freebsd-nq

Author	SHA1	Message	Date
Scott Long	33ce28d137	Remove the trm(4) driver Differential Revision: https://reviews.freebsd.org/D22575	2019-11-28 02:32:17 +00:00
Konstantin Belousov	13189065cb	amd64: assert that EARLY_COUNTER does not corrupt memory. Reviewed by: imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22514	2019-11-24 19:02:13 +00:00
Andrew Turner	68cad68149	Add kcsan_md_unsupported from NetBSD. It's used to ignore virtual addresses that may have a different physical address depending on the CPU. Sponsored by: DARPA, AFRL	2019-11-21 13:22:23 +00:00
Andrew Turner	1b8c58f283	Fix for style(9): use parentheses around return statements. Reported by: kib Sponsored by: DARPA, AFRL	2019-11-21 12:29:20 +00:00
Andrew Turner	849aef496d	Port the NetBSD KCSAN runtime to FreeBSD. Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in the FreeBSD kernel. It is a useful tool for finding data races between threads executing on different CPUs. This can be enabled by enabling KCSAN in the kernel config, or by using the GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later needs a compiler change to allow -fsanitize=thread that KCSAN uses. Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22315	2019-11-21 11:22:08 +00:00
Konstantin Belousov	da248a69aa	amd64: in double fault handler, do not rely on sane gsbase value. Typical reasons for doublefault faults are either kernel stack overflow or bugs in the code that manipulates protection CPU state. The later code is the code which often has to set up gsbase for kernel. Switching to explicit load of GSBASE MSR in the fault handler makes it more probable to output a useful information. Now all IST handlers have nmi_pcpu structure on top of their stacks. It would be even more useful to save gsbase value at the moment of the fault. I did not this because I do not want to modify PCB layout now. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-20 11:12:19 +00:00
Kyle Evans	f22a592111	Convert in-tree sysent targets to use new makesyscalls.lua flua is bootstrapped as part of the build for those on older versions/revisions that don't yet have flua installed. Once upgraded past r354833, "make sysent" will again naturally work as expected. Reviewed by: brooks Differential Revision: https://reviews.freebsd.org/D21894	2019-11-18 23:28:23 +00:00
John Baldwin	03b0d68c72	Check for errors from copyout() and suword*() in sv_copyout_args/strings. Reviewed by: brooks, kib Tested on: amd64 (amd64, i386, linux64), i386 (i386, linux) Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D22401	2019-11-18 20:07:43 +00:00
Mark Johnston	85e06c728c	Set MALLOC_DEBUG_MAXZONES=1 in GENERIC-NODEBUG configurations. The purpose of this option is to make it easier to track down memory corruption bugs by reducing the number of malloc(9) types that might have recently been associated with a given chunk of memory. However, it increases fragmentation and is disabled in release kernels. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2019-11-18 20:03:28 +00:00
Konstantin Belousov	b2e1b88984	amd64 copyout: remove irrelevant comment. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-11-17 14:41:47 +00:00
Scott Long	e372160177	TSX Asynchronous Abort mitigation for Intel CVE-2019-11135. This CVE has already been announced in FreeBSD SA-19:26.mcu. Mitigation for TAA involves either turning off TSX or turning on the VERW mitigation used for MDS. Some CPUs will also be self-mitigating for TAA and require no software workaround. Control knobs are: machdep.mitigations.taa.enable: 0 - no software mitigation is enabled 1 - attempt to disable TSX 2 - use the VERW mitigation 3 - automatically select the mitigation based on processor features. machdep.mitigations.taa.state: inactive - no mitigation is active/enabled TSX disable - TSX is disabled in the bare metal CPU as well as - any virtualized CPUs VERW - VERW instruction clears CPU buffers not vulnerable - The CPU has identified itself as not being vulnerable Nothing in the base FreeBSD system uses TSX. However, the instructions are straight-forward to add to custom applications and require no kernel support, so the mitigation is provided for users with untrusted applications and tenants. Reviewed by: emaste, imp, kib, scottph Sponsored by: Intel Differential Revision: 22374	2019-11-16 00:26:42 +00:00
John Baldwin	5caa67fa84	Use a sv_copyout_auxargs hook in the Linux ELF ABIs. Reviewed by: emaste Tested on: amd64 (linux64 only), i386 Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D22356	2019-11-15 23:01:43 +00:00
John Baldwin	e353233118	Add a sv_copyout_auxargs() hook in sysentvec. Change the FreeBSD ELF ABIs to use this new hook to copyout ELF auxv instead of doing it in the sv_fixup hook. In particular, this new hook allows the stack space to be allocated at the same time the auxv values are copied out to userland. This allows us to avoid wasting space for unused auxv entries as well as not having to recalculate where the auxv vector is by walking back up over the argv and environment vectors. Reviewed by: brooks, emaste Tested on: amd64 (amd64 and i386 binaries), i386, mips, mips64 Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D22355	2019-11-15 18:42:13 +00:00
Josh Paetzel	052e12a508	Add the pvscsi driver to the tree. This driver allows to usage of the paravirt SCSI controller in VMware products like ESXi. The pvscsi driver provides a substantial performance improvement in block devices versus the emulated mpt and mps SCSI/SAS controllers. Error handling in this driver has not been extensively tested yet. Submitted by: vbhakta@vmware.com Relnotes: yes Sponsored by: VMware, Panzura Differential Revision: D18613	2019-11-14 23:31:20 +00:00
Konstantin Belousov	c4f056e8ea	amd64: only set PCB_FULL_IRET pcb flag when #gp or similar exception comes from usermode. If CPU supports RDFSBASE, the flag also means that userspace fsbase and gsbase are already written into pcb, which might be not true when we handle #gp from kernel. The offender is rdmsr_safe(), and the visible result is corrupted userspace TLS base. Reported by: pstef Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-11-13 22:39:46 +00:00
Konstantin Belousov	c08973d09c	Workaround for Intel SKL002/SKL012S errata. Disable the use of executable 2M page mappings in EPT-format page tables on affected CPUs. For bhyve virtual machines, this effectively disables all use of superpage mappings on affected CPUs. The vm.pmap.allow_2m_x_ept sysctl can be set to override the default and enable mappings on affected CPUs. Alternate approaches have been suggested, but at present we do not believe the complexity is warranted for typical bhyve's use cases. Reviewed by: alc, emaste, markj, scottl Security: CVE-2018-12207 Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21884	2019-11-12 18:01:33 +00:00
Konstantin Belousov	a7af4a3e7d	amd64: move GDT into PCPU area. Reviewed by: jhb, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22302	2019-11-12 15:51:47 +00:00
Konstantin Belousov	de6f295446	amd64: assert that size of the software prototype table for gdt is equal to the size of hardware gdt. Reviewed by: jhb, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22302	2019-11-12 15:47:46 +00:00
Andriy Gapon	78f1851613	teach db_nextframe/x86 about [X]xen_intr_upcall interrupt handler Discussed with: kib, royger MFC after: 3 weeks Sponsored by: Panzura	2019-11-12 11:00:01 +00:00
Konstantin Belousov	6cd492bcd4	amd64: Issue MFENCE on context switch on AMD CPUs when reusing address space. On some AMD CPUs, in particular, machines that do not implement CLFLUSHOPT but do provide CLFLUSH, the CLFLUSH instruction is only synchronized with MFENCE. Code using CLFLUSH typicall needs to brace it with MFENCE both before and after flush, see for instance pmap_invalidate_cache_range(). If context switch occurs while inside the protected region, we need to ensure visibility of flushes done on the old CPU, to new CPU. For all other machines, locked operation done to lock switched thread, should be enough. For case of different address spaces, reload of %cr3 is serializing. Reviewed by: cem, jhb, scottph Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22007	2019-11-11 21:59:20 +00:00
Andriy Gapon	2961e6efeb	db_nextframe/amd64: remove TRAP_INTERRUPT frame type Besides the confusing name, this type is effectively unused. In all cases where it could be set, the INTERRUPT type is set by the earlier code. The conditions for TRAP_INTERRUPT are a subset of the conditions for INTERRUPT. Reviewed by: kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22305	2019-11-11 17:11:49 +00:00
Konstantin Belousov	415d23ebfd	amd64: change r_gdt to the local variable in hammer_time(). Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-10 10:03:22 +00:00
Konstantin Belousov	d70bab39f2	amd64: Change SFENCE to locked op for synchronizing with CLFLUSHOPT on Intel. Reviewed by: cem, jhb Discussed with: alc, scottph Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22007	2019-11-10 09:41:29 +00:00
Konstantin Belousov	98158c753d	amd64: move common_tss into pcpu. This saves some memory, around 256K I think. It removes some code, e.g. KPTI does not need to specially map common_tss anymore. Also, common_tss become domain-local. Reviewed by: jhb Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22231	2019-11-10 09:28:18 +00:00
Eric van Gyzen	854e90da4e	vmm: pass M_WAITOK to uma_zalloc when allocating FPU save area Submitted by: patrick.sullivan3@dell.com Reviewed by: markj MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D22276	2019-11-08 16:30:55 +00:00
Konstantin Belousov	83ba1468ab	amd64: Store %cr3 into pcpu saved_ucr3 on double fault. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-03 11:52:50 +00:00
Konstantin Belousov	7ccd639deb	amd64 ddb: Add printing of kernel/user and saved user %cr3 values from pcpu. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-03 11:51:53 +00:00
Edward Tomasz Napierala	2ae3f52cee	There's nothing architecture specific in "options STATS"; move it from sys/amd64/conf/NOTES to sys/conf/NOTES. Suggested by: jhb@ Sponsored by: Klara Inc, Netflix	2019-10-30 10:16:28 +00:00
Konstantin Belousov	af592d0465	Fix reset of the kernel stack pointer in TSS for !PTI case on pmap activation after r354095. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2019-10-28 10:50:37 +00:00
Konstantin Belousov	20795e252a	Provide dummy definition of the amd64 struct pcb for -m32 compilation. I do not see a need in the proper x86/include/pcb.h header. Reported and tested by: antoine MFC after: 1 week	2019-10-26 18:22:52 +00:00
Konstantin Belousov	5e921ff49e	amd64: move pcb out of kstack to struct thread. This saves 320 bytes of the precious stack space. The only negative aspect of the change I can think of is that the struct thread increased by 320 bytes obviously, and that 320 bytes are not swapped out anymore. I believe the freed stack space is much more important than that. Also, current struct thread size is 1392 bytes on amd64, so UMA will allocate two thread structures per (4KB) slab, which leaves a space for pcb without increasing zone memory use. Reviewed by: alc, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D22138	2019-10-25 20:09:42 +00:00
Mateusz Guzik	08ded448cf	amd64 pmap: per-domain pv chunk list This significantly reduces contention since chunks get created and removed all the time. See the review for sample results. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21976	2019-10-23 19:17:10 +00:00
Conrad Meyer	639ec13157	amd64: Add CFI directives for libc syscall stubs No functional change (in program code). Additional DWARF metadata is generated in the .eh_frame section. Also, it is now a compile-time requirement that machine/asm.h ENTRY() and END() macros are paired. (This is subject to ongoing discussion and may change.) This DWARF metadata allows llvm-libunwind to unwind program stacks when the program is executing the function. The goal is to collect accurate userspace stacktraces when programs have entered syscalls. (The motivation for "Call Frame Information," or CFI for short -- not to be confused with Control Flow Integrity -- is to sufficiently annotate assembly functions such that stack unwinders can unwind out of the local frame without the requirement of a dedicated framepointer register; i.e., -fomit-frame-pointer. This is necessary for C++ exception handling or collecting backtraces.) For the curious, a more thorough description of the metadata and some examples may be found at [1] and documentation at [2]. You can also look at 'cc -S -o - foo.c \| less' and search for '.cfi_' to see the CFI directives generated by your C compiler. [1]: https://www.imperialviolet.org/2017/01/18/cfi.html [2]: https://sourceware.org/binutils/docs/as/CFI-directives.html Reviewed by: emaste, kib (with reservations) Differential Revision: https://reviews.freebsd.org/D22122	2019-10-23 19:03:03 +00:00
Mateusz Guzik	61b8430f38	amd64 pmap: conditionalize per-superpage locks on NUMA Instead of superpages use. The current code employs superpage-wide locking regardless and the better locking granularity is welcome with NUMA enabled even when superpage support is not used. Requested by: alc Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21982	2019-10-22 22:55:46 +00:00
Mateusz Guzik	15e33b5493	amd64 pmap: fixup invlgen lookup for fictitious mappings Similarly to r353438, use dummy entry. Reported and tested by: Neel Chauhan Sponsored by: The FreeBSD Foundation	2019-10-22 22:54:41 +00:00
Andriy Gapon	869dbab7ba	vmm: remove a wmb() call After removing wmb(), vm_set_rendezvous_func() became super trivial, so there was no point in keeping it. The wmb (sfence on amd64, lock nop on i386) was not needed. This can be explained from several points of view. First, wmb() is used for store-store ordering (although, the primitive is undocumented). There was no obvious subsequent store that needed the barrier. Second, x86 has a memory model with strong ordering including total store order. An explicit store barrier may be needed only when working with special memory (device, special caching mode) or using special instructions (non-temporal stores). That was not the case for this code. Third, I believe that there is a misconception that sfence "flushes" the store buffer in a sense that it speeds up the propagation of stores from the store buffer to the global visibility. I think that such propagation always happens as fast as possible. sfence only makes subsequent stores wait for that propagation to complete. So, sfence is only useful for ordering of stores and only in the situations described above. Reviewed by: jhb MFC after: 23 days Differential Revision: https://reviews.freebsd.org/D21978	2019-10-19 07:10:15 +00:00
Mark Johnston	14327f5334	Tighten mapping protections on preloaded files on amd64. - We load the kernel at 0x200000. Memory below that address need not be executable, so do not map it as such. - Remove references to .ldata and related sections in the kernel linker script. They come from ld.bfd's default linker script, but are not used, and we now use ld.lld to link the amd64 kernel. lld does not contain a default linker script. - Pad the .bss to a 2MB as we do between .text and .data. This forces the loader to load additional files starting in the following 2MB page, preserving the use of superpage mappings for kernel data. - Map memory above the kernel image with NX. The kernel linker now upgrades protections as needed, and other preloaded file types (e.g., entropy, microcode) need not be mapped with execute permissions in the first place. Reviewed by: kib MFC after: 1 month Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21859	2019-10-18 14:05:13 +00:00
Yuri Pankov	a161fba992	linux: futex_mtx should follow futex_list Move futex_mtx to linux_common.ko for amd64 and aarch64 along with respective list/mutex init/destroy. PR: 240989 Reported by: Alex S <iwtcex@gmail.com>	2019-10-18 12:25:33 +00:00
Conrad Meyer	dda17b3672	Implement NetGDB(4) NetGDB(4) is a component of a system using a panic-time network stack to remotely debug crashed FreeBSD kernels over the network, instead of traditional serial interfaces. There are three pieces in the complete NetGDB system. First, a dedicated proxy server must be running to accept connections from both NetGDB and gdb(1), and pass bidirectional traffic between the two protocols. Second, the NetGDB client is activated much like ordinary 'gdb' and similarly to 'netdump' in ddb(4) after a panic. Like other debugnet(4) clients (netdump(4)), the network interface on the route to the proxy server must be online and support debugnet(4). Finally, the remote (k)gdb(1) uses 'target remote <proxy>:<port>' (like any other TCP remote) to connect to the proxy server. The NetGDB v1 protocol speaks the literal GDB remote serial protocol, and uses a 1:1 relationship between GDB packets and sequences of debugnet packets (fragmented by MTU). There is no encryption utilized to keep debugging sessions private, so this is only appropriate for local segments or trusted networks. Submitted by: John Reimer <john.reimer AT emc.com> (earlier version) Discussed some with: emaste, markj Relnotes: sure Differential Revision: https://reviews.freebsd.org/D21568	2019-10-17 21:33:01 +00:00
Conrad Meyer	7790c8c199	Split out a more generic debugnet(4) from netdump(4) Debugnet is a simplistic and specialized panic- or debug-time reliable datagram transport. It can drive a single connection at a time and is currently unidirectional (debug/panic machine transmit to remote server only). It is mostly a verbatim code lift from netdump(4). Netdump(4) remains the only consumer (until the rest of this patch series lands). The INET-specific logic has been extracted somewhat more thoroughly than previously in netdump(4), into debugnet_inet.c. UDP-layer logic and up, as much as possible as is protocol-independent, remains in debugnet.c. The separation is not perfect and future improvement is welcome. Supporting INET6 is a long-term goal. Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to 'debugnet_' or 'dn_' -- sorry. I thought keeping the netdump name on the generic module would be more confusing than the refactoring. The only functional change here is the mbuf allocation / tracking. Instead of initiating solely on netdump-configured interface(s) at dumpon(8) configuration time, we watch for any debugnet-enabled NIC for link activation and query it for mbuf parameters at that time. If they exceed the existing high-water mark allocation, we re-allocate and track the new high-water mark. Otherwise, we leave the pre-panic mbuf allocation alone. In a future patch in this series, this will allow initiating netdump from panic ddb(4) without pre-panic configuration. No other functional change intended. Reviewed by: markj (earlier version) Some discussion with: emaste, jhb Objection from: marius Differential Revision: https://reviews.freebsd.org/D21421	2019-10-17 16:23:03 +00:00
Mark Johnston	6e6d41f20d	Introduce pmap_change_prot() for amd64. This updates the protection attributes of subranges of the kernel map. Unlike pmap_protect(), which is typically used for user mappings, pmap_change_prot() does not perform lazy upgrades of protections. pmap_change_prot() also updates the aliasing range of the direct map. Reviewed by: kib MFC after: 1 month Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21758	2019-10-16 22:12:34 +00:00
Mark Johnston	01cef4caa7	Remove page locking from pmap_mincore(). After r352110 the page lock no longer protects a page's identity, so there is no purpose in locking the page in pmap_mincore(). Instead, if vm.mincore_mapped is set to the non-default value of 0, re-lookup the page after acquiring its object lock, which holds the page's identity stable. The change removes the last callers of vm_page_pa_tryrelock(), so remove it. Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21823	2019-10-16 22:03:27 +00:00
Conrad Meyer	f677fed5a2	ddb: Add support for disassembling 'crc32' on amd64	2019-10-16 18:27:27 +00:00
Andriy Gapon	edca4938f7	itwd(4): driver for watchdog function in ITE Super I/O chips The chips are commonly named with "IT" prefix. MFC after: 19 days	2019-10-16 14:57:38 +00:00
Jeff Roberson	638f867814	(6/6) Convert pmap to expect busy in write related operations now that all callers hold it. This simplifies pmap code and removes a dependency on the object lock. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21596	2019-10-15 03:51:46 +00:00
Jeff Roberson	0012f373e4	(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594	2019-10-15 03:45:41 +00:00
Jeff Roberson	205be21d99	(3/6) Add a shared object busy synchronization mechanism that blocks new page busy acquires while held. This allows code that would need to acquire and release a very large number of page busy locks to use the old mechanism where busy is only checked and not held. This comes at the cost of false positives but never false negatives which the single consumer, vm_fault_soft_fast(), handles. Reviewed by: kib Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21592	2019-10-15 03:41:36 +00:00
Mateusz Guzik	f48d651dd9	amd64 pmap: handle fictitious mappigns with addresses beyond pv_table There are provisions to do it already with pv_dummy, but new locking code did not account for it. Previous one did not have the problem because it hashed the address into the lock array. While here annotate common vars with __read_mostly and __exclusive_cache_line. Reported by: Thomas Laus Tesetd by: jkim, Thomas Laus Fixes: r353149 ("amd64 pmap: implement per-superpage locks") Sponsored by: The FreeBSD Foundation	2019-10-11 14:57:47 +00:00
Doug Ambrisko	f2521a76ed	This driver attaches to the Intel VMD drive and connects a new PCI domain starting at the max. domain, and then work down. Then existing FreeBSD drivers will attach. Interrupt routing from the VMD MSI-X to the NVME drive is not well known, so any interrupt is sent to all children that register. VROC used Intel meta data so graid(8) works with it. However, graid(8) supports RAID 0,1,10 for read and write. I have some early code to support writes with RAID 5. Note that RAID 5 can have life issues with SSDs since it can cause write amplification from updating the parity data. Hot plug support needs a change to skip the following check to work: if (pcib_request_feature(dev, PCI_FEATURE_HP) != 0) { in sys/dev/pci/pci_pci.c. Looked at by: imp, rpokala, bcr Differential Revision: https://reviews.freebsd.org/D21383	2019-10-10 03:12:17 +00:00
Mateusz Guzik	fa43c5d49e	amd64: plug spurious cld instructions ABI already guarantees the direction is forward. Note this does not take care of i386-specific cld's. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21906	2019-10-08 21:14:11 +00:00

1 2 3 4 5 ...

8130 Commits