freebsd-nq

Author	SHA1	Message	Date
Roger Pau Monne	ed78016d00	xen/privcmd: implement the dm op ioctl Use an interface compatible with the Linux one so that the user-space libraries already using the Linux interface can be used without much modifications. This allows user-space to make use of the dm_op family of hypercalls, which are used by device models. Sponsored by: Citrix Systems R&D	2021-01-11 16:33:27 +01:00
Konstantin Belousov	45974de8fb	x86: Add rdtscp32() into cpufunc.h. Suggested by: markj MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27986	2021-01-10 04:42:34 +02:00
Mitchell Horne	72939459bd	amd64: use register macros for gdb_cpu_getreg() Prefer these newly-added definitions to bare values. MFC after: 2 weeks Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc.	2020-12-18 16:16:03 +00:00
Mitchell Horne	0ef474de88	amd64: allow gdb(4) to write to most registers Similar to the recent patch to arm's gdb stub in r368414, allow GDB to update the contents of most general purpose registers. Reviewed by: cem, jhb, markj MFC after: 2 weeks Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. NetApp PR: 44 Differential Revision: https://reviews.freebsd.org/D27642	2020-12-18 16:09:24 +00:00
Peter Grehan	15add60d37	Convert vmm_ops calls to IFUNC There is no need for these to be function pointers since they are never modified post-module load. Rename AMD/Intel ops to be more consistent. Submitted by: adam_fenn.io Reviewed by: markj, grehan Approved by: grehan (bhyve) MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D27375	2020-11-28 01:16:59 +00:00
Maxim Sobolev	fd2ef8ef5a	Unobfuscate "KERNLOAD" parameter on amd64. This change lines-up amd64 with the i386 and the rest of supported architectures by defining KERNLOAD in the vmparam.h and getting rid of magic constant in the linker script, which albeit documented via comment but isn't programmatically accessible at a compile time. Use KERNLOAD to eliminate another (matching) magic constant 100 lines down inside unremarkable TU "copy.c" 3 levels deep in the EFI loader tree. Reviewed by: markj Approved by: markj MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D27355	2020-11-25 23:19:01 +00:00
John Baldwin	1925586e03	Honor the disabled setting for MSI-X interrupts for passthrough devices. Add a new ioctl to disable all MSI-X interrupts for a PCI passthrough device and invoke it if a write to the MSI-X capability registers disables MSI-X. This avoids leaving MSI-X interrupts enabled on the host if a guest device driver has disabled them (e.g. as part of detaching a guest device driver). This was found by Chelsio QA when testing that a Linux guest could switch from MSI-X to MSI interrupts when using the cxgb4vf driver. While here, explicitly fail requests to enable MSI on a passthrough device if MSI-X is enabled and vice versa. Reported by: Sony Arpita Das @ Chelsio Reviewed by: grehan, markj MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27212	2020-11-24 23:18:52 +00:00
Mark Johnston	6f5a960678	vmm: Make pmap_invalidate_ept() wait synchronously for guest exits Currently EPT TLB invalidation is done by incrementing a generation counter and issuing an IPI to all CPUs currently running vCPU threads. The VMM inner loop caches the most recently observed generation on each host CPU and invalidates TLB entries before executing the VM if the cached generation number is not the most recent value. pmap_invalidate_ept() issues IPIs to force each vCPU to stop executing guest instructions and reload the generation number. However, it does not actually wait for vCPUs to exit, potentially creating a window where guests may continue to reference stale TLB entries. Fix the problem by bracketing guest execution with an SMR read section which is entered before loading the invalidation generation. Then, pmap_invalidate_ept() increments the current write sequence before loading pm_active and sending IPIs, and polls readers to ensure that all vCPUs potentially operating with stale TLB entries have exited before pmap_invalidate_ept() returns. Also ensure that unsynchronized loads of the generation counter are wrapped with atomic(9), and stop (inconsistently) updating the invalidation counter and pm_active bitmask with acquire semantics. Reviewed by: grehan, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26910	2020-11-11 15:01:17 +00:00
Mark Johnston	cff169880e	amd64: Make it easier to configure exception stack sizes The amd64 kernel handles certain types of exceptions on a dedicated stack. Currently the sizes of these stacks are all hard-coded to PAGE_SIZE, but for at least NMI handling it can be useful to use larger stacks. Add constants to intr_machdep.h to make this easier to tweak. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D27076	2020-11-04 16:42:20 +00:00
Konstantin Belousov	546df7a45d	amd64 pmap.h: explicitly provide constants values instead of relying on some more advanced C features. This fixes gcc-toolchain build of exception.S. Reported and tested by: kevans Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-10-16 16:22:32 +00:00
Konstantin Belousov	e406235000	Fix for mis-interpretation of PCB_KERNFPU. RIght now PCB_KERNFPU is used both as indication that kernel prepared hardware FPU context to use and that the thread is fpu-kern thread. This also breaks fpu_kern_enter(FPU_KERN_NOCTX), since fpu_kern_leave() then clears PCB_KERNFPU. Introduce new flag PCB_KERNFPU_THR which indicates that the thread is fpu-kern. Do not clear PCB_KERNFPU if fpu-kern thread leaves noctx fpu region. Reported and tested by: jhb (amd64) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25511	2020-10-14 23:01:41 +00:00
Konstantin Belousov	df01340989	amd64: Store full 64bit of FIP/FDP for 64bit processes when using XSAVE. If current process is 64bit, use rex-prefixed version of XSAVE (XSAVE64). If current process is 32bit and CPU supports saving segment registers cs/ds in the FPU save area, use non-prefixed variant of XSAVE. Reported and tested by: Michał Górny <mgorny@mgorny@moritz.systems> PR: 250043 Reviewed by: emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D26643	2020-10-03 23:17:29 +00:00
Konstantin Belousov	5e8ea68fd8	Move ctx_switch_xsave declaration to amd64 md_var.h. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2020-10-03 23:07:09 +00:00
Edward Tomasz Napierala	1e2521ffae	Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead. Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26458	2020-09-27 18:47:06 +00:00
Mark Johnston	78257765f2	Add a vmparam.h constant indicating pmap support for large pages. Enable SHM_LARGEPAGE support on arm64. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc., Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26467	2020-09-23 19:34:21 +00:00
D Scott Phillips	00e6614750	Sparsify the vm_page_dump bitmap On Ampere Altra systems, the sparse population of RAM within the physical address space causes the vm_page_dump bitmap to be much larger than necessary, increasing the size from ~8 Mib to > 2 Gib (and overflowing `int` for the size). Changing the page dump bitmap also changes the minidump file format, so changes are also necessary in libkvm. Reviewed by: jhb Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26131	2020-09-21 22:21:59 +00:00
D Scott Phillips	ab041f713a	Move vm_page_dump bitset array definition to MI code These definitions were repeated by all architectures, with small variations. Consolidate the common definitons in machine independent code and use bitset(9) macros for manipulation. Many opportunities for deduplication remain in the machine dependent minidump logic. The only intended functional change is increasing the bit index type to vm_pindex_t, allowing the indexing of pages with address of 8 TiB and greater. Reviewed by: kib, markj Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26129	2020-09-21 22:20:37 +00:00
Mark Johnston	2d838cd867	Add the MEM_EXTRACT_PADDR ioctl to /dev/mem. This allows privileged userspace processes to find information about the physical page backing a given mapping. It is useful in applications such as DPDK which perform some of their own memory management. Reviewed by: kib, jhb (previous version) MFC after: 2 weeks Sponsored by: Juniper Networks, Inc. Sponsored by: Klara Inc. Differential Revision: https://reviews.freebsd.org/D26237	2020-09-02 18:12:47 +00:00
Mateusz Guzik	543769bf83	amd64: clean up empty lines in .c and .h files	2020-09-01 21:16:54 +00:00
Konstantin Belousov	f3eb12e4a6	Add bhyve support for LA57 guest mode. Noted and reviewed by: grehan Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273	2020-08-23 20:37:21 +00:00
Konstantin Belousov	9ce875d9b5	amd64 pmap: LA57 AKA 5-level paging Since LA57 was moved to the main SDM document with revision 072, it seems that we should have a support for it, and silicons are coming. This patch makes pmap support both LA48 and LA57 hardware. The selection of page table level is done at startup, kernel always receives control from loader with 4-level paging. It is not clear how UEFI spec would adapt LA57, for instance it could hand out control in LA57 mode sometimes. To switch from LA48 to LA57 requires turning off long mode, requesting LA57 in CR4, then re-entering long mode. This is somewhat delicate and done in pmap_bootstrap_la57(). AP startup in LA57 mode is much easier, we only need to toggle a bit in CR4 and load right value in CR3. I decided to not change kernel map for now. Single PML5 entry is created that points to the existing kernel_pml4 (KML4Phys) page, and a pml5 entry to create our recursive mapping for vtopte()/vtopde(). This decision is motivated by the fact that we cannot overcommit for KVA, so large space there is unusable until machines start providing wider physical memory addressing. Another reason is that I do not want to break our fragile autotuning, so the KVA expansion is not included into this first step. Nice side effect is that minidumps are compatible. On the other hand, (very) large address space is definitely immediately useful for some userspace applications. For userspace, numbering of pte entries (or page table pages) is always done for 5-level structures even if we operate in 4-level mode. The pmap_is_la57() function is added to report the mode of the specified pmap, this is done not to allow simultaneous 4-/5-levels (which is not allowed by hw), but to accomodate for EPT which has separate level control and in principle might not allow 5-leve EPT despite x86 paging supports it. Anyway, it does not seems critical to have 5-level EPT support now. Tested by: pho (LA48 hardware) Reviewed by: alc Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273	2020-08-23 20:19:04 +00:00
Peter Grehan	f5f5f1e7d6	Support guest rdtscp and rdpid instructions on Intel VT-x Enable any of rdtscp and/or rdpid for bhyve guests on Intel-based hosts that support the "enable RDTSCP" VM-execution control. Submitted by: adam_fenn.io Reported by: chuck Reviewed by: chuck, grehan, jhb Approved by: jhb (bhyve), grehan MFC after: 3 weeks Relnotes: Yes Differential Revision: https://reviews.freebsd.org/D26003	2020-08-18 07:23:47 +00:00
Ruslan Bukin	c4cd699010	o Add machine/iommu.h and include MD iommu headers from it, so we don't ifdef for every arch in busdma_iommu.c; o No need to include specialreg.h for x86, remove it. Requested by: andrew Reviewed by: kib Sponsored by: DARPA/AFRL Differential Revision: https://reviews.freebsd.org/D25957	2020-08-05 19:11:31 +00:00
Alexander Motin	aba10e131f	Allow swi_sched() to be called from NMI context. For purposes of handling hardware error reported via NMIs I need a way to escape NMI context, being too restrictive to do something significant. To do it this change introduces new swi_sched() flag SWI_FROMNMI, making it careful about used KPIs. On platforms allowing IPI sending from NMI context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI, otherwise it works just like SWI_DELAY. To handle the delayed SWIs this patch calls clk_intr_event on every hardclock() tick. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25754	2020-07-25 15:19:38 +00:00
Konstantin Belousov	3ec7e1695c	amd64 pmap: microoptimize local shootdowns for PCID PTI configurations When pmap operates in PTI mode, we must reload %cr3 on return to userspace. In non-PCID mode the reload always flushes all non-global TLB entries and we take advantage of it by only invalidating the KPT TLB entries (there is no cached UPT entries at all). In PCID mode, we flush both KPT and UPT TLB explicitly, but we can take advantage of the fact that PCID mode command to reload %cr3 includes a flag to flush/not flush target TLB. In particular, we can avoid the flush for UPT, instead record that load of pc_ucr3 into %cr3 on return to usermode should be flushing. This is done by providing either all-1s or ~CR3_PCID_MASK in pc_ucr3_load_mask. The mask is automatically reset to all-1s on return to usermode. Similarly, we can avoid flushing UPT TLB on context switch, replacing it by setting pc_ucr3_load_mask. This unifies INVPCID and non-INVPCID PTI ifunc, leaving only 4 cases instead of 6. This trick is also applicable both to the TLB shootdown IPI handlers, since handlers interrupt the target thread. But then we need to check pc_curpmap in handlers, and this would reopen the same race for INVPCID machines as was fixed in r306350 for non-INVPCID. To not introduce the same bug, unconditionally do spinlock_enter() in pmap_activate(). Reviewed by: alc, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D25483	2020-07-18 18:19:57 +00:00
Mateusz Guzik	c4e64133d8	amd64: patch ffsl to use the compiler builtin This shortens fdalloc by over 60 bytes. Correctness verified by running both variants at the same time and comparing the result of each call. Note someone(tm) should make a pass at converting everything else feasible.	2020-07-16 11:28:24 +00:00
Konstantin Belousov	dc43978aa5	amd64: allow parallel shootdown IPIs Stop using smp_ipi_mtx to protect global shootdown state, and move/multiply the global state into pcpu. Now each CPU can initiate shootdown IPI independently from other CPUs. Initiator enters critical section, then fills its local PCPU shootdown info (pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu, my_cpuid) for each target cpu. After that IPI is sent to all targets which scan for zeroed scoreboard generation words. Upon finding such word the shootdown data is read from corresponding cpu' pcpu, and generation is set. Meantime initiator loops waiting for all zeroed generations in scoreboard to update. Initiator does not disable interrupts, which should allow non-invalidation IPIs from deadlocking, it only needs to disable preemption to pin itself to the instance of the pcpu smp_tlb data. The generation is set before the actual invalidation is performed in handler. It is safe because target CPU cannot return to userspace before handler finishes. In principle only NMI can preempt the handler, but NMI would see the kernel handler frame and not touch not-invalidated user page table. Handlers loop until they do not see zeroed scoreboard generations. This, together with hardware keeping one pending IPI in LAPIC IRR should prevent lost shootdowns. Notes. 1. The code does protect writes to LAPIC ICR with exclusion. I believe this is fine because we in fact do not send IPIs from interrupt handlers. More for !x2APIC mode where ICR access for write requires two registers write, we disable interrupts around it. If considered incorrect, I can add per-cpu spinlock around ipi_send(). 2. Scoreboard lines owned by given target CPU can be padded to the cache line, to reduce ping-pong. Reviewed by: markj (previous version) Discussed with: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D25510	2020-07-14 20:37:50 +00:00
Conrad Meyer	c74a3041f0	Add domain policy allocation for amd64 fpu_kern_ctx Like other types of allocation, fpu_kern_ctx are frequently allocated per-cpu. Provide the API and sketch some example consumers. fpu_kern_alloc_ctx_domain() preferentially allocates memory from the provided domain, and falls back to other domains if that one is empty (DOMAINSET_PREF(domain) policy). Maybe it makes more sense to just shove one of these in the DPCPU area sooner or later -- left for future work. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D22053	2020-07-03 14:54:46 +00:00
Conrad Meyer	4daa95f85d	bhyve(8): For prototyping, reattempt decode in userspace If userspace has a newer bhyve than the kernel, it may be able to decode and emulate some instructions vmm.ko is unaware of. In this scenario, reset decoder state and try again. Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24464	2020-06-25 00:18:42 +00:00
Conrad Meyer	f4ce062964	vmm(4): Add 12 user ABI compat after r349948 Reported by: kp Reviewed by: jhb, kp Tested by: kp Differential Revision: https://reviews.freebsd.org/D24929	2020-05-20 17:27:54 +00:00
Conrad Meyer	8a68ae80f6	vmm(4), bhyve(8): Expose kernel-emulated special devices to userspace Expose the special kernel LAPIC, IOAPIC, and HPET devices to userspace for use in, e.g., fallback instruction emulation (when userspace has a newer instruction decode/emulation layer than the kernel vmm(4)). Plumb the ioctl through libvmmapi and register the memory ranges in bhyve(8). Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24525	2020-05-15 15:54:22 +00:00
John Baldwin	483d953a86	Initial support for bhyve save and restore. Save and restore (also known as suspend and resume) permits a snapshot to be taken of a guest's state that can later be resumed. In the current implementation, bhyve(8) creates a UNIX domain socket that is used by bhyvectl(8) to send a request to save a snapshot (and optionally exit after the snapshot has been taken). A snapshot currently consists of two files: the first holds a copy of guest RAM, and the second file holds other guest state such as vCPU register values and device model state. To resume a guest, bhyve(8) must be started with a matching pair of command line arguments to instantiate the same set of device models as well as a pointer to the saved snapshot. While the current implementation is useful for several uses cases, it has a few limitations. The file format for saving the guest state is tied to the ABI of internal bhyve structures and is not self-describing (in that it does not communicate the set of device models present in the system). In addition, the state saved for some device models closely matches the internal data structures which might prove a challenge for compatibility of snapshot files across a range of bhyve versions. The file format also does not currently support versioning of individual chunks of state. As a result, the current file format is not a fixed binary format and future revisions to save and restore will break binary compatiblity of snapshot files. The goal is to move to a more flexible format that adds versioning, etc. and at that point to commit to providing a reasonable level of compatibility. As a result, the current implementation is not enabled by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option for userland builds, and the kernel option BHYVE_SHAPSHOT. Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz Relnotes: yes Sponsored by: University Politehnica of Bucharest Sponsored by: Matthew Grooms (student scholarships) Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D19495	2020-05-05 00:02:04 +00:00
Conrad Meyer	cfdea69d24	vmm(4): Decode 3-byte VEX-prefixed instructions Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24462	2020-04-21 21:33:06 +00:00
Conrad Meyer	497cb9259b	vmm.h: Add ABI assertions and mark implicit holes The static assertions were added (with size and offsets from gdb) and verified with a build prior to marking the holes explicitly. This is in preparation for a subsequent revision, pending in phabricator, that makes use of some of these unused bits without impacting the ABI. Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24461	2020-04-17 15:19:42 +00:00
Conrad Meyer	b645fd4531	vmm(4): Expose instruction decode to userspace build Permit instruction decoding logic to be compiled outside of the kernel for rapid iteration and validation. Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24439	2020-04-16 16:50:33 +00:00
Conrad Meyer	ca0ec73c11	Expand generic subword atomic primitives The goal of this change is to make the atomic_load_acq_{8,16}, atomic_testandset{,_acq}_long, and atomic_testandclear_long primitives available in MI-namespace. The second goal is to get this draft out of my local tree, as anything that requires a full tinderbox is a big burden out of tree. MD specifics can be refined individually afterwards. The generic implementations may not be ideal for your architecture; feel free to implement better versions. If no subword_atomic definitions are needed, the include can be removed from your arch's machine/atomic.h. Generic definitions are guarded by defined macros of the same name. To avoid picking up conflicting generic definitions, some macro defines are added to various MD machine/atomic.h to register an existing implementation. Include _atomic_subword.h in arm and arm64 machine/atomic.h. For some odd reason, KCSAN only generates some versions of primitives. Generate the _acq variants of atomic_load._8, atomic_load._16, and atomic_testandset.*_long. There are other questionably disabled primitives, but I didn't run into them, so I left them alone. KCSAN is only built for amd64 in tinderbox for now. Add atomic_subword implementations of atomic_load_acq_{8,16} implemented using masking and atomic_load_acq_32. Add generic atomic_subword implementations of atomic_testandset_long(), atomic_testandclear_long(), and atomic_testandset_acq_long(), using atomic_fcmpset_long() and atomic_fcmpset_acq_long(). On x86, add atomic_testandset_acq_long as an alias for atomic_testandset_long. Reviewed by: kevans, rlibby (previous versions both) Differential Revision: https://reviews.freebsd.org/D22963	2020-03-25 23:12:43 +00:00
Ryan Libby	6d1a70dd0a	amd64 atomic.h: minor codegen optimization in flag access Previously the pattern to extract status flags from inline assembly blocks was to use setcc in the block to write the flag to a register. This was suboptimal in a few ways: - It would lead to code like: sete %cl; test %cl; jne, i.e. a flag would just be loaded into a register and then reloaded to a flag. - The setcc would force the block to use an additional register. - If the client code didn't care for the flag value then the setcc would be entirely pointless but could not be eliminated by the optimizer. A more modern inline asm construct (since gcc 6 and clang 9) allows for "flag output operands", where a C variable can be written directly from a flag. The optimizer can then use this to produce direct code where the flag does not take a trip through a register. In practice this makes each affected operation sequence shorter by five bytes of instructions. It's unlikely this has a measurable performance impact. Reviewed by: kib, markj, mjg Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23869	2020-02-28 18:32:36 +00:00
Mateusz Guzik	2318ed2508	amd64: provide custom zpcpu set/add/sub routines Note that clobbers are highly overzealous, can be cleaned up later.	2020-02-12 11:15:33 +00:00
Mateusz Guzik	fb886947d9	amd64: store per-cpu allocations subtracted by __pcpu This eliminates a runtime subtraction from counter_u64_add. before: mov 0x4f00ed(%rip),%rax # 0xffffffff80c01788 <numfullpathfail4> sub 0x808ff6(%rip),%rax # 0xffffffff80f1a698 <__pcpu> addq $0x1,%gs:(%rax) after: mov 0x4f02fd(%rip),%rax # 0xffffffff80c01788 <numfullpathfail4> addq $0x1,%gs:(%rax) Reviewed by: jeff Differential Revision: https://reviews.freebsd.org/D23570	2020-02-12 11:12:13 +00:00
Mateusz Guzik	e2b81f518a	amd64: clean up counter(9) - stop open-coding access to per-cpu data, use common macros instead - consistently use counter_t type where appropriate	2020-02-07 16:22:02 +00:00
Mark Johnston	c3d326fd44	Define MAXCPU consistently between the kernel and KLDs. This reverts r177661. The change is no longer very useful since out-of-tree KLDs will be built to target SMP kernels anyway. Moveover it breaks the KBI in !SMP builds since cpuset_t's layout depends on the value of MAXCPU, and several kernel interfaces, notably smp_rendezvous_cpus(), take a cpuset_t as a parameter. PR: 243711 Reviewed by: jhb, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23512	2020-02-05 19:08:21 +00:00
Konstantin Belousov	b837dadd87	bhyve: terminate waiting loops if thread suspension is requested. PR: 242724 Reviewed by: markj Reported and tested by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22881	2020-01-02 22:37:04 +00:00
John Baldwin	cbd03a9df2	Support software breakpoints in the debug server on Intel CPUs. - Allow the userland hypervisor to intercept breakpoint exceptions (BP#) in the guest. A new capability (VM_CAP_BPT_EXIT) is used to enable this feature. These exceptions are reported to userland via a new VM_EXITCODE_BPT that includes the length of the original breakpoint instruction. If userland wishes to pass the exception through to the guest, it must be explicitly re-injected via vm_inject_exception(). - Export VMCS_ENTRY_INST_LENGTH as a VM_REG_GUEST_ENTRY_INST_LENGTH pseudo-register. Injecting a BP# on Intel requires setting this to the length of the breakpoint instruction. AMD SVM currently ignores writes to this register (but reports success) and fails to read it. - Rework the per-vCPU state tracked by the debug server. Rather than a single 'stepping_vcpu' global, add a structure for each vCPU that tracks state about that vCPU ('stepping', 'stepped', and 'hit_swbreak'). A global 'stopped_vcpu' tracks which vCPU is currently reporting an event. Event handlers for MTRAP and breakpoint exits loop until the associated event is reported to the debugger. Breakpoint events are discarded if the breakpoint is not present when a vCPU resumes in the breakpoint handler to retry submitting the breakpoint event. - Maintain a linked-list of active breakpoints in response to the GDB 'Z0' and 'z0' packets. Reviewed by: markj (earlier version) MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D20309	2019-12-13 19:21:58 +00:00
Mark Johnston	5cff1f4dc3	Introduce vm_page_astate. This is a 32-bit structure embedded in each vm_page, consisting mostly of page queue state. The use of a structure makes it easy to store a snapshot of a page's queue state in a stack variable and use cmpset loops to update that state without requiring the page lock. This change merely adds the structure and updates references to atomic state fields. No functional change intended. Reviewed by: alc, jeff, kib Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22650	2019-12-10 18:14:50 +00:00
Warner Losh	f86e60008b	Regularize my copyright notice o Remove All Rights Reserved from my notices o imp@FreeBSD.org everywhere o regularize punctiation, eliminate date ranges o Make sure that it's clear that I don't claim All Rights reserved by listing All Rights Reserved on same line as other copyright holders (but not me). Other such holders are also listed last where it's clear.	2019-12-04 16:56:11 +00:00
Konstantin Belousov	13189065cb	amd64: assert that EARLY_COUNTER does not corrupt memory. Reviewed by: imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22514	2019-11-24 19:02:13 +00:00
Andrew Turner	68cad68149	Add kcsan_md_unsupported from NetBSD. It's used to ignore virtual addresses that may have a different physical address depending on the CPU. Sponsored by: DARPA, AFRL	2019-11-21 13:22:23 +00:00
Andrew Turner	849aef496d	Port the NetBSD KCSAN runtime to FreeBSD. Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in the FreeBSD kernel. It is a useful tool for finding data races between threads executing on different CPUs. This can be enabled by enabling KCSAN in the kernel config, or by using the GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later needs a compiler change to allow -fsanitize=thread that KCSAN uses. Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22315	2019-11-21 11:22:08 +00:00
Konstantin Belousov	c08973d09c	Workaround for Intel SKL002/SKL012S errata. Disable the use of executable 2M page mappings in EPT-format page tables on affected CPUs. For bhyve virtual machines, this effectively disables all use of superpage mappings on affected CPUs. The vm.pmap.allow_2m_x_ept sysctl can be set to override the default and enable mappings on affected CPUs. Alternate approaches have been suggested, but at present we do not believe the complexity is warranted for typical bhyve's use cases. Reviewed by: alc, emaste, markj, scottl Security: CVE-2018-12207 Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21884	2019-11-12 18:01:33 +00:00
Konstantin Belousov	a7af4a3e7d	amd64: move GDT into PCPU area. Reviewed by: jhb, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22302	2019-11-12 15:51:47 +00:00

1 2 3 4 5 ...

2068 Commits