freebsd-dev

Author	SHA1	Message	Date
Conrad Meyer	83dc49beaf	x86: Define pc_monitorbuf as a logical structure Rather than just accessing it via pointer cast. No functional change intended. Discussed with: kib (earlier version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20135	2019-05-04 17:35:13 +00:00
Konstantin Belousov	9891fa5592	Remove witness warning, same as r346351 for busdma_dmar. bounce_bus_dmamap_create() does not sleep either. Sponsored by: Mellanox Technologies MFC after: 1 week	2019-04-28 18:45:44 +00:00
Conrad Meyer	f1498d7aa3	x86: Halt non-BSP CPUs on panic IPI_STOP We may need the BSP to reboot, but we don't need any AP CPU that isn't the panic thread. Any CPU landing in this routine during panic isn't the panic thread, so we can just detect !BSP && panic and shut down the logical core. The savings can be demonstrated in a bhyve guest with multiple cores; before this change, N guest threads would spin at 100% CPU. After this change, only one or two threads spin (depending on if the panicing CPU was the BSP or not). Konstantin points out that this may break any future patches which allow switching ddb(4) CPUs after panic and examining CPU-local state that cannot be inspected remotely. In the event that such a mechanism is incorporated, this behavior could be made configurable by tunable/sysctl. Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20019	2019-04-24 18:24:22 +00:00
Tycho Nightingale	96ca24dc32	remove the 4GB boundary requirement on PCI DMA segments Reviewed by: kib Discussed with: jhb Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D19867	2019-04-19 13:43:33 +00:00
Konstantin Belousov	2d8bfbdcb2	Use correct type name. Sponsored by: Mellanox Technologies MFC after: 1 week	2019-04-18 15:31:03 +00:00
Konstantin Belousov	f9feb09189	Correct handling of RMRR during early enumeration stages. On some machines, DMAR contexts must be created before all devices under the scope of the corresponding DMAR unit are enumerated. Current code has two problems with that: - scope lookup returns NULL device_t, which causes to skip creating a context with RMRR, which is fatal for the affected device. - calculation of the final pci dbsf address fails if any bridge in the scope is not yet enumerated, because code relies on pcib_get_bus(). Make creation of contexts work either with device_t, or with DMAR PCI scope paths. Scope provides enough information to infer context address, and it is directly matched against DMAR tables scopes. When calculating bus addresses for the scope or device, use direct pci_cfgregread(PCIR_SECBUS_1) to get the secondary bus number, instead of pcib_get_bus(). The issue was observed on HP Gen servers, where iLO PCI devices are located behind south bridge switch. Turning on translation without satisfying RMRR requests caused iLO to mostly hang, up to the level of being unusable to control the server. While there, remove hw.dmar.dmar_match_verbose tunable, and make the normal logging under bootverbose useful and sufficient to diagnose DRHD and RMRR parsing and matching. Sponsored by: Mellanox Technologies MFC after: 1 week	2019-04-18 14:18:06 +00:00
Konstantin Belousov	c07640c430	Remove witness warning. dmar_bus_dmamap_create() does not sleep. Sponsored by: Mellanox Technologies MFC after: 1 week	2019-04-18 14:03:59 +00:00
Konstantin Belousov	1ad4a0314f	Reduce verbosity, do not announce details of irte programming by default. Sponsored by: Mellanox Technologies MFC after: 1 week	2019-04-18 14:02:33 +00:00
Konstantin Belousov	2a508645b4	pci_cfgreg.c: Use io port config access for early boot time. Some early PCIe chipsets are explicitly listed in the white-list to enable use of the MMIO config space accesses, perhaps because ACPI tables were not reliable source of the base MCFG address at that time. For that chipsets, MCFG base was read from the known chipset MCFGbase config register. During very early stage of boot, when access to the PCI config space is performed (see e.g. pci_early_quirks.c), we cannot map 255MB of registers because the method used with pre-boot pmap overflows initial kernel page tables. Move fallback to read MCFGbase to the attachment method of the x86/legacy device, which removes code duplication, and results in the use of io accesses until MCFG is parsed or legacy attach called. For amd64, pre-initialize cfgmech with CFGMECH_1, right now we dynamically assign CFGMECH_1 to it anyway, and remove checks for CFGMECH_NONE. There is a mention in the Intel documentation for corresponding chipsets that OS must use either io port or MMIO access method, but we already break this rule by reading MCFGbase register, so one more access seems to be innocent. Reported by: longwitz@incore.de PR: 236838 Reviewed by: avg (other version), jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D19833	2019-04-09 18:07:17 +00:00
Tycho Nightingale	9708c3a2b8	DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page. Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D19753	2019-04-02 18:50:49 +00:00
Tycho Nightingale	cec2287b6a	Use the BUS_DMA_NOWRITE flag to expose and create the read-only VT-d IOMMU mappings. Reviewed by: kib Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D19729	2019-03-27 20:15:51 +00:00
Konstantin Belousov	fd8d844f76	amd64 KPTI: add control from procctl(2). Add the infrastructure to allow MD procctl(2) commands, and use it to introduce amd64 PTI control and reporting. PTI mode cannot be modified for existing pmap, the knob controls PTI of the new vmspace created on exec. Requested by: jhb Reviewed by: jhb, markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D19514	2019-03-16 11:44:33 +00:00
Konstantin Belousov	7e0a345bc5	Add symbolic name for TSC_AUX MSR address. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-03-15 16:39:05 +00:00
Konstantin Belousov	3dcf329ee5	Add register number, CPUID bits, and print identification for TSX force abort errata. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-03-12 18:59:01 +00:00
Vladimir Kondratyev	76cefcd810	Fix amd64/i386 LINT build after r344982 Submitted by: jkim Reported by: rpokala MFC with: r344982	2019-03-11 19:46:15 +00:00
Vladimir Kondratyev	2b4ee39838	atrtc(4): install ACPI RTC/CMOS operation region handler FreeBSD base system does not provide an ACPI handler for the PC/AT RTC/CMOS device with PnP ID PNP0B00; on some HP laptops, the absence of this handler causes suspend/resume and poweroff(8) to hang or fail [1], [2]. On these laptops EC _REG method queries the RTC date/time registers via ACPI before suspending/powering off. The handler should be registered before acpi_ec driver is loaded. This change adds handler to access CMOS RTC operation region described in section 9.15 of ACPI-6.2 specification [3]. It is installed only for ACPI version of atrtc(4) so it should not affect old ACPI-less i386 systems. It is possible to disable the handler with loader tunable: debug.acpi.disabled=atrtc Informational debugging printf can be enabled by setting hw.acpi.verbose=1 in loader.conf [1] https://wiki.freebsd.org/Laptops/HP_Envy_6Z-1100 [2] https://wiki.freebsd.org/Laptops/HP_Notebook_15-af104ur [3] https://uefi.org/sites/default/files/resources/ACPI_6_2.pdf PR: 207419, 213039 Submitted by: Anthony Jenkins <Scoobi_doo@yahoo.com> Reviewed by: ian Discussed on: acpi@, 2013-2015, several threads MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D19314	2019-03-10 20:19:43 +00:00
John Baldwin	2e43efd0bb	Drop "All rights reserved" from my copyright statements. Reviewed by: rgrimes MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19485	2019-03-06 22:11:45 +00:00
Konstantin Belousov	a2d95495ee	Add usermode helpers for for Intel userspace protection keys feature. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18893	2019-02-20 09:56:23 +00:00
Konstantin Belousov	e7a9df16e6	Add kernel support for Intel userspace protection keys feature on Skylake Xeons. See SDM rev. 68 Vol 3 4.6.2 Protection Keys and the description of the RDPKRU and WRPKRU instructions. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18893	2019-02-20 09:51:13 +00:00
Bruce Evans	27c56cf357	Fix hangs in r341810 waiting for AP startup. idle_td is dereferenced without thread-locking it to make its contents is invariant, and was accessed without telling the compiler that its contents is invariant. Some compilers optimized accesses to the supposedly invariant contents by moving the critical checks for changes outside of the loop that waits for changes. Fix this using atomic ops. This bug only showed up for the following configuration: a Turion2 system, amd64 kernels, compiled by gcc, and SCHED_4BSD. clang fails to do the optimization with all CFLAGS that I tried, because it doesn't fully optimize the '__asm __volatile' for cpu_spinwait() although this asm has no memory clobber. gcc only does the optimization with most CFLAGS. I mostly used -Os with all compilers. i386 works because gcc -m32 -Os only moves 1 or the 2 accesses outside of the loop. Non-Turion2 systems and SCHED_ULE worked due to different timing (when all APs start before the BP checks them outside of the loop). Reviewed by: kib	2019-02-20 02:40:38 +00:00
Konstantin Belousov	5671e0d62e	Add definition for %cr4 PKRU enable bit. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D18893	2019-02-19 19:13:48 +00:00
Konstantin Belousov	eb785fab3b	Port sysctl kern.elf32.read_exec from amd64 to i386. Make it more comprehensive on i386, by not setting nx bit for any mapping, not just adding PF_X to all kernel-loaded ELF segments. This is needed for the compatibility with older i386 programs that assume that read access implies exec, e.g. old X servers with hand-rolled module loader. Reported and tested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-02-07 02:17:34 +00:00
Konstantin Belousov	f76b5ab6cc	Fix resume on i386 PAE. It was broken before PAE/no-PAE merge, but since now PAE is the default, resume is apparently becomes for all machines. The corrected issues: - the trampoline page is not mapped executable, so machine faults when paging is on; - MSR.EFER and %cr4 both should be loaded before paging is enabled, otherwise paging structures are invalid (cr4.PAE and EFER.NX). - MSR.EFER and %cr4 should be only loaded if present. I attempt to handle this by not touching the registers if the value is zero. There are some more bits still not quite correct, e.g. unconditional access to %cr4 in resumectx. Reported and debugging help by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-02-07 02:09:34 +00:00
Konstantin Belousov	ccc2d07e77	Update CPUID bits definitions and CPU identification based on changes in SDM rev. 069. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-02-04 23:57:59 +00:00
Konstantin Belousov	c3f5a36651	x86: correctly limit max memory resource address.. CPU and buses can manage up to the limit reported by cpu_maxphyaddr, so set mem_rman to the value returned by cpu_getmaxphyaddr(). For the PAE mode, it was missed both when rman_res_t was increased to uintmax_t, and from the PAE merge commit. When importing smaps or dump_avail chunks into memory rman, do not blindly ignore resources which ends above the limit, chomp them instead if start is below the limit. The same change was already done to i386 add_physmap_entry(). Based on the submission by: bde MFC after: 2 months	2019-02-01 20:46:47 +00:00
Roger Pau Monné	27c36a12f1	xen: introduce a new way to setup event channel upcall The main differences with the currently implemented method are: - Requires a local APIC EOI, since it doesn't bypass the local APIC as the previous method used to do. - Can be set to use different IDT vectors on each vCPU. Note that FreeBSD doesn't make use of this feature since the event channel IDT vector is reserved system wide. Note that the old method of setting the event channel upcall is not removed, and will be used as a fallback if this newly introduced method is not available. MFC after: 1 month Sponsored by: Citrix Systems R&D	2019-01-30 11:34:52 +00:00
Konstantin Belousov	9a52756044	i386: Merge PAE and non-PAE pmaps into same kernel. Effectively all i386 kernels now have two pmaps compiled in: one managing PAE pagetables, and another non-PAE. The implementation is selected at cold time depending on the CPU features. The vm_paddr_t is always 64bit now. As result, nx bit can be used on all capable CPUs. Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE configs, for drivers compatibility. Kernel layout, esp. max kernel address, low memory PDEs and max user address (same as trampoline start) are now same for PAE and for non-PAE regardless of the type of page tables used. Non-PAE kernel (when using PAE pagetables) can handle physical memory up to 24G now, larger memory requires re-tuning the KVA consumers and instead the code caps the maximum at 24G. Unfortunately, a lot of drivers do not use busdma(9) properly so by default even 4G barrier is not easy. There are two tunables added: hw.above4g_allow and hw.above24g_allow, the first one is kept enabled for now to evaluate the status on HEAD, second is only for dev use. i386 now creates three freelists if there is any memory above 4G, to allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed from 3 to 1. The PAE_TABLES kernel config option is retired. In collaboarion with: pho Discussed with: emaste Reviewed by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D18894	2019-01-30 02:07:13 +00:00
Konstantin Belousov	8f0916fc11	i386/PAE busdma: allow more bounce pages. If i386 has more than 4G of memory, allow the same number of busdma bounce pages as for amd64. In fact, in this case bouncing sometimes is much heavier than on amd64. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18854	2019-01-18 13:43:11 +00:00
Konstantin Belousov	957b9bbf3c	x86 busdma: fix mis-use of bus_addr_t where vm_paddr_t is assumed. Right now bus_addr_t and vm_paddr_t are always aliased to the same underlying integer type on x86, which makes the interchange hard to detect. Shortly, i386 kernel would use uint64_t for vm_paddr_t to enable automatic use of PAE paging structures if hardware allows it, while bus_addr_t would be extended to 64bit only when PAE option is specified. Fix all places that were identified as using bus_addr_t while page address was assumed. This was performed by testing the complete PAE merging patch on machine with > 4G of RAM enabled. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D18854	2019-01-18 13:38:56 +00:00
Conrad Meyer	16068ae479	Add definitions for AMD Spectre/Meltdown CPUID information No functional change, aside from printing recognized bits in CPU identification. The bits are documented in 111006-B "Indirect Branch Control Extension"[1] and 124441 "Speculative Store Bypass Disable."[2] Notably missing (left as future work): * Integration with hw.spec_store_bypass_disable and hw_ssb_active flag, which are currently Intel-specific * Integration with hw_ibrs_active global flag, which are currently Intel-specific * SSB_NO integration in hw_ssb_recalculate() * Bhyve integration (PR 235010) [1]: https://developer.amd.com/wp-content/resources/111006-B_AMD64TechnologyIndirectBranchControlExtenstion_WP_7-18Update_FNL.pdf [2]: https://developer.amd.com/wp-content/resources/124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf PR: 235010 (related, but does not fix) MFC after: a week	2019-01-17 19:44:47 +00:00
Konstantin Belousov	62ee17d2ee	Style(9) fixes for x86/busdma_bounce.c. Remove extra parentheses. Adjust indents and lines fill. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-01-16 06:10:55 +00:00
Konstantin Belousov	e471df6670	Remove unused prototype. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-01-16 05:51:03 +00:00
Conrad Meyer	15b7da10ac	vmm(4): Take steps towards multicore bhyve AMD support vmm's CPUID emulation presented Intel topology information to the guest, but disabled AMD topology information and in some cases passed through garbage. I.e., CPUID leaves 0x8000_001[de] were passed through to the guest, but guest CPUs can migrate between host threads, so the information presented was not consistent. This could easily be observed with 'cpucontrol -i 0xfoo /dev/cpuctl0'. Slightly improve this situation by enabling the AMD topology feature flag and presenting at least the CPUID fields used by FreeBSD itself to probe topology on more modern AMD64 hardware (Family 15h+). Older stuff is probably less interesting. I have not been able to empirically confirm it is sufficient, but it should not regress anything either. Reviewed by: araujo (previous version) Relnotes: sure	2019-01-16 02:19:04 +00:00
Conrad Meyer	6b83069e05	Expose threads-per-core and physical core count information With new sysctls (to the best of our ability do detect them). Restructured smp.4 slightly for clarity (keep relevant stuff closer to the top) while documenting. Reviewed by: markj, jhibbits (ppc parts) MFC after: 3 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18322	2019-01-04 18:31:17 +00:00
John Baldwin	a230c2f1b3	Correct variable name in two panic messages: num_msi_irq -> num_msi_irqs. MFC after: 1 week	2018-12-31 22:46:43 +00:00
Andriy Gapon	82a5a27527	add support for marking interrupt handlers as suspended The goal of this change is to fix a problem with PCI shared interrupts during suspend and resume. I have observed a couple of variations of the following scenario. Devices A and B are on the same PCI bus and share the same interrupt. Device A's driver is suspended first and the device is powered down. Device B generates an interrupt. Interrupt handlers of both drivers are called. Device A's interrupt handler accesses registers of the powered down device and gets back bogus values (I assume all 0xff). That data is interpreted as interrupt status bits, etc. So, the interrupt handler gets confused and may produce some noise or enter an infinite loop, etc. This change affects only PCI devices. The pci(4) bus driver marks a child's interrupt handler as suspended after the child's suspend method is called and before the device is powered down. This is done only for traditional PCI interrupts, because only they can be shared. At the moment the change is only for x86. Notable changes in core subsystems / interfaces: - BUS_SUSPEND_INTR and BUS_RESUME_INTR methods are added to bus interface along with convenience functions bus_suspend_intr and bus_resume_intr; - rman_set_irq_cookie and rman_get_irq_cookie functions are added to provide a way to associate an interrupt resource with an interrupt cookie; - intr_event_suspend_handler and intr_event_resume_handler functions are added to the MI interrupt handler interface. I added two new interrupt handler flags, IH_SUSP and IH_CHANGED, to implement the new intr_event functions. IH_SUSP marks a suspended interrupt handler. IH_CHANGED is used to implement a barrier that ensures that a change to the interrupt handler's state is visible to future interrupts. While there, I fixed some whitespace issues in comments and changed a couple of logically boolean variables to be bool. MFC after: 1 month (maybe) Differential Revision: https://reviews.freebsd.org/D15755	2018-12-17 17:11:00 +00:00
Mark Johnston	b6da2600f9	Fix the PAE kernel gcc build. The error was caused by map_ucode() casting a vm_paddr_t to a void *. Use a uintptr_t instead to match the caller. Fix some style bugs while here. Reported by: bde Reviewed by: bde MFC after: 1 week Sponsored by: The FreeBSD Foundation	2018-12-11 16:49:01 +00:00
Konstantin Belousov	94dd54b9a2	Free bootstacks after AP startup. Bootstacks are unused after APs executed sched_throw() in init_secondary_tail() and started executing on proper idle thread stack. Add sysinit that detects that the idle thread for each CPU was scheduled at least once, and free corresponding bootstack. Slight addition of the code (~200 bytes) is compensated by the saving, because even on typical small modern desktop CPU we leak 128K of memory otherwise (4 pages x 8 threads). Reviewed by: jhb MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18486	2018-12-11 02:54:36 +00:00
Jayachandran C.	9417fa9e3c	acpica : move SRAT/SLIT parsing to sys/dev/acpica This moves the architecture independent parts of sys/x86/acpica/srat.c to sys/dev/acpica/acpi_pxm.c, to be used later on arm64. The function declarations are moved to sys/dev/acpica/acpivar.h We also need to update sys/conf/files.{i386,amd64} to use the new file. No functional changes. Reviewed by: markj, imp Differential Revision: https://reviews.freebsd.org/D17941	2018-12-08 19:10:58 +00:00
Jayachandran C.	a3a6167448	x86/acpica/srat.c: Add API for parsing proximity tables The SLIT and SRAT ACPI tables needs to be parsed on arm64 as well, on systems that use UEFI/ACPI firmware and support NUMA. To do this, we need to move most of the logic of x86/acpica/srat.c to dev/acpica and provide an API that architectures can use to parse and configure ACPI NUMA information. This commit adds the API in srat.c as a first step, without making any functional changes. We will move the common code to sys/dev/acpica as the next step. The functions added are: * int acpi_pxm_init(int ncpus, vm_paddr_t maxphys) - to allocate and initialize data structures used * void acpi_pxm_parse_tables(void) - parse SRAT/SLIT, save the cpu and memory proximity information * void acpi_pxm_set_mem_locality(void) - use the saved data to set memory locality * void acpi_pxm_set_cpu_locality(void) - use the saved data to set cpu locality * void acpi_pxm_free(void) - free data structures allocated by init On arm64, we do not have an cpu APIC id that can be used as index to store CPU data, we need to use the Processor Uid. To help with this, define internal functions cpu_add, cpu_find, cpu_get_info to store and get CPU proximity information. Reviewed by: markj, jhb (previous version) Differential Revision: https://reviews.freebsd.org/D17940	2018-12-08 18:34:05 +00:00
Ben Widawsky	91890b73ad	Add definitions for Intel Speed Shift These definitions will be used by a driver to implement Hardware P-States (autonomous control of HWP, via Intel Speed Shift technology). Reviewed by: kib Approved by: emaste (mentor) Differential Revision: https://reviews.freebsd.org/D18050	2018-11-21 00:21:58 +00:00
John Baldwin	e13507f6f0	Axe MINIMUM_MSI_INT. Just allow MSI interrupts to always start at the end of the I/O APIC pins. Since existing machines already have more than 255 I/O APIC pins, IRQ 255 is no longer reliably invalid, so just remove the minimum starting value for MSI. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D17991	2018-11-16 23:39:39 +00:00
Konstantin Belousov	2343757338	Align IA32_ARCH_CAP MSR definitions and use with SDM rev. 068. SDM rev. 068 was released yesterday and it contains the description of the MSR 0x10a IA32_ARCH_CAP. This change adds symbolic definitions for all bits present in the document, and decode them in the CPU identification lines printed on boot. But also, the document defines SSB_NO as bit 4, while FreeBSD used but 2 to detect the need to work-around Speculative Store Bypass issue. Change code to use the bit from SDM. Similarly, the document describes bit 3 as an indicator that L1TF issue is not present, in particular, no L1D flush is needed on VMENTRY. We used RDCL_NO to avoid flushing, and again I changed the code to follow new spec from SDM. In fact my Apollo Lake machine with latest ucode shows this: IA32_ARCH_CAPS=0x19<RDCL_NO,SKIP_L1DFL_VME,SSB_NO> Reviewed by: bwidawsk Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D18006	2018-11-16 21:27:11 +00:00
John Baldwin	b6b42932db	Convert the number of MSI IRQs on x86 from a constant to a tunable. The number of MSI IRQs still defaults to 512, but it can now be changed at boot time via the machdep.num_msi_irqs tunable. Reviewed by: kib, royger (older version) Reviewed by: markj MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D17977	2018-11-15 18:37:41 +00:00
John Baldwin	c6aba52e4f	Revert r332735 and fix MSI-X to properly fail allocations when full. The off-by-one errors in 332735 weren't actual errors and were preventing the last MSI interrupt source from being used. Instead, the issue is that when all MSI interrupt sources were allocated, the loop in msix_alloc() would terminate with 'msi' still set to non-null. The only check for 'i' overflowing was in the 'msi' == NULL case, so msix_alloc() would try to reuse the last MSI interrupt source instead of failing. Fix by moving the check for all sources being in use to just after the loop. Reviewed by: kib, markj MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D17976	2018-11-14 18:45:33 +00:00
Konstantin Belousov	83813c6696	Apply fix to un-cripple max cpu id on BSP earlier. We need to know actual value for the standard extended features before ifuncs are resolved. Reported and tested by: madpilot Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-11-12 19:17:26 +00:00
John Baldwin	7f7f6f85a1	Add a custom implementation of cpu_lock_delay() for x86. Avoid using DELAY() since it can try to use spin locks on CPUs without a P-state invariant TSC. For cpu_lock_delay(), always use the TSC if it exists (even if it is not P-state invariant) to delay for a microsecond. If the TSC does not exist, read from I/O port 0x84 to delay instead. PR: 228768 Reported by: Roger Hammerstein <cheeky.m@live.com> Reviewed by: kib MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D17851	2018-11-05 22:54:03 +00:00
John Baldwin	3c03efc4ab	Add a delay_tsc() static function for when DELAY() uses the TSC. This uses slightly simpler logic than the existing code by using the full 64-bit counter and thus not having to worry about counter overflow. Reviewed by: kib MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D17850	2018-11-05 22:51:45 +00:00
Konstantin Belousov	6bc6a54280	Add pci_early function to detect Intel stolen memory. On some Intel devices BIOS does not properly reserve memory (called "stolen memory") for the GPU. If the stolen memory is claimed by the OS, functions that depend on stolen memory (like frame buffer compression) can't be used. A function called pci_early_quirks that is called before the virtual memory system is started was added. In Linux, this PCI early quirks function iterates through all PCI slots to check for any device that require quirks. While this more generic solution is preferable I only ported the Intel graphics specific parts because I think my implementation would be too similar to Linux GPL'd solution after looking at the Linux code too much. The code regarding Intel graphics stolen memory was ported from Linux. In the case of Intel graphics stolen memory this pci_early_quirks will read the stolen memory base and size from north bridge registers. The values are stored in global variables that is later read by linuxkpi_gplv2. Linuxkpi stores these values in a Linux-specific structure that is read by the drm driver. Relevant linuxkpi code is here: https://github.com/FreeBSDDesktop/kms-drm/blob/drm-v4.16/linuxkpi/gplv2/src/linux_compat.c#L37 For now, only amd64 arch is suppor ted since that is the only arch supported by the new drm drivers. I was told that Intel GPUs are always located on 0:2:0 so these values are hard coded for now. Note that the structure and early execution of the detection code is not required in its current form, but we expect that the code will be added shortly which fixes the potential BIOS bugs by reserving the stolen range in phys_avail[]. This must be done as early as possible to avoid conflicts with the potential usage of the memory in kernel. Submitted by: Johannes Lundberg <johalun0@gmail.com> Reviewed by: bwidawsk, imp MFC after: 1 week Differential revision: https://reviews.freebsd.org/D16719 Differential revision: https://reviews.freebsd.org/D17775	2018-10-31 23:17:00 +00:00
Mark Johnston	9978bd996b	Add malloc_domainset(9) and _domainset variants to other allocator KPIs. Remove malloc_domain(9) and most other _domain KPIs added in r327900. The new functions allow the caller to specify a general NUMA domain selection policy, rather than specifically requesting an allocation from a specific domain. The latter policy tends to interact poorly with M_WAITOK, resulting in situations where a caller is blocked indefinitely because the specified domain is depleted. Most existing consumers of the _domain KPIs are converted to instead use a DOMAINSET_PREF() policy, in which we fall back to other domains to satisfy the allocation request. This change also defines a set of DOMAINSET_FIXED() policies, which only permit allocations from the specified domain. Discussed with: gallatin, jeff Reported and tested by: pho (previous version) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17418	2018-10-30 18:26:34 +00:00

1 2 3 4 5 ...

934 Commits