freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	e164cafc69	Add hw.dmar.batch_coalesce tunable/sysctl, which specifies rate at which queued invalidation completion interrupt is requested with regard to the queued invalidation requests. In other words, setting the value of the knob to N requests completion interrupt after N items are processed. Existing behaviour is restored by setting hw.dmar.batch_coalesce=1. The knob significantly decreases the DMAR qi interrupt rate at the cost of slightly longer DMAR map entries recycling. Sponsored by: The FreeBSD Foundation	2016-04-17 10:56:56 +00:00
Konstantin Belousov	c5c20928d3	Add x86 CPU features definitions published in the Intel SDM rev. 58. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-04-16 06:07:13 +00:00
Konstantin Belousov	9e297f96d4	Always calculate divisor for the counter mode of LAPIC timer. Even if initially configured in the TSC deadline mode, eventtimer subsystem can be switched to periodic, and then DCR register is loaded with unitialized value. Reset the LAPIC eventtimer frequency and min/max periods when changing between deadline and counted periodic modes. Reported and tested by: Vladimir Zakharov <zakharov.vv@gmail.com> Sponsored by: The FreeBSD Foundation	2016-04-15 14:36:38 +00:00
Roger Pau Monné	9b44287ce5	busdma/bounce: revert r292255 Revert r292255 because it can create bounced regions without contiguous page offsets, which is needed for USB devices. Another solution would be to force bouncing the full buffer always (even when only one page requires bouncing), but this seems overly complicated and unnecessary, and it will probably involve using more bounce pages than the current code. Reported by: phk	2016-04-15 09:21:50 +00:00
Pedro F. Giffuni	a3269b0863	x86: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle.	2016-04-14 17:04:06 +00:00
Warner Losh	bd3bce41db	Deprecate using hints.acpi.0.rsdp to communicate the RSDP to the system. This uses the hints mechnanism. This mostly works today because when there's no static hints (the default), this value can be fetched from the hint. When there is a static hints file, the hint passed from the boot loader to the kernel is ignored, but for the BIOS case we're able to find it anyway. However, with UEFI, the fallback doesn't work, so we get a panic instead. Switch to acpi.rsdp and use TUNABLE_ULONG_FETCH instead. Continue to generate the old values to allow for transitions. In addition, fall back to the old method if the new method isn't present. Add comments about all this. Differential Revision: https://reviews.freebsd.org/D5866	2016-04-14 04:59:51 +00:00
Andriy Gapon	0d63fc3ed8	re-enable AMD Topology extension on certain models if disabled by BIOS Some BIOSes disable AMD Topology extension on AMD Family 15h notebook processors. We re-enable the extension, so that we can properly discover core and cache topology. Linux seems to do the same. Reported by: Johannes Dieterich <dieterich.joh@gmail.com> Reviewed by: jhb, kib Tested by: Johannes Dieterich <dieterich.joh@gmail.com> (earlier version) MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D5883	2016-04-12 13:30:39 +00:00
Pedro F. Giffuni	74b8d63dcc	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
John Baldwin	62d70a8174	Add more fine-grained kernel options for NUMA support. VM_NUMA_ALLOC is used to enable use of domain-aware memory allocation in the virtual memory system. DEVICE_NUMA is used to enable affinity reporting for devices such as bus_get_domain(). MAXMEMDOM must still be set to a value greater than for any NUMA support to be effective. Note that 'cpuset -gd' always works if MAXMEMDOM is enabled and the system supports NUMA. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D5782	2016-04-09 13:58:04 +00:00
Sepherosa Ziehau	19605ff758	xen: Set ipi_{alloc,free} even for UP This keeps XEN apic_ops aligned w/ x86's. Suggested by: kib, jhb Reviewed by: jhb, royger Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5871	2016-04-07 07:00:00 +00:00
Sepherosa Ziehau	8b0986c27f	x86: Allow interrupt vector allocation/free even on UP It is needed by the hypervisor FreeBSD guest to allocate/free private interrupt vectors. Reviewed by: kib, jhb, Dexuan Cui <decui microsoft com> Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5849	2016-04-07 06:36:03 +00:00
Andriy Gapon	c77702de74	x86 topo: add some comments, descriptions and references to documentation Plus a minor cosmetic change. MFC after: 1 month	2016-04-05 10:36:40 +00:00
Andriy Gapon	4725e6bff3	new x86 smp topology detection code Previously, the code determined a topology of processing units (hardware threads, cores, packages) and then deduced a cache topology using certain assumptions. The new code builds a topology that includes both processing units and caches using the information provided by the hardware. At the moment, the discovered full topology is used only to creeate a scheduling topology for SCHED_ULE. There is no KPI for other kernel uses. Summary: - based on APIC ID derivation rules for Intel and AMD CPUs - can handle non-uniform topologies - requires homogeneous APIC ID assignment (same bit widths for ID components) - topology for dual-node AMD CPUs may not be optimal - topology for latest AMD CPU models may not be optimal as the code is several years old - supports only thread/package/core/cache nodes Todo: - AMD dual-node processors - latest AMD processors - NUMA nodes - checking for homogeneity of the APIC ID assignment across packages - more flexible cache placement within topology - expose topology to userland, e.g., via sysctl nodes Long term todo: - KPI for CPU sharing and affinity with respect to various resources (e.g., two logical processors may share the same FPU, etc) Reviewed by: mav Tested by: mav MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D2728	2016-04-04 16:09:29 +00:00
John Baldwin	2b1e924b69	Move i386/i386/autoconf.c to sys/x86/x86 and use it on both amd64 and i386.	2016-04-03 23:03:54 +00:00
Konstantin Belousov	5c8e0b3bcb	Style(9), use tabs for the #define LOOPS line. Print unsigned values with %u. Make code slightly more compact by inlining loop limit. Noted by: bde Sponsored by: The FreeBSD Foundation	2016-04-01 08:47:23 +00:00
Konstantin Belousov	0df87548b9	Type of the interrupt handlers on x86 cannot be expressed in C. Simplify and unify placeholder type definitions. Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D5771	2016-03-29 19:56:48 +00:00
Konstantin Belousov	d317106ce2	Fix several bugs in r297374: - fix UP build [1] - do not obliterate initial reading of rdtsc by the loop counter [2] - restore the meaning of the argument -1 to native_lapic_ipi_wait() as wait until LAPIC acknowledge without timeout - correct formula for calculating loop iteration count for 1us, it was inverted, and ensure that even on unlikely slow CPUs at least one check for ack is performed. Reported by: Michael Butler <imb@protected-networks.net> [1], rpokala[2], jhb[3] Tested by: Michael Butler Pointy hat to: kib Sponsored by: The FreeBSD Foundation	2016-03-29 19:54:13 +00:00
Konstantin Belousov	998e1ef11f	Calibrate the frequency of the of the native_lapic_ipi_wait() loop, and avoid a delay while waiting for IPI delivery acknowledgement in xAPIC mode. This makes the loop exit immediately after the delivery bit in APIC_ICR register is set, instead of waiting for some microseconds. We only need to ensure that some amount of time is allowed for the LAPIC to react to the command, and we need that the wait time is finite and reasonable. For that reasons, it is irrelevant if the CPU frequency or throttling decrease the speed and make the loop, calibrated for full CPU speed at boot time, execute somewhat slower. Discussed with: bde, jhb Tested by: pho Sponsored by: The FreeBSD Foundation	2016-03-29 08:44:56 +00:00
Konstantin Belousov	d58c003a8a	Use ANSI function definition. Sponsored by: The FreeBSD Foundation	2016-03-29 08:31:34 +00:00
Konstantin Belousov	841d5e0151	Do not load LAPIC_DCR_TIMER with an undefined value. If we are in the deadline mode the divide configuration is not used and lapic_timer_divisor is not set. Reported by: dhw, mav Tested by: mav Sponsored by: The FreeBSD Foundation	2016-03-28 15:05:00 +00:00
Konstantin Belousov	ecabd74728	Use TSC deadline mode for LAPIC timer, when available. The mode fires LAPIC timer iinterrupt when TSC reaches the value written to the IA32_TSC_DEADLINE MSR. To arm or reset the timer in deadline mode, a single non-serializing MSR write is enough. This is an advance from the one-shot mode of LAPIC, where timer operated with the FSB frequency and required two (serialized in case of xAPIC) writes to the APIC registers. The LVT_TIMER register value is cached to avoid unneeded writes in the deadline mode. Unused arguments to specify period (which is passed in struct lapic as la_timer_period) and interrupt enable (which is always enabled) are removed from lapic_timer_{oneshot,periodic,deadline} functions. Instead, special lapic_timer_oneshot_nointr() function for interrupt-less one-shot calibration is added. Reviewed by: mav (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D5738	2016-03-28 09:52:44 +00:00
Konstantin Belousov	7c4e76935e	Add defines for the LAPIC TSC deadline timer mode. The LVT timer mode field is two-bit, extend the mask. Also add comments about all MSRs writes to which are not serializing. Sponsored by: The FreeBSD Foundation	2016-03-28 09:43:40 +00:00
John Baldwin	7a2c1d8c60	Enable interrupts on the BSP once all PICs are initialized. This moves the enabling of interrupts slightly earlier (the old location was still before devices were enumerated and probed) and does it in the interrupt code (rather than in the device configuration code). This also avoids tripping over an assertion on the first TLB shootdown with earlier AP startup. Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D5710	2016-03-24 00:24:07 +00:00
Justin Hibbits	f8fd3fb518	Fix the resource_list_print_type() calls to use uintmax_t. Missed a bunch from r297000.	2016-03-22 22:25:08 +00:00
John Baldwin	4a5202f9c4	Check IPI status more frequently when waiting. An IPI cannot be sent via the local APIC if a previous IPI is still being delivered. Attempts to send an IPI will wait for a pending IPI to clear. Prior to r278325 these checks used a spin loop with a hardcoded maximum count which broke AP startup on some systems. However, r278325 also enforced a minimum latency of 5 microseconds if an IPI was still pending which resulted in a measurable performance hit. This change reduces that minimum latency to 1 microsecond. Tested by: stas MFC after: 3 days	2016-03-18 19:48:49 +00:00
Justin Hibbits	da1b038af9	Use uintmax_t (typedef'd to rman_res_t type) for rman ranges. On some architectures, u_long isn't large enough for resource definitions. Particularly, powerpc and arm allow 36-bit (or larger) physical addresses, but type `long' is only 32-bit. This extends rman's resources to uintmax_t. With this change, any resource can feasibly be placed anywhere in physical memory (within the constraints of the driver). Why uintmax_t and not something machine dependent, or uint64_t? Though it's possible for uintmax_t to grow, it's highly unlikely it will become 128-bit on 32-bit architectures. 64-bit architectures should have plenty of RAM to absorb the increase on resource sizes if and when this occurs, and the number of resources on memory-constrained systems should be sufficiently small as to not pose a drastic overhead. That being said, uintmax_t was chosen for source clarity. If it's specified as uint64_t, all printf()-like calls would either need casts to uintmax_t, or be littered with PRI64 macros. Casts to uintmax_t aren't horrible, but it would also bake into the API for resource_list_print_type() either a hidden assumption that entries get cast to uintmax_t for printing, or these calls would need the PRI64 macros. Since source code is meant to be read more often than written, I chose the clearest path of simply using uintmax_t. Tested on a PowerPC p5020-based board, which places all device resources in 0xfxxxxxxxx, and has 8GB RAM. Regression tested on qemu-system-i386 Regression tested on qemu-system-mips (malta profile) Tested PAE and devinfo on virtualbox (live CD) Special thanks to bz for his testing on ARM. Reviewed By: bz, jhb (previous) Relnotes: Yes Sponsored by: Alex Perez/Inertial Computing Differential Revision: https://reviews.freebsd.org/D4544	2016-03-18 01:28:41 +00:00
Justin Hibbits	534ccd7bbf	Replace all resource occurrences of '0UL/~0UL' with '0/~0'. Summary: The idea behind this is '~0ul' is well-defined, and casting to uintmax_t, on a 32-bit platform, will leave the upper 32 bits as 0. The maximum range of a resource is 0xFFF.... (all bits of the full type set). By dropping the 'ul' suffix, C type promotion rules apply, and the sign extension of ~0 on 32 bit platforms gets it to a type-independent 'unsigned max'. Reviewed By: cem Sponsored by: Alex Perez/Inertial Computing Differential Revision: https://reviews.freebsd.org/D5255	2016-03-03 05:07:35 +00:00
John Baldwin	cbc4d2db75	Remove taskqueue_enqueue_fast(). taskqueue_enqueue() was changed to support both fast and non-fast taskqueues 10 years ago in r154167. It has been a compat shim ever since. It's time for the compat shim to go. Submitted by: Howard Su <howard0su@gmail.com> Reviewed by: sephe Differential Revision: https://reviews.freebsd.org/D5131	2016-03-01 17:47:32 +00:00
Justin Hibbits	e665eafb25	Correct the memory rman ranges to be to BUS_SPACE_MAXADDR Summary: As part of the migration of rman_res_t to be typed to uintmax_t, memory ranges must be clamped appropriately for the bus, to prevent completely bogus addresses from being used. This is extracted from D4544. Reviewed By: cem Sponsored by: Alex Perez/Inertial Computing Differential Revision: https://reviews.freebsd.org/D5134	2016-03-01 02:59:06 +00:00
Jung-uk Kim	0eda5b3f23	Silence PVS-Studio warning (V595). It can never be NULL here.	2016-02-23 23:57:24 +00:00
Svatopluk Kraus	a1e1814d76	As <machine/pmap.h> is included from <vm/pmap.h>, there is no need to include it explicitly when <vm/pmap.h> is already included. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D5373	2016-02-22 09:02:20 +00:00
Konstantin Belousov	2fe1339ea2	Some BIOSes ACPI bytecode needs to take (sleepable) acpi mutex for acpi_GetInteger() execution. Intel DMAR interrupt remapping code needs to know UID of the HPET to properly route the FSB interrupts from the HPET, even when interrupt remapping is disabled, and the code is executed under some non-sleepable mutexes. Cache HPET UIDs in the device softc at the attach time and provide lock-less method to get UID, use the method from the dmar hpet handling code instead of calling GetInteger(). Reported and tested by: Larry Rosenman <ler@lerctr.org> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-02-20 13:37:04 +00:00
Justin Hibbits	7915adb560	Introduce a RMAN_IS_DEFAULT_RANGE() macro, and use it. This simplifies checking for default resource range for bus_alloc_resource(), and improves readability. This is part of, and related to, the migration of rman_res_t from u_long to uintmax_t. Discussed with: jhb Suggested by: marcel	2016-02-20 01:32:58 +00:00
Konstantin Belousov	90edf67ecf	POSIX states that #include <signal.h> shall make both mcontext_t and ucontext_t available. Our code even has XXX comment about this. Add a bit of compliance by moving struct __ucontext definition into sys/_ucontext.h and including it into signal.h and sys/ucontext.h. Several machine/ucontext.h headers were changed to use namespace-safe types (like uint64_t->__uint64_t) to not depend on sys/types.h. struct __stack_t from sys/signal.h is made always visible in private namespace to satisfy sys/_ucontext.h requirements. Apparently mips _types.h pollutes global namespace with f_register_t type definition. This commit does not try to fix the issue. PR: 207079 Reported and tested by: Ting-Wei Lan <lantw44@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-02-12 07:38:19 +00:00
Justin Hibbits	2dd1bdf183	Convert rman to use rman_res_t instead of u_long Summary: Migrate to using the semi-opaque type rman_res_t to specify rman resources. For now, this is still compatible with u_long. This is step one in migrating rman to use uintmax_t for resources instead of u_long. Going forward, this could feasibly be used to specify architecture-specific definitions of resource ranges, rather than baking a specific integer type into the API. This change has been broken out to facilitate MFC'ing drivers back to 10 without breaking ABI. Reviewed By: jhb Sponsored by: Alex Perez/Inertial Computing Differential Revision: https://reviews.freebsd.org/D5075	2016-01-27 02:23:54 +00:00
Sepherosa Ziehau	69a53a7a3a	hyperv: use x86 generic code to do the hypervisor detection This is first step to move the generic part of HV code into kernel instead of module, so that it is possible to use hypercall to implement some other paravirtualization code in the kernel. Submitted by: Howard Su <howard0su@gmail.com> Reviewed by: royger, delphij, adrian Approved by: adrian (mentor) Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D3072	2016-01-14 02:50:13 +00:00
Ed Maste	0e42ee5dd8	Move amd64 metadata.h to x86 and share with i386 MFC after: 1 week	2016-01-07 19:47:26 +00:00
Ian Lepore	69dcb7e771	Make the 'env' directive described in config(5) work on all architectures, providing compiled-in static environment data that is used instead of any data passed in from a boot loader. Previously 'env' worked only on i386 and arm xscale systems, because it required the MD startup code to examine the global envmode variable and decide whether to use static_env or an environment obtained from the boot loader, and set the global kern_envp accordingly. Most startup code wasn't doing so. Making things even more complex, some mips startup code uses an alternate scheme that involves calling init_static_kenv() to pass an empty buffer and its size, then uses a series of kern_setenv() calls to populate that buffer. Now all MD startup code calls init_static_kenv(), and that routine provides a single point where envmode is checked and the decision is made whether to use the compiled-in static_kenv or the values provided by the MD code. The routine also continues to serve its original purpose for mips; if a non-zero buffer size is passed the routine installs the empty buffer ready to accept kern_setenv() values. Now if the size is zero, the provided buffer full of existing env data is installed. A NULL pointer can be passed if the boot loader provides no env data; this allows the static env to be installed if envmode is set to do so. Most of the work here is a near-mechanical change to call the init function instead of directly setting kern_envp. A notable exception is in xen/pv.c; that code was originally installing a buffer full of preformatted env data along with its non-zero size (like mips code does), which would have allowed kern_setenv() calls to wipe out the preformatted data. Now it passes a zero for the size so that the buffer of data it installs is treated as non-writeable.	2016-01-02 02:53:48 +00:00
Konstantin Belousov	6b247f858e	Add standard extended feature bit 6 from the Intel SDM rev. 57, which indicates that data-pointer in the saved x87 FPU state is only updated on FPU exceptions. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-12-29 22:14:21 +00:00
John Baldwin	9e8d8b4b0c	Move shared variables from {amd64,i386}/initcpu.c to x86/identcpu.c. While here, move the common bits of <machine/cputypes.h> to <x86/cputypes.h> as well. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D4670	2015-12-23 21:41:42 +00:00
Enji Cooper	b59f7a7ad8	Remove redundant declarations in sys/x86/xen which are now handled in other sys/x86 headers Differential Revision: https://reviews.freebsd.org/D4685 X-MFC with: r291949 Sponsored by: EMC / Isilon Storage Division	2015-12-23 17:43:55 +00:00
Conrad Meyer	986fd63b46	x86: Add CPUID_STDEXT_* macros for CPU feature bits A follow-up to r292478 and r292488. Sponsored by: EMC / Isilon Storage Division	2015-12-21 04:42:58 +00:00
Conrad Meyer	ce43b54ab2	x86: Detect feature flags "AVX512DQ", "AVX512IFMA", "AVX512BW", "AVX512VBMI" Documented in Intel Architecture Set Extensions Programming Reference (319433-023). Sponsored by: EMC / Isilon Storage Division	2015-12-20 03:34:30 +00:00
Conrad Meyer	f750a7edaa	x86: Detect feature flags "CLWB" and "PCOMMIT" "The availability of CLWB instruction is indicated by the presence of the CPUID feature flag CLWB (bit 24 of the EBX register)." CLWB is similar to CLFLUSHOPT, except that it is not required to discard cacheline contents. "On processors that supports PCOMMIT, PCOMMIT is enumerated through CPUID (CPUID.7.0.EBX[22]) only when the feature is enabled by BIOS." PCOMMIT is used to cause store-to-memory operations to become persistent (protected from power failure). Sponsored by: EMC / Isilon Storage Division	2015-12-19 20:47:15 +00:00
Roger Pau Monné	a7285da666	x86/bounce: try to always completely fill bounce pages Current code doesn't try to make use of the full page when bouncing because the size is only expanded to be a multiple of the alignment. Instead try to always create segments of PAGE_SIZE when using bounce pages. This allows us to remove the specific casing done for BUS_DMA_KEEP_PG_OFFSET, since the requirement is to make sure the offsets into contiguous segments are aligned, and now this is done by default. Sponsored by: Citrix Systems R&D Reviewed by: hps, kib Differential revision: https://reviews.freebsd.org/D4119	2015-12-15 10:07:03 +00:00
Konstantin Belousov	7c958a41fe	Merge common parts of i386 and amd64 md_var.h and smp.h into new headers x86/include x86_var.h and x86_smp.h. Reviewed by: emaste, jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D4358	2015-12-07 17:41:20 +00:00
Konstantin Belousov	9a5d210cb4	It seems that at least some KVM versions advertise support for EIO suppression but the version of the IOAPIC reported is 0x11 and neither IOAPIC EOIR nor the Linux trick of temporal reprogramming of the pin to edge-trigger mode to issue EOI work. Disable eoi suppression if KVM is detected. The mode can still be forced with the tunable. Reported and tested by: Roman Mamontov <mr.xanto@gmail.com> Sponsored by: The FreeBSD Foundation	2015-12-05 08:52:37 +00:00
Konstantin Belousov	27691a24ab	For amd64 non-PCID machines, and for i386 machines with support for the PG_G global pte flag, pmap_invalidate_all() fails to flush global TLB entries []. This is because TLB shootdown handler for such configs reloads CR3, and on i386 pmap_invalidate_all() does the same for the initiating CPU. Note that current code does not issue total invalidation requests for the kernel_pmap. Rename amd64 function invltlb_globpcid() to invltlb_glob(), it is not specific for PCID for quite some time, and implement the same functionality for i386. Use the function instead of invltlb() in shootdown handlers and in i386 pmap_invalidate_all(), but only for the kernel pmap (which maps pages with the PG_G attribute set), which takes care of PG_G TLB entries on flush. To detect the affected pmap in i386 TLB shootdown handler, pmap should be passed to the smp_masked_invltlb() function, which makes amd64 and i386 TLB shootdown code almost identical. Merge the code under x86/. Noted by: jhb [] Reviewed by: cem, jhb, pho Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D4346	2015-12-03 11:14:14 +00:00
Konstantin Belousov	906430e4f0	In the SandyBridge x2APIC workaround detection code, only fetch the environment variable when SandyBridge CPU is detected. Reduce code duplication. Sponsored by: The FreeBSD Foundation	2015-12-03 10:59:10 +00:00
Konstantin Belousov	2a8a46b161	Correct the number of DTLB entries reported for the CPUID Leaf 2 descriptor 0x6c. Confirmed by: Intel MFC after: 3 days	2015-11-24 19:55:11 +00:00

1 2 3 4 5 ...

551 Commits