freebsd-skq

Author	SHA1	Message	Date
kib	409097f5b7	Add a define for index of IA32_XSS MSR, which is, per SDM rev. 50, an analog of XCR0 for ring 0 FPU state, used by XSAVES and XRSTORS. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-09-06 19:47:37 +00:00
kib	a52df371a9	SDM rev. 50 defines the use of the next 8 bytes in the xstate header. It is the compaction bitmask, with the highest bit defining if compact format of the xsave area is used at all. Adjust the definition of struct xstate_hdr, provide define for bit 63. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-09-06 19:39:12 +00:00
kib	51e3f3be5c	Add more bits for the XSAVE features from CPUID 0xd, sub-function 1 %eax report. Print the XSAVE features 0xd/1 in the boot banner. The printcpuinfo() is executed late enough so that XSAVE is already enabled. There is no known to me off the shelf hardware that implements any feature bits except XSAVEOPT, the list is taken from SDM rev. 50. The banner printing will allow us to note the hardware arrival. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-09-06 15:45:45 +00:00
jhb	3a8cf1a38b	Create a separate structure for per-CPU state saved across suspend and resume that is a superset of a pcb. Move the FPU state out of the pcb and into this new structure. As part of this, move the FPU resume code on amd64 into a C function. This allows resumectx() to still operate only on a pcb and more closely mirrors the i386 code. Reviewed by: kib (earlier version)	2014-09-06 15:23:28 +00:00
jhb	f7c94cc497	Merge the amd64 and i386 identcpu.c into a single x86 implementation. This brings the structured extended features mask and VT-x reporting to i386 and Intel cache and TLB info (under bootverbose) to amd64.	2014-09-04 14:26:25 +00:00
jhb	f937116d20	- Move blacklists of broken TSCs out of the printcpuinfo() function and into the TSC probe routine. - Initialize cpu_exthigh once in finishidentcpu() which is called before printcpuinfo() (and matches the behavior on amd64).	2014-09-04 02:25:59 +00:00
jhb	59920f0385	Save and restore FPU state across suspend and resume. In earlier revisions of this patch, resumectx() called npxresume() directly, but that doesn't work because resumectx() runs with a non-standard %cs selector. Instead, all of the FPU suspend/resume handling is done in C. MFC after: 1 week	2014-08-30 17:48:38 +00:00
royger	c934f5fd28	atpic: make sure atpic_init is called after IO APIC initialization After r269510 the IO APIC and ATPIC initialization is done at the same order, which means atpic_init can be called before the IO APIC has been initalized. In that case the ATPIC will take over the interrupt sources, preventing the IO APIC from registering them. Reported by: David Wolfskill <david@catwhisker.org> Tested by: David Wolfskill <david@catwhisker.org>, Trond Endrestøl <Trond.Endrestol@fagskolen.gjovik.no> Sponsored by: Citrix Systems R&D	2014-08-07 17:00:50 +00:00
royger	8daf97263e	xen: add ACPI bus to xen_nexus when running as Dom0 Also disable a couple of ACPI devices that are not usable under Dom0. To this end a couple of booleans are added that allow disabling ACPI specific devices. Sponsored by: Citrix Systems R&D Reviewed by: jhb x86/xen/xen_nexus.c: - Return BUS_PROBE_SPECIFIC in the Xen Nexus attachement routine to force the usage of the Xen Nexus. - Attach the ACPI bus when running as Dom0. dev/acpica/acpi_cpu.c: dev/acpica/acpi_hpet.c: dev/acpica/acpi_timer.c - Add a variable that gates the addition of the devices. x86/include/init.h: - Declare variables that control the attachment of ACPI cpu, hpet and timer devices.	2014-08-04 09:05:28 +00:00
royger	937cdd0a36	xen: implement support for mapping IO APIC interrupts on Xen Allow a privileged Xen guest (Dom0) to parse the MADT ACPI interrupt overrides and register them with the interrupt subsystem. Also add a Xen specific implementation for bus_config_intr that registers interrupts on demand for all the vectors less than FIRST_MSI_INT. Sponsored by: Citrix Systems R&D x86/xen/pvcpu_enum.c: - Use helper functions from x86/acpica/madt.c in order to parse interrupt overrides from the MADT. - Walk the MADT and register any interrupt override with the interrupt subsystem. x86/xen/xen_nexus.c: - Add a custom bus_config_intr method for Xen that intercepts calls to configure unset interrupts and registers them on the fly (if the vector is < FIRST_MSI_INT).	2014-08-04 09:01:21 +00:00
royger	afa7324d1a	x86/madt: make the interrupt override parser a public function Split a portion of the code in madt_parse_interrupt_override to a separate function, that is public and can be used from other code. This will be needed by the Xen port, since FreeBSD needs to parse the interrupt overrides and notify Xen about them. This commit should not introduce any functional change. Sponsored by: Citrix Systems R&D Reviewed by: jhb, gibbs x86/acpica/madt.c: - Introduce madt_parse_interrupt_values() that parses the intr information from ACPI and returns the triggering and the polarity. This is a subset of the functionality that used to be part of madt_parse_interrupt_override(). - Make madt_found_sci_override a global variable that can be used from other files. x86/include/acpica_machdep.h: - Prototype of madt_parse_interrupt_values. - Extern declaration of madt_found_sci_override.	2014-08-04 08:58:50 +00:00
royger	668dd4b0cb	xen: change quality of the MADT ACPI enumerator Lower the quality of the MADT ACPI enumerator, so on Xen Dom0 we can force the usage of the Xen mptable enumerator even when ACPI is detected. This is needed because Xen might restrict the number of vCPUs available to Dom0, but the MADT ACPI table parsed in FreeBSD is the native one (which enumerates all the CPUs available in the system). Sponsored by: Citrix Systems R&D Reviewed by: gibbs x86/acpica/madt.c: - Lower MADT enumerator quality to -50. x86/xen/pvcpu_enum.c: - Rise Xen PV enumerator to 0.	2014-08-04 08:56:20 +00:00
royger	1bfb01ea6b	xen: change order of Xen intr init and IO APIC registration This change inserts the Xen interrupt subsystem (event channels) initialization between the system interrupt initialization and the IO APIC source registration. This is needed when running on Dom0, that routes physical interrupts on top of event channels, so that the interrupt sources found during IO APIC initialization can be registered using the Xen interrupt subsystem. The resulting order in the SI_SUB_INTR stage is the following: - System intr initialization - Xen intr initalization - IO APIC source registration Sponsored by: Citrix Systems R&D x86/x86/local_apic.c: - Change order of apic_setup_io to be called after xen interrupt subsystem is setup. x86/xen/xen_intr.c: - Init Xen event channels before apic_setup_io.	2014-08-04 08:54:34 +00:00
royger	3c669f2b53	xen: add a DDB command to print event channel information Add a new DDB command to dump all registered event channels. Sponsored by: Citrix Systems R&D x86/xen/xen_intr.c: - Add a new xen_evtchn command to DDB in order to dump all information related to event channels.	2014-08-04 08:52:10 +00:00
royger	8d27fa514f	xen: mask all event channels on init Mask all event channels during initialization. This is done so that we don't receive spurious interrupts while dynamically registering new event channels. There's a small window during registration where an event channel can fire before we have attached a handler to it. Sponsored by: Citrix Systems R&D x86/xen/xen_intr.c: - Mask all event channels on init.	2014-08-04 08:43:27 +00:00
royger	eb7b09e785	xen: implement event channel PIRQ support This allows Dom0 to manage physical hardware, redirecting the physical interrupts to event channels. Sponsored by: Citrix Systems R&D x86/xen/xen_intr.c: - Expand struct xenisrc to hold the level and triggering of PIRQ event channels. - Implement missing methods in xen_intr_pirq_pic. - Allow xen_intr_alloc_isrc to take a vector parameter that globally identifies the interrupt. This is only used for PIRQs that are bound to a specific hardware IRQ. - Introduce xen_register_pirq used to register IO APIC legacy PIRQ interrupts. - Add support for the dynamic PIRQ EOI map, this shared memory is modified by Xen (if it suppoorts that feature), and notifies the guest if an EOI is needed or not. If it's not available fall back to the old implementation using PHYSDEVOP_irq_status_query. - Rename xen_intr_isrc_count to xen_intr_auto_vector_count and replace it's usages. - Align static variables by name. xen/xen_intr.h: - Add prototype for xen_register_pirq.	2014-08-04 08:42:29 +00:00
jhb	05d21354c3	- Output a summary of optional VT-x features in dmesg similar to CPU features. If bootverbose is enabled, a detailed list is provided; otherwise, a single-line summary is displayed. - Add read-only sysctls for optional VT-x capabilities used by bhyve under a new hw.vmm.vmx.cap node. Move a few exiting sysctls that indicate the presence of optional capabilities under this node. CR: https://phabric.freebsd.org/D498 Reviewed by: grehan, neel MFC after: 1 week	2014-07-30 00:00:12 +00:00
marius	0af32d8345	Fix yet another comment typo in r269052.	2014-07-29 14:54:23 +00:00
marius	2ff80e0967	Fix comment typo in r269052. Submitted by: Daniel O'Connor	2014-07-29 13:26:24 +00:00
akiyama	966a1bc8d6	Add missing newline to output dmesg properly.	2014-07-28 13:47:02 +00:00
gavin	b01aa1f6d0	Add error return to dumpsys(), and use it in doadump(). This commit does not add error returns to minidumpsys() or textdump_dumpsys(); those can also be added later. Submitted by: Conrad Meyer (EMC / Isilon storage division)	2014-07-25 23:52:53 +00:00
marius	34c6aed3e2	Intel desktop Haswell CPUs may report benign corrected parity errors (see HSD131 erratum in [1]) at a considerable rate. So filter these (default), unless logging is enabled. Unfortunately, there really is no better way to reasonably implement suppressing these errors than to just skipping them in mca_log(). Given that they are reported for bank 0, they'd need to be masked in MSR_MC0_CTL. However, P6 family processors require that register to be set to either all 0s or all 1s, disabling way more than the one error in question when using all 0s there. Alternatively, it could be masked for the corresponding CMCI, but that still wouldn't keep the periodic scanner from detecting these spurious errors. Apart from that, register contents of MSR_MC0_CTL{,2} don't seem to be publicly documented, neither in the Intel Architectures Developer's Manual nor in the Haswell datasheets. Note that while HSD131 actually is only about C0-stepping as of revision 014 of the Intel desktop 4th generation processor family specification update, these corrected errors also have been observed with D0-stepping aka "Haswell Refresh". 1: http://www.intel.de/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf Reviewed by: jhb MFC after: 3 days Sponsored by: Bally Wulff Games & Entertainment GmbH	2014-07-24 10:14:51 +00:00
jhb	17d78db27b	Fix build with SMP disabled. CR: https://phabric.freebsd.org/D407 Reviewed by: royger	2014-07-15 15:40:33 +00:00
marcel	9f28abd980	Remove ia64. This includes: o All directories named ia64 o All files named ia64 o All ia64-specific code guarded by __ia64__ o All ia64-specific makefile logic o Mention of ia64 in comments and documentation This excludes: o Everything under contrib/ o Everything under crypto/ o sys/xen/interface o sys/sys/elf_common.h Discussed at: BSDcan	2014-07-07 00:27:09 +00:00
hselasky	35b126e324	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
gjb	fc21f40567	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
hselasky	bd1ed65f0f	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
hselasky	e34bd25949	Fix compile warning: Remove duplicate external declaration.	2014-06-19 05:06:24 +00:00
royger	4f6426a64f	xen: fix out-of-bounds access to ipi_handle Fix the gate in xen_pv_lapic_ipi_vectored to prevent access to element at position nitems(xen_ipis). Sponsored by: Citrix Systems R&D Coverity ID: 1223203 Approved by: gibbs	2014-06-18 13:41:20 +00:00
kib	0b024c9158	Do not reference native_lapic_ipi_() functions in the UP build. The functions' definitions are protected by #ifdef SMP. Keeping apic_ops.ipi_() methods NULL would allow to catch the use on UP machines. Reviewed by: royger Sponsored by: The FreeBSD Foundation	2014-06-17 09:33:22 +00:00
royger	ac5689b414	xen: add missing files Commit missing files that actually belong to previous commits. Sponsored by: Citrix Systems R&D Approved by: gibbs	2014-06-16 08:54:04 +00:00
royger	152e3229be	isa: allow ISA bus to attach to xenpv bus This is needed because syscons depends on ISA. Sponsored by: Citrix Systems R&D Approved by: gibbs x86/isa/isa.c: - Allow the ISA bus to attach to xenpv.	2014-06-16 08:49:16 +00:00
royger	a2b989a585	xen: add hooks for Xen PV APIC Create the necessary hooks in order to provide a Xen PV APIC implementation that can be used on PVH. Most of the lapic ops shouldn't be called on Xen, since we trap those operations at a higher layer. Sponsored by: Citrix Systems R&D Approved by: gibbs x86/xen/hvm.c: x86/xen/xen_apic.c: - Move IPI related code to xen_apic.c x86/xen/xen_apic.c: - Introduce Xen PV APIC implementation, most of the functions of the lapic interface should never be called when running as PV(H) guest, so make sure FreeBSD panics when trying to use one of those. - Define the Xen APIC implementation in xen_apic_ops. xen/xen_pv.h: - Extern declaration of the xen_apic struct. x86/xen/pv.c: - Use xen_apic_ops as apic_ops when running as PVH guest. conf/files.amd64: conf/files.i386: - Include the xen_apic.c file in the build of i386/amd64 kernels using XENHVM.	2014-06-16 08:43:45 +00:00
royger	7c7f3fb2d0	amd64/i386: introduce APIC hooks for different APIC implementations. This is needed for Xen PV(H) guests, since there's no hardware lapic available on this kind of domains. This commit should not change functionality. Sponsored by: Citrix Systems R&D Reviewed by: jhb Approved by: gibbs amd64/include/cpu.h: amd64/amd64/mp_machdep.c: i386/include/cpu.h: i386/i386/mp_machdep.c: - Remove lapic_ipi_vectored hook from cpu_ops, since it's now implemented in the lapic hooks. amd64/amd64/mp_machdep.c: i386/i386/mp_machdep.c: - Use lapic_ipi_vectored directly, since it's now an inline function that will call the appropiate hook. x86/x86/local_apic.c: - Prefix bare metal public lapic functions with native_ and mark them as static. - Define default implementation of apic_ops. x86/include/apicvar.h: - Declare the apic_ops structure and create inline functions to access the hooks, so the change is transparent to existing users of the lapic_ functions. x86/xen/hvm.c: - Switch to use the new apic_ops.	2014-06-16 08:43:03 +00:00
royger	6a8d0be395	xen: fix style in pv.c Fix the lenght of some comments, and also add proper indentation to xen_init_ops Sponsored by: Citrix Systems R&D Approved by: gibbs	2014-06-16 08:41:57 +00:00
scottl	38f58a4cec	Eliminate the fake contig_dmamap and replace it with a new flag, BUS_DMA_KMEM_ALLOC. They serve the same purpose, but using the flag means that the map can be NULL again, which in turn enables significant optimizations for the common case of no bouncing. Obtained from: Netflix, Inc. MFC after: 3 days	2014-05-27 21:31:11 +00:00
scottl	21c10cffc1	Now that there are separate back-end implementations of busdma, the bounce implementation shouldn't steal flags from the common front-end. Move those flags to the back-end. Obtained from: Netflix, Inc. MFC after: 3 days	2014-05-27 14:18:57 +00:00
scottl	7cb2cf17ef	Revert r266481. It was based on faulty analysis of the problem. A correct fix is forthcoming. Obtained from: Netflix, Inc.	2014-05-27 14:06:23 +00:00
jhb	5662e763a6	Whitespace fix. Submitted by: kib	2014-05-22 18:13:17 +00:00
scottl	a1e98423bb	Old PCIe implementations cannot allow a DMA transfer to cross a 4GB boundary. This was addressed several years ago by creating a parent tag hierarchy for the root buses that set the boundary restriction for appropriate buses and allowed child deviced to inherit it. Somewhere along the way, this restriction was turned into a case for marking the tag as a candidate for needing bounce buffers, instead of just splitting the segment along the boundary line. This flag also causes all maps associated with this tag to be non-NULL, which in turn causes bus_dmamap_sync() to take the slow path of function pointer indirection to discover that there's no bouncing work to do. The end result is a lot of pages set aside in bounce pools that will never be used, and a slow path for data buffers in nearly every DMA-capable PCIe device. For example, our workload at Netflix was spending nearly 1% of all CPU time going through this slow path. Fix this problem by being more selective about when to set the COULD_BOUNCE flag. Only set it when the boundary restriction exists and the consumer cannot do more than a single DMA segment at once. This fixes the case of dynamic buffers (mbufs, bio's) but doesn't address static buffers allocated from bus_dmamem_alloc(). That case will be addressed in the future. For those interested, this was discovered thanks to Dtrace Flame Graphs. Discussed with: jhb, kib Obtained from: Netflix, Inc. MFC after: 3 days	2014-05-20 22:43:17 +00:00
jhb	db4e203198	Add definitions for more structured extended features as well as XSAVE Extended Features for AVX512 and MPX (Memory Protection Extensions). Obtained from: Intel's Instruction Set Extensions Programming Reference (March 2014)	2014-05-16 17:45:09 +00:00
imp	7081aab77a	Make this compile with gcc. Submitted by: royger@	2014-04-05 22:43:18 +00:00
rstone	254af40a92	Re-implement the DMAR I/O MMU code in terms of PCI RIDs Under the hood the VT-d spec is really implemented in terms of PCI RIDs instead of bus/slot/function, even though the spec makes pains to convert back to bus/slot/function in examples. However working with bus/slot/function is not correct when PCI ARI is in use, so convert to using RIDs in most cases. bus/slot/function will only be used when reporting errors to a user. Reviewed by: kib MFC after: 2 months Sponsored by: Sandvine Inc.	2014-04-01 15:48:46 +00:00
rstone	120bf54d08	Revert PCI RID changes. My PCI RID changes somehow got intermixed with my PCI ARI patch when I committed it. I may have accidentally applied a patch to a non-clean working tree. Revert everything while I figure out what went wrong. Pointy hat to: rstone	2014-04-01 15:06:03 +00:00
rstone	eabfe8df7a	Re-implement the DMAR I/O MMU code in terms of PCI RIDs Under the hood the VT-d spec is really implemented in terms of PCI RIDs instead of bus/slot/function, even though the spec makes pains to convert back to bus/slot/function in examples. However working with bus/slot/function is not correct when PCI ARI is in use, so convert to using RIDs in most cases. bus/slot/function will only be used when reporting errors to a user. Reviewed by: kib Sponsored by: Sandvine Inc.	2014-04-01 14:51:45 +00:00
tijl	606babe108	Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4 -fms-extensions. MFC after: 2 weeks	2014-04-01 14:46:11 +00:00
takawata	d968f19902	Change default logic to CONFORM because this routine is shared with SCI polarity setting. Reviewed by: jhb	2014-03-28 02:38:14 +00:00
takawata	a40940f6b2	Strict value checking will cause problem. Bay trail DN2820FYKH is supported on Linux but does not work on FreeBSD. This behaviour is bug-compatible with Linux-3.13.5. References: http://d.hatena.ne.jp/syuu1228/20140326 http://lxr.linux.no/linux+v3.13.5/arch/x86/kernel/acpi/boot.c#L1094 Submitted by: syuu	2014-03-27 06:36:38 +00:00
takawata	35538e1725	To check polarity, check ACPI_MADT_POLARITY_CONFORMS, instead of ACPI_MADT_TRIGGER_CONFORMS. PR:amd64/188010 Submitted by: syuu	2014-03-27 06:08:07 +00:00
jhb	db52b17caa	Fix build without SMP. PR: kern/187854 MFC after: 1 week	2014-03-26 17:40:13 +00:00

1 2 3 4 5 ...

350 Commits