Commit Graph

457 Commits

Author SHA1 Message Date
Konstantin Belousov
9b3df93bf1 Typo in comment. 2015-07-20 19:51:41 +00:00
Konstantin Belousov
1ef630fb33 Fix warnings about unused functions for UP build.
Sponsored by:	The FreeBSD Foundation
2015-07-16 12:16:42 +00:00
Zbigniew Bodek
721555e7ee Fix KSTACK_PAGES issue when the default value was changed in KERNCONF
If KSTACK_PAGES was changed to anything alse than the default,
the value from param.h was taken instead in some places and
the value from KENRCONF in some others. This resulted in
inconsistency which caused corruption in SMP envorinment.

Ensure all places where KSTACK_PAGES are used the opt_kstack_pages.h
is included.

The file opt_kstack_pages.h could not be included in param.h
because was breaking the toolchain compilation.

Reviewed by:   kib
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3094
2015-07-16 10:46:52 +00:00
Christian Brueffer
9f026a420b Set the initial system time to a sane (as in: not end of 21st century) value when
booting on a PC with CMOS clock set to a year before 2000.

This uses 1980 (instead of 1970 as in the initial patch) as pivot year as
suggested by imp in the PR followup.

PR:		195703
Submitted by:	cs@soi.spb.ru
Reviewed by:	imp
MFC after:	1 weeks
2015-06-29 17:02:09 +00:00
Konstantin Belousov
1817023775 Add x86 PT_GETFSBASE, PT_GETGSBASE machine-depended ptrace requests to
obtain the thread %fs and %gs bases.  Add x86 PT_SETFSBASE and
PT_SETGSBASE requests to set the bases from debuggers.  The set
requests, similarly to the sysarch({I386,AMD64}_SET_FSBASE),
override the corresponding segment registers.

The main purpose of the operations is to retrieve and modify the tcb
address for debuggee.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-06-29 07:07:24 +00:00
Konstantin Belousov
1abfd35537 Split the DMAR unit domains and contexts. Domains carry address space
and related data structures.  Contexts attach requests initiators to
domains.  There is still 1:1 correspondence between contexts and
domains on the running system, since only busdma currently allocates
them, using dmar_get_ctx_for_dev().

Large part of the change is formal rename of the ctx to domain, but
patch also reworks the context allocation and free to allow for
independent domain creation.

The helper dmar_move_ctx_to_domain() is introduced for future use, to
reassign request initiator from one domain to another.  The hard issue
which is not yet resolved with the context move is proper handling (or
reserving) RMRR entries in the destination domain as required by ACPI
DMAR table for moved context.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2015-06-26 07:01:29 +00:00
Jung-uk Kim
5ef5072350 Merge ACPICA 20150619. 2015-06-18 23:14:45 +00:00
John Baldwin
125954c873 Handle X2APIC entries in the MADT for APICs with an ID < 255. At least one
BIOS has been seen to include such entries even though the relevant specs
require that X2APIC entries only be used for CPUs with an APIC ID >= 255.

This was tested on a system with "plain" local APIC entries in the MADT
to ensure no regressions, but it has not yet been tested on a system with
X2APIC entries in the MADT.  Currently such systems do not boot at all,
and with this change they might now boot correctly.

Differential Revision:	https://reviews.freebsd.org/D2521
Reviewed by:	kib
MFC after:	2 weeks
2015-06-09 10:49:40 +00:00
Konstantin Belousov
32a1e9e4a5 Update print_INTEL_TLB() by the tag values from the Intel SDM
rev. 55.  The modern CPUs cache and TLB descriptions looked quite
questionable without the update, e.g. Haswell i7 4770S reported:
	Data TLB: 4 KB pages, 4-way set associative, 64 entries
	L2 cache: 256 kbytes, 8-way associative, 64 bytes/line
After the update, the report is:
	Data TLB: 1 GByte pages, 4-way set associative, 4 entries
	Data TLB: 4 KB pages, 4-way set associative, 64 entries
	Instruction TLB: 2M/4M pages, fully associative, 8 entries
	Instruction TLB: 4KByte pages, 8-way set associative, 64 entries
	64-Byte prefetching
	Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries
	L2 cache: 256 kbytes, 8-way associative, 64 bytes/line
Some tags were apparently removed from the table 3-21, Vol. 2A.  Keep
them around, but add a comment stating the removal.

Update the format line for cpu_stdext_feature according to the bits
from the SDM rev.55.  It appears that Haswells do not store %cs and
%ds values in the FPU save area.

Store content of the %ecx register from the CPUID leaf 0x7
subleaf 0 as cpu_stdext_feature2 and print defined bits from it,
again acording to SDM rev. 55.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-06-06 22:03:24 +00:00
Konstantin Belousov
69baeadc31 Remove several write-only variables, all reported by the gcc 4.9
buildkernel run.

Some of them were write-only under some kernel options, e.g. variables
keeping values only used by CTR() macros.  It costs nothing to the
code readability and correctness to eliminate the warnings in those
cases too by removing the local cached values used only for
single-access.

Review:	https://reviews.freebsd.org/D2665
Reviewed by:	rodrigc
Looked at by:	bjk
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-05-29 13:24:17 +00:00
Konstantin Belousov
35725a97c4 Explicitely enable queued invalidation completion interrupt when the
queue is started, not relying on the interrupt remaping method to
happen.  Also disable interrupts when shooting down the queue.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-05-29 09:17:59 +00:00
Roger Pau Monné
02761cf314 xen: make sure xenpv bus is the last to attach
This is needed so other buses have a chance of attaching a real ISA bus, if
none is found xenpv will attach it.

Sponsored by: Citrix Systems R&D
2015-05-25 09:47:16 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Konstantin Belousov
d9e8bbb64d When sleeping in Sx state using MWAIT instruction, accept fast wakeup
requests from writes to the monitored line.

Submitted by:	avg
2015-05-19 14:21:00 +00:00
Adrian Chadd
4f5e270a93 Update the comments to match what the code ended up becoming.
-1 is now "no locality information available".

Sponsored by:	Norse Corp, Inc.
2015-05-15 21:33:19 +00:00
Konstantin Belousov
a546448b8d Rewrite amd64 PCID implementation to follow an algorithm described in
the Vahalia' "Unix Internals" section 15.12 "Other TLB Consistency
Algorithms".  The same algorithm is already utilized by the MIPS pmap
to handle ASIDs.

The PCID for the address space is now allocated per-cpu during context
switch to the thread using pmap, when no PCID on the cpu was ever
allocated, or the current PCID is invalidated.  If the PCID is reused,
bit 63 of %cr3 can be set to avoid TLB flush.

Each cpu has PCID' algorithm generation count, which is saved in the
pmap pcpu block when pcpu PCID is allocated.  On invalidation, the
pmap generation count is zeroed, which signals the context switch code
that already allocated PCID is no longer valid.  The implication is
the TLB shootdown for the given cpu/address space, due to the
allocation of new PCID.

The pm_save mask is no longer has to be tracked, which (significantly)
reduces the targets of the TLB shootdown IPIs.  Previously, pm_save
was reset only on pmap_invalidate_all(), which made it accumulate the
cpuids of all processors on which the thread was scheduled between
full TLB shootdowns.

Besides reducing the amount of TLB shootdowns and removing atomics to
update pm_saves in the context switch code, the algorithm is much
simpler than the maintanence of pm_save and selection of the right
address space in the shootdown IPI handler.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2015-05-09 19:11:01 +00:00
Konstantin Belousov
b57a73f8e7 If x86 CPU implementation of the MWAIT instruction reasonably
interacts with interrupts, query ACPI and use MWAIT for entrance into
Cx sleep states.  Support C1 "I/O then halt" mode.  See Intel'
document 302223-007 "Intelб╝ Processor Vendor-Specific ACPI Interface
Specification" for description.

Move the acpi_cpu_c1() function into x86/cpu_machdep.c and use
it instead of inlining "sti; hlt" sequence in several places.

In the acpi(4) man page, besides documenting the dev.cpu.N.cx_methods
sysctl, correct the names for dev.cpu.N.{cx_usage,cx_lowest,cx_supported}
sysctls.

Both jkim and avg have some other patches implementing the mwait
functionality; this work is unrelated.  Linux does not rely on the
ACPI to provide correct tables describing Cx modes.  Instead, the
driver has pre-defined knowledge of the CPU models, it was supplied by
Intel.

Tested by:    pho (previous versions)
Sponsored by:	The FreeBSD Foundation
2015-05-09 12:28:48 +00:00
Roger Pau Monné
0df8b29da3 xen: introduce a newbus function to allocate unused memory
In order to map memory from other domains when running on Xen FreeBSD uses
unused physical memory regions. Until now this memory has been allocated
using bus_alloc_resource, but this is not completely safe as we can end up
using unreclaimed MMIO or ACPI regions.

Fix this by introducing a new newbus method that can be used by Xen drivers
to request for unused memory regions. On amd64 we make sure this memory
comes from regions above 4GB in order to prevent clashes with MMIO/ACPI
regions. On i386 there's nothing we can do, so just fall back to the
previous mechanism.

Sponsored by: Citrix Systems R&D
Tested by: Gustau Pérez <gperez@entel.upc.edu>
2015-05-08 14:48:40 +00:00
Adrian Chadd
415d7ccab2 Add initial memory locality cost awareness to the VM, and include
a basic ACPI SLIT table parser.

For now this just exports the map via sysctl; it'll eventually be useful
to userland when there's more useful NUMA support in -HEAD.

* Add an optional mem_locality map;
* add a mapping function taking from/to domain and returning the
  relative cost, or -1 if it's not available;
* Add a very basic SLIT parser to x86 ACPI.

Differential Revision:	https://reviews.freebsd.org/D2460
Reviewed by:	rpaulo, stas, jhb
Sponsored by:	Norse Corp, Inc (hardware, coding); Dell (hardware)
2015-05-08 00:56:56 +00:00
Neel Natu
712bd51ada Add macros for AMD-specific bits in MSR_EFER: LMSLE, FFXSR and TCE.
AMDID_FFXSR is at bit 25 so correct its value to 0x02000000.

MFC after:	1 week
2015-05-06 05:12:29 +00:00
John Baldwin
ed95805e90 Remove support for Xen PV domU kernels. Support for HVM domU kernels
remains.  Xen is planning to phase out support for PV upstream since it
is harder to maintain and has more overhead.  Modern x86 CPUs include
virtualization extensions that support HVM guests instead of PV guests.
In addition, the PV code was i386 only and not as well maintained recently
as the HVM code.
- Remove the i386-only NATIVE option that was used to disable certain
  components for PV kernels.  These components are now standard as they
  are on amd64.
- Remove !XENHVM bits from PV drivers.
- Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3,
  etc.)
- Remove duplicate copy of <xen/features.h>.
- Remove unused, i386-only xenstored.h.

Differential Revision:	https://reviews.freebsd.org/D2362
Reviewed by:	royger
Tested by:	royger (i386/amd64 HVM domU and amd64 PVH dom0)
Relnotes:	yes
2015-04-30 15:48:48 +00:00
Wei Hu
da2f98a1cf Microsoft vmbus, storage and other related driver enhancements for HyperV.
- Vmbus multi channel support.
    - Vector interrupt support.
    - Signal optimization.
    - Storvsc driver performance improvement.
    - Scatter and gather support for storvsc driver.
    - Minor bug fix for KVP driver.
Thanks royger, jhb and delphij from FreeBSD community for the reviews
and comments. Also thanks Hovy Xu from NetApp for the contributions to
the storvsc driver.

PR:     195238
Submitted by:   whu
Reviewed by:    royger, jhb, delphij
Approved by:    royger
MFC after:      2 weeks
Relnotes:       yes
Sponsored by:   Microsoft OSTC
2015-04-29 10:12:34 +00:00
Hans Petter Selasky
5caa65ca2d The add_bounce_page() function can be called when loading physical
pages which pass a NULL virtual address. If the BUS_DMA_KEEP_PG_OFFSET
flag is set, use the physical address to compute the page offset
instead. The physical address should always be valid when adding
bounce pages and should contain the same page offset like the virtual
address.

Submitted by:	Svatopluk Kraus <onwahe@gmail.com>
MFC after:	1 week
Reviewed by:	jhb@
2015-04-28 06:12:37 +00:00
Konstantin Belousov
02c26f81a7 Move common code from sys/i386/i386/mp_machdep.c and
sys/amd64/amd64/mp_machdep.c, to the new common x86 source
sys/x86/x86/mp_x86.c.

Proposed and reviewed by:	jhb
Review:	https://reviews.freebsd.org/D2347
Sponsored by:	The FreeBSD Foundation
2015-04-24 16:20:56 +00:00
John Baldwin
179fa75e6e Reassign copyright statements on several files from Advanced
Computing Technologies LLC to Hudson River Trading LLC.

Approved by:	Hudson River Trading LLC (who owns ACT LLC)
MFC after:	1 week
2015-04-23 14:22:20 +00:00
Konstantin Belousov
dfe7b3bfbc Move some common code from sys/amd64/amd64/machdep.c and
sys/i386/i386/machdep.c to new file sys/x86/x86/cpu_machdep.c.  Most
of the code is related to the idle handling.

Discussed with:	pluknet
Sponsored by:	The FreeBSD Foundation
2015-04-22 12:32:14 +00:00
Marius Strobl
62def911bd Refine the workaround for Intel HSD131 [1] added in r269052:
- Use the full mask described by the erratum as with a sufficiently high
  number of these false-positives, the overflow bit (bit 62) additionally
  gets set [7].
- HSD131 has been brought into several other Haswell-derived CPUs including
  to the next generation, i. e. Intel Broadwell. Thus, also skip reporting of
  these benign errors by default on CPU models affected by HSM142, HSW131 and
  BDM48 [2 - 5], describing the HSD131 silicon bug for additional models.
  Also, Celeron 2955U with a CPU ID of 0x45 have been reported to be covered
  by this fault [6], with the specification update concerned with HSM142 [2]
  only referring to 0x3c and 0x46.

Submitted by:	David Froehlich [7]
MFC after:	3 days

http://www.intel.de/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf [1]
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-mobile-specification-update.pdf [2]
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf [3]
http://www.intel.de/content/dam/www/public/us/en/documents/specification-updates/core-m-processor-family-spec-update.pdf [4]
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e3-1200v3-spec-update.pdf [5]
https://lists.freebsd.org/pipermail/freebsd-hackers/2015-January/046878.html [6]
2015-04-19 20:15:57 +00:00
Konstantin Belousov
9dda94167d Revert unrelated chunk from the r281707.
MFC after:	2 weeks
2015-04-18 21:27:28 +00:00
Konstantin Belousov
1c8e7232b4 Remove lazy pmap switch code from i386. Naive benchmark with md(4)
shows no difference with the code removed.

On both amd64 and i386, assert that a released pmap is not active.

Proposed and reviewed by:	alc
Discussed with:	Svatopluk Kraus <onwahe@gmail.com>, peter
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-04-18 21:23:16 +00:00
Konstantin Belousov
34c15db9cd Add config option PAE_TABLES for the i386 kernel. It switches pmap to
use PAE format for the page tables, but does not incur other
consequences of the full PAE config.  In particular, vm_paddr_t and
bus_addr_t are left 32bit, and max supported memory is still limited
by 4GB.

The option allows to have nx permissions for memory mappings on i386
kernel, while keeping the usual i386 KBI and avoiding the kernel data
sizing problems typical for the PAE config.

Intel documented that the PAE format for page tables is available
starting with the Pentium Pro, but it is possible that the plain
Pentium CPUs have the required support (Appendix H).  The goal is to
enable the option and non-exec mappings on i386 for the GENERIC
kernel.  Anybody wanting a useful system on 486, have to reconfigure
the modern i386 kernel anyway.

Discussed with:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-04-13 15:22:45 +00:00
Jung-uk Kim
9e222cd613 Fix build on i386.
Reported by:	bz
2015-04-12 22:40:27 +00:00
John Baldwin
dbee5c671a Move the 32-bit compatible procfs types from freebsd32.h to <sys/procfs.h>
and export them to userland.
- Define __HAVE_REG32 on platforms that define a reg32 structure and check
  for this in <sys/procfs.h> to control when to export prstatus32, etc.
- Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures.
  libbfd looks for these types, and having them fixes 'gcore' in gdb of a
  32-bit process on a 64-bit platform.
- Use the structure definitions from <sys/procfs.h> in gcore's elf32 core
  dump code instead of duplicating the definitions.

Differential Revision:	https://reviews.freebsd.org/D2142
Reviewed by:	kib, nathanw (powerpc bits)
MFC after:	1 week
2015-04-08 16:30:45 +00:00
Konstantin Belousov
5f01ba81ae Account for the offset of the page run when allocating the
dmar_map_entry.  Non-zero offset both increases the required mapping
size, which is handled in dmar_bus_dmamap_load_something1(), and makes
it possible that allocated range crosses boundary, which needs a check
in dmar_gas_match_one().

Reported and tested by:	jimharris
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-08 01:55:22 +00:00
Konstantin Belousov
bdf31595ff When mapping an allocated entry, use the entry size, instead of the
requested size.  If tag restrictions caused split entry, its size is
less then requsted.

Hardware provided by:	Michael Fuckner <michael@fuckner.net>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-03-24 12:48:51 +00:00
Konstantin Belousov
a3d78402d2 Assert that the mapping loop makes progress.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-03-24 12:46:21 +00:00
Konstantin Belousov
0a110d5b17 Use VT-d interrupt remapping block (IR) to perform FSB messages
translation.  In particular, despite IO-APICs only take 8bit apic id,
IR translation structures accept 32bit APIC Id, which allows x2APIC
mode to function properly.  Extend msi_cpu of struct msi_intrsrc and
io_cpu of ioapic_intsrc to full int from one byte.

KPI of IR is isolated into the x86/iommu/iommu_intrmap.h, to avoid
bringing all dmar headers into interrupt code. The non-PCI(e) devices
which generate message interrupts on FSB require special handling. The
HPET FSB interrupts are remapped, while DMAR interrupts are not.

For each msi and ioapic interrupt source, the iommu cookie is added,
which is in fact index of the IRE (interrupt remap entry) in the IR
table. Cookie is made at the source allocation time, and then used at
the map time to fill both IRE and device registers. The MSI
address/data registers and IO-APIC redirection registers are
programmed with the special values which are recognized by IR and used
to restore the IRE index, to find proper delivery mode and target.
Map all MSI interrupts in the block when msi_map() is called.

Since an interrupt source setup and dismantle code are done in the
non-sleepable context, flushing interrupt entries cache in the IR
hardware, which is done async and ideally waits for the interrupt,
requires busy-wait for queue to drain.  The dmar_qi_wait_for_seq() is
modified to take a boolean argument requesting busy-wait for the
written sequence number instead of waiting for interrupt.

Some interrupts are configured before IR is initialized, e.g. ACPI
SCI.  Add intr_reprogram() function to reprogram all already
configured interrupts, and call it immediately before an IR unit is
enabled.  There is still a small window after the IO-APIC redirection
entry is reprogrammed with cookie but before the unit is enabled, but
to fix this properly, IR must be started much earlier.

Add workarounds for 5500 and X58 northbridges, some revisions of which
have severe flaws in handling IR.  Use the same identification methods
as employed by Linux.

Review:	https://reviews.freebsd.org/D1892
Reviewed by:	neel
Discussed with:	jhb
Tested by:	glebius, pho (previous versions)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2015-03-19 13:57:47 +00:00
Konstantin Belousov
dcc33b0a8a Provide definitions for all descriptors types in the DMAR invalidation
queue.  They are for first-level translations and device TLB.

Review:	https://reviews.freebsd.org/D1892
Reviewed by:	neel
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-03-19 13:05:55 +00:00
Konstantin Belousov
0f5830b045 Fix syntax error.
Review:	https://reviews.freebsd.org/D1892
Found by:	neel
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2015-03-19 13:03:58 +00:00
Konstantin Belousov
743ffdaf3c When initial placement of the new entry crosses the boundary,
allocator tries to move the entry up, after the boundary.  The new
location may still fail to satisfy boundary requirement, for instance,
if the boundary is set to page size, and allocation is of multiple
pages.

Recheck that boundary is not crossed after the move.  If it is
crossed, give up on allocating the whole entry and split it.

Reported by:	Michael Fuckner <michael@fuckner.net>, running nvme(4)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-03-17 22:00:11 +00:00
Konstantin Belousov
582b656248 When inserting new entry into the address map, ensure that not only
next entry does not intersect with the tail of the new entry, but also
that previous entry is also before new entry start.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-03-17 21:55:33 +00:00
Neel Natu
8958d18cb3 Add x86 specific APIs 'lapic_ipi_alloc()' and 'lapic_ipi_free()' to allow IPI
vectors to be dynamically allocated. This allows kernel modules like vmm.ko
to allocate unique IPI slots when loaded (as opposed to hard allocating one
or more vectors).

Also, reorganize the fixed IPI vectors to create a contiguous space for
dynamic IPI allocation.

Reviewed by:	kib, jhb
Differential Revision:	https://reviews.freebsd.org/D2042
2015-03-14 00:30:41 +00:00
Neel Natu
847383d090 Free up the IPI slot used by IPI_STOP_HARD.
Change the numeric value of IPI_STOP_HARD so it doesn't occupy a valid IPI
slot. This can be done because IPI_STOP_HARD is actually delivered via NMI.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D1983
2015-03-01 02:31:27 +00:00
Konstantin Belousov
68c2f49379 Since all generations of Intel CPUs have errata which causes hang on
the cache line flush in the LAPIC page, keep direct map page covering
LAPIC mapped uncached.

To have the (incomplete) check for the LAPIC range in
pmap_invalidate_cache_range() working, lapic_paddr must be initialized
in x2APIC mode too.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2015-02-27 11:13:46 +00:00
Roger Pau Monné
544b31a3e3 xen/intr: fix fallout from r278854
r278854 introduced a race in the event channel handling code. We must make
sure that the pending bit is cleared before executing the filter, or else we
might miss other events that would be injected after the filter has ran but
before the pending bit is cleared.

While there also mask event channels while FreeBSD executes the ithread
bound to that event channel. This refrains Xen from injecting more
interrupts while the ithread has not finished it's work.

Sponsored by: Citrix Systems R&D
Reported by: sbruno, robak
Tested by: robak
2015-02-26 16:05:09 +00:00
Konstantin Belousov
2d4c4c8dc7 Implements EOI suppression mode, where LAPIC on EOI command for
level-triggered interrupt does not broadcast the EOI message to all
APICs in the system.  Instead, interrupt handler must follow LAPIC EOI
with IOAPIC EOI.  For modern IOAPICs, the later is done by writing to
EOIR register.  Otherwise, Intel provided Linux with a trick of
temporary switching the pin config to edge and then back to level.

Detect presence of EOIR register by reading IO-APIC version.  The
summary table in the comments was taken from the Linux kernel.  For
Intel, newer IO-APICs are only briefly documented as part of the
ICH/PCH datasheet.  According to the BKDG and chipset documentation,
AMD LAPICs do not provide EOI suppression, althought IO-APICs do
declare version 0x21 and implement EOIR.

The trick to temporary switch pin to edge mode to clear IRR was tested
on modern chipset, by pretending that EOIR is not present, i.e. by
forcing io_haseoi to zero.

Tunable hw.lapic_eoi_suppression disables the optimization.

Reviewed by:	neel
Tested by:	pho
Review:	https://reviews.freebsd.org/D1943
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2015-02-26 11:02:40 +00:00
Konstantin Belousov
45bc78eb45 For now, disable x2APIC mode when Xen is detected, even if CPU
declares support for it.  Newer versions of Xen works fine with x2APIC
code, but e.g. Xen 4.2 delivers GPF on the LAPIC MSR write, despite
x2APIC mode being known to hypervisor.

Discussed with:	royger
Sponsored by:	The FreeBSD Foundation
2015-02-25 16:44:07 +00:00
Konstantin Belousov
af02586229 Revert r276949 and redo the fix for PCIe/PCI bridges, which do not
follow specification and do not provide PCIe capability.

Verify if the port above such bridge is downstream PCIe (or root port)
and treat the bridge as PCIe/PCI then.  This allows to avoid
maintaining the table of device ids for bridges without capability,
while still calculate correct request originator for devices behind
the bridge.

Submitted by:	Jason Harmening <jason.harmening@gmail.com>
MFC after:	1 week
2015-02-21 22:38:32 +00:00
Tijl Coosemans
f140f6d14b Fix build on i386 without "device apic"
Reviewed by:	kib
2015-02-20 19:42:26 +00:00
Konstantin Belousov
117c6e7cf2 Fix UP build.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2015-02-18 10:51:48 +00:00
Konstantin Belousov
5f674c4cbd Initialize x2APIC mode on the resume path before accessing LAPIC.
Remove unneeded disable of LAPIC in the native_lapic_xapic_mode().  We
attempt to send wakeup IPI on the resume path right after BSP wakeup,
so disabling is wrong.

Reported and tested by:	glebius, "Ranjan1018 ." <214748mv@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	2 months
2015-02-16 21:56:19 +00:00