Commit Graph

35 Commits

Author SHA1 Message Date
Roger Pau Monné
27c36a12f1 xen: introduce a new way to setup event channel upcall
The main differences with the currently implemented method are:

 - Requires a local APIC EOI, since it doesn't bypass the local APIC
   as the previous method used to do.
 - Can be set to use different IDT vectors on each vCPU. Note that
   FreeBSD doesn't make use of this feature since the event channel
   IDT vector is reserved system wide.

Note that the old method of setting the event channel upcall is
not removed, and will be used as a fallback if this newly introduced
method is not available.

MFC after:	1 month
Sponsored by:	Citrix Systems R&D
2019-01-30 11:34:52 +00:00
John Baldwin
b6b42932db Convert the number of MSI IRQs on x86 from a constant to a tunable.
The number of MSI IRQs still defaults to 512, but it can now be
changed at boot time via the machdep.num_msi_irqs tunable.

Reviewed by:	kib, royger (older version)
Reviewed by:	markj
MFC after:	1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D17977
2018-11-15 18:37:41 +00:00
Roger Pau Monné
a74cdf4e74 xen: legacy PVH fixes for the new interrupt count
Register interrupts using the PIC pic_register_sources method instead
of doing it in apic_setup_io. This is now required, since the internal
interrupt structures are not yet setup when calling apic_setup_io.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:14:11 +00:00
Roger Pau Monné
4fcd5f3003 xen: limit the usage of PIRQs to a legacy PVH Dom0
That's the only mode in FreeBSD that requires the usage of PIRQs, so
there's no need to attach the PIRQ PIC when running in other modes.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:11:11 +00:00
John Baldwin
fd036deac1 Dynamically allocate IRQ ranges on x86.
Previously, x86 used static ranges of IRQ values for different types
of I/O interrupts.  Interrupt pins on I/O APICs and 8259A PICs used
IRQ values from 0 to 254.  MSI interrupts used a compile-time-defined
range starting at 256, and Xen event channels used a
compile-time-defined range after MSI.  Some recent systems have more
than 255 I/O APIC interrupt pins which resulted in those IRQ values
overflowing into the MSI range triggering an assertion failure.

Replace statically assigned ranges with dynamic ranges.  Do a single
pass computing the sizes of the IRQ ranges (PICs, MSI, Xen) to
determine the total number of IRQs required.  Allocate the interrupt
source and interrupt count arrays dynamically once this pass has
completed.  To minimize runtime complexity these arrays are only sized
once during bootup.  The PIC range is determined by the PICs present
in the system.  The MSI and Xen ranges continue to use a fixed size,
though this does make it possible to turn the MSI range size into a
tunable in the future.

As a result, various places are updated to use dynamic limits instead
of constants.  In addition, the vmstat(8) utility has been taught to
understand that some kernels may treat 'intrcnt' and 'intrnames' as
pointers rather than arrays when extracting interrupt stats from a
crashdump.  This is determined by the presence (vs absence) of a
global 'nintrcnt' symbol.

This change reverts r189404 which worked around a buggy BIOS which
enumerated an I/O APIC twice (using the same memory mapped address for
both entries but using an IRQ base of 256 for one entry and a valid
IRQ base for the second entry).  Making the "base" of MSI IRQ values
dynamic avoids the panic that r189404 worked around, and there may now
be valid I/O APICs with an IRQ base above 256 which this workaround
would incorrectly skip.

If in the future the issue reported in PR 130483 reoccurs, we will
have to add a pass over the I/O APIC entries in the MADT to detect
duplicates using the memory mapped address and use some strategy to
choose the "correct" one.

While here, reserve room in intrcnts for the Hyper-V counters.

PR:		229429, 130483
Reviewed by:	kib, royger, cem
Tested by:	royger (Xen), kib (DMAR)
Approved by:	re (gjb)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D16861
2018-08-28 21:09:19 +00:00
Andrew Turner
2bf9501287 Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by:	bz, emaste, markj
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D16140
2018-07-05 17:13:37 +00:00
Jeff Roberson
27a3c9d710 Restore r331606 with a bugfix to setup cpuset_domain[] earlier on all
platforms.  Original commit message as follows:

Only use CPUs in the domain the device is attached to for default
assignment.  Device drivers are able to override the default assignment
if they bind directly.  There are severe performance penalties for
handling interrupts on remote CPUs and this should only be done in
very controlled circumstances.

Reviewed by:    jhb, kib
Tested by:      pho
Sponsored by:   Netflix, Dell/EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D14838
2018-03-28 18:47:35 +00:00
Jeff Roberson
261c408744 Backout r331606 until I can identify why it does not boot on some
machines.
2018-03-27 10:20:50 +00:00
Jeff Roberson
a48de40bcc Only use CPUs in the domain the device is attached to for default
assignment.  Device drivers are able to override the default assignment
if they bind directly.  There are severe performance penalties for
handling interrupts on remote CPUs and this should only be done in
very controlled circumstances.

Reviewed by:	jhb, kib
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14838
2018-03-27 03:37:04 +00:00
Roger Pau Monné
ca7af67ac9 xen: fix IPI setup with EARLY_AP_STARTUP
Current Xen IPI setup functions require that the caller provide a device in
order to obtain the name of the interrupt from it. With early AP startup this
device is no longer available at the point where IPIs are bound, and a KASSERT
would trigger:

panic: NULL pcpu device_t
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82233a20
vpanic() at vpanic+0x186/frame 0xffffffff82233aa0
kassert_panic() at kassert_panic+0x126/frame 0xffffffff82233b10
xen_setup_cpus() at xen_setup_cpus+0x5b/frame 0xffffffff82233b50
mi_startup() at mi_startup+0x118/frame 0xffffffff82233b70
btext() at btext+0x2c

Fix this by no longer requiring the presence of a device in order to bind IPIs,
and simply use the "cpuX" format where X is the CPU identifier in order to
describe the interrupt.

Reported by:            sbruno, cperciva
Tested by:              sbruno
X-MFC-With:             r310177
Sponsored by:           Citrix Systems R&D
2016-12-22 16:09:44 +00:00
Bryan Drewery
28323add09 Fix improper use of "its".
Sponsored by:	Dell EMC Isilon
2016-11-08 23:59:41 +00:00
Roger Pau Monné
0f4d7d9fd7 xen/intr: add reference counts to event channels
Add a reference count to xenisrc. This is required for implementation of
unmap-notifications in the grant table userspace device (gntdev). We need to
hold a reference to the event channel port, in case the user deallocates the
port before we send the notification.

Submitted by:		jaggi
Reviewed by:		royger
Differential review:	https://reviews.freebsd.org/D7429
2016-10-31 13:00:53 +00:00
Roger Pau Monné
35fdb32d86 xen-intr: fix removal of event channels during resume
Event channel handlers cannot be removed during resume because there might
be an interrupt thread running on a CPU currently blocked in the
cpususpend_handler, which prevents the call to intr_remove_handler from
finishing and completely freezes the system during resume. r291022 tried to
fix this by allowing recursion in intr_remove_handler, but that's clearly
not enough.

Instead don't remove the handlers at the interrupt resume phase, and let
each driver remove the handler by itself during resume. In order to do this,
change the opaque event channel handler cookie to use the global interrupt
vector instead of the event channel port. The event channel port cannot be
used because after resume all event channels are reset, and the port numbers
can change.

Sponsored by:		Citrix Systems R&D
MFC after:		5 days
2016-07-29 16:34:54 +00:00
Roger Pau Monné
ea64b86f94 xen/intr: properly dispose event channels on resume
All event channels are torn down when performing a migration on Xen, make
sure all handlers are also removed and the event channel structure is
properly disposed so it can be reused.

Sponsored by:		Citrix Systems R&D
MFC after:		2 weeks
2015-11-18 18:10:28 +00:00
Roger Pau Monné
f186ed526a xen/intr: fix the event channel enabled per-cpu mask
Fix two issues with the current event channel code, first ENABLED_SETSIZE is
not correctly defined and then using a BITSET to store the per-cpu masks is
not portable to other arches, since on arm32 the event channel arrays shared
with the hypervisor are of type uint64_t and not long. Partially restore the
previous code but switch the bit operations to use the recently introduced
xen_{set/clear/test}_bit versions.

Reviewed by:		Julien Grall <julien.grall@citrix.com>
Sponsored by:		Citrix Systems R&D
Differential Revision:	https://reviews.freebsd.org/D4080
2015-11-05 14:33:46 +00:00
Conrad Meyer
ce7543042c xen: Add missing semi-colon for BITSET_DEFINE()
Broken when it was removed from the macro in r289867.

Pointy-hat:	markj
Sponsored by:	EMC / Isilon Storage Division
2015-10-24 19:04:55 +00:00
Roger Pau Monné
2f9ec994bc xen: Code cleanup and small bug fixes
xen/hypervisor.h:
 - Remove unused helpers: MULTI_update_va_mapping, is_initial_xendomain,
   is_running_on_xen
 - Remove unused define CONFIG_X86_PAE
 - Remove unused variable xen_start_info: note that it's used inpcifront
   which is not built at all
 - Remove forward declaration of HYPERVISOR_crash

xen/xen-os.h:
 - Remove unused define CONFIG_X86_PAE
 - Drop unused helpers: test_and_clear_bit, clear_bit,
   force_evtchn_callback
 - Implement a generic version (based on ofed/include/linux/bitops.h) of
   set_bit and test_bit and prefix them by xen_ to avoid any use by other
   code than Xen. Note that It would be worth to investigate a generic
   implementation in FreeBSD.
 - Replace barrier() by __compiler_membar()
 - Replace cpu_relax() by cpu_spinwait(): it's exactly the same as rep;nop
   = pause

xen/xen_intr.h:
 - Move the prototype of xen_intr_handle_upcall in it: Use by all the
   platform

x86/xen/xen_intr.c:
 - Use BITSET* for the enabledbits: Avoid to use custom helpers
 - test_bit/set_bit has been renamed to xen_test_bit/xen_set_bit
 - Don't export the variable xen_intr_pcpu

dev/xen/blkback/blkback.c:
 - Fix the string format when XBB_DEBUG is enabled: host_addr is typed
   uint64_t

dev/xen/balloon/balloon.c:
 - Remove set but not used variable
 - Use the correct type for frame_list: xen_pfn_t represents the frame
   number on any architecture

dev/xen/control/control.c:
 - Return BUS_PROBE_WILDCARD in xs_probe: Returning 0 in a probe callback
   means the driver can handle this device. If by any chance xenstore is the
   first driver, every new device with the driver is unset will use
   xenstore.

dev/xen/grant-table/grant_table.c:
 - Remove unused cmpxchg
 - Drop unused include opt_pmap.h: Doesn't exist on ARM64 and it doesn't
   contain anything required for the code on x86

dev/xen/netfront/netfront.c:
 - Use the correct type for rx_pfn_array: xen_pfn_t represents the frame
   number on any architecture

dev/xen/netback/netback.c:
 - Use the correct type for gmfn: xen_pfn_t represents the frame number on
   any architecture

dev/xen/xenstore/xenstore.c:
 - Return BUS_PROBE_WILDCARD in xctrl_probe: Returning 0 in a probe callback
   means the driver can handle this device. If by any chance xenstore is the
  first driver, every new device with the driver is unset will use xenstore.

Note that with the changes, x86/include/xen/xen-os.h doesn't contain anymore
arch-specific code. Although, a new series will add some helpers that differ
between x86 and ARM64, so I've kept the headers for now.

Submitted by:		Julien Grall <julien.grall@citrix.com>
Reviewed by:		royger
Differential Revision:	https://reviews.freebsd.org/D3921
Sponsored by:		Citrix Systems R&D
2015-10-21 10:44:07 +00:00
John Baldwin
3c790178c5 Remove some more vestiges of the Xen PV domu support. Specifically,
use vtophys() directly instead of vtomach() and retire the no-longer-used
headers <machine/xenfunc.h> and <machine/xenvar.h>.

Reported by:	bde (stale bits in <machine/xenfunc.h>)
Reviewed by:	royger (earlier version)
Differential Revision:	https://reviews.freebsd.org/D3266
2015-08-06 17:07:21 +00:00
John Baldwin
ed95805e90 Remove support for Xen PV domU kernels. Support for HVM domU kernels
remains.  Xen is planning to phase out support for PV upstream since it
is harder to maintain and has more overhead.  Modern x86 CPUs include
virtualization extensions that support HVM guests instead of PV guests.
In addition, the PV code was i386 only and not as well maintained recently
as the HVM code.
- Remove the i386-only NATIVE option that was used to disable certain
  components for PV kernels.  These components are now standard as they
  are on amd64.
- Remove !XENHVM bits from PV drivers.
- Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3,
  etc.)
- Remove duplicate copy of <xen/features.h>.
- Remove unused, i386-only xenstored.h.

Differential Revision:	https://reviews.freebsd.org/D2362
Reviewed by:	royger
Tested by:	royger (i386/amd64 HVM domU and amd64 PVH dom0)
Relnotes:	yes
2015-04-30 15:48:48 +00:00
Roger Pau Monné
544b31a3e3 xen/intr: fix fallout from r278854
r278854 introduced a race in the event channel handling code. We must make
sure that the pending bit is cleared before executing the filter, or else we
might miss other events that would be injected after the filter has ran but
before the pending bit is cleared.

While there also mask event channels while FreeBSD executes the ithread
bound to that event channel. This refrains Xen from injecting more
interrupts while the ithread has not finished it's work.

Sponsored by: Citrix Systems R&D
Reported by: sbruno, robak
Tested by: robak
2015-02-26 16:05:09 +00:00
Roger Pau Monné
a2c5251281 xen/intr: improve PIRQ handling
Improve and cleanup the Xen PIRQ event channel code:

 - Remove the xi_shared field as it is unused.
 - Clean the "pending" bit in the EOI handler, this is more similar to how
   native interrupts are handled.
 - Don't mask edge triggered PIRQs, edge trigger interrupts cannot be
   masked.
 - Panic if PHYSDEVOP_eoi fails.
 - Remove the usage of the PHYSDEVOP_alloc_irq_vector hypercall because
   it's just a no-op in the Xen versions that are supported by FreeBSD Dom0.

Sponsored by: Citrix Systems R&D
2015-02-16 16:30:42 +00:00
Roger Pau Monné
f229f35db7 xen/intr: balance dynamic interrupts across available vCPUs
By default Xen binds all event channels to vCPU#0, and FreeBSD only shuffles
the interrupt sources once, at the end of the boot process. Since new event
channels might be created after this point (because new devices or backends
are added), try to automatically shuffle them at creation time.

This does not affect VIRQ or IPI event channels, that are already bound to a
specific vCPU as requested by the caller.

Sponsored by: Citrix Systems R&D
2014-12-10 13:25:21 +00:00
Roger Pau Monné
23ca39cf61 xen: mask event channels while binding them to a vCPU
Mask the event channel source before trying to bind it to a CPU, this
prevents stray interrupts from firing while assigning them and hitting the
KASSERT in xen_intr_handle_upcall.

Sponsored by: Citrix Systems R&D
2014-12-10 11:42:02 +00:00
Roger Pau Monné
6d54cab1fe xen: allow to register event channels without handlers
This is needed by the event channel user-space device, that requires
registering event channels without unmasking them. intr_add_handler
will unconditionally unmask the event channel, so we avoid calling it
if no filter/handler is provided, and then the user will be in charge
of calling it when ready.

In order to do this, we need to change the opaque type
xen_intr_handle_t to contain the event channel port instead of the
opaque cookie returned by intr_add_handler, since now registration of
event channels without handlers are allowed. The cookie will now be
stored inside of the private xenisrc struct. Also, introduce a new
function called xen_intr_add_handler that allows adding a
filter/handler after the event channel has been registered.

Sponsored by: Citrix Systems R&D

x86/xen/xen_intr.c:
 - Leave the event channel without a handler if no filter/handler is
   provided to xen_intr_bind_isrc.
 - Don't perform an evtchn_mask_port, intr_add_handler will already do
   it.
 - Change the opaque type xen_intr_handle_t to contain a pointer to
   the event channel port number, and make the necessary changes to
   related functions.
 - Introduce a new function called xen_intr_add_handler that can be
   used to add filter/handlers to an event channel after registration.

xen/xen_intr.h:
 - Add prototype of xen_intr_add_handler.
2014-10-22 16:51:52 +00:00
Roger Pau Monné
44e06d158a msi: add Xen MSI implementation
This patch adds support for MSI interrupts when running on Xen. Apart
from adding the Xen related code needed in order to register MSI
interrupts this patch also makes the msi_init function a hook in
init_ops, so different MSI implementations can have different
initialization functions.

Sponsored by: Citrix Systems R&D

xen/interface/physdev.h:
 - Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen
   public interface.

x86/include/init.h:
 - Add a hook for setting custom msi_init methods.

amd64/amd64/machdep.c:
i386/i386/machdep.c:
 - Set the default msi_init hook to point to the native MSI
   initialization method.

x86/xen/pv.c:
 - Set the Xen MSI init hook when running as a Xen guest.

x86/x86/local_apic.c:
 - Call the msi_init hook instead of directly calling msi_init.

xen/xen_intr.h:
x86/xen/xen_intr.c:
 - Introduce support for registering/releasing MSI interrupts with
   Xen.
 - The MSI interrupts will use the same PIC as the IO APIC interrupts.

xen/xen_msi.h:
x86/xen/xen_msi.c:
 - Introduce a Xen MSI implementation.

x86/xen/xen_nexus.c:
 - Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI
   implementation.

x86/xen/xen_pci.c:
 - Introduce a Xen specific PCI bus that inherits from the ACPI PCI
   bus and overwrites the native MSI methods.
 - This is needed because when running under Xen the MSI messages used
   to configure MSI interrupts on PCI devices are written by Xen
   itself.

dev/acpica/acpi_pci.c:
 - Lower the quality of the ACPI PCI bus so the newly introduced Xen
   PCI bus can take over when needed.

conf/files.i386:
conf/files.amd64:
 - Add the newly created files to the build process.
2014-09-30 16:46:45 +00:00
Roger Pau Monné
9c7116e195 xen: don't set suspend/resume methods for the PIRQ PIC
The suspend/resume of event channels is already handled by the xen_intr_pic.
If those methods are set on the PIRQ PIC they are just called twice, which
breaks proper resume. This fix restores migration of FreeBSD guests to a
working state.

Sponsored by: Citrix Systems R&D
2014-09-15 15:15:52 +00:00
Roger Pau Monné
a36a5425c7 xen: change order of Xen intr init and IO APIC registration
This change inserts the Xen interrupt subsystem (event channels)
initialization between the system interrupt initialization and the IO
APIC source registration.

This is needed when running on Dom0, that routes physical interrupts
on top of event channels, so that the interrupt sources found during
IO APIC initialization can be registered using the Xen interrupt
subsystem.

The resulting order in the SI_SUB_INTR stage is the following:

- System intr initialization
- Xen intr initalization
- IO APIC source registration

Sponsored by: Citrix Systems R&D

x86/x86/local_apic.c:
 - Change order of apic_setup_io to be called after xen interrupt
   subsystem is setup.

x86/xen/xen_intr.c:
 - Init Xen event channels before apic_setup_io.
2014-08-04 08:54:34 +00:00
Roger Pau Monné
a33ea97e26 xen: add a DDB command to print event channel information
Add a new DDB command to dump all registered event channels.

Sponsored by: Citrix Systems R&D

x86/xen/xen_intr.c:
 - Add a new xen_evtchn command to DDB in order to dump all
   information related to event channels.
2014-08-04 08:52:10 +00:00
Roger Pau Monné
c9f3ec7fd2 xen: mask all event channels on init
Mask all event channels during initialization. This is done so that we
don't receive spurious interrupts while dynamically registering new
event channels. There's a small window during registration where an
event channel can fire before we have attached a handler to it.

Sponsored by: Citrix Systems R&D

x86/xen/xen_intr.c:
 - Mask all event channels on init.
2014-08-04 08:43:27 +00:00
Roger Pau Monné
25e34dd327 xen: implement event channel PIRQ support
This allows Dom0 to manage physical hardware, redirecting the
physical interrupts to event channels.

Sponsored by: Citrix Systems R&D

x86/xen/xen_intr.c:
 - Expand struct xenisrc to hold the level and triggering of PIRQ
   event channels.
 - Implement missing methods in xen_intr_pirq_pic.
 - Allow xen_intr_alloc_isrc to take a vector parameter that globally
   identifies the interrupt. This is only used for PIRQs that are
   bound to a specific hardware IRQ.
 - Introduce xen_register_pirq used to register IO APIC legacy PIRQ
   interrupts.
 - Add support for the dynamic PIRQ EOI map, this shared memory is
   modified by Xen (if it suppoorts that feature), and notifies the
   guest if an EOI is needed or not. If it's not available fall back
   to the old implementation using PHYSDEVOP_irq_status_query.
 - Rename xen_intr_isrc_count to xen_intr_auto_vector_count and
   replace it's usages.
 - Align static variables by name.

xen/xen_intr.h:
 - Add prototype for xen_register_pirq.
2014-08-04 08:42:29 +00:00
John Baldwin
e07ef9b0f6 Move <machine/apicvar.h> to <x86/apicvar.h>. 2014-01-23 20:10:22 +00:00
Justin T. Gibbs
5fdd34ee20 Formalize the concept of virtual CPU ids by adding a per-cpu vcpu_id
field.  Perform vcpu enumeration for Xen PV and HVM environments
and convert all Xen drivers to use vcpu_id instead of a hard coded
assumption of the mapping algorithm (acpi or apic ID) in use.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)

amd64/include/pcpu.h:
i386/include/pcpu.h:
	Add vcpu_id to the amd64 and i386 pcpu structures.

dev/xen/timer/timer.c
x86/xen/xen_intr.c
	Use new vcpu_id instead of assuming acpi_id == vcpu_id.

i386/xen/mp_machdep.c:
i386/xen/mptable.c
x86/xen/hvm.c:
	Perform Xen HVM and Xen full PV vcpu_id mapping.

x86/xen/hvm.c:
x86/acpica/madt.c
	Change SYSINIT ordering of acpi CPU enumeration so that it
	is guaranteed to be available at the time of Xen HVM vcpu
	id mapping.
2013-10-05 23:11:01 +00:00
Justin T. Gibbs
428b7ca290 Add support for suspend/resume/migration operations when running as a
Xen PVHVM guest.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
	- Make sure that are no MMU related IPIs pending on migration.
	- Reset pending IPI_BITMAP on resume.
	- Init vcpu_info on resume.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
sys/x86/acpica/acpi_wakeup.c:
sys/x86/x86/intr_machdep.c:
sys/x86/isa/atpic.c:
sys/x86/x86/io_apic.c:
sys/x86/x86/local_apic.c:
	- Add a "suspend_cancelled" parameter to pic_resume().  For the
	  Xen PIC, restoration of interrupt services differs between
	  the aborted suspend and normal resume cases, so we must provide
	  this information.

sys/dev/acpica/acpi_timer.c:
sys/dev/xen/timer/timer.c:
sys/timetc.h:
	- Don't swap out "suspend safe" timers across a suspend/resume
	  cycle.  This includes the Xen PV and ACPI timers.

sys/dev/xen/control/control.c:
	- Perform proper suspend/resume process for PVHVM:
		- Suspend all APs before going into suspension, this allows us
		  to reset the vcpu_info on resume for each AP.
		- Reset shared info page and callback on resume.

sys/dev/xen/timer/timer.c:
	- Implement suspend/resume support for the PV timer. Since FreeBSD
	  doesn't perform a per-cpu resume of the timer, we need to call
	  smp_rendezvous in order to correctly resume the timer on each CPU.

sys/dev/xen/xenpci/xenpci.c:
	- Don't reset the PCI interrupt on each suspend/resume.

sys/kern/subr_smp.c:
	- When suspending a PVHVM domain make sure there are no MMU IPIs
	  in-flight, or we will get a lockup on resume due to the fact that
	  pending event channels are not carried over on migration.
	- Implement a generic version of restart_cpus that can be used by
	  suspended and stopped cpus.

sys/x86/xen/hvm.c:
	- Implement resume support for the hypercall page and shared info.
	- Clear vcpu_info so it can be reset by APs when resuming from
	  suspension.

sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/x86/xen/xen_intr.c:
	- Support UP kernel configurations.

sys/x86/xen/xen_intr.c:
	- Properly rebind per-cpus VIRQs and IPIs on resume.
2013-09-20 05:06:03 +00:00
Justin T. Gibbs
e44af46e4c Implement PV IPIs for PVHVM guests and further converge PV and HVM
IPI implmementations.

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Submitted by: gibbs (misc cleanup, table driven config)
Reviewed by:  gibbs
MFC after: 2 weeks

sys/amd64/include/cpufunc.h:
sys/amd64/amd64/pmap.c:
	Move invltlb_globpcid() into cpufunc.h so that it can be
	used by the Xen HVM version of tlb shootdown IPI handlers.

sys/x86/xen/xen_intr.c:
sys/xen/xen_intr.h:
	Rename xen_intr_bind_ipi() to xen_intr_alloc_and_bind_ipi(),
	and remove the ipi vector parameter.  This api allocates
	an event channel port that can be used for ipi services,
	but knows nothing of the actual ipi for which that port
	will be used.  Removing the unused argument and cleaning
	up the comments surrounding its declaration helps clarify
	its actual role.

sys/amd64/amd64/mp_machdep.c:
sys/amd64/include/cpu.h:
sys/i386/i386/mp_machdep.c:
sys/i386/include/cpu.h:
	Implement a generic framework for amd64 and i386 that allows
	the implementation of certain CPU management functions to
	be selected at runtime.  Currently this is only used for
	the ipi send function, which we optimize for Xen when running
	on a Xen hypervisor, but can easily be expanded to support
	more operations.

sys/x86/xen/hvm.c:
	Implement Xen PV IPI handlers and operations, replacing native
	send IPI.

sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
sys/i386/include/smp.h:
	Remove NR_VIRQS and NR_IPIS from FreeBSD headers.  NR_VIRQS
	is defined already for us in the xen interface files.
	NR_IPIS is only needed in one file per Xen platform and is
	easily inferred by the IPI vector table that is defined in
	those files.

sys/i386/xen/mp_machdep.c:
	Restructure to more closely match the HVM implementation by
	performing table driven IPI setup.
2013-09-06 22:17:02 +00:00
Justin T. Gibbs
76acc41fb7 Implement vector callback for PVHVM and unify event channel implementations
Re-structure Xen HVM support so that:
	- Xen is detected and hypercalls can be performed very
	  early in system startup.
	- Xen interrupt services are implemented using FreeBSD's native
	  interrupt delivery infrastructure.
	- the Xen interrupt service implementation is shared between PV
	  and HVM guests.
	- Xen interrupt handlers can optionally use a filter handler
	  in order to avoid the overhead of dispatch to an interrupt
	  thread.
	- interrupt load can be distributed among all available CPUs.
	- the overhead of accessing the emulated local and I/O apics
	  on HVM is removed for event channel port events.
	- a similar optimization can eventually, and fairly easily,
	  be used to optimize MSI.

Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure,
and misc Xen cleanups:

Sponsored by: Spectra Logic Corporation

Unification of PV & HVM interrupt infrastructure, bug fixes,
and misc Xen cleanups:

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D

sys/x86/x86/local_apic.c:
sys/amd64/include/apicvar.h:
sys/i386/include/apicvar.h:
sys/amd64/amd64/apic_vector.S:
sys/i386/i386/apic_vector.s:
sys/amd64/amd64/machdep.c:
sys/i386/i386/machdep.c:
sys/i386/xen/exception.s:
sys/x86/include/segments.h:
	Reserve IDT vector 0x93 for the Xen event channel upcall
	interrupt handler.  On Hypervisors that support the direct
	vector callback feature, we can request that this vector be
	called directly by an injected HVM interrupt event, instead
	of a simulated PCI interrupt on the Xen platform PCI device.
	This avoids all of the overhead of dealing with the emulated
	I/O APIC and local APIC.  It also means that the Hypervisor
	can inject these events on any CPU, allowing upcalls for
	different ports to be handled in parallel.

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
	Map Xen per-vcpu area during AP startup.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
	Increase the FreeBSD IRQ vector table to include space
	for event channel interrupt sources.

sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
	Remove Xen HVM per-cpu variable data.  These fields are now
	allocated via the dynamic per-cpu scheme.  See xen_intr.c
	for details.

sys/amd64/include/xen/hypercall.h:
sys/dev/xen/blkback/blkback.c:
sys/i386/include/xen/xenvar.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/xen/gnttab.c:
	Prefer FreeBSD primatives to Linux ones in Xen support code.

sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/console/xencons_ring.c:
sys/dev/xen/control/control.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/xenpci/xenpci.c:
sys/i386/i386/machdep.c:
sys/i386/include/pmap.h:
sys/i386/include/xen/xenfunc.h:
sys/i386/isa/npx.c:
sys/i386/xen/clock.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/xen_rtc.c:
sys/xen/evtchn/evtchn_dev.c:
sys/xen/features.c:
sys/xen/gnttab.c:
sys/xen/gnttab.h:
sys/xen/hvm.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstore.c:
sys/xen/xenstore/xenstore_dev.c:
sys/xen/xenstore/xenstorevar.h:
	Pull common Xen OS support functions/settings into xen/xen-os.h.

sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
	Remove constants, macros, and functions unused in FreeBSD's Xen
	support.

sys/xen/xen-os.h:
sys/i386/xen/xen_machdep.c:
sys/x86/xen/hvm.c:
	Introduce new functions xen_domain(), xen_pv_domain(), and
	xen_hvm_domain().  These are used in favor of #ifdefs so that
	FreeBSD can dynamically detect and adapt to the presence of
	a hypervisor.  The goal is to have an HVM optimized GENERIC,
	but more is necessary before this is possible.

sys/amd64/amd64/machdep.c:
sys/dev/xen/xenpci/xenpcivar.h:
sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/sys/kernel.h:
	Refactor magic ioport, Hypercall table and Hypervisor shared
	information page setup, and move it to a dedicated HVM support
	module.

	HVM mode initialization is now triggered during the
	SI_SUB_HYPERVISOR phase of system startup.  This currently
	occurs just after the kernel VM is fully setup which is
	just enough infrastructure to allow the hypercall table
	and shared info page to be properly mapped.

sys/xen/hvm.h:
sys/x86/xen/hvm.c:
	Add definitions and a method for configuring Hypervisor event
	delievery via a direct vector callback.

sys/amd64/include/xen/xen-os.h:
sys/x86/xen/hvm.c:

sys/conf/files:
sys/conf/files.amd64:
sys/conf/files.i386:
	Adjust kernel build to reflect the refactoring of early
	Xen startup code and Xen interrupt services.

sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
sys/dev/xen/control/control.c:
sys/dev/xen/evtchn/evtchn_dev.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/xen/xenstore/xenstore.c:
sys/xen/evtchn/evtchn_dev.c:
sys/dev/xen/console/console.c:
sys/dev/xen/console/xencons_ring.c
	Adjust drivers to use new xen_intr_*() API.

sys/dev/xen/blkback/blkback.c:
	Since blkback defers all event handling to a taskqueue,
	convert this task queue to a "fast" taskqueue, and schedule
	it via an interrupt filter.  This avoids an unnecessary
	ithread context switch.

sys/xen/xenstore/xenstore.c:
	The xenstore driver is MPSAFE.  Indicate as much when
	registering its interrupt handler.

sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusvar.h:
	Remove unused event channel APIs.

sys/xen/evtchn.h:
	Remove all kernel Xen interrupt service API definitions
	from this file.  It is now only used for structure and
	ioctl definitions related to the event channel userland
	device driver.

	Update the definitions in this file to match those from
	NetBSD.  Implementing this interface will be necessary for
	Dom0 support.

sys/xen/evtchn/evtchnvar.h:
	Add a header file for implemenation internal APIs related
	to managing event channels event delivery.  This is used
	to allow, for example, the event channel userland device
	driver to access low-level routines that typical kernel
	consumers of event channel services should never access.

sys/xen/interface/event_channel.h:
sys/xen/xen_intr.h:
	Standardize on the evtchn_port_t type for referring to
	an event channel port id.  In order to prevent low-level
	event channel APIs from leaking to kernel consumers who
	should not have access to this data, the type is defined
	twice: Once in the Xen provided event_channel.h, and again
	in xen/xen_intr.h.  The double declaration is protected by
	__XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared
	twice within a given compilation unit.

sys/xen/xen_intr.h:
sys/xen/evtchn/evtchn.c:
sys/x86/xen/xen_intr.c:
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/xenpci/xenpcivar.h:
	New implementation of Xen interrupt services.  This is
	similar in many respects to the i386 PV implementation with
	the exception that events for bound to event channel ports
	(i.e. not IPI, virtual IRQ, or physical IRQ) are further
	optimized to avoid mask/unmask operations that aren't
	necessary for these edge triggered events.

	Stubs exist for supporting physical IRQ binding, but will
	need additional work before this implementation can be
	fully shared between PV and HVM.

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
sys/i386/xen/mp_machdep.c
sys/x86/xen/hvm.c:
	Add support for placing vcpu_info into an arbritary memory
	page instead of using HYPERVISOR_shared_info->vcpu_info.
	This allows the creation of domains with more than 32 vcpus.

sys/i386/i386/machdep.c:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/exception.s:
	Add support for new event channle implementation.
2013-08-29 19:52:18 +00:00