This patch comes from Dave Jiang's Linux tree, davejiang/ntb. It hasn't
been accepted into Linus' tree, so I do not have an authoritative SHA1
to point at. Original commit log:
=====================================================================
A hardware errata causes the NTB to hang when heavy bi-directional
traffic in addition to the usage of BAR0/1 (where the registers reside,
including the doorbell registers to trigger interrupts).
This workaround is only available on Haswell and Broadwell platform.
The workaround is to enable split BAR in the BIOS to allow the 64bit
BAR4 to be split into two 32bit BAR4 and BAR5. The BAR4 shall be pointed
to LAPIC region of the remote host. We will bypass the db mechanism and
directly trigger the MSIX interrupts. The offsets and vectors are
exchanged during transport scratch pad negotiation. The scratch pads are
now overloaded in order to allow the exchange of the information. This
gets around using the doorbell and prevents the lockup with additional
pcode changes in BIOS.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
=====================================================================
Notable changes in the FreeBSD version of this patch:
* The MSIX BAR is configurable, like hw.ntb.b2b_mw_idx (msix_mw_idx).
The Linux version of the patch only uses BAR4.
* MSIX negotiation aborts if the link goes down.
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Replace the hw.ntb.enable_writecombine tunable with
hw.ntb.default_mw_pat. It can be set with several specific numerical
values to select a caching type. Any bogus value is treated as
Uncacheable (UC).
The ntb_mw_set_wc() KPI has removed the restriction that the selected
mode must be one of UC, WC, or WB.
Sponsored by: EMC / Isilon Storage Division
And expose vm_memattr_t of current mapping to consumers (as well as the
ability to change it to one of UC, WB, WC).
After short discussion with: jhb (but no review)
Sponsored by: EMC / Isilon Storage Division
Adds a new tunable, ntb.hw.b2b_mw_idx, which specifies the offset (from the
total number of memory windows) to use for register access on hardware with
the SDOORBELL_LOCKUP errata. The default is -1, i.e., the last memory
window.
We map BARs before the b2b_mw_idx is selected, so map them all as memory
windows initially. The register memory window should not be write-combined,
so we explicitly disable WC on the selected MW later.
This introduces a layer of abstraction between consumer memory window
indices, which exclude any exclusive errata-workaround BARs, and internal
memory window indices, which include such BARs. An internal routine,
ntb_user_mw_to_idx(), converts the former to the latter. Public APIs have
been updated to use this instead of assuming the exclusive workaround BAR is
the last available MW.
Sponsored by: EMC / Isilon Storage Division
This feature is disabled by default. To enable it, tune
hw.if_ntb.enable_xeon_watchdog to non-zero.
If enabled, writes an unused NTB register every second to demonstrate to
a hardware watchdog that the NTB device is still alive. Most machines
with NTB will not need this -- you know who you are.
Sponsored by: EMC / Isilon Storage Division
32-bit BARs can only address memory mapped in the low 32 bits of
physical RAM. Expose this as a 'plimit' out parameter from
ntb_mw_get_range().
Fix if_ntb to allocate memory within this limit.
Sponsored by: EMC / Isilon Storage Division
Sometimes they'll read spurious values (observed: 0xc on Broadwell-DE),
failing link negotiation.
Discussed with: Dave Jiang, Allen Hubbe
Sponsored by: EMC / Isilon Storage Division
The tunable 'hw.ntb.enable_writecombine' may be set to zero to
administratively disable write combining the mapped NTB region.
Sponsored by: EMC / Isilon Storage Division
There is no need for the upstream and downstream addresses to be
different for the NTB configs. Go to using a single set of address. It
is still possible to configure them differently using module parameter
override however (CEM: tunable).
Authored by: Dave Jiang <dave.jiang@intel.com>
Reviewed by: Allen Hubbe <Allen.Hubbe@emc.com>
Reviewed by: Jon Mason <jdmason@kudzu.us>
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Because it can sleep drainking link work callout(s). Linux (dual
BSD/GPL driver) does something very similar.
At the same time, switch the NTB CTX lock to a non-spin mutex, because
the taskqueue_swi lock can't be taken after a spin mutex.
Suggested by: Witness
Sponsored by: EMC / Isilon Storage Division
In ntb_poll_link, we are intentionally writing the link bit, which is
absent from db_valid_mask. Don't panic on a kassert when we do so.
The Linux version of this (dual BSD/GPL) driver has the db_valid_mask
assertions in callers of db_iowrite() rather than db_iowrite() itself;
it skips the assertions in the equivalent of ntb_poll_link(). Rather
than duplicating the assertions in every caller, add a db_iowrite_raw()
that doesn't check and use it from ntb_poll_link().
Suggested by: kassert_panic
Sponsored by: EMC / Isilon Storage Division
AMD64 pmap assumes ranges will be in the DMAP, which isn't necessarily
true for NTB memory windows (especially 64-bit BARs).
Suggested by: pmap_change_attr_locked -> kassert_panic
Sponsored by: EMC / Isilon Storage Division
This should export all of the same information as the Linux ntb_hw_intel
debugfs info file, but with a bit more structure, in the sysctl tree
rooted at 'dev.ntb_hw.<N>.debug_info'.
Raw registers are marked as OPAQUE because reading them on some hardware
revisions may cause a hard lockup (NTB errata). They can be read with
'sysctl -x dev.ntb_hw.<N>.debug_info.registers'. On Xeon platforms,
some additional registers are available under 'registers.xeon_stats' and
'registers.xeon_hw_err'. They are exported as big-endian values so that
the 'sysctl -x' output is legible.
Shrink the feature mask to 32 bits so we can use the %b formatter in
'debug_info.features'.
Sponsored by: EMC / Isilon Storage Division
Mechanically replace "SOC" with "ATOM" to match Linux. No functional
change. Original Linux commit log follows:
Instead of using the platform code names, use the correct platform names
to identify the respective Intel NTB hardware.
Authored by: Dave Jiang
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Prints driver name to indicate what is being loaded.
Authored by: Dave Jiang
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Add module parameters for the addresses to be used in B2B topology.
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
We skip actually bringing up Rootport/Transparent configurations, so
most of this doesn't apply. Original Linux commit log:
Link training should be enabled in the driver probe for root port mode.
We should not have to wait for transport to be loaded for this to
happen. Otherwise the ntb device will not show up on the transparent
bridge side of the link.
Authored by: Dave Jiang
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
It is just a trivial wrapper around ntb_mw_set_trans().
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
This is the last e26a5843 patch. The general thrust of the rewrite was
to move more responsibility for Memory Window and Doorbell interrupt
management from the ntb_hw driver to if_ntb.
A number of APIs have been added, removed, or replaced. The old
DB callback mechanism has been excised. Instead, callers (if_ntb) are
responsible for configuring MWs and handling their interrupts more
directly.
This adds a tunable, hw.ntb.max_mw_size, allowing users to limit the
size of memory windows used by if_ntb (identical to the Linux modparam
of the same name).
Despite attempts to keep mechanical name changes to separate commits,
some have snuck in here. At least the driver should be much more
similar to the latest Linux one now -- making porting fixes easier.
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
No functional change. Part of the huge rewrite (e26a5843).
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Move all Xeon secondary register setup to the setup_b2b_mw routine. We
use subroutines to make it a bit less wordy than the Linux version.
Adds a new tunable, 'hw.ntb.b2b_mw_share'. By default, it is off
(zero). If both sides enable it (any non-zero value), the NTB driver
attempts to use only half of a memory window for remote register MMIO
access.
This is still part of the large Linux rewrite (e26a5843).
Authored by: Allen Hubbe
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
This Linux commit was more or less a rewrite. Unfortunately, the commit
log does not give a lot of context for the rewrite. I have tried to
faithfully follow the changes made upstream, including matching function
names where possible, while churning the FreeBSD driver as little as
possible.
This is the bulk of the rewrite. There are two groups of changes to
follow in separate commits: fleshing out the rest of the changes to
xeon_setup_b2b_mw(), and some changes to if_ntb.
Yes, this is a big patch (3 files changed, 416 insertions(+), 237
deletions(-)), but the Linux patch was 13 files changed, 2,589
additions(+) and 2,195 deletions(-).
Original Linux commit log:
Change ntb_hw_intel to use the new NTB hardware abstraction layer.
Split ntb_transport into its own driver. Change it to use the new NTB
hardware abstraction layer.
Authored by: Allen Hubbe
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Some interrupt-related function names changed to match Linux.
No functional change. Still part of the huge e26a5843 rewrite in Linux.
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
No functional change.
Still part of the huge e26a5843 rewrite. I'm trying to make it less of
a complete rewrite in the FreeBSD version of the driver. Still, it
helps if our names match Linux.
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division