Commit Graph

119 Commits

Author SHA1 Message Date
imp
f88f0cb715 Fix typos from last commit, these should have been #. 2017-12-22 20:48:49 +00:00
imp
ff6ebd2b2f Use '#' rather than some made up name for fields we want to ignore. 2017-12-22 17:53:27 +00:00
mav
8267c687a5 Add initial support for Address Lookup Table (A-LUT).
When enabled by EEPROM, use it to relax translation address/size alignment
requirements for BAR2 window by 128 or 256 times.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2017-10-01 09:48:31 +00:00
cem
437fd56dcd Add PNP metadata to a few drivers
An eventual devd(8) or other component should be able to scan buses and
automatically load drivers that match device ids described in this metadata.

Reviewed by:	imp
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12364
2017-09-14 15:34:45 +00:00
mav
69fd2bc45c Add second entry to LUT on a link side in B2B mode.
Each of two entries on a virtual side should have its counterpart on a
peer's link side.

MFC after:	1 week
2017-09-14 04:51:17 +00:00
mav
e31c0fd7de Increase negotiation polling period from 10ms to 100ms.
There is no big need to burn CPU if other side may be not there yet.  For
example, the PLX hardware by default enables the NTB link up on reset, not
dependig on driver to do it.  In case of Intel hardware this also reduces
race between MSI-X workaround negotiation and upper layers, using the same
scratchpad registers in different time.

MFC after:	12 days
2017-09-02 13:28:45 +00:00
mav
27adbb1a94 Make NTB drivers report more info via NewBus methods.
MFC after:	12 days
2017-09-02 11:56:16 +00:00
mav
4807d74429 Link Interface has no Link Error registers.
MFC after:	13 days
2017-09-01 09:48:19 +00:00
mav
213e12d71e Remove unneeded pmap_change_attr() calls.
Reported by:	kib
MFC after:	13 days
2017-08-31 17:02:06 +00:00
mav
376d970620 Add/polish some defines.
MFC after:	13 days
2017-08-31 16:32:11 +00:00
mav
cece25e8d3 Fix port control for PEX 8749.
That chip has three Station Ports, so previous address math was incorrect.

MFC after:	13 days
Sponsored by:	iXsystems, Inc.
2017-08-31 13:41:44 +00:00
mav
5849e8f575 Add NTB driver for PLX/Avago/Broadcom PCIe switches.
This driver supports both NTB-to-NTB and NTB-to-Root Port modes (though
the second with predictable complications on hot-plug and reboot events).
I tested it with PEX 8717 and PEX 8733 chips, but expect it should work
with many other compatible ones too.  It supports up to two NT bridges
per chip, each of which can have up to 2 64-bit or 4 32-bit memory windows,
6 or 12 scratchpad registers and 16 doorbells.  There are also 4 DMA engines
in those chips, but they are not yet supported.

While there, rename Intel NTB driver from generic ntb_hw(4) to more specific
ntb_hw_intel(4), so now it is on par with this new ntb_hw_plx(4) driver and
alike to Linux naming.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-30 21:16:32 +00:00
mav
b813aecdd3 Fix fake interrupt when set doorbell is unmasked.
Since the doorbell bit is already set when interrupt handler is called,
the event was not propagated to upper layer.  It was working normally
because present code was not using masking actively, but that is going
to change.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-28 19:52:57 +00:00
mav
42510015b1 Wrap previous MSIX workaround into #ifndef EARLY_AP_STARTUP.
With EARLY_AP_STARTUP we can successfully negotiate MSIX earlier.

Requested by:	jhb@
2016-07-30 21:06:59 +00:00
mav
6d315fe831 Block MSIX negotiation until SMP started and IRQ reshuffled. 2016-07-30 15:56:36 +00:00
mav
b3cab4a69b Clear scratchpad after MSIX negotiation to not leak garbage. 2016-07-29 20:52:18 +00:00
mav
910e26641a Once more refactor KPI between NTB hardware and consumers.
New design allows hardware resources to be split between several consumers.
For example, one BAR can be dedicated for remote memory access, while other
resources can be used for packet transport for virtual Ethernet interface.
And even without resource split, this code allows to specify which consumer
driver should attach the hardware.

From some points this makes the code even closer to Linux one, even though
Linux does not provide the described flexibility.
2016-07-28 10:48:20 +00:00
mav
1251b09fe1 Postpone ntb_get_msix_info() till we need to negotiate MSIX.
Calling it earlier increases the window when MSIX info may change.
This change does not solve the problem completely, but seems logical.
Complete solution should probably include link reset in case of MSIX
remap to trigger new negotiation, but we have no way to get notified
about that now.
2016-07-24 14:42:11 +00:00
sephe
d211e969f7 ntb: Fix LINT
Sponsored by:	Microsoft OSTC
2016-07-12 05:41:34 +00:00
mav
2ef64931ce Revert odd change, setting limit registers before base.
I don't know what errata is mentioned there, I was unable to find it, but
setting limit before the base simply does not work at all.  According to
specification attempt to set limit out of the present window range resets
it to zero, effectively disabling it.  And that is what I see in practice.

Fixing this properly disables access for remote side to our memory until
respective xlat is negotiated and set.  As I see, Linux does the same.
2016-07-10 20:22:04 +00:00
mav
5edde4436f Fix wrong copy/paste in r302510. 2016-07-10 19:52:26 +00:00
mav
351e95b628 Simplify MSIX MW BAR xlat setup, and don't forget to unlock its limit.
The last fixes SB01BASE_LOCKUP workaround after driver reload.
2016-07-10 01:09:16 +00:00
mav
f3fcdc1b95 Disable SB01BASE_LOCKUP workaround when split BARs disabled.
For some reason hack with sending MSI-X interrupts by writing to remote
LAPIC memory works only for 32-bit BARs, that are available only if split
BARs mode is enabled in BIOS.  If it is not, complain loudly and fall back
to less efficient workaround.
2016-07-09 23:22:44 +00:00
mav
4353c90d6c Reimplement doorbell register emulation for NTB_SB01BASE_LOCKUP.
This allows at least first three doorbells to work very close to normal
hardware, properly signaling events to upper layers without spurious or
lost events.  Doorbells above the first three may still report spurious
events due to lack of reliable information, but they are rarely used.
2016-07-09 11:57:21 +00:00
mav
2a1bf3bef3 Switch ctx_lock from mutex to rmlock.
It is odd idea to serialize different MSI-X vectors.  Use of rmlocks
here allows them to execute in parallel, but still protects ctx.
If upper layers require any additional serialization -- they can
do it by themselves.
2016-07-09 11:47:52 +00:00
mav
5ab408cf77 NewBus'ify NTB subsystem.
This follows NTB subsystem modularization in Linux, tuning it to FreeBSD
native NewBus interfaces.  This change allows to support different types
of hardware with different drivers, support multiple NTB instances in a
system, ntb_transport module use for needs other then if_ntb, etc.

Sponsored by:	iXsystems, Inc.
2016-07-09 11:20:42 +00:00
mav
f3601c1cdc Remove some dead code found by Clang static analyzer. 2016-07-09 09:47:11 +00:00
mav
7cf7db1912 Fix NTB_SDOORBELL_LOCKUP workaround.
Since SBARxSZ register can be write-once, it can be unusable for disabling
the SBAR.  For such case also set SBARxBASE to zero to not intersect with
config BAR.
2016-07-09 09:34:24 +00:00
mav
33d0103e47 When negotiating NTB_SB01BASE_LOCKUP workaround, don't try to limit the
BAR size to 1MB.  According to Xeon v3 specifications and my tests, that
size register is write-once and so not writeable after BIOS written it.

Instead of that, make the code work with BAR of any sufficient size,
properly calculating offset within its base.  It also simplifies the code.

Discussed with:	cem
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-06-04 00:18:59 +00:00
mav
a0753989c6 When negotiating MSIX parameters, give other head time to see our
NTB_MSIX_RECEIVED status, before making upper layers overwrite it.

This is not completely perfect, but now it works better then before.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-06-04 00:08:37 +00:00
cem
d17b279014 ntb_hw(4): Only record the first three MSIX vectors
Don't overrun the msix_data array by reading the (unused) link state
interrupt information.

Reported by:	mav (earlier version)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D6489
2016-05-23 19:46:58 +00:00
cem
27b91c5342 ntb_hw(4): Add sysctls for administrative/test link config, state
dev.ntb_hw.0.admin_up=0/1: Like ifconfig UP/DOWN.
dev.ntb_hw.0.active=0/1:   Like ifconfig 'status'

Reviewed by:	ngie
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D6429
2016-05-18 02:10:05 +00:00
pfg
eed4bd22ad sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
skra
f4b6499ab5 As <machine/pmap.h> is included from <vm/pmap.h>, there is no need to
include it explicitly when <vm/pmap.h> is already included.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D5373
2016-02-22 09:02:20 +00:00
cem
ff82ae2996 NTB: workaround for high traffic hardware hang
This patch comes from Dave Jiang's Linux tree, davejiang/ntb.  It hasn't
been accepted into Linus' tree, so I do not have an authoritative SHA1
to point at.  Original commit log:

=====================================================================
A hardware errata causes the NTB to hang when heavy bi-directional
traffic in addition to the usage of BAR0/1 (where the registers reside,
including the doorbell registers to trigger interrupts).

This workaround is only available on Haswell and Broadwell platform.
The workaround is to enable split BAR in the BIOS to allow the 64bit
BAR4 to be split into two 32bit BAR4 and BAR5. The BAR4 shall be pointed
to LAPIC region of the remote host. We will bypass the db mechanism and
directly trigger the MSIX interrupts. The offsets and vectors are
exchanged during transport scratch pad negotiation. The scratch pads are
now overloaded in order to allow the exchange of the information. This
gets around using the doorbell and prevents the lockup with additional
pcode changes in BIOS.

Signed-off-by:	Dave Jiang <dave.jiang@intel.com>
=====================================================================

Notable changes in the FreeBSD version of this patch:
* The MSIX BAR is configurable, like hw.ntb.b2b_mw_idx (msix_mw_idx).
  The Linux version of the patch only uses BAR4.
* MSIX negotiation aborts if the link goes down.

Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2016-02-14 22:37:28 +00:00
cem
954cde6c76 ntb_hw(4): Print correct PAT name for non-WC/WB types mapped at load
Sponsored by:	EMC / Isilon Storage Division
2016-02-10 20:49:22 +00:00
cem
1e46ac5ade ntb_hw(4): Allow any x86 PAT caching flags for MW defaults
Replace the hw.ntb.enable_writecombine tunable with
hw.ntb.default_mw_pat.  It can be set with several specific numerical
values to select a caching type.  Any bogus value is treated as
Uncacheable (UC).

The ntb_mw_set_wc() KPI has removed the restriction that the selected
mode must be one of UC, WC, or WB.

Sponsored by:	EMC / Isilon Storage Division
2016-02-10 20:28:28 +00:00
cem
dc18eb23e2 NTB: WC/WB isn't enough; set MMR region as UC
And expose vm_memattr_t of current mapping to consumers (as well as the
ability to change it to one of UC, WB, WC).

After short discussion with:	jhb (but no review)
Sponsored by:	EMC / Isilon Storage Division
2015-11-25 01:59:08 +00:00
cem
4ab80ee6ff ntb: Add MW tunable for MMR Xeon errata workaround
Adds a new tunable, ntb.hw.b2b_mw_idx, which specifies the offset (from the
total number of memory windows) to use for register access on hardware with
the SDOORBELL_LOCKUP errata.  The default is -1, i.e., the last memory
window.

We map BARs before the b2b_mw_idx is selected, so map them all as memory
windows initially.  The register memory window should not be write-combined,
so we explicitly disable WC on the selected MW later.

This introduces a layer of abstraction between consumer memory window
indices, which exclude any exclusive errata-workaround BARs, and internal
memory window indices, which include such BARs.  An internal routine,
ntb_user_mw_to_idx(), converts the former to the latter.  Public APIs have
been updated to use this instead of assuming the exclusive workaround BAR is
the last available MW.

Sponsored by:	EMC / Isilon Storage Division
2015-11-24 18:51:17 +00:00
cem
8737f2dc7d if_ntb: Add Xeon link watchdog register writes
This feature is disabled by default.  To enable it, tune
hw.if_ntb.enable_xeon_watchdog to non-zero.

If enabled, writes an unused NTB register every second to demonstrate to
a hardware watchdog that the NTB device is still alive.  Most machines
with NTB will not need this -- you know who you are.

Sponsored by:	EMC / Isilon Storage Division
2015-11-19 19:53:09 +00:00
cem
48ceeb626e NTB: Expose 32-bit BAR limits to consumers
32-bit BARs can only address memory mapped in the low 32 bits of
physical RAM.  Expose this as a 'plimit' out parameter from
ntb_mw_get_range().

Fix if_ntb to allocate memory within this limit.

Sponsored by:	EMC / Isilon Storage Division
2015-11-18 22:20:40 +00:00
cem
c87adb684e NTB: Mask off the low 12 bits of address/range registers
Sometimes they'll read spurious values (observed: 0xc on Broadwell-DE),
failing link negotiation.

Discussed with:	Dave Jiang, Allen Hubbe
Sponsored by:	EMC / Isilon Storage Division
2015-11-18 22:20:31 +00:00
cem
7354d53b8d ntb_hw: Add programmatic interface to enable/disable WC
Enable users to enable/disable WC on memory windows programmatically.

Sponsored by:	EMC / Isilon Storage Division
2015-11-18 22:20:21 +00:00
cem
cb548ea272 ntb_hw: Add tunable to disable write-combining
The tunable 'hw.ntb.enable_writecombine' may be set to zero to
administratively disable write combining the mapped NTB region.

Sponsored by:	EMC / Isilon Storage Division
2015-11-18 22:20:13 +00:00
cem
173ce2499e NTB: Fix 32-bit BAR size validation
Sponsored by:	EMC / Isilon Storage Division
2015-11-18 22:20:04 +00:00
cem
6401e21ede NTB: MFV 8b782fab: unify translation addresses
There is no need for the upstream and downstream addresses to be
different for the NTB configs.  Go to using a single set of address. It
is still possible to configure them differently using module parameter
override however (CEM: tunable).

Authored by:	Dave Jiang <dave.jiang@intel.com>
Reviewed by:	Allen Hubbe <Allen.Hubbe@emc.com>
Reviewed by:	Jon Mason <jdmason@kudzu.us>
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-11-12 19:07:03 +00:00
cem
1d18b415e2 NTB: Add more HW registers to device sysctl tree
Sponsored by:	EMC / Isilon Storage Division
2015-11-11 18:56:11 +00:00
cem
e3eccc928a ntb: volatile some members set by interrupt routines
Sponsored by:	EMC / Isilon Storage Division
2015-11-11 18:56:02 +00:00
cem
ba49bb5bae ntb_hw: Similarly, add a debug-leveled macro for ntb_hw
Sponsored by:	EMC / Isilon Storage Division
2015-11-11 18:55:53 +00:00
cem
d3fa847401 if_ntb: Transport link cleanup needs to be on a taskqueue
Because it can sleep drainking link work callout(s).  Linux (dual
BSD/GPL driver) does something very similar.

At the same time, switch the NTB CTX lock to a non-spin mutex, because
the taskqueue_swi lock can't be taken after a spin mutex.

Suggested by:	Witness
Sponsored by:	EMC / Isilon Storage Division
2015-11-11 18:55:34 +00:00