Note that the MW allocation still must be BAR *aligned*. So, this only
loosens the constraints on MW allocation slightly. BAR-aligned does not
play well with large (GB+) BAR sizes.
Going forward, if anyone cares about if_ntb on very large BARs, I
suggest they add functionality to allocate a smaller window than the BAR
size, and set the BAR range to cover a window much larger than the
allocated window. This will require negotiating a window offset and
limit for protocol traffic. None of this is implemented in this
revision.
Sponsored by: EMC / Isilon Storage Division
Lower the payload data (IP) portion of the MTU from 0x10000 to
IP_MAXPACKET (0xFFFF) to avoid panicing the IP stack.
Sponsored by: EMC / Isilon Storage Division
This feature is disabled by default. To enable it, tune
hw.if_ntb.enable_xeon_watchdog to non-zero.
If enabled, writes an unused NTB register every second to demonstrate to
a hardware watchdog that the NTB device is still alive. Most machines
with NTB will not need this -- you know who you are.
Sponsored by: EMC / Isilon Storage Division
Discard the unused rx_free_q. Instead, reuse inputed packets by putting
them back on the *pend* queue after reinitialization.
If tx or rx handlers are unavailable, free mbufs rather than leaking
them.
With this change, if_ntb can receive more than 100
(NTB_QP_DEF_NUM_ENTRIES) packets.
Sponsored by: EMC / Isilon Storage Division
32-bit BARs can only address memory mapped in the low 32 bits of
physical RAM. Expose this as a 'plimit' out parameter from
ntb_mw_get_range().
Fix if_ntb to allocate memory within this limit.
Sponsored by: EMC / Isilon Storage Division
Use bus_space_write instead of (non-volatile) C pointer writes via an
iowrite32() shim in the same places as the Dual BSD/GPL Linux driver.
Update some types to fixed 32-bit sizes.
Sponsored by: EMC / Isilon Storage Division
Order of operations issue with the QP Num and MW count, which would
result in the receive buffer pointer being invalid if there are more
than 1 MW. Corrected with parenthesis to enforce the proper order of
operations.
Reported by: John I. Kading <John.Kading@gd-ms.com>
Reported by: Conrad Meyer <cem@FreeBSD.org>
Authored by: Jon Mason <jdmason@kudzu.us>
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Because it can sleep drainking link work callout(s). Linux (dual
BSD/GPL driver) does something very similar.
At the same time, switch the NTB CTX lock to a non-spin mutex, because
the taskqueue_swi lock can't be taken after a spin mutex.
Suggested by: Witness
Sponsored by: EMC / Isilon Storage Division
Add ntb_q_idx_t so it is more clear which struct members are of the same
type (some bogus uint64_ts snuck in that should have been unsigned int).
Add tx_err_no_buf and s/ENOMEM/EBUSY/ in tx_enqueue to match Linux.
Sponsored by: EMC / Isilon Storage Division
A plain 32 bit integer will overflow for values over 4GiB.
Change the plain integer size to the appropriate size type in
ntb_set_mw. Change the type of the size parameter and two local
variables used for size.
Even if there is no overflow, a size of zero is invalid here.
Authored by: Allen Hubbe
Reported by: Juyoung Jung
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
It was possible for a synchronous update of the RX index in the error
case to get ahead of the asynchronous RX index update in the normal
case. Change the RX processing to preserve an RX completion order.
There were two error cases. First, if a buffer is not present to
receive data, there would be no queue entry to preserve the RX
completion order. Instead of dropping the RX frame, leave the RX frame
in the ring. Schedule RX processing when RX entries are enqueued, in
case there are RX frames waiting in the ring to be received.
Second, if a buffer is too small to receive data, drop the frame in the
ring, mark the RX entry as done, and indicate the error in the RX entry
length. Check for a negative length in the receive callback in
ntb_netdev, and count occurrences as rx_length_errors.
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Prints driver name to indicate what is being loaded.
Authored by: Dave Jiang
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Benchmarking showed a significant performance increase with the MTU size
to 64k instead of 16k. Change the driver default to 64k.
Authored by: Dave Jiang
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Throw away the result of the peer SPAD read. The peer will write our
local SPAD and we need to keep the locally read SPAD value to check if
the remote side is up.
Sponsored by: EMC / Isilon Storage Division
Reset the link stats when the link goes down. In particular, the TX and
RX index and count must be reset, or else the TX side will be sending
packets to the RX side where the RX side is not expecting them. Reset
all the stats, to be consistent.
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
This is the last e26a5843 patch. The general thrust of the rewrite was
to move more responsibility for Memory Window and Doorbell interrupt
management from the ntb_hw driver to if_ntb.
A number of APIs have been added, removed, or replaced. The old
DB callback mechanism has been excised. Instead, callers (if_ntb) are
responsible for configuring MWs and handling their interrupts more
directly.
This adds a tunable, hw.ntb.max_mw_size, allowing users to limit the
size of memory windows used by if_ntb (identical to the Linux modparam
of the same name).
Despite attempts to keep mechanical name changes to separate commits,
some have snuck in here. At least the driver should be much more
similar to the latest Linux one now -- making porting fixes easier.
Authored by: Allen Hubbe
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
No functional change. Part of the huge rewrite (e26a5843).
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
No functional change.
Still part of the huge e26a5843 rewrite. I'm trying to make it less of
a complete rewrite in the FreeBSD version of the driver. Still, it
helps if our names match Linux.
Obtained from: Linux (e26a5843) (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
This is a follow-up to r289208: "Xeon Errata Workaround."
Add logic to support a variable number of memory windows and doorbell
callbacks. This was added to the Linux driver in the "Xeon Errata
Workaround" commit, but I skipped it because it didn't look neccessary
at the time. It is needed for future Haswell split-BAR support, so
bring it in now.
A new tunable was added for if_ntb, 'hw.ntb.max_num_clients'. By
default, it is set to zero -- infer the number of clients from the
number of memory windows available from the hardware. Any other
positive value can specify a different number of clients, limited by the
number of doorbell callbacks available (4 under MSI-X, or 15 (Xeon) or
34 (SoC) under legacy INTx).
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Original Linux commit log:
The NTB translate register must have the value to be BAR size aligned.
This alignment check make sure that the DMA memory allocated has the
proper alignment. Another requirement for NTB to function properly with
memory window BAR size greater or equal to 4M is to use the CMA feature
in 3.16 kernel with the appropriate CONFIG_CMA_ALIGNMENT and
CONFIG_CMA_SIZE_MBYTES set.
Authored by: Dave Jiang
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
The detection of an uneven number of queues on the given memory windows
was not correct. The mw_num is zero based and the mod should be
division to spread them evenly over the mw's.
Authored by: Jon Mason
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Provide a better event interface between the client and transport.
Authored by: Jon Mason
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
A WARN_ON is being hit in ntb_qp_link_work due to the NTB transport link
being down while the ntb qp link is still active. This is caused by the
transport link being brought down prior to the qp link worker thread
being terminated. To correct this, shutdown the qp's prior to bringing
the transport link down. Also, only call the qp worker thread if it is
in interrupt context, otherwise call the function directly.
Authored by: Jon Mason
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Many variable names in the NTB driver refer to the primary or secondary
side. However, these variables will be used to access the reverse case
when in NTB-RP mode. Make these names more generic in anticipation of
NTB-RP support.
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
There is a Xeon hardware errata related to writes to SDOORBELL or B2BDOORBELL
in conjunction with inbound access to NTB MMIO Space, which may hang the
system. To workaround this issue, use one of the memory windows to access the
interrupt and scratch pad registers on the remote system. This bypasses the
issue, but removes one of the memory windows from use by the transport. This
reduction of MWs necessitates adding some logic to determine the number of
available MWs.
Since some NTB usage methodologies may have unidirectional traffic, the ability
to disable the workaround via modparm has been added.
See BF113 in
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-c5500-c3500-spec-update.pdf
See BT119 in
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-family-spec-update.pdf
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
The system will appear to lockup for long periods of time due to the NTB
driver spending too much time in memcpy. Avoid this by reducing the
number of packets that can be serviced on a given interrupt.
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
The ring logic of the NTB receive buffer/transmit memory window requires
there to be at least 2 payload sized allotments. For the minimal size
case, split the buffer into two and set the transport_mtu to the
appropriate size.
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
If the NTB link toggles, the driver could stop receiving due to the
tx_index not being set to 0 on the transmitting size on a link-up event.
This is due to the driver expecting the incoming data to start at the
beginning of the receive buffer and not at a random place.
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
Each link-up will allocate a new NTB receive buffer when the NTB
properties are negotiated with the remote system. These allocations did
not check for existing buffers and thus did not free them. Now, the
driver will check for an existing buffer and free it if not of the
correct size, before trying to alloc a new one.
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
64bit BAR sizes are permissible with an NTB device. To support them
various modifications and clean-ups were required, most significantly
using 2 32bit scratch pad registers for each BAR.
Also, modify the driver to allow more than 2 Memory Windows.
Authored by: Jon Mason
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
->remote_rx_info and ->rx_info are struct ntb_rx_info pointers. If we
add sizeof(struct ntb_rx_info) then it goes too far.
Authored by: Dan Carpenter
Obtained from: Linux
Sponsored by: EMC / Isilon Storage Division
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks