This simplifies the driver's rx fast path as well as the bookkeeping
code that tracks various rx buffer sizes and layouts.
MFC after: 1 week
Sponsored by: Chelsio Communications
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs. Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.
MFC after: 1 week
Sponsored by: Chelsio Communications
requested.
This is a tradeoff between PCIe efficiency during large packet rx and
packing efficiency during small packet rx.
MFC after: 1 week
Sponsored by: Chelsio Communications
than 128MB, which is the maximum supported by the hardware in RDMA mode.
Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications
Virtualise tcp_always_keepalive, TCP and UDP log_in_vain. All three are
set in the netoptions startup script, which we would love to run for VNETs
as well [1].
While virtualising the log_in_vain sysctls seems pointles at first for as
long as the kernel message buffer is not virtualised, it at least allows
an administrator to debug the base system or an individual jail if needed
without turning the logging on for all jails running on a system.
PR: 243193 [1]
MFC after: 2 weeks
CPL_TX_PKT_XT disables the internal parser on the chip and instead
relies on the driver to provide the exact length of the L2 and L3
headers. This allows hw checksumming and TSO to be used with L2 and
L3 encapsulations that the chip doesn't understand directly.
Note that netmap tx still uses the old CPL as it never uses the hw
to generate the checksum on tx.
Reviewed by: jhb@
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22788
TX_PKTS2 is more efficient within the firmware and this improves netmap
Tx by a few Mpps in some common scenarios.
MFC after: 1 week
Sponsored by: Chelsio Communications
These were obtained from the Chelsio Unified Wire v3.12.0.1 beta
release.
Note that the firmwares are not uuencoded any more.
MFH: 1 month
Sponsored by: Chelsio Communications
improvements, the ECN bits need to be exposed to the TCP SYNcache.
This change is a minimal modification to the function headers, without any
functional change intended.
Submitted by: Richard Scheffenegger
Reviewed by: rgrimes@, rrs@, tuexen@
Differential Revision: https://reviews.freebsd.org/D22436
should try in order to link up with the peer.
Various FEC variables within the driver can now have multiple bits set
instead of being powers of 2. 0 and -1 in the user knobs still mean no
FEC and auto (driver decides) respectively for backward compatibility,
but no-FEC and auto now have their own bits in the internal
representation. There is a new bit that can be set to request the FEC
recommended by the cable/transceiver module.
Add sysctls to display link related capabilities of the local side as
well as the link partner.
Note that all this needs a new firmware and the documentation for the
driver FEC knobs will be updated after that firmware is added to the
driver.
MFC after: 1 week
Sponsored by: Chelsio Communications
This allows the driver to be updated for the next firmware without
waiting for it to be released.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters.
Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE
connections.
NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment
the encrypted TLS frames output by the crypto engine. Instead, the
TOE is placed into a special setup to permit "dummy" connections to be
associated with regular sockets using KTLS. This permits using the
TOE to segment the encrypted TLS records. However, this approach does
have some limitations:
1) Regular TOE sockets cannot be used when the TOE is in this special
mode. One can use either TOE and TOE-based KTLS or NIC KTLS, but
not both at the same time.
2) In NIC KTLS mode, the TOE is only able to accept a per-connection
timestamp offset that varies in the upper 4 bits. Put another way,
only connections whose timestamp offset has the 28 lower bits
cleared can use NIC KTLS and generate correct timestamps. The
driver will refuse to enable NIC KTLS on connections with a
timestamp offset with any of the lower 28 bits set. To use NIC
KTLS, users can either disable TCP timestamps by setting the
net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the
tcp_new_ts_offset() function to clear the lower 28 bits of the
generated offset.
3) Because the TCP segmentation relies on fields mirrored in a TCB in
the TOE, not all fields in a TCP packet can be sent in the TCP
segments generated from a TLS record. Specifically, for packets
containing TCP options other than timestamps, the driver will
inject an "empty" TCP packet holding the requested options (e.g. a
SACK scoreboard) along with the segments from the TLS record.
These empty TCP packets are counted by the
dev.cc.N.txq.M.kern_tls_options sysctls.
Unlike TOE TLS which is able to buffer encrypted TLS records in
on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS
records for retransmit requests as well as non-retransmit requests
that do not include the start of a TLS record but do include the
trailer. The T6 NIC KTLS code tries to optimize some of the cases for
requests to transmit partial TLS records. In particular it attempts
to minimize sending "waste" bytes that have to be given as input to
the crypto engine but are not needed on the wire to satisfy mbufs sent
from the TCP stack down to the driver.
TCP packets for TLS requests are broken down into the following
classes (with associated counters):
- Mbufs that send an entire TLS record in full do not have any waste
bytes (dev.cc.N.txq.M.kern_tls_full).
- Mbufs that send a short TLS record that ends before the end of the
trailer (dev.cc.N.txq.M.kern_tls_short). For sockets using AES-CBC,
the encryption must always start at the beginning, so if the mbuf
starts at an offset into the TLS record, the offset bytes will be
"waste" bytes. For sockets using AES-GCM, the encryption can start
at the 16 byte block before the starting offset capping the waste at
15 bytes.
- Mbufs that send a partial TLS record that has a non-zero starting
offset but ends at the end of the trailer
(dev.cc.N.txq.M.kern_tls_partial). In order to compute the
authentication hash stored in the trailer, the entire TLS record
must be sent as input to the crypto engine, so the bytes before the
offset are always "waste" bytes.
In addition, other per-txq sysctls are provided:
- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq
using AES-CBC.
- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq
using AES-GCM.
- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to
compensate for the TOE engine not being able to set FIN on the last
segment of a TLS record if the TLS record mbuf had FIN set.
- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this
txq including full, short, and partial records.
- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header
and payload) sent for TLS record requests.
- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS
record requests.
To enable NIC KTLS with T6, set the following tunables prior to
loading the cxgbe(4) driver:
hw.cxgbe.config_file=kern_tls
hw.cxgbe.kern_tls=1
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21962
ccr(4) and TLS support in cxgbe(4) construct key contexts used by the
crypto engine in the T6. This consolidates some duplicated code for
helper functions used to build key contexts.
Reviewed by: np
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22156
TVSENSE may not be ready by the time t4_fw_initialize returns and the
firmware returns 0 if the driver asks for the Vdd before the sensor is
ready.
MFC after: 1 week
Sponsored by: Chelsio Communications
NIC KTLS will add a new TLS send tag type in cxgbe(4) that is a
distinct tag from a ratelimit tag. To support this, refactor
cxgbe_snd_tag to be a simple send tag with a type and convert the
existing ratelimit tag to a new cxgbe_rate_tag structure.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22072
Previously the table was allocated on first use by TOE and the
ratelimit code. The forthcoming NIC KTLS code also uses this table.
Allocate it unconditionally during attach to simplify consumers.
Reviewed by: np
Differential Revision: https://reviews.freebsd.org/D22028
This ensures the clip task won't race with t4_destroy_clip_table.
While here, make some mutex destroys unconditional since attach always
initializes them.
Reviewed by: np
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21952
This adds a TOE hook to allocate a KTLS session. It also recognizes
TLS mbufs in the socket buffer and sends those to the NIC using a TLS
work request to encrypt the record before segmenting it.
TOE TLS support must be enabled via the dev.t6nex.<N>.tls sysctl in
addition to enabling KTLS.
Reviewed by: np, gallatin
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21891
The PCI block in the adapter requires this field to be set to a valid
queue ID. It is not clear why it did not fail on all machines, but
the effect was that crypto operations reading input data via DMA
failed with an internal PCI read error on machines with 128G or more
of RAM.
Reported by: gallatin
Reviewed by: np
MFC after: 3 days
Sponsored by: Chelsio Communications
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator. In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well. These references are protected by the page lock, which must
therefore be acquired for many per-page operations. This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.
Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter. A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held. As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.
The vm_page_wire() and vm_page_unwire() KPIs are changed. The former
requires that either the object lock or the busy lock is held. The
latter no longer has a return value and may free the page if it releases
the last reference to that page. vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold(). It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler. vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state). In particular, synchronization details are no longer
leaked into the caller.
The change excises the page lock from several frequently executed code
paths. In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock. In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.
__FreeBSD_version is bumped. The DRM ports have been updated to
accomodate the KPI changes.
Reviewed by: jeff (earlier version)
Tested by: gallatin (earlier version), pho
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20486
Remove now-redundant items from toepcb and synq_entry and the code to
support them.
Let the driver calculate tx_align, rx_coalesce, and sndbuf by default.
Reviewed by: jhb@
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21387
descriptor. The per-tid tx credits are in demand during active Tx and
it's best not to use too many just for payload.
Sponsored by: Chelsio Communications
an updated rack depend on having access to the new
ratelimit api in this commit.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D20953
The driver used to log any non-zero cause and when running with a single
line interrupt it would spam the console/logs with reports of interrupts
that are of no interest to anyone.
MFC after: 1 week
Sponsored by: Chelsio Communications
The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics. The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.
This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead. Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.
No functional change is intended. __FreeBSD_version is bumped.
Reviewed by: alc, kib
Discussed with: jeff
Discussed with: jhb, np (cxgbe)
Tested by: pho (previous version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19247
Previously the TOE code used its own custom unmapped mbufs via
EXT_FLAG_VENDOR1. The old version always wired the entire AIO request
buffer first for the duration of the AIO operation and constructed
multiple mbufs which used the wired buffer as an external buffer.
The new version determines how much room is available in the socket
buffer and only wires the pages needed for the available room building
chains of M_NOMAP mbufs. This means that a large AIO write will now
limit the amount of wired memory it uses to the size of the socket
buffer.
Reviewed by: gallatin, np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D20839
Since cxgbe(4) uses sglist instead of bus_dma, this required updates
to the code that generates scatter/gather lists for packets. Also,
unmapped mbufs are always sent via DMA and never as immediate data in
the payload of a work request.
Submitted by: gallatin (earlier version)
Reviewed by: gallatin, hselasky, rrs
Discussed with: np
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20616
handle_ddp_close.
This eliminates a bad race where an aio_ddp_requeue that happened to run
after handle_ddp_close could bump up the active count.
Discussed with: jhb@
MFC after: 3 days
Sponsored by: Chelsio Communications
t_maxseg was changed in r293284 to not have any adjustment for TCP
timestamps. t4_tom inadvertently went back to pre-r293284 semantics
in r332506.
Sponsored by: Chelsio Communications
Previously, the aiotx task relied on the aio jobs in the queue to hold
a reference on the socket. However, when the last job is completed,
there is nothing left to hold a reference to the socket buffer lock
used to check if the queue is empty. In addition, if the last job on
the queue is cancelled, the task can run with no queued jobs holding a
reference to the socket buffer lock the task uses to notice the queue
is empty.
Fix these races by holding an explicit reference on the socket when
the task is queued and dropping that reference when the task
completes.
Reviewed by: np
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D20539
receive sockbuf's high water mark.
Calculate rx credits on the spot instead of tracking sbused/sb_cc and
rx_credits in the toepcb. The previous method worked when the high
water mark changed due to SB_AUTOSIZE but not when it was adjusted
directly (for example, by the soreserve in nfsrvd_addsock).
This fixes a connection hang while running iozone over an NFS mounted
share where nfsd's TCP sockets are being handled by t4_tom.
MFC after: 3 days
Sponsored by: Chelsio Communications
it hasn't been initialized.
This fixes a bug in r346570 that could cause a panic when servicing
TCP_INFO for offloaded connections.
MFC after: 3 days
Sponsored by: Chelsio Communications
- Perform ifp mismatch checks (to determine if a send tag is allocated
for a different ifp than the one the packet is being output on), in
ip_output() and ip6_output(). This avoids sending packets with send
tags to ifnet drivers that don't support send tags.
Since we are now checking for ifp mismatches before invoking
if_output, we can now try to allocate a new tag before invoking
if_output sending the original packet on the new tag if allocation
succeeds.
To avoid code duplication for the fragment and unfragmented cases,
add ip_output_send() and ip6_output_send() as wrappers around
if_output and nd6_output_ifp, respectively. All of the logic for
setting send tags and dealing with send tag-related errors is done
in these wrapper functions.
For pseudo interfaces that wrap other network interfaces (vlan and
lagg), wrapper send tags are now allocated so that ip*_output see
the wrapper ifp as the ifp in the send tag. The if_transmit
routines rewrite the send tags after performing an ifp mismatch
check. If an ifp mismatch is detected, the transmit routines fail
with EAGAIN.
- To provide clearer life cycle management of send tags, especially
in the presence of vlan and lagg wrapper tags, add a reference count
to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
Provide a helper function (m_snd_tag_init()) for use by drivers
supporting send tags. m_snd_tag_init() takes care of the if_ref
on the ifp meaning that code alloating send tags via if_snd_tag_alloc
no longer has to manage that manually. Similarly, m_snd_tag_rele
drops the refcount on the ifp after invoking if_snd_tag_free when
the last reference to a send tag is dropped.
This also closes use after free races if there are pending packets in
driver tx rings after the socket is closed (e.g. from tcpdrop).
In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
Drivers now also check this flag instead of checking snd_tag against
NULL. This avoids false positive matches when a forwarded packet
has a non-NULL rcvif that was treated as a send tag.
- cxgbe was relying on snd_tag_free being called when the inp was
detached so that it could kick the firmware to flush any pending
work on the flow. This is because the driver doesn't require ACK
messages from the firmware for every request, but instead does a
kind of manual interrupt coalescing by only setting a flag to
request a completion on a subset of requests. If all of the
in-flight requests don't have the flag when the tag is detached from
the inp, the flow might never return the credits. The current
snd_tag_free command issues a flush command to force the credits to
return. However, the credit return is what also frees the mbufs,
and since those mbufs now hold references on the tag, this meant
that snd_tag_free would never be called.
To fix, explicitly drop the mbuf's reference on the snd tag when the
mbuf is queued in the firmware work queue. This means that once the
inp's reference on the tag goes away and all in-flight mbufs have
been queued to the firmware, tag's refcount will drop to zero and
snd_tag_free will kick in and send the flush request. Note that we
need to avoid doing this in the middle of ethofld_tx(), so the
driver grabs a temporary reference on the tag around that loop to
defer the free to the end of the function in case it sends the last
mbuf to the queue after the inp has dropped its reference on the
tag.
- mlx5 preallocates send tags and was using the ifp pointer even when
the send tag wasn't in use. Explicitly use the ifp from other data
structures instead.
- Sprinkle some assertions in various places to assert that received
packets don't have a send tag, and that other places that overwrite
rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.
Reviewed by: gallatin, hselasky, rgrimes, ae
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20117
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.
EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).
As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.
LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).
No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.
Drivers can now pass up numa domain information via the
mbuf numa domain field. This information is then used
by TCP syncache_socket() to associate that information
with the inpcb. The domain information is then fed back
into transmitted mbufs in ip{6}_output(). This mechanism
is nearly identical to what is done to track RSS hash values
in the inp_flowid.
Follow on changes will use this information for lacp egress
port selection, binding TCP pacers to the appropriate NUMA
domain, etc.
Reviewed by: markj, kib, slavash, bz, scottl, jtl, tuexen
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20028
This is fairly similar to the AES-GCM support in ccr(4) in that it will
fall back to software for certain cases (requests with only AAD and
requests that are too large).
Tested by: cryptocheck, cryptotest.py
MFC after: 1 month
Sponsored by: Chelsio Communications
To workaround limitations in the crypto engine, empty buffers are
handled by manually constructing the final length block as the payload
passed to the crypto engine and disabling the normal "final" handling.
For HMAC this length block should hold the length of a single block
since the hash is actually the hash of the IPAD digest, but for
"plain" SHA the length should be zero instead.
Reported by: NIST SHA1 test failure
MFC after: 2 weeks
Sponsored by: Chelsio Communications
This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets. When called with a domain on a NUMA machine,
ifalloc_domain() will record the NUMA domain in the ifnet, and it will
allocate the ifnet struct from memory which is local to that NUMA
node. Similarly, if_alloc_dev() is a wrapper for if_alloc_domain
which uses a driver supplied device_t to call ifalloc_domain() with
the appropriate domain.
Note that the new if_numa_domain field fits in an alignment pad in
struct ifnet, and so does not alter the size of the structure.
Reviewed by: glebius, kib, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19930
This fixes a bug that prevented the driver from auto-flashing the
firmware when it didn't see one on the card. This feature was
introduced in r321390 and this bug was introduced in r343269.
Reported by: gallatin@
MFC after: 1 week
Sponsored by: Chelsio Communications
interrupt enable are not fatal.
The firmware sets up all the interrupt enables based on run time
configuration, which means the information in the enables is more
accurate than what's compiled into the driver. This change also allows
the fatal bits to be updated without any changes in the driver in some
cases.
MFC after: 1 week
Sponsored by: Chelsio Communications
The declaration in tcp_var.h is still around so t4_tom continued to
compile but wouldn't load. A separate commit will fix tcp_var.h
Reported By: Dustin Marquess (dmarquess at gmail)
Sponsored by: Chelsio Communications
Recent firmwares prefer to use a different format for viid internally
and this change allows them to do so.
MFC after: 1 week
Sponsored by: Chelsio Communications
Specifically, ccr(4) devices are also children of cxgbe nexus devices.
Rather than making assumptions about the child device's softc, walk
the list of ports from the nexus' softc to determine if a child is a
port in t4_child_location_str(). This fixes a panic when detaching a
ccr device.
Reviewed by: np
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D19399
- Do not use nvf = 4 as it is not really supported by the firmware.
Firmwares 1.23.3.0 and above will ignore it silently.
- Increase PF4's share of the VIs and let it use all of the RSS table.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
This fixes a panic during configuration if the tx channel of a port
isn't the same as its port id.
Reported by: Fabrice Bruel
MFC after: 1 week
Sponsored by: Chelsio Communications
"slow" interrupt handler:
- Expand the list of INT_CAUSE registers known to the driver.
- Add decode information for many more bits but decouple it from the
rest of intr_info so that it is entirely optional.
- Call t4_fatal_err exactly once, and from the top level PL intr handler.
t4_fatal_err:
- Use t4_shutdown_adapter from the common code to stop the adapter.
- Stop servicing slow interrupts after the first fatal one.
Driver/firmware interaction:
- CH_DUMP_MBOX: note whether the mailbox being dumped is a command or a
reply or something else.
- Log the raw value of pcie_fw for some errors.
- Use correct log levels (debug vs. error).
Sponsored by: Chelsio Communications
- Drain offload transmit queues when RATELIMIT is enabled but
TCP_OFFLOAD is not.
- Expose the per-VI nofldtxq and first_ofld_txq sysctls when
RATELIMIT is enabled but TCP_OFFLOAD is not.
- Clear offload transmit queue stats as part of a 'cxgbetool clearstats'
request when RATELIMIT is enabled but TCP_OFFLOAD is not.
Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D18966
mean that the driver should taste the firmware in the KLD and use that
firmware's version for all its fw_install checks.
The driver gets firmware version information from compiled-in values by
default and this change allows custom (or older/newer) firmware modules
to be used with the stock driver.
There is no change in default behavior.
MFC after: 1 week
Sponsored by: Chelsio Communications
indicates an error. Also, do not remove it twice from the hf list in
this case.
Submitted by: Krishnamraju Eraparaju @ Chelsio
MFC after: 1 week
Sponsored by: Chelsio Communicatons
ccr reuses the control queue and first rx queue from the first port on
each adapter. The driver cannot send requests until those queues are
initialized. Refuse to create sessions for now if the queues aren't
ready. This is a workaround until cxgbe allocates one or more
dedicated queues for ccr.
PR: 233851
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D18478
- Fix PR 227760 by getting the TOE to respond to the SYN after the call
to toe_syncache_add, not during it. The kernel syncache code calls
syncache_respond just before syncache_insert. If the ACK to the
syncache_respond is processed in another thread it may run before the
syncache_insert and won't find the entry. Note that this affects only
t4_tom because it's the only driver trying to insert and expand
syncache entries from different threads.
- Do not leak resources if an embryonic connection terminates at
SYN_RCVD because of L2 lookup failures.
- Retire lctx->synq and associated code because there is never a need to
walk the list of embryonic connections associated with a listener.
The per-tid state is still called a synq entry in the driver even
though the synq itself is now gone.
PR: 227760
MFC after: 2 weeks
Sponsored by: Chelsio Communications
through.
cxgb4vf doesn't own the buffer size list but still expects the first two
entries to be 4K and some power of 2 respectively. The BSD cxgbe
doesn't care where its preferred buffer sizes are as long as they're in
the list somewhere, so just move its entries towards the end as a
workaround.
MFC after: 1 month
Sponsored by: Chelsio Communicatons
card initialization. This is an expanded version of r333682.
Break up prep_firmware into simpler routines while here. Load the
firmware/config KLD only if needed.
MFC after: 1 month
Sponsored by: Chelsio Communications
This fixes builds of kernels without INET6 such as LINT-NOINET6.
Reported by: arybchik
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D18384
- Store the clip table in 'struct adapter' instead of in the TOM softc.
- Init the clip table during attach and teardown during detach.
- While here, add a dev.<nexus>.<unit>.misc.clip sysctl to dump the
CLIP table.
This does mean that we update the clip table even if TOE is not enabled,
but non-TOE things need the CLIP table anyway.
Reviewed by: np, Krishnamraju Eraparaju @ Chelsio
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D18010
After the fix contained in r341144, cxgbe does not need anymore
to set the IFCAP_NETMAP flag manually.
Reviewed by: np
Approved by: gnn (mentor)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D17987
vmem's are not just used for TLS memory in TOM and the #include actually
predates the TLS code so should not have been removed when the TLS vmem
moved in r340466.
Pointy hat to: jhb
Sponsored by: Chelsio Communications
The key context is always placed immediately after the work request
header. The total work request length has to be rounded up by 16
however.
MFC after: 1 month
Sponsored by: Chelsio Communications
The addresses passed when reading and writing keys are always shifted
right by 5 as the memory locations are addressed in 32-byte chunks, so
the quantum needs to be 32, not 8.
MFC after: 1 month
Sponsored by: Chelsio Communications
For TOE TLS, we just want to advance the send pointer to skip over the
record just sent to the TOE. The recently added sbsndptr_adv() is
sufficient for that and is cheaper.
MFC after: 1 month
Sponsored by: Chelsio Communications
A kernel panic can occur if the cxgbe interface is DOWN
when activating netmap. This patch prevents the driver
from freeing up cxgbe netmap resources when they have not
been allocated.
Submitted by: Nicolas Witkowski <nwitkowski@verisign.com>
Reviewed by: np
MFC after: 1 week
Sponsored by: Verisign, Inc.
Differential Revision: https://reviews.freebsd.org/D17802
r254889 added tcp_state_change() as a centralized place to log state
changes in TCP connections for DTrace. r294869 and r296881 took
advantage of this central location to manage per-state counters.
However, TOE sockets were still performing some (but not all) state
change updates via direct assignments to t_state. This resulted in
state counters underflowing when TOE was in use. Fix by using
tcp_state_change() when changing a TOE connection's state.
Reviewed by: np, markj
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D17915
Previously attempts to read the MC region were failing since the
length was greater than 2^31.
Reviewed by: np
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D17857
- Use PH_loc.eight[1] as a general 'cflags' (Chelsio flags) field to
describe properties of a queued packet. The MC_RAW_WR flag
indicates an mbuf holding a raw work request. mbuf_cflags() returns
the current flags.
- Raw work request mbufs are allocated via alloc_wr_mbuf() which will
allocate a single contiguous range to hold the mbuf data. The
consumer can use mtod() to obtain the start of the work request and
write the required work request in the buffer. The mbuf can then be
enqueued directly to the txq via mp_ring_enqueue().
- Since raw work requests might potentially send arbitrary work
requests, only set the EQUIQ and EQUEQ bits on work requests that
support them such as the normal tunneled Ethernet packet work
requests.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D17811
Currently this is a no-op, but will matter in the future when
cannot_use_txpkts() starts checking other conditions than just
needs_tso().
Sponsored by: Chelsio Communications
- Add a bus_child_location_str method to the nexus drivers that prints
out 'port=N' as the location string exported via devinfo and the
'%location' sysctl node.
- We can't use a bus_hint_device_unit to wire the unit numbers of
devices with a fixed devclass as the device gets assigned a unit in
make_device() before the device creator can set softc, etc.
Instead, when adding a child device, use a helper function much like
a bus_hint_device_unit method to look for wiring hints or to return
-1 to let the system choose a unit number. This function requires
an "at" hint for the port pointing to the nexus device and a "port"
hint listing the port number. For example:
hint.cxl.4.at="t5nex0"
hint.cxl.4.port="0"
wires cxl4 to the first port on the t5nex0 adapter.
Requested by: gallatin
MFC after: 2 months
interface into two groups. Filters can be used to match traffic
and distribute it across a group.
hw.cxgbe.nm_split_rss
Sponsored by: Chelsio Communications
subset of a VI's RSS indirection table.
This makes it possible to make groups out of rx queues and steer
different kinds of traffic to different groups. For example, an
interface with 8 rx queues could have all non-TCP traffic delivered to
queues 0-3 and all TCP traffic to queues 4-7.
Note that it is already possible for filters to steer traffic to a
particular queue or to distribute it using the full indirection table
(much like normal rx does).
Sponsored by: Chelsio Communications
The bits that explicitly request cidx updates do not work reliably with
all possible WRs that can be sent over the queue. The F_FW_WR_EQUIQ
requests that still remain may also have to be replaced with explicit
credit flush WRs in the future.
MFC after: 2 days
Sponsored by: Chelsio Communications
- Switch to using 32b port/link capabilities in the driver. The 32b
format is used internally by firmwares > 1.16.45.0 and the driver will
now interact with the firmware in its native format, whether it's 16b
or 32b. Note that the 16b format doesn't have room for 50G, 200G, or
400G speeds.
- Add a bit in the pause_settings knobs to allow negotiated PAUSE
settings to override manual settings.
- Ensure that manual link settings persist across an administrative
down/up as well as transceiver unplug/replug.
- Remove unused is_*G_port() functions.
Approved by: re@ (gjb@)
MFC after: 1 month
Sponsored by: Chelsio Communications
properly in a couple of places in the driver.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Approved by: re@ (rgrimes@)
Sponsored by: Chelsio Communications
hashfilters. Two because the driver needs to look up a hashfilter by
its 4-tuple or tid.
A couple of fixes while here:
- Reject attempts to add duplicate hashfilters.
- Do not assume that any part of the 4-tuple that isn't specified is 0.
This makes it consistent with all other mandatory parameters that
already require explicit user input.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
There used to be one control queue per adapter (the mgmtq) that was
initialized during adapter init and one per port that was initialized
later during port init. This change moves all the control queues (one
per port/channel) to the adapter so that they are initialized during
adapter init and are available before any port is up. This allows the
driver to issue ctrlq work requests over any channel without having to
bring up any port.
MFH: 2 weeks
Sponsored by: Chelsio Communications
own region in the TCAM starting with T6, unlike previous chips where
they were in the same region as normal filters.
These filters "hit" before anything else in the LE's lookup. The exact
order is:
a) High priority filters
b) TOE's active region (TCAM and/or hash)
c) Servers (TOE hw listeners)
d) Normal filters
MFC after: 1 week
Sponsored by: Chelsio Communications
traffic class for rate limiting.
Add experimental knobs that allow the user to specify a default pktsize
and burstsize for traffic classes associated with a port:
dev.<ifname>.<instance>.tc.pktsize
dev.<ifname>.<instance>.tc.burstsize
Sponsored by: Chelsio Communications
them display the current value of the bitfield rather than the fixed
value that was provided when the sysctl node was created.
MFC after: 1 week
Sponsored by: Chelsio Communications
- Ignore any type of TID where the start/end values are not in the
correct order. There are situations where the firmware isn't able to
reserve room for the number requested in the config file but doesn't
report a failure during configuration and instead sets end <= start.
- Track start/end in tid_tab and remove some redundant copies from
adapter->params.
- Move all the start/end and other read-only parameters to a quiet part
of tid_tab, away from the tid locks.
MFC after: 1 week
Sponsored by: Chelsio Communications
initializing the softc for a per-flow rate limiter. The limit happens
to be the same for both and the existing code worked by accident for
common configurations.
Reported by: gallatin@
Sponsored by: Chelsio Communications
Start in "class" instead of "flow" mode. This eliminates the need to
specify an MTU, which is not available that early anyway. It also
allows the user to manually configure ch-rl rate limiting after attach.
This used to fail because ch-rl isn't supported if cl-rl "flow" mode is
configured.
Set all traffic classes to 1Gbps during initialization. The goal is to
start off with _any_ valid configuration and 1Gbps works even for
gigabit cards.
Sponsored by: Chelsio Communications
Track session objects in the framework, and pass handles between the
framework (OCF), consumers, and drivers. Avoid redundancy and complexity in
individual drivers by allocating session memory in the framework and
providing it to drivers in ::newsession().
Session handles are no longer integers with information encoded in various
high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the
appropriate crypto_ses2foo() function on the opaque session handle.
Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to
the opaque handle interface. Discard existing session tracking as much as
possible (quick pass). There may be additional code ripe for deletion.
Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style
interface. The conversion is largely mechnical.
The change is documented in crypto.9.
Inspired by
https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html .
No objection from: ae (ipsec portion)
Reported by: jhb
- Add tracker argument to preemptible epochs
- Inline epoch read path in kernel and tied modules
- Change in_epoch to take an epoch as argument
- Simplify tfb_tcp_do_segment to not take a ti_locked argument,
there's no longer any benefit to dropping the pcbinfo lock
and trying to do so just adds an error prone branchfest to
these functions
- Remove cases of same function recursion on the epoch as
recursing is no longer free.
- Remove the the TAILQ_ENTRY and epoch_section from struct
thread as the tracker field is now stack or heap allocated
as appropriate.
Tested by: pho and Limelight Networks
Reviewed by: kbowling at llnw dot com
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16066
can time out if it's backed up due to a non-stop deluge of PAUSE frames
from a misbehaving peer. Detect this situation and toggle MPS TxEn
to allow forward progress.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
of descriptors processed. Add the ability to gather a certain maximum
number of frames in the driver's rx before waking up netmap rx. If
there aren't enough frames then netmap rx will be woken up as usual.
hw.cxgbe.nm_rx_nframes
Sponsored by: Chelsio Communications
The driver assumes the list can change (even though it does't right now)
and queries it every time the sysctl runs.
sysctl dev.<nexus>.<inst>.local_cpus
sysctl dev.<nexus>.<inst>.intr_cpus
sysctl dev.t6nex.0.local_cpus
sysctl dev.t6nex.0.intr_cpus
Sponsored by: Chelsio Communications
It was needed only for ia64 where it was implemented as a call to
bswapXX, which was always a real function. htobeXX with a constant
argument is calculated at compile-time everywhere else.
MFC after: 1 week
Sponsored by: Chelsio Communications
for a port. Fix other related issues while here:
- Require port lock for access to link_config.
- Allow 100Mbps operation by tracking the speed in Mbps. Yes, really.
- New port flag to indicate that the media list is immutable. It will
be used in future refinements.
This also fixes a bug where the driver reports incorrect media with
recent firmwares.
MFC after: 2 days
Sponsored by: Chelsio Communications
This is hardware support for the SO_MAX_PACING_RATE sockopt (see
setsockopt(2)), which is available in kernels built with "options
RATELIMIT".
Relnotes: Yes
Sponsored by: Chelsio Communications
If a socket is closed or shutdown and a partial record (or what
appears to be a partial record) is waiting in the socket buffer,
discard the partial record and close the connection rather than
waiting forever for the rest of the record.
Reported by: Harsh Jain @ Chelsio
Sponsored by: Chelsio Communications
An etid (ethoffload tid) is allocated for a send tag and it acquires a
reference on the traffic class that matches the send parameters
associated with the tag.
Sponsored by: Chelsio Communications
provisioned for NIC_ETHOFLD and the kernel has option RATELIMIT.
It is possible to use the chip's offload queues for normal NIC Tx and
not just TOE Tx. The difference is that these queues support out of
order processing of work requests and have a per-"flowid" mechanism for
tracking credits between the driver and hardware. This allows Tx for
any number of flows bound to different rate limits to be submitted to a
single Tx queue and the work requests for slow flows won't cause HOL
blocking for the rest.
Sponsored by: Chelsio Communications
by default.
This is the first of a series of commits that will add support for
RATELIMIT kernel option to the base if_cxgbe driver, for use with
ordinary NIC traffic "flows". RATELIMIT is already supported by t4_tom
for the fully-offloaded TCP connections that it handles.
Sponsored by: Chelsio Communications
if an error is reported while pre-processing the configuration file that
the driver attempted to use.
Also, allow the user to explicitly use the built-in configuration with
hw.cxgbe.config_file="built-in"
MFC after: 2 days
Sponsored by: Chelsio Communications
- Driver support for hardware NAT.
- Driver support for swapmac action.
- Validate a request to create a hashfilter against the filter mask.
- Add a hashfilter config file for T5.
Sponsored by: Chelsio Communications
This had been the default behavior but was changed accidentally as part
of the recent iw_cxgbe+OFED overhaul. Fix another bug in that change
while here: the global knob affects all the adapters in the system and
should be left alone by per-adapter code.
MFC after: 3 days
Sponsored by: Chelsio Communications
request, which can be used to configure hardware NAT and swapmac.
All firmwares released after Jan 2017 support this work request.
Sponsored by: Chelsio Communications
These filters reside in the card's memory instead of its TCAM and can be
configured via a new "hashfilter" subcommand in cxgbetool. Hash and
normal TCAM filters can be used together. The hardware does an
exact-match of packet fields for hash filters, unlike the masked match
performed for TCAM filters. Any T5/T6 card with memory can support at
least half a million hash filters. The sample config file with the
driver configures 512K of these, it is possible to double this to 1
million+ in some cases.
The chip does an exact-match of fields of incoming datagrams with hash
filters and performs the action configured for the filter if it matches.
The fields to match are specified in a "filter mask" in the firmware
config file. The filter mask always includes the 5-tuple (sip, dip,
sport, dport, ipproto). It can, optionally, also include any subset of
the filter mode (see filterMode and filterMask in the firmware config
file).
For example:
filterMode = fragmentation, mpshittype, protocol, vlan, port, fcoe
filterMask = protocol, port, vlan
Exact values of the 5-tuple, the physical port, and VLAN tag would have
to be provided while setting up a hash filter with the chip
configuration above.
Hash filters support all actions supported by TCAM filters. A packet
that hits a hash filter can be dropped, let through (with optional
steering to a specific queue or RSS region), switched out of another
port (with optional L2 rewrite of DMAC, SMAC, VLAN tag), or get NAT'ed.
(Support for some of these will show up in the driver in a follow-up
commit very shortly).
Sponsored by: Chelsio Communications
These firmwares and the following list of changes are from the public
ChelsioUwire-3.7.1.0 release.
T6 Firmware
================================================================================
Version : 1.19.1.0
Date : 04/23/2018
================================================================================
Fixes
-----
BASE:
- Fixed traffic stall when rate-limit is modified while running traffic.
- Fixes a firmware crash in FW_ETH_TX_EO_WR handling.
- Fixes host DCB support when FW_PORT_CMD is used.
ETH:
- Exit Auto-Negotiation if we don't receive base page from peer within 10s.
This fixes some cases where in we keep on restarting auto negotiation without
ever exiting, resulting in link failure.
- Fixes an issue where VF packets counter were not increasing if VF packets
coalesced WR is used by driver.
OFLD:
- Kernel and user mode NVMEoF performance enhancements.
FOiSCSI:
- Fixes fw crash when trying to connect to non-existence IPv6 iSNS target.
================================================================================
Version : 1.18.9.0
Date : 03/27/2018
================================================================================
Fixes
-----
BASE:
- For Ethernet frames less than 64B, pad them with zero bytes as per IEEE spec
(RFC 894).
- Added a new parameter iqtype to FW_IQ_CMD to identify the ingress NIC or offload
queues. This fixes an issue where driver was receiving interrupt with no new
messages in queue.
- FW_PARAMS_CMD processes all the valaid paramaters and returns value 0UL for
any unknown parameter.
OFLD:
- Fixes connection failure during SRQ reuse.
- Fixes incorrect cqe in case of WRITE with immediate operation.
FOiSCSI:
- Fixes a fw crash when wrong node-id is passed to FW_FOISCSI_CTRL_WR.
FOFCoE:
- Fixes a fw hang while creating NPIV.
Enhancements
------------
ETH:
- A new WR FW_ETH_TX_PKTS_VM_WR added to support VM packet coalescing.
================================================================================
Version : 1.18.4.0
Date : 02/28/2018
================================================================================
Fixes
-----
BASE:
- Fixed Rate limiting not working for 101Mbps<=rate limit<=163Mbps range.
- Fixed starting more than 32 VMs on PF4 causing firmware hang.
ETH:
- Fixed link failure due to FEC mismatch with optics.
- Fixed link failure with link toggle stress tests.
- Only BaseR FEC is supported for 50G.
- Fixed a bug in next page handling which sometimes causes link down.
- Fixed port down due to failre to read eeprom contents of some modules.
- Fixed a bug causing adapter to fail with spider configuration.
FOiSCSI:
- Fixed a bug causing login failure when connecting to multiple targets.
Enhancements
------------
BASE:
- Added a new firmware API to retrieve the maximum temperaturethreshold for
the chip (FW_PARAM_DEV_DIAG_MAXTMPTHRESH).
ETH:
- Added support for user to contol pause negotiation during auto negotiation.
FOiSCSI:
- Added a new facility to redirect few fw events to offload rx queue
(based on driver's configration)
- Driver can ignore providing ipv6 prefix len during ipv6 address configuration.
================================================================================
Version : 1.17.14.0
Date : 12/27/2017
================================================================================
FIXES
-----
BASE:
- Fixed an FLR failure during simulteneous power up of VM.
- Fixed an issue in vlan acl which was limiting vlan range to 1024.
ETH:
- Enabled RS-FEC for 25G active copper cable and 25GBASE-SR.
- When auto negotiation is enabled, final pause settings are resolved
based on local and peer pause settings.
- Handle NACK for an I2C access.
OFLD
- Fixed rdma connection cleanup in SO adpater.
- Fixed rdma connections during read invalidate.
- Fixed the crash when invalid BW rate is passed to fw.
- Fixed the traffic hang when BW allocation is changed from switch during traffic.
FOFCoE:
- Fixed an issue where initiator remains logged-in even after LLDP is disabled
on switch.
ENHANCEMENTS
------------
BASE:
- Added support for 248 VFs.
- Added fw driver periodic calibration for MC.
ETH:
- Added XLAUI port type support.
- Added raw mac entry deletion support (FW_VI_MAC_ID_BASED_FREE).
OFLD:
- Inline IPSec support added (flag F_FW_ULPTX_WR_DATA indicates the inline
IPSec WR).
- New work request FW_RI_RDMA_WRITE_CMPL_WR (write with completion) added to
T5 Firmware
================================================================================
Version : 1.19.1.0
Date : 04/23/2018
================================================================================
Fixes
-----
BASE:
- Fixes a firmware crash in FW_ETH_TX_EO_WR handling.
- Fixes host DCB support when FW_PORT_CMD is used.
ETH:
- Fixes an issue where VF packets counter were not increasing if VF packets
coalesced WR is used by driver.
OFLD:
- Fixes an issue where fw hangs if max traffic rate passed is 0.
FOiSCSI:
- Fixes fw crash when trying to connect to non-existence IPv6 iSNS target.
================================================================================
Version : 1.18.9.0
Date : 03/27/2018
================================================================================
Fixes
-----
BASE:
- For Ethernet frames less than 64B, pad them with zero bytes as per IEEE spec
(RFC 894).
- Added a new parameter iqtype to FW_IQ_CMD to identify the ingress NIC or offload
queues. This fixes an issue where driver was receiving interrupt with no new
messages in queue.
ETH:
- Pad the Ethernet packets of size less than 64B with zeros. This fixes the
incorrect checksum generation of packets less then 64B.
FOiSCSI:
- Fixes a fw crash when wrong node-id is passed to FW_FOISCSI_CTRL_WR.
FOFCoE:
- Fixes a fw hang while creating NPIV.
Enhancements
------------
ETH:
- A new WR FW_ETH_TX_PKTS_VM_WR added to support VM packet coalescing.
================================================================================
Version : 1.18.4.0
Date : 02/28/2018
================================================================================
Fixes
-----
BASE:
- Fixed starting more than 32 VMs on PF4 causing firmware hang.
FOiSCSI:
- Fixed a bug causing login failure when connecting to multiple targets.
Enhancements
------------
BASE:
- Added a new firmware API to retrieve the maximum temperaturethreshold for
the chip (FW_PARAM_DEV_DIAG_MAXTMPTHRESH).
ETH:
- Added support for user to contol pause negotiation during auto negotiation.
FOiSCSI:
- Added a new facility to redirect few fw events to offload rx queue
(based on driver's configration)
- Driver can ignore providing ipv6 prefix len during ipv6 address configuration.
================================================================================
Version : 1.17.14.0
Date : 12/27/2017
================================================================================
FIXES
-----
BASE:
- Fixed an issue in vlan acl which was limiting vlan range to 1024.
ETH:
- Corrected lane inversion logic.
- Fixed improper LED behavior in T580 cards.
- When auto negotiation is enabled, final pause settings are resolved
based on local and peer pause settings.
- Handle NACK for an I2C access.
OFLD
- Fixed rdma connections during read invalidate.
FOiSCSI:
- Fixed a connections hang when link is toggled frequently.
FOFCoE:
- Fixed an issue where initiator remains logged-in even after LLDP is disabled
on switch.
ENHANCEMENTS
------------
BASE:
- Added support for 124 VFs.
ETH:
- Added XLAUI port type support.
- Added raw mac entry deletion support (FW_VI_MAC_ID_BASED_FREE).
OFLD:
- New work request FW_RI_RDMA_WRITE_CMPL_WR (write with completion) added to
optimize NVMEoF write.
T4 Firmware
================================================================================
Version : 1.19.1.0
Date : 04/23/2018
================================================================================
Fixes
-----
BASE:
- Fixes a firmware crash in FW_ETH_TX_EO_WR handling.
- Fixes host DCB support when FW_PORT_CMD is used.
FOiSCSI:
- Fixes fw crash when trying to connect to non-existence IPv6 iSNS target.
================================================================================
Version : 1.18.9.0
Date : 03/27/2018
================================================================================
Fixes
-----
BASE:
- Added a new paramter iqtype to FW_IQ_CMD to identify the ingress NIC or
offload queues. This fixes an issue where driver was receiving interrupt with
no new messages in queue.
FOFCoE:
- Fixes a fw hang while creating NPIV.
Enhancements
------------
ETH:
- A new WR FW_ETH_TX_PKTS_VM_WR added to support VM packet coalescing.
================================================================================
Version : 1.18.4.0
Date : 02/28/2018
================================================================================
Enhancements
------------
BASE:
- Added a new firmware API to retrieve the maximum temperaturethreshold for
the chip (FW_PARAM_DEV_DIAG_MAXTMPTHRESH).
================================================================================
Version : 1.17.14.0
Date : 12/27/2017
================================================================================
FIXES
-----
BASE:
- Fixed an issue in vlan acl which was limiting vlan range to 1024.
MFC after: 3 days
Sponsored by: Chelsio Communications
Filter work requests are submitted in the nexus cdev's ioctl which then
blocks waiting for a reply. If driver detach runs in this state and
disables interrupts the ioctl will never complete and detach will hang
in destroy_cdev.
Sponsored by: Chelsio Communications
Reserve 3b in the 14b atid to identify the owner and use it to dispatch
the CPL. This allows all CPLs that use an atid to be used as shared
CPLs, although ACT_OPEN_RPL is the only one being converted in this
revision.
Sponsored by: Chelsio Communications
intended recipient of a CPL when it can't be determined solely from the
opcode. Retire the per-queue handlers for such CPLs in favor of the new
scheme.
Sponsored by: Chelsio Communications
Previously, get_keyid() was returning the address of the receive key
instead of the transmit key when renegotiating the transmit key. This
could either hang the card (if a connection was only offloading TLS TX
and thus had a receive key address of -1) or cause the connection to
fail by overwriting the wrong key (if both RX and TX TLS were
offloaded).
Submitted by: Harsh Jain @ Chelsio
Sponsored by: Chelsio Communications
Retrieve the tag from the correct ifnet and use the provided tag
(instead of hardcoded 0xffff, implying no tag) in the routines that
process offload policy.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Sponsored by: Chelsio Communications
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload. t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface. The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so. The general format
of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or
passive open and stops at the first match. There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload
This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
Changelist:
- Turn tx_rings and rx_rings arrays into arrays of pointers to kring
structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
vtnet and ptnet drivers to cope with the change.
- Generalize the nm_config() callback to accept a struct containing many
parameters.
- Introduce NKR_FAKERING to support buffers sharing (used for netmap
pipes)
- Improved API for external VALE modules.
- Various bug fixes and improvements to the netmap memory allocator,
including support for externally (userspace) allocated memory.
- Refactoring of netmap pipes: now linked rings share the same netmap
buffers, with a separate set of kring pointers (rhead, rcur, rtail).
Buffer swapping does not need to happen anymore.
- Large refactoring of the control API towards an extensible solution;
the goal is to allow the addition of more commands and extension of
existing ones (with new options) without the need of hacks or the
risk of running out of configuration space.
A new NIOCCTRL ioctl has been added to handle all the requests of the
new control API, which cover all the functionalities so far supported.
The netmap API bumps from 11 to 12 with this patch. Full backward
compatibility is provided for the old control command (NIOCREGIF), by
means of a new netmap_legacy module. Many parts of the old netmap.h
header has now been moved to netmap_legacy.h (included by netmap.h).
Approved by: hrs (mentor)
IFF_UP and IFF_DRV_RUNNING out of sync. ifhwioctl in the kernel pays no
attention to the return code from the driver ioctl during SIOCSIFFLAGS
so these messages are the only indication that the ioctl was called but
failed.
MFC after: 1 week
Sponsored by: Chelsio Communications
The TCB is read using a memory window right now. A better alternate to
get self-consistent, uncached information would be to use a GET_TCB
request but waiting for a reply from hw while holding non-sleepable
locks is quite inconvenient.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D14817
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900
Requests to modify the state of TLS connections need to be sent on the
same queue as TLS record transmit requests to ensure ordering.
However, in order to use the offload transmit queue in t4_set_tcb_field(),
the function needs to be updated to do proper flow control / credit
management when queueing a request to an offload queue. This required
passing a pointer to the toepcb itself to this function, so while here
remove the 'tid' and 'iqid' parameters and obtain those values from the
toepcb in t4_set_tcb_field() itself.
Submitted by: Harsh Jain @ Chelsio (original version)
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D14871
This fixes an avoidable EINVAL when the user tries to disable AN after
the port is initialized but l1cfg doesn't have a valid speed to use.
MFC after: 1 week
Sponsored by: Chelsio Communications
Compare sbavail() with the cached sb_off of already-sent data instead of
always comparing with zero. This will correctly close the connection and
send the FIN if the socket buffer contains some previously-sent data but
no unsent data.
Reported by: Harsh Jain @ Chelsio
Sponsored by: Chelsio Communications