cannot be freed while do_pass_accept_req is running. This closes a race
where do_pass_establish on another CPU (the driver chose a different
queue for the new tid) expands the synq entry into a full PCB and then
releases the only hold on it, all while do_pass_accept_req is still
running.
MFC after: 3 days
This is the Compressed Local IPv6 table on the chip. To save space, the
chip uses an index into this table instead of a full IPv6 address in
some of its hardware data structures.
For now the driver fills this table with all the local IPv6 addresses
that it sees at the time the table is initialized. I'll improve this
later so that the table is updated whenever new IPv6 addresses are
configured or existing ones deleted.
MFC after: 1 week
- Teach find_best_mtu_idx() to deal with IPv6 endpoints.
- Install correct protosw in offloaded TCP/IPv6 sockets when DDP is
enabled.
- Move set_tcp_ddp_ulp_mode to t4_tom.c so that t4_tom.h can be included
without having to drag in t4_msg.h too. This was bothering the iWARP
driver for some reason.
MFC after: 1 week
- Add full support for IPv6 addresses.
- Read the size of the L2 table during attach. Do not assume that PCIe
physical function 4 of the card has all of the table to itself.
- Use FNV instead of Jenkins to hash L3 addresses and drop the private
copy of jhash.h from the driver.
MFC after: 1 week
on the fast data path) and use them instead of frobbing the adapter lock
and busy flag directly.
Other changes made while reworking all slow operations:
- Wait for the reply to a filter request (add/delete). This guarantees
that the operation is complete by the time the ioctl returns.
- Tidy up the tid_info structure.
- Do not allow the tx queue size to be set to something that's not a
power of 2.
MFC after: 1 week
resources are partitioned.
- Reduce the number of virtual interfaces reserved for PF4. This leaves
spare room in the source MAC table and allows the driver to setup
filters that rewrite the source MAC address.
- Reduce the number of filters and use the freed up space for the CLIP
(Compressed Local IPv6 addresses) table. This is a prerequisite for
IPv6 TOE support which will follow separately in a series of commits.
MFC after: 1 week
embryonic connection has been setup and never attempt to abort a tid
before this is done. This fixes a bad race where a listening socket is
closed when the driver is in the middle of step (b) here. The symptom
of this were "ARP miss" errors from the driver followed by tid leaks.
A hardware-offloaded passive open works this way:
a) A SYN "hits" the TCAM entry for a server tid and the chip delivers it
to the queue associated with the server tid (say, queue A). It waits
for a response from the driver telling it what to do.
b) The driver decides it is ok to proceed. It adds the new tid to the
list of embryonic connections associated with the server tid and then
hands off the SYN to the kernel's syncache to make sure that the kernel
okays it too. If it does then the driver provides an L2 table entry,
queue id (say, queue B), etc. and instructs the chip to send the SYN/ACK
response.
c) The chip delivers a status to queue B depending on how the third step
of the 3-way handshake goes. The driver removes the tid from its list
of embryonic connections and either expands the syncache entry or
destroys the tid. In any case all subsequent messages for the new tid
will be delivered to queue B, not queue A. Anything running in queue B
knows that the L2 entry has long been setup and the new flag is of no
interest from here on. If the listener is closed it will deal with
so_comp as normal.
MFC after: 1 week
counter) when the syncache doesn't want the driver to reply to an
incoming SYN. This fixes a harmless bug where tids_in_use would
go out of sync with the hardware counter.
MFC after: 3 days
This lets userspace read arbitrary information from the SFP+ modules
etc. on this bus.
Reading multiple bytes in the same transaction isn't possible right now.
I'll update the driver once the chip's firmware supports this.
MFC after: 3 days
#defines. This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.
This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g
When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.
Discussed with: jhb
MFC after: 1 week
evicted from the syncache but a later syncache_expand succeeds because
of syncookies. The TOE driver has to resort to more direct means to
install its hooks in the socket in this case.
the TOE driver reports that an active open failed. toe_connect_failed is
supposed to handle this but it should be provided the inpcb instead of the
tcpcb which may no longer be around.
Basically, this is automatic rx zero copy when feasible. TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.
- Works with sockets that are being handled by the TCP offload engine
of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
"ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.
- Setup multiple DDP page sizes. When the driver attempts DDP it will
try to combine physically contiguous pages into regions of these sizes.
- Set the indicate size such that the payload carried in the indicate can
be copied in the header mbuf (and the 16K rx buffer can be recycled).
- Set DDP threshold to the max payload that the chip will coalesce and
deliver to the driver (this is ~16K by default, which is also why the
offload rx queue is backed by 16K buffers). If the chip is able to
coalesce up to the max it's allowed to, it's a good sign that the peer
is transmitting in bulk without any TCP PSH.
MFC after: 2 weeks