freebsd-dev

Author	SHA1	Message	Date
Navdeep Parhar	87b027ba69	cxgbe(4): Enable automatic cidx flush for all control queues. MFC after: 3 days	2017-01-09 22:20:09 +00:00
Navdeep Parhar	f50c49cca7	cxgbe(4): The wraparound logic in start_wrq_wr() should not get involved in work requests that end at the end of the descriptor ring, even though the pidx wraps around to 0. MFC after: 3 days	2017-01-09 22:18:08 +00:00
Navdeep Parhar	1de8c69de7	cxgbe(4): Deal with compressed error vectors. MFC after: 3 days Sponsored by: Chelsio Communications	2016-12-15 02:05:29 +00:00
Navdeep Parhar	77e9044c47	cxgbe(4): Fix bug in the calculation of the number of physically contiguous regions in an mbuf chain. If the payload of an mbuf ends at a page boundary count_mbuf_nsegs would incorrectly consider the next mbuf's payload physically contiguous based solely on a KVA comparison. MFC after: 1 week Sponsored by: Chelsio Communications	2016-10-24 19:09:56 +00:00
Navdeep Parhar	aa93b99aa0	cxgbe(4): Make the location/length of all descriptor rings available in the sysctl MIB.	2016-09-23 20:03:28 +00:00
Navdeep Parhar	8c0ca00b72	cxgbe(4): Setup congestion response for T6 rx queues.	2016-09-21 00:50:22 +00:00
Navdeep Parhar	0459a175eb	cxgbe(4): Fixes to wrq stats. - Increment tx_wrs_copied in the correct place. - Add tx_wrs_sspace to the sysctl MIB. Sponsored by: Chelsio Communications	2016-09-19 17:16:51 +00:00
Navdeep Parhar	97f2919d54	cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields to use in tx work requests.	2016-09-15 08:30:47 +00:00
Navdeep Parhar	ed7e5640a5	cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6. Sponsored by: Chelsio Communications	2016-09-11 17:51:17 +00:00
Navdeep Parhar	0dbc6cfd75	cxgbe(4): Update the pad_boundary calculation for T6, which has a different range of boundaries. Sponsored by: Chelsio Communications	2016-09-11 17:22:54 +00:00
Navdeep Parhar	472a6004cf	cxgbe(4): Use correct macro for header length with T6 ASICs. This affects the transmit of the VF driver only. Sponsored by: Chelsio Communications	2016-09-11 16:11:51 +00:00
Navdeep Parhar	9e7cb06c17	cxgbe(4): Do not prescreen frames before attempting LRO. Sponsored by: Chelsio Communications	2016-09-09 07:34:14 +00:00
John Baldwin	6af45170c1	Chelsio T4/T5 VF driver. The cxgbev/cxlv driver supports Virtual Function devices for Chelsio T4 and T4 adapters. The VF devices share most of their code with the existing PF4 driver (cxgbe/cxl) and as such the VF device driver currently depends on the PF4 driver. Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf PCI device driver that attaches to the VF device. It then creates child cxgbev/cxlv devices representing ports assigned to the VF. By default, the PF driver assigns a single port to each VF. t4vf_hw.c contains VF-specific routines from the shared code used to fetch VF-specific parameters from the firmware. t4_vf.c contains the VF-specific PCI device driver and includes its own attach routine. VF devices are required to use a different firmware request when transmitting packets (which in turn requires a different CPL message to encapsulate messages). This alternate firmware request does not permit chaining multiple packets in a single message, so each packet results in a firmware request. In addition, the different CPL message requires more detailed information when enabling hardware checksums, so parse_pkt() on VF devices must examine L2 and L3 headers for all packets (not just TSO packets) for VF devices. Finally, L2 checksums on non-UDP/non-TCP packets do not work reliably (the firmware trashes the IPv4 fragment field), so IPv4 checksums for such packets are calculated in software. Most of the other changes in the non-VF-specific code are to expose various variables and functions private to the PF driver so that they can be used by the VF driver. Note that a limited subset of cxgbetool functions are supported on VF devices including register dumps, scheduler classes, and clearing of statistics. In addition, TOE is not supported on VF devices, only for the PF interfaces. Reviewed by: np MFC after: 2 months Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7599	2016-09-07 18:13:57 +00:00
John Baldwin	e06ab612d2	Don't break out of the m_advance() loop if len drops to zero. If a packet contains the Ethernet header (14 bytes) in the first mbuf and the payload (IP + UDP + data) in the second mbuf, then the attempt to fetch the l3hdr will return a NULL pointer. The first loop iteration will drop len to zero and exit the loop without setting 'p'. However, the desired data is at the start of the second mbuf, so the correct behavior is to loop around and let the conditional set 'p' to m_data of the next mbuf (and leave offset as 0). Reviewed by: np Sponsored by: Chelsio Communications	2016-09-07 18:08:43 +00:00
Navdeep Parhar	7cba15b16e	cxgbe/cxgbei: Retire all DDP related code from cxgbei and switch to routines available in t4_tom to manage the iSCSI DDP page pod region. This adds the ability to use multiple DDP page sizes to the iSCSI driver, among other improvements. Sponsored by: Chelsio Communications	2016-09-01 20:43:01 +00:00
John Baldwin	59c1e950b9	Make SGE parameter handling more VF-friendly. Add fields to hold the SGE control register and free list buffer sizes to the sge_params structure. Populate these new fields in t4_init_sge_params() for PF devices and change t4_read_chip_settings() to pull these values out of the params structure instead of reading registers directly. This will permit t4_read_chip_settings() to be reused for VF devices which cannot read SGE registers directly. While here, move the call to t4_init_sge_params() to get_params__post_init(). The VF driver will populate the SGE parameters structure via a different method before calling t4_read_chip_settings(). Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7476	2016-08-15 17:40:05 +00:00
John Baldwin	ec55567ce6	Track the base absolute ID of ingress and egress queues. Use this to map an absolute queue ID to a logical queue ID in interrupt handlers. For the regular cxgbe/cxl drivers this should be a no-op as the base absolute ID should be zero. VF devices have a non-zero base absolute ID and require this change. While here, export the absolute ID of egress queues via a sysctl. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7446	2016-08-09 17:49:42 +00:00
John Baldwin	8f6690d385	Fix a typo.	2016-08-08 21:28:02 +00:00
John Baldwin	315048f2ad	Store the offset of the KDOORBELL and GTS registers in the softc. VF devices use a different register layout than PF devices. Storing the offset in a value in the softc allows code to be shared between the PF and VF drivers. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7389	2016-08-01 22:39:51 +00:00
John Baldwin	29c229e9fc	Mark spg_len and fl_pktshift static. These variables are no longer exported to t4_netmap.c after r296478.	2016-07-28 17:37:12 +00:00
John Baldwin	069af0eb14	Install a handler for firmware work request error messages. If a driver sends an malformed or disallowed work request, the firmware responds with a work request error. Previously the driver treated this is as an unexpected message and panicked. Now it decodes the error message to aid in debugging. Reviewed by: np (older version) MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6950	2016-07-22 21:52:07 +00:00
Navdeep Parhar	671bf2b8b2	cxgbe(4): Changes to the CPL-handler registration mechanism and code related to "shared" CPLs. a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single function. Allow callers to direct the response to any iq. Tidy up set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of magic constants. b) Remove all CPL handler tables from struct adapter. This reduces its size by around 2KB. All handlers are now registered at MOD_LOAD instead of attach or some kind of initialization/activation. The registration functions do not need an adapter parameter any more. c) Add per-iq handlers to deal with CPLs whose destination cannot be determined solely from the opcode. There are 2 such CPLs in use right now: SET_TCB_RPL and L2T_WRITE_RPL. The base driver continues to send filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq. t4_tom (including the DDP code) now uses the port's ctrlq to send L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq. fwq and ofld_rxq have different handlers that know what kind of tid to expect in the reply. Update t4_write_l2e and callers to to support any wrq/iq combination. Approved by: re@ (kib@) Sponsored by: Chelsio Communications	2016-07-05 01:29:24 +00:00
Navdeep Parhar	62291463de	cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main cxgbe/cxl interfaces and tunables related to them are not affected by any of this and will continue to operate as usual. The driver used to create an additional 'n' interface for every cxgbe/cxl interface if "device netmap" was in the kernel. The 'n' interface shared the wire with the main interface but was otherwise autonomous (with its own MAC address, etc.). It did not have normal tx/rx but had a specialized netmap-only data path. r291665 added another set of virtual interfaces (the 'v' interfaces) to the driver. These had normal tx/rx but no netmap support. This revision consolidates the features of both the interfaces into the 'v' interface which now has a normal data path, TOE support, and native netmap support. The 'v' interfaces need to be created explicitly with the hw.cxgbe.num_vis tunable. This means "device netmap" will not result in the automatic creation of any virtual interfaces. The following tunables can be used to override the default number of queues allocated for each 'v' interface. nofld* = 0 will disable TOE on the virtual interface and nnm* = 0 to will disable native netmap support. # number of normal NIC queues hw.cxgbe.ntxq_vi hw.cxgbe.nrxq_vi # number of TOE queues hw.cxgbe.nofldtxq_vi hw.cxgbe.nofldrxq_vi # number of netmap queues hw.cxgbe.nnmtxq_vi hw.cxgbe.nnmrxq_vi hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed. --- tl;dr version --- The workflow for netmap on cxgbe starting with FreeBSD 11 is: 1) "device netmap" in the kernel config. 2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll end up with multiple autonomous netmap-capable interfaces for every port. 3) "dmesg \| grep vcxl \| grep netmap" to verify that the interface has netmap queues. 4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... . One major improvement is that the netmap interface has a normal data path as expected. 5) Just ignore the cxl interfaces if you want to use netmap only. No need to bring them up. The vcxl interfaces are completely independent and everything should just work. --------------------- Approved by: re@ (gjb@) Relnotes: Yes Sponsored by: Chelsio Communications	2016-06-23 02:53:00 +00:00
Navdeep Parhar	02f972e8f3	cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class. Sponsored by: Chelsio Communications	2016-06-08 14:15:29 +00:00
Pedro F. Giffuni	453130d9bf	sys/dev: minor spelling fixes. Most affect comments, very few have user-visible effects.	2016-05-03 03:41:25 +00:00
Navdeep Parhar	cda2ab0e7a	cxgbe(4): Always dispatch all work requests that have been written to the descriptor ring before leaving drain_wrq_wr_list.	2016-04-12 22:11:29 +00:00
Sepherosa Ziehau	6dd38b8716	tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication And factor out tcp_lro_rx_done, which deduplicates the same logic with netinet/tcp_lro.c Reviewed by: gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com> Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5725	2016-04-01 06:28:33 +00:00
Navdeep Parhar	78552b23a5	cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything to the descriptor ring no matter what path the frame takes within the driver's tx.	2016-03-22 18:56:23 +00:00
Navdeep Parhar	90e7434a6d	cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters. Move the code that reads all the parameters to t4_init_sge_params in the shared code. Use these per-adapter values instead of globals. Sponsored by: Chelsio Communications	2016-03-08 00:23:56 +00:00
Navdeep Parhar	d1205d093d	cxgbe(4): Very basic T6 awareness. This is part of ongoing work to update to the latest internal shared code. - Add a chip_params structure to keep track of hardware constants for all generations of Terminators handled by cxgbe. - Update t4_hw_pci_read_cfg4 to work with T6. - Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to work with T6. Most of the changes are in the decoders for the CIM logic analyzer and the MPS TCAM. - Acquire the regwin lock around indirect register accesses. Obtained from: Chelsio Communications Sponsored by: Chelsio Communications	2016-03-04 13:11:13 +00:00
Gleb Smirnoff	b4b12e52fb	Garbage collect unused arguments of m_init().	2016-02-10 18:54:18 +00:00
Hans Petter Selasky	e936121d31	Add optimizing LRO wrapper: - Add optimizing LRO wrapper which pre-sorts all incoming packets according to the hash type and flowid. This prevents exhaustion of the LRO entries due to too many connections at the same time. Testing using a larger number of higher bandwidth TCP connections showed that the incoming ACK packet aggregation rate increased from ~1.3:1 to almost 3:1. Another test showed that for a number of TCP connections greater than 16 per hardware receive ring, where 8 TCP connections was the LRO active entry limit, there was a significant improvement in throughput due to being able to fully aggregate more than 8 TCP stream. For very few very high bandwidth TCP streams, the optimizing LRO wrapper will add CPU usage instead of reducing CPU usage. This is expected. Network drivers which want to use the optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of "tcp_lro_flush()". Further the LRO control structure must be initialized using "tcp_lro_init_args()" passing a non-zero number into the "lro_mbufs" argument. - Make LRO statistics 64-bit. Previously 32-bit integers were used for statistics which can be prone to wrap-around. Fix this while at it and update all SYSCTL's which expose LRO statistics. - Ensure all data is freed when destroying a LRO control structures, especially leftover LRO entries. - Reduce number of memory allocations needed when setting up a LRO control structure by precomputing the total amount of memory needed. - Add own memory allocation counter for LRO. - Bump the FreeBSD version to force recompilation of all KLDs due to change of the LRO control structure size. Sponsored by: Mellanox Technologies Reviewed by: gallatin, sbruno, rrs, gnn, transport Tested by: Netflix Differential Revision: https://reviews.freebsd.org/D4914	2016-01-19 15:33:28 +00:00
John Baldwin	fe2ebb7644	Add support for configuring additional virtual interfaces (VIs) on a port. Each virtual interface has its own MAC address, queues, and statistics. The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented as additional VIs on each port. This change allows additional non-netmap interfaces to be configured on each port. Additional virtual interfaces use the naming scheme vcxgbeX or vcxlX. Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a value greater than 1 before loading the cxgbe(4) or cxl(4) driver. NB: The first VI on each port is the "main" interface (cxgbeX or cxlX). T4/T5 NICs provide a limited number of MAC addresses for each physical port. As a result, a maximum of six VIs can be configured on each port (including the "main" interface and the netmap interface when netmap is enabled). One user-visible result is that when netmap is enabled, packets received or transmitted via the netmap interface are no longer counted in the stats for the "main" interface, but are not accounted to the netmap interface. The netmap interfaces now also have a new-bus device and export various information sysctl nodes via dev.n(cxgbe\|cxl).X. The cxgbetool 'clearstats' command clears the stats for all VIs on the specified port along with the port's stats. There is currently no way to clear the stats of an individual VI. Reviewed by: np MFC after: 1 month Sponsored by: Chelsio	2015-12-03 00:02:01 +00:00
Navdeep Parhar	9af71ab3bc	cxgbe(4): Add a new knob that controls the congestion response of netmap rx queues. The default is to drop rather than backpressure. This decouples the congestion settings of NIC and netmap rx queues. MFC after: 3 days	2015-07-06 20:56:59 +00:00
Navdeep Parhar	41f7622b64	cxgbe(4): Do not override the the global defaults for congestion drops. The hw.cxgbe.cong_drop knob is not affected by this change because the driver sets up congestion drop on a per-queue basis. MFC after: 3 days	2015-07-06 20:28:42 +00:00
Navdeep Parhar	dbbf46c40c	cxgbe: get_fl_payload returns a header mbuf when successful. MFC after: 3 days	2015-06-23 05:55:13 +00:00
Navdeep Parhar	6af2071b47	cxgbe: set minimum burst size when fetching freelist buffers to 128B. MFC after: 3 days	2015-06-01 00:55:15 +00:00
Navdeep Parhar	70ca622987	cxgbe(4): provide the exact RSS hash type instead of a catch-all value to the upper layers.	2015-03-26 18:45:51 +00:00
Navdeep Parhar	1605bac6fb	cxgbe(4): set up congestion management for netmap rx queues. The hw.cxgbe.cong_drop knob controls the response of the chip when netmap queues are congested.	2015-02-24 18:40:10 +00:00
Navdeep Parhar	c5bb375553	cxgbe(4): there is no need to force an "unimplemented" panic needlessly. The calls to free_nm_txq and free_nm_rxq are made just a few lines prior to the panic.	2015-02-20 22:57:54 +00:00
Navdeep Parhar	7951040f8a	cxgbe(4): major tx rework. a) Front load as much work as possible in if_transmit, before any driver lock or software queue has to get involved. b) Replace buf_ring with a brand new mp_ring (multiproducer ring). This is specifically for the tx multiqueue model where one of the if_transmit producer threads becomes the consumer and other producers carry on as usual. mp_ring is implemented as standalone code and it should be possible to use it in any driver with tx multiqueue. It also has: - the ability to enqueue/dequeue multiple items. This might become significant if packet batching is ever implemented. - an abdication mechanism to allow a thread to give up writing tx descriptors and have another if_transmit thread take over. A thread that's writing tx descriptors can end up doing so for an unbounded time period if a) there are other if_transmit threads continuously feeding the sofware queue, and b) the chip keeps up with whatever the thread is throwing at it. - accurate statistics about interesting events even when the stats come at the expense of additional branches/conditional code. The NIC txq lock is uncontested on the fast path at this point. I've left it there for synchronization with the control events (interface up/down, modload/unload). c) Add support for "type 1" coalescing work request in the normal NIC tx path. This work request is optimized for frames with a single item in the DMA gather list. These are very common when forwarding packets. Note that netmap tx in cxgbe already uses these "type 1" work requests. d) Do not request automatic cidx updates every 32 descriptors. Instead, request updates via bits in individual work requests (still every 32 descriptors approximately). Also, request an automatic final update when the queue idles after activity. This means NIC tx reclaim is still performed lazily but it will catch up quickly as soon as the queue idles. This seems to be the best middle ground and I'll probably do something similar for netmap tx as well. e) Implement a faster tx path for WRQs (used by TOE tx and control queues, _not_ by the normal NIC tx). Allow work requests to be written directly to the hardware descriptor ring if room is available. I will convert t4_tom and iw_cxgbe modules to this faster style gradually. MFC after: 2 months	2014-12-31 23:19:16 +00:00
Navdeep Parhar	b741402c40	cxgbe(4): allow the driver to use rx buffers that do not end on a pack boundary. MFC after: 2 weeks	2014-12-06 01:47:38 +00:00
Navdeep Parhar	e3207e1973	cxgbe(4): Allow for different pad and pack boundaries for different adapters. Set the pack boundary for T5 cards to be the same as the PCIe max payload size. The chip likes it this way. In this revision the driver allocate rx buffers that align on both boundaries. This is not a strict requirement and a followup commit will switch the driver to a more relaxed allocation strategy. MFC after: 2 weeks	2014-12-06 00:13:56 +00:00
Hans Petter Selasky	c25290420e	Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file. This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows. "M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before. Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped. MFC after: 1 month Sponsored by: Mellanox Technologies	2014-12-01 11:45:24 +00:00
Navdeep Parhar	4d6db4e0f7	cxgbe(4): some optimizations in freelist handling. MFC after: 2 weeks.	2014-08-02 06:55:36 +00:00
Navdeep Parhar	f10405b396	cxgbe(4): Fix an off by one error when looking for the BAR2 doorbell address of an egress queue. MFC after: 2 weeks	2014-08-02 01:48:25 +00:00
Navdeep Parhar	b2daa9a9cd	cxgbe(4): minor optimizations in ingress queue processing. Reorganize struct sge_iq. Make the iq entry size a compile time constant. While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly. MFC after: 2 weeks	2014-08-02 00:56:34 +00:00
Navdeep Parhar	82eff304b6	cxgbe(4): Keep track of the clusters that have to be freed by the custom free routine (rxb_free) in the driver. Fail MOD_UNLOAD with EBUSY if any such cluster has been handed up to the kernel but hasn't been freed yet. This prevents a panic later when the cluster finally needs to be freed but rxb_free is gone from the kernel. MFC after: 1 week	2014-07-23 22:29:22 +00:00
Navdeep Parhar	c086e3d1b7	Add missing newline to an error message. MFC after: 3 days	2014-07-22 19:48:21 +00:00
Navdeep Parhar	c3fb772502	Simplify r267600, there's no need to distinguish between allocated and inlined mbufs. MFC after: 1 week	2014-07-22 02:02:39 +00:00

1 2 3

129 Commits