freebsd-skq

Author	SHA1	Message	Date
Navdeep Parhar	319f290030	cxgbe(4): adapter_full_init is always a synchronized operation. MFC after: 1 week	2015-02-08 08:52:18 +00:00
Navdeep Parhar	d86a5ff917	cxgbe(4): a change to the synchronization rules within the the driver. This is purely cosmetic because the new rules are already followed. MFC after: 1 week	2015-02-08 08:42:45 +00:00
Navdeep Parhar	cb6c101a86	cxgbe(4): fix a test made while enabling TOE. MFC after: 1 week	2015-02-07 01:50:32 +00:00
Navdeep Parhar	85dc477798	cxgbe(4): Add a minimal if_cxl module that pulls in the real driver as a dependency. This ensures "ifconfig cxl<n> ..." does the right thing even when it's run with no driver loaded. if_cxl.ko is the tiniest module in /boot/kernel. MFC after: 2 weeks	2015-02-06 01:10:04 +00:00
Navdeep Parhar	a2355dd909	cxgbe(4): reserve id for iSCSI upper layer driver.	2015-02-05 08:52:20 +00:00
John Baldwin	86f05ea6cf	Lock the socket buffer before jumping to the 'out' label if sblock() fails in t4_soreceive_ddp().	2015-01-26 16:32:41 +00:00
John Baldwin	de5a10ecbc	- Update a disabled KASSERT() to use sbused() instead of accessing the no-longer existant sb_cc sockbuf member. - Use sbavail() instead of sbused() in t4_soreceive_ddp() to match the usage in soreceive_stream() on which it is based. Discussed with: glebius (2)	2015-01-26 16:29:14 +00:00
John Baldwin	4f621933a5	Fix a couple of panics when detaching from a cxgbe/cxl interface that was never brought up: - Allow NULL to be passed to sglist_free(). - Don't try to stop an interface that was never fully initialized. Reviewed by: np	2015-01-26 16:26:28 +00:00
Hans Petter Selasky	d39d7c8636	Add missing linuxapi module dependencies and always use the FreeBSD "MODULE_VERSION" macro definition. Remove the redefinition of the "MODULE_VERSION" macro from the Linux kernel compatibility API. MFC after: 1 month Reported by: np@ Sponsored by: Mellanox Technologies	2015-01-19 21:53:00 +00:00
Navdeep Parhar	88d7f6bddf	Allow cxgbe(4) to be built on i386. Driver attach will succeed only on a subset of i386 systems.	2015-01-16 01:32:40 +00:00
Navdeep Parhar	e503548810	cxgbe/iw_cxgbe: fix whitespace nit in r277102. Reported by: stefanf@	2015-01-13 16:18:31 +00:00
Navdeep Parhar	b3e112f962	cxgbe/iw_cxgbe: allow any size during the initial MPA exchange. MFC after: 1 month	2015-01-13 01:40:12 +00:00
Navdeep Parhar	db8bcd1b21	cxgbe/tom: allocate page pod addresses instead of ppod#. MFC after: 2 weeks	2015-01-07 06:20:33 +00:00
Navdeep Parhar	f8c479085f	cxgbe/tom: use vmem(9) as the DDP page pod allocator. MFC after: 1 month	2015-01-06 01:30:32 +00:00
Navdeep Parhar	008015d2f4	cxgbe(4): fix the description of a strange bunch of counters. MFC after: 1 week	2015-01-05 23:43:24 +00:00
Navdeep Parhar	79b93bf6a3	cxgbe/tom: do not engage the TOE's payload chopper for payload < 2 MSS or for 10Gbps ports. MFC after: 2 weeks	2015-01-03 00:09:21 +00:00
Navdeep Parhar	402873f32a	cxgbe/tom: fix the MSS calculation for IPv6 connections handled by the TOE. MFC after: 1 week	2015-01-02 21:13:24 +00:00
Navdeep Parhar	dd1be4d418	cxgbe/tom: log some more details in send_flowc_wr. MFC after: 1 week	2015-01-02 20:52:51 +00:00
Navdeep Parhar	255155fc91	cxgbe(4): remove buf_ring specific restriction on the txq size. MFC after: 2 months	2015-01-01 09:33:46 +00:00
Navdeep Parhar	7951040f8a	cxgbe(4): major tx rework. a) Front load as much work as possible in if_transmit, before any driver lock or software queue has to get involved. b) Replace buf_ring with a brand new mp_ring (multiproducer ring). This is specifically for the tx multiqueue model where one of the if_transmit producer threads becomes the consumer and other producers carry on as usual. mp_ring is implemented as standalone code and it should be possible to use it in any driver with tx multiqueue. It also has: - the ability to enqueue/dequeue multiple items. This might become significant if packet batching is ever implemented. - an abdication mechanism to allow a thread to give up writing tx descriptors and have another if_transmit thread take over. A thread that's writing tx descriptors can end up doing so for an unbounded time period if a) there are other if_transmit threads continuously feeding the sofware queue, and b) the chip keeps up with whatever the thread is throwing at it. - accurate statistics about interesting events even when the stats come at the expense of additional branches/conditional code. The NIC txq lock is uncontested on the fast path at this point. I've left it there for synchronization with the control events (interface up/down, modload/unload). c) Add support for "type 1" coalescing work request in the normal NIC tx path. This work request is optimized for frames with a single item in the DMA gather list. These are very common when forwarding packets. Note that netmap tx in cxgbe already uses these "type 1" work requests. d) Do not request automatic cidx updates every 32 descriptors. Instead, request updates via bits in individual work requests (still every 32 descriptors approximately). Also, request an automatic final update when the queue idles after activity. This means NIC tx reclaim is still performed lazily but it will catch up quickly as soon as the queue idles. This seems to be the best middle ground and I'll probably do something similar for netmap tx as well. e) Implement a faster tx path for WRQs (used by TOE tx and control queues, _not_ by the normal NIC tx). Allow work requests to be written directly to the hardware descriptor ring if room is available. I will convert t4_tom and iw_cxgbe modules to this faster style gradually. MFC after: 2 months	2014-12-31 23:19:16 +00:00
John Baldwin	5ad25ceb41	Check for SS_NBIO in so->so_state instead of sb->sb_flags in soreceive_stream(). Differential Revision: https://reviews.freebsd.org/D1299 Reviewed by: bz, gnn MFC after: 1 week	2014-12-15 17:52:08 +00:00
Navdeep Parhar	a7570ee305	Move KTR_CXGBE from t4_tom.h to adapter.h so that the base if_cxgbe code can use it too. MFC after: 1 week	2014-12-12 21:54:59 +00:00
Navdeep Parhar	b741402c40	cxgbe(4): allow the driver to use rx buffers that do not end on a pack boundary. MFC after: 2 weeks	2014-12-06 01:47:38 +00:00
Navdeep Parhar	e3207e1973	cxgbe(4): Allow for different pad and pack boundaries for different adapters. Set the pack boundary for T5 cards to be the same as the PCIe max payload size. The chip likes it this way. In this revision the driver allocate rx buffers that align on both boundaries. This is not a strict requirement and a followup commit will switch the driver to a more relaxed allocation strategy. MFC after: 2 weeks	2014-12-06 00:13:56 +00:00
Hans Petter Selasky	c25290420e	Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file. This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows. "M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before. Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped. MFC after: 1 month Sponsored by: Mellanox Technologies	2014-12-01 11:45:24 +00:00
Gleb Smirnoff	651e4e6a30	Merge from projects/sendfile: extend protocols API to support sending not ready data: o Add new flag to pru_send() flags - PRUS_NOTREADY. o Add new protocol method pru_ready(). Sponsored by: Nginx, Inc. Sponsored by: Netflix	2014-11-30 13:24:21 +00:00
Gleb Smirnoff	0f9d0a73a4	Merge from projects/sendfile: o Introduce a notion of "not ready" mbufs in socket buffers. These mbufs are now being populated by some I/O in background and are referenced outside. This forces following implications: - An mbuf which is "not ready" can't be taken out of the buffer. - An mbuf that is behind a "not ready" in the queue neither. - If sockbet buffer is flushed, then "not ready" mbufs shouln't be freed. o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc. The sb_ccc stands for ""claimed character count", or "committed character count". And the sb_acc is "available character count". Consumers of socket buffer API shouldn't already access them directly, but use sbused() and sbavail() respectively. o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones with M_BLOCKED. o New field sb_fnrdy points to the first not ready mbuf, to avoid linear search. o New function sbready() is provided to activate certain amount of mbufs in a socket buffer. A special note on SCTP: SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet allow protocol specific sockbufs. Thus, SCTP does some hacks to make itself compatible with FreeBSD: it manages sockbufs on its own, but keeps sb_cc updated to inform the stack of amount of data in them. The new notion of "not ready" data isn't supported by SCTP. Instead, only a mechanical substitute is done: s/sb_cc/sb_ccc/. A proper solution would be to take away struct sockbuf from struct socket and allow protocols to implement their own socket buffers, like SCTP already does. This was discussed with rrs@. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 12:52:33 +00:00
Navdeep Parhar	729fee332b	cxgbe(4): figure out the max payload size and save it for later. MFC after: 1 week	2014-11-19 20:16:56 +00:00
Navdeep Parhar	aa8d1792d1	iw_cxgbe: don't forget to close the socket in c4iw_connect if soconnect fails. Submitted by: hariprasad at chelsio dot com	2014-11-13 03:59:36 +00:00
Navdeep Parhar	05c4567dd9	Fix some bad interaction between cxgbe(4) and lacp lagg(4) that could leave a port permanently disabled when a copper cable is unplugged and then plugged right back in. lacp_linkstate goes looking for the current ifmedia on a link state change and it could get stale information from cxgbe(4) on a module unplug followed by replug. The fix is to process module events before link-state events within the driver, and to always rebuild the ifmedia list on a module change event (instead of rebuilding it lazily). Thanks to asomers@ for the problem report and detailed analysis to go with it. MFC after: 1 week	2014-11-12 23:29:22 +00:00
Gleb Smirnoff	cfa6009e36	In preparation of merging projects/sendfile, transform bare access to sb_cc member of struct sockbuf to a couple of inline functions: sbavail() and sbused() Right now they are equal, but once notion of "not ready socket buffer data", will be checked in, they are going to be different. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-12 09:57:15 +00:00
John Baldwin	7bfc98355a	Add device ID for the T502-BT (dual-port 1G) adapter. Reviewed by: np MFC after: 1 week	2014-11-11 20:05:50 +00:00
Navdeep Parhar	62fc63abfb	cxgbe(4): adjust PMRX and PMTX parameters. MFC after: 1 week	2014-11-10 19:45:28 +00:00
Navdeep Parhar	527e4e62ac	Always request a completion for every work request for iWARP. The initial MPA exchange must be tracked this way so that t4_tom's state for the tid is all clean at the time the tid transitions to RDMA mode. Once it does, t4_tom is out of the way and iw_cxgbe uses the qp endpoints directly. Sponsored by: Chelsio Communications	2014-10-28 18:10:57 +00:00
Navdeep Parhar	3030c8bce9	iwcm_event status needs to be populated for close_complete_upcall Submitted by: Hariprasad at Chelsio dot com Sponsored by: Chelsio Communications	2014-10-27 23:11:48 +00:00
Navdeep Parhar	d25d06afc0	Some cxgbe/iw_cxgbe fixes: - Free rt in c4iw_connect only if it is allocated. - Call soclose instead of so_shutdown if there is an abort from the peer. - Close socket and return failure if TOE is not enabled. Submitted by: Hariprasad at Chelsio dot com Sponsored by: Chelsio Communications	2014-10-27 22:22:46 +00:00
Navdeep Parhar	1284501329	cxgbe(4): bump up PF4's share of some global resources. This increases the size of the per-port RSS slice and also allows the driver to use a larger number of tx and rx queues. MFC after: 2 weeks	2014-10-25 00:14:44 +00:00
Navdeep Parhar	53f49a7bd8	cxgbe/iw_cxgbe: wake up waiters after flushing the qp. Obtained from: Chelsio	2014-10-22 18:55:44 +00:00
Hans Petter Selasky	f0188618f2	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
Hans Petter Selasky	2c6eb461a7	Update the OFED Linux compatibility layer and Mellanox hardware driver(s): - Properly name an inclusion guard - Fix compile warnings regarding unsigned enums - Add two new sysctl nodes - Remove all empty linux header files - Make an error printout more verbose - Use "mod_delayed_work()" instead of cancelling and starting a timeout. - Implement more Linux scatterlist functions. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-15 13:40:29 +00:00
Navdeep Parhar	19abdd0654	cxgbe/tom: don't leak resources tied to an active open request that cannot be sent to the chip because a prerequisite L2 resolution failed. Submitted by: Hariprasad at chelsio dot com (original version) MFC after: 2 weeks.	2014-10-07 21:26:22 +00:00
Navdeep Parhar	2d8910854b	cxgbe(4): implement if_get_counter.	2014-09-27 05:50:31 +00:00
Navdeep Parhar	acc45299f5	cxgbe(4): explicitly set various if_hw_tso* values. MFC after: 3 days	2014-09-26 22:21:02 +00:00
Navdeep Parhar	db25c97a1a	Make sure the adapter's management queue and the event queue are available before any uppper layer driver (TOE, iWARP, or iSCSI) registers with the base cxgbe(4) driver. Submitted by: Hariprasad at chelsio dot com Reviewed by: np@	2014-09-26 18:53:00 +00:00
Navdeep Parhar	3a260c2a19	Update comment (missed this bit in r272079).	2014-09-24 20:08:43 +00:00
Navdeep Parhar	13251b21e1	cxgbe/tom: Catch up with r271119, syncache_add doesn't need tcbinfo lock.	2014-09-24 20:04:11 +00:00
Navdeep Parhar	1dee8327d4	cxgbe(4): Verify that the addresses in if_multiaddrs really are multicast addresses. (The chip doesn't really care, it's just that it needs to be told explicitly if unicast DMACs are checked for "hits" in the hash that is used after the TCAM entries are all used up).	2014-09-23 22:57:11 +00:00
Navdeep Parhar	8374717dc0	cxgbe(4): add support for the SIOCGI2C ioctl.	2014-09-12 21:56:57 +00:00
Navdeep Parhar	3eb2c201a6	cxgbe(4): knobs to enable/disable PAUSE frame based flow control. MFC after: 1 week	2014-09-12 05:25:56 +00:00
Robert Watson	7524f39b9f	Add new a M_START() mbuf macro that returns a pointer to the start of an mbuf's storage (internal or external). Add a new M_SIZE() mbuf macro that returns the size of an mbuf's storage (internal or external). These contrast with m_data and m_len, which are with respect to data in the buffer, rather than the buffer itself. Rewrite M_LEADINGSPACE() and M_TRAILINGSPACE() in terms of M_START() and M_SIZE(). This is done as we currently have many instances of using mbuf flags to generate pointers or lengths for internal storage in header and regular mbufs, as well as to external storage. Rather than replicate this logic throughout the network stack, centralising the implementation will make it easier for us to refine mbuf storage. This should also help reduce bugs by limiting the amount of mbuf-type-specific pointer arithmetic. Followup changes will propagate use of the macros throughout the stack. M_SIZE() conflicts with one macro in the Chelsio driver; rename that macro in a slightly unsatisfying way to eliminate the collision. MFC after: 3 days Obtained from: jeff (with enhancements) Sponsored by: EMC / Isilon Storage Division Reviewed by: bz, glebius, np Differential Revision: https://reviews.freebsd.org/D753	2014-09-11 07:16:15 +00:00
Navdeep Parhar	4309e7b020	Whitespace nit. MFC after: 1 week	2014-09-09 18:36:00 +00:00
Hans Petter Selasky	c7818b48b6	- Update the OFED Linux Emulation layer as a preparation for a hardware driver update from Mellanox Technologies. - Remove empty files from the OFED Linux Emulation layer. - Fix compile warnings related to printf() and the "%lld" and "%llx" format specifiers. - Add some missing 2-clause BSD copyrights. - Add "Mellanox Technologies, Ltd." to list of copyright holders. - Add some new compatibility files. - Fix order of uninit in the mlx4ib module to avoid crash at unload using the new module_exit_order() function. MFC after: 1 week Sponsored by: Mellanox Technologies	2014-08-27 13:21:53 +00:00
Luigi Rizzo	4bf50f18eb	Update to the current version of netmap. Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for txsync() and rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days.	2014-08-16 15:00:01 +00:00
Navdeep Parhar	6b6e7079b3	cxgbe(4): Do not poke T4-only registers on a T5 (and vice versa). Obtained from: Chelsio Communications MFC after: 1 week	2014-08-08 18:36:53 +00:00
Navdeep Parhar	bc22dc708f	cxgbe(4): Let caller specify whether it's ok to sleep in t4_sched_config and t4_sched_params. MFC after: 2 weeks	2014-08-06 19:38:03 +00:00
Navdeep Parhar	46a646940f	cxgbe(4): Do not run any sleepable code in the SIOCSIFFLAGS handler when IFF_PROMISC or IFF_ALLMULTI is being flipped. bpf(4) holds its global mutex around ifpromisc in at least the bpf_dtor path. MFC after: 3 days	2014-08-04 22:32:16 +00:00
Navdeep Parhar	b2c5bf0de2	cxgbe(4): Remove an unused version of t4_enable_vi. MFC after: 2 weeks	2014-08-02 18:37:22 +00:00
Navdeep Parhar	4d6db4e0f7	cxgbe(4): some optimizations in freelist handling. MFC after: 2 weeks.	2014-08-02 06:55:36 +00:00
Navdeep Parhar	f10405b396	cxgbe(4): Fix an off by one error when looking for the BAR2 doorbell address of an egress queue. MFC after: 2 weeks	2014-08-02 01:48:25 +00:00
Navdeep Parhar	b2daa9a9cd	cxgbe(4): minor optimizations in ingress queue processing. Reorganize struct sge_iq. Make the iq entry size a compile time constant. While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly. MFC after: 2 weeks	2014-08-02 00:56:34 +00:00
Navdeep Parhar	0fe982772d	Some hooks in cxgbe(4) for the offloaded iSCSI driver. (I'm committing this on behalf of my colleagues in the Storage team at Chelsio). Submitted by: Sreenivasa Honnur <shonnur at chelsio dot com> Sponsored by: Chelsio Communications.	2014-07-24 18:39:08 +00:00
Navdeep Parhar	82eff304b6	cxgbe(4): Keep track of the clusters that have to be freed by the custom free routine (rxb_free) in the driver. Fail MOD_UNLOAD with EBUSY if any such cluster has been handed up to the kernel but hasn't been freed yet. This prevents a panic later when the cluster finally needs to be freed but rxb_free is gone from the kernel. MFC after: 1 week	2014-07-23 22:29:22 +00:00
Navdeep Parhar	c086e3d1b7	Add missing newline to an error message. MFC after: 3 days	2014-07-22 19:48:21 +00:00
Navdeep Parhar	c3fb772502	Simplify r267600, there's no need to distinguish between allocated and inlined mbufs. MFC after: 1 week	2014-07-22 02:02:39 +00:00
Navdeep Parhar	bae4e5af99	cxgbe(4): Display CF facility correctly in the device log. MFC after: 3 days	2014-07-15 18:24:41 +00:00
Navdeep Parhar	44eb893659	Allow multi-byte reads in the private CHELSIO_T4_GET_I2C ioctl. The firmware allows up to 48B to be read this way but the driver limits itself to 8B at a time to remain compatible with old cxgbetool binaries. MFC after: 1 week	2014-07-15 01:03:29 +00:00
Navdeep Parhar	30f337891d	cxgbe(4): Add an iSCSI softc to the adapter structure.	2014-07-11 21:02:54 +00:00
Gleb Smirnoff	15c28f87b8	All mbuf external free functions never fail, so let them be void. Sponsored by: Nginx, Inc.	2014-07-11 13:58:48 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Navdeep Parhar	327235b3d6	cxgbe(4): Update the bundled T4 and T5 firmwares to versions 1.11.27.0. Obtained from: Chelsio MFC after: 3 days	2014-06-22 23:40:20 +00:00
Navdeep Parhar	0835ddc766	Consider the total number of descriptors available (and not just those that are ready to be reclaimed) when deciding whether to resume tx after a stall. MFC after: 3 days	2014-06-20 20:28:46 +00:00
Navdeep Parhar	ccc69b2fa9	cxgbe(4): Fix bug in the fast rx buffer recycle path. In some cases rx buffers were getting recycled when they should have been left alone. MFC after: 3 days	2014-06-18 00:16:35 +00:00
Attilio Rao	3ae10f7477	- Modify vm_page_unwire() and vm_page_enqueue() to directly accept the queue where to enqueue pages that are going to be unwired. - Add stronger checks to the enqueue/dequeue for the pagequeues when adding and removing pages to them. Of course, for unmanaged pages the queue parameter of vm_page_unwire() will be ignored, just as the active parameter today. This makes adding new pagequeues quicker. This change effectively modifies the KPI. __FreeBSD_version will be, however, bumped just when the full cache of free pages will be evicted. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2014-06-16 18:15:27 +00:00
Navdeep Parhar	861e42b209	cxgbe(4): Properly account for the freelist buffers used when returning early from service_iq due to a budget restriction. This fixes a potential rx hang when using INTx. MFC after: 3 days	2014-06-05 00:38:32 +00:00
Navdeep Parhar	368541ba1e	cxgbe(4): Fix a NULL dereference when the very first call to get_scatter_segment() in get_fl_payload() fails. While here, fix the code to adjust fl_bufs_used when a failure occurs for any other scatter segment. MFC after: 3 days	2014-05-30 22:59:45 +00:00
Navdeep Parhar	298d969c53	cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards. Netmap gets its own hardware-assisted virtual interface and won't take over or disrupt the "normal" interface in any way. You can use both simultaneously. For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface (note the 'n' prefix) in the hardware to accompany each cxl<N> interface. These two ifnet's per port share the same wire but really are separate interfaces in the hardware and software. Each gets its own L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You should run netmap on the 'n' interfaces only, that's what they are for. With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now. Single port receive is at 33Mpps but this is very much a work in progress. I expect it to be closer to 40Mpps once done. In any case the current effort can already saturate multiple 10G ports of a T5 card at the smallest legal packet size. T4 gear is totally untested. trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43🆎cd:ef 881.952141 main [1621] interface is ncxl0 881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0 881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0 881.962540 main [1804] mapped 334980KB at 0x801dff000 Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43🆎cd:ef) 881.962562 main [1882] Sending 512 packets every 0.000000000 s 881.962563 main [1884] Wait 2 secs for phy reset 884.088516 main [1886] Ready... 884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1 884.088607 sender_body [996] start 884.093246 sender_body [1064] drop copy 885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec) 886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec) 887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec) 888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec) 889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec) 890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec) 891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec) 892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec) 893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec) 894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec) 895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec) 896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec) ... Relnotes: Yes Sponsored by: Chelsio Communications.	2014-05-27 18:18:41 +00:00
Bjoern A. Zeeb	255cd9fd58	Move the tcp_fields_to_host() and tcp_fields_to_net() (inline) functions to the tcp_var.h header file in order to avoid further duplication with upcoming commits. Reviewed by: np MFC after: 2 weeks	2014-05-23 20:15:01 +00:00
Navdeep Parhar	7a5b897dfe	cxgbe(4): Remove stray if_up from the code that creates the tracing ifnet.	2014-05-23 01:45:44 +00:00
Maksim Yevmenkin	080a4b9b1c	use correct (integer) type for the temperature sysctl Reviewed by: np, scottl Obtained from: Netflix MFC after: 3 days	2014-04-17 19:29:15 +00:00
Navdeep Parhar	8b3f42d52d	cxgbe(4): Recognize the "spider" configuration where a T5 card's 40G QSFP port is presented as 4 distinct 10G SFP+ ports to the driver. MFC after: 2 weeks	2014-03-21 00:56:56 +00:00
Navdeep Parhar	65bd4d1cb4	cxgbe(4): Use ifi_oqdrops in if_data to count drops in the tx path.	2014-03-20 02:28:05 +00:00
Navdeep Parhar	475992bdfb	cxgbe(4): if_iqdrops statistic should include tunnel congestion drops. MFC after: 1 week	2014-03-20 01:58:04 +00:00
Navdeep Parhar	38035ed6dc	cxgbe(4): significant rx rework. - More flexible cluster size selection, including the ability to fall back to a safe cluster size (PAGE_SIZE from zone_jumbop by default) in case an allocation of a larger size fails. - A single get_fl_payload() function that assembles the payload into an mbuf chain for any kind of freelist. This replaces two variants: one for freelists with buffer packing enabled and another for those without. - Buffer packing with any sized cluster. It was limited to 4K clusters only before this change. - Enable buffer packing for TOE rx queues as well. - Statistics and tunables to go with all these changes. The driver's man page will be updated separately. MFC after: 5 weeks	2014-03-18 20:14:13 +00:00
Dimitry Andric	e9e21b6e41	In cxgbe, conditionalize the t4_pgprot_wc() function, since it is only used when DOT5 is defined. Reviewed by: np MFC after: 3 days	2014-02-14 23:38:42 +00:00
Scott Long	f7a74e061b	Add a new sysctl, dev.cxgbe.N.rsrv_noflow, and a companion tunable, hw.cxgbe.rsrv_noflow. When set, queue 0 of the port is reserved for TX packets without a flowid. The hash value of packets with a flowid is bumped up by 1. The intent is to provide a private queue for link-level packets like LACP that is unlikely to overflow or suffer deep queue latency. Reviewed by: np Obtained from: Netflix MFC after: 3 days	2014-02-06 18:40:38 +00:00
Navdeep Parhar	e46dcc5670	cxgbe(4): Use the rx channel map (instead of the tx channel map) as the congestion channel map. MFC after: 1 week	2014-02-06 03:30:12 +00:00
Navdeep Parhar	7293a15f54	cxgbe(4): The T5 allows for a different freelist starvation threshold for queues with buffer packing. Use the correct value to calculate a freelist's low water mark. MFC after: 1 week	2014-02-06 03:21:43 +00:00
Navdeep Parhar	454813ff9c	cxgbe(4): Use the port's tx channel to identify it to t4_clr_port_stats. MFC after: 3 days	2014-02-06 02:34:29 +00:00
Adrian Chadd	3af0f449ae	Add an option to enable or disable the small RX packet copying that is done to improve performance of small frames. When doing RX packing, the RX copying isn't necessarily required. Reviewed by: np	2014-01-02 23:23:33 +00:00
Navdeep Parhar	88bb82e511	Do not create a hardware IPv6 server if the listen address is not in6addr_any and is not in the CLIP table either. This fixes a reported TOE+IPv6 NULL-dereference panic in do_pass_open_rpl(). While here, stop creating hardware servers for any loopback address. It's just a waste of server tids. MFC after: 1 week	2013-12-17 21:41:23 +00:00
Navdeep Parhar	93e9cae3fa	Read card capabilities after firmware initialization, instead of setting them up as part of firmware initialization (which the driver gets to do only if it's the master driver). Read the range of tids available for the ETHOFLD functionality if it's enabled. New is_ftid() and is_etid() functions to test whether a tid falls within the range of filter tids or ETHOFLD tids respectively. MFC after: 2 weeks	2013-12-14 03:08:03 +00:00
Adrian Chadd	ac68deae6d	Print out the full PCIe link negotiation during dmesg. I found this useful when checking whether a NIC is in a PCIE 3.0 8x slot or not. Reviewed by: np Sponsored by: Netflix, inc.	2013-12-10 00:07:04 +00:00
Navdeep Parhar	d419aaa126	Unstaticize t4_list and t4_uld_list. This works around a clang annoyance[1] and allows kgdb to find these symbols. [1] http://lists.freebsd.org/pipermail/freebsd-hackers/2012-November/041166.html MFC after: 3 days	2013-12-09 23:33:57 +00:00
Navdeep Parhar	273ef9912d	cxgbe(4): save a copy of the RSS map for each port for the driver's use.	2013-12-08 17:47:37 +00:00
Navdeep Parhar	05337b80ee	cxgbe(4): T4_SET_SCHED_CLASS and T4_SET_SCHED_QUEUE ioctls to program scheduling classes in the chip and to bind tx queue(s) to a scheduling class respectively. These can be used for various kinds of tx traffic throttling (to force selected tx queues to drain at a fixed Kbps rate, or a % of the port's total bandwidth, or at a fixed pps rate, etc.). Obtained from: Chelsio	2013-12-03 18:34:52 +00:00
Navdeep Parhar	2471928bf8	Disable an assertion that relies on some code[1] that isn't in HEAD yet. [1] http://lists.freebsd.org/pipermail/freebsd-net/2013-August/036573.html	2013-11-27 19:54:19 +00:00
Navdeep Parhar	245a0bd40a	cxgbe(4): update the internal list of device features. MFC after: 3 days	2013-11-21 20:07:58 +00:00
Navdeep Parhar	1192eeb8a3	cxgbe(4): Tidy up the display for payload memory statistics (pm_stats). # sysctl -n dev.t4nex.0.misc.pm_stats # sysctl -n dev.t5nex.0.misc.pm_stats MFC after: 1 week	2013-11-07 00:25:49 +00:00

1 2 3 4 5 ...

343 Commits