freebsd-nq

Author	SHA1	Message	Date
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Navdeep Parhar	327235b3d6	cxgbe(4): Update the bundled T4 and T5 firmwares to versions 1.11.27.0. Obtained from: Chelsio MFC after: 3 days	2014-06-22 23:40:20 +00:00
Navdeep Parhar	0835ddc766	Consider the total number of descriptors available (and not just those that are ready to be reclaimed) when deciding whether to resume tx after a stall. MFC after: 3 days	2014-06-20 20:28:46 +00:00
Navdeep Parhar	ccc69b2fa9	cxgbe(4): Fix bug in the fast rx buffer recycle path. In some cases rx buffers were getting recycled when they should have been left alone. MFC after: 3 days	2014-06-18 00:16:35 +00:00
Attilio Rao	3ae10f7477	- Modify vm_page_unwire() and vm_page_enqueue() to directly accept the queue where to enqueue pages that are going to be unwired. - Add stronger checks to the enqueue/dequeue for the pagequeues when adding and removing pages to them. Of course, for unmanaged pages the queue parameter of vm_page_unwire() will be ignored, just as the active parameter today. This makes adding new pagequeues quicker. This change effectively modifies the KPI. __FreeBSD_version will be, however, bumped just when the full cache of free pages will be evicted. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2014-06-16 18:15:27 +00:00
Navdeep Parhar	861e42b209	cxgbe(4): Properly account for the freelist buffers used when returning early from service_iq due to a budget restriction. This fixes a potential rx hang when using INTx. MFC after: 3 days	2014-06-05 00:38:32 +00:00
Navdeep Parhar	368541ba1e	cxgbe(4): Fix a NULL dereference when the very first call to get_scatter_segment() in get_fl_payload() fails. While here, fix the code to adjust fl_bufs_used when a failure occurs for any other scatter segment. MFC after: 3 days	2014-05-30 22:59:45 +00:00
Navdeep Parhar	298d969c53	cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards. Netmap gets its own hardware-assisted virtual interface and won't take over or disrupt the "normal" interface in any way. You can use both simultaneously. For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface (note the 'n' prefix) in the hardware to accompany each cxl<N> interface. These two ifnet's per port share the same wire but really are separate interfaces in the hardware and software. Each gets its own L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You should run netmap on the 'n' interfaces only, that's what they are for. With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now. Single port receive is at 33Mpps but this is very much a work in progress. I expect it to be closer to 40Mpps once done. In any case the current effort can already saturate multiple 10G ports of a T5 card at the smallest legal packet size. T4 gear is totally untested. trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43🆎cd:ef 881.952141 main [1621] interface is ncxl0 881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0 881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0 881.962540 main [1804] mapped 334980KB at 0x801dff000 Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43🆎cd:ef) 881.962562 main [1882] Sending 512 packets every 0.000000000 s 881.962563 main [1884] Wait 2 secs for phy reset 884.088516 main [1886] Ready... 884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1 884.088607 sender_body [996] start 884.093246 sender_body [1064] drop copy 885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec) 886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec) 887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec) 888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec) 889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec) 890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec) 891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec) 892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec) 893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec) 894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec) 895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec) 896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec) ... Relnotes: Yes Sponsored by: Chelsio Communications.	2014-05-27 18:18:41 +00:00
Bjoern A. Zeeb	255cd9fd58	Move the tcp_fields_to_host() and tcp_fields_to_net() (inline) functions to the tcp_var.h header file in order to avoid further duplication with upcoming commits. Reviewed by: np MFC after: 2 weeks	2014-05-23 20:15:01 +00:00
Navdeep Parhar	7a5b897dfe	cxgbe(4): Remove stray if_up from the code that creates the tracing ifnet.	2014-05-23 01:45:44 +00:00
Maksim Yevmenkin	080a4b9b1c	use correct (integer) type for the temperature sysctl Reviewed by: np, scottl Obtained from: Netflix MFC after: 3 days	2014-04-17 19:29:15 +00:00
Navdeep Parhar	8b3f42d52d	cxgbe(4): Recognize the "spider" configuration where a T5 card's 40G QSFP port is presented as 4 distinct 10G SFP+ ports to the driver. MFC after: 2 weeks	2014-03-21 00:56:56 +00:00
Navdeep Parhar	65bd4d1cb4	cxgbe(4): Use ifi_oqdrops in if_data to count drops in the tx path.	2014-03-20 02:28:05 +00:00
Navdeep Parhar	475992bdfb	cxgbe(4): if_iqdrops statistic should include tunnel congestion drops. MFC after: 1 week	2014-03-20 01:58:04 +00:00
Navdeep Parhar	38035ed6dc	cxgbe(4): significant rx rework. - More flexible cluster size selection, including the ability to fall back to a safe cluster size (PAGE_SIZE from zone_jumbop by default) in case an allocation of a larger size fails. - A single get_fl_payload() function that assembles the payload into an mbuf chain for any kind of freelist. This replaces two variants: one for freelists with buffer packing enabled and another for those without. - Buffer packing with any sized cluster. It was limited to 4K clusters only before this change. - Enable buffer packing for TOE rx queues as well. - Statistics and tunables to go with all these changes. The driver's man page will be updated separately. MFC after: 5 weeks	2014-03-18 20:14:13 +00:00
Dimitry Andric	e9e21b6e41	In cxgbe, conditionalize the t4_pgprot_wc() function, since it is only used when DOT5 is defined. Reviewed by: np MFC after: 3 days	2014-02-14 23:38:42 +00:00
Scott Long	f7a74e061b	Add a new sysctl, dev.cxgbe.N.rsrv_noflow, and a companion tunable, hw.cxgbe.rsrv_noflow. When set, queue 0 of the port is reserved for TX packets without a flowid. The hash value of packets with a flowid is bumped up by 1. The intent is to provide a private queue for link-level packets like LACP that is unlikely to overflow or suffer deep queue latency. Reviewed by: np Obtained from: Netflix MFC after: 3 days	2014-02-06 18:40:38 +00:00
Navdeep Parhar	e46dcc5670	cxgbe(4): Use the rx channel map (instead of the tx channel map) as the congestion channel map. MFC after: 1 week	2014-02-06 03:30:12 +00:00
Navdeep Parhar	7293a15f54	cxgbe(4): The T5 allows for a different freelist starvation threshold for queues with buffer packing. Use the correct value to calculate a freelist's low water mark. MFC after: 1 week	2014-02-06 03:21:43 +00:00
Navdeep Parhar	454813ff9c	cxgbe(4): Use the port's tx channel to identify it to t4_clr_port_stats. MFC after: 3 days	2014-02-06 02:34:29 +00:00
Adrian Chadd	3af0f449ae	Add an option to enable or disable the small RX packet copying that is done to improve performance of small frames. When doing RX packing, the RX copying isn't necessarily required. Reviewed by: np	2014-01-02 23:23:33 +00:00
Navdeep Parhar	88bb82e511	Do not create a hardware IPv6 server if the listen address is not in6addr_any and is not in the CLIP table either. This fixes a reported TOE+IPv6 NULL-dereference panic in do_pass_open_rpl(). While here, stop creating hardware servers for any loopback address. It's just a waste of server tids. MFC after: 1 week	2013-12-17 21:41:23 +00:00
Navdeep Parhar	93e9cae3fa	Read card capabilities after firmware initialization, instead of setting them up as part of firmware initialization (which the driver gets to do only if it's the master driver). Read the range of tids available for the ETHOFLD functionality if it's enabled. New is_ftid() and is_etid() functions to test whether a tid falls within the range of filter tids or ETHOFLD tids respectively. MFC after: 2 weeks	2013-12-14 03:08:03 +00:00
Adrian Chadd	ac68deae6d	Print out the full PCIe link negotiation during dmesg. I found this useful when checking whether a NIC is in a PCIE 3.0 8x slot or not. Reviewed by: np Sponsored by: Netflix, inc.	2013-12-10 00:07:04 +00:00
Navdeep Parhar	d419aaa126	Unstaticize t4_list and t4_uld_list. This works around a clang annoyance[1] and allows kgdb to find these symbols. [1] http://lists.freebsd.org/pipermail/freebsd-hackers/2012-November/041166.html MFC after: 3 days	2013-12-09 23:33:57 +00:00
Navdeep Parhar	273ef9912d	cxgbe(4): save a copy of the RSS map for each port for the driver's use.	2013-12-08 17:47:37 +00:00
Navdeep Parhar	05337b80ee	cxgbe(4): T4_SET_SCHED_CLASS and T4_SET_SCHED_QUEUE ioctls to program scheduling classes in the chip and to bind tx queue(s) to a scheduling class respectively. These can be used for various kinds of tx traffic throttling (to force selected tx queues to drain at a fixed Kbps rate, or a % of the port's total bandwidth, or at a fixed pps rate, etc.). Obtained from: Chelsio	2013-12-03 18:34:52 +00:00
Navdeep Parhar	2471928bf8	Disable an assertion that relies on some code[1] that isn't in HEAD yet. [1] http://lists.freebsd.org/pipermail/freebsd-net/2013-August/036573.html	2013-11-27 19:54:19 +00:00
Navdeep Parhar	245a0bd40a	cxgbe(4): update the internal list of device features. MFC after: 3 days	2013-11-21 20:07:58 +00:00
Navdeep Parhar	1192eeb8a3	cxgbe(4): Tidy up the display for payload memory statistics (pm_stats). # sysctl -n dev.t4nex.0.misc.pm_stats # sysctl -n dev.t5nex.0.misc.pm_stats MFC after: 1 week	2013-11-07 00:25:49 +00:00
Navdeep Parhar	be2c01211c	cxgbe(4): Exclude MPS_RPLC_MAP_CTL (0x11114) from the register dump. Turns out it's a write-only register with strange side effects on read. Submitted by: gnn MFC after: 3 days	2013-11-04 21:06:21 +00:00
Gleb Smirnoff	66e01d73cd	- Provide necessary includes. - Remove unnecessary includes. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-29 11:17:49 +00:00
Gleb Smirnoff	c3322cb91c	Include necessary headers that now are available due to pollution via if_var.h. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-28 07:29:16 +00:00
Gleb Smirnoff	76039bc84f	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 17:58:36 +00:00
Navdeep Parhar	87f804e879	Fix typo in previous commit.	2013-10-18 00:00:08 +00:00
Navdeep Parhar	02318fcf86	iw_cxgbe should have a dependency on t4nex. Reported by: trasz@	2013-10-17 23:57:17 +00:00
Navdeep Parhar	fb93f5c47f	iw_cxgbe: iWARP driver for Chelsio T4/T5 chips. This is a straight port of the iw_cxgb4 found in OFED distributions. Obtained from: Chelsio	2013-10-17 18:37:25 +00:00
Navdeep Parhar	b3eda7872d	cxgbe(4): Store the log2 of the # of doorbells per BAR2 page for both ingress and egress queues, and for both T4 and T5. These values are used by the T4/T5 iWARP driver.	2013-10-14 23:32:56 +00:00
Navdeep Parhar	48d05478bf	cxgbe(4): Update T4 and T5 firmwares to 1.9.12.0	2013-10-14 21:25:07 +00:00
Gleb Smirnoff	4cdc1f5421	There are some high performance NICs that count statistics in hardware, and there are ifnets, that do that via counter(9). Provide a flag that would skip cache line trashing '+=' operation in ether_input(). Sponsored by: Netflix Sponsored by: Nginx, Inc. Reviewed by: melifaro, adrian Approved by: re (marius)	2013-10-09 19:04:40 +00:00
Dimitry Andric	64db896617	Fix kernel build on amd64 after r256118, since the machine/md_var.h header is not implicitly included there. So include it explicitly. Approved by: re (delphij) Pointy hat to: dim MFC after: 3 days X-MFC-With: r256118	2013-10-07 22:30:03 +00:00
Dimitry Andric	42355a4ff6	Remove redundant declaration of cpu_clflush_line_size in sys/dev/cxgbe/t4_sge.c, to silence a gcc warning. Approved by: re (gjb) MFC after: 3 days	2013-10-07 16:56:56 +00:00
Navdeep Parhar	eb22728291	Rework the tx credit mechanism between the cxgbe/tom driver and the card. This helps smooth out some burstiness in the exchange. Approved by: re (glebius)	2013-09-09 04:38:57 +00:00
Navdeep Parhar	c81d56a0aa	Fix a miscalculation that caused cxgbe/tom to auto-increment a TOE socket's tx buffer size too aggressively. Approved by: re (delphij)	2013-09-09 00:16:59 +00:00
Navdeep Parhar	4f641559c7	For TOE connections, the window scale factor in CPL_PASS_ACCEPT_REQ is set to 15 to indicate that the peer did not send a window scale option with its SYN. Do not send a window scale option in the SYN\|ACK reply in that case.	2013-09-03 23:34:04 +00:00
Navdeep Parhar	32e9219012	Fix the sysctl that displays whether buffer packing is enabled or not.	2013-08-30 02:13:36 +00:00
Navdeep Parhar	1458bff9a4	Implement support for rx buffer packing. Enable it by default for T5 cards. This is a T4 and T5 chip feature which lets the chip deliver multiple Ethernet frames in a single buffer. This is more efficient within the chip, in the driver, and reduces wastage of space in rx buffers. - Always allocate rx buffers from the jumbop zone, no matter what the MTU is. Do not use the normal cluster refcounting mechanism. - Reserve space for an mbuf and a refcount in the cluster itself and let the chip DMA multiple frames in the rest. - Use the embedded mbuf for the first frame and allocate mbufs on the fly for any additional frames delivered in the cluster. Each of these mbufs has a reference on the underlying cluster.	2013-08-30 01:45:36 +00:00
Navdeep Parhar	480e603c79	Merge r254386 from user/np/cxl_tuning. Add an INET\|INET6 check missing in said revision. r254386: Flush inactive LRO entries periodically.	2013-08-29 06:26:22 +00:00

1 2 3 4 5

225 Commits