freebsd-dev

Author	SHA1	Message	Date
Maksim Yevmenkin	c5d8a885d4	Correct typo(?) and actually set PTHRESH to 32 and not 16 as per Intel Linux driver 3.8.21. MFC after: 1 week	2012-06-07 22:57:26 +00:00
Maksim Yevmenkin	cd1fb2e095	Before it gets lost in the noise. Put a bandaid to prevent ixgbe(4) from completely locking up the system under high load. Our platform has a few CPU cores and a single active ixgbe(4) port with 4 queues. Under high enough traffic load, at about 7.5GBs and 700,000 packets/sec (outbound), the entire system would deadlock. What we found was that each CPU was in an endless loop on a different ix taskqueue thread. The OACTIVE flag had gotten set on each queue, and the ixgbe_handle_queue() function was continuously rescheduling itself via the taskqueue_enqueue. Since all CPUs were busy with their taskqueue threads, the ixgbe_local_timer() function couldn't run to clear the OACTIVE flag. Submitted by: scottl MFC after: 1 week	2012-06-05 18:48:02 +00:00
Bjoern A. Zeeb	e2c0161e2e	MFp4 bz_ipv6_fast: Add TSO6 and LRO/IPv6 support. Fix the module Makefile to at least properly inlcude opt_inet6.h and allow builds without INET or INET6. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 03:02:56 +00:00
Luigi Rizzo	f9125c3ec9	fix a typo in a comment	2012-05-17 14:36:19 +00:00
Luigi Rizzo	9b034c6f08	Properly disable crc stripping when operating in netmap mode. Contrarily to what i wrote in my previous commit, the 82599 does include the CRC in the length. The operating mode is reset in ixgbe_init_locked() and so we need to hook into the places where the two registers (HLREG0 and RDRXCTL) are modified.	2012-04-13 16:42:54 +00:00
Luigi Rizzo	aa15c59eb1	Enable prefetching of descriptors on the TX ring, using the same values as in the Intel driver 3.8.21 for linux. The fact that it is standard in the above driver suggests that it has no bad side effects. But of course there must be a reason for enabling features, not just "it does not harm", so here it is a good one: Prefetching enables full line rate even using a single queue (14.88 Mpps, compared to ~12 Mpps without prefetch). This in turn is terribly useful when one wants to schedule traffic. For obvious reasons the difference is only visible with netmap or other high speed solutions, but presumably the advantage should be in the order of a fraction of a microsecond when starting transmission on an empty queue. Discussed with Jack Vogel. MFC after: 1 week	2012-04-11 15:02:14 +00:00
Scott Long	62ce43ccc8	More conversions of drivers to use the PCI parent DMA tag.	2012-03-12 18:15:08 +00:00
Luigi Rizzo	64ae02c365	A bunch of netmap fixes: USERSPACE: 1. add support for devices with different number of rx and tx queues; 2. add better support for zero-copy operation, adding an extra field to the netmap ring to indicate how many buffers we have already processed but not yet released (with help from Eddie Kohler); 3. The two changes above unfortunately require an API change, so while at it add a version field and some spares to the ioctl() argument to help detect mismatches. 4. update the manual page for the two changes above; 5. update sample applications in tools/tools/netmap KERNEL: 1. simplify the internal structures moving the global wait queues to the 'struct netmap_adapter'; 2. simplify the functions that map kring<->nic ring indexes 3. normalize device-specific code, helps mainteinance; 4. start exploring the impact of micro-optimizations (prefetch etc.) in the ixgbe driver. Use 'legacy' descriptors on the tx ring and prefetch slots gives about 20% speedup at 900 MHz. Another 7-10% would come from removing the explict calls to bus_dmamap* in the core (they are effectively NOPs in this case, but it takes expensive load of the per-buffer dma maps to figure out that they are all NULL. Rx performance not investigated. I am postponing the MFC so i can import a few more improvements before merging.	2012-02-27 19:05:01 +00:00
Luigi Rizzo	5644ccec61	(This commit only touches code within the DEV_NETMAP blocks) Introduce some functions to map NIC ring indexes into netmap ring indexes and vice versa. This way we can implement the bound checks only in one place (and hopefully in a correct way). On passing, make the code and comments more uniform across the various drivers.	2012-02-15 23:13:29 +00:00
Jack F Vogel	85d0a26ed4	New hardware support: Intel X540 adapter support added. Some shared code reorganization along with the new adapter. Sync changes to OACTIVE in igb into this driver. Misc small fixes.	2012-01-30 16:42:02 +00:00
Luigi Rizzo	2157a17ce2	ixgbe changes: - remove experimental code for disabling CRC - use the correct constant for conversion between interrupt rate and EITR values (the previous values were off by a factor of 2) - make dev.ix.N.queueM.interrupt_rate a RW sysctl variable. Changing individual values affects the queue immediately, and propagates to all interfaces at the next reinit. - add dev.ix.N.queueM.irqs rdonly sysctl, to export the actual interrupt counts Netmap-related changes for ixgbe: - use the "new" format for TX descriptors in netmap mode. - pass interrupt mitigation delays to the user process doing poll() on a netmap file descriptor. On the RX side this means we will not check the ring more than once per interrupt. This gives the process a chance to sleep and process packets in larger batches, thus reducing CPU usage. On the TX side we take this even further: completed transmissions are reclaimed every half ring even if the NIC interrupts more often. This saves even more CPU without any additional tx delays. Generic Netmap-related changes: - align the netmap_kring to cache lines so that there is no false sharing (possibly useful for multiqueue NICs and MSIX interrupts, which are handled by different cores). It's a minor improvement but it does not cost anything. Reviewed by: Jack Vogel Approved by: Jack Vogel	2012-01-26 09:55:16 +00:00
Luigi Rizzo	e3ca4599b0	netmap-related changes: 1. correct the initialization of RDT when there is an ixgbe_init() while a netmap client is active. This code was previously in ixgbe_initialize_receive_units() but RDT is overwritten shortly afterwards in ixgbe_init_locked() 2. add code (not active yet) to disable CRCSTRIP while in netmap mode. From all evidence i could gather, it seems that when the 82599 has to write a data block that is not a full cache line, it first reads the line (64 bytes) and then writes back the updated version. This hurts reception of min-sized frames, which are only 60 bytes if the CRC is stripped: i could never get above 11Mpps (received from one queue) with CRCSTRIP enabled, whyle 64+4-byte packets reach 14.2 Mpps (the theoretical maximum). Leaving the CRC in gets us 14.88Mpps for 60+4 byte frames, (and penalizes 64+4). The min-size case is important not just because it looks good in benchmarks, but also because this is the size of pure acks. Note we cannot leave CRCSTRIP on by default because it is incompatible with some other features (LRO etc.)	2012-01-19 09:36:19 +00:00
Luigi Rizzo	6e10c8b8c5	small code cleanup in preparation for future modifications in the memory allocator used by netmap. No functional change, two small bug fixes: - in if_re.c add a missing bus_dmamap_sync() - in netmap.c comment out a spurious free() in an error handling block	2012-01-10 19:57:23 +00:00
Kevin Lo	5bbe0c5357	ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again Reviewed by: yongari	2012-01-07 09:41:57 +00:00
Matthew D Fleming	117f85276f	Consistently use types in ixgbe driver code: - {ixgbe,ixv}_header_split is passed to TUNABLE_INT, so delcare it int, not bool. - {ixgbe,ixv}_tx_ctx_setup() returns a boolean value, so declare it bool, not int. - {ixgbe,ixv}_tso_setup() returns a bool, so declare it bool, not boolean_t. - {ixgbe,ixv}_txeof() returns a bool, so declare it bool, not boolean_t. - Do not re-define bool if the symbol already exists. MFC after: 2 weeks Sponsored by: Isilon Systems, LLC	2011-12-12 18:27:28 +00:00
Luigi Rizzo	506cc70cce	1. Fix the handling of link reset while in netmap more. A link reset now is completely transparent for the netmap client: even if the NIC resets its own ring (e.g. restarting from 0), the client will not see any change in the current rx/tx positions, because the driver will keep track of the offset between the two. 2. make the device-specific code more uniform across different drivers There were some inconsistencies in the implementation of the netmap support routines, now drivers have been aligned to a common code structure. 3. import netmap support for ixgbe . This is implemented as a very small patch for ixgbe.c (233 lines, 11 chunks, mostly comments: in total the patch has only 54 lines of new code) , as most of the code is in an external file sys/dev/netmap/ixgbe_netmap.h , following some initial comments from Jack Vogel about making changes less intrusive. (Note, i have emailed Jack multiple times asking if he had comments on this structure of the code; i got no reply so i assume he is fine with it). Support for other drivers (em, lem, re, igb) will come later. "ixgbe" is now the reference driver for netmap support. Both the external file (sys/dev/netmap/ixgbe_netmap.h) and the device-specific patches (in sys/dev/ixgbe/ixgbe.c) are heavily commented and should serve as a reference for other device drivers. Tested on i386 and amd64 with the pkt-gen program in tools/tools/netmap, the sender does 14.88 Mpps at 1050 Mhz and 14.2 Mpps at 900 MHz on an i7-860 with 4 cores and 82599 card. Haven't tried yet more aggressive optimizations such as adding 'prefetch' instructions in the time-critical parts of the code.	2011-12-05 12:06:53 +00:00
Qing Li	62e3af5225	The maximum read size of incoming packets is done in 1024-byte increments. The current code was rounding down the maximum frame size instead of routing up, resulting in a read size of 1024 bytes, in the non-jumbo frame case, and splitting the packets across multiple mbufs. Consequently the above problem exposed another issue, which is when packets were splitted across multiple mbufs, and all of the mbufs in the chain have the M_PKTHDR flag set. Submitted by: original patch by Ray Ruvinskiy at BlueCoat dot com Reviewed by: jfv, kmacy, rwatson Approved by: re (rwatson) MFC after: 5 days	2011-09-05 17:54:19 +00:00
Jack F Vogel	b6582d0066	First off: update the driver README, the old one was horribly crusty, and this still isn't perfect, but its at least a bit more recent. Secondly, a few improvements to the driver from Andrew Boyer, support hint to allow devices to not attach, add VLAN_HWTSO capability so vlans can use TSO, fix in the interrupt handler to make sure the stack TX queue is processed. Oh, and also make sure IPv6 does not cause a re-init in the ioctl routine. Thanks for your efforts Andrew! Thanks to Claudio Jeker for noticing the ixgbe_xmit() routine was not correctly swapping the dma map from the first to the last descriptor in a multi-descriptor transmission, corrected this.	2011-06-02 00:34:57 +00:00
Jack F Vogel	e2314c6ccb	- Add the RX refresh changes from igb to ixgbe - Also a couple minor tweaks to the TX code from the same source. - Add the INET ioctl code which has been missing from this driver, and which caused IP aliases to reset the interface. - Last, some minor logic changes that just reflect upcoming hardware support, but have no other functional effect now. MFC after a week	2011-04-25 23:34:21 +00:00
Jack F Vogel	7d5f64a903	Don't bother to run the flowcontrol code if there is no change. Thanks to Andrew for the tweak.	2011-01-22 00:19:15 +00:00
Jack F Vogel	1d4e0b19e4	Missing case for 82598DA type adapter, thanks Andrew.	2011-01-22 00:08:06 +00:00
Jack F Vogel	c6f98cde15	Leftover bogus TX UNLOCK removed. Thanks to Andrew Boyer.	2011-01-21 23:55:28 +00:00
Jack F Vogel	182b3808b5	Update driver to version 2.3.8: CRITICAL FIX - with stats changes the older 82598 will panic and trash the stack on driver load, FCOE registers ONLY exist in 82599 and must not be read otherwise. kern/153951 - to correct incorrect media type on adapters with pluggable modules I have eliminated the old static table in favor of a new dynamic shared code routine. This also has the benefit of detecting changes when a different module is inserted. Performance/enhancement to the Flow Director code from my linux coworker (the developer of the code). Fixes from Michael Tuexen - a data corruption problem on the 82599 (CRITICAL), fix so the buf size correctly adjusts as the cluster changes, and max descriptors are set properly. Also added 16K clusters for those REALLY big jumbos :) In the RX path, the RX LOCK was not being released, and this causes LOR problems. Add the code that igb already has. Sync with in house shared code, this was necessary for the Flow Director fix. MFC in 2 days	2011-01-19 19:36:27 +00:00
Matthew D Fleming	5bc0787f29	Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need to rely on the format string.	2011-01-18 21:14:23 +00:00
Matthew D Fleming	8c49f18771	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the Intel drivers.	2011-01-12 19:53:23 +00:00
Jack F Vogel	006d15596a	kern/153772 fix variable names. Thank you Andrew Boyer for catching these MFC in 3 days	2011-01-07 22:34:56 +00:00
Jack F Vogel	43fcb978a7	This small little change is a bug that drove me nuts finding. The test to compare the mbuf m_len against a fixed value and then returning needs to be removed. When using VLANS and doing HW_TAGGING, and IPV6, the ICMP6 packets actually fail this condition, the constant assumes that the tag is IN the frame, and its not, so the length is actually tiny. Furthermore, I'm not sure what the point was to just return?? MFC after: 3 days	2010-12-04 01:43:38 +00:00
Jack F Vogel	f0fe67b43c	Interrupt handler, and stats changes from Michael Tuexen, thanks Michael!	2010-11-27 01:34:09 +00:00
Jack F Vogel	aa26851c4f	A couple fixes got clobbered, putting them back.	2010-11-26 23:57:13 +00:00
Jack F Vogel	1a4e34498c	Update ixgbe driver to verion 2.3.6 - This adds a VM SRIOV interface, ixv, it is however transparent to the user, it links with the ixgbe.ko, but when ixgbe is loaded in a virtualized guest with SRIOV configured this will be detected. - Sync shared code to latest - Many bug fixes and improvements, thanks to everyone who has been using the driver and reporting issues.	2010-11-26 22:46:32 +00:00
Rebecca Cran	b1ce21c6ef	Fix typos. PR: bin/148894 Submitted by: olgeni	2010-11-09 10:59:09 +00:00
Pyun YongHyeon	dd20cce19a	Do not allocate multicast array memory in multicast filter configuration function. For failed memory allocations, em(4)/lem(4) called panic(9) which is not acceptable on production box. igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which consumed 768 bytes of stack memory which looks too big. To address these issues, allocate multicast array memory in device attach time and make multicast configuration success under any conditions. This change also removes the excessive use of memory in stack. Reviewed by: jfv	2010-08-28 00:34:22 +00:00
Pyun YongHyeon	ad1917be37	Do not call voluntary panic(9) in case of if_alloc() failure. Reviewed by: jfv	2010-08-28 00:09:19 +00:00
Jack F Vogel	1fa9ef23cc	BAH, I apologize, the wrong version of the code got fat fingered in place, this is the correct version that actually works... <sheepish grin> MFC: in a week	2010-06-30 01:10:08 +00:00
Jack F Vogel	5f46ec799a	Add a new sysctl option, this will allow one to limit the advertised speed of an SFP+ to 1G, effectively "forcing" link at that lower speed. It is off by default and is enabled by sysctl dev.ix.0.force_gig=1, 0 will set it back to the norm.	2010-06-30 01:01:06 +00:00
Jack F Vogel	91c0189dc0	Change the mbuf memory calls back to NOWAIT as a problem has been seen in one case with doing the M_WAITOK	2010-06-11 20:59:29 +00:00
Jack F Vogel	0301599d3d	Remove a disable_queue from the beginning of the interrupt handler, automask handles it. Also, add in msix vector descriptions. MFC for 8.1 asap	2010-06-11 19:03:59 +00:00
Jack F Vogel	2d8f84cbea	Fixes for panic experienced in test at Intel, when doing bidirectional stress traffic on 82598. Also a couple bug fixes from Michael Tuexen, thank you!! Add a workaround into the header so that 8 REL can use the driver (adds local copy of ALTQ fix). MFC: in a few days	2010-06-03 00:00:45 +00:00
Jack F Vogel	3f13ffab71	A few changes: When not defining header split do not allocate mbufs, this can be a BIG savings in the mbuf memory pool. Also keep seperate dma maps for the header and payload pieces when doing header split. The basis of this code was a patch done a while ago by yongari, thank you :) A number of white space changes. MFC: in a few days	2010-05-19 00:03:48 +00:00
Jack F Vogel	245c81a9ea	A few minor fixes: - add a moderation value to the Link vector - allow disabling HW RSC on the 82599 if LRO is not enabled. - correct error in the stats code - change optic type on the 82598 DA device Thanks to Andrew Boyer for the changes.	2010-05-14 22:00:37 +00:00
Jack F Vogel	c99cdece4e	Remove the tx queue selection based on the cpu whe no flowid is present, this was causing some bad reordering, now just use 0. Also, add a few watchdog bits, and tx handler bits that were corrected in igb.	2010-04-16 16:33:05 +00:00
Jack F Vogel	1eadf156c2	fix my clobber of the copyright date :)	2010-03-30 19:54:29 +00:00
Jack F Vogel	9de5aff5b4	Thanks to Michael Tuexen for adding SCTP support for 82599, also for finding a one character bug that kept TSO from working. Sometimes with direct attach cables a failure can occur in init, the old method of calling detach was broken, there is no way to return an error to the system from init, so I have changed it to return failure thru the ioctl. And, have fixed the ALTQ code changes of Max Laier, sorry Max :)	2010-03-30 19:09:18 +00:00
Jack F Vogel	c00148556a	Update the driver to Intel version 2.1.6 - add some new hardware support for 82599 - Big change to interrupt architecture, it now uses a queue which contains an RX/TX pair as the recipient of the interrupt. This will reduce overall system interrupts/msix usage. - Improved RX mbuf handling: the old get_buf routine is no longer synchronized with rxeof, this allows the elimination of packet discards due to mbuf allocation failure. - Much simplified and improved AIM code, it now happens in the queue interrupt context and takes into account both the traffic on the RX AND TX side. - variety of small tweaks, like ring size, that have been seen as performance improvements. - Thanks to those that provided feedback or suggested changes, I hope I've caught all of them.	2010-03-27 00:21:40 +00:00
Max Laier	193cbc4d24	Fix drbr and altq interaction: - introduce drbr_needs_enqueue that returns whether the interface/br needs an enqueue operation: returns true if altq is enabled or there are already packets in the ring (as we need to maintain packet order) - update all drbr consumers - fix drbr_flush - avoid using the driver queue (IFQ_DRV_*) in the altq case as the multiqueue consumer does not provide enough protection, serialize altq interaction with the main queue lock - make drbr_dequeue_cond work with altq Discussed with: kmacy, yongari, jfv MFC after: 4 weeks	2010-02-13 16:04:58 +00:00
Martin Blapp	c2ede4b379	Remove extraneous semicolons, no functional changes. Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week	2010-01-07 21:01:37 +00:00
Jack F Vogel	2969bf0e46	Update driver to Intel version 2.0.7: This adds new feature support for the 82599, a hardware assist to LRO, doing this required a large revamp to the RX cleanup code because the descriptor ring may not be processed out of order, this necessitated the elimination of global pointers. Additionally, the RX routine now does not refresh mbufs on every descriptor, rather it will do a range, and then update the hardware pointer at that time. These are performance oriented changes. The TX side now has a cleaner simpler watchdog algorithm as well, in TX cleanup a read of ticks is stored, that can then be compared in local_timer to determine if there is a hang. Various other cleanups along the way, thanks to all who have provided input and testing.	2009-12-07 21:30:54 +00:00
John Baldwin	e1b17582f4	Take a step towards removing if_watchdog/if_timer. Don't explicitly set if_watchdog/if_timer to NULL/0 when initializing an ifnet. if_alloc() sets those members to NULL/0 already.	2009-11-06 14:55:01 +00:00
Jack F Vogel	ac54649762	Stats missed packet handling was still not quite right, thanks to Dmitrij Tejblum for the correction, need a variable with scope only within the for loop for all queues. MFC: 3 days	2009-09-11 00:00:23 +00:00
Jack F Vogel	0cde297e03	If an interface is brought up with no cable it will experience watchdog resets, this is due to a missing check for link in the new multiqueue start code. MFC: 3 days	2009-09-04 22:45:07 +00:00

1 2

68 Commits