freebsd-dev

Author	SHA1	Message	Date
Sean Bruno	98ae230f07	em(4): Add Skylake/I219 support. - driver rev 7.5.2 - use new functions em_flush* for i219 devices Differential Revision: https://reviews.freebsd.org/D3163 Submitted by: erj jfv Reviewed by: jfv MFC after: 1 month Relnotes: Yes Sponsored by: Intel Corporation	2015-09-04 17:21:55 +00:00
Sean Bruno	67ebffd348	e1000: Shared code updates - Fix compiler warning in 80003es2lan.c - Add return value handler for e1000_*_kmrn_reg_80003es2lan - Fix usage of DEBUGOUT - Remove unnecessary variable initializations. - Removed unused variables (complaints from gcc). - Edit defines in 82571.h. - Add workaround for igb hw errata. - Shared code changes for Skylake/I219 support. - Remove unused OBFF and LTR functions. Differential Revision: https://reviews.freebsd.org/D3162 Submitted by: erj MFC after: 1 month Sponsored by: Intel Corporation	2015-09-04 16:30:48 +00:00
Sean Bruno	02415af2ee	igb(4): Update and fix HW errata - HW errata workaround for IPv6 offload w/ extension headers - Edited start of if_igb.c (Device IDs / #includes) to match ixgbe/ixl Differential Revision: https://reviews.freebsd.org/D3165 Submitted by: erj MFC after: 1 month Sponsored by: Intel Corporation	2015-09-04 16:07:27 +00:00
Sean Bruno	fac8243601	Restrict tso_max to IP_MAXPACKET to avoid the panic reported in: https://lists.freebsd.org/pipermail/freebsd-current/2015-August/057192.html Submitted by: pyunyh@gmail.com MFC after: 2 weeks	2015-08-31 19:12:10 +00:00
Sean Bruno	48600901a8	Style/whitespace cleanup in shared/common code. Differential Revision: https://reviews.freebsd.org/D3159 Submitted by: erj MFC after: 2 weeks	2015-08-24 16:32:57 +00:00
Sean Bruno	7c669ab6cc	Bump all copywrite dates to 2015 Differential Revision: https://reviews.freebsd.org/D3160 Submitted by: erj MFC after: 2 weeks Sponsored by: Intel Corportation	2015-08-16 20:13:58 +00:00
Sean Bruno	d2635c677b	e1000/if_lem.c bump to 1.1.0 - deprecate fbsd 8 Differential Revision: https://reviews.freebsd.org/D3164 Submitted by: erj MFC after: 2 weeks Sponsored by: Intel Corporation	2015-08-16 20:10:43 +00:00
Sean Bruno	df40405fab	Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::segs[EM_MAX_SCATTER] doesn't get overrun by things like NFS that can and do shove more than 32 segs when being used with em(4) and TSO4. Update tso handling code in em_xmit() with update from jhb@ in email thread: https://lists.freebsd.org/pipermail/freebsd-net/2014-July/039306.html set ifp->if_hw_tsomax, ifp->if_hw_tsomaxsegcount & ifp->if_hw_tsomaxsegsize to appropriate values. Define a TSO workaround "magic" number of 4 that is used to avoid an alignment issue in hardware. Change a couple of integer values that were used as booleans to actual bool types. Ensure that em_enable_intr() enables the appropriate mask of interrupts and not just a hardcoded define of values. PR: 200221 199174 195078 Differential Revision: https://reviews.freebsd.org/D3192 Reviewed by: erj jhb hiren MFC after: 2 weeks Sponsored by: Limelight Networks	2015-08-16 19:43:44 +00:00
Sean Bruno	38be29d321	Add capability to disable CRC stripping. This breaks IPMI/BMC capabilities on certain adatpers. Linux has been doing the exact same thing since 2008 `eb7c3adb1c` PR: 161277 Differential Revision: https://reviews.freebsd.org/D3282 Submitted by: Fravadona@gmail.com Reviewed by: erj wblock MFC after: 2 weeks Relnotes: yes Sponsored by: Limelight Networks	2015-08-16 19:06:23 +00:00
Hans Petter Selasky	577c341353	Free mbufs when busdma loading fails. Reviewed by: erj, sbruno MFC after: 1 month	2015-08-01 20:40:37 +00:00
Sean Bruno	a82cd51680	Remove unused txd_saved. Intialize txd_upper, txd_lower and txd_used at declaration. Differential Revision: D3174 Reviewed by: erj hiren MFC after: 2 weeks Sponsored by: Limelight Networks	2015-07-25 19:24:33 +00:00
Sean Bruno	f46fb03de7	Add an adapter CORE lock in the DDB hook em_dump_queue to avoid WITNESS panic in em_init_locked() while debugging. MFC after: 2 weeks Sponsored by: Limelight Networks	2015-07-16 16:32:57 +00:00
Kevin Lo	f7c698e20d	Fix typo in register definition. Submitted by: James Hung Reviewed by: sbruno	2015-07-16 08:03:23 +00:00
Luigi Rizzo	847bf38369	Sync netmap sources with the version in our private tree. This commit contains large contributions from Giuseppe Lettieri and Stefano Garzarella, is partly supported by grants from Verisign and Cisco, and brings in the following: - fix zerocopy monitor ports and introduce copying monitor ports (the latter are lower performance but give access to all traffic in parallel with the application) - exclusive open mode, useful to implement solutions that recover from crashes of the main netmap client (suggested by Patrick Kelsey) - revised memory allocator in preparation for the 'passthrough mode' (ptnetmap) recently presented at bsdcan. ptnetmap is described in S. Garzarella, G. Lettieri, L. Rizzo; Virtual device passthrough for high speed VM networking, ACM/IEEE ANCS 2015, Oakland (CA) May 2015 http://info.iet.unipi.it/~luigi/research.html - fix rx CRC handing on ixl - add module dependencies for netmap when building drivers as modules - minor simplifications to device-specific routines (txsync, rxsync) - general code cleanup (remove unused variables, introduce macros to access rings and remove duplicate code, Applications do not need to be recompiled, unless of course they want to use the new features (monitors and exclusive open). Those willing to try this code on stable/10 can just update the sys/dev/netmap/, sys/net/netmap with the version in HEAD and apply the small patches to individual device drivers. MFC after: 1 month Sponsored by: (partly) Verisign, Cisco	2015-07-10 05:51:36 +00:00
Sean Bruno	23c9098b2a	Change EM_MULTIQUEUE to a real kernconf entry and enable support for up to 2 rx/tx queues for the 82574. Program the 82574 to enable 5 msix vectors, assign 1 to each rx queue, 1 to each tx queue and 1 to the link handler. Inspired by DragonFlyBSD, enable some RSS logic for handling tx queue handling/processing. Move multiqueue handler functions so that they line up better in a diff review to if_igb.c Always enqueue tx work to be done in em_mq_start, if unable to acquire the TX lock, then this will be processed in the background later by the taskqueue. Remove mbuf argument from em_start_mq_locked() as the work is always enqueued. (stolen from igb) Setup TARC, TXDCTL and RXDCTL registers for better performance and stability in multiqueue and singlequeue implementations. Handle Intel errata 3 and generic multiqueue behavior with the initialization of TARC(0) and TARC(1) Bind interrupt threads to cpus in order. (stolen from igb) Add 2 new DDB functions, one to display the queue(s) and their settings and one to reset the adapter. Primarily used for debugging. In the multiqueue configuration, bump RXD and TXD ring size to max for the adapter (4096). Setup an RDTR of 64 and an RADV of 128 in multiqueue configuration to cut down on the number of interrupts. RADV was arbitrarily set to 2x RDTR and can be adjusted as needed. Cleanup the display in top a bit to make it clearer where the taskqueue threads are running and what they should be doing. Ensure that both queues are processed by em_local_timer() by writing them both to the IMS register to generate soft interrupts. Ensure that an soft interrupt is generated when em_msix_link() is run so that any races between assertion of the link/status interrupt and a rx/tx interrupt are handled. Document existing tuneables: hw.em.eee_setting, hw.em.msix, hw.em.smart_pwr_down, hw.em.sbp Document use of hw.em.num_queues and the new kernel option EM_MULTIQUEUE Thanks to Intel for their continued support of FreeBSD. Reviewed by: erj jfv hiren gnn wblock Obtained from: Intel Corporation MFC after: 2 weeks Relnotes: Yes Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D1994	2015-06-03 18:01:09 +00:00
Sean Bruno	b7a728aaba	Simplify hang detection by stealing the techniques used in ixl(4) and applying them to em(4). Rely on iterations through the local timer, and the tx queue state to determine if an actual hang has occurred. Any time a descriptor is used (packet sent), the tx queue is flagged as busy. Then when txeof runs, it either clears the flag when all is clean, or resets it to 1 if ANY are cleaned, if nothing is cleaned it increments the flag. Local timer simply checks to see if busy ever reaches MAX (10, which is compile time configurable), and then sets it as HUNG, at that point there is one more timer cycle in which to have any cleans, if not a watchdog reset will occur. Differential Revision: https://reviews.freebsd.org/D2019 Submitted by: jfv Reviewed by: hiren Obtained from: Intel Corporation MFC after: 2 weeks Relnotes: Yes Sponsored by: Limelight Networks	2015-06-02 18:28:41 +00:00
Sean Bruno	316f4c880a	Bump rx_overruns when indicated by the ICR mask. PR: 199716 MFC after: 3 days Sponsored by: Limelight Networks	2015-05-22 17:01:43 +00:00
John Baldwin	625d12c609	Various fixes to the stats in igb(4), ixgbe(4), and ixl(4). - Use hardware counters for ifnet stats in igb(4) when possible. This ensures these stats include packets that bypass the regular stack via netmap. - Don't derefence values off the end of the igb(4) VF stats structure. Instead, add a dedicated if_get_counter method for igb(4) VF interfaces. - Report missed packets on igb(4) as input queue drops rather than an input error. - Report bug_ring drop counts as output queue drops for igb(4) and ixgbe(4). - Export the buf_ring drop stats for individual rings via sysctl on ixgbe(4). - Fix a typo that in ixl(4) that caused output queue drops to be reported as input queue drops and input queue drops to be unreported. Differential Revision: https://reviews.freebsd.org/D2402 Reviewed by: jfv, rstone (6) Sponsored by: Norse Corp, Inc.	2015-04-30 18:23:38 +00:00
Hiren Panchasara	270538b2b6	For igb(4), when we are doing multiqueue, we are all setup to have full 32bit RSS hash from the card. We do not need to hide that under "ifdef RSS" and should expose that by default so others like lagg(4) can use that and avoid hashing the traffic by themselves. While here, improve comments and get rid of hidden/unimplemented RSS support code for UDP. Differential Revision: https://reviews.freebsd.org/D2296 Reviewed by: jfv, erj Discussed with: adrian Sponsored by: Limelight Networks	2015-04-21 20:24:15 +00:00
Adrian Chadd	977dc4e243	Migrate using CPU_ZERO() + CPU_SET() -> CPU_SETOF(). Tested: * ixgbe, igb, RSS enabled Submitted by: jhb Sponsored by: Norse Corp, Inc.	2015-02-25 21:44:53 +00:00
Adrian Chadd	9756bd5982	Change uses of taskqueue_start_threads_pinned() -> taskqueue_start_threads_cpuset() Differential Revision: https://reviews.freebsd.org/D1897 Reviewed by: jfv	2015-02-24 22:17:12 +00:00
Adrian Chadd	b2bdc62a95	Refactor / restructure the RSS code into generic, IPv4 and IPv6 specific bits. The motivation here is to eventually teach netisr and potentially other networking subsystems a bit more about how RSS work queues / buckets are configured so things have a hope of auto-configuring in the future. * net/rss_config.[ch] takes care of the generic bits for doing configuration, hash function selection, etc; * topelitz.[ch] is now in net/ rather than netinet/; * (and would be in libkern if it didn't directly include RSS_KEYSIZE; that's a later thing to fix up.) * netinet/in_rss.[ch] now just contains the IPv4 specific methods; * and netinet/in6_rss.[ch] now just contains the IPv6 specific methods. This should have no functional impact on anyone currently using the RSS support. Differential Revision: D1383 Reviewed by: gnn, jfv (intel driver bits)	2015-01-18 18:06:40 +00:00
Jack F Vogel	4da1bbcda5	Revert r275136, it was not approved, it was sloppy, if a feature like this is needed please resubmit for Intel's approval.	2014-12-02 23:02:57 +00:00
Hans Petter Selasky	c25290420e	Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file. This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows. "M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before. Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped. MFC after: 1 month Sponsored by: Mellanox Technologies	2014-12-01 11:45:24 +00:00
Alfred Perlstein	56c14bca7e	Make igb and ixgbe check tunables at probe time. This allows one to make a kernel module to tune the number of queues before the driver loads. This is needed so that a module at SI_SUB_CPU can set tunables for these drivers to take. Otherwise getenv is called too early by the TUNABLE macros. Reviewed by: smh Phabric: https://reviews.freebsd.org/D1149	2014-11-26 20:19:36 +00:00
Hans Petter Selasky	f0188618f2	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
John Baldwin	8423f42aa8	Various fixes to stats: - Read the counts of received, dropped, and transmitted management packets and add sysctl nodes for them. - Fix the total octets received/transmitted to read all 64 bits of the counters. - Add missing sysctl nodes for rlec, tncrs, fcruc, tor, and tot. - Remove spurious spaces. Reviewed by: Eric Joyner @ Intel MFC after: 1 week	2014-10-10 16:36:25 +00:00
Gleb Smirnoff	bd071d4d19	- Remove empty wrappers ether_poll_[de]register_drv(). [1] - Move polling(9) declarations out of ifq.h back to if_var.h they are absolutely unrelated to queues. Submitted by: Mikhail <mp lenta.ru> [1]	2014-09-28 14:05:18 +00:00
Gleb Smirnoff	5d53210ced	- Provide igb_get_counter() to return counters that are not collected, but taken from hardware. - Mechanically convert to if_inc_counter() the rest of counters.	2014-09-19 11:49:41 +00:00
Adrian Chadd	0936a8208b	Fix the handling of EOP in status descriptors for if_igb(4) and don't double-free mbufs. Like ixgbe(4) chipsets, EOP is only set on the final descriptor in a chain of descriptors. So, to free the whole list of descriptors, we should free the current slot _and_ the assembled list of descriptors that make up the fragment list. The existing code was setting discard once it saw EOP + an error status; it then freed all the subsequent descriptors until the next EOP. That's totally the wrong order.	2014-09-18 16:20:17 +00:00
Gleb Smirnoff	df3601781d	- Use if_inc_counter() to increment various counters. - Do not ever set a counter to a value. For those counters that we don't increment, but return directly from hardware create cases in if_get_counter() method. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-18 15:56:14 +00:00
Adrian Chadd	1c2427605c	Set DROP_EN on each RX queue if transmit flow-control is disabled. This allows the NIC to drop frames on the receive queue and not cause the MAC to block on receiving to _any_ queue. Tested: igb0@pci0:5:0:0: class=0x020000 card=0x152115d9 chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet Discussed with: Eric Joyner <eric.joyner@intel.com> MFC after: 1 week Sponsored by: Norse Corp, Inc.	2014-09-15 19:53:49 +00:00
Gleb Smirnoff	09a8241fc9	It is actually possible to have if_t a typedef to non-void type, and keep both converted to drvapi and non-converted drivers compilable. o Make if_t typedef to struct ifnet *. o Remove shim functions. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-08-31 12:48:13 +00:00
Gleb Smirnoff	1bffa9511f	Use define from if_var.h to access a field inside struct if_data, that resides in struct ifnet. Sponsored by: Nginx, Inc.	2014-08-30 19:55:54 +00:00
Luigi Rizzo	4bf50f18eb	Update to the current version of netmap. Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for txsync() and rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days.	2014-08-16 15:00:01 +00:00
Adrian Chadd	fa4be7cc42	Fix the igb(4) redirection table to correctly populate. This is similar to the ixgbe(4) fix. Tested: * Intel I350 gigabit adapter	2014-07-23 05:40:28 +00:00
Hiren Panchasara	eee92ad073	The description is a bit misleading. Trying to make it more obvious. Phabric: https://phabric.freebsd.org/D435 Reviewed by: gnn	2014-07-18 16:25:35 +00:00
Rick Macklem	e2ade3b6f7	Move the "retry:" label so that the calls to m_pullup() are not done after the call to m_defrag(). This fixes a problem where m_pullup() would prepend an mbuf to the list created by m_defrag() making the chain greater than 32 again. Tested by: rcarter@pinyon.org Reviewed by: yongari, jfv MFC after: 2 weeks	2014-07-15 23:32:13 +00:00
Mark Johnston	58e6549541	Correct the setting of the VID in transmit descriptors when hardware VLAN tagging is enabled. This was broken in r266978. Reported by: gjb Tested by: gjb	2014-07-10 16:46:46 +00:00
Adrian Chadd	8c0d2adf3f	Initialise these variables so gcc doesn't complain. Submitted by: luigi	2014-06-30 23:34:36 +00:00
Adrian Chadd	1d72a9bea9	Add initial RSS awareness to the igb(4) driver. The igb(4) hardware is capable of RSS hashing RX packets and doing RSS queue selection for up to 8 queues. (I believe some hardware is limited to 4 queues, but I haven't tested on that.) However, even if multi-queue is enabled for igb(4), the RX path doesn't use the RSS flowid from the received descriptor. It just uses the MSIX queue id. This patch does a handful of things if RSS is enabled: * Instead of using a random key at boot, fetch the RSS key from the RSS code and program that in to the RSS redirection table. That whole chunk of code should be double checked for endian correctness. * Use the RSS queue mapping to CPU ID to figure out where to thread pin the RX swi thread and the taskqueue threads for each queue. * The software queue is now really an "RSS bucket". * When programming the RSS indirection table, use the RSS code to figure out which RSS bucket each slot in the indirection table maps to. * When transmitting, use the flowid RSS mapping if the mbuf has an RSS aware hash. The existing method wasn't guaranteed to align correctly with the destination RSS bucket (and thus CPU ID.) This code warns if the number of RSS buckets isn't the same as the automatically configured number of hardware queues. The administrator will have to tweak one of them for better performance. There's currently no way to re-balance the RSS indirection table after startup. I'll worry about that later. Additionally, it may be worthwhile to always use the full 32 bit flowid if multi-queue is enabled. It'll make things like lagg(4) behave better with respect to traffic distribution.	2014-06-30 04:34:59 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Jack F Vogel	8cc64f1e21	Sync the E1000 shared code with Intel internal, this adds fixes, and more importantly, new I218 adapter support to the em driver. MFC after: 1 week	2014-06-26 21:33:32 +00:00
John Baldwin	46e89834dc	- Don't compare bus_dma map pointers for static DMA allocations against NULL to determine if bus_dmamap_unload() or bus_dmamem_free() should be called. Instead, check the associated bus and virtual addresses. - Don't clear static DMA maps to NULL. Reviewed by: jfv	2014-06-12 11:15:19 +00:00
Luigi Rizzo	c7156fe92f	make sure if_transmit returns 0 if the mbuf is enqueued. ixgbe/ixv.c still needs a similar fix but it takes a little more restructuring of the code. MFC after: 3 days	2014-06-06 20:49:56 +00:00
Marcel Moolenaar	9e11529015	Convert em(4) to use the driver API. Submitted by: Anuranjan Shukla <anshukla@juniper.net> Obtained from: Juniper Networks, Inc.	2014-06-02 18:52:03 +00:00
Luigi Rizzo	0d88706547	reference the correct variable in a comment MFC after: 3 days	2014-05-28 06:50:16 +00:00
Eitan Adler	eb0a187849	e1000: add missing braces Obtained from: DragonFlyBSD	2014-05-26 02:19:50 +00:00
George V. Neville-Neil	e1cda2b313	The timestamp bit is number 17, and not number 9, in the stat error field of the receive descriptor. MFC after: 1 week	2014-01-30 18:32:33 +00:00
Gleb Smirnoff	3dbdfe820b	Fix compilation with IGB_LEGACY_TX defined. PR: 185909 Submitted by: Aurelien Rougemont <beorn binaries.fr>	2014-01-25 20:39:23 +00:00
Luigi Rizzo	17885a7bfd	It is 2014 and we have a new version of netmap. Most relevant features: - netmap emulation on any NIC, even those without native netmap support. On the ixgbe we have measured about 4Mpps/core/queue in this mode, which is still a lot more than with sockets/bpf. - seamless interconnection of VALE switch, NICs and host stack. If you disable accelerations on your NIC (say em0) ifconfig em0 -txcsum -txcsum you can use the VALE switch to connect the NIC and the host stack: vale-ctl -h valeXX:em0 allowing sharing the NIC with other netmap clients. - THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers instead of pointers/count as before). This was unavoidable to support, in the future, multiple threads operating on the same rings. Netmap clients require very small source code changes to compile again. On the plus side, the new API should be easier to understand and the internals are a lot simpler. The manual page has been updated extensively to reflect the current features and give some examples. This is the result of work of several people including Giuseppe Lettieri, Vincenzo Maffione, Michio Honda and myself, and has been financially supported by EU projects CHANGE and OPENLAB, from NetApp University Research Fund, NEC, and of course the Universita` di Pisa.	2014-01-06 12:53:15 +00:00
Luigi Rizzo	7091cd69d0	use the correct netmap <-> nic slot mapping on the transmit ring for 'lem'. This bug would manifest only in netmap mode and on packets transmitted after a NIC reset while netmap mode is active. MFC after: 3 days	2013-12-26 05:22:38 +00:00
Eitan Adler	7a22215c53	Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this shifts into the sign bit. Instead use (1U << 31) which gets the expected result. This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases. A similar change was made in OpenBSD. Discussed with: -arch, rdivacky Reviewed by: cperciva	2013-11-30 22:17:27 +00:00
Konstantin Belousov	d480f5b820	Fix several issues with the busdma(9) KPI use in the e1000 drivers. The problems do not affect bouncing busdma in a visible way, but are critical for the dmar backend. - The bus_dmamap_create(9) is not documented to take BUS_DMA_NOWAIT flag. - Unload descriptor map after receive. - Do not reset descriptor map to NULL, bus_dmamap_load(9) requires valid map, and also this leaks the map. Reported and tested by: pho Approved by: jfv Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-11-02 09:16:11 +00:00
Luigi Rizzo	ce3ee1e7c4	update to the latest netmap snapshot. This includes the following: - use separate memory regions for VALE ports - locking fixes - some simplifications in the NIC-specific routines - performance improvements for the VALE switch - some new features in the pkt-gen test program - documentation updates There are small API changes that require programs to be recompiled (NETMAP_API has been bumped so you will detect old binaries at runtime). In particular: - struct netmap_slot now is 16 bytes to support an extra pointer, which may save one data copy when using VALE ports or VMs; - the struct netmap_if has two extra fields; MFC after: 3 days	2013-11-01 21:21:14 +00:00
Gleb Smirnoff	76039bc84f	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 17:58:36 +00:00
Jack F Vogel	7609433eb6	Update the Intel igb driver to version 2.4.0 - This version has support for the new Intel Avoton systems, including 2.5Gb support, further it now has IPv6/TSO6 support as well. Shared code has been updated where necessary as well. Thanks to my new assistant Eric Joyner for doing the transmit path changes to bring in the IPv6/TSO6 support. Thanks to Gleb for catching the one bug and change needed in NETMAP. Approved by: re	2013-10-09 17:32:52 +00:00
Hiren Panchasara	5b9d734b08	Expose system level ixgbe sysctls. Device level sysctls are already exposed as dev.ix.<device> Fixing the case where number of queues for igb is auto-tuned and hw.igb.num_queues does not return current/updated value. Reviewed by: jfv Approved by: re (delphij) MFC after: 2 weeks	2013-10-05 19:17:56 +00:00
Andre Oppermann	1b4381afbb	Restructure the mbuf pkthdr to make it fit for upcoming capabilities and features. The changes in particular are: o Remove rarely used "header" pointer and replace it with a 64bit protocol/ layer specific union PH_loc for local use. Protocols can flexibly overlay their own 8 to 64 bit fields to store information while the packet is worked on. o Mechanically convert IP reassembly, IGMP/MLD and ATM to use pkthdr.PH_loc instead of pkthdr.header. o Extend csum_flags to 64bits to allow for additional future offload information to be carried (e.g. iSCSI, IPsec offload, and others). o Move the RSS hash type enumerator from abusing m_flags to its own 8bit rsstype field. Adjust accessor macros. o Add cosqos field to store Class of Service / Quality of Service information with the packet. It is not yet supported in any drivers but allows us to get on par with Cisco/Juniper in routing applications (plus MPLS QoS) with a modernized ALTQ. o Add four 8 bit fields l[2-5]hlen to store the relative header offsets from the start of the packet. This is important for various offload capabilities and to relieve the drivers from having to parse the packet and protocol headers to find out location of checksums and other information. Header parsing in drivers is a lot of copy-paste and unhandled corner cases which we want to avoid. o Add another flexible 64bit union to map various additional persistent packet information, like ether_vtag, tso_segsz and csum fields. Depending on the csum_flags settings some fields may have different usage making it very flexible and adaptable to future capabilities. o Restructure the CSUM flags to better signify their outbound (down the stack) and inbound (up the stack) use. The CSUM flags used to be a bit chaotic and rather poorly documented leading to incorrect use in many places. Bring clarity into their use through better naming. Compatibility mappings are provided to preserve the API. The drivers can be corrected one by one and MFC'd without issue. o The size of pkthdr stays the same at 48/56bytes (32/64bit architectures). Sponsored by: The FreeBSD Foundation	2013-08-24 19:51:18 +00:00
Jack F Vogel	83cef45266	Alter the mq_start routine to do a TRYLOCK and call to the locked routine rather than just queueing. The former code was an attempt at getting UDP performance up, but there have been customer reports of problems with it, so the ixgbe approach seems the best solution for now.	2013-08-13 00:25:39 +00:00
Scott Long	c68534f1d5	Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI command register. The lazy BAR allocation code in FreeBSD sometimes disables this bit when it detects a range conflict, and will re-enable it on demand when a driver allocates the BAR. Thus, the bit is no longer a reliable indication of capability, and should not be checked. This results in the elimination of a lot of code from drivers, and also gives the opportunity to simplify a lot of drivers to use a helper API to set the busmaster enable bit. This changes fixes some recent reports of disk controllers and their associated drives/enclosures disappearing during boot. Submitted by: jhb Reviewed by: jfv, marius, achadd, achim MFC after: 1 day	2013-08-12 23:30:01 +00:00
Jack F Vogel	4dc63104ae	Improve the MSIX setup code in the drivers, thanks to Marius for the changes. Make sure that pci_alloc_msix() does give us the vectors we need and fall back to MSI when it doesn't, also release any that were allocated when insufficient. MFC after: 3 days	2013-08-12 22:54:38 +00:00
Jack F Vogel	d0913b7f25	Make the various driver MSIX setup routines fallback to MSI more gracefully. This change was suggested by Marius Strobl, thank you. PR: kern/181016 MFC after: ASAP	2013-08-06 21:01:38 +00:00
Jack F Vogel	54a6317360	When the igb driver is static there are cases when early interrupts occur, resulting in a panic in refresh_mbufs, to prevent this add a check in the interrupt handler for DRV_RUNNING. MFC after: 1 day (critical for 9.2)	2013-08-06 18:00:53 +00:00
Jack F Vogel	a1db87ec73	Change the E1000 driver option header handling to match the ixgbe driver. As it was, when building them as a module INET and INET6 are not defined. In these drivers it does not cause a panic, however it does result in different behavior in the ioctl routine when you are using a module vs static, and I think the behavior should be the same. MFC after: 3 days	2013-07-12 22:36:26 +00:00
Luigi Rizzo	4dc07530d7	if_lem.c: make sure that lem_rxeof() can drain the entire rx queue irrespective of the setting of lem_rx_process_limit, while giving a chance to the taskqueue scheduler to act after each chunk. This makes lem_rxeof similar to the one in if_em.c and if_igb.c . if_lem.c and if_em.c: add a sysctl to manually configure the 'itr' moderation register. Approved by: Jack Vogel	2013-05-09 17:07:30 +00:00
Luigi Rizzo	1405478115	simplify the code to initialize the RDT while in netmap mode.	2013-05-09 16:57:02 +00:00
Eitan Adler	f7efb9e28e	Update Intel email address. PR: docs/175349 Submitted by: Lars Eggert <lars@netapp.com> Discussed with: jfv	2013-05-02 01:36:52 +00:00
Luigi Rizzo	9b2e4517d5	use netmap_rx_irq() and netmap_tx_irq() instead of replicating the logic in the individual driver.	2013-04-30 16:51:58 +00:00
Luigi Rizzo	d61ba75247	use netmap_rx_irq() / netmap_tx_irq() to handle interrupts in netmap mode, removing the logic from individual drivers. (note: if_lem.c not updated yet due to some other pending modifications)	2013-04-30 16:18:29 +00:00
Jack F Vogel	386c110e3c	Corrections to the RX checksum code, make sure its disabled as well as enabled when necessary. And simplify the checksum routine itself, adding UDP bit to the test. Thanks to Kevin Lo for pointing out the problems and code suggestions.	2013-04-15 17:01:42 +00:00
Jack F Vogel	f0105d2d23	Simplify allocate_legacy code, txr pointer was breaking LEGACY compile, thanks to Nick Rogers for pointing this out.	2013-04-10 17:51:39 +00:00
Jack F Vogel	3b0b7ffbb9	Correct the multicast handling in the E1000 drivers as was done in ixgbe, thanks to Mike Karels for this fix. When exiting promiscuous mode MPE bit was being unconditionally cleared, this should not be done if we are in MAX multicast groups.	2013-04-03 23:39:54 +00:00
Sean Bruno	8e3ff376cf	Update man page for igb(4) with a little bit of information about hw.igb.num_queues for those so inclined. PR: kern/177384 Submitted by: hiren.panchasara@gmail.com Reviewed by: sbruno@ Approved by: jfv@ Obtained from: Yahoo! Inc. MFC after: 2 weeks	2013-04-03 21:55:19 +00:00
Jack F Vogel	be2095895a	Change the define in the header to eliminate unnecessary data when using LEGACY TX.	2013-03-29 18:46:13 +00:00
Jack F Vogel	c05891a6da	Change defines in the igb driver to allow an easier selection of the older if_start/non-multiqueue interface from the stack. This is not the default, but can be turned on in the Makefile now regardless of the OS level to allow either testing or use of ALTQ. MFC after: one week	2013-03-29 18:25:45 +00:00
Jack F Vogel	6ab6bfe32f	Refresh on the shared code for the E1000 drivers. - bear with me, there are lots of white space changes, I would not do them, but I am a mere consumer of this stuff and if these drivers are to stay in shape they need to be taken. em driver changes: support for the new i217/i218 interfaces igb driver changes: - TX mq start has a quick turnaround to the stack - Link/media handling improvement - When link status changes happen the current flow control state will now be displayed. - A few white space/style changes. lem driver changes: - the shared code uncovered a bogus write to the RLPML register (which does not exist in this hardware) in the vlan code,this is removed.	2013-02-21 00:25:45 +00:00
Randall Stewart	ded5ea6a25	This fixes a out-of-order problem with several of the newer drivers. The basic problem was that the driver was pulling the mbuf off the drbr ring and then when sending with xmit(), encounting a full transmit ring. Thus the lower layer xmit() function would return an error, and the drivers would then append the data back on to the ring. For TCP this is a horrible scenario sure to bring on a fast-retransmit. The fix is to use drbr_peek() to pull the data pointer but not remove it from the ring. If it fails then we either call the new drbr_putback or drbr_advance method. Advance moves it forward (we do this sometimes when the xmit() function frees the mbuf). When we succeed we always call advance. The putback will always copy the mbuf back to the top of the ring. Note that the putback cannot be used with a drbr_dequeue() only with drbr_peek(). We most of the time, in putback, would not need to copy it back since most likey the mbuf is still the same, but sometimes xmit() functions will change the mbuf via a pullup or other call. So the optimial case for the single consumer is to always copy it back. If we ever do a multiple_consumer (for lagg?) we will need a test and atomic in the put back possibly a seperate putback_mc() in the ring buf. Reviewed by: jhb@freebsd.org, jlv@freebsd.org	2013-02-07 15:20:54 +00:00
Sofian Brabez	61bfd86762	Use DEVMETHOD_END macro defined in sys/bus.h instead of {0, 0} sentinel on device_method_t arrays Reviewed by: cognet Approved by: cognet	2013-01-30 18:01:20 +00:00
Steven Hartland	31e85bd9cd	Fixed mbuf free when receive structures fail to allocate. This prevents quad igb card on high core machines, without any nmbcluster or igb queue tuning wedging the boot process if all nics are configured. Reviewed by: jfv Approved by: pjd (mentor) MFC after: 1 week	2013-01-12 16:05:55 +00:00
Gleb Smirnoff	c6499eccad	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags in sys/dev.	2012-12-04 09:32:43 +00:00
Gleb Smirnoff	9c402aeb41	drbr_enqueue() awlays consumes mbuf, no matter did it fail or not. The mbuf pointer is no longer valid, so can't be reused after. Fix igb_mq_start() where mbuf pointer was used after drbr_enqueue(). This eventually leads us to all invocations of igb_mq_start_locked() called with third argument as NULL. This allows us to simplify this function. Submitted by: Karim Fodil-Lemelin <fodillemlinkarim gmail.com> Reviewed by: jfv	2012-11-26 20:03:57 +00:00
Eitan Adler	2da1951583	Now that device disabling is generic, remove extraneous code from the device drivers that used to provide this feature. This is a subset of 241856 (which was reverted) Reviewed by: des Approved by: cperciva (implicit) MFC after: 1 week	2012-10-22 22:29:48 +00:00
Eitan Adler	a8de37b024	This isn't functionally identical. In some cases a hint to disable unit 0 would in fact disable all units. This reverts r241856 Approved by: cperciva (implicit)	2012-10-22 13:06:09 +00:00
Eitan Adler	76b7512247	Now that device disabling is generic, remove extraneous code from the device drivers that used to provide this feature. Reviewed by: des Approved by: cperciva MFC after: 1 week	2012-10-22 03:41:14 +00:00
Eitan Adler	db702c59cf	remove duplicate semicolons where possible. Approved by: cperciva MFC after: 1 week	2012-10-22 03:00:37 +00:00
Gleb Smirnoff	063efed28c	The drbr(9) API appeared to be so unclear, that most drivers in tree used it incorrectly, which lead to inaccurate overrated if_obytes accounting. The drbr(9) used to update ifnet stats on drbr_enqueue(), which is not accurate since enqueuing doesn't imply successful processing by driver. Dequeuing neither mean that. Most drivers also called drbr_stats_update() which did accounting again, leading to doubled if_obytes statistics. And in case of severe transmitting, when a packet could be several times enqueued and dequeued it could have been accounted several times. o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between ALTQ queueing or buf_ring(9) queueing. - It doesn't touch the buf_ring stats any more. - It doesn't touch ifnet stats anymore. - drbr_stats_update() no longer exists. o buf_ring(9) handles its stats itself: - It handles br_drops itself. - br_prod_bytes stats are dropped. Rationale: no one ever reads them but update of a common counter on every packet negatively affects performance due to excessive cache invalidation. - buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since we no longer account bytes. o Drivers handle their stats theirselves: if_obytes, if_omcasts. o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer use drbr_stats_update(), and update ifnet stats theirselves. o bxe(4) was the most correct driver, it didn't call drbr_stats_update(), thus it was the only driver accurate under moderate load. Now it also maintains stats itself. o ixgbe(4) had already taken stats from hardware, so just - drop software stats updating. - take multicast packet count from hardware as well. o mxge(4) just no longer needs NO_SLOW_STATS define. o cxgb(4), cxgbe(4) need no change, since they obtain stats from hardware. Reviewed by: jfv, gnn	2012-09-28 18:28:27 +00:00
John Baldwin	aceb040376	Merge similar fixes from 223198 from igb to ixgbe: - Use a dedicated task to handle deferred transmits from the if_transmit method instead of reusing the existing per-queue interrupt task. Reusing the per-queue interrupt task could result in both an interrupt thread and the taskqueue thread trying to handle received packets on a single queue resulting in out-of-order packet processing and lock contention. - Don't define ixgbe_start() at all where if_transmit is used. Tested by: Vijay Singh Reviewed by: jfv MFC after: 2 weeks	2012-09-26 18:11:43 +00:00
Sean Bruno	126a39ce60	This patch fixes a nit in the em, lem, and igb driver statistics. Increment adapter->dropped_pkts instead of if_ierrors because if_ierrors is overwritten by hw stats collection. Submitted by: Andrew Boyer <aboyer@averesystems.com> Reviewed by: Jack F Vogel <jfv@freebsd.org> MFC after: 2 weeks	2012-09-23 22:53:39 +00:00
Gavin Atkinson	e935190a33	Switch some PCI register reads from using magic numbers to using the names defined in pcireg.h MFC after: 1 week	2012-09-19 12:27:23 +00:00
Gavin Atkinson	389c8bd51e	Align the PCI Express #defines with the style used for the PCI-X #defines. This also has the advantage that it makes the names more compact, iand also allows us to correct the non-uniform naming of the PCIM_LINK_* defines, making them all consistent amongst themselves. This is a mostly mechanical rename: s/PCIR_EXPRESS_/PCIER_/g s/PCIM_EXP_/PCIEM_/g s/PCIM_LINK_/PCIEM_LINK_/g When this is MFC'd, #defines will be added for the old names to assist out-of-tree drivers. Discussed with: jhb MFC after: 1 week	2012-09-18 22:04:59 +00:00
Eitan Adler	96240c89f0	Correct double "the the" Approved by: cperciva MFC after: 3 days	2012-09-14 21:28:56 +00:00
Jack F Vogel	252781f47d	Customer report of a panic on boot due to the old "m_getjcl:invalid cluster type" that occurred some time back with the igb driver. This happens often when booting over the net. I believe the NIC hardware is left in a warm state when handed over to the driver, and a stray RX interrupt happens earlier than the code is prepared for it to happen. This change was verified to fix the problem, its kind of a bandaid... but it is similar to what was done in the igb code.	2012-08-15 17:12:40 +00:00
Jack F Vogel	724f79462b	Make the polling interface in igb able to handle multiqueue, and correct the rxdone handling. Update the polling man page to include igb as well. Thanks to Mark Johnston for these changes.	2012-08-06 22:43:49 +00:00
Jack F Vogel	6aa4d618ca	Correct the mq_start routine to avoid out-of-order packet delivery, always enqueue when possible. Also correct the DEPLETED test as multiple bits might be set. Thanks to Randall Stewart for the changes!	2012-08-06 20:44:05 +00:00
Sean Bruno	8844c80848	CPU_NEXT() already handles wrapping around to the beginning. Also, in a system with sparse CPU IDs, you can have a valid CPU ID > mp_ncpus (e.g. if you have two CPUs 0 and 4, with mp_maxid == 4 and mp_ncpus == 2). Introduced at svn r235210 Submitted by: jhb@ Reviewed by: jfv@	2012-08-02 00:00:34 +00:00
Jack F Vogel	b4750260cd	Clean up some unused leftover code from em Make IRQ style a tuneable Fix lock handling in the interrupt handler MFC after:3 days	2012-07-31 18:44:10 +00:00
Luigi Rizzo	fc1fa1f2fe	remove some extra testing code that slipped into the previous commit Reported-by: Alexander Motin	2012-07-25 12:51:33 +00:00

1 2 3 4 5 ...

323 Commits