freebsd-nq

Author	SHA1	Message	Date
Luigi Rizzo	c7156fe92f	make sure if_transmit returns 0 if the mbuf is enqueued. ixgbe/ixv.c still needs a similar fix but it takes a little more restructuring of the code. MFC after: 3 days	2014-06-06 20:49:56 +00:00
Luigi Rizzo	0d88706547	reference the correct variable in a comment MFC after: 3 days	2014-05-28 06:50:16 +00:00
Gleb Smirnoff	3dbdfe820b	Fix compilation with IGB_LEGACY_TX defined. PR: 185909 Submitted by: Aurelien Rougemont <beorn binaries.fr>	2014-01-25 20:39:23 +00:00
Luigi Rizzo	17885a7bfd	It is 2014 and we have a new version of netmap. Most relevant features: - netmap emulation on any NIC, even those without native netmap support. On the ixgbe we have measured about 4Mpps/core/queue in this mode, which is still a lot more than with sockets/bpf. - seamless interconnection of VALE switch, NICs and host stack. If you disable accelerations on your NIC (say em0) ifconfig em0 -txcsum -txcsum you can use the VALE switch to connect the NIC and the host stack: vale-ctl -h valeXX:em0 allowing sharing the NIC with other netmap clients. - THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers instead of pointers/count as before). This was unavoidable to support, in the future, multiple threads operating on the same rings. Netmap clients require very small source code changes to compile again. On the plus side, the new API should be easier to understand and the internals are a lot simpler. The manual page has been updated extensively to reflect the current features and give some examples. This is the result of work of several people including Giuseppe Lettieri, Vincenzo Maffione, Michio Honda and myself, and has been financially supported by EU projects CHANGE and OPENLAB, from NetApp University Research Fund, NEC, and of course the Universita` di Pisa.	2014-01-06 12:53:15 +00:00
Konstantin Belousov	d480f5b820	Fix several issues with the busdma(9) KPI use in the e1000 drivers. The problems do not affect bouncing busdma in a visible way, but are critical for the dmar backend. - The bus_dmamap_create(9) is not documented to take BUS_DMA_NOWAIT flag. - Unload descriptor map after receive. - Do not reset descriptor map to NULL, bus_dmamap_load(9) requires valid map, and also this leaks the map. Reported and tested by: pho Approved by: jfv Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-11-02 09:16:11 +00:00
Luigi Rizzo	ce3ee1e7c4	update to the latest netmap snapshot. This includes the following: - use separate memory regions for VALE ports - locking fixes - some simplifications in the NIC-specific routines - performance improvements for the VALE switch - some new features in the pkt-gen test program - documentation updates There are small API changes that require programs to be recompiled (NETMAP_API has been bumped so you will detect old binaries at runtime). In particular: - struct netmap_slot now is 16 bytes to support an extra pointer, which may save one data copy when using VALE ports or VMs; - the struct netmap_if has two extra fields; MFC after: 3 days	2013-11-01 21:21:14 +00:00
Gleb Smirnoff	76039bc84f	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 17:58:36 +00:00
Jack F Vogel	7609433eb6	Update the Intel igb driver to version 2.4.0 - This version has support for the new Intel Avoton systems, including 2.5Gb support, further it now has IPv6/TSO6 support as well. Shared code has been updated where necessary as well. Thanks to my new assistant Eric Joyner for doing the transmit path changes to bring in the IPv6/TSO6 support. Thanks to Gleb for catching the one bug and change needed in NETMAP. Approved by: re	2013-10-09 17:32:52 +00:00
Hiren Panchasara	5b9d734b08	Expose system level ixgbe sysctls. Device level sysctls are already exposed as dev.ix.<device> Fixing the case where number of queues for igb is auto-tuned and hw.igb.num_queues does not return current/updated value. Reviewed by: jfv Approved by: re (delphij) MFC after: 2 weeks	2013-10-05 19:17:56 +00:00
Andre Oppermann	1b4381afbb	Restructure the mbuf pkthdr to make it fit for upcoming capabilities and features. The changes in particular are: o Remove rarely used "header" pointer and replace it with a 64bit protocol/ layer specific union PH_loc for local use. Protocols can flexibly overlay their own 8 to 64 bit fields to store information while the packet is worked on. o Mechanically convert IP reassembly, IGMP/MLD and ATM to use pkthdr.PH_loc instead of pkthdr.header. o Extend csum_flags to 64bits to allow for additional future offload information to be carried (e.g. iSCSI, IPsec offload, and others). o Move the RSS hash type enumerator from abusing m_flags to its own 8bit rsstype field. Adjust accessor macros. o Add cosqos field to store Class of Service / Quality of Service information with the packet. It is not yet supported in any drivers but allows us to get on par with Cisco/Juniper in routing applications (plus MPLS QoS) with a modernized ALTQ. o Add four 8 bit fields l[2-5]hlen to store the relative header offsets from the start of the packet. This is important for various offload capabilities and to relieve the drivers from having to parse the packet and protocol headers to find out location of checksums and other information. Header parsing in drivers is a lot of copy-paste and unhandled corner cases which we want to avoid. o Add another flexible 64bit union to map various additional persistent packet information, like ether_vtag, tso_segsz and csum fields. Depending on the csum_flags settings some fields may have different usage making it very flexible and adaptable to future capabilities. o Restructure the CSUM flags to better signify their outbound (down the stack) and inbound (up the stack) use. The CSUM flags used to be a bit chaotic and rather poorly documented leading to incorrect use in many places. Bring clarity into their use through better naming. Compatibility mappings are provided to preserve the API. The drivers can be corrected one by one and MFC'd without issue. o The size of pkthdr stays the same at 48/56bytes (32/64bit architectures). Sponsored by: The FreeBSD Foundation	2013-08-24 19:51:18 +00:00
Jack F Vogel	83cef45266	Alter the mq_start routine to do a TRYLOCK and call to the locked routine rather than just queueing. The former code was an attempt at getting UDP performance up, but there have been customer reports of problems with it, so the ixgbe approach seems the best solution for now.	2013-08-13 00:25:39 +00:00
Scott Long	c68534f1d5	Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI command register. The lazy BAR allocation code in FreeBSD sometimes disables this bit when it detects a range conflict, and will re-enable it on demand when a driver allocates the BAR. Thus, the bit is no longer a reliable indication of capability, and should not be checked. This results in the elimination of a lot of code from drivers, and also gives the opportunity to simplify a lot of drivers to use a helper API to set the busmaster enable bit. This changes fixes some recent reports of disk controllers and their associated drives/enclosures disappearing during boot. Submitted by: jhb Reviewed by: jfv, marius, achadd, achim MFC after: 1 day	2013-08-12 23:30:01 +00:00
Jack F Vogel	4dc63104ae	Improve the MSIX setup code in the drivers, thanks to Marius for the changes. Make sure that pci_alloc_msix() does give us the vectors we need and fall back to MSI when it doesn't, also release any that were allocated when insufficient. MFC after: 3 days	2013-08-12 22:54:38 +00:00
Jack F Vogel	d0913b7f25	Make the various driver MSIX setup routines fallback to MSI more gracefully. This change was suggested by Marius Strobl, thank you. PR: kern/181016 MFC after: ASAP	2013-08-06 21:01:38 +00:00
Jack F Vogel	54a6317360	When the igb driver is static there are cases when early interrupts occur, resulting in a panic in refresh_mbufs, to prevent this add a check in the interrupt handler for DRV_RUNNING. MFC after: 1 day (critical for 9.2)	2013-08-06 18:00:53 +00:00
Jack F Vogel	a1db87ec73	Change the E1000 driver option header handling to match the ixgbe driver. As it was, when building them as a module INET and INET6 are not defined. In these drivers it does not cause a panic, however it does result in different behavior in the ioctl routine when you are using a module vs static, and I think the behavior should be the same. MFC after: 3 days	2013-07-12 22:36:26 +00:00
Luigi Rizzo	d61ba75247	use netmap_rx_irq() / netmap_tx_irq() to handle interrupts in netmap mode, removing the logic from individual drivers. (note: if_lem.c not updated yet due to some other pending modifications)	2013-04-30 16:18:29 +00:00
Jack F Vogel	f0105d2d23	Simplify allocate_legacy code, txr pointer was breaking LEGACY compile, thanks to Nick Rogers for pointing this out.	2013-04-10 17:51:39 +00:00
Jack F Vogel	3b0b7ffbb9	Correct the multicast handling in the E1000 drivers as was done in ixgbe, thanks to Mike Karels for this fix. When exiting promiscuous mode MPE bit was being unconditionally cleared, this should not be done if we are in MAX multicast groups.	2013-04-03 23:39:54 +00:00
Sean Bruno	8e3ff376cf	Update man page for igb(4) with a little bit of information about hw.igb.num_queues for those so inclined. PR: kern/177384 Submitted by: hiren.panchasara@gmail.com Reviewed by: sbruno@ Approved by: jfv@ Obtained from: Yahoo! Inc. MFC after: 2 weeks	2013-04-03 21:55:19 +00:00
Jack F Vogel	c05891a6da	Change defines in the igb driver to allow an easier selection of the older if_start/non-multiqueue interface from the stack. This is not the default, but can be turned on in the Makefile now regardless of the OS level to allow either testing or use of ALTQ. MFC after: one week	2013-03-29 18:25:45 +00:00
Jack F Vogel	6ab6bfe32f	Refresh on the shared code for the E1000 drivers. - bear with me, there are lots of white space changes, I would not do them, but I am a mere consumer of this stuff and if these drivers are to stay in shape they need to be taken. em driver changes: support for the new i217/i218 interfaces igb driver changes: - TX mq start has a quick turnaround to the stack - Link/media handling improvement - When link status changes happen the current flow control state will now be displayed. - A few white space/style changes. lem driver changes: - the shared code uncovered a bogus write to the RLPML register (which does not exist in this hardware) in the vlan code,this is removed.	2013-02-21 00:25:45 +00:00
Randall Stewart	ded5ea6a25	This fixes a out-of-order problem with several of the newer drivers. The basic problem was that the driver was pulling the mbuf off the drbr ring and then when sending with xmit(), encounting a full transmit ring. Thus the lower layer xmit() function would return an error, and the drivers would then append the data back on to the ring. For TCP this is a horrible scenario sure to bring on a fast-retransmit. The fix is to use drbr_peek() to pull the data pointer but not remove it from the ring. If it fails then we either call the new drbr_putback or drbr_advance method. Advance moves it forward (we do this sometimes when the xmit() function frees the mbuf). When we succeed we always call advance. The putback will always copy the mbuf back to the top of the ring. Note that the putback cannot be used with a drbr_dequeue() only with drbr_peek(). We most of the time, in putback, would not need to copy it back since most likey the mbuf is still the same, but sometimes xmit() functions will change the mbuf via a pullup or other call. So the optimial case for the single consumer is to always copy it back. If we ever do a multiple_consumer (for lagg?) we will need a test and atomic in the put back possibly a seperate putback_mc() in the ring buf. Reviewed by: jhb@freebsd.org, jlv@freebsd.org	2013-02-07 15:20:54 +00:00
Sofian Brabez	61bfd86762	Use DEVMETHOD_END macro defined in sys/bus.h instead of {0, 0} sentinel on device_method_t arrays Reviewed by: cognet Approved by: cognet	2013-01-30 18:01:20 +00:00
Steven Hartland	31e85bd9cd	Fixed mbuf free when receive structures fail to allocate. This prevents quad igb card on high core machines, without any nmbcluster or igb queue tuning wedging the boot process if all nics are configured. Reviewed by: jfv Approved by: pjd (mentor) MFC after: 1 week	2013-01-12 16:05:55 +00:00
Gleb Smirnoff	c6499eccad	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags in sys/dev.	2012-12-04 09:32:43 +00:00
Gleb Smirnoff	9c402aeb41	drbr_enqueue() awlays consumes mbuf, no matter did it fail or not. The mbuf pointer is no longer valid, so can't be reused after. Fix igb_mq_start() where mbuf pointer was used after drbr_enqueue(). This eventually leads us to all invocations of igb_mq_start_locked() called with third argument as NULL. This allows us to simplify this function. Submitted by: Karim Fodil-Lemelin <fodillemlinkarim gmail.com> Reviewed by: jfv	2012-11-26 20:03:57 +00:00
Eitan Adler	a8de37b024	This isn't functionally identical. In some cases a hint to disable unit 0 would in fact disable all units. This reverts r241856 Approved by: cperciva (implicit)	2012-10-22 13:06:09 +00:00
Eitan Adler	76b7512247	Now that device disabling is generic, remove extraneous code from the device drivers that used to provide this feature. Reviewed by: des Approved by: cperciva MFC after: 1 week	2012-10-22 03:41:14 +00:00
Gleb Smirnoff	063efed28c	The drbr(9) API appeared to be so unclear, that most drivers in tree used it incorrectly, which lead to inaccurate overrated if_obytes accounting. The drbr(9) used to update ifnet stats on drbr_enqueue(), which is not accurate since enqueuing doesn't imply successful processing by driver. Dequeuing neither mean that. Most drivers also called drbr_stats_update() which did accounting again, leading to doubled if_obytes statistics. And in case of severe transmitting, when a packet could be several times enqueued and dequeued it could have been accounted several times. o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between ALTQ queueing or buf_ring(9) queueing. - It doesn't touch the buf_ring stats any more. - It doesn't touch ifnet stats anymore. - drbr_stats_update() no longer exists. o buf_ring(9) handles its stats itself: - It handles br_drops itself. - br_prod_bytes stats are dropped. Rationale: no one ever reads them but update of a common counter on every packet negatively affects performance due to excessive cache invalidation. - buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since we no longer account bytes. o Drivers handle their stats theirselves: if_obytes, if_omcasts. o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer use drbr_stats_update(), and update ifnet stats theirselves. o bxe(4) was the most correct driver, it didn't call drbr_stats_update(), thus it was the only driver accurate under moderate load. Now it also maintains stats itself. o ixgbe(4) had already taken stats from hardware, so just - drop software stats updating. - take multicast packet count from hardware as well. o mxge(4) just no longer needs NO_SLOW_STATS define. o cxgb(4), cxgbe(4) need no change, since they obtain stats from hardware. Reviewed by: jfv, gnn	2012-09-28 18:28:27 +00:00
Sean Bruno	126a39ce60	This patch fixes a nit in the em, lem, and igb driver statistics. Increment adapter->dropped_pkts instead of if_ierrors because if_ierrors is overwritten by hw stats collection. Submitted by: Andrew Boyer <aboyer@averesystems.com> Reviewed by: Jack F Vogel <jfv@freebsd.org> MFC after: 2 weeks	2012-09-23 22:53:39 +00:00
Jack F Vogel	724f79462b	Make the polling interface in igb able to handle multiqueue, and correct the rxdone handling. Update the polling man page to include igb as well. Thanks to Mark Johnston for these changes.	2012-08-06 22:43:49 +00:00
Jack F Vogel	6aa4d618ca	Correct the mq_start routine to avoid out-of-order packet delivery, always enqueue when possible. Also correct the DEPLETED test as multiple bits might be set. Thanks to Randall Stewart for the changes!	2012-08-06 20:44:05 +00:00
Sean Bruno	8844c80848	CPU_NEXT() already handles wrapping around to the beginning. Also, in a system with sparse CPU IDs, you can have a valid CPU ID > mp_ncpus (e.g. if you have two CPUs 0 and 4, with mp_maxid == 4 and mp_ncpus == 2). Introduced at svn r235210 Submitted by: jhb@ Reviewed by: jfv@	2012-08-02 00:00:34 +00:00
Jack F Vogel	fcc144ad4e	Change the interface to the Energy Efficient Ethernet (EEE) setting in the igb and em driver. This was necessitated by a shared code change that I was given late in the game, a data type changed from bool to int, in the last update I dealt with it by a cast, but it was pointed out (thanks jhb) that there was a potential problem with this. John suggested this safer approach, and it is fine with me... MFC after:2 days (to catch the 9.1 update)	2012-07-07 20:21:05 +00:00
Jack F Vogel	996922aeee	Correct small regressions pointed out by jhb, thanks John. MFC after:5 days	2012-07-05 23:36:17 +00:00
Jack F Vogel	ab5d036272	Sync with Intel internal source: shared code update and small changes in core required Add support for new i210/i211 devices Improve queue calculation based on mac type MFC after:5 days	2012-07-05 20:26:57 +00:00
John Baldwin	03b0ca8b28	Commit a portion of 233708 I missed earlier and don't include the definition of igb_start() and igb_start_locked() (nor set if_start in the ifnet) when igb(4) uses if_transmit.	2012-06-01 15:52:41 +00:00
Sean Bruno	daf8162d1f	Modify the binding of queues to attach to as many CPUs as possible when using more than one igb(4) adapter. This means that queues will not be bound to the same CPUs if there are more CPUs availble. This is only applicable to a system that has multiple interfaces. Obtained from: Yahoo! Inc. MFC after: 3 days	2012-05-10 00:00:28 +00:00
John Baldwin	8546e82467	Reapply r223198 which was reverted in the previous vendor import. Some portions were already reapplied in r233708: - Use a dedicated task to handle deferred transmits from the if_transmit method instead of reusing the existing per-queue interrupt task. Reusing the per-queue interrupt task could result in both an interrupt thread and the taskqueue thread trying to handle received packets on a single queue resulting in out-of-order packet processing. - Call ether_ifdetach() earlier in igb_detach(). - Drain tasks and free taskqueues during igb_detach(). MFC after: 1 week	2012-04-11 21:33:45 +00:00
John Baldwin	d8a8648379	Fix a few issues with transmit handling in em(4) and igb(4): - Do not define the foo_start() methods or set if_start in the ifnet if multiq transmit is enabled. Also, set if_transmit and if_qflush before ether_ifattach rather than after when multiq transmit is enabled. This helps to ensure that the drivers never try to mix different transmit methods. - Properly restart transmit during resume. igb(4) was not restarting it at all, and em(4) was restarting even if the link was down and was calling the wrong method if multiq transmit was enabled. - Remove all the 'more' handling for transmit completions. Transmit completion processing does not have a processing limit, so it always runs to completion and never has more work to do when it returns. Instead, the previous code was returning 'true' anytime there were packets in the queue that weren't still in the process of being transmitted. The effect was that the driver would continuously reschedule a task to process TX completions in effect running at 100% CPU polling the hardware until it finished transmitting all of the packets in the ring. Now it will just wait for the next TX completion interrupt. - Restart packet transmission when the link becomes active. - Fix the MSI-X queue interrupt handlers to restart packet transmission if there are pending packets in the relevant software queue (IFQ or buf_ring) after processing TX completions. This is the root cause for the OACTIVE hangs as if the MSI-X queue handler drained all the pending packets from the TX ring, nothing would ever restart it. As such, remove some previously-added workarounds to reschedule a task to poll the TX ring anytime OACTIVE was set. Tested by: sbruno Reviewed by: jfv MFC after: 1 week	2012-03-30 19:54:48 +00:00
John Baldwin	c3173381be	Properly handle failures in igb_setup_msix() by returning 0 if MSI or MSI-X allocation fails. Reviewed by: jfv MFC after: 2 weeks	2012-03-01 22:13:10 +00:00
Luigi Rizzo	64ae02c365	A bunch of netmap fixes: USERSPACE: 1. add support for devices with different number of rx and tx queues; 2. add better support for zero-copy operation, adding an extra field to the netmap ring to indicate how many buffers we have already processed but not yet released (with help from Eddie Kohler); 3. The two changes above unfortunately require an API change, so while at it add a version field and some spares to the ioctl() argument to help detect mismatches. 4. update the manual page for the two changes above; 5. update sample applications in tools/tools/netmap KERNEL: 1. simplify the internal structures moving the global wait queues to the 'struct netmap_adapter'; 2. simplify the functions that map kring<->nic ring indexes 3. normalize device-specific code, helps mainteinance; 4. start exploring the impact of micro-optimizations (prefetch etc.) in the ixgbe driver. Use 'legacy' descriptors on the tx ring and prefetch slots gives about 20% speedup at 900 MHz. Another 7-10% would come from removing the explict calls to bus_dmamap* in the core (they are effectively NOPs in this case, but it takes expensive load of the per-buffer dma maps to figure out that they are all NULL. Rx performance not investigated. I am postponing the MFC so i can import a few more improvements before merging.	2012-02-27 19:05:01 +00:00
Luigi Rizzo	5644ccec61	(This commit only touches code within the DEV_NETMAP blocks) Introduce some functions to map NIC ring indexes into netmap ring indexes and vice versa. This way we can implement the bound checks only in one place (and hopefully in a correct way). On passing, make the code and comments more uniform across the various drivers.	2012-02-15 23:13:29 +00:00
Luigi Rizzo	6e10c8b8c5	small code cleanup in preparation for future modifications in the memory allocator used by netmap. No functional change, two small bug fixes: - in if_re.c add a missing bus_dmamap_sync() - in netmap.c comment out a spurious free() in an error handling block	2012-01-10 19:57:23 +00:00
Kevin Lo	5bbe0c5357	ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again Reviewed by: yongari	2012-01-07 09:41:57 +00:00
Luigi Rizzo	2d0d326d91	put back netmap support, deleted by mistake in a previous commit	2011-12-22 15:33:41 +00:00
John Baldwin	ef93f57495	Restore the sysctl changes from 223676 and 227309 lost in the previous import: - Add read-only sysctls for all of the tunables supported by the igb and em drivers. - Make the per-instance 'enable_aim' sysctl truly per-instance by having it change a per-instance variable (which is used to control AIM) rather than having all of the per-instance sysctls operate on a single global variable. While here, restore the previously existing hw.igb.rx_processing_limit tunable as it is very useful to be able to set a default tunable that applies to all adapters in the system.	2011-12-21 20:10:11 +00:00
Matthew D Fleming	30a497c860	Consistently use types in e1000 driver code: - Two struct members eee_disable are used in a function that expects an int *, so declare them int, not bool. - igb_tx_ctx_setup() returns a boolean value, so declare it bool, not int. - igb_header_split is passed to TUNABLE_INT, so delcare it int, not bool. - igb_tso_setup() returns a bool, so declare it bool, not boolean_t. - Do not re-define bool/true/false if the symbols already exist. MFC after: 2 weeks Sponsored by: Isilon Systems, LLC	2011-12-12 18:27:34 +00:00
Jack F Vogel	62aca36544	Last change still had an issue, one more time...	2011-12-11 18:46:14 +00:00

1 2 3

127 Commits