Commit Graph

192 Commits

Author SHA1 Message Date
Hiren Panchasara
df7b11fa09 Expose full 32bit RSS hash from card regardless of whether RSS is defined or
not. When doing multiqueue, we are all setup to have full 32bit RSS hash from
the card. We do not need to hide that under "ifdef RSS" and should expose that
by default so others like lagg(4) can use that and avoid hashing the traffic by
themselves.

While here, delete the FreeBSD version check and use of deprecated M_FLOWID.

Reviewed by:	adrian, erj
MFC after:	1 week
Sponsored by:	Limelight Networks
2015-07-14 09:13:18 +00:00
Luigi Rizzo
847bf38369 Sync netmap sources with the version in our private tree.
This commit contains large contributions from Giuseppe Lettieri and
Stefano Garzarella, is partly supported by grants from Verisign and Cisco,
and brings in the following:

- fix zerocopy monitor ports and introduce copying monitor ports
  (the latter are lower performance but give access to all traffic
  in parallel with the application)

- exclusive open mode, useful to implement solutions that recover
  from crashes of the main netmap client (suggested by Patrick Kelsey)

- revised memory allocator in preparation for the 'passthrough mode'
  (ptnetmap) recently presented at bsdcan. ptnetmap is described in
        S. Garzarella, G. Lettieri, L. Rizzo;
        Virtual device passthrough for high speed VM networking,
        ACM/IEEE ANCS 2015, Oakland (CA) May 2015
        http://info.iet.unipi.it/~luigi/research.html

- fix rx CRC handing on ixl

- add module dependencies for netmap when building drivers as modules

- minor simplifications to device-specific routines (*txsync, *rxsync)

- general code cleanup (remove unused variables, introduce macros
  to access rings and remove duplicate code,

Applications do not need to be recompiled, unless of course
they want to use the new features (monitors and exclusive open).

Those willing to try this code on stable/10 can just update the
sys/dev/netmap/*, sys/net/netmap* with the version in HEAD
and apply the small patches to individual device drivers.

MFC after:	1 month
Sponsored by:	(partly) Verisign, Cisco
2015-07-10 05:51:36 +00:00
John Baldwin
9e34aea25c Catch up to the SRIOV API changes in r283670. 2015-06-01 20:05:06 +00:00
Jack F Vogel
48056c88e1 Delta D2489 - Add SRIOV support to the Intel 10G driver.
NOTE: This is a technology preview, while it has undergone
      development testing, Intel has not yet completed full
      validation of the feature. It is being integrated for
      early access and customer testing.
2015-06-01 17:43:34 +00:00
Jack F Vogel
3012653750 Revert last commit, to remove added skeleton tree. 2015-06-01 17:35:29 +00:00
Jack F Vogel
2533e32559 Delta D2489 - Add SRIOV support to the Intel 10G driver.
NOTE: This is a technology preview, while it has undergone development
      tests, Intel has not yet completed full validation of the feature.
      It is being integrated for early access and customer testing.
2015-06-01 17:15:25 +00:00
Bjoern A. Zeeb
24f5e9f694 Remove the extra extern which makes gcc complain; I assume it came from
r282289.

We do include ixgbe.h which does include ixgbe_common.h which has the
extern statement for ixgbe_stop_mac_link_on_d3_82599().
2015-05-01 12:10:36 +00:00
Eric Joyner
6f37f2324d Add support for certain Intel X550 devices.
These include standalone X550 adapters, X552 10GbE backplane, and
X552/X557-AT 10GBASE-T; with the latter two being integrated into Xeon D SoCs.

As well, this bumps the ixgbe version number to 2.8.3, and includes updates
to shared code for support for the new devices.

Differential Revision: D2414
Reviewed by:	gnn, adrian
Approved by:	jfv (mentor), gnn (mentor)
2015-04-30 22:53:27 +00:00
John Baldwin
625d12c609 Various fixes to the stats in igb(4), ixgbe(4), and ixl(4).
- Use hardware counters for ifnet stats in igb(4) when possible.  This
  ensures these stats include packets that bypass the regular stack via
  netmap.
- Don't derefence values off the end of the igb(4) VF stats structure.
  Instead, add a dedicated if_get_counter method for igb(4) VF interfaces.
- Report missed packets on igb(4) as input queue drops rather than an
  input error.
- Report bug_ring drop counts as output queue drops for igb(4) and ixgbe(4).
- Export the buf_ring drop stats for individual rings via sysctl on
  ixgbe(4).
- Fix a typo that in ixl(4) that caused output queue drops to be reported
  as input queue drops and input queue drops to be unreported.

Differential Revision:	https://reviews.freebsd.org/D2402
Reviewed by:	jfv, rstone (6)
Sponsored by:	Norse Corp, Inc.
2015-04-30 18:23:38 +00:00
Marcelo Araujo
32e85f856a Add back ixgbe_rxeof, just remove the assignment to more. 2015-04-20 17:24:39 +00:00
Marcelo Araujo
201df06bb2 Remove unused variable.
Differential Revision:	D2331
Reviewed by:		erj
2015-04-20 17:21:15 +00:00
Eric Joyner
892406d827 Make changes to busdma code in tx/rx path similar to the ones made in r257541.
- bus_dmamap_create() does not take the BUS_DMA_NOWAIT flag
- properly unload maps
- do not assign NULL to dma map pointers

Submitted by:	Konstantin Belousov <kostikbel@gmail.com>
Approved by:	jfv (mentor)
MFC after:	1 week
2015-04-01 17:19:55 +00:00
Andrew Turner
6c85c62d00 Fix building ixgbe with gcc, it doesn't like nested extern declarations.
The fix is to move the extern declaration ix_crcstrip out of
ixgbe_setup_hw_rsc.
2015-03-19 13:00:02 +00:00
Jack F Vogel
bff38d637c Fix i386 LINT build issues, and remove unused variable. 2015-03-18 20:11:59 +00:00
Adrian Chadd
a1edda90b2 Fix ixgbe(4) to compile - with RSS; with ix+ixv in the kernel.
* Fix the multiple same-named devclasses; the duplicate name
  trips up the linker.

* Re-do the taskqueue stuff to use the new cpuset API, not the old
  pinned API.

* Add includes for the new location of the RSS configuration routines.

This allows ixgbe to compile as a module /and/ linked into the kernel,
along with RSS working.

Sponsored by:	Norse Corp, Inc.
2015-03-18 05:05:30 +00:00
Jack F Vogel
bc2e8d79c7 Resolve a few build issues, add module directories back into Makefile,
then correct a NETMAP problem resulting from the split, and finally
temporarily disable the X550 functionality.
2015-03-17 22:40:50 +00:00
Jack F Vogel
758cc3dcd5 Update to the Intel ixgbe driver:
- Split the driver into independent pf and vf loadables. This is
	  in preparation for SRIOV support which will be following shortly.
	  This also allows us to keep a seperate revision control over the
	  two parts, making for easier sustaining.
	- Make the TX/RX code a shared/seperated file, in the old code base
	  the ixv code would miss fixes that went into ixgbe, this model
	  will eliminate that problem.
	- The driver loadables will now match the device names, something that
	  has been requested for some time.
	- Rather than a modules/ixgbe there is now modules/ix and modules/ixv
	- It will also be possible to make your static kernel with only one
	  or the other for streamlined installs, or both.

Enjoy!

Submitted by: jfv and erj
2015-03-17 18:32:28 +00:00
Marcelo Araujo
53f7fd617c Fix style(9) from my previous commit r279803.
Spotted by:	kevlo
2015-03-09 10:29:15 +00:00
Marcelo Araujo
18733a0b76 Now ifconfig(8) can set the media option as 10Gbase-T for ixgbe(4).
Differential Revision:	D823
Reviewed by:	jfvogel
Approved by:	jfvogel
Sponsored by:	QNAP Systems Inc.
2015-03-09 08:43:27 +00:00
Marcelo Araujo
df634d45d3 Fix the media detected for copper cables NIC based on chipset X540T.
Phabric:        D811
Reviewed by:    jfvogel
Approved by:	jfvogel
Sponsored by:   QNAP Systems Inc.
2015-03-09 08:22:11 +00:00
Enji Cooper
74c1c91c47 Pad RX copy alignment calculation to avoid illegal memory accesses
The optimization made in r239940 is valid for struct mbuf's current structure
and size in FreeBSD, but hardcodes assumptions about sizes of struct mbuf,
which are unfortunately broken if additional data is added to the beginning of
struct mbuf

X-MFC note (discussed with rwatson):

This change requires the MPKTHSIZE definition, which is only available after
head@r277203 and will not be MFCed as it breaks mbuf(9) KPI.

A direct commit to stable/10 and merges to other branches to add the necessary
definitions to work with the code as-is will be done to facilitate this MFC

PR: 194314
MFC after: 2 weeks
Approved/Reviewed by: erj, jfv
Sponsored by: EMC / Isilon Storage Division
2015-02-28 14:57:57 +00:00
Adrian Chadd
977dc4e243 Migrate using CPU_ZERO() + CPU_SET() -> CPU_SETOF().
Tested:

* ixgbe, igb, RSS enabled

Submitted by:	jhb
Sponsored by:	Norse Corp, Inc.
2015-02-25 21:44:53 +00:00
Adrian Chadd
9756bd5982 Change uses of taskqueue_start_threads_pinned() -> taskqueue_start_threads_cpuset()
Differential Revision:	https://reviews.freebsd.org/D1897
Reviewed by:	jfv
2015-02-24 22:17:12 +00:00
Adrian Chadd
b2bdc62a95 Refactor / restructure the RSS code into generic, IPv4 and IPv6 specific
bits.

The motivation here is to eventually teach netisr and potentially
other networking subsystems a bit more about how RSS work queues / buckets
are configured so things have a hope of auto-configuring in the future.

* net/rss_config.[ch] takes care of the generic bits for doing
  configuration, hash function selection, etc;
* topelitz.[ch] is now in net/ rather than netinet/;
* (and would be in libkern if it didn't directly include RSS_KEYSIZE;
  that's a later thing to fix up.)
* netinet/in_rss.[ch] now just contains the IPv4 specific methods;
* and netinet/in6_rss.[ch] now just contains the IPv6 specific methods.

This should have no functional impact on anyone currently using
the RSS support.

Differential Revision:	D1383
Reviewed by:	gnn, jfv (intel driver bits)
2015-01-18 18:06:40 +00:00
Jack F Vogel
4da1bbcda5 Revert r275136, it was not approved, it was sloppy, if a feature
like this is needed please resubmit for Intel's approval.
2014-12-02 23:02:57 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Alfred Perlstein
56c14bca7e Make igb and ixgbe check tunables at probe time.
This allows one to make a kernel module to tune the
number of queues before the driver loads.

This is needed so that a module at SI_SUB_CPU can set
tunables for these drivers to take.  Otherwise getenv
is called too early by the TUNABLE macros.

Reviewed by: smh
Phabric: https://reviews.freebsd.org/D1149
2014-11-26 20:19:36 +00:00
Alexander V. Chernikov
86e10a0c6a Fix r273112: do not turn DROP_EN by default.
Due to adapter->hw.fc.requested_mode is filled with default value
after ixgbe_initialize_receive_units(), this leads to enabling
DROP_EN in most cases.

Tested by:	ae
MFC after:	1 week
2014-11-16 18:08:00 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Adrian Chadd
bba61d1a89 Set the DROP_EN bit before the RX queue is brought up and active.
He noticed issues setting this bit in SRRCTL after the queue was up,
so doing it from the sysctl handler isn't enough and may not actually
work correctly.

This commit doesn't remove the sysctl path or try to change its
behaviour.  I'll talk with others about how to finish fixing that
before I tackle that.

PR:		kern/194311
Submitted by:	luigi
MFC after:	3 days
Sponsored by:	Norse Corp, Inc
2014-10-15 01:22:56 +00:00
Gleb Smirnoff
058a38660e Convert to if_get_counter(). 2014-09-28 07:29:45 +00:00
Gleb Smirnoff
5fe12776aa Mechanically switch ixv(4) to if_inc_counter(). 2014-09-28 07:19:32 +00:00
Adrian Chadd
e45d876dd7 The error bits are not valid with EOP=0; so intermediary fragments should
not be discarded.

Submitted by:	Marc De La Gueronniere <mdelagueronniere@verisign.com>
MFC after:	1 week
Sponsored by:	Verisign, Inc.
2014-09-15 20:54:12 +00:00
Adrian Chadd
5894690d0d Fix a double-free of mbufs in rx_ixgbe_discard().
fmp->buf at the free point is already part of the chain being freed,
so double-freeing is counter-productive.

Submitted by:	Marc De La Gueronniere <mdelagueronniere@verisign.com>
MFC after:	1 week
Sponsored by:	Verisign, Inc.
2014-09-15 20:50:26 +00:00
Christian Brueffer
380064c8dc Use the right constants in comparisons. This is currently a nop, as
MIN_RXD == MIN_TXD and MAX_RXD == MAX_TXD.

Reviewed by:	Eric Joyner @ Intel
MFC after:	1 week
2014-09-08 19:24:25 +00:00
Gleb Smirnoff
1bffa9511f Use define from if_var.h to access a field inside struct if_data,
that resides in struct ifnet.

Sponsored by:	Nginx, Inc.
2014-08-30 19:55:54 +00:00
Alexander V. Chernikov
ea463f2dc0 * Add SIOCGI2C driver ioctl used to retrieve i2c info.
* Convert ixgbe to use this ioctl
* Convert ifconfig to use generic i2c handler for  "ix" interfaces.

Approved by:	Eric Joyner (ixgbe part)
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2014-08-29 18:02:58 +00:00
Luigi Rizzo
4bf50f18eb Update to the current version of netmap.
Mostly bugfixes or features developed in the past 6 months,
so this is a 10.1 candidate.

Basically no user API changes (some bugfixes in sys/net/netmap_user.h).

In detail:

1. netmap support for virtio-net, including in netmap mode.
  Under bhyve and with a netmap backend [2] we reach over 1Mpps
  with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode.

2. (kernel) add support for multiple memory allocators, so we can
  better partition physical and virtual interfaces giving access
  to separate users. The most visible effect is one additional
  argument to the various kernel functions to compute buffer
  addresses. All netmap-supported drivers are affected, but changes
  are mechanical and trivial

3. (kernel) simplify the prototype for *txsync() and *rxsync()
  driver methods. All netmap drivers affected, changes mostly mechanical.

4. add support for netmap-monitor ports. Think of it as a mirroring
  port on a physical switch: a netmap monitor port replicates traffic
  present on the main port. Restrictions apply. Drive carefully.

5. if_lem.c: support for various paravirtualization features,
  experimental and disabled by default.
  Most of these are described in our ANCS'13 paper [1].
  Paravirtualized support in netmap mode is new, and beats the
  numbers in the paper by a large factor (under qemu-kvm,
  we measured gues-host throughput up to 10-12 Mpps).

A lot of refactoring and additional documentation in the files
in sys/dev/netmap, but apart from #2 and #3 above, almost nothing
of this stuff is visible to other kernel parts.

Example programs in tools/tools/netmap have been updated with bugfixes
and to support more of the existing features.

This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline.

A lot of this code has been contributed by my colleagues at UNIPI,
including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella.

MFC after:	3 days.
2014-08-16 15:00:01 +00:00
Steven Hartland
d7b5bae512 Renamed hw.ixgbe.unsupported_sfp -> hw.ix.unsupported_sfp
This now matches all other ixgbe sysctl / tunables.

Sponsored by:	Multiplay
2014-08-14 13:25:05 +00:00
Adrian Chadd
ce44e5f776 Add the UDP hash -> RSS mbuf hash type for the ixgbe(4) driver. 2014-07-20 08:43:53 +00:00
Adrian Chadd
e3af537e75 Teach ixgbe(4) about rss_gethashconfig().
If RSS is enabled, ixgbe(4) will query the RSS API for the types of hashes
which should be used.  It'll then only enable hashes that are exposed via
the RSS layer.

This way it won't try to do things like enable UDP hashing if RSS explicitly
states that it isn't supported in lookups.

Tested:

* 82599EB ixgbe(4) NIC
2014-07-20 07:45:48 +00:00
Adrian Chadd
e965b0dcd1 Disable the ixgbe(4) UDP 4-tuple hashing for the time being.
A mix of fragmented and non-fragmented UDP in a single stream will end up
being hashed differently, resulting in out-of-order behaviour in the receive
path.

This was done in the linux e1000 driver in 2011.

Discussed with:	jfv
2014-07-20 07:43:41 +00:00
Adrian Chadd
c64a6bc62d Correctly program the RSS redirection table entries.
Without this, the RSS bucket assignments aren't correct - they're
DCBA instead of ABCD in each DWORD.

Tested:	82599EB ixgbe(4), TCP and UDP RSS
2014-07-20 04:11:18 +00:00
Hiren Panchasara
2814ea66e3 Fix a typo.
PR:		191898
Submitted by:	vsjcfm@gmail.com
2014-07-17 06:21:58 +00:00
Adrian Chadd
8c0d2adf3f Initialise these variables so gcc doesn't complain.
Submitted by:	luigi
2014-06-30 23:34:36 +00:00
Adrian Chadd
7063e348ab Add initial RSS awareness to the ixgbe(4) driver.
The ixgbe(4) hardware is capable of RSS hashing RX packets and doing RSS
queue selection for up to 8 queues.

However, even if multi-queue is enabled for ixgbe(4), the RX path doesn't use
the RSS flowid from the received descriptor.  It just uses the MSIX queue id.

This patch does a handful of things if RSS is enabled:

* Instead of using a random key at boot, fetch the RSS key from the RSS code
  and program that in to the RSS redirection table.

  That whole chunk of code should be double checked for endian correctness.

* Use the RSS queue mapping to CPU ID to figure out where to thread pin
  the RX swi thread and the taskqueue threads for each queue.

* The software queue is now really an "RSS bucket".

* When programming the RSS indirection table, use the RSS code to
  figure out which RSS bucket each slot in the indirection table maps
  to.

* When transmitting, use the flowid RSS mapping if the mbuf has
  an RSS aware hash.  The existing method wasn't guaranteed to align
  correctly with the destination RSS bucket (and thus CPU ID.)

This code warns if the number of RSS buckets isn't the same as the
automatically configured number of hardware queues.  The administrator
will have to tweak one of them for better performance.

There's currently no way to re-balance the RSS indirection table after
startup.  I'll worry about that later.

Additionally, it may be worthwhile to always use the full 32 bit flowid if
multi-queue is enabled.  It'll make things like lagg(4) behave better with
respect to traffic distribution.
2014-06-30 04:38:29 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
John Baldwin
46e89834dc - Don't compare bus_dma map pointers for static DMA allocations against
NULL to determine if bus_dmamap_unload() or bus_dmamem_free() should be
  called.  Instead, check the associated bus and virtual addresses.
- Don't clear static DMA maps to NULL.

Reviewed by:	jfv
2014-06-12 11:15:19 +00:00