Commit Graph

28 Commits

Author SHA1 Message Date
Scott Long
1a8959dac6 Multi-queue NIC drivers and multi-port lagg tend to use the same lower
bits of the flowid as each other, resulting in a poor distribution of
packets among queues in certain cases.  Work around this by adding a
set of sysctls for controlling a bit-shift on the flowid when doing
multi-port aggrigation in lagg and lacp.  By default, lagg/lacp will
now use bits 16 and higher instead of 0 and higher.

Reviewed by:	max
Obtained from:	Netflix
MFC after:	3 days
2013-12-30 01:32:17 +00:00
Gleb Smirnoff
c3322cb91c Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-28 07:29:16 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Andrey V. Elsukov
d5773da82a Use the same actor key for media types of the same speed.
PR:		176097
MFC after:	2 weeks
2013-10-17 15:14:58 +00:00
Adrian Chadd
49de4f2214 Break out the static, global LACP debug options into a per-lagg unit
sysctl tree.

* Create a net.link.lagg.X.lacp node
* Add a debug node under that for tx_test and rx_test
* Add lacp_strict_mode, defaulting to 1

tx_test and rx_test are still a bitmap of unit numbers for now.
At some point it would be nice to create child nodes of the lagg bundle
for each sub-interface, and then populate those with various knobs
and statistics.

Sponsored by:	Netflix
2013-07-26 19:41:13 +00:00
Adrian Chadd
387e754ae5 Fix typo.
Sponsored by:	Netflix
2013-07-25 19:10:23 +00:00
Adrian Chadd
31402c27b8 Bring over some link aggregation / LACP protocol improvements and debugging
additions.

* Add some new tracing events to aid in debugging.
* Add in a debugging mode to drop transmit and received frames, specifically
  to test whether seeing or hearing heartbeats correctly cause LACP to
  drop the port.
* Add in (and make default) a strict LACP mode, which requires the
  heartbeat on a port to be heard before it's used.  Sometimes vendor ports
  will hang but the link layer stays up, resulting in hung traffic.
* Add logging the number of link status flaps, again to aid in debugging
  badly behaving switch ports.
* Calculate the lagg interface port speed as the multiple of the
  configured ports, rather than the largest.

Obtained from:	Netflix
MFC after:	2 weeks
2013-07-13 04:25:03 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Andrew Thompson
5fc4c149ab Turn LACP debugging from a compile time option to a sysctl, it is very handy to
be able to turn it on when negotiation to a switch misbehaves.

Submitted by:	Andrew Boyer
MFC after:	3 days
2012-05-26 08:09:01 +00:00
Andrew Thompson
86f67641a9 Add the ability to set which packet layers are used for the load balance hash
calculation.
2012-03-06 22:58:13 +00:00
Andrew Thompson
0bf97ae271 Using the flowid in the mbuf assumes the network card is giving a good hash for
the traffic flow, this may not be the case giving poor traffic distribution.
Add a sysctl which allows us to fall back to our own flow hash code.

PR:		kern/164901
Submitted by:	Eugene Grosbein
MFC after:	1 week
2012-02-22 22:01:30 +00:00
Andrew Thompson
5c6026e91f Use the flowid if its available for selecting the tx port. 2009-04-30 14:25:44 +00:00
Andrew Thompson
be07c18007 Update the interface baudrate taking into account the max speed for the
different aggregation protocols.
2008-12-17 20:58:10 +00:00
Andrew Thompson
3de1800850 Switch the LACP state machine over to its own mutex to protect the internals,
this means that it no longer grabs the lagg rwlock. Use two port table arrays
which list the active ports for Tx and switch between them with an atomic op.
Now the lagg rwlock is only exclusively locked for management (ioctls) and
queuing of lacp control frames isnt needed.
2008-03-16 19:25:30 +00:00
Andrew Thompson
af0084c92e Pass any unmatched slowprotocols frames up the stack instead of dropping them,
there are more subtypes than just LACP.
2007-12-31 01:16:35 +00:00
Andrew Thompson
5c0d5fddf5 Use the safer callout_init_rw() to allow the softclock to grab the
rwlock for us.
2007-11-21 05:28:49 +00:00
Andrew Thompson
b3d37ca5f8 Allow the LACP state to be queried from userland which at the moment is the
actor and partner peer info. Print out the active aggregator and per port data
in verbose mode from ifconfig.

Approved by:	re (mux)
2007-07-05 09:18:57 +00:00
Andrew Thompson
ec32b37ecd non-functional cleanup
- remove dead code
- use consistent variable names
- gc unused defines
- whitespace cleanup
2007-06-12 07:29:11 +00:00
Andrew Thompson
fe45e65f10 Compare the partner system priority when choosing the aggregator. 2007-05-19 09:37:04 +00:00
Andrew Thompson
998971a70f Implement the Marker Protocol. A marker frame is placed on the interface queue
of each port and any further packets are blocked, when the all the marker frames
have been returned to us from the remote network device then we can be sure
that all interface queues are empty.

This is needed when a port is added or removed from the aggregation since it
will affect the hash based distribution, if the queues are not empty then a
packet from an existing connection may be placed on a different interface and
arrive out of order. This was previously achieved by suppressing transmission for
1 second, now that there is an active feedback this timeout as been increased
to 3 seconds and used as a fallback.
2007-05-19 07:47:04 +00:00
Andrew Thompson
3362a47464 Fix locking assert where we should hold the reader lock. 2007-05-18 23:38:35 +00:00
Andrew Thompson
3bf517e389 Change from a mutex to a read/write lock. This allows the tx port to be
selected simultaneously by multiple senders and transmit/receive is not
serialised between aggregated interfaces.
2007-05-15 07:41:46 +00:00
Andrew Thompson
108fe96a44 Avoid touching various unsafe parts if the interface is disappearing. 2007-05-07 00:28:55 +00:00
Andrew Thompson
d74fd34568 Change from using if_delmulti() to if_delmulti_ifma() as it simplifies the code
and is safe to use if the ifp has disappeared.

Suggested by:	bms
2007-05-07 00:18:56 +00:00
Andrew Thompson
e3163ef60a - Add a disabled state for ports that can not be aggregated
- Refine check for lacp links, set to disabled if not suitable
2007-05-03 08:56:20 +00:00
Andrew Thompson
c0194db365 Test for IFM_FDX rather than IFM_HDX as the half-duplex bit may not be set even
if the link is not full-duplex.
2007-05-02 07:52:55 +00:00
Andrew Thompson
18242d3b09 Rename the trunk(4) driver to lagg(4) as it is too similar to vlan trunking.
The name trunk is misused as the networking term trunk means carrying multiple
VLANs over a single connection. The IEEE standard for link aggregation (802.3
section 3) does not talk about 'trunk' at all while it is used throughout IEEE
802.1Q in describing vlans.

The lagg(4) driver provides link aggregation, failover and fault tolerance.

Discussed on:	current@
2007-04-17 00:35:11 +00:00
Andrew Thompson
b47888ceba Add the trunk(4) driver for providing link aggregation, failover and fault
tolerance.  This driver allows aggregation of multiple network interfaces as
one virtual interface using a number of different protocols/algorithms.

failover    - Sends traffic through the secondary port if the master becomes
              inactive.
fec         - Supports Cisco Fast EtherChannel.
lacp        - Supports the IEEE 802.3ad Link Aggregation Control Protocol
              (LACP) and the Marker Protocol.
loadbalance - Static loadbalancing using an outgoing hash.
roundrobin  - Distributes outgoing traffic using a round-robin scheduler
              through all active ports.

This code was obtained from OpenBSD and this also includes 802.3ad LACP support
from agr(4) in NetBSD.
2007-04-10 00:27:25 +00:00