Commit Graph

146 Commits

Author SHA1 Message Date
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Pyun YongHyeon
9935c65a28 Remove unnecessary assignment.
Found by:	PVS-Studio
2017-04-14 07:27:23 +00:00
Edward Tomasz Napierala
1691e8bfb1 Remove NULL checks after M_WAITOK allocations from msk(4).
MFC after:	1 month
2016-08-09 15:51:11 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
Pyun YongHyeon
15b185341d ifnet lock was changed to use sx(9) long time ago.
Don't hold a driver lock for if_free(9).
2016-02-22 00:58:04 +00:00
Pyun YongHyeon
9dda5c8ffc Fix variable assignment.
Found by:	PVS-Studio
2016-02-18 01:24:10 +00:00
Pyun YongHyeon
97c477f86f Set DMA alignment constraint of status, TX and RX LEs(List Elements
in Marvell terms) to 32768.  32768 looks overkill but it will
ensure correct DMAed update.  This change addresses occasional
watchdog timeouts reported on 10.2-RELEASE.

Tested by:	Johann Hugo <jhugo@meraka.csir.co.za>
MFC after:	2 weeks
2015-08-28 01:32:42 +00:00
Robert Watson
6ad1954bcb Eliminate unnecessary checking for M_EXT on mbufs returned by m_getjcl().
Reviewed by:	bz, glebius, yongari
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D938
2014-10-13 06:51:40 +00:00
Gleb Smirnoff
1162f06501 Mechanically convert to if_inc_counter(). 2014-09-18 19:57:13 +00:00
Gleb Smirnoff
1bffa9511f Use define from if_var.h to access a field inside struct if_data,
that resides in struct ifnet.

Sponsored by:	Nginx, Inc.
2014-08-30 19:55:54 +00:00
John Baldwin
068d8643ad Fix various NIC drivers to properly cleanup static DMA resources.
In particular, don't check the value of the bus_dma map against NULL
to determine if either bus_dmamem_alloc() or bus_dmamap_load() succeeded.
Instead, assume that bus_dmamap_load() succeeeded (and thus that
bus_dmamap_unload() should be called) if the bus address for a resource
is non-zero, and assume that bus_dmamem_alloc() succeeded (and thus
that bus_dmamem_free() should be called) if the virtual address for a
resource is not NULL.

In many cases these bugs could result in leaks when a driver was detached.

Reviewed by:	yongari
MFC after:	2 weeks
2014-06-11 14:53:58 +00:00
Pyun YongHyeon
52ee8ac027 Increase the number of TX DMA segments from 32 to 35. It turned
out 32 is not enough to support a full sized TSO packet.
While I'm here fix a long standing bug introduced in r169632 in
bce(4) where it didn't include L2 header length of TSO packet in
the maximum DMA segment size calculation.

In collaboration with:	rmacklem
MFC after:		2 weeks
2014-03-31 01:54:59 +00:00
Pyun YongHyeon
f6f20128c7 Revert r234666. Clearing TWSI IRQ seems to cause watchdog timeout
on old Yukon II controllers.

Tested by:	bsam
MFC after:	2 weeks
2014-02-07 05:08:59 +00:00
Eitan Adler
7a22215c53 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
Pyun YongHyeon
b52d3ddba3 Perform media change after setting IFF_DRV_RUNNING flag. Without it,
driver would ignore the first link state update if controller
already established a link.

Reported by:	bsam
Tested by:	bsam
2013-11-01 05:03:47 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Marius Strobl
2dc26832f7 - Merge from r249476: Ensure that PCI bus BUS_GET_DMA_TAG() method sees
the actual PCI device which makes the request for DMA tag, instead of
  some descendant of the PCI device, by creating a pass-through trampoline.
- Sprinkle const on tables.
- Use NULL instead of 0 for pointers.
- Take advantage of nitems().

MFC after:	1 week
2013-05-30 12:16:55 +00:00
Gabor Kovesdan
ab3f6b347e - Correct mispellings of the word occurrence
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de> (via private mail)
2013-04-17 11:40:10 +00:00
Pyun YongHyeon
1c3515d2d5 RX checksum offloading on old Yukon controllers seem to cause more
problems.  Disable RX checksum offloading on controllers that don't
use new descriptor format but give chance to enable it with
ifconfig(8).
2013-02-27 05:03:35 +00:00
Gleb Smirnoff
c6499eccad Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.
2012-12-04 09:32:43 +00:00
Pyun YongHyeon
fca56fb2c3 For Yukon II controllers that implement optional temperature sensor
and voltage sensor, TWSI is used to get sensor data.  msk(4) does
not monitor these sensors and interrupt for TWSI completion is
disabled by default.
However, due to unknown reason, the TWSI completion interrupt fires
and it resulted in interrupt storm.  To fix it, acknowledges the
TWSI completion interrupt if driver see the event.  Given that not
all Yukon II controllers show the issue it could be a silicon bug
which does not honor interrupt masking.

Probably the right way to address the issue is disabling automatic
TWSI cycle initiation against these sensors.  It would be even
better to implement reading voltage/temperature from the NIC but it
requires access to National LM80 through TWSI and documentation to
do that is not available yet(probably will never happen).

Reported by:	jhb
Tested by:	jhb
MFC after:	2 weeks
2012-04-25 02:46:13 +00:00
Kevin Lo
5bbe0c5357 ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again
Reviewed by:	yongari
2012-01-07 09:41:57 +00:00
Pyun YongHyeon
7659f3c35d Increase wait time for OP_TCPSTART command processing. It seems
100us is not enough to ensure prefetch unit work.
2011-12-19 19:02:36 +00:00
Marius Strobl
4b7ec27007 - There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
  (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
  since r52045) but even recently added device drivers do this unnecessarily.
  Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
  Discussed with: jhb
- Also while at it, use __FBSDID.
2011-11-22 21:28:20 +00:00
Pyun YongHyeon
355a415e92 Enable 64bit DMA addressing support for all msk(4) controllers.
Unnecessarily complex LE format used on Marvell controller was
main reason not to enable 64bit DMA addressing in driver.  If high
32bit address of DMA address of TX/RX buffer is changed, driver has
to generate a new LE.  In TX path, driver will keep track of lastly
used high 32bit address of DMA address and generate a new LE
whenever it sees high address change in the DMA address. In RX path,
driver will always use two LEs to specify 64bit DMA address of RX
buffer.  If the high 32bit address of DMA address of RX buffer is
the same as previous DMA address of RX buffer, driver does not have
to use two LEs but driver will use two LEs for simplicity in RX
ring management.

One of draw back for switching to 64bit DMA addressing is that the
large amount of LEs are used to specify 64bit DMA address such that
number of available LEs for TX/RX buffers are considerably reduced.
To mitigate the issue, increase number of available LEs from 256 to
384 for TX and from 256 to 512 for RX. For 32bit architectures,
msk(4) does not use 64bit DMA addressing to save resources.

Tested by:	das
2011-11-16 19:25:26 +00:00
Pyun YongHyeon
57c81d92ae Close a race where SIOCGIFMEDIA ioctl get inconsistent link status.
Because driver is accessing a common MII structure in
mii_pollstat(), updating user supplied structure should be done
before dropping a driver lock.

Reported by:	Karim (fodillemlinkarimi <> gmail dot com)
2011-10-17 19:49:00 +00:00
Pyun YongHyeon
7c017a713e Correctly check MAC running status before disabling TX/RX MACs. 2011-05-31 01:30:58 +00:00
Pyun YongHyeon
81e2a01a77 style(9) 2011-05-24 20:39:07 +00:00
Pyun YongHyeon
8be664b8b7 When MTU is changed, check whether driver should be reinitialized or
not.  If reinitialized is required, clear driver running flag.
2011-05-23 21:56:04 +00:00
Pyun YongHyeon
e0029a7260 Add initial support for Marvell 88E8055/88E8075 Yukon Supreme. 2011-05-23 21:51:47 +00:00
Pyun YongHyeon
fe0b141e73 Do not touch ASF related register for controllers that do not have
these registers. Also disable Watchdog of ASF microcontroller.
2011-05-23 21:11:46 +00:00
Pyun YongHyeon
c6a34f768e Make sure to enable all clocks before accessing registers.
Releasing PHY from power down/COMA is done after enabling all
clocks. While I'm here remove unnecessary controller reset.
2011-05-23 21:00:56 +00:00
Pyun YongHyeon
d91192e329 Do not configure RAM registers for controllers that do not have
them.  These registers are defined only for Yukon XL, Yukon EC and
Yukon FE.
2011-05-23 20:18:09 +00:00
Pyun YongHyeon
7b4f47c1db Rework store and forward configuration of TX MAC FIFO. Basically it
enables store and forward mode except for jumbo frame on Yukon
Ultra.
2011-05-23 20:09:32 +00:00
Pyun YongHyeon
10e71e2260 Do not blindly clear entire GPHY control register. It seems some
bits of the register is used for other purposes such that clearing
these bits resulted in unexpected results such as corrupted RX
frames or missing LE status updates.  For old controllers like
Yukon EC it had no effect but it caused all kind of troubles on
Yukon Supreme.
This change shall improve stability of controllers like Yukon
Ultra, Ultra2, Extreme, Optima and Supreme.
2011-05-23 19:58:08 +00:00
Gleb Smirnoff
4c5a247b54 When msk_detach() is called from msk_attach(), ifp may be
yet not initialized.
2011-04-25 04:55:50 +00:00
John Baldwin
3b0a4aef96 Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.
2011-03-23 13:10:15 +00:00
Matthew D Fleming
cbc134ad03 Introduce signed and unsigned version of CTLTYPE_QUAD, renaming
existing uses.  Rename sysctl_handle_quad() to sysctl_handle_64().
2011-01-19 23:00:25 +00:00
Matthew D Fleming
f4f04709ac Fix a few more SYSCTL_PROC() that were missing a CTLFLAG type specifier. 2011-01-19 00:57:58 +00:00
Pyun YongHyeon
3c5571b374 Fix endianness bug introduced in r205091.
After controller updates control word in a RX LE, driver converts
it to host byte order. The checksum value in the control word is
stored in big endian form by controller. r205091 didn't account for
the host byte order conversion such that the checksum value was
incorrectly interpreted on big endian architectures which in turn
made all TCP/UDP frames dropped. Make RX checksum offload work
on any architectures by swapping the checksum value.

Reported by:	Sreekanth M. ( kanthms <> netlogicmicro dot com )
Tested by:	Sreekanth M. ( kanthms <> netlogicmicro dot com )
2010-12-31 22:18:41 +00:00
Marius Strobl
efd4fc3fb3 o Flesh out the generic IEEE 802.3 annex 31B full duplex flow control
support in mii(4):
  - Merge generic flow control advertisement (which can be enabled by
    passing by MIIF_DOPAUSE to mii_attach(9)) and parsing support from
    NetBSD into mii_physubr.c and ukphy_subr.c. Unlike as in NetBSD,
    IFM_FLOW isn't implemented as a global option via the "don't care
    mask" but instead as a media specific option this. This has the
    following advantages:
    o allows flow control advertisement with autonegotiation to be
      turned on and off via ifconfig(8) with the default typically
      being off (though MIIF_FORCEPAUSE has been added causing flow
      control to be always advertised, allowing to easily MFC this
      changes for drivers that previously used home-grown support for
      flow control that behaved that way without breaking POLA)
    o allows to deal with PHY drivers where flow control advertisement
      with manual selection doesn't work or at least isn't implemented,
      like it's the case with brgphy(4), e1000phy(4) and ip1000phy(4),
      by setting MIIF_NOMANPAUSE
    o the available combinations of media options are readily available
      from the `ifconfig -m` output
  - Add IFM_FLOW to IFM_SHARED_OPTION_DESCRIPTIONS and IFM_ETH_RXPAUSE
    and IFM_ETH_TXPAUSE to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so
    these are understood by ifconfig(8).
o Make the master/slave support in mii(4) actually usable:
  - Change IFM_ETH_MASTER from being implemented as a global option via
    the "don't care mask" to a media specific one as it actually is only
    applicable to IFM_1000_T to date.
  - Let mii_phy_setmedia() set GTCR_MAN_MS in IFM_1000_T slave mode to
    actually configure manually selected slave mode (like we also do in
    the PHY specific implementations).
  - Add IFM_ETH_MASTER to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so it
    is understood by ifconfig(8).
o Switch bge(4), bce(4), msk(4), nfe(4) and stge(4) along with brgphy(4),
  e1000phy(4) and ip1000phy(4) to use the generic flow control support
  instead of home-grown solutions via IFM_FLAGs. This includes changing
  these PHY drivers and smcphy(4) to no longer unconditionally advertise
  support for flow control but only if the selected media has IFM_FLOW
  set (or MIIF_FORCEPAUSE is set) and implemented for these media variants,
  i.e. typically only for copper.
o Switch brgphy(4), ciphy(4), e1000phy(4) and ip1000phy(4) to report and
  set IFM_1000_T master mode via IFM_ETH_MASTER instead of via IFF_LINK0
  and some IFM_FLAGn.
o Switch brgphy(4) to add at least the the supported copper media based on
  the contents of the BMSR via mii_phy_add_media() instead of hardcoding
  them. The latter approach seems to have developed historically, besides
  causing unnecessary code duplication it was also undesirable because
  brgphy_mii_phy_auto() already based the capability advertisement on the
  contents of the BMSR though.
o Let brgphy(4) set IFM_1000_T master mode on all supported PHY and not
  just BCM5701. Apparently this was a misinterpretation of a workaround
  in the Linux tg3 driver; BCM5701 seem to require RGPHY_1000CTL_MSE and
  BRGPHY_1000CTL_MSC to be set when configuring autonegotiation but
  this doesn't mean we can't set these as well on other PHYs for manual
  media selection.
o Let ukphy_status() report IFM_1000_T master mode via IFM_ETH_MASTER so
  IFM_1000_T master mode support now is generally available with all PHY
  drivers.
o Don't let e1000phy(4) set master/slave bits for IFM_1000_SX as it's
  not applicable there.

Reviewed by:	yongari (plus additional testing)
Obtained from:	NetBSD (partially), OpenBSD (partially)
MFC after:	2 weeks
2010-11-14 13:26:10 +00:00
Rebecca Cran
b1ce21c6ef Fix typos.
PR:	bin/148894
Submitted by:	olgeni
2010-11-09 10:59:09 +00:00
Marius Strobl
8e5d93dbb4 Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
  addresses; we now let mii_attach() only probe the PHY at the desired
  address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
  off from, partly even based on grabbing and using the softc of the
  parent; we now pass these flags down from the NIC to the PHY drivers
  via mii_attach(). This got us rid of all such hacks except those of
  brgphy() in combination with bce(4) and bge(4), which is way beyond
  what can be expressed with simple flags.

While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).

Reviewed by:	jhb, yongari
2010-10-15 14:52:11 +00:00
John Baldwin
d1a02e0932 Catch up to rename of the constant for the Master Data Parity Error bit in
the PCI status register.

Pointed out by:	mdf
Pointy hat to:	jhb
2010-09-09 20:26:30 +00:00
Pyun YongHyeon
3edfecaa37 When VLAN hardware tagging is disabled, make sure to disable VLAN
checksum offloading as well as TSO over VLAN.

Reported by:	jhb
2010-05-04 22:24:19 +00:00
Pyun YongHyeon
31fefd0d5d Make sure to check whether driver is running before processing
received frames. Also check driver has valid ifp pointer before
calling msk_stop() in device_shutdown handler. While I'm here
remove unnecessary accesses to interrupt mask registers in
device_shutdown handler because driver puts the controller into
reset state.
With these changes, msk(4) now survive from heavy RX traffic(1byte
UDP frame) while reboot is in progress.

Reported by:	Mark Atkinson < atkin901 <> gmail dot com >
2010-05-04 17:12:36 +00:00
Pyun YongHyeon
3d763c3133 Drop driver lock before exiting from interrupt handler.
Submitted by:	jhb
MFC after:	3 days
2010-05-04 17:02:34 +00:00
Pyun YongHyeon
e19bd6ee55 Add basic support for Marvell 88E8059 Yukon Optima.
Tested by:	James LaLagna < jameslalagna <> gmail dot com >
MFC after:	5 days
2010-04-30 18:58:55 +00:00
Pyun YongHyeon
7c8db6fd16 Disable non-ASF packet flushing on Yukon Extreme as vendor's driver
does. Without this change, Yukon Extreme seems to generate lots of
RX FIFO overruns even though controller has available RX buffers.
These excessive RX FIFO overruns generated lots of pause frames
which in turn killed devices plugged into switch. It seems there is
still occasional RX frame corruption on Yukon Extreme but this
change seems to fix the pause frame storm.

Reported by:	jhb
Tested by:	jhb
MFC after:	5 days
2010-04-30 18:04:46 +00:00