Commit Graph

38 Commits

Author SHA1 Message Date
imp
c4d87b52ed Add PNP info to PCI attachments of age driver
Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
2018-06-13 20:25:32 +00:00
mmacy
7aeac9ef18 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
pfg
1537078d8f sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
yongari
835df89217 Remove dead code. 2017-04-14 08:27:42 +00:00
pfg
96555d3833 sys/dev: extend use of the howmany() macro when available.
We have a howmany() macro in the <sys/param.h> header that is
convenient to re-use as it makes things easier to read.
2016-04-26 15:03:15 +00:00
pfg
e2a6cba651 sys/dev: use our nitems() macro when it is avaliable through param.h.
No functional change, only trivial cases are done in this sweep,
Drivers that can get further enhancements will be done independently.

Discussed in:	freebsd-current
2016-04-19 23:37:24 +00:00
yongari
a28774fe4e Fix variable assignment.
Found by:	PVS-Studio
2016-02-18 01:24:10 +00:00
glebius
25bbf4092e Mechanically convert to if_inc_counter(). 2014-09-18 21:01:41 +00:00
glebius
9b93b159b3 Use define from if_var.h to access a field inside struct if_data,
that resides in struct ifnet.

Sponsored by:	Nginx, Inc.
2014-08-30 19:55:54 +00:00
jhb
e8769d6be2 Fix various NIC drivers to properly cleanup static DMA resources.
In particular, don't check the value of the bus_dma map against NULL
to determine if either bus_dmamem_alloc() or bus_dmamap_load() succeeded.
Instead, assume that bus_dmamap_load() succeeeded (and thus that
bus_dmamap_unload() should be called) if the bus address for a resource
is non-zero, and assume that bus_dmamem_alloc() succeeded (and thus
that bus_dmamem_free() should be called) if the virtual address for a
resource is not NULL.

In many cases these bugs could result in leaks when a driver was detached.

Reviewed by:	yongari
MFC after:	2 weeks
2014-06-11 14:53:58 +00:00
glebius
ff6e113f1b The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
markj
001cab5cd9 Be sure to actually decrement the "count" parameter for each processed
descriptor so that we return when the threshold has been reached.

Reviewed by:	yongari
MFC after:	1 week
2013-06-17 22:59:47 +00:00
yongari
4bfa972e11 Rework jumbo frame handling. QAC confirmed that the controller
requires 8 bytes alignment on RX buffer.  Given that non-jumbo
frame works on any alignments I guess this DMA limitation for RX
buffer could be jumbo frame specific one.  Also I'm not sure
whether this DMA limitation is related with 64bit DMA.  Previously
age(4) disabled 64bit DMA addressing due to silent data corruption.
So we may need more testing on re-enabling 64bit DMA in future.

While I'm here, change mbuf chaining algorithm to use fixed sized
buffer and force software checksum if controller reports length
error. According to QAC, RFD is not updated at all for jumbo frame
so it works just like alc(4) controllers.  This change also added
alignment fixup for strict alignment architectures.  Because I'm
not aware of any non-x86 machines that use age(4) controllers it's
just for completeness at this moment.

Wit this change, jumbo frame should work with age(4).

Tested by:	Christian Gusenbauer < c47g <> gmx dot at >
MFC after:	1 week
2013-02-05 00:37:45 +00:00
glebius
a69aaa7721 Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.
2012-12-04 09:32:43 +00:00
yongari
730fb9d66a TSO engine of L1 requires a separate DMA descriptor for TCP
payload.  This means driver has to split a TX buffer into two
pieces of TX buffers when the TX buffer contains both
ethernet/IP/TCP header and partial TCP payload.  The controller
does not require all header should be in a TX buffer but driver
forced it to compute IP/TCP header size/offset which is required
parameter to configure DMA descriptor for TSO.
While here, slightly reorder DMA descriptor setup to enhance
readability and remove unnecessary code for TSO(upper stack never
requests TSO when the frame length is less than or equal to MTU).

Reported by:	Yamagi Burmeister <lists <> yamagi dot org>
Tested by:	Yamagi Burmeister <lists <> yamagi dot org>
MFC After:	1 week
2012-10-30 07:55:03 +00:00
yongari
4c371e596b Close a race where SIOCGIFMEDIA ioctl get inconsistent link status.
Because driver is accessing a common MII structure in
mii_pollstat(), updating user supplied structure should be done
before dropping a driver lock.

Reported by:	Karim (fodillemlinkarimi <> gmail dot com)
2011-10-17 19:49:00 +00:00
marius
d0f32374e6 - Remove attempts to implement setting of BMCR_LOOP/MIIF_NOLOOP
(reporting IFM_LOOP based on BMCR_LOOP is left in place though as
  it might provide useful for debugging). For most mii(4) drivers it
  was unclear whether the PHYs driven by them actually support
  loopback or not. Moreover, typically loopback mode also needs to
  be activated on the MAC, which none of the Ethernet drivers using
  mii(4) implements. Given that loopback media has no real use (and
  obviously hardly had a chance to actually work) besides for driver
  development (which just loopback mode should be sufficient for
  though, i.e one doesn't necessary need support for loopback media)
  support for it is just dropped as both NetBSD and OpenBSD already
  did quite some time ago.
- Let mii_phy_add_media() also announce the support of IFM_NONE.
- Restructure the PHY entry points to use a structure of entry points
  instead of discrete function pointers, and extend this to include
  a "reset" entry point. Make sure any PHY-specific reset routine is
  always used, and provide one for lxtphy(4) which disables MII
  interrupts (as is done for a few other PHYs we have drivers for).
  This includes changing NIC drivers which previously just called the
  generic mii_phy_reset() to now actually call the PHY-specific reset
  routine, which might be crucial in some cases. While at it, the
  redundant checks in these NIC drivers for mii->mii_instance not being
  zero before calling the reset routines were removed because as soon
  as one PHY driver attaches mii->mii_instance is incremented and we
  hardly can end up in their media change callbacks etc if no PHY driver
  has attached as mii_attach() would have failed in that case and not
  attach a miibus(4) instance.
  Consequently, NIC drivers now no longer should call mii_phy_reset()
  directly, so it was removed from EXPORT_SYMS.
- Add a mii_phy_dev_attach() as a companion helper to mii_phy_dev_probe().
  The purpose of that function is to perform the common steps to attach
  a PHY driver instance and to hook it up to the miibus(4) instance and to
  optionally also handle the probing, addition and initialization of the
  supported media. So all a PHY driver without any special requirements
  has to do in its bus attach method is to call mii_phy_dev_attach()
  along with PHY-specific MIIF_* flags, a pointer to its PHY functions
  and the add_media set to one. All PHY drivers were updated to take
  advantage of mii_phy_dev_attach() as appropriate. Along with these
  changes the capability mask was added to the mii_softc structure so
  PHY drivers taking advantage of mii_phy_dev_attach() but still
  handling media on their own do not need to fiddle with the MII attach
  arguments anyway.
- Keep track of the PHY offset in the mii_softc structure. This is done
  for compatibility with NetBSD/OpenBSD.
- Keep track of the PHY's OUI, model and revision in the mii_softc
  structure. Several PHY drivers require this information also after
  attaching and previously had to wrap their own softc around mii_softc.
  NetBSD/OpenBSD also keep track of the model and revision on their
  mii_softc structure. All PHY drivers were updated to take advantage
  as appropriate.
- Convert the mebers of the MII data structure to unsigned where
  appropriate. This is partly inspired by NetBSD/OpenBSD.
- According to IEEE 802.3-2002 the bits actually have to be reversed
  when mapping an OUI to the MII ID registers. All PHY drivers and
  miidevs where changed as necessary. Actually this now again allows to
  largely share miidevs with NetBSD, which fixed this problem already
  9 years ago. Consequently miidevs was synced as far as possible.
- Add MIIF_NOMANPAUSE and mii_phy_flowstatus() calls to drivers that
  weren't explicitly converted to support flow control before. It's
  unclear whether flow control actually works with these but typically
  it should and their net behavior should be more correct with these
  changes in place than without if the MAC driver sets MIIF_DOPAUSE.

Obtained from:	NetBSD (partially)
Reviewed by:	yongari (earlier version), silence on arch@ and net@
2011-05-03 19:51:29 +00:00
yongari
bdc9b04112 Partially revert r184106. RX buffer ring also needs bus_dmamap_sync().
Tested by:	Yamagi Burmeister (lists <> yamagi dot org)
MFC after:	1 week
2011-04-01 18:53:41 +00:00
yongari
a5be52d802 64bit DMA caused data corruption. Unfortunately there is no known
workaround to use 64bit DMA.
Disable 64bit DMA on Attansic L1 controller.

Tested by:	Yamagi Burmeister (lists <> yamagi dot org)
MFC after:	1 week
2011-04-01 16:45:26 +00:00
jhb
00c3c01f4f Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.
2011-03-23 13:10:15 +00:00
jhb
018ecd0b32 Forgot to remove unlock of the driver lock from age_start_locked() when
converting it to a locked variant.

PR:		kern/153948
2011-01-13 13:04:49 +00:00
jhb
be4690f32e Add a 'locked' variant of the foo_start() routine and call it directly
from interrupt handlers and watchdog routines instead of queueing a task
to call foo_start().

Reviewed by:	yongari
MFC after:	1 month
2011-01-03 18:28:30 +00:00
marius
385153aa98 Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
  addresses; we now let mii_attach() only probe the PHY at the desired
  address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
  off from, partly even based on grabbing and using the softc of the
  parent; we now pass these flags down from the NIC to the PHY drivers
  via mii_attach(). This got us rid of all such hacks except those of
  brgphy() in combination with bce(4) and bge(4), which is way beyond
  what can be expressed with simple flags.

While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).

Reviewed by:	jhb, yongari
2010-10-15 14:52:11 +00:00
yongari
92023f4cc9 Make sure to not use stale ip/tcp header pointers. The ip/tcp
header parser uses m_pullup(9) to get access to mbuf chain.
m_pullup(9) can allocate new mbuf chain and free old one if the
space left in the mbuf chain is not enough to hold requested
contiguous bytes. Previously drivers can use stale ip/tcp header
pointer if m_pullup(9) returned new mbuf chain.

Reported by:	Andrew Boyer (aboyer <> averesystems dot com)
MFC after:	10 days
2010-10-14 18:31:40 +00:00
yongari
86e44bf444 Remove unnecessary controller reinitialization.
PR:	kern/87506
2010-08-24 19:41:15 +00:00
yongari
66db61c8c0 With r206844, CSUM_TCP is also set for CSUM_TSO case. Modify
drivers to take into account for the change. Basically CSUM_TSO
should be checked before checking CSUM_TCP.
2010-04-19 22:10:40 +00:00
yongari
a924508d55 Add TSO support on VLANs. While I'm here remove unnecessary check
of VLAN hardware checksum offloading. vlan(4) already takes care of
this.
2010-02-26 22:43:23 +00:00
yongari
b9dd684580 Fix multicast handling. All Atheros controllers use big-endian form
in computing multicast hash.

PR:	kern/139137
2009-09-29 23:03:16 +00:00
rwatson
be5740a255 Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs.  This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 11:45:06 +00:00
yongari
a457681b2d pci(4) handles PCIM_CMD_INTxDIS so there is no need to poke this
bit in driver.
2009-05-20 03:33:27 +00:00
yongari
b702ca48cf o Don't access VPD even if hardware advertised the capability.
It seems that some revision of controller hang while accessing
  the VPD. Because VPD access routine are unused, nuke it.
o Let TWSI reload EEPROM if VPD capability is detected. Reloading
  EEPROM will also set ethernet address so age(4) now reads AGE_PAR0
  and AGE_PAR1 register to get ethernet address. This removes a lot
  of hack and enhance readability a lot.
o Double PHY reset timeout as it takes more time to take PHY out of
  power-saving state.
o Explicitly check power-saving state by checking undocumented PHY
  registers. If link is not up, poke undocumented registers to take
  PHY out of power-saving state. This is the same way what Linux
  does. On resume, make sure to wake up PHY.
o Don't rely on auto-clearing feature of master reset bit, just wait
  1ms and check idle status of MAC.
o Add PCI device revision information in bootverbose mode.
This should fix occasional controller hang in device attach phase.

Reported by:	barbara < barbara.xxx1975 at libero DOT it >
Tested by:	barbara < barbara.xxx1975 at libero DOT it >
2009-03-28 07:39:35 +00:00
yongari
f128d57f70 Fix inversed logic. pci_find_extcap() returns 0 when it finds
specified capability.
2009-03-23 00:27:46 +00:00
yongari
41e01818b6 Remove informational messages left. These messages were intended to
show up in verbose boot mode.

Reported by:	pluknet ( pluknet<> gmail DOT com )
2008-11-07 07:02:28 +00:00
kevlo
92f71ae8f1 No need to sync descriptors twice in age_rxintr() 2008-10-21 03:16:50 +00:00
kevlo
088a5c65f8 Fix a typo: jme -> age 2008-08-14 02:43:18 +00:00
yongari
065c59620f Use DELAY() instead of pause if waiting time is less than 1ms.
This will fix driver hang if hz < 1000.

Pointed out by:	thompsa
2008-07-18 01:00:54 +00:00
rpaulo
c482e7889a Fix typo in comment. 2008-06-08 14:42:43 +00:00
yongari
af582dfcb7 Add age(4), a driver for Attansic/Atheros L1 gigabit ethernet
controller. L1 has several threshold/timer registers and they
seem to require careful tuned parameters to get best
performance. Datasheet for L1 is not available to open source
driver writers so age(4) focus on stability and correctness of
basic Tx/Rx operation. ATM the performance of age(4) is far from
optimal which in turn means there are mis-programmed registers or
incorrectly configured registers.
Currently age(4) supports all known hardware assistance including
  - MSI support.
  - TCP Segmentation Offload.
  - Hardware VLAN tag insertion/stripping.
  - TCP/UDP checksum offload.
  - Interrupt moderation.
  - Hardware statistics counter support.
  - Jumbo frame support.
  - WOL support.

L1 gigabit ethernet controller is mainly found on ASUS
motherboards. Note, it seems that there are other variants of
hardware as known as L2(Fast ethernet) and newer gigabit ethernet
(AR81xx) from Atheros. These are not supported by age(4) and
requires a seperate driver. Big thanks to all people who reported
feedback or tested patches.

Tested by:	kevlo, bsam, Francois Ranchin < fyr AT fyrou DOT net >
		Thomas Nystroem < thn AT saeab DOT se >
		Roman Pogosyan < asternetadmin AT gmail DOT com >
		Derek Tattersal < dlt AT mebtel DOT net >
		Oliver Seitz < karlkiste AT yahoo DOT com >
2008-05-19 01:39:59 +00:00