Unnecessarily complex LE format used on Marvell controller was
main reason not to enable 64bit DMA addressing in driver. If high
32bit address of DMA address of TX/RX buffer is changed, driver has
to generate a new LE. In TX path, driver will keep track of lastly
used high 32bit address of DMA address and generate a new LE
whenever it sees high address change in the DMA address. In RX path,
driver will always use two LEs to specify 64bit DMA address of RX
buffer. If the high 32bit address of DMA address of RX buffer is
the same as previous DMA address of RX buffer, driver does not have
to use two LEs but driver will use two LEs for simplicity in RX
ring management.
One of draw back for switching to 64bit DMA addressing is that the
large amount of LEs are used to specify 64bit DMA address such that
number of available LEs for TX/RX buffers are considerably reduced.
To mitigate the issue, increase number of available LEs from 256 to
384 for TX and from 256 to 512 for RX. For 32bit architectures,
msk(4) does not use 64bit DMA addressing to save resources.
Tested by: das
does. Without this change, Yukon Extreme seems to generate lots of
RX FIFO overruns even though controller has available RX buffers.
These excessive RX FIFO overruns generated lots of pause frames
which in turn killed devices plugged into switch. It seems there is
still occasional RX frame corruption on Yukon Extreme but this
change seems to fix the pause frame storm.
Reported by: jhb
Tested by: jhb
MFC after: 5 days
Yukon FE and Yukon Ultra2. These controllers provide very simple
checksum computation mechanism and it requires additional pseudo
header checksum computation in upper stack. Even though I couldn't
see much performance difference with/without Rx checksum offloading
it may help notebook based controllers.
Actually controller can compute two checksum value by giving
different starting position of checksum computation on received
frame. However, for long time, Marvell's checksum offloading engine
have been known to have several silicon bugs so don't blindly trust
computed partial checksum value. Instead, compute partial checksum
twice by giving the same checksum computation position and compare
the result. If the value is different it's clear indication of
hardware bug. This configuration lose IP checksum offloading
capability but I think it's better to take safe route.
Note, Rx checksum offloading for Yukon XL was still disabled due to
known silicon bug.
does not generate excessive interrupts any more so we don't need
to have two copies of interrupt handler.
While I'm here remove two STAT_PUT_IDX register accesses in LE
status event handler. After r204539 msk(4) always sync status LEs
so there is no need to resort to reading STAT_PUT_IDX register to
know the end of status LE processing. Just trust status LE's
ownership bit.
countdown timer register. The timer resolution may vary among
controllers but the value would be represented by core clock
cycles. msk(4) will automatically computes number of required clock
cycles from given micro-seconds unit.
The default interrupt holdoff timer value is 100us which will
ensure less than 10k interrupts under load. The timer value can be
changed with dev.mskc.0.int_holdoff sysctl node.
Note, the interrupt moderation is shared resource on dual-port
controllers so you can't use separate interrupt moderation value
for each port. This means we can't stop interrupt moderation in
driver stop routine. Also have msk_tick() reclaim transmitted Tx
buffers as safety belt. With this change there is no need to check
missing Tx completion interrupt in watchdog handler, so remove it.
for controllers like 88E8053 which reports two MSI messages.
Because we don't get anything useful things with 2 MSI messages,
allocating 1 MSI message would be more sane approach.
While I'm here, enable MSI for dual-port controllers too. Because
status block is shared for dual-port controllers, I don't think
msk(4) will encounter problem for using MSI on dual-port
controllers.
The register offset is not valid on 88E8072 controller. Also don't
blindly increase max read request size to 4096, instead, use 2048
which seems to be more sane value and only change the value if the
hardware default size(512) was used on that register.
For PCIX controllers, use system defined constant rather than using
magic value.
While I'm here stop showing negotiated link width.
not require checksum LE configuration if checksum start and write
position is the same as before. So keep track last checksum start
and write position and insert new LE whenever the position is
changed. This reduces number of LEs used in TX path as well as
slightly enhance TX performance.
Unlike most other PHYs there is no easy way to know which media
type the PHY supports on Marvell PHYs. MIIF_HAVEFIBER flags is now
passed via bus-specific instance variable of a device. While I'm
here add 88E1112 specific work around to set SIGDET polarity low.
Many thanks "Eugene Perevyazko <john <> dnepro dot net>" who kindly
gave remote access to system with DGE-560SX.
These controllers use newer descriptor format and the new descriptor
format uses status LE to indicate the status of checksum. Rx
checksummed value used in previous controllers were very cryptic
and I failed to understand how to use them. In addition most
controllers in previous generations had Rx checksum offloading bug.
While I'm here introduce a MSK_FLAG_NORX_CSUM flag to bypass
checking Rx checksum offloading as Yukon FE+ A0 has status LE bug.
this feature hardware automatically computes TCP/UDP payload
offset. Introduce MSK_FLAG_AUTOTX_CSUM to mark the capability.
Yukon Extreme B0 revision is known to have a silicon for the
feature so disable it. Yukon Extreme B0 still can do Tx checksum
offloading but CPU have to compute TCP/UDP payload offset. To
enable traditional checksum offloading, disable automatic Tx
checksum calculation capability.
Yukon Extreme A0 revision could not use store-and-forward mode for
jumbo frames(silicon bug) so disable Tx checksum offloading for
jumbo frames.
I believe controllers that have MSK_FLAG_AUTOTX_CSUM capability or
new descriptor format do not have Tx checksum offload bug so
disable checksum offloading workaround for for short frames.
Tested by: jhb, Warren Block ( wblock <> wonkity dot com )
Yukon Extreme uses new descriptor format for TSO and has Tx frame
parser which greatly reduces CPU cycles spent in computing TCP/UDP
payload offset calculation in Tx checksum offloading path. The new
descriptor format also removed TCP/UDP payload computation for TSO
which in turn results in better TSO performance. It seems Yukon
Extreme has a lot of new (unknown) features but only basic
offloading is supported at this time. So far there are two known
issues.
o Sometimes Rx overrun errors happen when pulling data over
gigabit link. Running over 100Mbps seem to ok.
o Ethernet hardware address shows all-zeroed value on 88E8070.
Assigning ethernet address with ifconfig is necessary to make it
work.
Support for Yukon Extreme is not perfect but it would be better
than having a non-working device. Special thanks to jbh who fixed
several bugs of initial patch.
Tested by: jhb, Warren Block ( wblock <> wonkity dot com )
severe silicon bugs that can't handle VLAN hardware tagging as well
as status LE writeback bug. The status LE writeback bug is so
critical we can't trust status word of received frame. To accept
frames on Yukon FE+ A0 msk(4) just do minimal check for received
frames and pass them to upper stack. This means msk(4) can pass
corrupted frames to upper layer. You have been warned!
Also I supposed RX_GMF_FL_THR to be 32bits register but Linux
driver treated it as 16bit register so follow their leads. At least
this does not seem to break msk(4) on Yukon FE+.
Tested by: bz, Tanguy Bouzeloc ( the.zauron <> gmail dot com )
Bruce Cran ( bruce <> cran dot org dot uk )
Michael Reifenberger ( mike <> reifenberger dot com )
Stephen Montgomery-Smith ( stephen <> missouri dot edu )
Yukon FE+ is fast ethernet controller and uses new descriptor
format. Since I don't have this controller, the support code was
written from guess and various feedback from enthusiastic users.
Thanks to all users who patiently tested my initial patches.
Special thanks to Tanguy Bouzeloc who fixed critical bug of initial
patch.
Tested by: bz, Tanguy Bouzeloc ( the.zauron <> gmail dot com )
Bruce Cran ( bruce <> cran dot org dot uk )
Michael Reifenberger ( mike <> reifenberger dot com )
Stephen Montgomery-Smith ( stephen <> missouri dot edu )
mark controller's capability. Controllers that have jumbo frame
support sets MSK_FLAG_JUMBO, and controllers that does not support
checksum offloading for jumbo frames will set MSK_FLAG_JUMBO_NOCSUM.
For Fast Ethernet controllers it will set MSK_FLAG_FASTETHER and it
would be used in link state handling.
While here, disable Tx checksum offloading if jumbo frame is used
on controllers that does not have Tx checksum offloading capability
for jumbo frame(e.g. Yukon EC Ultra).
for jumbo frame.
o Nuke unneeded jlist lock which was used to protect jumbo buffer
management in local allocator.
o Added a new tunable hw.mskc.jumbo_disable to disable jumbo
frame support for the driver. The tunable could be set for
systems that do not need to use jumbo frames and it would
save (9K * number of Rx descriptors) bytes kernel memory.
o Jumbo buffer allocation failure is no longer critical error
for the operation of msk(4). If msk(4) encounter the allocation
failure it just disables jumbo frame support and continues to
work without your intervention.
Using local allocator had several drawbacks such as requirement of
large amount of continuous kernel memory and fixed (small) number
of available buffers. The need for large continuous memory resulted
in failure of loading driver with kldload on running systems.
Also small number of buffer used in local allocator showed poor
performance for some applications.
have hardware ram buffer. The silicon bug seem to be triggered by
pause frames if receive buffer is not aligned on FIFO word(8 bytes).
To workaround the issue, make sure to align Rx buffers on 8 bytes.
Unfortunately this workaround requires yet another Rx fixup for
strict alignment architecture machines to align IP header.
For newer hardwares that lacks ram buffer may not have this bug so
check number of available ram buffer size to see the existence of
ram buffer.
Reported by: Ian Freislich (ianf <at> clue dot co dot za), das
Tested by: Ian Freislich (ianf <at> clue dot co dot za)
frames. This bug seems to happen on certain hardware model/revision
(e.g. 88E8053) but it's not identified which hardwares are affected.
Revision 1.4 of if_mskreg.h was not enough to workaround the bug.
To workaround it, inrease GMAC FIFO threshold by one FIFO word to
flush received pause frames.
Reported by: das, Kirill Nuzhdin < kirill.nuzhdin AT rad dot chem dot msu dot ru >
Tested by: das, Kirill Nuzhdin
only at address 0 which is supposed to be the only valid phy address
on Marvell PHY. The more correct solution would be masking PHY
address ranges allowable in PHY probe routine. Unfortunately,
FreeBSD has no way to retrict the PHY address ranges or to pass special
flags to PHY driver.
This change assumes that PHY hardwares attached to msk(4) would be
Marvell made 88E11xx PHY.
With this changes the phantom phys attached on 88E8036(Yukon FE)
should disappear.
Reported by: Oleg Lomaka < oleg AT lomaka DOT org DOT ua >
Tested by: Oleg Lomaka < oleg AT lomaka DOT org DOT ua >
publicly available datasheet for Yukon II and don't know what
bug/workaround exist for the specific hardware revision. Also I don't
think the vendor will release hardware errata in near future.
The hardware feature lists were not used at all except setting water
mark registers. Since msk(4) should know exact chip model/revision
number to decide which hardware capability could be used the extra
feature lists were redundant.
o Enable jumbo frame support for EC Ultra and disable jumbo frame
for FE.
o Enable store and forward mode for standard MTU sized frame.
o Enable TSO for EC Ultra. However TSO/checksum offload is disabled
for jumbo frame case. Because EC Ultra can't use store and forward
mode for jumbo frame TSO/checksum offload is not available.
o Adjust Tx GMAC almost empty threshold value and add a jumbo frame
water mark. The maic value was obtained from Marvell's sk98lin
driver.
o Fix EC Ultra chip revision number.
MCLBYTES for the segment size but it used too many Tx descriptors in
TSO case.
While I'm here adjust maximum size of the sum of all segment lengths
in a given DMA mapping to 65535, the maximum size, in bytes, of a IP
packet.
It seems that valid pause frames(Tx flow control) cause GMAC to hang
such that it resulted in watchdog timeout. As a work around don't
flush Rx MAC FIFO if we've received pause frames.
Tested by: Harald Schmalzbauer (h DOT schmalzbauer AT omnisec DOT de)
controller. Due to lack of documentation, this driver is based on the
code from sk(4) and Marvell's myk(4) driver for FreeBSD. I've also
adopted the OpenBSD interface name, msk(4) in order to reduce naming
differences between BSDs.
The msk(4) driver supports the following Gigabit Ethernet adapters.
o SysKonnect SK-9Sxx Gigabit Ethernet
o SysKonnect SK-9Exx Gigabit Ethernet
o Marvell Yukon 88E8021CU Gigabit Ethernet
o Marvell Yukon 88E8021 SX/LX Gigabit Ethernet
o Marvell Yukon 88E8022CU Gigabit Ethernet
o Marvell Yukon 88E8022 SX/LX Gigabit Ethernet
o Marvell Yukon 88E8061CU Gigabit Ethernet
o Marvell Yukon 88E8061 SX/LX Gigabit Ethernet
o Marvell Yukon 88E8062CU Gigabit Ethernet
o Marvell Yukon 88E8062 SX/LX Gigabit Ethernet
o Marvell Yukon 88E8035 Gigabit Ethernet
o Marvell Yukon 88E8036 Gigabit Ethernet
o Marvell Yukon 88E8038 Gigabit Ethernet
o Marvell Yukon 88E8050 Gigabit Ethernet
o Marvell Yukon 88E8052 Gigabit Ethernet
o Marvell Yukon 88E8053 Gigabit Ethernet
o Marvell Yukon 88E8055 Gigabit Ethernet
o Marvell Yukon 88E8056 Gigabit Ethernet
o D-Link 550SX Gigabit Ethernet
o D-Link 560T Gigabit Ethernet
Unlike OpenBSD/NetBSD msk(4), the msk(4) driver supports all hardware
features including TCP/UDP checksum offload for transmit, MSI, TCP
segmentation offload(TSO), hardware VLAN tag stripping/insertion,
and jumbo frames(up to 9022 bytes). The only unsupported hardware
feature except RLMT is Rx checksum offload which I don't know how to
make it work reliably.
Known Issues:
It seems msk(4) does not work on the second port of dual port NIC.
(The first port works without problems.)
Thanks to Marvell for releasing the BSD licensed myk(4) driver and
thanks to all users helped fixing bugs.
Tested by: bz, philip, bms,
YAMAMOTO Shigeru < shigeru AT iij DOT ad DOT jp >,
Dmitry Pryanishnikov < dmitry AT atlantis DOT dp DOT ua >,
Jia-Shiun Li < jiashiun AT gmail DOT com >,
David Duchscher < daved AT tamu DOT edu >,
Arno J. Klaassen < arno AT heho DOT snv DOT jussieu DOT fr>,
Nicolae Namolovan < adrenalinup AT gmail DOT com>,
Andre Guibert de Bruet < andy AT siliconlandmark DOT com >
current ML
Tested on: i386, amd64