#defines. This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.
This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g
When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.
Discussed with: jhb
MFC after: 1 week
negotiate with each other on the TLP payload size so blindly
forcing the size to 128 can cause a completion error which in turn
will stop device.
Reported by: Geans Pin < geanspin <> broadcom dot com >
MFC after: 5 days
updated.
o Number of times NIC ran out of RX buffer descriptors
o Number of inbound packet errors
o Number of inbound packets that were chosen to be discarded
Previously only the discarded packet counter was used to update
if_ierrors. This change fixes wrong if_ierrors counter on
BCM570[0-4] controllers. For BCM5705 and later controllers bge(4)
already correctly counted it.
Reported by: Eugene Grosbein <egrosbein <> rdtc dot ru>
device in device attach. This would help to narrow down issue to a
specific controller and operating mode of the controller.
While I'm here rename BGE_MISCCFG_BOARD_ID with
BGE_MISCCFG_BOARD_ID_MASK.
AMD-8131 PCI-X bridge. The bridge seems to reorder write access to
mailbox registers such that it caused watchdog timeouts by
out-of-order TX completions.
Tested by: Michael L. Squires <mikes <> siralan dot org >
Reviewed by: jhb
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
bge(4) sends BGE_FW_CMD_DRV_ALIVE command to firmware every 2
seconds. BGE_FW_CMD_DRV_ALIVE command requires 4 bytes data. This
data contains timeout value in seconds until the next
BGE_FW_CMD_DRV_ALIVE command.
Broadcom recommends driver set the value 3 times longer than the
interval that it sends BGE_FW_CMD_DRV_ALIVE. Currently bge(4) uses
3 seconds so probably we have to increase it in future and use
different ALIVE command(e.g. BGE_FW_CMD_DRV_ALIVE3).
No functional changes.
This bit(SW event 7 in publicly available data sheet) is used to
make RX CPU handle a firmware command and the bit is automatically
cleared after RX CPU completed the command.
Generally firmware command takes the following steps.
1. Write BGE_SRAM_FW_CMD_MB with a command.
2. Write BGE_SRAM_FW_CMD_LEN_MB with the length of the command in bytes.
3. Write BGE_SRAM_FW_CMD_DATA_MB with actual command data.
4. Generate BGE_RX_CPU_EVENT and let firmware handle the command.
5. Wait for the ACK of the firmware command.
No functional changes.
about the various driver events like load, unload, reset, suspend,
restart, and ioctl operations.
Define driver's event rather than using hard-coded values. We don't
still send suspend/resume event to firmware.
Previously bge(4) used BGE_SDI_STATUS to send events. Because driver
has to access firmware mail box to inform current state, using
BGE_SDI_STATUS register was wrong. The end result was the same as
BGE_SDI_STATUS is 0x0C04.
No functional changes.
The origin of GENCOMM seems to come from Alteon Tigon Host/NIC
interface definition where it defines general communications region
which is active when firmware is loaded and running. This region
was used in communication between the host and processor internal
to the Tigon chip.
Broadcom data sheet also defines the region as 'Software Gencomm'
in NetXtreme memory map but lacks detailed description of its
interface so it was hard to know which ones are used for which
interface.
This change shall slightly enhance readability.
No functional changes.
larger than 4KB in size. However the maximum DMA segment size
created in DMA tag is 4KB, so we wouldn't encounter the issue here.
Just record this issue such that let developers not to create a DMA
segment that is larger than 4KB for BCM5719. It's possible to split
a DMA segment into multiple smaller ones in run time but I believe
it's not worth to implement that.
o Protect bge(4) status block access and register dump with driver lock.
o Add missing bus_dmamap_sync() before dumping status block.
o Use minimum status block size, 32 bytes, for status block dump on most
controllers except BCM5700 AX/BX.
While I'm here, make the handler show 5717 Plus in hardware flags.
disabled for BCM5719 A0 revision due to known hardware errata.
Many thanks to Broadcom for continuing support of FreeBSD.
Submitted by: Geans Pin at Broadcom
newer controllers. However, all data sheet I have access has no
indication that buffer manager should not be touched on these
controllers. It seems the buffer manager always runs on BCM5705 or
newer controllers. Some controller(e.g. BCM5719) needs other buffer
manager configuration so driver should enable buffer manager for
all controllers. Both Linux and OpenBSD/NetBSD use the same
approach.
This change polls enable bit of block to know whether specified
block was really stopped as well as enabling buffer manager for all
controllers in driver initialization.
Obtained from: NetBSD
have similar hardware features of BCM5718 family except the number
of receive return ring is 4. The BCM57765 family is known to
support IEEE 802.3az EEE(Energy Efficient Ethernet) but this change
does not include EEE support code. I hope EEE is implemented in
near future.
This change will support BCM57761, BCM57765, BCM57781, BCM57785,
BCM57791 and BCM57795. All hardware offloading features are
supported and suspend/resume also should work.
Many thanks to Broadcom for continuing support of FreeBSD.
Tested by: Paul Thornton (prt <> prt dot org)
HW donated by: Broadcom
(reporting IFM_LOOP based on BMCR_LOOP is left in place though as
it might provide useful for debugging). For most mii(4) drivers it
was unclear whether the PHYs driven by them actually support
loopback or not. Moreover, typically loopback mode also needs to
be activated on the MAC, which none of the Ethernet drivers using
mii(4) implements. Given that loopback media has no real use (and
obviously hardly had a chance to actually work) besides for driver
development (which just loopback mode should be sufficient for
though, i.e one doesn't necessary need support for loopback media)
support for it is just dropped as both NetBSD and OpenBSD already
did quite some time ago.
- Let mii_phy_add_media() also announce the support of IFM_NONE.
- Restructure the PHY entry points to use a structure of entry points
instead of discrete function pointers, and extend this to include
a "reset" entry point. Make sure any PHY-specific reset routine is
always used, and provide one for lxtphy(4) which disables MII
interrupts (as is done for a few other PHYs we have drivers for).
This includes changing NIC drivers which previously just called the
generic mii_phy_reset() to now actually call the PHY-specific reset
routine, which might be crucial in some cases. While at it, the
redundant checks in these NIC drivers for mii->mii_instance not being
zero before calling the reset routines were removed because as soon
as one PHY driver attaches mii->mii_instance is incremented and we
hardly can end up in their media change callbacks etc if no PHY driver
has attached as mii_attach() would have failed in that case and not
attach a miibus(4) instance.
Consequently, NIC drivers now no longer should call mii_phy_reset()
directly, so it was removed from EXPORT_SYMS.
- Add a mii_phy_dev_attach() as a companion helper to mii_phy_dev_probe().
The purpose of that function is to perform the common steps to attach
a PHY driver instance and to hook it up to the miibus(4) instance and to
optionally also handle the probing, addition and initialization of the
supported media. So all a PHY driver without any special requirements
has to do in its bus attach method is to call mii_phy_dev_attach()
along with PHY-specific MIIF_* flags, a pointer to its PHY functions
and the add_media set to one. All PHY drivers were updated to take
advantage of mii_phy_dev_attach() as appropriate. Along with these
changes the capability mask was added to the mii_softc structure so
PHY drivers taking advantage of mii_phy_dev_attach() but still
handling media on their own do not need to fiddle with the MII attach
arguments anyway.
- Keep track of the PHY offset in the mii_softc structure. This is done
for compatibility with NetBSD/OpenBSD.
- Keep track of the PHY's OUI, model and revision in the mii_softc
structure. Several PHY drivers require this information also after
attaching and previously had to wrap their own softc around mii_softc.
NetBSD/OpenBSD also keep track of the model and revision on their
mii_softc structure. All PHY drivers were updated to take advantage
as appropriate.
- Convert the mebers of the MII data structure to unsigned where
appropriate. This is partly inspired by NetBSD/OpenBSD.
- According to IEEE 802.3-2002 the bits actually have to be reversed
when mapping an OUI to the MII ID registers. All PHY drivers and
miidevs where changed as necessary. Actually this now again allows to
largely share miidevs with NetBSD, which fixed this problem already
9 years ago. Consequently miidevs was synced as far as possible.
- Add MIIF_NOMANPAUSE and mii_phy_flowstatus() calls to drivers that
weren't explicitly converted to support flow control before. It's
unclear whether flow control actually works with these but typically
it should and their net behavior should be more correct with these
changes in place than without if the MAC driver sets MIIF_DOPAUSE.
Obtained from: NetBSD (partially)
Reviewed by: yongari (earlier version), silence on arch@ and net@
Unlike other controllers which have more advanced jumbo support,
these controllers have one send ring, one standard receive producer
ring and one receive return ring. In order to receive jumbo frames
on the controllers, driver now will increase Rx buffer size to 9k.
Two Rx modes are supported on these controllers and I chose
standard Rx BDs over extended Rx BDs. The extended Rx BD mode
allows up to 4 segmentations for each Rx BDs such that kernel does
not have to allocate large buffer of contiguous memory for
receiving. The extended Rx BD mode is already used on controllers
that have separate jumbo receive ring. However, using extended Rx
BDs on BCM5714/BCM5715/BCM5780 reduces the number of Rx BDs to 256
entries which in turn may reduce the performance. Also UMA backed
page allocator for jumbo frame returns contiguous memory so using
extended Rx BD has no advantage on FreeBSD unless highly customized
local allocator implemented in driver is used.
To use jumbo buffers in standard receive ring, Rx buffer allocation
handler was changed to allocate MJUM9BYTES sized mbuf.
PR: kern/155192
Tested by: Vijay Singh <vijju.singh <> gmail dot com>
Submitted by: mjacob (initial version)
DMA boundary bug and runs with PCI-X mode. watchdog timeout was
observed on BCM5704 which lives behind certain PCI-X bridge(e.g.
AMD 8131 PCI-X bridge). It's still not clear whether the root
cause came from that PCI-X bridge or not. The watchdog timeout
indicates the issue is in TX path. If the bridge reorders TX
mailbox write accesses it would generate all kinds of problems but
I'm not sure. This should be revisited.
Tested by: Michael L. Squires (mikes <> siralan dot org)
issue seen on PCIX BCM5704 controller. r216970 fixed the issue but
the DMA address space restriction was applied to all bge(4)
controllers such that it caused unnecessary performance degradation
for controllers that have no such issues.
bus_dma(9)'s capability which honors boundary restrictions of DMA
tag for dynamic buffers. However it seems this does not work well
and it triggered watchodg timeouts on controller that has the
hardware bug. It's not clear whether there is still another
hardware bug not mentioned in errata. This should be revisited
since this change shall make use of bounce buffers which in turn
reduces performance a lot on systems that have more than 4GB
memory.
Reported by: Michael L. Squires (mikes <> siralan dot org)
Tested by: Michael L. Squires (mikes <> siralan dot org)
MFC after: 3 days
versions of FreeBSD. In fact we are already missing a lot of conditional
code necessary to support older versions of FreeBSD, including alternatives
for vital functionality not yet provided by the respective subsystem back
then (see for example r199663). So this change shouldn't actually break
this driver on versions of FreeBSD that were supported before. Besides,
this driver also isn't maintained as an multi-release version outside of
the main repository, so removing the conditional code shouldn't be a
problem in that regard either.
- Sprinkle some more const on tables.
of certain MAC models from brgphy(4) to bge(4) where it belongs. While at it,
update the list of models having that restriction to what OpenBSD uses, which
in turn seems to have obtained that information from the Linux tg3 driver.
support in mii(4):
- Merge generic flow control advertisement (which can be enabled by
passing by MIIF_DOPAUSE to mii_attach(9)) and parsing support from
NetBSD into mii_physubr.c and ukphy_subr.c. Unlike as in NetBSD,
IFM_FLOW isn't implemented as a global option via the "don't care
mask" but instead as a media specific option this. This has the
following advantages:
o allows flow control advertisement with autonegotiation to be
turned on and off via ifconfig(8) with the default typically
being off (though MIIF_FORCEPAUSE has been added causing flow
control to be always advertised, allowing to easily MFC this
changes for drivers that previously used home-grown support for
flow control that behaved that way without breaking POLA)
o allows to deal with PHY drivers where flow control advertisement
with manual selection doesn't work or at least isn't implemented,
like it's the case with brgphy(4), e1000phy(4) and ip1000phy(4),
by setting MIIF_NOMANPAUSE
o the available combinations of media options are readily available
from the `ifconfig -m` output
- Add IFM_FLOW to IFM_SHARED_OPTION_DESCRIPTIONS and IFM_ETH_RXPAUSE
and IFM_ETH_TXPAUSE to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so
these are understood by ifconfig(8).
o Make the master/slave support in mii(4) actually usable:
- Change IFM_ETH_MASTER from being implemented as a global option via
the "don't care mask" to a media specific one as it actually is only
applicable to IFM_1000_T to date.
- Let mii_phy_setmedia() set GTCR_MAN_MS in IFM_1000_T slave mode to
actually configure manually selected slave mode (like we also do in
the PHY specific implementations).
- Add IFM_ETH_MASTER to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so it
is understood by ifconfig(8).
o Switch bge(4), bce(4), msk(4), nfe(4) and stge(4) along with brgphy(4),
e1000phy(4) and ip1000phy(4) to use the generic flow control support
instead of home-grown solutions via IFM_FLAGs. This includes changing
these PHY drivers and smcphy(4) to no longer unconditionally advertise
support for flow control but only if the selected media has IFM_FLOW
set (or MIIF_FORCEPAUSE is set) and implemented for these media variants,
i.e. typically only for copper.
o Switch brgphy(4), ciphy(4), e1000phy(4) and ip1000phy(4) to report and
set IFM_1000_T master mode via IFM_ETH_MASTER instead of via IFF_LINK0
and some IFM_FLAGn.
o Switch brgphy(4) to add at least the the supported copper media based on
the contents of the BMSR via mii_phy_add_media() instead of hardcoding
them. The latter approach seems to have developed historically, besides
causing unnecessary code duplication it was also undesirable because
brgphy_mii_phy_auto() already based the capability advertisement on the
contents of the BMSR though.
o Let brgphy(4) set IFM_1000_T master mode on all supported PHY and not
just BCM5701. Apparently this was a misinterpretation of a workaround
in the Linux tg3 driver; BCM5701 seem to require RGPHY_1000CTL_MSE and
BRGPHY_1000CTL_MSC to be set when configuring autonegotiation but
this doesn't mean we can't set these as well on other PHYs for manual
media selection.
o Let ukphy_status() report IFM_1000_T master mode via IFM_ETH_MASTER so
IFM_1000_T master mode support now is generally available with all PHY
drivers.
o Don't let e1000phy(4) set master/slave bits for IFM_1000_SX as it's
not applicable there.
Reviewed by: yongari (plus additional testing)
Obtained from: NetBSD (partially), OpenBSD (partially)
MFC after: 2 weeks
the dual port BCM5717 and BCM5718 devices which are intended for
mainstream workstation and entry-level server designs and
represents the twelfth generation of NetXtreme Ethernet controllers.
This family is the successor to the BCM5714/BCM5715 family and
supports IPv4/IPv6 checksum offloading, TSO, VLAN hardware tagging,
jumbo frames, MSI/MSIX, IOV, RSS and TSS.
This change set supports all hardware features except IOV and
RSS/TSS. Unlike its predecessors, only extended RX buffer
descriptors can be posted to the jumbo producer ring. Single RX
buffer descriptors for jumbo frame are not supported. RSS requires
a more substantial set of changes and will apply to a larger set
of NetXtreme devices so RSS/TSS multi-queue support will be
implemented in a future releases.
Special thanks to Broadcom who kindly sent a sample board to me
and to davidch who gave provided the initial support code.
Submitted by: davidch (initial version)
HW donated by: Broadcom
to BCM6906 A0/A2. This should fix a long standing BCM5906 A2 lockup
issues. Data sheet explicitly mentions BCM5906 A0, A1 and A2 use
de-pipelined mode on these revisions.
Special thanks to Buganini who tried all combinations of
experimental patches for more than 10 days.
Tested by: Buganini <buganini <> gmail dot com >
auto-negotiation results in half-duplex operation, excess collision
on the ethernet link may cause internal chip delays that may result
in subsequent valid frames being dropped due to insufficient
receive buffer resources. The workaround is to choose de-pipeline
method as a flow control decision for SDI. De-pipeline method
allows only 1 data in TxMbuf at a time such that a request to RDMA
from SDI is made only when TxMbuf is empty. Thanks for david for
providing detailed errata information.
receive two back-to-back send BDs with less than or equal to 8
total bytes then the device may hang. The two back-to-back send
BDs must be in the same frame for this failure to occur.
Thanks to davidch for detailed errata information.
Reviewed by: davidch
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
addresses; we now let mii_attach() only probe the PHY at the desired
address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
off from, partly even based on grabbing and using the softc of the
parent; we now pass these flags down from the NIC to the PHY drivers
via mii_attach(). This got us rid of all such hacks except those of
brgphy() in combination with bce(4) and bge(4), which is way beyond
what can be expressed with simple flags.
While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).
Reviewed by: jhb, yongari
header parser uses m_pullup(9) to get access to mbuf chain.
m_pullup(9) can allocate new mbuf chain and free old one if the
space left in the mbuf chain is not enough to hold requested
contiguous bytes. Previously drivers can use stale ip/tcp header
pointer if m_pullup(9) returned new mbuf chain.
Reported by: Andrew Boyer (aboyer <> averesystems dot com)
MFC after: 10 days
auto polling such that it made all controllers obtain link status
information from the state of the LNKRDY input signal. Broadcom
recommends disabling auto polling such that driver should rely on
PHY interrupts for link status change indications. Unfortunately it
seems some controllers(BCM5703, BCM5704 and BCM5705) have PHY
related issues so Linux took other approach to workaround it.
bge(4) didn't follow that and it used to enable auto polling to
workaround it. Restore this old behavior for BCM5700 family
controllers and BCM5705 to use auto polling. For BCM5700 and
BCM5701, it seems it does not need to enable auto polling but I
restored it for safety.
Special thanks to marius who tried lots of patches with patience.
Reported by: marius
Tested by: marius
Link UP state could be reported first before actual completion of
auto-negotiation. This change makes bge(4) reprogram BGE_MAC_MODE,
BGE_TX_MODE and BGE_RX_MODE register only after controller got a
valid link.
receive producer ring only for BCM5700. It was believed that
BCM5700 with external SSRAM is the only controller that supports
mini ring but it seems all BCM570[0-4] requires to disable mini
receive producer ring. Otherwise, it caused unexpected RX DMA
error or watchdog timeouts.
Reported by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
Tested by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
before setting the flag, interrupt was already enabled such that
interrupt handler could be run before setting IFF_DRV_RUNNING flag.
This can lose initial link state change interrupt which in turn
make bge(4) think that it still does not have valid link. Fix this
race by protecting the taskqueue with a driver lock.
While I'm here move reenabling interrupt code after handling of link
state chage.
Reviewed by: davidch
Previously bge(4) always enabled auto polling for non-BGE_FLAG_TBI
controllers. With this change, auto polling is not used anymore so
polling through mii(4) was introduced.
Reviewed by: davidch
as 5788. This caused BGE_MISC_LOCAL_CTL register is used to
generate link state change interrupt for non-5788 controllers. The
interrupt handler may or may not detect link state attention as
status block wouldn't be updated when an interrupt was generated
with BGE_MISC_LOCAL_CTL register. All controllers except 5700 and
5788 should use host coalescing mode register to trigger an
interrupt.
versions of controller support different number of ring control
blocks such that adjust code a bit to access known number of
send/receive ring control blocks. Previously bge(4) blindly
accessed 16 send/receive RCBs. Also move initializing standard
receive producer ring producer index, jumbo receive producer ring
producer index and mini receive producer ring producer index to
the end of each receive producer ring initialization.
Do not assume mini receive producer ring is available only when
controller has jumbo frame capability, instead explicitly check
ASIC version BCM5700 to disable mini receive producer ring.
Additionally always enable send ring 0 regardless of controller
versions. Previously bge(4) didn't enable send ring 0 if controller
is BGE_IS_5705_PLUS. Becase bge(4) need 1 send ring to send frames
at least, I have no idea how it would have worked so far.
Submitted by: davidch
BGE_MI_MODE register accesses. Previously bge(4) used to read
BGE_MI_MODE register to detect whether it needs to disable
autopolling feature or not. Because we don't touch autopolling in
other part of driver there is no reason to read BGE_MI_MODE
register given that we know default value in advance. In order to
achieve the goal, check whether the controller has CPMU(Central
Power Mangement Unit) capability. If controller has CPMU feature,
use 500KHz MII management interface(mdio/mdc) frequency regardless
core clock frequency. Otherwise use default MII clock. While I'm
here, add CPMU register definition.
In bge_miibus_readreg(), rearrange code a bit and remove goto
statement. In bge_miibus_writereg(), make sure to restore
autopolling even if MII write failed. The delay time inserted after
accessing BGE_MI_MODE register increased from 40us to 80us.
The default PHY address is now stored in softc. All PHYs supported
by bge(4) currently uses PHY address 1 but it will be changed when
we add newer controllers. This change will make it easier to change
default PHY address depending on PHY models.
Submitted by: davidch
these names are used in data sheet. Also use UnicastPkts,
MulticastPkts and BroadcastPkts instead of UcastPkts, McastPkts
and BcastPkts to clarify its meaning.
Suggested by: bde
controllers. bge(4) exported MAC statistics on controllers that
maintain the statistics in the NIC's internal memory. Newer
controllers require register access to fetch these values. These
counters provide useful information to diagnose driver issues.
parent driver. Use that information to configure flow-control.
One drawback is there is no way to disable flow-control as we still
don't have proper way to not advertise RX/TX pause capability to
link partner. But I don't think it would cause severe problems and
users can selectively disable flow-control in switch port.
has reached. This reduced number of dropped frames when
flow-control is enabled. Previously it dropped incoming frames once
RX MBUF low watermark has reached. The value used in MAC RX MBUF
low watermark is greater than or equal to 4 so receiving two more
RX frames should not be a problem.
Obtained from: OpenBSD
too many bge(4) controllers there and model name does not
necessarily match asic/chip revision. Relying on VPD string made
it hard to identify exact asic/chip revision so the first step to
debug bge(4) was getting exact asic/chip information with verbose
boot which may not be available on production server.
already updated after allocating mbuf so driver had to use the last
index instead of using next producer index. This should fix driver
hang which may happen under high network load.
Reported by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
Tested by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
MFC after: 10 days
or not by comparing reported TX consumer index with saved index. So
remove unnecessary check done after freeing transmitted mbufs.
While I'm here nuke unnecessary variable initializations.
tag. All controllers that are not BCM5755 or higher have 4GB
boundary DMA bug. Previously bge(4) used 32bit DMA address to
workaround the bug(r199670). However this caused the use of bounce
buffers such that it resulted in poor performance for systems which
have more than 4GB memory. Because bus_dma(9) honors boundary
restriction requirement of DMA tag for dynamic buffers, having a
separate TX/RX mbuf DMA tag will greatly reduce the possibility of
using bounce buffers. For DMA buffers allocated with
bus_dmamem_alloc(9), now bge(4) explicitly checks whether the
requested memory region crossed the boundary or not.
With this change, only the DMA buffer that crossed the boundary
will use 32bit DMA address. Other DMA buffers are not affected as
separate DMA tag is created for each DMA buffer.
Even if 32bit DMA address space is used for a buffer, the chance to
use bounce buffer is still very low as the size of buffer is small.
This change should eliminate most usage of bounce buffers on
systems that have more than 4GB memory.
More correct fix would be teaching bus_dma(9) to honor boundary
restriction for buffers created with bus_dmamem_alloc(9) but it
seems that is not easy.
While I'm here cleanup bge_dma_map_addr() and remove unnecessary
member variables in bge_dmamap_arg structure.
Tested by: marcel
datagrams with checksum value 0 when TX UDP checksum offloading is
enabled. Generating UDP checksum value 0 is RFC 768 violation.
Even though the probability of generating such UDP datagrams is
low, I don't want to see FreeBSD boxes to inject such datagrams
into network so disable UDP checksum offloading by default. Users
still override this behavior by setting a sysctl variable or loader
tunable, dev.bge.%d.forced_udpcsum.
I have no idea why this issue was not reported so far given that
bge(4) is one of the most commonly used controller on high-end
server class systems. Thanks to andre@ who passed the PR to me.
PR: kern/104826
r165114 added that code and that change ignored the same logic
committed in r135772. In addition, data FIFO protection should be
selectively enabled instead of applying to all PCIe devices.
While I'm here add BCM5785 to devices that do not require this
fix.