datagrams with checksum value 0 when TX UDP checksum offloading is
enabled. Generating UDP checksum value 0 is RFC 768 violation.
Even though the probability of generating such UDP datagrams is
low, I don't want to see FreeBSD boxes to inject such datagrams
into network so disable UDP checksum offloading by default. Users
still override this behavior by setting a sysctl variable or loader
tunable, dev.bge.%d.forced_udpcsum.
I have no idea why this issue was not reported so far given that
bge(4) is one of the most commonly used controller on high-end
server class systems. Thanks to andre@ who passed the PR to me.
PR: kern/104826
buffers it should also reinitialize RX descriptors otherwise some
stale data could be passed to controller. This could end up with
mbuf double free or unexpected NULL pointer dereference in upper
stack. To fix the issue, save loaded buffer's length and
reinitialize RX descriptors with the saved value whenever bge(4)
reuses the loaded RX buffers.
While I'm here, increase the number of RX buffers to 512 from 256.
This simplifies RX buffer handling as well as giving more RX
buffers. Controller supports just fixed number of RX buffers
(i.e. 512) and bge(4) used to rely on hope that our CPU is fast
enough to keep up with the controller. With this change, bge(4)
will use 1MB for RX buffers but I don't think it would cause
problems in these days.
Reported by: marcel
Tested by: marcel
over GMII, make sure to enable GMII. With this change brgphy(4) is
used to handle the dual mode PHY. Since we still don't have a sane
way to pass PHY specific information to mii(4) layer special
handling is needed in brgphy(4) to determine which mode of PHY was
configured in parent interface.
This change make BCM5715S work.
Tested by: olli
Obtained from: OpenBSD
MFC after: 1 week
o Don't enable BGE_FLAG_BER_BUG on both 5722 and 5756, and based
on their PCI IDs rather than their chip IDs.
Reported by: several PC-BSD users via kmoore
Reviewed by: yongari, imp, jhb, davidch
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
hw.bge.forced_collapse. hw.bge.forced_collapse affects all bge(4)
controllers on system which may not desirable behavior of the
sysctl node. Also allow the sysctl node could be modified at any
time.
Reviewed by: bde (initial version)
seem to require a special firmware to use TSO. But the firmware is
not available to FreeBSD and Linux claims that the TSO performed by
the firmware is slower than hardware based TSO. Moreover the
firmware based TSO has one known bug which can't handle TSO if
ethernet header + IP/TCP header is greater than 80 bytes. The
workaround for the TSO bug exist but it seems it's too expensive
than not using TSO at all. Some hardwares also have the TSO bug so
limit the TSO to the controllers that are not affected TSO issues
(e.g. 5755 or higher).
While I'm here set VLAN tag bit to all descriptors that belengs to
a frame instead of the first descriptor of a frame. The datasheet
is not clear how to handle VLAN tag bit but it worked either way in
my testing. This makes it simplify TSO configuration a little bit.
Big thanks to davidch@ who sent me detailed TSO information.
Without this I was not able to implement it.
Tested by: current
have a DMA bug when buffer address crosses a multiple of the 4GB
boundary(e.g. 4GB, 8GB, 12GB etc). Limit DMA address to be within
4GB address for these controllers. The second DMA bug limits DMA
address to be within 40bit address space. This bug applies to
BCM5714 and BCM5715 and 5708(bce(4) controller). This is not
actually a MAC controller bug but an issue with the embedded PCIe
to PCI-X bridge in the device. So for BCM5714/BCM5715 controllers
also limit the DMA address to be within 40bit address space.
Special thanks to davidch@ who gave me detailed errata information.
I think this change will fix long standing bge(4) instability
issues on systems with more than 4GB memory.
Reviewed by: davidch
PCI flush to get correct status block update. Add an optimized
interrupt handler that is activated for MSI case. Actual interrupt
handling is done by taskqueue such that the handler does not
require driver lock for Rx path. The MSI capable bge(4) controllers
automatically disables further interrupt once it enters interrupt
state so we don't need PIO access to disable interrupt in interrupt
handler.
directly access them at fixed address. While I'm here don't touch
other bits of PCIe device control register except max payload size.
Reviewed by: marius
Introduce two spare dma maps for standard buffer and jumbo buffer
respectively. If loading a dma map failed reuse previously loaded
dma map. This should fix unloaded dma map is used in case of dma
map load failure. Also don't blindly unload dma map and defer
dma map sync and unloading operation until we know dma map for new
buffer is successfully loaded. This change saves unnecessary dma
load/unload operation. Previously bge(4) tried to reuse mbuf
with unloaded dma map which is really bad thing in bus_dma(9)
perspective.
While I'm here update if_iqdrops if we can't allocate Rx buffers.
and Rx DMA tag separately. Previously it used a common mbuf DMA tag
for both Tx and Rx path but Rx buffer(standard ring case) should
have a single DMA segment and maximum buffer size of the segment
should be less than or equal to MCLBYTES. This change also make it
possible to add TSO with minor changes.
BGE_PCI_PRODID_ASICREV register to store the chip identifier and its revision.
- Add new grouping macro for 7575+ chips (BGE_IS_5755_PLUS).
- Add IDs for Fujitsu-branded Broadcom adapters.
PR: kern/127587
Tested by: Thomas Quinot <thomas@quinot.org> (BCM7561 A0)
MFC after: 2 weeks
Obtained from: OpenBSD
quirk requiring it to be enabled even when using MSI. This makes
the latter work again after r189285.
- Remove a comment which no longer applies since r190194.
- If boot verbose, print asicrev, chiprev and bus type on attach.
- For PCI Express devices:
1) Adjust max read request size to 4Kbytes
2) Turn on FIFO_LONG_BURST in RDMA during bge_blockinit()
Though 1) does not seem to have much to do with the poor TX performance
observed on PCI Express bge(4), 2) does fix the problem. [1]
- Nuke the RX CPU self-diag, which prevents working cards from working
(Linux tg3 does not have this diag neither does OpenBSD's bge(4)).
The increasing of the firmware handshaking timeout to 20000 retries
done as part of the original commit isn't merged as way already have a
way higher BGE_TIMEOUT of 100000.
PR: 119361 [1]
Obtained from: tg3 via DragonflyBSD [1], DragonflyBSD
- Rename BGE_FLAG_EEPROM to BGE_FLAG_EADDR to underline it's absence means
"there's no chip containing an Ethernet address fitted to the BGE chip
so we have to get it from the firmware instead" rather than "there's no
EEPROM, but maybe NVRAM or something else".
- Don't treat BCM5906[M] generally like chips w/o BGE_FLAG_EADDR set, just
in the two cases really necessary. This gets us line with the original
patch for DragonFlyBSD.
- For sparc64 restore the intended behavior of obtaining the Ethernet
address from the firmware in case BGE_FLAG_EADDR is not set, even for
BCM5906[M].
- Fix some style(9) bugs introduced with rev. 1.208 of if_bge.c
Approved by: jhb
Additional testing by: Thomas Nystroem (BCM5906)
all cards/modes.
In addition to the intr forcing added with rev. 1.205 adopt the other
places to use the same logic.
We need to exclude a few chips/revisions (5700, 5788) from using the
enhanced version and fall back to the old way as that is the only
method they support.
Tested by: phk
Suggested by: davidch, Broadcom (thanks a lot for the help!)
MFC after: 16 days
10/100 operation and place the mailbox registers at a different offset.
They also do not have an EEPROM, so the MAC address must be read from
NVRAM instead.
MFC after: 1 month
PR: kern/118975
Submitted by: benjsc, Thomas Nyström thn at saeab dot se
Submitted by: sephe (original patch for DragonflyBSD)
Blade 2500, Fire V210 and probably some other sparc64 machines.
These chips are typically not fitted with an EEPROM which means
that we have to obtain the MAC address via OFW and that some chip
tests will just always fail.
These changes are based on the respective code found in OpenBSD
with some additional info obtained from OpenSolaris and some style
suggestions by jkim@. They also have the desired side-effect of
respecting the 'local-mac-address?' system configuration variable
for the affected BGEs.
- In bge_attach() factor out calling bge_release_resources() before
going to the fail label into the fail label as well as replace a
magic 6 with ETHER_ADDR_LEN.
Reviewed by: yongari (before style changes), jkim
- Move some PHY bug detections from brgphy.c to if_bge.c.
- Do not penalize working PHYs.
- Re-arrange bge_flags roughly by their categories.
- Fix minor style(9) nits.
PR: kern/107257
Obtained from: OpenBSD
Tested by: Mike Hibler <mike at flux dot utah dot edu>
- Do not repeatedly read vendor/device IDs while probing.
- Remove redundant bzero(3) for softc. device_get_softc(9) does it for free[1].
Reviewed by: glebius
Suggested by: glebius[1]
- Use the appropriate register writing method when reseting the chip
- Program the descriptor DMA engine correctly.
- More reliably detect certain chips and their features.
Also add some low-level debugging tools to help future work on this driver.
Submitted by: David Christenson (proof of concept changes)
Sponsored by: www.UIA.net
discarded RX packets to input error for BCM5705 or newer chipset as the others.
Unfortunately we cannot do the same for output errors because ifOutDiscards
equivalent register does not exist. While I am here, replace misleading and
wrong BGE_RX_STATS/BGE_TX_STATS with BGE_MAC_STATS. They were reversed but
worked accidently.
to it. Try to co-operate with the IPMI/ASF firmware accessing the PHY.
One we get link we don't mess with the PHY. If we do then over time
the NIC will go off line. It would be nice if we could tell if IPMI
was enabled on the chip but I can't figure out a reliable way to do
that. The scheme I tried worked on a Dell PE850 but not on an HP machine.
So we assume any NIC that has ASF capability needs to deal with it.
The code was inspired by the support in Linux from kernel.org and Broadcom.
Broadcom did give me some info. but it is rather limited and is mostly
just what is in the Linux driver. Thanks to the numerous people that
helped debug the many prior versions and that I didn't break other
bge(4) HW.
Reviewed by: several people
Tested by: even more
BCM5787 based NICs.
- Recognize BCM5703 B0 ASIC.
- Rewrite the jumbo capability matching macro, so that chips known
to work are listed there. [*]
[*] I'm still not sure about this. Probably more corrections
will be done to this macro after discussion with davidch@
and brad@OpenBSD.
Obtained from: OpenBSD (brad)
- Add more device IDs, ASIC revisions and chip IDs.
- Rewrite a bit code that picks the description for device.
- Introduce several macros to shorten quirks for bugs and
features.[*]
- Use some magic values, that OpenBSD has successfully
possessed from Linux (Broadcom supplied) driver.
- Remove disabled code that tried to access VPD.
[*] The macro that matches Jumbo capable NICs is
rewritten to preserve our current behavior. I
need clarify whether our or theirs is correct.
PR: 68351 (and may be others)
Obtained from: OpenBSD, brad@ mostly
2) add missing bus_dmamap_sync() call in bge_intr()
Tested by: Husnu Demir <hdemir AT metu DOT edu DOT tr>
Approved by: glebius (mentor)
MFC after: 3 days
as input/output interface errors.
- Keep values of rx/tx discards & tx collisions inside struct bge_softc.
So we can keep statistic across ifconfig down/up runs (cause bringing
bge up will reset chip).
Approved by: glebius (mentor)
MFC after: 1 week
we can cache its value in the softc. Eliminates one PCI register
write per call to bge_start().
A 1.8% speedup for UDP_RR test on my old box.
Obtained from: NetBSD(jonathan) via delphij
- Give up endianess support and switch to native-endian format for
accessing hardware structures. In fact embedded processor for
BCM57xx is big-endian architure(MIPS) and it requires native-endian
format for NIC structures.The NIC performs necessary byte/word
swapping depending on programmed endian type.
- With above changes all htole16/htole32 calls were gone.
- Remove bge_vhandle member in softc and changed to use explicit
register access. This may add additional performance penalty
that than that of previous memory access. But most of the access
is performed on initialization phase(e.g. RCB setup), it would be
negligible.
Due to incorrect use of bus_dma(9) in bge(4) it still panics sparc64
system in device detach path. The issue would be fixed in next patch.
Reviewed by: jkim (initial version)
Silence from: ps
Tested by: glebius
Obtained from: NetBSD via OpenBSD
cluster allocator, that wasn't MPSAFE. Instead, utilize our new generic
UMA jumbo cluster allocator. Since UMA gives us a 9k piece that is contigous
in virtual memory, but isn't contigous in physical memory we need to handle
a few segments. To deal with this we utilize Tigon chip feature - extended
RX descriptors, that can handle up to four DMA segments for one frame.
Details:
o Remove bge_alloc_jumbo_mem(), bge_free_jumbo_mem(),
bge_jalloc(), bge_jfree() functions.
o Remove SLIST heads, bge_jumbo_tag, bge_jumbo_map from softc.
o Use extended RX BDs for Jumbo receive producer ring, and
initialize it appropriately.
o New bge_newbuf_jumbo():
- Allocate an mbuf with Jumbo cluster with help of m_cljget().
- Load the cluster for DMA with help of bus_dmamap_load_mbuf_sg().
- Assert that we got 3 segments in the DMA mapping.
- Fill in these 3 segments into the extended RX descriptor.
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.
Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
I have from Broadcom does not give much information on these devices,
so the Broadcom Linux driver was used for clues to what these chips
support. It turns out they are similar to the 5705 with the 5751
being the PCI-Express version and needing special work-arounds and
settings.
mode. The 5704 apparently has some s00p3r s33kr1t registers for setting
the advertisement of pause frame ability (i.e flow control) when in
autoneg mode. If we don't set these registers correctly, we may not
be able to negotiate a proper link with some switches. (Symptom is that
the NIC reports the link as up (PCS synched) but no traffic can be
exchanged.)
PR: kern/67598
These are 10/100 only NICs found on the IBM Thinkpad R40E and
G40. These seem to be based on the BCM5705 MAC but with a PHY
that doesn't support 1000Mbps modes.
Submitted by: Igor Sviridov <sia@nest.org>
- 5705 doesn't support jumbo frames
- Statistics must be read from registers
- RX return ring must be capped at 512 entries
- Omit initialization of certain device blocks
- Acknowledge link change interrupts by setting the 'link changed'
bit in the status register (used to have no effect)
- Remember to toggle the MI completion bit too
- Set the mbuf low watermark differently (on-chip memory buffers,
not BSD mbufs)
- Don't enable Ethernet@WireSpeed feature for certain 5705 chip revs
- Add additional PCI IDs for 5705 and 5782 parts
- Add a forgotten 5704 PCI ID
Most changes ripped kicking and screaming from the Broadcom linux driver.
Thanks to Paul Saab for sanity testing. (My lack of sanity has been
confirmed.)
ASIC revision is really the major number of the CHIPID. Also store
the chipid, asic rev and chip revision in the softc for later use.
- The write twice to send producer index workaround only applies to
the 5700_BX chips, so only do it there.
Requested by: jdp
- Do not initalize the LED's to 0x00. The default configuration
the chip comes up in should yeild proper operation of the LED's.
Confirmed by: John Cagle <john.cagle@hp.com>
Approved by: re (blanket)
as separate 16-bit entities. Some of the ring control blocks are
in NIC memory, so they must be referenced using 32-bit accesses.
Smaller accesses have been observed to fail under some conditions.
This caused the rings to be set up wrong, leading to writes by the
card outside of the intended bounds of the rings. This problem was
diagnosed by Michael Barthelow. Don Bowman submitted a patch which
fixed the problem using a slightly different approach.
Reference ring control blocks in NIC memory using a pointer to
volatile.
Parenthesize the BGE_HOSTADDR macro definition properly.
MFC after: 3 days
o don't strip the Ethernet header from inbound packets; pass packets
up the stack intact (required significant changes to some drivers)
o reference common definitions in net/ethernet.h (e.g. ETHER_ALIGN)
o track ether_ifattach/ether_ifdetach API changes
o track bpf changes (use BPF_TAP and BPF_MTAP)
o track vlan changes (ifnet capabilities, revised processing scheme, etc.)
o use if_input to pass packets "up"
o call ether_ioctl for default handling of ioctls
Reviewed by: many
Approved by: re
cards to test; however the submitter reports that this patch works
with the on-board interface on the IBM x235 server.
Submitted by: Jung-uk Kim <jkim@niksun.com>
MFC after: 1 month
of the Netgear GA302-T. I changed the symbolic names in the
submitter's patch to reflect the part number of the chip instead
of the board.
PR: kern/38988
Submitted by: Brad Chapman <chapmanb@arches.uga.edu>
MFC after: 2 days
up when operating in PCI-X mode. For some received packets there is
data corruption in the first few bytes in that case. Aligning the
packet buffer eliminates the corruption. With this fix, the code
that offsets the packet buffer up by 2 bytes to align the payload is
disabled for BCM5701s operating in PCI-X mode. On the i386, which
permits unaligned accesses, the payload is left unaligned. On other
platforms, the packet is copied after reception to force alignment
of the payload. Obviously, this work-around reduces performance in
those cases (BCM5701 plus PCI-X) where it is in effect.
MFC after: 3 days
interrupts. This is a bit harder than it needs to be because there's
more than one way to generate link attentions, at least one of which
does not work on the BCM5700, but does on the 5701.
For the 5701, we can safely use the 'link changed' bit in the status
block, and we enable link change attentions in the mac event register.
For the 5700, we have to use MII interrupts, which require checking
the MAC status register rather than the status block. This requires
doing an extra register access on each interrupt which I'd prefer to
avoid, but them's the breaks. Testing with both a 3c996-T and 3c996B-T
shows that we do in fact detect the link going up and down properly
on cable insertions/disconnections.
Also, avoid twiddling the autopoll enable bit in the MI mode register
when doing a PHY read. I think this coupled with the other changes
will stop the interrupt storms Paul Saab has been harassing me about.
Manually setting the link to 100baseTX full duplex seems to work ok
for me. (I'm typing over the 3c996B-T right now.)
Lastly, teach the driver how to recognize a 3c996B-SX by checking
the hardware config word in the EEPROM in order to detect the media.
We attach 5701 fiber cards correctly now, but I haven't verified that
they send/receive packets yet since I don't have a second fiber
interface at home. (I know that fiber 5700 cards work, so I'm
keeping my fingers crossed.)
3c996B-T, with the 5701 rev B5 ASIC). One thing that confuses me
still is that the 'link state change' bit in the status block seems
to change state an awful lot. I have a workaround for this in place
now, but it needs more investigation. For the moment though, this
is enough to get the driver to work with this card.
ethernet controllers. This adds support for the 3Com 3c996-T, the
SysKonnect SK-9D21 and SK-9D41, and the built-in gigE NICs on
Dell PowerEdge 2550 servers. The latter configuration hauls ass:
preliminary measurements show TCP speeds of over 900Mbps using
only normal size frames.
TCP/IP checksum offload, jumbo frames and VLAN tag insertion/stripping
are supported, as well as interrupt moderation.
Still need to fix autonegotiation support for 1000baseSX NICs, but
beyond that, driver is pretty solid.