Commit Graph

118 Commits

Author SHA1 Message Date
Pyun YongHyeon
351a76f9aa Instead of allocating variables for each events/hardware flags, use
a dedicated flag that represents controller capabilities/events.
This will simplify many part of code that requires different
workaround for each controller revisions and will enhance
readability.
While I'm here move PHY wakeup code up before mii_phy_probe() which
seems to help to wake PHY in some cases.
2008-07-02 06:29:43 +00:00
Pyun YongHyeon
ace7ed5dd5 Switch to memory space register mapping over IO space. If that
mapping fails fall back to traditional IO space access.
2008-07-02 05:21:09 +00:00
Pyun YongHyeon
f98dd8cf50 While accessing EEPROM command register use pre-defined constant
instead of hardcoded value.
2008-07-02 05:01:19 +00:00
Pyun YongHyeon
9dfcacbe29 After the change of r176757 re(4) no longer relys on reading
RL_TXCFG register to identify a device in device probe. Reflect the
fact by modifing device description with general ethernet
controller family.
Note, rl_basetype in struct rl_type is not used and the more
detailed information is provided with rl_hwrev structure.
2008-07-02 04:55:39 +00:00
Pyun YongHyeon
dd6bd66671 Remove duplicated H/W revision check. 2008-07-02 04:27:36 +00:00
Pyun YongHyeon
339a44fb62 Don't touch MSI enable bit in RL_CFG2 register. For unknown reason
clearing MSI enable bit for MSI capable hardwares resulted in Tx
problems. MSI enable bit is set only when MSI is requested from
user.

Tested by:	remko
2008-04-15 00:46:15 +00:00
Pyun YongHyeon
a4148af5f0 Padding more bytes than necessary one broke another variants of
PCIe RealTek chips. Only pad IP packets if the payload is less than
28 bytes.

Obtained from:	NetBSD
PR:		kern/122221
2008-03-31 04:03:14 +00:00
Pyun YongHyeon
99c8ae87a4 In revision 1.70, 1.71 and 1.84 re(4) tried to workaround checksum
offload bugs by manual padding for short IP/UDP frames. Unfortunately
it seems that these workaround does not work reliably on newer PCIe
variants of RealTek chips.

To workaround the hardware bug, always pad short frames if Tx IP
checksum offload is requested. It seems that the hardware has a
bug in IP checksum offload handling. NetBSD manually pads short
frames only when the length of IP frame is less than 28 bytes but I
chose 60 bytes to safety. Also unconditionally set IP checksum
offload bit in Tx descriptor if any TCP or UDP checksum offload is
requested. This is the same way as Linux does but it's not
mentioned in data sheet.

Obtained from:	NetBSD
Tested by:	remko, danger
2008-03-28 01:21:21 +00:00
Pyun YongHyeon
2000cf6c0b MSI handling on some RealTek chips are broken so disable it by
default.

Reported by:	Giulio Ferro ( auryn AT zirakzigil DOT org )
Tested by:	Giulio Ferro ( auryn AT zirakzigil DOT org )
2008-03-23 05:35:18 +00:00
Pyun YongHyeon
03ca7ae8a9 For MSI capable hardwares, enable MSI enable bit in RL_CFG2
register.  If MSI was disabled by hw.re.msi_disable tunable
expliclty clear the MSI enable bit.
2008-03-23 05:31:35 +00:00
Pyun YongHyeon
ce6283934e Some RealTek chips are known to be buggy on DAC handling, so
disable DAC by default.
2008-03-23 05:13:45 +00:00
Pyun YongHyeon
ccf34c81f8 VLAN hardware tag information should be set for all desciptors of a
multi-descriptor transmission attempt. Datasheet said nothing about
this requirements. This should fix a long-standing VLAN hardware
tagging issues with re(4).

Reported by:	Giulio Ferro ( auryn AT zirakzigil DOT org )
Tested by:	Giulio Ferro ( auryn AT zirakzigil DOT org )
2008-03-23 05:06:16 +00:00
Pyun YongHyeon
70acaecfd0 Always honor configured VLAN/checksum offload capabilities.
Previously re(4) used to blindly enable VLAN hardware tag stripping
and Rx checksum offload regardless of enabled optional features of
interface.
2008-03-23 04:59:13 +00:00
Pyun YongHyeon
dfdb409ef0 Don't map memory/IO resource in device probe and just use PCI
vendor/revision/sub device id of the hardware to probe it.
This is the same way as NetBSD does and it enhances readabilty
a lot.
2008-03-03 04:15:08 +00:00
Pyun YongHyeon
c1d0b5737f Don't allow jumbo frame on 8139C+ controller.
While I'm here add a check for minimal MTU length.
2008-03-03 03:41:06 +00:00
Pyun YongHyeon
7467bd5370 Implement WOL.
Tested by:	Fabian Keil ( freebsd-listen AT fabienkeli DOT de )
2008-03-03 03:33:58 +00:00
John Baldwin
304a4c6fb1 - Retire npe_defrag(), gem_defrag(), msk_defrag(), nfe_defrag(), and
re_defrag() and use m_collapse() instead.
- Replace a reference to ath_defrag() in a comment in if_wpi.c with
  m_collapse().
2008-01-17 23:37:47 +00:00
Pyun YongHyeon
738489d1c1 Fix build. 2008-01-15 03:47:24 +00:00
Pyun YongHyeon
d65abd6663 Overhaul re(4).
o Increased number of Rx/Tx descriptors to 256 for 8169 GigEs
  because it's hard to push the hardware to the limit with default
  64 descriptors.
  TSO requires large number of Tx descriptors to pass a full sized
  TCP segment(65535 bytes IP packet) to hardware. Previously it
  consumed 32 Tx descriptors, assuming MCLBYTES DMA segment size,
  to send the TCP segment which means re(4) couldn't queue more
  than two full sized IP packets.
  For 8139C+ it still uses 64 Rx/Tx descriptors due to its hardware
  limitations. With this changes there are (very) small waste of
  memory for 8139C+ users but I don't think it would affect 8139C+
  users for most cases.
o Various bus_dma(9) fixes.
   - The hardware supports DAC so allow 64bit DMA operations.
   - Removed BUS_DMA_ALLOC_NOW flag.
   - Increased DMA segment size to 4096 from MCLBYTES because TSO
     consumes too many descriptors with MCLBYTES DMA segment size.
   - Tx/Rx side bus_dmamap_load_mbuf_sg(9) support. With these
     changes the code is more readable than previous one and got a
     (slightly) better performance as it doesn't need to pass/
     decode arguments to/from callback function.
   - Removed unnecessary callback function re_dmamap_desc() and
     nuked rl_dmaload_arg structure which was used in the callback.
   - Additional protection for DMA map load failure. In case of
     failure reuse current map instead of returning a bogus DMA
     map.
  -  Deferred DMA map unloading/sync operation for maximum
     performance until we really need to load new DMA map. If we
     happen to reuse current map(e.g. input error) there is no need
     to sync/unload/load again.
  -  The number of allowable Tx DMA segments for a mbuf chains are
     now 32 instead of magic nseg value. If the number of available
     Tx descriptors are short enough to send highly fragmented mbuf
     chains an optimized re_defrag() is called to collapse mbuf
     chains which is supposed to be much faster than m_defrag(9).
     re_defrag() was borrowed from ath(4).
   - Separated Rx/Tx DMA tag from a common DMA tag such that Rx DMA
     tag correctly uses DMA maps that were created with DMA alignment
     restriction(8bytes alignments). Tx DMA tag does not have such
     alignment limitation.
   - Added additional sanity checks for DMA ring map load failure.
   - Added additional spare Rx DMA map for graceful handling of Rx
     DMA map load failure.
   - Fixed misused bus_dmamap_sync(9) and added missing
     bus_dmamap_sync(9) in re_encap()/re_txeof()/re_rxeof().
o Enabled TSO again as re(4) have reasonable number of Tx
  descriptors.
o Don't touch DMA address of a Tx descriptor in re_txeof(). It's
  not needed.
o Fix incorrect update of if_ierrors counter. For Rx buffer
  shortage it should update if_qdrops as the buffer is reused.
o Added checks for unsupported H/W revisions and return ENXIO for
  these hardwares. This is required to remove resource allocation
  code in re_probe as other drivers do in device probe routine.
o Modified descriptor index manipulation macros as it's now possible
  to have different number of descriptors for Rx/Tx.
o In re_start, to save a lock operation, use IFQ_DRV_IS_EMPTY before
  trying to invoke IFQ_DRV_DEQUEUE. Also don't blindly call re_encap
  since we already know the number of available Tx descriptors in
  advance.
o Removed RL_TX_DESC_THLD which was used to reserve RL_TX_DESC_THLD
  descriptors in Tx path. There is no such a limitation mentioned in
  8139C+/8169/8110/8168/8101/8111 datasheet and it seems to work ok
  without reserving RL_TX_DESC_THLD descriptors.
o Fix a comment for RL_GTXSTART. The register is 8bits register.
o Added comments for 8169/8139C+ hardware restrictions on descriptors.
o Removed forward declaration for "struct rl_softc", it's not needed.
o Added a new structure rl_txdesc for Tx descriptor managements and
  a structure rl_rxdesc for Rx descriptor managements.
o Removed unused member variable rl_intlock in driver softc. There are
  still several unused member variables which are supposed to be used
  to access hardware statistics counters. But it seems that accessing
  hardware counters were not implemented yet.
2008-01-15 01:10:31 +00:00
Pyun YongHyeon
a0637caa3f By definition promiscuous mode should see all unicast frames as well
as multicast/broadcast frames. Previously re(4) ignored multicast
frames in promiscuous mode. The RTL8169 datasheet was not clear
how it handles multicast frames in promiscuous mode.

PR:	kern/118572
MFC after:	3 days
2007-12-20 07:26:20 +00:00
Pyun YongHyeon
1acbb78ada Add another RTL8168 revision 3 which is found on RTL8111-GR Gigabit
Ethernet Controller. Multicast filtering wasn't tested and needs more
expore. While I'm here change complex if statements with switch
statement which would improve readability.

Reported by:	Abdullah Ibn Hamad Al-Marri < wearabnet AT yahoo DOT ca >
Tested by:	Abdullah Ibn Hamad Al-Marri < wearabnet AT yahoo DOT ca >
2007-12-08 00:14:09 +00:00
Pyun YongHyeon
7c103000b6 Always honor promiscuous flag prior to programming Rx multicast
filter. This fixes a regression introduced in rev 1.89.

PR:	114632
MFC after:	3 days
2007-12-03 01:28:08 +00:00
Pyun YongHyeon
6a087a8722 Fix function prototype for device_shutdown method. 2007-11-22 02:45:00 +00:00
Remko Lodder
3b9982e59c Add support for D-Link DGE-528(T) Rev.B1
PR:		112774
Submitted by:	Denis Fortin <fortin at acm dot org>
Approved by:	imp (mentor), yongari
MFC After:	3 days
2007-11-12 15:44:00 +00:00
Pyun YongHyeon
c4aca09a2a Make sure to take PHY out of power down mode in device attach.
Without this the PHY wouldn't work as expected. This should fix
dual-boot Windows XP machine where RealTek Windows drivers put the
PHY in power down mode during shutdown. The magic PHY register
accesses come from RealTek driver. No datasheets mention the magic
PHY registers.
In general, the PHY wakeup code should go into PHY driver. However it
seems that it only apply to RTL8169S single chip and it would be
another hack if we have rgephy(4) check what parent driver/chip model
is attached.

Reported by:	lofi, Laurens Timmermans ( laurens AT timkapel DOT nl )
Tested by:	lofi
Obtained from:	RealTek FreeBSD driver
Approved by:	re (Ken Smith)
2007-08-14 02:00:04 +00:00
Marius Strobl
9282563532 Initialize the rl_vlanctl field of the descriptors to zero (in order
to clear RL_TDESC_VLANCTL_TAG). This fixes sending packets in the
native VLAN when running both tagged and an untagged VLAN over the
same trunk and descriptors are recycled.

Approved by:	re (kensmith)
MFC after:	1 week
2007-08-05 11:20:33 +00:00
Pyun YongHyeon
4693e424a7 style(9)
Pointed out by:	cnst
Approved by:	re (kensmith)
2007-07-27 00:43:12 +00:00
Pyun YongHyeon
5774c5ff93 Add MSI support.
Ever since switching to adaptive polling re(4) occasionally spews
watchdog timeouts on systems with MSI capability. This change is
minimal one for supporting MSI and re(4) also needs MSIX support
for RTL8111C in future. Because softc structure of re(4) is shared
with rl(4), rl(4) was touched to use the modified softc.

Reported by:	cnst
Tested by:	cnst
Approved by:	re (kensmith)
2007-07-24 01:24:03 +00:00
Pyun YongHyeon
141f92e7b5 re(4) devices requires an external EEPROM. Depending on models it
would be 93C46(1Kbit) or 93C56(2Kbit). One of differences between them
is number of address lines required to access the EEPROM. For example,
93C56 EEPROM needs 8 address lines to read/write data. If 93C56
recevied premature end of required number of serial clock(CLK) to set
OP code/address of EEPROM, the result would be unexpected behavior.
Previously it tried to detect 93C46, which requires 6 address lines,
and then assumed it would be 93C56 if read data was not expected
value. However, this approach didn't work in some models/situations
as 93C56 requries 8 address lines to access its data. In order to fix
it, change EEPROM probing order such that 93C56 is detected reliably.

While I'm here change hard-coded address line numbers with defined
constant to enhance readability.

PR:	112710
Approved by:	re (mux)
2007-07-06 00:05:12 +00:00
Pyun YongHyeon
f28a171ce2 Disable TSO support.
Without bus_dma clean up and increment of number of Tx descriptors
it's hard to guarantee correct Tx operation in TSO case. The TSO
support would be enabled again when I get more feeback from re(4)
patch posted to current.
2007-06-16 02:54:19 +00:00
Pyun YongHyeon
eed497bbe5 Don't reinitialize the hardware if only PROMISC flag was changed.
Previously whenever PROMISC mode turned on/off link renegotiation
occurs and it could resulted in network unavailability for serveral
seconds.(Depending on switch STP settings it could last several tens
seconds.)

Reported by:	Prokofiev S.P.  < proks AT logos DOT uptel DOT net >
Tested by:	Prokofiev S.P.  < proks AT logos DOT uptel DOT net >
2007-04-18 00:40:43 +00:00
Remko Lodder
2ee2c3b4e4 Add support for the RTL8110SC driver.
PR:		110804
Submitted by:	Daan Vreeken
Sponsored by:	Vitsch Electronics (patch)
Approved by:	imp (mentor)
MFC After:	3 days
2007-03-28 18:07:12 +00:00
Christian S.J. Peron
59a0d28bac Catch up the rest of the drivers with the ether_vlan_mtap modifications.
If these drivers are setting M_VLANTAG because they are stripping the
layer 2 802.1Q headers, then they need to be re-inserting them so any
bpf(4) peers can properly decode them.

It should be noted that this is compiled tested only.

MFC after:	3 weeks
2007-03-04 03:38:08 +00:00
John Baldwin
3d4c1b5744 Use taskqueue_drain() to wait for any pending tasks to complete rather
than just pausing for a second.
2007-02-27 18:45:37 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Bill Paul
e2bcb489ef The TCP checksum offload handling in the 8111B/8168B and 8101E PCIe can
apparently be confused by short TCP segments that have been manually
padded to the minimum ethernet frame size. The driver does short frame
padding in software as a workaround for a bug in the 8169 PCI devices
that causes short IP fragments to be corrupted due to an apparent
conflict between the hardware autopadding and hardware IP checksumming.

To fix this, we avoid software padding for short TCP segments, since
the hardware seems to autopad and checksum these correctly (even the
older 8169 NICs get these right). Short UDP packets appear to be
handled correctly in all cases. This should work around the IP header
checksum bug in the 8169 while not tripping the TCP checksum bug in
the 8111B/8168B and 8101E.
2007-01-25 17:30:30 +00:00
Pyun YongHyeon
d01fac16ac It seems that enabling Tx and Rx before setting descriptor DMA
addresses shall access invalid descriptor DMA addresses on PCIe
hardwares and then panicked the system.
To fix it set descriptor DMA addresses before enabling Tx and Rx
such that hardware can see valid descriptor DMA addresses. Also
set RL_EARLY_TX_THRESH before starting Tx and Rx.

Reported by:	steve.tell AT crashmail DOT de
Tested by:	steve.tell AT crashmail DOT de
Obtained from:	NetBSD
MFC after:	1 week
2007-01-23 00:44:12 +00:00
Marius Strobl
b4b958792b o In re_newbuf() and re_encap() if re_dma_map_desc() aborts the mapping
operation as it ran out of free descriptors or if there are too many
  segments in the first place, call bus_dmamap_unload() in order to
  unload the already loaded segments.
  For trying to map the defragmented mbuf (chain) in re_encap() this
  introduces re_dma_map_desc() setting arg.rl_maxsegs to 0 as a new
  failure mode. Previously we just ignored this case, corrupting our
  view of the TX ring.
o In re_txeof():
  - Don't clear IFF_DRV_OACTIVE unless there are at least 4 free TX
    descriptors. Further down the road re_encap() will bail if there
    aren't at least 4 free TX descriptors, causing re_start() to
    abort and prepend the dequeued mbuf again so it makes no sense
    to pretend we could process mbufs again when in fact we won't.
    While at it replace this magic 4 with a macro RL_TX_DESC_THLD
    throughout this driver.
  - Don't cancel the watchdog timeout as soon as there's at least one
    free TX descriptor but instead only if all descriptors have been
    handled. It's perfectly normal, especially in the DEVICE_POLLING
    case, that re_txeof() is called when only a part of the enqueued
    TX descriptors have been handled, causing the watchdog to be
    disarmed prematurely.
o In re_encap():
  - If m_defrag() fails just drop the packet like other NIC drivers
    do. This should only happen when there's a mbuf shortage, in which
    case it was possible to end up with an IFQ full of packets which
    couldn't be processed as they couldn't be defragmented as they
    were taking up all the mbufs themselves. This includes adjusting
    re_start() to not trying to prepend the mbuf (chain) if re_encap()
    has freed it.
  - Remove dupe initialization of members of struct rl_dmaload_arg to
    values that didn't change since trying to process the fragmented
    mbuf chain.
    While at it remove an unused member from struct rl_dmaload_arg.
o In re_start() remove a abandoned, banal comment. The corresponding
  code was moved to re_attach() some time ago.

With these changes re(4) now survives one day (until stopped) of
hammering out packets here.

Reviewed by:	yongari
MFC after:	2 weeks
2007-01-16 20:35:23 +00:00
Bill Paul
bb7dfefb45 Fix re_setmulti() so that it works correctly for PCIe chips where
the multicast hash table are in reverse order compared to older
devices.
2007-01-11 20:31:35 +00:00
Marius Strobl
1d545c7a44 - Use the re_tick() callout instead of if_slowtimo() for driving
re_watchdog() in order to avoid races accessing if_timer.
- Use bus_get_dma_tag() so re(4) works on platforms requiring it.
- Remove invalid BUS_DMA_ALLOCNOW when creating the parent DMA tag
  and the tags that are used for static memory allocations.
- Don't bother to set if_mtu to ETHERMTU, ether_ifattach() does that.
- Remove an unused variable in re_intr().
2006-12-20 02:13:59 +00:00
Pyun YongHyeon
b61178a956 Fix typo. 2006-11-21 05:41:11 +00:00
Pyun YongHyeon
dc74159da6 Add TSO support.
Tested by:	wilko,  Pieter de Goeje < pieter AT degoeje DOT nl >
2006-11-21 04:40:30 +00:00
Pyun YongHyeon
960fd5b3d0 o Correctly set IFCAP_VLAN_HWCSUM as re(4) can do VLAN tagging/checksum
offloading in hardware.
o Correctly set media header length for VLAN.
2006-11-21 04:23:52 +00:00
Pyun YongHyeon
19ecd2310c Don't set RL_CFG1_FULLDUPLEX bit. The RL_CFG1_FULLDUPLEX bit in
config register 1 is only valid on 8129.
2006-11-21 04:14:44 +00:00
Andre Oppermann
78ba57b9e1 Move ethernet VLAN tags from mtags to its own mbuf packet header field
m_pkthdr.ether_vlan.  The presence of the M_VLANTAG flag on the mbuf
signifies the presence and validity of its content.

Drivers that support hardware VLAN tag stripping fill in the received
VLAN tag (containing both vlan and priority information) into the
ether_vtag mbuf packet header field:

	m->m_pkthdr.ether_vtag = vlan_id;	/* ntohs()? */
	m->m_flags |= M_VLANTAG;

to mark the packet m with the specified VLAN tag.

On output the driver should check the mbuf for the M_VLANTAG flag to
see if a VLAN tag is present and valid:

	if (m->m_flags & M_VLANTAG) {
		... = m->m_pkthdr.ether_vtag;	/* htons()? */
		... pass tag to hardware ...
	}

VLAN tags are stored in host byte order.  Byte swapping may be necessary.

(Note: This driver conversion was mechanic and did not add or remove any
byte swapping in the drivers.)

Remove zone_mtag_vlan UMA zone and MTAG_VLAN definition.  No more tag
memory allocation have to be done.

Reviewed by:	thompsa, yar
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-09-17 13:33:30 +00:00
Gleb Smirnoff
6b9f5c941c - Consistently use if_printf() only in interface methods: if_start(),
if_watchdog, etc., or in functions used only in these methods.
  In all other functions in the driver use device_printf().
- Use __func__ instead of typing function name.

Submitted by:	Alex Lyashkov <umka sevcity.net>
2006-09-15 15:16:12 +00:00
Pyun YongHyeon
baa1277289 Make 8139C+ work again which was broken since rev 1.68.
Ever since rev 1.68 re(4) checks the validity of link in re_start.
But rlphy(4) got a garbled data due to a different bit layout used on
8139C+ and it couldn't report correct link state. To fix it, ignore
BMCR_LOOP and BMCR_ISO bits which have different meanings on 8139C+.
I think this also make dhclient(8) work on 8139C+.

Reported by:	Gerrit Kuehn <gerrit AT pmp DOT uni-hannover DOT de>
Tested by:	Gerrit Kuehn <gerrit AT pmp DOT uni-hannover DOT de>
2006-09-08 00:58:02 +00:00
Pyun YongHyeon
be09900714 Fix re(4) breakge introduced in tree from rev 1.68.
This should fix incorrect configuration of station address on
big-endian architectures.

Reviewed by:	wpaul
Tested on:	sparc64
2006-08-03 00:15:19 +00:00
Bill Paul
0fc4974f79 Another small update to the re(4) driver:
- Change the workaround for the autopad/checksum offload bug so that
  instead of lying about the map size, we actually create a properly
  padded mbuf and map it as usual. The other trick works, but is ugly.
  This approach also gives us a chance to zero the pad space to avoid
  possibly leaking data.

- With the PCIe devices, it looks issuing a TX command while there's
  already a transmission in progress doesn't have any effect. In other
  words, if you send two packets in rapid succession, the second one may
  end up sitting in the TX DMA ring until another transmit command is
  issued later in the future. Basically, if re_txeof() sees that there
  are still descriptors outstanding, it needs to manually resume the
  TX DMA channel by issuing another TX command to make sure all
  transmissions are flushed out. (The PCI devices seem to keep the
  TX channel moving until all descriptors have been consumed. I'm not
  sure why the PCIe devices behave differently.)

  (You can see this issue if you do the following test: plug an re(4)
  interface into another host via crossover cable, and from the other
  host do 'ping -c 2 <host with re(4) NIC>' to prime the ARP cache,
  then do 'ping -c 1 -s 1473 <host with re(4) NIC>'. You're supposed
  to see two packets sent in response, but you may only see one. If
  you do 'ping -c 1 -s 1473 <host with re(4) NIC>' again, you'll
  see two packets, but one will be the missing fragment from the last
  ping, followed by one of the fragments from this ping.)

- Add the PCI ID for the US Robotics 997902 NIC, which is based on
  the RTL8169S.

- Add a tsleep() of 1 second in re_detach() after the interrupt handler
  is disconnected. This should allow any tasks queued up by the ISR
  to drain. Now, I know you're supposed to use taskqueue_drain() for
  this, but something about the way taskqueue_drain() works with
  taskqueue_fast queues doesn't seem quite right, and I refuse to be
  tricked into fixing it.
2006-08-01 17:18:25 +00:00
Bill Paul
498bd0d326 Fix the following bugs in re(4)
- Correct the PCI ID for the 8169SC/8110SC in the device list (I added
  the macro for it to if_rlreg.h before, but forgot to use it.)

- Remove the extra interrupt spinlock I added previously. After giving it
  some more thought, it's not really needed.

- Work around a hardware bug in some versions of the 8169. When sending
  very small IP datagrams with checksum offload enabled, a conflict can
  occur between the TX autopadding feature and the hardware checksumming
  that can corrupt the outbound packet. This is the reason that checksum
  offload sometimes breaks NFS: if you're using NFS over UDP, and you're
  very unlucky, you might find yourself doing a fragmented NFS write where
  the last fragment is smaller than the minimum ethernet frame size (60
  bytes). (It's rare, but if you keep NFS running long enough it'll
  happen.) If checksum offload is enabled, the chip will have to both
  autopad the fragment and calculate its checksum header. This confuses
  some revs of the 8169, causing the packet that appears on the wire
  to be corrupted. (The IP addresses and the checksum field are mangled.)
  This will cause the NFS write to fail. Unfortunately, when NFS retries,
  it sends the same write request over and over again, and it keeps
  failing, so NFS stays wedged.

  (A simple way to provoke the failure is to connect the failing system
  to a network with a known good machine and do "ping -s 1473 <badhost>"
  from the good system. The ping will fail.)

  Someone had previously worked around this using the heavy-handed
  approahch of just disabling checksum offload. The correct fix is to
  manually pad short frames where the TCP/IP stack has requested
  checksum offloading. This allows us to have checksum offload turned
  on by default but still let NFS work right.

- Not a bug, but change the ID strings for devices with hardware rev
  0x30000000 and 0x38000000 to both be 8168B/8111B. According to RealTek,
  they're both the same device, but 0x30000000 is an earlier silicon spin.
2006-07-30 23:25:21 +00:00