much more often that expected and negatively impact performance when
running at 100mbps. I need to figure out if there's a better way to
handle this, but for now this shouldn't hurt anything.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.
Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
takes care of all the 10/100 and gigE PCI drivers that I've done.
Next will be the wireless drivers, then the USB ones. I may pick up
some stragglers along the way. I'm sort of playing this by ear: if
anyone spots any places where I've screwed up horribly, please let me
know.
laptops. I've checked that this still works with the other cards and
it works with the 3c556 that I have access to, but I want to check that
it works with the 556B mentioned in PR #20878 before I close out the PR
and merge to -stable.
the 3c450-TX HomeConnect. Like the 3cSOHO100-TX OfficeConnect, this NIC
uses the same ASIC as the 3c905B/3c905C but is targeted for a particular
market segment (home users). It is somewhat less expensive than the
3c905B/3c905C ($49, according to the 3Com web site), comes with its
own custom driver kit and is bundled with various goofy Windows software
packages designed to demonstrate the niftyness of home networking (networked
game demos, etc...).
Changes are:
- Add PCI ID to list in if_xlreg.h.
- Update xl_devs table in if_xl.c.
- Update xl_choose_xcvr() to consider the HomeConnect the
same as all the other 10baseT/100baseTX cards.
- When setting/clearing promisc mode, just update the filter, don't
reset the whole interface.
- Call xl_init() in xl_ifmedia_upd() when setting miibus media modes. This
fixes a problem with the 3c905B-COMBO where switching from 10base5/AUI
or 10base2/BNC to a 10/100 mode doesn't always work right.
- Attempt to reset the interface in xl_init() so that we know we're getting
the receive and transmit rings reset properly.
strategy used in the 3Com Linux driver. The new strategy is to use transmit
descriptor polling -- that is, the NIC polls the descriptors to see when
new packets are available for transmission. The advantage to the new scheme
is that no register accesses are needed in the transmit routine. The old
scheme requires several register accesses to stall the TX engine, update the
TX DMA list pointer register, then unstall the TX engine. Hopefully the new
scheme will provide improved transmit performance with less CPU overhead.
This only affects the 3c90xB or 3c90xC cards, not the 3c90x cards. This
means the original 3c900 and 3c905 cards are unaffected. Newer cards include
the 3c900B series, the 3c905B, 3c980, 3c980B, 3c905C and 3c905C, and the
3cSOHO100-TX OfficeConnect.
It's GPL'ed of course, but looking over it tonight I learned of Yet Another
Fast EtherLink XL Adapter: the 3c980C server adapter. This is basically
an updated version of the 3c980 that uses the Tornado ASIC instead of the
earlier Hurricane ASIC. The only change here is to add the new PCI device
ID (0x9805) and corresponding table entries.
due to the fact that there are non-MII cards supported by the same
driver and I don't have all of the cards available for testing. There's
also the 3c905B-COMBO which has MII, AUI and BNC media ports all in one
package. Supporting the COMBO is difficult because we have to add the
10base5 and 10base2 media types to the same ifmedia struct as the
MII-attached types, however there is no way to force the miibus and
child PHYs into existence before xl_attach() completes, so there is
no ifmedia struct available in xl_attach(). What we do inistead is
use the mediainit method as a callback: when a child PHY is attached,
it calls the miibus mediainit routine which selects a default media.
This routing also calls the NIC driver's mediainit method (if it
implements one) at which point we can safely add the other media
types.
into a loadable module, and all of the platform dependencies are gone
(except for the alpha_XXX_dmamap() thing, which is another issue -- I
still don't know how to use the busdma stuff with a network driver).
Also increase the delay in xl_reset(); testing on a 486/66 with a 3c905C
shows that reading the EEPROM fails immediately after a reset. Waiting
a little longer after the reset completes seems to fix it.
chipset. First you thrilled to the 3c905, then you trembled at the
3c905B, now gaze in wonder at: the 3c905C! This appears to be another
3c90X series chip called the Tornado (PCI ID 0x10B7/0x9200) and should
be equivalent (from the driver API perspective) to the 3c905B, so all
we have to do is add the PCI ID to the list.
preparation for tsunami support. Previous chipsets' direct-mapped DMA
mask was always 1024*1024*1024. The Tsunami chipset needs it to be
2*1024*1024*1024
These changes should not affect the i386 port
Reviewed by: Doug Rabson <dfr@nlsystems.com>
- Try to unbreak what I broke by screwing with the tx queuing again.
I'm waiting for a few more people to test out this code and report back
before I move it into current. Hopefully it will be soon. Basically I
reverted to the old TX queuing strategy.
- Add experimental support for the 3c900B-FL (10mbps ST fiber). The card
should be detected properly and the 10baseFL mode supported, but again
I'm still waiting for word from a tester to see if this actually works.
It shouldn't affect the other cards though; all the differences are in
media selection.
- Set the TX start threshold register to get better performance.
- Increase the size of the RX and TX rings. UDP performance was pretty
bad because the TX ring was too small. Should be substantially better
now (I can saturate the link with either TCP or UDP now).
- Change some of the #defines to reflect proper 3Com ASIC names (boomerang,
cyclone, krakatoa, hurricane).
- Simplify and reorganize interrupt handler; ack all interrupts right
away and then process them. This avoids a potential race condition.
(Noted by Matt Dillon.)
- Reorganize the bridging code to eliminate using a goto to jump into
the middle of an if() {} clause. Sorry, that just made my brain itch.
- Use m_adj() in xl_rxeof().
- Make the payload alignment in xl_newbuf() the default (instead of
just conditionally defined for the alpha) to improve NFS performance
(avoids need for nfs_realign()).
3c900B-TPC (twisted pair and coax). Treated similarly to the
3c900B-COMBO, except no AUI port.
- Fix media selection so that it's possible to select the AUI and BNC
ports on the 3c905B-COMBO. This board is now fully supported.
- Change TX queueing strategy to hopefully be more efficient by avoiding
register accesses in xl_start(). Should provide small performance
improvement and a little better reliability.
(cut-down version of the "cyclone" for the small office/home office
"cheap bastard" market). Basically the same as a 3c905B but without
Wake-on-LAN, ROM socket, etc...
- Wait longer for the reset to complete in xl_attach() to try and avoid
'command never completed' warnings.
- Clean up a few odds and ends in xl_attach().
- Add PCI ID for the 3c905B-COMBO (a new card). Right now this is
treated as a 3c905B; I need to dig up one of these cards for testing
before I can make the AUI and BNC ports work.
- Add a hack to force reading the I/O address directly from the PCI
registers if pci_map_port() fails. I SHOULD NOT HAVE TO DO THIS:
SOMEBODY WITH MORE PCI CLUES THAN I SHOULD INVESTIGATE WHY THIS
HAPPENS.
sys/alpha/conf/GENERIC.
Note: the PNIC ignores the lower few bits of the RX buffer DMA address,
which means we have to add yet another kludge to make it happy. Since
we can't offset the packet data, we copy the first few bytes of the
received data into a separate mbuf with proper alignment. This puts
the IP header where it needs to be to prevent unaligned accesses.
Also modified the PNIC driver to use a non-interrupt driven TX
strategy. This improves performance somewhat on x86/SMP systems where
interrupt delivery doesn't seem to be as fast with an SMP kernel as
with a UP kernel.
Revert the transmission packet queueing strategy changes. Clearly I missed
something while debugging this, although I never encountered any problems
on my test machines.
Also make one other minor change: jack up the TX reclaim threshold for
3c90xB adapters in order to stave off 'transmission error: 82' errors.
Document the existence of the tx reclaim register (for inspecting the
current reclaim threshold) in register window 5 (if_xlreg.h).
agressive. With the old code, if a descriptor chain was already on its
way to the chip, xl_start() would try to splice new chains onto the end
of the current chain by stopping the transmitter, modifying the tail
pointer of the current chain to point to the head of the new chain, then
restart the transmitter. The manual says you're allowed to do this and
it works, but I'm not too keen on it anymore.
The new code waits until the eixsting chain has been sent and then
queues the next waiting chain in the 'transmit ok' handler.
Performance still looks good one way or the other.
This is a 100BaseFX board with SC fiber media connectors. I don't actually
have one of these but I've been told it works with the xl driver.
Submitted by: Jason Wright from the openbsd group
the OpenBSD group to fix a problem with the default ifmedia not being
set properly in some cases with a 3c905B, leading to a panic in ifmedia_set().
Also apply a patch to force the transmit start routine to check the
transmitter to make sure it isn't wedged if the outbound tx queue appears
full. This seems to cure some problems with 'watchdog timeout' errors
cropping up in some cases. I tried to do this before by checking for the
IFF_OACTIVE flag on entry to xl_start(), but if the IFF_OACTIVE flag is
set, ether_output() won't even call xl_start(). It should work now.
Lastly, increase the size of the TX queue from 10 descriptors to 16 to
hopefully make it less likely that the TX queue will fill up.
XCVR value read from the EEPROM is completely wrong. I've had one report
of a 3c900 card that returns an xcvr value of 14, which is impossible
(the manual states that all vales above 8 are reserved). If the value
is out of the expe
Add PCI vendor ID for the 3c980-TX server adapter card, which apparently
also uses the cyclone chip. Graciously supplied Mats O Jansson
<maja@cntw.com>.
Also noted by Mats, the 10mpbs cyclone adapters should be named 3c900B,
not 3c905B. I haven't actually encountered a 10mbps only cyclone adapter
yet, nor anybody who has one, but this makes sense given the naming
scheme used for the older boomerang adapters.
of associated mbuf clusters) in the RX ring from 4 to 16. On my
really fast PI 400Mhz test machines, 4 descriptors (and associated
mbuf clusters) is enough to achieve decent performance without any
RX overruns. However, one person reported problems with the following
scenario:
- P90 system running FreeBSD with a 3c905B-TX adapter, slow IDE hard
disk (Quantum Bigfoot?)
- PII 266 with SCSI disks running LoseNT and also with a 3c905B-TX
- Both machines connected together via crossover cable at 100Mbps
full-duplex
- LoseNT machine writing largs amounts of data (2.5 GB work of
files each in the neighborhood of 1 to 2 MB in size) via samba to
the FreeBSD machine
In this case, the LoseNT machine is sending data very fast. Apparently
there weren't any problems initially because the user was writing to
one particular disk which was relatively fast, however after this disk
filled up and the user started writing to the second slower disk, RX
overruns would occur and sometimes the RX DMA engine would stall after
a 100 to 500MB had been transfered. The xl_rxeof() handler is supposed
to detect this condition and restart the upload engine; I'm not sure
why it doesn't, unless interrupts are being lost and the rx handler
isn't getting called.
This is still an improvement over the Linux driver, which uses 32
descriptors in its receive ring. :)
Problem reported by: Heiko Schaefer <hschaefer@fto.de>
stability now. ALso modify /sys/conf/files, /sys/i386/conf/GENERIC
and /sys/i386/conf/LINT to add entries for the XL driver. Deactivate
support for the XL adapters in the vortex driver. LAstly, add a man
page.
(Also added an MLINKS entry for the ThunderLAN man page which I forgot
previously.)