Check for suspend before the device polling, rather than after it.
Check to see if the current thread owns the lock in ioctl and return
EBUSY if it does.
This advances the locking to the point that I can eject my fxp card 10
times in a row, but I agree with Jeff Hsu that we need to get the
network layer locking finished before chasing more of the races here
(actually, he doesn't think this set is worth it even). There's a
number of races between FXP_LOCK in detach and all other users of
FXP_LOCK, and this gets back to the 'device with sleepers being
forcibly detached' problem as well...
1) always call fxp_stop in fxp_detach. Since we don't read from
the card, there's no need to carefully look at things with
bus_child_present.
2) Call FXP_UNLOCK() before calling bus_teardown_intr to avoid
a possible deadlock reported by jhb.
3) add gone to the softc. Set it to true in detach.
4) Return immediately if gone is true in fxp_ioctl
5) Return immediately if gone is true in fxp_intr
- Add fxp_start_body() and change fxp_start() to just acquire locks and
then call fxp_start_body(). Places that would call fxp_start() with
locks held (mutex recursion) now call fxp_start_body() directly.
Remove MTX_RECURSE flag from sc_mtx. [gallatin]
- Change fxp_attach() to work without the softc lock, saving interrupt
hooking until the head of fxp_attach().
- Call ether_ifattach() before overriding ifp parameters. This reverts
part of 1.155.
- Remove multiple error paths in fxp_attach().
- Teardown interrupt in fxp_detach() before unlocking the softc.
- Make sure mutex is not held in fxp_release()
- Delete the miibus instance and/or self in fxp_release(), not in
fxp_detach(). This can happen if attach fails partway through.
- Move ifmedia_removeall to fxp_release() since attach may fail after
media have been allocated.
- Add locking to fxp_suspend, fxp_resume, fxp_start, fxp_intr,
fxp_poll, fxp_tick, fxp_ioctl, fxp_watchdog.
- Pass in ifp to fxp_intr_body since its callers sometimes already use
it.
- Add compatibility define for INTR_MPSAFE for 4.x. [gallatin]
- You don't need to bzero softc.
Ideas from: gallatin, mux
Tested by: >400M packets of dd/ssh, NFS, ping on i386 UP
network layer (ether).
- Don't abuse module names to facilitate ifconfig module loading;
such abuse isn't really needed. (And if we do need type information
associated with a module then we should make it explicit and not
use hacks.)
PCIM_CMD_MEMEN and PCIM_CMD_BUSMASTEREN, becaise some braindead
BIOSes (such as one found in my vprmatrix notebook) forget
to initialize it properly resulting in attachment failure.
- Remove a useless device_is_alive() check.
- Disable interrupts if bus_child_present() so that this
check is more useful.
This fixes the hangs I was seeing when unloading the fxp driver.
Suggestions from: hsu, njl
- Be sure to teardown the interrupt first so that "kldunload if_fxp"
doesn't panic the box. It's now deadlocking rather than crashing,
which isn't really better, but I'm unsure this is fxp(4)'s fault.
- Change a bus_dmamap_sync() call to also do a BUS_DMASYNC_PREREAD
now that we can pass several operations.
we're using an atomic operation to clear the suspend flag
in fxp_start(). Since other architectures may need the
same thing, we want to do it all the time and not only
in the __alpha__ case. However, we don't want to use
atomic operations on 16-bit integers, because those may
not be available on any architecture. We're thus faking
a 32-bit atomic operation here. This patch also deals
with endianness here.
which deals with both endianness and alignment issues.
- Collect low-hanging fruits for endianness safety.
- Use 0xffffffff instead of -1 where appropriate.
endian safe.
- Change some u_int to u_int8_t which make more sense here since
we're really defining bytes. That produces the same code due to
how bitfields work.
- Add the definition of the vlan_drop_en bit (not used yet).
- Add some useful comments.
Obtained from: NetBSD
other allocations/initializations have been successful. I kinda
doubt it will fix the recent breakage that some people are seeing,
but this could have caused problems for sure.
This patch is rather big because I had to significantly redesign
the driver to make the busdma conversion possible. Most notably,
hardware and software structures were carefully splitted to get
rid of all the structs overlapping evilness.
Special thanks to phk and Richard Puga <puga@mauibuilt.com> for
providing me with fxp(4) hardware to do this work.
Thanks to marcel for testing this on ia64, and to Fred Clift
<fclift@verio.net> for testing this on alpha.
Tested on: i386, ia64, alpha
the fxp driver. This is enabled only for the 82550/82551 chips
(PCI revision code 12 or 13). RX and TX checksum offload are
both supported. Transmit offload is limited to TCP and UDP only
right now: there seems to be a problem with IP header checksumming
on transmit in some cases.
This chip has hardware VLAN support as well. I hope to enable
support for this eventually.
o don't strip the Ethernet header from inbound packets; pass packets
up the stack intact (required significant changes to some drivers)
o reference common definitions in net/ethernet.h (e.g. ETHER_ALIGN)
o track ether_ifattach/ether_ifdetach API changes
o track bpf changes (use BPF_TAP and BPF_MTAP)
o track vlan changes (ifnet capabilities, revised processing scheme, etc.)
o use if_input to pass packets "up"
o call ether_ioctl for default handling of ioctls
Reviewed by: many
Approved by: re
just limited to the DEVICE_POLLING case. This removes the FXP_RFA_RNRMARK
hack, and replaces it with a softc flag that is used to record when
the handling of a no-resource condition was deferred due to running
out of DEVICE_POLLING cycles. This was tested on -stable, but the
code is essentially the same as in -current. It should only affect
the case where DEVICE_POLLING is defined.
The details of the mechanism behind the crashes are still uncertain
but the most likely cause seems to be some kind of hardware confusion
when the no-resource recovery code is accidentally invoked while
the receiver is still active. This could have happened if the
hardware left the 0x4000 bit of the RFA status word set. The comments
in the commit log for revision 1.142 stating that the driver could
clash with the hardware writing to this status word were not correct.
Tested by: Guy Helmer <ghelmer@palisadesys.com>
behaviour of the hardware: a possibly reserved bit of the receive
descriptor (RFA) `status' field is borrowed to record no-resource
(RNR) events, and the same status field is read and written to at
a time that may clash with the hardware updating this field.
There is no hardware documentation available to determine if these
things are safe to do; the second issue almost certainly isn't, and
the first is only safe if there is documentation saying that this
bit is free to be used by the driver. The PR referenced below
provides extremely convincing evidence that the changes cause random
crashes on some (unusual) hardware.
Since these features are only required by the DEVICE_POLLING case,
this commit makes their use conditional on that option. It does not
change the DEVICE_POLLING case, but at least people with the rare
hardware on which this code causes problems can now avoid the crashes
by not enabling DEVICE_POLLING.
PR: kern/42260
Reviewed by: luigi
Problem revision found by: Pawel Malachowski <pawmal@unia.3lo.lublin.pl>
Tested by: Pawel Malachowski <pawmal@unia.3lo.lublin.pl>
MFC after: 1 week
Linux driver defines 0x103[B-E] so add those as well.
Obtained from: Intel Linux e100 driver
MFC: Immediately if re@ allows it, otherwise after 4.7-RELEASE
Also take this chance to cleanup the code in fxp_intr_body.
Add a missing block of code to disable interrupts when
reinitializing the interface while doing polling (the RELENG_4
version was correct).
MFC after: 3 days
1.131 is slightly broken, and I would commit the fix to that here, but it
has been reported that any deviation from the original code is causing
problems with some 82557 chips, causing them to lock hard.
Until those issues have been figured out, going back to the original
code is the best plan.
Frustrated: Silby
are packets queued for transmission.
This driver is strange -- it never sets IFF_OACTIVE, so all
transmissions always cause a call to fxp_start. However, if the
link gets stuck, there was nothing to reset it, so there was still
a possibility of lockups.
MFC after: 3 days