struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.
Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
latest 82550 and 82551 chipsets (revision IDs 0x0e, 0x0f and 0x10).
We were only enabling it for revisions 0x0c and 0x0d, now it's
enabled for any 8255x NIC with a revision ID bigger than 0x0c. It
should be safe, and this is what Intel does in their open source
driver.
MFC after: 2 weeks
Tested by: Pavel Lobach lobach_pavel at mail dot ru
from an mbuf into the fxp_encap() function, as done in other drivers.
- Don't waste time calling bus_dmamap_load_mbuf() if we know the mbuf
chain is too long to fit in a TX descriptor, call m_defrag() first.
- Convert fxp(4) to use bus_dmamap_load_mbuf_sg().
a vlan interface attached to a fxp(4) card when it has not been
initialized yet. We now set the links from our internel TX descriptor
structure to the TX command blocks at attach time rather than at init
time. While I'm here, slightly improve the style in fxp_attach().
PR: kern/78112
Reported by: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk> and others
Tested by: flz, Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>
MFC after: 1 week
testing it to know whether we should enable the 82503 serial mode...
Move code to the right location and disallow the use of the 82503
serial mode if the sc->revision field is 0 again. This makes fxp(4)
work correctly with ATMEL 350 93C46 cards (3 port 82559 based with a
82555 PHY), as well as with the older ATMEL 220 93C46 (same flavour)
and with the even older 10Mbps-only 82557 cards with the 82503 serial
interface.
Tested by: Andre Albsmeier <andrer@albsmeier.net>, krion
MFC after: 2 weeks
interface as well. This is not an expected revision id per the
datasheet, but unfortunately there are such cards out there with
a 82557 chipset, and they want to use the 82503.
PR: kern/75739
Reported by: Andre Albsmeier <andre.albsmeier@siemens.com>
datasheet says it is only valid for such chipsets and shouldn't be used
with others. This fixes some 82559 based cards which otherwise only
work at 10Mbit.
MFC after: 5 days
Tested by: krion
following drivers: bfe(4), em(4), fxp(4), lnc(4), tun(4), de(4) rl(4),
sis(4) and xl(4)
More patches are pending on: http://peoples.freebsd.org/~mlaier/ Please take
a look and tell me if "your" driver is missing, so I can fix this.
Tested-by: many
No-objection: -current, -net
sysctls were global (hw.fxp_rnr and hw.fxp_noflow), all of them are
now per-device. Sample output of "sysctl dev.fxp0" with this patch,
with the standard %foo nodes removed :
dev.fxp0.int_delay: 1000
dev.fxp0.bundle_max: 6
dev.fxp0.rnr: 0
dev.fxp0.noflow: 0
In particular, disabling it was likely to break configurations
involving ng_vlan(4) since the latter couldn't control
the parent's VLAN_MTU in the way vlan(4) did.
Pointed out by: ru
this problem put these lines back in. While they should be
unnecessary, they appear to be sometimes necessary.
Reviewed in concept: dfr
Approved by: re (scottl@)
to the pci attachment. Cardbus is a derived class of pci so all pci
drivers are automatically available for matching against cardbus devices.
Reviewed by: imp
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
However, they are presently necessary due to bigger bogusness in the
pci bus layer not doing the right thing on suspend/resume or on
initial device probe. This is exactly the sort of thing that the
BURN_BRIDGES option was invented for. Mark all of them as
BURN_BRIDGES. As soon as I have the powerstate stuff properly
integrated into the pci bus code, I intend to remove all these
workarounds.
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.
sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.
If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.
Reviewed by: tmm, gibbs
in the netisr case. This would result in a lock reversal. This
fixes the net.isr.enable=1 case. Better performance might be
obtained by chaining all packets received, dropping the lock, and
then calling if_input() on each one.
Reported by: hmp
structure, which is new to the 82550 and 82551, is used to transmit
a packet. This appears to fix the packet truncation problem that was
observed when using 82550-based fxp cards to transmit ICMP or fragmented
UDP packets of certain lengths which only had one to three bytes in the
second and final mbuf of the packet. This matches a note in the "Intel
8255x 10/100 Mbps Ethernet Controller Family Open Source Software Developer
Manual", which says that the hardware parse bit should be set when sending
these types of packets.
There have also been unconfirmed reports of similar problems when
transmitting TCP packets, which should not be affected by the above
mentioned change because the hardware parse bit was already being set
if the stack requested hardware checksumming of the packet. If the
problem remains, the use of the IPCB structure can be disabled to
cause the driver to fall back to using the older 82559 interface with
82550-based cards by setting
hint.fxp.UNIT_NUMBER.ipcbxmit_disable
to a non-zero value at boot time, or using kenv to set this variable
before using kldload to load the fxp driver.
Approved by: re (jhb)
on if_fxp cards. When flow control is enabled, if the operating system
doesn't acknowledge the packet buffer filling, the card will begin to
generate ethernet quench packets, but appears to get into a feedback
loop of some sort, hosing local switches. This is a temporary workaround
for 5.1: the ability to configure flow control should probably be
exposed by some or another management interface on ethernet link layer
devices.
Approved by: re (bmah)
Reviewed by: mux
by using a __packed keyword for the fxp_rfa structure. The Intel
guys who designed this structure with unaligned fields deserve
to be shot.
Tested by: kris
Approved by: re@ (jhb)
Check for suspend before the device polling, rather than after it.
Check to see if the current thread owns the lock in ioctl and return
EBUSY if it does.
This advances the locking to the point that I can eject my fxp card 10
times in a row, but I agree with Jeff Hsu that we need to get the
network layer locking finished before chasing more of the races here
(actually, he doesn't think this set is worth it even). There's a
number of races between FXP_LOCK in detach and all other users of
FXP_LOCK, and this gets back to the 'device with sleepers being
forcibly detached' problem as well...
1) always call fxp_stop in fxp_detach. Since we don't read from
the card, there's no need to carefully look at things with
bus_child_present.
2) Call FXP_UNLOCK() before calling bus_teardown_intr to avoid
a possible deadlock reported by jhb.
3) add gone to the softc. Set it to true in detach.
4) Return immediately if gone is true in fxp_ioctl
5) Return immediately if gone is true in fxp_intr
- Add fxp_start_body() and change fxp_start() to just acquire locks and
then call fxp_start_body(). Places that would call fxp_start() with
locks held (mutex recursion) now call fxp_start_body() directly.
Remove MTX_RECURSE flag from sc_mtx. [gallatin]
- Change fxp_attach() to work without the softc lock, saving interrupt
hooking until the head of fxp_attach().
- Call ether_ifattach() before overriding ifp parameters. This reverts
part of 1.155.
- Remove multiple error paths in fxp_attach().
- Teardown interrupt in fxp_detach() before unlocking the softc.
- Make sure mutex is not held in fxp_release()
- Delete the miibus instance and/or self in fxp_release(), not in
fxp_detach(). This can happen if attach fails partway through.
- Move ifmedia_removeall to fxp_release() since attach may fail after
media have been allocated.
- Add locking to fxp_suspend, fxp_resume, fxp_start, fxp_intr,
fxp_poll, fxp_tick, fxp_ioctl, fxp_watchdog.
- Pass in ifp to fxp_intr_body since its callers sometimes already use
it.
- Add compatibility define for INTR_MPSAFE for 4.x. [gallatin]
- You don't need to bzero softc.
Ideas from: gallatin, mux
Tested by: >400M packets of dd/ssh, NFS, ping on i386 UP
network layer (ether).
- Don't abuse module names to facilitate ifconfig module loading;
such abuse isn't really needed. (And if we do need type information
associated with a module then we should make it explicit and not
use hacks.)
PCIM_CMD_MEMEN and PCIM_CMD_BUSMASTEREN, becaise some braindead
BIOSes (such as one found in my vprmatrix notebook) forget
to initialize it properly resulting in attachment failure.
- Remove a useless device_is_alive() check.
- Disable interrupts if bus_child_present() so that this
check is more useful.
This fixes the hangs I was seeing when unloading the fxp driver.
Suggestions from: hsu, njl
- Be sure to teardown the interrupt first so that "kldunload if_fxp"
doesn't panic the box. It's now deadlocking rather than crashing,
which isn't really better, but I'm unsure this is fxp(4)'s fault.
- Change a bus_dmamap_sync() call to also do a BUS_DMASYNC_PREREAD
now that we can pass several operations.
we're using an atomic operation to clear the suspend flag
in fxp_start(). Since other architectures may need the
same thing, we want to do it all the time and not only
in the __alpha__ case. However, we don't want to use
atomic operations on 16-bit integers, because those may
not be available on any architecture. We're thus faking
a 32-bit atomic operation here. This patch also deals
with endianness here.
which deals with both endianness and alignment issues.
- Collect low-hanging fruits for endianness safety.
- Use 0xffffffff instead of -1 where appropriate.
endian safe.
- Change some u_int to u_int8_t which make more sense here since
we're really defining bytes. That produces the same code due to
how bitfields work.
- Add the definition of the vlan_drop_en bit (not used yet).
- Add some useful comments.
Obtained from: NetBSD
other allocations/initializations have been successful. I kinda
doubt it will fix the recent breakage that some people are seeing,
but this could have caused problems for sure.
This patch is rather big because I had to significantly redesign
the driver to make the busdma conversion possible. Most notably,
hardware and software structures were carefully splitted to get
rid of all the structs overlapping evilness.
Special thanks to phk and Richard Puga <puga@mauibuilt.com> for
providing me with fxp(4) hardware to do this work.
Thanks to marcel for testing this on ia64, and to Fred Clift
<fclift@verio.net> for testing this on alpha.
Tested on: i386, ia64, alpha
the fxp driver. This is enabled only for the 82550/82551 chips
(PCI revision code 12 or 13). RX and TX checksum offload are
both supported. Transmit offload is limited to TCP and UDP only
right now: there seems to be a problem with IP header checksumming
on transmit in some cases.
This chip has hardware VLAN support as well. I hope to enable
support for this eventually.
o don't strip the Ethernet header from inbound packets; pass packets
up the stack intact (required significant changes to some drivers)
o reference common definitions in net/ethernet.h (e.g. ETHER_ALIGN)
o track ether_ifattach/ether_ifdetach API changes
o track bpf changes (use BPF_TAP and BPF_MTAP)
o track vlan changes (ifnet capabilities, revised processing scheme, etc.)
o use if_input to pass packets "up"
o call ether_ioctl for default handling of ioctls
Reviewed by: many
Approved by: re
just limited to the DEVICE_POLLING case. This removes the FXP_RFA_RNRMARK
hack, and replaces it with a softc flag that is used to record when
the handling of a no-resource condition was deferred due to running
out of DEVICE_POLLING cycles. This was tested on -stable, but the
code is essentially the same as in -current. It should only affect
the case where DEVICE_POLLING is defined.
The details of the mechanism behind the crashes are still uncertain
but the most likely cause seems to be some kind of hardware confusion
when the no-resource recovery code is accidentally invoked while
the receiver is still active. This could have happened if the
hardware left the 0x4000 bit of the RFA status word set. The comments
in the commit log for revision 1.142 stating that the driver could
clash with the hardware writing to this status word were not correct.
Tested by: Guy Helmer <ghelmer@palisadesys.com>
behaviour of the hardware: a possibly reserved bit of the receive
descriptor (RFA) `status' field is borrowed to record no-resource
(RNR) events, and the same status field is read and written to at
a time that may clash with the hardware updating this field.
There is no hardware documentation available to determine if these
things are safe to do; the second issue almost certainly isn't, and
the first is only safe if there is documentation saying that this
bit is free to be used by the driver. The PR referenced below
provides extremely convincing evidence that the changes cause random
crashes on some (unusual) hardware.
Since these features are only required by the DEVICE_POLLING case,
this commit makes their use conditional on that option. It does not
change the DEVICE_POLLING case, but at least people with the rare
hardware on which this code causes problems can now avoid the crashes
by not enabling DEVICE_POLLING.
PR: kern/42260
Reviewed by: luigi
Problem revision found by: Pawel Malachowski <pawmal@unia.3lo.lublin.pl>
Tested by: Pawel Malachowski <pawmal@unia.3lo.lublin.pl>
MFC after: 1 week
Linux driver defines 0x103[B-E] so add those as well.
Obtained from: Intel Linux e100 driver
MFC: Immediately if re@ allows it, otherwise after 4.7-RELEASE
Also take this chance to cleanup the code in fxp_intr_body.
Add a missing block of code to disable interrupts when
reinitializing the interface while doing polling (the RELENG_4
version was correct).
MFC after: 3 days
1.131 is slightly broken, and I would commit the fix to that here, but it
has been reported that any deviation from the original code is causing
problems with some 82557 chips, causing them to lock hard.
Until those issues have been figured out, going back to the original
code is the best plan.
Frustrated: Silby
are packets queued for transmission.
This driver is strange -- it never sets IFF_OACTIVE, so all
transmissions always cause a call to fxp_start. However, if the
link gets stuck, there was nothing to reset it, so there was still
a possibility of lockups.
MFC after: 3 days
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
instead of relying on the previous filters to be present.
Back out r1.125, as a reset is needed to unload any existing microcode,
(which clears the multicast addresses), as it is superceded by this change.
Non-SMP, i386-only, no polling in the idle loop at the moment.
To use this code you must compile a kernel with
options DEVICE_POLLING
and at runtime enable polling with
sysctl kern.polling.enable=1
The percentage of CPU reserved to userland can be set with
sysctl kern.polling.user_frac=NN (default is 50)
while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.
Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at
http://info.iet.unipi.it/~luigi/polling/
and also supports polling in the idle loop.
NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.
NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.
Quick description of files touched by this commit:
sys/conf/files.i386
new file kern/kern_poll.c
sys/conf/options.i386
new option
sys/i386/i386/trap.c
poll in trap (disabled by default)
sys/kern/kern_clock.c
initialization and hardclock hooks.
sys/kern/kern_intr.c
minor swi_net changes
sys/kern/kern_poll.c
the bulk of the code.
sys/net/if.h
new flag
sys/net/if_var.h
declaration for functions used in device drivers.
sys/net/netisr.h
NETISR_POLL
sys/dev/fxp/if_fxp.c
sys/dev/fxp/if_fxpvar.h
sys/pci/if_dc.c
sys/pci/if_dcreg.h
sys/pci/if_sis.c
sys/pci/if_sisreg.h
device driver modifications
and packet bundling. Make the microcode settings controllable via sysctl
and loader tunables.
Submitted by: Marko Zec <zec@tel.fer.hr>
(with some munging and dynamic sysctl support by me)
Also extend the workaround for Dynamic Standby mode to later '559 chips,
not just the ICH2 variants.
This is taken verbatim from the Intel's e100-1.6.22 release, with
the addition of their LICENSE file at the top.
Submitted by: Marko Zec <zec@tel.fer.hr>
the chip can cause a PCI protocol violation in under certain scenarios.
The workaround is to rewrite the EEPROM to disable Dynamic Standby Mode.
Once the EEPROM is rewritten, the system needs to be rebooted in order
to pick up the new settings.
This has been tested on several ICH2/ICH2-M systems, found in 815E based
boards, and usually identified by the presence of the 82562 ET/EM PHY.
Thanks to: Mike Tansca, Paul Saab for samples of the problematic boards.
of " &= ". Also change the MII PHY device mask to check the correct bits.
Cookie to: Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Pointy hat to: me
a 82557 (e.g.: a newer chip) then:
+ enable MWI, if the PCI configuration indicates the system supports it
+ enable usage of extended TxCB, for better performance
+ enable hardware flow control. FC frames will be passed up to the
host only if promiscuous mode is enabled.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.
Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
an override as a loader settable variable (fxp_iomap). fxp_iomap is
a bitmap of fxp units that should be configured to use PCI I/O space
in stead of PCI Memory space.
Reviewed by: Kees Jan Koster <dutchman@tccn.cs.kun.nl>, dg@freebsd.org
All calls to mtx_init() for mutexes that recurse must now include
the MTX_RECURSE bit in the flag argument variable. This change is in
preparation for an upcoming (further) mutex API cleanup.
The witness code will call panic() if a lock is found to recurse but
the MTX_RECURSE bit was not set during the lock's initialization.
The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to
MTX_RECURSED, which is more appropriate given its meaning.
The following locks have been made "recursive," thus far:
eventhandler, Giant, callout, sched_lock, possibly some others declared
in the architecture-specific code, all of the network card driver locks
in pci/, as well as some other locks in dev/ stuff that I've found to
be recursive.
Reviewed by: jhb
claimed that their Intel NIC is comatose after a warm boot from Windoze.
This is most likely due to the card getting put in the D3 state. This
should bring it back to life.