- don't force busdma to pre-allocate bounce pages for parent tag.
- use system supplied roundup2 macro instead of rolling its own version.
- TX/RX decriptor length should be multiple of 128. There is no
no need to expand the size with the multiple of 4096.
- don't create/destroy DMA maps in TX/RX handlers. Use pre-allocated
DMA maps. Since creating DMA maps on sparc64 is time consuming
operations(resource mananger overhead), this change should boost
performance on sparc64. I could get > 2x speedup on Ultra60.
- TX/RX descriptors could be aligned on 128 boundary. Aligning them
on PAGE_SIZE is waste of resource.
- don't blindly create TX DMA tag with size of MCLBYTES * 8. The size
is only valid under jumbo frame environments. Instead of using the
hardcoded value, re-compute necessary size on the fly.
- RX side bus_dmamap_load_mbuf_sg(9) support.
- remove unused macro EM_ROUNDUP and constant EM_MMBA.
Reviewed by: scottl
Tested by: glebius
for a notebook with em(4) adapter.
- Introduce tunables em.hw.txd and em.hw.rxd, which allow administrator
to configure number of transmit and receive descriptors.
- Check em.hw.txd and em.hw.rxd against hardware limits [*] and require
them to be multiple of 128.
[*] According to comments in if_em.h the 82540EM/82541ER chips can handle
more than 256 descriptors. Since we don't have this hardware to test,
we decided to mimic NetBSD wm(4) driver, that limits these chips to
256 descriptors.
In collaboration with: yongari
rather than in ifindex_table[]; all (except one) accesses are
through ifp anyway. IF_LLADDR() works faster, and all (except
one) ifaddr_byindex() users were converted to use ifp->if_addr.
- Stop storing a (pointer to) Ethernet address in "struct arpcom",
and drop the IFP2ENADDR() macro; all users have been converted
to use IF_LLADDR() instead.
overruns and number of watchdog timeouts.
- Do not log(9) RX overrun events, since this pessimizes
things under load [1].
- Do not increase if->if_oerrors in em_watchdog(), since
this leads to counter slipping back, when if->if_oerrors
is recalculated in em_update_stats_counters(). Instead
increase watchdog counter in em_watchdog() and take it
into account in em_update_stats_counters().
Submitted by: ade [1]
- disable jumbo frame support on strict alignment architectures due
to the limitation of hardware. The driver needs a fix-up code for
RX side. The fix will show up in near future.
- fix endian issue for 82544 on PCI-X bus. I couldn't test this as
I don't have the NIC/hardware.
- prefer PCIR_BAR to hardcoded EM_MMBA.
- Properly checks for for 64bit BAR [1]
- replace inl/outl with bus_space(9) [1]
- fix endian issue on VLAN handling.
- reorder header files and remove unnecessary one.
Reviewed by: cognet
No response from: pdeuskar, tackerman
Obtained from: OpenBSD [1]
since the link takes a bit to negotiate, the information is pretty
much never available during the probe. As such, the boot output
pretty much always prints N/A for speed and duplex. Since we print
out the output of ifconfig during the user space boot, this early
boot information is also generally redundant, and added to the noise.
MFC after: 2 weeks
- Destroy mutex in case of attach failure. [1]
- Lock properly em_watchdog(). [1]
- Lock properly em_sysctl_int_delay(). [1]
- Remove unused global adapter linked list.
- Remove unused dma_size field from struct em_dma_alloc.
- Do not touch interface statistics, that must be edited
only by upper layers. [1]
Submitted by: yongari [1]
o Do not mask the RX overrun interrupt.
o Rewrite em_intr():
- Axe EM_MAX_INTR.
- Cycle acknowledging interrupts and processing
packets until zero interrupt cause register is
read.
- If RX overrun comes in log this fact. [ NetBSD also
resets adapter in this case, but my tests showed that
this is not needed and only pessimizes behavior under
heavy load. ]
- Since almost all functions is rewritten, style the
remaining lines.
This fixes em(4) interfaces wedging under high load.
In collaboration with: wpaul, cognet
Obtained from: NetBSD
flag. This fixes panic, when 'ifconfig em0 down' was called and it calls
em_stop() while the em_process_receive_interrupts() has temporarily
dropped the lock.
the descriptors set.
- In em_process_receive_interrupts(), call bus_dmamap_sync() for the
descriptors set each time we modify one descriptor, instead of doing it only
at the function exit, to make sure the adapters know he can re-use the
descriptor.
This helps on arm with write-back data cache (and possibly on other arches
with bounce pages, I don't know) under heavy network load. Without this,
if we attempt to process more than num_rx_desc descriptors, the adapter
would just stop processing rx interrupts.
The receive function em_process_receive_interrupts() unlocks the
adapter while ether_input() processes the packet, and then locks
it back. In the meantime, em_init() may be called, either from
em_watchdog() from softclock interrupt or from the ifconfig(8)
program. The em_init() resets the card, in particular it sets
adapter->next_rx_desc_to_check to 0 and resets hardware RX Head
and Tail descriptor pointers. The loop in
em_process_receive_interrupts() does not expect these things to
change, and a mess may result.
This fixes long wedges of em(4) interfaces receive part under high
load and IP fastforwarding enabled.
PR: kern/87418
Submitted by: Dmitrij Tejblum <tejblum yandex-team.ru>
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
can be compiled as loadable modules.
Reviewed by: bde
o Axe poll in trap.
o Axe IFF_POLLING flag from if_flags.
o Rework revision 1.21 (Giant removal), in such a way that
poll_mtx is not dropped during call to polling handler.
This fixes problem with idle polling.
o Make registration and deregistration from polling in a
functional way, insted of next tick/interrupt.
o Obsolete kern.polling.enable. Polling is turned on/off
with ifconfig.
Detailed kern_poll.c changes:
- Remove polling handler flags, introduced in 1.21. The are not
needed now.
- Forget and do not check if_flags, if_capenable and if_drv_flags.
- Call all registered polling handlers unconditionally.
- Do not drop poll_mtx, when entering polling handlers.
- In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
- In netisr_poll() axe the block, where polling code asks drivers
to unregister.
- In netisr_poll() and ether_poll() do polling always, if any
handlers are present.
- In ether_poll_[de]register() remove a lot of error hiding code. Assert
that arguments are correct, instead.
- In ether_poll_[de]register() use standard return values in case of
error or success.
- Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
poll_switch() goes through interface list and enabled/disables polling.
A message that kern.polling.enable is deprecated is printed.
Detailed driver changes:
- On attach driver announces IFCAP_POLLING in if_capabilities, but
not in if_capenable.
- On detach driver calls ether_poll_deregister() if polling is enabled.
- In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
flag. If there is no, then unlocks and returns.
- In ioctl handler driver checks for IFCAP_POLLING flag requested to
be set or cleared. Driver first calls ether_poll_[de]register(), then
obtains driver lock and [dis/en]ables interrupts.
- In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
If present, then returns.This is important to protect from spurious
interrupts.
Reviewed by: ru, sam, jhb
make function reenterable. In the runtime the race is masked by serializing
of em_process_receive_interrupts() either by interrupt thread, or by
polling. The race can be triggered when polling is switched on or off.
replacement and has additional features which make it superior.
Discussed on: -arch
Reviewed by: thompsa
X-MFC-after: never (RELENG_6 as transition period)
could get an interrupt after we free the ifp, and the interrupt
handler depended on the ifp being still alive, this could, in theory,
cause a crash. Eliminate this possibility by moving the if_free to
after the bus_teardown_intr() call.
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.
Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.
Reviewed by: pjd, bz
MFC after: 7 days
over iteration of their multicast address lists when synchronizing the
hardware address filter with the network stack-maintained list.
Problem reported by: Ed Maste (emaste at phaedrus dot sandvine dot ca>
MFC after: 1 week
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.
Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
- Changed from using explicit devices id to using descriptive labels.
- Added support for 82573 and 82546 Quad adapters.
- Corrected support for 82547EI and 82541ER (mac_type was not assigned)
- Removed #ifdef DBG_STATS and extraneous code.
if_em_hw.c/if_em_hw.h
- Added support for 82573 and 82546 Quad adapters.
- Brought forward Intel's most current mac and phy changes.
the interface when going to toggle VLAN support for
internal reasons. If the IFCAP_VLAN_HWTAGGING bit is
cleared, we should rely on the (re)init routine to turn
VLAN support off and never touch the relevant hardware bits.
This applies to other capability bits, too. The user
obviously has a reason for clearing a capability bit,
e.g., if his particular NIC is buggy and hangs if a
certain hardware capability is turned on even for a
fraction of a second.
The flag adapter->em_insert_vlan_header still is set or
reset irrespective of the IFCAP_VLAN_HWTAGGING setting,
as before, in order to handle the case when a user sets
promiscuous mode on an interface first and later turns
its IFCAP_VLAN_HWTAGGING bit on.
This change might look orthogonal to rev#1.85, but in fact
it is not. It introduces bugfixes that hopefully will make
implementing the general scheme mentioned in the commit
message of rev#1.85 easier.
configuration: it appears to work properly in the non-promiscuous case, but
we've not yet implemented a more general solution that maintains full
functionality with promiscuous mode enabled. While my hope is that we can
get one implemented soon, this will improve functionality substantially in
the mean time.
MFC after: 3 days
- Because em_encap() can now fail in a way that leaves us without an
mbuf chain, potentially set *m_headp to NULL if that happens, so that
the caller can do the right thing. This case can occur when we try
to prepend the vlan header mbuf but can't allocate additional memory.
- Modify the caller of em_encap() to detect a NULL m_head and not try
to queue the mbuf if that happens.
- When em_encap() fails, make sure to call bus_dmamap_destroy() to
clean up.
promiscuous mode introduced in 1.45, which programs the em card not
to strip or prepend tags when in promiscuous mode without also
modifying behavior to manually prepend a vlan header in the event
that the card isn't doing it on transmit. Due to a feature of card
operation, if the global VLAN prepend/strip register isn't set,
setting the VLAN tag flag on individual packet descriptors will
cause the packet to be transmitted using ISL encapsulation rather
than 802.1Q VLAN encapsulation.
This fix causes em_encap() to prepend the header by tracking whether
the card is configured to temporarily disable prepending/stripping
due to promiscuous mode. As a result, entering promiscuous mode on
the parent em interface no longer causes vlans to appear to "wedge"
or transmit ISL-encapsulated frames, which typically will not be
configured/spoken by the other endpoints on the VLAN trunk. This
bug may also exist in other drivers, and the additional vlan
encapsulation logic should be abstracted and centralized in
if_vlan.c if so.
RELENG_5_3 candidate.
MFC after: 1 week
Tested by: pjd, rwatson
Reported by: astesin at ukrtelecom dot net
Reported by: Mike Tancsa <mike at sentex dot net>
Reported by: Iasen Kostov <tbyte at OTEL dot net>
with a weak memory model or x86 + PAE (or more specifically, your
driver is using bounce pages) and you have had problems with em(4),
this may fix it. At least this is needed to have em(4) work properly
on FreeBSD/arm.
Original version by: cognet
Reviewed by: tackerman
Tested by: cognet
Since the e1000 DMA engines hava no constraints on the alignment of buffer
transfers, there is no reason to tell busdma that there is. This save a
minimum of 1 malloc call per packet, which translates to eliminating 4 locks.
It also means that buffers are not needlessly bounced when transfered. The
end result is a 38% improvement in pps in a 4 way bridging environment.
Obtained from: Sandvine, Inc.
Removed support for Intel 82541ER
Added fix for 82547 which corrects an issue with Jumbo frames larger than 10k.
Added fix for vlan tagged frames not being properly bridged.
Corrected TBI workaround.
Corrected incorrect LED operation issues
Submitted by: tackerman (Tony Ackerman)
MFC after: 2 weeks
following drivers: bfe(4), em(4), fxp(4), lnc(4), tun(4), de(4) rl(4),
sis(4) and xl(4)
More patches are pending on: http://peoples.freebsd.org/~mlaier/ Please take
a look and tell me if "your" driver is missing, so I can fix this.
Tested-by: many
No-objection: -current, -net
because VLAN hardware features are enabled in em(4) by default.
Note: Currently vlan(4) has a bug that it consults
if_capabilities, not if_capenable. This will be fixed
after all the network drivers set VLAN bits in
if_capenable properly.
mutex to be locked. It is redundant since em_init() is called and this
correctly locks the mutex and calls em_stop().
5.2 release candidate since this can cause a panic if the watchdog
expires.
Tested by: kuriyama
- In the receive routine handle the case where last descriptor could have
less than 4 bytes of data.
- Handle race between detach/ioctl routine.
MFC after: 3 days
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
lock around a call to the original function. Make the timeout
function in callout_reset() use the wrapped function to avoid a
lock assertion panic.
Reviewed by: sam
Reported by: cgiordano@ids.net
o correct recursive locking when polling and in em_82547_move_tail
o destroy mutex on detach
o add EM_LOCK_ASSERT and similar macros for creating+deleteing the mtx
Submitted by: Daniel Eischen <eischen@vigrid.com>
Bug Fixes:
- Allow users to use LAA
- Remember promiscuous mode settings while bridging
- Allow gratuitous arp's to be sent
PR: 52966/54488
MFC after: 1 week
recompiling the driver. See the comments near the top of "if_em.h"
for descriptions of these delays. Four new loader tunables control
the system-wide default values:
hw.em.tx_int_delay
hw.em.rx_int_delay
hw.em.tx_abs_int_delay
hw.em.rx_abs_int_delay
The tunables are specified in microseconds. The valid range is
0-67108 usec., and 0 means that the timer is disabled.
There are also four new sysctls (actually, a set of four for each
"em" device in the system) to query and change the interrupt delays
after the system is up:
hw.em0.tx_int_delay
hw.em0.rx_int_delay
hw.em0.tx_abs_int_delay (not present for 82542/3/4 adapters)
hw.em0.rx_abs_int_delay (not present for 82542/3/4 adapters)
It seems to be OK to change these values even while the adapter is
passing traffic.
Approved by: Prafulla Deuskar <pdeuskar@FreeBSD.ORG>
MFC after: 4 weeks
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.
sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.
If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.
Reviewed by: tmm, gibbs
Add sysctl's to display statistics/debug_info
Set WAIT_FOR_AUTONEG_DEFAULT to zero by default
Increment packet in/out statistics inline instead of every two seconds.
MFC after: 3 days
- Use htole* macros where appropriate so that the driver could work on non-x86 architectures
- Use m_getcl() instead of MGETHDR/MCLGET macros
Submitted by: sam (Sam Leffler)
network layer (ether).
- Don't abuse module names to facilitate ifconfig module loading;
such abuse isn't really needed. (And if we do need type information
associated with a module then we should make it explicit and not
use hacks.)
- Added support for ITR (interrupt throttle register). This feature is available on
adapters based on 82545 and above
- Fixed problem with vlan support when traffic has priority bits set. (kern/45907)
PR: kern/45907
MFC after: 1 week
o don't strip the Ethernet header from inbound packets; pass packets
up the stack intact (required significant changes to some drivers)
o reference common definitions in net/ethernet.h (e.g. ETHER_ALIGN)
o track ether_ifattach/ether_ifdetach API changes
o track bpf changes (use BPF_TAP and BPF_MTAP)
o track vlan changes (ifnet capabilities, revised processing scheme, etc.)
o use if_input to pass packets "up"
o call ether_ioctl for default handling of ioctls
Reviewed by: many
Approved by: re
- Set RDTR to zero by default instead of 28.
- Fixed a problem with TX hangs with jumbo frames when number of fragments in the mbuf chain
is large.
- Added support for 82540EP based cards.
MFC after: 3 days
We are having panics with the driver under stress with jumbo frames.
Unfortunately we didnot catch it during our regular test cycle.
I am going to MFC the backout immediately.