Non-SMP, i386-only, no polling in the idle loop at the moment.
To use this code you must compile a kernel with
options DEVICE_POLLING
and at runtime enable polling with
sysctl kern.polling.enable=1
The percentage of CPU reserved to userland can be set with
sysctl kern.polling.user_frac=NN (default is 50)
while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.
Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at
http://info.iet.unipi.it/~luigi/polling/
and also supports polling in the idle loop.
NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.
NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.
Quick description of files touched by this commit:
sys/conf/files.i386
new file kern/kern_poll.c
sys/conf/options.i386
new option
sys/i386/i386/trap.c
poll in trap (disabled by default)
sys/kern/kern_clock.c
initialization and hardclock hooks.
sys/kern/kern_intr.c
minor swi_net changes
sys/kern/kern_poll.c
the bulk of the code.
sys/net/if.h
new flag
sys/net/if_var.h
declaration for functions used in device drivers.
sys/net/netisr.h
NETISR_POLL
sys/dev/fxp/if_fxp.c
sys/dev/fxp/if_fxpvar.h
sys/pci/if_dc.c
sys/pci/if_dcreg.h
sys/pci/if_sis.c
sys/pci/if_sisreg.h
device driver modifications
and packet bundling. Make the microcode settings controllable via sysctl
and loader tunables.
Submitted by: Marko Zec <zec@tel.fer.hr>
(with some munging and dynamic sysctl support by me)
Also extend the workaround for Dynamic Standby mode to later '559 chips,
not just the ICH2 variants.
This is taken verbatim from the Intel's e100-1.6.22 release, with
the addition of their LICENSE file at the top.
Submitted by: Marko Zec <zec@tel.fer.hr>
the chip can cause a PCI protocol violation in under certain scenarios.
The workaround is to rewrite the EEPROM to disable Dynamic Standby Mode.
Once the EEPROM is rewritten, the system needs to be rebooted in order
to pick up the new settings.
This has been tested on several ICH2/ICH2-M systems, found in 815E based
boards, and usually identified by the presence of the 82562 ET/EM PHY.
Thanks to: Mike Tansca, Paul Saab for samples of the problematic boards.
of " &= ". Also change the MII PHY device mask to check the correct bits.
Cookie to: Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Pointy hat to: me
a 82557 (e.g.: a newer chip) then:
+ enable MWI, if the PCI configuration indicates the system supports it
+ enable usage of extended TxCB, for better performance
+ enable hardware flow control. FC frames will be passed up to the
host only if promiscuous mode is enabled.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.
Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
an override as a loader settable variable (fxp_iomap). fxp_iomap is
a bitmap of fxp units that should be configured to use PCI I/O space
in stead of PCI Memory space.
Reviewed by: Kees Jan Koster <dutchman@tccn.cs.kun.nl>, dg@freebsd.org
All calls to mtx_init() for mutexes that recurse must now include
the MTX_RECURSE bit in the flag argument variable. This change is in
preparation for an upcoming (further) mutex API cleanup.
The witness code will call panic() if a lock is found to recurse but
the MTX_RECURSE bit was not set during the lock's initialization.
The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to
MTX_RECURSED, which is more appropriate given its meaning.
The following locks have been made "recursive," thus far:
eventhandler, Giant, callout, sched_lock, possibly some others declared
in the architecture-specific code, all of the network card driver locks
in pci/, as well as some other locks in dev/ stuff that I've found to
be recursive.
Reviewed by: jhb
claimed that their Intel NIC is comatose after a warm boot from Windoze.
This is most likely due to the card getting put in the D3 state. This
should bring it back to life.
is enabling as all entries are still called with Giant being held.
Maintaining compatability with NetBSD makes what should be very simple
kinda ugly.
Reviewed by: Jason Evans
understand exactly what it is about SMPng that tickles this bug. What I
do know is that the foo_init() routine in most drivers is often called
twice when an interface is brought up. One time is due to the ifconfig(8)
command calling the SIOCSIFFLAGS ioctl to set the IFF_UP flag, and another
is probably due to the kernel calling ifp->if_init at some point. In any
case, the SMPng changes seem to affect the timing of these two events in
such a way that there is a significant delay before any packets are sent
onto the wire after the interface is first brought up. This manifested
itself locally as an SMPng test machine which failed to obtain an address
via DHCP when booting up.
It looks like the second call to fxp_init() is happening faster now than
it did before, and I think it catches the chip while it's in the process
of dealing with the configuration command from the first call. Whatever
the case, a FXP_CSR_SCB_CNA interrupt event is now generated shortly after
the second fxp_init() call. (This interrupt is apparently never generated
by a non-SMPng kernel, so nobody noticed.)
There are two problems with this: first, fxp_intr() does not handle the
FXP_CSR_SCB_CNA interrupt event (it never tests for it or does anything
to deal with it), and second, the meaning of FXP_CSR_SCB_CNA is not
documented in the driver. (Apparently it means "command unit not active.")
Bad coder. No biscuit.
The fix is to have the FXP_CSR_SCB_CNA interrupt handled just like the
FXP_SCB_STATACK_CXTNO interrupt. This prevents the state machine for
the configuration/RX filter programming stuff from getting wedged for
several seconds and preventing packet transmission.
Noticed by: jhb
lock up under moderate to heavy load.
The status & command fields share a 32-bit longword. The programming
API of the eepro apparently requires that you update the command field
of a transmit slot that you've already given to the card. This means
the card could be updating the status field of the same longword at
the same time. Since alphas can only operate on 32-bit chunks of
memory, both the status & command fields are loaded from memory &
operated on in registers when the following line of C is executed:
sc->cbl_last->cb_command &= ~FXP_CB_COMMAND_S;
The race is caused by the card DMA'ing up the status at just the wrong
time -- after it has been loaded into a register & before it has been
written back. The old value of the status is written back, clobbering
the status the card just DMA'ed up. The fact that the card has sent
this frame is missed & the transmit engine appears to hang.
Luckily, as numerous people on the freebsd-alpha list pointed out, the
load-locked/store-conditional instructions used by the atomic
functions work with respect changes in memory due to I/O devices. We
now use them to safely update the command field.
Tested by: Bernd Walter <ticso@mail.cicely.de>
ether_ifdetach().
The former consolidates the operations of if_attach(), ng_ether_attach(),
and bpfattach(). The latter consolidates the corresponding detach operations.
Reviewed by: julian, freebsd-net
of the individual drivers and into the common routine ether_input().
Also, remove the (incomplete) hack for matching ethernet headers
in the ip_fw code.
The good news: net result of 1016 lines removed, and this should make
bridging now work with *all* Ethernet drivers.
The bad news: it's nearly impossible to test every driver, especially
for bridging, and I was unable to get much testing help on the mailing
lists.
Reviewed by: freebsd-net