Commit Graph

47 Commits

Author SHA1 Message Date
Robert Watson
f2d2d69438 Rework netisr policy mechanism so that per-protocol dispatch policies can
be represented:

- A single policy namespace is defined, consisting of four possible
  policies: "default" to use the global default, "deferred" to force
  deferred dispatch, "direct" to employ direct dispatch where possible, and
  "hybrid" which makes a dynamic decision based on CPU affinity, ordering,
  etc.  Routines are implemented to convert between strings and an integer
  namespace.

- A new global variable, netisr_dispatch_policy, subsumes existing global
  variables for direct dispatch, forced direct dispatch, etc, and is used
  for explicit policy interpretation and composition.  Old variables remain
  so that they can be exported by legacy sysctls for use by old netstat(1)
  binaries.  A new sysctl and tunable, netisr.dispatch.policy, accepts the
  above strings for specifying a global policy default.

- The protocol registration structure, netisr_handler, grows an nh_dispatch
  field, which accepts a per-policy policy override.  The default value is
  '0', which corresponds to "default", meaning that protocols will accept
  the global default policy unless otherwise specified.

- Policies are now interpreted and composed explicitly at various points in
  packet dispatch; protocol policies override global policies.

- Protocols grow the ability to express a non-opinion about affinity even
  when implenting m2cpuid by returning NETISR_CPUID_NONE.  In that case, the
  framework falls back on source ordering, rather than simply using the
  current CPU.

These changes are in support of allowing link layer re-dispatch based on
RSS or similar hashes provided by NICs, especially in the case where the
number of hardware receive queues matches hardware core count, rather than
hardware thread count, requiring further software redistributeon.  (i.e.,
on RMI XLR).

MFC after:      3 weeks
Reviewed by:    bz
Sponsored by:   Juniper Networks, Inc.
2011-05-24 12:34:19 +00:00
Robert Watson
60efbc9991 Whitespace tweak.
MFC after:	3 days
2010-03-01 00:43:05 +00:00
Robert Watson
c4fbf89fc5 Fix constant assignment for netisr protocol information sysctl.
MFC after:	1 week
Spotted by:	bz
2010-02-22 16:16:16 +00:00
Robert Watson
2d22f334ea Export netisr configuration and statistics to userspace via sysctl(9).
MFC after:	1 week
Sponsored by:	Juniper Networks
2010-02-22 15:03:16 +00:00
Bjoern A. Zeeb
d0ea47437a Update epair(4) to the new netisr implementation and polish
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
  from going away, under INVARIANTS as this is a general problem
  of the stack and should be solved in if.c/netisr but still good
  to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.

Hook epair(4) up to the build.

Approved by:	re (kib)
2009-07-26 12:20:07 +00:00
Bjoern A. Zeeb
ed655c8c07 Add an optional callback function that will be invoked when a per-CPU
queue was drained.  It will never fire for a directly dispatched packet.

You will most likely never want to use this for any ordinary netisr usage
and you will never blame netisr in case you try to use it and it does
not work as expected.

Reviewed by:	rwatson
2009-06-14 17:15:18 +00:00
Robert Watson
ed54411c19 Garbage collect NETISR_POLL and NETISR_POLLMORE, which are no longer
required for options DEVICE_POLLING.

De-fragment the NETISR_ constant space and lower NETISR_MAXPROT from
32 to 16 -- when sizing queue arrays using this compile-time constant,
significant amounts of memory are saved.

Warn on the console when tunable values for netisr are automatically
adjusted during boot due to exceeding limits, invalid values, or as a
result of DEVICE_POLLING.
2009-06-01 15:03:58 +00:00
Robert Watson
d4b5cae49b Reimplement the netisr framework in order to support parallel netisr
threads:

- Support up to one netisr thread per CPU, each processings its own
  workstream, or set of per-protocol queues.  Threads may be bound
  to specific CPUs, or allowed to migrate, based on a global policy.

  In the future it would be desirable to support topology-centric
  policies, such as "one netisr per package".

- Allow each protocol to advertise an ordering policy, which can
  currently be one of:

  NETISR_POLICY_SOURCE: packets must maintain ordering with respect to
    an implicit or explicit source (such as an interface or socket).

  NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work,
    as well as allowing protocols to provide a flow generation function
    for mbufs without flow identifers (m2flow).  Falls back on
    NETISR_POLICY_SOURCE if now flow ID is available.

  NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for
    each packet handled by netisr (m2cpuid).

- Provide utility functions for querying the number of workstreams
  being used, as well as a mapping function from workstream to CPU ID,
  which protocols may use in work placement decisions.

- Add explicit interfaces to get and set per-protocol queue limits, and
  get and clear drop counters, which query data or apply changes across
  all workstreams.

- Add a more extensible netisr registration interface, in which
  protocols declare 'struct netisr_handler' structures for each
  registered NETISR_ type.  These include name, handler function,
  optional mbuf to flow ID function, optional mbuf to CPU ID function,
  queue limit, and ordering policy.  Padding is present to allow these
  to be expanded in the future.  If no queue limit is declared, then
  a default is used.

- Queue limits are now per-workstream, and raised from the previous
  IFQ_MAXLEN default of 50 to 256.

- All protocols are updated to use the new registration interface, and
  with the exception of netnatm, default queue limits.  Most protocols
  register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use
  NETISR_POLICY_FLOW, and will therefore take advantage of driver-
  generated flow IDs if present.

- Formalize a non-packet based interface between interface polling and
  the netisr, rather than having polling pretend to be two protocols.
  Provide two explicit hooks in the netisr worker for start and end
  events for runs: netisr_poll() and netisr_pollmore(), as well as a
  function, netisr_sched_poll(), to allow the polling code to schedule
  netisr execution.  DEVICE_POLLING still embeds single-netisr
  assumptions in its implementation, so for now if it is compiled into
  the kernel, a single and un-bound netisr thread is enforced
  regardless of tunable configuration.

In the default configuration, the new netisr implementation maintains
the same basic assumptions as the previous implementation: a single,
un-bound worker thread processes all deferred work, and direct dispatch
is enabled by default wherever possible.

Performance measurement shows a marginal performance improvement over
the old implementation due to the use of batched dequeue.

An rmlock is used to synchronize use and registration/unregistration
using the framework; currently, synchronized use is disabled
(replicating current netisr policy) due to a measurable 3%-6% hit in
ping-pong micro-benchmarking.  It will be enabled once further rmlock
optimization has taken place.  However, in practice, netisrs are
rarely registered or unregistered at runtime.

A new man page for netisr will follow, but since one doesn't currently
exist, it hasn't been updated.

This change is not appropriate for MFC, although the polling shutdown
handler should be merged to 7-STABLE.

Bump __FreeBSD_version.

Reviewed by:	bz
2009-06-01 10:41:38 +00:00
Robert Watson
a765f96051 Garbage collect unused NETISR_{ATM,NETGRAPH,PPP} netisr constants. 2009-05-18 10:33:23 +00:00
Robert Watson
2f120c90a7 Garbage collect now-unused NETISR_FORCEQUEUE, which overrode the global
direct dispatch policy for specific protocols (NETISR_USB).  We leave
the additional 'flags' argument to netisr_register() for the time being,
even though it is no longer required.
2009-05-13 17:22:33 +00:00
Robert Watson
270b609935 Remove now-unused NETISR_USB. 2009-05-13 17:17:05 +00:00
Bruce M Simpson
1e98b429fe Reserve a netisr slot for the IGMPv3 output queue. 2009-03-04 02:54:11 +00:00
Robert Watson
59dd72d040 Remove NETISR_MPSAFE, which allows specific netisr handlers to be directly
dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific
netisr handlers to always be dispatched via a queue (deferred).  Mark the
usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly
acquire Giant in those handlers.

Previously, any netisr handler not marked NETISR_MPSAFE would necessarily
run deferred and with Giant acquired.  This change removes Giant
scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows
non-MPSAFE handlers to continue to force deferred dispatch so as to avoid
lock order reversals between their acqusition of Giant and any calling
context.

It is likely we will be able to remove NETISR_FORCEQUEUE once
IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no
longer be supported.

Reviewed by:	bz
MFC after:	1 month
X-MFC note:	We can't remove NETISR_MPSAFE from stable/7 for KPI reasons,
		but the rest can go back.
2008-07-04 00:21:38 +00:00
Robert Watson
315f04614c Update netisr comment for the SMPng world order: netisr is no longer
implemented using the ISR facility, and cannot be triggered by calling
splnet()/splx().

MFC after:	3 weeks
2007-12-31 20:58:50 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Robert Watson
d989c7b389 Introduce a netisr to deliver kernel-generated routing, avoiding
recursive entering of the socket code from the routing code:

- Modify rt_dispatch() to bundle up the sockaddr family, if any,
  associated with a pending mbuf to dispatch to routing sockets, in
  an m_tag on the mbuf.

- Allocate NETISR_ROUTE for use by routing sockets.

- Introduce rtsintrq, an ifqueue to be used by the netisr, and
  introduce rts_input(), a function to unbundle the tagged sockaddr
  and inject the mbuf and address into raw_input(), which previously
  occurred in rt_dispatch().

- Introduce rts_init() to initialize rtsintrq, its mutex, and
  register the netisr.  Perform this at the same point in system
  initialization as setup of the domains.

This change introduces asynchrony between the generation of a
pending routing socket message and delivery to sockets for use
by userspace.  It avoids socket->routing->rtsock->socket use and
helps to avoid lock order reversals between the routing code and
socket code (in particular, raw socket control blocks), as route
locks are held over calls to rt_dispatch().

Reviewed by:		"George V.Neville-Neil" <gnn@neville-neil.com>
Conceptual head nod by:	sam
2004-06-09 02:48:23 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Sam Leffler
7902224c6b o add a flags parameter to netisr_register that is used to specify
whether or not the isr needs to hold Giant when running; Giant-less
  operation is also controlled by the setting of debug_mpsafenet
o mark all netisr's except NETISR_IP as needing Giant
o add a GIANT_REQUIRED assertion to the top of netisr's that need Giant
o pickup Giant (when debug_mpsafenet is 1) inside ip_input before
  calling up with a packet
o change netisr handling so swi_net runs w/o Giant; instead we grab
  Giant before invoking handlers based on whether the handler needs Giant
o change netisr handling so that netisr's that are marked MPSAFE may
  have multiple instances active at a time
o add netisr statistics for packets dropped because the isr is inactive

Supported by:	FreeBSD Foundation
2003-11-08 22:28:40 +00:00
Peter Wemm
3c6b084e96 Finish driving a stake through the heart of netns and the associated
ifdefs scattered around the place - its dead Jim!

The SMB stuff had stolen AF_NS, make it official.
2003-03-05 19:24:24 +00:00
Jonathan Lemon
1cafed3941 Update netisr handling; Each SWI now registers its queue, and all queue
drain routines are done by swi_net, which allows for better queue control
at some future point.  Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.

Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
2003-03-04 23:19:55 +00:00
Robert Watson
4a583fd480 Slight whitespace cleanup. Whitespace sync to MAC tree. 2002-07-27 19:53:02 +00:00
Alfred Perlstein
929ddbbb89 Remove __P. 2002-03-19 21:54:18 +00:00
Luigi Rizzo
e4fc250c15 Device Polling code for -current.
Non-SMP, i386-only, no polling in the idle loop at the moment.

To use this code you must compile a kernel with

        options DEVICE_POLLING

and at runtime enable polling with

        sysctl kern.polling.enable=1

The percentage of CPU reserved to userland can be set with

        sysctl kern.polling.user_frac=NN (default is 50)

while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.

Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at

	http://info.iet.unipi.it/~luigi/polling/

and also supports polling in the idle loop.

NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.

NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.

Quick description of files touched by this commit:

sys/conf/files.i386
        new file kern/kern_poll.c
sys/conf/options.i386
        new option
sys/i386/i386/trap.c
        poll in trap (disabled by default)
sys/kern/kern_clock.c
        initialization and hardclock hooks.
sys/kern/kern_intr.c
        minor swi_net changes
sys/kern/kern_poll.c
        the bulk of the code.
sys/net/if.h
        new flag
sys/net/if_var.h
        declaration for functions used in device drivers.
sys/net/netisr.h
        NETISR_POLL
sys/dev/fxp/if_fxp.c
sys/dev/fxp/if_fxpvar.h
sys/pci/if_dc.c
sys/pci/if_dcreg.h
sys/pci/if_sis.c
sys/pci/if_sisreg.h
        device driver modifications
2001-12-14 17:56:12 +00:00
Jake Burkholder
1eb44f0270 Remove the last of the MD netisr code. It is now all MI. Remove
spending, which was unused now that all software interrupts have
their own thread.  Make the legacy schednetisr use an atomic op
for setting bits in the netisr mask.

Reviewed by:	jhb
2000-12-05 00:36:00 +00:00
John Baldwin
8088699f79 - Overhaul the software interrupt code to use interrupt threads for each
type of software interrupt.  Roughly, what used to be a bit in spending
  now maps to a swi thread.  Each thread can have multiple handlers, just
  like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
  software interrupt thread to run, so spending, NSWI, and the shandlers
  array are no longer needed.  We can now have an arbitrary number of
  software interrupt threads.  When you register a software interrupt
  thread via sinthand_add(), you get back a struct intrhand that you pass
  to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
  more intuitive.  Also, prefix all the members of struct intrhand with
  'ih_'.
- Make swi_net() a MI function since there is now no point in it being
  MD.

Submitted by:	cp
2000-10-25 05:19:40 +00:00
Poul-Henning Kamp
6cb2a0952f Do some cleanups of the HARP atm codes interface into the system:
Define the NETISR just like all the other NETISRs.

unifdef -Usun -D__FreeBSD__  we will probably never support sun4c
and if we do we can't use the solaris code anyway and  I doubt
anybody will be running Fore ATM cards in then in the first place.
2000-10-12 00:03:50 +00:00
Peter Wemm
242c5536ea Clean up some loose ends in the network code, including the X.25 and ISO
#ifdefs.  Clean out unused netisr's and leftover netisr linker set gunk.
Tested on x86 and alpha, including world.

Approved by:	jkh
2000-02-13 03:32:07 +00:00
Bill Paul
a0067d7b89 Attempt to fix a problem with receiving packets on USB ethernet interfaces.
Packets are received inside USB bulk transfer callbacks, which run at
splusb() (actually splbio()). The packet input queues are meant to be
manipulated at splimp(). However the locking apparently breaks down under
certain circumstances and the input queues can get trampled.

There's a similar problem with if_ppp, which is driven by hardware/tty
interrupts from the serial driver, but which must also manipulate the
packet input queues at splimp(). The fix there is to use a netisr, and
that's the fix I used here. (I can hear you groaning back there. Hush up.)

The usb_ethersubr module maintains a single queue of its own. When a
packet is received in the USB callback routine, it's placed on this
queue with usb_ether_input(). This routine also schedules a soft net
interrupt with schednetisr(). The ISR routine then runs later, at
splnet, outside of the USB callback/interrupt context, and passes the
packet to ether_input(), hopefully in a safe manner.

The reason this is implemented as a separate module is that there are
a limited number of NETISRs that we can use, and snarfing one up for
each driver that needs it is wasteful (there will be three once I get
the CATC driver done). It also reduces code duplication to a certain
small extent. Unfortunately, it also needs to be linked in with the
usb.ko module in order for the USB ethernet drivers to share it.

Also removed some uneeded includes from if_aue.c and if_kue.c

Fix suggested by: peter
Not rejected as a hairbrained idea by: n_hibma
2000-01-10 23:12:54 +00:00
Peter Wemm
664a31e496 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 04:46:21 +00:00
Yoshinobu Inoue
76429de41a KAME related header files additions and merges.
(only those which don't affect c source files so much)

Reviewed by: cvs-committers
Obtained from: KAME project
1999-11-05 14:41:39 +00:00
Julian Elischer
4cf49a4355 Whistle's Netgraph link-layer (sometimes more) networking infrastructure.
Been in production for 3 years now. Gives Instant Frame relay to if_sr
and if_ar drivers, and PPPOE support soon. See:
ftp://ftp.whistle.com/pub/archie/netgraph/index.html
for on-line manual pages.

Reviewed by: Doug Rabson (dfr@freebsd.org)
Obtained from:  Whistle CVS tree
1999-10-21 09:06:11 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Peter Wemm
2ef43b0971 Make NETISR_SET use a SYSINIT() rather than a linker set. 1999-04-26 08:52:16 +00:00
Bruce Evans
35b88f573a Fixed pedantic syntax errors caused by a trailing semicolon in a macro
definition.
1998-06-07 11:52:17 +00:00
Bruce Evans
514ede0953 Fixed gratuitous ANSIisms. 1997-09-16 11:44:05 +00:00
Kenjiro Cho
68713f97a1 merge ATM driver 1997-05-09 12:19:06 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Julian Elischer
655929bfba Obtained from: netatalk distribution netatalk@itd.umich.edu
Kernel Appletalk protocol support
both CAP and netatalk can make use of this..
still needs some owrk but  it seemd the right tiime to commit it
so other can experiment.
1996-05-24 01:35:45 +00:00
Peter Wemm
06cc185852 Add a simplistic netisr register routine - I need this now for ppp-2.2. 1995-10-31 19:07:53 +00:00
Julian Elischer
cc6a66f20e Reviewed by: julian and jhay@mikom.csir.co.za
Submitted by:	Mike Mitchell, supervisor@alb.asctmd.com

This is a bulk mport of Mike's IPX/SPX protocol stacks and all the
related gunf that goes with it..
it is not guaranteed to work 100% correctly at this time
but as we had several people trying to work on it
I figured it would be better to get it checked in so
they could all get teh same thing to work on..

Mikes been using it for a year or so
but on 2.0

more changes and stuff will be merged in from other developers now that this is in.

Mike Mitchell, Network Engineer
AMTECH Systems Corporation, Technology and Manufacturing
8600 Jefferson Street, Albuquerque, New Mexico 87113 (505) 856-8000
supervisor@alb.asctmd.com
1995-10-26 20:31:59 +00:00
Garrett Wollman
748e0b0acc Make networking domains drop-ins, through the magic of GNU ld. (Some day,
there may even be LKMs.)  Also, change the internal name of `unixdomain'
to `localdomain' since AF_LOCAL is now the preferred name of this family.
Declare netisr correctly and in the right place.
1995-05-11 00:13:26 +00:00
Stefan Eßer
623976474c Submitted by: Wolfgang Stanglmeier <wolf@dentaro.GUN.de>
Reviewed by: <wollman>
First hooks and defines for the ISDN driver,
that soon will see the light ...
1995-01-05 19:51:51 +00:00
Paul Richards
cea1da3be2 Make idempotent.
Submitted by:	Paul
1994-08-21 05:11:48 +00:00
David Greenman
3c4dd3568f Added $Id$ 1994-08-02 07:55:43 +00:00
Rodney W. Grimes
26f9a76710 The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by:	Rodney W. Grimes
Submitted by:	John Dyson and David Greenman
1994-05-25 09:21:21 +00:00
Rodney W. Grimes
df8bae1de4 BSD 4.4 Lite Kernel Sources 1994-05-24 10:09:53 +00:00