Commit Graph

76 Commits

Author SHA1 Message Date
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
Robert Watson
d0728d7174 Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
  occur each time a network stack is instantiated and destroyed.  In the
  !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
  For the VIMAGE case, we instead use SYSINIT's to track their order and
  properties on registration, using them for each vnet when created/
  destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
  previously, as well as its dependency scheme: we now just use the
  SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
  they want init functions to be called for each virtual network stack
  rather than just once at boot, compiling down to DOMAIN_SET() in the
  non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
  of modinfo or DOMAIN_SET() for init/uninit events.  In some cases,
  convert modular components from using modevent to using sysinit (where
  appropriate).  In some cases, do minor rejuggling of SYSINIT ordering
  to make room for or better manage events.

Portions submitted by:	jhb (VNET_SYSINIT), bz (cleanup)
Discussed with:		jhb, bz, julian, zec
Reviewed by:		bz
Approved by:		re (VIMAGE blanket)
2009-07-23 20:46:49 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
Robert Watson
eb956cd041 Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs.  This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 11:45:06 +00:00
Bjoern A. Zeeb
5736e6fb9d After cleaning up rt_tables from vnet.h and cleaning up opt_route.h
a lot of files no longer need route.h either. Garbage collect them.
While here remove now unneeded vnet.h #includes as well.
2009-06-23 17:03:45 +00:00
Alexander Motin
c1b8a9edab Mark ng_ether node hooks as HI_STACK. It is usually the last point when
netgraph may unroll the call stack, and I have found that in some cases 2K
guarantied there for i386 may be not enough for NIC driver and BPF.
2009-06-23 12:30:21 +00:00
Marko Zec
f089869fa5 Introduce a mechanism for detecting calls from outbound path of the
network stack when reentering the inbound path from netgraph, and
force queueing of mbufs at the outbound netgraph node.

The mechanism relies on two components.  First, in netgraph nodes
where outbound path of the network stack calls into netgraph, the
current thread has to be appropriately marked using the new
NG_OUTBOUND_THREAD_REF() macro before proceeding to call further
into the netgraph topology, and unmarked using the
NG_OUTBOUND_THREAD_UNREF() macro before returning to the caller.
Second, netgraph nodes which can potentially reenter the network
stack in the inbound path have to mark their inbound hooks using
NG_HOOK_SET_TO_INBOUND() macro.  The netgraph framework will then
detect when there is a danger of a call graph looping back from
outbound to inbound path via netgraph, and defer handing off the
mbufs to the "inbound" node to a worker thread with a clean stack.

In this first pass only the most obvious netgraph nodes have been
updated to ensure no outbound to inbound calls can occur.  Nodes
such as ng_ipfw, ng_gif etc. should be further examined whether a
potential for outbound to inbound call looping exists.

This commit changes the layout of struct thread, but due to
__FreeBSD_version number shortage a version bump has been omitted
at this time, nevertheless kernel and modules have to be rebuilt.

Reviewed by:	julian, rwatson, bz
Approved by:	julian (mentor)
2009-06-11 16:50:49 +00:00
Bjoern A. Zeeb
8d8bc0182e After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.
2009-06-08 19:57:35 +00:00
Marko Zec
aef8f3445b In preparation to make options VIMAGE operational, where needed,
initialize / release netgraph related state in iattach() / idetach()
functions called via the vnet module registration / initialization
framework, instead of initialization / cleanups being done in
mod_event handlers.

While here, introduce a crude hack aimed at preventing ng_ether to
autoattach to ng_eiface ifnets, which are also netgraph nodes already.

Reviewed by:	bz
Approved by:	julian (mentor)
2009-04-26 07:14:50 +00:00
Bjoern A. Zeeb
33553d6e99 For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.
2009-02-27 14:12:05 +00:00
Julian Elischer
f664dcde7c shave about 7% off the overhead of ng_ether by using per-hook
receive data methods.
2008-12-25 09:02:55 +00:00
Bjoern A. Zeeb
4b79449e2f Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by:	brooks, gnn, des, zec, imp
Sponsored by:	The FreeBSD Foundation
2008-12-02 21:37:28 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
Bjoern A. Zeeb
603724d3ab Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
Bruce M Simpson
ec002fee99 Implement reference counting for ifmultiaddr, in_multi, and in6_multi
structures. Detect when ifnet instances are detached from the network
stack and perform appropriate cleanup to prevent memory leaks.

This has been implemented in such a way as to be backwards ABI compatible.
Kernel consumers are changed to use if_delmulti_ifma(); in_delmulti()
is unable to detect interface removal by design, as it performs searches
on structures which are removed with the interface.

With this architectural change, the panics FreeBSD users have experienced
with carp and pfsync should be resolved.

Obtained from:	p4 branch bms_netdev
Reviewed by:	andre
Sponsored by:	Garance A Drosehn
Idea from:	NetBSD
MFC after:	1 month
2007-03-20 00:36:10 +00:00
Gleb Smirnoff
447a8026ec Turn off by default "feature" that overwrites MAC address
on output frames.

Many people were confused with not working CARP, ng_bridge(4)
and other subsystems, because ng_ether(4) overwritten source
MAC address.
2006-08-04 13:36:27 +00:00
Andrew Thompson
9674cf0e27 Remove the dependency of bridgestp.h on if_bridgevar.h by moving a couple of
private structures to if_bridge.c.
2006-07-27 21:01:48 +00:00
Tai-hwa Liang
87909ba75c Fixing compilation bustage: net/if_bridgevar.h depends on net/bridgestp.h. 2006-07-27 06:15:37 +00:00
Gleb Smirnoff
04b3fa1623 Do not force queueing on peer hooks. This was important only for
5.0-CURRENT. And it looks like this didn't work before Julian's
revamp of netgraph queue code.

Reviewed by:	julian
2006-01-12 21:09:12 +00:00
Ruslan Ermilov
4a0d6638b3 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
Ruslan Ermilov
5240dcdb07 Make IFP2NG() usable as an lvalue. 2005-11-09 13:34:43 +00:00
Ruslan Ermilov
c60c00bce1 Restore part of the code dropped in rev. 1.25 that makes sure ether_demux()
will receive an mbuf chain with Ethernet header in the data portion of the
first mbuf.
2005-11-03 07:46:04 +00:00
Andrew Thompson
fd6238a659 Further clean up the bridge hooks in if_ethersubr.c and ng_ether.c
- move the function pointer definitions to if_bridgevar.h
- move most of the logic to the new BRIDGE_INPUT and BRIDGE_OUTPUT macros
- remove unneeded functions from if_bridgevar.h and sort a little.
2005-10-14 02:38:47 +00:00
Andrew Thompson
9cff52f7f6 Clean up the if_bridge hooks a bit in if_ethersubr.c and ng_ether.c, move
the broadcast/multicast test to bridge_input().

Requested by:	glebius
2005-10-13 09:43:30 +00:00
Gleb Smirnoff
6512768b89 A deja vu of:
http://lists.freebsd.org/pipermail/cvs-src/2004-October/033496.html

The same problem applies to if_bridge(4), too.

- Copy-and-paste the if_bridge(4) related block from
  if_ethersubr.c to ng_ether.c
- Add XXXs, so that copy-and-paste would be noticed by
  any future editors of this code.
- Also add XXXs near if_bridge(4) declarations.

Silence from:	thompsa
2005-10-07 14:14:47 +00:00
Max Laier
b6de9e91bd Remove bridge(4) from the tree. if_bridge(4) is a full functional
replacement and has additional features which make it superior.

Discussed on:	-arch
Reviewed by:	thompsa
X-MFC-after:	never (RELENG_6 as transition period)
2005-09-27 18:10:43 +00:00
Robert Watson
13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Gleb Smirnoff
cefddd662e Add new netgraph control message NGM_ETHER_DETACH, which actually
removes netgraph node and unwraps Ethernet interface.

This gives us ability to unload ng_ether.ko, when all interfaces
are detached, making ng_ether(4) developers happy.

Reviewed by:	ru
2005-02-14 12:01:09 +00:00
Gleb Smirnoff
ba20540e3d - bzero sockaddr_dl
- use constant instead of number

Suggested by:	ru
2005-02-12 19:19:29 +00:00
Gleb Smirnoff
810d5e89f4 Add two new netgraph messages NGM_ETHER_ADD_MULTI and NGM_ETHER_DEL_MULTI,
to join and leave Ethernet multicast membership, respectively. Messages
take MAC address as argument.

Sponsored by:	Rinet ISP
2005-02-12 11:41:32 +00:00
Ruslan Ermilov
953a8b2e6d Removed redundant MODULE_VERSION(). 2005-02-12 11:14:25 +00:00
Ruslan Ermilov
bbb75d7844 Fixed an off-by-one error when dealing with interface name
(if_xname is IFNAMSIZ-sized and NUL-terminated).
2005-02-03 12:50:10 +00:00
Gleb Smirnoff
1c7899c74e This change adds reliability for Ethernet trunks built with ng_one2many:
- Introduce another ng_ether(4) callback ng_ether_link_state_p, which
  is called from if_link_state_change(), every time link is changed.
- In ng_ether_link_state() send netgraph control message notifying
  of link state change to a node connected to "lower" hook.

Reviewed by:	sam
MFC after:	2 weeks
2005-01-08 12:42:03 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Gleb Smirnoff
3cbbee76f8 Remove a check that never returns true, because in this case we have panic
before the check.
2004-12-30 12:22:51 +00:00
Gleb Smirnoff
a176c2aeaf Fix packet flow when both ng_ether(4) and bridge(4) are in use:
- push all bridge logic from if_ethersubr.c into bridge.c
  make bridge_in() return mbuf pointer (or NULL).
- call only bridge_in() from ether_input(), after ng_ether_input()
  was optinally called.
- call bridge_in() from ng_ether_rcv_upper().

Long description:	http://lists.freebsd.org/mailman/htdig/freebsd-net/2004-May/003881.html
Reported by:		Jian-Wei Wang <jwwang at FreeBSD.csie.NCTU.edu.tw>
Tested by:		myself, Sergey Lyubka
Reviewed by:		sam
Approved by:		julian (mentor)
MFC after:		2 months
2004-10-12 10:33:42 +00:00
Dag-Erling Smørgrav
7ea5573cd8 Return 0, not NULL, from a function declared as returning int. 2004-10-09 14:20:18 +00:00
Alexander Kabaev
445e045b0d Avoid casts as lvalues. 2004-07-28 06:59:55 +00:00
Julian Elischer
be4252b367 Slight cosmetic changes.
Also introduce a macro to be called by persistent nodes to signal their
persistence during shutdown to hide this mechanism from the node author.

Make node flags have a consistent style in naming.

Document the change.
2004-07-20 17:15:38 +00:00
Julian Elischer
3ca24c284d Having moved metadata usage to mbuf tags, remove code that supports
the old way of doing it.

Submitted by:	Gleb Smirnoff <glebius@cell.sick.ru>
2004-06-26 22:24:16 +00:00
Julian Elischer
f8aae7776f Switch to using C99 sparse initialisers for the type methods array.
Should make no binary difference.

Submitted by:	Gleb Smirnoff <glebius@cell.sick.ru>
Reviewed by:	Harti Brandt <harti@freebsd.org>
MFC after:	1 week
2004-05-29 00:51:19 +00:00
Archie Cobbs
1a292b8015 Allow ng_ether "lower" and "orphans" hooks to be connected at the same time.
Reviewed by:	julian
PR:		kern/63317
2004-05-16 19:31:35 +00:00
Luigi Rizzo
212b6d5244 + rename and document an unused field in struct arpcom (field is still
there so there are no ABI changes);
+ replace 5 redefinitions of the IPF2AC macro with one in if_arp.h

Eventually (but before freezing the ABI) we need to get rid of
struct arpcom (initially with the help of some smart #defines
to avoid having to touch each and every driver, see below).

Apart from the struct ifnet, struct arpcom now only stores a copy
of the MAC address (ac_enaddr, but we already have another copy in
the struct ifnet -- if_addrhead), and a netgraph-specific field
which is _always_ accessed through the ifp, so it might well go
into the struct ifnet too (where, besides, there is already an entry
for AF_NETGRAPH data...)

Too bad ac_enaddr is widely referenced by all drivers. But
this can be fixed as follows:

#define ac_enaddr       ac_if.the_original_ac_enaddr_in_struct_ifnet

(note that the right hand side would likely be a pointer rather than
the base address of an array.)
2004-04-18 01:15:32 +00:00
Pawel Jakub Dawidek
3c976c3f95 Backout previous change, it breaks build and it is not needed
layering violation. As pointed out, there is much better way to do this.
Sorry guys, I need to find a better way to force reviews.

Requested by:	harti, julian, scottl (mentor)
Pointy hat to:	pjd
2004-02-20 08:26:27 +00:00
Pawel Jakub Dawidek
d592e95ba7 Export private structure owned by ng_ether(4) module outside.
It'll is required by new failure detection algorithm for ng_one2many(4).

No response from:	julian, archie
Silent on:		net@
Approved by:		scottl (mentor)
2004-02-19 16:58:01 +00:00
Brian Feldman
b712e9ec66 Also, don't crash in the netgraph disconnect node if the interface is
detached from the other direction.
2004-01-09 02:03:24 +00:00
Brian Feldman
018df1c369 Don't try to ifpromisc() on an interface that's deleted already. 2004-01-09 00:41:45 +00:00
Ruslan Ermilov
8c7e4101f8 Made the Ethernet address parse type standard.
OK'ed by:	archie
2003-12-17 12:40:34 +00:00
Brooks Davis
9bf40ede4a Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By:	re (in principle)
Reviewed By:	njl, imp
Tested On:	i386, amd64, sparc64
Obtained From:	NetBSD (if_xname)
2003-10-31 18:32:15 +00:00