Commit Graph

55 Commits

Author SHA1 Message Date
Yaroslav Tykhiy
06dc090fe0 Use TAILQ_FOREACH. 2006-06-29 17:31:43 +00:00
Christian S.J. Peron
16d878cc99 Fix the following bpf(4) race condition which can result in a panic:
(1) bpf peer attaches to interface netif0
	(2) Packet is received by netif0
	(3) ifp->if_bpf pointer is checked and handed off to bpf
	(4) bpf peer detaches from netif0 resulting in ifp->if_bpf being
	    initialized to NULL.
	(5) ifp->if_bpf is dereferenced by bpf machinery
	(6) Kaboom

This race condition likely explains the various different kernel panics
reported around sending SIGINT to tcpdump or dhclient processes. But really
this race can result in kernel panics anywhere you have frequent bpf attach
and detach operations with high packet per second load.

Summary of changes:

- Remove the bpf interface's "driverp" member
- When we attach bpf interfaces, we now set the ifp->if_bpf member to the
  bpf interface structure. Once this is done, ifp->if_bpf should never be
  NULL. [1]
- Introduce bpf_peers_present function, an inline operation which will do
  a lockless read bpf peer list associated with the interface. It should
  be noted that the bpf code will pickup the bpf_interface lock before adding
  or removing bpf peers. This should serialize the access to the bpf descriptor
  list, removing the race.
- Expose the bpf_if structure in bpf.h so that the bpf_peers_present function
  can use it. This also removes the struct bpf_if; hack that was there.
- Adjust all consumers of the raw if_bpf structure to use bpf_peers_present

Now what happens is:

	(1) Packet is received by netif0
	(2) Check to see if bpf descriptor list is empty
	(3) Pickup the bpf interface lock
	(4) Hand packet off to process

From the attach/detach side:

	(1) Pickup the bpf interface lock
	(2) Add/remove from bpf descriptor list

Now that we are storing the bpf interface structure with the ifnet, there is
is no need to walk the bpf interface list to locate the correct bpf interface.
We now simply look up the interface, and initialize the pointer. This has a
nice side effect of changing a bpf interface attach operation from O(N) (where
N is the number of bpf interfaces), to O(1).

[1] From now on, we can no longer check ifp->if_bpf to tell us whether or
    not we have any bpf peers that might be interested in receiving packets.

In collaboration with:	sam@
MFC after:	1 month
2006-06-02 19:59:33 +00:00
Ruslan Ermilov
303989a2f3 Use sparse initializers for "struct domain" and "struct protosw",
so they are easier to follow for the human being.
2005-11-09 13:29:16 +00:00
Andrew Thompson
4e7e0183e1 Move the cloned interface list management in to if_clone. For some drivers the
softc lists and associated mutex are now unused so these have been removed.

Calling if_clone_detach() will now destroy all the cloned interfaces for the
driver and in most cases is all thats needed to unload.

Idea by:	brooks
Reviewed by:	brooks
2005-11-08 20:08:34 +00:00
Andrew Thompson
febd0759f3 Change the reference counting to count the number of cloned interfaces for each
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.

Assert that all cloners have been destroyed when freeing the memory.

Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.

Discussed with:	brooks, pjd, -current
Reviewed by:	brooks
2005-10-12 19:52:16 +00:00
David Malone
01399f34a5 Fix some long standing bugs in writing to the BPF device attached to
a DLT_NULL interface. In particular:

        1) Consistently use type u_int32_t for the header of a
           DLT_NULL device - it continues to represent the address
           family as always.
        2) In the DLT_NULL case get bpf_movein to store the u_int32_t
           in a sockaddr rather than in the mbuf, to be consistent
           with all the DLT types.
        3) Consequently fix a bug in bpf_movein/bpfwrite which
           only permitted packets up to 4 bytes less than the MTU
           to be written.
        4) Fix all DLT_NULL devices to have the code required to
           allow writing to their bpf devices.
        5) Move the code to allow writing to if_lo from if_simloop
           to looutput, because it only applies to DLT_NULL devices
           but was being applied to other devices that use if_simloop
           possibly incorrectly.

PR:		82157
Submitted by:	Matthew Luckie <mjl@luckie.org.nz>
Approved by:	re (scottl)
2005-06-26 18:11:11 +00:00
Brooks Davis
b03965ddca Initialze ifp->if_softc.
Submitted by:	ume
2005-06-13 17:17:07 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Sam Leffler
89bc9a3171 the rt parameter to ifa_rtrequest callbacks should always be non-null;
eliminate grauitous ptr checks that follow ptr deref's

Noticed by:	Coverity Prevent analysis tool
2005-02-24 01:34:01 +00:00
Hajimu UMEMOTO
529ed56f83 don't see NBPFILTER. 2005-01-11 07:17:33 +00:00
Hajimu UMEMOTO
2d106a00c9 remove HAVE_OLD_BPF part. 2005-01-11 07:14:37 +00:00
Hajimu UMEMOTO
9b1a707635 fix typo. 2005-01-11 07:05:56 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Brooks Davis
f889d2ef8d Major overhaul of pseudo-interface cloning. Highlights include:
- Split the code out into if_clone.[ch].
 - Locked struct if_clone. [1]
 - Add a per-cloner match function rather then simply matching names of
   the form <name><unit> and <name>.
 - Use the match function to allow creation of <interface>.<tag>
   vlan interfaces.  The old way is preserved unchanged!
 - Also the match function to allow creation of stf(4) interfaces named
   stf0, stf, or 6to4.  This is the only major user visible change in
   that "ifconfig stf" creates the interface stf rather then stf0 and
   does not print "stf0" to stdout.
 - Allow destroy functions to fail so they can refuse to delete
   interfaces.  Currently, we forbid the deletion of interfaces which
   were created in the init function, particularly lo0, pflog0, and
   pfsync0.  In the case of lo0 this was a panic implementation so it
   does not count as a user visiable change. :-)
 - Since most interfaces do not need the new functionality, an family of
   wrapper functions, ifc_simple_*(), were created to wrap old style
   cloner functions.
 - The IF_CLONE_INITIALIZER macro is replaced with a new incompatible
   IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE
   instead.

Submitted by:   Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by:    andre, mlaier
Discussed on:	net
2004-06-22 20:13:25 +00:00
Poul-Henning Kamp
5dba30f15a add missing #include <sys/module.h> 2004-05-30 20:27:19 +00:00
Brooks Davis
1861b71020 Use an tempory struct ifnet *ifp instead of sc->sc_if to access the
ifnet in stf_clone_create.  Also use if_printf() instead of printf().
2004-04-19 05:06:27 +00:00
Brooks Davis
bb2bfb4fa9 Staticize <if>_clone_{create,destroy} functions.
Reviewed by:	mlaier
2004-04-14 00:57:49 +00:00
Robert Watson
15db03a075 Introduce stf_mtx to protect global softc list in if_stf. Add
stf_destroy() to handle the common softc destruction path for the
two destruction sources: interface cloning destroy, and module
unload.

NOTE: sc_ro, the cached route for stf conversion, is not synchronized
against concurrent access in this change, that will follow in a future
change.

Reviewed by:	pjd
2004-03-09 20:29:19 +00:00
Robert Watson
591cf7ce2d Const-poison ip_stf_ttl to make it clear that the variable is not
modified at run-time.
2004-03-07 05:15:42 +00:00
Sam Leffler
437ffe1823 o eliminate widespread on-stack mbuf use for bpf by introducing
a new bpf_mtap2 routine that does the right thing for an mbuf
  and a variable-length chunk of data that should be prepended.
o while we're sweeping the drivers, use u_int32_t uniformly when
  when prepending the address family (several places were assuming
  sizeof(int) was 4)
o return M_ASSERTVALID to BPF_MTAP* now that all stack-allocated
  mbufs have been eliminated; this may better be moved to the bpf
  routines

Reviewed by:	arch@ and several others
2003-12-28 03:56:00 +00:00
Brooks Davis
9bf40ede4a Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By:	re (in principle)
Reviewed By:	njl, imp
Tested On:	i386, amd64, sparc64
Obtained From:	NetBSD (if_xname)
2003-10-31 18:32:15 +00:00
Sam Leffler
d1dd20be6e Locking for updates to routing table entries. Each rtentry gets a mutex
that covers updates to the contents.  Note this is separate from holding
a reference and/or locking the routing table itself.

Other/related changes:

o rtredirect loses the final parameter by which an rtentry reference
  may be returned; this was never used and added unwarranted complexity
  for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
  we assume the parent will remain as long as the clone; doing this avoids
  a circularity in locking during delete
o convert some timeouts to MPSAFE callouts

Notes:

1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
   applications cannot/do-no know about mutex's.  Doing this requires
   that the mutex be the last element in the structure.  A better solution
   is to introduce an externalized version of struct rtentry but this is
   a major task because of the intertwining of rtentry and other data
   structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
   work to eliminate many held references.  If not these will be resolved
   prior to release.
3. ATM changes are untested.

Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS (partly)
2003-10-04 03:44:50 +00:00
Jonathan Lemon
1cafed3941 Update netisr handling; Each SWI now registers its queue, and all queue
drain routines are done by swi_net, which allows for better queue control
at some future point.  Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.

Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
2003-03-04 23:19:55 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
SUZUKI Shinsuke
8d95b0ce40 sync with KAME to simplify rev 1.28's patch (no functional changes)
Obtained from: KAME
Reviewd by: fenner
Approved by: re (jhb)
2003-01-15 20:09:52 +00:00
Bill Fenner
2b0e597682 Fix alignment problems -- the embedded v4 address is guaranteed to
be only 16-bit aligned, so only do byte operations to compare with it.
2003-01-05 14:03:26 +00:00
Sam Leffler
6fc32a2495 network interface and link layer changes:
o on input don't strip the Ethernet header from packets
o input packet handling is now done with if_input
o track changes to ether_ifattach/ether_ifdetach API
o track changes to bpf tapping
o call ether_ioctl for default handling of ioctl's
o use constants from net/ethernet.h where possible

Reviewed by:	many
Approved by:	re
2002-11-15 00:00:15 +00:00
Robert Watson
6b459e4956 When packets pass in and out of six-to-four (STF) tunnels, perform
labeling checks and operations as with other network interfaces.
Eventually, if it proves desirable, we might want to offer special
casing of this or other tunnel interfaces where we have an existing
label of interest, rather than treating it as though it's an
entirely fresh mbuf in the incoming/outgoing encapsulation directions.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-20 22:39:55 +00:00
Sam Leffler
5d84645305 Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
Hajimu UMEMOTO
f26b2d5bf2 - increment interface output counter. sync w/ netbsd-current
- increase if_oerrors.  sync w/netbsd

Obtained from:	KAME
2002-09-17 14:25:19 +00:00
Hajimu UMEMOTO
ce9d7b2f1a - reject SIOCSIFADDR if embedded address is in private address range
- reject packets from private address range.  from hitachi

Obtained from:	KAME
2002-09-17 10:45:51 +00:00
Brooks Davis
ae5a19be8e Move all unit number management cloned interfaces into the cloning
code.  The reverts the API change which made the <if>_clone_destory()
functions return an int instead of void bringing us into closer
alignment with NetBSD.

Reviewed by:	net (a long time ago)
2002-05-25 20:17:04 +00:00
SUZUKI Shinsuke
88ff5695c1 just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by:	ume
MFC after:	1 week
2002-04-19 04:46:24 +00:00
Alfred Perlstein
929ddbbb89 Remove __P. 2002-03-19 21:54:18 +00:00
Maxime Henrion
3b16e7b252 Simplify the interface cloning framework by handling unit
unit allocation with a bitmap in the generic layer.  This
allows us to get rid of the duplicated rman code in every
clonable interface.

Reviewed by:	brooks
Approved by:	phk
2002-03-11 09:26:07 +00:00
Brooks Davis
b75496fedf Change the network interface cloning API so the destroy function returns
an int errorcode instead of void in preperation for merging cloning of
the loopback device.

Submitted by:	mux
MFC after:	2 weeks
2002-03-04 21:43:49 +00:00
Peter Wemm
777b9faaa4 Fix warnings. 2002-02-28 00:09:17 +00:00
Mike Smith
e8783c4d6b Staticise private interface lists. 2002-01-08 10:30:09 +00:00
Andrew R. Reiter
1ed4b9feff - malloc should be passed M_WAITOK, not M_WAIT (a mbuf flag)
- make use of M_ZERO to remove a call to bzero()
2001-12-07 01:32:40 +00:00
Ruslan Ermilov
8071913df2 Pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2.
Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo *''
as the argument.  Pass rt_addrinfo all the way down to rtrequest1
and ifa->ifa_rtrequest.  3rd argument of ifa->ifa_rtrequest is now
``rt_addrinfo *'' instead of ``sockaddr *'' (almost noone is
using it anyways).

Benefit: the following command now works.  Previously we needed
two route(8) invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

Remove unsafe typecast in rtrequest(), from ``rtentry *'' to
``sockaddr *''.  It was introduced by 4.3BSD-Reno and never
corrected.

Obtained from:	BSD/OS, NetBSD
MFC after:	1 month
PR:		kern/28360
2001-10-17 18:07:05 +00:00
Jonathan Lemon
cf912c8943 Use in_ifaddrhashtbl instead of in_ifaddrhead to look up IP address. 2001-09-29 05:02:36 +00:00
Brooks Davis
abb6470625 Make stf a clonable device.
Yes this really is rather silly and the implementation is overkill given
that you are only allowed one of them, but NetBSD implements cloning on
this device and it's a less cluttered example of cloning then most.
2001-09-19 00:13:00 +00:00
Julian Elischer
ff265614c1 Patches from KAME to remove usage of Varargs in existing
IPV4 code. For now they will still have some in the developing stuff (IPv6)

Submitted by:	Keiichi SHIMA / <keiichi@iij.ad.jp>
Obtained from:	KAME
2001-09-07 07:19:12 +00:00
Julian Elischer
f0ffb944d2 Patches from Keiichi SHIMA <keiichi@iij.ad.jp>
to make ip use the standard protosw structure again.

Obtained from: Well, KAME I guess.
2001-09-03 20:03:55 +00:00
Brooks Davis
53dab5fe7b gif(4) and stf(4) modernization:
- Remove gif dependencies from stf.
 - Make gif and stf into modules
 - Make gif cloneable.

PR:		kern/27983
Reviewed by:	ru, ume
Obtained from:	NetBSD
MFC after:	1 week
2001-07-02 21:02:09 +00:00
Hajimu UMEMOTO
8acb22906f inject outbound packet to BPF.
Submitted by:	itojun
Obtained from:	KAME
MFC after:	10 days
2001-06-24 14:52:55 +00:00
Hajimu UMEMOTO
3384154590 Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
  - The definitions of SADB_* in sys/net/pfkeyv2.h are still different
    from RFC2407/IANA assignment because of binary compatibility
    issue.  It should be fixed under 5-CURRENT.
  - ip6po_m member of struct ip6_pktopts is no longer used.  But, it
    is still there because of binary compatibility issue.  It should
    be removed under 5-CURRENT.

Reviewed by:	itojun
Obtained from:	KAME
MFC after:	3 weeks
2001-06-11 12:39:29 +00:00
Poul-Henning Kamp
22f2982675 Use <sys/queue.h> macro api rather than fondle its implementation detals.
Created with:	/usr/bin/sed
Reviewed by:	/sbin/md5
2001-02-03 11:46:35 +00:00