Commit Graph

1796 Commits

Author SHA1 Message Date
Robert Watson
b1c53bc9c0 Take a first cut at cleaning up ifnet removal and multicast socket
panics, which occur when stale ifnet pointers are left in struct
moptions hung off of inpcbs:

- Add in_ifdetach(), which matches in6_ifdetach(), and allows the
  protocol to perform early tear-down on the interface early in
  if_detach().

- Annotate that if_detach() needs careful consideration.

- Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR --
  this is not the place to detect interface removal!  This also
  removes what is basically a nasty (and now unnecessary) hack.

- Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP
  IPv4 sockets.

It is now possible to run the msocket_ifnet_remove regression test
using HEAD without panicking.

MFC after:	3 days
2005-09-18 17:36:28 +00:00
Ruslan Ermilov
83908c6560 The arguments to printf() were swapped. 2005-09-16 20:38:33 +00:00
Yaroslav Tykhiy
ffdd61c31d Do assorted nitpicking in diagnostics while I'm here:
- Use __func__ consistently instead of copying function name
  to message strings.  Code tends to migrate around source files.
- DIAGNOSTIC is for information, INVARIANTS is for panics.
2005-09-16 12:24:28 +00:00
Yaroslav Tykhiy
14e9825634 It's nice to have relevant comments both in if {} and else {},
not in just one of them.
2005-09-16 11:58:58 +00:00
Yaroslav Tykhiy
f4ec4126bb Test the new M_VLANTAG packet flag before calling
m_tag_locate().  This adds little overhead of a simple
bitwise operation in case hardware VLAN acceleration
is on, yet saves the more expensive function call if
the acceleration is off.

Reviewed by:	ru, glebius
X-MFC-after:	6.0
2005-09-16 11:44:43 +00:00
Andre Oppermann
035ba19027 Undo a tad little optimization to bpf_mtap() introduced in rev. 1.95
which broke the correct handling of the BIOCGSEESENT flag in the bpf
listener.

PR:		kern/56441
Submitted by:	<vys at renet.ru>
MFC after:	3 days
2005-09-14 16:37:05 +00:00
Andre Oppermann
17a8471fcd Remove bogous semicolons at the end of the definitions of
'do { ... } while (0)' macros.

PR:		kern/83088
Sumbitted by:	<antoine.brodin at laposte.net>
2005-09-14 14:57:04 +00:00
Robert Watson
0a53be4671 In netkqfilter(), return EINVAL instead of 1 (EPERM) when a filter type
is requested on a network interface file descriptor that is non-applicable.

MFC after:	3 days
2005-09-12 19:26:03 +00:00
Craig Rodrigues
6a3d26b2b7 Forward declare z_errmsg with static linkage since it is defined
with static linkage later in the file.  Eliminates GCC 4.0 error.
2005-09-11 16:13:02 +00:00
Christian S.J. Peron
fe0fc7efe3 Protect interface and address lists using the appropriate mutex. These
locks were not aquired because the user buffers were not wired, thus it was
possible that that SYSCTL_OUT could sleep, causing a number of different
problems such as lock ordering issues and dead locks.

-Wire user supplied buffer to ensure SYSCTL_OUT will not sleep.
-Pickup ifnet locks to protect the list.
-Where applicable pickup address locks.
-Pickup radix node head locks.
-Remove splnet stubs
-Remove various comments about locking here, because they are no
 longer needed.

It is the hope that these changes will make sysctl_rtsock MP safe.

MFC after:	3 weeks
2005-09-10 15:12:24 +00:00
David E. O'Brien
5b1c0294e4 Forward declaring static variables as extern is invalid ISO-C. Now that
GCC can properly handle forward static declarations, do this properly.
2005-09-07 10:06:14 +00:00
Andrew Thompson
59280079d3 Add support for multicast to the bridge and allow inet6 addresses to be
assigned to the interface.

IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge case and
it may cause scope violation.

An address can be assigned in the usual way;
  ifconfig bridge0 inet6 xxxx:...

Tested by:	bmah
Reviewed by:	ume (netinet6)
Approved by:	mlaier (mentor)
MFC after:	1 week
2005-09-06 21:11:59 +00:00
Christian S.J. Peron
b75a24a075 Instead of caching the PID which opened the bpf descriptor, continuously
refresh the PID which has the descriptor open. The PID is refreshed in various
operations like ioctl(2), kevent(2) or poll(2). This produces more accurate
information about current bpf consumers. While we are here remove the bd_pcomm
member of the bpf stats structure because now that we have an accurate PID we
can lookup the via the kern.proc.pid sysctl variable. This is the trick that
NetBSD decided to use to deal with this issue.

Special care needs to be taken when MFC'ing this change, as we have made a
change to the bpf stats structure. What will end up happening is we will leave
the pcomm structure but just mark it as being un-used. This way we keep the ABI
in tact.

MFC after:	1 month
Discussed with:	Rui Paulo < rpaulo at NetBSD dot org >
2005-09-05 23:08:04 +00:00
Sam Leffler
62313e4c3f reclaim sbuf and clear lock on error in ifconf
Submitted by:	Ted Unangst
Reviewed by:	rwatson
MFC after:	3 days
2005-09-04 17:32:47 +00:00
Yaroslav Tykhiy
eefbcf0e62 Use VLAN_TAG_VALUE() not only to read a dot1q tag
value from an m_tag, but also to set it.  This reduces
complex code duplication and improves its readability.

Alas, we shouldn't rename the macro to VLAN_TAG_LVALUE()
globally because that would cause pain for kernel module
port maintainers and vendors using FreeBSD as their codebase.
Added a clarifying comment instead.

Discussed with:	ru, glebius
X-MFC-After:	6.0-RELEASE (MFC is good just to reduce the diff)
2005-08-31 11:36:50 +00:00
Gleb Smirnoff
ba26134b19 Fix fallout from revision 1.77, mark outgoing packets with M_VLANTAG flag.
PR:		kern/80646
Reviewed by:	yar
MFC after:	3 days
2005-08-30 14:14:08 +00:00
Andrew Thompson
68e84b98b2 Fix a panic in softclock() if the interface is destroyed with a bpf consumer
attached.

This is caused by bpf_detachd clearing IFF_PROMISC on the interface which does
a SIOCSIFFLAGS ioctl. The problem here is that while the interface has been
stopped, IFF_UP has not been cleared so IFF_UP != IFF_DRV_RUNNING, this causes
the ioctl function to init() the interface which resets the callouts.

The destroy then completes and frees the softc but softclock will panic on a
dead callout pointer.

Ensure ifp->if_flags matches reality by clearing IFF_UP when we destroy.

Silence from:	rwatson
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-27 01:17:42 +00:00
Robert Watson
7e994955ac De-spl parts of the routing socket code now generally protected
through locking; leave some spl references around code where there
are open questions about global variable references.  Also, add
an XXX regarding locking in sysctl.

MFC after:	3 days
2005-08-25 13:30:04 +00:00
Andrew Thompson
dba31bdea1 The mtu check in bridge_enqueue is bogus as the maximum Ethernet frame is
actually 1514, so comparing the mbuf length which includes the Ethernet header
to the interface MTU is wrong.

The check was a little over the top so just remove it.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-23 19:49:00 +00:00
Max Laier
0bdf5171c8 Don't loop back packets that have been routed by pf. This fixes an endless
loop where the same packet is sent over and over again.

Obtained from:	OpenBSD
Reported by:	Sergey Lapin
Tested by:	Sergey Lapin
MFC after:	7 days
2005-08-23 14:13:17 +00:00
Christian S.J. Peron
93e39f0b93 Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands
enhance the security of bpf(4) by further relinquishing the privilege of
the bpf(4) consumer (assuming the ioctl commands are being implemented).

Once BIOCLOCK is executed, the device becomes locked which prevents the
execution of ioctl(2) commands which can change the underly parameters of the
bpf(4) device. An example might be the setting of bpf(4) filter programs or
attaching to different network interfaces.

BIOCSETWF can be used to set write filters for outgoing packets. Currently if
a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used
as a raw socket, regardless of consumer's UID. Write filters give users the
ability to constrain which packets can be sent through the bpf(4) descriptor.

These features are currently implemented by a couple programs which came from
OpenBSD, such as the new dhclient and pflogd.

-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify
 whether a read or write filter is to be set.
-Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the
 filter program on the mbuf data once we move the packet in from user-space.
-Rather than execute two uiomove operations, (one for the link header and the
 other for the packet data), execute one and manually copy the linker header
 into the sockaddr structure via bcopy.
-Restructure bpf_setf to compensate for write filters, as well as read.
-Adjust bpf(4) stats structures to include a bd_locked member.

It should be noted that the FreeBSD and OpenBSD implementations differ a bit in
the sense that we unconditionally enforce the lock, where OpenBSD enforces it
only if the calling credential is not root.

Idea from:	OpenBSD
Reviewed by:	mlaier
2005-08-22 19:35:48 +00:00
Christian S.J. Peron
4ddfb5312a Add missing braces around bpf_filter which were missed when I
merged the bpfstat code.

Pointed out by:	iedowse
Pointy hat to:	csjp
MFC after:	3 days
2005-08-18 22:30:52 +00:00
Andrew Thompson
23e7643185 Mark the callouts as MPSAFE as if_bridge has been giant-free since day 1.
Use the SMP friendly callout_init_mtx() while we are here.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-18 20:17:00 +00:00
Brooks Davis
dc7c539e33 When we started calling if_findindex() from if_alloc() with an empty
struct ifnet most of if_findindex() become a complex no-op.  Remove it
and replace it with a corrected version of the four line for loop it
devolved to plus some error handling.  This should probably be replaced
with subr_unit at some point.

Switch from checking ifaddr_byindex to ifnet_byindex when looking for
empty indexes.  Since we're doing this from if_alloc/if_free, we can
only be sure that ifnet_byindex will be correct.  This fixes panics when
loading the ef(4) module.  The panics were caused by the fact that
if_alloc was called four time before if_attach was called and thus
ifaddr_byindex was not set and the same unit was allocated again.  This
in turn caused the first if_attach to fail because the ifp was not the
one in ifnet_byindex(ifp->if_index).

Reported by:	"Wojciech A. Koszek" <dunstan at freebsd dot czest dot pl>
PR:		kern/84987
MFC After:	1 day
2005-08-18 18:36:40 +00:00
Brooks Davis
7cf30146f0 - Move IF_ADDR_LOCK_DESTROY(ifp) from if_free to if_free_type.
- Add a note that additions should be made to if_free_type and not
  if_free to help avoid this in the future.

This apparently fixes a use after free in if_bridge and may fix bugs
in other direct if_free_type consumers.

Reported by:	thompsa
2005-08-16 17:02:35 +00:00
Brooks Davis
f3447eb493 Vlan interfaces change their type after ether_ifattach() so we needs to
use if_free_type(ifp, IFT_ETHER) to delete them and stop leaking struct
arpcoms.

Reported by:	thompsa
MFC After:	3 days
2005-08-15 20:27:34 +00:00
Andrew Thompson
691cdb5351 Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by:    Michal Mertl
Approved by:    mlaier (mentor)
MFC after:      3 days
2005-08-15 02:54:29 +00:00
Andrew Thompson
a1c0fd4dee Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by:	Michal Mertl
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-15 02:50:13 +00:00
Gleb Smirnoff
00ff5c4778 Axe ppp_for_tty(). Use tty->t_lsc pointer to store sc. This
also eliminates recursive use of ppp_softc_list_mtx.

PR:		kern/84686
Reviewed by:	phk
MFC after:	1 week
2005-08-12 08:27:15 +00:00
Gleb Smirnoff
791888619d o To prevent a race between RTM_DELETE message and
arptimer() deleting stale entry, we need to lock
  rtentry before unlocking radix head.

Reviewed by:	sam
2005-08-11 08:26:31 +00:00
Gleb Smirnoff
530f95fc08 o Make rt_check() function more strict:
- rt0 passed to rt_check() must not be NULL, assert this.
  - rt returned by rt_check() must be valid locked rtentry,
    if no error occured.
o Modify callers, so that they never pass NULL rt0
  to rt_check().

Reviewed by:	sam, ume (nd6.c)
2005-08-11 08:14:53 +00:00
Robert Watson
fc57457045 For each interface flag, indicate whether or not it is owned by the
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.

Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions
with IFF_ALLMULTI.

MFC after:	7 days
2005-08-09 12:56:20 +00:00
Robert Watson
13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
Robert Watson
292ee7be1c Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack.  The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.

Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names.  Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.

When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together.  Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.

Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.

With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking.  Most were likely
never triggered.

Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:16:17 +00:00
Gleb Smirnoff
9bd8ca3014 In preparation for fixing races in ARP (and probably in other
L2/L3 mappings) make rt_check() return a locked rtentry.
2005-08-09 08:39:56 +00:00
Andrew Thompson
3155122ec2 Use m_copypacket() which is an optimization of the common case
m_copym(m, 0, M_COPYALL, how).

This is required for strict alignment architectures where we align the IP
header in the input path but m_copym() will create an unaligned copy in
bridge_broadcast(). m_copypacket() preserves alignment of the first mbuf.

Noticed by:	Petri Simolin
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-08 22:21:55 +00:00
Robert Watson
6a113b3de7 Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do.  This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by:	phk
MFC after:	3 days
2005-08-08 19:55:32 +00:00
Sam Leffler
456d182d5b destroy lock _before_ free'ing the structure it resides in 2005-08-06 18:42:01 +00:00
John Baldwin
6da3131abd Initialize the if_addr mutex in if_alloc() rather than waiting until
if_attach().  This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver).  Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized mutex.

Reported by:	Holm Tiffe holm at freibergnet dot de
Discussed with:	rwatson
2005-08-04 14:39:47 +00:00
Robert Watson
c3b31afd92 Protect link layer network interface multicast address list manipulation
using ifp->if_addr_mtx:

- Initialize if_addr_mtx when ifnet is initialized.

- Destroy if_addr_mtx when ifnet is torn down.

- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
  Staticize.

- Extract ifmultiaddr allocation and initialization into if_allocmulti();
  accept a 'mflags' argument to indicate whether or not sleeping is
  permitted.  This centralizes error handling and address duplication.

- Extract ifmultiaddr tear-down and deallocation in if_freemulti().

- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Make use of non-sleeping allocations.  Annotate the fact that we only
  generate routing socket events for explicit address addition, not
  implicit link layer address addition.

- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Annotate the lack of a routing socket event for implicit link layer
  address removal.

- De-spl all and sundry.

Problem reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after:		1 week
2005-08-02 23:23:26 +00:00
Robert Watson
09df718e0e When allocating link layer ifnet address list entries in
ifp->if_resolvemulti(), do so with M_NOWAIT rather than M_WAITOK, so
that a mutex can be held over the call.  In the FDDI code, add a
missing M_ZERO.  Consumers are already aware that if_resolvemulti()
can fail.

MFC after:	1 week
2005-08-02 17:52:52 +00:00
Robert Watson
de6073aab0 Add if_addr_mtx to struct ifnet, a mutex to protect ifnet-related address
lists.  Add accessor macros.

This changes the size of struct ifnet, but ideally, all ifnet consumers
are now using if_alloc() to allocate these structures rather than
embedding them into device driver softc's, so this won't modify the
network device driver ABI.

MFC after:	1 week
2005-08-02 17:43:35 +00:00
Bjoern A. Zeeb
9e669156d4 Add support for IPv6 over GRE [1]. PR kern/80340 includes the
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].

PR:		kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from:	NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by:	Gert Doering <gert at greenie.muc.de>[2]
MFC after:	4 days
2005-08-01 08:14:21 +00:00
Christian S.J. Peron
422a63da6e Rather than hold a mutex over calls to SYSCTL_OUT allocate a
temporary buffer then pass the array to user-space once we have
dropped the lock.

While we are here, drop an assertion which could result in a
kernel panic under certain race conditions.

Pointed out by:	rwatson
2005-07-26 17:21:56 +00:00
Hajimu UMEMOTO
a1f7e5f8ee scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from:	KAME
2005-07-25 12:31:43 +00:00
Andrew Thompson
39bb2fca46 We check that all the member interfaces have the same MTU on attach to the
bridge but the interface can still be changed afterwards.

This falls under the 'dont do that' category but log an warning when INVARIANTS
is defined.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-25 02:22:37 +00:00
Christian S.J. Peron
69f7644bc9 Introduce new sysctl variable: net.bpf.stats. This sysctl variable can
be used to pass statistics regarding dropped, matched and received
packet counts from the kernel to user-space. While we are here
introduce a new counter for filtered or matched packets. We currently
keep track of packets received or dropped by the bpf device, but not
how many packets actually matched the bpf filter.

-Introduce net.bpf.stats sysctl OID
-Move sysctl variables after the function prototypes so we can
 reference bpf_stats_sysctl(9) without build errors.
-Introduce bpf descriptor counter which is used mainly for sizing
 of the xbpf_d array.
-Introduce a xbpf_d structure which will act as an external
 representation of the bpf_d structure.
-Add a the following members to the bpfd structure:

	bd_fcount	- Number of packets which matched bpf filter
	bd_pid		- PID which opened the bpf device
	bd_pcomm	- Process name which opened the device.

It should be noted that it's possible that the process which opened
the device could be long gone at the time of stats collection. An
example might be a process that opens the bpf device forks then exits
leaving the child process with the bpf fd.

Reviewed by:	mdodd
2005-07-24 17:21:17 +00:00
Robert Watson
638ccea02a Allocate one of the spare ifnet integer fields to hold if_drv_flags,
which in the future will hold IFF_OACTIVE and IFF_RUNNING, and have
its access synchronized by the device driver rather than the
protocol stack.  This will avoid potential races in the management
of flags in if_flags.

Discussed with:	various (scottl, jhb, ...)
MFC after:	1 week
2005-07-21 22:01:06 +00:00
Poul-Henning Kamp
514bcb8955 Add some KASSERTS to catch null pointers. 2005-07-21 09:00:51 +00:00
Andrew Thompson
12b47243c6 Clear the PROMISC flag from the vlan interface when we remove a member. We
checked for IFT_L2VLAN in bridge_ioctl_add() but not bridge_delete_member().

Approved by:	mlaier (mentor)
2005-07-20 19:42:51 +00:00
Robert Watson
2432c31c8b In multicast routines:
Compare pointers with NULL rather than treating them as booleans.

Compare pointers with NULL rather than 0 to make it more clear
they are pointers.

Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.

MFC after:	3 days
2005-07-19 10:12:58 +00:00
Robert Watson
d8d5b10e84 Rename equal() macro to sa_equal(), which matches the definitions
of sa_equal() in other files, and makes it more clear what equal()
is comparing.

MFC after:	3 days
2005-07-19 10:03:47 +00:00
Robert Watson
f002340544 Lock down netnatm and mark as MPSAFE:
- Introduce a subsystem mutex, natm_mtx, manipulated with accessor macros
  NATM_LOCK_INIT(), NATM_LOCK(), NATM_UNLOCK(), NATM_LOCK_ASSERT().  It
  protects the consistency of pcb-related data structures.  Finer grained
  locking is possible, but should be done in the context of specific
  measurements (as very little work is done in netnatm -- most is in the
  ATM device driver or socket layer, so there's probably not much
  contention).

- Remove GIANT_REQUIRED, mark as NETISR_MPSAFE, remove
  NET_NEEDS_GIANT("netnatm").

- Conditionally acquire Giant when entering network interfaces for
  ifp->if_ioctl() using IFF_LOCKGIANT(ifp)/IFF_UNLOCKGIANT(ifp) in order
  to coexist with non-MPSAFE atm ifnet drivers..

- De-spl.

MFC after:	2 weeks
Reviewed by:	harti, bms (various versions)
2005-07-18 16:55:46 +00:00
George V. Neville-Neil
ba7be0a934 Fix for PR 82974. We were not checking that the route looked up in
the case of an RTM_CHANGE was specific, i.e. that it matched completely.  This
led to a route change of a non-existent route changing the default route
as the radix code would simply back track to that point and hand that
route back to the routing socket code.

PR: 82974
Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw>
             Ben Kaduk <minimarmot@gmail.com>
             Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
Obtained from:	OpenBSD with modifications.
MFC after: 2 weeks
2005-07-15 09:18:34 +00:00
Max Laier
52023244de Move eventhandler for 'ifnet_departure_event' at the end of the progress.
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure.  These packets confused users of ifnet_departure_event (pf at the
moment).

PR:		kern/80627
Tested by:	Divacky Roman
MFC after:	1 week
2005-07-14 20:26:43 +00:00
Yaroslav Tykhiy
1a3b685942 MFp4:
- Introduce a helper function if_setflag() containing the code common
  to ifpromisc() and if_allmulti() instead of duplicating the code poorly,
  with different bugs.
- Call ifp->if_ioctl() in a consistent way: always use more compatible C
  syntax and check whether ifp->if_ioctl is not NULL prior to the call.

MFC after:	1 month
2005-07-14 13:56:51 +00:00
Andrew Thompson
489fc2258f Previously the bridge MTU was set to ETHERMTU and could not be changed. Since
we can only bridge interfaces with the same value it meant that all members had
to be set at ETHERMTU as well.

Allow the first member to be added to define the MTU for the bridge, the check
still applies to all additional members.

Print an informative message if the MTU is incorrect [1]

Requested by:	Niki Denev [1]
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-13 20:40:19 +00:00
Sam Leffler
e0d80bffb5 additions from libpcap 0.9.1 release
Approved by:	re (scottl)
2005-07-11 03:16:23 +00:00
Andrew Thompson
ea32e73208 - Previously when broadcasting to N number of interfaces we would run pfil
hooks for each outgoing interface but also run pfil hooks _N times_ on the
  bridge interface. This is changed so pfil hooks are run once for the bridge
  interface (bridge0) and then only on the outgoing interfaces in the broadcast
  loop.

- Simplify bridge_enqueue() by moving bridge_pfil() to the callers.

- Check (inet6_pfil_hook.ph_busy_count >= 0), it may be possible to have a
  packet filter hooked for only ipv6 but we were only checking if ipv4 hooks
  were busy.

- Minor optimisation for null mbuf check after bridge_pfil(), move it into the
  if-block as it couldnt possibly be null outside.

Prodded by:	mlaier
Approved by:	re (scottl), mlaier (mentor)
2005-07-06 01:24:45 +00:00
Robert Watson
3c308b091f Eliminate MAC entry point mac_create_mbuf_from_mbuf(), which is
redundant with respect to existing mbuf copy label routines.  Expose
a new mac_copy_mbuf() routine at the top end of the Framework and
use that; use the existing mpo_copy_mbuf_label() routine on the
bottom end.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA, SPAWAR
Approved by:	re (scottl)
2005-07-05 23:39:51 +00:00
Andrew Thompson
ede3a2773d Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64, ia64 and ppc.

This uses the code block from if_bridge and the newly added macro
IP_HDR_ALIGNED_P().

This /might/ be a temporary messure before all NIC drivers are educated
to align the header themself.

PR:		ia64/81284
Obtained from:	NetBSD (if_bridge)
Approved by:	re (dwhite), mlaier (mentor)
2005-07-03 18:24:03 +00:00
Andrew Thompson
2fcb030ad5 Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).

This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.

 IP_HDR_ALIGNED_P()
 IP6_HDR_ALIGNED_P()

Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.

PR:		ia64/81284
Obtained from:	NetBSD
Approved by:	re (dwhite), mlaier (mentor)
2005-07-02 23:13:31 +00:00
Suleiman Souhlal
571dcd15e2 Fix the recent panics/LORs/hangs created by my kqueue commit by:
- Introducing the possibility of using locks different than mutexes
for the knlist locking. In order to do this, we add three arguments to
knlist_init() to specify the functions to use to lock, unlock and
check if the lock is owned. If these arguments are NULL, we assume
mtx_lock, mtx_unlock and mtx_owned, respectively.

- Using the vnode lock for the knlist locking, when doing kqueue operations
on a vnode. This way, we don't have to lock the vnode while holding a
mutex, in filt_vfsread.

Reviewed by:	jmg
Approved by:	re (scottl), scottl (mentor override)
Pointyhat to:	ssouhlal
Will be happy:	everyone
2005-07-01 16:28:32 +00:00
Gleb Smirnoff
82dd5411d9 Use m_uiotombuf() instead of own implementation. This is not just
a cosmetic change. m_uiotombuf() produces a packet header mbuf, while
original implementation did not. When kernel is compiled with MAC
support, headerless mbuf will cause panic.

Reported by:	Alexander Nikiforenko <asn rambler-co.ru>
Approved by:	re (scottl)
MFC After:	2 weeks
2005-07-01 15:22:47 +00:00
Andrew Thompson
49808fa4fc Sync if_bridge to NetBSD r1.31
Rename conflicting variables when handling SNAP Ethernet frames.

Obtained from:	NetBSD
Approved by:	mlaier (mentor)
Approved by:	re (blanket)
2005-06-29 19:23:32 +00:00
Qing Li
16a2e0a6c8 Require gateways for routes to be of the same address family as the
route itself.

It fixes a bug where an IPv4 route for example has an IPv6 gateway
specified:

     route add 10.1.1.1 -inet6 fe80::1%fxp0

     Destination  Gateway       Flags  Refs  Use  Netif Expire
     10.1.1.1     fe80::1%fxp0  UGHS   0     0    fxp0

The fix rejects these illegal combinations:

     route: writing to routing socket: Invalid argument
     add host 10.1.1.1: gateway fe80::1%fxp0: Invalid argument

Reviewed by:	KAME jinmei@isl.rdc.toshiba.co.jp
Reviewed by:	andre (mentor)
Approved by:	re
MFC after:	5
2005-06-28 23:32:22 +00:00
Bjoern A. Zeeb
066b192e3b Fix panic after ifnet changes in rev. 1.30. sc->sc_ifp is a
pointer now and needs to be allocated before using.

Reviewed by:	gnn
Approved by:	re (scottl), rwatson (mentor)
2005-06-28 06:55:45 +00:00
Andrew Thompson
ca6c404ce3 Fix a panic when bringing up the bridge interface. We were casting a ifnet
pointer to a softc which is no longer valid since the ifnet struct was split
out from the softc.

Approved by:	mlaier (mentor)
Approved by:	re (blanket)
2005-06-27 21:58:12 +00:00
David Malone
01399f34a5 Fix some long standing bugs in writing to the BPF device attached to
a DLT_NULL interface. In particular:

        1) Consistently use type u_int32_t for the header of a
           DLT_NULL device - it continues to represent the address
           family as always.
        2) In the DLT_NULL case get bpf_movein to store the u_int32_t
           in a sockaddr rather than in the mbuf, to be consistent
           with all the DLT types.
        3) Consequently fix a bug in bpf_movein/bpfwrite which
           only permitted packets up to 4 bytes less than the MTU
           to be written.
        4) Fix all DLT_NULL devices to have the code required to
           allow writing to their bpf devices.
        5) Move the code to allow writing to if_lo from if_simloop
           to looutput, because it only applies to DLT_NULL devices
           but was being applied to other devices that use if_simloop
           possibly incorrectly.

PR:		82157
Submitted by:	Matthew Luckie <mjl@luckie.org.nz>
Approved by:	re (scottl)
2005-06-26 18:11:11 +00:00
Brooks Davis
1436936ab0 Spelling/grammer fixes in comment.
Reported by:	Hans Petter Selasky <hselasky at c2i dot net>
Approved by:	re (ifnet blanked)
2005-06-17 17:19:34 +00:00
Brooks Davis
b03965ddca Initialze ifp->if_softc.
Submitted by:	ume
2005-06-13 17:17:07 +00:00
Brooks Davis
28ef2db496 Return NULL instead of a bogus pointer from if_alloc when if_com_alloc
fails.

Move detaching the ifnet from the ifindex_table into if_free so we can
both keep the sanity checks and actually delete the ifnets. [0]

Reported by:	gallatin [0]
Approved by:	re (blanket)
2005-06-12 00:53:03 +00:00
Andrew Thompson
e7acea8202 Catch up with the struct ifnet changes and use if_alloc().
Reviewed by:	brooks
Approved by:	mlaier (mentor)
2005-06-10 23:52:01 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Max Laier
2c67c57c8b Add missing {} in last commit. 2005-06-10 15:53:21 +00:00
Andrew Thompson
c8b0129238 Add dummynet(4) support to if_bridge, this code is largely based on bridge.c.
This is the final piece to match bridge.c in functionality, we can now be a
drop-in replacement.

Approved by:	mlaier (mentor)
2005-06-10 01:25:22 +00:00
Hartmut Brandt
25029d6c31 When returing an RTM_GET message through the routing socket fill
in the rtm_index field whenever we have an interface pointer. This
is consistent with the RTM_GET messages returned by sysctl().
2005-06-09 12:20:50 +00:00
Andrew Thompson
82116c339c Bring in IPFW layer2 filtering from bridge.c, this allows Ethernet filtering
using the layer2, mac and mac-type keywords.

This is one of the last features that bridge.c has over if_bridge and gets us
very close to a full functional replacement.

Approved by:	mlaier (mentor)
2005-06-07 21:20:18 +00:00
Christian S.J. Peron
0eb206049e Change the maximum bpf program instruction limitation from being hard-
coded at 512 (BPF_MAXINSNS) to being tunable. This is useful for users
who wish to use complex or large bpf programs when filtering traffic.
For now we will default it to BPF_MAXINSNS. I have tested bpf programs
with well over 21,000 instructions without any problems.

Discussed with:	phk
2005-06-06 22:19:59 +00:00
Brooks Davis
9d80a3307a Send link state change notifications to /dev/devctl. This is needed to
start the OpenBSD dhclient when links come up.
2005-06-06 19:08:11 +00:00
Andrew Thompson
f2999b2fdf Change ipv6 packet filtering to match ipv4. It now checks pfil_member and
pfil_bridge to determine which interfaces to filter on.

Approved by:	mlaier (mentor)
2005-06-06 02:41:29 +00:00
Andrew Thompson
5a6530a38d Fix indentation of two comment blocks from the last commit.
Approved by:	mlaier (mentor)
2005-06-05 03:49:23 +00:00
Andrew Thompson
8f86751705 Add hooks into the networking layer to support if_bridge. This changes struct
ifnet so a buildworld is necessary.

Approved by:	mlaier (mentor)
Obtained from:	NetBSD
2005-06-05 03:13:13 +00:00
Andrew Thompson
31997bf223 Add if_bridge, which provides more advanced Ethernet bridging and 802.1d
spanning tree support.

Based on Jason Wright's bridge driver from OpenBSD, and modified by Jason R.
Thorpe in NetBSD.

Reviewed by:	mlaier, bms, green
Silence from:	-net
Approved by:	mlaier (mentor)
Obtained from:	NetBSD
2005-06-05 02:59:26 +00:00
Sam Leffler
f6f1669c0f integrate changes from libpcap-0.9.1-096
Reviewed by:	bms
2005-05-28 21:56:41 +00:00
Brooks Davis
dbf49e18bb Update refrenced URL for SNMP list of ifTypes to refer to iana.org
instead of a dead location on ftp.isi.edu.
2005-05-28 06:11:38 +00:00
Gleb Smirnoff
748741c7ae Plug mbuf leak, that I have introduced in 1.85. Also restore important comment
from if_ethersubr.c:1.178. While here adjust formatting, to make code more
readable.

Reported by:	Alexey Kamyshev, rwatson
2005-05-26 06:50:00 +00:00
Peter Edwards
45778b37b2 Separate out address-detaching part of if_detach into if_purgeaddrs,
so if_tap doesn't need to rely on locally-rolled code to do same.

The observable symptom of if_tap's bzero'ing the address details
was a crash in "ifconfig tap0" after an if_tap device was closed.

Reported By: Matti Saarinen (mjsaarin at cc dot helsinki dot fi)
2005-05-25 13:52:03 +00:00
Max Laier
d274e6b641 Fix semantics of ph_busy_count == -1 to pass instead of block.
PR:		kern/81128
Submitted by:	Joost Bekkers
MFC-after:	2 weeks
2005-05-23 17:07:16 +00:00
Colin Percival
fd94099ec2 If we are going to
1. Copy a NULL-terminated string into a fixed-length buffer, and
2. copyout that buffer to userland,
we really ought to
0. Zero the entire buffer
first.

Security: FreeBSD-SA-05:08.kmem
2005-05-06 02:50:00 +00:00
Maksim Yevmenkin
75ae257016 Change m_uiotombuf so it will accept offset at which data should be copied
to the mbuf. Offset cannot exceed MHLEN bytes. This is currently used to
fix Ethernet header alignment problem on alpha and sparc64. Also change all
users of m_uiotombuf to pass proper offset.

Reviewed by:	jmg, sam
Tested by:	Sten Spans "sten AT blinkenlights DOT nl"
MFC after:	1 week
2005-05-04 18:55:03 +00:00
Christian S.J. Peron
a3272e3ce3 -introduce net.bpf sysctl instead of the less intuitive debug.*
debug.bpf_bufsize is now net.bpf.bufsize
    debug.bpf_maxbufsize is now net.bpf.maxbufsize

-move function prototypes for bpf_drvinit and bpf_clone up to the
 top of the file with the others
-assert bpfd lock in catchpacket() and bpf_wakeup()

MFC after:	2 weeks
2005-05-04 03:09:28 +00:00
Gleb Smirnoff
984be3efbf - Call if_link_state_change() for each vlan, when link changes
on parent.
- Remove route.h include.
- Fix comment about MII.

Sponsored by:	Rambler
Reviewed by:	yar
2005-04-20 12:16:41 +00:00
Gleb Smirnoff
68a3482f69 Do not call all link state callbacks directly, but schedule
a taskqueue(9) task. This fixes LORs and adds possibility
to serve such events pseudorecursively, when link state
change of interface causes subsequent change on other
interfaces.

Sponsored by:	Rambler
Reviewed by:	sam, brooks, mux
2005-04-20 09:30:54 +00:00
Colin Percival
fbd24c5ed6 Zero the ifr.ifr_name buffer in ifconf() in order to avoid
accidental disclosure of kernel memory to userland.

Security:	FreeBSD-SA-05:04.ifconf
2005-04-15 01:52:40 +00:00
Matthew N. Dodd
f7251b07e2 Add #defines for control fields and address bits. 2005-04-13 08:14:14 +00:00
Matthew N. Dodd
b137ea624b Provide a sysctl (net.link.tap.user_open) to allow unpriviliged
acces to tap(4) device nodes based on file system permission.

Duplicate the 'debug.if_tap_debug' sysctl under the
'net.link.tap' hierarchy.
2005-04-13 00:30:19 +00:00
Poul-Henning Kamp
f4f6abcb4e Explicitly hold a reference to the cdev we have just cloned. This
closes the race where the cdev was reclaimed before it ever made it
back to devfs lookup.
2005-03-31 12:19:44 +00:00
Brian Feldman
4549709fb5 You must selwakeup{,pri}() when closing a selectable object or the
td->td_sel will get trashed and crash the system.  Fix BPF's mistake
in this area.

MFC after:	1 day
2005-03-27 23:16:17 +00:00
Sam Leffler
7a7fa27b23 rt_newaddrmsg will blow up if given something other than RTM_ADD
or RTM_DELETE; add an assertion, may want to do something more
heavyhanded in the future

Noticed by:	Coverity Prevent analysis tool
Reviewed by:	mdodd
2005-03-26 21:49:43 +00:00