Commit Graph

1740 Commits

Author SHA1 Message Date
csjp
7bdd5fbb1c Protect interface and address lists using the appropriate mutex. These
locks were not aquired because the user buffers were not wired, thus it was
possible that that SYSCTL_OUT could sleep, causing a number of different
problems such as lock ordering issues and dead locks.

-Wire user supplied buffer to ensure SYSCTL_OUT will not sleep.
-Pickup ifnet locks to protect the list.
-Where applicable pickup address locks.
-Pickup radix node head locks.
-Remove splnet stubs
-Remove various comments about locking here, because they are no
 longer needed.

It is the hope that these changes will make sysctl_rtsock MP safe.

MFC after:	3 weeks
2005-09-10 15:12:24 +00:00
obrien
b888392910 Forward declaring static variables as extern is invalid ISO-C. Now that
GCC can properly handle forward static declarations, do this properly.
2005-09-07 10:06:14 +00:00
thompsa
ab218f3cd3 Add support for multicast to the bridge and allow inet6 addresses to be
assigned to the interface.

IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge case and
it may cause scope violation.

An address can be assigned in the usual way;
  ifconfig bridge0 inet6 xxxx:...

Tested by:	bmah
Reviewed by:	ume (netinet6)
Approved by:	mlaier (mentor)
MFC after:	1 week
2005-09-06 21:11:59 +00:00
csjp
17dd22a407 Instead of caching the PID which opened the bpf descriptor, continuously
refresh the PID which has the descriptor open. The PID is refreshed in various
operations like ioctl(2), kevent(2) or poll(2). This produces more accurate
information about current bpf consumers. While we are here remove the bd_pcomm
member of the bpf stats structure because now that we have an accurate PID we
can lookup the via the kern.proc.pid sysctl variable. This is the trick that
NetBSD decided to use to deal with this issue.

Special care needs to be taken when MFC'ing this change, as we have made a
change to the bpf stats structure. What will end up happening is we will leave
the pcomm structure but just mark it as being un-used. This way we keep the ABI
in tact.

MFC after:	1 month
Discussed with:	Rui Paulo < rpaulo at NetBSD dot org >
2005-09-05 23:08:04 +00:00
sam
3a5df528dd reclaim sbuf and clear lock on error in ifconf
Submitted by:	Ted Unangst
Reviewed by:	rwatson
MFC after:	3 days
2005-09-04 17:32:47 +00:00
yar
a8babb6361 Use VLAN_TAG_VALUE() not only to read a dot1q tag
value from an m_tag, but also to set it.  This reduces
complex code duplication and improves its readability.

Alas, we shouldn't rename the macro to VLAN_TAG_LVALUE()
globally because that would cause pain for kernel module
port maintainers and vendors using FreeBSD as their codebase.
Added a clarifying comment instead.

Discussed with:	ru, glebius
X-MFC-After:	6.0-RELEASE (MFC is good just to reduce the diff)
2005-08-31 11:36:50 +00:00
glebius
be4e3433fd Fix fallout from revision 1.77, mark outgoing packets with M_VLANTAG flag.
PR:		kern/80646
Reviewed by:	yar
MFC after:	3 days
2005-08-30 14:14:08 +00:00
thompsa
c61e775527 Fix a panic in softclock() if the interface is destroyed with a bpf consumer
attached.

This is caused by bpf_detachd clearing IFF_PROMISC on the interface which does
a SIOCSIFFLAGS ioctl. The problem here is that while the interface has been
stopped, IFF_UP has not been cleared so IFF_UP != IFF_DRV_RUNNING, this causes
the ioctl function to init() the interface which resets the callouts.

The destroy then completes and frees the softc but softclock will panic on a
dead callout pointer.

Ensure ifp->if_flags matches reality by clearing IFF_UP when we destroy.

Silence from:	rwatson
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-27 01:17:42 +00:00
rwatson
e888ea0cd4 De-spl parts of the routing socket code now generally protected
through locking; leave some spl references around code where there
are open questions about global variable references.  Also, add
an XXX regarding locking in sysctl.

MFC after:	3 days
2005-08-25 13:30:04 +00:00
thompsa
c0a34d34c0 The mtu check in bridge_enqueue is bogus as the maximum Ethernet frame is
actually 1514, so comparing the mbuf length which includes the Ethernet header
to the interface MTU is wrong.

The check was a little over the top so just remove it.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-23 19:49:00 +00:00
mlaier
20fd9e3a1f Don't loop back packets that have been routed by pf. This fixes an endless
loop where the same packet is sent over and over again.

Obtained from:	OpenBSD
Reported by:	Sergey Lapin
Tested by:	Sergey Lapin
MFC after:	7 days
2005-08-23 14:13:17 +00:00
csjp
a41f7da35f Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands
enhance the security of bpf(4) by further relinquishing the privilege of
the bpf(4) consumer (assuming the ioctl commands are being implemented).

Once BIOCLOCK is executed, the device becomes locked which prevents the
execution of ioctl(2) commands which can change the underly parameters of the
bpf(4) device. An example might be the setting of bpf(4) filter programs or
attaching to different network interfaces.

BIOCSETWF can be used to set write filters for outgoing packets. Currently if
a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used
as a raw socket, regardless of consumer's UID. Write filters give users the
ability to constrain which packets can be sent through the bpf(4) descriptor.

These features are currently implemented by a couple programs which came from
OpenBSD, such as the new dhclient and pflogd.

-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify
 whether a read or write filter is to be set.
-Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the
 filter program on the mbuf data once we move the packet in from user-space.
-Rather than execute two uiomove operations, (one for the link header and the
 other for the packet data), execute one and manually copy the linker header
 into the sockaddr structure via bcopy.
-Restructure bpf_setf to compensate for write filters, as well as read.
-Adjust bpf(4) stats structures to include a bd_locked member.

It should be noted that the FreeBSD and OpenBSD implementations differ a bit in
the sense that we unconditionally enforce the lock, where OpenBSD enforces it
only if the calling credential is not root.

Idea from:	OpenBSD
Reviewed by:	mlaier
2005-08-22 19:35:48 +00:00
csjp
6de3f76016 Add missing braces around bpf_filter which were missed when I
merged the bpfstat code.

Pointed out by:	iedowse
Pointy hat to:	csjp
MFC after:	3 days
2005-08-18 22:30:52 +00:00
thompsa
805a9715f8 Mark the callouts as MPSAFE as if_bridge has been giant-free since day 1.
Use the SMP friendly callout_init_mtx() while we are here.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-18 20:17:00 +00:00
brooks
7538f97807 When we started calling if_findindex() from if_alloc() with an empty
struct ifnet most of if_findindex() become a complex no-op.  Remove it
and replace it with a corrected version of the four line for loop it
devolved to plus some error handling.  This should probably be replaced
with subr_unit at some point.

Switch from checking ifaddr_byindex to ifnet_byindex when looking for
empty indexes.  Since we're doing this from if_alloc/if_free, we can
only be sure that ifnet_byindex will be correct.  This fixes panics when
loading the ef(4) module.  The panics were caused by the fact that
if_alloc was called four time before if_attach was called and thus
ifaddr_byindex was not set and the same unit was allocated again.  This
in turn caused the first if_attach to fail because the ifp was not the
one in ifnet_byindex(ifp->if_index).

Reported by:	"Wojciech A. Koszek" <dunstan at freebsd dot czest dot pl>
PR:		kern/84987
MFC After:	1 day
2005-08-18 18:36:40 +00:00
brooks
618a796b66 - Move IF_ADDR_LOCK_DESTROY(ifp) from if_free to if_free_type.
- Add a note that additions should be made to if_free_type and not
  if_free to help avoid this in the future.

This apparently fixes a use after free in if_bridge and may fix bugs
in other direct if_free_type consumers.

Reported by:	thompsa
2005-08-16 17:02:35 +00:00
brooks
aa637c9682 Vlan interfaces change their type after ether_ifattach() so we needs to
use if_free_type(ifp, IFT_ETHER) to delete them and stop leaking struct
arpcoms.

Reported by:	thompsa
MFC After:	3 days
2005-08-15 20:27:34 +00:00
thompsa
5eb423e0c4 Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by:    Michal Mertl
Approved by:    mlaier (mentor)
MFC after:      3 days
2005-08-15 02:54:29 +00:00
thompsa
97eb04040a Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by:	Michal Mertl
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-15 02:50:13 +00:00
glebius
4c2db6c3c4 Axe ppp_for_tty(). Use tty->t_lsc pointer to store sc. This
also eliminates recursive use of ppp_softc_list_mtx.

PR:		kern/84686
Reviewed by:	phk
MFC after:	1 week
2005-08-12 08:27:15 +00:00
glebius
eff02aa90f o To prevent a race between RTM_DELETE message and
arptimer() deleting stale entry, we need to lock
  rtentry before unlocking radix head.

Reviewed by:	sam
2005-08-11 08:26:31 +00:00
glebius
fa253399af o Make rt_check() function more strict:
- rt0 passed to rt_check() must not be NULL, assert this.
  - rt returned by rt_check() must be valid locked rtentry,
    if no error occured.
o Modify callers, so that they never pass NULL rt0
  to rt_check().

Reviewed by:	sam, ume (nd6.c)
2005-08-11 08:14:53 +00:00
rwatson
76ad033815 For each interface flag, indicate whether or not it is owned by the
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.

Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions
with IFF_ALLMULTI.

MFC after:	7 days
2005-08-09 12:56:20 +00:00
rwatson
5d770a09e8 Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
rwatson
74759aaa78 Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack.  The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.

Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names.  Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.

When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together.  Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.

Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.

With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking.  Most were likely
never triggered.

Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:16:17 +00:00
glebius
f2edd835a7 In preparation for fixing races in ARP (and probably in other
L2/L3 mappings) make rt_check() return a locked rtentry.
2005-08-09 08:39:56 +00:00
thompsa
56add66791 Use m_copypacket() which is an optimization of the common case
m_copym(m, 0, M_COPYALL, how).

This is required for strict alignment architectures where we align the IP
header in the input path but m_copym() will create an unaligned copy in
bridge_broadcast(). m_copypacket() preserves alignment of the first mbuf.

Noticed by:	Petri Simolin
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-08 22:21:55 +00:00
rwatson
daa1c89f45 Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do.  This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by:	phk
MFC after:	3 days
2005-08-08 19:55:32 +00:00
sam
2269fa4b40 destroy lock _before_ free'ing the structure it resides in 2005-08-06 18:42:01 +00:00
jhb
a1fc2e1330 Initialize the if_addr mutex in if_alloc() rather than waiting until
if_attach().  This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver).  Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized mutex.

Reported by:	Holm Tiffe holm at freibergnet dot de
Discussed with:	rwatson
2005-08-04 14:39:47 +00:00
rwatson
127682bc8c Protect link layer network interface multicast address list manipulation
using ifp->if_addr_mtx:

- Initialize if_addr_mtx when ifnet is initialized.

- Destroy if_addr_mtx when ifnet is torn down.

- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
  Staticize.

- Extract ifmultiaddr allocation and initialization into if_allocmulti();
  accept a 'mflags' argument to indicate whether or not sleeping is
  permitted.  This centralizes error handling and address duplication.

- Extract ifmultiaddr tear-down and deallocation in if_freemulti().

- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Make use of non-sleeping allocations.  Annotate the fact that we only
  generate routing socket events for explicit address addition, not
  implicit link layer address addition.

- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Annotate the lack of a routing socket event for implicit link layer
  address removal.

- De-spl all and sundry.

Problem reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after:		1 week
2005-08-02 23:23:26 +00:00
rwatson
f859ef367b When allocating link layer ifnet address list entries in
ifp->if_resolvemulti(), do so with M_NOWAIT rather than M_WAITOK, so
that a mutex can be held over the call.  In the FDDI code, add a
missing M_ZERO.  Consumers are already aware that if_resolvemulti()
can fail.

MFC after:	1 week
2005-08-02 17:52:52 +00:00
rwatson
a3335c93b9 Add if_addr_mtx to struct ifnet, a mutex to protect ifnet-related address
lists.  Add accessor macros.

This changes the size of struct ifnet, but ideally, all ifnet consumers
are now using if_alloc() to allocate these structures rather than
embedding them into device driver softc's, so this won't modify the
network device driver ABI.

MFC after:	1 week
2005-08-02 17:43:35 +00:00
bz
6860ef65eb Add support for IPv6 over GRE [1]. PR kern/80340 includes the
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].

PR:		kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from:	NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by:	Gert Doering <gert at greenie.muc.de>[2]
MFC after:	4 days
2005-08-01 08:14:21 +00:00
csjp
90616e3ddf Rather than hold a mutex over calls to SYSCTL_OUT allocate a
temporary buffer then pass the array to user-space once we have
dropped the lock.

While we are here, drop an assertion which could result in a
kernel panic under certain race conditions.

Pointed out by:	rwatson
2005-07-26 17:21:56 +00:00
ume
da2cf62b28 scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from:	KAME
2005-07-25 12:31:43 +00:00
thompsa
b8029d769b We check that all the member interfaces have the same MTU on attach to the
bridge but the interface can still be changed afterwards.

This falls under the 'dont do that' category but log an warning when INVARIANTS
is defined.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-25 02:22:37 +00:00
csjp
b29c8f5d64 Introduce new sysctl variable: net.bpf.stats. This sysctl variable can
be used to pass statistics regarding dropped, matched and received
packet counts from the kernel to user-space. While we are here
introduce a new counter for filtered or matched packets. We currently
keep track of packets received or dropped by the bpf device, but not
how many packets actually matched the bpf filter.

-Introduce net.bpf.stats sysctl OID
-Move sysctl variables after the function prototypes so we can
 reference bpf_stats_sysctl(9) without build errors.
-Introduce bpf descriptor counter which is used mainly for sizing
 of the xbpf_d array.
-Introduce a xbpf_d structure which will act as an external
 representation of the bpf_d structure.
-Add a the following members to the bpfd structure:

	bd_fcount	- Number of packets which matched bpf filter
	bd_pid		- PID which opened the bpf device
	bd_pcomm	- Process name which opened the device.

It should be noted that it's possible that the process which opened
the device could be long gone at the time of stats collection. An
example might be a process that opens the bpf device forks then exits
leaving the child process with the bpf fd.

Reviewed by:	mdodd
2005-07-24 17:21:17 +00:00
rwatson
2a95094158 Allocate one of the spare ifnet integer fields to hold if_drv_flags,
which in the future will hold IFF_OACTIVE and IFF_RUNNING, and have
its access synchronized by the device driver rather than the
protocol stack.  This will avoid potential races in the management
of flags in if_flags.

Discussed with:	various (scottl, jhb, ...)
MFC after:	1 week
2005-07-21 22:01:06 +00:00
phk
a2c6f4658e Add some KASSERTS to catch null pointers. 2005-07-21 09:00:51 +00:00
thompsa
7762c9b44d Clear the PROMISC flag from the vlan interface when we remove a member. We
checked for IFT_L2VLAN in bridge_ioctl_add() but not bridge_delete_member().

Approved by:	mlaier (mentor)
2005-07-20 19:42:51 +00:00
rwatson
e648c739b2 In multicast routines:
Compare pointers with NULL rather than treating them as booleans.

Compare pointers with NULL rather than 0 to make it more clear
they are pointers.

Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.

MFC after:	3 days
2005-07-19 10:12:58 +00:00
rwatson
d19a0a8dd9 Rename equal() macro to sa_equal(), which matches the definitions
of sa_equal() in other files, and makes it more clear what equal()
is comparing.

MFC after:	3 days
2005-07-19 10:03:47 +00:00
rwatson
e11e852d54 Lock down netnatm and mark as MPSAFE:
- Introduce a subsystem mutex, natm_mtx, manipulated with accessor macros
  NATM_LOCK_INIT(), NATM_LOCK(), NATM_UNLOCK(), NATM_LOCK_ASSERT().  It
  protects the consistency of pcb-related data structures.  Finer grained
  locking is possible, but should be done in the context of specific
  measurements (as very little work is done in netnatm -- most is in the
  ATM device driver or socket layer, so there's probably not much
  contention).

- Remove GIANT_REQUIRED, mark as NETISR_MPSAFE, remove
  NET_NEEDS_GIANT("netnatm").

- Conditionally acquire Giant when entering network interfaces for
  ifp->if_ioctl() using IFF_LOCKGIANT(ifp)/IFF_UNLOCKGIANT(ifp) in order
  to coexist with non-MPSAFE atm ifnet drivers..

- De-spl.

MFC after:	2 weeks
Reviewed by:	harti, bms (various versions)
2005-07-18 16:55:46 +00:00
gnn
dffbb40d0a Fix for PR 82974. We were not checking that the route looked up in
the case of an RTM_CHANGE was specific, i.e. that it matched completely.  This
led to a route change of a non-existent route changing the default route
as the radix code would simply back track to that point and hand that
route back to the routing socket code.

PR: 82974
Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw>
             Ben Kaduk <minimarmot@gmail.com>
             Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
Obtained from:	OpenBSD with modifications.
MFC after: 2 weeks
2005-07-15 09:18:34 +00:00
mlaier
0fcf9ba9f3 Move eventhandler for 'ifnet_departure_event' at the end of the progress.
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure.  These packets confused users of ifnet_departure_event (pf at the
moment).

PR:		kern/80627
Tested by:	Divacky Roman
MFC after:	1 week
2005-07-14 20:26:43 +00:00
yar
9a76361325 MFp4:
- Introduce a helper function if_setflag() containing the code common
  to ifpromisc() and if_allmulti() instead of duplicating the code poorly,
  with different bugs.
- Call ifp->if_ioctl() in a consistent way: always use more compatible C
  syntax and check whether ifp->if_ioctl is not NULL prior to the call.

MFC after:	1 month
2005-07-14 13:56:51 +00:00
thompsa
a3d1367c6d Previously the bridge MTU was set to ETHERMTU and could not be changed. Since
we can only bridge interfaces with the same value it meant that all members had
to be set at ETHERMTU as well.

Allow the first member to be added to define the MTU for the bridge, the check
still applies to all additional members.

Print an informative message if the MTU is incorrect [1]

Requested by:	Niki Denev [1]
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-13 20:40:19 +00:00
sam
b82cb32ca3 additions from libpcap 0.9.1 release
Approved by:	re (scottl)
2005-07-11 03:16:23 +00:00
thompsa
23f56c61a9 - Previously when broadcasting to N number of interfaces we would run pfil
hooks for each outgoing interface but also run pfil hooks _N times_ on the
  bridge interface. This is changed so pfil hooks are run once for the bridge
  interface (bridge0) and then only on the outgoing interfaces in the broadcast
  loop.

- Simplify bridge_enqueue() by moving bridge_pfil() to the callers.

- Check (inet6_pfil_hook.ph_busy_count >= 0), it may be possible to have a
  packet filter hooked for only ipv6 but we were only checking if ipv4 hooks
  were busy.

- Minor optimisation for null mbuf check after bridge_pfil(), move it into the
  if-block as it couldnt possibly be null outside.

Prodded by:	mlaier
Approved by:	re (scottl), mlaier (mentor)
2005-07-06 01:24:45 +00:00