> Fix a panic in softclock() if the interface is destroyed with a bpf consumer
> attached.
>
> This is caused by bpf_detachd clearing IFF_PROMISC on the interface which does
> a SIOCSIFFLAGS ioctl. The problem here is that while the interface has been
> stopped, IFF_UP has not been cleared so IFF_UP != IFF_DRV_RUNNING, this causes
> the ioctl function to init() the interface which resets the callouts.
>
> The destroy then completes and frees the softc but softclock will panic on a
> dead callout pointer.
>
> Ensure ifp->if_flags matches reality by clearing IFF_UP when we destroy.
Approved by: re (kensmith), mlaier (mentor)
> Mark the callouts as MPSAFE as if_bridge has been giant-free since day 1.
>
> Use the SMP friendly callout_init_mtx() while we are here.
>
Approved by: re (kensmith), mlaier (mentor)
Don't loop back packets that have been routed by pf. This fixes an
endless loop where the same packet is sent over and over again.
Obtained from: OpenBSD
Reported by: Sergey Lapin
Tested by: Sergey Lapin
Approved by: re (scottl)
De-spl parts of the routing socket code now generally protected
through locking; leave some spl references around code where there
are open questions about global variable references. Also, add
an XXX regarding locking in sysctl.
Approved by: re (scottl)
> The mtu check in bridge_enqueue is bogus as the maximum Ethernet frame is
> actually 1514, so comparing the mbuf length which includes the Ethernet
> header
> to the interface MTU is wrong.
>
> The check was a little over the top so just remove it.
Approved by: re (scottl), mlaier (mentor)
For each interface flag, indicate whether or not it is owned by the
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.
Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions
with IFF_ALLMULTI.
Approved by: re (scottl)
to RELENG_6:
Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack. The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.
Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names. Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.
When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together. Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.
Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.
With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking. Most were likely
never triggered.
Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.
Reviewed by: pjd, bz
Approved by: re (scottl)
Lock down netnatm and mark as MPSAFE:
- Introduce a subsystem mutex, natm_mtx, manipulated with accessor macros
NATM_LOCK_INIT(), NATM_LOCK(), NATM_UNLOCK(), NATM_LOCK_ASSERT(). It
protects the consistency of pcb-related data structures. Finer grained
locking is possible, but should be done in the context of specific
measurements (as very little work is done in netnatm -- most is in the
ATM device driver or socket layer, so there's probably not much
contention).
- Remove GIANT_REQUIRED, mark as NETISR_MPSAFE, remove
NET_NEEDS_GIANT("netnatm").
- Conditionally acquire Giant when entering network interfaces for
ifp->if_ioctl() using IFF_LOCKGIANT(ifp)/IFF_UNLOCKGIANT(ifp) in order
to coexist with non-MPSAFE atm ifnet drivers..
- De-spl.
Reviewed by: harti, bms (various versions)
Approved by: re (hrs)
if_iso88025subr.c:1.68 from HEAD to RELENG_6:
When allocating link layer ifnet address list entries in
ifp->if_resolvemulti(), do so with M_NOWAIT rather than M_WAITOK, so
that a mutex can be held over the call. In the FDDI code, add a
missing M_ZERO. Consumers are already aware that if_resolvemulti()
can fail.
Approved by: re (scottl)
- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
Staticize.
Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>
Approved by: re (scottl)
correct nits in the addition of if_addr_mtx:
if.c:1.240:
Initialize the if_addr mutex in if_alloc() rather than waiting until
if_attach(). This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver). Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized
mutex.
Reported by: Holm Tiffe holm at freibergnet dot de
Discussed with: rwatson
if.c:1.241:
destroy lock _before_ free'ing the structure it resides in
if.c:1.242:
- Move IF_ADDR_LOCK_DESTROY(ifp) from if_free to if_free_type.
- Add a note that additions should be made to if_free_type and not
if_free to help avoid this in the future.
This apparently fixes a use after free in if_bridge and may fix bugs
in other direct if_free_type consumers.
Reported by: thompsa
Approved by: re (hrs)
Protect link layer network interface multicast address list manipulation
using ifp->if_addr_mtx:
- Initialize if_addr_mtx when ifnet is initialized.
- Destroy if_addr_mtx when ifnet is torn down.
- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
Staticize.
- Extract ifmultiaddr allocation and initialization into if_allocmulti();
accept a 'mflags' argument to indicate whether or not sleeping is
permitted. This centralizes error handling and address duplication.
- Extract ifmultiaddr tear-down and deallocation in if_freemulti().
- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
the ifnet multicast address list and reference count manipulation.
Make use of non-sleeping allocations. Annotate the fact that we only
generate routing socket events for explicit address addition, not
implicit link layer address addition.
- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
the ifnet multicast address list and reference count manipulation.
Annotate the lack of a routing socket event for implicit link layer
address removal.
- De-spl all and sundry.
Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>
Approved by: re (hrs)
- Introduce a helper function if_setflag() containing the code common
to ifpromisc() and if_allmulti() instead of duplicating the code poorly,
with different bugs.
- Call ifp->if_ioctl() in a consistent way: always use more compatible C
syntax and check whether ifp->if_ioctl is not NULL prior to the call.
Discussed with: yar
Approved by: re (scottl)
Allocate one of the spare ifnet integer fields to hold if_drv_flags,
which in the future will hold IFF_OACTIVE and IFF_RUNNING, and have
its access synchronized by the device driver rather than the
protocol stack. This will avoid potential races in the management
of flags in if_flags.
Discussed with: various (scottl, jhb, ...)
Approved by: re (kensmith)
calls before an if_attach.
Partial MFC of 1.243. The change it was a response to has not been
MFCd, but the comment is relevent without it.
Approved by: re (kensmith)
Axe ppp_for_tty(). Use tty->t_lsc pointer to store sc. This
also eliminates recursive use of ppp_softc_list_mtx.
PR: kern/84686
Reviewed by: phk
Approved by: re (kensmith)
date: 2005/08/18 22:30:52; author: csjp; state: Exp; lines: +4 -2
Add missing braces around bpf_filter which were missed when I
merged the bpfstat code.
Approved by: re (kensmith)
> Ensure that we are holding the lock when initialising the bridge interface. We
> could initialise while unlocked if the bridge is not up when setting the inet
> address, ether_ioctl() would call bridge_init.
>
> Change it so bridge_init is always called unlocked and then locks before
> calling bstp_initialization().
Approved by: re (kensmith), mlaier (mentor)
In multicast routines:
Compare pointers with NULL rather than treating them as booleans.
Compare pointers with NULL rather than 0 to make it more clear
they are pointers.
Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.
Approved by: re (kensmith)
Rename equal() macro to sa_equal(), which matches the definitions
of sa_equal() in other files, and makes it more clear what equal()
is comparing.
Approved by: re (kensmith)
To prevent a race between RTM_DELETE message and
arptimer() deleting stale entry, we need to lock
rtentry before unlocking radix head.
Approved by: re (kensmith)
o Make rt_check() function more strict:
- rt0 passed to rt_check() must not be NULL, assert this.
- rt returned by rt_check() must be valid locked rtentry,
if no error occured.
o Modify callers, so that they never pass NULL rt0
to rt_check().
o Modify callers, so that they unlock rtentry.
Revisions merged:
net/if_atmsubr.c - 1.39, 1.41
net/if_fwsubr.c - 1.13, 1.15
net/if_iso88025subr.c - 1.69, 1.71
net/route.c - 1.110, 1.111
netinet6/nd6.c - 1.51-1.53
netinet/if_ether.c intentionally not touched by this merge.
Approved by: re (kensmith)
Add if_addr_mtx to struct ifnet, a mutex to protect ifnet-related address
lists. Add accessor macros.
This changes the size of struct ifnet, but ideally, all ifnet consumers
are now using if_alloc() to allocate these structures rather than
embedding them into device driver softc's, so this won't modify the
network device driver ABI.
Approved by: re (hrs)
HEAD to RELENG_6: changes to introduce a credentialed version of the
clone event handler, and then changes to merge the regular and
credentialed versions into a single interface (along with updates to
existing consumers). With this merge, 6.x and 7.x are in sync.
First batch merges devfs_devs.c:1.37, devfs_vnops.c:1.115,
kern_conf.c:1.187, tty_pty.c:1.138, mac_vfs.c:1.109, mac_biba.c:1.36,
mac_lomac.c:1.36, mac_mls.c:1.73, mac_stub.c:1.53, mac_test.c:1.61,
conf.h:1.223, mac.h:1.68, mac_policy.h:1.67 from HEAD to RELENG_6:
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
Second batch merges scsi_target.c:1.68, coda_fbsd.c:1.43,
firewirereg.h:1.38, fwdev.c:1.47, nmdm.c:1.36, snp.c:1.100, dsp.c:1.82,
mixer.c:1.45, vkbd.c:1.9, devfs_vnops.c:1.117, tty_pty.c:1.139,
tty_tty.c:1.57, bpf.c:1.156, if_tap.c:1.56, if_tun.c:1.153,
smb_dev.c:1.28, conf.h:1.224 from HEAD to RELENG_6:
Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do. This avoids having multiple event handler types and
fall-back/precedence logic in devfs.
This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.
Requested by: phk
These changes modifies the kernel device driver API for device cloning,
and might require minor modifications to third party device drivers that
make use of devfs cloning. It will not be merged to RELENG_5.
Approved by: re (scottl)
> Use m_copypacket() which is an optimization of the common case
> m_copym(m, 0, M_COPYALL, how).
>
> This is required for strict alignment architectures where we align the
> IP header in the input path but m_copym() will create an unaligned copy
> in bridge_broadcast(). m_copypacket() preserves alignment of the first
> mbuf.
>
> Noticed by: Petri Simolin
> Approved by: mlaier (mentor)
Approved by: re (kensmith), mlaier (mentor)
Add support for IPv6 over GRE [1]. PR kern/80340 includes the
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].
PR: kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from: NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by: Gert Doering <gert at greenie.muc.de>[2]
Approved by: re (kensmith)
> We check that all the member interfaces have the same MTU on attach to the
> bridge but the interface can still be changed afterwards.
>
> This falls under the 'dont do that' category but log an warning when INVARIANTS
> is defined.
>
> Approved by: mlaier (mentor)
>
> Revision Changes Path
> 1.14 +8 -0 src/sys/net/if_bridge.c
Approved by: re (kensmith), mlaier (mentor)
> Clear the PROMISC flag from the vlan interface when we remove a member. We
> checked for IFT_L2VLAN in bridge_ioctl_add() but not bridge_delete_member().
>
> Approved by: mlaier (mentor)
>
> Revision Changes Path
> 1.13 +1 -0 src/sys/net/if_bridge.c
Approved by: re (kensmith), mlaier (mentor)
Move eventhandler for 'ifnet_departure_event' at the end of the progress.
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure. These packets confused users of ifnet_departure_event (pf at
the moment).
PR: kern/80627
Tested by: Divacky Roman
Approved by: re (kensmith)
hooks for each outgoing interface but also run pfil hooks _N times_ on the
bridge interface. This is changed so pfil hooks are run once for the bridge
interface (bridge0) and then only on the outgoing interfaces in the broadcast
loop.
- Simplify bridge_enqueue() by moving bridge_pfil() to the callers.
- Check (inet6_pfil_hook.ph_busy_count >= 0), it may be possible to have a
packet filter hooked for only ipv6 but we were only checking if ipv4 hooks
were busy.
- Minor optimisation for null mbuf check after bridge_pfil(), move it into the
if-block as it couldnt possibly be null outside.
Prodded by: mlaier
Approved by: re (scottl), mlaier (mentor)
redundant with respect to existing mbuf copy label routines. Expose
a new mac_copy_mbuf() routine at the top end of the Framework and
use that; use the existing mpo_copy_mbuf_label() routine on the
bottom end.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, SPAWAR
Approved by: re (scottl)
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64, ia64 and ppc.
This uses the code block from if_bridge and the newly added macro
IP_HDR_ALIGNED_P().
This /might/ be a temporary messure before all NIC drivers are educated
to align the header themself.
PR: ia64/81284
Obtained from: NetBSD (if_bridge)
Approved by: re (dwhite), mlaier (mentor)
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).
This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.
IP_HDR_ALIGNED_P()
IP6_HDR_ALIGNED_P()
Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.
PR: ia64/81284
Obtained from: NetBSD
Approved by: re (dwhite), mlaier (mentor)
- Introducing the possibility of using locks different than mutexes
for the knlist locking. In order to do this, we add three arguments to
knlist_init() to specify the functions to use to lock, unlock and
check if the lock is owned. If these arguments are NULL, we assume
mtx_lock, mtx_unlock and mtx_owned, respectively.
- Using the vnode lock for the knlist locking, when doing kqueue operations
on a vnode. This way, we don't have to lock the vnode while holding a
mutex, in filt_vfsread.
Reviewed by: jmg
Approved by: re (scottl), scottl (mentor override)
Pointyhat to: ssouhlal
Will be happy: everyone
a cosmetic change. m_uiotombuf() produces a packet header mbuf, while
original implementation did not. When kernel is compiled with MAC
support, headerless mbuf will cause panic.
Reported by: Alexander Nikiforenko <asn rambler-co.ru>
Approved by: re (scottl)
MFC After: 2 weeks
route itself.
It fixes a bug where an IPv4 route for example has an IPv6 gateway
specified:
route add 10.1.1.1 -inet6 fe80::1%fxp0
Destination Gateway Flags Refs Use Netif Expire
10.1.1.1 fe80::1%fxp0 UGHS 0 0 fxp0
The fix rejects these illegal combinations:
route: writing to routing socket: Invalid argument
add host 10.1.1.1: gateway fe80::1%fxp0: Invalid argument
Reviewed by: KAME jinmei@isl.rdc.toshiba.co.jp
Reviewed by: andre (mentor)
Approved by: re
MFC after: 5
pointer to a softc which is no longer valid since the ifnet struct was split
out from the softc.
Approved by: mlaier (mentor)
Approved by: re (blanket)
a DLT_NULL interface. In particular:
1) Consistently use type u_int32_t for the header of a
DLT_NULL device - it continues to represent the address
family as always.
2) In the DLT_NULL case get bpf_movein to store the u_int32_t
in a sockaddr rather than in the mbuf, to be consistent
with all the DLT types.
3) Consequently fix a bug in bpf_movein/bpfwrite which
only permitted packets up to 4 bytes less than the MTU
to be written.
4) Fix all DLT_NULL devices to have the code required to
allow writing to their bpf devices.
5) Move the code to allow writing to if_lo from if_simloop
to looutput, because it only applies to DLT_NULL devices
but was being applied to other devices that use if_simloop
possibly incorrectly.
PR: 82157
Submitted by: Matthew Luckie <mjl@luckie.org.nz>
Approved by: re (scottl)
fails.
Move detaching the ifnet from the ifindex_table into if_free so we can
both keep the sanity checks and actually delete the ifnets. [0]
Reported by: gallatin [0]
Approved by: re (blanket)