- In ifc_name2unit(), disallow leading zeroes in a unit.
Exploit: ifconfig lo01 create
- In ifc_name2unit(), properly handle overflows. Otherwise,
either of two local panic()'s can occur, either because
no interface with such a name could be found after it was
successfully created, or because the code will bogusly
assume that it's a wildcard (unit < 0 due to overflow).
Exploit: ifconfig lo<overflowed_integer> create
- Previous revision made the following sequence trigger
a KASSERT() failure in queue(3):
Exploit: ifconfig lo0 destroy; ifconfig lo0 destroy
This is because IFC_IFLIST_REMOVE() is always called
before ifc->ifc_destroy() has been run, not accounting
for the fact that the latter can fail and leave the
interface operating (like is the case for "lo0").
So we ended up calling LIST_REMOVE() twice. We cannot
defer IFC_IFLIST_REMOVE() until after a call to
ifc->ifc_destroy() because the ifnet may have been
removed and its memory has been freed, so recover from
this by re-inserting the ifnet in the cloned interfaces
list if ifc->ifc_destroy() indicates a failure.
If the packet is rejected from pfil(9) then continue the loop rather than
returning, this means that we can still try to send it out the remaining
interfaces but more importantly the mbuf is freed and refcount decremented on
exit.
rather than in ifindex_table[]; all (except one) accesses are
through ifp anyway. IF_LLADDR() works faster, and all (except
one) ifaddr_byindex() users were converted to use ifp->if_addr.
- Stop storing a (pointer to) Ethernet address in "struct arpcom",
and drop the IFP2ENADDR() macro; all users have been converted
to use IF_LLADDR() instead.
copy of Ethernet address.
- Change iso88025_ifattach() and fddi_ifattach() to accept MAC
address as an argument, similar to ether_ifattach(), to make
this work.
softc lists and associated mutex are now unused so these have been removed.
Calling if_clone_detach() will now destroy all the cloned interfaces for the
driver and in most cases is all thats needed to unload.
Idea by: brooks
Reviewed by: brooks
Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it. It only adds a layer of confusion. The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.
Non-native code is not changed in this commit. For compatibility
MT_HEADER is mapped to MT_DATA.
Sponsored by: TCP/IP Optimization Fundraise 2005
promisc flag from the member interface, this is a no-op anyway since the
interface is disappearing. The driver may have already released
its resources such as miibus and this is likely to panic the kernel.
Submitted and tested by: Wojciech A. Koszek
MFC after: 2 weeks
OR the flags bits with the driver managed status flags. This fixes an
issue where RUNNING flags would not be reported to processes, which
conflicts with the flags information provided by ifconfig(8).
that the following funtions are not used, wrap in '#ifdef noused' for the
moment.
bstp_enable_change_detection
bstp_disable_change_detection
bstp_set_bridge_priority
bstp_set_port_priority
bstp_set_path_cost
- move the function pointer definitions to if_bridgevar.h
- move most of the logic to the new BRIDGE_INPUT and BRIDGE_OUTPUT macros
- remove unneeded functions from if_bridgevar.h and sort a little.
Use bridge_ifdetach() to notify the bridge that a member has been detached. The
bridge can then remove it from its interface list and not try to send out via a
dead pointer.
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.
Assert that all cloners have been destroyed when freeing the memory.
Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.
Discussed with: brooks, pjd, -current
Reviewed by: brooks
http://lists.freebsd.org/pipermail/cvs-src/2004-October/033496.html
The same problem applies to if_bridge(4), too.
- Copy-and-paste the if_bridge(4) related block from
if_ethersubr.c to ng_ether.c
- Add XXXs, so that copy-and-paste would be noticed by
any future editors of this code.
- Also add XXXs near if_bridge(4) declarations.
Silence from: thompsa
module name to something that wouldn't conflict with
sys/dev/firewire/firewire.c.
Submitted by: Cai, Quanqing <caiquanqing at gmail dot com>
PR: kern/82727
MFC after: 3 days
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
can be compiled as loadable modules.
Reviewed by: bde
to the parent interface, such as IFF_PROMISC and
IFF_ALLMULTI. In addition, vlan(4) gains ability
to migrate from one parent to another w/o losing
its own flags.
PR: kern/81978
MFC after: 2 weeks
as it is done for usual promiscuous mode already. This info is important
because promiscuous mode in the hands of a malicious party can jeopardize
the whole network.
o Axe poll in trap.
o Axe IFF_POLLING flag from if_flags.
o Rework revision 1.21 (Giant removal), in such a way that
poll_mtx is not dropped during call to polling handler.
This fixes problem with idle polling.
o Make registration and deregistration from polling in a
functional way, insted of next tick/interrupt.
o Obsolete kern.polling.enable. Polling is turned on/off
with ifconfig.
Detailed kern_poll.c changes:
- Remove polling handler flags, introduced in 1.21. The are not
needed now.
- Forget and do not check if_flags, if_capenable and if_drv_flags.
- Call all registered polling handlers unconditionally.
- Do not drop poll_mtx, when entering polling handlers.
- In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
- In netisr_poll() axe the block, where polling code asks drivers
to unregister.
- In netisr_poll() and ether_poll() do polling always, if any
handlers are present.
- In ether_poll_[de]register() remove a lot of error hiding code. Assert
that arguments are correct, instead.
- In ether_poll_[de]register() use standard return values in case of
error or success.
- Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
poll_switch() goes through interface list and enabled/disables polling.
A message that kern.polling.enable is deprecated is printed.
Detailed driver changes:
- On attach driver announces IFCAP_POLLING in if_capabilities, but
not in if_capenable.
- On detach driver calls ether_poll_deregister() if polling is enabled.
- In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
flag. If there is no, then unlocks and returns.
- In ioctl handler driver checks for IFCAP_POLLING flag requested to
be set or cleared. Driver first calls ether_poll_[de]register(), then
obtains driver lock and [dis/en]ables interrupts.
- In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
If present, then returns.This is important to protect from spurious
interrupts.
Reviewed by: ru, sam, jhb
replacement and has additional features which make it superior.
Discussed on: -arch
Reviewed by: thompsa
X-MFC-after: never (RELENG_6 as transition period)
copied mbuf, which keeps the IP header 32-bit aligned. This copied mbuf is
reinjected back into ether_input and off to the IP routines.
Reported and tested by: Peter van Dijk
Approved by: mlaier (mentor)
MFC after: 3 days
- Rearrange code so that in a case of failure the affected
route is not changed. Otherwise, a bogus rtentry will be
left and later rt_check() can recurse on its lock. [1]
- Remove comment about protocol cloning.
- Fix two places where rtentry mutex was recursed on, because
accessed via two different pointers, that were actually pointing
to the same rtentry in some cases. [1]
- Return EADDRINUSE instead of bogus EDQUOT, in case when gateway
uses the same route. [2]
Reported & tested by: ps, Andrej Zverev <az inec.ru> [1]
PR: kern/64090 [2]
panics, which occur when stale ifnet pointers are left in struct
moptions hung off of inpcbs:
- Add in_ifdetach(), which matches in6_ifdetach(), and allows the
protocol to perform early tear-down on the interface early in
if_detach().
- Annotate that if_detach() needs careful consideration.
- Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR --
this is not the place to detect interface removal! This also
removes what is basically a nasty (and now unnecessary) hack.
- Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP
IPv4 sockets.
It is now possible to run the msocket_ifnet_remove regression test
using HEAD without panicking.
MFC after: 3 days
- Use __func__ consistently instead of copying function name
to message strings. Code tends to migrate around source files.
- DIAGNOSTIC is for information, INVARIANTS is for panics.
m_tag_locate(). This adds little overhead of a simple
bitwise operation in case hardware VLAN acceleration
is on, yet saves the more expensive function call if
the acceleration is off.
Reviewed by: ru, glebius
X-MFC-after: 6.0
locks were not aquired because the user buffers were not wired, thus it was
possible that that SYSCTL_OUT could sleep, causing a number of different
problems such as lock ordering issues and dead locks.
-Wire user supplied buffer to ensure SYSCTL_OUT will not sleep.
-Pickup ifnet locks to protect the list.
-Where applicable pickup address locks.
-Pickup radix node head locks.
-Remove splnet stubs
-Remove various comments about locking here, because they are no
longer needed.
It is the hope that these changes will make sysctl_rtsock MP safe.
MFC after: 3 weeks
assigned to the interface.
IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge case and
it may cause scope violation.
An address can be assigned in the usual way;
ifconfig bridge0 inet6 xxxx:...
Tested by: bmah
Reviewed by: ume (netinet6)
Approved by: mlaier (mentor)
MFC after: 1 week
refresh the PID which has the descriptor open. The PID is refreshed in various
operations like ioctl(2), kevent(2) or poll(2). This produces more accurate
information about current bpf consumers. While we are here remove the bd_pcomm
member of the bpf stats structure because now that we have an accurate PID we
can lookup the via the kern.proc.pid sysctl variable. This is the trick that
NetBSD decided to use to deal with this issue.
Special care needs to be taken when MFC'ing this change, as we have made a
change to the bpf stats structure. What will end up happening is we will leave
the pcomm structure but just mark it as being un-used. This way we keep the ABI
in tact.
MFC after: 1 month
Discussed with: Rui Paulo < rpaulo at NetBSD dot org >
value from an m_tag, but also to set it. This reduces
complex code duplication and improves its readability.
Alas, we shouldn't rename the macro to VLAN_TAG_LVALUE()
globally because that would cause pain for kernel module
port maintainers and vendors using FreeBSD as their codebase.
Added a clarifying comment instead.
Discussed with: ru, glebius
X-MFC-After: 6.0-RELEASE (MFC is good just to reduce the diff)
attached.
This is caused by bpf_detachd clearing IFF_PROMISC on the interface which does
a SIOCSIFFLAGS ioctl. The problem here is that while the interface has been
stopped, IFF_UP has not been cleared so IFF_UP != IFF_DRV_RUNNING, this causes
the ioctl function to init() the interface which resets the callouts.
The destroy then completes and frees the softc but softclock will panic on a
dead callout pointer.
Ensure ifp->if_flags matches reality by clearing IFF_UP when we destroy.
Silence from: rwatson
Approved by: mlaier (mentor)
MFC after: 3 days
through locking; leave some spl references around code where there
are open questions about global variable references. Also, add
an XXX regarding locking in sysctl.
MFC after: 3 days
actually 1514, so comparing the mbuf length which includes the Ethernet header
to the interface MTU is wrong.
The check was a little over the top so just remove it.
Approved by: mlaier (mentor)
MFC after: 3 days
enhance the security of bpf(4) by further relinquishing the privilege of
the bpf(4) consumer (assuming the ioctl commands are being implemented).
Once BIOCLOCK is executed, the device becomes locked which prevents the
execution of ioctl(2) commands which can change the underly parameters of the
bpf(4) device. An example might be the setting of bpf(4) filter programs or
attaching to different network interfaces.
BIOCSETWF can be used to set write filters for outgoing packets. Currently if
a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used
as a raw socket, regardless of consumer's UID. Write filters give users the
ability to constrain which packets can be sent through the bpf(4) descriptor.
These features are currently implemented by a couple programs which came from
OpenBSD, such as the new dhclient and pflogd.
-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify
whether a read or write filter is to be set.
-Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the
filter program on the mbuf data once we move the packet in from user-space.
-Rather than execute two uiomove operations, (one for the link header and the
other for the packet data), execute one and manually copy the linker header
into the sockaddr structure via bcopy.
-Restructure bpf_setf to compensate for write filters, as well as read.
-Adjust bpf(4) stats structures to include a bd_locked member.
It should be noted that the FreeBSD and OpenBSD implementations differ a bit in
the sense that we unconditionally enforce the lock, where OpenBSD enforces it
only if the calling credential is not root.
Idea from: OpenBSD
Reviewed by: mlaier
struct ifnet most of if_findindex() become a complex no-op. Remove it
and replace it with a corrected version of the four line for loop it
devolved to plus some error handling. This should probably be replaced
with subr_unit at some point.
Switch from checking ifaddr_byindex to ifnet_byindex when looking for
empty indexes. Since we're doing this from if_alloc/if_free, we can
only be sure that ifnet_byindex will be correct. This fixes panics when
loading the ef(4) module. The panics were caused by the fact that
if_alloc was called four time before if_attach was called and thus
ifaddr_byindex was not set and the same unit was allocated again. This
in turn caused the first if_attach to fail because the ifp was not the
one in ifnet_byindex(ifp->if_index).
Reported by: "Wojciech A. Koszek" <dunstan at freebsd dot czest dot pl>
PR: kern/84987
MFC After: 1 day
- Add a note that additions should be made to if_free_type and not
if_free to help avoid this in the future.
This apparently fixes a use after free in if_bridge and may fix bugs
in other direct if_free_type consumers.
Reported by: thompsa
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.
Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().
Reported by: Michal Mertl
Approved by: mlaier (mentor)
MFC after: 3 days
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.
Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().
Reported by: Michal Mertl
Approved by: mlaier (mentor)
MFC after: 3 days
- rt0 passed to rt_check() must not be NULL, assert this.
- rt returned by rt_check() must be valid locked rtentry,
if no error occured.
o Modify callers, so that they never pass NULL rt0
to rt_check().
Reviewed by: sam, ume (nd6.c)
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.
Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions
with IFF_ALLMULTI.
MFC after: 7 days
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.
Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.
Reviewed by: pjd, bz
MFC after: 7 days
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack. The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.
Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names. Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.
When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together. Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.
Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.
With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking. Most were likely
never triggered.
Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.
Reviewed by: pjd, bz
MFC after: 7 days
m_copym(m, 0, M_COPYALL, how).
This is required for strict alignment architectures where we align the IP
header in the input path but m_copym() will create an unaligned copy in
bridge_broadcast(). m_copypacket() preserves alignment of the first mbuf.
Noticed by: Petri Simolin
Approved by: mlaier (mentor)
MFC after: 3 days
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do. This avoids having multiple event handler types and
fall-back/precedence logic in devfs.
This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.
Requested by: phk
MFC after: 3 days
if_attach(). This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver). Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized mutex.
Reported by: Holm Tiffe holm at freibergnet dot de
Discussed with: rwatson
using ifp->if_addr_mtx:
- Initialize if_addr_mtx when ifnet is initialized.
- Destroy if_addr_mtx when ifnet is torn down.
- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
Staticize.
- Extract ifmultiaddr allocation and initialization into if_allocmulti();
accept a 'mflags' argument to indicate whether or not sleeping is
permitted. This centralizes error handling and address duplication.
- Extract ifmultiaddr tear-down and deallocation in if_freemulti().
- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
the ifnet multicast address list and reference count manipulation.
Make use of non-sleeping allocations. Annotate the fact that we only
generate routing socket events for explicit address addition, not
implicit link layer address addition.
- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
the ifnet multicast address list and reference count manipulation.
Annotate the lack of a routing socket event for implicit link layer
address removal.
- De-spl all and sundry.
Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after: 1 week
ifp->if_resolvemulti(), do so with M_NOWAIT rather than M_WAITOK, so
that a mutex can be held over the call. In the FDDI code, add a
missing M_ZERO. Consumers are already aware that if_resolvemulti()
can fail.
MFC after: 1 week
lists. Add accessor macros.
This changes the size of struct ifnet, but ideally, all ifnet consumers
are now using if_alloc() to allocate these structures rather than
embedding them into device driver softc's, so this won't modify the
network device driver ABI.
MFC after: 1 week
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].
PR: kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from: NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by: Gert Doering <gert at greenie.muc.de>[2]
MFC after: 4 days
temporary buffer then pass the array to user-space once we have
dropped the lock.
While we are here, drop an assertion which could result in a
kernel panic under certain race conditions.
Pointed out by: rwatson
- most of the kernel code will not care about the actual encoding of
scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
scoped addresses as a special case.
- scope boundary check will be stricter. For example, the current
*BSD code allows a packet with src=::1 and dst=(some global IPv6
address) to be sent outside of the node, if the application do:
s = socket(AF_INET6);
bind(s, "::1");
sendto(s, some_global_IPv6_addr);
This is clearly wrong, since ::1 is only meaningful within a single
node, but the current implementation of the *BSD kernel cannot
reject this attempt.
Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from: KAME
bridge but the interface can still be changed afterwards.
This falls under the 'dont do that' category but log an warning when INVARIANTS
is defined.
Approved by: mlaier (mentor)
MFC after: 3 days
be used to pass statistics regarding dropped, matched and received
packet counts from the kernel to user-space. While we are here
introduce a new counter for filtered or matched packets. We currently
keep track of packets received or dropped by the bpf device, but not
how many packets actually matched the bpf filter.
-Introduce net.bpf.stats sysctl OID
-Move sysctl variables after the function prototypes so we can
reference bpf_stats_sysctl(9) without build errors.
-Introduce bpf descriptor counter which is used mainly for sizing
of the xbpf_d array.
-Introduce a xbpf_d structure which will act as an external
representation of the bpf_d structure.
-Add a the following members to the bpfd structure:
bd_fcount - Number of packets which matched bpf filter
bd_pid - PID which opened the bpf device
bd_pcomm - Process name which opened the device.
It should be noted that it's possible that the process which opened
the device could be long gone at the time of stats collection. An
example might be a process that opens the bpf device forks then exits
leaving the child process with the bpf fd.
Reviewed by: mdodd
which in the future will hold IFF_OACTIVE and IFF_RUNNING, and have
its access synchronized by the device driver rather than the
protocol stack. This will avoid potential races in the management
of flags in if_flags.
Discussed with: various (scottl, jhb, ...)
MFC after: 1 week
Compare pointers with NULL rather than treating them as booleans.
Compare pointers with NULL rather than 0 to make it more clear
they are pointers.
Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.
MFC after: 3 days
- Introduce a subsystem mutex, natm_mtx, manipulated with accessor macros
NATM_LOCK_INIT(), NATM_LOCK(), NATM_UNLOCK(), NATM_LOCK_ASSERT(). It
protects the consistency of pcb-related data structures. Finer grained
locking is possible, but should be done in the context of specific
measurements (as very little work is done in netnatm -- most is in the
ATM device driver or socket layer, so there's probably not much
contention).
- Remove GIANT_REQUIRED, mark as NETISR_MPSAFE, remove
NET_NEEDS_GIANT("netnatm").
- Conditionally acquire Giant when entering network interfaces for
ifp->if_ioctl() using IFF_LOCKGIANT(ifp)/IFF_UNLOCKGIANT(ifp) in order
to coexist with non-MPSAFE atm ifnet drivers..
- De-spl.
MFC after: 2 weeks
Reviewed by: harti, bms (various versions)
the case of an RTM_CHANGE was specific, i.e. that it matched completely. This
led to a route change of a non-existent route changing the default route
as the radix code would simply back track to that point and hand that
route back to the routing socket code.
PR: 82974
Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw>
Ben Kaduk <minimarmot@gmail.com>
Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
Obtained from: OpenBSD with modifications.
MFC after: 2 weeks
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure. These packets confused users of ifnet_departure_event (pf at the
moment).
PR: kern/80627
Tested by: Divacky Roman
MFC after: 1 week