Commit Graph

1914 Commits

Author SHA1 Message Date
Robert Watson
13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
Robert Watson
292ee7be1c Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack.  The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.

Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names.  Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.

When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together.  Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.

Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.

With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking.  Most were likely
never triggered.

Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:16:17 +00:00
Gleb Smirnoff
9bd8ca3014 In preparation for fixing races in ARP (and probably in other
L2/L3 mappings) make rt_check() return a locked rtentry.
2005-08-09 08:39:56 +00:00
Andrew Thompson
3155122ec2 Use m_copypacket() which is an optimization of the common case
m_copym(m, 0, M_COPYALL, how).

This is required for strict alignment architectures where we align the IP
header in the input path but m_copym() will create an unaligned copy in
bridge_broadcast(). m_copypacket() preserves alignment of the first mbuf.

Noticed by:	Petri Simolin
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-08 22:21:55 +00:00
Robert Watson
6a113b3de7 Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do.  This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by:	phk
MFC after:	3 days
2005-08-08 19:55:32 +00:00
Sam Leffler
456d182d5b destroy lock _before_ free'ing the structure it resides in 2005-08-06 18:42:01 +00:00
John Baldwin
6da3131abd Initialize the if_addr mutex in if_alloc() rather than waiting until
if_attach().  This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver).  Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized mutex.

Reported by:	Holm Tiffe holm at freibergnet dot de
Discussed with:	rwatson
2005-08-04 14:39:47 +00:00
Robert Watson
c3b31afd92 Protect link layer network interface multicast address list manipulation
using ifp->if_addr_mtx:

- Initialize if_addr_mtx when ifnet is initialized.

- Destroy if_addr_mtx when ifnet is torn down.

- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
  Staticize.

- Extract ifmultiaddr allocation and initialization into if_allocmulti();
  accept a 'mflags' argument to indicate whether or not sleeping is
  permitted.  This centralizes error handling and address duplication.

- Extract ifmultiaddr tear-down and deallocation in if_freemulti().

- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Make use of non-sleeping allocations.  Annotate the fact that we only
  generate routing socket events for explicit address addition, not
  implicit link layer address addition.

- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Annotate the lack of a routing socket event for implicit link layer
  address removal.

- De-spl all and sundry.

Problem reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after:		1 week
2005-08-02 23:23:26 +00:00
Robert Watson
09df718e0e When allocating link layer ifnet address list entries in
ifp->if_resolvemulti(), do so with M_NOWAIT rather than M_WAITOK, so
that a mutex can be held over the call.  In the FDDI code, add a
missing M_ZERO.  Consumers are already aware that if_resolvemulti()
can fail.

MFC after:	1 week
2005-08-02 17:52:52 +00:00
Robert Watson
de6073aab0 Add if_addr_mtx to struct ifnet, a mutex to protect ifnet-related address
lists.  Add accessor macros.

This changes the size of struct ifnet, but ideally, all ifnet consumers
are now using if_alloc() to allocate these structures rather than
embedding them into device driver softc's, so this won't modify the
network device driver ABI.

MFC after:	1 week
2005-08-02 17:43:35 +00:00
Bjoern A. Zeeb
9e669156d4 Add support for IPv6 over GRE [1]. PR kern/80340 includes the
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].

PR:		kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from:	NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by:	Gert Doering <gert at greenie.muc.de>[2]
MFC after:	4 days
2005-08-01 08:14:21 +00:00
Christian S.J. Peron
422a63da6e Rather than hold a mutex over calls to SYSCTL_OUT allocate a
temporary buffer then pass the array to user-space once we have
dropped the lock.

While we are here, drop an assertion which could result in a
kernel panic under certain race conditions.

Pointed out by:	rwatson
2005-07-26 17:21:56 +00:00
Hajimu UMEMOTO
a1f7e5f8ee scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from:	KAME
2005-07-25 12:31:43 +00:00
Andrew Thompson
39bb2fca46 We check that all the member interfaces have the same MTU on attach to the
bridge but the interface can still be changed afterwards.

This falls under the 'dont do that' category but log an warning when INVARIANTS
is defined.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-25 02:22:37 +00:00
Christian S.J. Peron
69f7644bc9 Introduce new sysctl variable: net.bpf.stats. This sysctl variable can
be used to pass statistics regarding dropped, matched and received
packet counts from the kernel to user-space. While we are here
introduce a new counter for filtered or matched packets. We currently
keep track of packets received or dropped by the bpf device, but not
how many packets actually matched the bpf filter.

-Introduce net.bpf.stats sysctl OID
-Move sysctl variables after the function prototypes so we can
 reference bpf_stats_sysctl(9) without build errors.
-Introduce bpf descriptor counter which is used mainly for sizing
 of the xbpf_d array.
-Introduce a xbpf_d structure which will act as an external
 representation of the bpf_d structure.
-Add a the following members to the bpfd structure:

	bd_fcount	- Number of packets which matched bpf filter
	bd_pid		- PID which opened the bpf device
	bd_pcomm	- Process name which opened the device.

It should be noted that it's possible that the process which opened
the device could be long gone at the time of stats collection. An
example might be a process that opens the bpf device forks then exits
leaving the child process with the bpf fd.

Reviewed by:	mdodd
2005-07-24 17:21:17 +00:00
Robert Watson
638ccea02a Allocate one of the spare ifnet integer fields to hold if_drv_flags,
which in the future will hold IFF_OACTIVE and IFF_RUNNING, and have
its access synchronized by the device driver rather than the
protocol stack.  This will avoid potential races in the management
of flags in if_flags.

Discussed with:	various (scottl, jhb, ...)
MFC after:	1 week
2005-07-21 22:01:06 +00:00
Poul-Henning Kamp
514bcb8955 Add some KASSERTS to catch null pointers. 2005-07-21 09:00:51 +00:00
Andrew Thompson
12b47243c6 Clear the PROMISC flag from the vlan interface when we remove a member. We
checked for IFT_L2VLAN in bridge_ioctl_add() but not bridge_delete_member().

Approved by:	mlaier (mentor)
2005-07-20 19:42:51 +00:00
Robert Watson
2432c31c8b In multicast routines:
Compare pointers with NULL rather than treating them as booleans.

Compare pointers with NULL rather than 0 to make it more clear
they are pointers.

Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.

MFC after:	3 days
2005-07-19 10:12:58 +00:00
Robert Watson
d8d5b10e84 Rename equal() macro to sa_equal(), which matches the definitions
of sa_equal() in other files, and makes it more clear what equal()
is comparing.

MFC after:	3 days
2005-07-19 10:03:47 +00:00
Robert Watson
f002340544 Lock down netnatm and mark as MPSAFE:
- Introduce a subsystem mutex, natm_mtx, manipulated with accessor macros
  NATM_LOCK_INIT(), NATM_LOCK(), NATM_UNLOCK(), NATM_LOCK_ASSERT().  It
  protects the consistency of pcb-related data structures.  Finer grained
  locking is possible, but should be done in the context of specific
  measurements (as very little work is done in netnatm -- most is in the
  ATM device driver or socket layer, so there's probably not much
  contention).

- Remove GIANT_REQUIRED, mark as NETISR_MPSAFE, remove
  NET_NEEDS_GIANT("netnatm").

- Conditionally acquire Giant when entering network interfaces for
  ifp->if_ioctl() using IFF_LOCKGIANT(ifp)/IFF_UNLOCKGIANT(ifp) in order
  to coexist with non-MPSAFE atm ifnet drivers..

- De-spl.

MFC after:	2 weeks
Reviewed by:	harti, bms (various versions)
2005-07-18 16:55:46 +00:00
George V. Neville-Neil
ba7be0a934 Fix for PR 82974. We were not checking that the route looked up in
the case of an RTM_CHANGE was specific, i.e. that it matched completely.  This
led to a route change of a non-existent route changing the default route
as the radix code would simply back track to that point and hand that
route back to the routing socket code.

PR: 82974
Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw>
             Ben Kaduk <minimarmot@gmail.com>
             Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
Obtained from:	OpenBSD with modifications.
MFC after: 2 weeks
2005-07-15 09:18:34 +00:00
Max Laier
52023244de Move eventhandler for 'ifnet_departure_event' at the end of the progress.
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure.  These packets confused users of ifnet_departure_event (pf at the
moment).

PR:		kern/80627
Tested by:	Divacky Roman
MFC after:	1 week
2005-07-14 20:26:43 +00:00
Yaroslav Tykhiy
1a3b685942 MFp4:
- Introduce a helper function if_setflag() containing the code common
  to ifpromisc() and if_allmulti() instead of duplicating the code poorly,
  with different bugs.
- Call ifp->if_ioctl() in a consistent way: always use more compatible C
  syntax and check whether ifp->if_ioctl is not NULL prior to the call.

MFC after:	1 month
2005-07-14 13:56:51 +00:00
Andrew Thompson
489fc2258f Previously the bridge MTU was set to ETHERMTU and could not be changed. Since
we can only bridge interfaces with the same value it meant that all members had
to be set at ETHERMTU as well.

Allow the first member to be added to define the MTU for the bridge, the check
still applies to all additional members.

Print an informative message if the MTU is incorrect [1]

Requested by:	Niki Denev [1]
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-13 20:40:19 +00:00
Sam Leffler
e0d80bffb5 additions from libpcap 0.9.1 release
Approved by:	re (scottl)
2005-07-11 03:16:23 +00:00
Andrew Thompson
ea32e73208 - Previously when broadcasting to N number of interfaces we would run pfil
hooks for each outgoing interface but also run pfil hooks _N times_ on the
  bridge interface. This is changed so pfil hooks are run once for the bridge
  interface (bridge0) and then only on the outgoing interfaces in the broadcast
  loop.

- Simplify bridge_enqueue() by moving bridge_pfil() to the callers.

- Check (inet6_pfil_hook.ph_busy_count >= 0), it may be possible to have a
  packet filter hooked for only ipv6 but we were only checking if ipv4 hooks
  were busy.

- Minor optimisation for null mbuf check after bridge_pfil(), move it into the
  if-block as it couldnt possibly be null outside.

Prodded by:	mlaier
Approved by:	re (scottl), mlaier (mentor)
2005-07-06 01:24:45 +00:00
Robert Watson
3c308b091f Eliminate MAC entry point mac_create_mbuf_from_mbuf(), which is
redundant with respect to existing mbuf copy label routines.  Expose
a new mac_copy_mbuf() routine at the top end of the Framework and
use that; use the existing mpo_copy_mbuf_label() routine on the
bottom end.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA, SPAWAR
Approved by:	re (scottl)
2005-07-05 23:39:51 +00:00
Andrew Thompson
ede3a2773d Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64, ia64 and ppc.

This uses the code block from if_bridge and the newly added macro
IP_HDR_ALIGNED_P().

This /might/ be a temporary messure before all NIC drivers are educated
to align the header themself.

PR:		ia64/81284
Obtained from:	NetBSD (if_bridge)
Approved by:	re (dwhite), mlaier (mentor)
2005-07-03 18:24:03 +00:00
Andrew Thompson
2fcb030ad5 Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).

This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.

 IP_HDR_ALIGNED_P()
 IP6_HDR_ALIGNED_P()

Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.

PR:		ia64/81284
Obtained from:	NetBSD
Approved by:	re (dwhite), mlaier (mentor)
2005-07-02 23:13:31 +00:00
Suleiman Souhlal
571dcd15e2 Fix the recent panics/LORs/hangs created by my kqueue commit by:
- Introducing the possibility of using locks different than mutexes
for the knlist locking. In order to do this, we add three arguments to
knlist_init() to specify the functions to use to lock, unlock and
check if the lock is owned. If these arguments are NULL, we assume
mtx_lock, mtx_unlock and mtx_owned, respectively.

- Using the vnode lock for the knlist locking, when doing kqueue operations
on a vnode. This way, we don't have to lock the vnode while holding a
mutex, in filt_vfsread.

Reviewed by:	jmg
Approved by:	re (scottl), scottl (mentor override)
Pointyhat to:	ssouhlal
Will be happy:	everyone
2005-07-01 16:28:32 +00:00
Gleb Smirnoff
82dd5411d9 Use m_uiotombuf() instead of own implementation. This is not just
a cosmetic change. m_uiotombuf() produces a packet header mbuf, while
original implementation did not. When kernel is compiled with MAC
support, headerless mbuf will cause panic.

Reported by:	Alexander Nikiforenko <asn rambler-co.ru>
Approved by:	re (scottl)
MFC After:	2 weeks
2005-07-01 15:22:47 +00:00
Andrew Thompson
49808fa4fc Sync if_bridge to NetBSD r1.31
Rename conflicting variables when handling SNAP Ethernet frames.

Obtained from:	NetBSD
Approved by:	mlaier (mentor)
Approved by:	re (blanket)
2005-06-29 19:23:32 +00:00
Qing Li
16a2e0a6c8 Require gateways for routes to be of the same address family as the
route itself.

It fixes a bug where an IPv4 route for example has an IPv6 gateway
specified:

     route add 10.1.1.1 -inet6 fe80::1%fxp0

     Destination  Gateway       Flags  Refs  Use  Netif Expire
     10.1.1.1     fe80::1%fxp0  UGHS   0     0    fxp0

The fix rejects these illegal combinations:

     route: writing to routing socket: Invalid argument
     add host 10.1.1.1: gateway fe80::1%fxp0: Invalid argument

Reviewed by:	KAME jinmei@isl.rdc.toshiba.co.jp
Reviewed by:	andre (mentor)
Approved by:	re
MFC after:	5
2005-06-28 23:32:22 +00:00
Bjoern A. Zeeb
066b192e3b Fix panic after ifnet changes in rev. 1.30. sc->sc_ifp is a
pointer now and needs to be allocated before using.

Reviewed by:	gnn
Approved by:	re (scottl), rwatson (mentor)
2005-06-28 06:55:45 +00:00
Andrew Thompson
ca6c404ce3 Fix a panic when bringing up the bridge interface. We were casting a ifnet
pointer to a softc which is no longer valid since the ifnet struct was split
out from the softc.

Approved by:	mlaier (mentor)
Approved by:	re (blanket)
2005-06-27 21:58:12 +00:00
David Malone
01399f34a5 Fix some long standing bugs in writing to the BPF device attached to
a DLT_NULL interface. In particular:

        1) Consistently use type u_int32_t for the header of a
           DLT_NULL device - it continues to represent the address
           family as always.
        2) In the DLT_NULL case get bpf_movein to store the u_int32_t
           in a sockaddr rather than in the mbuf, to be consistent
           with all the DLT types.
        3) Consequently fix a bug in bpf_movein/bpfwrite which
           only permitted packets up to 4 bytes less than the MTU
           to be written.
        4) Fix all DLT_NULL devices to have the code required to
           allow writing to their bpf devices.
        5) Move the code to allow writing to if_lo from if_simloop
           to looutput, because it only applies to DLT_NULL devices
           but was being applied to other devices that use if_simloop
           possibly incorrectly.

PR:		82157
Submitted by:	Matthew Luckie <mjl@luckie.org.nz>
Approved by:	re (scottl)
2005-06-26 18:11:11 +00:00
Brooks Davis
1436936ab0 Spelling/grammer fixes in comment.
Reported by:	Hans Petter Selasky <hselasky at c2i dot net>
Approved by:	re (ifnet blanked)
2005-06-17 17:19:34 +00:00
Brooks Davis
b03965ddca Initialze ifp->if_softc.
Submitted by:	ume
2005-06-13 17:17:07 +00:00
Brooks Davis
28ef2db496 Return NULL instead of a bogus pointer from if_alloc when if_com_alloc
fails.

Move detaching the ifnet from the ifindex_table into if_free so we can
both keep the sanity checks and actually delete the ifnets. [0]

Reported by:	gallatin [0]
Approved by:	re (blanket)
2005-06-12 00:53:03 +00:00
Andrew Thompson
e7acea8202 Catch up with the struct ifnet changes and use if_alloc().
Reviewed by:	brooks
Approved by:	mlaier (mentor)
2005-06-10 23:52:01 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Max Laier
2c67c57c8b Add missing {} in last commit. 2005-06-10 15:53:21 +00:00
Andrew Thompson
c8b0129238 Add dummynet(4) support to if_bridge, this code is largely based on bridge.c.
This is the final piece to match bridge.c in functionality, we can now be a
drop-in replacement.

Approved by:	mlaier (mentor)
2005-06-10 01:25:22 +00:00
Hartmut Brandt
25029d6c31 When returing an RTM_GET message through the routing socket fill
in the rtm_index field whenever we have an interface pointer. This
is consistent with the RTM_GET messages returned by sysctl().
2005-06-09 12:20:50 +00:00
Andrew Thompson
82116c339c Bring in IPFW layer2 filtering from bridge.c, this allows Ethernet filtering
using the layer2, mac and mac-type keywords.

This is one of the last features that bridge.c has over if_bridge and gets us
very close to a full functional replacement.

Approved by:	mlaier (mentor)
2005-06-07 21:20:18 +00:00
Christian S.J. Peron
0eb206049e Change the maximum bpf program instruction limitation from being hard-
coded at 512 (BPF_MAXINSNS) to being tunable. This is useful for users
who wish to use complex or large bpf programs when filtering traffic.
For now we will default it to BPF_MAXINSNS. I have tested bpf programs
with well over 21,000 instructions without any problems.

Discussed with:	phk
2005-06-06 22:19:59 +00:00
Brooks Davis
9d80a3307a Send link state change notifications to /dev/devctl. This is needed to
start the OpenBSD dhclient when links come up.
2005-06-06 19:08:11 +00:00
Andrew Thompson
f2999b2fdf Change ipv6 packet filtering to match ipv4. It now checks pfil_member and
pfil_bridge to determine which interfaces to filter on.

Approved by:	mlaier (mentor)
2005-06-06 02:41:29 +00:00
Andrew Thompson
5a6530a38d Fix indentation of two comment blocks from the last commit.
Approved by:	mlaier (mentor)
2005-06-05 03:49:23 +00:00
Andrew Thompson
8f86751705 Add hooks into the networking layer to support if_bridge. This changes struct
ifnet so a buildworld is necessary.

Approved by:	mlaier (mentor)
Obtained from:	NetBSD
2005-06-05 03:13:13 +00:00
Andrew Thompson
31997bf223 Add if_bridge, which provides more advanced Ethernet bridging and 802.1d
spanning tree support.

Based on Jason Wright's bridge driver from OpenBSD, and modified by Jason R.
Thorpe in NetBSD.

Reviewed by:	mlaier, bms, green
Silence from:	-net
Approved by:	mlaier (mentor)
Obtained from:	NetBSD
2005-06-05 02:59:26 +00:00
Sam Leffler
f6f1669c0f integrate changes from libpcap-0.9.1-096
Reviewed by:	bms
2005-05-28 21:56:41 +00:00
Brooks Davis
dbf49e18bb Update refrenced URL for SNMP list of ifTypes to refer to iana.org
instead of a dead location on ftp.isi.edu.
2005-05-28 06:11:38 +00:00
Gleb Smirnoff
748741c7ae Plug mbuf leak, that I have introduced in 1.85. Also restore important comment
from if_ethersubr.c:1.178. While here adjust formatting, to make code more
readable.

Reported by:	Alexey Kamyshev, rwatson
2005-05-26 06:50:00 +00:00
Peter Edwards
45778b37b2 Separate out address-detaching part of if_detach into if_purgeaddrs,
so if_tap doesn't need to rely on locally-rolled code to do same.

The observable symptom of if_tap's bzero'ing the address details
was a crash in "ifconfig tap0" after an if_tap device was closed.

Reported By: Matti Saarinen (mjsaarin at cc dot helsinki dot fi)
2005-05-25 13:52:03 +00:00
Max Laier
d274e6b641 Fix semantics of ph_busy_count == -1 to pass instead of block.
PR:		kern/81128
Submitted by:	Joost Bekkers
MFC-after:	2 weeks
2005-05-23 17:07:16 +00:00
Colin Percival
fd94099ec2 If we are going to
1. Copy a NULL-terminated string into a fixed-length buffer, and
2. copyout that buffer to userland,
we really ought to
0. Zero the entire buffer
first.

Security: FreeBSD-SA-05:08.kmem
2005-05-06 02:50:00 +00:00
Maksim Yevmenkin
75ae257016 Change m_uiotombuf so it will accept offset at which data should be copied
to the mbuf. Offset cannot exceed MHLEN bytes. This is currently used to
fix Ethernet header alignment problem on alpha and sparc64. Also change all
users of m_uiotombuf to pass proper offset.

Reviewed by:	jmg, sam
Tested by:	Sten Spans "sten AT blinkenlights DOT nl"
MFC after:	1 week
2005-05-04 18:55:03 +00:00
Christian S.J. Peron
a3272e3ce3 -introduce net.bpf sysctl instead of the less intuitive debug.*
debug.bpf_bufsize is now net.bpf.bufsize
    debug.bpf_maxbufsize is now net.bpf.maxbufsize

-move function prototypes for bpf_drvinit and bpf_clone up to the
 top of the file with the others
-assert bpfd lock in catchpacket() and bpf_wakeup()

MFC after:	2 weeks
2005-05-04 03:09:28 +00:00
Gleb Smirnoff
984be3efbf - Call if_link_state_change() for each vlan, when link changes
on parent.
- Remove route.h include.
- Fix comment about MII.

Sponsored by:	Rambler
Reviewed by:	yar
2005-04-20 12:16:41 +00:00
Gleb Smirnoff
68a3482f69 Do not call all link state callbacks directly, but schedule
a taskqueue(9) task. This fixes LORs and adds possibility
to serve such events pseudorecursively, when link state
change of interface causes subsequent change on other
interfaces.

Sponsored by:	Rambler
Reviewed by:	sam, brooks, mux
2005-04-20 09:30:54 +00:00
Colin Percival
fbd24c5ed6 Zero the ifr.ifr_name buffer in ifconf() in order to avoid
accidental disclosure of kernel memory to userland.

Security:	FreeBSD-SA-05:04.ifconf
2005-04-15 01:52:40 +00:00
Matthew N. Dodd
f7251b07e2 Add #defines for control fields and address bits. 2005-04-13 08:14:14 +00:00
Matthew N. Dodd
b137ea624b Provide a sysctl (net.link.tap.user_open) to allow unpriviliged
acces to tap(4) device nodes based on file system permission.

Duplicate the 'debug.if_tap_debug' sysctl under the
'net.link.tap' hierarchy.
2005-04-13 00:30:19 +00:00
Poul-Henning Kamp
f4f6abcb4e Explicitly hold a reference to the cdev we have just cloned. This
closes the race where the cdev was reclaimed before it ever made it
back to devfs lookup.
2005-03-31 12:19:44 +00:00
Brian Feldman
4549709fb5 You must selwakeup{,pri}() when closing a selectable object or the
td->td_sel will get trashed and crash the system.  Fix BPF's mistake
in this area.

MFC after:	1 day
2005-03-27 23:16:17 +00:00
Sam Leffler
7a7fa27b23 rt_newaddrmsg will blow up if given something other than RTM_ADD
or RTM_DELETE; add an assertion, may want to do something more
heavyhanded in the future

Noticed by:	Coverity Prevent analysis tool
Reviewed by:	mdodd
2005-03-26 21:49:43 +00:00
Andrew Gallatin
f83935f874 Zero the reserved fields of the header, as per rfc 2734. This change
results in connectivty to MacOSX hosts via fwip.

Thanks to Apple's Arulchandran Paramasivam <arulchandranp@apple.com> for
letting us know what we were doing wrong.

Reviewed by: dfr
MFC After: 7 days
2005-03-25 16:05:42 +00:00
Matthew N. Dodd
96a205962e - Break after nested switch.
- Default returns an error.
2005-03-24 02:08:22 +00:00
Gleb Smirnoff
d4d2297060 ifma_protospec is a pointer. Use NULL when assigning or compating it. 2005-03-20 14:31:45 +00:00
Gleb Smirnoff
5515c2e793 Add a sysctl net.link.log_link_state_change, which allows to
suppress logging of interface link state changes.

Requested by:	sam, kan
2005-03-12 12:58:03 +00:00
Maxim Sobolev
5c16270365 When neither of supported frame type is enabled via kernel options enable
them all, otherwise the driver will be useless and will only confuse user
as manual page says nothing about the need to enable one of those frame
types explicitly in the kernel config.

PR:		kern/47152
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
MFC after:	3 days
2005-03-06 23:03:58 +00:00
Maxim Sobolev
a10260280f Fix ef(4) driver when kernel compiled w/o IPX.
MFC after:	3 days
2005-03-06 22:59:40 +00:00
John-Mark Gurney
7819da7944 fix a bug where bpf would try to wakeup before updating the state.. This
was causing kqueue not to see the correct state and not wake up a process
that is waiting...

Submitted by:	nCircle Network Security, Inc.
2005-03-02 21:59:39 +00:00
Gleb Smirnoff
31199c8463 Use NET_CALLOUT_MPSAFE macro. 2005-03-01 12:01:17 +00:00
Gleb Smirnoff
3a84d72a78 Revert change to struct ifnet. Use ifnet pointer in softc. Embedding
ifnet into smth will soon be removed.

Requested by:	brooks
2005-03-01 10:59:14 +00:00
Robert Watson
a8e93fb7ec In bpf_setf(), protect against races between multiple user threads
attempting to change the BPF filter on a BPF descriptor at the same
time: retrieve the old filter pointer under the same locked region
as setting the new pointer.

MFC after:	3 days
2005-02-28 14:04:09 +00:00
Robert Watson
d1a67300e2 Update a comment describing bpf_iflist to indicate that the BPF interface
structures correspond to specific link layers, so the same network
interface may appear more than once.

MFC after:	3 days
2005-02-28 12:35:52 +00:00
Gleb Smirnoff
e8c34a71eb Remove carp_softc.sc_ifp member in favor of union pointers in struct ifnet.
Obtained from:	OpenBSD
2005-02-26 13:55:07 +00:00
Brooks Davis
bc9d299133 Change the definition of struct if_data's member ifi_epoch from wall
clock time to uptime because wall clock time may go backwards.

This is a change in the API which will impact SNMP agents who are using
ifi_epoch to set RFC2233's ifCounterDiscontinuityTime.  None are know to
exist today.  This will not impact applications that are using the
<index, epoch> tuple to verify interface uniqueness except that it
eliminates a race which could lead to a false assumption of uniqueness.

Because this is a behavior change, bump __FreeBSD_version.

Discussed with:	re (jhb, scottl)
MFC after:	3 days
Pointed out by:	pkh (way back at EuroBSDCon)
Pointy hat:	brooks
2005-02-25 19:46:41 +00:00
Maxim Konovalov
a6d008350d o Move ifcr_count sanity check up and reject negative values before we
panic at kmem_alloc() via malloc(9).

PR:		kern/77748
Submitted by:	Wojciech A. Koszek
OK'ed by:	brooks
Security:	local DoS, a sample code in the PR.
MFC after:	3 days
2005-02-24 13:14:41 +00:00
Gleb Smirnoff
58996b1337 Fix long lines in comment introduced in previous commit. 2005-02-24 10:15:50 +00:00
Sam Leffler
89bc9a3171 the rt parameter to ifa_rtrequest callbacks should always be non-null;
eliminate grauitous ptr checks that follow ptr deref's

Noticed by:	Coverity Prevent analysis tool
2005-02-24 01:34:01 +00:00
Sam Leffler
8d78bea456 eliminate dead code and collapse the remainder
Noticed by:	Coverity Prevent analysis tool
Reviewed by:	rwatson
2005-02-23 22:50:19 +00:00
Gleb Smirnoff
8b25904e36 Typo in comment. 2005-02-22 15:29:29 +00:00
Robert Watson
7e2041e0c4 When prepending an LCC SNAP header to an atalk outgoing ethernet packet,
allocate the additional mbuf (if needed) using a non-sleeping memory
allocation.

MFC after:	7 days
2005-02-22 15:03:25 +00:00
Gleb Smirnoff
4d96314f88 - In if_link_state_change() extract function body from if-block, to improve
readability.
- Call carp_carpdev_state() from if_link_state_change() if interface has
  associated CARP interface.

Sponsored by:	Rambler
2005-02-22 14:21:59 +00:00
Gleb Smirnoff
a97719482d Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by:	mlaier
Obtained from:	OpenBSD (mickey, mcbride)
2005-02-22 13:04:05 +00:00
Ruslan Ermilov
6ee20ab521 Allocate the M_VLANTAG m_pkthdr flag, and use it to indicate that
a packet has VLAN mbuf tag attached.  This is faster to check than
m_tag_locate(), and allows us to use the tags in non-vlan(4) VLAN
producers.

The first argument to VLAN_OUTPUT_TAG() is now unused but retained
for backward compatibility.

While here, embellish a fix in rev. 1.174 of if_ethersubr.c -- it
now checks for packets with VLAN (mbuf) tags, and it should now
be possible to bridge(4) on vlan(4)'s whose parent interfaces
support VLAN decapsulation in hardware.

Reviewed by:	sam
2005-02-18 22:31:19 +00:00
Gleb Smirnoff
eb46c866bb Check for non-NULL ac_netgraph field in interface arpcom, instead of
checking global presence of ng_ether(4).

Reviewed by:	ru
2005-02-14 11:58:54 +00:00
Ruslan Ermilov
6c23e6cc5a If no vlan(4) interfaces are configured for the interface, and the
driver did VLAN decapsulation in hardware, we were passing a frame
as if it came for the parent (non-VLAN) interface.  Stop this from
happening.

Reminded by:	glebius
Security:	This could pose a security risk in some setups
2005-02-14 08:29:42 +00:00
Xin LI
b0b4b28bf1 Validate ifc->ifc_len before submitting its incarnation to sbuf_new,
which will finally lead to kernel panic.

Security:	This prevents a local (root-launched) DoS
Submitted by:	Wojciech A. Koszek [dunstan at freebsd czest pl]
PR:		77421
MFC After:	1 week
2005-02-12 17:51:12 +00:00
Poul-Henning Kamp
c711aea6ca Make a bunch of malloc types static.
Found by:	src/tools/tools/kernxref
2005-02-10 12:02:37 +00:00
Gleb Smirnoff
8b02df2485 Log changes of link state.
Reviewed by:	rwatson
2005-01-30 12:57:47 +00:00
Robert Watson
31c436a2a9 Acquire the raw_cb mutex around LIST_REMOVE() of a raw socket control
block from the global raw socket list.

Submitted by:	Roselyn Lee <rosel at verniernetworks dot com>
MFC after:	1 week
2005-01-24 22:56:09 +00:00
Yaroslav Tykhiy
cab574d841 Fix spelling in a comment. 2005-01-24 15:48:00 +00:00
Yaroslav Tykhiy
c6e6ca3e7b Reduce the global name space pollution.
The cloner structure isn't referenced by name outside this file.
2005-01-23 23:10:33 +00:00
Gleb Smirnoff
28935658c4 - Reduce number of arguments passed to dummynet_io(), we already have cookie
in struct ip_fw_args itself.
- Remove redundant &= 0xffff from dummynet_io().
2005-01-16 11:13:18 +00:00
Gleb Smirnoff
c31d24c37c Remove ip_fw.h and ip_dummynet.h from includes. 2005-01-15 22:04:17 +00:00
Gleb Smirnoff
6c69a7c30b o Clean up interface between ip_fw_chk() and its callers:
- ip_fw_chk() returns action as function return value. Field retval is
  removed from args structure. Action is not flag any more. It is one
  of integer constants.
- Any action-specific cookies are returned either in new "cookie" field
  in args structure (dummynet, future netgraph glue), or in mbuf tag
  attached to packet (divert, tee, some future action).

o Convert parsing of return value from ip_fw_chk() in ipfw_check_{in,out}()
  to a switch structure, so that the functions are more readable, and a future
  actions can be added with less modifications.

Approved by:	andre
MFC after:	2 months
2005-01-14 09:00:46 +00:00
Giorgos Keramidas
2ccfeeaef4 Fix a typo in a comment that may be confusing if one doesn't really
check what the code does.  Separators are spaces, commas or tabs;
not '*' characters (as one may assume by reading the old comment).
2005-01-11 10:47:51 +00:00
Hajimu UMEMOTO
529ed56f83 don't see NBPFILTER. 2005-01-11 07:17:33 +00:00
Hajimu UMEMOTO
2d106a00c9 remove HAVE_OLD_BPF part. 2005-01-11 07:14:37 +00:00
Hajimu UMEMOTO
4b9a5e9f07 we are not OLD_BPF system. 2005-01-11 07:08:15 +00:00
Hajimu UMEMOTO
9b1a707635 fix typo. 2005-01-11 07:05:56 +00:00
Gleb Smirnoff
1c7899c74e This change adds reliability for Ethernet trunks built with ng_one2many:
- Introduce another ng_ether(4) callback ng_ether_link_state_p, which
  is called from if_link_state_change(), every time link is changed.
- In ng_ether_link_state() send netgraph control message notifying
  of link state change to a node connected to "lower" hook.

Reviewed by:	sam
MFC after:	2 weeks
2005-01-08 12:42:03 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Roman Kurakin
d676cb6fad Add FR support to sppp (MFCronyx).
Silence on: net@, current@, hackers@.
No objections: joerg

Requested by: by many (mostly Cronyx) users for a long long time.
MFC after:	10 days

PR:		kern/21771, kern/66348
2004-12-28 00:07:57 +00:00
Pawel Jakub Dawidek
77fc70c1ef Fix mbuf leak.
Submitted by:	Johnny Eriksson <bygg@cafax.se>
MFC after:	5 days
2004-12-27 15:53:44 +00:00
Poul-Henning Kamp
f62f3a1121 Include fcntl.h
Include selinfo.h (don't rely on vnode.h to do so)
Check O_NONBLOCK instead of IO_NELAY
Don't include vnode.h
2004-12-22 17:39:21 +00:00
Poul-Henning Kamp
9eaed5e66e Don't include filedesc.h
Include fcntl.h
Include selinfo.h (don't rely on vnode.h to do so)
Check O_NONBLOCK instead of IO_NDELAY
Don't include vnode.h
2004-12-22 17:38:43 +00:00
Poul-Henning Kamp
e76eee5562 Include fcntl.h
Check O_NONBLOCK instead of IO_NDELAY
Include uio.h
Don't include vnode.h
Don't include filedesc.h
2004-12-22 17:37:57 +00:00
Poul-Henning Kamp
27d7317dda Check O_NONBLOCK instead of IO_NDELAY.
Don't include <sys/vnode.h>
2004-12-22 17:32:53 +00:00
John-Mark Gurney
86c9a45388 don't try to recurse on the bpf lock.. kqueue already locks the bpf lock
now...

Submitted by:	Ed Maste of Sandvine Inc.
MFC after:	1 week
2004-12-17 03:21:46 +00:00
Roman Kurakin
1fd90fb4a0 Kill double inclusion for <netinet/in.h> and <netinet/in_systm.h>. 2004-12-14 18:18:54 +00:00
Roman Kurakin
e42ddbdf64 Make sppp MPSAFE.
MPSAFE could be turned off by IFF_NEEDSGIANT.

Silence on: net@, current@, hackers@.
No objections: joerg
2004-12-12 14:54:15 +00:00
Sam Leffler
94f5c9cfc0 Cleanup link state change notification:
o add new if_link_state_change routine that deals with link state changes
o change mii to use if_link_state_change
2004-12-08 05:45:59 +00:00
Sam Leffler
3518d22073 Don't require a device to be marked up when issuing BIOCSETIF. 2004-12-08 05:40:02 +00:00
Max Laier
69fb23b73d Implement the check I was talking about in the previous message already.
Introduce domain_init_status to keep track of the init status of the domains
list (surprise). 0 = uninitialized, 1 = initialized/unpopulated, 2 =
initialized/done. Higher values can be used to support late addition of
domains which right now "works", but is potential dangerous. I choose to
only give a warning when doing so.

Use domain_init_status with if_attachdomain[1]() to ensure that we have a
complete domains list when we init the if_afdata array. Store the current
value of domain_init_status in if_afdata_initialized. This way we can update
if_afdata after a new protocol has been added (once that is allowed).

Submitted by:	se (with changes)
Reviewed by:	julian, glebius, se
PR:		kern/73321	(partly)
2004-11-30 22:38:37 +00:00
Robert Watson
6237419d5c Assign if_broadcastaddr to NULL not 0 in if_attach().
Printf() a warning if if_attachdomain() is called more than once on an
  interface to generate some noise on mailing lists when this occurs.

Fix up style in if_start(), where spaces crept in instead of tabs at
some point.

MFC after:	1 week
MFC note:	Not the printf().
2004-11-23 23:31:33 +00:00
John-Mark Gurney
1f48dc25d7 sync comment on IFF_OACTIVE with reality.. IFF_OACTIVE is set when the
hardware cannot take anymore packets, and so will supress the calling of
the device's if_start method...

Submitted by:	bde
2004-11-17 18:32:44 +00:00
Max Laier
0b39ef4db1 Remove the #if 0 wrapping around !ALTQ stuff that can't be used due to ABI
stability anyway.
2004-11-09 21:29:28 +00:00
Poul-Henning Kamp
756d52a195 Initialize struct pr_userreqs in new/sparse style and fill in common
default elements in net_init_domain().

This makes it possible to grep these structures and see any bogosities.
2004-11-08 14:44:54 +00:00
Olivier Houchard
943efa1bd1 Don't abuse tp->t_sc in sl(4) either. 2004-11-07 14:36:47 +00:00
Olivier Houchard
7358f4bb52 Don't abuse tp->t_sc, as it is now used by tty drivers.
This fixes the panic that occurs when using ppp(4)

Reported and tested by:	Yann Berthier (yb at sainte-barbe dot org)
2004-11-07 14:35:53 +00:00
Gleb Smirnoff
411f23b06e Utilize m_uiotombuf() in device write method, instead of home-grown
implementation. This also gives a performance improvement, because
m_uiotombuf() utilizes clusters.

Approved by:	julian (mentor)
MFC after:	1 month
2004-10-31 17:39:46 +00:00
Robert Watson
0b762445b9 Move if_handoff() from an inline in if_var.h to a function to if.c
in orden to harden the ABI for 5.x; this will permit us to modify
the locking in the ifnet packet dispatch without requiring drivers
to be recompiled.

MFC after:	3 days
Discussed at:	EuroBSDCon Developer's Summit
2004-10-30 09:39:13 +00:00
Robert Watson
b4d4574a55 Add additional "spare" fields to 'struct ifnet' in order to improve
the resistance of the network driver ABI to changes that will be
required as we optimize locking.

MFC after:	3 days
Discussed at:	Developer Summit
2004-10-30 08:45:13 +00:00
John-Mark Gurney
2f27e1512c use NULL instead of 0 when casting/comparing w/ a pointer... 2004-10-25 17:04:40 +00:00
Robert Watson
31302ebf9d Define IFF_LOCKGIANT() and IFF_UNLOCKGIANT() macros, which conditionally
acquire Giant if the passed interface has IFF_NEEDSGIANT set on it.
Modify calls into (ifp)->if_ioctl() in if.c to use these macros in order
to ensure that Giant is held.

MFC after:	3 days
Bumped into by:	jmg
2004-10-19 18:11:55 +00:00
Robert Watson
81158452be Push acquisition of the accept mutex out of sofree() into the caller
(sorele()/sotryfree()):

- This permits the caller to acquire the accept mutex before the socket
  mutex, avoiding sofree() having to drop the socket mutex and re-order,
  which could lead to races permitting more than one thread to enter
  sofree() after a socket is ready to be free'd.

- This also covers clearing of the so_pcb weak socket reference from
  the protocol to the socket, preventing races in clearing and
  evaluation of the reference such that sofree() might be called more
  than once on the same socket.

This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket.  The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets.  The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.

RELENG_5_3 candidate.

MFC after:	3 days
Reviewed by:	dwhite
Discussed with:	gnn, dwhite, green
Reported by:	Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by:	Vlad <marchenko at gmail dot com>
2004-10-18 22:19:43 +00:00
Gleb Smirnoff
a176c2aeaf Fix packet flow when both ng_ether(4) and bridge(4) are in use:
- push all bridge logic from if_ethersubr.c into bridge.c
  make bridge_in() return mbuf pointer (or NULL).
- call only bridge_in() from ether_input(), after ng_ether_input()
  was optinally called.
- call bridge_in() from ng_ether_rcv_upper().

Long description:	http://lists.freebsd.org/mailman/htdig/freebsd-net/2004-May/003881.html
Reported by:		Jian-Wei Wang <jwwang at FreeBSD.csie.NCTU.edu.tw>
Tested by:		myself, Sergey Lyubka
Reviewed by:		sam
Approved by:		julian (mentor)
MFC after:		2 months
2004-10-12 10:33:42 +00:00
Andre Oppermann
de10fe70e1 Correctly unregister a netisr by clearing the ni->ni_queue field to NULL as
well.  This field is actually used by various netisr functions to determine
the availablility of the specified netisr.  This uncomplete unregister leads
directly to a crash when the KLD unregistering the netisr is unloaded.

Submitted by:	Sam <sah@softcardsystems.com>
MFC after:	3 days
2004-10-11 20:01:43 +00:00
Robert Watson
acf032f516 When harvesting entropy from an ethernet mbuf, do so before freeing the
mbuf.

RELENG_5 candidate.
2004-10-11 10:21:34 +00:00
Gleb Smirnoff
570343bfec Assign pointer NULL, not 0.
Approved by:	julian (mentor)
2004-10-11 07:28:36 +00:00
Max Laier
85bba4455a Change pfil starvation prevention from fail-open to fail-close.
We return ENOBUF to indicate the problem, which is an errno that should be
handled well everywhere.

Requested & Submitted by:	green
Silently okay'ed by:		The rest of the firewall gang
MFC after:			3 days
2004-10-08 12:07:20 +00:00
Brooks Davis
ab67442f0c Since net/net_osdep.c contained only one function that could be
trivially implemented as a macro, do that and remove it.  NetBSD did
this quite a while ago.
2004-10-08 00:24:30 +00:00
Brian Feldman
93daabdd83 Don't recurse the BPF descriptor lock during the BIOCSDLT operation
(and panic).  To try to finish making BPF safe, at the very least,
the BPF descriptor lock really needs to change into a reader/writer
lock that controls access to "settings," and a mutex that controls
access to the selinfo/knote/callout.  Also, use of callout_drain()
instead of callout_stop() (which is really a much more widespread
issue).
2004-10-06 04:25:37 +00:00
Sam Leffler
b83a279f19 Add 802.11-specific events that are dispatched through the routing socket.
This really doesn't belong here but is preferred (for the moment) over
adding yet another mechanism for sending msgs from the kernel to user apps.

Reviewed by:	imp
2004-10-05 19:48:33 +00:00
Sam Leffler
0cc8f89a4a add ETHERTYPE_PAE for EAPOL/802.1x 2004-10-05 19:28:52 +00:00
Max Laier
d6a8d58875 Add an additional struct inpcb * argument to pfil(9) in order to enable
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.

This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.

Suggested by:		rwatson
A lot of work by:	csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by:		rwatson, csjp
Tested by:		-pf, -ipfw, LINT, csjp and myself
MFC after:		3 days

LOR IDs:		14 - 17 (not fixed yet)
2004-09-29 04:54:33 +00:00
Max Laier
fa97ea3131 Switch order for mtx_unlock and cv_signal as (condvar(9)) sez:
A thread must hold mp while calling cv_signal(), cv_broadcast(), or
     cv_broadcastpri() even though it isn't passed as an argument.

and is right with this claim.

While here remove a "\" from the macro -> __inline conversion.

Found by:	csjp
MFC after:	4 days
2004-09-22 20:55:56 +00:00
Stefan Farfeleder
e7b80a8e24 Prefer C99's __func__ over GCC's __FUNCTION__. 2004-09-22 17:16:04 +00:00
Brian Feldman
5ed8cedc83 Call sbuf_finish() before sbuf_data() so as to not panic the system. 2004-09-22 12:53:27 +00:00
Brooks Davis
4dcf2bbbff Fix a LOR where ifconf() used copyout while holding a mutex. This LOR
was seen when configuring addresses on interfaces using ifconfig.  This
patch has been verified to work with over eight thousand addresses
assigned to an interface.

LOR id:		031
2004-09-22 08:59:41 +00:00
Brooks Davis
71672bb6f6 Log the renaming of an interface. This should make it easier to follow
kernel log files.
2004-09-18 05:02:08 +00:00
Robert Watson
6874bcf242 Destroy global tapmtx when the if_tap module is unloaded.
RELENG_5 candidated.
2004-09-17 03:55:50 +00:00
Brooks Davis
c859ef977e Fix a LOR where copyout was called while holding a lock.
Reported by:    rwatson
2004-09-15 04:41:56 +00:00
Robert Watson
46448b5a1b Reformulate bpf_dettachd() to acquire the BIF_LOCK() as well as
BPFD_LOCK() when removing a descriptor from an interface descriptor
list.  Hold both over the operation, and do a better job at
maintaining the invariant that you can't find partially connected
descriptors on an active interface descriptor list.

This appears to close a race that resulted in the kernel performing
a NULL pointer dereference when BPF sessions are detached during
heavy network activity on SMP systems.

RELENG_5 candidate.
2004-09-09 04:11:12 +00:00
Robert Watson
4a3feeaa86 Reformulate use of linked lists in 'struct bpf_d' and 'struct bpf_if'
to use queue(3) list macros rather than hand-crafted lists.  While
here, move to doubly linked lists to eliminate iterating lists in
order to remove entries.  This change simplifies and clarifies the
list logic in the BPF descriptor code as a first step towards revising
the locking strategy.

RELENG_5 candidate.

Reviewed by:	fenner
2004-09-09 00:19:27 +00:00
Robert Watson
d17d818425 Compare/set pointers using NULL not 0. 2004-09-09 00:11:50 +00:00
Brooks Davis
55287f2a60 Re-add ifi_epoch, to struct if_data, this time replacing ifi_unused
to avoid ABI changes.  It is set to the last time the interface
counters were zeroed, currently the time if_attach() was called.  It is
intentended to be a valid value for RFC2233's ifCounterDiscontinuityTime
and to make it easier for applications to verify that the interface they
find at a given index is the one that was there last time they looked.

Due to space constraints ifi_epoch is a time_t rather then a struct
timeval.  SNMP would prefer higher precision, but this unlikely to be
useful in practice.
2004-09-08 04:50:55 +00:00
John-Mark Gurney
9b90387dcf don't call f_detach if the filter has alread removed the knote.. This
happens when a proc exits, but needs to inform the user that this has
happened..  This also means we can remove the check for detached from
proc and sig f_detach functions as this is doing in kqueue now...

MFC after:	5 days
2004-09-06 19:02:42 +00:00
Robert Watson
ccaae37ab1 Correct a comment typo: s/Note/Not/.
Pointed out by:	kensmith
2004-09-03 01:37:02 +00:00
Brooks Davis
4ff62bd97b Back out ifi_epoch. The ABI breakage is too disruptive this close to
5-STABLE. ifi_epoch will shortly be reintroduced with less precistion
using the space currently allocated to ifi_unused.
2004-09-02 05:07:29 +00:00
Max Laier
7b21048cea Fix an assertion when if_down()ing a ALTQ managed interface. The lock should
have been in place all the time the mtx_assert in the ALTQ code just
discovered the shortcoming.

PR:		i386/71195
Tested by:	Bettan (PR originator), myself
MFC after:	5 days
2004-09-01 19:56:47 +00:00
Brooks Davis
9e734b4468 Use a spare byte in struct if_data to store the structure size without
increasing it.  Add code to ifconfig to use this size to find the
sockaddr_dl after the struct if_data in the routing message.  This
allows struct if_data to grow (up to 255 bytes) without breaking
ifconfig.

Submitted by:	peter
2004-09-01 18:22:14 +00:00
Brooks Davis
1fc4519b1d Add a new variable, ifi_epoch, to struct if_data. It is set to the last
time the interface counters were zeroed, currently the time if_attach()
was called.  It is indentended to be a valid value for RFC2233's
ifCounterDiscontinuityTime and to make it easier for applications to
verify that the interface they find at a given index is the one that was
there last time they looked.

An if_epoch "compatability" macro has not been created as ifi_epoch has
never been a member of struct ifnet.

Approved by:	andre, bms, wollman
2004-08-30 06:29:26 +00:00
Yaroslav Tykhiy
b9803f29dd Use an ANSI-style definition for slstart()
in accord with the rest of the file.
2004-08-30 04:48:52 +00:00
Yaroslav Tykhiy
ecfb8f3f7b Grant the poor old SLIP driver with an if_start handler
so that it becomes happy and no longer panics the system
upon getting the very first packet to transmit.

Reported and tested by:	Igor Timkin <ivt@gamma.ru>
Reviewed by:		rwatson
MFC after:		5 days
2004-08-30 04:32:52 +00:00
Robert Watson
ace437c3c6 Correct typo in printf() warning.
Submitted by:	Pawel Worach <pawel.worach at telia.com>
2004-08-28 19:27:25 +00:00
Robert Watson
1d8cd39e71 Change the default disposition of debug.mpsafenet from 0 to 1, which
will cause the network stack to operate without the Giant lock by
default.  This change has the potential to improve performance by
increasing parallelism and decreasing latency in network processing.

Due to the potential exposure of existing or new bugs, the following
compatibility functionality is maintained:

- It is still possible to disable Giant-free operation by setting
  debug.mpsafenet to 0 in loader.conf.

- Add "options NET_WITH_GIANT", which will restore the default value of
  debug.mpsafenet to 0, and is intended for use on systems compiled with
  known unsafe components, or where a more conservative configuration is
  desired.

- Add a new declaration, NET_NEEDS_GIANT("componentname"), which permits
  kernel components to declare dependence on Giant over the network
  stack.  If the declaration is made by a preloaded module or a compiled
  in component, the disposition of debug.mpsafenet will be set to 0 and
  a warning concerning performance degraded operation printed to the
  console.  If it is declared by a loadable kernel module after boot, a
  warning is displayed but the disposition cannot be changed.  This is
  implemented by defining a new SYSINIT() value, SI_SUB_SETTINGS, which
  is intended for the processing of configuration choices after tunables
  are read in and the console is available to generate errors, but
  before much else gets going.

This compatibility behavior will go away when we've finished the last
of the locking work and are confident that operation is correct.
2004-08-28 15:11:13 +00:00
Brooks Davis
b9907cd45b When detaching an interface, don't leave an obsolete pointer to the
soon to be deleted struct ifnet around.

PR:		kern/52260
MFC After:	3 days
2004-08-27 19:42:40 +00:00
Andre Oppermann
3161f583ca Apply error and success logic consistently to the function netisr_queue() and
its users.

netisr_queue() now returns (0) on success and ERRNO on failure.  At the
moment ENXIO (netisr queue not functional) and ENOBUFS (netisr queue full)
are supported.

Previously it would return (1) on success but the return value of IF_HANDOFF()
was interpreted wrongly and (0) was actually returned on success.  Due to this
schednetisr() was never called to kick the scheduling of the isr.  However this
was masked by other normal packets coming through netisr_dispatch() causing the
dequeueing of waiting packets.

PR:		kern/70988
Found by:	MOROHOSHI Akihiko <moro@remus.dti.ne.jp>
MFC after:	3 days
2004-08-27 18:33:08 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
Robert Watson
d4e02af583 Revert previous revision, 1.7, as removal of GIANT_REQUIRED was made
in the wrong branch (and hence to the wrong function).
2004-08-24 14:17:58 +00:00
Robert Watson
b84209fbec MT4 if_fwsubr.c:1.6:
date: 2004/08/22 14:48:55;  author: rwatson;  state: Exp;  lines: +0 -2
  Don't need to assert Giant in fw_output(), only in the firewire start
  routine.

Approved by:	re (scottl)
2004-08-24 14:16:08 +00:00
Peter Pentchev
18aee723a3 Fix a typo (attacked -> attached).
Approved by:	sam
2004-08-24 08:47:15 +00:00
Robert Watson
6063b5f0ad Style update: use newer style function prototypes in if_sl.c in
prep for merging locking.
2004-08-22 21:32:52 +00:00
Robert Watson
201a36deca Don't need to assert Giant in fw_output(), only in the firewire start
routine.
2004-08-22 14:48:55 +00:00
Robert Watson
b062951a3d If a tunable for the routing socket netisr queue max is defined, allow it
to override the default value, rather than the default value overriding
the tunable.
2004-08-21 21:45:40 +00:00
Robert Watson
190a4c9436 Allow the size of the routing socket netisr queue to be configured using
the tunable or sysctl 'net.route.netisr_maxqlen'.  Default the maximum
depth to 256 rather than IFQ_MAXLEN due to the downsides of dropping
routing messages.

MT5 candidate.

Discussed with:	mdodd, mlaier, Vincent Jardin <jardin at 6wind.com>
2004-08-21 21:20:06 +00:00
Christian S.J. Peron
5090559b7f When a prison is given the ability to create raw sockets (when the
security.jail.allow_raw_sockets sysctl MIB is set to 1) where privileged
access to jails is given out, it is possible for prison root to manipulate
various network parameters which effect the host environment. This commit
plugs a number of security holes associated with the use of raw sockets
and prisons.

This commit makes the following changes:

- Add a comment to rtioctl warning developers that if they add
  any ioctl commands, they should use super-user checks where necessary,
  as it is possible for PRISON root to make it this far in execution.
- Add super-user checks for the execution of the SIOCGETVIFCNT
  and SIOCGETSGCNT IP multicast ioctl commands.
- Add a super-user check to rip_ctloutput(). If the calling cred
  is PRISON root, make sure the socket option name is IP_HDRINCL,
  otherwise deny the request.

Although this patch corrects a number of security problems associated
with raw sockets and prisons, the warning in jail(8) should still
apply, and by default we should keep the default value of
security.jail.allow_raw_sockets MIB to 0 (or disabled) until
we are certain that we have tracked down all the problems.

Looking forward, we will probably want to eliminate the
references to curthread.

This may be a MFC candidate for RELENG_5.

Reviewed by:	rwatson
Approved by:	bmilekic (mentor)
2004-08-21 17:38:57 +00:00
Andre Oppermann
9b932e9e04 Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI.  The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

 In general ipfw is now called through the PFIL_HOOKS and most associated
 magic, that was in ip_input() or ip_output() previously, is now done in
 ipfw_check_[in|out]() in the ipfw PFIL handler.

 IPDIVERT is entirely handled within the ipfw PFIL handlers.  A packet to
 be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
 reassembly.  If not, or all fragments arrived and the packet is complete,
 divert_packet is called directly.  For 'tee' no reassembly attempt is made
 and a copy of the packet is sent to the divert socket unmodified.  The
 original packet continues its way through ip_input/output().

 ipfw 'forward' is done via m_tag's.  The ipfw PFIL handlers tag the packet
 with the new destination sockaddr_in.  A check if the new destination is a
 local IP address is made and the m_flags are set appropriately.  ip_input()
 and ip_output() have some more work to do here.  For ip_input() the m_flags
 are checked and a packet for us is directly sent to the 'ours' section for
 further processing.  Destination changes on the input path are only tagged
 and the 'srcrt' flag to ip_forward() is set to disable destination checks
 and ICMP replies at this stage.  The tag is going to be handled on output.
 ip_output() again checks for m_flags and the 'ours' tag.  If found, the
 packet will be dropped back to the IP netisr where it is going to be picked
 up by ip_input() again and the directly sent to the 'ours' section.  When
 only the destination changes, the route's 'dst' is overwritten with the
 new destination from the forward m_tag.  Then it jumps back at the route
 lookup again and skips the firewall check because it has been marked with
 M_SKIP_FIREWALL.  ipfw 'forward' has to be compiled into the kernel with
 'option IPFIREWALL_FORWARD' to enable it.

 DUMMYNET is entirely handled within the ipfw PFIL handlers.  A packet for
 a dummynet pipe or queue is directly sent to dummynet_io().  Dummynet will
 then inject it back into ip_input/ip_output() after it has served its time.
 Dummynet packets are tagged and will continue from the next rule when they
 hit the ipfw PFIL handlers again after re-injection.

 BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
 they did before.  Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

 conf/files
	Add netinet/ip_fw_pfil.c.

 conf/options
	Add IPFIREWALL_FORWARD option.

 modules/ipfw/Makefile
	Add ip_fw_pfil.c.

 net/bridge.c
	Disable PFIL_HOOKS if ipfw for bridging is active.  Bridging ipfw
	is still directly invoked to handle layer2 headers and packets would
	get a double ipfw when run through PFIL_HOOKS as well.

 netinet/ip_divert.c
	Removed divert_clone() function.  It is no longer used.

 netinet/ip_dummynet.[ch]
	Neither the route 'ro' nor the destination 'dst' need to be stored
	while in dummynet transit.  Structure members and associated macros
	are removed.

 netinet/ip_fastfwd.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_fw.h
	Removed 'ro' and 'dst' from struct ip_fw_args.

 netinet/ip_fw2.c
	(Re)moved some global variables and the module handling.

 netinet/ip_fw_pfil.c
	New file containing the ipfw PFIL handlers and module initialization.

 netinet/ip_input.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.  ip_forward() does not longer require
	the 'next_hop' struct sockaddr_in argument.  Disable early checks
	if 'srcrt' is set.

 netinet/ip_output.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_var.h
	Add ip_reass() as general function.  (Used from ipfw PFIL handlers
	for IPDIVERT.)

 netinet/raw_ip.c
	Directly check if ipfw and dummynet control pointers are active.

 netinet/tcp_input.c
	Rework the 'ipfw forward' to local code to work with the new way of
	forward tags.

 netinet/tcp_sack.c
	Remove include 'opt_ipfw.h' which is not needed here.

 sys/mbuf.h
	Remove m_claim_next() macro which was exclusively for ipfw 'forward'
	and is no longer needed.

Approved by:	re (scottl)
2004-08-17 22:05:54 +00:00
John-Mark Gurney
ad3b9257c2 Add locking to the kqueue subsystem. This also makes the kqueue subsystem
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers.  Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.

Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks.  Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).

Reviewed by:	green, rwatson (both earlier versions)
2004-08-15 06:24:42 +00:00
Robert Watson
3b7d076fe7 Use IFQ_SET_MAXLEN() to set the maximum queue depth of the routing
socket netisr queue.

Pointed out by:	winter
2004-08-13 22:23:21 +00:00
Tony Ackerman
b59db7bbe8 Added two new media types for 10GBASE-SR and 10GBASE-LR 2004-08-12 23:48:26 +00:00
Andre Oppermann
2dc1d58164 Convert the routing table to use an UMA zone for rtentries. The zone is
called "rtentry".

This saves a considerable amount of kernel memory.  R_Zmalloc previously
used 256 byte blocks (plus kmalloc overhead) whereas UMA only needs 132
bytes.

Idea from:	OpenBSD
2004-08-11 17:26:56 +00:00
Maksim Yevmenkin
285b72aa78 Set IFF_RUNNING flag on the interface as soon as the control device is opened. 2004-08-11 00:12:27 +00:00
Max Laier
de0332d4fa Add a "void *if_carp" placeholder to struct ifnet with prospect to bring in
the "Common address redundancy protocol" (CARP) during the 5-STABLE cycle.
Hence doing the ABI break now.

Approved by:	re (scottl)
2004-08-07 09:32:04 +00:00
Robert Watson
ebcd28e669 As SLIP directly accesses the tty code from its if_start() routine,
mark if_sl as IFF_NEEDSGIANT.
2004-08-06 22:41:13 +00:00
Peter Pentchev
3f35d5150b Do not attempt to clean up data that has not been initialized yet.
This fixes two kernel panics on boot when the xl driver fails to
allocate bus/port/memory resources.

Reviewed by:	silence on -net
2004-08-06 09:08:33 +00:00
Maxim Sobolev
97c4cd9853 Set ip_v field properly.
PR:	kern/69957
2004-08-05 08:12:46 +00:00
Robert Watson
46691dd8d7 Do a lockless read of the BPF interface structure descriptor list head
before grabbing BPF locks to see if there are any entries in order to
avoid the cost of locking if there aren't any.  Avoids a mutex lock/
unlock for each packet received if there are no BPF listeners.
2004-08-05 02:37:36 +00:00
Alexander Kabaev
445e045b0d Avoid casts as lvalues. 2004-07-28 06:59:55 +00:00
Alexander Kabaev
a0ec13c419 Initialize ; variable eraly to shut up GCC warning. 2004-07-28 06:48:36 +00:00
Robert Watson
af5e59bf28 Add a new network interface flag, IFF_NEEDSGIANT, which will allow
device drivers to declare that the ifp->if_start() method implemented
by the driver requires Giant in order to operate correctly.

Add a 'struct task' to 'struct ifnet' that can be used to execute a
deferred ifp->if_start() in the event that if_start needs to be called
in a Giant-free environment.  To do this, introduce if_start(), a
wrapper function for ifp->if_start().  If the interface can run MPSAFE,
it directly dispatches into the interface start routine.  If it can't
run MPSAFE, we're running with debug.mpsafenet != 0, and Giant isn't
currently held, the task is queued to execute in a swi holding Giant
via if_start_deferred().

Modify if_handoff() to use if_start() instead of direct dispatch.
Modify 802.11 to use if_start() instead of direct dispatch.

This is intended to provide increased compatibility for non-MPSAFE
network device drivers in the presence of Giant-free operation via
asynchronous dispatch.  However, this commit does not mark any network
interfaces as IFF_NEEDSGIANT.
2004-07-27 23:20:45 +00:00
Yaroslav Tykhiy
d6fcfb7ae1 Stop tinkering with the parent's VLAN_MTU capability.
Now it is user-controlled through ifconfig(8).

The former ``automagic'' way of operation created more
trouble than good.  First, VLAN_MTU consumers other than
vlan(4) had appeared, e.g., ng_vlan(4).  Second, there was
no way to disable VLAN_MTU manually if it were causing
trouble, e.g., data corruption.

Dropping the ``automagic'' should be completely invisible
to the user since
a) all the drivers supporting VLAN_MTU
have it enabled by default, and in the first place
b) there is only one driver that can really toggle VLAN_MTU
in the hardware under its control (it's fxp(4), to which
I added VLAN_MTU controls to illustrate the principle.)
2004-07-26 14:46:04 +00:00
Robert Watson
572bde2aea Prefer NULL to '0' when checking a pointer value. 2004-07-24 16:58:56 +00:00
Brooks Davis
b4e9f8379e Actually free the unit when destroying the interface.
Reported by:	la at delfi.lt
Tested by:	la at delfi.lt
PR:		68618
2004-07-22 22:50:15 +00:00
Max Laier
ca64c799d4 When removing the last reference to a cloner, do not try to unlock twice -
esp. not since the backing memory was just freed.

Reviewed by:	rwatson
2004-07-20 21:44:28 +00:00
Robert Watson
08f85b089e Comment clarifying debug_mpsafenet. 2004-07-18 21:50:22 +00:00
Robert Watson
8bbfdc98e4 Gratuitous whitespace change to un-wrap a short line. 2004-07-18 19:53:35 +00:00
Poul-Henning Kamp
672c05d49c Preparation commit for the tty cleanups that will follow in the near
future:

rename ttyopen() -> tty_open() and ttyclose() -> tty_close().

We need the ttyopen() and ttyclose() for the new generic cdevsw
functions for tty devices in order to have consistent naming.
2004-07-15 20:47:41 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Max Laier
bfe4641596 Fix a copy-and-paste-o in IFQ_DRV_PREPEND - all pointyhats to me.
While here also fix a (not less stupid) braino in IFQ_DRV_PURGE.

Reported-by:	clement
Tested-by:	clement (_PREPEND in sis(4))
2004-07-14 13:31:41 +00:00
Robert Watson
efe0ab01b2 Convert SLIP to using C99 structure initialization for its struct
linesw.
2004-07-14 05:01:40 +00:00
Bruce M Simpson
086e98c437 Use ETHER_IS_MULTICAST() consistently in ether_resolvemulti().
Reviewed by:	jmallett
2004-07-09 05:26:27 +00:00
Bruce M Simpson
ca28620f0d Use M_ZERO instead of bzero(). 2004-07-06 03:34:16 +00:00
Bruce M Simpson
9b3d77e7c9 Be consistent and use bzero() instead of memset(). 2004-07-06 03:29:41 +00:00
Bruce M Simpson
b3c9a01e5e Use M_ZERO instead of memset() (!). 2004-07-06 03:28:24 +00:00
Bruce M Simpson
e1a8c3dc33 Use M_ZERO instead of bzero(). 2004-07-06 03:26:26 +00:00
Bruce M Simpson
60323f48bd Replace a bzero() after malloc() with M_ZERO. 2004-07-06 03:16:55 +00:00
Bruce M Simpson
832cb4aef7 Style. 2004-07-06 03:07:50 +00:00
Robert Watson
28b8605232 In the BPF and ethernet bridging code, don't allow callouts to execute
without Giant if we're not debug.mpsafenet=1.
2004-07-05 16:28:31 +00:00
Bruce M Simpson
29c2dfbe32 Workaround a locking problem in vlan(4). vlan_setmulti() may be called
with sleepable locks held from further up in the network stack, and
attempts to allocate memory to hold multicast group membership information
with M_WAITOK.

This panic was triggered specifically when an exiting routing daemon
process closes its raw sockets after joining multicast groups on them.

While we're here, comment some possible locking badness.

PR:	kern/48560
2004-07-04 18:32:54 +00:00
Bruce M Simpson
15a66c21c0 style(9)/whitespace cleanup while I'm in this file. 2004-07-04 16:43:24 +00:00
Bruce M Simpson
4c9e94d42c The net.link.ether.bridge.enable sysctl MIB variable enables bridge
functionality by setting to a non-zero value. This is an integer, but
is treated as a boolean by the code, so clamp it to a boolean value
when set so as to avoid unnecessary bridge reinitialization if it's
changed to another value.

PR:		kern/61174
Requested by:	Bruce Cran
2004-07-04 15:53:28 +00:00
Brooks Davis
f93dfa28b1 Don't announce the ethernet address when it's 00:00:00:00:00:00. It's
not of any interest.  This primairly happens when vlan(4) interfaces are
created.
2004-07-02 19:44:59 +00:00
Max Laier
7929aa036c Bring in the first chunk of altq driver modifications. This covers the
following drivers: bfe(4), em(4), fxp(4), lnc(4), tun(4), de(4) rl(4),
sis(4) and xl(4)

More patches are pending on: http://peoples.freebsd.org/~mlaier/ Please take
a look and tell me if "your" driver is missing, so I can fix this.

Tested-by:	many
No-objection:	-current, -net
2004-07-02 12:16:02 +00:00
Roman Kurakin
e874bf6648 Do not m_free packet since IF_HANDOFF (called from netisr_queue) will
do it for us, just count it.
2004-06-28 15:32:24 +00:00
Pawel Jakub Dawidek
0a44517d3a Those are unneeded too. 2004-06-27 09:06:10 +00:00
Pawel Jakub Dawidek
46e3b1cbe7 Add two missing includes and remove two uneeded.
This is quite serious fix, because even with MAC framework compiled in,
MAC entry points in those two files were simply ignored.
2004-06-27 09:03:22 +00:00
Poul-Henning Kamp
cb9ea5f4cb Pick the hotchar out of the tty structure instead of caching private
copies.

No current line disciplines have a dynamically changing hotchar, and
expecting to receive anything sensible during a change in ldisc is
insane so no locking of the hotchar field is necessary.
2004-06-26 09:20:07 +00:00
Poul-Henning Kamp
4776c07426 Fix line discipline switching issues: If opening a new ldisc fails,
we have to revert to TTYDISC which we know will successfully open
rather than try the previous ldisc which might also fail to open.

Do not let ldisc implementations muck about with ->t_line, and remove
code which checks for reopens, it should never happen.

Move ldisc->l_hotchar to tty->t_hotchar and have ldisc implementation
initialize it in their open routines.  Reset to zero when we enter
TTYDISC.  ("no" should really be -1 since zero could be a valid
hotchar for certain old european mainframe protocols.)
2004-06-26 08:44:04 +00:00
Roman Kurakin
1127aac31e Do not count loobacks as other fuilures.
As a result magic will not be rejected any more in case of loopback.

Discussed with:	joerg@
2004-06-25 10:25:33 +00:00
Joerg Wunsch
b46f884b80 Add a couple of #ifdef DEBUG printf()s in vlan_input() I found to be
useful when debugging the ether_demux() problem (when bridging over
VLANs).
2004-06-24 12:32:41 +00:00
Joerg Wunsch
cd0cd0149b When considering an ethernet frame that is not destined for us, do not
only allow this to be further processed when bridging is active on
that interface, but also if the current packet has a VLAN tag and
VLANs are active on our interface.  This gives the VLAN layers a
chance to also consider the packet (and perhaps drop it instead of the
main dispatcher).

This fixes a situation where bridging was only active on VLAN
interfaces but ether_demux() called on behalf of the main interface
had already thrown the packet away.

MFC after:	4 weeks
2004-06-24 12:31:44 +00:00
Dag-Erling Smørgrav
d7647d966e Make dependencies on the TCP/IP stack conditional on INET / INET6. This
makes it possible to build a kernel with NIC drivers but no TCP/IP stack.

Sponsored by:	Teleplan AS
2004-06-24 10:58:08 +00:00
Brooks Davis
f889d2ef8d Major overhaul of pseudo-interface cloning. Highlights include:
- Split the code out into if_clone.[ch].
 - Locked struct if_clone. [1]
 - Add a per-cloner match function rather then simply matching names of
   the form <name><unit> and <name>.
 - Use the match function to allow creation of <interface>.<tag>
   vlan interfaces.  The old way is preserved unchanged!
 - Also the match function to allow creation of stf(4) interfaces named
   stf0, stf, or 6to4.  This is the only major user visible change in
   that "ifconfig stf" creates the interface stf rather then stf0 and
   does not print "stf0" to stdout.
 - Allow destroy functions to fail so they can refuse to delete
   interfaces.  Currently, we forbid the deletion of interfaces which
   were created in the init function, particularly lo0, pflog0, and
   pfsync0.  In the case of lo0 this was a panic implementation so it
   does not count as a user visiable change. :-)
 - Since most interfaces do not need the new functionality, an family of
   wrapper functions, ifc_simple_*(), were created to wrap old style
   cloner functions.
 - The IF_CLONE_INITIALIZER macro is replaced with a new incompatible
   IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE
   instead.

Submitted by:   Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by:    andre, mlaier
Discussed on:	net
2004-06-22 20:13:25 +00:00
Mark Murray
3410878421 Give zlib the ability to be a module that can be depended on,
in the MODULE_DEPEND() sense.
2004-06-20 17:42:35 +00:00
Bruce Evans
7a637a637e Include <sys/_lock.h>'s prerequisite <sys/queue.h> before including the
former, not after.

Don't hide this bug by including <sys/queue.h> in <sys/_lock.h>.
2004-06-19 14:58:35 +00:00
Poul-Henning Kamp
f3732fd15b Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Max Laier
affc907d0c Replace IF_HANDOFF with new IFQ_HANDOFF to enqueue with ALTQ once enabled on
the respective drivers.
2004-06-15 23:57:42 +00:00
Robert Watson
730262cdf7 Lock down rawcb_list, a global list of control blocks for raw sockets,
using rawcb_mtx.  Hold this mutex while modifying or iterating over
the control list; this means that the mutex is held over calls into
socket delivery code, which no longer causes a lock order reversal as
the routing socket code uses a netisr to avoid recursing socket ->
routing -> socket.

Note: Locking of IPsec consumers of rawcb_list is not included in this
commit.
2004-06-15 04:13:59 +00:00
Max Laier
62d7f46e88 Fix a typeo in IFQ_HANDOFF. 2004-06-15 03:40:39 +00:00
Max Laier
4cb655c020 Transform tbr_dequeue into a function pointer in order to build drivers with
ALTQ enabled versions of IFQ_* macros by default, as requested by serveral
others. This is a follow-up to the quick fix I committed yesterday which
turned off the ALTQ checks for non-ALTQ kernels.
2004-06-15 01:45:19 +00:00
Doug Rabson
941d37182e Fix big-endian build. 2004-06-14 08:17:51 +00:00
Max Laier
930e2cfa1f Unbreak non-ALTQ kernel linking. I forgot about tbr_dequeue.
In the end drivers should be building with ALTQ checks by default, but for
now build them with the old macros for non-ALTQ kernels.

Note: Check new features w/ LINT *and* w/ LINT minus the new feature.

Found-by:	rwatson
2004-06-14 03:55:09 +00:00
Doug Rabson
eedccad06a Add MAC framework bits to the output path. 2004-06-13 19:55:16 +00:00
Doug Rabson
d9eb70ad37 Remove advertising clause. 2004-06-13 19:15:44 +00:00
Max Laier
02b199f158 Link ALTQ to the build and break with ABI for struct ifnet. Please recompile
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.

__FreeBSD_version bump will follow.

Tested-by:	(i386)LINT
2004-06-13 17:29:10 +00:00
Doug Rabson
b8b3323469 Add a new driver to support IP over firewire. This driver is intended to
conform to the rfc2734 and rfc3146 standard for IP over firewire and
should eventually supercede the fwe driver. Right now the broadcast
channel number is hardwired and we don't support MCAP for multicast
channel allocation - more infrastructure is required in the firewire
code itself to fix these problems.
2004-06-13 10:54:36 +00:00
Robert Watson
395a08c904 Extend coverage of SOCK_LOCK(so) to include so_count, the socket
reference count:

- Assert SOCK_LOCK(so) macros that directly manipulate so_count:
  soref(), sorele().

- Assert SOCK_LOCK(so) in macros/functions that rely on the state of
  so_count: sofree(), sotryfree().

- Acquire SOCK_LOCK(so) before calling these functions or macros in
  various contexts in the stack, both at the socket and protocol
  layers.

- In some cases, perform soisdisconnected() before sotryfree(), as
  this could result in frobbing of a non-present socket if
  sotryfree() actually frees the socket.

- Note that sofree()/sotryfree() will release the socket lock even if
  they don't free the socket.

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS
2004-06-12 20:47:32 +00:00
Robert Watson
935becd8dd Constify raw_sendspace and raw_recvspace, as they're not mutable. 2004-06-11 03:52:56 +00:00
Robert Watson
b8f9429d55 Switch to conditionally acquiring and dropping Giant around calls into
ifp->if_output() basedd on debug.mpsafenet.  That way once bpfwrite()
can be called without Giant, it will acquire Giant (if desired) before
entering the network stack.
2004-06-11 03:47:21 +00:00
Robert Watson
8240bf1e04 Un-staticize 'dst' sockaddr in the stack of bpfwrite() to prevent
the need to synchronize access to the structure.  I believe this
should fit into the stack under the necessary circumstances, but
if not we can either add synchronization or use a thread-local
malloc for the duration.
2004-06-11 03:45:42 +00:00
Robert Watson
d989c7b389 Introduce a netisr to deliver kernel-generated routing, avoiding
recursive entering of the socket code from the routing code:

- Modify rt_dispatch() to bundle up the sockaddr family, if any,
  associated with a pending mbuf to dispatch to routing sockets, in
  an m_tag on the mbuf.

- Allocate NETISR_ROUTE for use by routing sockets.

- Introduce rtsintrq, an ifqueue to be used by the netisr, and
  introduce rts_input(), a function to unbundle the tagged sockaddr
  and inject the mbuf and address into raw_input(), which previously
  occurred in rt_dispatch().

- Introduce rts_init() to initialize rtsintrq, its mutex, and
  register the netisr.  Perform this at the same point in system
  initialization as setup of the domains.

This change introduces asynchrony between the generation of a
pending routing socket message and delivery to sockets for use
by userspace.  It avoids socket->routing->rtsock->socket use and
helps to avoid lock order reversals between the routing code and
socket code (in particular, raw socket control blocks), as route
locks are held over calls to rt_dispatch().

Reviewed by:		"George V.Neville-Neil" <gnn@neville-neil.com>
Conceptual head nod by:	sam
2004-06-09 02:48:23 +00:00
Poul-Henning Kamp
3786c125c7 Use ldisc_[de]register() instead of frobbing linesw[] directly. 2004-06-07 20:43:37 +00:00
Christian Weisgerber
16b4a34316 Add helper functions to calculate the standard ethernet CRC in
little/big endian fashion, so that network drivers can just reference
the standard implementation and don't have to bring their own.

As discussed on arch@.

Obtained from:	NetBSD
2004-06-02 21:34:14 +00:00
Poul-Henning Kamp
5dba30f15a add missing #include <sys/module.h> 2004-05-30 20:27:19 +00:00
Poul-Henning Kamp
41ee9f1c69 Add some missing <sys/module.h> includes which are masked by the
one on death-row in <sys/kernel.h>
2004-05-30 17:57:46 +00:00
David Malone
bde800e688 Make the comment for DLT_NULL slightly more accurate.
PR:		62272
Submitted by:	Radim Kolar <hsn@netmag.cz>
MFC after:	1 week
2004-05-30 17:03:48 +00:00
Yaroslav Tykhiy
6cbd3e99ec if_printf() won't emit a newline unless told to. 2004-05-26 11:41:26 +00:00
Roman Kurakin
9105841d31 Keepalive timer should be added if we does not have any sppp consumers before
and should be deleted if we do not have any anymore.
2004-05-25 21:54:07 +00:00
Yaroslav Tykhiy
656acce4f4 After all the relevant drivers have been fixed, fix vlan(4) itself
WRT manipulating capabilities of the parent interface:

- use ioctl(SIOCSIFCAP) to toggle VLAN_MTU (the way that was done
  before was just wrong);

- use the right order of conditional clauses to set the MTU fudge
  (that is logically independent from toggling VLAN_MTU.)
2004-05-25 14:30:12 +00:00
Maxime Henrion
7131aeaea1 Remove another redundant if_output initialization. 2004-05-24 11:01:45 +00:00
Yaroslav Tykhiy
b08347a005 Consult parent's if_capenable for active VLAN-related capabilities.
This change is possible since all the relevant drivers have been
fixed to set if_capenable properly.  The field if_capabilities tracks
supported capabilities, which may be disabled administratively.

Inheriting checksum offload support from the parent interface isn't
that easy because the checksumming capabilities of the parent may be
toggled on the fly.  Disable the code for now.
2004-05-23 22:32:15 +00:00