Commit Graph

1662 Commits

Author SHA1 Message Date
Hajimu UMEMOTO
529ed56f83 don't see NBPFILTER. 2005-01-11 07:17:33 +00:00
Hajimu UMEMOTO
2d106a00c9 remove HAVE_OLD_BPF part. 2005-01-11 07:14:37 +00:00
Hajimu UMEMOTO
4b9a5e9f07 we are not OLD_BPF system. 2005-01-11 07:08:15 +00:00
Hajimu UMEMOTO
9b1a707635 fix typo. 2005-01-11 07:05:56 +00:00
Gleb Smirnoff
1c7899c74e This change adds reliability for Ethernet trunks built with ng_one2many:
- Introduce another ng_ether(4) callback ng_ether_link_state_p, which
  is called from if_link_state_change(), every time link is changed.
- In ng_ether_link_state() send netgraph control message notifying
  of link state change to a node connected to "lower" hook.

Reviewed by:	sam
MFC after:	2 weeks
2005-01-08 12:42:03 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Roman Kurakin
d676cb6fad Add FR support to sppp (MFCronyx).
Silence on: net@, current@, hackers@.
No objections: joerg

Requested by: by many (mostly Cronyx) users for a long long time.
MFC after:	10 days

PR:		kern/21771, kern/66348
2004-12-28 00:07:57 +00:00
Pawel Jakub Dawidek
77fc70c1ef Fix mbuf leak.
Submitted by:	Johnny Eriksson <bygg@cafax.se>
MFC after:	5 days
2004-12-27 15:53:44 +00:00
Poul-Henning Kamp
f62f3a1121 Include fcntl.h
Include selinfo.h (don't rely on vnode.h to do so)
Check O_NONBLOCK instead of IO_NELAY
Don't include vnode.h
2004-12-22 17:39:21 +00:00
Poul-Henning Kamp
9eaed5e66e Don't include filedesc.h
Include fcntl.h
Include selinfo.h (don't rely on vnode.h to do so)
Check O_NONBLOCK instead of IO_NDELAY
Don't include vnode.h
2004-12-22 17:38:43 +00:00
Poul-Henning Kamp
e76eee5562 Include fcntl.h
Check O_NONBLOCK instead of IO_NDELAY
Include uio.h
Don't include vnode.h
Don't include filedesc.h
2004-12-22 17:37:57 +00:00
Poul-Henning Kamp
27d7317dda Check O_NONBLOCK instead of IO_NDELAY.
Don't include <sys/vnode.h>
2004-12-22 17:32:53 +00:00
John-Mark Gurney
86c9a45388 don't try to recurse on the bpf lock.. kqueue already locks the bpf lock
now...

Submitted by:	Ed Maste of Sandvine Inc.
MFC after:	1 week
2004-12-17 03:21:46 +00:00
Roman Kurakin
1fd90fb4a0 Kill double inclusion for <netinet/in.h> and <netinet/in_systm.h>. 2004-12-14 18:18:54 +00:00
Roman Kurakin
e42ddbdf64 Make sppp MPSAFE.
MPSAFE could be turned off by IFF_NEEDSGIANT.

Silence on: net@, current@, hackers@.
No objections: joerg
2004-12-12 14:54:15 +00:00
Sam Leffler
94f5c9cfc0 Cleanup link state change notification:
o add new if_link_state_change routine that deals with link state changes
o change mii to use if_link_state_change
2004-12-08 05:45:59 +00:00
Sam Leffler
3518d22073 Don't require a device to be marked up when issuing BIOCSETIF. 2004-12-08 05:40:02 +00:00
Max Laier
69fb23b73d Implement the check I was talking about in the previous message already.
Introduce domain_init_status to keep track of the init status of the domains
list (surprise). 0 = uninitialized, 1 = initialized/unpopulated, 2 =
initialized/done. Higher values can be used to support late addition of
domains which right now "works", but is potential dangerous. I choose to
only give a warning when doing so.

Use domain_init_status with if_attachdomain[1]() to ensure that we have a
complete domains list when we init the if_afdata array. Store the current
value of domain_init_status in if_afdata_initialized. This way we can update
if_afdata after a new protocol has been added (once that is allowed).

Submitted by:	se (with changes)
Reviewed by:	julian, glebius, se
PR:		kern/73321	(partly)
2004-11-30 22:38:37 +00:00
Robert Watson
6237419d5c Assign if_broadcastaddr to NULL not 0 in if_attach().
Printf() a warning if if_attachdomain() is called more than once on an
  interface to generate some noise on mailing lists when this occurs.

Fix up style in if_start(), where spaces crept in instead of tabs at
some point.

MFC after:	1 week
MFC note:	Not the printf().
2004-11-23 23:31:33 +00:00
John-Mark Gurney
1f48dc25d7 sync comment on IFF_OACTIVE with reality.. IFF_OACTIVE is set when the
hardware cannot take anymore packets, and so will supress the calling of
the device's if_start method...

Submitted by:	bde
2004-11-17 18:32:44 +00:00
Max Laier
0b39ef4db1 Remove the #if 0 wrapping around !ALTQ stuff that can't be used due to ABI
stability anyway.
2004-11-09 21:29:28 +00:00
Poul-Henning Kamp
756d52a195 Initialize struct pr_userreqs in new/sparse style and fill in common
default elements in net_init_domain().

This makes it possible to grep these structures and see any bogosities.
2004-11-08 14:44:54 +00:00
Olivier Houchard
943efa1bd1 Don't abuse tp->t_sc in sl(4) either. 2004-11-07 14:36:47 +00:00
Olivier Houchard
7358f4bb52 Don't abuse tp->t_sc, as it is now used by tty drivers.
This fixes the panic that occurs when using ppp(4)

Reported and tested by:	Yann Berthier (yb at sainte-barbe dot org)
2004-11-07 14:35:53 +00:00
Gleb Smirnoff
411f23b06e Utilize m_uiotombuf() in device write method, instead of home-grown
implementation. This also gives a performance improvement, because
m_uiotombuf() utilizes clusters.

Approved by:	julian (mentor)
MFC after:	1 month
2004-10-31 17:39:46 +00:00
Robert Watson
0b762445b9 Move if_handoff() from an inline in if_var.h to a function to if.c
in orden to harden the ABI for 5.x; this will permit us to modify
the locking in the ifnet packet dispatch without requiring drivers
to be recompiled.

MFC after:	3 days
Discussed at:	EuroBSDCon Developer's Summit
2004-10-30 09:39:13 +00:00
Robert Watson
b4d4574a55 Add additional "spare" fields to 'struct ifnet' in order to improve
the resistance of the network driver ABI to changes that will be
required as we optimize locking.

MFC after:	3 days
Discussed at:	Developer Summit
2004-10-30 08:45:13 +00:00
John-Mark Gurney
2f27e1512c use NULL instead of 0 when casting/comparing w/ a pointer... 2004-10-25 17:04:40 +00:00
Robert Watson
31302ebf9d Define IFF_LOCKGIANT() and IFF_UNLOCKGIANT() macros, which conditionally
acquire Giant if the passed interface has IFF_NEEDSGIANT set on it.
Modify calls into (ifp)->if_ioctl() in if.c to use these macros in order
to ensure that Giant is held.

MFC after:	3 days
Bumped into by:	jmg
2004-10-19 18:11:55 +00:00
Robert Watson
81158452be Push acquisition of the accept mutex out of sofree() into the caller
(sorele()/sotryfree()):

- This permits the caller to acquire the accept mutex before the socket
  mutex, avoiding sofree() having to drop the socket mutex and re-order,
  which could lead to races permitting more than one thread to enter
  sofree() after a socket is ready to be free'd.

- This also covers clearing of the so_pcb weak socket reference from
  the protocol to the socket, preventing races in clearing and
  evaluation of the reference such that sofree() might be called more
  than once on the same socket.

This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket.  The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets.  The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.

RELENG_5_3 candidate.

MFC after:	3 days
Reviewed by:	dwhite
Discussed with:	gnn, dwhite, green
Reported by:	Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by:	Vlad <marchenko at gmail dot com>
2004-10-18 22:19:43 +00:00
Gleb Smirnoff
a176c2aeaf Fix packet flow when both ng_ether(4) and bridge(4) are in use:
- push all bridge logic from if_ethersubr.c into bridge.c
  make bridge_in() return mbuf pointer (or NULL).
- call only bridge_in() from ether_input(), after ng_ether_input()
  was optinally called.
- call bridge_in() from ng_ether_rcv_upper().

Long description:	http://lists.freebsd.org/mailman/htdig/freebsd-net/2004-May/003881.html
Reported by:		Jian-Wei Wang <jwwang at FreeBSD.csie.NCTU.edu.tw>
Tested by:		myself, Sergey Lyubka
Reviewed by:		sam
Approved by:		julian (mentor)
MFC after:		2 months
2004-10-12 10:33:42 +00:00
Andre Oppermann
de10fe70e1 Correctly unregister a netisr by clearing the ni->ni_queue field to NULL as
well.  This field is actually used by various netisr functions to determine
the availablility of the specified netisr.  This uncomplete unregister leads
directly to a crash when the KLD unregistering the netisr is unloaded.

Submitted by:	Sam <sah@softcardsystems.com>
MFC after:	3 days
2004-10-11 20:01:43 +00:00
Robert Watson
acf032f516 When harvesting entropy from an ethernet mbuf, do so before freeing the
mbuf.

RELENG_5 candidate.
2004-10-11 10:21:34 +00:00
Gleb Smirnoff
570343bfec Assign pointer NULL, not 0.
Approved by:	julian (mentor)
2004-10-11 07:28:36 +00:00
Max Laier
85bba4455a Change pfil starvation prevention from fail-open to fail-close.
We return ENOBUF to indicate the problem, which is an errno that should be
handled well everywhere.

Requested & Submitted by:	green
Silently okay'ed by:		The rest of the firewall gang
MFC after:			3 days
2004-10-08 12:07:20 +00:00
Brooks Davis
ab67442f0c Since net/net_osdep.c contained only one function that could be
trivially implemented as a macro, do that and remove it.  NetBSD did
this quite a while ago.
2004-10-08 00:24:30 +00:00
Brian Feldman
93daabdd83 Don't recurse the BPF descriptor lock during the BIOCSDLT operation
(and panic).  To try to finish making BPF safe, at the very least,
the BPF descriptor lock really needs to change into a reader/writer
lock that controls access to "settings," and a mutex that controls
access to the selinfo/knote/callout.  Also, use of callout_drain()
instead of callout_stop() (which is really a much more widespread
issue).
2004-10-06 04:25:37 +00:00
Sam Leffler
b83a279f19 Add 802.11-specific events that are dispatched through the routing socket.
This really doesn't belong here but is preferred (for the moment) over
adding yet another mechanism for sending msgs from the kernel to user apps.

Reviewed by:	imp
2004-10-05 19:48:33 +00:00
Sam Leffler
0cc8f89a4a add ETHERTYPE_PAE for EAPOL/802.1x 2004-10-05 19:28:52 +00:00
Max Laier
d6a8d58875 Add an additional struct inpcb * argument to pfil(9) in order to enable
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.

This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.

Suggested by:		rwatson
A lot of work by:	csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by:		rwatson, csjp
Tested by:		-pf, -ipfw, LINT, csjp and myself
MFC after:		3 days

LOR IDs:		14 - 17 (not fixed yet)
2004-09-29 04:54:33 +00:00
Max Laier
fa97ea3131 Switch order for mtx_unlock and cv_signal as (condvar(9)) sez:
A thread must hold mp while calling cv_signal(), cv_broadcast(), or
     cv_broadcastpri() even though it isn't passed as an argument.

and is right with this claim.

While here remove a "\" from the macro -> __inline conversion.

Found by:	csjp
MFC after:	4 days
2004-09-22 20:55:56 +00:00
Stefan Farfeleder
e7b80a8e24 Prefer C99's __func__ over GCC's __FUNCTION__. 2004-09-22 17:16:04 +00:00
Brian Feldman
5ed8cedc83 Call sbuf_finish() before sbuf_data() so as to not panic the system. 2004-09-22 12:53:27 +00:00
Brooks Davis
4dcf2bbbff Fix a LOR where ifconf() used copyout while holding a mutex. This LOR
was seen when configuring addresses on interfaces using ifconfig.  This
patch has been verified to work with over eight thousand addresses
assigned to an interface.

LOR id:		031
2004-09-22 08:59:41 +00:00
Brooks Davis
71672bb6f6 Log the renaming of an interface. This should make it easier to follow
kernel log files.
2004-09-18 05:02:08 +00:00
Robert Watson
6874bcf242 Destroy global tapmtx when the if_tap module is unloaded.
RELENG_5 candidated.
2004-09-17 03:55:50 +00:00
Brooks Davis
c859ef977e Fix a LOR where copyout was called while holding a lock.
Reported by:    rwatson
2004-09-15 04:41:56 +00:00
Robert Watson
46448b5a1b Reformulate bpf_dettachd() to acquire the BIF_LOCK() as well as
BPFD_LOCK() when removing a descriptor from an interface descriptor
list.  Hold both over the operation, and do a better job at
maintaining the invariant that you can't find partially connected
descriptors on an active interface descriptor list.

This appears to close a race that resulted in the kernel performing
a NULL pointer dereference when BPF sessions are detached during
heavy network activity on SMP systems.

RELENG_5 candidate.
2004-09-09 04:11:12 +00:00
Robert Watson
4a3feeaa86 Reformulate use of linked lists in 'struct bpf_d' and 'struct bpf_if'
to use queue(3) list macros rather than hand-crafted lists.  While
here, move to doubly linked lists to eliminate iterating lists in
order to remove entries.  This change simplifies and clarifies the
list logic in the BPF descriptor code as a first step towards revising
the locking strategy.

RELENG_5 candidate.

Reviewed by:	fenner
2004-09-09 00:19:27 +00:00
Robert Watson
d17d818425 Compare/set pointers using NULL not 0. 2004-09-09 00:11:50 +00:00
Brooks Davis
55287f2a60 Re-add ifi_epoch, to struct if_data, this time replacing ifi_unused
to avoid ABI changes.  It is set to the last time the interface
counters were zeroed, currently the time if_attach() was called.  It is
intentended to be a valid value for RFC2233's ifCounterDiscontinuityTime
and to make it easier for applications to verify that the interface they
find at a given index is the one that was there last time they looked.

Due to space constraints ifi_epoch is a time_t rather then a struct
timeval.  SNMP would prefer higher precision, but this unlikely to be
useful in practice.
2004-09-08 04:50:55 +00:00
John-Mark Gurney
9b90387dcf don't call f_detach if the filter has alread removed the knote.. This
happens when a proc exits, but needs to inform the user that this has
happened..  This also means we can remove the check for detached from
proc and sig f_detach functions as this is doing in kqueue now...

MFC after:	5 days
2004-09-06 19:02:42 +00:00
Robert Watson
ccaae37ab1 Correct a comment typo: s/Note/Not/.
Pointed out by:	kensmith
2004-09-03 01:37:02 +00:00
Brooks Davis
4ff62bd97b Back out ifi_epoch. The ABI breakage is too disruptive this close to
5-STABLE. ifi_epoch will shortly be reintroduced with less precistion
using the space currently allocated to ifi_unused.
2004-09-02 05:07:29 +00:00
Max Laier
7b21048cea Fix an assertion when if_down()ing a ALTQ managed interface. The lock should
have been in place all the time the mtx_assert in the ALTQ code just
discovered the shortcoming.

PR:		i386/71195
Tested by:	Bettan (PR originator), myself
MFC after:	5 days
2004-09-01 19:56:47 +00:00
Brooks Davis
9e734b4468 Use a spare byte in struct if_data to store the structure size without
increasing it.  Add code to ifconfig to use this size to find the
sockaddr_dl after the struct if_data in the routing message.  This
allows struct if_data to grow (up to 255 bytes) without breaking
ifconfig.

Submitted by:	peter
2004-09-01 18:22:14 +00:00
Brooks Davis
1fc4519b1d Add a new variable, ifi_epoch, to struct if_data. It is set to the last
time the interface counters were zeroed, currently the time if_attach()
was called.  It is indentended to be a valid value for RFC2233's
ifCounterDiscontinuityTime and to make it easier for applications to
verify that the interface they find at a given index is the one that was
there last time they looked.

An if_epoch "compatability" macro has not been created as ifi_epoch has
never been a member of struct ifnet.

Approved by:	andre, bms, wollman
2004-08-30 06:29:26 +00:00
Yaroslav Tykhiy
b9803f29dd Use an ANSI-style definition for slstart()
in accord with the rest of the file.
2004-08-30 04:48:52 +00:00
Yaroslav Tykhiy
ecfb8f3f7b Grant the poor old SLIP driver with an if_start handler
so that it becomes happy and no longer panics the system
upon getting the very first packet to transmit.

Reported and tested by:	Igor Timkin <ivt@gamma.ru>
Reviewed by:		rwatson
MFC after:		5 days
2004-08-30 04:32:52 +00:00
Robert Watson
ace437c3c6 Correct typo in printf() warning.
Submitted by:	Pawel Worach <pawel.worach at telia.com>
2004-08-28 19:27:25 +00:00
Robert Watson
1d8cd39e71 Change the default disposition of debug.mpsafenet from 0 to 1, which
will cause the network stack to operate without the Giant lock by
default.  This change has the potential to improve performance by
increasing parallelism and decreasing latency in network processing.

Due to the potential exposure of existing or new bugs, the following
compatibility functionality is maintained:

- It is still possible to disable Giant-free operation by setting
  debug.mpsafenet to 0 in loader.conf.

- Add "options NET_WITH_GIANT", which will restore the default value of
  debug.mpsafenet to 0, and is intended for use on systems compiled with
  known unsafe components, or where a more conservative configuration is
  desired.

- Add a new declaration, NET_NEEDS_GIANT("componentname"), which permits
  kernel components to declare dependence on Giant over the network
  stack.  If the declaration is made by a preloaded module or a compiled
  in component, the disposition of debug.mpsafenet will be set to 0 and
  a warning concerning performance degraded operation printed to the
  console.  If it is declared by a loadable kernel module after boot, a
  warning is displayed but the disposition cannot be changed.  This is
  implemented by defining a new SYSINIT() value, SI_SUB_SETTINGS, which
  is intended for the processing of configuration choices after tunables
  are read in and the console is available to generate errors, but
  before much else gets going.

This compatibility behavior will go away when we've finished the last
of the locking work and are confident that operation is correct.
2004-08-28 15:11:13 +00:00
Brooks Davis
b9907cd45b When detaching an interface, don't leave an obsolete pointer to the
soon to be deleted struct ifnet around.

PR:		kern/52260
MFC After:	3 days
2004-08-27 19:42:40 +00:00
Andre Oppermann
3161f583ca Apply error and success logic consistently to the function netisr_queue() and
its users.

netisr_queue() now returns (0) on success and ERRNO on failure.  At the
moment ENXIO (netisr queue not functional) and ENOBUFS (netisr queue full)
are supported.

Previously it would return (1) on success but the return value of IF_HANDOFF()
was interpreted wrongly and (0) was actually returned on success.  Due to this
schednetisr() was never called to kick the scheduling of the isr.  However this
was masked by other normal packets coming through netisr_dispatch() causing the
dequeueing of waiting packets.

PR:		kern/70988
Found by:	MOROHOSHI Akihiko <moro@remus.dti.ne.jp>
MFC after:	3 days
2004-08-27 18:33:08 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
Robert Watson
d4e02af583 Revert previous revision, 1.7, as removal of GIANT_REQUIRED was made
in the wrong branch (and hence to the wrong function).
2004-08-24 14:17:58 +00:00
Robert Watson
b84209fbec MT4 if_fwsubr.c:1.6:
date: 2004/08/22 14:48:55;  author: rwatson;  state: Exp;  lines: +0 -2
  Don't need to assert Giant in fw_output(), only in the firewire start
  routine.

Approved by:	re (scottl)
2004-08-24 14:16:08 +00:00
Peter Pentchev
18aee723a3 Fix a typo (attacked -> attached).
Approved by:	sam
2004-08-24 08:47:15 +00:00
Robert Watson
6063b5f0ad Style update: use newer style function prototypes in if_sl.c in
prep for merging locking.
2004-08-22 21:32:52 +00:00
Robert Watson
201a36deca Don't need to assert Giant in fw_output(), only in the firewire start
routine.
2004-08-22 14:48:55 +00:00
Robert Watson
b062951a3d If a tunable for the routing socket netisr queue max is defined, allow it
to override the default value, rather than the default value overriding
the tunable.
2004-08-21 21:45:40 +00:00
Robert Watson
190a4c9436 Allow the size of the routing socket netisr queue to be configured using
the tunable or sysctl 'net.route.netisr_maxqlen'.  Default the maximum
depth to 256 rather than IFQ_MAXLEN due to the downsides of dropping
routing messages.

MT5 candidate.

Discussed with:	mdodd, mlaier, Vincent Jardin <jardin at 6wind.com>
2004-08-21 21:20:06 +00:00
Christian S.J. Peron
5090559b7f When a prison is given the ability to create raw sockets (when the
security.jail.allow_raw_sockets sysctl MIB is set to 1) where privileged
access to jails is given out, it is possible for prison root to manipulate
various network parameters which effect the host environment. This commit
plugs a number of security holes associated with the use of raw sockets
and prisons.

This commit makes the following changes:

- Add a comment to rtioctl warning developers that if they add
  any ioctl commands, they should use super-user checks where necessary,
  as it is possible for PRISON root to make it this far in execution.
- Add super-user checks for the execution of the SIOCGETVIFCNT
  and SIOCGETSGCNT IP multicast ioctl commands.
- Add a super-user check to rip_ctloutput(). If the calling cred
  is PRISON root, make sure the socket option name is IP_HDRINCL,
  otherwise deny the request.

Although this patch corrects a number of security problems associated
with raw sockets and prisons, the warning in jail(8) should still
apply, and by default we should keep the default value of
security.jail.allow_raw_sockets MIB to 0 (or disabled) until
we are certain that we have tracked down all the problems.

Looking forward, we will probably want to eliminate the
references to curthread.

This may be a MFC candidate for RELENG_5.

Reviewed by:	rwatson
Approved by:	bmilekic (mentor)
2004-08-21 17:38:57 +00:00
Andre Oppermann
9b932e9e04 Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI.  The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

 In general ipfw is now called through the PFIL_HOOKS and most associated
 magic, that was in ip_input() or ip_output() previously, is now done in
 ipfw_check_[in|out]() in the ipfw PFIL handler.

 IPDIVERT is entirely handled within the ipfw PFIL handlers.  A packet to
 be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
 reassembly.  If not, or all fragments arrived and the packet is complete,
 divert_packet is called directly.  For 'tee' no reassembly attempt is made
 and a copy of the packet is sent to the divert socket unmodified.  The
 original packet continues its way through ip_input/output().

 ipfw 'forward' is done via m_tag's.  The ipfw PFIL handlers tag the packet
 with the new destination sockaddr_in.  A check if the new destination is a
 local IP address is made and the m_flags are set appropriately.  ip_input()
 and ip_output() have some more work to do here.  For ip_input() the m_flags
 are checked and a packet for us is directly sent to the 'ours' section for
 further processing.  Destination changes on the input path are only tagged
 and the 'srcrt' flag to ip_forward() is set to disable destination checks
 and ICMP replies at this stage.  The tag is going to be handled on output.
 ip_output() again checks for m_flags and the 'ours' tag.  If found, the
 packet will be dropped back to the IP netisr where it is going to be picked
 up by ip_input() again and the directly sent to the 'ours' section.  When
 only the destination changes, the route's 'dst' is overwritten with the
 new destination from the forward m_tag.  Then it jumps back at the route
 lookup again and skips the firewall check because it has been marked with
 M_SKIP_FIREWALL.  ipfw 'forward' has to be compiled into the kernel with
 'option IPFIREWALL_FORWARD' to enable it.

 DUMMYNET is entirely handled within the ipfw PFIL handlers.  A packet for
 a dummynet pipe or queue is directly sent to dummynet_io().  Dummynet will
 then inject it back into ip_input/ip_output() after it has served its time.
 Dummynet packets are tagged and will continue from the next rule when they
 hit the ipfw PFIL handlers again after re-injection.

 BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
 they did before.  Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

 conf/files
	Add netinet/ip_fw_pfil.c.

 conf/options
	Add IPFIREWALL_FORWARD option.

 modules/ipfw/Makefile
	Add ip_fw_pfil.c.

 net/bridge.c
	Disable PFIL_HOOKS if ipfw for bridging is active.  Bridging ipfw
	is still directly invoked to handle layer2 headers and packets would
	get a double ipfw when run through PFIL_HOOKS as well.

 netinet/ip_divert.c
	Removed divert_clone() function.  It is no longer used.

 netinet/ip_dummynet.[ch]
	Neither the route 'ro' nor the destination 'dst' need to be stored
	while in dummynet transit.  Structure members and associated macros
	are removed.

 netinet/ip_fastfwd.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_fw.h
	Removed 'ro' and 'dst' from struct ip_fw_args.

 netinet/ip_fw2.c
	(Re)moved some global variables and the module handling.

 netinet/ip_fw_pfil.c
	New file containing the ipfw PFIL handlers and module initialization.

 netinet/ip_input.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.  ip_forward() does not longer require
	the 'next_hop' struct sockaddr_in argument.  Disable early checks
	if 'srcrt' is set.

 netinet/ip_output.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_var.h
	Add ip_reass() as general function.  (Used from ipfw PFIL handlers
	for IPDIVERT.)

 netinet/raw_ip.c
	Directly check if ipfw and dummynet control pointers are active.

 netinet/tcp_input.c
	Rework the 'ipfw forward' to local code to work with the new way of
	forward tags.

 netinet/tcp_sack.c
	Remove include 'opt_ipfw.h' which is not needed here.

 sys/mbuf.h
	Remove m_claim_next() macro which was exclusively for ipfw 'forward'
	and is no longer needed.

Approved by:	re (scottl)
2004-08-17 22:05:54 +00:00
John-Mark Gurney
ad3b9257c2 Add locking to the kqueue subsystem. This also makes the kqueue subsystem
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers.  Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.

Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks.  Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).

Reviewed by:	green, rwatson (both earlier versions)
2004-08-15 06:24:42 +00:00
Robert Watson
3b7d076fe7 Use IFQ_SET_MAXLEN() to set the maximum queue depth of the routing
socket netisr queue.

Pointed out by:	winter
2004-08-13 22:23:21 +00:00
Tony Ackerman
b59db7bbe8 Added two new media types for 10GBASE-SR and 10GBASE-LR 2004-08-12 23:48:26 +00:00
Andre Oppermann
2dc1d58164 Convert the routing table to use an UMA zone for rtentries. The zone is
called "rtentry".

This saves a considerable amount of kernel memory.  R_Zmalloc previously
used 256 byte blocks (plus kmalloc overhead) whereas UMA only needs 132
bytes.

Idea from:	OpenBSD
2004-08-11 17:26:56 +00:00
Maksim Yevmenkin
285b72aa78 Set IFF_RUNNING flag on the interface as soon as the control device is opened. 2004-08-11 00:12:27 +00:00
Max Laier
de0332d4fa Add a "void *if_carp" placeholder to struct ifnet with prospect to bring in
the "Common address redundancy protocol" (CARP) during the 5-STABLE cycle.
Hence doing the ABI break now.

Approved by:	re (scottl)
2004-08-07 09:32:04 +00:00
Robert Watson
ebcd28e669 As SLIP directly accesses the tty code from its if_start() routine,
mark if_sl as IFF_NEEDSGIANT.
2004-08-06 22:41:13 +00:00
Peter Pentchev
3f35d5150b Do not attempt to clean up data that has not been initialized yet.
This fixes two kernel panics on boot when the xl driver fails to
allocate bus/port/memory resources.

Reviewed by:	silence on -net
2004-08-06 09:08:33 +00:00
Maxim Sobolev
97c4cd9853 Set ip_v field properly.
PR:	kern/69957
2004-08-05 08:12:46 +00:00
Robert Watson
46691dd8d7 Do a lockless read of the BPF interface structure descriptor list head
before grabbing BPF locks to see if there are any entries in order to
avoid the cost of locking if there aren't any.  Avoids a mutex lock/
unlock for each packet received if there are no BPF listeners.
2004-08-05 02:37:36 +00:00
Alexander Kabaev
445e045b0d Avoid casts as lvalues. 2004-07-28 06:59:55 +00:00
Alexander Kabaev
a0ec13c419 Initialize ; variable eraly to shut up GCC warning. 2004-07-28 06:48:36 +00:00
Robert Watson
af5e59bf28 Add a new network interface flag, IFF_NEEDSGIANT, which will allow
device drivers to declare that the ifp->if_start() method implemented
by the driver requires Giant in order to operate correctly.

Add a 'struct task' to 'struct ifnet' that can be used to execute a
deferred ifp->if_start() in the event that if_start needs to be called
in a Giant-free environment.  To do this, introduce if_start(), a
wrapper function for ifp->if_start().  If the interface can run MPSAFE,
it directly dispatches into the interface start routine.  If it can't
run MPSAFE, we're running with debug.mpsafenet != 0, and Giant isn't
currently held, the task is queued to execute in a swi holding Giant
via if_start_deferred().

Modify if_handoff() to use if_start() instead of direct dispatch.
Modify 802.11 to use if_start() instead of direct dispatch.

This is intended to provide increased compatibility for non-MPSAFE
network device drivers in the presence of Giant-free operation via
asynchronous dispatch.  However, this commit does not mark any network
interfaces as IFF_NEEDSGIANT.
2004-07-27 23:20:45 +00:00
Yaroslav Tykhiy
d6fcfb7ae1 Stop tinkering with the parent's VLAN_MTU capability.
Now it is user-controlled through ifconfig(8).

The former ``automagic'' way of operation created more
trouble than good.  First, VLAN_MTU consumers other than
vlan(4) had appeared, e.g., ng_vlan(4).  Second, there was
no way to disable VLAN_MTU manually if it were causing
trouble, e.g., data corruption.

Dropping the ``automagic'' should be completely invisible
to the user since
a) all the drivers supporting VLAN_MTU
have it enabled by default, and in the first place
b) there is only one driver that can really toggle VLAN_MTU
in the hardware under its control (it's fxp(4), to which
I added VLAN_MTU controls to illustrate the principle.)
2004-07-26 14:46:04 +00:00
Robert Watson
572bde2aea Prefer NULL to '0' when checking a pointer value. 2004-07-24 16:58:56 +00:00
Brooks Davis
b4e9f8379e Actually free the unit when destroying the interface.
Reported by:	la at delfi.lt
Tested by:	la at delfi.lt
PR:		68618
2004-07-22 22:50:15 +00:00
Max Laier
ca64c799d4 When removing the last reference to a cloner, do not try to unlock twice -
esp. not since the backing memory was just freed.

Reviewed by:	rwatson
2004-07-20 21:44:28 +00:00
Robert Watson
08f85b089e Comment clarifying debug_mpsafenet. 2004-07-18 21:50:22 +00:00
Robert Watson
8bbfdc98e4 Gratuitous whitespace change to un-wrap a short line. 2004-07-18 19:53:35 +00:00
Poul-Henning Kamp
672c05d49c Preparation commit for the tty cleanups that will follow in the near
future:

rename ttyopen() -> tty_open() and ttyclose() -> tty_close().

We need the ttyopen() and ttyclose() for the new generic cdevsw
functions for tty devices in order to have consistent naming.
2004-07-15 20:47:41 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Max Laier
bfe4641596 Fix a copy-and-paste-o in IFQ_DRV_PREPEND - all pointyhats to me.
While here also fix a (not less stupid) braino in IFQ_DRV_PURGE.

Reported-by:	clement
Tested-by:	clement (_PREPEND in sis(4))
2004-07-14 13:31:41 +00:00
Robert Watson
efe0ab01b2 Convert SLIP to using C99 structure initialization for its struct
linesw.
2004-07-14 05:01:40 +00:00
Bruce M Simpson
086e98c437 Use ETHER_IS_MULTICAST() consistently in ether_resolvemulti().
Reviewed by:	jmallett
2004-07-09 05:26:27 +00:00
Bruce M Simpson
ca28620f0d Use M_ZERO instead of bzero(). 2004-07-06 03:34:16 +00:00
Bruce M Simpson
9b3d77e7c9 Be consistent and use bzero() instead of memset(). 2004-07-06 03:29:41 +00:00
Bruce M Simpson
b3c9a01e5e Use M_ZERO instead of memset() (!). 2004-07-06 03:28:24 +00:00