Commit Graph

1331 Commits

Author SHA1 Message Date
Andre Oppermann
97d8d152c2 Introduce tcp_hostcache and remove the tcp specific metrics from
the routing table.  Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.

It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination.  Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.

tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.

It removes significant locking requirements from the tcp stack with
regard to the routing table.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 20:07:39 +00:00
Andre Oppermann
26d02ca7ba Remove RTF_PRCLONING from routing table and adjust users of it
accordingly.  The define is left intact for ABI compatibility
with userland.

This is a pre-step for the introduction of tcp_hostcache.  The
network stack remains fully useable with this change.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 19:47:31 +00:00
Robert Watson
a557af222b Introduce a MAC label reference in 'struct inpcb', which caches
the   MAC label referenced from 'struct socket' in the IPv4 and
IPv6-based protocols.  This permits MAC labels to be checked during
network delivery operations without dereferencing inp->inp_socket
to get to so->so_label, which will eventually avoid our having to
grab the socket lock during delivery at the network layer.

This change introduces 'struct inpcb' as a labeled object to the
MAC Framework, along with the normal circus of entry points:
initialization, creation from socket, destruction, as well as a
delivery access control check.

For most policies, the inpcb label will simply be a cache of the
socket label, so a new protocol switch method is introduced,
pr_sosetlabel() to notify protocols that the socket layer label
has been updated so that the cache can be updated while holding
appropriate locks.  Most protocols implement this using
pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use
the the worker function in_pcbsosetlabel(), which calls into the
MAC Framework to perform a cache update.

Biba, LOMAC, and MLS implement these entry points, as do the stub
policy, and test policy.

Reviewed by:	sam, bms
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-11-18 00:39:07 +00:00
Andre Oppermann
9188b4a169 Introduce ip_fastforward and remove ip_flow.
Short description of ip_fastforward:

 o adds full direct process-to-completion IPv4 forwarding code
 o handles ip fragmentation incl. hw support (ip_flow did not)
 o sends icmp needfrag to source if DF is set (ip_flow did not)
 o supports ipfw and ipfilter (ip_flow did not)
 o supports divert, ipfw fwd and ipfilter nat (ip_flow did not)
 o returns anything it can't handle back to normal ip_input

Enable with sysctl -w net.inet.ip.fastforwarding=1

Reviewed by:	sam (mentor)
2003-11-14 21:02:22 +00:00
Bruce M Simpson
f16770ae7e Fix a bug whereby the physical endpoints of a gre(4) tunnel would not
be printed, if the module were loaded into a kernel which had INET6 enabled.

The gre(4) driver does not use INET6, nor is it specified for IPv6. The
tunnel_status() function in ifconfig(8) is somewhat overzealous and assumes
that all tunnel interfaces speak KAME ifioctls.

This fix follows the path of least resistance, by teaching gre(4) about
the two KAME ifioctls concerned.

PR:	bin/56341
2003-11-14 20:58:00 +00:00
Bruce M Simpson
05b2efe044 Add a sysctl MIB, NET_RT_IFMALIST, to retrieve multicast group memberships
in a protocol-independent way.

Submitted by:	harti
2003-11-14 18:48:15 +00:00
Hajimu UMEMOTO
979ed0d5e6 fix comments.
Obtained from:	KAME
2003-11-14 18:17:07 +00:00
Ruslan Ermilov
26f9b263ae - vlan_start(): Increment the correct interface statistics member.
Reviewed by:	mdodd

- vlan_input(): Macroize the VLAN tag extraction from mbuf.
2003-11-12 12:58:19 +00:00
Robert Watson
eca8a663d4 Modify the MAC Framework so that instead of embedding a (struct label)
in various kernel objects to represent security data, we embed a
(struct label *) pointer, which now references labels allocated using
a UMA zone (mac_label.c).  This allows the size and shape of struct
label to be varied without changing the size and shape of these kernel
objects, which become part of the frozen ABI with 5-STABLE.  This opens
the door for boot-time selection of the number of label slots, and hence
changes to the bound on the number of simultaneous labeled policies
at boot-time instead of compile-time.  This also makes it easier to
embed label references in new objects as required for locking/caching
with fine-grained network stack locking, such as inpcb structures.

This change also moves us further in the direction of hiding the
structure of kernel objects from MAC policy modules, not to mention
dramatically reducing the number of '&' symbols appearing in both the
MAC Framework and MAC policy modules, and improving readability.

While this results in minimal performance change with MAC enabled, it
will observably shrink the size of a number of critical kernel data
structures for the !MAC case, and should have a small (but measurable)
performance benefit (i.e., struct vnode, struct socket) do to memory
conservation and reduced cost of zeroing memory.

NOTE: Users of MAC must recompile their kernel and all MAC modules as a
result of this change.  Because this is an API change, third party
MAC modules will also need to be updated to make less use of the '&'
symbol.

Suggestions from:	bmilekic
Obtained from:		TrustedBSD Project
Sponsored by:		DARPA, Network Associates Laboratories
2003-11-12 03:14:31 +00:00
Mike Silbersack
20880eabe9 Remove the m_defrag call from if_loop; testing with m_fragment
has shown that the IPv6 stack can clearly handle fragmented
mbuf chains without a problem.

MFC after:	1 week
2003-11-11 17:58:36 +00:00
Seigo Tanimura
512824f8f7 - Implement selwakeuppri() which allows raising the priority of a
thread being waken up.  The thread waken up can run at a priority as
  high as after tsleep().

- Replace selwakeup()s with selwakeuppri()s and pass appropriate
  priorities.

- Add cv_broadcastpri() which raises the priority of the broadcast
  threads.  Used by selwakeuppri() if collision occurs.

Not objected in:	-arch, -current
2003-11-09 09:17:26 +00:00
Sam Leffler
7138d65c3f replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREF
macros that expand to include assertions when the system is built
with INVARIANTS

Supported by:	FreeBSD Foundation
2003-11-08 23:36:32 +00:00
Sam Leffler
7902224c6b o add a flags parameter to netisr_register that is used to specify
whether or not the isr needs to hold Giant when running; Giant-less
  operation is also controlled by the setting of debug_mpsafenet
o mark all netisr's except NETISR_IP as needing Giant
o add a GIANT_REQUIRED assertion to the top of netisr's that need Giant
o pickup Giant (when debug_mpsafenet is 1) inside ip_input before
  calling up with a packet
o change netisr handling so swi_net runs w/o Giant; instead we grab
  Giant before invoking handlers based on whether the handler needs Giant
o change netisr handling so that netisr's that are marked MPSAFE may
  have multiple instances active at a time
o add netisr statistics for packets dropped because the isr is inactive

Supported by:	FreeBSD Foundation
2003-11-08 22:28:40 +00:00
Sam Leffler
d3be1471c7 o make debug_mpsafenet globally visible
o move it from subr_bus.c to netisr.c where it more properly belongs
o add NET_PICKUP_GIANT and NET_DROP_GIANT macros that will be used to
  grab Giant as needed when MPSAFE operation is enabled

Supported by:	FreeBSD Foundation
2003-11-05 23:42:51 +00:00
Hajimu UMEMOTO
4a201dfc18 - update comments to refrect recent BSDs.
- nuke unused macro PSUEDO_SET().
- I believe our if_xname stuff is nothing strange against other BSDs.

Obtained from:	KAME
2003-11-04 14:08:31 +00:00
Brooks Davis
9bf40ede4a Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By:	re (in principle)
Reviewed By:	njl, imp
Tested On:	i386, amd64, sparc64
Obtained From:	NetBSD (if_xname)
2003-10-31 18:32:15 +00:00
Brooks Davis
aaaad65dcd Make TUNDEBUG use if_printf instead of printf. 2003-10-31 02:48:12 +00:00
Brooks Davis
13fb40df0a Replace a couple printfs with if_printfs. 2003-10-31 01:35:07 +00:00
Sam Leffler
9c63e9dbd7 Overhaul routing table entry cleanup by introducing a new rtexpunge
routine that takes a locked routing table reference and removes all
references to the entry in the various data structures. This
eliminates instances of recursive locking and also closes races
where the lock on the entry had to be dropped prior to calling
rtrequest(RTM_DELETE).  This also cleans up confusion where the
caller held a reference to an entry that might have been reclaimed
(and in some cases used that reference).

Supported by:	FreeBSD Foundation
2003-10-30 23:02:51 +00:00
Sam Leffler
319de71e19 avoid recursive lock panic by unlocking before calling rtrequest;
this is consistent with other places but will be replaced
shortly by a "proper fix"

Supported by:	FreeBSD Foundation
Pain felt by:	Jiri Mikulas
2003-10-29 23:01:37 +00:00
Sam Leffler
8dbc9c8bf4 Always queue looped back packets (rather than potentially using
direct dispatch) to avoid extensive kernel stack usage and to
avoid directly re-entering the network stack.  The latter causes
locking problems when, for example, a complete TCP handshake`
happens w/o a context switch.
2003-10-29 18:37:47 +00:00
Sam Leffler
9c855a36c1 Introduce the notion of "persistent mbuf tags"; these are tags that stay
with an mbuf until it is reclaimed.  This is in contrast to tags that
vanish when an mbuf chain passes through an interface.  Persistent tags
are used, for example, by MAC labels.

Add an m_tag_delete_nonpersistent function to strip non-persistent tags
from mbufs and use it to strip such tags from packets as they pass through
the loopback interface and when turned around by icmp.  This fixes problems
with "tag leakage".

Pointed out by:	Jonathan Stone
Reviewed by:	Robert Watson
2003-10-29 05:40:07 +00:00
Brooks Davis
5e17543a23 Use VLANNAME instead of "vlan". 2003-10-28 20:58:02 +00:00
Alexander Kabaev
78ece5537d Delay if_lo module intialization until domain list has been
completely populated. This prevents a system crash on boot.
2003-10-27 04:16:32 +00:00
Hajimu UMEMOTO
9dcd9885cf use official # for IFT_STF
(are there any backward compat issue?  i don't think so)

Obtained from:	KAME
2003-10-26 14:29:04 +00:00
Hajimu UMEMOTO
234a35c714 Since dp->dom_ifattach calls malloc() with M_WAITOK, we cannot
use mutex lock directly here.  Protect ifp->if_afdata instead.

Reported by:	grehan
2003-10-24 16:57:59 +00:00
Warner Losh
39efe73c15 Remove unnecessary (caddr_t) casts of if_broadcastaddr. 2003-10-23 17:47:55 +00:00
Brooks Davis
76d6b7514f Use IF_MAXUNIT instead of rolling our own identical TUN_MAXUNIT. 2003-10-23 16:29:20 +00:00
Warner Losh
64760eb06e Merge from p4 (noticed these changes with DES' if_ethersubr.c changes caused
a minor conflict):
o Use ETHER_ADDR_LEN in preference to '6'.
o Remove two unnecessary (caddr_t) casts.  One of them causes problems in
  my tree where etherbroadcastaddr is const, and (caddr_t) casts the const
  away.
2003-10-23 15:09:34 +00:00
Dag-Erling Smørgrav
72fd1b6a20 Clean up whitespace, remove "register" keyword, ANSIfy.
No functional changes.
2003-10-23 13:49:10 +00:00
Hajimu UMEMOTO
86b51224d4 we have ppsratecheck(). 2003-10-22 19:23:51 +00:00
Hajimu UMEMOTO
e115574c1d protect by IFNET_RLOCK. 2003-10-22 15:10:39 +00:00
Mike Silbersack
5d5b5d0f99 Add a new macro M_ASSERTVALID which ensures that the mbuf in question
is non-free.  (More checks can/should be added in the future.)

Use M_ASSERTVALID in BPF_MTAP so that we catch when freed mbufs are
passed in, even if no bpf listeners are active.

Inspired by a bug in if_dc caught by Kenjiro Cho.
2003-10-19 22:33:41 +00:00
Hajimu UMEMOTO
31b1bfe1b0 - add dom_if{attach,detach} framework.
- transition to use ifp->if_afdata.

Obtained from:	KAME
2003-10-17 15:46:31 +00:00
Sam Leffler
ea04521020 Correct handling of cloning loop avoidance: rtalloc1 may return a null
pointer in which case we should not do the unlock.

Supported by:	FreeBSD Foundatin
2003-10-16 16:17:17 +00:00
Hajimu UMEMOTO
212bd869db AF_LINK sockaddr has to be attached to ifp->if_addrlist until the
end, as many of the code assumes that TAILQ_FIRST(ifp->if_addrlist)
is non-null.

Submitted by:	itojun
2003-10-16 13:38:29 +00:00
Hajimu UMEMOTO
b42ac57f4f - support AES counter mode for ESP.
- use size_t as return type of schedlen(), as there's no error
  check needed.
- clear key schedule buffer before freeing.

Obtained from:	KAME
2003-10-13 14:57:41 +00:00
Hajimu UMEMOTO
c65ee7c758 - support AES XCBC MAC for AH
- correct SADB_X_AALG_RIPEMD160HMAC to 8

Obtained from:	KAME
2003-10-13 04:54:51 +00:00
Robert Watson
d32864c331 Comment spelling fix. 2003-10-12 20:51:26 +00:00
Sam Leffler
3299a156c7 fix braino: null the pointer who's memory we just free'd, not some other
pointers that are (potentially) used later
2003-10-11 04:48:35 +00:00
Sam Leffler
3e6a836eea insure local variable is initialized prior to use 2003-10-07 16:56:35 +00:00
Hajimu UMEMOTO
8c99329e89 return(code) -> return (code)
Obtained from:	KAME
2003-10-07 15:02:29 +00:00
Sam Leffler
4de5d90c8e fix typo that caused a panic when processing an ICMP redirect
Sponsored by:	FreeBSD Foundation
2003-10-05 19:05:53 +00:00
Sam Leffler
d1dd20be6e Locking for updates to routing table entries. Each rtentry gets a mutex
that covers updates to the contents.  Note this is separate from holding
a reference and/or locking the routing table itself.

Other/related changes:

o rtredirect loses the final parameter by which an rtentry reference
  may be returned; this was never used and added unwarranted complexity
  for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
  we assume the parent will remain as long as the clone; doing this avoids
  a circularity in locking during delete
o convert some timeouts to MPSAFE callouts

Notes:

1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
   applications cannot/do-no know about mutex's.  Doing this requires
   that the mutex be the last element in the structure.  A better solution
   is to introduce an externalized version of struct rtentry but this is
   a major task because of the intertwining of rtentry and other data
   structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
   work to eliminate many held references.  If not these will be resolved
   prior to release.
3. ATM changes are untested.

Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS (partly)
2003-10-04 03:44:50 +00:00
Sam Leffler
5f7a7923ea add a stub for bpfattach2 so bpf is not required with the 802.11
module or related drivers

Spotted by:	Dan Lukes <dan@obluda.cz>
2003-10-04 01:32:28 +00:00
Robert Watson
5fd04e380f When direct dispatching an netisr (net.isr.enable=1), if there are already
any queued packets for the isr, process those packets before the newly
submitted packet, maintaining ordering of all packets being delivered
to the netisr.  Remove the bypass counter since we don't bypass anymore.
Leave the comment about possible problems and options since later
performance optimization may change the strategy for addressing ordering
problems here.

Specifically, this maintains the strong isr ordering guarantee; additional
parallelism and lower latency may be possible by moving to weaker
guarantees (per-interface, for example).  We will probably at some point
also want to remove the one instance netisr dispatch limit currently
enforced by a mutex, but it's not clear that's 100% safe yet, even in
the netperf branch.

Reviewed by:	sam, others
2003-10-03 18:27:24 +00:00
Sam Leffler
aea8b30f12 trivial locking rtsock_cb
Sponsored by:	FreeBSD Foundation
2003-10-03 18:27:02 +00:00
Sam Leffler
becc44d76c cleanups prior to adding locking (and in some cases to eliminate locking):
o move route_cb to be private to rtsock.c
o replace global static route_proto by locals
o eliminate global #define shorthands for info references
o remove some register decls
o ansi-fy function decls
o move items to be close in scope to their usage
o add rt_dispatch function for dispatching the actual message
o cleanup tangled logic for doing all-but-me msg send

Support by:	FreeBSD Foundation
2003-10-03 18:15:54 +00:00
Robert Watson
e590eca2ad Create a tunable for net.isr.enable so that it may be set from
inception, rather than having to wait for the boot to finish.
2003-10-02 02:54:10 +00:00
Robert Watson
3164565d39 Temporarily turn net.isr.enable back off again until patches to
correct potential nits in packet ordering are resolved.
2003-10-01 22:15:16 +00:00