Commit Graph

81 Commits

Author SHA1 Message Date
wpaul
6a0eb38f30 Fix a bug which I discovered recently while doing IPv6 testing at
Wind River. In the IPv4 output path, one of the tests in ip_output()
checks how many slots are actually available in the interface output
queue before attempting to send a packet. If, for example, we need
to transmit a packet of 32K bytes over an interface with an MTU of
1500, we know it's going to take about 21 fragments to do it. If
there's less than 21 slots left in the output queue, there's no point
in transmitting anything at all: IP does not do retransmission, so
sending only some of the fragments would just be a waste of bandwidth.
(In an extreme case, if you're sending a heavy stream of fragmented
packets, you might find yourself sending nothing by the first fragment
of all your packets.) So if ip_output() notices there's not enough
room in the output queue to send the frame, it just dumps the packet
and returns ENOBUFS to the app.

It turns out ip6_output() lacks this code. Consequently, this caused
the netperf UDPIPV6_STREAM test to produce very poor results with large
write sizes. This commit adds code to check the remaining space in the
output queue and junk fragmented packets if they're too big to be
sent, just like with IPv4. (I can't imagine anyone's running an NFS
server using UDP over IPv6, but if they are, this will likely make them
a lot happier. :)
2004-05-14 03:57:17 +00:00
imp
b49b7fe799 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
ume
11f479f519 Validate IPv6 socket options more carefully to avoid a panic.
PR:		kern/61513
Reviewed by:	cperciva, nectar
2004-03-26 19:52:18 +00:00
ume
92aaace604 IPSEC and FAST_IPSEC have the same internal API now;
so merge these (IPSEC has an extra ipsecstat)

Submitted by:	"Bjoern A. Zeeb" <bzeeb+freebsd@zabbadoz.net>
2004-02-17 14:02:37 +00:00
ume
4975c09f54 - obey ip6po_minmtu.
- notify a proper path MTU to applications.

Obtained from:	KAME
2004-02-08 18:22:27 +00:00
ume
de3407d028 pass pcb rather than so. it is expected that per socket policy
works again.
2004-02-03 18:20:55 +00:00
ume
11aa947d12 Catch a few places where NULL (pointer) was used where 0 (integer) was
expected (fix build).
2003-12-23 11:01:17 +00:00
peter
7cc77e03ae Catch a few places where NULL (pointer) was used where 0 (integer) was
expected.
2003-12-23 02:36:43 +00:00
suz
ed93d04274 fixed a bug that IPv6 routing header does not work properly if specified from userland application
reviewed by: ume
2003-12-22 03:12:13 +00:00
suz
3c416f8489 fixed an IPv6 path MTU discovery failure owing to a lack of initialization
Reviewed by: ume
Approved by: re (scottl)
MFC after: 1 day
2003-12-17 04:31:07 +00:00
ume
939be2da2f pktopt may be null.
Approved by:	re (rwatson)
2003-11-24 01:53:36 +00:00
andre
6164d7c280 Introduce tcp_hostcache and remove the tcp specific metrics from
the routing table.  Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.

It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination.  Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.

tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.

It removes significant locking requirements from the tcp stack with
regard to the routing table.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 20:07:39 +00:00
ume
11df5d4f1a correct to look right interface. 2003-11-17 07:53:32 +00:00
sam
c997776d7c replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREF
macros that expand to include assertions when the system is built
with INVARIANTS

Supported by:	FreeBSD Foundation
2003-11-08 23:36:32 +00:00
ume
bbea30a3ab correct behavior when ipv6mr_interface is 0. Matthias Drochner
Notified by:	itojun
Obtained from:	NetBSD
2003-11-06 16:42:59 +00:00
ume
373abd9403 - cleanup SP refcnt issue.
- share policy-on-socket for listening socket.
- don't copy policy-on-socket at all.  secpolicy no longer contain
  spidx, which saves a lot of memory.
- deep-copy pcb policy if it is an ipsec policy.  assign ID field to
  all SPD entries.  make it possible for racoon to grab SPD entry on
  pcb.
- fixed the order of searching SA table for packets.
- fixed to get a security association header.  a mode is always needed
  to compare them.
- fixed that the incorrect time was set to
  sadb_comb_{hard|soft}_usetime.
- disallow port spec for tunnel mode policy (as we don't reassemble).
- an user can define a policy-id.
- clear enc/auth key before freeing.
- fixed that the kernel crashed when key_spdacquire() was called
  because key_spdacquire() had been implemented imcopletely.
- preparation for 64bit sequence number.
- maintain ordered list of SA, based on SA id.
- cleanup secasvar management; refcnt is key.c responsibility;
  alloc/free is keydb.c responsibility.
- cleanup, avoid double-loop.
- use hash for spi-based lookup.
- mark persistent SP "persistent".
  XXX in theory refcnt should do the right thing, however, we have
  "spdflush" which would touch all SPs.  another solution would be to
  de-register persistent SPs from sptree.
- u_short -> u_int16_t
- reduce kernel stack usage by auto variable secasindex.
- clarify function name confusion.  ipsec_*_policy ->
  ipsec_*_pcbpolicy.
- avoid variable name confusion.
  (struct inpcbpolicy *)pcb_sp, spp (struct secpolicy **), sp (struct
  secpolicy *)
- count number of ipsec encapsulations on ipsec4_output, so that we
  can tell ip_output() how to handle the packet further.
- When the value of the ul_proto is ICMP or ICMPV6, the port field in
  "src" of the spidx specifies ICMP type, and the port field in "dst"
  of the spidx specifies ICMP code.
- avoid from applying IPsec transport mode to the packets when the
  kernel forwards the packets.

Tested by:	nork
Obtained from:	KAME
2003-11-04 16:02:05 +00:00
ume
2ecf5196e3 do not insert a dest option header (even specified by a user) that
should be placed before a routing header, unless a routing header
really exists.

Obtained from:	KAME
2003-10-31 16:32:12 +00:00
ume
ecb479d77a re-add wrongly disappered IPV6_CHECKSUM stuff by introducing
ip6_raw_ctloutput().

Obtained from:	KAME
2003-10-26 18:17:01 +00:00
ume
19c7c976c8 remove the ip6r0_addr and ip6r0_slmap members from ip6_rthdr0{}
according to rfc2292bis.

Obtained from:	KAME
2003-10-24 20:37:05 +00:00
ume
881c4fa391 Switch Advanced Sockets API for IPv6 from RFC2292 to RFC3542
(aka RFC2292bis).  Though I believe this commit doesn't break
backward compatibility againt existing binaries, it breaks
backward compatibility of API.
Now, the applications which use Advanced Sockets API such as
telnet, ping6, mld6query and traceroute6 use RFC3542 API.

Obtained from:	KAME
2003-10-24 18:26:30 +00:00
ume
5199c863f8 - change scope to zone.
- change node-local to interface-local.
- better error handling of address-to-scope mapping.
- use in6_clearscope().

Obtained from:	KAME
2003-10-21 20:05:32 +00:00
ume
1bfb498609 correct linkmtu handling.
Obtained from:	KAME
2003-10-20 15:27:48 +00:00
ume
babf2c3ec0 - add dom_if{attach,detach} framework.
- transition to use ifp->if_afdata.

Obtained from:	KAME
2003-10-17 15:46:31 +00:00
ume
a72f1bdb76 nuke SCOPEDROUTING. Though it was there for a long time,
it was never enabled.
2003-10-10 16:04:00 +00:00
ume
cb2c1545ab - fix typo in comments.
- style.
- NULL is not 0.
- some variables were renamed.
- nuke unused logic.
(there is no functional change.)

Obtained from:	KAME
2003-10-08 18:26:08 +00:00
sam
5506d3b8f4 must lock route when the caller provided a route but not
an interface; otherwise the subsequent unlock blows up

Suffered by:	Marcel Moolenaar <marcel@xcllnt.net>
Supported by:	FreeBSD Foundation
2003-10-07 20:57:35 +00:00
ume
6c1377b9ef return(code) -> return (code)
(reduce diffs against KAME)
2003-10-06 14:02:09 +00:00
sam
9d93fce265 Locking for updates to routing table entries. Each rtentry gets a mutex
that covers updates to the contents.  Note this is separate from holding
a reference and/or locking the routing table itself.

Other/related changes:

o rtredirect loses the final parameter by which an rtentry reference
  may be returned; this was never used and added unwarranted complexity
  for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
  we assume the parent will remain as long as the clone; doing this avoids
  a circularity in locking during delete
o convert some timeouts to MPSAFE callouts

Notes:

1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
   applications cannot/do-no know about mutex's.  Doing this requires
   that the mutex be the last element in the structure.  A better solution
   is to introduce an externalized version of struct rtentry but this is
   a major task because of the intertwining of rtentry and other data
   structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
   work to eliminate many held references.  If not these will be resolved
   prior to release.
3. ATM changes are untested.

Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS (partly)
2003-10-04 03:44:50 +00:00
ume
59fe55cb24 Obey RANDOM_IP_ID.
Requested by:	sam
2003-10-01 16:00:12 +00:00
ume
7a9738e262 randomize IPv6 fragment ID.
Obtained from:	KAME
2003-10-01 15:13:29 +00:00
sam
d1d4c947ce Correct pfil_run_hooks return handling: if the return value is non-zero
then the mbuf has been consumed by a hook; otherwise beware of a null
mbuf return (gack).  In particular the bridge was doing the wrong thing.
While in the ipv6 code make it's handling of pfil_run_hooks identical
to netbsd.

Pointed out by:	Pyun YongHyeon <yongari@kt-is.co.kr>
2003-09-30 04:46:08 +00:00
sam
cd738e8574 o update PFIL_HOOKS support to current API used by netbsd
o revamp IPv4+IPv6+bridge usage to match API changes
o remove pfil_head instances from protosw entries (no longer used)
o add locking
o bump FreeBSD version for 3rd party modules

Heavy lifting by:	"Max Laier" <max@love2party.net>
Supported by:		FreeBSD Foundation
Obtained from:		NetBSD (bits of pfil.h and pfil.c)
2003-09-23 17:54:04 +00:00
jlemon
26815368d4 Remove unused variables in the IPSEC case.
Submitted by:  Lars Eggert <larse@ISI.EDU>
2003-02-20 18:22:21 +00:00
jlemon
a8bc02dcb2 Add a TCP TIMEWAIT state which uses less space than a fullblown TCP
control block.  Allow the socket and tcpcb structures to be freed
earlier than inpcb.  Update code to understand an inp w/o a socket.

Reviewed by: hsu, silby, jayanth
Sponsored by: DARPA, NAI Labs
2003-02-19 22:32:43 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
sam
8a8e425d5f purge extraneous clears of M_PKTHDR since M_MOVE_PKTHDR does this already 2003-01-06 21:29:27 +00:00
sam
b16cb0a948 Correct mbuf packet header propagation. Previously, packet headers
were sometimes propagated using M_COPY_PKTHDR which actually did
something between a "move" and a  "copy" operation.  This is replaced
by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it
from the source mbuf) and m_dup_pkthdr which copies the packet
header contents including any m_tag chain.  This corrects numerous
problems whereby mbuf tags could be lost during packet manipulations.

These changes also introduce arguments to m_tag_copy and m_tag_copy_chain
to specify if the tag copy work should potentially block.  This
introduces an incompatibility with openbsd which we may want to revisit.

Note that move/dup of packet headers does not handle target mbufs
that have a cluster bound to them.  We may want to support this;
for now we watch for it with an assert.

Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG|M_LASTFRAG.

Supported by:	Vernier Networks
Reviewed by:	Robert Watson <rwatson@FreeBSD.org>
2002-12-30 20:22:40 +00:00
ume
e7761b68ee plugged memory leakage in some erroneous cases
Obtained from:	KAME
MFC after:	1 week
2002-10-31 19:45:48 +00:00
sam
0ef6c52bbc Tie new "Fast IPsec" code into the build. This involves the usual
configuration stuff as well as conditional code in the IPv4 and IPv6
areas.  Everything is conditional on FAST_IPSEC which is mutually
exclusive with IPSEC (KAME IPsec implmentation).

As noted previously, don't use FAST_IPSEC with INET6 at the moment.

Reviewed by:	KAME, rwatson
Approved by:	silence
Supported by:	Vernier Networks
2002-10-16 02:25:05 +00:00
sam
2a86be217a Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
archie
7a233d4c9f Replace (ab)uses of "NULL" where "0" is really meant. 2002-08-22 21:24:01 +00:00
ume
ed0d6e9ce4 make sure to set/unset INP_IPV4 according to a value
of IN6P_IPV6_V6ONLY

Reviewed by:	Keiichi SHIMA <keiichi@iij.ad.jp>
2002-07-24 19:19:53 +00:00
ume
cd3f29ae66 do not refer to IN6P_BINDV6ONLY anymore.
Obtained from:	KAME
MFC after:	1 week
2002-07-22 15:51:02 +00:00
suz
553226e8e1 just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by:	ume
MFC after:	1 week
2002-04-19 04:46:24 +00:00
jhb
dc2e474f79 Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
ume
1787e9ff8d Fix cached route problem.
Submitted by:	Keiichi SHIMA <keiichi@iij.ad.jp> (KAME)
Reviewed by:	JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp> (KAME)
MFC after:	1 week
2002-03-29 15:42:44 +00:00
alfred
f7f2a5dc47 Remove duplicate extern declarations to silence warnings. 2002-03-19 19:45:41 +00:00
arr
915c6a9163 - Replace M_WAIT with M_TRYWAIT since the M_WAIT flag is deprecated.
Spotted by: bde
2001-12-09 17:48:08 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00