Commit Graph

2456 Commits

Author SHA1 Message Date
Andre Oppermann
eaf80179e2 Have TCP Inflight disable itself if the RTT is below a certain
threshold.  Inflight doesn't make sense on a LAN as it has
trouble figuring out the maximal bandwidth because of the coarse
tick granularity.

The sysctl net.inet.tcp.inflight.rttthresh specifies the threshold
in milliseconds below which inflight will disengage.  It defaults
to 10ms.

Tested by:	Joao Barros <joao.barros-at-gmail.com>,
		Rich Murphey <rich-at-whiteoaklabs.com>
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-02-16 19:38:07 +00:00
Andre Oppermann
cf744713e8 In in_pcbconnect_setup() reduce code duplication and use ip_rtaddr()
to find the outgoing interface for this connection.

Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	2 weeks
2006-02-16 15:45:28 +00:00
Andre Oppermann
a4684d742d Make sysctl_msec_to_ticks(SYSCTL_HANDLER_ARGS) generally available instead
of being private to tcp_timer.c.

Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-16 15:40:36 +00:00
Ruslan Ermilov
ea9dce1461 When sending a packet from dummynet, indicate that we're forwarding
it so that ip_id etc. don't get overwritten.  This fixes forwarding
of fragmented IP packets through a dummynet pipe -- fragments came
out with modified and different(!) ip_id's, making it impossible to
reassemble a datagram at the receiver side.

Submitted by:	Alexander Karptsov (reworked by me)
MFC after:	3 days
2006-02-14 06:36:39 +00:00
Qing Li
eee9df08bd Set the M_ZERO flag when calling uma_zalloc() to allocate a syncache entry.
Reviewed by:	andre, glebius
MFC after:	3 days
2006-02-09 21:29:02 +00:00
Qing Li
c1fd993af9 Redo the previous fix by setting the UMA_ZONE_ZINIT bit in the syncache
zone, eliminating the need to call bzero() after each syncache entry
allocation.

Suggested by:	glebius
Reviewed by:	andre
MFC after:	3 days
2006-02-08 23:32:57 +00:00
Qing Li
737b12e98f Fixes a crash due to the memory of the newly allocated syncache entry
in syncache_lookup() is not cleared and may lead to an arbitrary and
bogus rtentry pointer which later gets free'd.

Reviewed by: andre
MFC after: 3 days
2006-02-07 19:59:46 +00:00
Oleg Bulyzhin
6edb555dbc Fix five years old bug in ip_reass(): if we are using 'full' (i.e. including
pseudo header) hardware rx checksum offloading ip_reass() fails to calculate
TCP/UDP checksum for reassembled packet correctly.  This also should fix
recent 'NFS over UDP over bge' issue exposed by if_bge.c rev. 1.123

Reviewed by:	sam (earlier version), bde
Approved by:	glebius (mentor)
MFC after:	2 weeks
2006-02-07 11:48:10 +00:00
Hajimu UMEMOTO
d5e8a67ee9 Never select the PCB that has INP_IPV6 flag and is bound to :: if
we have another PCB which is bound to 0.0.0.0.  If a PCB has the
INP_IPV6 flag, then we set its cost higher than IPv4 only PCBs.

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
MFC after:	1 week
2006-02-04 07:59:17 +00:00
Gleb Smirnoff
a7908db153 Dropping the lock in the transmit_event() is not safe, because we
store some pipe pointers on stack. If user reconfigures dummynet
in the interlock gap, we can work with freed pipes after relock.

To fix this, we decided not to send packets in transmit_event(),
but fill a queue. At the end of dummynet() and dummynet_io(),
after the lock is dropped, if there is something in the queue
we run dummynet_send() to process the queue.

In collaboration with:	ru
2006-02-03 11:38:19 +00:00
Gleb Smirnoff
ce62866023 Axe unused function. 2006-02-03 10:42:28 +00:00
Christian S.J. Peron
f5cdbcf14c Use PFIL_HOOKED macros in if_bridge and pass the right argument to
rw_assert. This un-breaks the build.

Submitted by:	Kostik Belousov
Pointy hat to:	csjp
2006-02-02 16:41:20 +00:00
Christian S.J. Peron
604afec496 Somewhat re-factor the read/write locking mechanism associated with the packet
filtering mechanisms to use the new rwlock(9) locking API:

- Drop the variables stored in the phil_head structure which were specific to
  conditions and the home rolled read/write locking mechanism.
- Drop some includes which were used for condition variables
- Drop the inline functions, and convert them to macros. Also, move these
  macros into pfil.h
- Move pfil list locking macros intp phil.h as well
- Rename ph_busy_count to ph_nhooks. This variable will represent the number
  of IN/OUT hooks registered with the pfil head structure
- Define PFIL_HOOKED macro which evaluates to true if there are any
  hooks to be ran by pfil_run_hooks
- In the IP/IP6 stacks, change the ph_busy_count comparison to use the new
  PFIL_HOOKED macro.
- Drop optimization in pfil_run_hooks which checks to see if there are any
  hooks to be ran, and returns if not. This check is already performed by the
  IP stacks when they call:

        if (!PFIL_HOOKED(ph))
                goto skip_hooks;

- Drop in assertion which makes sure that the number of hooks never drops
  below 0 for good measure. This in theory should never happen, and if it
  does than there are problems somewhere
- Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep
- Drop variables which support home rolled read/write locking mechanism from
  the IPFW firewall chain structure.
- Swap out the read/write firewall chain lock internal to use the rwlock(9)
  API instead of our home rolled version
- Convert the inlined functions to macros

Reviewed by:	mlaier, andre, glebius
Thanks to:	jhb for the new locking API
2006-02-02 03:13:16 +00:00
Andre Oppermann
1dfcf0d2a3 Move the IPSEC related code blocks to their own file to unclutter
and signifincantly improve the readability of ip_input() and
ip_output() again.

The resulting IPSEC hooks in ip_input() and ip_output() may be
used later on for making IPSEC loadable.

This move is mostly mechanical and should preserve current IPSEC
behaviour as-is.  Nothing shall prevent improvements in the way
IPSEC interacts with the IPv4 stack.

Discussed with:	bz, gnn, rwatson; (earlier version)
2006-02-01 13:55:03 +00:00
Ruslan Ermilov
e46c3da737 Brain-o (use standard int types now). 2006-02-01 06:15:37 +00:00
Ruslan Ermilov
bc7eeed4c9 Fix multicast routing on 64-bit platforms.
Tested on:	amd64
MFC after:	3 days
2006-01-31 22:39:35 +00:00
Andrew Thompson
235073f4c0 Now that the bridge also processes Ethernet frames as itself, two arp replies
will be sent if there is an address on the bridge. Exclude the bridge from the
special arp handling.

This has been tested with all combinations of addresses on the bridge and members.

Pointed out by:	Michal Mertl
2006-01-31 21:29:41 +00:00
Gleb Smirnoff
25af0bb50e Add some initial locking to gif(4). It doesn't covers the whole driver,
however IPv4-in-IPv4 tunnels are now stable on SMP. Details:

- Add per-softc mutex.
- Hold the mutex on output.

The main problem was the rtentry, placed in softc. It could be
freed by ip_output(). Meanwhile, another thread being in
in_gif_output() can read and write this rtentry.

Reported by:	many
Tested by:	Alexander Shiryaev <aixp mail.ru>
2006-01-30 08:39:09 +00:00
Andrew Thompson
74948aa6f3 Back out of r1.148, it causes two arp replies to be sent with different mac
addresses. One for the bridged interface with the IP address assigned but then
another with the mac for the bridge itself.
2006-01-29 23:21:01 +00:00
Andre Oppermann
ab48768b20 When doing IP forwarding with [FAST_]IPSEC compiled into the kernel
ip_forward() would report back a zero MTU in ICMP needfrag messages
because on a IPSEC SP lookup failure no MTU got computed.

Fix this by changing the logic to compute a new MTU in any case if
IPSEC didn't do it.

Change MTU computation logic to use egress interface MTU if available
or the next smaller MTU compared to the current packet size instead
of falling back to a very small fixed MTU.

Fix associated comment.

PR:		kern/91412
MFC after:	3 days
2006-01-24 17:57:19 +00:00
Andre Oppermann
1dec73a153 In ip_mdq() compute the TV_DELTA the correct way around.
PR:		kern/91851
Submitted by:	SAKAI Hiroaki <sakai.hiroaki-at-jp.fujitsu.com>
MFC after:	3 days
2006-01-24 17:09:12 +00:00
Andre Oppermann
31343a3da2 In in_control() remove the temporary in_ifaddr structure from the
ia_hash only if it actually is an AF_INET address.  All other places
test for sa_family == AF_INET but this one.

PR:		kern/92091
Submitted by:	Seth Kingsley <sethk-at-meowfishies.com>
MFC after:	3 days
2006-01-24 16:19:31 +00:00
Oleg Bulyzhin
44a515834f Fix minor bug in uRPF:
If net.link.ether.inet.useloopback=1 and we send broadcast packet using our
  own source ip address it may be rejected by uRPF rules.

  Same bug was fixed for IPv6 in rev. 1.115 by suz.

PR:		kern/76971
Approved by:	glebius (mentor)
MFC after:	3 days
2006-01-24 13:38:06 +00:00
Gleb Smirnoff
0b4ae859ac Implement 'ipfw fwd laddr,port' feature for UDP. According to ipfw(8)
it should work, however it never did. People expect it to work.

PR:		kern/90834
2006-01-24 09:08:54 +00:00
Gleb Smirnoff
1c0b0f523d Fix build. 2006-01-23 20:10:49 +00:00
Andre Oppermann
06003a1e7c Simplify ip_next_mtu() and make its logic more easy to see while
silencing code analysis tools.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID341
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-01-23 17:06:32 +00:00
Robert Watson
136d4f1cf2 Convert remaining functions to ANSI C function declarations; remove
'register' where present.

MFC after:	1 week
2006-01-22 01:16:25 +00:00
Robert Watson
d0c75d36b9 Convert last remaining function in ip_gre.c to ANSI C function
declaration.

MFC after:	1 week
2006-01-22 01:08:30 +00:00
Bjoern A. Zeeb
3f2e28fe9f Fix stack corruptions on amd64.
Vararg functions have a different calling convention than regular
functions on amd64. Casting a varag function to a regular one to
match the function pointer declaration will hide the varargs from
the caller and we will end up with an incorrectly setup stack.

Entirely remove the varargs from these functions and change the
functions to match the declaration of the function pointers.
Remove the now unnecessary casts.

Lots of explanations and help from:     peter
Reviewed by:                            peter
PR:                                     amd64/89261
MFC after:                              6 days
2006-01-21 10:44:34 +00:00
Christian S.J. Peron
9c57c204be - Change the return type for init_tables from void to int so we can propagate
errors from rn_inithead back to the ipfw initialization function.
- Check return value of rn_inithead for failure, if table allocation has
  failed for any reason, free up any tables we have created and return ENOMEM
- In ipfw_init check the return value of init_tables and free up any mutexes or
  UMA zones which may have been created.
- Assert that the supplied table is not NULL before attempting to dereference.

This fixes panics which were a result of invalid memory accesses due to failed
table allocation. This is an issue mainly because the R_Zalloc function is a
malloc(M_NOWAIT) wrapper, thus making it possible for allocations to fail.

Found by:	Coverity Prevent (tm)
Coverity ID:	CID79
MFC after:	1 week
2006-01-20 05:35:27 +00:00
Christian S.J. Peron
e9186cb94b Destroy the dynamic rule zone in the event that we fail to insert the
initial default rule.

MFC after:	1 week
2006-01-20 03:21:25 +00:00
Andre Oppermann
0270746230 Do not derefence the ip header pointer in the IPv6 case.
This fixes a bug in the previous commit.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID253
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 18:59:30 +00:00
Andre Oppermann
8f8d29f686 In in_delayed_cksum() we can't perform a m_pullup() as it may
change the mbuf pointer and we don't have any way of passing
it back to the callers.  Instead just fail silently without
updating the checksum but leaving the mbuf+chain intact.

A search in our GNATS database did not turn up any match for
the existing warning message when this case is encountered.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID779
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 18:49:16 +00:00
Andre Oppermann
79eb490467 In syncache_expand() insert a proper syncache_free() to fix a case
that currently can't be triggered.  But better be safe than sorry
later on.  Additionally it properly silences Coverity Prevent for
future tests.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID802
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 18:25:03 +00:00
Andre Oppermann
39550088cf Prevent dereferencing a NULL route pointer when trying to update the
route MTU.

This bug is very difficult to reach and not remotely exploitable.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID162
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 15:05:05 +00:00
Andre Oppermann
5d691e6da8 Return mbuf pointer or NULL from ip_fastforward() as the mbuf pointer
may have changed by m_pullup() during fastforward processing.

While this is a bug it is actually never triggered in real world
situations and it is not remotely exploitable.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID780
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-01-18 14:24:39 +00:00
Robert Watson
d248c7d7f5 Modify the IP fragment reassembly code so that it uses a new UMA zone,
ipq_zone, to allocate fragment headers from, rather than using cast mbuf
storage.  This was one of the few remaining uses of mbuf storage for
local data structures that relied on dtom().  Implement the resource
limit on ipq's using UMA zone limits, but preserve current sysctl
semantics using a sysctl proc.

MFC after:	3 weeks
2006-01-15 18:58:21 +00:00
Robert Watson
dfa60d9354 Staticize ipqlock, since it is local to ip_input.c.
MFC after:	3 days
2006-01-15 17:05:48 +00:00
George V. Neville-Neil
34f83c52e7 Check the correct TTL in both the IPv6 and IPv4 cases.
Submitted by:	glebius
Reviewed by:	gnn, bz
Found with:     Coverity Prevent(tm)
2006-01-14 16:39:31 +00:00
Gleb Smirnoff
ecedca7441 UMA can return NULL not only in case when our zone is full, but
also in case of generic memory shortage. In the latter case we may
not find an old entry.

Found with:	Coverity Prevent(tm)
2006-01-14 13:04:08 +00:00
Robert Watson
e5bc0aa3c3 Remove dead code: 'opts' is not used in udp_append(), only in udp_input(),
so no need to assign it to NULL or conditionally free it.

Found with:	Coverity Prevent(tm)
MFC after:	3 days
2006-01-14 11:18:32 +00:00
Andrew Thompson
54c427e0e2 Include the bridge interface itself in the special arp handling.
PR:		90973
MFC after:	1 week
2006-01-12 21:05:30 +00:00
Colin Percival
9ed97bee65 Correct insecure temporary file usage in texindex. [06:01]
Correct insecure temporary file usage in ee. [06:02]
Correct a race condition when setting file permissions, sanitize file
names by default, and fix a buffer overflow when handling files
larger than 4GB in cpio. [06:03]
Fix an error in the handling of IP fragments in ipfw which can cause
a kernel panic. [06:04]

Security:	FreeBSD-SA-06:01.texindex
Security:	FreeBSD-SA-06:02.ee
Security:	FreeBSD-SA-06:03.cpio
Security:	FreeBSD-SA-06:04.ipfw
2006-01-11 08:02:16 +00:00
Andrew Thompson
73ff045c57 Add RFC 3378 EtherIP support. This change makes it possible to add gif
interfaces to bridges, which will then send and receive IP protocol 97 packets.
Packets are Ethernet frames with an EtherIP header prepended.

Obtained from:	NetBSD
MFC after:	2 weeks
2005-12-21 21:29:45 +00:00
Xin LI
92e0a4a2a4 Use consistent indent character as other IPPROTO_* lines did. 2005-12-20 09:38:03 +00:00
George V. Neville-Neil
496f9fc522 Add protocol number for SCTP.
Submitted by:	Randall Stewart rrs at cisco.com
MFC after:	1 week
2005-12-20 09:24:04 +00:00
Gleb Smirnoff
3939390679 Add a knob to suppress logging of attempts to modify
permanent ARP entries.

Submitted by:	Andrew Alcheyev <buddy telenet.ru>
2005-12-18 19:11:56 +00:00
Ed Maste
bd2b686fe8 Add descriptions for sysctl -d.
Approved by:	glebius
Silence from:	rwatson (mentor)
2005-12-16 15:01:44 +00:00
Gleb Smirnoff
6e02dbdfa3 Cleanup __FreeBSD_version. 2005-12-16 13:10:32 +00:00
John Baldwin
636a309adb Use %t (ptrdiff_t modifier) to print a couple of pointer differences rather
than casting them to int.
2005-12-15 21:57:32 +00:00