timestamped TCP packets where FreeBSD will send DATA+FIN and
A W2K box will ack just the DATA portion. If this occurs
after FreeBSD has done a (NewReno) fast-retransmit and is
recovering it (dupacks > threshold) it triggers a case in
tcp_newreno_partial_ack() (tcp_newreno() in stable) where
tcp_output() is called with the expectation that the retransmit
timer will be reloaded. But tcp_output() falls through and
returns without doing anything, causing the persist timer to be
loaded instead. This causes the connection to hang until W2K gives up.
This occurs because in the case where only the FIN must be acked, the
'len' calculation in tcp_output() will be 0, a lot of checks will be
skipped, and the FIN check will also be skipped because it is designed
to handle FIN retransmits, not forced transmits from tcp_newreno().
The solution is to simply set TF_ACKNOW before calling tcp_output()
to absolute guarentee that it will run the send code and reset the
retransmit timer. TF_ACKNOW is already used for this purpose in other
cases.
For some unknown reason this patch also seems to greatly reduce
the number of duplicate acks received when Guido runs his tests over
a lossy network. It is quite possible that there are other
tcp_newreno{_partial_ack()} cases which were not generating the expected
output which this patch also fixes.
X-MFC after: Will be MFC'd after the freeze is over
o Move len initialization closer to place of its first usage.
o Compare len with 0 to improve readability.
o Explicitly zero out phlen in ip_insertoptions() in failure case.
Suggested by: jhb
Reviewed by: jhb
MFC after: 2 weeks
Windows 2000 box and a FreeBSD box could stall. The problem turned out
to be a timestamp reply bug in the W2K TCP stack. FreeBSD sends a
timestamp with the SYN, W2K returns a timestamp of 0 in the SYN+ACK
causing FreeBSD to calculate an insane SRTT and RTT, resulting in
a maximal retransmit timeout (60 seconds). If there is any packet
loss on the connection for the first six or so packets the retransmit
case may be hit (the window will still be too small for fast-retransmit),
causing a 60+ second pause. The W2K box gives up and closes the
connection.
This commit works around the W2K bug.
15:04:59.374588 FREEBSD.20 > W2K.1036: S 1420807004:1420807004(0) win 65535 <mss 1460,nop,wscale 2,nop,nop,timestamp 188297344 0> (DF) [tos 0x8]
15:04:59.377558 W2K.1036 > FREEBSD.20: S 4134611565:4134611565(0) ack 1420807005 win 17520 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0> (DF)
Bug reported by: Guido van Rooij <guido@gvr.org>
packets in addition to IPPROTO_IPV4 and IPPROTO_IPV6, explicitly specify
IPPROTO_IPV4 or IPPROTO_IPV6 instead of -1 when calling encap_attach().
MFC after: 28 days
(along with other if_gre changes)
- use `struct uma_zone *' instead of uma_zone_t, so that <sys/uma.h> isn't
a prerequisite.
- don't include <sys/uma.h>.
Namespace pollution makes "opaque" types like uma_zone_t perfectly
non-opaque. Such types should never be used (see style(9)).
Fixed subsequently grwon dependencies of this header on its own pollution:
- include <sys/_mutex.h> and its prerequisite <sys/_lock.h> instead of
depending on namespace pollution 2 layers deep in <sys/uma.h>.
firewall logging on and off when at elevated securelevel(8). It would
be nice to be able to only lock these at securelevel >= 3, like rules
are, but there is no such functionality at present. I don't see reason
to be adding features to securelevel(8) with MAC being merged into 5.0.
PR: kern/39396
Reviewed by: luigi
MFC after: 1 week
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif
Concept by: bde
Reviewed by: jake, obrien
in6_v4mapsin6_sockaddr() which allocate the appropriate sockaddr_in*
structure and initialize it with the address and port information passed
as arguments. Use calls to these new functions to replace code that is
replicated multiple times in in_setsockaddr(), in_setpeeraddr(),
in6_setsockaddr(), in6_setpeeraddr(), in6_mapped_sockaddr(), and
in6_mapped_peeraddr(). Inline COMMON_END in tcp_usr_accept() so that
we can call in_sockaddr() with temporary copies of the address and port
after the PCB is unlocked.
Fix the lock violation in tcp6_usr_accept() (caused by calling MALLOC()
inside in6_mapped_peeraddr() while the PCB is locked) by changing
the implementation of tcp6_usr_accept() to match tcp_usr_accept().
Reviewed by: suz
not meant to duplicate) TCP/Vegas. Add four sysctls and default the
implementation to 'off'.
net.inet.tcp.inflight_enable enable algorithm (defaults to 0=off)
net.inet.tcp.inflight_debug debugging (defaults to 1=on)
net.inet.tcp.inflight_min minimum window limit
net.inet.tcp.inflight_max maximum window limit
MFC after: 1 week
Implement the M_SKIP_FIREWALL bit in m_flags to avoid loops
for firewall-generated packets (the constant has to go in sys/mbuf.h).
Better comments on keepalive generation, and enforce dyn_rst_lifetime
and dyn_fin_lifetime to be less than dyn_keepalive_period.
Enforce limits (up to 64k) on the number of dynamic buckets, and
retry allocation with smaller sizes.
Raise default number of dynamic rules to 4096.
Improved handling of set of rules -- now you can atomically
enable/disable multiple sets, move rules from one set to another,
and swap sets.
sbin/ipfw/ipfw2.c:
userland support for "noerror" pipe attribute.
userland support for sets of rules.
minor improvements on rule parsing and printing.
sbin/ipfw/ipfw.8:
more documentation on ipfw2 extensions, differences from ipfw1
(so we can use the same manpage for both), stateful rules,
and some additional examples.
Feedback and more examples needed here.
we can use the names _receive() and _send() for the receive() and send()
checks. Rename related constants, policy implementations, etc.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
When a pipe or queue has the "noerror" attribute, do not report
drops to the caller (ip_output() and friends).
(2 lines to implement it, 2 lines to document it.)
This will let you simulate losses on the sender side as if they
happened in the middle of the network, i.e. with no explicit feedback
to the sender.
manpage and ipfw2.c changes to follow shortly, together with other
ipfw2 changes.
Requested by: silby
MFC after: 3 days
satisfy consumers of ip_var.h that need a complete definition of
struct ipq and don't include mac.h.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
The bugfix (ipfw2.c) makes the handling of port numbers with
a dash in the name, e.g. ftp-data, consistent with old ipfw:
use \\ before the - to consider it as part of the name and not
a range separator.
The new feature (all this description will go in the manpage):
each rule now belongs to one of 32 different sets, which can
be optionally specified in the following form:
ipfw add 100 set 23 allow ip from any to any
If "set N" is not specified, the rule belongs to set 0.
Individual sets can be disabled, enabled, and deleted with the commands:
ipfw disable set N
ipfw enable set N
ipfw delete set N
Enabling/disabling of a set is atomic. Rules belonging to a disabled
set are skipped during packet matching, and they are not listed
unless you use the '-S' flag in the show/list commands.
Note that dynamic rules, once created, are always active until
they expire or their parent rule is deleted.
Set 31 is reserved for the default rule and cannot be disabled.
All sets are enabled by default. The enable/disable status of the sets
can be shown with the command
ipfw show sets
Hopefully, this feature will make life easier to those who want to
have atomic ruleset addition/deletion/tests. Examples:
To add a set of rules atomically:
ipfw disable set 18
ipfw add ... set 18 ... # repeat as needed
ipfw enable set 18
To delete a set of rules atomically
ipfw disable set 18
ipfw delete set 18
ipfw enable set 18
To test a ruleset and disable it and regain control if something
goes wrong:
ipfw disable set 18
ipfw add ... set 18 ... # repeat as needed
ipfw enable set 18 ; echo "done "; sleep 30 && ipfw disable set 18
here if everything goes well, you press control-C before
the "sleep" terminates, and your ruleset will be left
active. Otherwise, e.g. if you cannot access your box,
the ruleset will be disabled after the sleep terminates.
I think there is only one more thing that one might want, namely
a command to assign all rules in set X to set Y, so one can
test a ruleset using the above mechanisms, and once it is
considered acceptable, make it part of an existing ruleset.
case, also preserve the MAC label. Note that this mbuf allocation
is fairly non-optimal, but not my fault.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
kernel access control.
Add MAC support for the UDP protocol. Invoke appropriate MAC entry
points to label packets that are generated by local UDP sockets,
and to authorize delivery of mbufs to local sockets both in the
multicast/broadcast case and the unicast case.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
pointer and incoming mbuf pointer will be non-NULL in tcp_respond().
This is relied on by the MAC code for correctness, as well as
existing code.
Obtained from: TrustedBSD PRoject
Sponsored by: DARPA, NAI Labs