Redirect (and temporal) route expiration was broken a while ago.
This change brings route expiration back, with unified IPv4/IPv6 handling code.
It introduces net.inet.icmp.redirtimeout sysctl, allowing to set
an expiration time for redirected routes. It defaults to 10 minutes,
analogues with net.inet6.icmp6.redirtimeout.
Implementation uses separate file, route_temporal.c, as route.c is already
bloated with tons of different functions.
Internally, expiration is implemented as an per-rnh callout scheduled when
route with non-zero rt_expire time is added or rt_expire is changed.
It does not add any overhead when no temporal routes are present.
Callout traverses entire routing tree under wlock, scheduling expired routes
for deletion and calculating the next time it needs to be run. The rationale
for such implemention is the following: typically workloads requiring large
amount of routes have redirects turned off already, while the systems with
small amount of routes will not inhibit large overhead during tree traversal.
This changes also fixes netstat -rn display of route expiration time, which
has been broken since the conversion from kread() to sysctl.
Reviewed by: bz
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D23075
After r354748 mld_input() can change the mbuf. The new pointer
is never returned to icmp6_input() and when passed to
icmp6_rip6_input() the mbuf may no longer valid leading to
a panic.
Pass a pointer to the mbuf to mld_input() so we can return an
updated version in the non-error case.
Add a test sending an MLD packet case which will trigger this bug.
Pointyhat to: bz
Reported by: gallatin, thj
MFC After: 2 weeks
X-MFC with: r354748
Sponsored by: Netflix
RFC 8200 says:
"If the fragment is a whole datagram (that is, both the Fragment
Offset field and the M flag are zero), then it does not need
any further reassembly and should be processed as a fully
reassembled packet (i.e., updating Next Header, adjust Payload
Length, removing the Fragment header, etc.). .."
That means we should remove the fragment header and make all the adjustments
rather than just skipping over the fragment header. The difference should
be noticeable in that a properly handled atomic fragment triggering an ICMPv6
message at an upper layer (e.g. dest unreach, unreachable port) will not
include the fragment header.
Update the test cases to also test for an unfragmentable part. That is
needed so that the next header is properly updated (not just lengths).
MFC after: 3 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D22155
Remove mentions of fragmentation tests from extension header test.
Remove setting an MTU > IF_MAXMTU from the test cases to avoid warnings;
this was only possible in a local research tree.
MFC after: 2 weeks
Sponsored by: Netflix
Add a simple test case which can exercise some of the IPv6 extension
header code paths. At the moment only a small set of extension headers
is implemented and no options to the ones which take them.
Also implements a "bad" case to make sure that error handling works.
The tests were used to test m_pullup() changes to the code paths while
removing the KAME PULLDOWN_TEST cases and related macros.
MFC after: 3 weeks
Sponsored by: Netflix
There are times when we have to wait for reply packets. There are
either an ICMPv6 (error) reply or the expiration timeout.
In these cases synchonous ICMPv6 replies should arrive, always,
unless the packet is lost. Due to errors experienced with the
test software sending an invlaid request on at least i386 (*) these
packets are not generated. That means we are waiting for a long time
for the replies or even timeout the test case.
Manually set the "End" flag on these test cases as well, so they do
fail rather than timeout as the sniffer timeout happens. This improves
debugging options, reflects the error properly, and saves time on each
test suit run.
(*) The real cause for that is still to be found (see the referenced PRs)
PR: 241493, 239380
MFC after: 2 weeks
Sponsored by: Netflix
The change to conform to RFC 8200 for overlapping fragments now frees
the entire reassembly queue if the overlapping fragments are not an
exact match.
As a result we do see one less packet in the timeout statistics from
expiry. No other statistics change as the event is not counted.
It can be argued that we should improve the statistics counters in
that case.
This test case update should have been committed alongside the original
commit.
Pointyhat to: bz
MFC after: 3 weeks
X-MFC with: r354046
Sponsored by: Netflix
When we receive the packet with the first fragmented part (fragoff=0)
we remember the length of the unfragmentable part and the next header
(and should probably also remember ECN) as meta-data on the reassembly
queue.
Someone replying this packet so far could change these 2 (3) values.
While changing the next header seems more severe, for a full size
fragmented UDP packet, for example, adding an extension header to the
unfragmentable part would go unnoticed (as the framented part would be
considered an exact duplicate) but make reassembly fail.
So do not allow updating the meta-data after we have seen the first
fragmented part anymore.
The frag6_20 test case is added which failed before triggering an
ICMPv6 "param prob" due to the check for each queued fragment for
a max-size violation if a fragoff=0 packet was received.
MFC after: 3 weeks
Sponsored by: Netflix
When done with tests check that both the per-VNET and the global-fragmented-
packets-in-system counters are zero to make sure we do not leak counters or
queue entries.
This implies that for all test cases we either have to check for the ICMPv6
packet sent in case of TLL=0 expiry (if it is sent) or sleep at least long
enough for the TTL to expire for all packets (e.g., fragments where we do not
have the off=0 packet).
This also means that statistics are now updated to include all the expired
packets.
There are cases when we do not check for counters to be zero and this is
when testing VNET teardown to behave properly and not panic, when we are
intentionally leaving fragments in the system.
MFC after: 3 weeks
Sponsored by: Netflix
In order to ensure that changing the frag6 code does not change behaviour
or break code a set of test cases were implemented.
Like some other test cases these use Scapy to generate packets and possibly
wait for expected answers. In most cases we do check the global and
per interface (netstat) statistics output using the libxo output and grep
to validate fields and numbers. This is a bit hackish but we currently have
no better way to match a selected number of stats only (we have to ignore
some of the ND6 variables; otherwise we could use the entire list).
Test cases include atomic fragments, single fragments, multi-fragments,
and try to cover most error cases in the code currently.
In addition vnet teardown is tested to not panic.
A separate set (not in-tree currently) of probes were used in order to
make sure that the test cases actually test what they should.
The "sniffer" code was copied and adjusted from the netpfil version
as we sometimes will not get packets or have longer timeouts to deal with.
Sponsored by: Netflix