Commit Graph

123 Commits

Author SHA1 Message Date
Justin Hibbits
3d0d5b21c9 IfAPI: Explicitly include <net/if_private.h> in netstack
Summary:
In preparation of making if_t completely opaque outside of the netstack,
explicitly include the header.  <net/if_var.h> will stop including the
header in the future.

Sponsored by:	Juniper Networks, Inc.
Reviewed by:	glebius, melifaro
Differential Revision: https://reviews.freebsd.org/D38200
2023-01-31 15:02:16 -05:00
Gleb Smirnoff
a0d7d2476f frag6: use callout(9) directly instead of pr_slowtimo
Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36162
2022-08-17 11:50:31 -07:00
Gordon Bergling
3cf59750eb netinet6: Fix a typo in a sysctl description
- remove a double 'a'

MFC after:	3 days
2021-11-30 07:24:44 +01:00
Mateusz Guzik
8afe9481cf frag6: do less work in frag6_slowtimo if possible
frag6_slowtimo avoidably uses CPU on otherwise idle boxes

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31528
2021-08-14 18:51:00 +02:00
Mateusz Guzik
c17ae18080 frag6: drop the volatile keyword from frag6_nfrags and mark with __exclusive_cache_line
The keyword adds nothing as all operations on the var are performed
through atomic_*

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31528
2021-08-14 18:50:29 +02:00
Kristof Provost
bb4a7d94b9 net: Introduce IPV6_DSCP(), IPV6_ECN() and IPV6_TRAFFIC_CLASS() macros
Introduce convenience macros to retrieve the DSCP, ECN or traffic class
bits from an IPv6 header.

Use them where appropriate.

Reviewed by:	ae (previous version), rscheff, tuexen, rgrimes
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29056
2021-03-04 20:56:48 +01:00
Alexander V. Chernikov
8268d82cff Remove per-packet ifa refcounting from IPv6 fast path.
Currently ip6_input() calls in6ifa_ifwithaddr() for
 every local packet, in order to check if the target ip
 belongs to the local ifa in proper state and increase
 its counters.

in6ifa_ifwithaddr() references found ifa.
With epoch changes, both `ip6_input()` and all other current callers
 of `in6ifa_ifwithaddr()` do not need this reference
 anymore, as epoch provides stability guarantee.

Given that, update `in6ifa_ifwithaddr()` to allow
 it to return ifa without referencing it, while preserving
 option for getting referenced ifa if so desired.

MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D28648
2021-02-15 22:33:12 +00:00
Mateusz Guzik
662c13053f net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Bjoern A. Zeeb
a4adf6cc65 Fix m_pullup() problem after removing PULLDOWN_TESTs and KAME EXT_*macros.
r354748-354750 replaced the KAME macros with m_pulldown() calls.
Contrary to the rest of the network stack m_len checks before m_pulldown()
were not put in placed (see r354748).
Put these m_len checks in place for now (to go along with the style of the
network stack since the initial commits).  These are not put in for
performance but to avoid an error scenario (even though it also will help
performance at the moment as it avoid allocating an extra mbuf; not because
of the unconditional function call).

The observed error case went like this:
(1) an mbuf with M_EXT arrives and we call m_pullup() unconditionally on it.
(2) m_pullup() will call m_get() unless the requested length is larger than
MHLEN (in which case it'll m_freem() the perfectly fine mbuf) and migrate the
requested length of data and pkthdr into the new mbuf.
(3) If m_get() succeeds, a further m_pullup() call going over MHLEN will fail.
This was observed with failing auto-configuration as an RA packet of
200 bytes exceeded MHLEN and the m_pullup() called from nd6_ra_input()
dropped the mbuf.
(Re-)adding the m_len checks before m_pullup() calls avoids this problems
with mbufs using external storage for now.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-12-01 00:22:04 +00:00
Bjoern A. Zeeb
a61b5cfbbf netinet6: Remove PULLDOWN_TESTs.
Remove the KAME introduced PULLDOWN_TESTs which did not even
have a compile-time option in sys/conf to turn them on for a
custom kernel build. They made the code a lot harder to read
or more complicated in a few cases.

Convert the IP6_EXTHDR_CHECK() calls into FreeBSD looking code.
Rather than throwing the packet away if it would not fit the
KAME mbuf expectations, convert the macros to m_pullup() calls.
Do not do any extra manual conditional checks upfront as to
whether the m_len would suffice (*), simply let m_pullup() do
its work (incl. an early check).

Remove extra m_pullup() calls where earlier in the function or
the only caller has already done the pullup.

Discussed with:	rwatson (*)
Reviewed by:	ae
MFC after:	8 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22334
2019-11-15 21:40:40 +00:00
Bjoern A. Zeeb
a8fe77d877 netinet*: update *mp to pass the proper value back
In ip6_[direct_]input() we are looping over the extension headers
to deal with the next header.  We pass a pointer to an mbuf pointer
to the handling functions.  In certain cases the mbuf can be updated
there and we need to pass the new one back.  That missing in
dest6_input() and route6_input().  In tcp6_input() we should also
update it before we call tcp_input().

In addition to that mark the mbuf NULL all the times when we return
that we are done with handling the packet and no next header should
be checked (IPPROTO_DONE).  This will eventually allow us to assert
proper behaviour and catch the above kind of errors more easily,
expecting *mp to always be set.

This change is extracted from a larger patch and not an exhaustive
change across the entire stack yet.

PR:			240135
Reported by:		prabhakar.lakhera gmail.com
MFC after:		3 weeks
Sponsored by:		Netflix
2019-11-12 15:46:28 +00:00
Bjoern A. Zeeb
c1131de6f1 frag6: properly handle atomic fragments according to RFCs.
RFC 8200 says:
	"If the fragment is a whole datagram (that is, both the Fragment
         Offset field and the M flag are zero), then it does not need
         any further reassembly and should be processed as a fully
         reassembled packet (i.e., updating Next Header, adjust Payload
         Length, removing the Fragment header, etc.).  .."

That means we should remove the fragment header and make all the adjustments
rather than just skipping over the fragment header.  The difference should
be noticeable in that a properly handled atomic fragment triggering an ICMPv6
message at an upper layer (e.g. dest unreach, unreachable port) will not
include the fragment header.

Update the test cases to also test for an unfragmentable part.  That is
needed so that the next header is properly updated (not just lengths).

MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22155
2019-11-08 14:36:44 +00:00
Bjoern A. Zeeb
6e6b5143f5 Properly set VNET when nuking recvif from fragment queues.
In theory the eventhandler invoke should be in the same VNET as
the the current interface. We however cannot guarantee that for
all cases in the future.

So before checking if the fragmentation handling for this VNET
is active, switch the VNET to the VNET of the interface to always
get the one we want.

Reviewed by:	hselasky
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22153
2019-10-25 18:54:06 +00:00
Bjoern A. Zeeb
702828f643 frag6: do not leak counter in error cases
When allocating the IPv6 fragement packet queue entry we do checks
against counters and if we pass we increment one of the counters
to claim the spot.  Right after that we have two cases (malloc and MAC)
which can both fail in which case we free the entry but never released
our claim on the counter.  In theory this can lead to not accepting new
fragments after a long time, especially if it would be MAC "refusing"
them.
Rather than immediately subtracting the value in the error case, only
increment it after these two cases so we can no longer leak it.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-25 16:29:09 +00:00
Bjoern A. Zeeb
619456bb59 frag6: prevent overwriting initial fragoff=0 packet meta-data.
When we receive the packet with the first fragmented part (fragoff=0)
we remember the length of the unfragmentable part and the next header
(and should probably also remember ECN) as meta-data on the reassembly
queue.
Someone replying this packet so far could change these 2 (3) values.
While changing the next header seems more severe, for a full size
fragmented UDP packet, for example, adding an extension header to the
unfragmentable part would go unnoticed (as the framented part would be
considered an exact duplicate) but make reassembly fail.
So do not allow updating the meta-data after we have seen the first
fragmented part anymore.

The frag6_20 test case is added which failed before triggering an
ICMPv6 "param prob" due to the check for each queued fragment for
a max-size violation if a fragoff=0 packet was received.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 22:07:45 +00:00
Bjoern A. Zeeb
cd188da20f frag6: handling of overlapping fragments to conform to RFC 8200
While the comment was updated in r350746, the code was not.
RFC8200 says that unless fragment overlaps are exact (same fragment
twice) not only the current fragment but the entire reassembly queue
for this packet must be silently discarded, which we now do if
fragment offset and fragment length do not match.

Obtained from:	jtl
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16850
2019-10-24 20:22:52 +00:00
Bjoern A. Zeeb
53707abd41 frag6: export another counter read-only by sysctl
Similar to the system global counter also export the per-VNET counter
"frag6_nfragpackets" detailing the current number of fragment packets
in this VNET's reassembly queues.
The read-only counter is helpful for in-VNET statistical monitoring and
for test-cases.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 20:00:37 +00:00
Bjoern A. Zeeb
dda02192f9 frag6: fix counter leak in error case and optimise code
In case the first fragmented part (off=0) arrives we check for the
maximum packet size for each fragmented part we already queued with the
addition of the unfragmentable part from the first one.

For one we do not have to enter the loop at all if this is the first
fragmented part to arrive, and we can skip the check.

Should we encounter an error case we send an ICMPv6 message for any
fragment exceeding the maximum length limit.  While dequeueing the
original packet and freeing it, statistics were not updated and leaked
both the reassembly queue count for the fragment and the global
fragment count.  Found by code inspection and confirmed by tightening
test cases checking more statistical and system counters.

While here properly wrap a line.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 19:57:18 +00:00
Bjoern A. Zeeb
e5fffe9a69 frag6.c: do not leak packet queue entry in error case
When we are checking for the maximum reassembled packet size of the
fragmentable part and run into the error case (packet too big),
we are leaking the packet queue enntry if this was a first fragment
to arrive.
Properly cleanup, removing the queue entry from the bucket, decrementing
counters, and freeing the memory.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 19:47:32 +00:00
Bjoern A. Zeeb
30809ba9e3 frag6: leave a note about upper layer header checks TBD
Per sepcification the upper layer header needs to be within the first
fragment.  The check was not done so far and there is an open review for
related work, so just leave a note as to where to put it.
Move the extraction of frag offset up to this as it is needed to determine
whether this is a first fragment or not.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 12:16:15 +00:00
Bjoern A. Zeeb
7715d794ef frag6: check global limits before hash and lock
Check whether we are accepting more fragments (based on global limits)
before doing expensive operations of calculating the hash and taking the
bucket lock.   This slightly increases a "race" between check time and
incrementing counters (which is already there) possibly allowing a few
more fragments than the maximum limits.  However, when under attack,
we rather save this CPU time for other packets/work.

MFC after:		3 weeks
Sponsored by:		Netflix
2019-10-24 11:58:24 +00:00
Bjoern A. Zeeb
efdfee93c0 frag6: small improvements
Rather than walking the mbuf chain manually use m_last() which doing
exactly that for us.
Defer initializing srcifp for longer as there are multiple exit paths
out of the function which do not need it set.  Initialize before taking
the lock though.
Rename the mtx lock to match the type better.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 08:15:40 +00:00
Bjoern A. Zeeb
da89a0fe94 frag6: remove IP6_REASS_MBUF macro
The IP6_REASS_MBUF() macro did some pointer gynmastics to end up with the
same type as it gets in [*(cast **)&].  Spelling it out instead saves all
this and makes the code more readable and less obfuscated directly using
the structure field.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 07:53:10 +00:00
Bjoern A. Zeeb
f1664f3258 frag6: add "big picture"
Add some ASCII relation of how the bits plug together.  The terminology
difference of "fragmented packets" and "fragment packets" is subtle.
While here clear up more whitespace and comments.

No functional change.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-23 23:10:12 +00:00
Bjoern A. Zeeb
21f08a074d frag6: replace KAME hand-rolled queues with queue(9) TAILQs
Remove the KAME custom circular queue for fragments and fragmented packets
and replace them with a standard TAILQ.
This make the code a lot more understandable and maintainable and removes
further hand-rolled code from the the tree using a standard interface instead.

Hide the still public structures under #ifdef _KERNEL as there is no
use for them in user space.
The naming is a bit confusing now as struct ip6q and the ip6q[] buckets
array are not the same anymore;  sadly struct ip6q is also used by the
MAC framework and we cannot rename it.

Submitted by:	jtl (initally)
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16847 (jtl's original)
2019-10-23 23:01:18 +00:00
Bjoern A. Zeeb
3c7165b35e frag6: whitespace changes
Remove trailing white space, add a blank line, and compress a comment.
No functional changes.

MFC after:	10 days
Sponsored by:	Netflix
2019-10-23 20:37:15 +00:00
Bjoern A. Zeeb
67a10c4644 frag6: fix vnet teardown leak
When shutting down a VNET we did not cleanup the fragmentation hashes.
This has multiple problems: (1) leak memory but also (2) leak on the
global counters, which might eventually lead to a problem on a system
starting and stopping a lot of vnets and dealing with a lot of IPv6
fragments that the counters/limits would be exhausted and processing
would no longer take place.

Unfortunately we do not have a useable variable to indicate when
per-VNET initialization of frag6 has happened (or when destroy happened)
so introduce a boolean to flag this. This is needed here as well as
it was in r353635 for ip_reass.c in order to avoid tripping over the
already destroyed locks if interfaces go away after the frag6 destroy.

While splitting things up convert the TRY_LOCK to a LOCK operation in
now frag6_drain_one().  The try-lock was derived from a manual hand-rolled
implementation and carried forward all the time.  We no longer can afford
not to get the lock as that would mean we would continue to leak memory.

Assert that all the buckets are empty before destroying to lock to
ensure long-term stability of a clean shutdown.

Reported by:	hselasky
Reviewed by:	hselasky
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22054
2019-10-21 08:48:47 +00:00
Bjoern A. Zeeb
65456706c0 frag6: add read-only sysctl for nfrags.
Add a read-only sysctl exporting the global number of fragments
(base system and all vnets).  This is helpful to (a) know how many
fragments are currently being processed, (b) if there are possible
leaks, (c) if vnet teardown is not working correctly, and lastly
(d) it can be used as part of test-suits to ensure (a) to (c).

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-21 08:36:15 +00:00
Hans Petter Selasky
a55383e720 Fix panic in network stack due to use after free when receiving
partial fragmented packets before a network interface is detached.

When sending IPv4 or IPv6 fragmented packets and a fragment is lost
before the network device is freed, the mbuf making up the fragment
will remain in the temporary hashed fragment list and cause a panic
when it times out due to accessing a freed network interface
structure.


1) Make sure the m_pkthdr.rcvif always points to a valid network
interface. Else the rcvif field should be set to NULL.

2) Use the rcvif of the last received fragment as m_pkthdr.rcvif for
the fully defragged packet, instead of the first received fragment.

Panic backtrace for IPv6:

panic()
icmp6_reflect() # tries to access rcvif->if_afdata[AF_INET6]->xxx
icmp6_error()
frag6_freef()
frag6_slowtimo()
pfslowtimo()
softclock_call_cc()
softclock()
ithread_loop()

Reviewed by:	bz
Differential Revision:	https://reviews.freebsd.org/D19622
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-10-16 09:11:49 +00:00
Bjoern A. Zeeb
1540a98e36 frag6: move public structure into file local space.
Move ip6asfrag and the accompanying IP6_REASS_MBUF macro from
ip6_var.h into frag6.c as they are not used outside frag6.c.
Sadly struct ip6q is all over the mac framework so we have to
leave it public.

This reduces the public KPI space.

MFC after:		3 months
X-MFC:			possibly MFC the #define only to stable branches
Sponsored by:		Netflix
2019-08-08 10:59:54 +00:00
Bjoern A. Zeeb
5778b399f1 frag6.c: cleanup varaibles and return statements.
Consitently put () around return values.
Do not assign variables at the time of variable declaration.
Sort variables.  Rename ia to ia6, remove/reuse some variables used only
once or twice for temporary calculations.

No functional changes intended.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-08 10:15:47 +00:00
Bjoern A. Zeeb
23d374aa14 frag6.c: initial comment and whitespace cleanup.
Cleanup some comments (start with upper case, ends in punctuation,
use width and do not consume vertical space).  Update comments to
RFC8200.  Some whitespace changes.

No functional changes.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-08 09:42:57 +00:00
Bjoern A. Zeeb
9cb1a47af2 frag6.c: rename ip6q[] to ipq6b[] and consistently use "bucket"
The hash buckets array is called ip6q.  The data structure ip6q is a
description of different object, the one the array holds these days
(since r337776).  To clear some of this confusion, rename the array
to ip6qb.

When iterating over all buckets or addressing them directly, we
use at least the variables i, hash, and bucket.  To keep the
terminology consistent use the variable name "bucket" and always
make it an uint32_t and not sometimes an int.

No functional behaviour changes intended.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-05 11:01:12 +00:00
Bjoern A. Zeeb
c00464a245 frag6.c: re-order functions within file
Re-order functions within the file in preparation for an upcoming
code simplification.

No functional changes.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-05 09:49:24 +00:00
Bjoern A. Zeeb
f349c821f5 frag6.c: fix includes
Bring back systm.h after r350532 and banish errno.h, time.h, and
machine/atomic.h.

Reported by:	bde (Thank you!)
Pointyhat to:	bz
MFC after:	12 weeks
X-MFC:		with r350532
Sponsored by:	Netflix
2019-08-03 16:56:44 +00:00
Bjoern A. Zeeb
09b361c792 frag6.c: make compile with gcc
Removing the prototype from the header and making the function static
in r350533 makes architectures using gcc complain "function declaration
isn't a prototype".  Add the missing void given the function has no
arguments.

Reported by:		the CI machinery
Pointyhat to:		bz
MFC after:		3 months
X-MFC with:		r350533
Sponsored by:		Netflix
2019-08-02 11:05:00 +00:00
Bjoern A. Zeeb
487a161cff frag6.c: rename malloc type
Rename M_FTABLE to M_FRAG6 as the former sounds very much like the former
"flowtable" rather than anything to do with fragments and reassembly.

While here, let malloc( , .. | M_ZERO) do the zeroing rather than calling
bzero() ourselves.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-02 10:54:57 +00:00
Bjoern A. Zeeb
a687de6aee frag6.c: remove dead code
Remove all the #if 0 and #if notyet blocks of dead code which have been
there for at least 18 years from what I can see.

No functional changes.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-02 10:41:51 +00:00
Bjoern A. Zeeb
757cb678e5 frag6.c: move variables and sysctls into local file
Move the sysctls and the related variables only used in frag6.c
into the file and out of in6_proto.c.  That way everything belonging
together is in one place.

Sort the variables into global and per-vnet scopes and make
them static.  No longer export the (helper) function
frag6_set_bucketsize() now also file-local only.

Should be no functional changes, only reduced public KPI/KBI surface.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-02 10:29:53 +00:00
Bjoern A. Zeeb
1a3044fa2c frag6.c: sort includes
Sort includes and remove duplicate kernel.h as well as the unneeded
systm.h.
Hide the mac framework incude behind #fidef MAC.

MFC after:		3 months
Sponsored by:		Netflix
2019-08-02 10:06:54 +00:00
Hans Petter Selasky
6bbdbbb830 Revert r346530 until further.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-22 19:36:19 +00:00
Hans Petter Selasky
04f44499ca Fix build for mips and powerpc after r346530.
Need to include sys/kernel.h to define SYSINIT() which is used
by sys/eventhandler.h .

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-22 08:32:00 +00:00
Hans Petter Selasky
40eb389666 Fix panic in network stack due to memory use after free in relation to
fragmented packets.

When sending IPv4 and IPv6 fragmented packets and a fragment is lost,
the mbuf making up the fragment will remain in the temporary hashed
fragment list for a while. If the network interface departs before the
so-called slow timeout clears the packet, the fragment causes a panic
when the timeout kicks in due to accessing a freed network interface
structure.

Make sure that when a network device is departing, all hashed IPv4 and
IPv6 fragments belonging to it, get freed.

Backtrace:
panic()
icmp6_reflect()

hlim = ND_IFINFO(m->m_pkthdr.rcvif)->chlim;
^^^^ rcvif->if_afdata[AF_INET6] is NULL.

icmp6_error()
frag6_freef()
frag6_slowtimo()
pfslowtimo()
softclock_call_cc()
softclock()
ithread_loop()

Differential Revision:	https://reviews.freebsd.org/D19622
Reviewed by:		bz (network), adrian
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-22 07:27:24 +00:00
Tom Jones
2946a9415c Add stat counter for ipv6 atomic fragments
Add a stat counter to track ipv6 atomic fragments. Atomic fragments can be
generated in response to invalid path MTU values, but are also a potential
attack vector and considered harmful (see RFC6946 and RFC8021).

While here add tracking of the atomic fragment counter to netstat and systat.

Reviewed by:    tuexen, jtl, bz
Approved by:    jtl (mentor), bz (mentor)
Event:  Aberdeen hackathon 2019
Differential Revision:  https://reviews.freebsd.org/D17511
2019-04-19 17:06:43 +00:00
Tom Jones
198fdaeda1 When dropping a fragment queue count the number of fragments in the queue
When dropping a fragment queue, account for the number of fragments in the
queue. This improves accounting between the number of fragments received and
the number of fragments dropped.

Reviewed by:	jtl, bz, transport
Approved by:	jtl (mentor), bz (mentor)
Differential Revision:	https://review.freebsd.org/D17521
2019-02-19 19:57:55 +00:00
Kristof Provost
505e91f500 frag6: Fix fragment reassembly
r337776 started hashing the fragments into buckets for faster lookup.

The hashkey is larger than intended. This results in random stack data being
included in the hashed data, which in turn means that fragments of the same
packet might end up in different buckets, causing the reassembly to fail.

Set the correct size for hashkey.

PR:		231045
Approved by:	re (kib)
MFC after:	3 days
2018-08-31 08:37:15 +00:00
Jonathan T. Looney
2ceeacbe71 Lower the default limits on the IPv6 reassembly queue.
Currently, the limits are quite high. On machines with millions of
mbuf clusters, the reassembly queue limits can also run into
the millions. Lower these values.

Also, try to ensure that no bucket will have a reassembly
queue larger than approximately 100 items. This limits the cost to
find the correct reassembly queue when processing an incoming
fragment.

Due to the low limits on each bucket's length, increase the size of
the hash table from 64 to 1024.

Reviewed by:	jhb
Security:	FreeBSD-SA-18:10.ip
Security:	CVE-2018-6923
2018-08-14 17:32:07 +00:00
Jonathan T. Looney
5f9f192dc5 Drop 0-byte IPv6 fragments.
Currently, we process IPv6 fragments with 0 bytes of payload, add them
to the reassembly queue, and do not recognize them as duplicating or
overlapping with adjacent 0-byte fragments. An attacker can exploit this
to create long fragment queues.

There is no legitimate reason for a fragment with no payload. However,
because IPv6 packets with an empty payload are acceptable, allow an
"atomic" fragment with no payload.

Reviewed by:	jhb
Security:	FreeBSD-SA-18:10.ip
Security:	CVE-2018-6923
2018-08-14 17:29:22 +00:00
Jonathan T. Looney
1e9f3b734e Implement a limit on on the number of IPv6 reassembly queues per bucket.
There is a hashing algorithm which should distribute IPv6 reassembly
queues across the available buckets in a relatively even way. However,
if there is a flaw in the hashing algorithm which allows a large number
of IPv6 fragment reassembly queues to end up in a single bucket, a per-
bucket limit could help mitigate the performance impact of this flaw.

Implement such a limit, with a default of twice the maximum number of
reassembly queues divided by the number of buckets. Recalculate the
limit any time the maximum number of reassembly queues changes.
However, allow the user to override the value using a sysctl
(net.inet6.ip6.maxfragbucketsize).

Reviewed by:	jhb
Security:	FreeBSD-SA-18:10.ip
Security:	CVE-2018-6923
2018-08-14 17:27:41 +00:00