key_delsav used to conditionally dereference vnet, leading to panics as
it was getting unset too early.
While the particular condition was removed, it makes sense to handle all
operations of the sort with correct vnet set so change it.
Reviewed by: ae
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D34313
When TCP_MD5SIG is set on a socket, all packets are dropped that don't
contain an MD5 signature. Relax this behavior to accept a non-signed
packet when a security association doesn't exist with the peer.
This is useful when a listen socket set with TCP_MD5SIG wants to handle
connections protected with and without MD5 signatures.
Reviewed by: bz (previous version)
Sponsored by: nepustil.net
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D33227
Return ENOENT from tcp_ipsec_input() when a security association is not
found. This allows callers of TCP_MD5_INPUT() to differentiate between a
security association not found and receiving a bad signature.
Also return ENOENT from tcp_ipsec_output() for consistency.
Reviewed by: ae
Sponsored by: nepustil.net
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D33226
The historical BSD network stack loop that rolls over domains and
over protocols has no advantages over more modern SYSINIT(9).
While doing the sweep, split global and per-VNET initializers.
Getting rid of pr_init allows to achieve several things:
o Get rid of ifdef's that protect against double foo_init() when
both INET and INET6 are compiled in.
o Isolate initializers statically to the module they init.
o Makes code easier to understand and maintain.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D33537
When adding an SPD entry that already exists, a refcount wraparound
panic is encountered. This was caused from dropping a reference on the
wrong security policy.
Fixes: 4920e38fec ("ipsec: fix race condition in key.c")
Reviewed by: wma
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D33100
Same comparison problem as in key_do_getnewspi.
Reviewed by: ae
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32827
SPIs get allocated and inserted in separate steps. Prior to the change
there was nothing preventing 2 differnet threads from ending up with the
same one.
PR: 258849
Reported by: Herbie.Robinson@stratus.com
Reviewed by: ae
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32826
The 'count' variable would end up being -1 post loop, while the
following condition would check for 0 instead.
PR: 258849
Reported by: Herbie.Robinson@stratus.com
Reviewed by: ae
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32826
Discard and send ICMPv6 Packet Too Big to sender when we try to encapsulate
and forward a packet which total length exceeds the PMTU.
Logic is based on the IPv4 implementation.
Common code was moved to a separate function.
Differential revision: https://reviews.freebsd.org/D31771
Obtained from: Semihalf
Sponsored by: Stormshield
If we fail to find to PMTU in hostcache, we assume it's equal
to link's MTU.
This patch prevents packets larger then link's MTU to be dropped
silently if there is no PMTU in hostcache.
Differential revision: https://reviews.freebsd.org/D31770
Obtained from: Semihalf
Sponsored by: Stormshield
pfil_run_hooks which eventually can get called asserts on it.
Reviewed by: ae
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32007
key_allocsa() expects to handle only IPSec protocols and has an
assertion to this effect. However, ipsec4_ctlinput() has to handle
messages from ICMP unreachable packets and was not validating the
protocol number. In practice such a packet would simply fail to match
any SADB entries and would thus be ignored.
Reported by: syzbot+6a9ef6fcfadb9f3877fe@syzkaller.appspotmail.com
Reviewed by: ae
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31890
If we matched SP to a packet, but no associated SA was found
ipsec4_allocsa will return NULL while setting error=0.
This resulted in use after free and potential kernel panic.
Return EINPROGRESS if the case described above instead.
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D30994
If an encapsulated frame is going to have DF bit set check its desitnitions'
PMTU and if it won't fit drop it and:
Generate ICMP 3/4 message if the packet was to be forwarded.
Return EMSGSIZE error otherwise.
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D30993
SO_RERROR indicates that receive buffer overflows should be handled as
errors. Historically receive buffer overflows have been ignored and
programs could not tell if they missed messages or messages had been
truncated because of overflows. Since programs historically do not
expect to get receive overflow errors, this behavior is not the
default.
This is really really important for programs that use route(4) to keep
in sync with the system. If we loose a message then we need to reload
the full system state, otherwise the behaviour from that point is
undefined and can lead to chasing bogus bug reports.
Reviewed by: philip (network), kbowling (transport), gbe (manpages)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D26652
Creation of a zone is expensive and there is no need to have one for
every vnet. Moreover, this wastes memory as these separate zones
cannot use the same per-cpu caches. Finally, this is a step towards
replacing the custom zone with pcpu-16.
Two counter_u64_zero calls induce back-to-back IPIs to zero everything
out. Instead, pass the M_ZERO flag to let uma just iterate all buffers.
The counter(9) API abstraction is already violated by not using
counter_u64_alloc.
Reviewed by: ae
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30916
Several protocol methods take a sockaddr as input. In some cases the
sockaddr lengths were not being validated, or were validated after some
out-of-bounds accesses could occur. Add requisite checking to various
protocol entry points, and convert some existing checks to assertions
where appropriate.
Reported by: syzkaller+KASAN
Reviewed by: tuexen, melifaro
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29519
Historically receive buffer overflows have been ignored and programs
could not tell if they missed messages or messages had been truncated
because of overflows. Since programs historically do not expect to get
receive overflow errors, this behavior is not the default.
This is really really important for programs that use route(4) to keep in sync
with the system. If we loose a message then we need to reload the full system
state, otherwise the behaviour from that point is undefined and can lead
to chasing bogus bug reports.
Currently, OpenCrypto consumers can request asynchronous dispatch by
setting a flag in the cryptop. (Currently only IPSec may do this.) I
think this is a bit confusing: we (conditionally) set cryptop flags to
request async dispatch, and then crypto_dispatch() immediately examines
those flags to see if the consumer wants async dispatch. The flag names
are also confusing since they don't specify what "async" applies to:
dispatch or completion.
Add a new KPI, crypto_dispatch_async(), rather than encoding the
requested dispatch type in each cryptop. crypto_dispatch_async() falls
back to crypto_dispatch() if the session's driver provides asynchronous
dispatch. Get rid of CRYPTOP_ASYNC() and CRYPTOP_ASYNC_KEEPORDER().
Similarly, add crypto_dispatch_batch() to request processing of a tailq
of cryptops, rather than encoding the scheduling policy using cryptop
flags. Convert GELI, the only user of this interface (disabled by
default) to use the new interface.
Add CRYPTO_SESS_SYNC(), which can be used by consumers to determine
whether crypto requests will be dispatched synchronously. This is just
a helper macro. Use it instead of looking at cap flags directly.
Fix style in crypto_done(). Also get rid of CRYPTO_RETW_EMPTY() and
just check the relevant queues directly. This could result in some
unnecessary wakeups but I think it's very uncommon to be using more than
one queue per worker in a given workload, so checking all three queues
is a waste of cycles.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28194
This is similar to the logic used in ip_output() to convert mbufs
prior to computing checksums. Unmapped mbufs can be sent when using
sendfile() over IPsec or using KTLS over IPsec.
Reported by: Sony Arpita Das @ Chelsio QA
Reviewed by: np
Sponsored by: Chelsio
Differential Revision: https://reviews.freebsd.org/D28187
This patch adds 80% of UINT32_MAX limit on sequence number.
When sequence number reaches limit kernel sends SADB_EXPIRE message to
IKE daemon which is responsible to perform rekeying.
Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: ae
Differential revision: https://reviews.freebsd.org/D22370
Obtained from: Semihalf
Sponsored by: Stormshield
Implement support for including IPsec ESN (Extended Sequence Number) to
both encrypt and authenticate mode (eg. AES-CBC and SHA256) and combined
mode (eg. AES-GCM). Both ESP and AH protocols are updated. Additionally
pass relevant information about ESN to crypto layer.
For the ETA mode the ESN is stored in separate crp_esn buffer because
the high-order 32 bits of the sequence number are appended after the
Next Header (RFC 4303).
For the AEAD modes the high-order 32 bits of the sequence number
[e.g. RFC 4106, Chapter 5 AAD Construction] are included as part of
crp_aad (SPI + ESN (32 high order bits) + Seq nr (32 low order bits)).
Submitted by: Grzegorz Jaszczyk <jaz@semihalf.com>
Patryk Duda <pdk@semihalf.com>
Reviewed by: jhb, gnn
Differential revision: https://reviews.freebsd.org/D22369
Obtained from: Semihalf
Sponsored by: Stormshield
As RFC 4304 describes there is anti-replay algorithm responsibility
to provide appropriate value of Extended Sequence Number.
This patch introduces anti-replay algorithm with ESN support based on
RFC 4304, however to avoid performance regressions window implementation
was based on RFC 6479, which was already implemented in FreeBSD.
To keep things clean and improve code readability, implementation of window
is kept in seperate functions.
Submitted by: Grzegorz Jaszczyk <jaz@semihalf.com>
Patryk Duda <pdk@semihalf.com>
Reviewed by: jhb
Differential revision: https://reviews.freebsd.org/D22367
Obtained from: Semihalf
Sponsored by: Stormshield
- Rename from the teardown callback from 'zeroize' to 'cleanup' since
this no longer zeroes keys.
- Change the callback return type to void. Nothing checked the return
value and it was always zero.
- Don't have esp call into ah since it no longer needs to depend on
this to clear the auth key. Instead, both are now private and
self-contained.
Reviewed by: delphij
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25443
When an IPsec packet has been encrypted or decrypted, the next step in
the packet's traversal through the network stack is invoked from a
crypto worker thread, not from the original calling thread. These
threads need to enter the network epoch before passing packets down to
IP output routines or up to transport protocols.
Reviewed by: ae
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25444
This is in preparation for enabling a loadable SCTP stack. Analogous to
IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured
in order to support a loadable SCTP implementation.
Discussed with: tuexen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
This fixes ipsec.ko to include all of IPSEC_DEBUG.
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25046
r361390 decreased blocksize of AES-CTR from 16 to 1.
Because of that ESP payload is no longer aligned to 16 bytes
before being encrypted and sent.
This is a good change since RFC3686 specifies that the last block
doesn't need to be aligned.
Since FreeBSD before r361390 couldn't decrypt partial blocks encrypted
with AES-CTR we need to enforce 16 byte alignment in order to preserve
compatibility.
Add a sysctl(on by default) to control it.
Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: jhb
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D24999
Some crypto consumers such as GELI and KTLS for file-backed sendfile
need to store their output in a separate buffer from the input.
Currently these consumers copy the contents of the input buffer into
the output buffer and queue an in-place crypto operation on the output
buffer. Using a separate output buffer avoids this copy.
- Create a new 'struct crypto_buffer' describing a crypto buffer
containing a type and type-specific fields. crp_ilen is gone,
instead buffers that use a flat kernel buffer have a cb_buf_len
field for their length. The length of other buffer types is
inferred from the backing store (e.g. uio_resid for a uio).
Requests now have two such structures: crp_buf for the input buffer,
and crp_obuf for the output buffer.
- Consumers now use helper functions (crypto_use_*,
e.g. crypto_use_mbuf()) to configure the input buffer. If an output
buffer is not configured, the request still modifies the input
buffer in-place. A consumer uses a second set of helper functions
(crypto_use_output_*) to configure an output buffer.
- Consumers must request support for separate output buffers when
creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are
only permitted to queue a request with a separate output buffer on
sessions with this flag set. Existing drivers already reject
sessions with unknown flags, so this permits drivers to be modified
to support this extension without requiring all drivers to change.
- Several data-related functions now have matching versions that
operate on an explicit buffer (e.g. crypto_apply_buf,
crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf).
- Most of the existing data-related functions operate on the input
buffer. However crypto_copyback always writes to the output buffer
if a request uses a separate output buffer.
- For the regions in input/output buffers, the following conventions
are followed:
- AAD and IV are always present in input only and their
fields are offsets into the input buffer.
- payload is always present in both buffers. If a request uses a
separate output buffer, it must set a new crp_payload_start_output
field to the offset of the payload in the output buffer.
- digest is in the input buffer for verify operations, and in the
output buffer for compute operations. crp_digest_start is relative
to the appropriate buffer.
- Add a crypto buffer cursor abstraction. This is a more general form
of some bits in the cryptosoft driver that tried to always use uio's.
However, compared to the original code, this avoids rewalking the uio
iovec array for requests with multiple vectors. It also avoids
allocate an iovec array for mbufs and populating it by instead walking
the mbuf chain directly.
- Update the cryptosoft(4) driver to support separate output buffers
making use of the cursor abstraction.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24545
The changes in r359374 added various sanity checks in sessions and
requests created by crypto consumers in part to permit backend drivers
to make assumptions instead of duplicating checks for various edge
cases. One of the new checks was to reject sessions which provide a
pointer to a key while claiming the key is zero bits long.
IPsec ESP tripped over this as it passes along whatever key is
provided for NULL, including a pointer to a zero-length key when an
empty string ("") is used with setkey(8). One option would be to
teach the IPsec key layer to not allocate keys of zero length, but I
went with a simpler fix of just not passing any keys down and always
using a key length of zero for NULL algorithms.
PR: 245832
Reported by: CI
Examples of depecrated algorithms in manual pages and sample configs
are updated where relevant. I removed the one example of combining
ESP and AH (vs using a cipher and auth in ESP) as RFC 8221 says this
combination is NOT RECOMMENDED.
Specifically, this removes support for the following ciphers:
- des-cbc
- 3des-cbc
- blowfish-cbc
- cast128-cbc
- des-deriv
- des-32iv
- camellia-cbc
This also removes support for the following authentication algorithms:
- hmac-md5
- keyed-md5
- keyed-sha1
- hmac-ripemd160
Reviewed by: cem, gnn (older verisons)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24342