SADB_ACQUIRE requests are send by kernel, when security policy doesn't
have corresponding security association for outbound packet. IKE daemon
usually registers its handler for such messages and when the kernel asks
for SA it can handle this request. Now such requests will contain
additional fields that can help IKE daemon to create SA. And IKE now
can create SAs using only information from SADB_ACQUIRE request, this
is useful when many if_ipsec(4) interfaces are in use and IKE doesn track
security policies that was installed by kernel.
Obtained from: Yandex LLC
MFC after: 3 weeks
Sponsored by: Yandex LLC
Do not call crypto_newsession() while holding xforms_lock mutex.
Release mutex before invoking crypto_newsession(), and use
ipsec_kmod_enter()/ipsec_kmod_exit() functions to protect from doing
access to unloaded kernel module memory.
Move xform-releated functions into subr_ipsec.c to be able use
ipsec_kmod_* functions. Also unconditionally build ipsec_kmod_*
functions, since now they are always used by IPSec code.
Add xf_cntr field to struct xformsw, it is used by ipsec_kmod_*
functions. Also constify xf_name field, since it is not expected to be
modified.
Approved by: re (kib)
Differential Revision: https://reviews.freebsd.org/D17302
Track session objects in the framework, and pass handles between the
framework (OCF), consumers, and drivers. Avoid redundancy and complexity in
individual drivers by allocating session memory in the framework and
providing it to drivers in ::newsession().
Session handles are no longer integers with information encoded in various
high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the
appropriate crypto_ses2foo() function on the opaque session handle.
Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to
the opaque handle interface. Discard existing session tracking as much as
possible (quick pass). There may be additional code ripe for deletion.
Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style
interface. The conversion is largely mechnical.
The change is documented in crypto.9.
Inspired by
https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html .
No objection from: ae (ipsec portion)
Reported by: jhb
IPSEC_PCBCTL() functions, which include tcp_ipsec_pcbctl(),
ipsec4_pcbctl(), and ipsec6_pcbctl(), should all have matching locking
semantics.
ipsec4_pcbctl() and ipsec6_pcbctl() expect the inp to be unlocked on
entry and exit and appear to be correctly implemented as such. But
tcp_ipsec_pcbctl() had other semantics. This patch fixes the semantics
for tcp_ipsec_pcbctl().
Submitted by: Jason Eggleston <jason@eggnet.com>
MFH: 2 weeks
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D14623
Per-cpu zone allocations are very rarely done compared to regular zones.
The intent is to avoid pessimizing the latter case with per-cpu specific
code.
In particular contrary to the claim in r334824, M_ZERO is sometimes being
used for such zones. But the zeroing method is completely different and
braching on it in the fast path for regular zones is a waste of time.
Currently it has several disadvantages:
- it uses single mutex to protect internal structures. It is used by
data- and control- path, thus there are no parallelism at all.
- it uses single list to keep encap handlers for both INET and INET6
families.
- struct encaptab keeps unneeded information (src, dst, masks, protosw),
that isn't used by code in the source tree.
- matches are prioritized and when many tunneling interfaces are
registered, encapcheck handler of each interface is invoked for each
packet. The search takes O(n) for n interfaces. All this work is done
with exclusive lock held.
What this patch includes:
- the datapath is converted to be lockless using epoch(9) KPI.
- struct encaptab now linked using CK_LIST.
- all unused fields removed from struct encaptab. Several new fields
addedr: min_length is the minimum packet length, that encapsulation
handler expects to see; exact_match is maximum number of bits, that
can return an encapsulation handler, when it wants to consume a packet.
- IPv6 and IPv4 handlers are stored in separate lists;
- added new "encap_lookup_t" method, that will be used later. It is
targeted to speedup lookup of needed interface, when gif(4)/gre(4) have
many interfaces.
- the need to use protosw structure is eliminated. The only pr_input
method was used from this structure, so I don't see the need to keep
using it.
- encap_input_t method changed to avoid using mbuf tags to store softc
pointer. Now it is passed directly trough encap_input_t method.
encap_getarg() funtions is removed.
- all sockaddr structures and code that uses them removed. We don't have
any code in the tree that uses them. All consumers use encap_attach_func()
method, that relies on invoking of encapcheck() to determine the needed
handler.
- introduced struct encap_config, it contains parameters of encap handler
that is going to be registered by encap_attach() function.
- encap handlers are stored in lists ordered by exact_match value, thus
handlers that need more bits to match will be checked first, and if
encapcheck method returns exact_match value, the search will be stopped.
- all current consumers changed to use new KPI.
Reviewed by: mmacy
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D15617
The RFC specifies that under IPv6 the complete AH header must be 64 bit
aligned, and under IPv4, 32 bit aligned. Prior to this change, we (along
with other BSDs and MacOS) had violated this requirement.
This makes it possible to set up IPv6-AH between Linux and BSD, and also
probably between Windows and BSD.
PR: 222684
Reported and tested by: Jason Mader <jasonmader AT gmail.com>
Obtained from: NetBSD xform_ah.c 1.105
(b939fe2483972eb43d71bf990cfb7f26dece7839 NetBSD/src on GH)
by Maxime Villard
MFC after: 35.2731 hours
Relnotes: probably (breaks ipv6 compat with older FreeBSD/NetBSD/MacOS)
Sponsored by: Dell EMC Isilon
When large SPDs are used, we face two problems:
- too many CPU cycles are spent during the linear searches in the SPD
for each packet
- too much contention on multi socket systems, since we use a single
shared lock.
Main changes:
- added the sysctl tree 'net.key.spdcache' to control the SPD cache
(disabled by default).
- cache the sp indexes that are used to perform SP lookups.
- use a range of dedicated mutexes to protect the cache lines.
Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by: ae
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D15050
Don't assume M_PKTHDR is set only on the first mbuf of the chain.
The check is replaced by (m1 != m), which is equivalent to the previous
code: we want to modify m->m_pkthdr.len only when 'm' was not passed in
m_adj().
Fix a pretty bad mistake, that has always been there:
m_adj(m1, -(m1->m_len - roff));
if (m1 != m)
m->m_pkthdr.len -= (m1->m_len - roff);
This is wrong: m_adj() will modify m1->m_len, so we're using a wrong
value when manually adjusting m->m_pkthdr.len.
Reported by: Maxime Villard <max at m00nbsd dot net>
Obtained from: NetBSD
MFC after: 1 week
When using hardware crypto engines, the callback functions used to handle
an IPsec packet after it has been encrypted or decrypted can be invoked
asynchronously from a worker thread that is not associated with a vnet.
Extend 'struct xform_data' to include a vnet pointer and save the current
vnet in this new member when queueing crypto requests in IPsec. In the
IPsec callback routines, use the new member to set the current vnet while
processing the modified packet.
This fixes a panic when using hardware offload such as ccr(4) with IPsec
after VIMAGE was enabled in GENERIC.
Reported by: Sony Arpita Das and Harsh Jain @ Chelsio
Reviewed by: bz
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D14763
o count in_nomem counter when we have failed to allocate mbuf for
promisc socket;
o count in_msgtarget counter when we have secussfully sent data to socket;
o Since we are sending messages in a loop, returning error on first fail
interrupts the loop, and all remaining sockets will not receive this
message. So, do not return error when we have failed to send data to ALL
or REGISTERED target. Return error only for KEY_SENDUP_ONE case. Now,
when some socket has overfilled its receive buffer, this will not break
other sockets.
MFC after: 2 weeks
Fix a vulnerability in IPsec-IPv6-AH, that allows an attacker to remotely
crash the kernel with a single packet.
In this loop we need to increment 'ad' by two, because the length field
of the option header does not count the size of the option header itself.
If the length is zero, then 'count' is incremented by zero, and there's
an infinite loop. Beyond that, this code was written with the assumption
that since the IPv6 packet already went through the generic IPv6 option
parser, several fields are guaranteed to be valid; but this assumption
does not hold because of the missing '+2', and there's as a result a
triggerable buffer overflow (write zeros after the end of the mbuf,
potentially to the next mbuf in memory since it's a pool).
Add the missing '+2', this place will be reinforced in separate commits.
Reported by: Maxime Villard <maxv at NetBSD.org>
MFC after: 1 week
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.
Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
SPDB was cleaned using TAILQ_CONCAT() instead of calling key_unlink()
for each SP, thus we need to properly clean lists in each bucket of
V_sphashtbl to avoid panic in hashdestroy() when INVARIANTS is enabled.
Do the same for V_acqaddrhashtbl and V_acqseqhashtbl.
When we are called in DEFAULT_VNET, destroy also all global locks and
drain key_timer callout.
Reported by: kp
Tested by: kp
MFC after: 1 week
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
No functional change intended.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
The HMAC construction natively permits any key size between 0 and the input
block length. Before r324017, the auth_hash 'keysize' member was the hash
output length, which was used by ipsec for key sizes. (Non-ipsec consumers
need the ability to use other keysizes, hence, r324017.)
The ipsec SADB code blindly uses the auth_hash 'keysize' member for both
minimum and maximum key size, which is wrong (from an HMAC perspective).
For now, just switch it to 'hashsize', which matches the existing
expectations.
Instead it should probably use the range [0, keysize]. But there may be
other broken code in ipsec that rejects hashes with too small a minimum
key size.
Reported by: olivier@
Reviewed by: olivier, no objection from ae
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12770
key_updateaddresses() is used to update SA addresses and NAT-T
configuration in SADB_UPDATE message. This is done using cloning SA
content from old SA into new one. But addresses and NAT-T configuration
are taking from SADB_UPDATE message. Use newsa pointer to set NAT-T
properties into cloned SA.
PR: 223382
MFC after: 1 week
fine when a lot of different flows to be ciphered/deciphered are involved.
However, when a software crypto driver is used, there are
situations where we could benefit from making crypto(9) multi threaded:
- a single flow is to be ciphered: only one thread is used to cipher it,
- a single ESP flow is to be deciphered: only one thread is used to
decipher it.
The idea here is to call crypto(9) using a new mode (CRYPTO_F_ASYNC) to
dispatch the crypto jobs on multiple threads, if the underlying crypto
driver is working in synchronous mode.
Another flag is added (CRYPTO_F_ASYNC_KEEPORDER) to make crypto(9)
dispatch the crypto jobs in the order they are received (an additional
queue/thread is used), so that the packets are reinjected in the network
using the same order they were posted.
A new sysctl net.inet.ipsec.async_crypto can be used to activate
this new behavior (disabled by default).
Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by: ae, jmg, jhb
Differential Revision: https://reviews.freebsd.org/D10680
Sponsored by: Stormshield
Theoretically, HMACs do not actually have any limit on key sizes.
Transforms should compact input keys larger than the HMAC block size by
using the transform (hash) on the input key.
(Short input keys are padded out with zeros to the HMAC block size.)
Still, not all FreeBSD crypto drivers that provide HMAC functionality
handle longer-than-blocksize keys appropriately, so enforce a "maximum" key
length in the crypto API for auth_hashes that previously expressed a
requirement. (The "maximum" is the size of a single HMAC block for the
given transform.) Unconstrained auth_hashes are left as-is.
I believe the previous hardcoded sizes were committed in the original
import of opencrypto from OpenBSD and are due to specific protocol
details of IPSec. Note that none of the previous sizes actually matched
the appropriate HMAC block size.
The previous hardcoded sizes made the SHA tests in cryptotest.py
useless for testing FreeBSD crypto drivers; none of the NIST-KAT example
inputs had keys sized to the previous expectations.
The following drivers were audited to check that they handled keys up to
the block size of the HMAC safely:
Software HMAC:
* padlock(4)
* cesa
* glxsb
* safe(4)
* ubsec(4)
Hardware accelerated HMAC:
* ccr(4)
* hifn(4)
* sec(4) (Only supports up to 64 byte keys despite claiming to
support SHA2 HMACs, but validates input key sizes)
* cryptocteon (MIPS)
* nlmsec (MIPS)
* rmisec (MIPS) (Amusingly, does not appear to use key material at
all -- presumed broken)
Reviewed by: jhb (previous version), rlibby (previous version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12437
When a security policy should match TCP connection with specific ports,
the SYN+ACK segment send by syncache_respond() is considered as forwarded
packet, because at this moment TCP connection does not have PCB structure,
and ip_output() is called without inpcb pointer. In this case SPIDX filled
for SP lookup will not contain TCP ports and security policy will not
be found. This can lead to unencrypted SYN+ACK on the wire.
This patch restores the old behavior, when ports will not be filled only
for forwarded packets.
Reported by: Dewayne Geraghty <dewayne.geraghty at heuristicsystems.com.au>
MFC after: 1 week
key_msg2sp() is used for parsing data from setsockopt(IP[V6]_IPSEC_POLICY)
call. This socket option is usually used to configure IPsec bypass for
socket. Only privileged user can set this socket option.
The message syntax is described here
http://www.kame.net/newsletter/20021210/
and our libipsec is usually used to create the correct request.
Add additional checks:
* that sadb_x_ipsecrequest_len is not out of bounds of user supplied buffer
* that src/dst's sa_len is the same
* that 2*sa_len is not out of bounds of user supplied buffer
* that 2*sa_len fits into bounds of sadb_x_ipsecrequest
Reported by: Ilja van Sprundel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11796
from enc_hhook().
This should solve the problem when pf is used with if_enc(4) interface,
and outbound packet with existing PCB checked by pf, and this leads to
deadlock due to pf does its own PCB lookup and tries to take rlock when
wlock is already held.
Now we pass PCB pointer if it is known to the pfil hook, this helps to
avoid extra PCB lookup and thus rlock acquiring is not needed.
For inbound packets it is safe to pass NULL, because we do not held any
PCB locks yet.
PR: 220217
MFC after: 3 weeks
Sponsored by: Yandex LLC
is not specified.
Due to the long call chain IPsec code can produce the kernel stack
exhaustion on the i386 architecture. The debugging code usually is not
used, but it requires a lot of stack space to keep buffers for strings
formatting. This patch conditionally defines macros to disable building
of IPsec debugging code.
IPsec currently has two sysctl variables to configure debug output:
* net.key.debug variable is used to enable debug output for PF_KEY
protocol. Such debug messages are produced by KEYDBG() macro and
usually they can be interesting for developers.
* net.inet.ipsec.debug variable is used to enable debug output for
DPRINTF() macro and ipseclog() function. DPRINTF() macro usually
is used for development debugging. ipseclog() function is used for
debugging by administrator.
The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers
declarations when IPSEC_DEBUG is not present in kernel config. This reduces
stack requirement for up to several hundreds of bytes.
The net.inet.ipsec.debug variable still can be used to enable ipseclog()
messages by administrator.
PR: 219476
Reported by: eugen
No objection from: #network
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10869
There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.
For outbound packets the direct call chain is the following:
IPSEC_OUTPUT() method -> ipsec[46]_common_output() ->
-> ipsec[46]_perform_request() -> xform_output() ->
-> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
-> xform_output_cb() -> ipsec_process_done() -> ip[6]_output().
The SA and SP references are held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did references releasing in
xform_output_cb(). But when the crypto callback called directly, in case of
error the error handling code in ipsec[46]_perform_request() also did
references releasing.
To fix this, remove error handling from ipsec[46]_perform_request() and do it
in xform_output() before crypto_dispatch().
MFC after: 10 days
There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.
For inbound packets the direct call chain is the following:
IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() ->
-> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
-> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().
The SA reference is held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did SA reference releasing in
xform_input_cb(). But when the crypto callback called directly, in case of
error (e.g. data authentification failed) the error handling in
ipsec_common_input() also did SA reference releasing.
To fix this, remove error handling from ipsec_common_input() and do it
in xform_input() before crypto_dispatch().
PR: 219356
MFC after: 10 days
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.
ANSIfy related prototypes while here.
Reviewed by: cem, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10193
PCB SP cache acquires extra reference, when SP is stored in the cache.
Release this reference when PCB is destroyed in ipsec_delete_pcbpolicy().
In ipsec_copy_pcbpolicy() release reference to SP in case if sp_in or
sp_out are not NULL.
Reported by: Slawa Olhovchenkov <slw at zxy spb ru>
MFC after: 1 week
When the replay window size is large than UINT8_MAX, add to the request
the SADB_X_EXT_SA_REPLAY extension header that was added in r309144.
Also add support of SADB_X_EXT_NAT_T_TYPE, SADB_X_EXT_NAT_T_SPORT,
SADB_X_EXT_NAT_T_DPORT, SADB_X_EXT_NAT_T_OAI, SADB_X_EXT_NAT_T_OAR,
SADB_X_EXT_SA_REPLAY, SADB_X_EXT_NEW_ADDRESS_SRC, SADB_X_EXT_NEW_ADDRESS_DST
extension headers to the key_debug that is used by `setkey -x`.
Modify kdebug_sockaddr() to use inet_ntop() for IP addresses formatting.
And modify kdebug_sadb_x_policy() to show policy scope and priority.
Reviewed by: gnn, Emeric Poupon
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10375
destination addresses. Previous code has used only destination address
for lookup. But for inbound packets the source address was used as SA
destination address. Thus only outbound SA were used for both directions.
Now we use addresses from a packet as is, thus SAs for both directions are
needed.
Reported by: Mike Tancsa
MFC after: 1 week
Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.
Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.
To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.
For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.
After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
in ipsec
esp/tunnel/87.250.242.144-87.250.242.145/unique:145
spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
out none
spid=5 seq=1 pid=872 scope=global
refcnt=1
No objection from: #network
Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9805
In case when decrypted and decapsulated packet is an UDP datagram,
check that its checksum is not zero before doing incremental checksum
adjustment.
Reported by: Eugene Grosbein
Tested by: Eugene Grosbein
Small summary
-------------
o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
option IPSEC_SUPPORT added. It enables support for loading
and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
support was removed. Added TCP/UDP checksum handling for
inbound packets that were decapsulated by transport mode SAs.
setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
build as part of ipsec.ko module (or with IPSEC kernel).
It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
methods. The only one header file <netipsec/ipsec_support.h>
should be included to declare all the needed things to work
with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
- now all security associations stored in the single SPI namespace,
and all SAs MUST have unique SPI.
- several hash tables added to speed up lookups in SADB.
- SADB now uses rmlock to protect access, and concurrent threads
can do SA lookups in the same time.
- many PF_KEY message handlers were reworked to reflect changes
in SADB.
- SADB_UPDATE message was extended to support new PF_KEY headers:
SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
avoid locking protection for ipsecrequest. Now we support
only limited number (4) of bundled SAs, but they are supported
for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
check for full history of applied IPsec transforms.
o References counting rules for security policies and security
associations were changed. The proper SA locking added into xform
code.
o xform code was also changed. Now it is possible to unregister xforms.
tdb_xxx structures were changed and renamed to reflect changes in
SADB/SPDB, and changed rules for locking and refcounting.
Reviewed by: gnn, wblock
Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9352
This function is used only by ipsec_getpolicybysock() to fill security
policy index selector for locally generated packets (that have INPCB).
The function incorrectly assumes that spidx is the same for both directions.
Fix this by using new direction argument to specify correct INPCB security
policy - sp_in or sp_out. There is no need to fill both policy indeces,
because they are overwritten for each packet.
This fixes security policy matching for outbound packets when user has
specified TCP/UDP ports in the security policy upperspec.
PR: 213869
MFC after: 1 week
Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.
The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.
Reviewed by: ae
Obtained from: emeric.poupon@stormshield.eu
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D8468
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.
Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.
Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.
For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.
Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.
For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).
Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.
Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747
Coverity points out that 'continue' is equivalent to 'break' in a do {}
while(false) loop.
Reported by: Coverity
CID: 1354983
Sponsored by: EMC / Isilon Storage Division
Use own protosw structures for both address families.
Check proto in encapcheck function and use -1 as proto argument in
encap_attach_func(), both address families can have IPPROTO_IPV4
and IPPROTO_IPV6 protocols.
Reported by: bz
RFC3173 says that the IP datagram MUST be sent in the original
non-compressed form, when the total size of a compressed payload
and the IPComp header is not smaller than the size of the original
payload. In tunnel mode for small packets IPComp will send
encapsulated IP datagrams without IPComp header.
Add ip_encap handler for IPPROTO_IPV4 and IPPROTO_IPV6 to handle
these datagrams. The handler does lookup for SA related to IPComp
protocol and given from mbuf source and destination addresses as
tunnel endpoints. It decapsulates packets only when corresponding SA
is found.
Reported by: gnn
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D6062
Use hhook(9) framework to achieve ability of loading and unloading
if_enc(4) kernel module. INET and INET6 code on initialization registers
two helper hooks points in the kernel. if_enc(4) module uses these helper
hook points and registers its hooks. IPSEC code uses these hhook points
to call helper hooks implemented in if_enc(4).
Set zero ivsize for enc_xform_null and remove special handling from
xform_esp.c.
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D1503
degradation (7%) for host host TCP connections over 10Gbps links,
even when there were no secuirty policies in place. There is no
change in performance on 1Gbps network links. Testing GENERIC vs.
GENERIC-NOIPSEC vs. GENERIC with this change shows that the new
code removes any overhead introduced by having IPSEC always in the
kernel.
Differential Revision: D3993
MFC after: 1 month
Sponsored by: Rubicon Communications (Netgate)
Currently we perform crypto requests for IPSEC synchronous for most of
crypto providers (software, aesni) and only VIA padlock calls crypto
callback asynchronous. In synchronous mode it is possible, that security
policy will be removed during the processing crypto request. And crypto
callback will release the last reference to SP. Then upon return into
ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already
freed request. To prevent this we will take extra reference to SP.
PR: 201876
Sponsored by: Yandex LLC
defines the keys differently than NIST does, so we have to muck with
key lengths and nonce/IVs to be standard compliant...
Remove the iv from secasvar as it was unused...
Add a counter protected by a mutex to ensure that the counter for GCM
and ICM will never be repeated.. This is a requirement for security..
I would use atomics, but we don't have a 64bit one on all platforms..
Fix a bug where IPsec was depending upon the OCF to ensure that the
blocksize was always at least 4 bytes to maintain alignment... Move
this logic into IPsec so changes to OCF won't break IPsec...
In one place, espx was always non-NULL, so don't test that it's
non-NULL before doing work..
minor style cleanups...
drop setting key and klen as they were not used...
Enforce that OCF won't pass invalid key lengths to AES that would
panic the machine...
This was has been tested by others too... I tested this against
NetBSD 6.1.5 using mini-test suite in
https://github.com/jmgurney/ipseccfgs and the only things that don't
pass are keyed md5 and sha1, and 3des-deriv (setkey syntax error),
all other modes listed in setkey's man page... The nice thing is
that NetBSD uses setkey, so same config files were used on both...
Reviewed by: gnn
use CTASSERTs now that we have them...
Replace a draft w/ RFC that's over 10 years old.
Note that _AALG and _EALG do not need to match what the IKE daemons
think they should be.. This is part of the KABI... I decided to
renumber AESCTR, but since we've never had working AESCTR mode, I'm
not really breaking anything.. and it shortens a loop by quite
a bit..
remove SKIPJACK IPsec support... SKIPJACK never made it out of draft
(in 1999), only has 80bit key, NIST recommended it stop being used
after 2010, and setkey nor any of the IKE daemons I checked supported
it...
jmgurney/ipsecgcm: a357a33, c75808b, e008669, b27b6d6
Reviewed by: gnn (earlier version)
The IPsec SA statistic keeping is used even for decision making on expiry/rekeying SAs.
When there are multiple transformations being done the statistic keeping might be wrong.
This mostly impacts multiple encapsulations on IPsec since the usual scenario it is not noticed due to the code path not taken.
Differential Revision: https://reviews.freebsd.org/D3239
Reviewed by: ae, gnn
Approved by: gnn(mentor)
problems that was introduced in r285336... I have verified that
HMAC-SHA2-256 both ah only and w/ AES-CBC interoperate w/ a NetBSD
6.1.5 vm...
Reviewed by: gnn
mode and with hardware support on systems that have AESNI instructions.
Differential Revision: D2936
Reviewed by: jmg, eri, cognet
Sponsored by: Rubicon Communications (Netgate)
When IPSEC is enabled on the kernel the forwarding path has an optimization to not enter the code paths
for checking security policies but first checks if there is any security policy active at all.
The patch introduces the same optimization but for traffic generated from the host itself.
This reduces the overhead by 50% on my tests for generated host traffic without and SP active.
Differential Revision: https://reviews.freebsd.org/D2980
Reviewed by: ae, gnn
Approved by: gnn(mentor)
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks
extension header type. The key_flush_sad() now will send SADB_EXPIRE
message when HARD lifetime expires. This is required by RFC 2367 and some
keying daemons rely on these messages. HARD lifetime messages have
precedence over SOFT lifetime messages, so now they will be checked first.
Also now SADB_EXPIRE messages will be send even the SA has not been used,
because keying daemons might want to rekey such SA.
PR: 200282, 200283
Submitted by: Tobias Brunner <tobias at strongswan dot org>
MFC after: 2 weeks