freebsd-dev

Author	SHA1	Message	Date
John Baldwin	dae61c9d09	Simplify IPsec transform-specific teardown. - Rename from the teardown callback from 'zeroize' to 'cleanup' since this no longer zeroes keys. - Change the callback return type to void. Nothing checked the return value and it was always zero. - Don't have esp call into ah since it no longer needs to depend on this to clear the auth key. Instead, both are now private and self-contained. Reviewed by: delphij Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25443	2020-06-25 23:59:16 +00:00
John Baldwin	20869b25cc	Use zfree() to explicitly zero IPsec keys. Reviewed by: delphij Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25442	2020-06-25 20:31:06 +00:00
John Baldwin	28d2a72bbf	Consistently include opt_ipsec.h for consumers of <netipsec/ipsec.h>. This fixes ipsec.ko to include all of IPSEC_DEBUG. Reviewed by: imp MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25046	2020-05-29 19:22:40 +00:00
John Baldwin	9c0e3d3a53	Add support for optional separate output buffers to in-kernel crypto. Some crypto consumers such as GELI and KTLS for file-backed sendfile need to store their output in a separate buffer from the input. Currently these consumers copy the contents of the input buffer into the output buffer and queue an in-place crypto operation on the output buffer. Using a separate output buffer avoids this copy. - Create a new 'struct crypto_buffer' describing a crypto buffer containing a type and type-specific fields. crp_ilen is gone, instead buffers that use a flat kernel buffer have a cb_buf_len field for their length. The length of other buffer types is inferred from the backing store (e.g. uio_resid for a uio). Requests now have two such structures: crp_buf for the input buffer, and crp_obuf for the output buffer. - Consumers now use helper functions (crypto_use_, e.g. crypto_use_mbuf()) to configure the input buffer. If an output buffer is not configured, the request still modifies the input buffer in-place. A consumer uses a second set of helper functions (crypto_use_output_) to configure an output buffer. - Consumers must request support for separate output buffers when creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are only permitted to queue a request with a separate output buffer on sessions with this flag set. Existing drivers already reject sessions with unknown flags, so this permits drivers to be modified to support this extension without requiring all drivers to change. - Several data-related functions now have matching versions that operate on an explicit buffer (e.g. crypto_apply_buf, crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf). - Most of the existing data-related functions operate on the input buffer. However crypto_copyback always writes to the output buffer if a request uses a separate output buffer. - For the regions in input/output buffers, the following conventions are followed: - AAD and IV are always present in input only and their fields are offsets into the input buffer. - payload is always present in both buffers. If a request uses a separate output buffer, it must set a new crp_payload_start_output field to the offset of the payload in the output buffer. - digest is in the input buffer for verify operations, and in the output buffer for compute operations. crp_digest_start is relative to the appropriate buffer. - Add a crypto buffer cursor abstraction. This is a more general form of some bits in the cryptosoft driver that tried to always use uio's. However, compared to the original code, this avoids rewalking the uio iovec array for requests with multiple vectors. It also avoids allocate an iovec array for mbufs and populating it by instead walking the mbuf chain directly. - Update the cryptosoft(4) driver to support separate output buffers making use of the cursor abstraction. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24545	2020-05-25 22:12:04 +00:00
John Baldwin	897e43124e	Don't pass bogus keys down for NULL algorithms. The changes in r359374 added various sanity checks in sessions and requests created by crypto consumers in part to permit backend drivers to make assumptions instead of duplicating checks for various edge cases. One of the new checks was to reject sessions which provide a pointer to a key while claiming the key is zero bits long. IPsec ESP tripped over this as it passes along whatever key is provided for NULL, including a pointer to a zero-length key when an empty string ("") is used with setkey(8). One option would be to teach the IPsec key layer to not allocate keys of zero length, but I went with a simpler fix of just not passing any keys down and always using a key length of zero for NULL algorithms. PR: 245832 Reported by: CI	2020-05-02 01:00:29 +00:00
John Baldwin	16aabb761c	Remove support for IPsec algorithms deprecated in r348205 and r360202. Examples of depecrated algorithms in manual pages and sample configs are updated where relevant. I removed the one example of combining ESP and AH (vs using a cipher and auth in ESP) as RFC 8221 says this combination is NOT RECOMMENDED. Specifically, this removes support for the following ciphers: - des-cbc - 3des-cbc - blowfish-cbc - cast128-cbc - des-deriv - des-32iv - camellia-cbc This also removes support for the following authentication algorithms: - hmac-md5 - keyed-md5 - keyed-sha1 - hmac-ripemd160 Reviewed by: cem, gnn (older verisons) Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24342	2020-05-02 00:06:58 +00:00
John Baldwin	c034143269	Refactor driver and consumer interfaces for OCF (in-kernel crypto). - The linked list of cryptoini structures used in session initialization is replaced with a new flat structure: struct crypto_session_params. This session includes a new mode to define how the other fields should be interpreted. Available modes include: - COMPRESS (for compression/decompression) - CIPHER (for simply encryption/decryption) - DIGEST (computing and verifying digests) - AEAD (combined auth and encryption such as AES-GCM and AES-CCM) - ETA (combined auth and encryption using encrypt-then-authenticate) Additional modes could be added in the future (e.g. if we wanted to support TLS MtE for AES-CBC in the kernel we could add a new mode for that. TLS modes might also affect how AAD is interpreted, etc.) The flat structure also includes the key lengths and algorithms as before. However, code doesn't have to walk the linked list and switch on the algorithm to determine which key is the auth key vs encryption key. The 'csp_auth_' fields are always used for auth keys and settings and 'csp_cipher_' for cipher. (Compression algorithms are stored in csp_cipher_alg.) - Drivers no longer register a list of supported algorithms. This doesn't quite work when you factor in modes (e.g. a driver might support both AES-CBC and SHA2-256-HMAC separately but not combined for ETA). Instead, a new 'crypto_probesession' method has been added to the kobj interface for symmteric crypto drivers. This method returns a negative value on success (similar to how device_probe works) and the crypto framework uses this value to pick the "best" driver. There are three constants for hardware (e.g. ccr), accelerated software (e.g. aesni), and plain software (cryptosoft) that give preference in that order. One effect of this is that if you request only hardware when creating a new session, you will no longer get a session using accelerated software. Another effect is that the default setting to disallow software crypto via /dev/crypto now disables accelerated software. Once a driver is chosen, 'crypto_newsession' is invoked as before. - Crypto operations are now solely described by the flat 'cryptop' structure. The linked list of descriptors has been removed. A separate enum has been added to describe the type of data buffer in use instead of using CRYPTO_F_* flags to make it easier to add more types in the future if needed (e.g. wired userspace buffers for zero-copy). It will also make it easier to re-introduce separate input and output buffers (in-kernel TLS would benefit from this). Try to make the flags related to IV handling less insane: - CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv' member of the operation structure. If this flag is not set, the IV is stored in the data buffer at the 'crp_iv_start' offset. - CRYPTO_F_IV_GENERATE means that a random IV should be generated and stored into the data buffer. This cannot be used with CRYPTO_F_IV_SEPARATE. If a consumer wants to deal with explicit vs implicit IVs, etc. it can always generate the IV however it needs and store partial IVs in the buffer and the full IV/nonce in crp_iv and set CRYPTO_F_IV_SEPARATE. The layout of the buffer is now described via fields in cryptop. crp_aad_start and crp_aad_length define the boundaries of any AAD. Previously with GCM and CCM you defined an auth crd with this range, but for ETA your auth crd had to span both the AAD and plaintext (and they had to be adjacent). crp_payload_start and crp_payload_length define the boundaries of the plaintext/ciphertext. Modes that only do a single operation (COMPRESS, CIPHER, DIGEST) should only use this region and leave the AAD region empty. If a digest is present (or should be generated), it's starting location is marked by crp_digest_start. Instead of using the CRD_F_ENCRYPT flag to determine the direction of the operation, cryptop now includes an 'op' field defining the operation to perform. For digests I've added a new VERIFY digest mode which assumes a digest is present in the input and fails the request with EBADMSG if it doesn't match the internally-computed digest. GCM and CCM already assumed this, and the new AEAD mode requires this for decryption. The new ETA mode now also requires this for decryption, so IPsec and GELI no longer do their own authentication verification. Simple DIGEST operations can also do this, though there are no in-tree consumers. To eventually support some refcounting to close races, the session cookie is now passed to crypto_getop() and clients should no longer set crp_sesssion directly. - Assymteric crypto operation structures should be allocated via crypto_getkreq() and freed via crypto_freekreq(). This permits the crypto layer to track open asym requests and close races with a driver trying to unregister while asym requests are in flight. - crypto_copyback, crypto_copydata, crypto_apply, and crypto_contiguous_subsegment now accept the 'crp' object as the first parameter instead of individual members. This makes it easier to deal with different buffer types in the future as well as separate input and output buffers. It's also simpler for driver writers to use. - bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer. This understands the various types of buffers so that drivers that use DMA do not have to be aware of different buffer types. - Helper routines now exist to build an auth context for HMAC IPAD and OPAD. This reduces some duplicated work among drivers. - Key buffers are now treated as const throughout the framework and in device drivers. However, session key buffers provided when a session is created are expected to remain alive for the duration of the session. - GCM and CCM sessions now only specify a cipher algorithm and a cipher key. The redundant auth information is not needed or used. - For cryptosoft, split up the code a bit such that the 'process' callback now invokes a function pointer in the session. This function pointer is set based on the mode (in effect) though it simplifies a few edge cases that would otherwise be in the switch in 'process'. It does split up GCM vs CCM which I think is more readable even if there is some duplication. - I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC as an auth algorithm and updated cryptocheck to work with it. - Combined cipher and auth sessions via /dev/crypto now always use ETA mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored. This was actually documented as being true in crypto(4) before, but the code had not implemented this before I added the CIPHER_FIRST flag. - I have not yet updated /dev/crypto to be aware of explicit modes for sessions. I will probably do that at some point in the future as well as teach it about IV/nonce and tag lengths for AEAD so we can support all of the NIST KAT tests for GCM and CCM. - I've split up the exising crypto.9 manpage into several pages of which many are written from scratch. - I have converted all drivers and consumers in the tree and verified that they compile, but I have not tested all of them. I have tested the following drivers: - cryptosoft - aesni (AES only) - blake2 - ccr and the following consumers: - cryptodev - IPsec - ktls_ocf - GELI (lightly) I have not tested the following: - ccp - aesni with sha - hifn - kgssapi_krb5 - ubsec - padlock - safe - armv8_crypto (aarch64) - glxsb (i386) - sec (ppc) - cesa (armv7) - cryptocteon (mips64) - nlmsec (mips64) Discussed with: cem Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D23677	2020-03-27 18:25:23 +00:00
Bjoern A. Zeeb	a4adf6cc65	Fix m_pullup() problem after removing PULLDOWN_TESTs and KAME EXT_*macros. r354748-354750 replaced the KAME macros with m_pulldown() calls. Contrary to the rest of the network stack m_len checks before m_pulldown() were not put in placed (see r354748). Put these m_len checks in place for now (to go along with the style of the network stack since the initial commits). These are not put in for performance but to avoid an error scenario (even though it also will help performance at the moment as it avoid allocating an extra mbuf; not because of the unconditional function call). The observed error case went like this: (1) an mbuf with M_EXT arrives and we call m_pullup() unconditionally on it. (2) m_pullup() will call m_get() unless the requested length is larger than MHLEN (in which case it'll m_freem() the perfectly fine mbuf) and migrate the requested length of data and pkthdr into the new mbuf. (3) If m_get() succeeds, a further m_pullup() call going over MHLEN will fail. This was observed with failing auto-configuration as an RA packet of 200 bytes exceeded MHLEN and the m_pullup() called from nd6_ra_input() dropped the mbuf. (Re-)adding the m_len checks before m_pullup() calls avoids this problems with mbufs using external storage for now. MFC after: 3 weeks Sponsored by: Netflix	2019-12-01 00:22:04 +00:00
Bjoern A. Zeeb	63abacc204	netinet*: replace IP6_EXTHDR_GET() In a few places we have IP6_EXTHDR_GET() left in upper layer protocols. The IP6_EXTHDR_GET() macro might perform an m_pulldown() in case the data fragment is not contiguous. Convert these last remaining instances into m_pullup()s instead. In CARP, for example, we will a few lines later call m_pullup() anyway, the IPsec code coming from OpenBSD would otherwise have done the m_pullup() and are copying the data a bit later anyway, so pulling it in seems no better or worse. Note: this leaves very few m_pulldown() cases behind in the tree and we might want to consider removing them as well to make mbuf management easier again on a path to variable size mbufs, especially given m_pulldown() still has an issue not re-checking M_WRITEABLE(). Reviewed by: gallatin MFC after: 8 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22335	2019-11-15 21:44:17 +00:00
John Baldwin	0f70218343	Make the warning intervals for deprecated crypto algorithms tunable. New sysctl/tunables can now set the interval (in seconds) between rate-limited crypto warnings. The new sysctls are: - kern.cryptodev_warn_interval for /dev/crypto - net.inet.ipsec.crypto_warn_interval for IPsec - kern.kgssapi_warn_interval for KGSSAPI Reviewed by: cem MFC after: 1 month Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D20555	2019-06-11 23:00:55 +00:00
John Baldwin	c2fd516f3a	Add deprecation warnings for IPsec algorithms deprecated in RFC 8221. All of these algorithms are either explicitly marked MUST NOT, or they are implicitly MUST NOTs by virtue of not being included in IETF's list of protocols at all despite having assignments from IANA. Specifically, this adds warnings for the following ciphers: - des-cbc - blowfish-cbc - cast128-cbc - des-deriv - des-32iv - camellia-cbc Warnings for the following authentication algorithms are also added: - hmac-md5 - keyed-md5 - keyed-sha1 - hmac-ripemd160 Reviewed by: cem, gnn MFC after: 3 days Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D20340	2019-05-23 22:06:57 +00:00
Conrad Meyer	1b0909d51a	OpenCrypto: Convert sessions to opaque handles instead of integers Track session objects in the framework, and pass handles between the framework (OCF), consumers, and drivers. Avoid redundancy and complexity in individual drivers by allocating session memory in the framework and providing it to drivers in ::newsession(). Session handles are no longer integers with information encoded in various high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the appropriate crypto_ses2foo() function on the opaque session handle. Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to the opaque handle interface. Discard existing session tracking as much as possible (quick pass). There may be additional code ripe for deletion. Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style interface. The conversion is largely mechnical. The change is documented in crypto.9. Inspired by https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html . No objection from: ae (ipsec portion) Reported by: jhb	2018-07-18 00:56:25 +00:00
Conrad Meyer	2e08e39ff5	OCF: Add a typedef for session identifiers No functional change. This should ease the transition from an integer session identifier model to an opaque pointer model.	2018-07-13 23:46:07 +00:00
Conrad Meyer	ede2f7731d	Correctly handle the padding for IPv6-AH, as specified by RFC4302 The RFC specifies that under IPv6 the complete AH header must be 64 bit aligned, and under IPv4, 32 bit aligned. Prior to this change, we (along with other BSDs and MacOS) had violated this requirement. This makes it possible to set up IPv6-AH between Linux and BSD, and also probably between Windows and BSD. PR: 222684 Reported and tested by: Jason Mader <jasonmader AT gmail.com> Obtained from: NetBSD xform_ah.c 1.105 (b939fe2483972eb43d71bf990cfb7f26dece7839 NetBSD/src on GH) by Maxime Villard MFC after: 35.2731 hours Relnotes: probably (breaks ipv6 compat with older FreeBSD/NetBSD/MacOS) Sponsored by: Dell EMC Isilon	2018-06-04 18:51:06 +00:00
John Baldwin	fd40ecf3d4	Set the proper vnet in IPsec callback functions. When using hardware crypto engines, the callback functions used to handle an IPsec packet after it has been encrypted or decrypted can be invoked asynchronously from a worker thread that is not associated with a vnet. Extend 'struct xform_data' to include a vnet pointer and save the current vnet in this new member when queueing crypto requests in IPsec. In the IPsec callback routines, use the new member to set the current vnet while processing the modified packet. This fixes a panic when using hardware offload such as ccr(4) with IPsec after VIMAGE was enabled in GENERIC. Reported by: Sony Arpita Das and Harsh Jain @ Chelsio Reviewed by: bz MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14763	2018-03-20 17:05:23 +00:00
Andrey V. Elsukov	6ca39da354	Check packet length to do not make out of bounds access. Also save ah_nxt value to use it later, since ah pointer can become invalid. Reported by: Maxime Villard <max at m00nbsd dot net> MFC after: 5 days	2018-02-19 11:14:38 +00:00
Andrey V. Elsukov	055679e67e	Adopt revision 1.76 and 1.77 from NetBSD: Fix a vulnerability in IPsec-IPv6-AH, that allows an attacker to remotely crash the kernel with a single packet. In this loop we need to increment 'ad' by two, because the length field of the option header does not count the size of the option header itself. If the length is zero, then 'count' is incremented by zero, and there's an infinite loop. Beyond that, this code was written with the assumption that since the IPv6 packet already went through the generic IPv6 option parser, several fields are guaranteed to be valid; but this assumption does not hold because of the missing '+2', and there's as a result a triggerable buffer overflow (write zeros after the end of the mbuf, potentially to the next mbuf in memory since it's a pool). Add the missing '+2', this place will be reinforced in separate commits. Reported by: Maxime Villard <maxv at NetBSD.org> MFC after: 1 week	2018-01-24 19:48:25 +00:00
Andrey V. Elsukov	b12a7532e3	Merge revision 1.35 from NetBSD: fix pointer/offset mistakes in handling of IPv4 options Reported by: Maxime Villard <maxv at NetBSD.org> MFC after: 1 week	2018-01-24 19:06:44 +00:00
Alexander Kabaev	151ba7933a	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385	2017-12-25 04:48:39 +00:00
Fabien Thomas	39bbca6ffd	crypto(9) is called from ipsec in CRYPTO_F_CBIFSYNC mode. This is working fine when a lot of different flows to be ciphered/deciphered are involved. However, when a software crypto driver is used, there are situations where we could benefit from making crypto(9) multi threaded: - a single flow is to be ciphered: only one thread is used to cipher it, - a single ESP flow is to be deciphered: only one thread is used to decipher it. The idea here is to call crypto(9) using a new mode (CRYPTO_F_ASYNC) to dispatch the crypto jobs on multiple threads, if the underlying crypto driver is working in synchronous mode. Another flag is added (CRYPTO_F_ASYNC_KEEPORDER) to make crypto(9) dispatch the crypto jobs in the order they are received (an additional queue/thread is used), so that the packets are reinjected in the network using the same order they were posted. A new sysctl net.inet.ipsec.async_crypto can be used to activate this new behavior (disabled by default). Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu> Reviewed by: ae, jmg, jhb Differential Revision: https://reviews.freebsd.org/D10680 Sponsored by: Stormshield	2017-11-03 10:27:22 +00:00
Conrad Meyer	3693b18840	opencrypto: Loosen restriction on HMAC key sizes Theoretically, HMACs do not actually have any limit on key sizes. Transforms should compact input keys larger than the HMAC block size by using the transform (hash) on the input key. (Short input keys are padded out with zeros to the HMAC block size.) Still, not all FreeBSD crypto drivers that provide HMAC functionality handle longer-than-blocksize keys appropriately, so enforce a "maximum" key length in the crypto API for auth_hashes that previously expressed a requirement. (The "maximum" is the size of a single HMAC block for the given transform.) Unconstrained auth_hashes are left as-is. I believe the previous hardcoded sizes were committed in the original import of opencrypto from OpenBSD and are due to specific protocol details of IPSec. Note that none of the previous sizes actually matched the appropriate HMAC block size. The previous hardcoded sizes made the SHA tests in cryptotest.py useless for testing FreeBSD crypto drivers; none of the NIST-KAT example inputs had keys sized to the previous expectations. The following drivers were audited to check that they handled keys up to the block size of the HMAC safely: Software HMAC: * padlock(4) * cesa * glxsb * safe(4) * ubsec(4) Hardware accelerated HMAC: * ccr(4) * hifn(4) * sec(4) (Only supports up to 64 byte keys despite claiming to support SHA2 HMACs, but validates input key sizes) * cryptocteon (MIPS) * nlmsec (MIPS) * rmisec (MIPS) (Amusingly, does not appear to use key material at all -- presumed broken) Reviewed by: jhb (previous version), rlibby (previous version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12437	2017-09-26 16:18:10 +00:00
Andrey V. Elsukov	7f1f65918b	Disable IPsec debugging code by default when IPSEC_DEBUG kernel option is not specified. Due to the long call chain IPsec code can produce the kernel stack exhaustion on the i386 architecture. The debugging code usually is not used, but it requires a lot of stack space to keep buffers for strings formatting. This patch conditionally defines macros to disable building of IPsec debugging code. IPsec currently has two sysctl variables to configure debug output: * net.key.debug variable is used to enable debug output for PF_KEY protocol. Such debug messages are produced by KEYDBG() macro and usually they can be interesting for developers. * net.inet.ipsec.debug variable is used to enable debug output for DPRINTF() macro and ipseclog() function. DPRINTF() macro usually is used for development debugging. ipseclog() function is used for debugging by administrator. The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers declarations when IPSEC_DEBUG is not present in kernel config. This reduces stack requirement for up to several hundreds of bytes. The net.inet.ipsec.debug variable still can be used to enable ipseclog() messages by administrator. PR: 219476 Reported by: eugen No objection from: #network MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10869	2017-05-29 09:30:38 +00:00
Andrey V. Elsukov	3aee70991d	Fix possible double releasing for SA and SP references. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For outbound packets the direct call chain is the following: IPSEC_OUTPUT() method -> ipsec[46]_common_output() -> -> ipsec[46]_perform_request() -> xform_output() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output(). The SA and SP references are held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did references releasing in xform_output_cb(). But when the crypto callback called directly, in case of error the error handling code in ipsec[46]_perform_request() also did references releasing. To fix this, remove error handling from ipsec[46]_perform_request() and do it in xform_output() before crypto_dispatch(). MFC after: 10 days	2017-05-23 09:32:26 +00:00
Andrey V. Elsukov	5f7c516f21	Fix possible double releasing for SA reference. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For inbound packets the direct call chain is the following: IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue(). The SA reference is held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did SA reference releasing in xform_input_cb(). But when the crypto callback called directly, in case of error (e.g. data authentification failed) the error handling in ipsec_common_input() also did SA reference releasing. To fix this, remove error handling from ipsec_common_input() and do it in xform_input() before crypto_dispatch(). PR: 219356 MFC after: 10 days	2017-05-23 09:01:48 +00:00
Andrey V. Elsukov	fcf596178b	Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352	2017-02-06 08:49:57 +00:00
Fabien Thomas	bf4356266d	IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets. Since the previous algorithm, based on bit shifting, does not scale with large replay windows, the algorithm used here is based on RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting. The replay window will be fast to be updated, but will cost as many bits in RAM as its size. The previous implementation did not provide a lock on the replay window, which may lead to replay issues. Reviewed by: ae Obtained from: emeric.poupon@stormshield.eu Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D8468	2016-11-25 14:44:49 +00:00
Andrey V. Elsukov	f367798498	Take extra reference to security policy before calling crypto_dispatch(). Currently we perform crypto requests for IPSEC synchronous for most of crypto providers (software, aesni) and only VIA padlock calls crypto callback asynchronous. In synchronous mode it is possible, that security policy will be removed during the processing crypto request. And crypto callback will release the last reference to SP. Then upon return into ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already freed request. To prevent this we will take extra reference to SP. PR: 201876 Sponsored by: Yandex LLC	2015-09-30 08:16:33 +00:00
John-Mark Gurney	42e5fcbf2b	these are comparing authenticators and need to be constant time... This could be a side channel attack... Now that we have a function for this, use it... jmgurney/ipsecgcm: 24d704cc and 7f37a14	2015-07-31 00:31:52 +00:00
John-Mark Gurney	a09a7146a7	RFC4868 section 2.3 requires that the output be half... This fixes problems that was introduced in r285336... I have verified that HMAC-SHA2-256 both ah only and w/ AES-CBC interoperate w/ a NetBSD 6.1.5 vm... Reviewed by: gnn	2015-07-29 07:15:16 +00:00
George V. Neville-Neil	16de9ac1b5	Add support for AES modes to IPSec. These modes work both in software only mode and with hardware support on systems that have AESNI instructions. Differential Revision: D2936 Reviewed by: jmg, eri, cognet Sponsored by: Rubicon Communications (Netgate)	2015-07-09 18:16:35 +00:00
Andrey V. Elsukov	3d80e82d60	Fix possible use after free due to security policy deletion. When we are passing mbuf to IPSec processing via ipsec[46]_process_packet(), we hold one reference to security policy and release it just after return from this function. But IPSec processing can be deffered and when we release reference to security policy after ipsec[46]_process_packet(), user can delete this security policy from SPDB. And when IPSec processing will be done, xform's callback function will do access to already freed memory. To fix this move KEY_FREESP() into callback function. Now IPSec code will release reference to SP after processing will be finished. Differential Revision: https://reviews.freebsd.org/D2324 No objections from: #network Sponsored by: Yandex LLC	2015-04-27 00:55:56 +00:00
Andrey V. Elsukov	962ac6c727	Change ipsec_address() and ipsec_logsastr() functions to take two additional arguments - buffer and size of this buffer. ipsec_address() is used to convert sockaddr structure to presentation format. The IPv6 part of this function returns pointer to the on-stack buffer and at the moment when it will be used by caller, it becames invalid. IPv4 version uses 4 static buffers and returns pointer to new buffer each time when it called. But anyway it is still possible to get corrupted data when several threads will use this function. ipsec_logsastr() is used to format string about SA entry. It also uses static buffer and has the same problem with concurrent threads. To fix these problems add the buffer pointer and size of this buffer to arguments. Now each caller will pass buffer and its size to these functions. Also convert all places where these functions are used (except disabled code). And now ipsec_address() uses inet_ntop() function from libkern. PR: 185996 Differential Revision: https://reviews.freebsd.org/D2321 Reviewed by: gnn Sponsored by: Yandex LLC	2015-04-18 16:58:33 +00:00
Andrey V. Elsukov	f0514a8b8a	Remove now unused mtag argument from ipsec*_common_input_cb. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 17:14:49 +00:00
Andrey V. Elsukov	08537f4526	Remove code related to PACKET_TAG_IPSEC_IN_CRYPTO_DONE mbuf tag. It isn't used in FreeBSD. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 17:07:21 +00:00
Andrey V. Elsukov	2d957916ef	Remove route chaching support from ipsec code. It isn't used for some time. * remove sa_route_union declaration and route_cache member from struct secashead; * remove key_sa_routechange() call from ICMP and ICMPv6 code; * simplify ip_ipsec_mtu(); * remove #include <net/route.h>; Sponsored by: Yandex LLC	2014-12-02 04:20:50 +00:00
Gleb Smirnoff	6df8a71067	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.	2014-11-07 09:39:05 +00:00
Gleb Smirnoff	eedc7fd9e8	Provide includes that are needed in these files, and before were read in implicitly via if.h -> if_var.h pollution. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 18:18:50 +00:00
Andrey V. Elsukov	db8c087944	Migrate structs ahstat, espstat, ipcompstat, ipipstat, pfkeystat, ipsec4stat, ipsec6stat to PCPU counters.	2013-07-09 10:08:13 +00:00
Andrey V. Elsukov	a04d64d875	Use corresponding macros to update statistics for AH, ESP, IPIP, IPCOMP, PFKEY. MFC after: 2 weeks	2013-06-20 11:44:16 +00:00
Gleb Smirnoff	8ad458a471	Do not reduce ip_len by size of IP header in the ip_input() before passing a packet to protocol input routines. For several protocols this mean that now protocol needs to do subtraction itself, and for another half this means that we do not need to add header length back to the packet. Make ip_stripoptions() to adjust ip_len, since now we enter this function with a packet header whose ip_len does represent length of entire packet, not payload only.	2012-10-23 08:33:13 +00:00
Gleb Smirnoff	20472bce45	Couple of changes missed from r241913, which converted IPv4 stack to network byte order.	2012-10-22 22:42:28 +00:00
Pawel Jakub Dawidek	0e4fb1db44	Eliminate 'err' variable and just use existing 'error'.	2011-11-26 23:15:28 +00:00
Pawel Jakub Dawidek	0a95a08ecb	Simplify code a bit.	2011-11-26 23:13:30 +00:00
Bjoern A. Zeeb	db178eb816	Make IPsec compile without INET adding appropriate #ifdef checks. Unfold the IPSEC_COMMON_INPUT_CB() macro in xform_{ah,esp,ipcomp}.c to not need three different versions depending on INET, INET6 or both. Mark two places preparing for not yet supported functionality with IPv6. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-27 19:28:42 +00:00
Fabien Thomas	73171433c5	Optimisation in IPSEC(4): - Remove contention on ISR during the crypto operation by using rwlock(9). - Remove a second lookup of the SA in the callback. Gain on 6 cores CPU with SHA1/AES128 can be up to 30%. Reviewed by: vanhu MFC after: 1 month	2011-03-31 15:23:32 +00:00
Fabien Thomas	11d2f4df50	Fix two SA refcount: - AH does not release the SA like in ESP/IPCOMP when handling EAGAIN - ipsec_process_done incorrectly release the SA. Reviewed by: vanhu MFC after: 1 week	2011-03-31 13:14:24 +00:00
VANHULLEBUS Yvan	442da28aeb	Fixed IPsec's HMAC_SHA256-512 support to be RFC4868 compliant. This will break interoperability with all older versions of FreeBSD for those algorithms. Reviewed by: bz, gnn Obtained from: NETASQ MFC after: 1w	2011-02-18 09:40:13 +00:00
Robert Watson	530c006014	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
Robert Watson	eddfbb763d	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
Marko Zec	bfe1aba468	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)	2009-04-11 05:58:58 +00:00

1 2

71 Commits