Commit Graph

504 Commits

Author SHA1 Message Date
Ed Maste
5b2b45a421 r335795 build fix: make static functions static
-Werror,-Wmissing-prototypes makes this an error otherwise.

MFC with:	335795
Sponsored by:	The FreeBSD Foundation
2018-06-29 14:51:36 +00:00
Andrey V. Elsukov
52b3619f45 Make debug output produced by setkey -x command a more human readable.
Add text names of SADB message types and extension headers to the output.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D16036
2018-06-29 13:59:33 +00:00
Mateusz Guzik
4e180881ae uma: implement provisional api for per-cpu zones
Per-cpu zone allocations are very rarely done compared to regular zones.
The intent is to avoid pessimizing the latter case with per-cpu specific
code.

In particular contrary to the claim in r334824, M_ZERO is sometimes being
used for such zones. But the zeroing method is completely different and
braching on it in the fast path for regular zones is a waste of time.
2018-06-08 21:40:03 +00:00
Andrey V. Elsukov
6d8fdfa9d5 Rework IP encapsulation handling code.
Currently it has several disadvantages:
- it uses single mutex to protect internal structures. It is used by
  data- and control- path, thus there are no parallelism at all.
- it uses single list to keep encap handlers for both INET and INET6
  families.
- struct encaptab keeps unneeded information (src, dst, masks, protosw),
  that isn't used by code in the source tree.
- matches are prioritized and when many tunneling interfaces are
  registered, encapcheck handler of each interface is invoked for each
  packet. The search takes O(n) for n interfaces. All this work is done
  with exclusive lock held.

What this patch includes:
- the datapath is converted to be lockless using epoch(9) KPI.
- struct encaptab now linked using CK_LIST.
- all unused fields removed from struct encaptab. Several new fields
  addedr: min_length is the minimum packet length, that encapsulation
  handler expects to see; exact_match is maximum number of bits, that
  can return an encapsulation handler, when it wants to consume a packet.
- IPv6 and IPv4 handlers are stored in separate lists;
- added new "encap_lookup_t" method, that will be used later. It is
  targeted to speedup lookup of needed interface, when gif(4)/gre(4) have
  many interfaces.
- the need to use protosw structure is eliminated. The only pr_input
  method was used from this structure, so I don't see the need to keep
  using it.
- encap_input_t method changed to avoid using mbuf tags to store softc
  pointer. Now it is passed directly trough encap_input_t method.
  encap_getarg() funtions is removed.
- all sockaddr structures and code that uses them removed. We don't have
  any code in the tree that uses them. All consumers use encap_attach_func()
  method, that relies on invoking of encapcheck() to determine the needed
  handler.
- introduced struct encap_config, it contains parameters of encap handler
  that is going to be registered by encap_attach() function.
- encap handlers are stored in lists ordered by exact_match value, thus
  handlers that need more bits to match will be checked first, and if
  encapcheck method returns exact_match value, the search will be stopped.
- all current consumers changed to use new KPI.

Reviewed by:	mmacy
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D15617
2018-06-05 20:51:01 +00:00
Conrad Meyer
ede2f7731d Correctly handle the padding for IPv6-AH, as specified by RFC4302
The RFC specifies that under IPv6 the complete AH header must be 64 bit
aligned, and under IPv4, 32 bit aligned. Prior to this change, we (along
with other BSDs and MacOS) had violated this requirement.

This makes it possible to set up IPv6-AH between Linux and BSD, and also
probably between Windows and BSD.

PR:		222684
Reported and tested by:	Jason Mader <jasonmader AT gmail.com>
Obtained from:	NetBSD xform_ah.c 1.105
		(b939fe2483972eb43d71bf990cfb7f26dece7839 NetBSD/src on GH)
		by Maxime Villard
MFC after:	35.2731 hours
Relnotes:	probably (breaks ipv6 compat with older FreeBSD/NetBSD/MacOS)
Sponsored by:	Dell EMC Isilon
2018-06-04 18:51:06 +00:00
Andrey V. Elsukov
33c1b2bd88 Temporary disable SPDCACHE statistic accounting until proper fix will be
committed. This fixes the kernel build without option IPSEC.
2018-05-28 09:23:28 +00:00
Matt Macy
c82dfce3ec netipsec/!VIMAGE: don't declare/define spdcache_destroy on non-VIMAGE builds
this breaks MIPS compiles in universe
2018-05-24 23:47:27 +00:00
Fabien Thomas
f8e73c47d8 Add a SPD cache to speed up lookups.
When large SPDs are used, we face two problems:

- too many CPU cycles are spent during the linear searches in the SPD
  for each packet
- too much contention on multi socket systems, since we use a single
  shared lock.

Main changes:

- added the sysctl tree 'net.key.spdcache' to control the SPD cache
  (disabled by default).
- cache the sp indexes that are used to perform SP lookups.
- use a range of dedicated mutexes to protect the cache lines.

Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by: ae
Sponsored by:	Stormshield
Differential Revision: https://reviews.freebsd.org/D15050
2018-05-22 15:54:25 +00:00
Andrey V. Elsukov
bf18dfa28e Merge r1.22-1.23 from NetBSD:
Don't assume M_PKTHDR is set only on the first mbuf of the chain.
  The check is replaced by (m1 != m), which is equivalent to the previous
  code: we want to modify m->m_pkthdr.len only when 'm' was not passed in
  m_adj().

  Fix a pretty bad mistake, that has always been there:
   m_adj(m1, -(m1->m_len - roff));
   if (m1 != m)
	m->m_pkthdr.len -= (m1->m_len - roff);

  This is wrong: m_adj() will modify m1->m_len, so we're using a wrong
  value when manually adjusting m->m_pkthdr.len.

Reported by:	Maxime Villard <max at m00nbsd dot net>
Obtained from:	NetBSD
MFC after:	1 week
2018-04-26 12:23:31 +00:00
John Baldwin
fd40ecf3d4 Set the proper vnet in IPsec callback functions.
When using hardware crypto engines, the callback functions used to handle
an IPsec packet after it has been encrypted or decrypted can be invoked
asynchronously from a worker thread that is not associated with a vnet.
Extend 'struct xform_data' to include a vnet pointer and save the current
vnet in this new member when queueing crypto requests in IPsec.  In the
IPsec callback routines, use the new member to set the current vnet while
processing the modified packet.

This fixes a panic when using hardware offload such as ccr(4) with IPsec
after VIMAGE was enabled in GENERIC.

Reported by:	Sony Arpita Das and Harsh Jain @ Chelsio
Reviewed by:	bz
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14763
2018-03-20 17:05:23 +00:00
Andrey V. Elsukov
017a5e581a Rework key_sendup_mbuf() a bit:
o count in_nomem counter when we have failed to allocate mbuf for
  promisc socket;
o count in_msgtarget counter when we have secussfully sent data to socket;
o Since we are sending messages in a loop, returning error on first fail
  interrupts the loop, and all remaining sockets will not receive this
  message. So, do not return error when we have failed to send data to ALL
  or REGISTERED target. Return error only for KEY_SENDUP_ONE case. Now,
  when some socket has overfilled its receive buffer, this will not break
  other sockets.

MFC after:	2 weeks
2018-03-11 19:14:01 +00:00
Andrey V. Elsukov
d158b221e5 Add KASSERT to check that proper targed was used.
MFC after:	2 weeks
2018-03-11 18:46:40 +00:00
Andrey V. Elsukov
e3004d2429 Replace panic() with KASSERTs.
MFC after:	2 weeks
2018-03-11 18:37:55 +00:00
Andrey V. Elsukov
dccd41cb39 Check that we have PF_KEY sockets before iterating over all RAW sockets.
MFC after:	2 weeks
2018-03-11 18:10:59 +00:00
Andrey V. Elsukov
26cb1e1dba Remove obsoleted and unused key_sendup() function.
Also remove declaration for nonexistend key_usrreq() function.

MFC after:	2 weeks
2018-03-11 18:03:55 +00:00
Andrey V. Elsukov
15bf717a93 Remove unused variables and sysctl declaration.
MFC after:	1 week
2018-02-19 12:20:51 +00:00
Andrey V. Elsukov
6ca39da354 Check packet length to do not make out of bounds access. Also save ah_nxt
value to use it later, since ah pointer can become invalid.

Reported by:	Maxime Villard <max at m00nbsd dot net>
MFC after:	5 days
2018-02-19 11:14:38 +00:00
Andrey V. Elsukov
055679e67e Adopt revision 1.76 and 1.77 from NetBSD:
Fix a vulnerability in IPsec-IPv6-AH, that allows an attacker to remotely
  crash the kernel with a single packet.

  In this loop we need to increment 'ad' by two, because the length field
  of the option header does not count the size of the option header itself.

  If the length is zero, then 'count' is incremented by zero, and there's
  an infinite loop. Beyond that, this code was written with the assumption
  that since the IPv6 packet already went through the generic IPv6 option
  parser, several fields are guaranteed to be valid; but this assumption
  does not hold because of the missing '+2', and there's as a result a
  triggerable buffer overflow (write zeros after the end of the mbuf,
  potentially to the next mbuf in memory since it's a pool).

  Add the missing '+2', this place will be reinforced in separate commits.

Reported by:	Maxime Villard <maxv at NetBSD.org>
MFC after:	1 week
2018-01-24 19:48:25 +00:00
Andrey V. Elsukov
b12a7532e3 Merge revision 1.35 from NetBSD:
fix pointer/offset mistakes in handling of IPv4 options

Reported by:	Maxime Villard <maxv at NetBSD.org>
MFC after:	1 week
2018-01-24 19:06:44 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Andrey V. Elsukov
d8ba1ddc0f Do better cleaning in key_destroy() for VIMAGE case.
SPDB was cleaned using TAILQ_CONCAT() instead of calling key_unlink()
for each SP, thus we need to properly clean lists in each bucket of
V_sphashtbl to avoid panic in hashdestroy() when INVARIANTS is enabled.

Do the same for V_acqaddrhashtbl and V_acqseqhashtbl.

When we are called in DEFAULT_VNET, destroy also all global locks and
drain key_timer callout.

Reported by:	kp
Tested by:	kp
MFC after:	1 week
2017-12-01 09:59:42 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Conrad Meyer
f95f6841c8 ipsec: Use the same keysize values for HMAC as prior to r324017
The HMAC construction natively permits any key size between 0 and the input
block length. Before r324017, the auth_hash 'keysize' member was the hash
output length, which was used by ipsec for key sizes. (Non-ipsec consumers
need the ability to use other keysizes, hence, r324017.)

The ipsec SADB code blindly uses the auth_hash 'keysize' member for both
minimum and maximum key size, which is wrong (from an HMAC perspective).
For now, just switch it to 'hashsize', which matches the existing
expectations.

Instead it should probably use the range [0, keysize]. But there may be
other broken code in ipsec that rejects hashes with too small a minimum
key size.

Reported by:	olivier@
Reviewed by:	olivier, no objection from ae
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12770
2017-11-15 22:42:20 +00:00
Andrey V. Elsukov
cd48d883bd Use correct pointer in key_updateaddresses() when updating NAT-T config.
key_updateaddresses() is used to update SA addresses and NAT-T
configuration in SADB_UPDATE message. This is done using cloning SA
content from old SA into new one. But addresses and NAT-T configuration
are taking from SADB_UPDATE message. Use newsa pointer to set NAT-T
properties into cloned SA.

PR:		223382
MFC after:	1 week
2017-11-03 11:33:13 +00:00
Fabien Thomas
39bbca6ffd crypto(9) is called from ipsec in CRYPTO_F_CBIFSYNC mode. This is working
fine when a lot of different flows to be ciphered/deciphered are involved.

However, when a software crypto driver is used, there are
situations where we could benefit from making crypto(9) multi threaded:
- a single flow is to be ciphered: only one thread is used to cipher it,
- a single ESP flow is to be deciphered: only one thread is used to
decipher it.

The idea here is to call crypto(9) using a new mode (CRYPTO_F_ASYNC) to
dispatch the crypto jobs on multiple threads, if the underlying crypto
driver is working in synchronous mode.

Another flag is added (CRYPTO_F_ASYNC_KEEPORDER) to make crypto(9)
dispatch the crypto jobs in the order they are received (an additional
queue/thread is used), so that the packets are reinjected in the network
using the same order they were posted.

A new sysctl net.inet.ipsec.async_crypto can be used to activate
this new behavior (disabled by default).

Submitted by:	Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by:	ae, jmg, jhb
Differential Revision:    https://reviews.freebsd.org/D10680
Sponsored by:	Stormshield
2017-11-03 10:27:22 +00:00
Conrad Meyer
3693b18840 opencrypto: Loosen restriction on HMAC key sizes
Theoretically, HMACs do not actually have any limit on key sizes.
Transforms should compact input keys larger than the HMAC block size by
using the transform (hash) on the input key.

(Short input keys are padded out with zeros to the HMAC block size.)

Still, not all FreeBSD crypto drivers that provide HMAC functionality
handle longer-than-blocksize keys appropriately, so enforce a "maximum" key
length in the crypto API for auth_hashes that previously expressed a
requirement.  (The "maximum" is the size of a single HMAC block for the
given transform.)  Unconstrained auth_hashes are left as-is.

I believe the previous hardcoded sizes were committed in the original
import of opencrypto from OpenBSD and are due to specific protocol
details of IPSec.  Note that none of the previous sizes actually matched
the appropriate HMAC block size.

The previous hardcoded sizes made the SHA tests in cryptotest.py
useless for testing FreeBSD crypto drivers; none of the NIST-KAT example
inputs had keys sized to the previous expectations.

The following drivers were audited to check that they handled keys up to
the block size of the HMAC safely:

  Software HMAC:
    * padlock(4)
    * cesa
    * glxsb
    * safe(4)
    * ubsec(4)

  Hardware accelerated HMAC:
    * ccr(4)
    * hifn(4)
    * sec(4) (Only supports up to 64 byte keys despite claiming to
      support SHA2 HMACs, but validates input key sizes)
    * cryptocteon (MIPS)
    * nlmsec (MIPS)
    * rmisec (MIPS) (Amusingly, does not appear to use key material at
      all -- presumed broken)

Reviewed by:	jhb (previous version), rlibby (previous version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12437
2017-09-26 16:18:10 +00:00
Andrey V. Elsukov
6f9e437bdc Fix possible double releasing for SA reference.
This is missing part of r318734. When crypto subsystem returns error
the xform code handles an error independently.

PR:		221849
MFC after:	5 days
2017-09-01 11:51:07 +00:00
Andrey V. Elsukov
c32396978e Remove stale comments.
MFC after:	1 week
2017-08-21 13:54:29 +00:00
Andrey V. Elsukov
22bbefb2c9 Fix the regression introduced in r275710.
When a security policy should match TCP connection with specific ports,
the SYN+ACK segment send by syncache_respond() is considered as forwarded
packet, because at this moment TCP connection does not have PCB structure,
and ip_output() is called without inpcb pointer. In this case SPIDX filled
for SP lookup will not contain TCP ports and security policy will not
be found. This can lead to unencrypted SYN+ACK on the wire.

This patch restores the old behavior, when ports will not be filled only
for forwarded packets.

Reported by:	Dewayne Geraghty <dewayne.geraghty at heuristicsystems.com.au>
MFC after:	1 week
2017-08-21 13:52:21 +00:00
Andrey V. Elsukov
e54647920b Make user supplied data checks a bit stricter.
key_msg2sp() is used for parsing data from setsockopt(IP[V6]_IPSEC_POLICY)
call. This socket option is usually used to configure IPsec bypass for
socket. Only privileged user can set this socket option.
The message syntax is described here
	http://www.kame.net/newsletter/20021210/

and our libipsec is usually used to create the correct request.
Add additional checks:
* that sadb_x_ipsecrequest_len is not out of bounds of user supplied buffer
* that src/dst's sa_len is the same
* that 2*sa_len is not out of bounds of user supplied buffer
* that 2*sa_len fits into bounds of sadb_x_ipsecrequest

Reported by:	Ilja van Sprundel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11796
2017-08-09 19:58:38 +00:00
Andrey V. Elsukov
1a01e0e7ac Add inpcb pointer to struct ipsec_ctx_data and pass it to the pfil hook
from enc_hhook().

This should solve the problem when pf is used with if_enc(4) interface,
and outbound packet with existing PCB checked by pf, and this leads to
deadlock due to pf does its own PCB lookup and tries to take rlock when
wlock is already held.

Now we pass PCB pointer if it is known to the pfil hook, this helps to
avoid extra PCB lookup and thus rlock acquiring is not needed.
For inbound packets it is safe to pass NULL, because we do not held any
PCB locks yet.

PR:		220217
MFC after:	3 weeks
Sponsored by:	Yandex LLC
2017-07-31 11:04:35 +00:00
Andrey V. Elsukov
8e5060a03e Build kdebug_secreplay() function only when IPSEC_DEBUG is defined.
This should fix the build on sparc.

Reported by:	emaste
X-MFC with:	r319118
2017-06-01 10:04:12 +00:00
Andrey V. Elsukov
7f1f65918b Disable IPsec debugging code by default when IPSEC_DEBUG kernel option
is not specified.

Due to the long call chain IPsec code can produce the kernel stack
exhaustion on the i386 architecture. The debugging code usually is not
used, but it requires a lot of stack space to keep buffers for strings
formatting. This patch conditionally defines macros to disable building
of IPsec debugging code.

IPsec currently has two sysctl variables to configure debug output:
 * net.key.debug variable is used to enable debug output for PF_KEY
   protocol. Such debug messages are produced by KEYDBG() macro and
   usually they can be interesting for developers.
 * net.inet.ipsec.debug variable is used to enable debug output for
   DPRINTF() macro and ipseclog() function. DPRINTF() macro usually
   is used for development debugging. ipseclog() function is used for
   debugging by administrator.

The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers
declarations when IPSEC_DEBUG is not present in kernel config. This reduces
stack requirement for up to several hundreds of bytes.
The net.inet.ipsec.debug variable still can be used to enable ipseclog()
messages by administrator.

PR:		219476
Reported by:	eugen
No objection from:	#network
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10869
2017-05-29 09:30:38 +00:00
Andrey V. Elsukov
3aee70991d Fix possible double releasing for SA and SP references.
There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.

For outbound packets the direct call chain is the following:
 IPSEC_OUTPUT() method -> ipsec[46]_common_output() ->
 -> ipsec[46]_perform_request() -> xform_output() ->
 -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
 -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output().

The SA and SP references are held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did references releasing in
xform_output_cb(). But when the crypto callback called directly, in case of
error the error handling code in ipsec[46]_perform_request() also did
references releasing.

To fix this, remove error handling from ipsec[46]_perform_request() and do it
in xform_output() before crypto_dispatch().

MFC after:	10 days
2017-05-23 09:32:26 +00:00
Andrey V. Elsukov
5f7c516f21 Fix possible double releasing for SA reference.
There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.

For inbound packets the direct call chain is the following:
 IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() ->
 -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
 -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().

The SA reference is held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did SA reference releasing in
xform_input_cb(). But when the crypto callback called directly, in case of
error (e.g. data authentification failed) the error handling in
ipsec_common_input() also did SA reference releasing.

To fix this, remove error handling from ipsec_common_input() and do it
in xform_input() before crypto_dispatch().

PR:		219356
MFC after:	10 days
2017-05-23 09:01:48 +00:00
Ed Maste
3e85b721d6 Remove register keyword from sys/ and ANSIfy prototypes
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.

ANSIfy related prototypes while here.

Reviewed by:	cem, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10193
2017-05-17 00:34:34 +00:00
Andrey V. Elsukov
44c6ff8e2a Fix SP refcount leak.
PCB SP cache acquires extra reference, when SP is stored in the cache.
Release this reference when PCB is destroyed in ipsec_delete_pcbpolicy().
In ipsec_copy_pcbpolicy() release reference to SP in case if sp_in or
sp_out are not NULL.

Reported by:	Slawa Olhovchenkov <slw at zxy spb ru>
MFC after:	1 week
2017-04-26 00:34:05 +00:00
Andrey V. Elsukov
4e0e8f3107 Add large replay widow support to setkey(8) and libipsec.
When the replay window size is large than UINT8_MAX, add to the request
the SADB_X_EXT_SA_REPLAY extension header that was added in r309144.

Also add support of SADB_X_EXT_NAT_T_TYPE, SADB_X_EXT_NAT_T_SPORT,
SADB_X_EXT_NAT_T_DPORT, SADB_X_EXT_NAT_T_OAI, SADB_X_EXT_NAT_T_OAR,
SADB_X_EXT_SA_REPLAY, SADB_X_EXT_NEW_ADDRESS_SRC, SADB_X_EXT_NEW_ADDRESS_DST
extension headers to the key_debug that is used by `setkey -x`.

Modify kdebug_sockaddr() to use inet_ntop() for IP addresses formatting.
And modify kdebug_sadb_x_policy() to show policy scope and priority.

Reviewed by:	gnn, Emeric Poupon
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10375
2017-04-13 14:44:17 +00:00
Andrey V. Elsukov
9c2b99b912 When we are doing SA lookup for TCP-MD5, check both source and
destination addresses. Previous code has used only destination address
for lookup. But for inbound packets the source address was used as SA
destination address. Thus only outbound SA were used for both directions.
Now we use addresses from a packet as is, thus SAs for both directions are
needed.

Reported by:	Mike Tancsa
MFC after:	1 week
2017-04-04 13:41:50 +00:00
Andrey V. Elsukov
db3b3ec5b4 GC some unused declarations.
MFC after:	1 week
2017-04-03 04:44:56 +00:00
Andrey V. Elsukov
8291fb89cf Fix bug in r308972 that leads to panic when non-compressed IPComp
packet is received.

Reported by:	Denis Ahrens <denis h3q com>
MFC after:	3 days
2017-03-29 10:24:48 +00:00
Andrey V. Elsukov
22986c6740 Introduce the concept of IPsec security policies scope.
Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.

Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.

To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.

For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.

After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
	in ipsec
	esp/tunnel/87.250.242.144-87.250.242.145/unique:145
	spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
	refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
	out none
	spid=5 seq=1 pid=872 scope=global
	refcnt=1

No objection from:	#network
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9805
2017-03-07 00:13:53 +00:00
Andrey V. Elsukov
fbed6d606a For translated packets do not adjust UDP checksum if it is zero.
In case when decrypted and decapsulated packet is an UDP datagram,
check that its checksum is not zero before doing incremental checksum
adjustment.

Reported by:	Eugene Grosbein
Tested by:	Eugene Grosbein
2017-02-18 19:53:37 +00:00
Andrey V. Elsukov
ddc9f8e87f Fix LINT build for powerpc.
Build kernel modules support only when both IPSEC and TCP_SIGNATURE
are not defined.

Reported by:	emaste
2017-02-16 11:38:50 +00:00
Gleb Smirnoff
cfff3743cd Move tcp_fields_to_net() static inline into tcp_var.h, just below its
friend tcp_fields_to_host(). There is third party code that also uses
this inline.

Reviewed by:	ae
2017-02-10 17:46:26 +00:00
Andrey V. Elsukov
fcf596178b Merge projects/ipsec into head/.
Small summary
 -------------

o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
  option IPSEC_SUPPORT added. It enables support for loading
  and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
  default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
  support was removed. Added TCP/UDP checksum handling for
  inbound packets that were decapsulated by transport mode SAs.
  setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
  build as part of ipsec.ko module (or with IPSEC kernel).
  It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
  methods. The only one header file <netipsec/ipsec_support.h>
  should be included to declare all the needed things to work
  with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
  Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
  - now all security associations stored in the single SPI namespace,
    and all SAs MUST have unique SPI.
  - several hash tables added to speed up lookups in SADB.
  - SADB now uses rmlock to protect access, and concurrent threads
    can do SA lookups in the same time.
  - many PF_KEY message handlers were reworked to reflect changes
    in SADB.
  - SADB_UPDATE message was extended to support new PF_KEY headers:
    SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
    can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
  avoid locking protection for ipsecrequest. Now we support
  only limited number (4) of bundled SAs, but they are supported
  for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
  used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
  check for full history of applied IPsec transforms.
o References counting rules for security policies and security
  associations were changed. The proper SA locking added into xform
  code.
o xform code was also changed. Now it is possible to unregister xforms.
  tdb_xxx structures were changed and renamed to reflect changes in
  SADB/SPDB, and changed rules for locking and refcounting.

Reviewed by:	gnn, wblock
Obtained from:	Yandex LLC
Relnotes:	yes
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9352
2017-02-06 08:49:57 +00:00
Andrey V. Elsukov
18b105c27b Add direction argument to ipsec_setspidx_inpcb() function.
This function is used only by ipsec_getpolicybysock() to fill security
policy index selector for locally generated packets (that have INPCB).
The function incorrectly assumes that spidx is the same for both directions.
Fix this by using new direction argument to specify correct INPCB security
policy - sp_in or sp_out. There is no need to fill both policy indeces,
because they are overwritten for each packet.
This fixes security policy matching for outbound packets when user has
specified TCP/UDP ports in the security policy upperspec.

PR:		213869
MFC after:	1 week
2017-01-08 12:40:07 +00:00
Scott Long
b84ef73179 Add a missing header 2016-11-26 23:15:11 +00:00
Ed Maste
79b4dcad2b netipsec: fix build after 309144
Reported by:	rakuco
2016-11-26 00:59:01 +00:00
Fabien Thomas
bf4356266d IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets.
Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.

The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.

Reviewed by:	ae
Obtained from:	emeric.poupon@stormshield.eu
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D8468
2016-11-25 14:44:49 +00:00
Kevin Lo
c3bef61e58 Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead.
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D7878
2016-09-15 07:41:48 +00:00
Andrey V. Elsukov
0c127808dd Remove redundant sanity checks from ipsec[46]_common_input_cb().
This check already has been done in the each protocol callback.
2016-08-31 11:51:52 +00:00
Bjoern A. Zeeb
89856f7e2d Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by:		re (hrs)
Obtained from:		projects/vnet
Reviewed by:		gnn, jhb
Sponsored by:		The FreeBSD Foundation
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D6747
2016-06-21 13:48:49 +00:00
Conrad Meyer
ab11d379fa netipsec: Fix minor style nit
Coverity points out that 'continue' is equivalent to 'break' in a do {}
while(false) loop.

Reported by:	Coverity
CID:		1354983
Sponsored by:	EMC / Isilon Storage Division
2016-05-10 20:14:11 +00:00
Pedro F. Giffuni
a4641f4eaa sys/net*: minor spelling fixes.
No functional change.
2016-05-03 18:05:43 +00:00
Conrad Meyer
5a8c498f2e netipsec: Don't leak memory when deep copy fails
Reported by:	Coverity
CID:		1331693
Sponsored by:	EMC / Isilon Storage Division
2016-04-26 23:23:44 +00:00
Andrey V. Elsukov
0d05666462 Fix build for NOINET and NOINET6 kernels.
Use own protosw structures for both address families.
Check proto in encapcheck function and use -1 as proto argument in
encap_attach_func(), both address families can have IPPROTO_IPV4
and IPPROTO_IPV6 protocols.

Reported by:	bz
2016-04-24 17:09:51 +00:00
Andrey V. Elsukov
2ada524ece Use ipsec_address() function to print IP addresses. 2016-04-24 09:05:29 +00:00
Andrey V. Elsukov
3cbd4ec3e4 Handle non-compressed packets for IPComp in tunnel mode.
RFC3173 says that the IP datagram MUST be sent in the original
non-compressed form, when the total size of a compressed payload
and the IPComp header is not smaller than the size of the original
payload. In tunnel mode for small packets IPComp will send
encapsulated IP datagrams without IPComp header.
Add ip_encap handler for IPPROTO_IPV4 and IPPROTO_IPV6 to handle
these datagrams. The handler does lookup for SA related to IPComp
protocol and given from mbuf source and destination addresses as
tunnel endpoints. It decapsulates packets only when corresponding SA
is found.

Reported by:	gnn
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D6062
2016-04-24 09:02:17 +00:00
Andrey V. Elsukov
61a292a396 Remove stale function declaration 2016-04-21 11:02:06 +00:00
Andrey V. Elsukov
efb10c3ce7 Constify mbuf pointer for IPSEC functions where mbuf isn't modified. 2016-04-21 10:58:07 +00:00
Pedro F. Giffuni
02abd40029 kernel: use our nitems() macro when it is available through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:48:27 +00:00
Pedro F. Giffuni
155d72c498 sys/net* : for pointers replace 0 with NULL.
Mostly cosmetical, no functional change.

Found with devel/coccinelle.
2016-04-15 17:30:33 +00:00
Andrey V. Elsukov
6f814d0ec1 Fix handling of net.inet.ipsec.dfbit=2 variable.
IP_DF macro is in host bytes order, but ip_off field is in network bytes
order. So, use htons() for correct check.
2016-03-18 09:03:00 +00:00
Robert Watson
bc84fc89be Put IPSec's anouncement of its successful intialisation under bootverbose:
now that it's a default kernel option, we don't really need to tell the
world about it on every boot, especially as it won't be used by most users.
2016-03-13 19:27:46 +00:00
Mark Johnston
3543e138e5 Set tres to NULL to avoid a double free if the m_pullup() below fails.
Reviewed by:	glebius
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D5497
2016-03-02 05:04:04 +00:00
Andrey V. Elsukov
b833ff5ab4 Fix useless check. m_pkthdr.len should be equal to orglen.
MFC after:	2 weeks
2016-02-24 12:28:49 +00:00
Gleb Smirnoff
8ec07310fa These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
Andrey V. Elsukov
ef91a9765d Overhaul if_enc(4) and make it loadable in run-time.
Use hhook(9) framework to achieve ability of loading and unloading
if_enc(4) kernel module. INET and INET6 code on initialization registers
two helper hooks points in the kernel. if_enc(4) module uses these helper
hook points and registers its hooks. IPSEC code uses these hhook points
to call helper hooks implemented in if_enc(4).
2015-11-25 07:31:59 +00:00
Fabien Thomas
d6d3f24890 Implement the sadb_x_policy_priority field as it is done in Linux:
lower priority policies are inserted first.

Submitted by:	Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by:	ae
Sponsored by:	Stormshield
2015-11-17 14:39:33 +00:00
Andrey V. Elsukov
0c80e7df43 Use explicitly specified ivsize instead of blocksize when we mean IV size.
Set zero ivsize for enc_xform_null and remove special handling from
xform_esp.c.

Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D1503
2015-11-16 07:10:42 +00:00
George V. Neville-Neil
26882b4239 Turning on IPSEC used to introduce a slight amount of performance
degradation (7%) for host host TCP connections over 10Gbps links,
even when there were no secuirty policies in place. There is no
change in performance on 1Gbps network links. Testing GENERIC vs.
GENERIC-NOIPSEC vs. GENERIC with this change shows that the new
code removes any overhead introduced by having IPSEC always in the
kernel.

Differential Revision:	D3993
MFC after:	1 month
Sponsored by:	Rubicon Communications (Netgate)
2015-10-27 00:42:15 +00:00
Andrey V. Elsukov
f367798498 Take extra reference to security policy before calling crypto_dispatch().
Currently we perform crypto requests for IPSEC synchronous for most of
crypto providers (software, aesni) and only VIA padlock calls crypto
callback asynchronous. In synchronous mode it is possible, that security
policy will be removed during the processing crypto request. And crypto
callback will release the last reference to SP. Then upon return into
ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already
freed request. To prevent this we will take extra reference to SP.

PR:		201876
Sponsored by:	Yandex LLC
2015-09-30 08:16:33 +00:00
John-Mark Gurney
a2bc81bf7c Make IPsec work with AES-GCM and AES-ICM (aka CTR) in OCF... IPsec
defines the keys differently than NIST does, so we have to muck with
key lengths and nonce/IVs to be standard compliant...

Remove the iv from secasvar as it was unused...

Add a counter protected by a mutex to ensure that the counter for GCM
and ICM will never be repeated..  This is a requirement for security..
I would use atomics, but we don't have a 64bit one on all platforms..

Fix a bug where IPsec was depending upon the OCF to ensure that the
blocksize was always at least 4 bytes to maintain alignment... Move
this logic into IPsec so changes to OCF won't break IPsec...

In one place, espx was always non-NULL, so don't test that it's
non-NULL before doing work..

minor style cleanups...

drop setting key and klen as they were not used...

Enforce that OCF won't pass invalid key lengths to AES that would
panic the machine...

This was has been tested by others too...  I tested this against
NetBSD 6.1.5 using mini-test suite in
https://github.com/jmgurney/ipseccfgs and the only things that don't
pass are keyed md5 and sha1, and 3des-deriv (setkey syntax error),
all other modes listed in setkey's man page...  The nice thing is
that NetBSD uses setkey, so same config files were used on both...

Reviewed by:	gnn
2015-08-04 17:47:11 +00:00
John-Mark Gurney
42e5fcbf2b these are comparing authenticators and need to be constant time...
This could be a side channel attack...  Now that we have a function
for this, use it...

jmgurney/ipsecgcm:	24d704cc and 7f37a14
2015-07-31 00:31:52 +00:00
John-Mark Gurney
817c7ed900 Clean up this header file...
use CTASSERTs now that we have them...

Replace a draft w/ RFC that's over 10 years old.

Note that _AALG and _EALG do not need to match what the IKE daemons
think they should be..  This is part of the KABI...  I decided to
renumber AESCTR, but since we've never had working AESCTR mode, I'm
not really breaking anything..  and it shortens a loop by quite
a bit..

remove SKIPJACK IPsec support...  SKIPJACK never made it out of draft
(in 1999), only has 80bit key, NIST recommended it stop being used
after 2010, and setkey nor any of the IKE daemons I checked supported
it...

jmgurney/ipsecgcm: a357a33, c75808b, e008669, b27b6d6

Reviewed by:	gnn (earlier version)
2015-07-31 00:23:21 +00:00
Ermal Luçi
59959de526 Correct IPSec SA statistic keeping
The IPsec SA statistic keeping is used even for decision making on expiry/rekeying SAs.
When there are multiple transformations being done the statistic keeping might be wrong.

This mostly impacts multiple encapsulations on IPsec since the usual scenario it is not noticed due to the code path not taken.

Differential Revision:	https://reviews.freebsd.org/D3239
Reviewed by:		ae, gnn
Approved by:		gnn(mentor)
2015-07-30 20:56:27 +00:00
John-Mark Gurney
a09a7146a7 RFC4868 section 2.3 requires that the output be half... This fixes
problems that was introduced in r285336...  I have verified that
HMAC-SHA2-256 both ah only and w/ AES-CBC interoperate w/ a NetBSD
6.1.5 vm...

Reviewed by:	gnn
2015-07-29 07:15:16 +00:00
Ermal Luçi
705f4d9c6a IPSEC, remove variable argument function its already due.
Differential Revision:		https://reviews.freebsd.org/D3080
Reviewed by:	gnn, ae
Approved by:	gnn(mentor)
2015-07-21 21:46:24 +00:00
George V. Neville-Neil
0b75d21e18 Summary: Fix LINT build. The names of the new AES modes were not
correctly used under the REGRESSION kernel option.
2015-07-10 02:23:50 +00:00
George V. Neville-Neil
16de9ac1b5 Add support for AES modes to IPSec. These modes work both in software only
mode and with hardware support on systems that have AESNI instructions.

Differential Revision:	D2936
Reviewed by:	jmg, eri, cognet
Sponsored by:	Rubicon Communications (Netgate)
2015-07-09 18:16:35 +00:00
Andrey V. Elsukov
280d77a3bb Fill the port and protocol information in the SADB_ACQUIRE message
in case when security policy has it as required by RFC 2367.

PR:		192774
Differential Revision:	https://reviews.freebsd.org/D2972
MFC after:	1 week
2015-07-06 12:40:31 +00:00
Ermal Luçi
c1fc5e9601 Reduce overhead of IPSEC for traffic generated from host
When IPSEC is enabled on the kernel the forwarding path has an optimization to not enter the code paths
for checking security policies but first checks if there is any security policy active at all.

The patch introduces the same optimization but for traffic generated from the host itself.
This reduces the overhead by 50% on my tests for generated host traffic without and SP active.

Differential Revision:	https://reviews.freebsd.org/D2980
Reviewed by:	ae, gnn
Approved by:	gnn(mentor)
2015-07-03 15:31:56 +00:00
John-Mark Gurney
49672bcc54 drop key_sa_stir_iv as it isn't used...
Reviewed by:	eri, ae
2015-06-11 13:05:37 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Andrey V. Elsukov
dc4ea824d4 In the reply to SADB_X_SPDGET message use the same sequence number that
was in the request. Some IKE deamons expect it will the same. Linux and
NetBSD also follow this behaviour.

PR:		137309
MFC after:	2 weeks
2015-05-20 11:59:53 +00:00
Andrey V. Elsukov
4ec5fcbe09 Remove unneded mbuf length adjustment, M_PREPEND() already did that.
PR:		139387
MFC after:	1 week
2015-05-19 17:14:27 +00:00
Andrey V. Elsukov
5bae2b34a4 Change SA's state before sending SADB_EXPIRE message. This state will
be reported to keying daemon.

MFC after:	2 weeks
2015-05-19 08:37:03 +00:00
Andrey V. Elsukov
664802113f Teach key_expire() send SADB_EXPIRE message with the SADB_EXT_LIFETIME_HARD
extension header type. The key_flush_sad() now will send SADB_EXPIRE
message when HARD lifetime expires. This is required by RFC 2367 and some
keying daemons rely on these messages. HARD lifetime messages have
precedence over SOFT lifetime messages, so now they will be checked first.
Also now SADB_EXPIRE messages will be send even the SA has not been used,
because keying daemons might want to rekey such SA.

PR:		200282, 200283
Submitted by:	Tobias Brunner <tobias at strongswan dot org>
MFC after:	2 weeks
2015-05-19 08:30:04 +00:00
George V. Neville-Neil
a9b294b433 Summary: Remove spurious, extra, next header comments.
Correct the name of the pad length field.
2015-05-15 18:04:49 +00:00
Andrey V. Elsukov
6508929bc2 Fix the comment. We will not do SPD lookup again, because
ip[6]_ipsec_output() will find PACKET_TAG_IPSEC_OUT_DONE mbuf tag.

Sponsored by:	Yandex LLC
2015-04-28 11:03:47 +00:00
Andrey V. Elsukov
574fde00be Since PFIL can change mbuf pointer, we should update pointers after
calling ipsec_filter().

Sponsored by:	Yandex LLC
2015-04-28 09:29:28 +00:00
Andrey V. Elsukov
a9b9f6b6c6 Make ipsec_in_reject() static. We use ipsec[46]_in_reject() instead.
Sponsored by:	Yandex LLC
2015-04-27 01:12:51 +00:00
Andrey V. Elsukov
3d80e82d60 Fix possible use after free due to security policy deletion.
When we are passing mbuf to IPSec processing via ipsec[46]_process_packet(),
we hold one reference to security policy and release it just after return
from this function. But IPSec processing can be deffered and when we release
reference to security policy after ipsec[46]_process_packet(), user can
delete this security policy from SPDB. And when IPSec processing will be
done, xform's callback function will do access to already freed memory.

To fix this move KEY_FREESP() into callback function. Now IPSec code will
release reference to SP after processing will be finished.

Differential Revision:	https://reviews.freebsd.org/D2324
No objections from:	#network
Sponsored by:	Yandex LLC
2015-04-27 00:55:56 +00:00
Andrey V. Elsukov
962ac6c727 Change ipsec_address() and ipsec_logsastr() functions to take two
additional arguments - buffer and size of this buffer.

ipsec_address() is used to convert sockaddr structure to presentation
format. The IPv6 part of this function returns pointer to the on-stack
buffer and at the moment when it will be used by caller, it becames
invalid. IPv4 version uses 4 static buffers and returns pointer to
new buffer each time when it called. But anyway it is still possible
to get corrupted data when several threads will use this function.

ipsec_logsastr() is used to format string about SA entry. It also
uses static buffer and has the same problem with concurrent threads.

To fix these problems add the buffer pointer and size of this
buffer to arguments. Now each caller will pass buffer and its size
to these functions. Also convert all places where these functions
are used (except disabled code).

And now ipsec_address() uses inet_ntop() function from libkern.

PR:		185996
Differential Revision:	https://reviews.freebsd.org/D2321
Reviewed by:	gnn
Sponsored by:	Yandex LLC
2015-04-18 16:58:33 +00:00
Andrey V. Elsukov
1d3b268c04 Requeue mbuf via netisr when we use IPSec tunnel mode and IPv6.
ipsec6_common_input_cb() uses partial copy of ip6_input() to parse
headers. But this isn't correct, when we use tunnel mode IPSec.

When we stripped outer IPv6 header from the decrypted packet, it
can become IPv4 packet and should be handled by ip_input. Also when
we use tunnel mode IPSec with IPv6 traffic, we should pass decrypted
packet with inner IPv6 header to ip6_input, it will correctly handle
it and also can decide to forward it.

The "skip" variable points to offset where payload starts. In tunnel
mode we reset it to zero after stripping the outer header. So, when
it is zero, we should requeue mbuf via netisr.

Differential Revision:	https://reviews.freebsd.org/D2306
Reviewed by:	adrian, gnn
Sponsored by:	Yandex LLC
2015-04-18 16:51:24 +00:00
Andrey V. Elsukov
1ae800e7a6 Fix handling of scoped IPv6 addresses in IPSec code.
* in ipsec_encap() embed scope zone ids into link-local addresses
  in the new IPv6 header, this helps ip6_output() disambiguate the
  scope;
* teach key_ismyaddr6() use in6_localip(). in6_localip() is less
  strict than key_sockaddrcmp(). It doesn't compare all fileds of
  struct sockaddr_in6, but it is faster and it should be safe,
  because all SA's data was checked for correctness. Also, since
  IPv6 link-local addresses in the &V_in6_ifaddrhead are stored in
  kernel-internal form, we need to embed scope zone id from SA into
  the address before calling in6_localip.
* in ipsec_common_input() take scope zone id embedded in the address
  and use it to initialize sin6_scope_id, then use this sockaddr
  structure to lookup SA, because we keep addresses in the SADB without
  embedded scope zone id.

Differential Revision:	https://reviews.freebsd.org/D2304
Reviewed by:	gnn
Sponsored by:	Yandex LLC
2015-04-18 16:46:31 +00:00
Andrey V. Elsukov
61f376155d Remove xform_ipip.c and code related to XF_IP4.
The only thing is used from this code is ipip_output() function, that does
IPIP encapsulation. Other parts of XF_IP4 code were removed in r275133.
Also it isn't possible to configure the use of XF_IP4, nor from userland
via setkey(8), nor from the kernel.

Simplify the ipip_output() function and rename it to ipsec_encap().
* move IP_DF handling from ipsec4_process_packet() into ipsec_encap();
* since ipsec_encap() called from ipsec[64]_process_packet(), it
  is safe to assume that mbuf is contiguous at least to IP header
  for used IP version. Remove all unneeded m_pullup(), m_copydata
  and related checks.
* use V_ip_defttl and V_ip6_defhlim for outer headers;
* use V_ip4_ipsec_ecn and V_ip6_ipsec_ecn for outer headers;
* move all diagnostic messages to the ipsec_encap() callers;
* simplify handling of ipsec_encap() results: if it returns non zero
  value, print diagnostic message and free mbuf.
* some style(9) fixes.

Differential Revision:	https://reviews.freebsd.org/D2303
Reviewed by:	glebius
Sponsored by:	Yandex LLC
2015-04-18 16:38:45 +00:00
Gleb Smirnoff
6d947416cc o Use new function ip_fillid() in all places throughout the kernel,
where we want to create a new IP datagram.
o Add support for RFC6864, which allows to set IP ID for atomic IP
  datagrams to any value, to improve performance. The behaviour is
  controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by
  default.
o In case if we generate IP ID, use counter(9) to improve performance.
o Gather all code related to IP ID into ip_id.c.

Differential Revision:		https://reviews.freebsd.org/D2177
Reviewed by:			adrian, cy, rpaulo
Tested by:			Emeric POUPON <emeric.poupon stormshield.eu>
Sponsored by:			Netflix
Sponsored by:			Nginx, Inc.
Relnotes:			yes
2015-04-01 22:26:39 +00:00
Andrey V. Elsukov
ba76ce40b4 Remove extra '&'. sin6 is already a pointer.
PR:		195011
MFC after:	1 week
2015-03-07 18:44:52 +00:00
Andrey V. Elsukov
47568136c5 Fix possible memory leak and several races in the IPsec policy management
code.

Resurrect the state field in the struct secpolicy, it has
IPSEC_SPSTATE_ALIVE value when security policy linked in the chain,
and IPSEC_SPSTATE_DEAD value in all other cases. This field protects
from trying to unlink one security policy several times from the different
threads.

Take additional reference in the key_flush_spd() to be sure that policy
won't be freed from the different thread while we are sending SPDEXPIRE message.

Add KEY_FREESP() call to the key_unlink() to release additional reference
that we take when use key_getsp*() functions.

Differential Revision:	https://reviews.freebsd.org/D1914
Tested by:		Emeric POUPON <emeric.poupon at stormshield dot eu>
Reviewed by:	hrs
Sponsored by:	Yandex LLC
2015-02-24 10:35:07 +00:00
Andrey V. Elsukov
b489a49fc0 key_spdget uses key_setdumpsp() without SPTREE_RLOCK held (it uses
referenced pointer to sp). Remove SPTREE_RLOCK_ASSERT from
key_setdumpsp() to fix wrong assertion.

Reported by:	Emeric POUPON
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-01-27 17:46:55 +00:00
Robert Watson
2a8c860fe3 In order to reduce use of M_EXT outside of the mbuf allocator and
socket-buffer implementations, introduce a return value for MCLGET()
(and m_cljget() that underlies it) to allow the caller to avoid testing
M_EXT itself.  Update all callers to use the return value.

With this change, very few network device drivers remain aware of
M_EXT; the primary exceptions lie in mbuf-chain pretty printers for
debugging, and in a few cases, custom mbuf and cluster allocation
implementations.

NB: This is a difficult-to-test change as it touches many drivers for
which I don't have physical devices.  Instead we've gone for intensive
review, but further post-commit review would definitely be appreciated
to spot errors where changes could not easily be made mechanically,
but were largely mechanical in nature.

Differential Revision:	https://reviews.freebsd.org/D1440
Reviewed by:	adrian, bz, gnn
Sponsored by:	EMC / Isilon Storage Division
2015-01-06 12:59:37 +00:00
Andrey V. Elsukov
fe07a9d08f Fix VIMAGE build. 2014-12-25 13:38:51 +00:00
Andrey V. Elsukov
93201211e9 Rename ip4_def_policy variable to def_policy. It is used by both IPv4 and
IPv6. Initialize it only once in def_policy_init(). Remove its
initialization from key_init() and make it static.

Remove several fields from struct secpolicy:
* lock - it isn't so useful having mutex in the structure, but the only
  thing we do with it is initialization and destroying.
* state - it has only two values - DEAD and ALIVE. Instead of take a lock
  and change the state to DEAD, then take lock again in GC function and
  delete policy from the chain - keep in the chain only ALIVE policies.
* scangen - it was used in GC function to protect from sending several
  SADB_SPDEXPIRE messages for one SPD entry. Now we don't keep DEAD entries
  in the chain and there is no need to have scangen variable.

Use TAILQ to implement SPD entries chain. Use rmlock to protect access
to SPD entries chain. Protect all SP lookup with RLOCK, and use WLOCK
when we are inserting (or removing) SP entry in the chain.

Instead of using pattern "LOCK(); refcnt++; UNLOCK();", use refcount(9)
API to implement refcounting in SPD. Merge code from key_delsp() and
_key_delsp() into _key_freesp(). And use KEY_FREESP() macro in all cases
when we want to release reference or just delete SP entry.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-24 18:34:56 +00:00
Andrey V. Elsukov
a91150da31 Treat errors when retrieving security policy as policy violation.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 18:46:11 +00:00
Andrey V. Elsukov
e65ada3e3c Initialize error variable.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 18:40:56 +00:00
Andrey V. Elsukov
0275b2e369 Remove flag/flags argument from the following functions:
ipsec_getpolicybyaddr()
 ipsec4_checkpolicy()
 ip_ipsec_output()
 ip6_ipsec_output()

The only flag used here was IP_FORWARDING.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 18:35:34 +00:00
Andrey V. Elsukov
619764beab Remove flags and tunalready arguments from ipsec4_process_packet()
and make its prototype similar to ipsec6_process_packet.
The flags argument isn't used here, tunalready is always zero.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:34:49 +00:00
Andrey V. Elsukov
f0514a8b8a Remove now unused mtag argument from ipsec*_common_input_cb.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:14:49 +00:00
Andrey V. Elsukov
08537f4526 Remove code related to PACKET_TAG_IPSEC_IN_CRYPTO_DONE mbuf tag.
It isn't used in FreeBSD.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:07:21 +00:00
Andrey V. Elsukov
566cbcc82a Remove unused mtag variable.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:01:53 +00:00
Andrey V. Elsukov
4268212124 key_getspacq() returns holding the spacq_lock. Unlock it in all cases.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-12-07 06:47:00 +00:00
Andrey V. Elsukov
3d6aff5615 Fix style(9) and remove m_freem(NULL).
Add XXX comment, it looks incorrect, because m_pkthdr.len is already
incremented by M_PREPEND().

Sponsored by:	Yandex LLC
2014-12-04 05:02:12 +00:00
Andrey V. Elsukov
18961126cb Remove __P() macro.
Suggested by:	kevlo
Sponsored by:	Yandex LLC
2014-12-03 04:08:41 +00:00
Andrey V. Elsukov
2e84e6eac9 ANSIfy function declarations.
Sponsored by:	Yandex LLC
2014-12-03 03:50:54 +00:00
Andrey V. Elsukov
bd766f3425 Remove unneded check. No need to do m_pullup to the size that we prepended.
Sponsored by:	Yandex LLC
2014-12-02 05:28:40 +00:00
Andrey V. Elsukov
2d957916ef Remove route chaching support from ipsec code. It isn't used for some time.
* remove sa_route_union declaration and route_cache member from struct secashead;
* remove key_sa_routechange() call from ICMP and ICMPv6 code;
* simplify ip_ipsec_mtu();
* remove #include <net/route.h>;

Sponsored by:	Yandex LLC
2014-12-02 04:20:50 +00:00
Andrey V. Elsukov
1fea1b0889 Remove unused structure declarations.
Sponsored by:	Yandex LLC
2014-12-02 02:41:44 +00:00
Andrey V. Elsukov
0e23cc372d Remove unused declartations.
Sponsored by:	Yandex LLC
2014-12-02 02:32:28 +00:00
Andrey V. Elsukov
ffbf9cdeb6 Remove ip4_input() declaration. It was removed in r275133.
MFC after:	1 month
2014-11-27 00:27:39 +00:00
Andrey V. Elsukov
b05765d75f Do not use xform_ipip as decapsulation fallback.
xform_ipip was used as fallback with low priority for IPIP
encapsulated packets that were decrypted. In some cases
it can decapsulate packets, that it shouldn't. This leads to situations,
when wrong configurations are magically working. Also it can propagate
wrong ingress interface and this can break security.

Now we redesigned the IPSEC code and IPIP encapsulation is called directly
from ipsec_output, and decapsulation is done in the ipsec_input with m_striphdr.

Differential Revision:	https://reviews.freebsd.org/D1220
MFC after:	1 month
Sponsored by:	Yandex LLC
2014-11-26 17:44:49 +00:00
Andrey V. Elsukov
f9d8f66552 Count statistics for the specific address family.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-13 12:58:33 +00:00
Andrey V. Elsukov
612faae7a2 Strip IP header only when we act in tunnel mode.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-13 10:48:59 +00:00
Andrey V. Elsukov
ab2164e0b5 Remove redundant ip6_plen initialization.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-13 10:47:24 +00:00
Andrey V. Elsukov
67fd172767 ipsec6_process_packet is called before ip6_output fixes ip6_plen.
Update ip6_plen before bpf processing to be able see correct value.

MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-12 22:51:30 +00:00
Andrey V. Elsukov
f3c93842bf Fix ips_out_nosa errors accounting.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-12 14:00:49 +00:00
Andrey V. Elsukov
b6e1ad3a3a Pass mbuf to pfil processing before stripping outer IP header as it
is described in if_enc(4).

MFC after:	2 week
Sponsored by:	Yandex LLC
2014-11-07 12:05:20 +00:00
Gleb Smirnoff
6df8a71067 Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.
Sponsored by:	Nginx, Inc.
2014-11-07 09:39:05 +00:00
Andrey V. Elsukov
1f194d8ae1 When mode isn't explicitly specified (wildcard) and inner protocol isn't
IPv4 or IPv6, assume it is the transport mode.

Reported by:	jmg
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-06 20:23:57 +00:00
Andrey V. Elsukov
f5196a58a0 Use in_localip() instead of handmade implementation.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-10-31 12:19:22 +00:00
John Baldwin
a4432e6bf7 Use a static callout to drive key_timehandler() instead of timeout().
While here, make key_timehandler() private to key.c.

Submitted by:	bz (2)
Tested by:	bz
2014-10-23 20:43:16 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Andrey V. Elsukov
a28b277a9f Do not strip outer header when operating in transport mode.
Instead requeue mbuf back to IPv4 protocol handler. If there is one extra IP-IP
encapsulation, it will be handled with tunneling interface. And thus proper
interface will be exposed into mbuf's rcvif. Also, tcpdump that listens on tunneling
interface will see packets in both directions.

Sponsored by:	Yandex LLC
2014-10-02 02:00:21 +00:00
Gleb Smirnoff
6ff8af1ca5 Mechanically convert to if_inc_counter(). 2014-09-19 10:18:14 +00:00
Kevin Lo
73d76e77b6 Change pr_output's prototype to avoid the need for explicit casts.
This is a follow up to r269699.

Phabric:	D564
Reviewed by:	jhb
2014-08-15 02:43:02 +00:00
Kevin Lo
8f5a8818f5 Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have
only one protocol switch structure that is shared between ipv4 and ipv6.

Phabric:	D476
Reviewed by:	jhb
2014-08-08 01:57:15 +00:00
Gleb Smirnoff
fcc34a238c Fix style bug: rename the refcount field of m_ext to ext_cnt, to match
other members.

Sponsored by:	Nginx, Inc.
2014-07-11 14:34:29 +00:00
Marko Zec
b01e3d0802 The assumption in ipsec4_process_packet() that the payload may be
only IPv4 is wrong, so check the IP version before mangling the
payload header.
2014-07-01 08:02:25 +00:00
Bjoern A. Zeeb
6d120f909c Use IPv4 statistics in ipsec4_process_packet() rather than the IPv6
version.  This also unbreaks the NOINET6 builds after r266800.
2014-05-28 23:01:20 +00:00
VANHULLEBUS Yvan
aaf2cfc0d6 Fixed IPv4-in-IPv6 and IPv6-in-IPv4 IPsec tunnels.
For IPv6-in-IPv4, you may need to do the following command
on the tunnel interface if it is configured as IPv4 only:
ifconfig <interface> inet6 -ifdisabled

Code logic inspired from NetBSD.

PR: kern/169438
Submitted by: emeric.poupon@netasq.com
Reviewed by: fabient, ae
Obtained from: NETASQ
2014-05-28 12:45:27 +00:00
Bjoern A. Zeeb
799653be1c Only do a ports check if this is a NAT-T SA. Otherwise other
lookups providing ports may get unexpected results.

MFC After:	2 weeks
2014-05-24 09:29:23 +00:00
Andrey V. Elsukov
cd92c2e1ec Remove _IP_VHL* macros and related ifdefs.
MFC after:	1 week
2014-04-16 05:31:54 +00:00
Andrey V. Elsukov
e064642c9a The check for local address spoofing lacks ifaddr locking.
Remove these loops and use in_localip() and in6_localip()
functions instead.

MFC after:	1 week
Sponsored by:	Yandex LLC
2014-04-04 16:58:32 +00:00
Andrey V. Elsukov
b48f1835a7 Remove unused variable.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-04-04 15:57:27 +00:00
Andrey V. Elsukov
c91a8e0ebe Remove dead code.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-04-04 15:55:38 +00:00
John Baldwin
5b26ea5df3 Remove more constants related to static sysctl nodes. The MAXID constants
were primarily used to size the sysctl name list macros that were removed
in r254295.  A few other constants either did not have an associated
sysctl node, or the associated node used OID_AUTO instead.

PR:		ports/184525 (exp-run)
2014-02-25 18:44:33 +00:00
Andrey V. Elsukov
00a689c438 Initialize prot variable.
PR:		177417
MFC after:	1 week
2013-11-11 13:19:55 +00:00
Gleb Smirnoff
eedc7fd9e8 Provide includes that are needed in these files, and before were read
in implicitly via if.h -> if_var.h pollution.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 18:18:50 +00:00