Commit Graph

477 Commits

Author SHA1 Message Date
Andrey V. Elsukov
6f9e437bdc Fix possible double releasing for SA reference.
This is missing part of r318734. When crypto subsystem returns error
the xform code handles an error independently.

PR:		221849
MFC after:	5 days
2017-09-01 11:51:07 +00:00
Andrey V. Elsukov
c32396978e Remove stale comments.
MFC after:	1 week
2017-08-21 13:54:29 +00:00
Andrey V. Elsukov
22bbefb2c9 Fix the regression introduced in r275710.
When a security policy should match TCP connection with specific ports,
the SYN+ACK segment send by syncache_respond() is considered as forwarded
packet, because at this moment TCP connection does not have PCB structure,
and ip_output() is called without inpcb pointer. In this case SPIDX filled
for SP lookup will not contain TCP ports and security policy will not
be found. This can lead to unencrypted SYN+ACK on the wire.

This patch restores the old behavior, when ports will not be filled only
for forwarded packets.

Reported by:	Dewayne Geraghty <dewayne.geraghty at heuristicsystems.com.au>
MFC after:	1 week
2017-08-21 13:52:21 +00:00
Andrey V. Elsukov
e54647920b Make user supplied data checks a bit stricter.
key_msg2sp() is used for parsing data from setsockopt(IP[V6]_IPSEC_POLICY)
call. This socket option is usually used to configure IPsec bypass for
socket. Only privileged user can set this socket option.
The message syntax is described here
	http://www.kame.net/newsletter/20021210/

and our libipsec is usually used to create the correct request.
Add additional checks:
* that sadb_x_ipsecrequest_len is not out of bounds of user supplied buffer
* that src/dst's sa_len is the same
* that 2*sa_len is not out of bounds of user supplied buffer
* that 2*sa_len fits into bounds of sadb_x_ipsecrequest

Reported by:	Ilja van Sprundel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11796
2017-08-09 19:58:38 +00:00
Andrey V. Elsukov
1a01e0e7ac Add inpcb pointer to struct ipsec_ctx_data and pass it to the pfil hook
from enc_hhook().

This should solve the problem when pf is used with if_enc(4) interface,
and outbound packet with existing PCB checked by pf, and this leads to
deadlock due to pf does its own PCB lookup and tries to take rlock when
wlock is already held.

Now we pass PCB pointer if it is known to the pfil hook, this helps to
avoid extra PCB lookup and thus rlock acquiring is not needed.
For inbound packets it is safe to pass NULL, because we do not held any
PCB locks yet.

PR:		220217
MFC after:	3 weeks
Sponsored by:	Yandex LLC
2017-07-31 11:04:35 +00:00
Andrey V. Elsukov
8e5060a03e Build kdebug_secreplay() function only when IPSEC_DEBUG is defined.
This should fix the build on sparc.

Reported by:	emaste
X-MFC with:	r319118
2017-06-01 10:04:12 +00:00
Andrey V. Elsukov
7f1f65918b Disable IPsec debugging code by default when IPSEC_DEBUG kernel option
is not specified.

Due to the long call chain IPsec code can produce the kernel stack
exhaustion on the i386 architecture. The debugging code usually is not
used, but it requires a lot of stack space to keep buffers for strings
formatting. This patch conditionally defines macros to disable building
of IPsec debugging code.

IPsec currently has two sysctl variables to configure debug output:
 * net.key.debug variable is used to enable debug output for PF_KEY
   protocol. Such debug messages are produced by KEYDBG() macro and
   usually they can be interesting for developers.
 * net.inet.ipsec.debug variable is used to enable debug output for
   DPRINTF() macro and ipseclog() function. DPRINTF() macro usually
   is used for development debugging. ipseclog() function is used for
   debugging by administrator.

The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers
declarations when IPSEC_DEBUG is not present in kernel config. This reduces
stack requirement for up to several hundreds of bytes.
The net.inet.ipsec.debug variable still can be used to enable ipseclog()
messages by administrator.

PR:		219476
Reported by:	eugen
No objection from:	#network
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10869
2017-05-29 09:30:38 +00:00
Andrey V. Elsukov
3aee70991d Fix possible double releasing for SA and SP references.
There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.

For outbound packets the direct call chain is the following:
 IPSEC_OUTPUT() method -> ipsec[46]_common_output() ->
 -> ipsec[46]_perform_request() -> xform_output() ->
 -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
 -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output().

The SA and SP references are held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did references releasing in
xform_output_cb(). But when the crypto callback called directly, in case of
error the error handling code in ipsec[46]_perform_request() also did
references releasing.

To fix this, remove error handling from ipsec[46]_perform_request() and do it
in xform_output() before crypto_dispatch().

MFC after:	10 days
2017-05-23 09:32:26 +00:00
Andrey V. Elsukov
5f7c516f21 Fix possible double releasing for SA reference.
There are two possible ways how crypto callback are called: directly from
caller and deffered from crypto thread.

For inbound packets the direct call chain is the following:
 IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() ->
 -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
 -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().

The SA reference is held while crypto processing is not finished.
The error handling code wrongly expected that crypto callback always called
from the crypto thread context, and it did SA reference releasing in
xform_input_cb(). But when the crypto callback called directly, in case of
error (e.g. data authentification failed) the error handling in
ipsec_common_input() also did SA reference releasing.

To fix this, remove error handling from ipsec_common_input() and do it
in xform_input() before crypto_dispatch().

PR:		219356
MFC after:	10 days
2017-05-23 09:01:48 +00:00
Ed Maste
3e85b721d6 Remove register keyword from sys/ and ANSIfy prototypes
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.

ANSIfy related prototypes while here.

Reviewed by:	cem, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10193
2017-05-17 00:34:34 +00:00
Andrey V. Elsukov
44c6ff8e2a Fix SP refcount leak.
PCB SP cache acquires extra reference, when SP is stored in the cache.
Release this reference when PCB is destroyed in ipsec_delete_pcbpolicy().
In ipsec_copy_pcbpolicy() release reference to SP in case if sp_in or
sp_out are not NULL.

Reported by:	Slawa Olhovchenkov <slw at zxy spb ru>
MFC after:	1 week
2017-04-26 00:34:05 +00:00
Andrey V. Elsukov
4e0e8f3107 Add large replay widow support to setkey(8) and libipsec.
When the replay window size is large than UINT8_MAX, add to the request
the SADB_X_EXT_SA_REPLAY extension header that was added in r309144.

Also add support of SADB_X_EXT_NAT_T_TYPE, SADB_X_EXT_NAT_T_SPORT,
SADB_X_EXT_NAT_T_DPORT, SADB_X_EXT_NAT_T_OAI, SADB_X_EXT_NAT_T_OAR,
SADB_X_EXT_SA_REPLAY, SADB_X_EXT_NEW_ADDRESS_SRC, SADB_X_EXT_NEW_ADDRESS_DST
extension headers to the key_debug that is used by `setkey -x`.

Modify kdebug_sockaddr() to use inet_ntop() for IP addresses formatting.
And modify kdebug_sadb_x_policy() to show policy scope and priority.

Reviewed by:	gnn, Emeric Poupon
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10375
2017-04-13 14:44:17 +00:00
Andrey V. Elsukov
9c2b99b912 When we are doing SA lookup for TCP-MD5, check both source and
destination addresses. Previous code has used only destination address
for lookup. But for inbound packets the source address was used as SA
destination address. Thus only outbound SA were used for both directions.
Now we use addresses from a packet as is, thus SAs for both directions are
needed.

Reported by:	Mike Tancsa
MFC after:	1 week
2017-04-04 13:41:50 +00:00
Andrey V. Elsukov
db3b3ec5b4 GC some unused declarations.
MFC after:	1 week
2017-04-03 04:44:56 +00:00
Andrey V. Elsukov
8291fb89cf Fix bug in r308972 that leads to panic when non-compressed IPComp
packet is received.

Reported by:	Denis Ahrens <denis h3q com>
MFC after:	3 days
2017-03-29 10:24:48 +00:00
Andrey V. Elsukov
22986c6740 Introduce the concept of IPsec security policies scope.
Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.

Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.

To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.

For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.

After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
	in ipsec
	esp/tunnel/87.250.242.144-87.250.242.145/unique:145
	spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
	refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
	out none
	spid=5 seq=1 pid=872 scope=global
	refcnt=1

No objection from:	#network
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9805
2017-03-07 00:13:53 +00:00
Andrey V. Elsukov
fbed6d606a For translated packets do not adjust UDP checksum if it is zero.
In case when decrypted and decapsulated packet is an UDP datagram,
check that its checksum is not zero before doing incremental checksum
adjustment.

Reported by:	Eugene Grosbein
Tested by:	Eugene Grosbein
2017-02-18 19:53:37 +00:00
Andrey V. Elsukov
ddc9f8e87f Fix LINT build for powerpc.
Build kernel modules support only when both IPSEC and TCP_SIGNATURE
are not defined.

Reported by:	emaste
2017-02-16 11:38:50 +00:00
Gleb Smirnoff
cfff3743cd Move tcp_fields_to_net() static inline into tcp_var.h, just below its
friend tcp_fields_to_host(). There is third party code that also uses
this inline.

Reviewed by:	ae
2017-02-10 17:46:26 +00:00
Andrey V. Elsukov
fcf596178b Merge projects/ipsec into head/.
Small summary
 -------------

o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
  option IPSEC_SUPPORT added. It enables support for loading
  and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
  default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
  support was removed. Added TCP/UDP checksum handling for
  inbound packets that were decapsulated by transport mode SAs.
  setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
  build as part of ipsec.ko module (or with IPSEC kernel).
  It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
  methods. The only one header file <netipsec/ipsec_support.h>
  should be included to declare all the needed things to work
  with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
  Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
  - now all security associations stored in the single SPI namespace,
    and all SAs MUST have unique SPI.
  - several hash tables added to speed up lookups in SADB.
  - SADB now uses rmlock to protect access, and concurrent threads
    can do SA lookups in the same time.
  - many PF_KEY message handlers were reworked to reflect changes
    in SADB.
  - SADB_UPDATE message was extended to support new PF_KEY headers:
    SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
    can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
  avoid locking protection for ipsecrequest. Now we support
  only limited number (4) of bundled SAs, but they are supported
  for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
  used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
  check for full history of applied IPsec transforms.
o References counting rules for security policies and security
  associations were changed. The proper SA locking added into xform
  code.
o xform code was also changed. Now it is possible to unregister xforms.
  tdb_xxx structures were changed and renamed to reflect changes in
  SADB/SPDB, and changed rules for locking and refcounting.

Reviewed by:	gnn, wblock
Obtained from:	Yandex LLC
Relnotes:	yes
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9352
2017-02-06 08:49:57 +00:00
Andrey V. Elsukov
18b105c27b Add direction argument to ipsec_setspidx_inpcb() function.
This function is used only by ipsec_getpolicybysock() to fill security
policy index selector for locally generated packets (that have INPCB).
The function incorrectly assumes that spidx is the same for both directions.
Fix this by using new direction argument to specify correct INPCB security
policy - sp_in or sp_out. There is no need to fill both policy indeces,
because they are overwritten for each packet.
This fixes security policy matching for outbound packets when user has
specified TCP/UDP ports in the security policy upperspec.

PR:		213869
MFC after:	1 week
2017-01-08 12:40:07 +00:00
Scott Long
b84ef73179 Add a missing header 2016-11-26 23:15:11 +00:00
Ed Maste
79b4dcad2b netipsec: fix build after 309144
Reported by:	rakuco
2016-11-26 00:59:01 +00:00
Fabien Thomas
bf4356266d IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets.
Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.

The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.

Reviewed by:	ae
Obtained from:	emeric.poupon@stormshield.eu
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D8468
2016-11-25 14:44:49 +00:00
Kevin Lo
c3bef61e58 Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead.
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D7878
2016-09-15 07:41:48 +00:00
Andrey V. Elsukov
0c127808dd Remove redundant sanity checks from ipsec[46]_common_input_cb().
This check already has been done in the each protocol callback.
2016-08-31 11:51:52 +00:00
Bjoern A. Zeeb
89856f7e2d Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by:		re (hrs)
Obtained from:		projects/vnet
Reviewed by:		gnn, jhb
Sponsored by:		The FreeBSD Foundation
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D6747
2016-06-21 13:48:49 +00:00
Conrad Meyer
ab11d379fa netipsec: Fix minor style nit
Coverity points out that 'continue' is equivalent to 'break' in a do {}
while(false) loop.

Reported by:	Coverity
CID:		1354983
Sponsored by:	EMC / Isilon Storage Division
2016-05-10 20:14:11 +00:00
Pedro F. Giffuni
a4641f4eaa sys/net*: minor spelling fixes.
No functional change.
2016-05-03 18:05:43 +00:00
Conrad Meyer
5a8c498f2e netipsec: Don't leak memory when deep copy fails
Reported by:	Coverity
CID:		1331693
Sponsored by:	EMC / Isilon Storage Division
2016-04-26 23:23:44 +00:00
Andrey V. Elsukov
0d05666462 Fix build for NOINET and NOINET6 kernels.
Use own protosw structures for both address families.
Check proto in encapcheck function and use -1 as proto argument in
encap_attach_func(), both address families can have IPPROTO_IPV4
and IPPROTO_IPV6 protocols.

Reported by:	bz
2016-04-24 17:09:51 +00:00
Andrey V. Elsukov
2ada524ece Use ipsec_address() function to print IP addresses. 2016-04-24 09:05:29 +00:00
Andrey V. Elsukov
3cbd4ec3e4 Handle non-compressed packets for IPComp in tunnel mode.
RFC3173 says that the IP datagram MUST be sent in the original
non-compressed form, when the total size of a compressed payload
and the IPComp header is not smaller than the size of the original
payload. In tunnel mode for small packets IPComp will send
encapsulated IP datagrams without IPComp header.
Add ip_encap handler for IPPROTO_IPV4 and IPPROTO_IPV6 to handle
these datagrams. The handler does lookup for SA related to IPComp
protocol and given from mbuf source and destination addresses as
tunnel endpoints. It decapsulates packets only when corresponding SA
is found.

Reported by:	gnn
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D6062
2016-04-24 09:02:17 +00:00
Andrey V. Elsukov
61a292a396 Remove stale function declaration 2016-04-21 11:02:06 +00:00
Andrey V. Elsukov
efb10c3ce7 Constify mbuf pointer for IPSEC functions where mbuf isn't modified. 2016-04-21 10:58:07 +00:00
Pedro F. Giffuni
02abd40029 kernel: use our nitems() macro when it is available through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:48:27 +00:00
Pedro F. Giffuni
155d72c498 sys/net* : for pointers replace 0 with NULL.
Mostly cosmetical, no functional change.

Found with devel/coccinelle.
2016-04-15 17:30:33 +00:00
Andrey V. Elsukov
6f814d0ec1 Fix handling of net.inet.ipsec.dfbit=2 variable.
IP_DF macro is in host bytes order, but ip_off field is in network bytes
order. So, use htons() for correct check.
2016-03-18 09:03:00 +00:00
Robert Watson
bc84fc89be Put IPSec's anouncement of its successful intialisation under bootverbose:
now that it's a default kernel option, we don't really need to tell the
world about it on every boot, especially as it won't be used by most users.
2016-03-13 19:27:46 +00:00
Mark Johnston
3543e138e5 Set tres to NULL to avoid a double free if the m_pullup() below fails.
Reviewed by:	glebius
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D5497
2016-03-02 05:04:04 +00:00
Andrey V. Elsukov
b833ff5ab4 Fix useless check. m_pkthdr.len should be equal to orglen.
MFC after:	2 weeks
2016-02-24 12:28:49 +00:00
Gleb Smirnoff
8ec07310fa These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
Andrey V. Elsukov
ef91a9765d Overhaul if_enc(4) and make it loadable in run-time.
Use hhook(9) framework to achieve ability of loading and unloading
if_enc(4) kernel module. INET and INET6 code on initialization registers
two helper hooks points in the kernel. if_enc(4) module uses these helper
hook points and registers its hooks. IPSEC code uses these hhook points
to call helper hooks implemented in if_enc(4).
2015-11-25 07:31:59 +00:00
Fabien Thomas
d6d3f24890 Implement the sadb_x_policy_priority field as it is done in Linux:
lower priority policies are inserted first.

Submitted by:	Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by:	ae
Sponsored by:	Stormshield
2015-11-17 14:39:33 +00:00
Andrey V. Elsukov
0c80e7df43 Use explicitly specified ivsize instead of blocksize when we mean IV size.
Set zero ivsize for enc_xform_null and remove special handling from
xform_esp.c.

Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D1503
2015-11-16 07:10:42 +00:00
George V. Neville-Neil
26882b4239 Turning on IPSEC used to introduce a slight amount of performance
degradation (7%) for host host TCP connections over 10Gbps links,
even when there were no secuirty policies in place. There is no
change in performance on 1Gbps network links. Testing GENERIC vs.
GENERIC-NOIPSEC vs. GENERIC with this change shows that the new
code removes any overhead introduced by having IPSEC always in the
kernel.

Differential Revision:	D3993
MFC after:	1 month
Sponsored by:	Rubicon Communications (Netgate)
2015-10-27 00:42:15 +00:00
Andrey V. Elsukov
f367798498 Take extra reference to security policy before calling crypto_dispatch().
Currently we perform crypto requests for IPSEC synchronous for most of
crypto providers (software, aesni) and only VIA padlock calls crypto
callback asynchronous. In synchronous mode it is possible, that security
policy will be removed during the processing crypto request. And crypto
callback will release the last reference to SP. Then upon return into
ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already
freed request. To prevent this we will take extra reference to SP.

PR:		201876
Sponsored by:	Yandex LLC
2015-09-30 08:16:33 +00:00
John-Mark Gurney
a2bc81bf7c Make IPsec work with AES-GCM and AES-ICM (aka CTR) in OCF... IPsec
defines the keys differently than NIST does, so we have to muck with
key lengths and nonce/IVs to be standard compliant...

Remove the iv from secasvar as it was unused...

Add a counter protected by a mutex to ensure that the counter for GCM
and ICM will never be repeated..  This is a requirement for security..
I would use atomics, but we don't have a 64bit one on all platforms..

Fix a bug where IPsec was depending upon the OCF to ensure that the
blocksize was always at least 4 bytes to maintain alignment... Move
this logic into IPsec so changes to OCF won't break IPsec...

In one place, espx was always non-NULL, so don't test that it's
non-NULL before doing work..

minor style cleanups...

drop setting key and klen as they were not used...

Enforce that OCF won't pass invalid key lengths to AES that would
panic the machine...

This was has been tested by others too...  I tested this against
NetBSD 6.1.5 using mini-test suite in
https://github.com/jmgurney/ipseccfgs and the only things that don't
pass are keyed md5 and sha1, and 3des-deriv (setkey syntax error),
all other modes listed in setkey's man page...  The nice thing is
that NetBSD uses setkey, so same config files were used on both...

Reviewed by:	gnn
2015-08-04 17:47:11 +00:00
John-Mark Gurney
42e5fcbf2b these are comparing authenticators and need to be constant time...
This could be a side channel attack...  Now that we have a function
for this, use it...

jmgurney/ipsecgcm:	24d704cc and 7f37a14
2015-07-31 00:31:52 +00:00
John-Mark Gurney
817c7ed900 Clean up this header file...
use CTASSERTs now that we have them...

Replace a draft w/ RFC that's over 10 years old.

Note that _AALG and _EALG do not need to match what the IKE daemons
think they should be..  This is part of the KABI...  I decided to
renumber AESCTR, but since we've never had working AESCTR mode, I'm
not really breaking anything..  and it shortens a loop by quite
a bit..

remove SKIPJACK IPsec support...  SKIPJACK never made it out of draft
(in 1999), only has 80bit key, NIST recommended it stop being used
after 2010, and setkey nor any of the IKE daemons I checked supported
it...

jmgurney/ipsecgcm: a357a33, c75808b, e008669, b27b6d6

Reviewed by:	gnn (earlier version)
2015-07-31 00:23:21 +00:00
Ermal Luçi
59959de526 Correct IPSec SA statistic keeping
The IPsec SA statistic keeping is used even for decision making on expiry/rekeying SAs.
When there are multiple transformations being done the statistic keeping might be wrong.

This mostly impacts multiple encapsulations on IPsec since the usual scenario it is not noticed due to the code path not taken.

Differential Revision:	https://reviews.freebsd.org/D3239
Reviewed by:		ae, gnn
Approved by:		gnn(mentor)
2015-07-30 20:56:27 +00:00
John-Mark Gurney
a09a7146a7 RFC4868 section 2.3 requires that the output be half... This fixes
problems that was introduced in r285336...  I have verified that
HMAC-SHA2-256 both ah only and w/ AES-CBC interoperate w/ a NetBSD
6.1.5 vm...

Reviewed by:	gnn
2015-07-29 07:15:16 +00:00
Ermal Luçi
705f4d9c6a IPSEC, remove variable argument function its already due.
Differential Revision:		https://reviews.freebsd.org/D3080
Reviewed by:	gnn, ae
Approved by:	gnn(mentor)
2015-07-21 21:46:24 +00:00
George V. Neville-Neil
0b75d21e18 Summary: Fix LINT build. The names of the new AES modes were not
correctly used under the REGRESSION kernel option.
2015-07-10 02:23:50 +00:00
George V. Neville-Neil
16de9ac1b5 Add support for AES modes to IPSec. These modes work both in software only
mode and with hardware support on systems that have AESNI instructions.

Differential Revision:	D2936
Reviewed by:	jmg, eri, cognet
Sponsored by:	Rubicon Communications (Netgate)
2015-07-09 18:16:35 +00:00
Andrey V. Elsukov
280d77a3bb Fill the port and protocol information in the SADB_ACQUIRE message
in case when security policy has it as required by RFC 2367.

PR:		192774
Differential Revision:	https://reviews.freebsd.org/D2972
MFC after:	1 week
2015-07-06 12:40:31 +00:00
Ermal Luçi
c1fc5e9601 Reduce overhead of IPSEC for traffic generated from host
When IPSEC is enabled on the kernel the forwarding path has an optimization to not enter the code paths
for checking security policies but first checks if there is any security policy active at all.

The patch introduces the same optimization but for traffic generated from the host itself.
This reduces the overhead by 50% on my tests for generated host traffic without and SP active.

Differential Revision:	https://reviews.freebsd.org/D2980
Reviewed by:	ae, gnn
Approved by:	gnn(mentor)
2015-07-03 15:31:56 +00:00
John-Mark Gurney
49672bcc54 drop key_sa_stir_iv as it isn't used...
Reviewed by:	eri, ae
2015-06-11 13:05:37 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Andrey V. Elsukov
dc4ea824d4 In the reply to SADB_X_SPDGET message use the same sequence number that
was in the request. Some IKE deamons expect it will the same. Linux and
NetBSD also follow this behaviour.

PR:		137309
MFC after:	2 weeks
2015-05-20 11:59:53 +00:00
Andrey V. Elsukov
4ec5fcbe09 Remove unneded mbuf length adjustment, M_PREPEND() already did that.
PR:		139387
MFC after:	1 week
2015-05-19 17:14:27 +00:00
Andrey V. Elsukov
5bae2b34a4 Change SA's state before sending SADB_EXPIRE message. This state will
be reported to keying daemon.

MFC after:	2 weeks
2015-05-19 08:37:03 +00:00
Andrey V. Elsukov
664802113f Teach key_expire() send SADB_EXPIRE message with the SADB_EXT_LIFETIME_HARD
extension header type. The key_flush_sad() now will send SADB_EXPIRE
message when HARD lifetime expires. This is required by RFC 2367 and some
keying daemons rely on these messages. HARD lifetime messages have
precedence over SOFT lifetime messages, so now they will be checked first.
Also now SADB_EXPIRE messages will be send even the SA has not been used,
because keying daemons might want to rekey such SA.

PR:		200282, 200283
Submitted by:	Tobias Brunner <tobias at strongswan dot org>
MFC after:	2 weeks
2015-05-19 08:30:04 +00:00
George V. Neville-Neil
a9b294b433 Summary: Remove spurious, extra, next header comments.
Correct the name of the pad length field.
2015-05-15 18:04:49 +00:00
Andrey V. Elsukov
6508929bc2 Fix the comment. We will not do SPD lookup again, because
ip[6]_ipsec_output() will find PACKET_TAG_IPSEC_OUT_DONE mbuf tag.

Sponsored by:	Yandex LLC
2015-04-28 11:03:47 +00:00
Andrey V. Elsukov
574fde00be Since PFIL can change mbuf pointer, we should update pointers after
calling ipsec_filter().

Sponsored by:	Yandex LLC
2015-04-28 09:29:28 +00:00
Andrey V. Elsukov
a9b9f6b6c6 Make ipsec_in_reject() static. We use ipsec[46]_in_reject() instead.
Sponsored by:	Yandex LLC
2015-04-27 01:12:51 +00:00
Andrey V. Elsukov
3d80e82d60 Fix possible use after free due to security policy deletion.
When we are passing mbuf to IPSec processing via ipsec[46]_process_packet(),
we hold one reference to security policy and release it just after return
from this function. But IPSec processing can be deffered and when we release
reference to security policy after ipsec[46]_process_packet(), user can
delete this security policy from SPDB. And when IPSec processing will be
done, xform's callback function will do access to already freed memory.

To fix this move KEY_FREESP() into callback function. Now IPSec code will
release reference to SP after processing will be finished.

Differential Revision:	https://reviews.freebsd.org/D2324
No objections from:	#network
Sponsored by:	Yandex LLC
2015-04-27 00:55:56 +00:00
Andrey V. Elsukov
962ac6c727 Change ipsec_address() and ipsec_logsastr() functions to take two
additional arguments - buffer and size of this buffer.

ipsec_address() is used to convert sockaddr structure to presentation
format. The IPv6 part of this function returns pointer to the on-stack
buffer and at the moment when it will be used by caller, it becames
invalid. IPv4 version uses 4 static buffers and returns pointer to
new buffer each time when it called. But anyway it is still possible
to get corrupted data when several threads will use this function.

ipsec_logsastr() is used to format string about SA entry. It also
uses static buffer and has the same problem with concurrent threads.

To fix these problems add the buffer pointer and size of this
buffer to arguments. Now each caller will pass buffer and its size
to these functions. Also convert all places where these functions
are used (except disabled code).

And now ipsec_address() uses inet_ntop() function from libkern.

PR:		185996
Differential Revision:	https://reviews.freebsd.org/D2321
Reviewed by:	gnn
Sponsored by:	Yandex LLC
2015-04-18 16:58:33 +00:00
Andrey V. Elsukov
1d3b268c04 Requeue mbuf via netisr when we use IPSec tunnel mode and IPv6.
ipsec6_common_input_cb() uses partial copy of ip6_input() to parse
headers. But this isn't correct, when we use tunnel mode IPSec.

When we stripped outer IPv6 header from the decrypted packet, it
can become IPv4 packet and should be handled by ip_input. Also when
we use tunnel mode IPSec with IPv6 traffic, we should pass decrypted
packet with inner IPv6 header to ip6_input, it will correctly handle
it and also can decide to forward it.

The "skip" variable points to offset where payload starts. In tunnel
mode we reset it to zero after stripping the outer header. So, when
it is zero, we should requeue mbuf via netisr.

Differential Revision:	https://reviews.freebsd.org/D2306
Reviewed by:	adrian, gnn
Sponsored by:	Yandex LLC
2015-04-18 16:51:24 +00:00
Andrey V. Elsukov
1ae800e7a6 Fix handling of scoped IPv6 addresses in IPSec code.
* in ipsec_encap() embed scope zone ids into link-local addresses
  in the new IPv6 header, this helps ip6_output() disambiguate the
  scope;
* teach key_ismyaddr6() use in6_localip(). in6_localip() is less
  strict than key_sockaddrcmp(). It doesn't compare all fileds of
  struct sockaddr_in6, but it is faster and it should be safe,
  because all SA's data was checked for correctness. Also, since
  IPv6 link-local addresses in the &V_in6_ifaddrhead are stored in
  kernel-internal form, we need to embed scope zone id from SA into
  the address before calling in6_localip.
* in ipsec_common_input() take scope zone id embedded in the address
  and use it to initialize sin6_scope_id, then use this sockaddr
  structure to lookup SA, because we keep addresses in the SADB without
  embedded scope zone id.

Differential Revision:	https://reviews.freebsd.org/D2304
Reviewed by:	gnn
Sponsored by:	Yandex LLC
2015-04-18 16:46:31 +00:00
Andrey V. Elsukov
61f376155d Remove xform_ipip.c and code related to XF_IP4.
The only thing is used from this code is ipip_output() function, that does
IPIP encapsulation. Other parts of XF_IP4 code were removed in r275133.
Also it isn't possible to configure the use of XF_IP4, nor from userland
via setkey(8), nor from the kernel.

Simplify the ipip_output() function and rename it to ipsec_encap().
* move IP_DF handling from ipsec4_process_packet() into ipsec_encap();
* since ipsec_encap() called from ipsec[64]_process_packet(), it
  is safe to assume that mbuf is contiguous at least to IP header
  for used IP version. Remove all unneeded m_pullup(), m_copydata
  and related checks.
* use V_ip_defttl and V_ip6_defhlim for outer headers;
* use V_ip4_ipsec_ecn and V_ip6_ipsec_ecn for outer headers;
* move all diagnostic messages to the ipsec_encap() callers;
* simplify handling of ipsec_encap() results: if it returns non zero
  value, print diagnostic message and free mbuf.
* some style(9) fixes.

Differential Revision:	https://reviews.freebsd.org/D2303
Reviewed by:	glebius
Sponsored by:	Yandex LLC
2015-04-18 16:38:45 +00:00
Gleb Smirnoff
6d947416cc o Use new function ip_fillid() in all places throughout the kernel,
where we want to create a new IP datagram.
o Add support for RFC6864, which allows to set IP ID for atomic IP
  datagrams to any value, to improve performance. The behaviour is
  controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by
  default.
o In case if we generate IP ID, use counter(9) to improve performance.
o Gather all code related to IP ID into ip_id.c.

Differential Revision:		https://reviews.freebsd.org/D2177
Reviewed by:			adrian, cy, rpaulo
Tested by:			Emeric POUPON <emeric.poupon stormshield.eu>
Sponsored by:			Netflix
Sponsored by:			Nginx, Inc.
Relnotes:			yes
2015-04-01 22:26:39 +00:00
Andrey V. Elsukov
ba76ce40b4 Remove extra '&'. sin6 is already a pointer.
PR:		195011
MFC after:	1 week
2015-03-07 18:44:52 +00:00
Andrey V. Elsukov
47568136c5 Fix possible memory leak and several races in the IPsec policy management
code.

Resurrect the state field in the struct secpolicy, it has
IPSEC_SPSTATE_ALIVE value when security policy linked in the chain,
and IPSEC_SPSTATE_DEAD value in all other cases. This field protects
from trying to unlink one security policy several times from the different
threads.

Take additional reference in the key_flush_spd() to be sure that policy
won't be freed from the different thread while we are sending SPDEXPIRE message.

Add KEY_FREESP() call to the key_unlink() to release additional reference
that we take when use key_getsp*() functions.

Differential Revision:	https://reviews.freebsd.org/D1914
Tested by:		Emeric POUPON <emeric.poupon at stormshield dot eu>
Reviewed by:	hrs
Sponsored by:	Yandex LLC
2015-02-24 10:35:07 +00:00
Andrey V. Elsukov
b489a49fc0 key_spdget uses key_setdumpsp() without SPTREE_RLOCK held (it uses
referenced pointer to sp). Remove SPTREE_RLOCK_ASSERT from
key_setdumpsp() to fix wrong assertion.

Reported by:	Emeric POUPON
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-01-27 17:46:55 +00:00
Robert Watson
2a8c860fe3 In order to reduce use of M_EXT outside of the mbuf allocator and
socket-buffer implementations, introduce a return value for MCLGET()
(and m_cljget() that underlies it) to allow the caller to avoid testing
M_EXT itself.  Update all callers to use the return value.

With this change, very few network device drivers remain aware of
M_EXT; the primary exceptions lie in mbuf-chain pretty printers for
debugging, and in a few cases, custom mbuf and cluster allocation
implementations.

NB: This is a difficult-to-test change as it touches many drivers for
which I don't have physical devices.  Instead we've gone for intensive
review, but further post-commit review would definitely be appreciated
to spot errors where changes could not easily be made mechanically,
but were largely mechanical in nature.

Differential Revision:	https://reviews.freebsd.org/D1440
Reviewed by:	adrian, bz, gnn
Sponsored by:	EMC / Isilon Storage Division
2015-01-06 12:59:37 +00:00
Andrey V. Elsukov
fe07a9d08f Fix VIMAGE build. 2014-12-25 13:38:51 +00:00
Andrey V. Elsukov
93201211e9 Rename ip4_def_policy variable to def_policy. It is used by both IPv4 and
IPv6. Initialize it only once in def_policy_init(). Remove its
initialization from key_init() and make it static.

Remove several fields from struct secpolicy:
* lock - it isn't so useful having mutex in the structure, but the only
  thing we do with it is initialization and destroying.
* state - it has only two values - DEAD and ALIVE. Instead of take a lock
  and change the state to DEAD, then take lock again in GC function and
  delete policy from the chain - keep in the chain only ALIVE policies.
* scangen - it was used in GC function to protect from sending several
  SADB_SPDEXPIRE messages for one SPD entry. Now we don't keep DEAD entries
  in the chain and there is no need to have scangen variable.

Use TAILQ to implement SPD entries chain. Use rmlock to protect access
to SPD entries chain. Protect all SP lookup with RLOCK, and use WLOCK
when we are inserting (or removing) SP entry in the chain.

Instead of using pattern "LOCK(); refcnt++; UNLOCK();", use refcount(9)
API to implement refcounting in SPD. Merge code from key_delsp() and
_key_delsp() into _key_freesp(). And use KEY_FREESP() macro in all cases
when we want to release reference or just delete SP entry.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-24 18:34:56 +00:00
Andrey V. Elsukov
a91150da31 Treat errors when retrieving security policy as policy violation.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 18:46:11 +00:00
Andrey V. Elsukov
e65ada3e3c Initialize error variable.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 18:40:56 +00:00
Andrey V. Elsukov
0275b2e369 Remove flag/flags argument from the following functions:
ipsec_getpolicybyaddr()
 ipsec4_checkpolicy()
 ip_ipsec_output()
 ip6_ipsec_output()

The only flag used here was IP_FORWARDING.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 18:35:34 +00:00
Andrey V. Elsukov
619764beab Remove flags and tunalready arguments from ipsec4_process_packet()
and make its prototype similar to ipsec6_process_packet.
The flags argument isn't used here, tunalready is always zero.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:34:49 +00:00
Andrey V. Elsukov
f0514a8b8a Remove now unused mtag argument from ipsec*_common_input_cb.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:14:49 +00:00
Andrey V. Elsukov
08537f4526 Remove code related to PACKET_TAG_IPSEC_IN_CRYPTO_DONE mbuf tag.
It isn't used in FreeBSD.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:07:21 +00:00
Andrey V. Elsukov
566cbcc82a Remove unused mtag variable.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-12-11 17:01:53 +00:00
Andrey V. Elsukov
4268212124 key_getspacq() returns holding the spacq_lock. Unlock it in all cases.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-12-07 06:47:00 +00:00
Andrey V. Elsukov
3d6aff5615 Fix style(9) and remove m_freem(NULL).
Add XXX comment, it looks incorrect, because m_pkthdr.len is already
incremented by M_PREPEND().

Sponsored by:	Yandex LLC
2014-12-04 05:02:12 +00:00
Andrey V. Elsukov
18961126cb Remove __P() macro.
Suggested by:	kevlo
Sponsored by:	Yandex LLC
2014-12-03 04:08:41 +00:00
Andrey V. Elsukov
2e84e6eac9 ANSIfy function declarations.
Sponsored by:	Yandex LLC
2014-12-03 03:50:54 +00:00
Andrey V. Elsukov
bd766f3425 Remove unneded check. No need to do m_pullup to the size that we prepended.
Sponsored by:	Yandex LLC
2014-12-02 05:28:40 +00:00
Andrey V. Elsukov
2d957916ef Remove route chaching support from ipsec code. It isn't used for some time.
* remove sa_route_union declaration and route_cache member from struct secashead;
* remove key_sa_routechange() call from ICMP and ICMPv6 code;
* simplify ip_ipsec_mtu();
* remove #include <net/route.h>;

Sponsored by:	Yandex LLC
2014-12-02 04:20:50 +00:00
Andrey V. Elsukov
1fea1b0889 Remove unused structure declarations.
Sponsored by:	Yandex LLC
2014-12-02 02:41:44 +00:00
Andrey V. Elsukov
0e23cc372d Remove unused declartations.
Sponsored by:	Yandex LLC
2014-12-02 02:32:28 +00:00
Andrey V. Elsukov
ffbf9cdeb6 Remove ip4_input() declaration. It was removed in r275133.
MFC after:	1 month
2014-11-27 00:27:39 +00:00
Andrey V. Elsukov
b05765d75f Do not use xform_ipip as decapsulation fallback.
xform_ipip was used as fallback with low priority for IPIP
encapsulated packets that were decrypted. In some cases
it can decapsulate packets, that it shouldn't. This leads to situations,
when wrong configurations are magically working. Also it can propagate
wrong ingress interface and this can break security.

Now we redesigned the IPSEC code and IPIP encapsulation is called directly
from ipsec_output, and decapsulation is done in the ipsec_input with m_striphdr.

Differential Revision:	https://reviews.freebsd.org/D1220
MFC after:	1 month
Sponsored by:	Yandex LLC
2014-11-26 17:44:49 +00:00
Andrey V. Elsukov
f9d8f66552 Count statistics for the specific address family.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-13 12:58:33 +00:00
Andrey V. Elsukov
612faae7a2 Strip IP header only when we act in tunnel mode.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-13 10:48:59 +00:00
Andrey V. Elsukov
ab2164e0b5 Remove redundant ip6_plen initialization.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-13 10:47:24 +00:00
Andrey V. Elsukov
67fd172767 ipsec6_process_packet is called before ip6_output fixes ip6_plen.
Update ip6_plen before bpf processing to be able see correct value.

MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-12 22:51:30 +00:00
Andrey V. Elsukov
f3c93842bf Fix ips_out_nosa errors accounting.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-12 14:00:49 +00:00
Andrey V. Elsukov
b6e1ad3a3a Pass mbuf to pfil processing before stripping outer IP header as it
is described in if_enc(4).

MFC after:	2 week
Sponsored by:	Yandex LLC
2014-11-07 12:05:20 +00:00
Gleb Smirnoff
6df8a71067 Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.
Sponsored by:	Nginx, Inc.
2014-11-07 09:39:05 +00:00
Andrey V. Elsukov
1f194d8ae1 When mode isn't explicitly specified (wildcard) and inner protocol isn't
IPv4 or IPv6, assume it is the transport mode.

Reported by:	jmg
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-11-06 20:23:57 +00:00
Andrey V. Elsukov
f5196a58a0 Use in_localip() instead of handmade implementation.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-10-31 12:19:22 +00:00
John Baldwin
a4432e6bf7 Use a static callout to drive key_timehandler() instead of timeout().
While here, make key_timehandler() private to key.c.

Submitted by:	bz (2)
Tested by:	bz
2014-10-23 20:43:16 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Andrey V. Elsukov
a28b277a9f Do not strip outer header when operating in transport mode.
Instead requeue mbuf back to IPv4 protocol handler. If there is one extra IP-IP
encapsulation, it will be handled with tunneling interface. And thus proper
interface will be exposed into mbuf's rcvif. Also, tcpdump that listens on tunneling
interface will see packets in both directions.

Sponsored by:	Yandex LLC
2014-10-02 02:00:21 +00:00
Gleb Smirnoff
6ff8af1ca5 Mechanically convert to if_inc_counter(). 2014-09-19 10:18:14 +00:00
Kevin Lo
73d76e77b6 Change pr_output's prototype to avoid the need for explicit casts.
This is a follow up to r269699.

Phabric:	D564
Reviewed by:	jhb
2014-08-15 02:43:02 +00:00
Kevin Lo
8f5a8818f5 Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have
only one protocol switch structure that is shared between ipv4 and ipv6.

Phabric:	D476
Reviewed by:	jhb
2014-08-08 01:57:15 +00:00
Gleb Smirnoff
fcc34a238c Fix style bug: rename the refcount field of m_ext to ext_cnt, to match
other members.

Sponsored by:	Nginx, Inc.
2014-07-11 14:34:29 +00:00
Marko Zec
b01e3d0802 The assumption in ipsec4_process_packet() that the payload may be
only IPv4 is wrong, so check the IP version before mangling the
payload header.
2014-07-01 08:02:25 +00:00
Bjoern A. Zeeb
6d120f909c Use IPv4 statistics in ipsec4_process_packet() rather than the IPv6
version.  This also unbreaks the NOINET6 builds after r266800.
2014-05-28 23:01:20 +00:00
VANHULLEBUS Yvan
aaf2cfc0d6 Fixed IPv4-in-IPv6 and IPv6-in-IPv4 IPsec tunnels.
For IPv6-in-IPv4, you may need to do the following command
on the tunnel interface if it is configured as IPv4 only:
ifconfig <interface> inet6 -ifdisabled

Code logic inspired from NetBSD.

PR: kern/169438
Submitted by: emeric.poupon@netasq.com
Reviewed by: fabient, ae
Obtained from: NETASQ
2014-05-28 12:45:27 +00:00
Bjoern A. Zeeb
799653be1c Only do a ports check if this is a NAT-T SA. Otherwise other
lookups providing ports may get unexpected results.

MFC After:	2 weeks
2014-05-24 09:29:23 +00:00
Andrey V. Elsukov
cd92c2e1ec Remove _IP_VHL* macros and related ifdefs.
MFC after:	1 week
2014-04-16 05:31:54 +00:00
Andrey V. Elsukov
e064642c9a The check for local address spoofing lacks ifaddr locking.
Remove these loops and use in_localip() and in6_localip()
functions instead.

MFC after:	1 week
Sponsored by:	Yandex LLC
2014-04-04 16:58:32 +00:00
Andrey V. Elsukov
b48f1835a7 Remove unused variable.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-04-04 15:57:27 +00:00
Andrey V. Elsukov
c91a8e0ebe Remove dead code.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-04-04 15:55:38 +00:00
John Baldwin
5b26ea5df3 Remove more constants related to static sysctl nodes. The MAXID constants
were primarily used to size the sysctl name list macros that were removed
in r254295.  A few other constants either did not have an associated
sysctl node, or the associated node used OID_AUTO instead.

PR:		ports/184525 (exp-run)
2014-02-25 18:44:33 +00:00
Andrey V. Elsukov
00a689c438 Initialize prot variable.
PR:		177417
MFC after:	1 week
2013-11-11 13:19:55 +00:00
Gleb Smirnoff
eedc7fd9e8 Provide includes that are needed in these files, and before were read
in implicitly via if.h -> if_var.h pollution.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 18:18:50 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
John Baldwin
fd77bbb967 Remove most of the remaining sysctl name list macros. They were only
ever intended for use in sysctl(8) and it has not used them for many
years.

Reviewed by:	bde
Tested by:	exp-run by bdrewery
2013-08-26 18:16:05 +00:00
Andrey V. Elsukov
6794f46021 Remove the large part of struct ipsecstat. Only few fields of this
structure is used, but they already have equal fields in the struct
newipsecstat, that was introduced with FAST_IPSEC and then was merged
together with old ipsecstat structure.

This fixes kernel stack overflow on some architectures after migration
ipsecstat to PCPU counters.

Reported by:	Taku YAMAMOTO, Maciej Milewski
2013-07-23 14:14:24 +00:00
Andrey V. Elsukov
db8c087944 Migrate structs ahstat, espstat, ipcompstat, ipipstat, pfkeystat,
ipsec4stat, ipsec6stat to PCPU counters.
2013-07-09 10:08:13 +00:00
Andrey V. Elsukov
c80211e3cf Prepare network statistics structures for migration to PCPU counters.
Use uint64_t as type for all fields of structures.

Changed structures: ahstat, arpstat, espstat, icmp6_ifstat, icmp6stat,
in6_ifstat, ip6stat, ipcompstat, ipipstat, ipsecstat, mrt6stat, mrtstat,
pfkeystat, pim6stat, pimstat, rip6stat, udpstat.

Discussed with:	arch@
2013-07-09 09:32:06 +00:00
Andrey V. Elsukov
a04d64d875 Use corresponding macros to update statistics for AH, ESP, IPIP, IPCOMP,
PFKEY.

MFC after:	2 weeks
2013-06-20 11:44:16 +00:00
Andrey V. Elsukov
6659296cb0 Use IPSECSTAT_INC() and IPSEC6STAT_INC() macros for ipsec statistics
accounting.

MFC after:	2 weeks
2013-06-20 09:55:53 +00:00
Andrey V. Elsukov
9cb8d207af Use IP6STAT_INC/IP6STAT_DEC macros to update ip6 stats.
MFC after:	1 week
2013-04-09 07:11:22 +00:00
Gleb Smirnoff
dcba52a5f1 Use m_get2() + m_align() instead of hand made key_alloc_mbuf(). Code
examination shows, that although key_alloc_mbuf() could return chains,
the callers never use chains, so m_get2() should suffice.

Sponsored by:	Nginx, Inc.
2013-03-15 10:20:15 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Gleb Smirnoff
8ad458a471 Do not reduce ip_len by size of IP header in the ip_input()
before passing a packet to protocol input routines.
  For several protocols this mean that now protocol needs to
do subtraction itself, and for another half this means that
we do not need to add header length back to the packet.

  Make ip_stripoptions() to adjust ip_len, since now we enter
this function with a packet header whose ip_len does represent
length of entire packet, not payload only.
2012-10-23 08:33:13 +00:00
Gleb Smirnoff
d2bffb140e - Fix one more miss from r241913.
- Add XXX comment about necessity of the entire block,
  that "fixes up" the IP header.
2012-10-23 08:22:01 +00:00
Gleb Smirnoff
20472bce45 Couple of changes missed from r241913, which converted
IPv4 stack to network byte order.
2012-10-22 22:42:28 +00:00
Gleb Smirnoff
8f134647ca Switch the entire IPv4 stack to keep the IP packet header
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.

  After this change a packet processed by the stack isn't
modified at all[2] except for TTL.

  After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.

[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.

[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.

Reviewed by:	luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by:	ray, Olivier Cochard-Labbe <olivier cochard.me>
2012-10-22 21:09:03 +00:00
Andre Oppermann
c9b652e3e8 Mechanically remove the last stray remains of spl* calls from net*/*.
They have been Noop's for a long time now.
2012-10-18 13:57:24 +00:00
Kevin Lo
9f614af4cf Add missing break 2012-09-18 08:00:43 +00:00
VANHULLEBUS Yvan
d1b835208a In NAT-T transport mode, allow a client to open a new connection just after
closing another.
It worked only in tunnel mode before.

Submitted by:	Andreas Longwitz <longwitz@incore.de>
MFC after: 1M
2012-09-12 12:14:50 +00:00
Gleb Smirnoff
d6d3f01e0a Merge the projects/pf/head branch, that was worked on for last six months,
into head. The most significant achievements in the new code:

 o Fine grained locking, thus much better performance.
 o Fixes to many problems in pf, that were specific to FreeBSD port.

New code doesn't have that many ifdefs and much less OpenBSDisms, thus
is more attractive to our developers.

  Those interested in details, can browse through SVN log of the
projects/pf/head branch. And for reference, here is exact list of
revisions merged:

r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330,
r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656,
r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782,
r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868,
r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223,
r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456,
r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505,
r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168,
r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230,
r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398,
r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548,
r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672,
r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169,
r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442,
r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522,
r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661,
r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212.

I'd like to thank people who participated in early testing:

Tested by:	Florian Smeets <flo freebsd.org>
Tested by:	Chekaluk Vitaly <artemrts ukr.net>
Tested by:	Ben Wilber <ben desync.com>
Tested by:	Ian FREISLICH <ianf cloudseed.co.za>
2012-09-08 06:41:54 +00:00
John Baldwin
2541fcd953 Unexpand a couple of TAILQ_FOREACH()s. 2012-08-17 16:01:24 +00:00
Bjoern A. Zeeb
174b0d419b Fix a bug introduced in r221129 that leads to a panic wen using bundled
SAs.  For now allow same address family bundles.  While discovered with
ESP and AH, which does not make a lot of sense, IPcomp could be a possible
problematic candidate.

PR:		kern/164400
MFC after:	3 days
2012-07-22 17:46:05 +00:00
Bjoern A. Zeeb
81d5d46b3c Add multi-FIB IPv6 support to the core network stack supplementing
the original IPv4 implementation from r178888:

- Use RT_DEFAULT_FIB in the IPv4 implementation where noticed.
- Use rt*fib() KPI with explicit RT_DEFAULT_FIB where applicable in
  the NFS code.
- Use the new in6_rt* KPI in TCP, gif(4), and the IPv6 network stack
  where applicable.
- Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally
  prevent multiple initializations of callouts in in6_inithead().
- Use wrapper functions where needed to preserve the current KPI to
  ease MFCs.  Use BURN_BRIDGES to indicate expected future cleanup.
- Fix (related) comments (both technical or style).
- Convert to rtinit() where applicable and only use custom loops where
  currently not possible otherwise.
- Multicast group, most neighbor discovery address actions and faith(4)
  are locked to the default FIB.  Individual IPv6 addresses will only
  appear in the default FIB, however redirect information and prefixes
  of connected subnets are automatically propagated to all FIBs by
  default (mimicking IPv4 behavior as closely as possible).

Sponsored by:	Cisco Systems, Inc.
2012-02-03 13:08:44 +00:00
Bjoern A. Zeeb
83e521ec73 Clean up some #endif comments removing from short sections. Add #endif
comments to longer, also refining strange ones.

Properly use #ifdef rather than #if defined() where possible.  Four
#if defined(PCBGROUP) occurances (netinet and netinet6) were ignored to
avoid conflicts with eventually upcoming changes for RSS.

Reported by:	bde (most)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 02:13:19 +00:00
Pawel Jakub Dawidek
d3e8e66d75 Remove unused 'plen' variable. 2011-11-26 23:57:03 +00:00
Pawel Jakub Dawidek
cdb7ebe38c The esp_max_ivlen global variable is not needed, we can just use
EALG_MAX_BLOCK_LEN.
2011-11-26 23:27:41 +00:00
Pawel Jakub Dawidek
5be4c9b9e6 malloc(M_WAITOK) never fails, so there is no need to check for NULL. 2011-11-26 23:18:19 +00:00
Pawel Jakub Dawidek
0e4fb1db44 Eliminate 'err' variable and just use existing 'error'. 2011-11-26 23:15:28 +00:00
Pawel Jakub Dawidek
0a95a08ecb Simplify code a bit. 2011-11-26 23:13:30 +00:00