freebsd-dev

Author	SHA1	Message	Date
Conrad Meyer	a8a16c7128	Replace read_random(9) with more appropriate arc4rand(9) KPIs Reviewed by: ae, delphij Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D19760	2019-04-04 01:02:50 +00:00
Conrad Meyer	1b0909d51a	OpenCrypto: Convert sessions to opaque handles instead of integers Track session objects in the framework, and pass handles between the framework (OCF), consumers, and drivers. Avoid redundancy and complexity in individual drivers by allocating session memory in the framework and providing it to drivers in ::newsession(). Session handles are no longer integers with information encoded in various high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the appropriate crypto_ses2foo() function on the opaque session handle. Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to the opaque handle interface. Discard existing session tracking as much as possible (quick pass). There may be additional code ripe for deletion. Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style interface. The conversion is largely mechnical. The change is documented in crypto.9. Inspired by https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html . No objection from: ae (ipsec portion) Reported by: jhb	2018-07-18 00:56:25 +00:00
Conrad Meyer	2e08e39ff5	OCF: Add a typedef for session identifiers No functional change. This should ease the transition from an integer session identifier model to an opaque pointer model.	2018-07-13 23:46:07 +00:00
John Baldwin	fd40ecf3d4	Set the proper vnet in IPsec callback functions. When using hardware crypto engines, the callback functions used to handle an IPsec packet after it has been encrypted or decrypted can be invoked asynchronously from a worker thread that is not associated with a vnet. Extend 'struct xform_data' to include a vnet pointer and save the current vnet in this new member when queueing crypto requests in IPsec. In the IPsec callback routines, use the new member to set the current vnet while processing the modified packet. This fixes a panic when using hardware offload such as ccr(4) with IPsec after VIMAGE was enabled in GENERIC. Reported by: Sony Arpita Das and Harsh Jain @ Chelsio Reviewed by: bz MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14763	2018-03-20 17:05:23 +00:00
Alexander Kabaev	151ba7933a	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385	2017-12-25 04:48:39 +00:00
Fabien Thomas	39bbca6ffd	crypto(9) is called from ipsec in CRYPTO_F_CBIFSYNC mode. This is working fine when a lot of different flows to be ciphered/deciphered are involved. However, when a software crypto driver is used, there are situations where we could benefit from making crypto(9) multi threaded: - a single flow is to be ciphered: only one thread is used to cipher it, - a single ESP flow is to be deciphered: only one thread is used to decipher it. The idea here is to call crypto(9) using a new mode (CRYPTO_F_ASYNC) to dispatch the crypto jobs on multiple threads, if the underlying crypto driver is working in synchronous mode. Another flag is added (CRYPTO_F_ASYNC_KEEPORDER) to make crypto(9) dispatch the crypto jobs in the order they are received (an additional queue/thread is used), so that the packets are reinjected in the network using the same order they were posted. A new sysctl net.inet.ipsec.async_crypto can be used to activate this new behavior (disabled by default). Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu> Reviewed by: ae, jmg, jhb Differential Revision: https://reviews.freebsd.org/D10680 Sponsored by: Stormshield	2017-11-03 10:27:22 +00:00
Andrey V. Elsukov	7f1f65918b	Disable IPsec debugging code by default when IPSEC_DEBUG kernel option is not specified. Due to the long call chain IPsec code can produce the kernel stack exhaustion on the i386 architecture. The debugging code usually is not used, but it requires a lot of stack space to keep buffers for strings formatting. This patch conditionally defines macros to disable building of IPsec debugging code. IPsec currently has two sysctl variables to configure debug output: * net.key.debug variable is used to enable debug output for PF_KEY protocol. Such debug messages are produced by KEYDBG() macro and usually they can be interesting for developers. * net.inet.ipsec.debug variable is used to enable debug output for DPRINTF() macro and ipseclog() function. DPRINTF() macro usually is used for development debugging. ipseclog() function is used for debugging by administrator. The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers declarations when IPSEC_DEBUG is not present in kernel config. This reduces stack requirement for up to several hundreds of bytes. The net.inet.ipsec.debug variable still can be used to enable ipseclog() messages by administrator. PR: 219476 Reported by: eugen No objection from: #network MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10869	2017-05-29 09:30:38 +00:00
Andrey V. Elsukov	3aee70991d	Fix possible double releasing for SA and SP references. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For outbound packets the direct call chain is the following: IPSEC_OUTPUT() method -> ipsec[46]_common_output() -> -> ipsec[46]_perform_request() -> xform_output() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output(). The SA and SP references are held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did references releasing in xform_output_cb(). But when the crypto callback called directly, in case of error the error handling code in ipsec[46]_perform_request() also did references releasing. To fix this, remove error handling from ipsec[46]_perform_request() and do it in xform_output() before crypto_dispatch(). MFC after: 10 days	2017-05-23 09:32:26 +00:00
Andrey V. Elsukov	5f7c516f21	Fix possible double releasing for SA reference. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For inbound packets the direct call chain is the following: IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue(). The SA reference is held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did SA reference releasing in xform_input_cb(). But when the crypto callback called directly, in case of error (e.g. data authentification failed) the error handling in ipsec_common_input() also did SA reference releasing. To fix this, remove error handling from ipsec_common_input() and do it in xform_input() before crypto_dispatch(). PR: 219356 MFC after: 10 days	2017-05-23 09:01:48 +00:00
Andrey V. Elsukov	fcf596178b	Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352	2017-02-06 08:49:57 +00:00
Fabien Thomas	bf4356266d	IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets. Since the previous algorithm, based on bit shifting, does not scale with large replay windows, the algorithm used here is based on RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting. The replay window will be fast to be updated, but will cost as many bits in RAM as its size. The previous implementation did not provide a lock on the replay window, which may lead to replay issues. Reviewed by: ae Obtained from: emeric.poupon@stormshield.eu Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D8468	2016-11-25 14:44:49 +00:00
Andrey V. Elsukov	0c80e7df43	Use explicitly specified ivsize instead of blocksize when we mean IV size. Set zero ivsize for enc_xform_null and remove special handling from xform_esp.c. Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D1503	2015-11-16 07:10:42 +00:00
Andrey V. Elsukov	f367798498	Take extra reference to security policy before calling crypto_dispatch(). Currently we perform crypto requests for IPSEC synchronous for most of crypto providers (software, aesni) and only VIA padlock calls crypto callback asynchronous. In synchronous mode it is possible, that security policy will be removed during the processing crypto request. And crypto callback will release the last reference to SP. Then upon return into ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already freed request. To prevent this we will take extra reference to SP. PR: 201876 Sponsored by: Yandex LLC	2015-09-30 08:16:33 +00:00
John-Mark Gurney	a2bc81bf7c	Make IPsec work with AES-GCM and AES-ICM (aka CTR) in OCF... IPsec defines the keys differently than NIST does, so we have to muck with key lengths and nonce/IVs to be standard compliant... Remove the iv from secasvar as it was unused... Add a counter protected by a mutex to ensure that the counter for GCM and ICM will never be repeated.. This is a requirement for security.. I would use atomics, but we don't have a 64bit one on all platforms.. Fix a bug where IPsec was depending upon the OCF to ensure that the blocksize was always at least 4 bytes to maintain alignment... Move this logic into IPsec so changes to OCF won't break IPsec... In one place, espx was always non-NULL, so don't test that it's non-NULL before doing work.. minor style cleanups... drop setting key and klen as they were not used... Enforce that OCF won't pass invalid key lengths to AES that would panic the machine... This was has been tested by others too... I tested this against NetBSD 6.1.5 using mini-test suite in https://github.com/jmgurney/ipseccfgs and the only things that don't pass are keyed md5 and sha1, and 3des-deriv (setkey syntax error), all other modes listed in setkey's man page... The nice thing is that NetBSD uses setkey, so same config files were used on both... Reviewed by: gnn	2015-08-04 17:47:11 +00:00
John-Mark Gurney	42e5fcbf2b	these are comparing authenticators and need to be constant time... This could be a side channel attack... Now that we have a function for this, use it... jmgurney/ipsecgcm: 24d704cc and 7f37a14	2015-07-31 00:31:52 +00:00
John-Mark Gurney	817c7ed900	Clean up this header file... use CTASSERTs now that we have them... Replace a draft w/ RFC that's over 10 years old. Note that _AALG and _EALG do not need to match what the IKE daemons think they should be.. This is part of the KABI... I decided to renumber AESCTR, but since we've never had working AESCTR mode, I'm not really breaking anything.. and it shortens a loop by quite a bit.. remove SKIPJACK IPsec support... SKIPJACK never made it out of draft (in 1999), only has 80bit key, NIST recommended it stop being used after 2010, and setkey nor any of the IKE daemons I checked supported it... jmgurney/ipsecgcm: a357a33, c75808b, e008669, b27b6d6 Reviewed by: gnn (earlier version)	2015-07-31 00:23:21 +00:00
John-Mark Gurney	a09a7146a7	RFC4868 section 2.3 requires that the output be half... This fixes problems that was introduced in r285336... I have verified that HMAC-SHA2-256 both ah only and w/ AES-CBC interoperate w/ a NetBSD 6.1.5 vm... Reviewed by: gnn	2015-07-29 07:15:16 +00:00
George V. Neville-Neil	0b75d21e18	Summary: Fix LINT build. The names of the new AES modes were not correctly used under the REGRESSION kernel option.	2015-07-10 02:23:50 +00:00
George V. Neville-Neil	16de9ac1b5	Add support for AES modes to IPSec. These modes work both in software only mode and with hardware support on systems that have AESNI instructions. Differential Revision: D2936 Reviewed by: jmg, eri, cognet Sponsored by: Rubicon Communications (Netgate)	2015-07-09 18:16:35 +00:00
Andrey V. Elsukov	3d80e82d60	Fix possible use after free due to security policy deletion. When we are passing mbuf to IPSec processing via ipsec[46]_process_packet(), we hold one reference to security policy and release it just after return from this function. But IPSec processing can be deffered and when we release reference to security policy after ipsec[46]_process_packet(), user can delete this security policy from SPDB. And when IPSec processing will be done, xform's callback function will do access to already freed memory. To fix this move KEY_FREESP() into callback function. Now IPSec code will release reference to SP after processing will be finished. Differential Revision: https://reviews.freebsd.org/D2324 No objections from: #network Sponsored by: Yandex LLC	2015-04-27 00:55:56 +00:00
Andrey V. Elsukov	962ac6c727	Change ipsec_address() and ipsec_logsastr() functions to take two additional arguments - buffer and size of this buffer. ipsec_address() is used to convert sockaddr structure to presentation format. The IPv6 part of this function returns pointer to the on-stack buffer and at the moment when it will be used by caller, it becames invalid. IPv4 version uses 4 static buffers and returns pointer to new buffer each time when it called. But anyway it is still possible to get corrupted data when several threads will use this function. ipsec_logsastr() is used to format string about SA entry. It also uses static buffer and has the same problem with concurrent threads. To fix these problems add the buffer pointer and size of this buffer to arguments. Now each caller will pass buffer and its size to these functions. Also convert all places where these functions are used (except disabled code). And now ipsec_address() uses inet_ntop() function from libkern. PR: 185996 Differential Revision: https://reviews.freebsd.org/D2321 Reviewed by: gnn Sponsored by: Yandex LLC	2015-04-18 16:58:33 +00:00
Andrey V. Elsukov	f0514a8b8a	Remove now unused mtag argument from ipsec*_common_input_cb. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 17:14:49 +00:00
Andrey V. Elsukov	08537f4526	Remove code related to PACKET_TAG_IPSEC_IN_CRYPTO_DONE mbuf tag. It isn't used in FreeBSD. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 17:07:21 +00:00
Andrey V. Elsukov	2d957916ef	Remove route chaching support from ipsec code. It isn't used for some time. * remove sa_route_union declaration and route_cache member from struct secashead; * remove key_sa_routechange() call from ICMP and ICMPv6 code; * simplify ip_ipsec_mtu(); * remove #include <net/route.h>; Sponsored by: Yandex LLC	2014-12-02 04:20:50 +00:00
Gleb Smirnoff	6df8a71067	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.	2014-11-07 09:39:05 +00:00
Gleb Smirnoff	eedc7fd9e8	Provide includes that are needed in these files, and before were read in implicitly via if.h -> if_var.h pollution. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 18:18:50 +00:00
Andrey V. Elsukov	db8c087944	Migrate structs ahstat, espstat, ipcompstat, ipipstat, pfkeystat, ipsec4stat, ipsec6stat to PCPU counters.	2013-07-09 10:08:13 +00:00
Andrey V. Elsukov	a04d64d875	Use corresponding macros to update statistics for AH, ESP, IPIP, IPCOMP, PFKEY. MFC after: 2 weeks	2013-06-20 11:44:16 +00:00
Pawel Jakub Dawidek	d3e8e66d75	Remove unused 'plen' variable.	2011-11-26 23:57:03 +00:00
Pawel Jakub Dawidek	cdb7ebe38c	The esp_max_ivlen global variable is not needed, we can just use EALG_MAX_BLOCK_LEN.	2011-11-26 23:27:41 +00:00
Pawel Jakub Dawidek	5be4c9b9e6	malloc(M_WAITOK) never fails, so there is no need to check for NULL.	2011-11-26 23:18:19 +00:00
Pawel Jakub Dawidek	0e4fb1db44	Eliminate 'err' variable and just use existing 'error'.	2011-11-26 23:15:28 +00:00
Pawel Jakub Dawidek	0a95a08ecb	Simplify code a bit.	2011-11-26 23:13:30 +00:00
Pawel Jakub Dawidek	b6a4c9acdb	There is no need to virtualize esp_max_ivlen.	2011-11-26 23:11:41 +00:00
Bjoern A. Zeeb	db178eb816	Make IPsec compile without INET adding appropriate #ifdef checks. Unfold the IPSEC_COMMON_INPUT_CB() macro in xform_{ah,esp,ipcomp}.c to not need three different versions depending on INET, INET6 or both. Mark two places preparing for not yet supported functionality with IPv6. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-27 19:28:42 +00:00
Fabien Thomas	73171433c5	Optimisation in IPSEC(4): - Remove contention on ISR during the crypto operation by using rwlock(9). - Remove a second lookup of the SA in the callback. Gain on 6 cores CPU with SHA1/AES128 can be up to 30%. Reviewed by: vanhu MFC after: 1 month	2011-03-31 15:23:32 +00:00
VANHULLEBUS Yvan	442da28aeb	Fixed IPsec's HMAC_SHA256-512 support to be RFC4868 compliant. This will break interoperability with all older versions of FreeBSD for those algorithms. Reviewed by: bz, gnn Obtained from: NETASQ MFC after: 1w	2011-02-18 09:40:13 +00:00
Dimitry Andric	3e288e6238	After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 \| dim \| 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) \| 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 \| dim \| 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) \| 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 \| dim \| 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) \| 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.	2010-11-22 19:32:54 +00:00
Dimitry Andric	31c6a0037e	Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.	2010-11-14 20:38:11 +00:00
Bjoern A. Zeeb	82cea7e6f3	MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days	2010-04-29 11:52:42 +00:00
VANHULLEBUS Yvan	a45bff047c	Changed an IPSEC_ASSERT to a simple test, as such invalid packets may come from outside without being discarded before. Submitted by: aurelien.ansel@netasq.com Reviewed by: bz (secteam) Obtained from: NETASQ MFC after: 1m	2009-10-01 15:33:53 +00:00
Robert Watson	530c006014	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
Robert Watson	1e77c1056a	Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)	2009-07-16 21:13:04 +00:00
Robert Watson	eddfbb763d	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
Marko Zec	bfe1aba468	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)	2009-04-11 05:58:58 +00:00
Marko Zec	1ed81b739e	First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)	2009-04-06 22:29:41 +00:00
Marko Zec	44e33a0758	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-19 09:39:34 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Bjoern A. Zeeb	eaa9325f48	In addition to the ipsec_osdep.h removal a week ago, now also eliminate IPSEC_SPLASSERT_SOFTNET which has been 'unused' since FreeBSD 5.0.	2008-05-24 15:32:46 +00:00

1 2

70 Commits