freebsd-skq

Author	SHA1	Message	Date
rwatson	57ca4583e7	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
rwatson	bd6eb7be79	Add address list locking for in6_ifaddrhead/ia_link: as with locking for in_ifaddrhead, we stick with an rwlock for the time being, which we will revisit in the future with a possible move to rmlocks. Some pieces of code require significant further reworking to be safe from all classes of writer-writer races. Reviewed by: bz MFC after: 6 weeks	2009-06-25 16:35:28 +00:00
rwatson	ea70a3542d	Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz	2009-06-25 11:52:33 +00:00
rwatson	9c4380a8ee	Convert netinet6 to using queue(9) rather than hand-crafted linked lists for the global IPv6 address list (in6_ifaddr -> in6_ifaddrhead). Adopt the code styles and conventions present in netinet where possible. Reviewed by: gnn, bz MFC after: 6 weeks (possibly not MFCable?)	2009-06-24 21:00:25 +00:00
bz	55f6868044	Move setting of ports from NAT-T below key_getsah() and actually below key_setsaval(). Without that, the lookup for the SA had failed as we were looking for a SA with the new, updated port numbers instead of the old ones and were comparing the ports in key_cmpsaidx(). This makes updating the remote -> local SA on the initiator work again. Problem introduced with: p4 changeset 152114	2009-06-19 21:01:55 +00:00
bz	b017972f11	Add the explicit include of vimage.h to another five .c files still missing it. Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning.	2009-06-17 12:44:11 +00:00
vanhu	16c1346b9a	Added support for NAT-Traversal (RFC 3948) in IPsec stack. Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele (julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense team, and all people who used / tried the NAT-T patch for years and reported bugs, patches, etc... X-MFC: never Reviewed by: bz Approved by: gnn(mentor) Obtained from: NETASQ	2009-06-12 15:44:35 +00:00
bz	86d0c6ee3d	Properly hide IPv4 only variables and functions under #ifdef INET.	2009-06-10 19:25:46 +00:00
bz	b7ff2bdc20	After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.	2009-06-08 19:57:35 +00:00
zec	8b1f38241a	Introduce an infrastructure for dismantling vnet instances. Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)	2009-06-08 17:15:40 +00:00
rwatson	2bab695560	Reimplement the netisr framework in order to support parallel netisr threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz	2009-06-01 10:41:38 +00:00
vanhu	48cef84e5f	Lock SPTREE before parsing it in key_spddump() Approved by: gnn(mentor) Obtained from: NETASQ MFC after: 2 weeks	2009-05-27 09:44:14 +00:00
vanhu	6e1cb07c00	Only decrease refcnt once when flushing SPD entries, to avoid flushing entries which are still used. Approved by: gnn(mentor) Obtained from: NETASQ MFC after: 1 month	2009-05-27 09:31:50 +00:00
bz	9642ff6e28	Add sysctls to toggle the behaviour of the (former) IPSEC_FILTERTUNNEL kernel option. This also permits tuning of the option per virtual network stack, as well as separately per inet, inet6. The kernel option is left for a transition period, marked deprecated, and will be removed soon. Initially requested by: phk (1 year 1 day ago) MFC after: 4 weeks	2009-05-23 16:42:38 +00:00
zec	d78a1b1a82	Change the curvnet variable from a global const struct vnet , previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_ macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)	2009-05-05 10:56:12 +00:00
zec	7a56f17240	Make indentation more uniform accross vnet container structs. This is a purely cosmetic / NOP change. Reviewed by: bz Approved by: julian (mentor) Verified by: svn diff -x -w producing no output	2009-05-02 08:16:26 +00:00
zec	39b6dc8ba2	Permit buiding kernels with options VIMAGE, restricted to only a single active network stack instance. Turning on options VIMAGE at compile time yields the following changes relative to default kernel build: 1) V_ accessor macros for virtualized variables resolve to structure fields via base pointers, instead of being resolved as fields in global structs or plain global variables. As an example, V_ifnet becomes: options VIMAGE: ((struct vnet_net ) vnet_net)->_ifnet default build: vnet_net_0._ifnet options VIMAGE_GLOBALS: ifnet 2) INIT_VNET_ macros will declare and set up base pointers to be used by V_ accessor macros, instead of resolving to whitespace: INIT_VNET_NET(ifp->if_vnet); becomes struct vnet_net vnet_net = (ifp->if_vnet)->mod_data[VNET_MOD_NET]; 3) Memory for vnet modules registered via vnet_mod_register() is now allocated at run time in sys/kern/kern_vimage.c, instead of per vnet module structs being declared as globals. If required, vnet modules can now request the framework to provide them with allocated bzeroed memory by filling in the vmi_size field in their vmi_modinfo structures. 4) structs socket, ifnet, inpcbinfo, tcpcb and syncache_head are extended to hold a pointer to the parent vnet. options VIMAGE builds will fill in those fields as required. 5) curvnet is introduced as a new global variable in options VIMAGE builds, always pointing to the default and only struct vnet. 6) struct sysctl_oid has been extended with additional two fields to store major and minor virtualization module identifiers, oid_v_subs and oid_v_mod. SYSCTL_V_ family of macros will fill in those fields accordingly, and store the offset in the appropriate vnet container struct in oid_arg1. In sysctl handlers dealing with virtualized sysctls, the SYSCTL_RESOLVE_V_ARG1() macro will compute the address of the target variable and make it available in arg1 variable for further processing. Unused fields in structs vnet_inet, vnet_inet6 and vnet_ipfw have been deleted. Reviewed by: bz, rwatson Approved by: julian (mentor)	2009-04-30 13:36:26 +00:00
bms	0915b81c76	Stub out IN6_LOOKUP_MULTI() for GETSPI requests, for now. This has the effect that IPv6 multicast traffic won't trigger an SPI allocation when IPSEC is in use, however, this obviously needs to stomp on locks, and IN6_LOOKUP_MULTI() is about to go away. This definitely needs to be revisited before 8.x is branched as a release branch.	2009-04-29 11:15:58 +00:00
bz	a12cc82f1a	key_gettunnel() has been unsued with FAST_IPSEC (now IPSEC). KAME had explicit checks at one point using it, so just hide it behind #if 0 for now until we are sure if we can completely dump it or not. MFC after: 1 month	2009-04-27 21:04:16 +00:00
zec	b39b54e6de	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)	2009-04-11 05:58:58 +00:00
zec	c85551e0bc	First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)	2009-04-06 22:29:41 +00:00
vanhu	7e0f7398ba	Fixed comments so it stays in 80 chars by line with hard tabs of 8 chars.... Approved by: gnn(mentor)	2009-03-23 16:20:39 +00:00
vanhu	21967caaf2	Spelling fix in a comment Approved by: gnn(mentor)	2009-03-20 09:12:01 +00:00
vanhu	cea6d30cdc	Fixed style for some comments Approved by: gnn(mentor)	2009-03-19 15:50:45 +00:00
vanhu	72aca0d947	Fixed style for some comments Approved by: gnn(mentor)	2009-03-19 15:44:13 +00:00
vanhu	e33d6fbff6	Fixed deletion of sav entries in key_delsah() Approved by: gnn(mentor) Obtained from: NETASQ MFC after: 1 month	2009-03-18 14:01:41 +00:00
vanhu	a5f4a55744	SAs are valid (but dying) when they reached soft lifetime, even if they have never been used. Approved by: gnn(mentor) MFC after: 2 weeks	2009-03-05 16:22:32 +00:00
bz	4321e2a8f4	Add size-guards evaluated at compile-time to the main struct vnet_* which are not in a module of their own like gif. Single kernel compiles and universe will fail if the size of the struct changes. Th expected values are given in sys/vimage.h. See the comments where how to handle this. Requested by: peter	2009-03-01 11:01:00 +00:00
bz	df2be82cec	For all files including net/vnet.h directly include opt_route.h and net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.	2009-02-27 14:12:05 +00:00
bz	710220924b	Shuffle the vimage.h includes or add where missing.	2009-02-27 13:22:26 +00:00
rdivacky	e5bfcba080	Change the functions to ANSI in those cases where it breaks promotion to int rule. See ISO C Standard: SS6.7.5.3:15. Approved by: kib (mentor) Reviewed by: warner Tested by: silence on -current	2009-02-24 18:09:31 +00:00
bz	8d30abae87	Try to remove/assimilate as much of formerly IPv4/6 specific (duplicate) code in sys/netipsec/ipsec.c and fold it into common, INET/6 independent functions. The file local functions ipsec4_setspidx_inpcb() and ipsec6_setspidx_inpcb() were 1:1 identical after the change in r186528. Rename to ipsec_setspidx_inpcb() and remove the duplicate. Public functions ipsec[46]_get_policy() were 1:1 identical. Remove one copy and merge in the factored out code from ipsec_get_policy() into the other. The public function left is now called ipsec_get_policy() and callers were adapted. Public functions ipsec[46]_set_policy() were 1:1 identical. Rename file local ipsec_set_policy() function to ipsec_set_policy_internal(). Remove one copy of the public functions, rename the other to ipsec_set_policy() and adapt callers. Public functions ipsec[46]_hdrsiz() were logically identical (ignoring one questionable assert in the v6 version). Rename the file local ipsec_hdrsiz() to ipsec_hdrsiz_internal(), the public function to ipsec_hdrsiz(), remove the duplicate copy and adapt the callers. The v6 version had been unused anyway. Cleanup comments. Public functions ipsec[46]_in_reject() were logically identical apart from statistics. Move the common code into a file local ipsec46_in_reject() leaving vimage+statistics in small AF specific wrapper functions. Note: unfortunately we already have a public ipsec_in_reject(). Reviewed by: sam Discussed with: rwatson (renaming to *_internal) MFC after: 26 days X-MFC: keep wrapper functions for public symbols?	2009-02-08 09:27:07 +00:00
bz	133ba226c9	Use NULL rather than 0 when comparing pointers. MFC after: 2 weeks	2009-01-30 20:17:08 +00:00
vanhu	1f1b367cb7	Remove remain <= MHLEN restriction in m_makespace(), which caused assert with big packets PR: kern/124609 Submitted by: fabien.thomas@netasq.com Approved by: gnn(mentor) Obtained from: NetBSD MFC after: 1 month	2009-01-28 10:41:10 +00:00
bz	086c4b5b79	Switch the last protosw* structs to C99 initializers. Reviewed by: ed, julian, Christoph Mallon <christoph.mallon@gmx.de> MFC after: 2 weeks	2009-01-05 20:29:01 +00:00
rwatson	1cfb2a06d0	Fix non-C99 initialization for protosw initializing pr_ousrreq.	2009-01-04 22:15:15 +00:00
rwatson	e259848db5	Unlike with struct protosw, several instances of struct ip6protosw did not use C99-style sparse structure initialization, so remove NULL assignments for now-removed pr_usrreq function pointers. Reported by: Chris Ruiz <yr.retarded at gmail.com>	2009-01-04 21:53:42 +00:00
bz	2207460cee	Like in the rest of the file and the network stack use inp as variable name for the inpcb. For consistency with the other *_hdrsiz functions use 'size' instead of 'siz' as variable name. No functional change. MFC after: 4 weeks	2008-12-27 23:24:59 +00:00
bz	14a7b2c5e1	Non-functional (style) changes: - Always use round brackets with return (). - Add empty line to beginning of functions without local variables. - Comments start with a capital letter and end in a '.'. While there adapt a few comments. Reviewed by: rwatson MFC after: 4 weeks	2008-12-27 22:58:16 +00:00
bz	2b6604ad9e	Convert function definitions to constantly use ANSI-style parameter declarations. Reviewed by: rwatson MFC after: 4 weeks	2008-12-27 21:20:34 +00:00
bz	9d1d3ec9fe	Rewrite ipsec6_setspidx_inpcb() to match the logic in the (now) equivalent IPv4 counterpart. MFC after: 4 weeks	2008-12-27 20:37:53 +00:00
bz	95679d0335	For consistency with ipsec4_setspidx_inpcb() rename file local function ipsec6_setspidx_in6pcb() to ipsec6_setspidx_inpcb(). MFC after: 4 weeks	2008-12-27 19:42:59 +00:00
bz	f460f8ea26	Change the in6p variable names to inp to be able to diff the v4 to the v6 implementations. MFC after: 4 weeks	2008-12-27 19:37:46 +00:00
bz	52b35d1d21	Make ipsec_getpolicybysock() static and no longer export it. It has not been used outside this file since about the FAST_IPSEC -> IPSEC change. MFC after: 4 weeks	2008-12-27 09:36:22 +00:00
bz	46b52a2d38	Remove long unused netinet/ipprotosw.h (basically since r82884). Discussed with: rwatson MFC after: 4 weeks	2008-12-23 16:52:03 +00:00
bz	03f6bb9dc9	Another step assimilating IPv[46] PCB code - directly use the inpcb names rather than the following IPv6 compat macros: in6pcb,in6p_sp, in6p_ip6_nxt,in6p_flowinfo,in6p_vflag, in6p_flags,in6p_socket,in6p_lport,in6p_fport,in6p_ppcb and sotoin6pcb(). Apart from removing duplicate code in netipsec, this is a pure whitespace, not a functional change. Discussed with: rwatson Reviewed by: rwatson (version before review requested changes) MFC after: 4 weeks (set the timer and see then)	2008-12-15 21:50:54 +00:00
bz	98e7fe0e6a	Second round of putting global variables, which were virtualized but formerly missed under VIMAGE_GLOBAL. Put the extern declarations of the virtualized globals under VIMAGE_GLOBAL as the globals themsevles are already. This will help by the time when we are going to remove the globals entirely. Sponsored by: The FreeBSD Foundation	2008-12-13 19:13:03 +00:00
zec	7b573d1496	Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-12-10 23:12:39 +00:00
bz	604d89458a	Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation	2008-12-02 21:37:28 +00:00
zec	7ecd715d48	Unhide declarations of network stack virtualization structs from underneath #ifdef VIMAGE blocks. This change introduces some churn in #include ordering and nesting throughout the network stack and drivers but is not expected to cause any additional issues. In the next step this will allow us to instantiate the virtualization container structures and switch from using global variables to their "containerized" counterparts. Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-28 23:30:51 +00:00
bz	9ef49d8b6f	Unify ipsec[46]_delete_pcbpolicy in ipsec_delete_pcbpolicy. Ignoring different names because of macros (in6pcb, in6p_sp) and inp vs. in6p variable name both functions were entirely identical. Reviewed by: rwatson (as part of a larger changeset) MFC after: 6 weeks () () possibly need to leave a stub wrappers in 7 to keep the symbols.	2008-11-27 10:43:08 +00:00
zec	95a15f5c84	Merge more of currently non-functional (i.e. resolving to whitespace) macros from p4/vimage branch. Do a better job at enclosing all instantiations of globals scheduled for virtualization in #ifdef VIMAGE_GLOBALS blocks. De-virtualize and mark as const saorder_state_alive and saorder_state_any arrays from ipsec code, given that they are never updated at runtime, so virtualizing them would be pointless. Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-26 22:32:07 +00:00
bz	5de6e674b3	Unbreak the build without INET6.	2008-11-25 09:49:05 +00:00
zec	815d52c5df	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-19 09:39:34 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
zec	8797d4caec	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
bz	1021d43b56	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
vanhu	72791f9bc1	Increase statistic counters for enc0 interface when enabled and processing IPSec traffic. Approved by: gnn (mentor) MFC after: 1 week	2008-08-12 09:05:01 +00:00
vanhu	3a946f98dc	Add lifetime informations to generated SPD entries when SPDDUMP Approved by: gnn (mentor) MFC after: 4 weeks	2008-08-05 15:36:50 +00:00
trhodes	56ab14a8ae	Fill in a few sysctl descriptions. Approved by: rwatson	2008-07-26 00:55:35 +00:00
trhodes	b3b4a48308	Document a few sysctls. While here, remove dead code related to ip4_esp_randpad. Reviewed by: gnn, bz (older version) Approved by: gnn Tested with: make universe	2008-07-20 17:51:58 +00:00
rwatson	754034c5cf	Remove unused support for local and foreign addresses in generic raw socket support. These utility routines are used only for routing and pfkey sockets, neither of which have a notion of address, so were required to mock up fake socket addresses to avoid connection requirements for applications that did not specify their own fake addresses (most of them). Quite a bit of the removed code is #ifdef notdef, since raw sockets don't support bind() or connect() in practice. Removing this simplifies the raw socket implementation, and removes two (commented out) uses of dtom(9). Fake addresses passed to sendto(2) by applications are ignored for compatibility reasons, but this is now done in a more consistent way (and with a comment). Possibly, EINVAL could be returned here in the future if it is determined that no applications depend on the semantic inconsistency of specifying a destination address for a protocol without address support, but this will require some amount of careful surveying. NB: This does not affect netinet, netinet6, or other wire protocol raw sockets, which provide their own independent infrastructure with control block address support specific to the protocol. MFC after: 3 weeks Reviewed by: bz	2008-07-09 15:48:16 +00:00
julian	4dcc97b12c	Enter the 1990s. Use real function declaration.	2008-06-29 00:49:50 +00:00
bz	db8afa9bc3	In addition to the ipsec_osdep.h removal a week ago, now also eliminate IPSEC_SPLASSERT_SOFTNET which has been 'unused' since FreeBSD 5.0.	2008-05-24 15:32:46 +00:00
gnn	5e9c239f57	Remove last bits of OS adaptation code from the IPSec code. Reviewed By: bz	2008-05-17 04:00:11 +00:00
bz	e1cf25141c	Fix a bug that when getting/dumping the soft lifetime we reported the hard lifetime instead. MFC after: 3 days	2008-03-24 15:01:20 +00:00
bz	42fbad307b	Import change from KAME, rev. 1.362 kame/kame/sys/netkey/key.c In case of "new SA", we must check the hard lifetime of the old SA to find out if it is not permanent and we can delete it. Submitted by: sakane via gnn MFC after: 3 days	2008-03-24 14:55:09 +00:00
bz	418e4a564c	Add ';' missed with the SYSINIT changes. Not noticed by tb as TCP_SIGNATURE is not in LINT. MFC after: 1 month	2008-03-21 18:31:42 +00:00
rwatson	877d7c65ba	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
bz	33dfb1706b	Correct IPsec behaviour with a 'use' level in SP but no SA available. In that case return an continue processing the packet without IPsec. PR: 121384 MFC after: 5 days Reported by: Cyrus Rahman (crahman gmail.com) Tested by: Cyrus Rahman (crahman gmail.com) [slightly older version]	2008-03-14 16:38:11 +00:00
bz	ee90b5b6c8	Remove the "Fast " from the "Fast IPsec: Initialized Security Association Processing." printf. People kept asking questions about this after the IPsec shuffle. This still is the Fast IPsec implementation so no worries that it would be any slower now. There are no functional changes. Discussed with: sam MFC after: 4 days	2008-03-14 16:25:40 +00:00
bz	767a2621f0	Fix bugs when allocating and passing information of current lifetime and soft lifetime [1] introduced in rev. 1.21 of key.c. Along with that, fix a related problem in key_debug printing the correct data. While there replace a printf by panic in a sanity check. PR: 120751 Submitted by: Kazuaki ODA (kazuaki aliceblue.jp) [1] MFC after: 5 days	2008-03-02 17:12:28 +00:00
bz	cfb85f0c07	Rather than passing around a cached 'priv', pass in an ucred to ipsec_set_policy and do the privilege check only if needed. Try to assimilate both ip_ctloutput code blocks calling ipsec*_set_policy. Reviewed by: rwatson	2008-02-02 14:11:31 +00:00
bz	05fda2a0bf	Add sysctls to if_enc(4) to control whether the firewalls or bpf will see inner and outer headers or just inner or outer headers for incoming and outgoing IPsec packets. This is useful in bpf to not have over long lines for debugging or selcting packets based on the inner headers. It also properly defines the behavior of what the firewalls see. Last but not least it gives you if_enc(4) for IPv6 as well. [ As some auxiliary state was not available in the later input path we save it in the tdbi. That way tcpdump can give a consistent view of either of (authentic,confidential) for both before and after states. ] Discussed with: thompsa (2007-04-25, basic idea of unifying paths) Reviewed by: thompsa, gnn	2007-11-28 22:33:53 +00:00
bz	0e9e73cbd0	Adjust a comment that suggest that we might consider a panic. Make clear that this is not a good idea when called from tcp_output()->ipsec_hdrsiz_tcp()->ipsec4_hdrsize_tcp() as we do not know if IPsec processing is needed at that point.	2007-11-28 21:48:21 +00:00
bz	a7318bd80c	Move the priv check before the malloc call for so_pcb. In case attach fails because of the priv check we leaked the memory and left so_pcb as fodder for invariants. Reported by: Pawel Worach Reviewed by: rwatson	2007-11-16 22:35:33 +00:00
bz	5c6a60df9f	Add a missing priv check in key_attach to prevent non-su users from messing with the spdb and sadb. Problem sneaked in with the fast_ipsec+v6->ipsec merger by no longer going via raw_usrreqs.pr_attach. Reported by: Pawel Worach Identified by: rwatson Reviewed by: rwatson MFC after: 3 days	2007-11-12 23:47:48 +00:00
gnn	a2ad10dc87	Fix for an infinite loop in processing ESP, IPv6 packets. The control input routine passes a NULL as its void argument when it has reached the innermost header, which terminates the loop. Reported by: Pawel Worach <pawel.worach@gmail.com> Approved by: re	2007-09-12 05:54:53 +00:00
rwatson	23574c8673	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
bz	ee4925e857	Replace hard coded options by their defined PFIL_{IN,OUT} names. Approved by: re (hrs)	2007-07-19 09:57:54 +00:00
gnn	aeca69ded5	Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC option is now deprecated, as well as the KAME IPsec code. What was FAST_IPSEC is now IPSEC. Approved by: re Sponsored by: Secure Computing	2007-07-03 12:13:45 +00:00
gnn	0cd74db89b	Commit IPv6 support for FAST_IPSEC to the tree. This commit includes only the kernel files, the rest of the files will follow in a second commit. Reviewed by: bz Approved by: re Supported by: Secure Computing	2007-07-01 11:41:27 +00:00
bz	028d7c7c98	'spi' and the return value of ntohl are unsigned. Remove the extra >=0 check which was always true. Document the special meaning of spi values of 0 and 1-255 with a comment. Found with: Coverity Prevent(tm) CID: 2047	2007-06-16 09:25:23 +00:00
bz	e1f2e76904	In case of failure we can directly return ENOBUFS because 'result' is still NULL and we do not need to free anything. That allows us to gc the entire goto parts and a now unused variable. Found with: Coverity Prevent(tm) CID: 2519	2007-06-16 00:15:14 +00:00
bz	e622d327e5	Add a missing return so that we drop out in case of an error and do not continue with a NULL pointer. [1] While here change the return of the error handling code path above. I cannot see why we should always return 0 there. Neither does KAME nor do we in here for the similar check in all the other functions. Found with: Coverity Prevent(tm) [1] CID: 2521	2007-06-15 23:45:39 +00:00
bz	3a2d39f8a2	With the current code 'src' is never NULL. Nevertheless move the check for NULL before dereferencing the pointer. Found with: Coverity Prevent(tm) CID: 2528	2007-06-15 22:35:59 +00:00
bz	28982ea6ee	Looking at {ah,esp}_input_cb it seems we might be able to end up without an mtag in ipsec4_common_input_cb. So in case of !IPCOMP (AH,ESP) only change the m_tag_id if an mtag was passed to ipsec4_common_input_cb. Found with: Coverity Prevent(tm) CID: 2523	2007-06-15 22:23:33 +00:00
bz	77956753fe	s,#,*, in a multi-line comment. This is C. No functional change.	2007-06-15 21:34:12 +00:00
bz	9868265580	Though we are only called for the three security protocols we can handle, document those sprotos using an IPSEC_ASSERT so that it will be clear that 'spi' will always be initialized when used the first time. Found with: Coverity Prevent(tm) CID: 2533	2007-06-15 21:32:51 +00:00
rwatson	00b02345d4	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
bz	7bbae86575	In ipsec6_output_tunnel() make sure that the SA contents do not change. The same would apply to ipsec6_output_trans() but there is a larger patch around which already corrected that case. Do not interfere with that one.	2007-05-29 22:44:24 +00:00
bz	c255051269	fix typo: s,applyed,applied,g	2007-05-29 22:34:58 +00:00
bz	183fd7a84a	Implement ICMPv6 support in ipsec6_get_ulp(). This is needed to make security policies work correctly if ICMPv6 type and/or code are given. See setkey(8) 'upperspec' para. for details.	2007-05-29 22:32:12 +00:00
bz	4662f48b4e	Add missing break; so when comparing AF_INET6 addresses, scope and ports we do not run into the default case and return 'no match' instead of 'match'.	2007-05-29 22:18:44 +00:00
gnn	38b76f0623	Integrate the Camellia Block Cipher. For more information see RFC 4132 and its bibliography. Submitted by: Tomoyuki Okazaki <okazaki at kick dot gr dot jp> MFC after: 1 month	2007-05-09 19:37:02 +00:00
rwatson	922d6e13fa	Update comment regarding how we check privilege on FreeBSD: we now use priv_check().	2007-04-10 16:09:00 +00:00
sam	19daed61a7	add include now required for crypto flags	2007-03-22 22:25:25 +00:00
sam	f96ba7ffda	Overhaul driver/subsystem api's: o make all crypto drivers have a device_t; pseudo drivers like the s/w crypto driver synthesize one o change the api between the crypto subsystem and drivers to use kobj; cryptodev_if.m defines this api o use the fact that all crypto drivers now have a device_t to add support for specifying which of several potential devices to use when doing crypto operations o add new ioctls that allow user apps to select a specific crypto device to use (previous ioctls maintained for compatibility) o overhaul crypto subsystem code to eliminate lots of cruft and hide implementation details from drivers o bring in numerous fixes from Michale Richardson/hifn; mostly for 795x parts o add an optional mechanism for mmap'ing the hifn 795x public key h/w to user space for use by openssl (not enabled by default) o update crypto test tools to use new ioctl's and add cmd line options to specify a device to use for tests These changes will also enable much future work on improving the core crypto subsystem; including proper load balancing and interposing code between the core and drivers to dispatch small operations to the s/w driver as appropriate. These changes were instigated by the work of Michael Richardson. Reviewed by: pjd Approved by: re	2007-03-21 03:42:51 +00:00
bz	762d6693b6	s,#if INET6,#ifdef INET6, This unbreaks the build for FAST_IPSEC && !INET6 and was wrong anyway. Reported by: Dmitry Pryanishnikov <dmitry atlantis.dp.ua>	2006-12-14 17:33:46 +00:00
bz	297206ec2a	MFp4: 92972, 98913 + one more change In ip6_sprintf no longer use and return one of eight static buffers for printing/logging ipv6 addresses. The caller now has to hand in a sufficiently large buffer as first argument.	2006-12-12 12:17:58 +00:00

1 2 3 4 5

233 Commits