freebsd-skq

Author	SHA1	Message	Date
kevlo	0cbbbb7d30	Clean up some unused leftover code. Pointed out by: ae	2013-03-22 01:45:54 +00:00
kevlo	b0b955ade2	Remove unused global variables. Reviewed by: ae, glebius	2013-03-22 01:40:17 +00:00
glebius	f07362f54e	- Use m_getcl() instead of hand allocating. - Do not calculate constant length values at run time, CTASSERT() their sanity. - Remove superfluous cleaning of mbuf fields after allocation. - Replace compat macros with function calls. Sponsored by: Nginx, Inc.	2013-03-15 13:48:53 +00:00
glebius	79cb402edb	- Use m_getcl() instead of hand allocating. - Use m_get()/m_gethdr() instead of macros. - Remove superfluous cleaning of mbuf fields after allocation. Sponsored by: Nginx, Inc.	2013-03-15 12:50:29 +00:00
glebius	ace684a132	Use m_getcl() instead of hand made allocation. Sponsored by: Nginx, Inc.	2013-03-15 12:33:23 +00:00
ae	4e920d3af6	Take the inpcb rlock before calculating checksum, it was accidentally moved in r191672. Obtained from: Yandex LLC MFC after: 1 week	2013-03-12 02:20:20 +00:00
np	e7cfe70efd	Generate lle_event in the IPv6 neighbor discovery code too. Reviewed by: bz@	2013-01-26 00:05:22 +00:00
np	09b8766144	Avoid NULL dereference in nd6_storelladdr when no mbuf is provided. It is called this way from a couple of places in the OFED code. (toecore calls it too but that's going to change shortly). Reviewed by: bz@	2013-01-25 23:11:13 +00:00
ae	0bad7195e9	Simplify in6_setscope() function to get better performance. Currently we use interface indeces as zone IDs for link-local and interface-local scopes, and since we don't have any tool to configure zone IDs, there is no need to acquire the afdata lock several times per packet only to read if_index value. So, now in6_setscope reads zone IDs for interface-local, link-local and global scopes without a lock. Sponsored by: Yandex LLC MFC after: 2 weeks	2013-01-10 00:10:24 +00:00
ae	6be782d67f	Remove unneeded variable. MFC after: 1 week	2013-01-09 18:54:58 +00:00
ume	e33acd92c3	Add no_prefer_iface option. It stops treating the address on the interface as special by source address selection rule even when the interface is outgoing interface. This is desired in some situation. Requested by: hrs Reviewed by: IHANet folks including hrs MFC after: 1 week	2013-01-09 18:18:08 +00:00
ae	5f7fde904c	The in6_setscope() function determines the scope zone id of an address and embeds it into address. Inside the kernel we keep addresses with embedded zone id only for two scopes: link-local and interface-local. For other scopes this function is nop in most cases. To reduce an overhead of locking, first check that address is capable for embedding. Also, handle the loopback address before acquire the lock. Sponsored by: Yandex LLC MFC after: 1 week	2013-01-09 00:36:06 +00:00
peter	3f8d5a8f51	Temporarily revert rev 244678. This is causing loopback problems with the lo (loopback) interfaces.	2013-01-03 10:21:28 +00:00
glebius	9f622a1b38	The SIOCSIFFLAGS ioctl handler runs if_up()/if_down() that notify all interested parties in case if interface flag IFF_UP has changed. However, not only SIOCSIFFLAGS can raise the flag, but SIOCAIFADDR and SIOCAIFADDR_IN6 can, too. The actual \|= is done not in the protocol code, but in code of interface drivers. To fix this historical layering violation, we will check whether ifp->if_ioctl(SIOCSIFADDR) raised the IFF_UP flag, and if it did, run the if_up() handler. This fixes configuring an address under CARP control on an interface that was initially !IFF_UP. P.S. I intentionally omitted handling the IFF_SMART flag. This flag was never ever used in any driver since it was introduced, and since it means another layering violation, it should be garbage collected instead of pretended to be supported.	2012-12-25 13:01:58 +00:00
ae	5f9f8c19a2	When we have some address to forward (e.g. it was specified with ipfw fwd), we should pass it as first argument into in6_selectroute_fib function to initiate new route lookup. MFC after: 1 week	2012-12-19 17:28:17 +00:00
ae	ddb9833615	Make dst_sa initialization only when it is actually needed. MFC after: 1 week	2012-12-19 17:08:49 +00:00
ae	e0bd011045	The selectroute functions does own account of EHOSTUNREACH errors, no need to do it twice. MFC after: 1 week	2012-12-19 17:02:07 +00:00
ae	001d436ac8	Use M_PROTO7 flag for M_IP6_NEXTHOP, because M_PROTO2 was used for M_AUTHIPHDR. Pointy hat to: ae Reported by: Vadim Goncharov MFC after: 3 days	2012-12-17 14:36:56 +00:00
ae	6ae8779032	In additional to the tailq of IPv6 addresses add the hash table. For now use 256 buckets and fnv_hash function. Use xor'ed 32-bit s6_addr32 parts of in6_addr structure as a hash key. Update in6_localip and in6_is_addr_deprecated to use hash table for fastest lookup. Sponsored by: Yandex LLC Discussed with: dwmalone, glebius, bz	2012-12-15 20:04:24 +00:00
glebius	8137816adb	Fix problem in r238990. The LLE_LINKED flag should be tested prior to entering llentry_free(), and in case if we lose the race, we should simply perform LLE_FREE_LOCKED(). Otherwise, if the race is lost by the thread performing arptimer(), it will remove two references from the lle instead of one. Reported by: Ian FREISLICH <ianf clue.co.za>	2012-12-13 11:11:15 +00:00
hrs	377b89c55f	- Move definition of V_deembed_scopeid to scope6_var.h. - Deembed scope id in L3 address in in6_lltable_dump(). - Simplify scope id recovery in rtsock routines. - Remove embedded scope id handling in ndp(8) and route(8) completely.	2012-12-05 19:45:24 +00:00
glebius	8e20fa5ae9	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually	2012-12-05 08:04:20 +00:00
ae	dcda6a6951	Remove opt_inet.h, it isn't required here. MFC after: 1 week	2012-11-20 14:09:37 +00:00
hrs	ec229e5ea6	Check if an extracted zoneid is equal to the non-zero sin6_scope_id only when it is link-local or MC interface-local.	2012-11-18 16:06:51 +00:00
tuexen	9a8531105a	Add support for SCTP/UDP/IPV6. This completes the support of http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-udp-encaps MFC after: 1 week	2012-11-17 20:04:04 +00:00
ae	4e1bc5063b	Reduce the overhead of locking, use IF_AFDATA_RLOCK() when we are doing simple lookups. Sponsored by: Yandex LLC MFC after: 1 week	2012-11-16 12:12:02 +00:00
ae	6d611aa2a2	if_afdata lock was converted from mutex to rwlock a long ago, so we can replace IF_AFDATA_LOCK() macro depending to the access type. Sponsored by: Yandex LLC MFC after: 1 week	2012-11-14 17:36:06 +00:00
ae	d80ce03f42	SCOPE6_LOCK protects V_sid_default, no need to acquire it without any access to V_sid_default. Sponsored by: Yandex LLC MFC after: 1 week	2012-11-14 17:23:48 +00:00
ae	42ab5bd8d7	zoneid has unsigned type. MFC after: 1 week	2012-11-14 17:14:03 +00:00
obrien	a841c9341b	Use consistent style.	2012-11-13 01:48:00 +00:00
ae	4354018055	Remove the recently added sysctl variable net.pfil.forward. Instead, add protocol specific mbuf flags M_IP_NEXTHOP and M_IP6_NEXTHOP. Use them to indicate that the mbuf's chain contains the PACKET_TAG_IPFORWARD tag. And do a tag lookup only when this flag is set. Suggested by: andre	2012-11-02 01:20:55 +00:00
tuexen	139b791e20	Whitespace changes due to upstream integration of SCTP changes in the FreeBSD code base.	2012-10-29 20:47:32 +00:00
ae	71112b5a8e	Remove the IPFIREWALL_FORWARD kernel option and make possible to turn on the related functionality in the runtime via the sysctl variable net.pfil.forward. It is turned off by default. Sponsored by: Yandex LLC Discussed with: net@ MFC after: 2 weeks	2012-10-25 09:39:14 +00:00
delphij	3948ce713c	Remove __P. Submitted by: kevlo Reviewed by: md5(1) MFC after: 2 months	2012-10-22 21:49:56 +00:00
glebius	5cc3ac5902	Switch the entire IPv4 stack to keep the IP packet header in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>	2012-10-22 21:09:03 +00:00
melifaro	a9d460420e	Eliminate code checking if found IPv6 rte is dynamic. IPv6 redirects are using (different) ND-based approach described in RFC 4861. This change is similar to r241406 which conditionally skips the same check in IPv4. This change is part of bigger patch eliminating rte locking. Sponsored by: Yandex LLC. OK'd by: hrs MFC after: 2 weeks	2012-10-22 12:54:52 +00:00
andre	34a9a386cb	Mechanically remove the last stray remains of spl* calls from net/. They have been Noop's for a long time now.	2012-10-18 13:57:24 +00:00
melifaro	85ee5d74ce	Cleanup documentation: cloning route support has been removed in r186119. MFC after: 2 weeks	2012-10-13 09:31:01 +00:00
kevlo	ceb08698f2	Revert previous commit... Pointyhat to: kevlo (myself)	2012-10-10 08:36:38 +00:00
kevlo	8747a46991	Prefer NULL over 0 for pointers	2012-10-09 08:27:40 +00:00
avg	a1635c3475	ip6_ipsec_output: fix a typo in r241344 Acting as a remote drone of glebius.	2012-10-08 13:45:40 +00:00
glebius	9086143e8c	After r241245 it appeared that in_delayed_cksum(), which still expects host byte order, was sometimes called with net byte order. Since we are moving towards net byte order throughout the stack, the function was converted to expect net byte order, and its consumers fixed appropriately: - ip_output(), ipfilter(4) not changed, since already call in_delayed_cksum() with header in net byte order. - divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order there and back. - mrouting code and IPv6 ipsec now need to switch byte order there and back, but I hope, this is temporary solution. - In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum(). - pf_route() catches up on r241245 changes to ip_output().	2012-10-08 08:03:58 +00:00
glebius	5190d38ee3	Merge the projects/pf/head branch, that was worked on for last six months, into head. The most significant achievements in the new code: o Fine grained locking, thus much better performance. o Fixes to many problems in pf, that were specific to FreeBSD port. New code doesn't have that many ifdefs and much less OpenBSDisms, thus is more attractive to our developers. Those interested in details, can browse through SVN log of the projects/pf/head branch. And for reference, here is exact list of revisions merged: r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330, r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656, r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782, r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868, r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223, r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456, r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505, r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168, r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230, r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398, r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548, r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672, r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169, r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442, r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522, r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661, r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212. I'd like to thank people who participated in early testing: Tested by: Florian Smeets <flo freebsd.org> Tested by: Chekaluk Vitaly <artemrts ukr.net> Tested by: Ben Wilber <ben desync.com> Tested by: Ian FREISLICH <ianf cloudseed.co.za>	2012-09-08 06:41:54 +00:00
trociny	4b20dde343	In ip6_ctloutput() guard inp_flags modifications with INP_WLOCK. MFC after: 2 weeks	2012-08-19 08:16:13 +00:00
glebius	abf245020a	Fix races between in_lltable_prefix_free(), lla_lookup(), llentry_free() and arptimer(): o Use callout_init_rw() for lle timeout, this allows us safely disestablish them. - This allows us to simplify the arptimer() and make it race safe. o Consistently use ifp->if_afdata_lock to lock access to linked lists in the lle hashes. o Introduce new lle flag LLE_LINKED, which marks an entry that is attached to the hash. - Use LLE_LINKED to avoid double unlinking via consequent calls to llentry_free(). - Mark lle with LLE_DELETED via \|= operation istead of =, so that other flags won't be lost. o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more consistent and provide more informative KASSERTs. The patch is a collaborative work of all submitters and myself. PR: kern/165863 Submitted by: Andrey Zonov <andrey zonov.org> Submitted by: Ryan Stone <rysto32 gmail.com> Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>	2012-08-02 13:57:49 +00:00
glebius	588de42f27	Some more whitespace cleanup.	2012-08-01 09:00:26 +00:00
bz	73fec218b5	In case of IPsec he have to do delayed checksum calculations before adding any extension header, or rather before calling into IPsec processing as we may send the packet and not return to IPv6 output processing here. PR: kern/170116 MFC After: 3 days	2012-07-31 23:34:06 +00:00
glebius	53cb168f80	Some style(9) and whitespace changes. Together with: Andrey Zonov <andrey zonov.org>	2012-07-31 11:31:12 +00:00
bz	712f939cb3	Properly apply #ifdef INET and leave a comment that we are (will) apply delayed IPv6 checksum processing in ip6_output.c when doing IPsec. PR: kern/170116 MFC after: 3 days	2012-07-31 05:44:03 +00:00
bz	e07aa136d7	Improve the should-never-hit printf to ease debugging in case we'd ever hit it again when doing the delayed IPv6 checksum calculations. MFC after: 3 days	2012-07-31 05:34:54 +00:00
bz	adb924e941	For consistency put the IPsec comment iside the #fidef section. MFC after: 3 days	2012-07-29 00:45:24 +00:00
bz	1be989ea6c	Fix a comment that we do not have an SA yet but need to acquire one. MFC after: 3 days	2012-07-29 00:44:41 +00:00
tuexen	2357a49326	Changes which improve compilation if neither INET nor INET6 is defined. MFC after: 3 days	2012-07-15 20:16:17 +00:00
tuexen	5895ece053	#ifdef INET and INET6 consistently. This also fixes a bug, where it was done wrong. MFC after: 3 days	2012-07-15 11:04:49 +00:00
hrs	43a17c6d76	Remove "prefer_source" address selection option. FreeBSD has had an implementation of RFC 3484 for this purpose for a long time and "prefer_source" was never implemented actually. ND6_IFF_PREFER_SOURCE macro is left intact.	2012-07-09 06:21:46 +00:00
bz	1c7f26864e	Implement handling of "atomic fragements" as outlined in draft-gont-6man-ipv6-atomic-fragments to mitigate one class of possible fragmentation-based attacks. MFC after: 5 days	2012-07-08 15:30:24 +00:00
bz	23dfdbf152	As mentioned in the commit message of r237571 (copied from a prototype patch of mine) also check if the 2nd in6_setscope() failed and return the error in that case. MFC after: 5 days	2012-07-08 08:49:37 +00:00
glebius	418a04b467	When ip_output()/ip6_output() is supplied a struct route *ro argument, it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning here: it may be supplied to provide route, and it may be supplied to store and return to caller the route that ip_output()/ip6_output() finds. In the latter case skipping FLOWTABLE lookup is pessimisation. The difference between struct route filled by FLOWTABLE and filled by rtalloc() family is that the former doesn't hold a reference on its rtentry. Reference is hold by flow entry, and it is about to be released in future. Thus, route filled by FLOWTABLE shouldn't be passed to RTFREE() macro. - Introduce new flag for struct route/route_in6, that marks route not holding a reference on rtentry. - Introduce new macro RO_RTFREE() that cleans up a struct route depending on its kind. - All callers to ip_output()/ip6_output() that do supply non-NULL but empty route should use RO_RTFREE() to free results of lookup. - ip_output()/ip6_output() now do FLOWTABLE lookup always when ro->ro_rt == NULL. Tested by: tuexen (SCTP part)	2012-07-04 07:37:53 +00:00
glebius	6223b4d21b	Remove route caching from IP multicast routing code. There is no reason to do that, and also, cached route never got unreferenced, which meant a reference leak. Reviewed by: bms	2012-07-02 19:44:18 +00:00
tuexen	26a057bce7	Move common code parts to sctp_common_input_processing(). MFC after: 3 days	2012-07-02 16:44:09 +00:00
bms	89f56f08d4	Kick the current-state report timer when a V1 group report would be triggered. Submitted by: rpaulo@ MFC after: 3 days	2012-06-28 23:48:40 +00:00
bms	0adc232251	Fix a typo in MLD query exponent processing. Submitted by: rpaulo@ MFC after: 3 days	2012-06-28 23:45:37 +00:00
bms	b6e9461377	In MLDv2 general query processing, do not enforce the strict check on query origins. Submitted by: Gu Yong MFC after: 3 days	2012-06-28 23:44:47 +00:00
tuexen	d2f6540a5c	Pass the src and dst address of a received packet explicitly around. MFC after: 3 days	2012-06-28 16:01:08 +00:00
delphij	dbcf8e863a	Fix a LOR acquiring the if_afdata lock while holding an rtentry lock. Possibly do some entra work in case we would not get into the ifa0 != NULL paths later as we already do for the mltaddr before. XXX We should possibly error in case in6_setscope fails. Reference: http://lists.freebsd.org/pipermail/freebsd-net/2011-September/029829.html Submitted by: bz MFC after: 1 week	2012-06-25 20:56:32 +00:00
tuexen	108138a997	Unify sctp_input() and sctp6_input(). MFC after: 3 days	2012-06-25 19:13:43 +00:00
tuexen	6e403ce2d9	Whitespace cleanup. MFC after: 3 days	2012-06-25 17:15:09 +00:00
tuexen	c8e30b0c20	Pass the packet length explicitly around. MFC after: 3 days	2012-06-24 23:12:24 +00:00
tuexen	9f34b53f7b	Do packet logging in a consistent way. MFC after: 3 days	2012-06-24 21:25:54 +00:00
bz	c56eb1ad39	Just add a comment to further investigate when being closer to that code again next time. The condition of the 2nd if() is very unlikely ever met.	2012-06-22 21:26:35 +00:00
tuexen	e8af286138	Pass flowid explicitly through the stack instead of taking it from the mbuf chain at different places. While there: Fix several bugs related to VRFs. MFC after: 3 days	2012-06-14 06:54:48 +00:00
tuexen	66a991aeee	Deliver IPV6_TCLASS, IPV6_HOPLIMIT and IPV6_PKTINFO cmsgs (if requested) on IPV6 sockets, which have been marked to be not IPV6_V6ONLY, for each received IPV4 packet. MFC after: 3 days	2012-06-12 13:57:56 +00:00
bz	2d7f5ff3a8	Plug two interface address refcount leaks in early error return cases in the ioctl path. Reported by: rpaulo Reviewed by: emax MFC after: 3 days	2012-06-05 13:27:37 +00:00
emax	0985bae1a6	Plug reference leak. Interface routes are refcounted as packets move through the stack, and there's garbage collection tied to it so that route changes can safely propagate while traffic is flowing. In our setup, we weren't changing or deleting any routes, but the refcounting logic in ip6_input() was wrong and caused a reference leak on every inbound V6 packet. This eventually caused a 32bit overflow, and the resulting 0 value caused the garbage collection to run on the active route. That then snowballed into the panic. Reviewed by: scottl MFC after: 3 days	2012-06-03 07:36:59 +00:00
tuexen	dc64091687	Seperate SCTP checksum offloading for IPv4 and IPv6. While there: remove some trainling whitespaces. MFC after: 3 days X-MFC with: 236170	2012-05-30 20:56:07 +00:00
emax	b57a3bc250	When we return deprecated addresses, we need to reference them. Reviewed by: bz, scottl MFC after: 3 days	2012-05-30 20:02:39 +00:00
bz	ac429c7044	It turns out that too many drivers are not only parsing the L2/3/4 headers for TSO but also for generic checksum offloading. Ideally we would only have one common function shared amongst all drivers, and perhaps when updating them for IPv6 we should introduce that. Eventually we should provide the meta information along with mbufs to avoid (re-)parsing entirely. To not break IPv6 (checksums and offload) and to be able to MFC the changes without risking to hurt 3rd party drivers, duplicate the v4 framework, as other OSes have done as well. Introduce interface capability flags for TX/RX checksum offload with IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6 flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6 fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and add an alias for CSUM_DATA_VALID_IPV6. This pretty much brings IPv6 handling in line with IPv4. TSO is still handled in a different way and not via if_hwassist. Update ifconfig to allow (un)setting of the new capability flags. Update loopback to announce the new capabilities and if_hwassist flags. Individual driver updates will have to follow, as will SCTP. Reported by: gallatin, dim, .. Reviewed by: gallatin (glanced at?) MFC after: 3 days X-MFC with: r235961,235959,235958	2012-05-28 09:30:13 +00:00
bz	2d07c3f75f	Correctly get the payload length in host byte order. While we already plan to support >64k payload here, the IPv6 header payload length obviously is only 16 bit and the calculations need to be right. Reported by: dim Tested by: dim MFC after: 1 day X-MFC: with r235958	2012-05-26 23:58:51 +00:00
tuexen	520d26f351	Get rid of SCTP specific code to avoid CRC32C computations on loopback. Just just offloading. MFC after: 3 days	2012-05-26 09:16:33 +00:00
bz	95907ed567	MFp4 bz_ipv6_fast: Use M_ZERO with malloc rather than calling bzero() ourselves. Change if () panic() checks to KASSERT()s as they are only catching invariants in code flow but not dependent on network input/output. Move initial assigments indirecting pointers after the lock has been aquired. Passing layer boundries, reset M_PROTOFLAGS. Remove a NULL assignment before free. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 09:27:16 +00:00
bz	f528e8d71f	MFp4 bz_ipv6_fast: Factor out Hop-By-Hop option processing. It's still not heavily used, it reduces the footprint of ip6_input() and makes ip6_input() more readable. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 02:58:21 +00:00
bz	179af0a3e5	MFp4 bz_ipv6_fast: Defer checksum calulations on UDP6 output and respect the mbuf flags set by NICs having done checksum validation for us already, thus saving the computing time in the input path as well. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 02:19:17 +00:00
bz	d52a8c3e3b	MFp4 bz_ipv6_fast: Add support for delayed checksum calculations in the IPv6 output path. We currently cannot offload to the card if we add extension headers (which incl. fragmentation). Fix two SCTP offload support copy&paste bugs: calculate checksums if fragmenting and no need to flag IPv4 header checksums in the IPv6 forwarding path. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 02:17:16 +00:00
bz	eda9e50c52	MFp4 bz_ipv6_fast: Hide the ip6aux functions. The only one referenced outside ip6_input.c is not compiled in yet (__notyet__) in route6.c (r235954). We do have accessor functions that should be used. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days X-MFC: KPI?	2012-05-25 01:48:15 +00:00
bz	a57f599e5c	MFp4 bz_ipv6_fast: Simplify the code removing a return from an earlier else case, not differing from the default function return called now. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 01:45:05 +00:00
bz	2c462da250	MFp4 bz_ipv6_fast: We currently nowhere set IP6A_SWAP making the entire check useless with the current code. Keep around but do not compile in. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 01:43:52 +00:00
bz	3971a44478	MFp4 bz_ipv6_fast: No need to hold the (expensive) rt lock over (expensive) logging. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 01:42:48 +00:00
bz	d9ff82d313	MFp4 bz_ipv6_fast: Introduce a (for now copied stripped down) in6_cksum_pseudo() function. We should be able to use this from in6_cksum() but we should also ponder possible MD specific improvements. It takes an extra csum argument to allow for easy checks as will be done by the upper layer protocol input paths. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-24 18:25:09 +00:00
bz	0a74a9f559	MFp4 bz_ipv6_fast: Optimize in6_cksum(), re-ordering work and limiting variable initialization, removing a bzero() for mostly re-initialized struct values, making use of the newly introduced in6_getscope(), as well as converting an if/panic to a KASSERT(). Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-24 18:05:10 +00:00
bz	bbf3b9348d	MFp4 bz_ipv6_fast: Introduce in6_getscope() to allow more effective checksum computations without the need to copy the address to clear the scope. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-24 16:30:13 +00:00
tuexen	25827ad382	Use consistent text at the begining of the files. MFC after: 3 days	2012-05-23 11:26:28 +00:00
marius	b8d578077a	Rewrite nd6_sysctl_{d,p}rlist() to avoid misaligned accesses to char arrays casted to structs by getting rid of these buffers entirely. In r169832, it was tried to paper over this issue by 32-bit aligning the buffers. Depending on compiler optimizations that still was insufficient for 64-bit architectures with strong alignment requirements though. While at it, add comments regarding the total lack of locking in this area. Tested by: bz Reviewed by: bz (slightly earlier version), yongari (earlier version) MFC after: 1 week	2012-05-20 05:12:31 +00:00
tuexen	442f3db615	Missed to commit this in r235414. MFC after: 3 days	2012-05-13 19:25:21 +00:00
tuexen	abe6735879	Use ECONNABORTED in cases where the ABORT was sent to the peer. MFC after: 3 days	2012-05-13 16:56:16 +00:00
tuexen	b3431e25a4	Provide in the association change notification the received ABORT chunk if case of SCTP_COMM_LOST or SCTP_CANT_STR_ASSOC as required by RFC 6458. MFC after: 3 days	2012-05-12 20:11:35 +00:00
glebius	348e8ce8be	in6_pcblookup_local() still can return a pcb with NULL inp_socket. To avoid panic, do not dereference inp_socket, but obtain reuse port option from inp_flags2, like this is done after next call to in_pcblookup_local() a few lines down below. Submitted by: rwatson	2012-03-21 08:43:38 +00:00
tuexen	52199a8cf4	Clean up, no functional change. MFC after: 3 days.	2012-03-15 14:22:05 +00:00
bz	b70c7f1db8	In nd6_options() ignore the RFC 6106 options completely rather than printing them if nd6_debug is enabled as unknown. Leave a comment about the RFC4191 option as I am undecided so far. Discussed with: hrs MFC after: 3 days	2012-03-04 18:51:45 +00:00
hrs	c2fcac3f88	Allow to configure net.inet6.ip6.{accept_rtadv,no_radr} by the loader tunables as well because they have to be configured before interface initialization for AF_INET6.	2012-03-02 07:23:28 +00:00
hrs	ae62f802c7	Remove a redundant check.	2012-03-02 07:22:04 +00:00
bz	9eb6f57f87	In selectroute() add a missing fibnum argument to an in6_rtalloc() call in an #if 0 section. In in6_selecthlim() optimize a case where in6p cannot be NULL due to an earlier check. More consistently use u_int instead of int for fibnum function arguments. Sponsored by: Cisco Systems, Inc. MFC after: 3 days	2012-02-24 20:06:04 +00:00
kmacy	a99e9d281d	When using flowtable llentrys can outlive the interface with which they're associated at which the lle_tbl pointer points to freed memory and the llt_free pointer is no longer valid. Move the free pointer in to the llentry itself and update the initalization sites. MFC after: 2 weeks	2012-02-23 18:21:37 +00:00
tuexen	01e294b2fd	Remove two clang warnings. MFC after: 1 month.	2012-02-18 16:06:15 +00:00
bz	dcdb23291f	Merge multi-FIB IPv6 support from projects/multi-fibv6/head/: Extend the so far IPv4-only support for multiple routing tables (FIBs) introduced in r178888 to IPv6 providing feature parity. This includes an extended rtalloc(9) KPI for IPv6, the necessary adjustments to the network stack, and user land support as in netstat. Sponsored by: Cisco Systems, Inc. Reviewed by: melifaro (basically) MFC after: 10 days	2012-02-17 02:39:58 +00:00
glebius	b1246dbed6	Remove casts from inet6 address testing macros, thus preserving qualifier from original argument. Obtained from: NetBSD, r. 1.67 Submitted by: maxim	2012-01-26 12:04:19 +00:00
pluknet	cf0bd26fe1	Remove unused variable. The actual ia6->ia6_lifetime access is hidden in IFA6_IS_INVALID/IFA6_IS_DEPRECATED macros since a long time ago (see netinet6/nd6.c, r1.104 of KAME for the reference). MFC after: 3 days	2012-01-25 08:53:42 +00:00
bz	e59c01b14f	Plug a possible ifa_ref leak in case of premature return from in6_purgeaddr(). Reviewed by: rwatson MFC after: 3 days	2012-01-24 13:57:30 +00:00
pluknet	728e6ff16e	Remove the stale XXX rt_newaddrmsg comment. A routing socket message is generated since r192282. Reviewed by: bz MFC after: 3 days	2012-01-24 09:51:42 +00:00
bz	e8bf125640	Remove unnecessary line break. MFC after: 3 days	2012-01-24 06:21:38 +00:00
bz	a8d3ef905d	Clean up some #endif comments removing from short sections. Add #endif comments to longer, also refining strange ones. Properly use #ifdef rather than #if defined() where possible. Four #if defined(PCBGROUP) occurances (netinet and netinet6) were ignored to avoid conflicts with eventually upcoming changes for RSS. Reported by: bde (most) Reviewed by: bde MFC after: 3 days	2012-01-22 02:13:19 +00:00
tuexen	bddf5b6a08	Small cleanup, no functional change.	2012-01-15 14:03:05 +00:00
tuexen	ebc0602463	Add an SCTP sysctl "blackhole", similar to the one for TCP. If set to 1, no ABORT is sent back in response to an incoming INIT. If set to 2, no ABORT is sent back in response to an out of the blue packet. If set to 0 (the default), ABORTs are sent. Discussed with rrs@. MFC after: 1 month.	2012-01-08 09:56:24 +00:00
jhb	4ef366671a	Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks	2012-01-05 19:00:36 +00:00
bz	bfb18e3280	Mark a couple of file local functions static and stop exporting them. MFC after: 1 week	2012-01-05 01:14:35 +00:00
bz	e889f6860b	Convert an #ifdef DIAGNOSTIC if/panic to a KASSERT. MFC after: 1 week	2012-01-05 01:13:25 +00:00
jhb	4e0513fc73	Use the mli_relinmhead list normally used to defer calls to in6m_release_locked() to defer calls to mld_v1_transmit_report() until after the IF_ADDR_LOCK is dropped. This removes a race where the lock is dropped and reacquired while attempting to walk an interface's address list. Reviewed by: bz MFC after: 1 week	2012-01-04 13:35:20 +00:00
glebius	2886f1415c	Use correct locking when traversing interface address list. Reviewed by: bz	2012-01-04 07:01:23 +00:00
jhb	66d3d3405c	When cancelling multicast timers on an interface, don't release the reference on a group in the leaving state while iterating over the loop. Instead, use the same approach used in igmp_ifdetach() and mld_ifdetach() of placing the groups to free on pending release list and then releasing the references after dropping the IF_ADDR_LOCK. This closes an ugly race where the code was dropping the lock in the middle of iterating over the list. It also fixes some additional potential use-after-free bugs since the cancellation routine also applied other changes to the group after dropping the reference. Now those changes are performed before the reference is dropped and the group is potentially freed. Prodded to fix by: glebius Reviewed by: bz MFC after: 1 week	2012-01-03 20:34:52 +00:00
jhb	badf97ee75	Grab a reference on the matching interface address (ifa) in the handling of the SIOC[DG]LIFADDR icotls before dropping the IF_ADDR_LOCK() and release the reference after using it. This prevents the address from being potentially freed out from under the ioctl handler. Reviewed by: bz MFC after: 1 week	2012-01-03 19:44:36 +00:00
jhb	dd61fe0873	Use TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE() for some loops that do not modify the queues they iterate over. Submitted by: glebius	2012-01-03 16:22:29 +00:00
bz	ea62212755	Remove an uneeded inpcb forward declaration and align the function declaration following to match the style in the rest of the file. MFC after: 3 days	2012-01-02 13:03:13 +00:00
bz	55bd70ed9a	Remove a declaration to a non-existent function. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2011-12-31 16:19:22 +00:00
jhb	7a0151720c	Use queue(3) macros instead of home-rolled versions in several places in the INET6 code. This includes retiring the 'ndpr_next' and 'pfr_next' macros. Submitted by: pluknet (earlier version) Reviewed by: pluknet	2011-12-29 18:25:18 +00:00
tuexen	b9ef107414	Address issues found by clang. While there, fix also some style issues. MFC after: 3 months.	2011-12-27 10:16:24 +00:00
jhb	f3f015b978	Fix a bug where TAILQ_FIRST(&V_ifnet) was accessed without holding the proper lock. Reviewed by: bz MFC after: 1 week	2011-12-24 18:11:54 +00:00
glebius	653f8c5e71	Provide ABI compatibility shim to enable configuring of addresses with ifconfig(8) prior to r228571. Requested by: brooks	2011-12-21 12:39:08 +00:00
maxim	330e98b5de	o Convert IPv6 read-only stats sysctls to the read-write ones. o Teach netstat(1) -z to reset these stats sysctls. PR: bin/153206 Reviewed by: glebuis Sponsored by: NGINX, Inc. MFC after: 1 month	2011-12-19 05:50:34 +00:00
tuexen	3a4d069b21	Fix unused parameter warnings. While there, fix some whitespace issues. MFC after: 3 months.	2011-12-17 19:21:40 +00:00
glebius	27a36f6ac8	A major overhaul of the CARP implementation. The ip_carp.c was started from scratch, copying needed functionality from the old implemenation on demand, with a thorough review of all code. The main change is that interface layer has been removed from the CARP. Now redundant addresses are configured exactly on the interfaces, they run on. The CARP configuration itself is, as before, configured and read via SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or SIOCAIFADDR_IN6 may now be configured to a particular virtual host id, which makes the prefix redundant. ifconfig(8) semantics has been changed too: now one doesn't need to clone carpXX interface, he/she should directly configure a vhid on a Ethernet interface. To supply vhid data from the kernel to an application the getifaddrs(8) function had been changed to pass ifam_data with each address. [1] The new implementation definitely closes all PRs related to carp(4) being an interface, and may close several others. It also allows to run a single redundant IP per interface. Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for idea on using ifam_data and for several rounds of reviewing! PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448 Reviewed by: bz Submitted by: bz [1]	2011-12-16 12:16:56 +00:00
glebius	c9b9c0a5a3	Fix double free. PR: kern/163089 Submitted by: Herbie Robinson <Herbie.Robinson stratus.com>	2011-12-07 13:37:42 +00:00
bz	2c9fd688ff	Return the correct value for the IPV6_MULTICAST_HOPS getsockopt() call. Submitted by: rpaulo MFC after: 3 days	2011-11-13 02:32:10 +00:00
qingli	3b996bbc11	A default route learned from the RAs could be deleted manually after its installation. This removal may be accidental and can prevent the default route from being installed in the future if the associated default router has the best preference. The cause is the lack of status update in the default router on the state of its route installation in the kernel FIB. This patch fixes the described problem. Reviewed by: hrs, discussed with hrs MFC after: 5 days	2011-11-11 23:22:38 +00:00
trociny	e290adc1a6	Fix false positive EADDRINUSE that could be returned by bind, due to the typo made in r227207. Reported by: kib Tested by: kib	2011-11-11 14:09:09 +00:00
ed	0c56cf839d	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
glebius	599a401646	In icmp6_redirect_input: - Assert that we got a valid mbuf with rcvif pointer. [1] - Use __func__ in logging. Submitted by: prabhakar lakhera <prabhakar.lakhera gmail.com> [1] Submitted by: Kristof Provost <kristof sigsegv.be> [1]	2011-11-07 14:22:18 +00:00
ed	e97eae1577	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
trociny	f9135967f2	Cache SO_REUSEPORT socket option in inpcb-layer in order to avoid inp_socket->so_options dereference when we may not acquire the lock on the inpcb. This fixes the crash due to NULL pointer dereference in in_pcbbind_setup() when inp_socket->so_options in a pcb returned by in_pcblookup_local() was checked. Reported by: dave jones <s.dave.jones@gmail.com>, Arnaud Lacombe <lacombar@gmail.com> Suggested by: rwatson Glanced by: rwatson Tested by: dave jones <s.dave.jones@gmail.com>	2011-11-06 10:47:20 +00:00
trociny	ddbde914da	Before dereferencing intotw() check for NULL, the same way as it is done for in_pcb (see r157474). MFC after: 1 week	2011-11-06 09:29:52 +00:00
pluknet	494eaea7b0	Remove a couple of write-only variables.	2011-11-03 09:09:05 +00:00
qingli	4fd26a87dc	The code change made in r226040 was incomplete and resulted in routes such as fe80::1%lo0 no being installed. This patch completes the original intended fix. Reviewed by: hrs, bz MFC after: 3 days	2011-10-16 22:24:04 +00:00
qingli	dd2b5b91eb	The IPv6 code was influx at the time of r196865 due to the L2/L3 separation rewrite changes. r196865 was committed to fix a scope violation problem in the following test scenario: box-1# ifconfig em0 inet6 2001:db8:1:: prefixlen 64 anycast box-1# ifconfig em1 inet6 2001:db8:2::1 prefixlen 64 box-2# ifconfig re0 inet6 2001:db8:1::6 prefixlen 64 em0 and re0 are on the same link. box-2# ping6 2001:db8:1:: PING6(56=40+8+8 bytes) 2001:db8:1::6 --> 2001:db8:1:: the ICMPv6 response should have a source address of em1, which is 2001:db8:2::1, not the link-local address of em0. That code is no longer necessary and breaks the IPv6-Ready logo testing, so revert it now. Reviewed by: hrs MFC after: 3 days	2011-10-16 22:15:13 +00:00
hrs	145ad1ad8c	Fix a problem that an interface unexpectedly becomes IFF_UP by just doing "ifconfing inet6 -ifdisabled" when the interface has ND6_IFF_AUTO_LINKLOCAL flag and no link-local address.	2011-10-16 19:46:52 +00:00
glebius	16af3ddcd2	Use TAILQ_FOREACH() in the nd6_dad_find() instead of hand-rolled implementation.	2011-10-13 13:33:23 +00:00
glebius	623fcd8af9	Restore functions in6_ifaddloop() and in6_ifremloop() that were inlined by Qing Li in his big new-ARP commit. I am going to utilize them in my newcarp work, and also these functions left declared in in6_var.h for all the time they were absent. Reviewed by: bz	2011-10-13 13:05:36 +00:00
qingli	d9f932fe21	The IFA_RTSELF instead of the IFA_ROUTE flag should be checked to determine if a loopback route should be installed for an interface IPv6 address. Another condition is the address must not belong to a looopback interface. Reviewed by: hrs MFC after: 3 days	2011-10-05 16:27:11 +00:00
bz	772dd0b1c3	Fix an obvious bug from r186196 shadowing a variable, not correctly appending the new mbuf to the chain reference but possibly causing an mbuf nextpkt loop leading to a memory used after handoff (or having been freed) and leaking an mbuf here. Reviewed by: rwatson, brooks MFC after: 3 days	2011-09-30 18:20:16 +00:00
kmacy	e3079e1350	Make KBI changes required for future MFCing of inpcb rtentry / llentry caching. Reviewed by: rwatson, bz Approved by: re (kib)	2011-09-20 20:27:26 +00:00
hrs	45d40bbb1a	Copy ip6po_minmtu and ip6po_prefer_tempaddr in ip6_copypktopts(). This fixes inconsistency when options are specified by both setsockopt() and ancillary data types. PR: kern/158307 Approved by: re (bz)	2011-09-20 00:29:17 +00:00
hrs	08320280c6	Add $ipv6_cpe_wanif to enable functionality required for IPv6 CPE (r225485). When setting an interface name to it, the following configurations will be enabled: 1. "no_radr" is set to all IPv6 interfaces automatically. 2. "-no_radr accept_rtadv" will be set only for $ipv6_cpe_wanif. This is done just before evaluating $ifconfig_IF_ipv6 in the rc.d scripts (this means you can manually supersede this configuration if necessary). 3. The node will add RA-sending routers to the default router list even if net.inet6.ip6.forwarding=1. This mode is added to conform to RFC 6204 (a router which connects the end-user network to a service provider network). To enable packet forwarding, you still need to set ipv6_gateway_enable=YES. Note that accepting router entries into the default router list when packet forwarding capability and a routing daemon are enabled can result in messing up the routing table. To minimize such unexpected behaviors, "no_radr" is set on all interfaces but $ipv6_cpe_wanif. Approved by: re (bz)	2011-09-13 00:06:11 +00:00
pluknet	081544b729	Fix if_addr_mtx recursion in mld6. mld_set_version() is called only from mld_v1_input_query() and mld_v2_input_query() both holding the if_addr_mtx lock, and then calling into mld_v2_cancel_link_timers() acquires it the second time, which results in mtx recursion. To avoid that, delay if_addr_mtx acquisition until after mld_set_version() is called; while here, further reduce locking scope to protect only the needed pieces: if_multiaddrs, in6m_lookup_locked(). PR: kern/158426 Reported by: Thomas <tps vr-web.de>, Tom Vijlbrief <tom.vijlbrief xs4all.nl> Tested by: Tom Vijlbrief Reviewed by: bz Approved by: re (kib)	2011-08-22 23:39:40 +00:00
bz	eccbdd061b	Add support for IPv6 to ipfw fwd: Distinguish IPv4 and IPv6 addresses and optional port numbers in user space to set the option for the correct protocol family. Add support in the kernel for carrying the new IPv6 destination address and port. Add support to TCP and UDP for IPv6 and fix UDP IPv4 to not change the address in the IP header. Add support for IPv6 forwarding to a non-local destination. Add a regession test uitilizing VIMAGE to check all 20 possible combinations I could think of. Obtained from: David Dolson at Sandvine Incorporated (original version for ipfw fwd IPv6 support) Sponsored by: Sandvine Incorporated PR: bin/117214 MFC after: 4 weeks Approved by: re (kib)	2011-08-20 17:05:11 +00:00
bz	fa01a4aee0	Add an in6_localip() helper function as in6_localaddr() is not doing what people think: returning true for an address in any connected subnet, not necessarily on the local machine. Sponsored by: Sandvine Incorporated MFC after: 2 weeks Approved by: re (kib)	2011-08-20 16:43:47 +00:00
tuexen	f47c615e88	The result of a joint work between rrs@ and myself at the IETF: * Decouple the path supervision using a separate HB timer per path. * Add support for potentially failed state. * Bring back RTO.min to 1 second. * Accept packets on IP-addresses already announced via an ASCONF * While there: do some cleanups. Approved by: re@ MFC after: 2 months.	2011-08-03 20:21:00 +00:00
tuexen	8600c7f735	The socket API only specifies SCTP for SOCK_SEQPACKET and SOCK_STREAM, but not SOCK_DGRAM. So don't register it for SOCK_DGRAM. While there, fix some indentation.	2011-07-12 19:29:29 +00:00
zec	99a0b299b3	Permit ARP to proceed for IPv4 host routes for which the gateway is the same as the host address. This already works fine for INET6 and ND6. While here, remove two function pointers from struct lltable which are only initialized but never used. MFC after: 3 days	2011-07-08 09:38:33 +00:00
bz	e15f804c7b	Update packet filter (pf) code to OpenBSD 4.5. You need to update userland (world and ports) tools to be in sync with the kernel. Submitted by: mlaier Submitted by: eri	2011-06-28 11:57:25 +00:00
bz	1aaf930d63	Add the missing call to ip6_ipsec_filtertunnel() to be able to control whether decapsulated IPsec packets will be passed to pfil again depending on the setting of the net.ip6.ipsec6.filtertunnel sysctl. PR: kern/157670 Submitted by: Manuel Kasper (mk neon1.net) MFC after: 2 weeks	2011-06-08 10:59:36 +00:00
bz	b4b3d062cd	Correct comments and debug logging in ipsec to better match reality. MFC after: 3 days	2011-06-08 03:02:11 +00:00
rwatson	6e29aea1db	Implement a CPU-affine TCP and UDP connection lookup data structure, struct inpcbgroup. pcbgroups, or "connection groups", supplement the existing inpcbinfo connection hash table, which when pcbgroups are enabled, might now be thought of more usefully as a per-protocol 4-tuple reservation table. Connections are assigned to connection groups base on a hash of their 4-tuple; wildcard sockets require special handling, and are members of all connection groups. During a connection lookup, a per-connection group lock is employed rather than the global pcbinfo lock. By aligning connection groups with input path processing, connection groups take on an effective CPU affinity, especially when aligned with RSS work placement (see a forthcoming commit for details). This eliminates cache line migration associated with global, protocol-layer data structures in steady state TCP and UDP processing (with the exception of protocol-layer statistics; further commit to follow). Elements of this approach were inspired by Willman, Rixner, and Cox's 2006 USENIX paper, "An Evaluation of Network Stack Parallelization Strategies in Modern Operating Systems". However, there are also significant differences: we maintain the inpcb lock, rather than using the connection group lock for per-connection state. Likewise, the focus of this implementation is alignment with NIC packet distribution strategies such as RSS, rather than pure software strategies. Despite that focus, software distribution is supported through the parallel netisr implementation, and works well in configurations where the number of hardware threads is greater than the number of NIC input queues, such as in the RMI XLR threaded MIPS architecture. Another important difference is the continued maintenance of existing hash tables as "reservation tables" -- these are useful both to distinguish the resource allocation aspect of protocol name management and the more common-case lookup aspect. In configurations where connection tables are aligned with hardware hashes, it is desirable to use the traditional lookup tables for loopback or encapsulated traffic rather than take the expense of hardware hashes that are hard to implement efficiently in software (such as RSS Toeplitz). Connection group support is enabled by compiling "options PCBGROUP" into your kernel configuration; for the time being, this is an experimental feature, and hence is not enabled by default. Subject to the limited MFCability of change dependencies in inpcb, and its change to the inpcbinfo init function signature, this change in principle could be merged to FreeBSD 8.x. Reviewed by: bz Sponsored by: Juniper Networks, Inc.	2011-06-06 12:55:02 +00:00
hrs	0ae2d5f6c6	Do not activate automatic LL addr configuration when 0/1->1 transition of ND6_IFF_IFDISABLED flag.	2011-06-06 04:12:57 +00:00
hrs	acbda2ccc1	- Make the code more proactively clear an ND6_IFF_IFDISABLED flag when an explicit action for INET6 configuration happens. The changes are: 1. When an ND6 flag is changed via SIOCSIFINFO_FLAGS ioctl, setting ND6_IFF_ACCEPT_RTADV and/or ND6_IFF_AUTO_LINKLOCAL now triggers an attempt to clear the ND6_IFF_IFDISABLED flag. 2. When an AF_INET6 address is added successfully to an interface and it is marked as ND6_IFF_IFDISABLED, an attempt to clear the ND6_IFF_IFDISABLED happens. This simplifies ND6_IFF_IFDISABLED flag manipulation by users via ifconfig(8); in most cases manual configuration is no longer needed. - When ND6_IFF_AUTO_LINKLOCAL is set and no link-local address is assigned to an interface, SIOCSIFINFO_FLAGS ioctl now calls in6_ifattach() to configure a link-local address. This change ensures link-local address configuration when "ifconfig IF inet6" command is invoked. For example, "ifconfig IF inet6 auto_linklocal" now always try to configure an LL addr even if ND6_IFF_AUTO_LINKLOCAL is already set to 1 (i.e. down/up cycle is no longer needed). Reviewed by: bz	2011-06-06 02:37:38 +00:00
hrs	4c2206b625	- Accept Router Advertisement messages even when net.inet6.ip6.forwarding=1. - A new per-interface knob IFF_ND6_NO_RADR and sysctl IPV6CTL_NO_RADR. This controls if accepting a route in an RA message as the default route. The default value for each interface can be set by net.inet6.ip6.no_radr. The system wide default value is 0. - A new sysctl: net.inet6.ip6.norbit_raif. This controls if setting R-bit in NA on RA accepting interfaces. The default is 0 (R-bit is set based on net.inet6.ip6.forwarding). Background: IPv6 host/router model suggests a router sends an RA and a host accepts it for router discovery. Because of that, KAME implementation does not allow accepting RAs when net.inet6.ip6.forwarding=1. Accepting RAs on a router can make the routing table confused since it can change the default router unintentionally. However, in practice there are cases where we cannot distinguish a host from a router clearly. For example, a customer edge router often works as a host against the ISP, and as a router against the LAN at the same time. Another example is a complex network configurations like an L2TP tunnel for IPv6 connection to Internet over an Ethernet link with another native IPv6 subnet. In this case, the physical interface for the native IPv6 subnet works as a host, and the pseudo-interface for L2TP works as the default IP forwarding route. Problem: Disabling processing RA messages when net.inet6.ip6.forwarding=1 and accepting them when net.inet6.ip6.forward=0 cause the following practical issues: - A router cannot perform SLAAC. It becomes a problem if a box has multiple interfaces and you want to use SLAAC on some of them, for example. A customer edge router for IPv6 Internet access service using an IPv6-over-IPv6 tunnel sometimes needs SLAAC on the physical interface for administration purpose; updating firmware and so on (link-local addresses can be used there, but GUAs by SLAAC are often used for scalability). - When a host has multiple IPv6 interfaces and it receives multiple RAs on them, controlling the default route is difficult. Router preferences defined in RFC 4191 works only when the routers on the links are under your control. Details of Implementation Changes: Router Advertisement messages will be accepted even when net.inet6.ip6.forwarding=1. More precisely, the conditions are as follow: (ACCEPT_RTADV && !NO_RADR && !ip6.forwarding) => Normal RA processing on that interface. (as IPv6 host) (ACCEPT_RTADV && (NO_RADR \|\| ip6.forwarding)) => Accept RA but add the router to the defroute list with rtlifetime=0 unconditionally. This effectively prevents from setting the received router address as the box's default route. (!ACCEPT_RTADV) => No RA processing on that interface. ACCEPT_RTADV and NO_RADR are per-interface knob. In short, all interface are classified as "RA-accepting" or not. An RA-accepting interface always processes RA messages regardless of ip6.forwarding. The difference caused by NO_RADR or ip6.forwarding is whether the RA source address is considered as the default router or not. R-bit in NA on the RA accepting interfaces is set based on net.inet6.ip6.forwarding. While RFC 6204 W-1 rule (for CPE case) suggests a router should disable the R-bit completely even when the box has net.inet6.ip6.forwarding=1, I believe there is no technical reason with doing so. This behavior can be set by a new sysctl net.inet6.ip6.norbit_raif (the default is 0). Usage: # ifconfig fxp0 inet6 accept_rtadv => accept RA on fxp0 # ifconfig fxp0 inet6 accept_rtadv no_radr => accept RA on fxp0 but ignore default route information in it. # sysctl net.inet6.ip6.norbit_no_radr=1 => R-bit in NAs on RA accepting interfaces will always be set to 0.	2011-06-06 02:14:23 +00:00
hrs	58bf7d3104	Use uint8_t for sockaddr sa_len. Reviewed by: bz	2011-06-05 11:40:30 +00:00
rwatson	e9eb5d3b9c	Add _mbuf() variants of various inpcb-related interfaces, including lookup, hash install, etc. For now, these are arguments are unused, but as we add RSS support, we will want to use hashes extracted from mbufs, rather than manually calculated hashes of header fields, due to the expensive of the software version of Toeplitz (and similar hashes). Add notes that it would be nice to be able to pass mbufs into lookup routines in pf(4), optimising firewall lookup in the same way, but the code structure there doesn't facilitate that currently. (In principle there is no reason this couldn't be MFCed -- the change extends rather than modifies the KBI. However, it won't be useful without other previous possibly less MFCable changes.) Reviewed by: bz Sponsored by: Juniper Networks, Inc.	2011-06-04 16:33:06 +00:00
rwatson	fdfdadb612	Decompose the current single inpcbinfo lock into two locks: - The existing ipi_lock continues to protect the global inpcb list and inpcb counter. This lock is now relegated to a small number of allocation and free operations, and occasional operations that walk all connections (including, awkwardly, certain UDP multicast receive operations -- something to revisit). - A new ipi_hash_lock protects the two inpcbinfo hash tables for looking up connections and bound sockets, manipulated using new INP_HASH_*() macros. This lock, combined with inpcb locks, protects the 4-tuple address space. Unlike the current ipi_lock, ipi_hash_lock follows the individual inpcb connection locks, so may be acquired while manipulating a connection on which a lock is already held, avoiding the need to acquire the inpcbinfo lock preemptively when a binding change might later be required. As a result, however, lookup operations necessarily go through a reference acquire while holding the lookup lock, later acquiring an inpcb lock -- if required. A new function in_pcblookup() looks up connections, and accepts flags indicating how to return the inpcb. Due to lock order changes, callers no longer need acquire locks before performing a lookup: the lookup routine will acquire the ipi_hash_lock as needed. In the future, it will also be able to use alternative lookup and locking strategies transparently to callers, such as pcbgroup lookup. New lookup flags are, supplementing the existing INPLOOKUP_WILDCARD flag: INPLOOKUP_RLOCKPCB - Acquire a read lock on the returned inpcb INPLOOKUP_WLOCKPCB - Acquire a write lock on the returned inpcb Callers must pass exactly one of these flags (for the time being). Some notes: - All protocols are updated to work within the new regime; especially, TCP, UDPv4, and UDPv6. pcbinfo ipi_lock acquisitions are largely eliminated, and global hash lock hold times are dramatically reduced compared to previous locking. - The TCP syncache still relies on the pcbinfo lock, something that we may want to revisit. - Support for reverting to the FreeBSD 7.x locking strategy in TCP input is no longer available -- hash lookup locks are now held only very briefly during inpcb lookup, rather than for potentially extended periods. However, the pcbinfo ipi_lock will still be acquired if a connection state might change such that a connection is added or removed. - Raw IP sockets continue to use the pcbinfo ipi_lock for protection, due to maintaining their own hash tables. - The interface in6_pcblookup_hash_locked() is maintained, which allows callers to acquire hash locks and perform one or more lookups atomically with 4-tuple allocation: this is required only for TCPv6, as there is no in6_pcbconnect_setup(), which there should be. - UDPv6 locking remains significantly more conservative than UDPv4 locking, which relates to source address selection. This needs attention, as it likely significantly reduces parallelism in this code for multithreaded socket use (such as in BIND). - In the UDPv4 and UDPv6 multicast cases, we need to revisit locking somewhat, as they relied on ipi_lock to stablise 4-tuple matches, which is no longer sufficient. A second check once the inpcb lock is held should do the trick, keeping the general case from requiring the inpcb lock for every inpcb visited. - This work reminds us that we need to revisit locking of the v4/v6 flags, which may be accessed lock-free both before and after this change. - Right now, a single lock name is used for the pcbhash lock -- this is undesirable, and probably another argument is required to take care of this (or a char array name field in the pcbinfo?). This is not an MFC candidate for 8.x due to its impact on lookup and locking semantics. It's possible some of these issues could be worked around with compatibility wrappers, if necessary. Reviewed by: bz Sponsored by: Juniper Networks, Inc.	2011-05-30 09:43:55 +00:00
bz	2a56bd3155	Add FEATURE() definitions for IPv4 and IPv6 so that we can use feature_present(3) to dynamically decide whether to use one or the other family. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 10 days	2011-05-25 00:34:25 +00:00
rwatson	79b3da72c2	Move from passing a wildcard boolean to a general set up lookup flags into in_pcb_lport(), in_pcblookup_local(), and in_pcblookup_hash(), and similarly for IPv6 functions. In the future, we would like to support other flags relating to locking strategy. This change doesn't appear to modify the KBI in practice, as callers already passed in INPLOOKUP_WILDCARD rather than a simple boolean. MFC after: 3 weeks Reviewed by: bz Sponsored by: Juniper Networks, Inc.	2011-05-23 15:23:18 +00:00
qingli	a1bf1a2582	The statically configured (permanent) ARP entries are removed when an interface is brought down, even though the interface address is still valid. This patch maintains the permanent ARP entries as long as the interface address (having the same prefix as that of the ARP entries) is valid. Reviewed by: delphij MFC after: 5 days	2011-05-20 19:12:20 +00:00
tuexen	a51881ba59	Remove code with any effect.	2011-05-03 20:34:02 +00:00
tuexen	ad795d2c5d	Improve compilation of SCTP code without INET support. Some bugs where fixed while doing this: * ASCONF-ACK messages might use wrong port number when using IPv6. * Checking for additional addresses takes the correct address into account and also does not do more comparisons than necessary. This patch is based on one received from bz@ who was sponsored by The FreeBSD Foundation and iXsystems. MFC after: 1 week	2011-04-30 11:18:16 +00:00
bz	734a66b389	Make the UDP code compile without INET. Expose udp_usrreq.c to IPv6 only as well compiling out most functions adding or extending #ifdef INET coverage. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-30 11:17:00 +00:00
bz	337d0d95c2	Make the PCB code compile without INET support by adding #ifdef INETs and correcting few #includes. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-30 11:04:34 +00:00
bz	d28e675043	Make IPsec compile without INET adding appropriate #ifdef checks. Unfold the IPSEC_COMMON_INPUT_CB() macro in xform_{ah,esp,ipcomp}.c to not need three different versions depending on INET, INET6 or both. Mark two places preparing for not yet supported functionality with IPv6. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days	2011-04-27 19:28:42 +00:00
ticso	1aaffda228	correct variable type name in comment	2011-04-25 09:00:52 +00:00
bz	80925a07fa	MFp4 CH=191760,191770: Not compiling in and not initializing from inetsw from in_proto.c for IPv6 only, we need to initialize upper layer protocols from inet6sw. Make sure to not initialize them twice in a Dual-Stack environment but only conditionally on no INET as we have done for TCP for a long time. Otherwise we would leak resources. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 3 days	2011-04-20 08:05:23 +00:00
bz	1584693544	Fix IPv6 ND. After r219562 we in nd6_ns_input() were erroneously always passing the cached proxydl reference (sockaddr_dl initialized or not) to nd6_na_output(). nd6_na_output() will thus assume a proxy NA. Revert to conditionally passing either &proxydl or NULL if no proxy case desired. Tested by: ipv6gw and ref9-i386 Reported by: Pete French (petefrench ingresso.co.uk on stable) Reported by: bz, simon on Y! cluster Reported by: kib PR: kern/151908 MFC after: 3 days	2011-04-17 16:07:08 +00:00
bz	6eed38b749	Remove a check in udp6_send() that prevented v4-mapped v6 addresses from working. We store v4 and v6 addresses as a union but for v4-mapped addresses only store the 32bits w/o the ::ffff: word. That failed the check as for example 127.0.0.1 would be ::7f00:1 rather than ::ffff:7f00:1 and the IN6_IS_ADDR_V4MAPPED() never worked here. Given we can hardly get here with an unbound local address or invalid inp_vflags remove the check. Reported by: tuexen Reviewed by: tuexen MFC after: 3 days	2011-04-09 02:22:49 +00:00
bz	1710f285d4	After r219579 and r219779 unbreak v4-mapped v6 sockets for UDP some more. Similar to what we do for TCP check for v4-mapped addresses and then handle them or the normal v6 address case. For either set inp_vflags before calling into the pcb connect function so that we have an unambiguous view in case we need to set the local address or port. Looked at: tuexen (as part of more) MFC after: 3 days	2011-04-09 01:29:46 +00:00
jeff	2d7d8c05e7	- Merge changes to the base system to support OFED. These include a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.	2011-03-21 09:40:01 +00:00
bz	524448845c	Merge the two identical implementations for local port selections from in_pcbbind_setup() and in6_pcbsetport() in a single in_pcb_lport(). MFC after: 2 weeks	2011-03-12 21:46:37 +00:00
bz	5d37412b23	Push a possible "unbind" in some situation from in6_pcbsetport() to callers. This also fixes a problem when the prison call could set the inp->in6p_laddr (laddr) and a following priv_check_cred() call would return an error and will allow us to merge the IPv4 and IPv6 implementation. MFC after: 2 weeks	2011-03-12 16:45:15 +00:00
bz	d94569d0ed	Make sure the locally cached value of rt->rt_gateway stays stable, even after dropping the reference and unlocking. Previously we have dereferenced a NULL pointer (after r121765). Simply unlocking after the block does not work either because of lock ordering (see r121765) and in addition we would still hold a pointer to something that might be gone by the time we access it. Thus take a copy of the value rather than just caching the pointer. PR: kern/151908 Submitted by: chenyl (netstar2008 126.com) (initial version) MFC after: 2 weeks	2011-03-12 09:41:25 +00:00
brucec	6d9b42b486	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
tuexen	9ccaf288c9	Fix bugs related to M_FLOWID: * Store the flowid when receiving an SCTP/IPv6 packet. * Store the flowid when receiving an SCTP packet with wrong CRC. * Initilize flowid correctly. * Put test code under INVARIANTS. MFC after: 3 months.	2011-02-07 15:04:23 +00:00
rrs	fa5090f1dd	1) Typo correction in comments and one spacing change. 2) Mass update to all copyrights. MFC after: 3 Months	2011-02-05 12:12:51 +00:00
tuexen	c17bd2abe0	Improve plausibility check in sctp_handle_sack(). Allow cmt_on_off to support values 0 (no CMT), 1 (CMT), and 2 (CMT/RP). MFC after: 3 months.	2010-12-22 17:59:38 +00:00
jhay	1f7a05a1ef	Add IFT_L2VLAN to the list that is capable of supplying the ingredients of the EUI64 part of an IPv6 address. Otherwise vlans will all use the MAC address of the first ethernet interface of the system. MFC after: 1 week	2010-12-22 11:58:31 +00:00
bz	ac70658e77	Loosen the locking in nd6-free() again after r216022 to avoid a LOR and a recursed lock. Reported by: delphij Tested by: delphij PR: kern/148857 MFC After: 3 days	2010-12-07 22:43:29 +00:00
bz	a8c33e5555	Plug well observed races on la_hold entries with the callout handler. Call the handler function with the lock held, return unlocked as we might free the entry. Rework functions later in the call graph to be either called with the lock held or, only if needed, unlocked. Place asserts to document and tighten assumptions on various lle locking, which were not always true before. We call nd6_ns_output() unlocked and the assignment of ip6->ip6_src was decentralized to minimize possible complexity introduced with the formerly missing locking there. This also resulted in a push down of local variable scopes into smaller blocks. Reported by: many PR: kern/148857 Submitted by: Dmitrij Tejblum (tejblum yandex-team.ru) (original version) MFC After: 4 days	2010-11-29 00:04:08 +00:00
brucec	cd6001f0b6	Fix more continuous/contiguous typos (cf. r215955)	2010-11-27 21:51:39 +00:00
dim	fb307d7d1d	After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 \| dim \| 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) \| 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 \| dim \| 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) \| 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 \| dim \| 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) \| 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.	2010-11-22 19:32:54 +00:00
bz	6a4a2e62a2	In case of an early return from the function there is no need to zero the route upfront, so defer as long as we can. MFC after: 3 days	2010-11-20 12:27:40 +00:00
bz	dc5dcc5e85	Do not initialize flag variables before needed. Consistently use the LLE_ prefix for lla_lookup() and the ND6_ prefix for nd6_lookup() even though both are defined the same. Use the right flag variable when checking each. No real functional change. MFC after: 4 days	2010-11-17 10:43:20 +00:00
bz	7d3dc5f192	No need to re-initialize the callout. We initially do it in in6_lltable_new() right after allocation. Worse, we are losing the right flags here. MFC after: 4 days	2010-11-17 09:25:08 +00:00
dim	fda4020a88	Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.	2010-11-14 20:38:11 +00:00
bz	de9392f9e0	Make the IPsec SADB embedded route cache a union to be able to hold both the legacy and IPv6 route destination address. Previously in case of IPv6, there was a memory overwrite due to not enough space for the IPv6 address. PR: kern/122565 MFC After: 2 weeks	2010-10-23 20:35:40 +00:00
rpaulo	01eca72a5f	Purposely tell the compiler that we ignore the return value of ADDCARRY() in the REDUCE macro. Reviewed by: dim, rdivacky	2010-10-13 10:45:22 +00:00
delphij	f0a3248def	Add a bandaid for a long-standing race condition during route entry un-expiring. The previous version of code have no locking when testing rt_refcnt. The result of the lack of locking may result in a condition where a routing entry have a reference count but at the same time have RTPRF_OURS bit set and an expiration timer. These would eventually lead to a panic: panic: rtqkill route really not free When the system have ICMP redirects accepted from local gateway in a moderate frequency, for instance. Commit this workaround for now until we have some better solution. PR: kern/149804 Reviewed by: bz Tested by: Zhao Xin, Pete French MFC after: 2 weeks	2010-09-27 19:26:56 +00:00
attilio	d658ddc7c7	IP_BINDANY is not correctly handled in getsockopt() case. Fix it by specifying the correct bits. Sponsored by: Sandvine Incorporated Reviewed by: bz, emaste, rstone Obtained from: Sandvine Incorporated MFC after: 10 days	2010-09-24 14:38:54 +00:00
tuexen	3f139046bf	Remove unused variables. MFC after: 2 weeks.	2010-09-15 20:41:20 +00:00
bz	18b40a43ee	MFp4 CH=183052 183053 183258: In protosw we define pr_protocol as short, while on the wire it is an uint8_t. That way we can have "internal" protocols like DIVERT, SEND or gaps for modules (PROTO_SPACER). Switch ipproto_{un,}register to accept a short protocol number() and do an upfront check for valid boundries. With this we also consistently report EPROTONOSUPPORT for out of bounds protocols, as we did for proto == 0. This allows a caller to not error for this case, which is especially important if we want to automatically call these from domain handling. () the functions have been without any in-tree consumer since the initial introducation, so this is considered save. Implement ip6proto_{un,}register() similarly to their legacy IP counter parts to allow modules to hook up dynamically. Reviewed by: philip, will MFC after: 1 week	2010-09-02 17:43:44 +00:00
tuexen	ff0f13be61	Fix the the SCTP_WITH_NO_CSUM option when used in combination with interface supporting CRC offload. While at it, make use of the feature that the loopback interface provides CRC offloading. MFC after: 4 weeks	2010-08-29 18:50:30 +00:00
tuexen	8d90bd696f	Fix the switching on/off of CMT using sysctl and socket option. Fix the switching on/off of PF and NR-SACKs using sysctl. Add minor improvement in handling malloc failures. Improve the address checks when sending. MFC after: 4 weeks	2010-08-28 17:59:51 +00:00
ume	a9738fb2b3	optp may be NULL.	2010-08-20 17:52:49 +00:00
anchie	a1cc5611b8	Fix mbuf leakages and remove unneccessary duplicate mbuf frees. Use the right copy of an mbuf for the IP6_EXTHDR_CHECK. Reported by: zec, hrs Approved by: bz (mentor)	2010-08-19 23:16:44 +00:00
anchie	c6c2feb282	MFp4: anchie_soc2009 branch: Add kernel side support for Secure Neighbor Discovery (SeND), RFC 3971. The implementation consists of a kernel module that gets packets from the nd6 code, sends them to user space on a dedicated socket and reinjects them back for further processing. Hooks are used from nd6 code paths to divert relevant packets to the send implementation for processing in user space. The hooks are only triggered if the send module is loaded. In case no user space application is connected to the send socket, processing continues normaly as if the module would not be loaded. Unloading the module is not possible at this time due to missing nd6 locking. The native SeND socket is similar to a raw IPv6 socket but with its own, internal pseudo-protocol. Approved by: bz (mentor)	2010-08-19 11:31:03 +00:00
ume	30f327ba13	Make `ping6 -I' work with net.inet6.ip6.use_defaultzone=1. MFC after: 2 weeks	2010-08-17 17:30:56 +00:00
bz	007e782fce	In rip6_input(), in case of multicast, we might skip the normal processing and go to the next iteration early if multicast filtering would decide that this socket shall not receive the data. Unlock the pcb in that case or we leak the read lock and next time trying to get a write lock, would hang forever. PR: kern/149608 Submitted by: Chris Luke (chrisy flirble.org) MFC after: 3 days	2010-08-14 14:13:44 +00:00
will	d548943ae9	Unbreak LINT by moving all carp hooks to net/if.c / netinet/ip_carp.h, with the appropriate ifdefs. Reviewed by: bz Approved by: ken (mentor)	2010-08-11 20:18:19 +00:00
will	aa4e762c4a	Allow carp(4) to be loaded as a kernel module. Follow precedent set by bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks	2010-08-11 00:51:50 +00:00
bz	7ff79c51bb	MFp4 CH180235: Add proto spacers to inet6sw like we have for legacy IP. This allows us to dynamically pf_proto_register() for INET6 from modules, needed by upcoming CARP changes and SeND. MC and SCTP could make use of it as well in theory in the future after upcoming VIMAGE vnet teardown work. Discussed with: will, anchie MFC after: 10 days	2010-08-09 19:53:24 +00:00
bz	b6078715a1	Document the mandatory argument to the arptimer() and nd6_llinfo_timer() functions with a KASSERT(). Note: there is no need to return after panic. In the legacy IP case, only assign the arg after the check, in the IPv6 case, remove the extra checks for the table and interface as they have to be there unless we freed and forgot to cancel the timer. It doesn't matter anyway as we would panic on the NULL pointer deref immediately and the bug is elsewhere. This unifies the code of both address families to some extend. Reviewed by: rwatson MFC after: 6 days	2010-07-31 21:33:18 +00:00
bz	b43b431ac5	Since r186119 IP6 input counters for octets and packets were not working anymore. In addition more checks and operations were missing. In case lla_lookup results in a match, get the ifaddr to update the statistics counters, and check that the address is neither tentative, duplicate or otherwise invalid before accepting the packet. If ok, record the address information in the mbuf. [ as is done in case lla_lookup does not return a result and we go through the FIB ]. Reported by: remko Tested by: remko MFC after: 2 weeks	2010-07-21 13:01:21 +00:00
alfred	1e2de7fa4d	Fix our version of IPv6 address representation. We do not respect rules 3 and 4 in the required list: 1. omit leading zeros 2. "::" used to their maximum extent whenever possible 3. "::" used where shortens address the most 4. "::" used in the former part in case of a tie breaker 5. do not shorten one 16 bit 0 field 6. use lower case http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04.html Submitted by: Kalluru Abhiram @ Juniper Networks Obtained from: Juniper Networks Reviewed by: hrs, dougb	2010-05-19 00:35:47 +00:00
kmacy	a68dd336d5	allocate ipv6 flows from the ipv6 flow zone reported by: rrs@ MFC after: 3 days	2010-05-16 21:48:39 +00:00
kmacy	cdb02238ee	do a proper fix Pointed out by: np@ MFC after: 3 days	2010-05-13 19:47:36 +00:00
kmacy	b52d51265d	fix compile error on some builds by doing the equivalent of an "extern VNET_DEFINE" without "__used" MFC after: 3 days	2010-05-13 19:36:13 +00:00
kmacy	f5c26f02f1	try working around panic by validating rt and lle MFC after: 3 days	2010-05-12 03:29:11 +00:00
kmacy	c4a8bb5da4	boot time size the flowtable MFC after: 3 days	2010-05-10 21:31:20 +00:00
kmacy	9cddffd405	Add flowtable support to IPv6 Tested by: qingli@ Reviewed by: qingli@ MFC after: 3 days	2010-05-09 20:32:00 +00:00
bz	0a90ef1728	MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days	2010-04-29 11:52:42 +00:00
bz	c7fd54ae5a	Enhance the historic behaviour of raw sockets and jails in a way that we allow all possible jail IPs as source address rather than forcing the "primary". While IPv6 naturally has source address selection, for legacy IP we do not go through the pain in case IP_HDRINCL was not set. People should bind(2) for that. This will, for example, allow ping(\|6) -S to work correctly for non-primary addresses. Reported by: (ten 211.ru) Tested by: (ten 211.ru) MFC after: 4 days	2010-04-27 15:07:08 +00:00
bz	e66b2bd00b	Make sure IPv6 source address selection does not change interface addresses while walking the IPv6 address list if in the jail case something is connecting to ::1. Reported by: Pieter de Boer (pieter thedarkside.nl) Tested by: Pieter de Boer (pieter thedarkside.nl) MFC after: 4 days	2010-04-27 15:05:03 +00:00
kib	8568816295	Provide 32bit compat for SIOCGDEFIFACE_IN6. Based on submission by: pluknet gmail com Reviewed by: emaste MFC after: 2 weeks	2010-04-27 09:47:14 +00:00
bz	d7a91dc6bf	Plug reference leaks in the link-layer code ("new-arp") that previously prevented the link-layer entry from being freed. In both in.c and in6.c (though that code path seems to be basically dead) plug a reference leak in case of a pending callout being drained. In if_ether.c consistently add a reference before resetting the callout and in case we canceled a pending one remove the reference for that. In the final case in arptimer, before freeing the expired entry, remove the reference again and explicitly call callout_stop() to clear the active flag. In nd6.c:nd6_free() we are only ever called from the callout function and thus need to remove the reference there as well before calling into llentry_free(). In if_llatbl.c when freeing entire tables make sure that in case we cancel a pending callout to remove the reference as well. Reviewed by: qingli (earlier version) MFC after: 10 days Problem observed, patch tested by: simon on ipv6gw.f.o, Christian Kratzer (ck cksoft.de), Evgenii Davidov (dado korolev-net.ru) PR: kern/144564 Configurations still affected: with options FLOWTABLE	2010-04-11 16:04:08 +00:00
bms	530b2e634f	When embedding the scope ID in MLDv1 output, check if the scope of the address being embedded is in fact link-local, before attempting to embed it. Note that this operation is a side-effect of trying to avoid recursion on the IN6 scope lock. PR: 144560 Submitted by: Petr Lampa MFC after: 3 days	2010-04-10 12:24:21 +00:00
tuexen	238a37de82	* Fix some race condition in SACK/NR-SACK processing. * Fix handling of mapping arrays when draining mbufs or processing FORWARD-TSN chunks. * Cleanup code (no duplicate code anymore for SACKs and NR-SACKs). Part of this code was developed together with rrs. MFC after: 2 weeks.	2010-04-03 15:40:14 +00:00
bz	95992079a2	We are holding a write lock here so avoid aquiring it twice calling the "locked" version rather than the wrapper function. MFC after: 6 days	2010-03-25 10:29:00 +00:00
rrs	5db64758fc	The proper fix for the delayed SCTP checksum is to have the delayed function take an argument as to the offset to the SCTP header. This allows it to work for V4 and V6. This of course means changing all callers of the function to either pass the header len, if they have it, or create it (ip_hl << 2 or sizeof(ip6_hdr)). PR: 144529 MFC after: 2 weeks	2010-03-12 22:58:52 +00:00
rrs	42b8493f26	With the recent change of the sctp checksum to support offload, no delayed checksum was added to the ip6 output code. This causes cards that do not support SCTP checksum offload to have SCTP packets that are IPv6 NOT have the sctp checksum performed. Thus you could not communicate with a peer. This adds the missing bits to make the checksum happen for these cards. PR: 144529 MFC after: 2 weeks	2010-03-12 08:10:30 +00:00
qingli	03d2b3884a	Use reference counting instead of locking to secure an address while that address is being used to generate temporary IPv6 address. This approach is sufficient and avoids recursive locking. MFC after: 3 days	2010-02-27 07:12:25 +00:00
pjd	3e1636c7b5	No need to include security/mac/mac_framework.h here.	2010-02-18 22:30:37 +00:00
bz	5fa77a0084	Correct a typo. Submitted by: kensmith MFC after: 3 days	2010-01-24 10:22:39 +00:00
bz	5d1c4cb181	Garbage collect references to the no longer implemented tcp_fasttimo(). Discussed with: rwatson MFC after: 5 days	2010-01-17 13:07:52 +00:00
bz	d80ba03e3c	Add ip4.saddrsel/ip4.nosaddrsel (and equivalent for ip6) to control whether to use source address selection (default) or the primary jail address for unbound outgoing connections. This is intended to be used by people upgrading from single-IP jails to multi-IP jails but not having to change firewall rules, application ACLs, ... but to force their connections (unless otherwise changed) to the primry jail IP they had been used for years, as well as for people prefering to implement similar policies. Note that for IPv6, if configured incorrectly, this might lead to scope violations, which single-IPv6 jails could as well, as by the design of jails. [1] Reviewed by: jamie, hrs (ipv6 part) Pointed out by: hrs [1] MFC After: 2 weeks Asked for by: Jase Thew (bazerka beardz.net)	2010-01-17 12:57:11 +00:00
trasz	2cff5b762a	Replace several instances of 'if (!a & b)' with 'if (!(a &b))' in order to silence newer GCC versions.	2010-01-08 15:44:49 +00:00
bz	a650e17670	Correct a typo. Submitted by: sn_ (sn_ gmx.net) on hackers@ MFC after: 3 days	2010-01-06 23:05:00 +00:00
qingli	a506bcc1c4	The IFA_RTSELF address flag marks a loopback route has been installed for the interface address. This marker is necessary to properly support PPP types of links where multiple links can have the same local end IP address. The IFA_RTSELF flag bit maps to the RTF_HOST value, which was combined into the route flag bits during prefix installation in IPv6. This inclusion causing the prefix route to be unusable. This patch fixes this bug by excluding the IFA_RTSELF flag during route installation. MFC after: 5 days	2010-01-04 23:39:53 +00:00
qingli	5357cdf4a4	Multiple IPv6 addresses of the same prefix can be installed on the same interface. The first address will install the prefix route into the kernel routing table and that prefix will be marked as on-link. Without RADIX_MPATH enabled, the other address aliases of the same prefix will update the prefix reference count but no other routes will be installed. Consequently the prefixes associated with these addresses would not be marked as on-link. As such, incoming packets destined to these address aliases will fail the ND6 on-link check on input. This patch fixes the above problem by searching the kernel routing table and try to find an on-link prefix on the given interface. MFC after: 5 days	2009-12-30 21:51:23 +00:00
qingli	ed965a92bc	The proxy arp entries could not be added into the system over the IFF_POINTOPOINT link types. The reason was due to the routing entry returned from the kernel covering the remote end is of an interface type that does not support ARP. This patch fixes this problem by providing a hint to the kernel routing code, which indicates the prefix route instead of the PPP host route should be returned to the caller. Since a host route to the local end point is also added into the routing table, and there could be multiple such instantiations due to multiple PPP links can be created with the same local end IP address, this patch also fixes the loopback route installation failure problem observed prior to this patch. The reference count of loopback route to local end would be either incremented or decremented. The first instantiation would create the entry and the last removal would delete the route entry. MFC after: 5 days	2009-12-30 21:35:34 +00:00
bms	bc3414fcae	Use ALLOW_NEW_SOURCES and BLOCK_OLD_SOURCES to signal a join or leave with SSM MLDv2 by default. This is current practice and complies with RFC 4604, as well as being required by production IPv6 networks in Japan. The behaviour may be disabled by setting the net.inet6.mld.use_allow sysctl/tunable to 0. Requested by: Hideki Yamamoto MFC after: 1 week	2009-12-22 20:40:22 +00:00
bms	f88254b35a	Add missing #include <sys/ktr.h>. Submitted by: Hideki Yamamoto MFC after: 1 week	2009-12-15 10:40:40 +00:00
bz	932cbdbe4d	Throughout the network stack we have a few places of if (jailed(cred)) left. If you are running with a vnet (virtual network stack) those will return true and defer you to classic IP-jails handling and thus things will be "denied" or returned with an error. Work around this problem by introducing another "jailed()" function, jailed_without_vnet(), that also takes vnets into account, and permits the calls, should the jail from the given cred have its own virtual network stack. We cannot change the classic jailed() call to do that, as it is used outside the network stack as well. Discussed with: julian, zec, jamie, rwatson (back in Sept) MFC after: 5 days	2009-12-13 13:57:32 +00:00
bms	cb3a6b3546	Adapt r197136 to IPv6 stack: Comment some flawed assumptions in in6p_join_group() about mixing SSM full-state and delta-based APIs. MFC after: 1 day	2009-11-19 13:39:07 +00:00
bms	b006145221	Adapt r197135 to IPv6 stack: Don't allow joins w/o source on an existing group. This is almost always pilot error. We don't need to check for group filter UNDEFINED state at t1, because we only ever allocate filters with their groups, so we unconditionally reject such calls with EINVAL. Trying to change the active filter mode w/o going through IPV6_MSFILTER is also disallowed. MFC after: 1 day	2009-11-19 13:33:23 +00:00
bms	d8ea9b2a5b	Adapt r197132 to IPv6 stack: Tighten input checking in in6p_join_group(): * Don't try to use the source address, when its family is unspecified. * If we get a join without a source, on an existing inclusive mode group, this is an error, as it would change the filter mode. Fix a problem with the handling of in6_mfilter for new memberships: * Do not rely on im6f being NULL; it is explicitly initialized to a non-NULL pointer when constructing a membership. * Explicitly initialize *im6f to EX mode when the source address is unspecified. This fixes a problem with in_mfilter slot recycling in the join path. MFC after: 1 day	2009-11-19 13:30:06 +00:00
bms	375b60ebd5	Adapt r197314 to IPv6 stack: Return ENOBUFS consistently if user attempts to exceed in_mcast_maxsocksrc resource limit. MFC after: 1 day	2009-11-19 12:21:20 +00:00
bms	63de6a0a63	Adapt r197130 to IPv6 stack: Fix an obvious logic error in the IPv4 multicast leave processing, where the filter mode vector was not updated correctly after the leave. MFC after: 1 day	2009-11-19 12:18:30 +00:00
bms	028af3a421	Adapt the fix for IGMPv2 in r199287 for the IPv6 stack. Only multicast routing is affected by the issue. MFC after: 1 day	2009-11-19 11:55:19 +00:00
ume	ff25cdd646	- We are not guaranteed that we're not dropping a reference that we did not add. Call LLE_REMREF() only when callout_stop() actually canceled a pending callout. - callout_reset() may cancel a pending callout. When callout_reset() canceled a pending callout, call LLE_REMREF() to drop a reference for the canceled callout. MFC after: 1 week	2009-11-12 14:48:36 +00:00

... 3 4 5 6 7 ...

1396 Commits