freebsd-nq

Author	SHA1	Message	Date
Kristof Provost	effaab8861	netpfil: Introduce PFIL_FWD flag Forwarded packets passed through PFIL_OUT, which made it difficult for firewalls to figure out if they were forwarding or producing packets. This in turn is an issue for pf for IPv6 fragment handling: it needs to call ip6_output() or ip6_forward() to handle the fragments. Figuring out which was difficult (and until now, incorrect). Having pfil distinguish the two removes an ugly piece of code from pf. Introduce a new variant of the netpfil callbacks with a flags variable, which has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if a packet is forwarded. Reviewed by: ae, kevans Differential Revision: https://reviews.freebsd.org/D13715	2018-03-23 16:56:44 +00:00
Sean Bruno	06b479a6a7	Refactor ip6_getpcbopt() for better locking and memory management Created GET_PKTOPT_EXT_HDR() and GET_PKTOPT_SOCKADDR() macros to handle safely fetching options from in6p_outputopts, including properly dealing with in6p locking and preparing memory for sooptcopyout(). Changed the function signature of ip6_getpcbopt() to allow the function to acquire and release locks on in6p as needed. Submitted by: Jason Eggleston <jason@eggnet.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14619	2018-03-22 23:34:48 +00:00
Sean Bruno	2a499acf59	Simple locking fixes in ip_ctloutput, ip6_ctloutput, rip_ctloutput. Submitted by: Jason Eggleston <jason@eggnet.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14624	2018-03-22 22:29:32 +00:00
Sean Bruno	5cbeca4497	Handle locking and memory safety for IPV6_PATHMTU in ip6_ctloutput(). Submitted by: Jason Eggleston <jason@eggnet.com> Reviewed by: ae Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14622	2018-03-22 21:18:34 +00:00
Sean Bruno	37d4fc1e70	Improve write locking in ip6_ctloutput() with macros. Submitted by: Jason Eggleston <jason@eggnet.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14620	2018-03-22 20:21:05 +00:00
Jonathan T. Looney	7fb2986ff6	If the INP lock is uncontested, avoid taking a reference and jumping through the lock-switching hoops. A few of the INP lookup operations that lock INPs after the lookup do so using this mechanism (to maintain lock ordering): 1. Lock lookup structure. 2. Find INP. 3. Acquire reference on INP. 4. Drop lock on lookup structure. 5. Acquire INP lock. 6. Drop reference on INP. This change provides a slightly shorter path for cases where the INP lock is uncontested: 1. Lock lookup structure. 2. Find INP. 3. Try to acquire the INP lock. 4. If successful, drop lock on lookup structure. Of course, if the INP lock is contested, the functions will need to revert to the previous way of switching locks safely. This saves a few atomic operations when the INP lock is uncontested. Discussed with: gallatin, rrs, rwatson MFC after: 2 weeks Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D12911	2018-03-21 15:54:46 +00:00
Alexander V. Chernikov	1435dcd94f	Fix outgoing TCP/UDP packet drop on arp/ndp entry expiration. Current arp/nd code relies on the feedback from the datapath indicating that the entry is still used. This mechanism is incorporated into the arpresolve()/nd6_resolve() routines. After the inpcb route cache introduction, the packet path for the locally-originated packets changed, passing cached lle pointer to the ether_output() directly. This resulted in the arp/ndp entry expire each time exactly after the configured max_age interval. During the small window between the ARP/NDP request and reply from the router, most of the packets got lost. Fix this behaviour by plugging datapath notification code to the packet path used by route cache. Unify the notification code by using single inlined function with the per-AF callbacks. Reported by: sthaug at nethelp.no Reviewed by: ae MFC after: 2 weeks	2018-03-17 17:05:48 +00:00
Eric van Gyzen	0bbfb20fe5	Update the MTU in affected routes when IPv6 RA changes the MTU ip6_calcmtu() only looks at the interface MTU if neither the TCP hostcache nor the route provides an MTU. Update the routes so they do not provide stale MTUs. This fixes UNH IPv6 conformance test cases v6LC_4_1_08 and v6LC_4_1_09, which use a RA to reduce the link MTU from 1500 to 1280. Reported and tested by: Farrell Woods <Farrell_Woods@Dell.com> Reviewed by: dab, melifaro Discussed with: ae MFC after: 1 week Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D14257	2018-02-12 19:49:20 +00:00
Eric van Gyzen	43105e589a	Fix ICMPv6 redirects icmp6_redirect_input() validates that a redirect packet came from the current gateway for the respective destination. To do this, it compares the source address, which has an embedded scope zone id, to the next-hop address, which does not. If the address is link-local, which should be the case, the comparison fails and the redirect is ignored. Insert the scope zone id into the next-hop address so the comparison is accurate. Unsurprisingly, this fixes 35 UNH IPv6 conformance test cases. Submitted by: Farrell Woods <Farrell_Woods@Dell.com> (initial revision) Reviewed by: ae melifaro dab MFC after: 1 week Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D14254	2018-02-09 00:13:05 +00:00
Andrey V. Elsukov	68e0e5a673	Modify ip6_get_prevhdr() to be able use it safely. Instead of returning pointer to the previous header, return its offset. In frag6_input() use m_copyback() and determined offset to store next header instead of accessing to it by pointer and assuming that the memory is contiguous. In rip6_input() use offset returned by ip6_get_prevhdr() instead of calculating it from pointers arithmetic, because IP header can belong to another mbuf in the chain. Reported by: Maxime Villard <max at m00nbsd dot net> Reviewed by: kp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D14158	2018-02-05 09:22:07 +00:00
Andrey V. Elsukov	883cd89b05	Merge r1.120 from NetBSD: Fix a pretty simple, yet pretty tragic typo: we should return IPPROTO_DONE, not IPPROTO_NONE. With IPPROTO_NONE we will keep parsing the header chain on an mbuf that was already freed. Reported by: Maxime Villard <max at m00nbsd dot net> MFC after: 3 days	2018-02-02 07:39:34 +00:00
Eric van Gyzen	f8116f391a	ND6: Set the correct state for new neighbor cache entries Restore state 6. Many of the UNH tests end up exercising this state, where we have a new neighbor cache entry and a new link-layer entry is being created for it. The link-layer address is currently unknown so the initial state of the "llentry" should remain initialized to ND6_LLINFO_NOSTATE so that the ND code will send a solicitation. Setting this to ND6_LLINFO_STALE implies that the link-level entry is valid and can be used (but needs to be refreshed via the Neighbor Unreachability state machine). https://forums.freebsd.org/threads/64287/ Submitted by: Farrell Woods <Farrell_Woods@Dell.com> Reviewed by: mjoras, dab, ae MFC after: 1 week Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D14059	2018-01-29 16:12:26 +00:00
Andrey V. Elsukov	2164def67c	Do not skip scope zone violation check, when mbuf has M_FASTFWD_OURS flag. When mbuf has M_FASTFWD_OURS flag, this means that a destination address is our local, but we still need to pass scope zone violation check, because protocol level expects that IPv6 link-local addresses have embedded scope zone indexes. This should fix the problem, when ipfw is used to forward packets to local address and source address of a packet is IPv6 LLA. Reported by: sbruno MFC after: 3 weeks	2018-01-29 11:03:29 +00:00
Andrey V. Elsukov	efc284cb12	Assign IPv6 link-local address to loopback interfaces whith unit > 0. When an interface has IFF_LOOPBACK flag in6_ifattach() tries to assing IPv6 loopback address to this interface. It uses in6ifa_ifpwithaddr() to check, that interface doesn't already have given address and then uses in6_ifattach_loopback(). If in6_ifattach_loopback() fails, it just exits and thus skips assignment of IPv6 LLA. Fix this using in6ifa_ifwithaddr() function. If IPv6 loopback address is already assigned in the system, do not call in6_ifattach_loopback(). PR: 138678 MFC after: 3 weeks	2018-01-29 10:33:55 +00:00
Navdeep Parhar	09b0b8c058	Do not generate illegal mbuf chains during IP fragment reassembly. Only the first mbuf of the reassembled datagram should have a pkthdr. This was discovered with cxgbe(4) + IPSEC + ping with payload more than interface MTU. cxgbe can generate !M_WRITEABLE mbufs and this results in m_unshare being called on the reassembled datagram, and it complains: panic: m_unshare: m0 0xfffff80020f82600, m 0xfffff8005d054100 has M_PKTHDR PR: 224922 Reviewed by: ae@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14009	2018-01-24 05:09:21 +00:00
Alan Somers	81e04458b8	sys/netinet6: fix typos in comments. No functional change. MFC after: 3 weeks Sponsored by: Spectra Logic Corp	2018-01-23 19:40:05 +00:00
Pedro F. Giffuni	ac2fffa4b7	Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197	2018-01-21 15:42:36 +00:00
Pedro F. Giffuni	443133416b	net*: make some use of mallocarray(9). Focus on code where we are doing multiplications within malloc(9). None of these ire likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values. X-Differential revision: https://reviews.freebsd.org/D13837	2018-01-15 21:21:51 +00:00
Pedro F. Giffuni	3760a9ac78	Fix some typos. Obtained from: OpenBSD (CVS v1.5)	2017-12-28 20:40:56 +00:00
Pedro F. Giffuni	a8e6714356	netinet6/ip6_id.c: niels kindly dropped clause 3/4 from the license. This bring back r327293 from OpenBSD, with the important difference that we are now getting it from their ip6_id.c file. Obtained from: OpenBSD (CVS v1.3)	2017-12-28 20:35:21 +00:00
Pedro F. Giffuni	b3c64c30fa	Start syncing changes from OpenBSD's ip6_id.c instead of ip_id.c. correct non-repetitive ID code, based on comments from niels provos. - seed2 is necessary, but use it as "seed2 + x" not "seed2 ^ x". - skipping number is not needed, so disable it for 16bit generator (makes the repetition period to 30000) Obtained from: OpenBSD (CVS rev. 1.2) MFC after: 1 week	2017-12-28 20:26:51 +00:00
Pedro F. Giffuni	d82751000f	Revert r327293 netinet6/ip6_id.c: niels kindly dropped clause 3/4 from the license. I was looking at the wrong file. There is an important merge that must be done before I can bring this change.	2017-12-28 20:10:10 +00:00
Pedro F. Giffuni	e9738d25c1	netinet6/ip6_id.c: niels kindly dropped clause 3/4 from the license. This file is supposed to be based on the OpenBSD CVS v1.6 but checking the OpenBSD repository the license had already dropped the 2&3 clasues by then. Catch up with the licensing. Obtained from: OpenBSD (CVS 1.2)	2017-12-28 19:42:53 +00:00
Alexander Kabaev	151ba7933a	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385	2017-12-25 04:48:39 +00:00
Alexander Kabaev	bf51c9665d	Silence clang analyzer false positive. clang does not know that two lookup calls will return the same pointer, so it assumes correctly that using the old pointer after dropping the reference to it is a bit risky.	2017-12-23 16:45:26 +00:00
Andrey V. Elsukov	a406128960	Follow the RFC6980 and silently ignore following IPv6 NDP messages that had the IPv6 fragmentation header: o Neighbor Solicitation o Neighbor Advertisement o Router Solicitation o Router Advertisement o Redirect Introduce M_FRAGMENTED mbuf flag, and set it after IPv6 fragment reassembly is completed. Then check the presence of this flag in correspondig ND6 handling routines. PR: 224247 MFC after: 2 weeks	2017-12-15 12:37:32 +00:00
Michael Tuexen	9f0abda051	Retire SCTP_WITH_NO_CSUM option. This option was used in the early days to allow performance measurements extrapolating the use of SCTP checksum offloading. Since this feature is now available, get rid of this option. This also un-breaks the LINT kernel. Thanks to markj@ for making me aware of the problem.	2017-12-07 22:19:08 +00:00
Pedro F. Giffuni	fe267a5590	sys: general adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended.	2017-11-27 15:23:17 +00:00
Pedro F. Giffuni	51369649b0	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
Pedro F. Giffuni	df57947f08	spdx: initial adoption of licensing ID tags. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133	2017-11-18 14:26:50 +00:00
Konstantin Belousov	06193f0be0	Use hardware timestamps to report packet timestamps for SO_TIMESTAMP and other similar socket options. Provide new control message SCM_TIME_INFO to supply information about timestamp. Currently it indicates that the timestamp was hardware-assisted and high-precision, for software timestamps the message is not returned. Reserved fields are added to ABI to report additional info about it, it is expected that raw hardware clock value might be useful for some applications. Reviewed by: gallatin (previous version), hselasky Sponsored by: Mellanox Technologies MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D12638	2017-11-07 09:46:26 +00:00
Michael Tuexen	28a6adde1d	Allow the setting of the MTU for future paths using an SCTP socket option. This functionality was missing. MFC after: 1 week	2017-11-03 20:46:12 +00:00
Kristof Provost	a0bf3ee425	Evaluate packet size after the firewall had its chance in the ip6 fast path Defer the packet size check until after the firewall has had a look at it. This means that the firewall now has the opportunity to (re-)fragment an oversized packet. This mirrors what the slow path does. Reviewed by: ae MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D12779	2017-10-25 19:21:48 +00:00
Gleb Smirnoff	0e229f343f	Hide struct socket and struct unpcb from the userland. Violators may define _WANT_SOCKET and _WANT_UNPCB respectively and are not guaranteed for stability of the structures. The violators list is the the usual one: libprocstat(3) and netstat(1) internally and lsof in ports. In struct xunpcb remove the inclusion of kernel structure and add a bunch of spare fields. The xsocket already has socket not included, but add there spares as well. Embed xsockbuf into xsocket. Sort declarations in sys/socketvar.h to separate kernel only from userland available ones. PR: 221820 (exp-run)	2017-10-02 23:29:56 +00:00
Michael Tuexen	2e8bb5ddf4	Fix a locking issue found by Coverity scanning the usrsctp library. MFC after: 3 days	2017-09-09 20:51:54 +00:00
Bjoern A. Zeeb	ae69ad884d	After inpcb route caching was put back in place there is no need for flowtable anymore (as flowtable was never considered to be useful in the forwarding path). Reviewed by: np Differential Revision: https://reviews.freebsd.org/D11448	2017-07-27 13:03:36 +00:00
Michael Tuexen	5ba7f91f9d	Use memset/memcpy instead of bzero/bcopy. Just use one variant instead of both. Use the memset/memcpy ones since they cause less problems in crossplatform deployment. MFC after: 1 week	2017-07-19 14:28:58 +00:00
Jonathan T. Looney	8b07e00e99	Fix an unnecessary/incorrect check in the PKTOPT_EXTHDRCPY macro. This macro allocates memory and, if malloc does not return NULL, copies data into the new memory. However, it doesn't just check whether malloc returns NULL. It also checks whether we called malloc with M_NOWAIT. That is not necessary. While it may be that malloc() will only return NULL when the M_NOWAIT flag is set, we don't need to check for this when checking malloc's return value. Further, in this case, the check was not completely accurate, because it checked for flags == M_NOWAIT, rather than treating it as a bit field and checking for (flags & M_NOWAIT). Reviewed by: ae MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D10942	2017-05-30 14:50:28 +00:00
Jonathan T. Looney	fb04394554	Fix two places in the ICMP6 code where we could dereference a NULL pointer in the icmp6_input() function. When processing an ICMP6_ECHO_REQUEST, if IP6_EXTHDR_GET fails, it will set nicmp6 and n to NULL. Therefore, we should condition our modification to nicmp6 on n being not NULL. And, when processing an ICMP6_WRUREQUEST in the (mode != FQDN) case, if m_dup_pkthdr() fails, the code will set n to NULL. However, the very next line dereferences n. Therefore, when m_dup_pkthdr() fails, we should discontinue further processing and follow the same path as when m_gethdr() fails. Reported by: clang static analyzer Reviewed by: ae MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D10941	2017-05-30 14:41:31 +00:00
Jonathan T. Looney	382a6bbcf1	Enforce the limit on ICMP messages before doing work to formulate the response. Delete an unneeded rate limit for UDP under IPv6. Because ICMP6 messages have their own rate limit, it is unnecessary to apply a second rate limit to UDP messages. Reviewed by: glebius MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D10387	2017-05-30 14:32:44 +00:00
Michael Tuexen	5dba6ada91	The connect() system call should return -1 and set errno to EAFNOSUPPORT if it is called on a TCP socket * with an IPv6 address and the socket is bound to an IPv4-mapped IPv6 address. * with an IPv4-mapped IPv6 address and the socket is bound to an IPv6 address. Thanks to Jonathan T. Leighton for reporting this issue. Reviewed by: bz gnn MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D9163	2017-05-22 15:29:10 +00:00
Ed Maste	3e85b721d6	Remove register keyword from sys/ and ANSIfy prototypes A long long time ago the register keyword told the compiler to store the corresponding variable in a CPU register, but it is not relevant for any compiler used in the FreeBSD world today. ANSIfy related prototypes while here. Reviewed by: cem, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D10193	2017-05-17 00:34:34 +00:00
Enji Cooper	bd7459366e	Add missing braces around MCAST_EXCLUDE check when KTR support is compiled into the kernel This ensures that .iss_asm (the number of ASM listeners) isn't incorrectly decremented for MLD-layer source datagrams when inspecting im*s_st[1] (the second state in the structure). MFC after: 2 months PR: 217509 [1] Reported by: Coverity (Isilon) Reviewed by: ae ("This patch looks correct to me." [1]) Submitted by: Miles Ohlrich <miles.ohlrich@isilon.com> Sponsored by: Dell EMC Isilon	2017-05-13 18:41:24 +00:00
Navdeep Parhar	ce9ac139d4	ip6_output runs with the inp lock held, just like ip_output.	2017-05-10 00:14:55 +00:00
Michael Tuexen	d274bcc661	Fix an issue with MTU calculation if an ICMP messaeg is received for an SCTP/UDP packet. MFC after: 1 week	2017-04-26 20:21:05 +00:00
Michael Tuexen	6ebfa5ee14	Use consistently uint32_t for mtu values. This does not change functionality, but this cleanup is need for further improvements of ICMP handling. MFC after: 1 week	2017-04-26 19:26:40 +00:00
Kristof Provost	d78c0804fb	Rename variable for clarity Rename the mtu variable in ip6_fragment(), because mtu is misleading. The variable actually holds the fragment length. No functional change. Suggested by: ae	2017-04-22 13:04:36 +00:00
Kristof Provost	00eab743ab	pf: Fix possible incorrect IPv6 fragmentation When forwarding pf tracks the size of the largest fragment in a fragmented packet, and refragments based on this size. It failed to ensure that this size was a multiple of 8 (as is required for all but the last fragment), so it could end up generating incorrect fragments. For example, if we received an 8 byte and 12 byte fragment pf would emit a first fragment with 12 bytes of payload and the final fragment would claim to be at offset 8 (not 12). We now assert that the fragment size is a multiple of 8 in ip6_fragment(), so other users won't make the same mistake. Reported by: Antonios Atlasis <aatlasis at secfu net> MFC after: 3 days	2017-04-20 09:05:53 +00:00
Andrey V. Elsukov	c33a231337	Rework r316770 to make it protocol independent and general, like we do for streaming sockets. And do more cleanup in the sbappendaddr_locked_internal() to prevent leak information from existing mbuf to the one, that will be possible created later by netgraph. Suggested by: glebius Tested by: Irina Liakh <spell at itl ua> MFC after: 1 week	2017-04-14 09:00:48 +00:00
Andrey V. Elsukov	8428914909	Clear h/w csum flags on mbuf handled by UDP. When checksums of received IP and UDP header already checked, UDP uses sbappendaddr_locked() to pass received data to the socket. sbappendaddr_locked() uses given mbuf as is, and if NIC supports checksum offloading, mbuf contains csum_data and csum_flags that were calculated for already stripped headers. Some NICs support only limited checksums offloading and do not use CSUM_PSEUDO_HDR flag, and csum_data contains some value that UDP/TCP should use for pseudo header checksum calculation. When L2TP is used for tunneling with mpd5, ng_ksocket receives mbuf with filled csum_flags and csum_data, that were calculated for outer headers. When L2TP header is stripped, a packet that was tunneled goes to the IP layer and due to presence of csum_flags (without CSUM_PSEUDO_HDR) and csum_data, the UDP/TCP checksum check fails for this packet. Reported by: Irina Liakh <spell at itl ua> Tested by: Irina Liakh <spell at itl ua> MFC after: 1 week	2017-04-13 17:03:57 +00:00
Steven Hartland	4d806fc663	Allow explicitly assigned IPv6 loopback address to be used in jails If a jail has an explicitly assigned IPv6 loopback address then allow it to be used instead of remapping requests for the loopback adddress to the first IPv6 address assigned to the jail. This fixes issues where applications attempt to detect their bound port where they requested a loopback address, which was available, but instead the kernel remapped it to the jails first address. This is the same fix applied to IPv4 fix by: r316313 Also: * Correct the description of prison_check_ip6_locked to match the code. MFC after: 2 weeks Relnotes: Yes Sponsored by: Multiplay	2017-03-31 09:10:05 +00:00
Mike Karels	8c1960d506	Fix reference count leak with L2 caching. ip_forward, TCP/IPv6, and probably SCTP leaked references to L2 cache entry because they used their own routes on the stack, not in_pcb routes. The original model for route caching was callers that provided a route structure to ip{,6}input() would keep the route, and this model was used for L2 caching as well. Instead, change L2 caching to be done by default only when using a route structure in the in_pcb; the pcb deallocation code frees L2 as well as L3 cacches. A separate change will add route caching to TCP/IPv6. Another suggestion was to have the transport protocols indicate willingness to use L2 caching, but this approach keeps the changes in the network level Reviewed by: ae gnn MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D10059	2017-03-25 15:06:28 +00:00
Alan Somers	559b42968c	Constrain IPv6 routes to single FIBs when net.add_addr_allfibs=0 sys/netinet6/icmp6.c Use the interface's FIB for source address selection in ICMPv6 error responses. sys/netinet6/in6.c In in6_newaddrmsg, announce arrival of local addresses on the interface's FIB only. In in6_lltable_rtcheck, use a per-fib ND6 cache instead of a single cache. sys/netinet6/in6_src.c In in6_selectsrc, use the caller's fib instead of the default fib. In in6_selectsrc_socket, remove a superfluous check. sys/netinet6/nd6.c In nd6_lle_event, use the interface's fib for routing socket messages. In nd6_is_new_addr_neighbor, check all FIBs when trying to determine whether an address is a neighbor. Also, simplify the code for point to point interfaces. sys/netinet6/nd6.h sys/netinet6/nd6.c sys/netinet6/nd6_rtr.c Make defrouter_select fib-aware, and make all of its callers pass in the interface fib. sys/netinet6/nd6_nbr.c When inputting a Neighbor Solicitation packet, consider the interface fib instead of the default fib for DAD. Output NS and Neighbor Advertisement packets on the correct fib. sys/netinet6/nd6_rtr.c Allow installing the same host route on different interfaces in different FIBs. If rt_add_addr_allfibs=0, only install or delete the prefix route on the interface fib. tests/sys/netinet/fibs_test.sh Clear some expected failures, but add a skip for the newly revealed BUG217871. PR: 196361 Submitted by: Erick Turnquist <jhujhiti@adjectivism.org> Reported by: Jason Healy <jhealy@logn.net> Reviewed by: asomers MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D9451	2017-03-17 16:50:37 +00:00
Ermal Luçi	dce33a45c9	The patch provides the same socket option as Linux IP_ORIGDSTADDR. Unfortunately they will have different integer value due to Linux value being already assigned in FreeBSD. The patch is similar to IP_RECVDSTADDR but also provides the destination port value to the application. This allows/improves implementation of transparent proxies on UDP sockets due to having the whole information on forwarded packets. Reviewed by: adrian, aw Approved by: ae (mentor) Sponsored by: rsync.net Differential Revision: D9235	2017-03-06 04:01:58 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Andrey V. Elsukov	9907aba370	When IPv6 fragments reassembly is complete, update mbuf's csum_data and csum_flags using information from all fragments. This fixes dropping of reassembled packets due to wrong checksum when the IPv6 checksum offloading is enabled on a network card. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2017-02-28 22:58:19 +00:00
Andrey V. Elsukov	627c036f65	Remove IPsec related PCB code from SCTP. The inpcb structure has inp_sp pointer that is initialized by ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec security policies associated with a specific socket. An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options to configure these security policies. Then ip[6]_output() uses inpcb pointer to specify that an outgoing packet is associated with some socket. And IPSEC_OUTPUT() method can use a security policy stored in the inp_sp. For inbound packet the protocol-specific input routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms to inbound security policy configured in the inpcb. SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends packets. Thus IPSEC_OUTPUT() method does not consider such packets as associated with some socket and can not apply security policies from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY() method is called from protocol-specific input routine, it can specify inpcb pointer and associated with socket inbound policy will be checked. But there are two problems: 1. Such check is asymmetric, becasue we can not apply security policy from inpcb for outgoing packet. 2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and access to inp_sp is protected. But for SCTP this is not correct, becasue SCTP uses own locks to protect inpcb. To fix these problems remove IPsec related PCB code from SCTP. This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options will be not applicable to SCTP sockets. To be able correctly check inbound security policies for SCTP, mark its protocol header with the PR_LASTHDR flag. Reported by: tuexen Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D9538	2017-02-13 11:37:52 +00:00
Ermal Luçi	c10c5b1eba	Committed without approval from mentor. Reported by: gnn	2017-02-12 06:56:33 +00:00
Ermal Luçi	70d81c5e91	Use proper value for socket option on IPv6 Reported-by: ohartmann@walstatt.org	2017-02-10 06:20:27 +00:00
Ermal Luçi	4616026faf	Revert r313527 Heh svn is not git	2017-02-10 05:58:16 +00:00
Ermal Luçi	c0fadfdbbf	Correct missed variable name. Reported-by: ohartmann@walstatt.org	2017-02-10 05:51:39 +00:00
Ermal Luçi	ed55edceef	The patch provides the same socket option as Linux IP_ORIGDSTADDR. Unfortunately they will have different integer value due to Linux value being already assigned in FreeBSD. The patch is similar to IP_RECVDSTADDR but also provides the destination port value to the application. This allows/improves implementation of transparent proxies on UDP sockets due to having the whole information on forwarded packets. Sponsored-by: rsync.net Differential Revision: D9235 Reviewed-by: adrian	2017-02-10 05:16:14 +00:00
Andrey V. Elsukov	fcf596178b	Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352	2017-02-06 08:49:57 +00:00
Andriy Voskoboinyk	2bbd06fc33	Garbage collect IFT_IEEE80211 (but leave the define for possible reuse) This interface type ("a parent interface of wlanX") is not used since r287197 Reviewed by: adrian, glebius Differential Revision: https://reviews.freebsd.org/D9308	2017-01-28 17:08:40 +00:00
Luiz Otavio O Souza	338e227ac0	After the in_control() changes in r257692, an existing address is (intentionally) deleted first and then completely added again (so all the events, announces and hooks are given a chance to run). This cause an issue with CARP where the existing CARP data structure is removed together with the last address for a given VHID, which will cause a subsequent fail when the address is later re-added. This change fixes this issue by adding a new flag to keep the CARP data structure when an address is not being removed. There was an additional issue with IPv6 CARP addresses, where the CARP data structure would never be removed after a change and lead to VHIDs which cannot be destroyed. Reviewed by: glebius Obtained from: pfSense MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC (Netgate)	2017-01-25 19:04:08 +00:00
Hans Petter Selasky	f3e7afe2d7	Implement kernel support for hardware rate limited sockets. - Add RATELIMIT kernel configuration keyword which must be set to enable the new functionality. - Add support for hardware driven, Receive Side Scaling, RSS aware, rate limited sendqueues and expose the functionality through the already established SO_MAX_PACING_RATE setsockopt(). The API support rates in the range from 1 to 4Gbytes/s which are suitable for regular TCP and UDP streams. The setsockopt(2) manual page has been updated. - Add rate limit function callback API to "struct ifnet" which supports the following operations: if_snd_tag_alloc(), if_snd_tag_modify(), if_snd_tag_query() and if_snd_tag_free(). - Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT flag, which tells if a network driver supports rate limiting or not. - This patch also adds support for rate limiting through VLAN and LAGG intermediate network devices. - How rate limiting works: 1) The userspace application calls setsockopt() after accepting or making a new connection to set the rate which is then stored in the socket structure in the kernel. Later on when packets are transmitted a check is made in the transmit path for rate changes. A rate change implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the destination network interface, which then sets up a custom sendqueue with the given rate limitation parameter. A "struct m_snd_tag" pointer is returned which serves as a "snd_tag" hint in the m_pkthdr for the subsequently transmitted mbufs. 2) When the network driver sees the "m->m_pkthdr.snd_tag" different from NULL, it will move the packets into a designated rate limited sendqueue given by the snd_tag pointer. It is up to the individual drivers how the rate limited traffic will be rate limited. 3) Route changes are detected by the NIC drivers in the ifp->if_transmit() routine when the ifnet pointer in the incoming snd_tag mismatches the one of the network interface. The network adapter frees the mbuf and returns EAGAIN which causes the ip_output() to release and clear the send tag. Upon next ip_output() a new "snd_tag" will be tried allocated. 4) When the PCB is detached the custom sendqueue will be released by a non-blocking ifp->if_snd_tag_free() call to the currently bound network interface. Reviewed by: wblock (manpages), adrian, gallatin, scottl (network) Differential Revision: https://reviews.freebsd.org/D3687 Sponsored by: Mellanox Technologies MFC after: 3 months	2017-01-18 13:31:17 +00:00
Mark Johnston	762d16d9e4	Improve some of the sysctl descriptions added in r299827. Submitted by: Marie Helene Kvello-Aune <marieheleneka@gmail.com> (original version) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D5336	2017-01-16 19:35:19 +00:00
Maxim Sobolev	339efd75a4	Add a new socket option SO_TS_CLOCK to pick from several different clock sources to return timestamps when SO_TIMESTAMP is enabled. Two additional clock sources are: o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME); o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC). In addition to this, this option provides unified interface to get bintime (equivalent of using SO_BINTIME), except it also supported with IPv6 where SO_BINTIME has never been supported. The long term plan is to depreciate SO_BINTIME and move everything to using SO_TS_CLOCK. Idea for this enhancement has been briefly discussed on the Net session during dev summit in Ottawa last June and the general input was positive. This change is believed to benefit network benchmarks/profiling as well as other scenarios where precise time of arrival measurement is necessary. There are two regression test cases as part of this commit: one extends unix domain test code (unix_cmsg) to test new SCM_XXX types and another one implementis totally new test case which exchanges UDP packets between two processes using both conventional methods (i.e. calling clock_gettime(2) before recv(2) and after send(2)), as well as using setsockopt()+recv() in receive path. The resulting delays are checked for sanity for all supported clock types. Reviewed by: adrian, gnn Differential Revision: https://reviews.freebsd.org/D9171	2017-01-16 17:46:38 +00:00
Mark Johnston	8cd3b2042c	Release the ND6 list lock before making a prefix off-link in nd6_timer(). Reported by: Jim <BM-2cWfdfG5CJsquqkJyry7hZT9LypbSEWEkQ@bitmessage.ch> X-MFC With: r306829	2017-01-08 18:46:00 +00:00
Michael Tuexen	b7b84c0e02	Whitespace changes. The toolchain for processing the sources has been updated. No functional change. MFC after: 3 days	2016-12-26 11:06:41 +00:00
Mark Johnston	62280740d6	Remove a bogus KASSERT from nd6_prefix_unlink(). The caller may unlink a prefix before purging referencing addresses. An identical assertion in nd6_prefix_del() verifies that the addresses are purged before the prefix is freed. PR: 215372 X-MFC With: r306829	2016-12-19 19:21:28 +00:00
Andrey V. Elsukov	ad9f4d6ab6	ip[6]_tryforward does inbound and outbound packet firewall processing. This can lead to change of mbuf pointer (packet filter could do m_pullup(), NAT, etc). Also in case of change of destination address, tryforward can decide that packet should be handled by local system. In this case modified mbuf can be returned to the ip[6]_input(). To handle this correctly, check M_FASTFWD_OURS flag after return from ip[6]_tryforward. And if it is present, update variables that depend from mbuf pointer and skip another inbound firewall processing. No objection from: #network MFC after: 3 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D8764	2016-12-19 11:02:49 +00:00
Andrey V. Elsukov	8a030e9c6e	Modify IPv6 statistic accounting in ip6_input(). Add rcvif local variable to keep inbound interface pointer. Count ifs6_in_discard errors in all "goto bad" cases. Now it will count errors even if mbuf was freed. Modify all places where m->m_pkthdr.rcvif is used to use local rcvif variable. Obtained from: Yandex LLC MFC after: 1 month	2016-12-12 11:26:59 +00:00
Andrey V. Elsukov	5a1842a24a	Add ip6_tryforward() - a run to completion forwarding implementation for IPv6. It gets performance benefits from reduced number of checks. It doesn't copy mbuf to be able send ICMPv6 error message, because it keeps mbuf unchanged until the moment, when the route decision has been made. It doesn't do IPsec checks, and when some IPsec security policies present, ip6_input() uses normal slow path. Reviewed by: bz, gnn Obtained from: Yandex LLC MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D8527	2016-12-12 10:57:32 +00:00
Michael Tuexen	5b495f17a5	Whitespace changes. The tools using to generate the sources has been updated and produces different whitespaces. Commit this seperately to avoid intermixing these with real code changes. MFC after: 3 days	2016-12-06 10:21:25 +00:00
Michael Tuexen	3e1465754f	Make ICMPv6 hard error handling for TCP consistent with the ICMPv4 handling. Ensure that: * Protocol unreachable errors are handled by indicating ECONNREFUSED to the TCP user for both IPv4 and IPv6. These were ignored for IPv6. * Communication prohibited errors are handled by indicating ECONNREFUSED to the TCP user for both IPv4 and IPv6. These were ignored for IPv6. * Hop Limited exceeded errors are handled by indicating EHOSTUNREACH to the TCP user for both IPv4 and IPv6. For IPv6 the TCP connected was dropped but errno wasn't set. Reviewed by: gallatin, rrs MFC after: 1 month Sponsored by: Netflix Differential Revision: 7904	2016-10-21 10:32:57 +00:00
George V. Neville-Neil	aec9c8d5a5	Limit the number of mbufs that can be allocated for IPV6_2292PKTOPTIONS (and IPV6_PKTOPTIONS). PR: 100219 Submitted by: Joseph Kong MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D5157	2016-10-17 23:25:31 +00:00
Gleb Smirnoff	cc94f0c2d7	- Revert r300854, r303657 which tried to fix regression from r297225. - Fix the regression proper way using RO_RTFREE(). Submitted by: ae	2016-10-13 20:15:47 +00:00
Mark Johnston	d748f7efcd	Lock the ND prefix list and add refcounting for prefixes. This change extends the nd6 lock to protect the ND prefix list as well as the list of advertising routers associated with each prefix. To handle cases where the nd6 lock must be dropped while iterating over either the prefix or default router lists, a generation counter is used to track modifications to the lists. Additionally, a new mutex is used to serialize prefix on-link/off-link transitions. This mutex must be acquired before the nd6 lock and is held while updating the routing table in nd6_prefix_onlink() and nd6_prefix_offlink(). Reviewed by: ae, tuexen (SCTP bits) Tested by: Jason Wolfe <jason@llnw.com>, Larry Rosenman <ler@lerctr.org> MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D8125	2016-10-07 21:10:53 +00:00
Mark Johnston	c26158449e	Reduce the number of conditional statements in nd6_prefix_onlink(). MFC after: 1 week	2016-10-07 21:03:18 +00:00
Mark Johnston	7b0e84b7c8	Combine several checks in nd6_prefix_offlink() into one. MFC after: 1 week	2016-10-07 21:02:30 +00:00
Mark Johnston	7782accf13	Fix whitespace around prototypes in nd6_rtr.c. MFC after: 1 week	2016-10-07 00:36:18 +00:00
Mark Johnston	07c1f95976	Fix a typo. MFC after: 1 week	2016-10-07 00:35:28 +00:00
Mark Johnston	a88d6d7e07	Shorten and simplify some of the loops in pfxlist_onlink_check(). No functional change intended. MFC after: 1 week	2016-10-07 00:34:57 +00:00
Mark Johnston	f7d91d8cdd	Use a const reference to prefixes in nd6_is_new_addr_neighbor(). MFC after: 1 week	2016-10-07 00:26:36 +00:00
Mark Johnston	0ed7d74424	nd6_dad_timer(): don't assert that the address is tentative. It appears that this assertion can be tripped in some cases when multiple interfaces are on the same link. Until this is resolved, revert a part of r306305 and simply log a message if the DAD timer fires on a non-tentative address. Reported by: jhb X-MFC With: r306305	2016-10-01 01:30:34 +00:00
Andrey V. Elsukov	5a03e7819a	Fix bug introduced in r274300. In icmp6_reflect() use original source address of erroneous packet as destination address for source selection algorithm when original destination address is not one of our own. Reported by: Mark Kamichoff <prox at prolixium com> Tested by: Mark Kamichoff <prox at prolixium com> MFC after: 1 week	2016-09-29 19:57:37 +00:00
Mark Johnston	970fe0938e	Convert checks in nd6_dad_start() and nd6_dad_timer() to assertions. In particular, these functions can assume they are operating on tentative addresses. MFC after: 2 weeks	2016-09-24 21:40:24 +00:00
Mark Johnston	0bbf244e9f	Rename ndpr_refcnt to ndpr_addrcnt. This field counts derived addresses and is not a true refcount for prefix objects, so the previous name was misleading. MFC after: 1 week	2016-09-24 01:14:25 +00:00
Mark Johnston	2d12d25c6a	Reduce code duplication around NDP message handlers in icmp6_input(). No functional change intended. MFC after: 2 weeks	2016-09-20 18:08:17 +00:00
Kevin Lo	c3bef61e58	Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D7878	2016-09-15 07:41:48 +00:00
Mike Karels	0f5687f2ae	Fix L2 caching for UDP over IPv6 ip6_output() was missing cache invalidation code analougous to ip_output.c. r304545 disabled L2 caching for UDP/IPv6 as a workaround. This change adds the missing cache invalidation code and reverts r304545. Reviewed by: gnn Approved by: gnn (mentor) Tested by: peter@, Mike Andrews MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D7591	2016-08-24 00:52:30 +00:00
Bjoern A. Zeeb	77ecef378a	Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated more than 7 years ago in favour of a sysctl in r192648.	2016-08-21 18:55:30 +00:00
Mike Karels	db727c1bd7	Disable L2 caching for UDP over IPv6 The ip6_output routine is missing L2 cache invalication as done in ip_output. Even with that code, some problems with UDP over IPv6 have been reported. Diabling L2 cache for that problem works around the problem for now. PR: 211872 211926 Reviewed by: gnn Approved by: gnn (mentor) MFC after: immediate	2016-08-20 20:46:53 +00:00
Andrey V. Elsukov	d8caf56e9e	Add ipfw_nat64 module that implements stateless and stateful NAT64. The module works together with ipfw(4) and implemented as its external action module. Stateless NAT64 registers external action with name nat64stl. This keyword should be used to create NAT64 instance and to address this instance in rules. Stateless NAT64 uses two lookup tables with mapped IPv4->IPv6 and IPv6->IPv4 addresses to perform translation. A configuration of instance should looks like this: 1. Create lookup tables: # ipfw table T46 create type addr valtype ipv6 # ipfw table T64 create type addr valtype ipv4 2. Fill T46 and T64 tables. 3. Add rule to allow neighbor solicitation and advertisement: # ipfw add allow icmp6 from any to any icmp6types 135,136 4. Create NAT64 instance: # ipfw nat64stl NAT create table4 T46 table6 T64 5. Add rules that matches the traffic: # ipfw add nat64stl NAT ip from any to table(T46) # ipfw add nat64stl NAT ip from table(T64) to 64:ff9b::/96 6. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96 via NAT64 host. Stateful NAT64 registers external action with name nat64lsn. The only one option required to create nat64lsn instance - prefix4. It defines the pool of IPv4 addresses used for translation. A configuration of instance should looks like this: 1. Add rule to allow neighbor solicitation and advertisement: # ipfw add allow icmp6 from any to any icmp6types 135,136 2. Create NAT64 instance: # ipfw nat64lsn NAT create prefix4 A.B.C.D/28 3. Add rules that matches the traffic: # ipfw add nat64lsn NAT ip from any to A.B.C.D/28 # ipfw add nat64lsn NAT ip6 from any to 64:ff9b::/96 4. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96 via NAT64 host. Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6434	2016-08-13 16:09:49 +00:00
Stephen J. Kiernan	0ce1624d0e	Move IPv4-specific jail functions to new file netinet/in_jail.c _prison_check_ip4 renamed to prison_check_ip4_locked Move IPv6-specific jail functions to new file netinet6/in6_jail.c _prison_check_ip6 renamed to prison_check_ip6_locked Add appropriate prototypes to sys/sys/jail.h Adjust kern_jail.c to call prison_check_ip4_locked and prison_check_ip6_locked accordingly. Add netinet/in_jail.c and netinet6/in6_jail.c to the list of files that need to be built when INET and INET6, respectively, are configured in the kernel configuration file. Reviewed by: jtl Approved by: sjg (mentor) Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D6799	2016-08-09 02:16:21 +00:00
Andrey V. Elsukov	723758b7ce	Fix NULL pointer dereference. ro pointer can be NULL when IPSec consumes mbuf. PR: 211486 MFC after: 3 days	2016-08-02 12:18:06 +00:00
Andrew Gallatin	d4c22202e6	Rework IPV6 TCP path MTU discovery to match IPv4 - Re-write tcp_ctlinput6() to closely mimic the IPv4 tcp_ctlinput() - Now that tcp_ctlinput6() updates t_maxseg, we can allow ip6_output() to send TCP packets without looking at the tcp host cache for every single transmit. - Make the icmp6 code mimic the IPv4 code & avoid returning PRC_HOSTDEAD because it is so expensive. Without these changes in place, every TCP6 pmtu discovery or host unreachable ICMP resulted in a call to in6_pcbnotify() which walks the tcbinfo table with the write lock held. Because the tcbinfo table is shared between IPv4 and IPv6, this causes huge scalabilty issues on servers with lots of (~100K) TCP connections, to the point where even a small percent of IPv6 traffic had a disproportionate impact on overall throughput. Reviewed by: bz, rrs, ae (all earlier versions), lstewart (in Netflix's tree) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D7272	2016-08-01 17:02:21 +00:00
Stephen J. Kiernan	4ac21b4f09	Prepare for network stack as a module - Move cr_canseeinpcb to sys/netinet/in_prot.c in order to separate the INET and INET6-specific code from the rest of the prot code (It is only used by the network stack, so it makes sense for it to live with the other network stack code.) - Move cr_canseeinpcb prototype from sys/systm.h to netinet/in_systm.h - Rename cr_seeotheruids to cr_canseeotheruids and cr_seeothergids to cr_canseeothergids, make them non-static, and add prototypes (so they can be seen/called by in_prot.c functions.) - Remove sw_csum variable from ip6_forward in ip6_forward.c, as it is an unused variable. Reviewed by: gnn, jtl Approved by: sjg (mentor) Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D2901	2016-07-27 20:34:09 +00:00
Mike Karels	ea17754c5a	Fix per-connection L2 caching in fast path r301217 re-added per-connection L2 caching from a previous change, but it omitted caching in the fast path. Add it. Reviewed By: gallatin Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D7239	2016-07-22 02:11:49 +00:00

1 2 3 4 5 ...

1804 Commits