freebsd-dev

Author	SHA1	Message	Date
Kristof Provost	2602284308	pf: Fix compliation warning with gcc While fixing the PF_ANEQ() macro I messed up the parentheses, leading to compliation warnings with gcc. Spotted by: ian Pointy Hat: kp	2015-10-25 18:09:03 +00:00
Kristof Provost	7d7624233a	PF_ANEQ() macro will in most situations returns TRUE comparing two identical IPv4 packets (when it should return FALSE). It happens because PF_ANEQ() doesn't stop if first 32 bits of IPv4 packets are equal and starts to check next 3*32 bits (like for IPv6 packet). Those bits containt some garbage and in result PF_ANEQ() wrongly returns TRUE. Fix: Check if packet is of AF_INET type and if it is then compare only first 32 bits of data. PR: 204005 Submitted by: Miłosz Kaniewski	2015-10-25 13:14:53 +00:00
Ed Maste	40a02d00a5	if_tap: correct typo in sysctl description (Enably) Sponsored by: The FreeBSD Foundation	2015-10-21 19:56:16 +00:00
Alexander V. Chernikov	f221bcaa06	Remove several compat functions from pre-fib era.	2015-10-17 17:26:44 +00:00
Hiroki Sato	b7a581eaa6	Fix a panic when destroying a lagg interface. Differential Revision: https://reviews.freebsd.org/D3883	2015-10-16 01:16:01 +00:00
Kristof Provost	c110fc49da	pf: Fix TSO issues In certain configurations (mostly but not exclusively as a VM on Xen) pf produced packets with an invalid TCP checksum. The problem was that pf could only handle packets with a full checksum. The FreeBSD IP stack produces TCP packets with a pseudo-header checksum (only addresses, length and protocol). Certain network interfaces expect to see the pseudo-header checksum, so they end up producing packets with invalid checksums. To fix this stop calculating the full checksum and teach pf to only update TCP checksums if TSO is disabled or the change affects the pseudo-header checksum. PR: 154428, 193579, 198868 Reviewed by: sbruno MFC after: 1 week Relnotes: yes Sponsored by: RootBSD Differential Revision: https://reviews.freebsd.org/D3779	2015-10-14 16:21:41 +00:00
Hiroki Sato	023d10cbc7	Fix a bug that caused reinitialization failure of MAC addresses on the lagg interface when removing the primary port. PR: 201916 Differential Revision: https://reviews.freebsd.org/D3301	2015-10-07 06:32:34 +00:00
Marcelo Araujo	973532fc7d	Remove per complete the fec aggregation protocol. The remove began with revision r271733. NOTE: This patch must never be merge to 10-Stable Reviewed by: glebius Approved by: bapt (mentor) Relnotes: Yes Sponsored by: EuroBSDCon Sweden. Differential Revision: D3786	2015-10-04 08:00:29 +00:00
Hiroki Sato	f1aaad0cd9	Add IFCAP_LINKSTATE support.	2015-10-03 09:15:23 +00:00
Andrey V. Elsukov	1a6fb597b0	Always detach encap handler when reconfiguring tunnel. Reported by: hrs MFC after: 1 week	2015-10-03 03:57:58 +00:00
Alexander V. Chernikov	1558cb2448	Eliminate nd6_nud_hint() and its TCP bindings. Initially function was introduced in r53541 (KAME initial commit) to "provide hints from upper layer protocols that indicate a connection is making "forward progress"" (quote from RFC 2461 7.3.1 Reachability Confirmation). However, it was converted to do nothing (e.g. just return) in r122922 (tcp_hostcache implementation) back in 2003. Some defines were moved to tcp_var.h in r169541. Then, it was broken (for non-corner cases) by r186119 (L2<>L3 split) in 2008 (NULL ifp in nd6_lookup). So, right now this code is broken and has no "real" base users. Differential Revision: https://reviews.freebsd.org/D3699	2015-09-27 05:29:34 +00:00
Alexander V. Chernikov	1fe201c322	Simplify the way of attaching IPv6 link-layer header. Problem description: How do we currently perform layer 2 resolution and header imposition: For IPv4 we have the following chain: ip_output() -> (ether\|atm\|whatever)_output() -> arpresolve() Lookup is done in proper place (link-layer output routine) and it is possible to provide cached lle data. For IPv6 situation is more complex: ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_storelladdr() We have ip6_ouput() which calls nd6_output() instead of link output routine. nd6_output() does the following: * checks if lle exists, creates it if needed (similar to arpresolve()) * performes lle state transitions (similar to arpresolve()) * calls nd6_output_ifp() which pushes packets to link output routine along with running SeND/MAC hooks regardless of lle state (e.g. works as run-hooks placeholder). After that, iface output routine like ether_output() calls nd6_storelladdr() which performs lle lookup once again. As a result, we perform lookup twice for each outgoing packet for most types of interfaces. We also need to maintain runtime-checked table of 'nd6-free' interfaces (see nd6_need_cache()). Fix this behavior by eliminating first ND lookup. To be more specific: * make all nd6_output() consumers use nd6_output_ifp() instead * rename nd6_output[_slow]() to nd6_resolve_[slow]() * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics, e.g. copy L2 address to buffer instead of pushing packet towards lower layers * Make all nd6_storelladdr() users use nd6_resolve() * eliminate nd6_storelladdr() The resulting callchain is the following: ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve() Error handling: Currently sending packet to non-existing la results in ip6_<output\|forward> -> nd6_output() -> nd6_output _lle() which returns 0. In new scenario packet is propagated to <ether\|whatever>_output() -> nd6_resolve() which will return EWOULDBLOCK, and that result will be converted to 0. (And EWOULDBLOCK is actually used by IB/TOE code). Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D1469	2015-09-16 14:26:28 +00:00
Andrey V. Elsukov	b71bed24a6	Use KASSERT for some checks, that are late to do. Discussed with: melifaro, glebius	2015-09-16 13:17:00 +00:00
Oleg Bulyzhin	3f70ebbf05	Remove superfluous m_freem(). MFC after: 1 month	2015-09-16 10:07:45 +00:00
Alexander V. Chernikov	59c180c35c	Unify loopback route switching: * prepare gateway before insertion * use RTM_CHANGE instead of explicit find/change route * Remove fib argument from ifa_switch_loopback_route added in r264887: if old ifp fib differes from new one, that the caller is doing something wrong * Make ifa_*_loopback_route call single ifa_maintain_loopback_route().	2015-09-16 06:23:15 +00:00
Alexander V. Chernikov	d3cdb71655	* Require explicitl lle unlink prior to calling llentry_delete(). This one slightly decreases time of holding afdata wlock. * While here, make nd6_free() return void. No one has used its return value since r186119.	2015-09-15 06:48:19 +00:00
Eric van Gyzen	17a036563d	Fix the handling of IPv6 On-Link Redirects. On receipt of a redirect message, install an interface route for the redirected destination. On removal of the corresponding Neighbor Cache entry, remove the interface route. This requires changes in rtredirect_fib() to cope with an AF_LINK address for the gateway and with the absence of RTF_GATEWAY. This fixes the "Redirected On-Link" test cases in the Tahi IPv6 Ready Logo Phase 2 test suite. Unrelated to the above, fix a recursion on the radix node head lock triggered by the Tahi Redirected to Alternate Router test cases. When I first wrote this patch in October 2012, all Section 2 (Neighbor Discovery) test cases passed on 10-CURRENT, 9-STABLE, and 8-STABLE. cem@ recently rebased the 10.x patch onto head and reported that it passes Tahi. (Thanks!) These other test cases also passed in 2012: * the RTF_MODIFIED case, with IPv4 and IPv6 (using a RTF_HOST\|RTF_GATEWAY route for the destination) * the redirected-to-self case, with IPv4 and IPv6 * a valid IPv4 redirect All testing in 2012 was done with WITNESS and INVARIANTS. Tested by: EMC / Isilon Storage Division via Conrad Meyer (cem) in 2015, Mark Kelley <mark_kelley@dell.com> in 2012, TC Telkamp <terence_telkamp@dell.com> in 2012 PR: 152791 Reviewed by: melifaro (current rev), bz (earlier rev) Approved by: kib (mentor) MFC after: 1 month Relnotes: yes Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D3602	2015-09-14 19:17:25 +00:00
Alexander V. Chernikov	3e7a2321e3	* Do more fine-grained locking: call eventhandlers/free_entry without holding afdata wlock * convert per-af delete_address callback to global lltable_delete_entry() and more low-level "delete this lle" per-af callback * fix some bugs/inconsistencies in IPv4/IPv6 ifscrub procedures Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3573	2015-09-14 16:48:19 +00:00
Hans Petter Selasky	d76d40126e	Update TSO limits to include all headers. To make driver programming easier the TSO limits are changed to reflect the values used in the BUSDMA tag a network adapter driver is using. The TCP/IP network stack will subtract space for all linklevel and protocol level headers and ensure that the full mbuf chain passed to the network adapter fits within the given limits. Implementation notes: If a network adapter driver needs to fixup the first mbuf in order to support VLAN tag insertion, the size of the VLAN tag should be subtracted from the TSO limit. Else not. Network adapters which typically inline the complete header mbuf could technically transmit one more segment. This patch does not implement a mechanism to recover the last segment for data transmission. It is believed when sufficiently large mbuf clusters are used, the segment limit will not be reached and recovering the last segment will not have any effect. The current TSO algorithm tries to send MTU-sized packets, where the MTU typically is 1500 bytes, which gives 1448 bytes of TCP data payload per packet for IPv4. That means if the TSO length limitiation is set to 65536 bytes, there will be a data payload remainder of (65536 - 1500) mod 1448 bytes which is equal to 324 bytes. Trying to recover total TSO length due to inlining mbuf header data will not have any effect, because adding or removing the ETH/IP/TCP headers to or from 324 bytes will not cause more or less TCP payload to be TSO'ed. Existing network adapter limits will be updated separately. Differential Revision: https://reviews.freebsd.org/D3458 Reviewed by: rmacklem MFC after: 2 weeks	2015-09-14 08:36:22 +00:00
Hiroki Sato	b1c250ff3f	- Remove GIF_{SEND,ACCEPT}_REVETHIP. - Simplify EADDRNOTAVAIL and EAFNOSUPPORT conditions. MFC after: 3 days	2015-09-10 05:59:39 +00:00
Alexander V. Chernikov	441f9243df	Constantify lookup key in ifa_ifwith* functions. Some places in our network stack already have const arguments (like if_output() routines and LLE functions). Code using ifa_ifwith (and similar functins) along with LLE/_output functions is currently bound to use tricks like __DECONST(). Provide a cleaner way by making sockaddr lookup key really constant. MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3464	2015-09-05 05:33:20 +00:00
Hiroki Sato	18e199ad72	Fix a panic which was reproducible by an infinite loop of "ifconfig epair0 create && ifconfig epair0a destroy". This was caused by an uninitialized function pointer in softc->media.	2015-09-02 16:30:45 +00:00
Alexander V. Chernikov	3b0fd911fa	Simplify lla_rt_output()/nd6_add_ifa_lle() by setting lle state in alloc handler, based on flags.	2015-08-31 05:03:36 +00:00
Adrian Chadd	8f1111cf0b	Remove now unused (and #if 0'ed out) headers.	2015-08-29 04:33:31 +00:00
Adrian Chadd	e5562eb934	Replace the printf()s with optional rate limited debugging for RSS. Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn> Differential Revision: https://reviews.freebsd.org/D3471	2015-08-28 05:58:16 +00:00
Kristof Provost	64b3b4d611	pf: Remove support for 'scrub fragment crop\|drop-ovl' The crop/drop-ovl fragment scrub modes are not very useful and likely to confuse users into making poor choices. It's also a fairly large amount of complex code, so just remove the support altogether. Users who have 'scrub fragment crop\|drop-ovl' in their pf configuration will be implicitly converted to 'scrub fragment reassemble'. Reviewed by: gnn, eri Relnotes: yes Differential Revision: https://reviews.freebsd.org/D3466	2015-08-27 21:27:47 +00:00
Luiz Otavio O Souza	c614d0a443	Fix the spelling of eri's name. Pointy hat to: loos MFC with: r287009	2015-08-24 23:40:36 +00:00
Luiz Otavio O Souza	0a70aaf8f5	Add ALTQ(9) support for the CoDel algorithm. CoDel is a parameterless queue discipline that handles variable bandwidth and RTT. It can be used as the single queue discipline on an interface or as a sub discipline of existing queue disciplines such as PRIQ, CBQ, HFSC, FAIRQ. Differential Revision: https://reviews.freebsd.org/D3272 Reviewd by: rpaulo, gnn (previous version) Obtained from: pfSense Sponsored by: Rubicon Communications (Netgate)	2015-08-21 22:02:22 +00:00
Alexander V. Chernikov	5a2555160f	* Split allocation and table linking for lle's. Before that, the logic besides lle_create() was the following: return existing if found, create if not. This behaviour was error-prone since we had to deal with 'sudden' static<>dynamic lle changes. This commit fixes bunch of different issues like: - refcount leak when lle is converted to static. Simple check case: console 1: while true; do for i in `arp -an\|awk '$4~/incomp/{print$2}'\|tr -d '()'`; do arp -s $i 00:22:44:66:88:00 ; arp -d $i; done; done console 2: ping -f any-dead-host-in-L2 console 3: # watch for memory consumption: vmstat -m \| awk '$1~/lltable/{print$2}' - possible problems in arptimer() / nd6_timer() when dropping/reacquiring lock. New logic explicitly handles use-or-create cases in every lla_create user. Basically, most of the changes are purely mechanical. However, we explicitly avoid using existing lle's for interface/static LLE records. * While here, call lle_event handlers on all real table lle change. * Create lltable_free_entry() calling existing per-lltable lle_free_t callback for entry deletion	2015-08-20 12:05:17 +00:00
Hiren Panchasara	0e02b43a07	Make LAG LACP fast timeout tunable through IOCTL. Differential Revision: D3300 Submitted by: LN Sundararajan <lakshmi.n at msystechnologies> Reviewed by: wblock, smh, gnn, hiren, rpokala at panasas MFC after: 2 weeks Sponsored by: Panasas	2015-08-12 20:21:04 +00:00
Alexander V. Chernikov	0447c1367a	Use single 'lle_timer' callout in lltable instead of two different names of the same timer.	2015-08-11 12:38:54 +00:00
Alexander V. Chernikov	314294de5c	Store addresses instead of sockaddrs inside llentry. This permits us having all (not fully true yet) all the info needed in lookup process in first 64 bytes of 'struct llentry'. struct llentry layout: BEFORE: [rwlock .. state .. state .. MAC ] (lle+1) [sockaddr_in[6]] AFTER [ in[6]_addr MAC .. state .. rwlock ] Currently, address part of struct llentry has only 16 bytes for the key. However, lltable does not restrict any custom lltable consumers with long keys use the previous approach (store key at (lle+1)). Sponsored by: Yandex LLC	2015-08-11 09:26:11 +00:00
Alexander V. Chernikov	41cb42a633	MFP r276712. * Split lltable_init() into lltable_allocate_htbl() (alloc hash table with default callbacks) and lltable_link() ( links any lltable to the list). * Switch from LLTBL_HASHTBL_SIZE to per-lltable hash size field. * Move lltable setup to separate functions in in[6]_domifattach.	2015-08-11 05:51:00 +00:00
Alexander V. Chernikov	2caee4be35	Rename rt_foreach_fib() to rt_foreach_fib_walk(). Suggested by: julian	2015-08-10 20:50:31 +00:00
Alexander V. Chernikov	11cdad9873	Partially merge r274887,r275334,r275577,r275578,r275586 to minimize differences between projects/routing and HEAD. This commit tries to keep code logic the same while changing underlying code to use unified callbacks. * Add llt_foreach_entry method to traverse all entries in given llt * Add llt_dump_entry method to export particular lle entry in sysctl/rtsock format (code is not indented properly to minimize diff). Will be fixed in the next commits. * Add llt_link_entry/llt_unlink_entry methods to link/unlink particular lle. * Add llt_fill_sa_entry method to export address in the lle to sockaddr format. * Add llt_hash method to use in generic hash table support code. * Add llt_free_entry method which is used in llt_prefix_free code. * Prepare for fine-grained locking by separating lle unlink and deletion in lltable_free() and lltable_prefix_free(). * Provide lltable_get<ifp\|af>() functions to reduce direct 'struct lltable' access by external callers. * Remove @llt agrument from lle_free() lle callback since it was unused. * Temporarily add L3_CADDR() macro for 'const' sockaddr typecasting. * Switch to per-af hashing code. * Rename LLE_FREE_LOCKED() callback from in[6]_lltable_free() to in_[6]lltable_destroy() to avoid clashing with llt_free_entry() method. Update description from these functions. * Use unified lltable_free_entry() function instead of per-af one. Reviewed by: ae	2015-08-10 12:03:59 +00:00
Alexander V. Chernikov	4bdf0b6a9a	MFP r274295: * Move interface route cleanup to route.c:rt_flushifroutes() * Convert most of "for (fibnum = 0; fibnum < rt_numfibs; fibnum++)" users to use new rt_foreach_fib() instead of hand-rolling cycles.	2015-08-08 18:14:59 +00:00
Alexander V. Chernikov	e362cf0e9f	MFP r274553: * Move lle creation/deletion from lla_lookup to separate functions: lla_lookup(LLE_CREATE) -> lla_create lla_lookup(LLE_DELETE) -> lla_delete lla_create now returns with LLE_EXCLUSIVE lock for lle. * Provide typedefs for new/existing lltable callbacks. Reviewed by: ae	2015-08-08 17:48:54 +00:00
Luiz Otavio O Souza	9224217213	Remove the mtx_sleep() from the kqueue f_event filter. The filter is called from the network hot path and must not sleep. The filter runs with the descriptor lock held and does not manipulates the buffers, so it is not necessary sleep when the hold buffer is in use. Just ignore the hold buffer contents when it is being copied to user space (when hold buffer in use is set). This fix the "Sleeping thread owns a non-sleepable lock" panic when the userland thread is too busy reading the packets from bpf(4). PR: 200323 MFC after: 2 weeks Sponsored by: Rubicon Communications (Netgate)	2015-08-03 22:14:45 +00:00
Luiz Otavio O Souza	98fa5d858c	Add a KASSERT() to make sure we wont rotate the buffers twice (rotate the buffers while the hold buffer is in use). Suggested by: ed, ghelmer MFC with: r286142	2015-08-03 18:22:31 +00:00
John-Mark Gurney	bba6880eab	looks like all archs either have clang or cdefs included before.. drop this include as unnecessary.. Requested by: bde	2015-08-02 21:33:40 +00:00
John-Mark Gurney	70e47040b0	convert to C11's _Static_assert, and pull in sys/cdefs.h for compatibility w/ older non-C11 compilers... passed make tinerdbox.. Suggested by: imp	2015-08-02 00:15:52 +00:00
Luiz Otavio O Souza	f87e372ef2	Remove two unnecessary sleeps from the hot path in bpf(4). The first one never triggers because bpf_canfreebuf() can only be true for zero-copy buffers and zero-copy buffers are not read with read(2). The second also never triggers, because we check the free buffer before calling ROTATE_BUFFERS(). If the hold buffer is in use the free buffer will be NULL and there is nothing else to do besides drop the packet. If the free buffer isn't NULL the hold buffer _is_ free and it is safe to rotate the buffers. Update the comment in ROTATE_BUFFERS macro to match the logic described here. While here fix a few typos in comments. MFC after: 2 weeks Sponsored by: Rubicon Communications (Netgate)	2015-07-31 21:43:27 +00:00
Luiz Otavio O Souza	faa693cdbe	Remove the sleep from the buffer allocation routine. The buffer must be allocated (or even changed) before the interface is set and thus, there is no need to verify if the buffer is in use. MFC after: 2 weeks Sponsored by: Rubicon Communications (Netgate)	2015-07-31 20:25:54 +00:00
Luiz Otavio O Souza	4f42daa4a3	Do not allocate the buffers at opening of the descriptor, because once the buffer is allocated we are committed to a particular buffer method (BPF_BUFMODE_BUFFER in this case). If we are using zero-copy buffers, the userland program must register its buffers before set the interface. If we are using kernel memory buffers, we can allocate the buffer at the time that the interface is being set. This fix allows the usage of BIOCSETBUFMODE after r235746. Update the comments to reflect the recent changes. MFC after: 2 weeks Sponsored by: Rubicon Communications (Netgate)	2015-07-31 20:02:12 +00:00
Andrey V. Elsukov	926381e108	Ansify if_stf.c	2015-07-31 09:04:22 +00:00
John-Mark Gurney	af024d3b23	temporarily fix build.. This isn't the final fix, and testing is still on going, but it has passed world for mips and powerpc... I know this has an extra semicolon, but this is the patch that is tested... Looks like better fix is to use _Static_assert...	2015-07-31 07:48:08 +00:00
John-Mark Gurney	817c7ed900	Clean up this header file... use CTASSERTs now that we have them... Replace a draft w/ RFC that's over 10 years old. Note that _AALG and _EALG do not need to match what the IKE daemons think they should be.. This is part of the KABI... I decided to renumber AESCTR, but since we've never had working AESCTR mode, I'm not really breaking anything.. and it shortens a loop by quite a bit.. remove SKIPJACK IPsec support... SKIPJACK never made it out of draft (in 1999), only has 80bit key, NIST recommended it stop being used after 2010, and setkey nor any of the IKE daemons I checked supported it... jmgurney/ipsecgcm: a357a33, c75808b, e008669, b27b6d6 Reviewed by: gnn (earlier version)	2015-07-31 00:23:21 +00:00
Andrey V. Elsukov	a5965d1513	Build if_stf(4) module only when both INET and INET6 support are enabled.	2015-07-30 10:26:43 +00:00
Luiz Otavio O Souza	8b15f615e0	Follow r256586 and rename the kernel version of the Free() macro to R_Free(). This matches the other macros and reduces the chances to clash with other headers. This also fixes the build of radix.c outside of the kernel environment. Reviewed by: glebius	2015-07-30 02:09:03 +00:00
Andrey V. Elsukov	10a0e0bf0a	Eliminate the use of m_copydata() in gif_encapcheck(). ip_encap already has inspected mbuf's data, at least an IP header. And it is safe to use mtod() and do direct access to needed fields. Add M_ASSERTPKTHDR() to gif_encapcheck(), since the code expects that mbuf has a packet header. Move the code from gif_validate[46] into in[6]_gif_encapcheck(), also remove "martian filters" checks. According to RFC 4213 it is enough to verify that the source address is the address of the encapsulator, as configured on the decapsulator. Reviewed by: melifaro Obtained from: Yandex LLC Sponsored by: Yandex LLC	2015-07-29 14:07:43 +00:00

1 2 3 4 5 ...

3469 Commits