freebsd-skq

Author	SHA1	Message	Date
ae	1037184c10	Replace rw_init/rw_destroy with corresponding macros. Obtained from: Yandex LLC	2016-10-06 14:42:06 +00:00
bz	7a1c0b1ad1	Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747	2016-06-21 13:48:49 +00:00
bz	8757b6342b	Provide a public interface to rt_flushifroutes which takes the address family as an argument as well. This will be used to cleanup individual protocols during VNET teardown. Obtained from: projects/vnet Sponsored by: The FreeBSD Foundation	2016-06-06 12:49:47 +00:00
gnn	d75e0c471e	This change re-adds L2 caching for TCP and UDP, as originally added in D4306 but removed due to other changes in the system. Restore the llentry pointer to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as appropriate. Submitted by: Mike Karels Differential Revision: https://reviews.freebsd.org/D6262	2016-06-02 17:51:29 +00:00
bz	a460d01567	Fix compile errors after r297225: - properly V_irtualise variable access unbreaking VIMAGE kernels. - remove the volatile from the function return type to make architecture using gcc happy [-Wreturn-type] "type qualifiers ignored on function return type" I am not entirely happy with this solution putting the u_int there but it will do for now.	2016-03-24 11:40:10 +00:00
gnn	c3d5404bbe	FreeBSD previously provided route caching for TCP (and UDP). Re-add route caching for TCP, with some improvements. In particular, invalidate the route cache if a new route is added, which might be a better match. The cache is automatically invalidated if the old route is deleted. Submitted by: Mike Karels Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D4306	2016-03-24 07:54:56 +00:00
dteske	51b30e8967	Merge SVN r295220 (bz) from projects/vnet/ Fix a panic that occurs when a vnet interface is unavailable at the time the vnet jail referencing said interface is stopped. Sponsored by: FIS Global, Inc.	2016-02-11 17:07:19 +00:00
melifaro	196af28820	Fix flowtable part missed in r294706.	2016-01-25 09:31:32 +00:00
melifaro	23582454c7	MFP r287070,r287073: split radix implementation and route table structure. There are number of radix consumers in kernel land (pf,ipfw,nfs,route) with different requirements. In fact, first 3 don't have _any_ requirements and first 2 does not use radix locking. On the other hand, routing structure do have these requirements (rnh_gen, multipath, custom to-be-added control plane functions, different locking). Additionally, radix should not known anything about its consumers internals. So, radix code now uses tiny 'struct radix_head' structure along with internal 'struct radix_mask_head' instead of 'struct radix_node_head'. Existing consumers still uses the same 'struct radix_node_head' with slight modifications: they need to pass pointer to (embedded) 'struct radix_head' to all radix callbacks. Routing code now uses new 'struct rib_head' with different locking macro: RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing information base). New net/route_var.h header was added to hold routing subsystem internal data. 'struct rib_head' was placed there. 'struct rtentry' will also be moved there soon.	2016-01-25 06:33:15 +00:00
melifaro	0f8d03361b	Fix panic in IP redirect. Panic was introduced in r293466. Found by: Yamagi Burmeister <lists at yamagi.org>>	2016-01-14 16:31:00 +00:00
melifaro	6342484c52	Remove now-unused wrappers for various routing functions.	2016-01-14 08:54:44 +00:00
melifaro	7fd2ccf67e	Remove RTF_RNH_LOCKED support from rtalloc1_fib(). Last caller using it was eliminated in r293471. Sponsored by: Yandex LLC	2016-01-13 14:32:48 +00:00
melifaro	334ff06bce	(Temporarily) remove route_redirect_event eventhandler. Such handler should pass different set of variables, instead of directly providing 2 locked route entries. Given that it hasn't been really used since at least 2012, remove current code. Will re-add it after finishing most major routing-related changes. Discussed with: np	2016-01-09 06:26:40 +00:00
melifaro	ff6e5f3297	Please Coverity by removing unneccessary check (rt_key() is always set). Coverity CID: 1347797	2016-01-09 05:39:06 +00:00
melifaro	108f5b92be	Do more fine-grained locking in rtrequest1_fib(). Last consumer using RTF_RNH_LOCKED flag was eliminated in r291643. Restrict passing RTF_RNH_LOCKED to rtrequest1_fib() and do better locking for RTM_ADD / RTM_DELETE cases.	2016-01-08 16:25:11 +00:00
melifaro	31d78f6810	Add rib_lookup_info() to provide API for retrieving individual route entries data in unified format. There are control plane functions that require information other than just next-hop data (e.g. individual rtentry fields like flags or prefix/mask). Given that the goal is to avoid rte reference/refcounting, re-use rt_addrinfo structure to store most rte fields. If caller wants to retrieve key/mask or gateway (which are sockaddrs and are allocated separately), it needs to provide sufficient-sized sockaddrs structures w/ ther pointers saved in passed rt_addrinfo. Convert: * lltable new records checks (in_lltable_rtcheck(), nd6_is_new_addr_neighbor(). * rtsock pre-add/change route check. * IPv6 NS ND-proxy check (RADIX_MPATH code was eliminated because 1) we don't support RTF_ANNOUNCE ND-proxy for networks and there should not be multiple host routes for such hosts 2) if we have multiple routes we should inspect them (which is not done). 3) the entire idea of abusing KRT as storage for ND proxy seems odd. Userland programs should be used for that purpose).	2016-01-04 15:03:20 +00:00
melifaro	a10bfcd45d	Fix PINNED routes handling. Before r291643, adding new interface prefix had the following logic: try_add: EEXIST && (PINNED) { try_del(w/o PINNED flag) if (OK) try_add(PINNED) } In r291643, deletion was performed w/ PINNED flag held which leaded to new interface prefixes (like ::1) overriding older ones. Fix this by requesting deletion w/o RTF_PINNED. PR: kern/205285 Submitted by: Fabian Keil <fk at fabiankeil.de>	2015-12-13 16:37:01 +00:00
melifaro	b97e91d3ed	Move RTF_PINNED handling to generic route code. This eliminates last RTF_RNH_LOCKED rtrequest1_fib() user.	2015-12-02 08:17:31 +00:00
ngie	f8a8bc03e7	Fix LINT-NOIP kernels after r291467 rn is only used if INET or INET6 are defined Sponsored by: EMC / Isilon Storage Division	2015-12-01 05:59:53 +00:00
melifaro	0dd8ba37e1	Move flowtable rte checks to separate function.	2015-11-30 05:59:22 +00:00
melifaro	e198456483	Add new rt_foreach_fib_walk_del() function for deleting route entries by filter function instead of picking into routing table details in each consumer. Remove now-unused rt_expunge() (eliminating last external RTF_RNH_LOCKED user). This simplifies future nexthops/mulitipath changes and rtrequest1_fib() locking refactoring. Actual changes: Add "rt_chain" field to permit rte grouping while doing batched delete from routing table (thus growing rte 200->208 on amd64). Add "rti_filter" / "rti_filterdata" / "rti_spare" fields to rt_addrinfo to pass filter function to various routing subsystems in standard way. Convert all rt_expunge() customers to new rt_addinfo-based api and eliminate rt_expunge().	2015-11-30 05:51:14 +00:00
melifaro	4322426d2c	Pass provided af instead of AF_UNSPEC to setwa_f callback.	2015-11-14 18:16:17 +00:00
bdrewery	722eb97ab8	Avoid passing an uninitialized 'i'. Currently nothing was depending on it anyhow. Coverity CID: 1331562	2015-10-29 18:58:18 +00:00
melifaro	45d403ed23	Remove several compat functions from pre-fib era.	2015-10-17 17:26:44 +00:00
vangyzen	75d72d4482	Fix the handling of IPv6 On-Link Redirects. On receipt of a redirect message, install an interface route for the redirected destination. On removal of the corresponding Neighbor Cache entry, remove the interface route. This requires changes in rtredirect_fib() to cope with an AF_LINK address for the gateway and with the absence of RTF_GATEWAY. This fixes the "Redirected On-Link" test cases in the Tahi IPv6 Ready Logo Phase 2 test suite. Unrelated to the above, fix a recursion on the radix node head lock triggered by the Tahi Redirected to Alternate Router test cases. When I first wrote this patch in October 2012, all Section 2 (Neighbor Discovery) test cases passed on 10-CURRENT, 9-STABLE, and 8-STABLE. cem@ recently rebased the 10.x patch onto head and reported that it passes Tahi. (Thanks!) These other test cases also passed in 2012: * the RTF_MODIFIED case, with IPv4 and IPv6 (using a RTF_HOST\|RTF_GATEWAY route for the destination) * the redirected-to-self case, with IPv4 and IPv6 * a valid IPv4 redirect All testing in 2012 was done with WITNESS and INVARIANTS. Tested by: EMC / Isilon Storage Division via Conrad Meyer (cem) in 2015, Mark Kelley <mark_kelley@dell.com> in 2012, TC Telkamp <terence_telkamp@dell.com> in 2012 PR: 152791 Reviewed by: melifaro (current rev), bz (earlier rev) Approved by: kib (mentor) MFC after: 1 month Relnotes: yes Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D3602	2015-09-14 19:17:25 +00:00
melifaro	bbab608243	Constantify lookup key in ifa_ifwith* functions. Some places in our network stack already have const arguments (like if_output() routines and LLE functions). Code using ifa_ifwith (and similar functins) along with LLE/_output functions is currently bound to use tricks like __DECONST(). Provide a cleaner way by making sockaddr lookup key really constant. MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3464	2015-09-05 05:33:20 +00:00
melifaro	ba06112c24	Rename rt_foreach_fib() to rt_foreach_fib_walk(). Suggested by: julian	2015-08-10 20:50:31 +00:00
melifaro	33c52eed18	MFP r274295: * Move interface route cleanup to route.c:rt_flushifroutes() * Convert most of "for (fibnum = 0; fibnum < rt_numfibs; fibnum++)" users to use new rt_foreach_fib() instead of hand-rolling cycles.	2015-08-08 18:14:59 +00:00
loos	79c1792263	Follow r256586 and rename the kernel version of the Free() macro to R_Free(). This matches the other macros and reduces the chances to clash with other headers. This also fixes the build of radix.c outside of the kernel environment. Reviewed by: glebius	2015-07-30 02:09:03 +00:00
araujo	5ca111b13f	Remove duplicate header entry.	2015-04-16 02:44:37 +00:00
melifaro	f8d64c469a	Finish r274175: do control plane MTU tracking. Update route MTU in case of ifnet MTU change. Add new RTF_FIXEDMTU to track explicitly specified MTU. Old behavior: ifconfig em0 mtu 1500->9000 -> all routes traversing em0 do not change MTU. User has to manually update all routes. ifconfig em0 mtu 9000->1500 -> all routes traversing em0 do not change MTU. However, if ip[6]_output finds route with rt_mtu > interface mtu, rt_mtu gets updated. New behavior: ifconfig em0 mtu 1500->9000 -> all interface routes in all fibs gets updated with new MTU unless RTF_FIXEDMTU flag set on them. ifconfig em0 mtu 9000->1500 -> all routes in all fibs gets updated with new MTU unless RTF_FIXEDMTU flag set on them AND rt_mtu is less than ifp mtu. route add ... -mtu XXX automatically sets RTF_FIXEDMTU flag. route change .. -mtu 0 automatically removes RTF_FIXEDMTU flag. PR: 194238 MFC after: 1 month CR: D1125	2014-11-17 01:05:29 +00:00
melifaro	6eee43ec5e	Revert r274585: rte lock is properly destroyed in uma dtor callback. Pointed by: glebius	2014-11-16 18:15:23 +00:00
melifaro	56eff7ebb0	Make witness happy: destroy rte lock before free. MFC after: 2 weeks	2014-11-16 14:56:31 +00:00
melifaro	dfd9a3788d	Fix build. Pointy hat to: melifaro	2014-11-06 17:50:35 +00:00
melifaro	5c54c0c246	Finish r274118: remove useless fields from struct domain. Sponsored by: Yandex LLC	2014-11-06 14:39:04 +00:00
melifaro	11af63037f	Make checks for rt_mtu generic: Some virtual if drivers has (ab)used ifa ifa_rtrequest hook to enforce route MTU to be not bigger that interface MTU. While ifa_rtrequest hooking might be an option in some situation, it is not feasible to do MTU checks there: generic (or per-domain) routing code is perfectly capable of doing this. We currrently have 3 places where MTU is altered: 1) route addition. In this case domain overrides radix _addroute callback (in[6]_addroute) and all necessary checks/fixes are/can be done there. 2) route change (especially, GW change). In this case, there are no explicit per-domain calls, but one can override rte by setting ifa_rtrequest hook to domain handler (inet6 does this). 3) ifconfig ifaceX mtu YYYY In this case, we have no callbacks, but ip[6]_output performes runtime checks and decreases rt_mtu if necessary. Generally, the goals are to be able to handle all MTU changes in control plane, not in runtime part, and properly deal with increased interface MTU. This commit changes the following: * removes hooks setting MTU from drivers side * adds proper per-doman MTU checks for case 1) * adds generic MTU check for case 2) * The latter is done by using new dom_ifmtu callback since if_mtu denotes L3 interface MTU, e.g. maximum trasmitted _packet_ size. However, IPv6 mtu might be different from if_mtu one (e.g. default 1280) for some cases, so we need an abstract way to know maximum MTU size for given interface and domain. * moves rt_setmetrics() before MTU/ifa_rtrequest hooks since it copies user-supplied data which must be checked. * removes RT_LOCK_ASSERT() from other ifa_rtrequest hooks to be able to use this functions on new non-inserted rte. More changes will follow soon. MFC after: 1 month Sponsored by: Yandex LLC	2014-11-06 13:13:09 +00:00
hrs	5e2751fde9	Make net.add_addr_allfibs vnet-local.	2014-09-21 03:48:20 +00:00
asomers	081aa8a15c	Revisions 264905 and 266860 added a "int fib" argument to ifa_ifwithnet and ifa_ifwithdstaddr. For the sake of backwards compatibility, the new arguments were added to new functions named ifa_ifwithnet_fib and ifa_ifwithdstaddr_fib, while the old functions became wrappers around the new ones that passed RT_ALL_FIBS for the fib argument. However, the backwards compatibility is not desired for FreeBSD 11, because there are numerous other incompatible changes to the ifnet(9) API. We therefore decided to remove it from head but leave it in place for stable/9 and stable/10. In addition, this commit adds the fib argument to ifa_ifwithbroadaddr for consistency's sake. sys/sys/param.h Increment __FreeBSD_version sys/net/if.c sys/net/if_var.h sys/net/route.c Add fibnum argument to ifa_ifwithbroadaddr, and remove the _fib versions of ifa_ifwithdstaddr, ifa_ifwithnet, and ifa_ifwithroute. sys/net/route.c sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_options.c sys/netinet/ip_output.c sys/netinet6/nd6.c Fixup calls of modified functions. share/man/man9/ifnet.9 Document changed API. CR: https://reviews.freebsd.org/D458 MFC after: Never Sponsored by: Spectra Logic	2014-09-11 20:21:03 +00:00
hselasky	35b126e324	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
gjb	fc21f40567	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
hselasky	bd1ed65f0f	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
asomers	7ca8bf0f2c	Fix unintended KBI change from r264905. Add _fib versions of ifa_ifwithnet() and ifa_ifwithdstaddr() The legacy functions will call the _fib() versions with RT_ALL_FIBS, preserving legacy behavior. sys/net/if_var.h sys/net/if.c Add legacy-compatible functions as described above. Ensure legacy behavior when RT_ALL_FIBS is passed as fibnum. sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/net/route.c sys/net/rtsock.c sys/netinet6/nd6.c Call with _fib() functions if we must use a specific fib, or the legacy functions otherwise. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c Improve the udp_dontroute test. The bug that this test exercises is that ifa_ifwithnet() will return the wrong address, if multiple interfaces have addresses on the same subnet but with different fibs. The previous version of the test only considered one possible failure mode: that ifa_ifwithnet_fib() might fail to find any suitable address at all. The new version also checks whether ifa_ifwithnet_fib() finds the correct address by checking where the ARP request goes. Reported by: bz, hrs Reviewed by: hrs MFC after: 1 week X-MFC-with: 264905 Sponsored by: Spectra Logic	2014-05-29 21:03:49 +00:00
melifaro	bb33a54f34	Remove additional fib checks from rtalloc1_fib. It looks like current consumers are either unaware of MRT (and uses RT_DEFAULT_FIB implicitly) or know what thay are doing, In latter case they will be either hit by KASSERT or ESCRH will be returned due to NULL rnh.	2014-05-03 16:38:05 +00:00
melifaro	a4407f98c0	Pass radix head ptr along with rte to rtexpunge(). Rename rtexpunge to rt_expunge().	2014-05-03 16:28:54 +00:00
melifaro	1883ddc524	Move rt_setmetrics() from rtsock.c to route.c. All rtsock-initiated rte creation/modification are now performed in route.c holding radix tree write lock. This reduces the need for per-rte mutex. Sponsored by: Yandex LLC MFC after: 1 month	2014-04-29 19:14:42 +00:00
melifaro	b1337c7d4c	Do not use senderr() in rtrequest1_fib_change(). Suggested by: glebius MFC after: 4 weeks	2014-04-29 12:52:36 +00:00
melifaro	5416196308	Remove useless `register' declarations. MFC after: 1 month	2014-04-26 22:42:21 +00:00
melifaro	e815654815	Decouple RTM_CHANGE from RTM_GET handling in rtsock.c:route_output(). RTM_CHANGE is now handled inside route.c:rtrequest1_fib() as it should be. Note change change handler is a separate function rtrequest1_fib_change(). MFC after: 1 month	2014-04-26 21:03:41 +00:00
melifaro	7b860c446e	Unify sa_equal() macro usage. MFC after: 2 weeks	2014-04-26 14:52:03 +00:00
asomers	f8a34b6f49	Fix subnet and default routes on different FIBs on the same subnet. These two bugs are closely related. The root cause is that ifa_ifwithnet does not consider FIBs when searching for an interface address. sys/net/if_var.h sys/net/if.c Add a fib argument to ifa_ifwithnet and ifa_ifwithdstadddr. Those functions will only return an address whose interface fib equals the argument. sys/net/route.c Update calls to ifa_ifwithnet and ifa_ifwithdstaddr with fib arguments. sys/netinet/in.c Update in_addprefix to consider the interface fib when adding prefixes. This will prevent it from not adding a subnet route when one already exists on a different fib. sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/netinet6/nd6.c Add RT_DEFAULT_FIB arguments to ifa_ifwithdstaddr and ifa_ifwithnet. In some cases it there wasn't a clear specific fib number to use. In others, I was unable to test those functions so I chose RT_DEFAULT_FIB to minimize divergence from current behavior. I will fix some of the latter changes along with PR kern/187553. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c tests/sys/netinet/Makefile Revert r263738. The udp_dontroute test was right all along. However, bugs kern/187550 and kern/187553 cancelled each other out when it came to this test. Because of kern/187553, ifa_ifwithnet searched the default fib instead of the requested one, but because of kern/187550, there was an applicable subnet route on the default fib. The new test added in r263738 doesn't work right, however. I can verify with dtrace that ifa_ifwithnet returned the wrong address before I applied this commit, but route(8) miraculously found the correct interface to use anyway. I don't know how. Clear expected failure messages for kern/187550 and kern/187552. PR: kern/187550 PR: kern/187552 Reviewed by: melifaro MFC after: 3 weeks Sponsored by: Spectra Logic	2014-04-24 23:56:56 +00:00

1 2 3 4 5 ...

262 Commits