freebsd-skq

Author	SHA1	Message	Date
Alexander V. Chernikov	76cfc6fa0d	Fix a use after free in update_rtm_from_rc(). update_rtm_from_rc() calls update_rtm_from_info() internally. The latter one may update provided prtm pointer with a new rtm. Reassign rtm from prtm afeter calling update_rtm_from_info() to avoid touching the freed rtm. PR: 255871 Submitted by: lylgood@foxmail.com MFC after: 3 days	2021-05-14 16:06:41 +00:00
Alexander V. Chernikov	25682e6a49	Fix rtsock sockaddr alignment. `b31fbebeb3` introduced alloc_sockaddr_aligned() which, in fact, failed to produce aligned addresses. Reported by: Oskar Holmlund <oskar.holmlund at yahoo.com> MFC after: immediately	2021-04-27 08:04:19 +00:00
Alexander V. Chernikov	5d1403a79a	[rtsock] Enforce netmask/RTF_HOST consistency. Traditionally we had 2 sources of information whether the added/delete route request targets network or a host route: netmask (RTA_NETMASK) and RTF_HOST flag. The former one is tricky: netmask can be empty or can explicitly specify the host netmask. Parsing netmask sockaddr requires per-family parsing and that's what rtsock code traditionally avoided. As a result, consistency was not enforced and it was possible to specify network with the RTF_HOST flag and vice versa. Continue normalization efforts from D29826 and D29826 and ensure that RTF_HOST flag always reflects host/network data from netmask field. Differential Revision: https://reviews.freebsd.org/D29958 MFC after: 2 days	2021-04-24 22:41:27 +00:00
Alexander V. Chernikov	b31fbebeb3	Relax rtsock message restrictions. Address multiple issues with strict rtsock message validation. D28668 "normalisation" approach was based on the assumption that we always have at least "standard" sockaddr len. It turned out to be false - certain older applications like quagga or routed abuse sin[6]_len field and set it to the offset to the first fully-zero bit in the mask. It is impossible to normalise such sockaddrs without reallocation. With that in mind, change the approach to use a distinct memory buffer for the altered sockaddrs. This allows supporting the older software while maintaining the guarantee on the "standard" sockaddrs. PR: 255273,255089 Differential Revision: https://reviews.freebsd.org/D29826 MFC after: 3 days	2021-04-20 21:34:19 +00:00
Alexander V. Chernikov	758c9d54d4	Improve error reporting in rtsock.c MFC after: 3 days	2021-04-19 20:36:41 +00:00
Alexander V. Chernikov	7f5f3fcc32	Fix direct route installation with net/bird. Slighly relax the gateway validation rules imposed by the `2fe5a79425`, by requiring only first 8 bytes (everyhing before sdl_data to be present in the AF_LINK gateway. Reported by: olivier	2021-04-10 16:31:16 +01:00
Alexander V. Chernikov	e5b394f2d0	Fix setting static entries for arp/ndp. rtsock message validation changes committed in `2fe5a79425` did not take llinfo messages into account. Add a special validation case for RTA_GATEWAY llinfo messages. MFC after: 2 days	2021-02-20 18:26:35 +00:00
Alexander V. Chernikov	f9e1cd6c99	Fix arp/ndp deletion broken by `2fe5a79425`. Changes in the `2fe5a79425` moved dst sockaddr masking from the routing control plane to the rtsock code. It broke arp/ndp deletion. It turns out, arp/ndp perform RTM_GET request first to get an interface index necessary for the deletion. Then they simply stamp the reply with RTF_LLDATA and set the command to RTM_DELETE. As a result, kernel receives request with non-empty RTA_NETMASK and clears RTA_DST host bits before passing the message to the lla code. De facto, the only needed bits are RTA_DST, RTA_GATEWAY and the subset of rtm_flags. With that in mind, fix the interace by clearing RTA_NETMASK for every messages with RTF_LLDATA. While here, cleanup arp/ndp code a bit. MFC after: 1 day Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D28804	2021-02-19 21:17:17 +00:00
Alexander V. Chernikov	a4513bace0	Fix NOINET6 build broken by `2fe5a79425`. Reported by: mjg	2021-02-16 21:49:48 +00:00
Alexander V. Chernikov	2fe5a79425	Fix dst/netmask handling in routing socket code. Traditionally routing socket code did almost zero checks on the input message except for the most basic size checks. This resulted in the unclear KPI boundary for the routing system code (`rtrequest` and now `rib_action()`) w.r.t message validness. Multiple potential problems and nuances exists: Host bits in RTAX_DST sockaddr. Existing applications do send prefixes with hostbits uncleared. Even `route(8)` does this, as they hope the kernel would do the job of fixing it. Code inside `rib_action()` needs to handle it on its own (see `rt_maskedcopy()` ugly hack). * There are multiple way of adding the host route: it can be DST without netmask or DST with /32(/128) netmask. Also, RTF_HOST has to be set correspondingly. Currently, these 2 options create 2 DIFFERENT routes in the kernel. * no sockaddr length/content checking for the "secondary" fields exists: nothing stops rtsock application to send sockaddr_in with length of 25 (instead of 16). Kernel will accept it, install to RIB as is and propagate to all rtsock consumers, potentially triggering bugs in their code. Same goes for sin_port, sin_zero, etc. The goal of this change is to make rtsock verify all sockaddr and prefix consistency. Said differently, `rib_action()` or internals should NOT require to change any of the sockaddrs supplied by `rt_addrinfo` structure due to incorrectness. To be more specific, this change implements the following: * sockaddr cleanup/validation check is added immediately after getting sockaddrs from rtm. * Per-family dst/netmask checks clears host bits in dst and zeros all dst/netmask "secondary" fields. * The same netmask checking code converts /32(/128) netmasks to "host" route case (NULL netmask, RTF_HOST), removing the dualism. * Instead of allowing ANY "known" sockaddr families (0<..<AF_MAX), allow only actually supported ones (inet, inet6, link). * Automatically convert `sockaddr_sdl` (AF_LINK) gateways to `sockaddr_sdl_short`. Reported by: Guy Yur <guyyur at gmail.com> Reviewed By: donner Differential Revision: https://reviews.freebsd.org/D28668 MFC after: 3 days	2021-02-16 20:30:04 +00:00
Alexander V. Chernikov	64d5c27777	Remove now-unused RTF_RNH_LOCKED route flag. MFC after: 1 week	2021-02-15 20:49:59 +00:00
Alexander V. Chernikov	8ca99aecf7	Fix various NOINET* builds broken by `145bf6c0af`. Reported by: mjg, bdragon	2021-02-12 20:36:20 +00:00
Alexander V. Chernikov	145bf6c0af	Fix blackhole/reject routes. Traditionally *BSD routing stack required to supply some interface data for blackhole/reject routes. This lead to varieties of hacks in routing daemons when inserting such routes. With the recent routeing stack changes, gateway sockaddr without RTF_GATEWAY started to be treated differently, purely as link identifier. This change broke net/bird, which installs blackhole routes with 127.0.0.1 gateway without RTF_GATEWAY flags. Fix this by automatically constructing necessary gateway data at rtsock level if RTF_REJECT/RTF_BLACKHOLE is set. Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl> Reviewed by: donner MFC after: 1 week	2021-02-11 23:08:55 +00:00
Alexander V. Chernikov	d68cf57b7f	Refactor rt_addrmsg() and rt_routemsg(). Summary: * Refactor rt_addrmsg(): make V_rt_add_addr_allfibs decision locally. * Fix rt_routemsg() and multipath by accepting nexthop instead of interface pointer. * Refactor rtsock_routemsg(): avoid accessing rtentry fields directly. * Simplify in_addprefix() by moving prefix search to a separate function. Reviewers: #network Subscribers: imp, ae, bz Differential Revision: https://reviews.freebsd.org/D28011	2021-01-07 19:38:19 +00:00
Ryan Libby	2fb4a03d55	rtsock: quiet -Wunused-variable in LINT-NOIP kernels Fixup after r368769 / `d68fb8d978`. Reported by: mjg Reviewed by: melifaro Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D27730	2020-12-24 12:34:18 -08:00
Mark Johnston	92be2847e8	rtsock: Avoid copying uninitialized padding bytes When copying sockaddrs out to userspace, we pad them to a multiple of the platform alignment (sizeof(long)). However, some sockaddr sizes, such as struct sockaddr_dl, are not an integer multiple of the alignment, so we may end up copying out uninitialized bytes. Fix this by always bouncing through a pre-zeroed sockaddr_storage. Reported by: KASAN Reviewed by: melifaro MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27729	2020-12-23 11:16:40 -05:00
Alexander V. Chernikov	d68fb8d978	Switch direct rt fields access in rtsock.c to newly-create field acessors. rtsock code was build around the assumption that each rtentry record in the system radix tree is a ready-to-use sockaddr. This assumptions turned out to be not quite true: * masks have their length tweaked, so we have rtsock_fix_netmask() hack * IPv6 addresses have their scope embedded, so we have another explicit deembedding hack. Change the code to decouple rtentry internals from rtsock code using newly-created rtentry accessors. This will allow to eventually eliminate both of the hacks and change rtentry dst/mask format. Differential Revision: https://reviews.freebsd.org/D27451	2020-12-18 22:00:57 +00:00
Alexander V. Chernikov	d1d941c5b9	Remove RADIX_MPATH config option. ROUTE_MPATH is the new config option controlling new multipath routing implementation. Remove the last pieces of RADIX_MPATH-related code and the config option. Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D27244	2020-11-29 19:43:33 +00:00
Alexander V. Chernikov	1b95005e95	Fix route flags update during RTM_CHANGE. Nexthop lookup was not consireding rt_flags when doing structure comparison, which lead to an original nexthop selection when changing flags. Fix the case by adding rt_flags field into comparison and rearranging nhop_priv fields to allow for efficient matching. Fix `route change X/Y flags` case - recent changes disallowed specifying RTF_GATEWAY flag without actual gateway. It turns out, route(8) fills in RTF_GATEWAY by default, unless -interface flag is specified. Fix regression by clearing RTF_GATEWAY flag instead of failing. Fix route flag reporting in RTM_CHANGE messages by explicitly updating rtm_flags after operation competion. Add IPv4/IPv6 tests for flag-only route changes.	2020-10-04 13:24:58 +00:00
Alexander V. Chernikov	9c584fa4bc	Remove ROUTE_MPATH-related warnings introduced in r366390. Reported by: mjg	2020-10-03 14:37:54 +00:00
Alexander V. Chernikov	fedeb08b6a	Introduce scalable route multipath. This change is based on the nexthop objects landed in D24232. The change introduces the concept of nexthop groups. Each group contains the collection of nexthops with their relative weights and a dataplane-optimized structure to enable efficient nexthop selection. Simular to the nexthops, nexthop groups are immutable. Dataplane part gets compiled during group creation and is basically an array of nexthop pointers, compiled w.r.t their weights. With this change, `rt_nhop` field of `struct rtentry` contains either nexthop or nexthop group. They are distinguished by the presense of NHF_MULTIPATH flag. All dataplane lookup functions returns pointer to the nexthop object, leaving nexhop groups details inside routing subsystem. User-visible changes: The change is intended to be backward-compatible: all non-mpath operations should work as before with ROUTE_MPATH and net.route.multipath=1. All routes now comes with weight, default weight is 1, maximum is 2^24-1. Current maximum multipath group width is statically set to 64. This will become sysctl-tunable in the followup changes. Using functionality: * Recompile kernel with ROUTE_MPATH * set net.route.multipath to 1 route add -6 2001:db8::/32 2001:db8::2 -weight 10 route add -6 2001:db8::/32 2001:db8::3 -weight 20 netstat -6On Nexthop groups data Internet6: GrpIdx NhIdx Weight Slots Gateway Netif Refcnt 1 ------- ------- ------- --------------------------------------- --------- 1 13 10 1 2001:db8::2 vlan2 14 20 2 2001:db8::3 vlan2 Next steps: * Land outbound hashing for locally-originated routes ( D26523 ). * Fix net/bird multipath (net/frr seems to work fine) * Add ROUTE_MPATH to GENERIC * Set net.route.multipath=1 by default Tested by: olivier Reviewed by: glebius Relnotes: yes Differential Revision: https://reviews.freebsd.org/D26449	2020-10-03 10:47:17 +00:00
Alexander V. Chernikov	2259a03020	Rework part of routing code to reduce difference to D26449. * Split rt_setmetrics into get_info_weight() and rt_set_expire_info(), as these two can be applied at different entities and at different times. * Start filling route weight in route change notifications * Pass flowid to UDP/raw IP route lookups * Rework nd6_subscription_cb() and sysctl_dumpentry() to prepare for the fact that rtentry can contain multiple nexthops. Differential Revision: https://reviews.freebsd.org/D26497	2020-09-21 20:02:26 +00:00
Ed Maste	4fa9815a3d	rtsock.c: remove extraneous space Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D26249	2020-09-05 16:13:36 +00:00
Mateusz Guzik	662c13053f	net: clean up empty lines in .c and .h files	2020-09-01 21:19:14 +00:00
Alexander V. Chernikov	a624ca3dff	Move net/route/shared.h definitions to net/route/route_var.h. No functional changes. net/route/shared.h was created in the inital phases of nexthop conversion. It was intended to serve the same purpose as route_var.h - share definitions of functions and structures between the routing subsystem components. At that time route_var.h was included by many files external to the routing subsystem, which largerly defeats its purpose. As currently this is not the case anymore and amount of route_var.h includes is roughly the same as shared.h, retire the latter in favour of the former.	2020-08-28 22:50:20 +00:00
Alexander V. Chernikov	592d300e34	Remove RT_LOCK mutex from rte. rtentry lock traditionally served 2 purposed: first was protecting refcounts, the second was assuring consistent field access/changes. Since route nexthop introduction, the need for the former disappeared and the need for the latter reduced. To be more precise, the following rte field are mutable: rt_nhop (nexthop pointer, updated with RIB_WLOCK, passed in rib_cmd_info) rte_flags (only RTF_HOST and RTF_UP, where RTF_UP gets changed at rte removal) rt_weight (relative weight, updated with RIB_WLOCK, passed in rib_cmd_info) rt_expire (time when rte deletion is scheduled, updated with RIB_WLOCK) rt_chain (deletion chain pointer, updated with RIB_WLOCK) All of them are updated under RIB_WLOCK, so the only remaining concern is the reading. rt_nhop and rt_weight (addressed in this review) are read under rib lock and stored in the rib_cmd_info, so the caller has no problem with consitency. rte_flags is currently read unlocked in rtsock reporting (however the scope is only RTF_UP flag, which is pretty static). rt_expire is currently read unlocked in rtsock reporting. rt_chain accesses are safe, as this is only used at route deletion. rt_expire and rte_flags reads will be dealt in a separate reviews soon. Differential Revision: https://reviews.freebsd.org/D26162	2020-08-24 20:23:34 +00:00
Alexander V. Chernikov	93bfd365d2	Rename rt_flags to rte_flags && reduce number of rt_nhop accesses. No functional changes. Most of the routing flags are stored in the netxtop instead of rtentry. Rename rt->rt_flags to rt->rte_flags to simplify reading/modifying code checking routing flags. In the new multipath code, rt->rt_nhop may actually point to nexthop group instead of nhop. To ease transition, reduce the amount of rt->rt_nhop->... accesses. Differential Revision: https://reviews.freebsd.org/D26156	2020-08-22 19:30:56 +00:00
Alexander V. Chernikov	bec053ffe0	Make net.inet6.ip6.deembed_scopeid behaviour default & remove sysctl. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25637	2020-08-15 11:37:44 +00:00
Alexander V. Chernikov	a287a973e3	Switch rtsock code to using newly-create rib_action() KPI call. This simplifies the code and allows to further split rtentry and nexthop, removing one of the blockers for multipath code introduction, described in D24141. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D25192	2020-06-10 07:46:22 +00:00
Alexander V. Chernikov	2bbab0af6d	Use epoch(9) for rtentries to simplify control plane operations. Currently the only reason of refcounting rtentries is the need to report the rtable operation details immediately after the execution. Delaying rtentry reclamation allows to stop refcounting and simplify the code. Additionally, this change allows to reimplement rib_lookup_info(), which is used by some of the customers to get the matching prefix along with nexthops, in more efficient way. The change keeps per-vnet rtzone uma zone. It adds nh_vnet field to nhop_priv to be able to reliably set curvnet even during vnet teardown. Rest of the reference counting code will be removed in the D24867 . Differential Revision: https://reviews.freebsd.org/D24866	2020-05-23 10:21:02 +00:00
Alexander V. Chernikov	656442a718	Embed dst sockaddr into rtentry and remove rte packet counter Currently each rtentry has dst&gateway allocated separately from another zone, bloating cache accesses. Current 'struct rtentry' has 12 "mandatory" radix pointers in the beginning, leaving 4 usable pointers/32 bytes in the first 2 cache lines (amd64). Fields needed for the datapath are destination sockaddr and rt_nhop. So far it doesn't look like there is other routable addressing protocol other than IPv4/IPv6/MPLS, which uses keys longer than 20 bytes. With that in mind, embed dst into struct rtentry, making the first 24 bytes of rtentry within 128 bytes. That is enough to make IPv6 address within first 128 bytes. It is still pretty easy to add code for supporting separately-allocated dst, however it doesn't make a lot of sense in having such code without a use case. As rS359823 moved the gateway to the nexthop structure, the dst embedding change removes the need for any additional allocations done by rt_setgate(). Lastly, as a part of cleanup, remove counter(9) allocation code, as this field is not used in packet processing anymore. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D24669	2020-05-08 21:06:10 +00:00
Alexander V. Chernikov	8c61eb2107	Convert more rtentry field accesses into nhop fields accesses. Continue routing subsystem conversion to nhop objects defined in r359823. Use fields from nhop structure instead of "struct rtentry" fields. This is one of the last changes prior to removing rt_ifp, rt_ifa, rt_gateway and rt_mtu from struct rtentry. Differential Revision: https://reviews.freebsd.org/D24609	2020-04-29 21:54:09 +00:00
Alexander V. Chernikov	fc88ecd31a	Move route-specific ddb commands to route/route_ddb.c Currently functionality resides in rtsock.c, which is a controlling interface, partially external to the routing subsystem. Additionally, DDB-supporting functionality is > 100SLOC, which deserves a separate file. Given that, move this functionality to a newly-created net/route/ subdir. Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D24561	2020-04-28 20:00:17 +00:00
Alexander V. Chernikov	e7d8af4f65	Move route_temporal.c and route_var.h to net/route. Nexthop objects implementation, defined in r359823, introduced sys/net/route directory intended to hold all routing-related code. Move recently-introduced route_temporal.c and private route_var.h header there. Differential Revision: https://reviews.freebsd.org/D24597	2020-04-28 19:14:09 +00:00
Alexander V. Chernikov	aaad3c4fca	Convert rtentry field accesses into nhop field accesses. One of the goals of the new routing KPI defined in r359823 is to entirely hide`struct rtentry` from the consumers. It will allow to improve routing subsystem internals and deliver more features much faster. This commit is mostly mechanical change to eliminate direct struct rtentry field accesses. The only notable difference is AF_LINK gateway encoding. AF_LINK gw is used in routing stack for operations with interface routes and host loopback routes. In the former case it indicates _some_ non-NULL gateway, as the interface is the same as in rt_ifp in kernel and rtm_ifindex in rtsock reporting. In the latter case the interface index inside gateway was used by the IPv6 datapath to verify address scope for link-local interfaces. Kernel uses struct sockaddr_dl for this type of gateway. This structure allows for specifying rich interface data, such as mac address and interface name. However, this results in relatively large structure size - 52 bytes. Routing stack fils in only 2 fields - sdl_index and sdl_type, which reside in the first 8 bytes of the structure. In the new KPI, struct nhop_object tries to be cache-efficient, hence embodies gateway address inside the structure. In the AF_LINK case it stores stortened version of the structure - struct sockaddr_dl_short, which occupies 16 bytes. After D24340 changes, the data inside AF_LINK gateway will not be used in the kernel at all, leaving rtsock as the only potential concern. The difference in rtsock reporting: (old) got message of size 240 on Thu Apr 16 03:12:13 2020 RTM_ADD: Add Route: len 240, pid: 0, seq 0, errno 0, flags:<UP,DONE,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 link#5 255.255.255.0 (new) got message of size 200 on Sun Apr 19 09:46:32 2020 RTM_ADD: Add Route: len 200, pid: 0, seq 0, errno 0, flags:<UP,DONE,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 link#5 255.255.255.0 Note 40 bytes different (52-16 + alignment). However, gateway is still a valid AF_LINK gateway with proper data filled in. It is worth noting that these particular messages (interface routes) are mostly ignored by routing daemons: * bird/quagga/frr uses RTM_NEWADDR and ignores prefix route addition messages. * quagga/frr ignores routes without gateway More detailed overview on how rtsock messages are used by the routing daemons to reconstruct the kernel view, can be found in D22974. Differential Revision: https://reviews.freebsd.org/D24519	2020-04-23 08:04:20 +00:00
Adrian Chadd	2538a4b1b4	Remove an duplicate definition of nhops_dump_sysctl() One of the source files included both nhop.h and shared.h, leading to this clashing. Tested with: mips-gcc cross toolchain	2020-04-16 23:28:47 +00:00
Alexander V. Chernikov	a666325282	Introduce nexthop objects and new routing KPI. This is the foundational change for the routing subsytem rearchitecture. More details and goals are available in https://reviews.freebsd.org/D24141 . This patch introduces concept of nexthop objects and new nexthop-based routing KPI. Nexthops are objects, containing all necessary information for performing the packet output decision. Output interface, mtu, flags, gw address goes there. For most of the cases, these objects will serve the same role as the struct rtentry is currently serving. Typically there will be low tens of such objects for the router even with multiple BGP full-views, as these objects will be shared between routing entries. This allows to store more information in the nexthop. New KPI: struct nhop_object fib4_lookup(uint32_t fibnum, struct in_addr dst, uint32_t scopeid, uint32_t flags, uint32_t flowid); struct nhop_object fib6_lookup(uint32_t fibnum, const struct in6_addr dst6, uint32_t scopeid, uint32_t flags, uint32_t flowid); These 2 function are intended to replace all all flavours of <in_\|in6_>rtalloc[1]<_ign><_fib>, mpath functions and the previous fib[46]-generation functions. Upon successful lookup, they return nexthop object which is guaranteed to exist within current NET_EPOCH. If longer lifetime is desired, one can specify NHR_REF as a flag and get a referenced version of the nexthop. Reference semantic closely resembles rtentry one, allowing sed-style conversion. Additionally, another 2 functions are introduced to support uRPF functionality inside variety of our firewalls. Their primary goal is to hide the multipath implementation details inside the routing subsystem, greatly simplifying firewalls implementation: int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid, uint32_t flags, const struct ifnet src_if); int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr dst6, uint32_t scopeid, uint32_t flags, const struct ifnet src_if); All functions have a separate scopeid argument, paving way to eliminating IPv6 scope embedding and allowing to support IPv4 link-locals in the future. Structure changes: * rtentry gets new 'rt_nhop' pointer, slightly growing the overall size. * rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz. Old KPI: During the transition state old and new KPI will coexists. As there are another 4-5 decent-sized conversion patches, it will probably take a couple of weeks. To support both KPIs, fields not required by the new KPI (most of rtentry) has to be kept, resulting in the temporary size increase. Once conversion is finished, rtentry will notably shrink. More details: * architectural overview: https://reviews.freebsd.org/D24141 * list of the next changes: https://reviews.freebsd.org/D24232 Reviewed by: ae,glebius(initial version) Differential Revision: https://reviews.freebsd.org/D24232	2020-04-12 14:30:00 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Alexander V. Chernikov	e02d3fe70c	Fix rtsock route message generation for interface addresses. Reviewed by: olivier MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D22974	2020-01-07 21:16:30 +00:00
Alexander V. Chernikov	d930203192	Fix NOINET6 build broken by r356236. MFC after: 2 weeks	2019-12-31 17:57:12 +00:00
Alexander V. Chernikov	c83dda362e	Split gigantic rtsock route_output() into smaller functions. Amount of changes to the original code has been intentionally minimised to ease diffing. The changes are mostly mechanical, with the following exceptions: * lltable handler is now called directly based of RTF_LLINFO flag presense. * "report" logic for updating rtm in RTM_GET/RTM_DELETE has been simplified, fixing several potential use-after-free cases in rt_addrinfo. * llable asserts has been replaced with error-returning, preventing kernel crashes when lltable gw af family is invalid (root required). MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22864	2019-12-31 17:26:53 +00:00
Gleb Smirnoff	8a9a28c455	sysctl_rtsock() has all necessary locking and doesn't need Giant to run. While here add description.	2019-11-07 19:06:18 +00:00
Gleb Smirnoff	b8a6e03fac	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
Li-Wen Hsu	4835262b68	Fix build for the platforms where db_expr_t is not long Sponsored by: The FreeBSD Foundation	2019-09-10 08:51:11 +00:00
Conrad Meyer	0b12ab8111	Appease Clang false-positive Werrors in r352112 Reported by: bcran	2019-09-10 01:56:47 +00:00
Conrad Meyer	8b6acd2b51	ddb(4): Add 'show route <dest>' and 'show routetable [<af>]' These commands show the route resolved for a specified destination, or print out the entire routing table for a given address family (or all families, if none is explicitly provided). Discussed with: emaste Differential Revision: https://reviews.freebsd.org/D21510	2019-09-09 22:54:27 +00:00
Michael Tuexen	e6481fd4c4	When sending a routing message, don't allow the user to set the RTF_RNH_LOCKED flag in rtm_flags, since this flag is used only internally. Reported by: syzbot+65c676f5248a13753ea0@syzkaller.appspotmail.com Reviewed by: ae@ MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19898	2019-04-14 10:18:14 +00:00
Gleb Smirnoff	cc31f9821c	Remove recursive NET_EPOCH_ENTER() from sysctl_ifmalist(), missed in r342872.	2019-01-11 00:45:22 +00:00
Gleb Smirnoff	a68cc38879	Mechanical cleanup of epoch(9) usage in network stack. - Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro is used in embedded scopes. Explicitly declare epoch_tracker always. - Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read locking macros to what they actually are - the net_epoch. Keeping them as is is very misleading. They all are named FOO_RLOCK(), while they no longer have lock semantics. Now they allow recursion and what's more important they now no longer guarantee protection against their companion WLOCK macros. Note: INP_HASH_RLOCK() has same problems, but not touched by this commit. This is non functional mechanical change. The only functionally changed functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter epoch recursively. Discussed with: jtl, gallatin	2019-01-09 01:11:19 +00:00
Andrey V. Elsukov	a716ad4a35	Fix possible panic during ifnet detach in rtsock. The panic can happen, when some application does dump of routing table using sysctl interface. To prevent this, set IFF_DYING flag in if_detach_internal() function, when ifnet under lock is removed from the chain. In sysctl_rtsock() take IFNET_RLOCK_NOSLEEP() to prevent ifnet detach during routes enumeration. In case, if some interface was detached in the time before we take the lock, add the check, that ifnet is not DYING. This prevents access to memory that could be freed after ifnet is unlinked. PR: 227720, 230498, 233306 Reviewed by: bz, eugen MFC after: 1 week Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D18338	2018-11-27 09:04:06 +00:00

1 2 3 4 5 ...

313 Commits