Use route destination sockaddr when the gateway is eiter AF_LINK or
has the different family (IPv4 over IPv6). This change ensures
the nexthop IFA has the same family as the destination.
Reported by: Dmitriy Smirnov <fox@sage.su>
Tested by: Dmitriy Smirnov <fox@sage.su>
MFC after: 3 days
Use already-existing RTM_F_PREFIX rtm_flag to indicate that the
request assumes exact-prefix lookup instead of the
longest-prefix-match.
MFC after: 2 weeks
Use full-featured ifa_ifwithroute() to guess route ifa/ifp
instead of ifa_ifwithnet(). This change makes the route addition
logic closer to the rt_getifa_fib() used by rtsock.
Reported by: glebius
Tested by: glebius
Differential Revision: https://reviews.freebsd.org/D39335
MFC after: 2 weeks
This change does the following:
Base Netlink KPIs (ability to register the family, parse and/or
write a Netlink message) are always present in the kernel. Specifically,
* Implementation of genetlink family/group registration/removal,
some base accessors (netlink_generic_kpi.c, 260 LoC) are compiled in
unconditionally.
* Basic TLV parser functions (netlink_message_parser.c, 507 LoC) are
compiled in unconditionally.
* Glue functions (netlink<>rtsock), malloc/core sysctl definitions
(netlink_glue.c, 259 LoC) are compiled in unconditionally.
* The rest of the KPI _functions_ are defined in the netlink_glue.c,
but their implementation calls a pointer to either the stub function
or the actual function, depending on whether the module is loaded or not.
This approach allows to have only 1k LoC out of ~3.7k LoC (current
sys/netlink implementation) in the kernel, which will not grow further.
It also allows for the generic netlink kernel customers to load
successfully without requiring Netlink module and operate correctly
once Netlink module is loaded.
Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D39269
This change allow to open Netlink sockets in the non-vnet jails, even for
unpriviledged processes.
The security model largely follows the existing one. To be more specific:
* by default, every `NETLINK_ROUTE` command is **NOT** allowed in non-VNET
jail UNLESS `RTNL_F_ALLOW_NONVNET_JAIL` flag is specified in the command
handler.
* All notifications are **disabled** for non-vnet jails (requests to
subscribe for the notifications are ignored). This will change to be more
fine-grained model once the first netlink provider requiring this gets
committed.
* Listing interfaces (RTM_GETLINK) is **allowed** w/o limits (**including**
interfaces w/o any addresses attached to the jail). The value of this is
questionable, but it follows the existing approach.
* Listing ARP/NDP neighbours is **forbidden**. This is a **change** from the
current approach - currently we list static ARP/ND entries belonging to the
addresses attached to the jail.
* Listing interface addresses is **allowed**, but the addresses are filtered
to match only ones attached to the jail.
* Listing routes is **allowed**, but the routes are filtered to provide only
host routes matching the addresses attached to the jail.
* By default, every `NETLINK_GENERIC` command is **allowed** in non-VNET jail
(as sub-families may be unrelated to network at all).
It is the goal of the family author to implement the restriction if
necessary.
Differential Revision: https://reviews.freebsd.org/D39206
MFC after: 1 month
* Make nhop_set_blackhole() set all necessary properties for the
nexthop
* Make nexthops blackhole/reject based on the rtm_type netlink
property instead of using rtflags.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
Currently kernel assumes that IPv6 gateway address is in "embedded"
form - that is, for the link-local IPv6 addresses, interface index
is embedded in bytes 2 and 3 of the address.
Fix address embedding in netlink by wrapping nhop_set_gw() in the
netlink-specific nl_set_nexthop_gw(), which does such embedding
automatically.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
The current code missed interface addition when reallocating
temporary buffer.
Tweak the code to perform the reallocation first and add
interface afterwards unconditionally.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
Netlink customers rely on admin and operational state when
working with interfaces. The current implementation retuns
"unknown" operstate for all interface types except IFT_ETHER
and IFT_LOOP.
This change updates the code to fetch vlan operstate in the same way
as for the ether interfaces. For the rest of the interface types,
operstate is now mapped to the admin state.
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days
Some operations like interface creation may need to return metadata
- in this case, interface name - back to the caller if the operation
is successful.
This change implements attaching an `NLMSGERR_ATTR_COOKIE` nla to the
operation reply message via `nlmsg_report_cookie()`.
Additionally, on successful interface creation, interface index and
interface name are returned in the `IFLA_NEW_IFINDEX` and `IFLA_IFNAME
TLVs, encapsulated in the `NLMSGERR_ATTR_COOKIE`.
Reviewed By: pauamma
Differential Revision: https://reviews.freebsd.org/D38283
MFC after: 1 week
Add support for the scenario when user adds/deletes paths for a single
prefix one-by-one, all with different weights.
This change adds a new FreeBSD-specific RTA attribute, NL_RTA_WEIGHT.
When dumping non-multipath routes, this attribute is added if the
route weight is not RT_DEFAULT_WEIGHT.
When adding a new route, this attribute is parsed as a relative path
weight.
MFC after: 2 weeks
The prevailing pattern seems to be to simply initialize all fields to
zero. Without this, it's possible to trigger a branch on uninitialized
memory, specifically, when testing nw->ignore_limit in
nlmsg_refill_buffer().
Initialize the writer structure in a couple of functions where this is
necessary.
Reported by: KMSAN
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38213
Output the proper attributes for IPv4/IPvv6 ifaddrs:
* IFA_ADDRESS contains local address in every case except p2p,
in that case it contains the peer address
* IFA_LOCAL contains local address. It is always present in IPv4,
or in IPv6/p2p.
* IFA_BROADCAST contains the network broadcast address (if any)
Reported by: Adam Wood <aswood@gmail.com>
Tested by: Adam Wood <aswood@gmail.com>
* Separate interface creation from interface modification code
* Support setting some interface attributes (ifdescr, mtu, up/down, promisc)
* Improve interaction with the cloners requiring to parse/write custom
interface attributes
* Add bitmask-based way of checking if the attribute is present in the
message
* Don't use multipart RTM_GETLINK replies when searching for the
specific interface names
* Use ENODEV instead of ENOENT in case of failed RTM_GETLINK search
* Add python netlink test helpers
* Add some netlink interface tests
Differential Revision: https://reviews.freebsd.org/D37668
* Add link-state change notifications by subscribing to ifnet_link_event.
In the Linux netlink model, link state is reported in 2 places: first is
the IFLA_OPERSTATE, which stores state per RFC2863.
The second is an IFF_LOWER_UP interface flag. As many applications rely
on the latter, reserve 1 bit from if_flags, named as IFF_NETLINK_1.
This flag is mapped to IFF_LOWER_UP in the netlink headers. This is done
to avoid making applications think this flag is actually
supported / presented in non-netlink outputs.
* Add flag change notifications, by hooking into rt_ifmsg().
In the netlink model, notification should include the bitmask for the
change flags. Update rt_ifmsg() to include such bitmask.
Differential Revision: https://reviews.freebsd.org/D37597
Store user-supplied source protocol in the nexthops and nexthop groups.
Protocol specification help routing daemons like bird to quickly
identify self-originated routes after the crash or restart.
Example:
```
10.2.0.0/24 via 10.0.0.2 dev vtnet0 proto bird
10.3.0.0/24 proto bird
nexthop via 10.0.0.2 dev vtnet0 weight 3
nexthop via 10.0.0.3 dev vtnet0 weight 4
```
For some of these Clang produced a warning that "a function declaration
without a prototype is deprecated in all versions of C". In other cases
the function defintion used () which did not match the header
declaration, which used (void).
Sponsored by: The FreeBSD Foundation
By adding missing ifdefs for INET and INET6 when building LINT-NOIP .
Differential Revision: https://reviews.freebsd.org/D36731
Sponsored by: NVIDIA Networking
Netlinks is a communication protocol currently used in Linux kernel to modify,
read and subscribe for nearly all networking state. Interfaces, addresses, routes,
firewall, fibs, vnets, etc are controlled via netlink.
It is async, TLV-based protocol, providing 1-1 and 1-many communications.
The current implementation supports the subset of NETLINK_ROUTE
family. To be more specific, the following is supported:
* Dumps:
- routes
- nexthops / nexthop groups
- interfaces
- interface addresses
- neighbors (arp/ndp)
* Notifications:
- interface arrival/departure
- interface address arrival/departure
- route addition/deletion
* Modifications:
- adding/deleting routes
- adding/deleting nexthops/nexthops groups
- adding/deleting neghbors
- adding/deleting interfaces (basic support only)
* Rtsock interaction
- route events are bridged both ways
The implementation also supports the NETLINK_GENERIC family framework.
Implementation notes:
Netlink is implemented via loadable/unloadable kernel module,
not touching many kernel parts.
Each netlink socket uses dedicated taskqueue to support async operations
that can sleep, such as interface creation. All message processing is
performed within these taskqueues.
Compatibility:
Most of the Netlink data models specified above maps to FreeBSD concepts
nicely. Unmodified ip(8) binary correctly works with
interfaces, addresses, routes, nexthops and nexthop groups. Some
software such as net/bird require header-only modifications to compile
and work with FreeBSD netlink.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D36002
MFC after: 2 months