Add support for the scenario when user adds/deletes paths for a single
prefix one-by-one, all with different weights.
This change adds a new FreeBSD-specific RTA attribute, NL_RTA_WEIGHT.
When dumping non-multipath routes, this attribute is added if the
route weight is not RT_DEFAULT_WEIGHT.
When adding a new route, this attribute is parsed as a relative path
weight.
MFC after: 2 weeks
The prevailing pattern seems to be to simply initialize all fields to
zero. Without this, it's possible to trigger a branch on uninitialized
memory, specifically, when testing nw->ignore_limit in
nlmsg_refill_buffer().
Initialize the writer structure in a couple of functions where this is
necessary.
Reported by: KMSAN
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38213
Some existing applications setup Netlink socket with
SOCK_DGRAM instead of SOCK_RAW. Update the manpage to clarify
that the default way of creating the socket should be with
SOCK_RAW. Update the code to support both SOCK_RAW and SOCK_DGRAM.
Reviewed By: pauamma
Differential Revision: https://reviews.freebsd.org/D38075
This file is indented with a mixture of tabs and spaces. No functional
change intended.
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38100
Some users of nlmsg_reserve_object() and nlmsg_reserve_data() are not
careful to fully initialize pad and reserved fields, allowing
uninitialized bytes to leak to userspace. For example, dump_nhgrp()
doesn't set nhm->resvd = 0.
Meanwhile, nlmsg_get_ns_buf() and nlmsg_get_ns_lbuf() zero-initialize
the buffer, so nlmsg_get_ns_mbuf() is inconsistent. Let's just make
them all behave the same here.
Reported by: KMSAN
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38098
Some apps try to provide only the non-zero part of the required message
header instead of the full one. It happens when fetching routes or
interface addresses, where the first header byte is the family.
This behavior is "illegal" under the "strict" Netlink socket option,
however there are many applications out there doing things in the
"old" way.
Support this usecase by copying the provided bytes into the temporary
zero-filled header and running the parser on this header instead.
Reported by: Goran Mekić <meka@tilda.center>
Output the proper attributes for IPv4/IPvv6 ifaddrs:
* IFA_ADDRESS contains local address in every case except p2p,
in that case it contains the peer address
* IFA_LOCAL contains local address. It is always present in IPv4,
or in IPv6/p2p.
* IFA_BROADCAST contains the network broadcast address (if any)
Reported by: Adam Wood <aswood@gmail.com>
Tested by: Adam Wood <aswood@gmail.com>
* Separate interface creation from interface modification code
* Support setting some interface attributes (ifdescr, mtu, up/down, promisc)
* Improve interaction with the cloners requiring to parse/write custom
interface attributes
* Add bitmask-based way of checking if the attribute is present in the
message
* Don't use multipart RTM_GETLINK replies when searching for the
specific interface names
* Use ENODEV instead of ENOENT in case of failed RTM_GETLINK search
* Add python netlink test helpers
* Add some netlink interface tests
Differential Revision: https://reviews.freebsd.org/D37668
* Add link-state change notifications by subscribing to ifnet_link_event.
In the Linux netlink model, link state is reported in 2 places: first is
the IFLA_OPERSTATE, which stores state per RFC2863.
The second is an IFF_LOWER_UP interface flag. As many applications rely
on the latter, reserve 1 bit from if_flags, named as IFF_NETLINK_1.
This flag is mapped to IFF_LOWER_UP in the netlink headers. This is done
to avoid making applications think this flag is actually
supported / presented in non-netlink outputs.
* Add flag change notifications, by hooking into rt_ifmsg().
In the netlink model, notification should include the bitmask for the
change flags. Update rt_ifmsg() to include such bitmask.
Differential Revision: https://reviews.freebsd.org/D37597
Store user-supplied source protocol in the nexthops and nexthop groups.
Protocol specification help routing daemons like bird to quickly
identify self-originated routes after the crash or restart.
Example:
```
10.2.0.0/24 via 10.0.0.2 dev vtnet0 proto bird
10.3.0.0/24 proto bird
nexthop via 10.0.0.2 dev vtnet0 weight 3
nexthop via 10.0.0.3 dev vtnet0 weight 4
```
Netlink has a confirmation/error reporting mechanism for the sent
messages. Kernel explicitly acks each messages if requested (NLM_F_ACK)
or if message processing results in an error.
Similarly, for multipart messages - typically dumps, where each message
represents a single object like an interface or a route - another
message, NLMSG_DONE is used to indicate the end of dump and the
resulting status.
As a result, successfull dump ends with both NLMSG_DONE and NLMSG_ERROR
messages.
RFC 3549 does not say anything specific about such case.
Linux adopted an optimisation which suppresses NLMSG_ERROR message
when NLMSG_DONE is already sent. Certain libraries/applications like
libnl depends on such behavior.
Suppress sending NLMSG_ERROR if NLMSG_DONE is already sent, by
setting newly-added 'suppress_ack' flag in the writer and checking
this flag when generating ack.
This change restores libnl compatibility.
Before:
```
~ nl-link-list
Error: Unable to allocate link cache: Message sequence number mismatch
````
After:
```
~ nl-link-list
vtnet0 ether 52:54:00:14:e3:19 <broadcast,multicast,up,running>
lo0 ieee1394 <loopback,multicast,up,running>
```
Reviewed by: bapt,pauamma
Tested by: bapt
Differential Revision: https://reviews.freebsd.org/D37565
For some of these Clang produced a warning that "a function declaration
without a prototype is deprecated in all versions of C". In other cases
the function defintion used () which did not match the header
declaration, which used (void).
Sponsored by: The FreeBSD Foundation
By adding missing ifdefs for INET and INET6 when building LINT-NOIP .
Differential Revision: https://reviews.freebsd.org/D36731
Sponsored by: NVIDIA Networking
Netlinks is a communication protocol currently used in Linux kernel to modify,
read and subscribe for nearly all networking state. Interfaces, addresses, routes,
firewall, fibs, vnets, etc are controlled via netlink.
It is async, TLV-based protocol, providing 1-1 and 1-many communications.
The current implementation supports the subset of NETLINK_ROUTE
family. To be more specific, the following is supported:
* Dumps:
- routes
- nexthops / nexthop groups
- interfaces
- interface addresses
- neighbors (arp/ndp)
* Notifications:
- interface arrival/departure
- interface address arrival/departure
- route addition/deletion
* Modifications:
- adding/deleting routes
- adding/deleting nexthops/nexthops groups
- adding/deleting neghbors
- adding/deleting interfaces (basic support only)
* Rtsock interaction
- route events are bridged both ways
The implementation also supports the NETLINK_GENERIC family framework.
Implementation notes:
Netlink is implemented via loadable/unloadable kernel module,
not touching many kernel parts.
Each netlink socket uses dedicated taskqueue to support async operations
that can sleep, such as interface creation. All message processing is
performed within these taskqueues.
Compatibility:
Most of the Netlink data models specified above maps to FreeBSD concepts
nicely. Unmodified ip(8) binary correctly works with
interfaces, addresses, routes, nexthops and nexthop groups. Some
software such as net/bird require header-only modifications to compile
and work with FreeBSD netlink.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D36002
MFC after: 2 months