rttrash (unused but not yet delete entries) were eliminated
during routing rework. Remove reading these symbols from the kernel.
PR: 254681
Reported by: rashey@superbox.pl
MFC after: immediately
This is the foundational change for the routing subsytem rearchitecture.
More details and goals are available in https://reviews.freebsd.org/D24141 .
This patch introduces concept of nexthop objects and new nexthop-based
routing KPI.
Nexthops are objects, containing all necessary information for performing
the packet output decision. Output interface, mtu, flags, gw address goes
there. For most of the cases, these objects will serve the same role as
the struct rtentry is currently serving.
Typically there will be low tens of such objects for the router even with
multiple BGP full-views, as these objects will be shared between routing
entries. This allows to store more information in the nexthop.
New KPI:
struct nhop_object *fib4_lookup(uint32_t fibnum, struct in_addr dst,
uint32_t scopeid, uint32_t flags, uint32_t flowid);
struct nhop_object *fib6_lookup(uint32_t fibnum, const struct in6_addr *dst6,
uint32_t scopeid, uint32_t flags, uint32_t flowid);
These 2 function are intended to replace all all flavours of
<in_|in6_>rtalloc[1]<_ign><_fib>, mpath functions and the previous
fib[46]-generation functions.
Upon successful lookup, they return nexthop object which is guaranteed to
exist within current NET_EPOCH. If longer lifetime is desired, one can
specify NHR_REF as a flag and get a referenced version of the nexthop.
Reference semantic closely resembles rtentry one, allowing sed-style conversion.
Additionally, another 2 functions are introduced to support uRPF functionality
inside variety of our firewalls. Their primary goal is to hide the multipath
implementation details inside the routing subsystem, greatly simplifying
firewalls implementation:
int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid,
uint32_t flags, const struct ifnet *src_if);
int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr *dst6, uint32_t scopeid,
uint32_t flags, const struct ifnet *src_if);
All functions have a separate scopeid argument, paving way to eliminating IPv6 scope
embedding and allowing to support IPv4 link-locals in the future.
Structure changes:
* rtentry gets new 'rt_nhop' pointer, slightly growing the overall size.
* rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz.
Old KPI:
During the transition state old and new KPI will coexists. As there are another 4-5
decent-sized conversion patches, it will probably take a couple of weeks.
To support both KPIs, fields not required by the new KPI (most of rtentry) has to be
kept, resulting in the temporary size increase.
Once conversion is finished, rtentry will notably shrink.
More details:
* architectural overview: https://reviews.freebsd.org/D24141
* list of the next changes: https://reviews.freebsd.org/D24232
Reviewed by: ae,glebius(initial version)
Differential Revision: https://reviews.freebsd.org/D24232
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
messages before accessing message fields that may not be present,
removing dead/duplicate/misleading code along the way.
Document the message format for each routing socket message in
route.h.
Fix a bug in usr.bin/netstat introduced in r287351 that resulted in
pointer computation with essentially random 16-bit offsets and
dereferencing of the results.
Reviewed by: ae
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D10330
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
Expand inet6name() line buffer to NI_MAXHOST and use strlcpy/snprintf
in various places.
Reported by: Anton Yuzhaninov <citrin citrin ru>
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D8916
In the case the width is less than 0, we are returning an uninitialized
value. For practical purposes the return value is ignored but initialize
it to avoid trouble.
CID: 1341619
routepr() (-r flag). It is too narrow to show an IPv6 prefix
in most cases.
- Accept "local" as a synonym of "unix" in protocol family name.
- Show a prefix length in CIDR notation when name resolution failed in
netname().
- Make routename() and netname() AF-independent and remove
unnecessary typecasting from struct sockaddr.
- Use getnameinfo(3) to format L2 addr in intpr().
- Fix a bug which showed "Address" when -A flag is specfied in pr_rthdr().
- Replace cryptic GETSA() macro with SA_SIZE().
- Fix declarations shadowing local variables with the same names.
- Add more static, remove unused header files and variables.
MFC after: 1 week
in 'netstat -r'.
The netstat/route.c was the last abuser of struct ifnet and struct
rtentry in the tree. With this change if_var.h can become kernel
only include, _WANT_RTENTRY can go away and projects/ifnet and
projects/routing can go forward.
Differential Revision: https://reviews.freebsd.org/D2242
Reviewed by: melifaro, gnn
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
Obtained from: Phil Shafer <phil@juniper.net>
Ported to -current by: alfred@ (mostly), Kim Shrier
Formatting: marcel@
Sponsored by: Juniper Networks, Inc.
- Increase WID_IF_DEFAULT() from 6 to 8 (the default for AF_INET6) because
we have interfaces with longer names than 6 chars like epairN{a,b}.
- Style fixes.
AppleTalk was a network transport protocol for Apple Macintosh devices
in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was
a legacy protocol and primary networking protocol is TCP/IP. The last
Mac OS X release to support AppleTalk happened in 2009. The same year
routing equipment vendors (namely Cisco) end their support.
Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.
Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
- Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This
removes another cache trashing ++ from packet forwarding path.
- Create zini/fini methods for the rtentry UMA zone. Via initialize
mutex and counter in them.
- Fix reporting of rmx_pksent to routing socket.
- Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.
The change is mostly targeted for stable/10 merge. For head,
rt_pksent is expected to just disappear.
Discussed with: melifaro
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
necessary symbols needed per subsystem. Main kvm(3) init is now delayed
as much as possbile. This finally fixes performance issues reported in
kern/167204.
Some non-working code (ng_socket.ko symbol addresses calculation) removed.
Some global variables eliminated.
PR: kern/167204
MFC after: 4 weeks
instead of peeking inside in-kernel radix via kget.
This permits us to change kernel structures without breaking userland.
Additionally, this change provide more reliable and faster output.
`Refs` and `Use` fields available in IPv4 by default (and via -W
for other families) were removed. `Refs` is radix-specific thing
which is not informative for users. `Use` field value is handy sometimes,
but a) current API does not support it and b) I'm not sure we will
support per-rte pcpu counters in near future.
Old method of retrieving data is still supported (either by defining
NewTree=0 or running netstat with -A). However, Refs/Use fields are
hidden.
Sponsored by: Yandex LLC
MFC after: 4 weeks
PR: kern/167204
libkvm digging in kernel memory. This is possible since r231506 made
getifaddrs(3) to supply if_data for each ifaddr.
The pros of this change is that now netstat(1) doesn't know about kernel
struct ifnet and struct ifaddr. And these structs are about to change
significantly in head soon. New netstat binary will work well with 10.0
and any future kernel.
The cons is that now it isn't possible to obtain interface statistics
from a vmcore.
Functions intpr() and sidewaysintpr() were rewritten from scratch.
The output of netstat(1) has underwent the following changes:
1) The MTU is not printed for protocol addresses, since it has no notion.
Dash is printed instead. If there would be a strong desire to return
previous output, it is doable.
2) Output interface queue drops are not printed. Currently this data isn't
available to userland via any API. We plan to drop 'struct ifqueue' from
'struct ifnet' very soon, so old kvm(3) access to queue drops is soon
to be broken, too. The plan is that drivers would handle their queues
theirselves and a new field in if_data would be updated in case of drops.
3) In-kernel reference count for multicast addresses isn't printed. I doubt
that anyone used it. Anyway, netstat(1) is sysadmin tool, not kernel
debugger.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
userland via routing socket or sysctl. This eliminates the following
KAME-specific sin6_scope_id handling routine from each userland utility:
sin6.sin6_scope_id = ntohs(*(u_int16_t *)&sin6.sin6_addr.s6_addr[2]);
This behavior can be controlled by net.inet6.ip6.deembed_scopeid. This is
set to 1 by default (sin6_scope_id will be filled in the kernel).
Reviewed by: bz
stack from the output of `netstat -ani'.
- The node-local multicast address in the output of `netstat -rn'
should be handled as well.
Spotted by: Bernd Walter <ticso__at__cicely7.cicely.de>
is in accordance with the information provided at
ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change
Also add $FreeBSD$ to a few files to keep svn happy.
Discussed with: imp, rwatson
otherwise sign extension leads to unlikely values when in the negative
range of the signed short structure fields that hold the statistics.
The type used to hold routing statistics is arguably also incorrect.
MFC after: 3 days
an accessor function to get the correct rnh pointer back.
Update netstat to get the correct pointer using kvm_read()
as well.
This not only fixes the ABI problem depending on the kernel
option but also permits the tunable to overwrite the kernel
option at boot time up to MAXFIBS, enlarging the number of
FIBs without having to recompile. So people could just use
GENERIC now.
Reviewed by: julian, rwatson, zec
X-MFC: not possible
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion