Commit Graph

61 Commits

Author SHA1 Message Date
melifaro
1a7f90755f Introduce scalable route multipath.
This change is based on the nexthop objects landed in D24232.

The change introduces the concept of nexthop groups.
Each group contains the collection of nexthops with their
 relative weights and a dataplane-optimized structure to enable
 efficient nexthop selection.

Simular to the nexthops, nexthop groups are immutable. Dataplane part
 gets compiled during group creation and is basically an array of
 nexthop pointers, compiled w.r.t their weights.

With this change, `rt_nhop` field of `struct rtentry` contains either
 nexthop or nexthop group. They are distinguished by the presense of
 NHF_MULTIPATH flag.
All dataplane lookup functions returns pointer to the nexthop object,
leaving nexhop groups details inside routing subsystem.

User-visible changes:

The change is intended to be backward-compatible: all non-mpath operations
 should work as before with ROUTE_MPATH and net.route.multipath=1.

All routes now comes with weight, default weight is 1, maximum is 2^24-1.

Current maximum multipath group width is statically set to 64.
 This will become sysctl-tunable in the followup changes.

Using functionality:
* Recompile kernel with ROUTE_MPATH
* set net.route.multipath to 1

route add -6 2001:db8::/32 2001:db8::2 -weight 10
route add -6 2001:db8::/32 2001:db8::3 -weight 20

netstat -6On

Nexthop groups data

Internet6:
GrpIdx  NhIdx     Weight   Slots                                 Gateway     Netif  Refcnt
1         ------- ------- ------- --------------------------------------- ---------       1
              13      10       1                             2001:db8::2     vlan2
              14      20       2                             2001:db8::3     vlan2

Next steps:
* Land outbound hashing for locally-originated routes ( D26523 ).
* Fix net/bird multipath (net/frr seems to work fine)
* Add ROUTE_MPATH to GENERIC
* Set net.route.multipath=1 by default

Tested by:	olivier
Reviewed by:	glebius
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D26449
2020-10-03 10:47:17 +00:00
melifaro
4eda536b2e Introduce nexthop objects and new routing KPI.
This is the foundational change for the routing subsytem rearchitecture.
 More details and goals are available in https://reviews.freebsd.org/D24141 .

This patch introduces concept of nexthop objects and new nexthop-based
 routing KPI.

Nexthops are objects, containing all necessary information for performing
 the packet output decision. Output interface, mtu, flags, gw address goes
 there. For most of the cases, these objects will serve the same role as
 the struct rtentry is currently serving.
Typically there will be low tens of such objects for the router even with
 multiple BGP full-views, as these objects will be shared between routing
 entries. This allows to store more information in the nexthop.

New KPI:

struct nhop_object *fib4_lookup(uint32_t fibnum, struct in_addr dst,
  uint32_t scopeid, uint32_t flags, uint32_t flowid);
struct nhop_object *fib6_lookup(uint32_t fibnum, const struct in6_addr *dst6,
  uint32_t scopeid, uint32_t flags, uint32_t flowid);

These 2 function are intended to replace all all flavours of
 <in_|in6_>rtalloc[1]<_ign><_fib>, mpath functions  and the previous
 fib[46]-generation functions.

Upon successful lookup, they return nexthop object which is guaranteed to
 exist within current NET_EPOCH. If longer lifetime is desired, one can
 specify NHR_REF as a flag and get a referenced version of the nexthop.
 Reference semantic closely resembles rtentry one, allowing sed-style conversion.

Additionally, another 2 functions are introduced to support uRPF functionality
 inside variety of our firewalls. Their primary goal is to hide the multipath
 implementation details inside the routing subsystem, greatly simplifying
 firewalls implementation:

int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid,
  uint32_t flags, const struct ifnet *src_if);
int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr *dst6, uint32_t scopeid,
  uint32_t flags, const struct ifnet *src_if);

All functions have a separate scopeid argument, paving way to eliminating IPv6 scope
 embedding and allowing to support IPv4 link-locals in the future.

Structure changes:
 * rtentry gets new 'rt_nhop' pointer, slightly growing the overall size.
 * rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz.

Old KPI:
During the transition state old and new KPI will coexists. As there are another 4-5
 decent-sized conversion patches, it will probably take a couple of weeks.
To support both KPIs, fields not required by the new KPI (most of rtentry) has to be
 kept, resulting in the temporary size increase.
Once conversion is finished, rtentry will notably shrink.

More details:
* architectural overview: https://reviews.freebsd.org/D24141
* list of the next changes: https://reviews.freebsd.org/D24232

Reviewed by:	ae,glebius(initial version)
Differential Revision:	https://reviews.freebsd.org/D24232
2020-04-12 14:30:00 +00:00
bz
a0dcb7af20 After inpcb route caching was put back in place there is no need for
flowtable anymore (as flowtable was never considered to be useful in
the forwarding path).

Reviewed by:		np
Differential Revision:	https://reviews.freebsd.org/D11448
2017-07-27 13:03:36 +00:00
bdrewery
2ab2ea6fbd Replace DPSRCS that work fine in SRCS.
This is so that 'make depend' is not a required build step in these
files.

DPSRCS is overall unneeded.  DPSRCS already contains SRCS, so anything
which can safely be in SRCS should be.  DPSRCS is mostly just a way to
generate files that should not be linked into the final PROG/LIB.  For
headers and grammars it is safe for them to be in SRCS since they will
be excluded during linking and installation.

The only remaining uses of DPSRCS are for generating .c or .o files that
must be built before 'make depend' can run 'mkdep' on the SRCS c files
list.  A semi-proper example is in tests/sys/kern/acct/Makefile where a
checked-in .c file has an #include on a generated .c file.  The
generated .c file should not be linked into the final PROG though since
it is #include'd.  The more proper way here is just to build/link it in
though without DPSRCS.  Another example is in sys/modules/linux/Makefile
where a shell script runs to parse a DPSRCS .o file that should not be
linked into the module.  Beyond those, the need for DPSRCS is largely
unneeded, redundant, and forces 'make depend' to be ran.  Generally,
these Makefiles should avoid the need for DPSRCS and define proper
dependencies for their files as well.

An example of an improper usage and why this matters is in usr.bin/netstat.
nl_defs.h was only in DPSRCS and so was not generated during 'make all',
but only during 'make depend'.  The files including it lacked proper
depenencies on it, which forced running 'make depend' to workaround that
bug.  The 'make depend' target should mostly be used for incremental build
help, not to produce a working build.  This specific example was broken in
the meta build until r287905 since it does not run 'make depend'.

The gnu/lib/libreadline/readline case is fine since bsd.lib.mk has 'OBJS:
SRCS:M*.h' when there is no .depend file.

Sponsored by:	EMC / Isilon Storage Division
MFC after:	1 week
2015-11-25 20:38:17 +00:00
hrs
07e5aa0065 Simplify kvm symbol resolution and error handling. The symbol table
nl_symbols will eventually be organized into several modules depending
on MK_* variables.
2015-09-02 18:51:36 +00:00
marcel
02ffac2cca Upgrade libxo to 0.4.5.
Local changes incorporated by 0.4.5: r284340
Local changes retained: r276260, r282117

Obtained from:	https://github.com/Juniper/libxo
2015-08-24 16:26:20 +00:00
marcel
0ea1b83e37 Convert netstat to use libxo.
Obtained from:  Phil Shafer <phil@juniper.net>
Ported to -current by: alfred@ (mostly), Kim Shrier
Formatting: marcel@
Sponsored by:   Juniper Networks, Inc.
2015-02-21 23:47:20 +00:00
bapt
8d6c7a49a6 Convert to usr.bin/ to LIBADD
Reduce overlinking
2014-11-25 14:29:10 +00:00
imp
2118f42afd Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
glebius
80e85e32a5 Remove AppleTalk support.
AppleTalk was a network transport protocol for Apple Macintosh devices
in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was
a legacy protocol and primary networking protocol is TCP/IP. The last
Mac OS X release to support AppleTalk happened in 2009. The same year
routing equipment vendors (namely Cisco) end their support.

Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 06:29:43 +00:00
glebius
d494babace Remove IPX support.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.

Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 02:58:48 +00:00
glebius
9d7706f9f4 o Revamp API between flowtable and netinet, netinet6.
- ip_output() and ip_output6() simply call flowtable_lookup(),
    passing mbuf and address family. That's the only code under
    #ifdef FLOWTABLE in the protocols code now.
o Revamp statistics gathering and export.
  - Remove hand made pcpu stats, and utilize counter(9).
  - Snapshot of statistics is available via 'netstat -rs'.
  - All sysctls are moved into net.flowtable namespace, since
    spreading them over net.inet isn't correct.
o Properly separate at compile time INET and INET6 parts.
o General cleanup.
  - Remove chain of multiple flowtables. We simply have one for
    IPv4 and one for IPv6.
  - Flowtables are allocated in flowtable.c, symbols are static.
  - With proper argument to SYSINIT() we no longer need flowtable_ready.
  - Hash salt doesn't need to be per-VNET.
  - Removed rudimentary debugging, which use quite useless in dtrace era.

The runtime behavior of flowtable shouldn't be changed by this commit.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-02-07 15:18:23 +00:00
glebius
b0bc7b1d54 Make userland tools honor WITHOUT_PF build option.
Tested by:	dt71@gmx.com
2013-10-29 17:38:13 +00:00
tuexen
8843fcf6f3 Allow netstat to be build if INET is not defined in the kernel.
Thanks to Garrett Cooper for reporting the issue.

MFC after: 3 days
X-MFC: 238501
2012-07-16 06:43:04 +00:00
dim
d692d62069 After r232745, which makes sure __bswap16(), ntohs() and htons() return
__uint16_t, we can partially undo r228668.

Note the remark "Work around a clang false positive with format string
warnings and ntohs macros (see LLVM PR 11313)" was actually incorrect.

Before r232745, on some arches, the ntohs() macros did in fact return
int, not uint16_t, so clang was right in warning about the %hu format
string.

MFC after:	2 weeks
2012-03-09 20:50:15 +00:00
dim
0d1f91e8e1 Define several extra macros in bsd.sys.mk and sys/conf/kern.pre.mk, to
get rid of testing explicitly for clang (using ${CC:T:Mclang}) in
individual Makefiles.

Instead, use the following extra macros, for use with clang:
- NO_WERROR.clang       (disables -Werror)
- NO_WCAST_ALIGN.clang  (disables -Wcast-align)
- NO_WFORMAT.clang	(disables -Wformat and friends)
- CLANG_NO_IAS		(disables integrated assembler)
- CLANG_OPT_SMALL	(adds flags for extra small size optimizations)

As a side effect, this enables setting CC/CXX/CPP in src.conf instead of
make.conf!  For clang, use the following:

CC=clang
CXX=clang++
CPP=clang-cpp

MFC after:	2 weeks
2012-02-28 18:30:18 +00:00
dim
5cdde91b9a Revert r228650, and work around the clang false positive with printf
formats in usr.bin/netstat/atalk.c by conditionally adding NO_WFORMAT to
the Makefile instead.

MFC after:	1 week
2011-12-17 22:32:00 +00:00
jeff
5115240a6c - Merge in OFED 1.5.3 from projects/ofed/head 2011-03-21 09:58:24 +00:00
rwatson
d193f17354 Teach netstat(1) to print out netisr statistics when given the -Q argument.
Currently supports only reporting on live systems via sysctl, kmem support
needs to be edded.

MFC after:	1 week
Sponsored by:	Juniper Networks
2010-02-22 15:57:36 +00:00
bms
2b08fb6e8b Now that ifmcstat(8) does not suck, retire host-mode netstat -g.
This change will not be back-ported.
2009-02-15 16:16:38 +00:00
sam
9c3d2ffcdf add new build knobs and jigger some existing controls to improve
control over the result of buildworld and installworld; this especially
helps packaging systems such as nanobsd

Reviewed by:	various (posted to arch)
MFC after:	1 month
2008-09-21 22:02:26 +00:00
jb
5582e69034 These are the things that the tinderbox has problems with because it
doesn't use the default CFLAGS which contain -fno-strict-aliasing.

Until the code is cleaned up, just add -fno-strict-aliasing to the
CFLAGS of these for the tinderboxes' sake, allowing the rest of the
tree to have -Werror enabled again.
2007-11-20 02:07:30 +00:00
gnn
f5875f045c Commit IPv6 support for FAST_IPSEC to the tree.
This commit includes all remaining changes for the time being including
user space updates.

Submitted by:    bz
Approved by:    re
2007-07-01 12:08:08 +00:00
ceri
72252c5d47 Backout mess mistakenly committed with manpage update. 2007-06-10 06:18:04 +00:00
ceri
03bd6740ae Document SCTP support. 2007-06-10 06:11:03 +00:00
rrs
af285a5d35 Adds support for SCTP. 2007-06-09 13:44:09 +00:00
yar
59fab84bab - Achieve WARNS=3 by using sparse initializers or avoiding initializers at all.
- Fix a nlist initialization: it should be terminated by a NULL entry.
- Constify.
- Catch an unused parameter.

Tested on:	i386 amd64 ia64
2006-07-28 16:16:40 +00:00
yar
e1db503689 Achieve WARNS=2 by using uintmax_t to pass around 64-bit quantities,
including to printf().  Using uintmax_t is also robust to further
extensions in both the C language and the bitwidth of kernel counters.

Tested on:	i386 amd64 ia64
2006-07-28 16:09:19 +00:00
yar
796fd4097a Avoid useless work: Do not build inet6.c if INET6 support is off.
This also avoids pretending that netstat includes inet6.c in the
output from ident(1).
2006-07-28 11:09:21 +00:00
ru
388e590f95 Reimplementation of world/kernel build options. For details, see:
http://lists.freebsd.org/pipermail/freebsd-current/2006-March/061725.html

The src.conf(5) manpage is to follow in a few days.

Brought to you by:	imp, jhb, kris, phk, ru (all bugs are mine)
2006-03-17 18:54:44 +00:00
kbyanc
980b33f224 Add support for printing IPSEC protocol stats if the kernel was compiled
with FAST_IPSEC rather than the KAME IPSEC stack.

Note that the output of "netstat -s -p ipsec" differs depending on which
stack is compiled into the kernel since they each keep different stats.
This delta also adds the "esp", "ah", and "ipcomp" protocol stats, which
are also available when the kernel is compiled with the FAST_IPSEC stack
(e.g. "netstat -s -p esp").

Submitted by:	Matt Titus <titus at nttmcl dot com>
MFC after:	3 days
2005-12-28 20:36:55 +00:00
csjp
ba6ab73cea Merge bpfstat's functionality into the netstat(1) utility. This adds
a -B option which causes bpf peers to be printed. This option can be
used in conjunction with -I if information about specific interfaces
is desired. This is similar to what NetBSD added to their version of
netstat.

$ netstat -B
  Pid  Netif  Flags      Recv      Drop     Match Sblen Hblen Command
 1137    lo0 p--s--         0         0         0     0     0 tcpdump
  205   sis0 -ifs-l     37331         0         1     0     0 dhclient
$

$ netstat -I lo0 -B
  Pid  Netif  Flags      Recv      Drop     Match Sblen Hblen Command
 1174    lo0 p--s--         0         0         0     0     0 tcpdump
$

-Add bpf.c which stores all the code for retrieving and parsing bpf
 related statistics.
-Modify main.c to add support for the -B option and hook it into the
 program logic.
-Add bpf.c to the build.
-Document this new functionality in the man page and bump the revision
 date.
-Add prototype for bpf_stats function.
2005-09-07 17:35:16 +00:00
phk
0d974f75a8 Don't include -lipx twice. 2005-08-05 20:13:09 +00:00
phk
267e10d4ec Make IPX support depend on NO_IPX 2005-08-05 18:45:49 +00:00
rwatson
20e476a513 Modify "netstat -mb" to use libmemstat(3) when acting on a live system,
with a number of positive benefits:

- Start using UMA(9) statistics for mbufs and clusters, which avoids
  using the mbuf allocator statistics which suffer from races under
  load on SMP.  This should eliminate "negative" mbuf counts in
  netstat -mb.

- We are now able to track cached (free) mbufs and clusters and count
  it towards memory allocated by the network stack.

- We are now also able to track memory allocated to mbuf tags since
  libmemstat(3) can also query malloc(9).  We don't print this except
  as part of the total (for now - #if 0).

- We are now able to track mbuf/cluster/packet allocation failures,
  although they are not currently printed (#if 0).

- Don't print out sfbuf statistics when running on a kernel core, as
  currently that code is able only to query sysctl for statistics.

MFC after:	1 week
2005-07-18 08:34:15 +00:00
delphij
89e5680369 According to style.Makefile(5):
WARNS?= should appear before CFLAGS

Reviewed by:	ru
2005-01-23 12:29:46 +00:00
delphij
99a68ccb1f Make sure that we don't define INET6 when NO_INET6 is defined.
Without this change, when running netstat with a kernel without
INET6 built in, you will get a complain at the end of "netstat -s"
output.

X-MFC:		NO_INET6 was called "NOINET6" on RELENG_5
2005-01-22 19:35:48 +00:00
ru
5db2b9d5b3 For variables that are only checked with defined(), don't provide
any fake value.
2004-10-24 15:33:08 +00:00
bms
7bb9972b9e Sort SRCS in Makefile and document -g option additions.
Nudged by:	ru
2004-03-25 09:07:26 +00:00
bms
af30ef4118 Teach netstat(1) how to print the multicast group memberships present
within the running system.

Sponsored by:	Ralf the Wonder Llama
2004-03-25 08:43:59 +00:00
bde
870b768c6b Fixed missing declaration of pluralies(). This showed up as strange
printf format warnings for inet6.c (pluralies() was implicit int, but
the context requires a "char *").

Added WARNS?=2 to the Makefile so that such errors don't come back.
Added NO_WERROR?= to the Makefile because I haven't checked that setting
WARNS doesn't uncover more bugs except on i386's.
2003-12-29 04:41:38 +00:00
peter
6467f119a1 Kill #ifdef NS and some leftover #ifdef ISO code. Re-pack the nlist[]
array, it isn't likely to find any ARPAnet IMP drivers in FreeBSD.
2003-03-05 19:20:29 +00:00
markm
9302f8d66b Remove GCC-specific flags and commented out cruft. 2002-04-28 12:14:10 +00:00
dwmalone
d9613ea383 Style improvements recommended by Bruce as a follow up to some
of the recent WARNS commits. The idea is:

1) FreeBSD id tags should follow vendor tags.
2) Vendor tags should not be compiled (though copyrights probably should).
3) There should be no blank line between including cdefs and __FBSDIF.
2001-12-10 21:13:08 +00:00
assar
ea6e16bc20 add the option -S for printing port numbers symbolically but addresses
numerically.  clean up the CFLAGS in Makefile.
2001-06-15 00:25:44 +00:00
brian
dd004da290 MAXHOSTNAMELEN includes space for a NUL.
Don't roll our own version of trimdomain(), use the one in libutil.

Not objected to by: freebsd-audit
2001-03-14 20:51:26 +00:00
itojun
77ac5d68c9 sync with latest kame netstat. basically, more statistics 2000-07-04 16:26:46 +00:00
shin
417b54f8df IPv6 multicast routing.
kernel IPv6 multicast routing support.
  pim6 dense mode daemon
  pim6 sparse mode daemon
  netstat support of IPv6 multicast routing statistics

  Merging to the current and testing with other existing multicast routers
  is done by Tatsuya Jinmei <jinmei@kame.net>, who writes and maintainances
  the base code in KAME distribution.

  Make world check and kernel build check was also successful.
2000-01-28 05:10:56 +00:00
shin
9b5932fc47 libipsec and IPsec related apps. (and some KAME related man pages)
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
2000-01-06 12:40:54 +00:00
shin
8c2ccb59ca Getaddrinfo(), getnameinfo(), and etc support in libc/net.
Several udp and raw apps IPv6 support.

Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
1999-12-28 02:37:14 +00:00