Commit Graph

61 Commits

Author SHA1 Message Date
Alexander V. Chernikov
fedeb08b6a Introduce scalable route multipath.
This change is based on the nexthop objects landed in D24232.

The change introduces the concept of nexthop groups.
Each group contains the collection of nexthops with their
 relative weights and a dataplane-optimized structure to enable
 efficient nexthop selection.

Simular to the nexthops, nexthop groups are immutable. Dataplane part
 gets compiled during group creation and is basically an array of
 nexthop pointers, compiled w.r.t their weights.

With this change, `rt_nhop` field of `struct rtentry` contains either
 nexthop or nexthop group. They are distinguished by the presense of
 NHF_MULTIPATH flag.
All dataplane lookup functions returns pointer to the nexthop object,
leaving nexhop groups details inside routing subsystem.

User-visible changes:

The change is intended to be backward-compatible: all non-mpath operations
 should work as before with ROUTE_MPATH and net.route.multipath=1.

All routes now comes with weight, default weight is 1, maximum is 2^24-1.

Current maximum multipath group width is statically set to 64.
 This will become sysctl-tunable in the followup changes.

Using functionality:
* Recompile kernel with ROUTE_MPATH
* set net.route.multipath to 1

route add -6 2001:db8::/32 2001:db8::2 -weight 10
route add -6 2001:db8::/32 2001:db8::3 -weight 20

netstat -6On

Nexthop groups data

Internet6:
GrpIdx  NhIdx     Weight   Slots                                 Gateway     Netif  Refcnt
1         ------- ------- ------- --------------------------------------- ---------       1
              13      10       1                             2001:db8::2     vlan2
              14      20       2                             2001:db8::3     vlan2

Next steps:
* Land outbound hashing for locally-originated routes ( D26523 ).
* Fix net/bird multipath (net/frr seems to work fine)
* Add ROUTE_MPATH to GENERIC
* Set net.route.multipath=1 by default

Tested by:	olivier
Reviewed by:	glebius
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D26449
2020-10-03 10:47:17 +00:00
Alexander V. Chernikov
a666325282 Introduce nexthop objects and new routing KPI.
This is the foundational change for the routing subsytem rearchitecture.
 More details and goals are available in https://reviews.freebsd.org/D24141 .

This patch introduces concept of nexthop objects and new nexthop-based
 routing KPI.

Nexthops are objects, containing all necessary information for performing
 the packet output decision. Output interface, mtu, flags, gw address goes
 there. For most of the cases, these objects will serve the same role as
 the struct rtentry is currently serving.
Typically there will be low tens of such objects for the router even with
 multiple BGP full-views, as these objects will be shared between routing
 entries. This allows to store more information in the nexthop.

New KPI:

struct nhop_object *fib4_lookup(uint32_t fibnum, struct in_addr dst,
  uint32_t scopeid, uint32_t flags, uint32_t flowid);
struct nhop_object *fib6_lookup(uint32_t fibnum, const struct in6_addr *dst6,
  uint32_t scopeid, uint32_t flags, uint32_t flowid);

These 2 function are intended to replace all all flavours of
 <in_|in6_>rtalloc[1]<_ign><_fib>, mpath functions  and the previous
 fib[46]-generation functions.

Upon successful lookup, they return nexthop object which is guaranteed to
 exist within current NET_EPOCH. If longer lifetime is desired, one can
 specify NHR_REF as a flag and get a referenced version of the nexthop.
 Reference semantic closely resembles rtentry one, allowing sed-style conversion.

Additionally, another 2 functions are introduced to support uRPF functionality
 inside variety of our firewalls. Their primary goal is to hide the multipath
 implementation details inside the routing subsystem, greatly simplifying
 firewalls implementation:

int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid,
  uint32_t flags, const struct ifnet *src_if);
int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr *dst6, uint32_t scopeid,
  uint32_t flags, const struct ifnet *src_if);

All functions have a separate scopeid argument, paving way to eliminating IPv6 scope
 embedding and allowing to support IPv4 link-locals in the future.

Structure changes:
 * rtentry gets new 'rt_nhop' pointer, slightly growing the overall size.
 * rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz.

Old KPI:
During the transition state old and new KPI will coexists. As there are another 4-5
 decent-sized conversion patches, it will probably take a couple of weeks.
To support both KPIs, fields not required by the new KPI (most of rtentry) has to be
 kept, resulting in the temporary size increase.
Once conversion is finished, rtentry will notably shrink.

More details:
* architectural overview: https://reviews.freebsd.org/D24141
* list of the next changes: https://reviews.freebsd.org/D24232

Reviewed by:	ae,glebius(initial version)
Differential Revision:	https://reviews.freebsd.org/D24232
2020-04-12 14:30:00 +00:00
Bjoern A. Zeeb
ae69ad884d After inpcb route caching was put back in place there is no need for
flowtable anymore (as flowtable was never considered to be useful in
the forwarding path).

Reviewed by:		np
Differential Revision:	https://reviews.freebsd.org/D11448
2017-07-27 13:03:36 +00:00
Bryan Drewery
114350b9de Replace DPSRCS that work fine in SRCS.
This is so that 'make depend' is not a required build step in these
files.

DPSRCS is overall unneeded.  DPSRCS already contains SRCS, so anything
which can safely be in SRCS should be.  DPSRCS is mostly just a way to
generate files that should not be linked into the final PROG/LIB.  For
headers and grammars it is safe for them to be in SRCS since they will
be excluded during linking and installation.

The only remaining uses of DPSRCS are for generating .c or .o files that
must be built before 'make depend' can run 'mkdep' on the SRCS c files
list.  A semi-proper example is in tests/sys/kern/acct/Makefile where a
checked-in .c file has an #include on a generated .c file.  The
generated .c file should not be linked into the final PROG though since
it is #include'd.  The more proper way here is just to build/link it in
though without DPSRCS.  Another example is in sys/modules/linux/Makefile
where a shell script runs to parse a DPSRCS .o file that should not be
linked into the module.  Beyond those, the need for DPSRCS is largely
unneeded, redundant, and forces 'make depend' to be ran.  Generally,
these Makefiles should avoid the need for DPSRCS and define proper
dependencies for their files as well.

An example of an improper usage and why this matters is in usr.bin/netstat.
nl_defs.h was only in DPSRCS and so was not generated during 'make all',
but only during 'make depend'.  The files including it lacked proper
depenencies on it, which forced running 'make depend' to workaround that
bug.  The 'make depend' target should mostly be used for incremental build
help, not to produce a working build.  This specific example was broken in
the meta build until r287905 since it does not run 'make depend'.

The gnu/lib/libreadline/readline case is fine since bsd.lib.mk has 'OBJS:
SRCS:M*.h' when there is no .depend file.

Sponsored by:	EMC / Isilon Storage Division
MFC after:	1 week
2015-11-25 20:38:17 +00:00
Hiroki Sato
81dacd8beb Simplify kvm symbol resolution and error handling. The symbol table
nl_symbols will eventually be organized into several modules depending
on MK_* variables.
2015-09-02 18:51:36 +00:00
Marcel Moolenaar
d1a0d267b7 Upgrade libxo to 0.4.5.
Local changes incorporated by 0.4.5: r284340
Local changes retained: r276260, r282117

Obtained from:	https://github.com/Juniper/libxo
2015-08-24 16:26:20 +00:00
Marcel Moolenaar
ade9ccfe21 Convert netstat to use libxo.
Obtained from:  Phil Shafer <phil@juniper.net>
Ported to -current by: alfred@ (mostly), Kim Shrier
Formatting: marcel@
Sponsored by:   Juniper Networks, Inc.
2015-02-21 23:47:20 +00:00
Baptiste Daroussin
3e11bd9e2a Convert to usr.bin/ to LIBADD
Reduce overlinking
2014-11-25 14:29:10 +00:00
Warner Losh
c6063d0da8 Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
Gleb Smirnoff
45c203fce2 Remove AppleTalk support.
AppleTalk was a network transport protocol for Apple Macintosh devices
in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was
a legacy protocol and primary networking protocol is TCP/IP. The last
Mac OS X release to support AppleTalk happened in 2009. The same year
routing equipment vendors (namely Cisco) end their support.

Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 06:29:43 +00:00
Gleb Smirnoff
2c284d9395 Remove IPX support.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.

Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 02:58:48 +00:00
Gleb Smirnoff
5d6d7e756b o Revamp API between flowtable and netinet, netinet6.
- ip_output() and ip_output6() simply call flowtable_lookup(),
    passing mbuf and address family. That's the only code under
    #ifdef FLOWTABLE in the protocols code now.
o Revamp statistics gathering and export.
  - Remove hand made pcpu stats, and utilize counter(9).
  - Snapshot of statistics is available via 'netstat -rs'.
  - All sysctls are moved into net.flowtable namespace, since
    spreading them over net.inet isn't correct.
o Properly separate at compile time INET and INET6 parts.
o General cleanup.
  - Remove chain of multiple flowtables. We simply have one for
    IPv4 and one for IPv6.
  - Flowtables are allocated in flowtable.c, symbols are static.
  - With proper argument to SYSINIT() we no longer need flowtable_ready.
  - Hash salt doesn't need to be per-VNET.
  - Removed rudimentary debugging, which use quite useless in dtrace era.

The runtime behavior of flowtable shouldn't be changed by this commit.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-02-07 15:18:23 +00:00
Gleb Smirnoff
3e4d5cd37b Make userland tools honor WITHOUT_PF build option.
Tested by:	dt71@gmx.com
2013-10-29 17:38:13 +00:00
Michael Tuexen
3dcc856b6c Allow netstat to be build if INET is not defined in the kernel.
Thanks to Garrett Cooper for reporting the issue.

MFC after: 3 days
X-MFC: 238501
2012-07-16 06:43:04 +00:00
Dimitry Andric
6aa83769fd After r232745, which makes sure __bswap16(), ntohs() and htons() return
__uint16_t, we can partially undo r228668.

Note the remark "Work around a clang false positive with format string
warnings and ntohs macros (see LLVM PR 11313)" was actually incorrect.

Before r232745, on some arches, the ntohs() macros did in fact return
int, not uint16_t, so clang was right in warning about the %hu format
string.

MFC after:	2 weeks
2012-03-09 20:50:15 +00:00
Dimitry Andric
07b202a847 Define several extra macros in bsd.sys.mk and sys/conf/kern.pre.mk, to
get rid of testing explicitly for clang (using ${CC:T:Mclang}) in
individual Makefiles.

Instead, use the following extra macros, for use with clang:
- NO_WERROR.clang       (disables -Werror)
- NO_WCAST_ALIGN.clang  (disables -Wcast-align)
- NO_WFORMAT.clang	(disables -Wformat and friends)
- CLANG_NO_IAS		(disables integrated assembler)
- CLANG_OPT_SMALL	(adds flags for extra small size optimizations)

As a side effect, this enables setting CC/CXX/CPP in src.conf instead of
make.conf!  For clang, use the following:

CC=clang
CXX=clang++
CPP=clang-cpp

MFC after:	2 weeks
2012-02-28 18:30:18 +00:00
Dimitry Andric
d88ccef562 Revert r228650, and work around the clang false positive with printf
formats in usr.bin/netstat/atalk.c by conditionally adding NO_WFORMAT to
the Makefile instead.

MFC after:	1 week
2011-12-17 22:32:00 +00:00
Jeff Roberson
aa0a1e58f0 - Merge in OFED 1.5.3 from projects/ofed/head 2011-03-21 09:58:24 +00:00
Robert Watson
0153eb6688 Teach netstat(1) to print out netisr statistics when given the -Q argument.
Currently supports only reporting on live systems via sysctl, kmem support
needs to be edded.

MFC after:	1 week
Sponsored by:	Juniper Networks
2010-02-22 15:57:36 +00:00
Bruce M Simpson
57e9cb8cda Now that ifmcstat(8) does not suck, retire host-mode netstat -g.
This change will not be back-ported.
2009-02-15 16:16:38 +00:00
Sam Leffler
690f477d75 add new build knobs and jigger some existing controls to improve
control over the result of buildworld and installworld; this especially
helps packaging systems such as nanobsd

Reviewed by:	various (posted to arch)
MFC after:	1 month
2008-09-21 22:02:26 +00:00
John Birrell
0aad0f2282 These are the things that the tinderbox has problems with because it
doesn't use the default CFLAGS which contain -fno-strict-aliasing.

Until the code is cleaned up, just add -fno-strict-aliasing to the
CFLAGS of these for the tinderboxes' sake, allowing the rest of the
tree to have -Werror enabled again.
2007-11-20 02:07:30 +00:00
George V. Neville-Neil
8409aedfa6 Commit IPv6 support for FAST_IPSEC to the tree.
This commit includes all remaining changes for the time being including
user space updates.

Submitted by:    bz
Approved by:    re
2007-07-01 12:08:08 +00:00
Ceri Davies
f18f2fc7fd Backout mess mistakenly committed with manpage update. 2007-06-10 06:18:04 +00:00
Ceri Davies
664fd46b84 Document SCTP support. 2007-06-10 06:11:03 +00:00
Randall Stewart
74fd40c90c Adds support for SCTP. 2007-06-09 13:44:09 +00:00
Yaroslav Tykhiy
096146f88b - Achieve WARNS=3 by using sparse initializers or avoiding initializers at all.
- Fix a nlist initialization: it should be terminated by a NULL entry.
- Constify.
- Catch an unused parameter.

Tested on:	i386 amd64 ia64
2006-07-28 16:16:40 +00:00
Yaroslav Tykhiy
7b95a1ebbd Achieve WARNS=2 by using uintmax_t to pass around 64-bit quantities,
including to printf().  Using uintmax_t is also robust to further
extensions in both the C language and the bitwidth of kernel counters.

Tested on:	i386 amd64 ia64
2006-07-28 16:09:19 +00:00
Yaroslav Tykhiy
b7dd94d5e6 Avoid useless work: Do not build inet6.c if INET6 support is off.
This also avoids pretending that netstat includes inet6.c in the
output from ident(1).
2006-07-28 11:09:21 +00:00
Ruslan Ermilov
e1fe3dba5c Reimplementation of world/kernel build options. For details, see:
http://lists.freebsd.org/pipermail/freebsd-current/2006-March/061725.html

The src.conf(5) manpage is to follow in a few days.

Brought to you by:	imp, jhb, kris, phk, ru (all bugs are mine)
2006-03-17 18:54:44 +00:00
Kelly Yancey
100b98db75 Add support for printing IPSEC protocol stats if the kernel was compiled
with FAST_IPSEC rather than the KAME IPSEC stack.

Note that the output of "netstat -s -p ipsec" differs depending on which
stack is compiled into the kernel since they each keep different stats.
This delta also adds the "esp", "ah", and "ipcomp" protocol stats, which
are also available when the kernel is compiled with the FAST_IPSEC stack
(e.g. "netstat -s -p esp").

Submitted by:	Matt Titus <titus at nttmcl dot com>
MFC after:	3 days
2005-12-28 20:36:55 +00:00
Christian S.J. Peron
6b463eed3a Merge bpfstat's functionality into the netstat(1) utility. This adds
a -B option which causes bpf peers to be printed. This option can be
used in conjunction with -I if information about specific interfaces
is desired. This is similar to what NetBSD added to their version of
netstat.

$ netstat -B
  Pid  Netif  Flags      Recv      Drop     Match Sblen Hblen Command
 1137    lo0 p--s--         0         0         0     0     0 tcpdump
  205   sis0 -ifs-l     37331         0         1     0     0 dhclient
$

$ netstat -I lo0 -B
  Pid  Netif  Flags      Recv      Drop     Match Sblen Hblen Command
 1174    lo0 p--s--         0         0         0     0     0 tcpdump
$

-Add bpf.c which stores all the code for retrieving and parsing bpf
 related statistics.
-Modify main.c to add support for the -B option and hook it into the
 program logic.
-Add bpf.c to the build.
-Document this new functionality in the man page and bump the revision
 date.
-Add prototype for bpf_stats function.
2005-09-07 17:35:16 +00:00
Poul-Henning Kamp
a00553b3d3 Don't include -lipx twice. 2005-08-05 20:13:09 +00:00
Poul-Henning Kamp
9cc22e5c89 Make IPX support depend on NO_IPX 2005-08-05 18:45:49 +00:00
Robert Watson
c8e6b6899a Modify "netstat -mb" to use libmemstat(3) when acting on a live system,
with a number of positive benefits:

- Start using UMA(9) statistics for mbufs and clusters, which avoids
  using the mbuf allocator statistics which suffer from races under
  load on SMP.  This should eliminate "negative" mbuf counts in
  netstat -mb.

- We are now able to track cached (free) mbufs and clusters and count
  it towards memory allocated by the network stack.

- We are now also able to track memory allocated to mbuf tags since
  libmemstat(3) can also query malloc(9).  We don't print this except
  as part of the total (for now - #if 0).

- We are now able to track mbuf/cluster/packet allocation failures,
  although they are not currently printed (#if 0).

- Don't print out sfbuf statistics when running on a kernel core, as
  currently that code is able only to query sysctl for statistics.

MFC after:	1 week
2005-07-18 08:34:15 +00:00
Xin LI
7f1a765333 According to style.Makefile(5):
WARNS?= should appear before CFLAGS

Reviewed by:	ru
2005-01-23 12:29:46 +00:00
Xin LI
980b4f7474 Make sure that we don't define INET6 when NO_INET6 is defined.
Without this change, when running netstat with a kernel without
INET6 built in, you will get a complain at the end of "netstat -s"
output.

X-MFC:		NO_INET6 was called "NOINET6" on RELENG_5
2005-01-22 19:35:48 +00:00
Ruslan Ermilov
a35d88931c For variables that are only checked with defined(), don't provide
any fake value.
2004-10-24 15:33:08 +00:00
Bruce M Simpson
1d2a7e07d7 Sort SRCS in Makefile and document -g option additions.
Nudged by:	ru
2004-03-25 09:07:26 +00:00
Bruce M Simpson
9fcc066d3e Teach netstat(1) how to print the multicast group memberships present
within the running system.

Sponsored by:	Ralf the Wonder Llama
2004-03-25 08:43:59 +00:00
Bruce Evans
aa54e1ecc5 Fixed missing declaration of pluralies(). This showed up as strange
printf format warnings for inet6.c (pluralies() was implicit int, but
the context requires a "char *").

Added WARNS?=2 to the Makefile so that such errors don't come back.
Added NO_WERROR?= to the Makefile because I haven't checked that setting
WARNS doesn't uncover more bugs except on i386's.
2003-12-29 04:41:38 +00:00
Peter Wemm
ab54ea99de Kill #ifdef NS and some leftover #ifdef ISO code. Re-pack the nlist[]
array, it isn't likely to find any ARPAnet IMP drivers in FreeBSD.
2003-03-05 19:20:29 +00:00
Mark Murray
2d3f94bf1b Remove GCC-specific flags and commented out cruft. 2002-04-28 12:14:10 +00:00
David Malone
9f5b04e925 Style improvements recommended by Bruce as a follow up to some
of the recent WARNS commits. The idea is:

1) FreeBSD id tags should follow vendor tags.
2) Vendor tags should not be compiled (though copyrights probably should).
3) There should be no blank line between including cdefs and __FBSDIF.
2001-12-10 21:13:08 +00:00
Assar Westerlund
65ea0024ba add the option -S for printing port numbers symbolically but addresses
numerically.  clean up the CFLAGS in Makefile.
2001-06-15 00:25:44 +00:00
Brian Somers
d121b55666 MAXHOSTNAMELEN includes space for a NUL.
Don't roll our own version of trimdomain(), use the one in libutil.

Not objected to by: freebsd-audit
2001-03-14 20:51:26 +00:00
Jun-ichiro itojun Hagino
32cd1d9601 sync with latest kame netstat. basically, more statistics 2000-07-04 16:26:46 +00:00
Yoshinobu Inoue
0fea3d5165 IPv6 multicast routing.
kernel IPv6 multicast routing support.
  pim6 dense mode daemon
  pim6 sparse mode daemon
  netstat support of IPv6 multicast routing statistics

  Merging to the current and testing with other existing multicast routers
  is done by Tatsuya Jinmei <jinmei@kame.net>, who writes and maintainances
  the base code in KAME distribution.

  Make world check and kernel build check was also successful.
2000-01-28 05:10:56 +00:00
Yoshinobu Inoue
9a4365d0e0 libipsec and IPsec related apps. (and some KAME related man pages)
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
2000-01-06 12:40:54 +00:00
Yoshinobu Inoue
7d56d3747c Getaddrinfo(), getnameinfo(), and etc support in libc/net.
Several udp and raw apps IPv6 support.

Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
1999-12-28 02:37:14 +00:00