Commit Graph

6 Commits

Author SHA1 Message Date
Marko Zec
43880c511c [fib_algo][dxr] Split unused range chunk list in multiple buckets
Traversing a single list of unused range chunks in search for a block
of optimal size was suboptimal.

The experience with real-world BGP workloads has shown that on average
unused range chunks are tiny, mostly in length from 1 to 4 or 5, when
DXR is configured with K = 20 which is the current default (D16X4R).

Therefore, introduce a limited amount of buckets to accomodate descriptors
of empty blocks of fixed (small) size, so that those can be found in O(1)
time.  If no empty chunks of the requested size can be found in fixed-size
buckets, the search continues in an unsorted list of empty chunks of
variable lengths, which should only happen infrequently.

This change should permit us to manage significantly more empty range
chunks without sacrifying the speed of incremental range table updating.

MFC after:	3 days
2021-09-25 06:29:48 +02:00
Marko Zec
2ac039f7be [fib_algo][dxr] Merge adjacent empty range table chunks.
MFC after:	3 days
2021-09-20 06:30:45 +02:00
Marko Zec
eb3148cc4d [fib algo][dxr] Fix division by zero.
A division by zero would occur if DXR would be activated on a vnet
with no IP addresses configured on any interfaces.

PR:		257965
MFC after:	3 days
Reported by:	Raul Munoz
2021-09-16 16:34:05 +02:00
Marko Zec
b51f8bae57 [fib algo][dxr] Optimize trie updating.
Don't rebuild in vain trie parts unaffected by accumulated incremental
RIB updates.

PR:		257965
Tested by:	Konrad Kreciwilk
MFC after:	3 days
2021-09-15 22:42:49 +02:00
Marko Zec
442c8a245e [fib algo][dxr] Fix undefined behavior.
The result of shifting uint32_t by 32 (or more) is undefined: fix it.
2021-09-15 22:42:48 +02:00
Marko Zec
2aca58e16f Introduce DXR as an IPv4 longest prefix matching / FIB module
DXR maintains compressed lookup structures with a trivial search
procedure.  A two-stage trie is indexed by the more significant bits of
the search key (IPv4 address), while the remaining bits are used for
finding the next hop in a sorted array.  The tradeoff between memory
footprint and search speed depends on the split between the trie and
the remaining binary search.  The default of 20 bits of the key being
used for trie indexing yields good performance (see below) with
footprints of around 2.5 Bytes per prefix with current BGP snapshots.

Rebuilding lookup structures takes some time, which is compensated for by
batching several RIB change requests into a single FIB update, i.e. FIB
synchronization with the RIB may be delayed for a fraction of a second.
RIB to FIB synchronization, next-hop table housekeeping, and lockless
lookup capability is provided by the FIB_ALGO infrastructure.

DXR works well on modern CPUs with several MBytes of caches, especially
in VMs, where is outperforms other currently available IPv4 FIB
algorithms by a large margin.

Synthetic single-thread LPM throughput test method:

kldload test_lookup; kldload dpdk_lpm4; kldload fib_dxr
sysctl net.route.test.run_lps_rnd=N
sysctl net.route.test.run_lps_seq=N

where N is the number of randomly generated keys (IPv4 addresses) which
should be chosen so that each test iteration runs for several seconds.

Each reported score represents the best of three runs, in million
lookups per second (MLPS), for two bechmarks (RND & SEQ) with two FIBs:

host: single interface address, local subnet route + default route
BGP: snapshot from linx.routeviews.org, 887957 prefixes, 496 next hops

Bhyve VM on an Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60 GHz:
inet.algo         host, RND    host, SEQ    BGP, RND    BGP, SEQ
bsearch4             40.6         20.2         N/A         N/A
radix4                7.8          3.8         1.2         0.6
radix4_lockless      18.0          9.0         1.6         0.8
dpdk_lpm4            14.4          5.0        14.6         5.0
dxr                  70.3         34.7        43.0        19.5

Intel(R) Core(TM) i5-5300U CPU @ 2.30 GHz:
inet.algo         host, RND    host, SEQ    BGP, RND    BGP, SEQ
bsearch4             47.0         23.1         N/A         N/A
radix4                8.5          4.2         1.9         1.0
radix4_lockless      19.2          9.5         2.5         1.2
dpdk_lpm4            31.2          9.4        31.6         9.3
dxr                  84.9         41.4        51.7        23.6

Intel(R) Core(TM) i7-4771 CPU @ 3.50 GHz:
inet.algo         host, RND    host, SEQ    BGP, RND    BGP, SEQ
bsearch4             59.5         29.4         N/A         N/A
radix4               10.8          5.5         2.5         1.3
radix4_lockless      24.7         12.0         3.1         1.6
dpdk_lpm4            29.1          9.0        30.2         9.1
dxr                 101.3         49.9        69.8        32.5

AMD Ryzen 7 3700X 8-Core Processor @ 3.60 GHz:
inet.algo         host, RND    host, SEQ    BGP, RND    BGP, SEQ
bsearch4             70.8         35.4         N/A         N/A
radix4               14.4          7.2         2.8         1.4
radix4_lockless      30.2         15.1         3.7         1.8
dpdk_lpm4            29.9          9.0        30.0         8.9
dxr                 163.3         81.5        99.5        44.4

AMD Ryzen 5 5600X 6-Core Processor @ 3.70 GHz:
inet.algo         host, RND    host, SEQ    BGP, RND    BGP, SEQ
bsearch4             93.6         46.7         N/A         N/A
radix4               18.9          9.3         4.3         2.1
radix4_lockless      37.2         18.6         5.3         2.7
dpdk_lpm4            51.8         15.1        51.6        14.9
dxr                 218.2        103.3       114.0        49.0

Reviewed by:	melifaro
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D29821
2021-05-05 13:45:52 +02:00