Commit Graph

2291 Commits

Author SHA1 Message Date
Andrey V. Elsukov
ccd69bd573 Ignore IPv6 NA and drop IPv6 NS when BACKUP CARP address is used
When system acts as CARP BACKUP ignore received IPv6 Neighbor Advertisements
to ensure that neighbor cache will not be changed.
Also do not send IPv6 Neighbor Solicitation from CARP BACKUP source address.
Such packets can confuse network switch and it detects MAC addresses
flapping.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	    https://reviews.freebsd.org/D36649
2022-10-06 20:01:16 +03:00
Gleb Smirnoff
fcb3f813f3 netinet*: remove PRC_ constants and streamline ICMP processing
In the original design of the network stack from the protocol control
input method pr_ctlinput was used notify the protocols about two very
different kinds of events: internal system events and receival of an
ICMP messages from outside.  These events were coded with PRC_ codes.
Today these methods are removed from the protosw(9) and are isolated
to IPv4 and IPv6 stacks and are called only from icmp*_input().  The
PRC_ codes now just create a shim layer between ICMP codes and errors
or actions taken by protocols.

- Change ipproto_ctlinput_t to pass just pointer to ICMP header.  This
  allows protocols to not deduct it from the internal IP header.
- Change ip6proto_ctlinput_t to pass just struct ip6ctlparam pointer.
  It has all the information needed to the protocols.  In the structure,
  change ip6c_finaldst fields to sockaddr_in6.  The reason is that
  icmp6_input() already has this address wrapped in sockaddr, and the
  protocols want this address as sockaddr.
- For UDP tunneling control input, as well as for IPSEC control input,
  change the prototypes to accept a transparent union of either ICMP
  header pointer or struct ip6ctlparam pointer.
- In icmp_input() and icmp6_input() do only validation of ICMP header and
  count bad packets.  The translation of ICMP codes to errors/actions is
  done by protocols.
- Provide icmp_errmap() and icmp6_errmap() as substitute to inetctlerrmap,
  inet6ctlerrmap arrays.
- In protocol ctlinput methods either trust what icmp_errmap() recommend,
  or do our own logic based on the ICMP header.

Differential revision:	https://reviews.freebsd.org/D36731
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
c0fc81e913 netinet*: remove dead code from TCP, UDP, SCTP control input
Now these functions are called only from icmp*_input().  The pointer
to the ICMP data is never NULL and cmd has a limited set of values.

In the past the functions were demultiplexing control messages from
ICMP layer, as well as internally generated events.  In the latter
case the the pointer to IP would be NULL.

Differential revision:	https://reviews.freebsd.org/D36729
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
53807a8a27 netinet*: use sparse C99 initializer for inetctlerrmap
and mark those PRC_* codes, that are used.  The rest are dead code.
This is not a functional change, but illustrative to make easier
review of following changes.
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
43d39ca7e5 netinet*: de-void control input IP protocol methods
After decoupling of protosw(9) and IP wire protocols in 78b1fc05b2 for
IPv4 we got vector ip_ctlprotox[] that is executed only and only from
icmp_input() and respectively for IPv6 we got ip6_ctlprotox[] executed
only and only from icmp6_input().  This allows to use protocol specific
argument types in these methods instead of struct sockaddr and void.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36727
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
46ddeb6be8 netinet6: retire ip6protosw.h
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with
the former removed in f0ffb944d2 in 2001 and the latter survived until
today.  It has been reduced down to only one useful declaration that
moves to ip6_var.h

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36726
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
24b96f35b9 netinet*: move ipproto_register() and co to ip_var.h and ip6_var.h
This is a FreeBSD KPI and belongs to private header not netinet/in.h.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36723
2022-10-03 20:53:04 -07:00
Alexander V. Chernikov
e437991fc9 netinet6: factor interface addition code to the dedicated function
Summary:
Move SIOCAIFADDR_IN6 (current "primary" ioctl to add an IPv6
 interface address) handling code to the dedicated in6_addifaddr()
 function and make it a part of KPI. This allows in-kernel users to
 add/delete interfaces addresses without relying on ioctl interface.

Subscribers: imp, ae, glebius

Differential Revision: https://reviews.freebsd.org/D36713
2022-09-27 13:23:34 +00:00
Sébastien BINI
64cce803c4 Correct IPv6 MLD group state string table
MLD_REPORTING_MEMBER was missing

MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D36311
2022-09-19 09:01:36 -04:00
Gordon Bergling
bcb2341c7d netinet6: Remove a double word in a source code comment
- s/to to/to/

MFC after:	3 days
2022-09-10 13:01:44 +02:00
Mateusz Guzik
dda6376b04 net: employ newly added pfil_mbuf_{in,out} where approriate
Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D36454
2022-09-08 16:21:08 +00:00
Mateusz Guzik
14c9a2dbfb net: retire PFIL_FWD
It is now unused and not having it allows further clean ups.

Reviewed by:	cy, glebius, kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D36452
2022-09-07 10:04:31 +00:00
Mateusz Guzik
223a73a1c4 net: remove stale altq_input reference
Code setting it was removed in:
commit 325fab802e
Author: Eric van Gyzen <vangyzen@FreeBSD.org>
Date:   Tue Dec 4 23:46:43 2018 +0000

    altq: remove ALTQ3_COMPAT code

Reviewed by:	glebius, kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D36471
2022-09-07 10:03:12 +00:00
Alexander V. Chernikov
db98b42050 netinet6: call lle_event eventhandler after updating state
Fix nd6_na_input() eventhandler call: run eventhandler after lle
 state transition.

Old behaviour (as seen by event handler):
 * fe80::5054:ff:fe8c:63e9 dev vtnet0 lladdr 52:54:00:8c:63:e9 INCOMPLETE
New behaviour:
* fe80::5054:ff:fe8c:63e9 dev vtnet0 lladdr 52:54:00:8c:63:e9 REACHABLE

MFC after: 2 weeks
2022-09-05 13:01:27 +00:00
Gleb Smirnoff
74ed2e8ab2 raw ip: fix regression with multicast and RSVP
With 61f7427f02 raw sockets protosw has wildcard pr_protocol.  Protocol
of a specific pcb is stored in inp_ip_p.

Reviewed by:		karels
Reported by:		karels
Differential revision:	https://reviews.freebsd.org/D36429
Fixes:			61f7427f02
2022-09-02 12:17:09 -07:00
Gleb Smirnoff
61f7427f02 protosw: cleanup protocols that existed merely to provide pr_input
Since 4.4BSD the protosw was used to implement socket types created
by socket(2) syscall and at the same to demultiplex incoming IPv4
datagrams (later copied to IPv6).  This story ended with 78b1fc05b2.

These entries (e.g. IPPROTO_ICMP) in inetsw that were added to catch
packets in ip_input(), they would also be returned by pffindproto()
if user says socket(AF_INET, SOCK_RAW, IPPROTO_ICMP).  Thus, for raw
sockets to work correctly, all the entries were pointing at raw_usrreq
differentiating only in the value of pr_protocol.

With 78b1fc05b2 all these entries are no longer needed, as ip_protox
is independent of protosw.  Any socket syscall requesting SOCK_RAW type
would end up with rip_protosw.  And this protosw has its pr_protocol
set to 0, allowing to mark socket with any protocol.

For IPv6 raw socket the change required two small fixes:
o Validate user provided protocol value
o Always use protocol number stored in inp in rip6_attach, instead
  of protosw value, which is now always 0.

Differential revision:	https://reviews.freebsd.org/D36380
2022-08-30 15:09:21 -07:00
Gleb Smirnoff
8624f4347e divert: declare PF_DIVERT domain and stop abusing PF_INET
The divert(4) is not a protocol of IPv4.  It is a socket to
intercept packets from ipfw(4) to userland and re-inject them
back.  It can divert and re-inject IPv4 and IPv6 packets today,
but potentially it is not limited to these two protocols.  The
IPPROTO_DIVERT does not belong to known IP protocols, it
doesn't even fit into u_char.  I guess, the implementation of
divert(4) was done the way it is done basically because it was
easier to do it this way, back when protocols for sockets were
intertwined with IP protocols and domains were statically
compiled in.

Moving divert(4) out of inetsw accomplished two important things:

1) IPDIVERT is getting much closer to be not dependent on INET.
   This will be finalized in following changes.
2) Now divert socket no longer aliases with raw IPv4 socket.
   Domain/proto selection code won't need a hack for SOCK_RAW and
   multiple entries in inetsw implementing different flavors of
   raw socket can merge into one without requirement of raw IPv4
   being the last member of dom_protosw.

Differential revision:	https://reviews.freebsd.org/D36379
2022-08-30 15:09:21 -07:00
Alexander V. Chernikov
177f04d57f routing: constantify @rc in rib_decompose_notification().
Clarify the @rc immutability by explicitly marking @rc const.

MFC after:	2 weeks
2022-08-29 18:12:24 +00:00
Alexander V. Chernikov
7b3440fc30 Revert "routing: install prefix and loopback routes using new nhop-based KPI."
Temporarily revert the commit to unblock testing.

This reverts commit a1b59379db.
2022-08-29 16:20:42 +00:00
Alexander V. Chernikov
6d4f6e4c70 routing: make rib_add_redirect() use new nhop-based KPI
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D36169
2022-08-29 10:23:26 +00:00
Alexander V. Chernikov
835a611e68 routing: make IPv6 defrouter code use new nhop-based KPI.
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D36168
2022-08-29 10:08:47 +00:00
Alexander V. Chernikov
a1b59379db routing: install prefix and loopback routes using new nhop-based KPI.
Construct the desired hexthops directly instead of using the
 "translation" layer in form of filling rt_addrinfo data.
Simplify V_rt_add_addr_allfibs handling by using recently-added
 rib_copy_route() to propagate the routes to the non-primary address
 fibs.

MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D36166
2022-08-29 10:07:58 +00:00
Alexander V. Chernikov
8036234c72 netinet6: fix SIOCSPFXFLUSH_IN6 by skipping manually-configured prefixes
Summary:
Currently netinet6/ code allocates IPv6 prefixes (nd_prefix) for
 both manually-assigned addresses and advertised prefixes. As a result,
 prefixes from manually-assigned prefixes can be seen in `ndp -p` list
 and be cleared via `ndp -P`. The latter relies on the SIOCSPFXFLUSH_IN6
 ioctl to clear to prefix list.
The original intent of the SIOCSPFXFLUSH_IN6 was to clear prefixes
 originated from the advertising routers:

```
1998-09-02  JINMEI, Tatuya  <jinmei@isl.rdc.toshiba.co.jp>
	* nd6.c (nd6_ioctl): added 2 new ioctls; SIOCSRTRFLUSH_IN6 and
	SIOCSPFXFLUSH_IN6. The former is to flush all default routers
	in the default router list, and the latter is to flush all the
	prefixes and the addresses derived from them in the prefix list.
```

Restore the intent by marking prefixes derived from the RA messages
with newly-added ndpr_flags.ra_derived flag and skip prefixes not marked
 with such flag during deletion and listing.

Differential Revision: https://reviews.freebsd.org/D36312
MFC after:	2 weeks
2022-08-24 13:59:13 +00:00
Gleb Smirnoff
6080e073dc ip6_input: explicitly include <sys/eventhandler.h>
On most architectures/kernels it was included implicitly, but powerpc
MPC85XX got broken.

Fixes:	81a34d374e
2022-08-17 14:54:46 -07:00
Gleb Smirnoff
e7d02be19d protosw: refactor protosw and domain static declaration and load
o Assert that every protosw has pr_attach.  Now this structure is
  only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw.  This was suggested
  in 1996 by wollman@ (see 7b187005d1), and later reiterated
  in 2006 by rwatson@ (see 6fbb9cf860).
o Make struct domain hold a variable sized array of protosw pointers.
  For most protocols these pointers are initialized statically.
  Those domains that may have loadable protocols have spacers. IPv4
  and IPv6 have 8 spacers each (andre@ dff3237ee5).
o For inetsw and inet6sw leave a comment noting that many protosw
  entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36232
2022-08-17 11:50:32 -07:00
Gleb Smirnoff
81a34d374e protosw: retire pr_drain and use EVENTHANDLER(9) directly
The method was called for two different conditions: 1) the VM layer is
low on pages or 2) one of UMA zones of mbuf allocator exhausted.
This change 2) into a new event handler, but all affected network
subsystems modified to subscribe to both, so this change shall not
bring functional changes under different low memory situations.

There were three subsystems still using pr_drain: TCP, SCTP and frag6.
The latter had its protosw entry for the only reason to register its
pr_drain method.

Reviewed by:		tuexen, melifaro
Differential revision:	https://reviews.freebsd.org/D36164
2022-08-17 11:50:31 -07:00
Gleb Smirnoff
a0d7d2476f frag6: use callout(9) directly instead of pr_slowtimo
Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36162
2022-08-17 11:50:31 -07:00
Gleb Smirnoff
b730de8bad mld6: use callout(9) directly instead of pr_slowtimo, pr_fasttimo
While here remove recursive network epoch entry in mld_fasttimo_vnet(),
as this function is already in epoch.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36161
2022-08-17 11:50:31 -07:00
Gleb Smirnoff
6c452841ef tcp: use callout(9) directly instead of pr_slowtimo
Modern TCP stacks uses multiple callouts per tcpcb, and a global
callout is ancient artifact.  However it is still used to garbage
collect compressed timewait entries.

Reviewed by:		melifaro, tuexen
Differential revision:	https://reviews.freebsd.org/D36159
2022-08-17 11:50:31 -07:00
Gleb Smirnoff
78b1fc05b2 protosw: separate pr_input and pr_ctlinput out of protosw
The protosw KPI historically has implemented two quite orthogonal
things: protocols that implement a certain kind of socket, and
protocols that are IPv4/IPv6 protocol.  These two things do not
make one-to-one correspondence. The pr_input and pr_ctlinput methods
were utilized only in IP protocols.  This strange duality required
IP protocols that doesn't have a socket to declare protosw, e.g.
carp(4).  On the other hand developers of socket protocols thought
that they need to define pr_input/pr_ctlinput always, which lead to
strange dead code, e.g. div_input() or sdp_ctlinput().

With this change pr_input and pr_ctlinput as part of protosw disappear
and IPv4/IPv6 get their private single level protocol switch table
ip_protox[] and ip6_protox[] respectively, pointing at array of
ipproto_input_t functions.  The pr_ctlinput that was used for
control input coming from the network (ICMP, ICMPv6) is now represented
by ip_ctlprotox[] and ip6_ctlprotox[].

ipproto_register() becomes the only official way to register in the
table.  Those protocols that were always static and unlikely anybody
is interested in making them loadable, are now registered by ip_init(),
ip6_init().  An IP protocol that considers itself unloadable shall
register itself within its own private SYSINIT().

Reviewed by:		tuexen, melifaro
Differential revision:	https://reviews.freebsd.org/D36157
2022-08-17 11:50:31 -07:00
Gleb Smirnoff
489482e276 ipsec: isolate knowledge about protocols that are last header
Retire PR_LASTHDR protosw flag.

Reviewed by:		ae
Differential revision:	https://reviews.freebsd.org/D36155
2022-08-17 08:24:28 -07:00
Gleb Smirnoff
c93db4abf4 udp: call UDP methods from UDP over IPv6 directly
Both UDP and UDP Lite use same methods on sockets.  Both UDP over IPv4
and over IPv6 use same methods.  Don't pretend that methods can switch
and remove this unneeded complexity.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36154
2022-08-16 12:40:36 -07:00
Gleb Smirnoff
f277746e13 protosw: change prototype for pr_control
For some reason protosw.h is used during world complation and userland
is not aware of caddr_t, a relic from the first version of C.  Broken
buildworld is good reason to get rid of yet another caddr_t in kernel.

Fixes:	886fc1e804
2022-08-12 12:08:18 -07:00
Gleb Smirnoff
948f31d7b0 netinet: do not broadcast PRC_REDIRECT_HOST on ICMP redirect
This is expensive and useless call.  It has been useless since Alexander
melifaro@ moved the forwarding table to nexthops with passive invalidation.
What happens now is that cached route in a inpcb would get invalidated
on next ip_output().

These were the last users of pfctlinput(), so garbage collect it.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36156
2022-08-12 08:31:29 -07:00
Alexander V. Chernikov
9d16275c65 netinet6: simplify defrouter_select_fib()
* factor out underlying llentry check into a separate function and use it consistently
* enter epoch once instead of per-router enter/exit
* don't execute body with fibnum = `RT_ALL_FIBS`

Differential Revision: https://reviews.freebsd.org/D35523
MFC after:	2 weeks
2022-08-12 11:43:37 +00:00
Gleb Smirnoff
e0b405003a raw ip6: merge rip6_output() into rip6_send()
While here remove some code that was compat legacy back in 2005, added
in a1f7e5f8ee.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36128
2022-08-11 09:19:37 -07:00
Gleb Smirnoff
8c77967ecc protosw: retire pr_output method
The only place to execute this method was raw_usend(). Only those
protocols that used raw socket were able to actually enter that method.
All pr_output assignments being deleted by this commit were a dead code
for many years.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36126
2022-08-11 09:19:37 -07:00
Gleb Smirnoff
c7a62c925c inpcb: gather v4/v6 handling code into in_pcballoc() from protocols
Reviewed by:		rrs, tuexen
Differential revision:	https://reviews.freebsd.org/D36062
2022-08-10 11:09:34 -07:00
Alexander V. Chernikov
f998535a66 netinet6: allow ND entries creation for all directly-reachable
destinations.

The current assumption is that kernel-handled rtadv prefixes along with
 the interface address prefixes are the only prefixes considered in
 the ND neighbor eligibility code.
Change this by allowing any non-gatewaye routes to be eligible. This
 will allow DHCPv6-controlled routes to be correctly handled by
 the ND code.
Refactor nd6_is_new_addr_neighbor() to enable more deterministic
 performance in "found" case and remove non-needed
 V_rt_add_addr_allfibs handling logic.

Reviewed By: kbowling
Differential Revision: https://reviews.freebsd.org/D23695
MFC after:	1 month
2022-08-10 14:19:19 +00:00
Gordon Bergling
cd33039749 inet6(4): Fix a typo in a source code comment
- s/Unreachablity/Unreachability/

MFC after:	3 days
2022-08-07 14:20:52 +02:00
Alexander V. Chernikov
08bb0873ca routing: fix panic for p2p interfaces after 800c68469b.
Reported by:	cy
MFC after:	1 month
2022-08-03 08:21:08 +00:00
Alexander V. Chernikov
ae6bfd12c8 routing: refactor private KPI
* Make nhgrp_get_nhops() return const struct weightened_nhop to
 indicate that the list is immutable
* Make nhgrp_get_group() return the actual group, instead of
 group+weight.

MFC after:	2 weeks
2022-08-01 10:02:12 +00:00
Alexander V. Chernikov
800c68469b routing: add nhop(9) kpi.
Differential Revision: https://reviews.freebsd.org/D35985
MFC after:	1 month
2022-08-01 08:52:26 +00:00
Kornel Dulęba
82042465c3 icmp6: Improve validation of PMTU
Currently we accept any pmtu between IPV6_MMTU(1280B) and the link mtu.
In some network topologies could allow a bad actor to perform a DOS attack.
Contrary to IPv4 in IPv6 oversized packets are dropped, and a ICMP
PACKET_TOO_BIG message is sent back to the sender.
After receiving an ICMPv6 packet with pmtu bigger than the
current one the victim will start sending frames that will be dropped
a router with reduced MTU.
Although it will eventually receive another message with correct pmtu,
an attacker can still just inject their spoofed packets frequently
enough to overwrite the correct value.
This issue is described in detail in RFC8201, section 6.
Fix this by checking the current pmtu, and accepting the new one only
if it's smaller.

Approved by:	mw(mentor)
Reviewed by:	tuexen
MFC after:	1 week
Sponsored by:	Stormshield
Obtained from:	Semihalf
Differential Revision: https://reviews.freebsd.org/D35871
2022-07-27 16:09:56 +02:00
Dimitry Andric
50207b2de9 Adjust function definition in nd6.c to avoid clang 15 warnings
With clang 15, the following -Werror warning is produced:

    sys/netinet6/nd6.c:247:12: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
    nd6_destroy()
               ^
                void

This is nd6_destroy() is declared with a (void) argument list, but
defined with an empty argument list. Make the definition match the
declaration.

MFC after:	3 days
2022-07-26 21:25:09 +02:00
Alexander V. Chernikov
50fa27e795 netinet6: fix interface handling for loopback traffic
Currently, processing of IPv6 local traffic is partially broken:
 link-local connection fails and global unicast connect() takes
 3 seconds to complete.
This happens due to the combination of multiple factors.
IPv6 code passes original interface "origifp" when passing
traffic via loopack to retain the scope that is mandatory for the
correct hadling of link-local traffic. First problem is that the logic
of passing source interface is not working correcly for TCP connections,
resulting in passing "origifp" on the first 2 connection attempts and
lo0 on the subsequent ones. Second problem is that source address
validation logic skips its checks iff the source interface is loopback,
which doesn't cover "origifp" case.
More detailed description is available at https://reviews.freebsd.org/D35732

Fix the first problem by untangling&simplifying ifp/origifp logic.
Fix the second problem by switching source address validation check to
using M_LOOP mbuf flag instead of interface type.

PR:		265089
Reviewed by:	ae, bz(previous version)
Differential Revision:	https://reviews.freebsd.org/D35732
MFC after:	2 weeks
2022-07-10 12:47:47 +00:00
Alexander V. Chernikov
2756774c3f netinet6: simplify selectroute()
Effectively selectroute() addresses two different cases:
 providing interface info for multicast destinations and providing
 nexthop data for unicast ones. Current implementation intertwines
 handling of both cases, especially in the error handling part.
Factor out all route lookup logic in a separate function,
 lookup_route() to simplify the code.
Ensure consistent KPI: no error means *retifp is set and otherwise.

Differential Revision: https://reviews.freebsd.org/D35711
MFC after:	2 weeks
2022-07-08 11:27:16 +00:00
Alexander V. Chernikov
81a235ecde netinet6: factor out cached route lookups from selectroute().
Currently selectroute() contains two nearly-identical versions of
 the route lookup logic - one for original destination and another
for the case when IPV6_NEXTHOP option was set on the socket.

Factor out handling these route lookups in a separation function to
 improve readability.
This change also fixes handling of link-local IPV6_NEXTHOPs.

Differential Revision: https://reviews.freebsd.org/D35710
MFC after:	2 weeks
2022-07-08 08:58:55 +00:00
Alexander V. Chernikov
0ed7253785 netinet6: perform out-of-bounds check for loX multicast statistics
Currently, some per-mbuf multicast statistics is stored in
 the per-interface ip6stat.ip6s_m2m[] array of size 32 (IP6S_M2MMAX).
Check that loopback ifindex falls within 0.. IP6S_M2MMAX-1 range to
 avoid silent data corruption. The latter cat happen with large
 number of VNETs.

Reviewed by:	glebius
Differential Revision: https://reviews.freebsd.org/D35715
MFC after:	2 weeks
2022-07-05 11:44:30 +00:00
Mark Johnston
a14465e1b9 rip6: Fix a lock order reversal in rip6_bind()
See also commit 71a1539e37.

Reported by:	syzbot+9b461b6a07a83cc10daa@syzkaller.appspotmail.com
Reported by:	syzbot+b6ce0aec16f5fdab3282@syzkaller.appspotmail.com
Reviewed by:	glebius
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35472
2022-06-14 12:00:59 -04:00
Kristof Provost
a7f20faa07 netinet6: fix panic on kldunload pfsync
Commit d6cd20cc5 ("netinet6: fix ndp proxying") caused us to panic when
unloading pfsync:

	Fatal trap 12: page fault while in kernel mode
	cpuid = 19; apic id = 38
	fault virtual address	= 0x20
	fault code		= supervisor read data, page not present
	instruction pointer	= 0x20:0xffffffff80dfe7f4
	stack pointer	        = 0x28:0xfffffe015d4f8ac0
	frame pointer	        = 0x28:0xfffffe015d4f8ae0
	code segment		= base 0x0, limit 0xfffff, type 0x1b
				= DPL 0, pres 1, long 1, def32 0, gran 1
	processor eflags	= interrupt enabled, resume, IOPL = 0
	current process		= 5477 (kldunload)
	trap number		= 12
	panic: page fault
	cpuid = 19
	time = 1654023100
	KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe015d4f8880
	vpanic() at vpanic+0x17f/frame 0xfffffe015d4f88d0
	panic() at panic+0x43/frame 0xfffffe015d4f8930
	trap_fatal() at trap_fatal+0x387/frame 0xfffffe015d4f8990
	trap_pfault() at trap_pfault+0xab/frame 0xfffffe015d4f89f0
	calltrap() at calltrap+0x8/frame 0xfffffe015d4f89f0
	--- trap 0xc, rip = 0xffffffff80dfe7f4, rsp = 0xfffffe015d4f8ac0, rbp = 0xfffffe015d4f8ae0 ---
	in6_purge_proxy_ndp() at in6_purge_proxy_ndp+0x14/frame 0xfffffe015d4f8ae0
	if_purgeaddrs() at if_purgeaddrs+0x24/frame 0xfffffe015d4f8b90
	if_detach_internal() at if_detach_internal+0x1c2/frame 0xfffffe015d4f8bf0
	if_detach() at if_detach+0x71/frame 0xfffffe015d4f8c20
	pfsync_clone_destroy() at pfsync_clone_destroy+0x1dd/frame 0xfffffe015d4f8c70
	if_clone_destroyif() at if_clone_destroyif+0x239/frame 0xfffffe015d4f8cc0
	if_clone_detach() at if_clone_detach+0xc8/frame 0xfffffe015d4f8cf0
	vnet_pfsync_uninit() at vnet_pfsync_uninit+0xda/frame 0xfffffe015d4f8d10
	vnet_deregister_sysuninit() at vnet_deregister_sysuninit+0x85/frame 0xfffffe015d4f8d40
	linker_file_sysuninit() at linker_file_sysuninit+0x147/frame 0xfffffe015d4f8d70
	linker_file_unload() at linker_file_unload+0x269/frame 0xfffffe015d4f8db0
	kern_kldunload() at kern_kldunload+0x18d/frame 0xfffffe015d4f8e00
	amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe015d4f8f30
	fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe015d4f8f30
	--- syscall (444, FreeBSD ELF64, sys_kldunloadf), rip = 0x1601eab28cba, rsp = 0x1601e9c363f8, rbp = 0x1601e9c36c50 ---

This happens because ifp->if_afdata[AF_INET6] is NULL. Check for this,
just as we already do in a few other places.
See also c139b3c19b ("arp/nd: Cope with late calls to
iflladdr_event").

Reviewed by:	melifaro
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D35374
2022-06-01 09:26:15 +02:00
Arseny Smalyuk
d18b4bec98 netinet6: Fix mbuf leak in NDP
Mbufs leak when manually removing incomplete NDP records with pending packet via ndp -d.
It happens because lltable_drop_entry_queue() rely on `la_numheld`
counter when dropping NDP entries (lles). It turned out NDP code never
increased `la_numheld`, so the actual free never happened.

Fix the issue by introducing unified lltable_append_entry_queue(),
common for both ARP and NDP code, properly addressing packet queue
maintenance.

Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35365
MFC after:	2 weeks
2022-05-31 21:06:14 +00:00
KUROSAWA Takahiro
d6cd20cc5c netinet6: fix ndp proxying
We could insert proxy NDP entries by the ndp command, but the host
with proxy ndp entries had not responded to Neighbor Solicitations.
Change the following points for proxy NDP to work as expected:
* join solicited-node multicast addresses for proxy NDP entries
  in order to receive Neighbor Solicitations.
* look up proxy NDP entries not on the routing table but on the
  link-level address table when receiving Neighbor Solicitations.

Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35307
MFC after:	2 weeks
2022-05-30 10:53:33 +00:00
KUROSAWA Takahiro
77001f9b6d lltable: introduce the llt_post_resolved callback
In order to decrease ifdef INET/INET6s in the lltable implementation,
introduce the llt_post_resolved callback and implement protocol-dependent
code in the protocol-dependent part.

Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35322
MFC after:	2 weeks
2022-05-30 10:53:33 +00:00
Dmitry Chagin
31d1b816fe sysent: Get rid of bogus sys/sysent.h include.
Where appropriate hide sysent.h under proper condition.

MFC after:	2 weeks
2022-05-28 20:52:17 +03:00
Gleb Smirnoff
6890b58814 sockbuf: improve sbcreatecontrol()
o Constify memory pointer.  Make length unsigned.
o Make it never fail with M_WAITOK and assert that length is sane.
2022-05-17 10:10:42 -07:00
Gleb Smirnoff
b46667c63e sockbuf: merge two versions of sbcreatecontrol() into one
No functional change.
2022-05-17 10:10:42 -07:00
Gleb Smirnoff
808b7d80e0 mbuf: remove PH_vt alias for mbuf packet header persistent shared data
Mechanical sed change s/PH_vt\.vt_nrecs/vt_nrecs/g
2022-05-13 13:32:43 -07:00
Kristof Provost
797b94504f udp6: allow udp_tun_func_t() to indicate it did not eat the packet
Implement the same filter feature we implemented for UDP over IPv6 in
742e7210d. This was missed in that commit.

Pointed out by:	markj
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-04-22 16:55:23 +02:00
Mark Johnston
5d691ab4f0 mld6: Ensure that mld_domifattach() always succeeds
mld_domifattach() does a memory allocation under the global MLD mutex
and so can fail, but no error handling prevents a null pointer
dereference in this case.  The mutex is only needed when updating the
global softc list; the allocation and static initialization of the softc
does not require this mutex.  So, reduce the scope of the mutex and use
M_WAITOK for the allocation.

PR:		261457
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34943
2022-04-21 13:23:59 -04:00
John Baldwin
a98bb75f80 netinet6: Use __diagused for variables only used in KASSERT(). 2022-04-13 16:08:19 -07:00
Kristof Provost
742e7210d0 udp: allow udp_tun_func_t() to indicate it did not eat the packet
Allow udp tunnel functions to indicate they have not taken ownership of
the packet, and that normal UDP processing should continue.

This is especially useful for scenarios where the kernel has taken
ownership of a socket that was originally created by userspace. It
allows the tunnel function to pass through certain packets for userspace
processing.

The primary user of this is if_ovpn, when it receives messages from
unknown peers (which might be a new client).

Reviewed by:	tuexen
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D34883
2022-04-12 10:04:59 +02:00
Andrey V. Elsukov
7d98cc096b Fix ipfw fwd that doesn't work in some cases
For IPv4 use dst pointer as destination address in fib4_lookup().
It keeps destination address from IPv4 header and can be changed
when PACKET_TAG_IPFORWARD tag was set by packet filter.

For IPv6 override destination address with address from dst_sa.sin6_addr,
that was set from PACKET_TAG_IPFORWARD tag.

Reviewed by:	eugen
MFC after:	1 week
PR:		256828, 261697, 255705
Differential Revision: https://reviews.freebsd.org/D34732
2022-04-11 14:16:43 +03:00
Mark Johnston
990a6d18b0 net: Fix memory leaks in lltable_calc_llheader() error paths
Also convert raw epoch_call() calls to lltable_free_entry() calls, no
functional change intended.  There's no need to asynchronously free the
LLEs in that case to begin with, but we might as well use the lltable
interfaces consistently.

Noticed by code inspection; I believe lltable_calc_llheader() failures
do not generally happen in practice.

Reviewed by:	bz
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34832
2022-04-08 11:47:25 -04:00
Mark Johnston
dd91d84486 net: Fix LLE lock leaks
Historically, lltable_try_set_entry_addr() would release the LLE lock
upon failure.  After some refactoring, it no longer does so, but
consumers were not adjusted accordingly.

Also fix a leak that can occur if lltable_calc_llheader() fails in the
ARP code, but I suspect that such a failure can only occur due to a code
bug.

Reviewed by:	bz, melifaro
Reported by:	pho
Fixes:		0b79b007eb ("[lltable] Restructure nd6 code.")
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34831
2022-04-08 11:46:19 -04:00
John Baldwin
bf73b06771 ip6_mroute: Mark a variable only used in a debug trace as unused. 2022-04-06 16:45:29 -07:00
John Baldwin
8b6ccfb6c7 multicast code: Quiet unused warnings for variables used for KTR traces.
For nallow and nblock, move the variables under #ifdef KTR.

For return values from functions logged in KTR traces, mark the
variables as __unused rather than having to #ifdef the assignment of
the function return value.
2022-04-06 16:45:28 -07:00
Warner Losh
c7761ca93e pim6_input: eliminate write only variable rc
Sponsored by:		Netflix
2022-04-04 22:30:52 -06:00
Gordon Bergling
c55ecce1c1 netinet6: Fix a typo in a source code comment
- s/maping/mapping/

MFC after:	3 days
2022-03-28 19:32:10 +02:00
Andrew Gallatin
9ba117960e Fix a memory leak when ip_output_send() returns EAGAIN due to send tag issues
When ip_output_send() returns EAGAIN due to issues with send tags (route
change, lagg failover, etc), it must free the mbuf. This is because
ip_output_send() was written as a wrapper/replacement for a direct
call to  if_output(), and the contract with if_output() has
historically been that it owns the mbufs once called. When
ip_output_send() failed to free mbufs, it violated this assumption
and lead to leaked mbufs.

This was noticed when using NIC TLS in combination with hardware
rate-limited connections. When seeing lots of NIC output drops
triggered ratelimit send tag changes, we noticed we were leaking
ktls_sessions, send tags and mbufs. This was due ip_output_send()
leaking mbufs which held references to ktls_sessions, which in
turn held references to send tags.

Many thanks to jbh, rrs, hselasky and markj for their help in
debugging this.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34054
Reviewed by: hselasky, jhb, rrs
MFC after: 2 weeks
2022-01-27 10:34:34 -05:00
Thomas Steen Rasmussen
bc6abdd97e nd6: use CARP link level address in SLLAO for NS sent out
When sending an NS, check if we are using a IPv6 CARP address
and if we do, then put proper CARP link level address into
ND_OPT_SOURCE_LINKADDR option and also put PACKET_TAG_CARP tag
on the packet.  The latter will enforce CARP link level address
at the data link layer too, which might be necessary for broken
implementations.
The code really follows what NA sending code has been doing since
introduction of carp(4).  While here, bring to style(9) the whole
block of code.

PR:			193280
Differential revision:	https://reviews.freebsd.org/D33858
2022-01-24 21:02:47 -08:00
Gleb Smirnoff
644ca0846d domains: make domain_init() initialize only global state
Now that each module handles its global and VNET initialization
itself, there is no VNET related stuff left to do in domain_init().

Differential revision:	https://reviews.freebsd.org/D33541
2022-01-03 10:15:22 -08:00
Gleb Smirnoff
89128ff3e4 protocols: init with standard SYSINIT(9) or VNET_SYSINIT
The historical BSD network stack loop that rolls over domains and
over protocols has no advantages over more modern SYSINIT(9).
While doing the sweep, split global and per-VNET initializers.

Getting rid of pr_init allows to achieve several things:
o Get rid of ifdef's that protect against double foo_init() when
  both INET and INET6 are compiled in.
o Isolate initializers statically to the module they init.
o Makes code easier to understand and maintain.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33537
2022-01-03 10:15:21 -08:00
Michael Tuexen
657fcf5807 udp6: remove assignments not being used
MFC after:	3 days
Sponsored by:	Netflix, Inc.
2022-01-01 19:25:47 +01:00
Michael Tuexen
2de2ae331b sctp: improve sctp_pathmtu_adjustment()
Allow the resending of DATA chunks to be controlled by the caller,
which allows retiring sctp_mtu_size_reset() in a separate commit.
Also improve the computaion of the overhead and use 32-bit integers
consistently.
Thanks to Timo Voelker for pointing me to the code.

MFC after:	3 days
2021-12-30 15:16:05 +01:00
Alexander V. Chernikov
ff3a85d324 [lltable] Add per-family lltable getters.
Introduce a new function, lltable_get(), to retrieve lltable pointer
 for the specified interface and family.
Use it to avoid all-iftable list traversal when adding or deleting
 ARP/ND records.

Differential Revision: https://reviews.freebsd.org/D33660
MFC after:	2 weeks
2021-12-29 20:57:15 +00:00
Gleb Smirnoff
a057769205 in_pcb: use jenkins hash over the entire IPv6 (or IPv4) address
The intent is to provide more entropy than can be provided
by just the 32-bits of the IPv6 address which overlaps with
6to4 tunnels.  This is needed to mitigate potential algorithmic
complexity attacks from attackers who can control large
numbers of IPv6 addresses.

Together with:		gallatin
Reviewed by:		dwmalone, rscheff
Differential revision:	https://reviews.freebsd.org/D33254
2021-12-26 10:47:28 -08:00
Gleb Smirnoff
eb8dcdeac2 jail: network epoch protection for IP address lists
Now struct prison has two pointers (IPv4 and IPv6) of struct
prison_ip type.  Each points into epoch context, address count
and variable size array of addresses.  These structures are
freed with network epoch deferred free and are not edited in
place, instead a new structure is allocated and set.

While here, the change also generalizes a lot (but not enough)
of IPv4 and IPv6 processing. E.g. address family agnostic helpers
for kern_jail_set() are provided, that reduce v4-v6 copy-paste.

The fast-path prison_check_ip[46]_locked() is also generalized
into prison_ip_check() that can be executed with network epoch
protection only.

Reviewed by:		jamie
Differential revision:	https://reviews.freebsd.org/D33339
2021-12-26 10:45:50 -08:00
Mateusz Guzik
71a1539e37 inet6: fix a LOR between rip and rawinp
Running sys/netpfil/pf/fragmentation v6 results in:

lock order reversal:
 1st 0xfffffe00050429a8 rip (rip, sleep mutex) @ /usr/src/sys/netinet6/raw_ip6.c:803
 2nd 0xfffff8009491e1d0 rawinp (rawinp, rw) @ /usr/src/sys/netinet6/raw_ip6.c:804
lock order rawinp -> rip established at:
0xffffffff8068e26a at witness_lock_order_add+0x28a
0xffffffff8068d087 at witness_checkorder+0x627
0xffffffff805a9f05 at __mtx_lock_flags+0x205
0xffffffff808102e4 at in_pcballoc+0x204
0xffffffff808d53c6 at rip6_attach+0x116
0xffffffff806dc4e8 at socreate+0x368
0xffffffff806eaedc at kern_socket+0xfc
0xffffffff806eadcd at sys_socket+0x2d
0xffffffff80abc774 at syscallenter+0x5c4
0xffffffff80abbeeb at amd64_syscall+0x1b
 0xffffffff80a8044b at fast_syscall_common+0xf8
lock order rip -> rawinp attempted at:
0xffffffff8068dc2a at witness_checkorder+0x11ca
0xffffffff805d1b7f at _rw_wlock_cookie+0x18f
0xffffffff808d596c at rip6_connect+0x19c
0xffffffff806e0842 at soconnectat+0x142
0xffffffff806ebe36 at kern_connectat+0x136
0xffffffff806ebcdf at sys_connect+0x4f
0xffffffff80abc774 at syscallenter+0x5c4
0xffffffff80abbeeb at amd64_syscall+0x1b
0xffffffff80a8044b at fast_syscall_common+0xf8

Reviewed by:	glebius
Fixes:	de2d47842e ("SMR protection for inpcbs")
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33508
2021-12-19 14:43:04 +00:00
Kristof Provost
9f5432d5e5 netinet6: ip6_setpktopt() requires NET_EPOCH
ip6_setpktopt() can call ifnet_byindex() which requires epoch. Mark the
function as requiring NET_EPOCH, and ensure we enter it priot to calling
it.

Reported-by: syzbot+92526116441688fea8a3@syzkaller.appspotmail.com
Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33462
2021-12-17 17:30:36 +01:00
Gleb Smirnoff
185e659c40 inpcb: use locked variant of prison_check_ip*()
The pcb lookup always happens in the network epoch and in SMR section.
We can't block on a mutex due to the latter.  Right now this patch opens
up a race.  But soon that will be addressed by D33339.

Reviewed by:		markj, jamie
Differential revision:	https://reviews.freebsd.org/D33340
Fixes:			de2d47842e
2021-12-14 09:38:52 -08:00
Gleb Smirnoff
e3044071de in6p_set_multicast_if(): fix malloc(M_WAITOK) with epoch
Fixes:	d74b7baeb0
2021-12-06 14:33:23 -08:00
Gleb Smirnoff
d74b7baeb0 ifnet_byindex() actually requires network epoch
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them
in epoch.  Most of the code touched remains unsafe, as the returned
pointer is being used after epoch exit.  Mark that with a comment.

Validate the index argument inside the function, reducing argument
validation requirement from the callers and making V_if_index
private to if.c.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Gleb Smirnoff
de2d47842e SMR protection for inpcbs
With introduction of epoch(9) synchronization to network stack the
inpcb database became protected by the network epoch together with
static network data (interfaces, addresses, etc).  However, inpcb
aren't static in nature, they are created and destroyed all the
time, which creates some traffic on the epoch(9) garbage collector.

Fairly new feature of uma(9) - Safe Memory Reclamation allows to
safely free memory in page-sized batches, with virtually zero
overhead compared to uma_zfree().  However, unlike epoch(9), it
puts stricter requirement on the access to the protected memory,
needing the critical(9) section to access it.  Details:

- The database is already build on CK lists, thanks to epoch(9).
- For write access nothing is changed.
- For a lookup in the database SMR section is now required.
  Once the desired inpcb is found we need to transition from SMR
  section to r/w lock on the inpcb itself, with a check that inpcb
  isn't yet freed.  This requires some compexity, since SMR section
  itself is a critical(9) section.  The complexity is hidden from
  KPI users in inp_smr_lock().
- For a inpcb list traversal (a pcblist sysctl, or broadcast
  notification) also a new KPI is provided, that hides internals of
  the database - inp_next(struct inp_iterator *).

Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33022
2021-12-02 10:48:48 -08:00
Gleb Smirnoff
565655f4e3 inpcb: reduce some aliased functions after removal of PCBGROUP.
Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33021
2021-12-02 10:48:48 -08:00
Gleb Smirnoff
93c67567e0 Remove "options PCBGROUP"
With upcoming changes to the inpcb synchronisation it is going to be
broken. Even its current status after the move of PCB synchronization
to the network epoch is very questionable.

This experimental feature was sponsored by Juniper but ended never to
be used in Juniper and doesn't exist in their source tree [sjg@, stevek@,
jtl@]. In the past (AFAIK, pre-epoch times) it was tried out at Netflix
[gallatin@, rrs@] with no positive result and at Yandex [ae@, melifaro@].

I'm up to resurrecting it back if there is any interest from anybody.

Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33020
2021-12-02 10:48:48 -08:00
Gleb Smirnoff
1cec1c5831 Allow to compile RSS without PCBGROUP.
Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33019
2021-12-02 10:48:48 -08:00
Gordon Bergling
3cf59750eb netinet6: Fix a typo in a sysctl description
- remove a double 'a'

MFC after:	3 days
2021-11-30 07:24:44 +01:00
Mark Johnston
44775b163b netinet: Remove unneeded mb_unmapped_to_ext() calls
in_cksum_skip() now handles unmapped mbufs on platforms where they're
permitted.

Reviewed by:	glebius, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33097
2021-11-24 13:31:16 -05:00
Kristof Provost
19dc644511 if_stf: add 6rd support
Implement IPv6 Rapid Deployment (RFC5969) on top of the existing 6to4
(RFC3056) if_stf code.

PR:		253328
Reviewed by:	hrs
Obtained from:	pfSense
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33037
2021-11-20 19:29:01 +01:00
Gleb Smirnoff
3850d1837b in6_rmx: remove unnecessary TCP includes 2021-11-18 00:54:29 -08:00
Gleb Smirnoff
1817be481b Add net.inet6.ip6.source_address_validation
Drop packets arriving from the network that have our source IPv6
address.  If maliciously crafted they can create evil effects
like an RST exchange between two of our listening TCP ports.
Such packets just can't be legitimate.  Enable the tunable
by default.  Long time due for a modern Internet host.

Reviewed by:		melifaro, donner, kp
Differential revision:	https://reviews.freebsd.org/D32915
2021-11-12 09:01:40 -08:00
Gleb Smirnoff
9c89392f12 Add in_localip_fib(), in6_localip_fib().
Check if given address/FIB exists locally.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D32913
2021-11-12 08:59:42 -08:00
Gleb Smirnoff
3ea9a7cf7b blackhole(4): disable for locally originated TCP/UDP packets
In most cases blackholing for locally originated packets is undesired,
leads to different kind of lags and delays. Provide sysctls to enforce
it, e.g. for debugging purposes.

Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D32718
2021-11-03 13:02:44 -07:00
Roy Marples
5c5340108e net: Allow binding of unspecified address without address existance
Previously in_pcbbind_setup returned EADDRNOTAVAIL for empty
V_in_ifaddrhead (i.e., no IPv4 addresses configured) and in6_pcbbind
did the same for empty V_in6_ifaddrhead (no IPv6 addresses).

An equivalent test has existed since 4.4-Lite.  It was presumably done
to avoid extra work (assuming the address isn't going to be found
later).

In normal system operation *_ifaddrhead will not be empty: they will
at least have the loopback address(es).  In practice no work will be
avoided.

Further, this case caused net/dhcpd to fail when run early in boot
before assignment of any addresses.  It should be possible to bind the
unspecified address even if no addresses have been configured yet, so
just remove the tests.

The now-removed "XXX broken" comments were added in 59562606b9,
which converted the ifaddr lists to TAILQs.  As far as I (emaste) can
tell the brokenness is the issue described above, not some aspect of
the TAILQ conversion.

PR:		253166
Reviewed by:	ae, bz, donner, emaste, glebius
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D32563
2021-10-20 19:25:51 -04:00
Gleb Smirnoff
0f617ae48a Add in_pcb_var.h for KPIs that are private to in_pcb.c and in6_pcb.c. 2021-10-18 10:19:57 -07:00
Gleb Smirnoff
147f018a72 Move in6_pcbsetport() to in6_pcb.c
This function was originally carved out of in6_pcbbind(), which
is in in6_pcb.c. This function also uses KPI private to the PCB
database - in_pcb_lport().
2021-10-18 10:19:03 -07:00
Mark Johnston
2d5c48eccd sctp: Tighten up locking around sctp_aloc_assoc()
All callers of sctp_aloc_assoc() mark the PCB as connected after a
successful call (for one-to-one-style sockets).  In all cases this is
done without the PCB lock, so the PCB's flags can be corrupted.  We also
do not atomically check whether a one-to-one-style socket is a listening
socket, which violates various assumptions in solisten_proto().

We need to hold the PCB lock across all of sctp_aloc_assoc() to fix
this.  In order to do that without introducing lock order reversals, we
have to hold the global info lock as well.

So:
- Convert sctp_aloc_assoc() so that the inp and info locks are
  consistently held.  It returns with the association lock held, as
  before.
- Fix an apparent bug where we failed to remove an association from a
  global hash if sctp_add_remote_addr() fails.
- sctp_select_a_tag() is called when initializing an association, and it
  acquires the global info lock.  To avoid lock recursion, push locking
  into its callers.
- Introduce sctp_aloc_assoc_connected(), which atomically checks for a
  listening socket and sets SCTP_PCB_FLAGS_CONNECTED.

There is still one edge case in sctp_process_cookie_new() where we do
not update PCB/socket state correctly.

Reviewed by:	tuexen
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31908
2021-09-11 10:15:21 -04:00