Commit Graph

664 Commits

Author SHA1 Message Date
Gleb Smirnoff
e7d02be19d protosw: refactor protosw and domain static declaration and load
o Assert that every protosw has pr_attach.  Now this structure is
  only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw.  This was suggested
  in 1996 by wollman@ (see 7b187005d1), and later reiterated
  in 2006 by rwatson@ (see 6fbb9cf860).
o Make struct domain hold a variable sized array of protosw pointers.
  For most protocols these pointers are initialized statically.
  Those domains that may have loadable protocols have spacers. IPv4
  and IPv6 have 8 spacers each (andre@ dff3237ee5).
o For inetsw and inet6sw leave a comment noting that many protosw
  entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36232
2022-08-17 11:50:32 -07:00
Alexander V. Chernikov
d8b42ddcac rtsock: subscribe to ifnet eventhandlers instead of direct calls.
Stop treating rtsock as a "special" consumer and use already-provided
 ifaddr arrival/departure notifications.

MFC after:	2 weeks

Test Plan:
```
21:05 [0] m@devel0 route -n monitor

-> ifconfig vtnet0.2 create

got message of size 24 on Tue Aug  9 21:05:44 2022
RTM_IFANNOUNCE: interface arrival/departure: len 24, if# 3, what: arrival

got message of size 168 on Tue Aug  9 21:05:54 2022
RTM_IFINFO: iface status change: len 168, if# 3, link: up, flags:<BROADCAST,RUNNING,SIMPLEX,MULTICAST>

-> ifconfig vtnet0.2 destroy

got message of size 24 on Tue Aug  9 21:05:54 2022
RTM_IFANNOUNCE: interface arrival/departure: len 24, if# 3, what: departure

```

Reviewed By: glebius
Differential Revision: https://reviews.freebsd.org/D36095
MFC after:	2 weeks
2022-08-11 20:36:59 +00:00
Gleb Smirnoff
b8103ca76d netinet: get interface event notifications directly via EVENTHANDLER(9)
The old mechanism of getting them via domains/protocols control input
is a relict from the previous century, when nothing like EVENTHANDLER(9)
existed yet.  Retire PRC_IFDOWN/PRC_IFUP as netinet was the only one
to use them.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36116
2022-08-11 09:19:36 -07:00
Zhenlei Huang
150486f6a9 Introduce and use the NET_EPOCH_DRAIN_CALLBACKS() macro
Reviewed by:	melifao, kp
Differential Revision:	https://reviews.freebsd.org/D35968
2022-07-29 21:21:10 +02:00
KUROSAWA Takahiro
d6cd20cc5c netinet6: fix ndp proxying
We could insert proxy NDP entries by the ndp command, but the host
with proxy ndp entries had not responded to Neighbor Solicitations.
Change the following points for proxy NDP to work as expected:
* join solicited-node multicast addresses for proxy NDP entries
  in order to receive Neighbor Solicitations.
* look up proxy NDP entries not on the routing table but on the
  link-level address table when receiving Neighbor Solicitations.

Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35307
MFC after:	2 weeks
2022-05-30 10:53:33 +00:00
Konstantin Belousov
051e7d78b0 Kernel-side infrastructure to implement nvlist-based set/get ifcaps
Reviewed by:	hselasky, jhb, kp (previous version)
Sponsored by:	NVIDIA Networking
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D32551
2022-05-24 23:59:32 +03:00
Kristof Provost
868bf82153 if: avoid interface destroy race
When we destroy an interface while the jail containing it is being
destroyed we risk seeing a race between if_vmove() and the destruction
code, which results in us trying to move a destroyed interface.

Protect against this by using the ifnet_detach_sxlock to also covert
if_vmove() (and not just detach).

PR:		262829
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D34704
2022-05-06 13:55:08 +02:00
Gleb Smirnoff
4d7a1361ef ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif
Supplement ifindex table with generation count and use it to
serialize & restore an ifnet pointer.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33266
Fun note:		git show e6abef0918

(cherry picked from commit e1882428dc)
2022-05-05 14:38:07 -04:00
Gleb Smirnoff
80e60e236d ifnet: make if_index global
Now that ifindex is static to if.c we can unvirtualize it.  For lifetime
of an ifnet its index never changes.  To avoid leaking foreign interfaces
the net.link.generic.system.ifcount sysctl and the ifnet_byindex() KPI
filter their returned value on curvnet.  Since if_vmove() no longer
changes the if_index, inline ifindex_alloc() and ifindex_free() into
if_alloc() and if_free() respectively.

API wise the only change is that now minimum interface index can be
greater than 1.  The holes in interface indexes were always allowed.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33672

(cherry picked from commit 91f44749c6)
2022-05-05 14:38:07 -04:00
Marko Zec
d461deeaa4 VNET: Revert "ifnet: make if_index global"
This reverts commit 91f44749c6.

Devirtualization of V_if_index and V_ifindex_table was rushed into
the tree lacking proper context, discussion, and declaration of intent,
so I'm backing it out as harmful to VNET on the following grounds:

1) The change repurposed the decades-old and stable if_index KBI for
new, unclear goals which were omitted from the commit note.

2) The change opened up a new resource exhaustion vector where any vnet
could starve the system of ifnet indices, including vnet0.

3) To circumvent the newly introduced problem of separating ifnets
belonging to different vnets from the globalized ifindex_table, the
author introduced sysctl_ifcount() which does a linear traversal over
the (potentially huge) global ifnet list just to return a simple upper
bound on existing ifnet indices.

4) The change effectively led to nonuniform ifnet index allocation
among vnets.

5) The commit note clearly stated that the patch changed the implicit
if_index ABI contract where ifnet indices were assumed to be starting
from one.  The commit note also included a correct observation that
holes in interface indices were always allowed, but failed to declare
that the userland-observable ifindex tables could now include huge
empty spans even under modest operating conditions.

6) The author had an earlier proposal in the works which did not
affect per-vnet ifnet lists (D33265) but which he abandoned without
providing the rationale behind his decision to do so, at the expense
of sacrificing the vnet isolation contract and if_index ABI / KBI.

Furthermore, the author agreed to back out his changes himself and
to follow up with a proposal for a less intrusive alternative, but
later silently declined to act.  Therefore, I decided to resolve the
status-quo by backing this out myself.  This in no way precludes a
future proposal aiming to mitigate ifnet-removal related system
crashes or panics to be accepted, provided it would not unnecessarily
compromise the goal of as strict as possible isolation between vnets.

Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
2022-05-03 19:27:57 +02:00
Marko Zec
6c741ffbfa Revert "mbuf: do not restore dying interfaces"
This reverts commit 703e533da5.

Revert "ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif"

This reverts commit e1882428dc.

Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
2022-05-03 19:11:40 +02:00
Gordon Bergling
1a15a383a6 net: Fix a typo in a source code comment
- s/peform/perform/

MFC after:	3 days
2022-04-09 11:37:57 +02:00
Gleb Smirnoff
964b8f8b99 ifnet: garbage collect unused function ifaddr_byindex().
Last use was removed in 5adea417d4.
2022-01-28 09:51:52 -08:00
Gleb Smirnoff
e1882428dc ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif
Supplement ifindex table with generation count and use it to
serialize & restore an ifnet pointer.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33266
Fun note:		git show e6abef0918
2022-01-26 21:58:50 -08:00
Gleb Smirnoff
91f44749c6 ifnet: make if_index global
Now that ifindex is static to if.c we can unvirtualize it.  For lifetime
of an ifnet its index never changes.  To avoid leaking foreign interfaces
the net.link.generic.system.ifcount sysctl and the ifnet_byindex() KPI
filter their returned value on curvnet.  Since if_vmove() no longer
changes the if_index, inline ifindex_alloc() and ifindex_free() into
if_alloc() and if_free() respectively.

API wise the only change is that now minimum interface index can be
greater than 1.  The holes in interface indexes were always allowed.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33672
2022-01-26 21:58:44 -08:00
Gleb Smirnoff
54712fc423 if_vmove: improve restoration in cloner's ifgroup membership
* Do a single call into if_clone.c instead of two.  The cloner
  can't disappear since the interface sits on its list.
* Make restoration smarter - check that cloner with same name
  exists in the new vnet.

Differential revision:	https://reviews.freebsd.org/D33941
2022-01-24 21:06:59 -08:00
Ryan Stone
5adea417d4 Fix ifa refcount leak in ifa_ifwithnet()
In 4f6c66cc9c, ifa_ifwithnet() was changed to no longer
ifa_ref() the returned ifaddr, and instead the caller was required
to stay in the net_epoch for as long as they wanted the ifaddr
to remain valid.  However, this missed the case where an AF_LINK
lookup would call ifaddr_byindex(), which still does ifa_ref()
the ifaddr.  This would cause a refcount leak.

Fix this by inlining the relevant parts of ifaddr_byindex() here,
with the ifa_ref() call removed.  This also avoids an unnecessary
entry and exit from the net_epoch for this case.

I've audited all in-tree consumers of ifa_ifwithnet() that could
possibly perform an AF_LINK lookup and confirmed that none of them
will expect the ifaddr to have a reference that they need to
release.

MFC after: 2 months
Sponsored by: Dell Inc
Differential Revision:	https://reviews.freebsd.org/D28705
Reviewed by: melifaro
2022-01-06 15:04:24 -05:00
Mateusz Guzik
e735fa3212 net/if.c: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 20:39:40 +00:00
Gleb Smirnoff
7e0bba4d80 ifnet: make V_if_index static to if.c
This requires moving net.link.generic sysctl declaration from if_mib.c
to if.c.  Ideally if_mib.c needs just to be merged to if.c, but they
have different license texts.

Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
d74b7baeb0 ifnet_byindex() actually requires network epoch
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them
in epoch.  Most of the code touched remains unsafe, as the returned
pointer is being used after epoch exit.  Mark that with a comment.

Validate the index argument inside the function, reducing argument
validation requirement from the callers and making V_if_index
private to if.c.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
7b40b00fad ifnet: merge ifindex_alloc(), ifnet_setbyindex(), if_grow() and call magic
Now it is possible to just merge all this complexity into single
linear function.  Note that IFNET_WLOCK() is a sleepable lock, so
we can M_WAITOK and epoch_wait_preempt().

Reviewed by:		melifaro, bz, kp
Differential revision:	https://reviews.freebsd.org/D33262
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
6ff4cac2ee ifnet: initial if_grow() shall always succeed
So let's just call malloc() directly.  This also avoids hidden
doubling of default V_if_indexlim.

Reviewed by:		melifaro, bz, kp
Differential revision:	https://reviews.freebsd.org/D33261
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
450394af27 ifnet: use ck_pr(3) store & load setting ifnet pointer in ifindex
The lockless access to the array is protected by the network epoch.

Reviewed by:		bz, kp
Differential revision:	https://reviews.freebsd.org/D33260
2021-12-06 09:32:30 -08:00
Gleb Smirnoff
8062e5759c ifnet: allocate index at the end of if_alloc_domain()
Now that if_alloc_domain() never fails and actually doesn't
expose ifnet to outside we can eliminate IFNET_HOLD and two
step index allocation.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33259
2021-12-06 09:32:30 -08:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Gleb Smirnoff
9e93d2b335 ifnet: enable & fix if_debug build
Fixes:	ce40632a31
2021-12-02 10:59:43 -08:00
Gleb Smirnoff
3bc40f39fd if_free: add a comment explaining why ifindex_free() is performed here 2021-11-22 19:59:27 -08:00
Gleb Smirnoff
fe499a8452 ifnet: merge if_destroy() and if_free_internal() into one
New function has more meaningful name if_free_deferred() and has
its header comment fixed to reflect reality.  NFC
2021-11-22 19:53:12 -08:00
Gleb Smirnoff
4787572d05 ifnet: make if_alloc_domain() never fail
The last consumer of if_com_alloc() is firewire.  It never fails
to allocate.  Most likely the if_com_alloc() KPI will go away
together with if_fwip(), less likely new consumers of if_com_alloc()
will be added, but they would need to follow the no fail KPI.
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
1e3ca25d92 ifnet: make if_alloc_domain() static 2021-11-22 19:49:57 -08:00
Gleb Smirnoff
ce40632a31 ifnet: append if_debug.c to if.c
With this change if_index can become static.  There is nothing
that if_debug.c would want to isolate from if.c.  Potentially
if.c wants to share everything with if_debug.c.

Move Bjoern's copyright to if.c.

Reviewed by:	bz
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
8a6f38c8ac ifnet: garbage collect drbr_*_drv().
They were left in 62d76917b8 but after years proved not to be useful.
2021-11-22 19:49:57 -08:00
Mateusz Guzik
5a4e46f6ec net: whack "set but not used" warnings in net/if.c
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:15:08 +00:00
Mark Johnston
b1e6a792d6 net: Enter a net epoch around protocol if_up/down notifications
When traversing a list of interface addresses, we need to be in a net
epoch section, and protocol ctlinput routines need a stable reference to
the address.

Reported by:	syzbot+3219af764ead146a3a4e@syzkaller.appspotmail.com
Reviewed by:	kp, melifaro
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31889
2021-09-10 09:07:40 -04:00
Kristof Provost
01ad0c0079 net: disallow MTU changes on bridge member interfaces
if_bridge member interfaces should always have the same MTU as the
bridge itself, so disallow MTU changes on interfaces that are part of an
if_bridge.

Reviewed by:	donner
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31304
2021-07-28 22:03:30 +02:00
Rozhuk Ivan
4fb3e0bb94 devctl: add RENAME devctl event for IFNET
Add devd event on network iface rename.

Reviewed by:		imp@,asomers@
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D30839
2021-06-23 10:20:58 -06:00
Mark Johnston
ad22ba2b9f if: Remove unnecessary validation in the SIOCSIFNAME handler
A successful copyinstr() call guarantees that the returned string is
nul-terminated.  Furthermore, the removed check would harmlessly compare
an uninitialized byte with '\0' if the new name is shorter than
IFNAMESIZ - 1.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-05-12 12:52:06 -04:00
John Baldwin
ed93deba11 Remove a write-only variable.
While refactoring an earlier series of changes during review, the
'saved_data' variable stopped being used at the bottom of if_ioctl().

Suggested by:	brooks
Reviewed by:	brooks, imp, kib
Fixes:		d17e0940f7 Rework compat shims in ifioctl().
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D30197
2021-05-11 14:56:23 -07:00
John Baldwin
9c87db4b3c Group all compat shim structures together to consolidate #ifdef's.
Reviewed by:	brooks, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29894
2021-05-05 13:59:09 -07:00
John Baldwin
01e9cbc4c5 Use thunks for compat ioctls using struct ifgroupreq.
Reviewed by:	brooks, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29893
2021-05-05 13:59:00 -07:00
John Baldwin
d61d98f4ed Add freebsd32 compat shims for SIOC[GS]DRVSPEC.
Reviewed by:	brooks, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29892
2021-05-05 13:58:50 -07:00
John Baldwin
d17e0940f7 Rework compat shims in ifioctl().
Centralize logic for handling compat ioctls into two blocks of code at
the start and end of the ioctl routine.  This avoids the conversion
logic being spread out both in multiple blocks in ifioctl as well as
various helper functions.

Reviewed by:	brooks, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29891
2021-05-05 13:58:23 -07:00
Mark Johnston
8e8f1cc9bb Re-enable network ioctls in capability mode
This reverts a portion of 274579831b ("capsicum: Limit socket
operations in capability mode") as at least rtsol and dhcpcd rely on
being able to configure network interfaces while in capability mode.

Reported by:	bapt, Greg V
Sponsored by:	The FreeBSD Foundation
2021-04-23 09:22:49 -04:00
Mark Johnston
274579831b capsicum: Limit socket operations in capability mode
Capsicum did not prevent certain privileged networking operations,
specifically creation of raw sockets and network configuration ioctls.
However, these facilities can be used to circumvent some of the
restrictions that capability mode is supposed to enforce.

Add capability mode checks to disallow network configuration ioctls and
creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET
internet sockets.

Reviewed by:	oshogbo
Discussed with:	emaste
Reported by:	manu
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29423
2021-04-07 14:32:56 -04:00
Adrian Chadd
25bfa44860 Add device and ifnet logging methods, similar to device_printf / if_printf
* device_printf() is effectively a printf
* if_printf() is effectively a LOG_INFO

This allows subsystems to log device/netif stuff using different log levels,
rather than having to invent their own way to prefix unit/netif  names.

Differential Revision: https://reviews.freebsd.org/D29320
Reviewed by: imp
2021-03-22 00:02:34 +00:00
Tai-hwa Liang
092f3f0812 net: fixing a memory leak in if_deregister_com_alloc()
Drain the callbacks upon if_deregister_com_alloc() such that the
if_com_free[type] won't be nullified before if_destroy().

Taking fwip(4) as an example, before this fix, kldunload if_fwip will
go through the following:

  1. fwip_detach()
  2. if_free() -> schedule if_destroy() through NET_EPOCH_CALL
  3. fwip_detach() returns
  4. firewire_modevent(MOD_UNLOAD) -> if_deregister_com_alloc()
  5. kernel complains about:
	Warning: memory type fw_com leaked memory on destroy (1 allocations, 64 bytes leaked).
  6. EPOCH runs if_destroy() -> if_free_internal()i

By this time, if_com_free[if_alloctype] is NULL since it's already
nullified by if_deregister_com_alloc(); hence, firewire_free() won't
have a chance to release the allocated fw_com.

Reviewed by:	hselasky, glebius
MFC after:	2 weeks
2021-03-06 14:43:16 +00:00
Alexander V. Chernikov
7563019bc6 Add if_try_ref() to simplify refcount handling inside epoch.
When we have an ifp pointer and the code is running inside epoch,
 epoch guarantees the pointer will not be freed.
However, the following case can still happen:

* in thread 1 we drop to refcount=0 for ifp and schedule its deletion.
* in thread 2 we use this ifp and reference it
* destroy callout kicks in
* unhappy user reports a bug

This can happen with the current implementation of ifnet_byindex_ref(),
 as we're not holding any locks preventing ifnet deletion by a parallel thread.

To address it, add if_try_ref(), allowing to return failure when
 referencing ifp with refcount=0.
Additionally, enforce existing if_ref() is with KASSERT to provide a
 cleaner error in such scenarios.

Finally, fix ifnet_byindex_ref() by using if_try_ref() and returning NULL
 if the latter fails.

MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D28836
2021-02-22 23:37:59 +00:00
Alexander V. Chernikov
600eade2fb Add ifa_try_ref() to simplify ifa handling inside epoch.
More and more code migrates from lock-based protection to the NET_EPOCH
 umbrella. It requires some logic changes, including, notably, refcount
 handling.

When we have an `ifa` pointer and we're running inside epoch we're
 guaranteed that this pointer will not be freed.
However, the following case can still happen:
 * in thread 1 we drop to 0 refcount for ifa and schedule its deletion.
 * in thread 2 we use this ifa and reference it
 * destroy callout kicks in
 * unhappy user reports bug

To address it, new `ifa_try_ref()` function is added, allowing to return
 failure when we try to reference `ifa` with 0 refcount.
Additionally, existing `ifa_ref()` is enforced with `KASSERT` to provide
 cleaner error in such scenarious.

Reviewed By: rstone, donner
Differential Revision: https://reviews.freebsd.org/D28639
MFC after:	1 week
2021-02-16 20:14:50 +00:00
Kristof Provost
6d2a10d96f Widen ifnet_detach_sxlock coverage
Widen the ifnet_detach_sxlock to cover the entire vnet sysuninit code.
This ensures that we can't end up having the vnet_sysuninit free the UDP
pcb while the detach code is running and trying to purge the UDP pcb.

MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28530
2021-02-11 16:12:29 +01:00