Commit Graph

25 Commits

Author SHA1 Message Date
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Justin Hibbits
2c2b37ad25 ifnet/API: Move struct ifnet definition to a <net/if_private.h>
Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail.  Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.

Reviewed by:	glebius
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046
2023-01-24 14:36:30 -05:00
Zhenlei Huang
43f8c763cd if_me: Use dedicated network privilege
Separate if_me privileges from if_gif.

Reviewed by:		kp
Differential Revision:	https://reviews.freebsd.org/D36691
2022-10-15 17:05:36 +02:00
Gleb Smirnoff
c8ee75f231 Use network epoch to protect local IPv4 addresses hash.
The modification to the hash are already naturally locked by
in_control_sx.  Convert the hash lists to CK lists. Remove the
in_ifaddr_rmlock. Assert the network epoch where necessary.

Most cases when the hash lookup is done the epoch is already entered.
Cover a few cases, that need entering the epoch, which mostly is
initial configuration of tunnel interfaces and multicast addresses.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D32584
2021-10-22 14:40:53 -07:00
Zhenlei Huang
62e1a437f3 routing: Allow using IPv6 next-hops for IPv4 routes (RFC 5549).
Implement kernel support for RFC 5549/8950.

* Relax control plane restrictions and allow specifying IPv6 gateways
 for IPv4 routes. This behavior is controlled by the
 net.route.rib_route_ipv6_nexthop sysctl (on by default).

* Always pass final destination in ro->ro_dst in ip_forward().

* Use ro->ro_dst to exract packet family inside if_output() routines.
 Consistently use RO_GET_FAMILY() macro to handle ro=NULL case.

* Pass extracted family to nd6_resolve() to get the LLE with proper encap.
 It leverages recent lltable changes committed in c541bd368f.

Presence of the functionality can be checked using ipv4_rfc5549_support feature(3).
Example usage:
  route add -net 192.0.0.0/24 -inet6 fe80::5054:ff:fe14:e319%vtnet0

Differential Revision: https://reviews.freebsd.org/D30398
MFC after:	2 weeks
2021-08-22 22:56:08 +00:00
Andrey V. Elsukov
dd4490fdab Add if_reassing method to all tunneling interfaces.
After r339550 tunneling interfaces have started handle appearing and
disappearing of ingress IP address on the host system.
When such interfaces are moving into VNET jail, they lose ability to
properly handle ifaddr_event_ext event. And this leads to need to
reconfigure tunnel to make it working again.

Since moving an interface into VNET jail leads to removing of all IP
addresses, it looks consistent, that tunnel configuration should also
be cleared. This is what will do if_reassing method.

Reported by:	John W. O'Brien <john saltant com>
MFC after:	1 week
2020-06-03 13:02:31 +00:00
Ed Maste
aeb665b538 remove extraneous double ;s in sys/ 2020-03-30 16:04:25 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Gleb Smirnoff
97168be809 Mechanically substitute assertion of in_epoch(net_epoch_preempt) to
NET_EPOCH_ASSERT(). NFC
2020-01-15 05:45:27 +00:00
Gleb Smirnoff
b8a6e03fac Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by:	gallatin, hselasky, cy, adrian, kristof
Differential Revision:	https://reviews.freebsd.org/D19111
2019-10-07 22:40:05 +00:00
Andrey V. Elsukov
8796e291f8 Add the check that current VNET is ready and access to srchash is allowed.
This change is similar to r339646. The callback that checks for appearing
and disappearing of tunnel ingress address can be called during VNET
teardown. To prevent access to already freed memory, add check to the
callback and epoch_wait() call to be sure that callback has finished its
work.

MFC after:	20 days
2018-10-23 13:11:45 +00:00
Andrey V. Elsukov
df49ca9fd3 Add handling for appearing/disappearing of ingress addresses to if_me(4).
* register handler for ingress address appearing/disappearing;
* add new srcaddr hash table for fast softc lookup by srcaddr;
* when srcaddr disappears, clear IFF_DRV_RUNNING flag from interface,
  and set it otherwise;

MFC after:	1 month
Sponsored by:	Yandex LLC
2018-10-21 18:18:37 +00:00
Gleb Smirnoff
0a27163f8d Fix exiting an epoch(9) we never entered. May happen only with MAC. 2018-10-21 12:39:00 +00:00
Andrew Turner
5f901c92a8 Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by:	bz
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16147
2018-07-24 16:35:52 +00:00
Andrey V. Elsukov
98a8fdf6da Deduplicate the code.
Add generic function if_tunnel_check_nesting() that does check for
allowed nesting level for tunneling interfaces and also does loop
detection. Use it in gif(4), gre(4) and me(4) interfaces.

Differential Revision:	https://reviews.freebsd.org/D16162
2018-07-09 11:03:28 +00:00
Matt Macy
6573d7580b epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs
- Inline epoch read path in kernel and tied modules
- Change in_epoch to take an epoch as argument
- Simplify tfb_tcp_do_segment to not take a ti_locked argument,
  there's no longer any benefit to dropping the pcbinfo lock
  and trying to do so just adds an error prone branchfest to
  these functions
- Remove cases of same function recursion on the epoch as
  recursing is no longer free.
- Remove the the TAILQ_ENTRY and epoch_section from struct
  thread as the tracker field is now stack or heap allocated
  as appropriate.

Tested by: pho and Limelight Networks
Reviewed by: kbowling at llnw dot com
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16066
2018-07-04 02:47:16 +00:00
Andrey V. Elsukov
6e081509db Add NULL pointer check.
encap_lookup_t method can be invoked by IP encap subsytem even if none
of gif/gre/me interfaces are exist. Hash tables are allocated on demand,
when first interface is created. So, make NULL pointer check before
doing access to hash table.

PR:		229378
2018-06-28 11:39:27 +00:00
Andrey V. Elsukov
9597ff83b8 Add missing BPF_MTAP2() for outbound packets. 2018-06-14 15:04:30 +00:00
Andrey V. Elsukov
2addcba7d5 Convert if_me(4) driver to use encap_lookup_t method and be lockless on
data path.
2018-06-14 14:53:24 +00:00
Andrey V. Elsukov
6d8fdfa9d5 Rework IP encapsulation handling code.
Currently it has several disadvantages:
- it uses single mutex to protect internal structures. It is used by
  data- and control- path, thus there are no parallelism at all.
- it uses single list to keep encap handlers for both INET and INET6
  families.
- struct encaptab keeps unneeded information (src, dst, masks, protosw),
  that isn't used by code in the source tree.
- matches are prioritized and when many tunneling interfaces are
  registered, encapcheck handler of each interface is invoked for each
  packet. The search takes O(n) for n interfaces. All this work is done
  with exclusive lock held.

What this patch includes:
- the datapath is converted to be lockless using epoch(9) KPI.
- struct encaptab now linked using CK_LIST.
- all unused fields removed from struct encaptab. Several new fields
  addedr: min_length is the minimum packet length, that encapsulation
  handler expects to see; exact_match is maximum number of bits, that
  can return an encapsulation handler, when it wants to consume a packet.
- IPv6 and IPv4 handlers are stored in separate lists;
- added new "encap_lookup_t" method, that will be used later. It is
  targeted to speedup lookup of needed interface, when gif(4)/gre(4) have
  many interfaces.
- the need to use protosw structure is eliminated. The only pr_input
  method was used from this structure, so I don't see the need to keep
  using it.
- encap_input_t method changed to avoid using mbuf tags to store softc
  pointer. Now it is passed directly trough encap_input_t method.
  encap_getarg() funtions is removed.
- all sockaddr structures and code that uses them removed. We don't have
  any code in the tree that uses them. All consumers use encap_attach_func()
  method, that relies on invoking of encapcheck() to determine the needed
  handler.
- introduced struct encap_config, it contains parameters of encap handler
  that is going to be registered by encap_attach() function.
- encap handlers are stored in lists ordered by exact_match value, thus
  handlers that need more bits to match will be checked first, and if
  encapcheck method returns exact_match value, the search will be stopped.
- all current consumers changed to use new KPI.

Reviewed by:	mmacy
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D15617
2018-06-05 20:51:01 +00:00
Andrey V. Elsukov
c00bf73076 Do not adjust interface MTU automatically. Leave this task to the system
administrator.

This restores the behavior that was prior to r274246.

No objection from:	#network
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10215
2017-04-11 08:56:18 +00:00
Hiroki Sato
f1aaad0cd9 Add IFCAP_LINKSTATE support. 2015-10-03 09:15:23 +00:00
Andrey V. Elsukov
eccfe69a5c Add new socket ioctls SIOC[SG]TUNFIB to set FIB number of encapsulated
packets on tunnel interfaces. Add support of these ioctls to gre(4),
gif(4) and me(4) interfaces. For incoming packets M_SETFIB() should use
if_fib value from ifnet structure, use proper value in gre(4) and me(4).

Differential Revision:	https://reviews.freebsd.org/D2462
No objection from:	#network
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2015-05-12 07:37:27 +00:00
Andrey V. Elsukov
b347bc3b7a Pass mtag argument into m_tag_locate() to continue the search from
the last found mtag.
2015-05-06 14:02:57 +00:00
Andrey V. Elsukov
f325335caf Overhaul if_gre(4).
Split it into two modules: if_gre(4) for GRE encapsulation and
if_me(4) for minimal encapsulation within IP.

gre(4) changes:
* convert to if_transmit;
* rework locking: protect access to softc with rmlock,
  protect from concurrent ioctls with sx lock;
* correct interface accounting for outgoing datagramms (count only payload size);
* implement generic support for using IPv6 as delivery header;
* make implementation conform to the RFC 2784 and partially to RFC 2890;
* add support for GRE checksums - calculate for outgoing datagramms and check
  for inconming datagramms;
* add support for sending sequence number in GRE header;
* remove support of cached routes. This fixes problem, when gre(4) doesn't
  work at system startup. But this also removes support for having tunnels with
  the same addresses for inner and outer header.
* deprecate support for various GREXXX ioctls, that doesn't used in FreeBSD.
  Use our standard ioctls for tunnels.

me(4):
* implementation conform to RFC 2004;
* use if_transmit;
* use the same locking model as gre(4);

PR:		164475
Differential Revision:	D1023
No objections from:	net@
Relnotes:	yes
Sponsored by:	Yandex LLC
2014-11-07 19:13:19 +00:00