freebsd-dev

Author	SHA1	Message	Date
Justin Hibbits	2c2b37ad25	ifnet/API: Move struct ifnet definition to a <net/if_private.h> Hide the ifnet structure definition, no user serviceable parts inside, it's a netstack implementation detail. Include it temporarily in <net/if_var.h> until all drivers are updated to use the accessors exclusively. Reviewed by: glebius Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D38046	2023-01-24 14:36:30 -05:00
Alexander V. Chernikov	91ebcbe02a	if_clone: migrate some consumers to the new KPI. Convert most of the cloner customers who require custom params to the new if_clone KPI. Reviewed by: kp Differential Revision: https://reviews.freebsd.org/D36636 MFC after: 2 weeks	2022-09-22 12:30:09 +00:00
Mateusz Guzik	497240def8	Retire clone_drain_lock It is only ever xlocked in drain_dev_clone_events and the only consumer of that routine does not need it -- eventhandler code already makes sure the relevant callback is no longer running. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D36268	2022-08-20 09:44:05 +00:00
Zhenlei Huang	62e1a437f3	routing: Allow using IPv6 next-hops for IPv4 routes (RFC 5549). Implement kernel support for RFC 5549/8950. * Relax control plane restrictions and allow specifying IPv6 gateways for IPv4 routes. This behavior is controlled by the net.route.rib_route_ipv6_nexthop sysctl (on by default). * Always pass final destination in ro->ro_dst in ip_forward(). * Use ro->ro_dst to exract packet family inside if_output() routines. Consistently use RO_GET_FAMILY() macro to handle ro=NULL case. * Pass extracted family to nd6_resolve() to get the LLE with proper encap. It leverages recent lltable changes committed in `c541bd368f`. Presence of the functionality can be checked using ipv4_rfc5549_support feature(3). Example usage: route add -net 192.0.0.0/24 -inet6 fe80::5054:ff:fe14:e319%vtnet0 Differential Revision: https://reviews.freebsd.org/D30398 MFC after: 2 weeks	2021-08-22 22:56:08 +00:00
Kyle Evans	51221b68fb	tuntap: clean up cc --analyze One complaint of a dead-store, smack it with a __diagused.	2021-07-21 19:14:43 -05:00
Alexander V. Chernikov	a6b7689718	Remove redundant rtinit() calls from tuntap. Removed code iterates over if_addrhead and tries to remove routes for each ifa. This is exactly the thing that if_purgeaddrs() do, and if_purgeaddr() is already called in the end. Reviewed by: glebius MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D28106	2021-01-13 10:03:15 +00:00
Mateusz Guzik	662c13053f	net: clean up empty lines in .c and .h files	2020-09-01 21:19:14 +00:00
Kyle Evans	cef5fc74c2	tuntap: drop redundant if_mtu assignment in tuncreate ether_ifattach will immediately clobber if_mtu with ETHERMTU anyways, just let it happen. MFC after: 1 week	2020-07-16 15:02:11 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Gleb Smirnoff	af614b8e04	tap(4) calls ether_input() in context of write(2). Enter network epoch here. The tun(4) side doesn't need this, as netisr code will take care.	2020-01-23 01:38:51 +00:00
Kyle Evans	f7810883d4	tuntap(4): Fix NOINET build after r353741 Shuffle headers around to more appropriate #ifdef OPTION blocks (INET vs. INET6) -- double checked LINT-{NOINET,NOINET6,NOIP}, all seem good. Reported by: cem	2019-10-23 02:15:15 +00:00
Kyle Evans	200abb43c0	tuntap(4): properly declare if_tun and if_tap modules Simply adding MODULE_VERSION does not do the trick, because the modules haven't been declared. This should actually fix modfind/kldstat, which r351229 aimed and failed to do. This should make vm-bhyve do the right thing again when using the ports version, rather than the latest version not in ports. MFC after: 3 days	2019-10-22 00:18:16 +00:00
Kyle Evans	3d5013337a	tuntap(4): restrict scope of net.link.tap.user_open slightly net.link.tap.user_open has historically allowed non-root users to do devfs cloning and open /dev/tap* nodes based on permissions. Loosen this up to make it only allow users to do devfs cloning -- we no longer check it in tunopen. This allows tap devices to be created that can actually be opened by a user, rather than swiftly restricting them to root because the magic sysctl has not been set. The sysctl has not yet been completely deprecated, because more thought is needed for how to handle the devfs cloning case. There is not an easy suitable replacement for the sysctl there, and more care needs to be placed in determining whether that's OK or not. PR: 200185	2019-10-21 14:38:11 +00:00
Kyle Evans	6025077704	tuntap(4): use cdevpriv w/ dtor for last close instead of d_close cdevpriv dtors will be called when the reference count on the associated struct file drops to 0, while d_close can be unreliable for cleaning up state at "last close" for a number of reasons. As far as tunclose/tundtor is concerned the difference is minimal, so make the switch.	2019-10-20 22:55:47 +00:00
Kyle Evans	6869d530c7	tuntap(4): Use make_dev_s to avoid si_drv1 race This allows us to avoid some dance in tunopen for dealing with the possibility of dev->si_drv1 being NULL as it's set prior to the devfs node being created in all cases. There's still the possibility that the tun device hasn't been fully initialized, since that's done after the devfs node was created. Alleviate this by returning ENXIO if we're not to that point of tuncreate yet. This work is what sparked r353128, full initialization of cloned devices w/ specified make_dev_args.	2019-10-20 22:39:40 +00:00
Kyle Evans	486c0b2269	tuntap(4): break out after setting TUN_DSTADDR This is now the only flag we set in this loop, terminate early.	2019-10-20 21:06:25 +00:00
Kyle Evans	6041d76e0c	tuntap(4): Drop TUN_IASET This flag appears to have been effectively unused since introduction to if_tun(4) -- drop it now.	2019-10-20 21:03:48 +00:00
Vincenzo Maffione	f8bc74e2f4	tap: add support for virtio-net offloads This patch is part of an effort to make bhyve networking (in particular TCP) faster. The key strategy to enhance TCP throughput is to let the whole packet datapath work with TSO/LRO packets (up to 64KB each), so that the per-packet overhead is amortized over a large number of bytes. This capability is supported in the guest by means of the vtnet(4) driver, which is able to handle TSO/LRO packets leveraging the virtio-net header (see struct virtio_net_hdr and struct virtio_net_hdr_mrg_rxbuf). A bhyve VM exchanges packets with the host through a network backend, which can be vale(4) or if_tap(4). While vale(4) supports TSO/LRO packets, if_tap(4) does not. This patch extends if_tap(4) with the ability to understand the virtio-net header, so that a tapX interface can process TSO/LRO packets. A couple of ioctl commands have been added to configure and probe the virtio-net header. Once the virtio-net header is set, the tapX interface acquires all the IFCAP capabilities necessary for TSO/LRO. Reviewed by: kevans Differential Revision: https://reviews.freebsd.org/D21263	2019-10-18 21:53:27 +00:00
Gleb Smirnoff	73c96bbeac	Don't use if_maddr_rlock() in tuntap(4), use epoch(9) directly instead.	2019-10-10 23:51:14 +00:00
Gleb Smirnoff	b8a6e03fac	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
Kyle Evans	291287667c	tuntap(4): loosen up tunclose restrictions Realistically, this cannot work. We don't allow the tun to be opened twice, so it must be done via fd passing, fork, dup, some mechanism like these. Applications demonstrably do not enforce strict ordering when they're handing off tun devices, so the parent closing before the child will easily leave the tun/tap device in a bad state where it can't be destroyed and a confused user because they did nothing wrong. Concede that we can't leave the tun/tap device in this kind of state because of software not playing the TUNSIFPID game, but it is still good to find and fix this kind of thing to keep ifconfig(8) up-to-date and help ensure good discipline in tun handling. MFC after: 3 days	2019-10-04 13:43:07 +00:00
Kyle Evans	59997c3c46	if_tuntap: create /dev aliases when a tuntap device gets renamed Currently, if you do: $ ifconfig tun0 create $ ifconfig tun0 name wg0 $ ls -l /dev \| egrep 'wg\|tun' You will see tun0, but no wg0. In fact, it's slightly more annoying to make the association between the new name and the old name in order to open the device (if it hadn't been opened during the rename). Register an eventhandler for ifnet_arrival_events and catch interface renames. We can determine if the ifnet is a tun easily enough from the if_dname, which matches the cevsw.d_name from the associated tuntap_driver. Some locking dance is required because renames don't require the device to be opened, so it could go away in the middle of handling the ioctl, but as soon as we've verified this isn't the case we can attempt to busy the tun and either bail out if the tun device is dying, or we can proceed with the rename. We only create these aliases on a best-effort basis. Renaming a tun device to "usbctl", which doesn't exist as an ifnet but does as a /dev, is clearly not that disastrous, but we can't and won't create a /dev for that.	2019-10-03 17:54:00 +00:00
Kyle Evans	c4cad1549e	if_tuntap: add a busy/unbusy mechanism, replace destroy OPEN check A future commit will create device aliases when a tuntap device is renamed so that it's still easily found in /dev after the rename. Said mechanism will want to keep the tun alive long enough to either realize that it's about to go away or complete the alias creation, even if the alias is about to get destroyed. While we're introducing it, using it to prevent open devices from going away makes plenty of sense and keeps the logic on waking up tun_destroy clean, so we don't have multiple places trying to cv_broadcast unless it's still in use elsewhere.	2019-10-03 17:46:27 +00:00
Kyle Evans	5c4eed8601	tuntap: belatedly add MODULE_VERSION for if_tun and if_tap When tun/tap were merged, appropriate MODULE_VERSION should have been added for things like modfind(2) to continue to do the right thing with the old names. Reported by: jhb	2019-08-19 19:01:59 +00:00
Vincenzo Maffione	b5b83671ea	if_tuntap: minor improvements Rewrite a loop to avoid duplicating the exit condition. Simplify mask processing in tunpoll(). Fix minor typos. Reviewed by: kevans, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D21302	2019-08-19 17:23:22 +00:00
Kyle Evans	0dbac71f19	if_tuntap(4): Add TUNGIFNAME This effectively just moves TAPGIFNAME into common ioctl territory. MFC after: 3 days	2019-07-25 22:23:34 +00:00
Conrad Meyer	e2e050c8ef	Extract eventfilter declarations to sys/_eventfilter.h This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h" in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header pollution substantially. EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c files into appropriate headers (e.g., sys/proc.h, powernv/opal.h). As a side effect of reduced header pollution, many .c files and headers no longer contain needed definitions. The remainder of the patch addresses adding appropriate includes to fix those files. LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by sys/mutex.h since r326106 (but silently protected by header pollution prior to this change). No functional change (intended). Of course, any out of tree modules that relied on header pollution for sys/eventhandler.h, sys/lock.h, or sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.	2019-05-20 00:38:23 +00:00
Kyle Evans	db226f0d8e	tuntap: Defer clearing if_softc until after if_detach r346670 added an sx to close a race between the ifioctl handler and interface destruction. Unfortunately, it clears if_softc immediately after the interface is closed, but before if_detach has been invoked. Any time before detachment, an interface that's part of a bridge may still receive traffic that's pushed through tunstart/tunstart_l2 and promptly lead to a panic because if_softc is now NULL. Fix it by deferring the clearing of if_softc until after the interface has detached and thus been removed from the bridge. if_softc still gets cleared in case another thread has already entered the ioctl handler before it's replaced with ifdead_ioctl. Reported by: markj MFC after: 3 days	2019-05-14 20:32:29 +00:00
Kyle Evans	81b3b91e6b	tuntap: Improve style No functional change. tun_flags of the tuntap_driver was renamed to ident_flags to reflect the fact that it's a subset of the tun_flags that identifies a tuntap device. This maps more easily (visually) to the TUN_DRIVER_IDENT_MASK that masks off the bits of tun_flags that are applicable to tuntap driver ident. This is a purely cosmetic change.	2019-05-11 04:18:06 +00:00
Kyle Evans	16760d8e28	tuntap: Don't down tap interfaces if LINK0 is set	2019-05-09 18:54:29 +00:00
Kyle Evans	a6fa049545	tuntap: Properly detach tap ifp	2019-05-09 14:06:24 +00:00
Kyle Evans	251a32b5b2	tun/tap: merge and rename to `tuntap` tun(4) and tap(4) share the same general management interface and have a lot in common. Bugs exist in tap(4) that have been fixed in tun(4), and vice-versa. Let's reduce the maintenance requirements by merging them together and using flags to differentiate between the three interface types (tun, tap, vmnet). This fixes a couple of tap(4)/vmnet(4) issues right out of the gate: - tap devices may no longer be destroyed while they're open [0] - VIMAGE issues already addressed in tun by kp [0] emaste had removed an easy-panic-button in r240938 due to devdrn blocking. A naive glance over this leads me to believe that this isn't quite complete -- destroy_devl will only block while executing d_* functions, but doesn't block the device from being destroyed while a process has it open. The latter is the intent of the condvar in tun, so this is "fixed" (for certain definitions of the word -- it wasn't really broken in tap, it just wasn't quite ideal). ifconfig(8) also grew the ability to map an interface name to a kld, so that `ifconfig {tun,tap}0` can continue to autoload the correct module, and `ifconfig vmnet0 create` will now autoload the correct module. This is a low overhead addition. (MFC commentary) This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this, and how critical they are. Changes after this are likely easily MFC'd without taking this merge, but the merge will be easier. I have no plans to do this MFC as of now. Reviewed by: bcr (manpages), tuexen (testing, syzkaller/packetdrill) Input also from: melifaro Relnotes: yes Differential Revision: https://reviews.freebsd.org/D20044	2019-05-08 02:32:11 +00:00

32 Commits