Commit Graph

259 Commits

Author SHA1 Message Date
Vincenzo Maffione
dd6ab49a9a netmap: add a tunable for the maximum number of VALE switches
The new dev.netmap.max_bridges sysctl tunable can be set in
loader.conf(5) to change the default maximum number of VALE
switches that can be created. Current defaults is 8.

MFC after:	2 weeks
2022-03-06 17:29:44 +00:00
Vincenzo Maffione
09a1893398 netmap: fix refcount bug in netmap allocator
Symptom: when a single extmem memory region is provided to netmap
multiple times, for multiple interfaces, the memory region is
never released by netmap once all the existing file descriptors
are closed.

Fix the relevant condition in netmap_mem_drop(): release the memory
when the last user of netmap_adapter is gone, rather then when
the last user of netmap_mem_d is gone.

MFC after:	2 weeks
2022-03-06 16:39:16 +00:00
Vincenzo Maffione
3e3314a8b7 netmap: fix uint32_t overflow in pool size calculation
MFC after:	1 week
2021-09-26 13:56:33 +00:00
Vincenzo Maffione
6127ce9d91 netmap: monitor: support offsets in copy mode 2021-09-26 13:52:16 +00:00
Vincenzo Maffione
98399ab06f netmap: import changes from upstream
- make sure rings are disabled during resets
 - introduce netmap_update_hostrings_mode(), with support
   for multiple host rings
 - always initialize ni_bufs_head in netmap_if
      ni_bufs_head was not properly initialized when no external buffers were
      requestedx and contained the ni_bufs_head from the last request. This
      was causing spurious buffer frees when alternating between apps that
      used external buffers and apps that did not use them.
 - check na validitity under lock on detach
 - netmap_mem: fix leak on error path
 - nm_dispatch: fix compilation on Raspberry Pi

MFC after:	2 weeks
2021-08-22 09:31:05 +00:00
Vincenzo Maffione
f4a54f4333 netmap: use safer defaults for hwbuf_len
We must make sure that incoming packets will never overflow the netmap
buffers, even when the user is using the offset feature. In the typical
scenario, the netmap buffer is 2KiB and, with an MTU of 1500, there are
~500 bytes available for user offsets.

Unfortunately, some NICs accept incoming packets even when they are
larger then the MTU. This means that the only way to stop DMA from
overflowing the netmap buffers, when offsets are allowed, is to choose
a hardware buffer length which is smaller than the netmap buffer
length. For most NICs and for 2KiB netmap buffers, this means 1024
bytes, which is unconveniently small.

The current code will select the small hardware buf size even when
offsets are not     in use. The main purpose of this change is to
fix this bug by returning to the normal behavior for the no-offsets
case.

At the same time, the patch pushes the handling of the offset case
to the lower level driver code, so that it can be made NIC-specific
(in future patches).
2021-04-18 13:39:15 +00:00
Cy Schubert
b51f459a20 wpa: Import wpa_supplicant/hostapd commit f91680c15
This is the April update to vendor/wpa committed upstream
2021/04/07.

This is MFV efec822389.

Suggested by:		philip
Reviewed by:		philip
MFC after:		2 months
Differential Revision:	https://reviews.freebsd.org/D29744
2021-04-17 07:21:12 -07:00
Vincenzo Maffione
13c4641188 netmap: make sure rings are disabled during resets
Explicitly disable ring synchronization before calling
callbacks that may result in a hardware reset.

Before this patch we relied on capturing the down/up events which,
however, may not be issued by all drivers.
2021-04-17 14:02:47 +00:00
Vincenzo Maffione
70275a6735 netmap: don't use linux type struct device *
Such type cannot be used in code that is in common between
FreeBSD and Linux. Use the FreeBSD type instead.

MFC after:	3 days
Reported by:	markj
Differential Revision:	https://reviews.freebsd.org/D29677
2021-04-11 21:13:01 +00:00
Vincenzo Maffione
172c5eb272 netmap: vtnet: remove unused variable
Reported by:	bdragon
2021-04-09 19:33:41 +00:00
Vincenzo Maffione
15dc713ceb netmap: vtnet: add support for netmap offsets
Follow-up change to a6d768d845.
This change adds support for netmap offsets.
2021-04-07 21:32:20 +00:00
Vincenzo Maffione
45c67e8f6b netmap: several typo fixes
No functional changes intended.
2021-04-02 07:01:20 +00:00
Vincenzo Maffione
66671ae589 netmap: fix typo bug in netmap_compute_buf_len 2021-04-02 06:47:28 +00:00
Vincenzo Maffione
660a47cb99 netmap: monitor: add a flag to distinguish packet direction
The netmap monitor intercepts any TX/RX packets on the monitored
port. However, before this change there was no way to tell
whether an intercepted packet was being transmitted or received
on the monitored port.
A TXMON flag in the netmap slot has been added for this purpose.
2021-03-29 16:32:54 +00:00
Vincenzo Maffione
a6d768d845 netmap: add kernel support for the "offsets" feature
This feature enables applications to ask netmap to transmit or
receive packets starting at a user-specified offset from the
beginning of the netmap buffer. This is meant to ease those
packet manipulation operations such as pushing or popping packet
headers, that may be useful to implement software switches,
routers and other packet processors.
To use the feature, drivers (e.g., iflib, vtnet, etc.) must have
explicit support. This change does not add support for any driver,
but introduces the necessary kernel changes. However, offsets support
is already included for VALE ports and pipes.
2021-03-29 16:29:01 +00:00
Vincenzo Maffione
ee7ffaa2e6 netmap: fix issues in nm_os_extmem_create()
- Call vm_object_reference() before vm_map_lookup_done().
- Use vm_mmap_to_errno() to convert vm_map_* return values to errno.
- Fix memory leak of e->obj.

Reported by:	markj
Reviewed by:	markj
MFC after:	1 week
2021-03-20 17:15:50 +00:00
Vincenzo Maffione
0ab5902e8a netmap: fix memory leak in NETMAP_REQ_PORT_INFO_GET
The netmap_ioctl() function has a reference counting bug in case of
NETMAP_REQ_PORT_INFO_GET command. When `hdr->nr_name[0] == '\0'`,
the function does not decrease the refcount of "nmd", which is
increased by netmap_mem_find(), causing a refcount leak.

Reported by:	Xiyu Yang <sherllyyang00@gmail.com>
Submitted by:	Carl Smith <carl.smith@alliedtelesis.co.nz>
MFC after: 3 days
PR:	254311
2021-03-15 17:39:18 +00:00
Mark Johnston
fef8450971 netmap: Stop printing a line to the dmesg in netmap_init()
netmap is compiled into the kernel by default so initialization was
always reported, and netmap uses a formatting convention not used in the
rest of the kernel.

Reviewed by:	vmaffione
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29099
2021-03-05 18:07:47 -05:00
Vincenzo Maffione
ee0005f11f netmap: simplify parameter passing
Changes imported from the netmap github.
2021-01-24 21:59:02 +00:00
Vincenzo Maffione
3005e10ddb netmap: vtnet: fix RX initialization after netmap_reset()
At device reset, we must not publish those netmap receive buffers
that are owned by userspace (nm_kr_rxspace).

MFC after:	1 week
2021-01-11 21:38:32 +00:00
Vincenzo Maffione
55f0ad5fde netmap: restore hwofs and support it in iflib
Restore the hwofs functionality temporarily disabled by
7ba6ecf216 to prevent issues with iflib.
This patch brings the necessary changes to iflib to
enable howfs to allow interface restarts without
disrupting netmap applications actively using its
rings.
After this change, it becomes possible for multiple
non-cooperating netmap applications to use non-overlapping
subsets of the available netmap rings without clashing
with each other.

PR:		252453
MFC after:	1 week
2021-01-10 22:51:15 +00:00
Vincenzo Maffione
bb714db6d3 netmap: vtnet: enable/disable krings on any interface reinit
See 3d65fd97e8 for a detailed explanation.

PR:             252453
MFC after:      1 week
2021-01-10 14:10:09 +00:00
Vincenzo Maffione
9ac59d42c0 netmap: vtnet: stop krings during interface reset
Similarly to what done for iflib in 1d238b07d5,
this patch prevents access to the krings during the interface
reset triggered by netmap_register().

MFC after:	1 week
2021-01-09 22:34:52 +00:00
Vincenzo Maffione
7ba6ecf216 netmap: refactor netmap_reset
The netmap_reset() function is meant to be called by the driver
when they initialize (or re-initialize) a hardware ring.
However, since the introduction of support for opening (in
netmap mode) a subset of the available rings, netmap_reset()
may be called multiple times on actively used rings, causing
both kring and netmap ring to transition to an inconsistent
state.
This changes improves the situation by resetting all the
indices fields of the kring to 0, as expected after the
reinitialization of a hardware ring.

PR:	    252518
MFC after:  1 week
2021-01-09 22:07:24 +00:00
Vincenzo Maffione
1d238b07d5 netmap: iflib: stop krings during interface reset
When different processes open separate subsets of the
available rings of a same netmap interface, a device
reset may be performed while one of the processes
is actively using some rings (e.g., caused by another
process executing a nmport_open()).
With this patch, such situation will cause the
active process to get a POLLERR, so that it can
have a chance to detect the situation.
We also guarantee that no process is running a txsync
or rxsync (ioctl or poll) while an iflib device reset
is in progress.

PR:	    252453
MFC after:  1 week
2021-01-09 21:01:46 +00:00
Vincenzo Maffione
174f809da5 netmap: fix mutex double unlock bug
https://github.com/luigirizzo/netmap/pull/733

Submitted by:	 brian90013
MFC after:	3 days
2020-10-22 20:21:11 +00:00
Vincenzo Maffione
b7d6913862 netmap: use FreeBSD guards for epoch calls
EPOCH calls are FreeBSD specific. Use guards to protect these, so
that the code can compile under Linux.

MFC after:	1 week
2020-08-24 20:28:21 +00:00
Vincenzo Maffione
ff48ef48ac netmap: fix parsing of legacy nmr->nr_ringid
Code was checking for NETMAP_{SW,HW}_RING in req->nr_ringid which
had already been masked by NETMAP_RING_MASK. Therefore, the comparisons
always failed and set NR_REG_ALL_NIC. Check against the original nmr
structure.

Submitted by:	bpoole@packetforensics.com
Reported by:	bpoole@packetforensics.com
Reviewed by:	giuseppe.lettieri@unipi.it
Approved by:	vmaffione
MFC after:	1 week
2020-08-18 08:03:28 +00:00
Vincenzo Maffione
16f224b5f8 netmap: vtnet: fix races in vtnet_netmap_reg()
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.

MFC after:	1 week
2020-06-14 20:47:31 +00:00
Vincenzo Maffione
6682323732 netmap: introduce netmap_kring_on()
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().

MFC after:	1 week
2020-06-11 20:35:28 +00:00
Vincenzo Maffione
e8c07b1246 netmap: vtnet: clean up rxsync disabled logs
MFC after:	1 week
2020-06-03 17:47:32 +00:00
Vincenzo Maffione
1b6d5a80a6 netmap: vtnet: fix race condition in rxsync
This change prevents a race that happens when rxsync dequeues
N-1 rx packets (with N being the size of the netmap rx ring).
In this situation, the loop exits without re-enabling the
rx interrupts, thus causing the VQ to stall.

MFC after:	1 week
2020-06-03 17:46:21 +00:00
Vincenzo Maffione
2d769e25b1 netmap: vtnet: add vtnrx_nm_refill index to receive queues
The new index tracks the next netmap slot that is going
to be enqueued into the virtqueue. The index is necessary
to prevent the receive VQ and the netmap rx ring from going
out of sync, considering that we never enqueue N slots, but
at most N-1. This change fixes a bug that causes the VQ
and the netmap ring to go out of sync after N-1 packets
have been received.

MFC after:	1 week
2020-06-03 17:42:17 +00:00
Vincenzo Maffione
06f6997eb5 netmap: vale: fix disabled logs
MFC after:	1 week
2020-06-03 05:49:19 +00:00
Vincenzo Maffione
81d2cade1c netmap: vtnet: remove leftover memory barriers
MFC after:	1 week
2020-06-03 05:48:42 +00:00
Vincenzo Maffione
9ec71596c0 netmap: if_vtnet: avoid netmap ring wraparound
netmap assumes the one "slot" is left unused to distinguish
the empty ring and full ring conditions. This assumption was
violated by vtnet_netmap_rxq_populate().

MFC after:	1 week
2020-06-01 16:14:29 +00:00
Vincenzo Maffione
36f2d67026 netmap: if_vtnet: replace vtnet_free_used()
The functionality contained in this function is duplicated,
as it is already available in vtnet_txq_free_mbufs()
and vtnet_rxq_free_mbufs().

MFC after:	1 week
2020-06-01 16:12:09 +00:00
Vincenzo Maffione
c9de157d36 netmap: vtnet: fix RX virtqueue initialization bug
The vtnet_netmap_rxq_populate() function erroneously assumed
that kring->nr_hwcur = 0, i.e. the kring was in the initial
state. However, this is not always the case: for example,
when a vtnet reinit is triggered by some changes in the
interface flags or capenable.
This patch changes the behaviour of vtnet_netmap_kring_refill()
so that it always starts publishing the netmap buffers starting
from the current value of kring->nr_hwcur.

MFC after:	1 week
2020-06-01 16:10:44 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Gleb Smirnoff
6c3e93cb5a Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D23518
2020-02-11 18:57:07 +00:00
Vincenzo Maffione
723180da59 netmap: improve netmap(4) and vale(4) man pages
Clean up obsolete sysctl descriptions and add missing ones.

PR:		243838
Reviewed by:	bcr
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23546
2020-02-07 19:26:26 +00:00
Vincenzo Maffione
de27b30340 netmap_mem_unmap: fix NULL pointer dereference
MFC after:	3 days
2020-01-26 21:34:46 +00:00
Gleb Smirnoff
a44700782e In netmap() call ether_input() within the network epoch. 2020-01-23 01:35:02 +00:00
Vincenzo Maffione
2ec213aba4 netmap: disable passthrough with no hypervisor support
The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.

PR:	241774
MFC after:	3 days
2020-01-13 21:47:23 +00:00
Jeff Roberson
3cf3b4e641 Make page busy state deterministic on free. Pages must be xbusy when
removed from objects including calls to free.  Pages must not be xbusy
when freed and not on an object.  Strengthen assertions to match these
expectations.  In practice very little code had to change busy handling
to meet these rules but we can now make stronger guarantees to busy
holders and avoid conditionally dropping busy in free.

Refine vm_page_remove() and vm_page_replace() semantics now that we have
stronger guarantees about busy state.  This removes redundant and
potentially problematic code that has proliferated.

Discussed with:	markj
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D22822
2019-12-22 06:56:44 +00:00
Vincenzo Maffione
c7c7805531 add valectl to the system commands
The valectl(4) program is used to manage vale(4) switches.
Add it to the system commands so that it can be used right away.
This program was previously called vale-ctl, and stored in
tools/tools/netmap

Reviewed by:	hrs, bcr, lwhsu, kevans
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22146
2019-10-31 21:01:34 +00:00
Vincenzo Maffione
484456b2d8 netmap: enter NET_EPOCH on generic txsync
After r353292, netmap generic adapter on if_vlan interfaces panics on
asserting the NET_EPOCH. In more detail, this happens when
nm_os_generic_xmit_frame() is called, that is in the generic txsync
routine.
Fix the issue by entering the NET_EPOCH during the generic txsync.
We amortize the cost of entering/exiting over a whole batch of
transmissions.

PR:		241489
Reported by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
2019-10-28 19:00:27 +00:00
Vincenzo Maffione
760fa2ab5d netmap: minor misc improvements
- use ring->head rather than ring->cur in lb(8)
 - use strlcat() rather than strncat()
 - fix bandwidth computation in pkt-gen(8)

MFC after:	1 week
2019-10-20 14:15:45 +00:00
Vincenzo Maffione
f8bc74e2f4 tap: add support for virtio-net offloads
This patch is part of an effort to make bhyve networking (in particular TCP)
faster. The key strategy to enhance TCP throughput is to let the whole packet
datapath work with TSO/LRO packets (up to 64KB each), so that the per-packet
overhead is amortized over a large number of bytes.
This capability is supported in the guest by means of the vtnet(4) driver,
which is able to handle TSO/LRO packets leveraging the virtio-net header
(see struct virtio_net_hdr and struct virtio_net_hdr_mrg_rxbuf).
A bhyve VM exchanges packets with the host through a network backend,
which can be vale(4) or if_tap(4).
While vale(4) supports TSO/LRO packets, if_tap(4) does not.
This patch extends if_tap(4) with the ability to understand the virtio-net
header, so that a tapX interface can process TSO/LRO packets.
A couple of ioctl commands have been added to configure and probe the
virtio-net header. Once the virtio-net header is set, the tapX interface
acquires all the IFCAP capabilities necessary for TSO/LRO.

Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D21263
2019-10-18 21:53:27 +00:00
Jeff Roberson
0012f373e4 (4/6) Protect page valid with the busy lock.
Atomics are used for page busy and valid state when the shared busy is
held.  The details of the locking protocol and valid and dirty
synchronization are in the updated vm_page.h comments.

Reviewed by:    kib, markj
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:        https://reviews.freebsd.org/D21594
2019-10-15 03:45:41 +00:00