Otherwise it breaks when offloading like checksum or TSO are used,
because second (encapsulated) ip_output() processing passes fragments of
the encapsulated packet down to the hardware interface.
Diagnosed by: hselasky
Reviewed by: np
Sponsored by: Nvidia Networking / Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29501
In some settings offload might calculate hash from decapsulated packet.
Reserve a bit in packet header rsstype to indicate that.
Add m_adj_decap() that acts similarly to m_adj, but also either clear
flowid if it is not marked as inner, or transfer it to the decapsulated
header, clearing inner indicator. It depends on the internals of m_adj()
that reuses the argument packet header for the result.
Use m_adj_decap() for decapsulating vxlan(4) and gif(4) input packets.
Reviewed by: ae, hselasky, np
Sponsored by: Nvidia Networking / Mellanox Technologies
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28773
'ps' is not a word - rather, it is a utility with its own manual page.
As every other utility referenced in the file has it, append the
relevant manual section that ps(1) can be found in.
While here, also wordsmith a sentence to avoid awkward phrasing, and fix
a typo.
Pointy hat to: me
Reported by: danfe, brueffer, maxim
The POWER7 subword atomics were not using the correct instructions for
byte and halfword stores in the atomic_fcmpset code.
This only affects builds with custom CFLAGS that have explicitly enabled
ISA_206_ATOMICS.
--Eliminate a big ifdef that encompassed all currently-supported
architectures except mips and powerpc32. This applied to the case
in which we've allocated a superpage but the pager-populated range
is insufficient for a superpage mapping. For platforms that don't
support superpages the check should be inexpensive as we shouldn't
get a superpage in the first place. Make the normal-page fallback
logic identical for all platforms and provide a simple implementation
of pmap_ps_enabled() for MIPS and Book-E/AIM32 powerpc.
--Apply the logic for handling pmap_enter() failure if a superpage
mapping can't be supported due to additional protection policy.
Use KERN_PROTECTION_FAILURE instead of KERN_FAILURE for this case,
and note Intel PKU on amd64 as the first example of such protection
policy.
Reviewed by: kib, markj, bdragon
Differential Revision: https://reviews.freebsd.org/D29439
ipi_msg_count is inaccessible outside this file and is never read.
It was introduced in the original SMP support code in r178628 and was never
actually used anywhere.
Remove it to slightly improve IPI performance.
Submitted by: jhibbits
MFC after: 1 week
The NFSv4.1 (and 4.2 on 13) server incorrectly binds
a new TCP connection to the back channel when first
used by an RPC with a Sequence op in it (almost all of them).
RFC5661 specifies that only the fore channel should be bound.
This was done because early clients (including FreeBSD)
did not do the required BindConnectionToSession RPC.
Unfortunately, this breaks the Linux client when the
"nconnects" mount option is used, since the server
may do a callback on the incorrect TCP connection.
This patch converts the server behaviour to that
required by the RFC. It also makes the server test/indicate
failure of the back channel more aggressively.
Until this patch is applied to the server, the
"nconnects" mount option is not recommended for a Linux
NFSv4.1/4.2 client mount to the FreeBSD server.
Reported by: bcodding@redhat.com
Tested by: bcodding@redhat.com
PR: 254560
MFC after: 1 week
In emacs mode, force ^R to backware search the history
This behaviour is the default in emacs mode for most of the other shells
Note: Note that this can still be overridden via $EDITRC, ~/.editrc or a
bind command after set -o emacs.
MFC after: 1 week
Approved by: jilles
Reviewed by: jilles, arichardson, pstef
Differential Revision: https://reviews.freebsd.org/D29494
Now that superpages for HPT MMU has landed, finish implementation of
pmap_mincore by adding support for superpages.
Submitted by: Fernando Eckhardt Valle <fernando.valle@eldorado.org.br>
Reviewed by: bdragon, luporl
MFC after: 1 week
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D29230
This is required for small CCBs support, where we need to track
whether the CCB was allocated from an UMA zone or not. There are
no (intended) functional changes with the current source.
Reviewed By: imp
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29484
The RCS IDs have been retired as of the move to git, so on 14-CURRENT
and 13.0-STABLE this fortune returns the following.
This fortune brought to you by:
$FreeBSD$
While faintly amusing the first time, this might just cause confusion
for folks, and in addition it's not the most useful of tips, so doesn't
add much.
Therefore it seems prudent to get rid of it.
MFC: Not to 11-STABLE or 12-STABLE.
There are now options for specifying the debug port via tunable
(hw.fdt.dbgport) and device tree (/chosen/freebsd-dbgpath), so this can
be usefully included in GENERIC.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29152
The two functions in linux/inetdevice.h are highly FreeBSD/ifnet
specific. This is a result of struct net_device being mapped to
struct ifnet.
The only known consumer of these functions are two files in the
ofed/infiniband code.
As a first step of cleaning up copy linux/inetdevice.h to
rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve
copyright and license of the original file; otherwise it could be
merged into ib_addr.h where more EPOCH/vnet/.. are already used).
Slightly rename the function to not conflict with LinuxKPI
in the future.
Remove the three last, now unneeded includes of inetdevice.h and
zap linux/inetdevice.h to an empty header file with only the forward
include to netdevice.h remaining.
Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
Reviewed-by: hselasky, kib
X-D-R: D29366 (extracted as further cleanup)
Differential Revision: https://reviews.freebsd.org/D29434
The remote protocol allows for implementations to report more specific
reasons for the break in execution back to the client [1]. This is
entirely optional, so it is only implemented for amd64, arm64, and i386
at the moment.
[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Stop-Reply-Packets.html
Reviewed by: jhb
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
NetApp PR: 51
Differential Revision: https://reviews.freebsd.org/D29174
Handle the 'z' and 'Z' remote packets for manipulating hardware
watchpoints.
This could be expanded quite easily to support hardware or software
breakpoints as well.
https://sourceware.org/gdb/onlinedocs/gdb/Packets.html
Reviewed by: cem, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
NetApp PR: 51
Differential Revision: https://reviews.freebsd.org/D29173
Instead of polling nleft[i] (without appropriate memory barriers!) and
using sleep() to detect the exit just call pthread_join() on all threads.
Also replace the use of a mutex that guarding the increments with atomic
fetch_add. This should reduce the runtime of this test on SMP systems.
Finally, remove all the debug printfs unless DEBUG_OUTPUT is set in
the environment.
Test Plan: still fails sometimes on qemu (but maybe less often?)
Reviewed By: jhb
Differential Revision: https://reviews.freebsd.org/D29390
LLVM12 complains if you change the symbol binding:
error: mfs_root_end changed binding to STB_WEAK [-Werror,-Winline-asm]
error: mfs_root changed binding to STB_WEAK [-Werror,-Winline-asm]
LLVM12 complains if you change the symbol binding:
`error: _longjmp changed binding to STB_GLOBAL`
In this case LLVM actually ignored the weak directive and used the
later .global, but GNU as would mark the symbol as weak.
None of the other architectures mark the libsa _setjmp as weak so
just drop this directive.
We are inspecting PCBs of divert sockets under NET_EPOCH section,
but PCB could be already detached and we should check INP_FREED flag
when we took INP_RLOCK.
PR: 254478
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29420
The old prototype requires callers to inspect flags of each descriptors
to get the starting position of host-writable iovecs.
vq_getchain() is changed to return a virtio request with the number of
host-readable iovecs and host-writable iovecs instead. Callers can avoid
boilerplate code of getting the start offset of host-writable iovecs.
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Reviewed by: afedorov
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29433
In busy-wait mode (BUSYWAIT defined), NIOCTXSYNC should be
performed after packets have been moved to the TX ring
(rather than before).
Before the change, moved packets may stall for an indefinite
time in the TX ring.
MFC after: 1 week
On an INVARIANTS kernel on 32-bit Book-E, we were panicing when running
the libproc tests. This was caused by extra pv entries being generated
accidentally by the pmap icache invalidation code.
Use the same VA (i.e. 0) when freeing the temporary mapping, instead of
some arbitrary address within the zero page.
Failure to do this was causing kernel-side icache syncing to leak
PVE entries when invalidating icache for a non page-aligned address, which
would later result in pages erroneously showing up as mapped to vm_page.
This bug was introduced in r347354 in 2019.
Reviewed by: jhibbits (in irc)
Sponsored by: Tag1 Consulting, Inc.
The current code has the limit of 127 nexthop groups due to the
wrongly-checked bitmask_copy() return value.
PR: 254303
Reported by: Aleks <a.ivanov at veesp.com>
MFC after: 1 day
This patch replaces a couple of while() loops with LIST_FOREACH() loops.
While here, declare a couple of variables "bool".
I think LIST_FOREACH() is preferred and makes the code more readable.
This also prepares the code for future changes to use a hash table of
lists for open searches via file handle.
This patch should not result in a semantics change.
MFC after: 2 weeks
Implements bus_map_resource and bus_unmap_resource DEVMETHODs to be
used by powerpc targets. This is identical to the amd64 code.
Required by virtio-modern.
Reviewed by: bryanv
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28012
If a delegation for a file has been acquired, the "oneopenown" option
was ignored when the local open was issued. This could result in multiple
openowners/opens for a file, that would be transferred to the server
when the delegation was recalled.
This would not be serious, but could result in more than one openowner.
Since the Amazon/EFS does not issue delegations, this probably never
occurs in practice.
Spotted during code inspection.
This small patch fixes the code so that it checks for "oneopenown"
when doing client local opens on a delegation.
MFC after: 2 weeks
The previous commit added the handler to parse the command line
options for virtio-scsi devices but forgot to set the correct function
pointer to point to the handler.
Reported by: vangyzen
Reviewed by: vangyzen
Fixes: 621b509048
Differential Revision: https://reviews.freebsd.org/D29438
The companion libnetmap changes for the "offsets" kernel support added
in a6d768d845. This includes code to parse the "@offset=NNN"
option that can be appended to the port name by any nmport_* application.
Example:
# pkt-gen -i 'netmap:em0@offset=16'
The netmap monitor intercepts any TX/RX packets on the monitored
port. However, before this change there was no way to tell
whether an intercepted packet was being transmitted or received
on the monitored port.
A TXMON flag in the netmap slot has been added for this purpose.
This feature enables applications to ask netmap to transmit or
receive packets starting at a user-specified offset from the
beginning of the netmap buffer. This is meant to ease those
packet manipulation operations such as pushing or popping packet
headers, that may be useful to implement software switches,
routers and other packet processors.
To use the feature, drivers (e.g., iflib, vtnet, etc.) must have
explicit support. This change does not add support for any driver,
but introduces the necessary kernel changes. However, offsets support
is already included for VALE ports and pipes.
Use the new kdb variants. Print more specific error messages.
Reviewed by: jhb, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29156
Implement wrappers around the existing debug_monitor interface, to be
consumed by MI kernel debugger code.
For now, the various db_printf() calls in this code remain. In the
future, they could be converted to printf() or removed altogether, to
properly decouple the DDB and GDB options.
Reviewed by: jhb, kib, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29155
Add wrappers around the debug_monitor interface, to be consumed by MI
kernel debugger code. Update dbg_setup_watchpoint() and
dbg_remove_watchpoint() to return specific error codes, not just -1.
Reviewed by: jhb, kib, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29155
Add wrappers around the dbreg interface that can be consumed by MI
kernel debugger code. The dbreg functions themselves are updated to
return error codes, not just -1. dbreg_set_watchpoint() is extended to
accept access bits as an argument.
Reviewed by: jhb, kib, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29155
This basically mirrors what already exists in ddb, but provides a
slightly improved interface. It allows the caller to specify the
watchpoint access type, and returns more specific error codes to
differentiate failure cases.
This will be used to support hardware watchpoints in gdb(4).
Stubs are provided for architectures lacking hardware watchpoint logic
(mips, powerpc, riscv), while other architectures are added individually
in follow-up commits.
Reviewed by: jhb, kib, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29155
This per-driver callback is invoked by netmap when it wants
to align the number of TX/RX netmap rings and/or the number of
TX/RX netmap slots to the actual state configured in the hardware.
The alignment happens when netmap mode is switched on (with no
active netmap file descriptors for that netmap port), or when
collecting netmap port information.
MFC after: 1 week
Without this patch, sh can autocomplete file names but not commands from
$PATH. Use libedit's facility to execute custom function for autocomplete,
but yield to the library's standard autocomplete function when cursor is
not at position 0.
Reviewed by: bapt
Differential Revision: https://reviews.freebsd.org/D29361