LRO bypasses normal ip_input()/tcp_input() and lacks several checks
that are present in the normal path. Without these checks, it
is possible to trigger assertions added in b0ccf53f24
Reviewed by: glebius, rrs
Sponsored by: Netflix
The quasi-LRU still gets in the way for example when doing an
incremental bzImage build, with vnode_list lock being at the
top of the profile. Further damage control the problem by trylocking.
Note the entire mechanism desperately wants to be reaped out in favor
of something(tm) which both scales in a multicore setting and provides
sensible replacement policy.
With this change everything vfs almost disappears from the on CPU
flamegraph, what is left is tons of contention in the VM.
Turns out it is very rarely triggered, making a per-cpu
counter a waste.
Examples from real life boxes:
uptime counter
135 days 847
138 days 2190
141 days 1
We previously attempted to emit Rock Ridge NM records only when the name
represented by the Rock Ridge extensions would actually differ. We would
omit the record for an all-upper-case directory name, however Linux (and
perhaps other operating systems) map names with no NM record to
lowercase.
This affected only directories, as file names have an implicit ";1"
version number appended and thus always differ. To solve, just emit NM
records for all entries other than DOT and DOTDOT .
We could continue to omit the NM record for directories that would avoid
mapping (for example, one named 1234.567) but this does not seem worth
the complexity.
PR: 203531
Reported by: Thomas Schmitt <scdbackup@gmx.net
Reviewed by: kevans
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39258
By checking Ubuntu, there is no `/sys/subsystem' in sysfs. To compatible
with Ubuntu, delete the 'subsystem' creation in Linux compatible module.
On the other hand, the sysfs `/sys/subsystem' cause failure for some
Linux udev cases. In Linux udev source code, there is a function named
`scan_devices_all', and it will scan `/sys/subsystem' if it is existed,
but now there are nothing in /sys/subsystem `, and it returns empty
to cause some use cases failed.
Reviewed by: dchagin
Differential Revision: https://reviews.freebsd.org/D38885
MFC after: 1 month
XMFC with: ifAPI
Add myself as a doc committer, mentored by carlavilla and dbaio.
Approved by: carlavilla (mentor)
Differential Revision: https://reviews.freebsd.org/D39254
This is a userland-only pointer that isn't relevant to the kernel and
doesn't belong in the ioctl structure shared between userland and the
kernel. For the kernel, the old structure for the ioctl is still
supported under COMPAT_FREEBSD13.
This changes vm_snapshot_req() in libvmmapi to accept an explicit
vmctx argument.
It also changes vm_snapshot_guest2host_addr to take an explicit vmctx
argument. As part of this change, move the declaration for this
function and its wrapper macro from vmm_snapshot.h to snapshot.h as it
is a userland-only API.
Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38125
This replaces the 'struct vm, int vcpuid' tuple passed to most API
calls and is similar to the changes recently made in vmm(4) in the
kernel.
struct vcpu is an opaque type managed by libvmmapi. For now it stores
a pointer to the VM context and an integer id.
As an immediate effect this removes the divergence between the kernel
and userland for the instruction emulation code introduced by the
recent vmm(4) changes.
Since this is a major change to the vmmapi API, bump VMMAPI_VERSION to
0x200 (2.0) and the shared library major version.
While here (and since the major version is bumped), remove unused
vcpu argument from vm_setup_pptdev_msi*().
Add new functions vm_suspend_all_cpus() and vm_resume_all_cpus() for
use by the debug server. The underyling ioctl (which uses a vcpuid of
-1) remains unchanged, but the userlevel API now uses separate
functions for global CPU suspend/resume.
Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38124
The formats for pmcstat(8)'s human-readable output are not part of its
user interface definition, and may change in the future. Highlight
this in its manual page.
Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D39249
The ix number for the fdescfs root is 1, while any fd vnode has the ix
value at least 3.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39207
It is already referenced by the VOP_LOOKUP() caller, otherwise vdrop()
after vn_lock() is invalid anyway.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39207
Just owning the interlock is not enough for vget() to operate on the
vnode race-free with vgone(), the vnode should be held. Use
vget_prep()/vget_finish() to avoid vholding the vnode explicitly, and
drop LK_INTERLOCK.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39207
bfb202c455 addresses the CTRL-EVENT-SCAN-FAILED. Upstream d807e289d
caused FreeBSD regression in driver_bsd.c, which this rc.d patch
worked around. As of bfb202c455 this workaround is no longer needed.
Reviewed by: bz (for wireless)
MFC after: 10 days
X-MFC with: bfb202c455
Differential Revision: https://reviews.freebsd.org/D39257
ip6_input() and ip6_destroy() both directly reference ifnet members.
This file was missed in 3d0d5b21
Fixes: 3d0d5b21 ("IfAPI: Explicitly include <net/if_private.h>...")
Sponsored by: Juniper Networks, Inc.
Summary:
Because of the intricacies of this code it wasn't purely scripted, but
instead hand-mechanical.
Reviewed by: hselasky
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38560
We can't do anything if ip6_output() fails, other than discard the
packet which ip6_output() already does for us.
Mark the return value as ignored.
Reported by: emaste, Coverity
Sponsored by: Rubicon Communications, LLC (Netgate)
Commit 8923de5905 ("ice(4): Update to 1.37.7-k", 2023-02-13)
unintentionally overwrote the change made in commit 52f45d8ace ("net:
iflib: let the drivers use isc_capenable", 2021-12-28).
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Reported by: jhibbits@
MFC after: 3 days
Sponsored by: Intel Corporation
The variable oicmp, which holds the original ("quoted packet") ICMP
packet in a structured way, did not have a copy of the original ICMP
packet obtained from the raw data.
The code was accidentally removed in 20b4130314. Bring it back.
Reported by: Coverity Scan, cy
Reviewed by: cy
CID: 1506960 (UNINIT)
Fixes: 20b4130314
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39233
This reverts the state to our old supplicant logic setting or clearing
IFF_UP if needed. In addition this adds logging for the cases in which
we do (not) change the interface state.
Depending on testing this seems to help bringing WiFi up or not log
any needed changes (which would be the expected wpa_supplicant logic
now). People should look out for ``(changed)`` log entries (at least
if debugging the issue; this way we will at least have data points).
There is a hypothesis still pondered that the entire IFF_UP toggling
only exploits a race in net80211 (see further discssussions for more
debugging and alternative solutions see D38508 and D38753).
That may also explain why the changes to the rc startup script [1]
only helped partially for some people to no longer see the
continuous CTRL-EVENT-SCAN-FAILED.
It is highly likely that we will want further changes and until
we know for sure that people are seeing ''(changed)'' events
this should stay local. Should we need to upstream this we'll
likely need #ifdef __FreeBSD__ around this code.
[1] 5fcdc19a81 and
d06d7eb091
Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: cy, enweiwu (earlier)
Differential Revision: https://reviews.freebsd.org/D38807
This entails:
- Marking some obvious candidates for __nosanitizeaddress
- Similar trap frame markings as amd64, for similar reasons
- Shadow map implementation
The shadow map implementation is roughly similar to what was done on
amd64, with some exceptions. Attempting to use available space at
preinit_map_va + PMAP_PREINIT_MAPPING_SIZE (up to the end of that range,
as depicted in the physmap) results in odd failures, so we instead
search the physmap for free regions that we can carve out, fragmenting
the shadow map as necessary to try and fit as much as we need for the
initial kernel map. pmap_bootstrap_san() is thus after
pmap_bootstrap(), which still included some technically reserved areas
of the memory map that needed to be included in the DMAP.
The odd failure noted above may be a bug, but I haven't investigated it
all that much.
Initial work by mhorne with additional fixes from kevans and markj.
Reviewed by: andrew, markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36701
The boot catalog pointer is a DWord, but we previously populated it via
cd9660_bothendian_dword which overwrote four unused bytes following it.
See El Torito 1.0 (1995) Figure 7 for details.
PR: 203531
Reported by: Coverity Scan
Reported by: Thomas Schmitt <scdbackup@gmx.net>
Reviewed by: kevans
CID: 977470
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39231
Consistent with 9fd0d9b16e, KERN_TLS is
not supported on kernels without any INET support.
Reviewed by: gallatin, hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39232
When requesting a superblock read for the sole purpose of getting
the parameters needed to find if backup parameters have been stored,
specify UFS_NOCSUM as only the base superblock is needed. This
change reduces the number of checks that the superblock must pass.
MFC after: 1 week
These were kept for ABI reasons. Remove them and bump __FreeBSD_version
so debuggers can be updated to use the new layout.
Reviewed by: jhb
Sponsored by: Arm Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35378
The size of the spsr field in struct reg has changed. Mask the bits
that userspace doesn't know about out as they may be invalid.
While here add a comment why we don't need compat support in set_regs.
Sponsored by: Arm Ltd
It was previously possible for the fault address register to get
clobbered before it was saved. This small window occurred when an
additional exception was encountered inside the exception handler,
overwriting the previous value.
Commit f29942229d ("Read the arm64 far early in el0 exceptions")
patched this issue, but avoided changing the trapframe since this could
be considered a KBI change in FreeBSD 13.
Revert the above fix and save the fault address in the trapframe
instead. This saves the fault address even earlier in the exception
handling process, and is a more robust and simple fix.
Reviewed by: andrew, jhb, jrtc27
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38984
For the Exception Syndrome Register, ESR_ELx, the upper 32b were
previously unused, but now may contain additional exception info as of
Armv8.7 (FEAT_LS64).
Extend ESR from u32->u64 in exception handling code to support this. In
addition, also extend Saved Program Status Register SPSR_ELx in the same
way to allow for future extensions.
Reviewed by: andrew
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38983