More and more code migrates from lock-based protection to the NET_EPOCH
umbrella. It requires some logic changes, including, notably, refcount
handling.
When we have an `ifa` pointer and we're running inside epoch we're
guaranteed that this pointer will not be freed.
However, the following case can still happen:
* in thread 1 we drop to 0 refcount for ifa and schedule its deletion.
* in thread 2 we use this ifa and reference it
* destroy callout kicks in
* unhappy user reports bug
To address it, new `ifa_try_ref()` function is added, allowing to return
failure when we try to reference `ifa` with 0 refcount.
Additionally, existing `ifa_ref()` is enforced with `KASSERT` to provide
cleaner error in such scenarious.
Reviewed By: rstone, donner
Differential Revision: https://reviews.freebsd.org/D28639
MFC after: 1 week
It fixes loopback route installation for the interfaces
in the different fibs using the same prefix.
Reviewed By: donner
PR: 189088
Differential Revision: https://reviews.freebsd.org/D28673
MFC after: 1 week
jail_remove(2) includes a loop that sends SIGKILL to all processes
in a jail, but skips processes in PRS_NEW state. Thus it is possible
the a process in mid-fork(2) during jail removal can survive the jail
being removed.
Add a prison flag PR_REMOVE, which is checked before the new process
returns. If the jail is being removed, the process will then exit.
Also check this flag in jail_attach(2) which has a similar issue.
Reported by: trasz
Approved by: kib
MFC after: 3 days
iflib_init_locked() assumes that iflib_stop() has been called, however,
it is not called for media changes.
iflib_if_init_locked() calls stop then init, so fixes the problem.
PR: 253473
MFC after: 3 days
Reviewed by: markj
Sponsored by: Juniper Networks, Inc., Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D28667
Do not attempt to add MODINFOMD_MODULEP to the kernel medatada on
arches that don't have it defined.
This fixes the build for arches different than amd64 after
7d3259775c.
Sponsored by: Citrix Systems R&D
Reported by: lwhsu, arichardson
linux_shared_page_init() creates an object and grabs and maps a single
page to back the VDSO. When destroying the VDSO object, we failed to
destroy the mapping and free KVA. Fix this.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28696
FreeBSD when running as a dom0 under Xen is not supposed to access the
run time services directly, and instead should proxy the calls through
Xen using an hypercall interface that exposes access to selected run
time services.
Implement the efirt interface on top of the Xen provided hypercalls.
Sponsored by: Citrix Systems R&D
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D28621
Introduce a set of hooks for MI EFI public functions, so that a new
implementation can be done. This will be used to implement the Xen PV
EFI interface that's used when running FreeBSD as a Xen dom0 from UEFI
firmware. Also make the efi_status_to_errno non-static since it will
be used to evaluate status return values from the PV interface.
No functional change indented.
Sponsored by: Citrix Systems R&D
Reviewed by: kib, imp
Differential revision: https://reviews.freebsd.org/D28620
Allow setting the bootmethod variable from the Xen PVH entry point, in
order to be able to correctly set the underlying firmware mode when
booted as a dom0.
Move the bootmethod variable to be defined in x86/cpu_machdep.c
instead so it can be shared by both i386 and amd64.
Sponsored by: Citrix Systems R&D
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D28619
Add some basic multiboot2 infrastructure to the EFI loader in order to
be capable of booting a FreeBSD/Xen dom0 when booted from UEFI.
Only a very limited subset of the multiboot2 protocol is implemented
in order to support enough to boot into Xen, the implementation
doesn't intend to be a full multiboot2 capable implementation.
Such multiboot2 functionality is hooked up into the amd64 EFI loader,
which is the only architecture that supports Xen dom0 on FreeBSD.
The options to boot a FreeBSD/Xen dom0 system are exactly the same as
on BIOS, and requires setting the xen_kernel and xen_cmdline options
in loader.conf.
Sponsored by: Citrix Systems R&D
Reviewed by: tsoome, imp
Differential revision: https://reviews.freebsd.org/D28497
This mirrors the functionality of the BIOS amd64 bi_load function,
that stashes the absolute address of the module metadata. This is
required for booting as a Xen dom0 that does relocate the modulep and
the loaded modules, and thus requires adjusting the offset.
No functional change introduced, further patches will make use of this
functionality for Xen dom0 loading.
Sponsored by: Citrix Systems R&D
Reviewed by: imp
Differential revision: https://reviews.freebsd.org/D28496
Xen requires that UEFI BootServices are enabled in order to boot, so
introduce a new parameter to bi_load in order to select whether BS
should be exited.
No functional change introduced in this patch, as all current users of
bi_load request BS to be exited. Further changes will make use of this
functionality.
Note the memory map is still appended to the kernel metadata, even
when it could be modified by further calls to the Boot Services, as it
will be used to detect if the kernel has been booted from UEFI.
Sponsored by: Citrix Systems R&D
Reviewed by: tsoome, imp
Differential revision: https://reviews.freebsd.org/D28495
- improved pipe calculation which does not degrade under heavy loss
- engaging in Loss Recovery earlier under adverse conditions
- Rescue Retransmission in case some of the trailing packets of a request got lost
All above changes are toggled with the sysctl "rfc6675_pipe" (disabled by default).
Reviewers: #transport, tuexen, lstewart, slavash, jtl, hselasky, kib, rgrimes, chengc_netapp.com, thj, #manpages, kbowling, #netapp, rscheff
Reviewed By: #transport
Subscribers: imp, melifaro
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D18985
Citing Kirk:
The previous code [before 8563de2f27 -- kib] did not call
vnode_pager_setsize() but worked because later in ffs_snapshot() it
does a UFS_WRITE() to output the snaplist. Previously the UFS_WRITE()
allocated the extra block at the end of the file which caused it to do
the needed vnode_pager_setsize(). But the new code had already allocated
the extra block, so UFS_WRITE() did not extend the size and thus did not
do the vnode_pager_setsize().
PR: 253158
Reported by: Harald Schmalzbauer <bugzilla.freebsd@omnilan.de>
Reviewed by: mckusick
Tested by: cy
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
If uio_offset is past end of the object size, calculated resid is negative.
Delegate handling this case to the locked read, as any other non-trivial
situation.
PR: 253158
Reported by: Harald Schmalzbauer <bugzilla.freebsd@omnilan.de>
Tested by: cy
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Since 4581cefc1e
ATF opens the results file on startup. This fixes problems like
capsicumized tests not being able to open the file on exit.
However, this test closes all file descriptors just to check that
socketpair returns fd 3+4 and thereby also closes the ATF results file.
This then results in an EBADF when writing the result so the test is
reported as broken.
While system calls that create new file descriptors (must?) use the lowest
available file descriptor number, it does not seem useful to test this
property here. Drop the check for FD==3/4 to unbreak the testsuite.
We could also try to re-open the results file in ATF if we get a EBADF
error, but that will fail when running under Capsicum.
Reviewed By: cem
Differential Revision: https://reviews.freebsd.org/D28683
This applies musl commit b02eed9c4841913d690a2d0029737d72615384fe by
Szabolcs Nagy and updates the tests accordingly. This also allows
removing an XFAIL from the test.
musl commit message:
complex: fix ctanh(+-0+i*nan) and ctanh(+-0+-i*inf)
These cases were incorrect in C11 as described by
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1886.htm
PR: 217528
Reviewed By: dim
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28578
Revision r334749 Added some C++ code to libpmc. It didn't change the ABI,
but it did introduce a dependency on libc++. Nobody noticed because every
program that in the base system that uses libpmc is also C++.
Reported-by: Dom Dwyer <dom@itsallbroken.com>
Reviewed By: vangyzen
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28550
Currently ip6_input() calls in6ifa_ifwithaddr() for
every local packet, in order to check if the target ip
belongs to the local ifa in proper state and increase
its counters.
in6ifa_ifwithaddr() references found ifa.
With epoch changes, both `ip6_input()` and all other current callers
of `in6ifa_ifwithaddr()` do not need this reference
anymore, as epoch provides stability guarantee.
Given that, update `in6ifa_ifwithaddr()` to allow
it to return ifa without referencing it, while preserving
option for getting referenced ifa if so desired.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28648
in6_selectsrc() may call fib6_lookup() in some cases, which requires
epoch. Wrap in6_selectsrc* calls into epoch inside its users.
Mark it as requiring epoch by adding NET_EPOCH_ASSERT().
MFC after: 1 weeek
Differential Revision: https://reviews.freebsd.org/D28647
While I was there:
- Fix some typos
- Fix an excessive argument "indent" reported by mandoc -Tlint
- Replace a dead link with the one suggested by
https://www.uefi.org/uefi
Submitted by: linimon (in part)
Reviewed by: bcr
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27774
Preserve more space for swap devise names.
Prevent line overflow with long devise name.
Don't draw a bar when swap is not used at all.
Simplify and optimize code.
Change the label to end at end of 100%.
PR: 251655
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D27496
When locating the anonymous memory region for a vm_map with ASLR
enabled, we try to keep the slid base address aligned on a superpage
boundary to minimize pagetable fragmentation and maximize the potential
usage of superpage mappings. We can't (portably) do this if superpages
have been disabled by loader tunable and pagesizes[1] is 0, and it
would be less beneficial in that case anyway.
PR: 253511
Reported by: johannes@jo-t.de
MFC after: 1 week
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28678
Currently the struct has a 4 byte padding stemming from 3 ints.
1. prio comfortably fits in short, unfortunately there is no dedicated
type for it and plumbing it throughout the codebase is not worth it
right now, instead an assert is added which covers also flags for
safety
2. lk_exslpfail can in principle exceed u_short, but the count is
already not considered reliable and it only ever gets modified
straight to 0. In other words it can be incrementing with an upper
bound of USHRT_MAX
With these in place struct lock shrinks from 48 to 40 bytes.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D28680
From openzfs-master 0ae184a6b commit message:
If we do not write any buffers to the cache device and the evict hand
has not advanced do not update the cache device header.
Cherry-picked from openzfs 0ae184a6ba
Patch Author: George Amanakis <gamanakis@gmail.com>
MFC after: 3 days
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D28682
From openzfs-master 62d4287f2 commit message:
When scrubbing, (non-sequential) resilvering, or correcting a checksum
error using RAIDZ parity, ZFS should heal any incorrect RAIDZ parity by
overwriting it. For example, if P disks are silently corrupted (P being
the number of failures tolerated; e.g. RAIDZ2 has P=2), `zpool scrub`
should detect and heal all the bad state on these disks, including
parity. This way if there is a subsequent failure we are fully
protected.
With RAIDZ2 or RAIDZ3, a block can have silent damage to a parity
sector, and also damage (silent or known) to a data sector. In this
case the parity should be healed but it is not.
Cherry-picked from openzfs 62d4287f27
Patch Author: Matthew Ahrens <matthew.ahrens@delphix.com>
MFC after: 3 days
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D28681
Currently when peer information is displayed with `ifconfig wgN peer ..`
or `ifconfig wgN peer-list`, the netmask of the first `allowed-ips` will
be used as the netmask of all CIDR in `allowed-ips`. For example, if
the list is `192.168.1.0/24, 172.16.0.0/16`, it will display as
`192.168.1.0/24, 172.16.0.0/24`. While this does not affect the actual
functionality, it is very confusing.
Submitted by: Michael Chiu <nyan -at- myuji.xyz>
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D28655
MFC after: 1 day
It was reported that getdirentries(2) was
returning dirents with d_off set to 0 for an NFS
mount.
This is believed to be correct behaviour at
this time (it may change for some NFS mounts
in the future), but is inconsistent with what the
getdirentries(2) man page says.
This patch fixes the man page.
This is a content change.
PR: 253428
Reviewed by: asomers
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28664
We use ascii box chars with serial console because we do not know
if terminal can draw unixode box chars. Same problem is about userboot
console.
MFC after: 5 days
Locking the second lock which causes the LOR, can be skipped because
the code updating the shared variables is always executing from the
same USB thread.
lock order reversal:
1st 0xfffff80005cc3840 pcm7:play:dsp7.p0 (pcm play channel, sleep mutex)
@ usb_transfer.c:2342
2nd 0xfffff80005cc3860 pcm7:record:dsp7.r0 (pcm record channel, sleep mutex)
@ uaudio.c:2317
lock order pcm record channel -> pcm play channel established at:
witness_checkorder+0x461
__mtx_lock_flags+0x98
dsp_mmap_single+0x151
vm_mmap_cdev+0x65
devfs_mmap_f+0x143
kern_mmap_req+0x594
sys_mmap+0x46
amd64_syscall+0x12e
fast_syscall_common+0xf8
lock order pcm play channel -> pcm record channel attempted at:
witness_checkorder+0xd82
__mtx_lock_flags+0x98
uaudio_chan_play_callback+0xeb
usbd_callback_wrapper+0x7ec
usb_command_wrapper+0x7e
usb_callback_proc+0x8e
usb_process+0xf3
fork_exit+0x80
fork_trampoline+0xe
Found by: Stefan Ehmann <shoesoft@gmx.net>
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
The veriexec option is redundant, mac_veriexec is sufficient.
MFC after: 1 week
#
# 72 columns --|
#
# Uncomment and complete these metadata fields, as appropriate:
#
# PR: <If and which Problem Report is related.>
# Reported by: <If someone else reported the issue.>
# Reviewed by: <If someone else reviewed your modification.>
# Approved by: <If you needed approval for this commit.>
# Obtained from: <If the change is from a third party.>
# MFC after: <N [day[s]|week[s]|month[s]]. Request a reminder email>
# MFH: <Ports tree branch name. Request approval for merge.>
# Relnotes: <Set to 'yes' for mention in release notes.>
# Security: <Vulnerability reference (one per line) or description.>
# Sponsored by: <If the change was sponsored by an organization.>
# Pull Request: <https://github.com/freebsd/<repo>/pull/###>
# Differential Revision: <https://reviews.freebsd.org/D###>
#
# "Pull Request" and "Differential Revision" require the *full* GitHub or
# Phabricator URL. The commit author should be set appropriately, using
# `git commit --author` if someone besides the committer sent in the change.
#
# Uncomment and complete these metadata fields, as appropriate:
#
# PR:
# Reported by: <If someone else reported the issue.>
# Reviewed by: <If someone else reviewed your modification.>
# Approved by: <If you needed approval for this commit.>
# Obtained from: <If the change is from a third party.>
# MFC after: <N [day[s]|week[s]|month[s]]. Request a reminder email>
# MFH: <Ports tree branch name. Request approval for merge.>
# Relnotes: <Set to 'yes' for mention in release notes.>
# Security: <Vulnerability reference (one per line) or description.>
# Sponsored by: <If the change was sponsored by an organization.>
# Pull Request: <https://github.com/freebsd/<repo>/pull/###>
# Differential Revision: <https://reviews.freebsd.org/D###>
#
# "Pull Request" and "Differential Revision" require the *full* GitHub or
# Phabricator URL. The commit author should be set appropriately, using
# `git commit --author` if someone besides the committer sent in the change.
#
Use ISS for SEG.SEQ when sending a SYN-ACK segment in response to
an SYN segment received in the SYN-SENT state on a socket having
the IPPROTO_TCP level socket option TCP_NOOPT enabled.
Reviewed by: rscheff
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D28656
The only place where in6_ifawithifp() is used is ip6_output(),
which uses the returned ifa to bump traffic counters.
Given ifa stability guarantees is provided by epoch, do not refcount ifa.
This eliminates 2 atomic ops from IPv6 fast path.
Reviewed By: rstone
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28649