The calculation of Maxmem was skipping the last phys_avail segment,
because of a wrong stop condition.
This was detected when using QEMU/PowerNV with Radix MMU and low
memory (2G). In this case opal_pci would allocate a DMA window that
was too small to cover all physical memory, resulting in reading all
zeroes from disk when using memory that was not inside the allocated
window.
Reviewed by: jhibbits
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D33449
MFC after: 2 weeks
To quote the manual:
The pointer passed in as name and type is saved rather than the data
it points to. The data pointed to must remain stable until the mutex
is destroyed.
It seems that the type is actually copied, but the name is stored as
a pointer indeed.
mmc_cam_sim_alloc used a name stored on stack.
So, a corrupt mutex name would be reported.
For example:
lock order reversal: (sleepable after non-sleepable)
1st 0xd7285b20 <8A><C0><C0>P@<C1><D0>P@<C1>^D^A (aw_mmc_sim, sleep mutex) @ sys/cam/cam_xpt.c:2804
This change moves the name to struct mmc_sim.
Also, that name is used as the sim name as well.
Unused mtx_name variable is removed too.
The name buffer is reduced to 16 characters.
Reviewed by: manu, bz
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D33412
- maximum number of bytes that can be sent is 32, not 8;
- previous interface required callers to bump sc->msg->len in addition
to setting sc->tx_slave_addr;
- because of the above there was an issue with writing one too many bytes
because sc->cnt is not advanced when the slave address is written;
- the inetraction between outer and inner loops was confusing as the former
was bounded on the number of bytes to write and the counter was
incremented by one, but the inner loop advanced four bytes at a time;
- the return value was incorrect in the tx_slave_addr case; one call place
had to use its own (and incorrect in some cases) notion of the write
lenth.
All of the above issues should be fixed.
Some sanity asserts are added.
All callers use the return value to program RK_I2C_MTXCNT.
iic_msg::len no longer needs to be hacked.
A constant is added to reflect the maximum number of octets that can be
sent or received in one go (they are the same).
MFC after: 1 week
No need to mask a uint8_t with 0xff, the mask covers the whole type.
Explcitly cast to uint32_t before bit shifting instead of relying on
the implicit promotion to signed int.
MFC after: 1 week
Previously the driver would happily talk to addresses with no device
returning some garbage for reads and sending bits into the void for writes.
MFC after: 1 week
Previously the code would decalre the transfer complete after sending
first 31 bytes (plus the slave address) of a larger I2C write transfer.
That was tested using a large write to an EEPROM with 32-byte write page
size and a 2-byte address type. Such a transaction needed to send 34
bytes, 2 bytes for an offset and 32 bytes of actual data.
MFC after: 1 week
The scanning code uses Giant to coordinate its accesses to newbus as
well as to synchronize a little state within hyperv's vmbus. Switch to
the new bus_topo_* functions instead of referring to Giant explicitly.
Sponsored by: Netflix
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D31840
Remove more of the pccard infrasturcture. CardBus Yenta driver (cbb)
still references the remaining bits. It needs some additiona work to
remove 16-bit support still, so it remains.
Sponsored by: Netflix
With the mac_priority(4) realtime policy active, users and processes in
the realtime group may promote existing threads and processes to
realtime scheduling priority. Extend the privileges granted to
PRIV_SCHED_SETPOLICY which allows explicit creation of new realtime
threads.
One use case of this is when the pthread scheduling policy is set to
SCHED_RR or SCHED_FIFO via pthread_attr_setschedpolicy(...) before
calling pthread_create(...). I ran into this when testing audio software
with realtime threads, particularly audio/ardour6.
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33393
The integration with RLIMIT_STACK is still causing problems for some
programs such as lang/sdcc and syzkaller's executor. Until this is
resolved by some work currently in progress, disable the stack gap by
default.
PR: 260303
Reviewed by: kib, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33438
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage. (For largepage mappings, it calls vm_fault() once per large
page.)
A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page. Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry. Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler. So the second page is wired twice and will be
leaked when the object is destroyed.
Fix the problem by modify vm_fault_populate() to wire only the fault
page. Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.
PR: 260347
Reviewed by: alc, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33416
Note that support for TLS 1.3 receive offload in OpenSSL is still an
open pull request in active development. However, potential changes
to that pull request should not affect the kernel interface.
Reviewed by: hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33007
It appears at least on QLE2694L cards 3rd and 4th ports follow the
same NVRAM addressing logic as the first two. In lack of proper
documentation this guess is as good as it can be.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Summary:
The linux device tree documentation for this states that
for v1 voltages are required, but for v2 voltages are optional.
So, handle that here - if there's no regulator/supply provided
for a v1 opmode then error out; but keep it optional for v2.
Then just don't both doing any regulator calls if it's not configured.
This isn't the best/final solution - mmel@ has suggested that
this should be flipped around a bit and print warnings if
we get an opp-microvolt property but we don't have a regulator.
Subscribers: imp
Reviewed by: mmel, jrtc27, manu
Test Plan: * IPQ4018, with no voltage tables; the freq set is called appropriately.
Differential Revision: https://reviews.freebsd.org/D33140
The pcb lookup always happens in the network epoch and in SMR section.
We can't block on a mutex due to the latter. Right now this patch opens
up a race. But soon that will be addressed by D33339.
Reviewed by: markj, jamie
Differential revision: https://reviews.freebsd.org/D33340
Fixes: de2d47842e
Suggested by: Michael Pratt <mpratt@google.com>
Reviewed by: Michael Pratt <mpratt@google.com>, emaste
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D33423
As with arm and riscv fix return fbt probes on arm64. arg0 should be
the offset within the function of the return instruction and arg1
should be the return value.
Reviewed by: kp, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33440
When changing memory properties in the arm64 pmap we need to keep both
the kernel address and DMAP mappings in sync.
To keep the kernel and DMAP memory in sync we recurse when updating the
former to also update the latter. There was insuffucuent checking around
this recursion. It would check if the virtual address is not within the
DMAP region, but not if the physical address is covered.
Add the missing check as without it the recursion may return an error.
Sponsored by: The FreeBSD Foundation
As reported in PR#260343, nfs_allocate() did not check
the filesize rlimit. This patch adds that check.
PR: 260343
Reviewed by: asomers
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33422
This patch decrements maxcnt by the appropriate
number of bytes during parsing and checks to see
if there is data remaining. If not, it just returns
from nfsrv_flexlayouterr() without further processing.
This prevents the tl pointer from running off the end
of the error data pointed at by layp, if there are
flaws in the data.
Reported by: rtm@lcs.mit.edu
Tested by: rtm@lcs.mit.edu
PR: 260293
MFC after: 2 weeks
1. Doorbell interrupt status may arrive lately when doorbell interrupt on
ARC-1886.
2. System boot up hung when ARC-1886 with no volume created or no device
attached.
Many thanks to Areca for continuing to support FreeBSD.
MFC after: 2 weeks
As with sha256 add support for accelerated sha512 support to libmd on
arm64. This depends on clang 13+ to build as this is the first release
with the needed intrinsics. Gcc should also support them, however from
a currently unknown release.
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33373
The fr_info struct contains a summary of a packet. One of its fields
is a pointer to the ifnet struct the packet arrived on. It is pointed
to by a void* because ipfilter supports multiple O/Ses. Unfortunately
this makes it difficult it examine with DTrace. Defining fin_ifp as a
pointer to an ifnet struct makes the struct it points to using a DTrace
script possible.
MFC after: 1 week
To quote the manual:
The pointer passed in as name and type is saved rather than the data
it points to. The data pointed to must remain stable until the mutex
is destroyed.
It seems that the type is actually copied, but the name is stored as
a pointer indeed.
mmc_cam_sim_alloc used a name stored on stack.
So, a corrupt mutex name would be reported.
For example:
lock order reversal: (sleepable after non-sleepable)
1st 0xd7285b20 <8A><C0><C0>P@<C1><D0>P@<C1>^D^A (aw_mmc_sim, sleep mutex) @ /usr/devel/git/orange/sys/cam/cam_xpt.c:2804
This change moves the name to struct mmc_sim.
Also, that name is used as the sim name as well.
Unused mtx_name variable is removed too.
This seems to be required even if the watchdog is accessed via the common
MMIO space.
Tested on:
- Ryzen 3 3200U APU;
- Ryzen 7 5800X CPU with X570 chipset.
MFC after: 2 weeks
For pNFS mounts to mirrored Flexible File layout pNFS servers,
the "must_commit" component in the nfsclwritedsdorpc
structure must be checked and the "must_commit" argument passed
into nfscl_doiods() must be updated. Technically, only writes to
the DS with a writeverf change must be redone, but since this
occurrence will be rare, the must_commit argument to nfscl_doiosd()
is set to 1, so all writes to all DSs will be redone.
This bug would affect few, since use of mirrored pNFS servers
is rare and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots.
MFC after: 2 weeks
and stop recalculating alignment for PIE base, which was off by one
power of two.
Suggested and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33359
Only accept at most superpage alignment, or if the arch does not have
superpages supported, artificially limit it to PAGE_SIZE * 1024.
This is somewhat arbitrary, and e.g. could change what binaries do
we accept between native i386 vs. amd64 ia32 with superpages disabled,
but I do not believe the difference there is affecting anybody with
real (useful) binaries.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33359
Invalid (artificial) layout of the loadable ELF segments might result in
triggering the assertion. This means that the file should not be
executed, regardless of the kernel debug mode. Change calling
conventions for rnd_elf{32,64} helpers to allow returning an error, and
abort activation with ENOEXEC if its invariants are broken.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33359
Without this patch, the KASSERT(must_commit == 0,..) can be
triggered by the writeverf in the Direct I/O write reply changing.
This is not a situation that should cause a panic(). Correct
handling is to ignore the change in "writeverf" for Direct
I/O, since it is done with NFSWRITE_FILESYNC.
This patch modifies the semantics of the "must_commit"
argument slightly, allowing an initial value of 2 to indicate
that a change in "writeverf" should be ignored.
It also fixes the KASSERT()s.
This bug would affect few, since Direct I/O is not enabled
by default and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots, however I found the
bug when testing against a Linux 5.15.1 kernel nfsd, which
replied to a NFSWRITE_FILESYNC write with a "writeverf" of all
0x0 bytes.
MFC after: 2 weeks
Introduce new MSGBUF_WRAP flag, indicating that buffer has wrapped
at least once and does not keep zeroes from the last msgbuf_clear().
It allows msgbuf_peekbytes() to return only real data, not requiring
every consumer to trim the leading zeroes after doing pointless copy.
The most visible effect is that kern.msgbuf sysctl now always returns
proper zero-terminated string, not only after the first buffer wrap.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
In pmap_ts_referenced we read the virtual address from pv->pv_va,
but then continue to use the pv struct to get the same value later
in the function.
Use the virtual address value we initially read rather than loading
from the pv struct each time.
Add an idletime user group that allows non-root users to run processes
with idle scheduling priority. Privileges are granted by a MAC policy in
the mac_priority module. For this purpose, the kernel privilege
PRIV_SCHED_IDPRIO was added to sys/priv.h (kernel module ABI change).
Deprecate the system wide sysctl(8) knob
security.bsd.unprivileged_idprio which lets any user run idle priority
processes, regardless of context. While the knob is still working, it is
marked as deprecated in the description and in the man pages.
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33338
The privilege allows the holder to assign idle priority type to thread
or process.
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33338
Rather than picking up Giant at the start of these sysctl handlers, push
acquiring Giant down a layer to protect walking the children and also
to add/delete children for the reconfigure.
Sponsored by: Netflix
Reviewed by: mav, jhb
Differential Revision: https://reviews.freebsd.org/D33057
Add bus_topo_assert() and implmement it as GIANT_REQUIRED for the
moment. This will allow us to change more easily to a newbus-specific
lock int he future.
Sponsored by: Netflix
Reviewed by: wulf, mav, jhb
Differential Revision: https://reviews.freebsd.org/D31833
Mark the sysctls MPSAFE and pickup the bus topo lock while processing
them.
Sponsored by: Netflix
Reviewed by: mav, jhb
Differential Revision: https://reviews.freebsd.org/D31832
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.
Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D31831
There were two places where the client code did not check
for a parse error return from nfsrv_getattrbits().
This patch fixes both of these cases.
Reported by: rtm@lcs.mit.edu
Tested by: rtm@lcs.mit.edu
PR: 260272
MFC after: 2 weeks
The sanity check for tag length in a callback request
was broken in two ways:
It checked for a negative value, but not a large positive
value.
It did not set taglen to -1, to indicate to the code that
it should not be used.
This patch fixes both of these issues.
Reported by: rtm@lcs.mit.edu
Tested by: rtm@lcs.mit.edu
PR: 260266
MFC after: 2 weeks
The header exports the following:
- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)
Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).
Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr. libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).
A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.
Reviewed by: kib (older version of x86/tls.h), jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33351
This is the more standard name for the bias of dtv pointers used on
other platforms. This also fixes a few other places that were using
the wrong bias previously on MIPS such as dlpi_tls_data in struct
dl_phdr_info and the recently added __libc_tls_get_addr().
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33346
All of the request handlers no longer modify session state, so remove
the mutex limiting operations to one per session. In addition, change
the pointer to the session state passed to process callbacks to const.
Suggested by: mjg
Reviewed by: mjg, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33317
Only pre-allocate auth contexts when a session-wide key is provided or
for sessions without keys. For sessions with per-operation keys,
always initialize the on-stack context directly rather than
initializing the session context in swcr_authprepare (now removed) and
then copying that session context into the on-stack context.
This approach permits parallel auth operations without needing a
serializing lock. In addition, the previous code assumed that auth
sessions always provided an initial key unlike cipher sessions which
assume either an initial key or per-op keys.
While here, fix the Blake2 auth transforms to function like other auth
transforms where Setkey is invoked after Init rather than before.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33316
As is done with authentication contexts, allocate cipher contexts on
the stack while completing requests. This permits safely dispatching
concurrent requests on a single session. The cipher context in the
session is now only allocated when a session key is provided during
session setup to serve as a template to initialize the on-stack
context similar to auth operations.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33198
The cipher context isn't always a key schedule, so use a more generic
name.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33197
Extend struct enc_xform to add new members to handle auth operations
for AEAD ciphers. In particular, AEAD operations in cryptosoft no
longer use a struct auth_hash. Instead, the setkey and reinit methods
of struct enc_xform are responsible for initializing both the cipher
and auth state.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33196
Previously, these values were only cleared in AES_GMAC_Init(), so a
second set of operations could reuse the final hash as the initial
hash. Currently this bug does not trigger in cryptosoft as existing
GMAC and GCM operations always use an on-stack auth context
initialized from a template context.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33315
This centralizes the check for valid nonce lengths for AES-GCM.
While here, remove some duplicate checks for valid AES-GCM tag lengths
from ccp(4) and ccr(4).
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33194
It is always valid for crp_payload_output_start to be 0. However, if
an output buffer is empty (e.g. a decryption request with a tag but an
empty payload), the existing assertion failed since 0 is not less than
0.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33193
(Rest is from the README that came with the firmware)
Version : 1.26.4.0
Date : 12/02/2021
Fixes
-----
BASE:
- Fixed error on setting 25G speed on 100G copper with multiple FEC set in
firmware commands.
- Handle link of unknown optics modules by enabling module tx unconditionally.
- Fixed link not coming up for 25G CRS phys. Firmware incorrectly tried to
bring up the link in RS-FEC but as per IEEE spec, it must be BASER FEC.
- Fixed an issue where firmware doesn't automatically retry next FEC if driver
asks to bring up the link using RS-FEC and link doesn't come up.
Obtained from: Chelsio Communications
MFC after: 1 month
Sponsored by: Chelsio Communications
The commit at hand happens to break userspace build as the header ins
included by sbin/gbde/gbde.c and the __diagused macro is not provided to
userspace.
Revert until this gets sorted out.
This reverts commit 26e837e2d4.
With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.
MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first. With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].
Reported by: Greg V (greg unrelenting.technology)
Sponsored by: The FreeBSD Foundation
Suggested by: imp [1]
MFC after: 10 days
Reviewed by: imp (and others to different level)
Differential Revision: https://reviews.freebsd.org/D32383
The only requirement for SV_DSO_SIG here is that the flags match between
the source and target sysentvec.
The current assertion is too strict and fails on powerpc64, the only
other architecture than amd64 that uses this function, which doesn't
implement sigtramp in a VDSO.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D33355
- Enable local MCEs on capable Intel CPUs. It delivers exceptions
only to the affected CPU instead of global broadcast, requiring a lot
of synchronization between CPUs. AMD always deliver MCEs locally.
- Make MCE handler process only uncorrected errors, while CMCI and
polling only corrected. It reduces synchronization problems between
them and is explicitly recommended by the documentation.
- Add minimal support for uncorrected software recoverable errors
on Intel CPUs. It allows to avoid kernel panics in case uncorrected
errors do not affect current operation, like ones found during scrub
or write. Such errors are only logged, postponing the panic until
the corrupted data will actually be needed (that may never happen).
- Reduce polling period from 1 hour to 5 minutes.
MFC after: 2 weeks
GCC 9 doesn't define a SHA256_Transform symbol when the stub just wraps
SHA256_Transform_c resulting in an undefined symbol for
_libmd_SHA256_Transform in libmd.so.
Discussed with: andrew, jrtc27
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D31945
allowing linking to static symbols from other files. Default the new
settings to true, delaying the change of the kernel linker behavior
for other day.
Suggested by: emaste
PR: 207898
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32878
In particular, this prevents resolving locals from other files.
To access debug symbol tables, add LINKER_LOOKUP_DEBUG_SYMBOL and
LINKER_DEBUG_SYMBOL_VALUES kobj methods, which are allowed to use
any types of present symbols in all tables.
PR: 207898
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32878
Use nitems instead and just use a magic `8` for the size of the args
array. MAXARGS was rarely used (only in arm64 code) and is an overly
generic name to polute the namespace with.
Requested by: kib in D33308