68f57679d6 Fixed another class of integer overflows, but introduced a
boundary condition for 2-4s in ns conversion, 2-~4000s in us conversions
and 2-~4,000,000s in ms conversions. This was because we bogusly used
SBT_1S for the notion of 1 second, instead of the appropriate power of
10. To fix, just use the appropriate power of 10, which avoids these
overflows.
This caused some sleeps in ZFS to be on the order of an hour.
MFC: 1 day
PR: 263073
Sponsored by: Netflix
Reviewed by: asomers
Differential Revision: https://reviews.freebsd.org/D34790
Use -Wno-unused-but-set-variable for kernel builds with clang13.
To turn this warning back on, set the following in src.conf:
WITH_SET_BUT_NOTUSED_KERNEL_WARNINGS=
Reviewed by: mjg, imp
Differential Revision: https://reviews.freebsd.org/D34784
miivar.h includes opt_platform.h. Make sure all the drivers that use the
miibus_if.h interface file have opt_platform.h as well. While some of
these may not, strictly speaking, need it, it's easier to include it
universally for miibus.
Sponsored by: Netflix
We read TX_STATUS1 to acknowledge a TX interrupt. We don't use this
value for anything, so mark it as unused. We may be able to eliminate
this read, but since this hardware is poorly documented and difficult to
test, I'm leaving the read in place.
Sponsored by: Netflix
bbp_atten is write-only: we don't use it later in
bwi_rf_calc_nrssi_slope_11b. However, we may also be able to eliminate
the read of this value from the device. However, The documationation for
this hardware is thing and I can't test this card so I've left the read
of this register in case it's required.
Sponsored by: Netflix
Add three hooks to the livedump process: before, after, and for each
block of dumped data. This allows, for example, quiescing the system
before the dump begins or protecting data of interest to ensure its
consistency in the final output.
Reviewed by: markj, kib (previous version)
Reviewed by: debdrup (manpages)
Reviewed by: Pau Amma <pauamma@gundo.com> (manpages)
MFC after: 3 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34067
This dumper can instantiate and write the dump's contents to a
file-backed vnode.
Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.
As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.
A future change to savecore(8) will add an option to save a live dump.
Reviewed by: markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with: kib
MFC after: 3 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D33813
Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.
This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.
free_single_dumper() is made public and renamed to dumper_destroy().
Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34068
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.
Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct before doing the Write operation.
This patch prepares the NFSv4 client for such an
RPC, which will be added in a future commit.
This patch does not cause any semantics change.
firmware in shutdown
If controller reset is in progress, at same time if system shutdown is
issued then corresponding shutdown function in driver will be invoked
where driver is waiting 15 seconds to complete the controller reset.
If the reset is not complteted within that time frame driver will go
ahead and fire cache flush and shutdown DCMDs which will end up
accessing the the queues which are not initialized due to undergoing
reset leads to FMU error in firmware.
Fix:
In shutdown function, if controller reset is not finished within 15
seconds than driver will return to the OS without firing any DCMDs.
Reviewed by: imp
PR: 261375
There is an additional MPT command allocation for R1 fp command which
will lead to MPT command unavailablity in case of rigorous R1 FP IOs.
Remove additional MPT command allocation for R1 FP.
Reviewed by: imp
PR: 261377
Move io_mapping_create_wc to .c because it encodes the size of struct
io_mapping so we move this from the client module to the linuxkpi
module.
Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34776
lkpi_pcim_iomap_devres_find encodes the size of struct pcim_iomap_devres
in the code, so move from .h to .c to move from client driver to
linuxkpi module.
Sponsored by: Netflix
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D34775
pci_alloc_irq_vectors encodes the size of struct msix_entry
into its code. Move from .h to .c to move this knowledge from
client modules to linuxkpi module.
Sponsored by: Netflix
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D34774
Both pci_request_region and _lkpi_pci_iomap encode the size of struct
pci_mmio_region into their code. Move from .h to .c files to move that
knowledge from the client drivers into the linuxkpi module.
Sponsored by: Netflix
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D34773
lkpi_pci_devres_get_alloc encodes the struct pci_devres into its
code. Move from .h file to .c file to move this knowledge into linuxkpi
module.
Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34772
Move cdev_alloc into linux_compat.c since it encodes the size of struct
linux_cdev into the client modules otherwise.
Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34771
class_create encodes the size of struct class into the generated
code. Move from .h file to .c file to move this knowledge from the
client modules that call this into the linuxkpi module.
Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34769
device_create_groups_vargs encodes the size of struct device. Move
definition from .h to .c to move this size into the linuxkpi module
rather than encoding it in all client driver modules.
Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34768
kobject_create knows the size of struct kobject. Move it to
linux_compat.c so this knowledge is confined to the loadable module and
not the clients.
Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34767
LinuxKPI is asking for single-segment mappings. Some (wireless) drivers
are using this to map multi-pages and our busdma framework is not very
friendly to that as single-segments [D31823]. Add a counter so we can
track when this happens to gather more information.
Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34715
This is a temporary hack for zlib to make sure that the library
still builds when building with Z_SOLO (used in kernel and loader),
as zlib is depending on limits.h which is only available in STDC
case.
PR: kern/262977
MFC after: 3 days
As of today, using 'kldload miibus' is not equivalent to using 'device
miibus' in a kernel config. Newly introduced PHY drivers (DP83822,
DP83867, VSCPHY) and source files/PHY driver for FDT-enabled kernels
are missing. Without including them, kernel modules using any function
from dev/mii/mii_fdt.c refuse to load. Additionally, miivar.h directly
includes opt_platform.h.
Add the missing sources to the module build, with the FDT-only files
gated behind an OPT_FDT check. Maintain the alphabetical listing of
SRCS, but move the required header files to a separate line to improve
readability.
Reviewed by: mhorne, mindal@semihalf.com
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34256
For development, building a driver as kernel module is both convenient
and a time saver (no need for reboot on some change, testing it requires
just kldunload and kldload, a matter of seconds). For some special
cases, it may be even desirable to postpone initializing the network
interface after some action is done (loading a FPGA bitstream may be
required for Zynq/ZynqMP based hardware as an example).
Building is limited to ARM, ARM64 and RISC-V architectures (for Zynq,
ZynqMP, PolarFire Soc based boards, and HiFive based boards are known to
use CGEM at the moment).
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34687
In some cases vn_open_cred overwrites cn_flags, effectively nullifying
initialisation done in NDINIT. This will have to be fixed.
In the meantime make sure the flag is passed.
Reported by: jenkins
Noted by: Mathieu <sigsys@gmail.com>
When trying to avoid a CARP demotion during a pfsync service restart, I
noticed that a non-default value for the net.pfsync.carp_demotion_factor
sysctl was not being applied during the demotion. The CARP was always
demoted by 240.
After investigating, I realized that the sysctl was using VNET_NAME()
without the CTLFLAG_VNET.
PR: 262983
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34737
As in 4a22cd6c4e nf and rss should be
signed and not unsigned. Change the types in the header and while
here change a magic number to a define as done elsewhere (value does
not change).
When calculating c_rssi we need to make it relative so subtract nf.
And while here improve the debug output.
This will hopefully fix ifconfig wlanN list scan S:N output which
tools use to chose a BSSID and help net80211 internal calculations.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
This fixes userspace RDMA applications that would fail due to mmap
failure. The driver's mmap routine would succeed but later the
linux_compat.c mmap routine would fail because vma->vm_private_data
wasn't set properly.
This is catch-up with b633e08c70.
Reported by: Veeresh @ Chelsio
Sponsored by: Chelsio Communications
In the changes to get hystart++ into cubic an inadvertent line
was removed in the conditional to figure out if you need to exit
hystart++ back to slowstart. The line of course is the most crucial
one (the others are valid but not critical) i.e. is the new rtt
less than the point where we entered hystart++. Without the line
we end up bouncing in and out of CSS.
Reported By: Reese Enghardt
Sponsored By: Netflix Inc.
There is a case where rack will get stuck when it has outstanding data and
the peer collapses the rwnd down to 0. This leaves the session hung if
the rwnd update is not received. You can test this with the packet drill script
below. Without this fix it will be stuck and hang. With it we retransmit everything.
This also fixes the mtu retransmit case so we don't go into recovery when
the mtu is changed to a smaller value.
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D34573
Add cap_lo and cap_hi sysctl to each nvme drive. This publishes the raw
capabilities of the drive. Now we can only discover these with
bootverbose.
Sponsored by: Netflix
Setting MPS in the CC should be a power of 2 number (it specifies the
page size of the host is 2^(12+MPS)), so adjust the calcuation. There is
no functional change because we do not support any architecutres != 4k
pages (yet). Other changes are needed for architectures with 16k or 64k
pages, especially when the underlying NVMe drive doesn't support that
page size (Most drives support a range that's small, and many only
support 4k), but let's at least do this calculation correctly. 12 - 12
is just as much 0 as 4096 >> 13 is :)
Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D34707
Add man pages for rtw88 and rtw88fw. Install a copy of the firmware
license file and hook up the driver and firmware modules to the build.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Relnotes: yes
Import the most recent versions of the firmware images for the
rtw88 driver.
This is based on linux-firmware at 681281e49fb6778831370e5d94e6e1d97f0752d6.
The license of the firmware matches the previous rtwnfw(4) firmware
files (modulo a Copyright year) and you can find a copy in
sys/contrib/dev/rtw88fw/LICENCE.rtlwifi_firmware.txt.
Add build infrastructure to create the .ko files but do not yet hook
it up to the build until all parts are in the tree.
Approved by: core (imp)
MFC after: 2 weeks
Import rtw88 based on wireless-testing at
5d5d68bcff1f7ff27ba0b938a4df5849849b47e3 with adjustments for FreeBSD.
While our version of the driver has knowledge about the incapablity
of DMA above 4GB we do see errors if people have more than that
often already showing when laoding firmware.
The problem for that is currently believed to be outside this driver
so importing it anyway for now.
Given the lack of full license texts on non-local files this is
imported under the draft policy for handling SPDX files (D29226). [1]
Approved by: core (imp) [1]
MFC after: 2 weeks
FreeBSD detects serial ports twice: First, very early in the boot
process, in order to obtain a usable console; and second, during
the device probe/attach process. When a UART is discovered during
device probing, FreeBSD attempts to determine whether it is a
device which was already being used as a console; without this,
the console doesn't work in userland.
Unfortunately it's possible for a UART to be mapped to a different
location in memory when it is discovered on a bus than it has when
it is announced via the ACPI SPCR table; this breaks the matching
process, which relies on comparing bus addresses.
To address this, we introduce a concept of "unique" serial devices,
i.e. devices which are guaranteed to be present *only once* on any
system. If we discover one of these during device probing, we can
match it to a same-PCI-vendor-and-device-numbers console which was
announced via the ACPI SPCR table, regardless of the differing bus
addresses.
At present, the only unique serial device is the "Amazon PCI serial
device" (vendor 0x1d0f, device 0x8250) found in some EC2 instances.
This unbreaks the serial console on those systems.
Reviewed by: imp
MFC after: 3 days
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D34703
In vm_phys_alloc_queues_contig, in the case that a sequence of
max-order blocks are sought to fulfill an allocation, a sequence is
ruled out if it does not have enough max-order blocks to satisfy the
allocation. However, there may be smaller blocks of free memory that
follow the last max-order block in the sequence, and they may be big
enough to complete the allocation request, so check for that
possibility before giving up on that block sequence.
Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D34724
stge_attach() could fail at line 464, sc->sc_spec remains NULL when
calling stge_detach(), thus bus_release_resources() at line 704 will
trigger null pointer dereference. We need to check the nulliness before
calling bus_release_resources().
PR: 258420
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34629
To avoid annoyng messages from LTP test suites add the simple
implementation of /proc/self/oom_score_adj which is do nothing.
Reviewed by: emaste
Differential revision: https://reviews.freebsd.org/D34710
MFC after: 2 weeks
Historically 32-bit Linuxulator under amd64 emulated the real i386
behavior. Since 3d8dd983 the old i386 Linux world can't be used under
amd64 Linuxulator as it don't know anything about amd64 machine (which
is returned now by newuname() syscall). So, add a knob to allow to swith
the behavior and use i386 Linux binaries on amd64.
Set knob to the new behavior as I think this is common to the modern
Linux distros.
Reviewed by: Pau Amma (doc), emaste
Differential revision: https://reviews.freebsd.org/D34708
MFC after: 2 weeks
Avoid calculating d_off value as it is specific to the underlying filesystem
and can be used by others API, like lseek(), seekdir() as input offset.
Differential revision: https://reviews.freebsd.org/D31551
MFC after: 2 weeks
Since Linux 5.4, if id is zero, then wait for any child that is in the same
process grop as the caller's process group.
Differential revision: https://reviews.freebsd.org/D31567
MFC after: 2 weeks
As FreeBSD does not have __WALL option bit analogue explicitly set all
possible option bits to emulate Linux __WALL wait option bit.
Reviewed by: emaste
Differential revision: https://reviews.freebsd.org/D31555
MFC after: 2 weeks
Compiling another driver on i386 revealed two problems:
- ieee80211_tx_info.status.status_driver_data space needs to be
calculated. While a pointer is 32bit vm_paddr_t is 64 bit on i386
so we didn't fit more than one of these in but needed more space.
- the arguments to ieee80211_txq_get_depth() are expected to
unsigned long and not uint64_t.
No user noticable changes.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
In the search for contiguous pages, as each page segment is examined,
check to see if the free list set for the next page segment differs
from the set for the current segment, and avoid a pointless search if
they do not differ.
Discussed with: alc
Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D33947
Diff reduction between mpr and mps.
Fixes: e2997a03b7 ("Diagnostic buffer fixes for the mps(4)...")
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
The limit may later be updated by the "set limit" directive in pf.conf.
UMA does not permit a limit to be set on a zone after any items have
been allocated from a zone.
Other UMA zones used by pf do not appear to be susceptible to this
problem: they either set a limit at zone creation time or never set one
at all.
PR: 260406
Reviewed by: kp
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34713
In vmbus_pcib_prepopulate_bars(), after writing all 1's to the
avialable device bars, those without being configured by device driver
are also set to its initialized values. However, this could cause
weird problem which results to device failure. The issue has been
reported to happen on LSI 9211-8i HBA card for DDA access on Hyper-V.
Writing back the orignal BAR values seem to work around this problem.
Reported by: Alexander Motin <mavbsd@gmail.com>
Tested by: Mathias Kraut <krautmaster@gmail.com>
Fixes: 75412a521f Hyper-V: vPCI: Prepopulate device bars
MFC after: 1 month
32-bit architectures other than i386 have 64-bit time_t which results
in a struct timespec with 12 bytes for tv_sec and tv_nsec, and 4 bytes
of padding. Zero the padding holes in struct stat32 and struct
freebsd11_stat32.
i386 has 32-bit time_t; struct timespec is 8 bytes and has no padding.
Found by inspection, prompted by a report by Reno Robert of Trend Micro
Zero Day Initiative. The originally reported issue (ZDI-CAN-14538) is
already fixed in all supported FreeBSD versions (it was addressed
incidentally as part of the 64-bit inode project).
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34709
Possibility to do it was always a bug, but it runs into crashes
since recent introduction of a per-ruleset RB tree.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Reported by: syzbot+665b700afc6f69f1766a@syzkaller.appspotmail.com
Variants of for_each_sg/for_each_sg_dma_page but they operate on sgtable
structs.
Needed by drm v5.10
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
*_CFG_PAGE ioctl handlers in the mpr, mps, and mpt drivers allocated a
buffer of a caller-specified size, but copied to it a fixed size header.
Add checks that the size is at least the required minimum.
Note that the device nodes are owned by root:operator with 0640
permissions so the ioctls are not available to unprivileged users.
This change includes suggestions from scottl, markj and mav.
Two of the mpt cases were reported by Lucas Leong (@_wmliang_) of
Trend Micro Zero Day Initiative; scottl reported the third case in mpt.
Same issue found in mpr and mps after discussion with imp.
Reported by: Lucas Leong (@_wmliang_), Trend Micro Zero Day Initiative
Reviewed by: imp, mav
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34692
Commit 867c27c23a modified the NFS client so that
it did IO_APPEND writes directly to the NFS server
bypassing the buffer cache, via a call to
nfs_directio_write(). Unfortunately, this (very old)
function assumed that the uio iov was for user space
addresses. As such, a IO_APPEND VOP_WRITE() that
was for system space, such as ktrace(1) does, would
write bogus data.
This patch fixes nfs_directio_write() so that it
handles kernel space uio iovs.
Reported by: bz
Tested by: bz
MFC after: 2 weeks
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).
Obtained from: CheriBSD
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D33780
Some filesystems do not fill out certain optional vattr fields. To
ensure that they do not get copied out to userspace uninitialized, use
VATTR_NULL to provide default values.
Reported by: KMSAN
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Some new AMD systems provide a HPET MMIO region smaller than the 1KB
specified, and a correspondingly small number of timers. Handle this in
the HPET driver rather than requiring a 1KB window. This allows the
HPET driver to attach on such systems.
PR: 262638
Reviewed by: markj
MFC after: 1 month
with md5 sum used as key.
This gets rid of the quadratic rule traversal when "keep_counters" is
set.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Makes it cheaper to compare rules when "keep_counters" is set.
This also sets up keeping them in a RB tree.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")