Boottrace is a facility for capturing trace events during boot and
shutdown. This includes kernel initialization, as well as rc. It has
been used by NetApp internally for several years, for catching and
diagnosing slow devices or subsystems. It is driven from userspace by
sysctl interface, and the output is a human-readable log of events
(kern.boottrace.log).
This commit adds the core boottrace functionality implementing these
interfaces. Adding the trace annotations themselves to kernel and
userland will happen in follow-up commits. A future commit will also add
a boottrace(4) man page.
For now, boottrace is unconditionally compiled into the kernel but
disabled by default. It can be enabled by setting the
kern.boottrace.enabled tunable to 1 in loader.conf(5).
There is an existing boot-time event tracing facility, which can be
compiled into the kernel with 'options TSLOG'. While there is some
functional overlap between this and boottrace, they are distinct. TSLOG
is suitable for generating detailed timing information and flamegraphs,
and has been used to great success by cperciva@ to diagnose and reduce
the overall system boot time. Boottrace aims to more quickly provide an
overview of timing and resource usage of the boot (and shutdown) process
to a sysadmin who requires this knowledge.
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
X-NetApp-PR: #23
Differential Revision: https://reviews.freebsd.org/D30184
mmc_helper have an hard dependency on gpio_if.h
gpio(4) isn't in the default x86 kernel and none of the x86
sd/mmc drivers uses mmc_helper so just add a dependency on gpio.
Fixes: 85b3794cee ("files: Make ext_resources non-optional")
EXT_RESOURCES have been introduced in 12-CURRENT and all supported
releases have it enabled in their kernel config.
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33834
Some IIC multifunction devices may have multiple I2C addresses per chip, but
only the primary address is listed in the DT (e.g. MAX776200). In this case,
the sub-devices for the secondary addresses must be created manually with
fixed OFW parameters (node, name, compatibility string, IIC address).
Add a bus method to the ofw_iicbus interface that does this.
MFC after: 4 weeks
Deleting hack.c cause the kernel to always be out of date:
$ make kernel
make: /usr/src/sys/amd64/compile/GENERIC/.depend.hack.pico, 1:
ignoring stale .depend for hack.c
:> hack.c
cc -shared -O2 -pipe ... -nostdlib hack.c -o hack.pico
rm -f hack.c
MAKE="make" sh ../../../conf/newvers.sh "-R" GENERIC
cc -c -O2 -pipe ... -std=iso9899:1999 -Werror vers.c
ctfconvert -L VERSION -g vers.o
linking kernel.full
Keeping hack.c in the compile directory causes no harm,
so there's no reason to delete it.
Also rename the file to "force-dyamic-hack.c" so it is
clear what the hack is aboug.
Reviewed by: sjg
Differential Revision: https://reviews.freebsd.org/D34281
GCC is more pedantic than clang about warning when a function doesn't
handle undefined enum values (see GCC bug 87950). Clang's warning
gives a more pragmatic coverage and should find any real bugs, so
disable the warning for GCC rather than adding __unreachable
annotations to appease GCC.
Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D34147
Parts of zstd, used in openzfs and other places, trigger a new clang 14
-Werror warning:
```
sys/contrib/zstd/lib/decompress/huf_decompress.c:889:25: error: use of bitwise '&' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
(BIT_reloadDStreamFast(&bitD1) == BIT_DStream_unfinished)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
While the warning is benign, it should ideally be fixed upstream and
then vendor-imported, but for now silence it selectively.
MFC after: 3 days
clang doesn't implement it, and Linux doesn't enforce it. As a
result, new instances keep cropping up both in FreeBSD's code and in
upstream sources from vendors.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D34144
Reduce the burden to maintain correct and
extensible ECN related code across multiple
stacks and codepaths.
Formally no functional change.
Incidentially this establishes correct
ECN operation in one instance.
Reviewed By: rrs, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34162
Reduce the burden to maintain correct and
extensible ECN related code across multiple
stacks and codepaths.
Formally no functional change.
Incidentially this establishes correct
ECN operation in one instance.
Reviewed By: rrs, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34162
TLS RX support is modeled after TLS TX support. The basic structures and layouts
are almost identical, except that the send tag created filters RX traffic and
not TX traffic.
The TLS RX tag keeps track of past TLS records up to a certain limit,
approximately 1 Gbyte of TCP data. TLS records of same length are joined
into a single database record.
Regularly the HW is queried for TLS RX progress information. The TCP sequence
number gotten from the HW is then matches against the database of TLS TCP
sequence number records and lengths. If a match is found a static params WQE
is queued on the IQ and the hardware should immediately resume decrypting TLS
data until the next non-sequential TCP packet arrives.
Offloading TLS RX data is supported for untagged, prio-tagged, and
regular VLAN traffic.
MFC after: 1 week
Sponsored by: NVIDIA Networking
This change adds convenience functions to setup a flow steering rule based on
a TCP socket. The helper function gets all the address information from the
socket and returns a steering rule, to be used with HW TLS RX offload.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Internal send queues are regular sendqueues which are reserved for WQE commands
towards the hardware and firmware. These queues typically carry resync
information for ongoing TLS RX connections and when changing schedule queues
for rate limited connections.
The internal queue, IQ, code is more or less a stripped down copy
of the existing SQ managing code with exception of:
1) An optional single segment memory buffer which can be read or
written as a whole by the hardware, may be provided.
2) An optional completion callback for all transmit operations, may
be provided.
3) Does not support mbufs.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Add i2c support to linuxkpi. This is needed by drm-kmod.
For every i2c_adapter added by i2c_add_adapter we add a child to the
device named "lkpi_iic". This child handle the conversion between
Linux i2c_msgs to FreeBSD iic_msgs.
For every i2c_adapter added by i2c_bit_add_bus we add a child to the
device named "lkpi_iicbb". This child handle the conversion between
Linux i2c_msgs to FreeBSD iic_msgs.
With the help of iic(4), this expose the i2c controller to userspace
allowing a user to query DDC information from a monitor.
e.g.: i2c -f /dev/iic0 -a 0x28 -c 128 -d r
will query the standard EDID from the monitor if plugged.
The bitbang part (lkpi_iicbb) isn't tested at all for now as I don't have
compatible hardware (all my hardware have native i2c controller).
Tested on: Intel (SandyBridge, Skylake, ApolloLake)
Tested on: AMD (Picasso, Polaris (amd64 and arm64))
MFC after: 1 month
Reviewed by: hselasky
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D33053
This is intended to be used with forthcoming ice(4) driver version 1.34.2.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Sponsored by: Intel Corporation
options IPSEC is already documented as requiring 'device crypto' and
duplicating the dependencies is harder to read and not always
consistent.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33990
This adds a wrapper around libsodium's curve25519 support matching
Linux's curve25519 API. The intended use case for this is WireGuard.
Note that this is not integrated with OCF as it is not related to
symmetric operations on data.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33935
These ciphers are now supported via OCF or 'struct enc_xform'.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33889
This is a synchronous software API which wraps the existing software
implementation shared with OCF. Note that this will not currently
use optimized backends (such as ossl(4)) but may be appropriate for
operations on small buffers.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33524
All supported Xen instances by FreeBSD provide a local APIC
implementation, so there's no need to replace the native local APIC
implementation anymore.
Leave just the ipi_vectored hook in order to be able to override it
with an implementation based on event channels if the underlying local
APIC is not virtualized by hardware. Note the hook cannot use ifuncs,
because at the point where ifuncs are resolved the kernel doesn't yet
know whether it will benefit from using the optimization.
Sponsored by: Citrix Systems R&D
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D33917
Prior to this commit, the TSC and local APIC frequencies were calibrated
at boot time by measuring the clocks before and after a one-second sleep.
This was simple and effective, but had the disadvantage of *requiring a
one-second sleep*.
Rather than making two clock measurements (before and after sleeping) we
now perform many measurements; and rather than simply subtracting the
starting count from the ending count, we calculate a best-fit regression
between the target clock and the reference clock (for which the current
best available timecounter is used). While we do this, we keep track
of an estimate of the uncertainty in the regression slope (aka. the ratio
of clock speeds), and stop measuring when we believe the uncertainty is
less than 1 PPM.
In order to avoid the risk of aliasing resulting from the data-gathering
loop synchronizing with (a multiple of) the frequency of the reference
clock, we add some additional spinning depending upon the iteration number.
For numerical stability and simplicity of implementation, we make use of
floating-point arithmetic for the statistical calculations.
On the author's Dell laptop, this reduces the time spent in calibration
from 2000 ms to 29 ms; on an EC2 c5.xlarge instance, it is reduced from
2000 ms to 2.5 ms.
Reviewed by: bde (previous version), kib
MFC after: 1 month
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D33802
Pointer authentication allows userspace to add instructions to insert
a Pointer Authentication Code (PAC) into a register based on an address
and modifier and check if the PAC is correct. If the check fails it will
either return an invalid address or fault to the kernel.
As many of these instructions are a NOP when disabled and in earlier
revisions of the architecture this can be used, for example, to sign
the return address before pushing it to the stack making Return-oriented
programming (ROP) attack more difficult on hardware that supports them.
The kernel manages five 128 bit signing keys: 2 instruction keys, 2 data
keys, and a generic key. The instructions then use one of these when
signing the registers. Instructions that use the first four store the
PAC in the register being signed, however the instructions that use the
generic key store the PAC in a separate register.
Currently all userspace threads share all the keys within a process
with a new set of userspace keys being generated when executing a new
process. This means a forked child will share its keys with its parent
until it calls an appropriate exec system call.
In the kernel we allow the use of one of the instruction keys, the ia
key. This will be used to sign return addresses in function calls.
Unlike userspace each kernel thread has its own randomly generated.
Thread0 has a static key as does the early code on secondary CPUs.
This should be safe as there is minimal user interaction with these
threads, however we could generate random keys when the Armv8.5
Random number generation instructions are present.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31261
For consistancy with the kernel linker script also use ${MACHINE} for
finding the kernel module linker script. As we currently only use this
for amd64 and i386 this is a no-op, but I'm planning on using this with
arm64 where ${MACHINE} != ${MACHINE_ARCH}.
Reviewed by: markj, kib, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33841
Add driver for Marvell, Armada-37xx peripheral clock.
Register clocks for various peripheral devices in
north bridge or south bridge domain. Dump clock's
domain while verbose boot.
Reviewed by:
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32294
Driver for tbg clocks. Read reference frequency from parent
and modify it depending on parameters read from register.
Reviewed by: manu
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32293
Driver registers new clock device. Clock frequency is set depending
on tenth bit's value obtained from syscon register. Full information
about the clock is dumped if bootverbose is enabled.
Driver was tested on EspressoBin.
Reviewed by: manu
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32292
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CHANGES
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Version : 1.26.6.0
Date : 01/03/2022
================================================================================
Fixes
-----
BASE:
- Fixed one module eeprom read failure.
- Fixed an issue with speed selection when 40G and 25G are advertised and
supported.
- Fixed a random traffic hang when T5 receives invalid ets BW in dcbx
messages from a switch.
- Fixed very long link up time with few switches.
================================================================================
Obtained from: Chelsio Communications
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Expand on the terse comments for where each of these files is used.
Reviewed by: emaste
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33716
Remove sys/mips as the next step of decomissioning mips from the tree.
Remove mips special cases from the kernel make files. Remove the mips
specific linker scripts.
Sponsored by: Netflix
Add 802.11 compat code for mac80211 and to a minimal degree cfg80211.
This allows us to compile and use basic functionality of wireless
drivers such as iwlwifi.
This is a constant work in progress but having it in the tree will
allow others to test and more easy to track changes and avoid having
snapshots no longer applying to branches.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Import a netdevice update complementing the last remaining bits of
the old ifnet derived implementation. Along add a (for now) task
based NAPI implementation.
This is the minimal set of chnages which are needed for the initial
support of wireless drivers. The NAPI implementation has an option to
still switch to "direct dispatch" as it had been used by these drivers
before not relying on a deferred context along with some printf tracing.
This has been helpful in the last weeks for debugging and will be
cleaned once we have had broader testing and are sure this is fine as-is.
Should we need a more time-sensitive or load-sensitive response
in the future we can always switch to something more sophisticated.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
X-Differential Revision: D33075 (abandoned without feedback a while ago)
This is a work-in-progress implementation of sk_buff compat code
used for wireless drivers only currently.
Bring in this version of the code as it has proven to be good enough
to have packets going for a few months.
The current implementation has several drawbacks including the need
for us to copy data between sk_buffs and mbufs.
Do not rely on the internals of this implementation. They are highly
likely to change as we will improve the integration to FreeBSD mbufs.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
readelf is not a bootstrap tool and so cannot be relied upon to exist.
On macOS there is no system readelf, and even on Linux or FreeBSD where
it does exist, BUILD_WITH_STRICT_TMPPATH builds won't be able to use it.
Instead of making it a bootstrap tool, just use nm as that suffices and
already is a bootstrap tool.
Fixes: 28482babd0 ("arm64: Use new arm_kernel_boothdr script for generating booti images.")
Reviewed by: emaste, mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32734
This will be used by the vdso signal trampoline on arm64.
While here fix the license as this part of locore.S to correct the
copyright owner.
Sponsored by: The FreeBSD Foundation