If a counter more than overflows just as we read it on switch out then,
if using sampling mode, we will negate this small value to give a huge
reload count, and if we later switch back in that context we will
validate that value against pm_reloadcount and panic an INVARIANTS
kernel with:
panic: [pmc,1470] pmcval outside of expected range cpu=2 ri=16 pmcval=fffff292 pm_reloadcount=10000
or similar. Presumably in a non-INVARIANTS kernel we will instead just
use the provided value as the reload count, which would lead to the
overflow not happing for a very long time (e.g. 78 minutes for a 48-bit
counter incrementing at an averate rate of 1GHz).
Instead, clamp the reload count to 0 (which corresponds precisely to the
value we would compute if it had just overflowed and no more), which
will result in hwpmc using the full original reload count again. This is
the approach used by core for Intel (for both fixed and programmable
counters).
As part of this, armv7 and arm64 are made conceptually simpler; rather
than skipping modifying the overflow count for sampling mode counters so
it's always kept as ~0, those special cases are removed so it's always
applicable and the concatentation of it and the hardware counter can
always be viewed as a 64-bit counter, which also makes them look more
like other architectures.
Whilst here, fix an instance of UB (shifting a 1 into the sign bit) for
amd in its sign-extension code.
Reviewed by: andrew, mhorne, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33654
With __diagused, these warnings were still emitted since INVARIANTS was
defined but TWS_DEBUG was not.
Fixes: a21f086a3316 ("Fix "set but not used" in the tws driver")
Differential Revision: https://reviews.freebsd.org/D33784
Admin queues almost always have several ASYNC_EVENT_REQUEST outstanding.
They have no timeouts, but their presence in qpair->outstanding_tr caused
useless timeout callout rearming twice a second.
While there, relax timeout callout period from 0.5s to 0.5-1s to improve
aggregation. Command timeouts are measured in seconds, so we don't need
to be precise here.
Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33781
It is one of the few remaining Giant-locked callouts. It would be
good to remove it, not mentioning that polling itself is not good.
If this cause keyboard/mouse freezes on some hardware, please set
loader tunable hw.atkbd.hz=1 as workaround and report the issue.
Submitted by: imp, jhb
This eliminates error messages like this from the driver when running at
50Gbps with 100G cables:
[3726] cc0: l1cfg failed: 71
[4407] cc0: l1cfg failed: 71
Note that link comes up anyway with or without this change.
Reported by: Suhas Lokesha @ Chelsio
MFC after: 1 week
Sponsored by: Chelsio Communications
In my understanding this is only needed to workaround lost interrupts.
I was thinking to remove it completely, but the comment about edge-
triggered interrupt may be true and needs deeper investigation. ~1Hz
should be often enough to handle the supposedly rare loss cases, but
rare enough to not appear in top. Add sysctl hw.atkbd.hz to tune it.
MFC after: 1 month
The saf1761 OTG support was only for mips targets (BERI?). Retire it.
Sponsored by: Netflix
Reviewed by: brooks
Differential Revision: https://reviews.freebsd.org/D33706
This matches icl_conn_task_setup() which passes the PDU and avoids the
need for a layering violation in cxgbei to fetch the request PDU from
the ctl_io.
Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D33746
Previously the driver duplicated code from cryptosoft.c to handle
certain edge case AES-CCM and AES-GCM requests. However, this
approach has a few downsides:
1) It only uses "plain" software and not accelerated software since it
uses enc_xform directly.
2) It performs the operation synchronously even though the caller
believes it is invoking an async driver. This was fine for the
original use case of requests with only AAD and no payload that
execute quickly, but is a bit more disingenuous for large requests
which fall back due to exceeding the size of a firmware work
request (e.g. due to large scatter/gather lists).
3) It has required several updates since ccr(4) was added to the tree.
Instead, allocate a software session for AES-CCM and AES-GCM sessions
and dispatch a cloned request asynchronusly to the software session.
Reviewed by: markj
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D33608
Make neta use frequency obtained from clock, instead of
hardcoded one. Use default frequency in case of clock
device failure. Remove unnecessary function calls
to obtain frequency and use cached one instead.
Reviewed by: manu
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D32336
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CHANGES
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Version : 1.26.6.0
Date : 01/03/2022
================================================================================
Fixes
-----
BASE:
- Fixed one module eeprom read failure.
- Fixed an issue with speed selection when 40G and 25G are advertised and
supported.
- Fixed a random traffic hang when T5 receives invalid ets BW in dcbx
messages from a switch.
- Fixed very long link up time with few switches.
================================================================================
Obtained from: Chelsio Communications
MFC after: 2 weeks
Sponsored by: Chelsio Communications
This fixes a driver panic during stats collection when a port's id does
not match its tx channel. The bug affected only the T580 card running
with a non-default VPD.
Reported by: Suhas Lokesha @ Chelsio
MFC after: 1 week
Sponsored by: Chelsio Communications
Now that each module handles its global and VNET initialization
itself, there is no VNET related stuff left to do in domain_init().
Differential revision: https://reviews.freebsd.org/D33541
The historical BSD network stack loop that rolls over domains and
over protocols has no advantages over more modern SYSINIT(9).
While doing the sweep, split global and per-VNET initializers.
Getting rid of pr_init allows to achieve several things:
o Get rid of ifdef's that protect against double foo_init() when
both INET and INET6 are compiled in.
o Isolate initializers statically to the module they init.
o Makes code easier to understand and maintain.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D33537
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D33685
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.
Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.
The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).
The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.
This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.
One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.
Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.
The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.
Reviewed by: kib
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D33451
Match igb(4) as in f7926a6d0c10. From Vincenzo, this check is redundant
to setup providing us an IGC_RXD_STAT_VP bit and would make for an
unexpected condition if IFCAP_VLAN_HWTAGGING were not set but the tag
was stripped, which would be passed up the stack breaking isolation.
PR: 260068
Approved by: vmaffione
MFC after: 1 month
sge->ctrlq is not always allocated during attach (eg. if firmware
initialization fails) and detach should be able to deal with this.
MFC after: 1 week
Sponsored by: Chelsio Communications
The logic that sets iri_vtag and M_VLANTAG does not handle the
case where the 802.11q VLAN tag is 0. Fix this issue across
the iflib drivers. While there, also improve and align the
VLAN tag check extraction, by moving it outside the RX descriptor
loop, eliminating a local variable and additional checks.
PR: 260068
Reviewed by: kbowling, gallatin
Reported by: erj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33156
Since isc_capenable (private copy of ifp->if_capenable) is
now synchronized to if_capenable, use it in the drivers
when checking the IFCAP_* bits.
This results in better cache usage and avoids indirection
through the ifp pointer.
PR: 260068
Reviewed by: kbowling, gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33156
This adds some very simple DWC3 glue for the IPQ4018/IPQ4019.
Other chipsets introduce reset line iteration, some further
clock line iteration and some customisations; I'll look at adding
those later.
This is enough to finally bring up USB 3.0 on my IPQ4018 ASUS
RT-58U router.
The Qualcomm TCSR is some top level glue between multiple IP blocks,
both for doing configuration of said IP blocks, some IPC between
them (mostly between multiple execution environments - eg trustzone
and non-TZ), and interrupt status bits for them.
However, for the IPQ4018/IPQ4019, it only is used as a small subset
of IP block configuration. As for what it actually gets used as
for other Qualcomm chipsets? Well, that'll have to wait.
It's a bit of a mess in linux and openwrt. See, every different
SoC support branch ends up with some different TCSR code for it.
So instead, I'm going to land a single TCSR driver that I'm going
to use for the IPQ4018/IPQ4019. When I add the next chipset, I'll
figure out how to organise things so there's a single TCSR driver
that works for multiple platforms.
The Qualcomm Universal Peripherals Engine (QUP) is a unified SPI and I2C
peripheral that ships with a variety of Qualcomm SoCs.
It supports three transfer modes - single PIO, block PIO and DMA.
This driver only supports the single PIO mode, which is enough to
bootstrap the rest of the SPI NAND/NOR support and means I can do
things like read the Wifi calibration data from NOR. It has some
hardware support code for the other transfer modes as well as
some support for split transfers (ie, transfers with no read or
write phase), but I haven't yet implemented those.
This driver is based on four sources - the linux driver, the u-boot
driver, some initial work done for APQ8064 by mmel@, and the APQ8064
Technical Reference Manual which is surprisingly free and open to
read. The linux and u-boot drivers approach a variety of things
completely differently, from how PIO is done, the hardware support
for re-ordering bytes in a transfer word and how the CS lines
are used.
Tested:
* IPQ4018, SPI to NAND/NOR flash, PIO only
Summary: I've tested this with cpufreq_dt, SPI and USB. They all seem to work fine.
Test Plan: * IPQ4018, boot
Subscribers: imp, andrew
Differential Revision: https://reviews.freebsd.org/D33665
These clock nodes are used by the IPQ4018/IPQ4019 and derivatives.
They're also used by other 32 and 64 bit qualcomm parts; so it's
best to put these nodes here in a single qcom_clk driver and add
to it as we grow new Qualcomm SoC support.
Tested:
* IPQ4018, boot
Differential Revision: https://reviews.freebsd.org/D33665
Summary:
The existing call can only really be used for a node wishing to
configure its parent, but as we don't pass in a pointer to the freq,
we can't set it to what it would be for a DRY_RUN pass.
So for clock nodes that wish to try setting parent frequencies to see
which would be the best for its own target frequency, we really do need
a way to call in and pass in a flag /and/ a pointer to freq so it can be
updated for us as the clock tree is recursed through.
Reviewers: manu
Approved by: manu
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D33445
The advanced TCP stacks (bbr, rack) may decide to drop a TCP connection
when they do output on it. The default stack never does this, thus
existing framework expects tcp_output() always to return locked and
valid tcpcb.
Provide KPI extension to satisfy demands of advanced stacks. If the
output method returns negative error code, it means that caller must
call tcp_drop().
In tcp_var() provide three inline methods to call tcp_output():
- tcp_output() is a drop-in replacement for the default stack, so that
default stack can continue using it internally without modifications.
For advanced stacks it would perform tcp_drop() and unlock and report
that with negative error code.
- tcp_output_unlock() handles the negative code and always converts
it to positive and always unlocks.
- tcp_output_nodrop() just calls the method and leaves the responsibility
to drop on the caller.
Sweep over the advanced stacks and use new KPI instead of using HPTS
delayed drop queue for that.
Reviewed by: rrs, tuexen
Differential revision: https://reviews.freebsd.org/D33370