These are the changes since the last update (copy-pasted from the
release notes for Chelsio Unified Wire v3.18.0.0):
====================
Version : 1.27.3.0
Date : 04/07/2023
Fixes
-----
BASE:
- Fixed a hang if module eeprom reads gives invalid data.
- KR backlplane no-fec link problem fixed.
OFLD:
- iscsi ddp errors fixed.
- iwarp connection abort in rare cases causing NIC traffic hang fixed.
ENHANCEMENTS
------------
BASE:
- Cisco GLC-TE 1G modules support added.
====================
Version : 1.27.1.0
Date : 12/02/2022
Fixes
-----
BASE:
- memwrite dsgl cannot be used for T5.
OFLD:
- Enabled FCoE in SO adapters.
- TOE-TLS crash fixed.
- iscsi hang fixed.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Have more accruate comments. While #if, #else, etc are copied to the
header files, lines that don't start with # are not. And #include files
are only output to sysinc (which winds up at the front of init_sysent.c
which seems a bit odd). This is all radically undocumented, and likely
has drifted somewhat from 4.4BSD and what other systems do (they've
drifted too, fwiw).
Sponsored by: Netflix
luacheck pointed out two minor issues: line isn't declared as a global,
so declare it local. Also remove an unused parameter.
Suggested by: kevans
Sponsored by: Netflix
x["y"] can be written as x.y, which looks better and is a more typical
lua idiom.
Sponsored by: Netflix
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D39709
This change touches both kernel and netstat(1), but either of the changes
will fix printing pcb addresses with -A.
The thing is that historically netstat(1) treated TCP differently, and
printed tcpcb address instead of inpcb address. This is not documented
anywhere! With e68b379244 these two addresses became the same. It is
highly likely they will be the same for a long time, but it might be they
will start to differ again in a far future. My proposal is to stop
treating TCP differently with netstat(1) and right now is a good opportunity
to do that, since there will be no behavior change at all. The kernel
change to tcp_inptoxtp() will go into stable/14 to make it compatible with
netstat(1) binary from stable/13. We can drop it later, probably together
with in_ppcb pointer from inpcb. The in_ppcb in xinpcb will stay for size
compatibility.
Reviewed by: tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D39736
Add DPAA2 console support for MC and AIOP (latter untested) for FDT
systems. ACPI systems are prepared but need some proper bus function
in order to get the address from MC (and likely a file splitup then).
This will come at a later stage once other ACPI/FDT bus parts are
cleared up.
The work was originally done in July 2022 and finally switched to
bus_space[1] lately to be ready for main.
Suggested by: andrew [1]
Reviewed by: dsl
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38592
dtrace_instr_size() is needed by the forthcoming RISC-V port of kinst,
as well as by libdtrace in D38825 for both amd64 and RISC-V.
Reviewed by: markj, mhorne
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39489
Callers are specifying uint8_t anyway and this slightly reduces
dependencies on compatibility typedefs. No functional change intended.
Reviewed by: markj, mhorne
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39490
Now that the inp_cred pointer is accessed only while the inpcb lock is
held, we can avoid deferring a crfree() call when freeing an inpcb.
This fixes a problem introduced when inpcb hash tables started being
synchronized with SMR: the credential reference previously could not be
released until all lockless readers have drained, and there is no
mechanism to explicitly purge cached, freed UMA items. Thus, ucred
references could linger indefinitely, and since ucreds hold a jail
reference, the jail would linger indefinitely as well. This manifests
as jails getting stuck in the DYING state.
Discussed with: glebius
Tested by: glebius
Sponsored by: Klara, Inc.
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D38573
The SMR-protected inpcb lookup algorithm currently has to check whether
a matching inpcb belongs to a jail, in order to prioritize jailed
bound sockets. To do this it has to maintain a ucred reference, and for
this to be safe, the reference can't be released until the UMA
destructor is called, and this will not happen within any bounded time
period.
Changing SMR to periodically recycle garbage is not trivial. Instead,
let's implement SMR-synchronized lookup without needing to dereference
inp_cred. This will allow the inpcb code to free the inp_cred reference
immediately when a PCB is freed, ensuring that ucred (and thus jail)
references are released promptly.
Commit 220d892129 ("inpcb: immediately return matching pcb on lookup")
gets us part of the way there. This patch goes further to handle
lookups of unconnected sockets. Here, the strategy is to maintain a
well-defined order of items within a hash chain so that a wild lookup
can simply return the first match and preserve existing semantics. This
makes insertion of listening sockets more complicated in order to make
lookup simpler, which seems like the right tradeoff anyway given that
bind() is already a fairly expensive operation and lookups are more
common.
In particular, when inserting an unconnected socket, in_pcbinhash() now
keeps the following ordering:
- jailed sockets before non-jailed sockets,
- specified local addresses before unspecified local addresses.
Most of the change adds a separate SMR-based lookup path for inpcb hash
lookups. When a match is found, we try to lock the inpcb and
re-validate its connection info. In the common case, this works well
and we can simply return the inpcb. If this fails, typically because
something is concurrently modifying the inpcb, we go to the slow path,
which performs a serialized lookup.
Note, I did not touch lbgroup lookup, since there the credential
reference is formally synchronized by net_epoch, not SMR. In
particular, lbgroups are rarely allocated or freed.
I think it is possible to simplify in_pcblookup_hash_wild_locked() now,
but I didn't do it in this patch.
Discussed with: glebius
Tested by: glebius
Sponsored by: Klara, Inc.
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D38572
These functions will get some additional callers in future revisions.
No functional change intended.
Discussed with: glebius
Tested by: glebius
Sponsored by: Modirum MDPay
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38571
Currently we use a single hash table per PCB database for connected and
bound PCBs. Since we started using net_epoch to synchronize hash table
lookups, there's been a bug, noted in a comment above in_pcbrehash():
connecting a socket can cause an inpcb to move between hash chains, and
this can cause a concurrent lookup to follow the wrong linkage pointers.
I believe this could cause rare, spurious ECONNREFUSED errors in the
worse case.
Address the problem by introducing a second hash table and adding more
linkage pointers to struct inpcb. Now the database has one table each
for connected and unconnected sockets.
When inserting an inpcb into the hash table, in_pcbinhash() now looks at
the foreign address of the inpcb to figure out which table to use. This
ensures that queue linkage pointers are stable until the socket is
disconnected, so the problem described above goes away. There is also a
small benefit in that in_pcblookup_*() can now search just one of the
two possible hash buckets.
I also made the "rehash" parameter of in(6)_pcbconnect() unused. This
parameter seems confusing and it is simpler to let the inpcb code figure
out what to do using the existing INP_INHASHLIST flag.
UDP sockets pose a special problem since they can be connected and
disconnected multiple times during their lifecycle. To handle this, the
patch plugs a hole in the inpcb structure and uses it to store an SMR
sequence number. When an inpcb is disconnected - an operation which
requires the global PCB database hash lock - the write sequence number
is advanced, and in order to reconnect, the connecting thread must wait
for readers to drain before reusing the inpcb's hash chain linkage
pointers.
raw_ip (ab)uses the hash table without using the corresponding
accessors. Since there are now two hash tables, it arbitrarily uses the
"connected" table for all of its PCBs. This will be addressed in some
way in the future.
inp interators which specify a hash bucket will only visit connected
PCBs. This is not really correct, but nothing in the tree uses that
functionality except raw_ip, which as mentioned above places all of its
PCBs in the "connected" table and so is unaffected.
Discussed with: glebius
Tested by: glebius
Sponsored by: Klara, Inc.
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D38569
Move a KASSERT out of a function and make it a CTASSERT with
appropriate comments.
Skeleton implement two tkip functions, still left TODO, initializing
variables with dummy values to quiten compiler warnings. It is
unclear to me if we should still ever properly implement TKIP
compat code at this point. If so the current code gives a good
idea what needs to be done in addition to allocating references
to real state along with keyconf.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Quieten some more (valid) gcc warnings and disable dead code.
There are more warnings, some probably a compiler problem, the
other related to firmware structs which I do not want to adjust
just locally. Leave a comment to revisit after a next driver
update.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Rather than using ACCESS_ONCE() in READ_ONCE() add a missing cast
to const in order to satisfy -Wcast-equal by gcc.
Sadly we cannot do the same to WRITE_ONCE() which still is very
noisy.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D39706
This change is required to support interface renaming via Netlink.
No functional changes intended.
Reviewed by: zlei
Differential Revision: https://reviews.freebsd.org/D39692
MFC after: 2 weeks
We are asserting that two values from different enums are the same.
gcc warns about these. Cast the values to (int) to avoid the warning.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Harmonize sk_buff_head and sk_buff further and fix -Warray-bounds
warnings reports by gcc. At the same time simplify some code by
re-using other functions or factoring some code out.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
A one-bit wide bit-field can take only the values 0 and -1. Clang 16
introduced a warning that "implicit truncation from 'int' to a one-bit
wide bit-field changes value from 1 to -1". Fix by using c99 bool.
Reported by: Clang
Reviewed by: emaste, wulf
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39665
Two vfs.cache.stats names are fixed:
- s/.dotdothis/.dotdothits/
- s/.posszaps/.poszaps/
Signed-off-by: Igor Ostapenko <pm@igoro.pro>
[mjg: massaged the header a little bit]
When doing request level BB logging the hybrid_bw_log() does not have proper screening to minimize logging
when point level logging is in use. Lets fix it properly so you have to have the proper knobs set to get the
more noisy logging.
Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39699
Turns out the location of the check to see if we can do output is in the wrong place. We need
to jump off to the compressed acks before handling that case since th is NULL in the
compressed ack case which is handled differently anyway.
Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39690
holds some nice stats about why/how the connection ended. Though with the current code it does not
come out without accounting due to the placement of the ifdefs. Also we need to make sure the stacks
fini has ran before calling in from tcp_subr so we get all logs the stack may make at its ending.
Reviewed by: rscheff
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39693
t4_dump_stag to dump hw state for a known STAG.
t4_dump_all_stag to dump hw state for all valid STAGs. This routine
walks the entire STAG region looking for valid entries and this can take
a while for some configurations.
MFC after: 1 week
Sponsored by: Chelsio Communications
struct dpaa2_cmd is no longer malloc'ed, but can be allocated on stack
and initialized with DPAA2_CMD_INIT() on demand. Drivers stopped caching
their DPAA2 command objects (and associated tokens) in the software
contexts in order to avoid using them concurrently.
Reviewed by: bz
Approved by: bz (mentor)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D39509
We need to make the syncache aware of the flag and not do ECN if its set. Note that this
is not 100% full proof but the best we can do (i.e. its still possible that you can get in a
situation where the peer try's to do ecn).
Reviewed by: tuexen, glebius, rscheff
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39672
Both pf_rules_lock and pf_ioctl_lock only ever affect one vnet, so
there's no point in having these locks affect other vnets.
(In fact, the only lock in pf that can affect multiple vnets is
pf_end_lock.)
That's especially important for the rules lock, because taking the write
lock suspends all network traffic until it's released. This will reduce
the impact a vnet running pf can have on other vnets, and improve
concurrency on machines running multiple pf-enabled vnets.
Reviewed by: zlei
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D39658
The explanation from https://reviews.freebsd.org/D39637 by stevek:
The "use_xsave" variable is a global and that is only supposed to be
initialized early before scheduling gets started. However, with the way
the ifuncs for "fpusave" and "fpurestore" are implemented, the value
could be changed at runtime when scheduling is active if "use_xsave"
was set to 0 by the tunable. This leaves a window of opportunity where
"use_xsave" gets re-initialized to 1 and a context switch could occur
with a thread that was not set up to be able to use xsave functionality.
This can lead to an "privileged instruction fault".
The fix is to protect "use_xsave" from being initialized more than once.
Reported and reviewed by: stevek
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39660
Since syncer vnode vector does not provide a fallback to the default
one, its VOP_GETWRITEMOUNT() implementation implicitly returned
EOPNOTSUPP, which means that syncer ignored suspension.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Ensure MAC modules are inserted in order that they are registered.
Reviewed by: markj
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39589
These bits are obsolete since 58aa35d429.
This change reverts part of 9ba2b298df as
well as effectively bd3d9826d7, i. e. the
SBus-related modifications. This also gets rid of a nasty hack required
as bus_{read,write}_N(9) doesn't really fit bus_space_subregion(9).
The original idea behind calling into the bridge driver was to have the
logic deciding whether tuning is actually required for a particular bus
timing in a given slot as well as doing the sanity checking only on the
controller layer which also generally is better suited for these due to
say SDHCI_SDR50_NEEDS_TUNING. On another thought, not every such driver
should need to check whether tuning is required at all, though, and not
everything is SDHCI in the first place.
Adjust sdhci{,_fsl_fdt}(4) accordingly, but keep sdhci_generic_tune() a
bit cautious still.
Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One
such inconsistency results in a bug when we can't allocate enough sendmap entries to entertain a call to
rack_output().. basically a timer won't get started like it should. Also in cleaning this up I find that the
"no_output" side of input needs to be adjusted to make sure we don't try to re-pace too quickly outside
the hpts assurance of 250useconds.
Another thing here is we end up with duplicate calls to tcp_output() which we should not. If packets go
from hpts for processing the input side of tcp will call the output side of tcp on the last packet if it is needed.
This means that when that occurs a second call to tcp_output would be made that is not needed and if pacing
is going on may be harmful.
Lets fix all this and explicitly state the contract that hpts is making with transports that care about the
flag.
Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39653
Using .ALLSRC may get additional arguments that we may not want
and could cause the objcopy to fail.
Reviewed by: emaste
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39639