Commit Graph

137074 Commits

Author SHA1 Message Date
Justin Hibbits
a6ca7519f8 powerpc64: Optimize radix trap handling a little more
Summary:
Since PCPU can live in a GPR for a while longer, let it, rather than
re-getting it in yet another register.  MFSPR is an expensive operation,
12 clock latency on POWER9, so the fewer operations we need, the better.

Since the check is tightly coupled to the fetch, by reducing the number
of fetch+check, we reduce the stalls, and improve the performance
marginally.  Buildworld was measured at a ~5-7% improvement on a single
run.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D30003
2021-04-30 19:58:11 -05:00
Marcin Wojtas
e245ee2774 gicv3_its: Flush cache after allocating ITT memory
It has to be zeroed before committing it to device.
We do that by allocating it with M_ZERO, but there was no
memory barrier or cache flush to ensure its sees it zeroed.
This fixes MSIX on LS1028A SoC.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30033
2021-05-01 00:58:26 +02:00
Eric van Gyzen
2f32a971b7 Wait longer for a previous IPI to be sent
When sending an IPI, if a previous IPI is still pending delivery,
native_lapic_ipi_vectored() waits for the previous IPI to be sent.
We've seen a few inexplicable panics with the current timeout of 50 ms.
Increase the timeout to 1 second and make it tunable.

No hardware specification mentions a timeout in this case; I checked
the Intel SDM, Intel MP spec, and Intel x2APIC spec.  Linux and illumos
wait forever.  In Linux, see __default_send_IPI_shortcut() in
arch/x86/kernel/apic/ipi.c.  In illumos, see apic_send_ipi() in
usr/src/uts/i86pc/io/pcplusmp/apic_common.c.  However, misbehaving hardware
could hang the system if we wait forever.

Reviewed by:	mav kib
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D29942
2021-04-30 13:32:29 -05:00
Konstantin Belousov
619fe09586 ioccom: define ioctl cmd value that can never be valid
Its use is for cases where some filler is needed for cmd, or we need an
indication that there were no cmd supplied, and so on.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29935
2021-04-30 17:43:45 +03:00
Konstantin Belousov
2082565798 O_PATH: disable kqfilter for fifos
Filter on fifos is real filter for the object, and not a filesystem
events filter like EVFILT_VNODE.

Reported by:	markj using syzkaller
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-04-30 17:43:45 +03:00
Konstantin Belousov
72a42ec63b amd64: disable LA57 by default for now
A testing on the real hardware uncovered an issue, and since I do not have
access to the machine, disable until the bug can be fixed.

Reported by:	"Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-04-30 17:43:45 +03:00
Konstantin Belousov
21fc6a2a10 amd64: invalidate TLB between page table update and access
When setting up trampoline mapping for LA57 switcher, it is possible
that TLB still has some random mapping at that address.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-04-30 17:43:45 +03:00
Michael Tuexen
e010d20032 sctp: update the vtag for INIT and INIT-ACK chunks
This is needed in case of responding with an ABORT to an INIT-ACK.
2021-04-30 13:33:16 +02:00
Marcin Wojtas
cd945dc08a iflib: Take iri_pad into account when processing small frames
Drivers can specify padding of received frames with iri_pad field.
This can be used to enforce ip alignment by hardware.
Iflib ignored that padding when processing small frames,
which rendered this feature inoperable.
I found it while writing a driver for a NIC that can ip align
received packets. Note that this doesn't change behavior of existing
drivers as they all set iri_pad to 0.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: gallatin
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30009
2021-04-30 12:46:17 +02:00
Michael Tuexen
eb79855920 sctp: fix SCTP_PEER_ADDR_PARAMS socket option
Ignore spp_pathmtu if it is 0, when setting the IPPROTO_SCTP level
socket option SCTP_PEER_ADDR_PARAMS as required by RFC 6458.

MFC after:	1 week
2021-04-30 12:31:09 +02:00
Kristof Provost
055c55abef pf: Fix IP checksum on reassembly
If we reassemble a packet we modify the IP header (to set the length and
remove the fragment offset information), but we failed to update the
checksum. On certain setups (mostly where we did not re-fragment again
afterwards) this could lead to us sending out packets with incorrect
checksums.

PR:		255432
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30026
2021-04-30 08:19:46 +02:00
Michael Tuexen
eecdf5220b sctp: use RTO.Initial of 1 second as specified in RFC 4960bis 2021-04-30 00:45:56 +02:00
Michael Tuexen
9de7354bb8 sctp: improve consistency in handling chunks with wrong size
Just skip the chunk, if no other handling is required by the
specification.
2021-04-28 18:11:06 +02:00
Mark Johnston
20e3b9d8bd kasan: Use vm_offset_t for the first parameter to kasan_shadow_map()
No functional change intended.

Sponsored by:	The FreeBSD Foundation
2021-04-29 11:39:02 -04:00
Yinlong Lu
ee8b757a94 ipmi: support getting address from EFI
The original implementation only supports getting the address from legacy
BIOS (by searching for the SMBIOS_SIG pattern in a fixed address space).

Try to get the SMBIOS table from EFI through efirt (EFI Runtime Services)
firstly.  Continue to search in the legacy BIOS if a NULL address is
returned from EFI.

By this way the ipmi function supports both legacy BIOS and UEFI systems.

Reviewed by:	dab, vangyzen
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D30007
2021-04-29 05:20:58 -05:00
Kristof Provost
eaabed8ac4 pf: Trivial typo fix
PV -> PF

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-04-29 15:25:07 +02:00
Navdeep Parhar
b9820bca18 cxgbe(4): Do not panic when tx is called with invalid checksum requests.
There is no need to panic in if_transmit if the checksums requested are
inconsistent with the frame being transmitted.  This typically indicates
that the kernel and driver were built with different INET/INET6 options,
or there is some other kernel bug.  The driver should just throw away
the requests that it doesn't understand and move on.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-04-28 14:04:53 -07:00
Alexander V. Chernikov
41ce0e34ea [fib algo] Update fib_gen counter under FIB_MOD_LOCK.
MFC after:	3 days
2021-04-28 20:23:03 +00:00
Mateusz Guzik
074abaccfa cache: remove incomplete lockless lockout support during resize
This is already properly handled thanks to 2 step hash replacement.
2021-04-28 19:53:25 +00:00
Kevin Bowling
fdbcd35a75 ixgbe: Improve device name strings
This is just clerical work to ease bug triage and may be used to set
expectations around the ability for anyone in the community to perform
testing and development on older parts.

Approved by:	erj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D29876
2021-04-28 10:29:59 -07:00
Kristof Provost
6b146f3b9b pf: Error tracing SDTs
Add additional DTrace static trace points to facilitate debugging
failing pf ioctl calls.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-04-28 17:19:10 +02:00
Neel Chauhan
341da0077e Bump __FreeBSD_version for commits efe7f12 and 9781105
These commits have added new APIs to linuxkpi.
2021-04-28 08:07:05 -07:00
Neel Chauhan
9781105bea linuxkpi: Introduce tasklet_disable_nosync()
This is needed for the drm-kmod 5.5 update.

Reviewed by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30024
2021-04-28 08:05:57 -07:00
Neel Chauhan
efe7f12cd3 linuxkpi: Implement rcu_replace_pointer() macro
This is needed for the drm-kmod 5.5 update.

Reviewed by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30025
2021-04-28 08:04:52 -07:00
Mark Johnston
d1e9441583 pipe: Avoid calling selrecord() on a closing pipe
pipe_poll() may add the calling thread to the selinfo lists of both ends
of a pipe.  It is ok to do this for the local end, since we know we hold
a reference on the file and so the local end is not closed.  It is not
ok to do this for the remote end, which may already be closed and have
called seldrain().  In this scenario, when the polling thread wakes up,
it may end up referencing a freed selinfo.

Guard the selrecord() call appropriately.

Reviewed by:	kib
Reported by:	syzkaller+KASAN
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30016
2021-04-28 10:43:29 -04:00
Richard Scheffenegger
48be5b976e tcp: stop spurious rescue retransmissions and potential asserts
Reported by: pho@
MFC after: 3 days
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D29970
2021-04-28 15:01:10 +02:00
Thomas Munro
3aaaa2efde poll(2): Add POLLRDHUP.
Teach poll(2) to support Linux-style POLLRDHUP events for sockets, if
requested.  Triggered when the remote peer shuts down writing or closes
its end.

Reviewed by:	kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D29757
2021-04-28 23:00:31 +12:00
Alexander V. Chernikov
f9668e42b4 Add rib_walk_from() wrapper for selective rib tree traversal.
Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
 routing subsystem, this wrapper is necessary to maintain isolation
 from the external code.

Differential Revision: https://reviews.freebsd.org/D29971
MFC after:	1 week
2021-04-28 08:09:45 +00:00
Navdeep Parhar
83b5cda106 cxgbe(4): Add support for NIC suspend/resume and live reset.
Add suspend/resume callbacks to the driver and a live reset built around
them.  This commit covers the basic NIC and future commits will expand
this functionality to other stateful parts of the chip.  Suspend and
resume operate on the chip (the t?nex nexus device) and affect all its
ports.  It is not possible to suspend/resume or reset individual ports.
All these operations can be performed on a running NIC.  A reset will
look like a link bounce to the networking stack.

Here are some ways to exercise this functionality:

 /* Manual suspend and resume. */
 # devctl suspend t6nex0
 # devctl resume t6nex0

 /* Manual reset. */
 # devctl reset t6nex0

 /* Manual reset with driver sysctl. */
 # sysctl dev.t6nex.0.reset=1

 /* Automatic adapter reset on any fatal error. */
 # hw.cxgbe.reset_on_fatal_err=1

Suspend disables the adapter (DMA, interrupts, and the port PHYs) and
marks the hardware as unavailable to the driver.  All ifnets associated
with the adapter are still visible to the kernel but operations that
require hardware interaction will fail with ENXIO.  All ifnets report
link-down while the adapter is suspended.

Resume will reattach to the card, reconfigure it as before, and recreate
the queues servicing the existing ifnets.  The ifnets are able to send
and receive traffic as soon as the link comes back up.

Reset is roughly the same as a suspend and a resume with at least one of
these events in between: D0->D3Hot->D0, FLR, PCIe link retrain.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Chelsio Communications
2021-04-27 22:48:51 -07:00
Rick Macklem
f6fec55fe3 nfscl: add check for NULL clp and forced dismounts to nfscl_delegreturnvp()
Commit aad780464f added a function called nfscl_delegreturnvp()
to return delegations during the NFS VOP_RECLAIM().
The function erroneously assumed that nm_clp would
be non-NULL. It will be NULL for NFSV4.0 mounts until
a regular file is opened. It will also be NULL during
vflush() in nfs_unmount() for a forced dismount.

This patch adds a check for clp == NULL to fix this.

Also, since it makes no sense to call nfscl_delegreturnvp()
during a forced dismount, the patch adds a check for that
case and does not do the call during forced dismounts.

PR:	255436
Reported by:	ish@amail.plala.or.jp
MFC after:	2 weeks
2021-04-27 17:30:16 -07:00
Rick Macklem
db8c27f499 nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT
It was reported that a NFSv4.1 Linux client mount against
a FreeBSD12 server was hung, with the TCP connection in
CLOSE_WAIT state on the server.
When a NFSv4.1/4.2 mount is done and the back channel is
bound to the TCP connection, the soclose() is delayed until
a new TCP connection is bound to the back channel, due to
a reference count being held on the SVCXPRT structure in
the krpc for the socket. Without the soclose() call, the socket
will remain in CLOSE_WAIT and this somehow caused the Linux
client to hang.

This patch adds calls to soshutdown(.., SHUT_WR) that
are performed when the server side krpc sees that the
socket is no longer usable.  Since this can be done
before the back channel is bound to a new TCP connection,
it allows the TCP connection to proceed to CLOSED state.

PR:	254590
Reported by:	jbreitman@tildenparkcapital.com
Reviewed by:	tuexen
Comments by:	kevans
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29526
2021-04-27 15:32:35 -07:00
Kevin Bowling
eea55de7b1 e1000: Rework em_msi_link interrupt filter
* Fix 82574 Link Status Changes, carrying the OTHER mask bit around as
  needed.
* Move igb-class LSC re-arming out of FAST back into the handler.
* Clarify spurious/other interrupt re-arms in FAST.

In MSI-X mode, 82574 and igb-class devices use an interrupt filter to
handle Link Status Changes. We want to do LSC re-arms in the handler
to take advantage of autoclear (EIAC) single shot behavior.

82574 uses 'Other' in ICR and IMS for LSC interrupt types when in MSI-X
mode, so we need to set and re-arm the 'Other' bit during attach and
after ICR reads in the FAST handler if not an LSC or after handling on
LSC due to autoclearing.

This work was primarily done to address the referenced PR, but inspired
some clarification and improvement for igb-class devices once the
intentions of previous bug fix attempts became clearer.

PR:		211219
Reported by:	Alexey <aserp3@gmail.com>
Tested by:	kbowling (I210 lagg), markj (I210)
Approved by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D29943
2021-04-27 15:29:39 -07:00
Alexander V. Chernikov
8a0d57baec [fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.
Currently, most of the rib(9) KPI does not use rnh pointers, using
 fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
 during fib growth.
In that case, there is no mapping between fib/family and the new rib,
 as an entirely new rib pointer array is populated.

Address this by delaying fib algo initialization till after switching
 to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
 won't crash the kernel in the brief moment when the rib exists, but
 no fib algo is attached.

This change allows to avoid creating duplicates of existing rib functions,
 with altered signature.

Differential Revision: https://reviews.freebsd.org/D29969
MFC after:	1 week
2021-04-27 22:10:08 +00:00
Brandon Bergren
6e1abda231 riscv: Remove old qemu compatibility code
During early qemu development, the /soc node was marked as compatible
with "riscv-virtio-soc" instead of "simple-bus".

This was changed in qemu 53f54508dae6 in Sep 2018, and predates the
baseline required qemu version (5.0) for riscv by a wide margin.

The generic simplebus code handles attachment in all cases nowadays.

Sponsored by:	Tag1 Consulting, Inc.
Reviewed by:	jrtc27, mhorne
Differential Revision:	https://reviews.freebsd.org/D30011
2021-04-27 16:22:04 -05:00
Ruslan Bukin
f17c4e38f5 Move IOMMU code to a separate pmap module and switch ARM System MMU
driver to use it.

Add ARM Mali Txxx (Midgard), Gxx (Bifrost) GPU page management code.

Sponsored by: UKRI
2021-04-27 19:16:09 +01:00
Emmanuel Vadot
f77d8d1011 dwc: Use mii_fdt function
Use the helper function to get phy mode and configure dwc accordingly.

Reviewed by:	ian
2021-04-27 19:07:33 +02:00
Emmanuel Vadot
80020d7888 mmccam: probe*: Style(9) 2021-04-27 19:03:16 +02:00
Emmanuel Vadot
e017c1c92c mmcprobe_done: Style(9) 2021-04-27 19:03:09 +02:00
Emmanuel Vadot
7cbdf8a05d dwmmc: Add \n to a debug printf 2021-04-27 19:01:09 +02:00
Emmanuel Vadot
f1cc48e5da mmc: dwmmc: Convert driver to use the mmc_sim interface
A lot more generic cam related things are done in mmc_sim so this simplify
the driver a lot.

Differential Revision:	https://reviews.freebsd.org/D27487
Reviewed by:	kibab
2021-04-27 19:00:47 +02:00
Emmanuel Vadot
2671bdb540 allwinner: aw_mmc: Convert driver to use the mmc_sim interface
A lot more generic cam related things are done in mmc_sim so this simplify
the driver a lot.

Differential Revision:	https://reviews.freebsd.org/D27486
Reviewed by:	imp
2021-04-27 19:00:42 +02:00
Emmanuel Vadot
47bde7925b mmccam: Add mmc_sim, a generic sim for mmc driver to use
This adds a generic sim that abstract a lot of what needs to be implemented
in a driver for mmccam support.
A new interface with three methods is added :

 - mmc_sim_get_tran_settings: Use to get what the controller supports in term
   of capabilities, freq etc ...
 - mmc_sim_set_tran_settings: Use to change the speed/freq/etc of the
   sdcard host controller
 - mmc_sim_cam_request: Used for MMCIO requests

Differential Revision:	https://reviews.freebsd.org/D27485
Reviewed by:	kibab
2021-04-27 19:00:38 +02:00
Ruslan Bukin
4c1ecf5502 Consider the broken card detect flag that comes from 'broken-cd;'
dts property.

This fixes operation on Intel Stratix 10 devices.

Tested on Terasic DE10-Pro.

Reviewed by: manu
Sponsored by: UKRI
Differential revision: https://reviews.freebsd.org/D29999
2021-04-27 12:19:05 +01:00
Michael Tuexen
059ec2225c sctp: cleanup verification of INIT and INIT-ACK chunks 2021-04-27 12:45:43 +02:00
Alexander V. Chernikov
439d087d0b [fib algo] always commit static routes synchronously.
Modular fib lookup framework features logic that allows
 route update batching for the algorithms that cannot easily
 apply the routing change without rebuilding. As a result,
 dataplane lookups may return old data until the the sync
 takes place. With the default sync timeout of 50ms, it is
 possible that new binary like ping(8) executed exactly after
 route(8) will still use the old fib data.

To address some aspects of the problem, framework executes
 all rtable changes without RTF_GATEWAY synchronously.

To fix the aforementioned problem, this diff extends sync
 execution for all RTF_STATIC routes (e.g. ones maintained by
 route(8).
This fixes a bunch of tests in the networking space.

Reported by:	ci, arichardson
MFC after:	2 weeks
2021-04-27 08:31:40 +00:00
Alexander V. Chernikov
25682e6a49 Fix rtsock sockaddr alignment.
b31fbebeb3 introduced alloc_sockaddr_aligned() which, in fact,
 failed to produce aligned addresses.

Reported by:	Oskar Holmlund <oskar.holmlund at yahoo.com>
MFC after:	immediately
2021-04-27 08:04:19 +00:00
Alexander V. Chernikov
bc5ef45aec Fix drace CTF for the rib_head.
33cb3cb2e3 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
 of the files including `route_var.h` do not have `fib_algo`
 defined.

Make dtrace happy by making the field unconditional.

Suggested by:	markj
2021-04-27 07:47:53 +00:00
Rick Macklem
f5ff282bc0 nfscl: fix the handling of NFSERR_DELAY for Open/LayoutGet RPCs
For a pNFS mount, the NFSv4.1/4.2 client uses compound RPCs that
have both Open and LayoutGet operations in them.
If the pNFS server were tp reply NFSERR_DELAY for one of these
compounds, the retry after a delay cannot be handled by
newnfs_request(), since there is a reference held on the open
state for the Open operation in them.

Fix this by adding these RPCs to the "don't do delay here"
list in newnfs_request().

This patch is only needed if the mount is using pNFS (the "pnfs"
mount option) and probably only matters if the MDS server
is issuing delegations as well as pNFS layouts.

Found by code inspection.

MFC after:	2 weeks
2021-04-26 17:48:21 -07:00
Rick Macklem
61aea7fa3c param.h: bump __FreeBSD_version for commit 8759773148
Commit 8759773148 changed the internal KPI between the
nfsd and nfscommon modules, so both need to be rebuilt
from sources.
2021-04-26 16:35:18 -07:00
Rick Macklem
8759773148 nfsd: fix the slot sequence# when a callback fails
Commit 4281bfec36 patched the server so that the
callback session slot would be free'd for reuse when
a callback attempt fails.
However, this can often result in the sequence# for
the session slot to be advanced such that the client
end will reply NFSERR_SEQMISORDERED.

To avoid the NFSERR_SEQMISORDERED client reply,
this patch negates the sequence# advance for the
case where the callback has failed.
The common case is a failed back channel, where
the callback cannot be sent to the client, and
not advancing the sequence# is correct for this
case.  For the uncommon case where the client's
reply to the callback is lost, not advancing the
sequence# will indicate to the client that the
next callback is a retry and not a new callback.
But, since the FreeBSD server always sets "csa_cachethis"
false in the callback sequence operation, a retry
and a new callback should be handled the same way
by the client, so this should not matter.

Until you have this patch in your NFSv4.1/4.2 server,
you should consider avoiding the use of delegations.
Even with this patch, interoperation with the
Linux NFSv4.1/4.2 client in kernel versions prior
to 5.3 can result in frequent 15second delays if
delegations are enabled.  This occurs because, for
kernels prior to 5.3, the Linux client does a TCP
reconnect every time it sees multiple concurrent
callbacks and then it takes 15seconds to recover
the back channel after doing so.

MFC after:	2 weeks
2021-04-26 16:24:10 -07:00