Commit Graph

109 Commits

Author SHA1 Message Date
Mark Johnston
96ad26eefb Remove free_domain() and uma_zfree_domain().
These functions were introduced before UMA started ensuring that freed
memory gets placed in domain-local caches.  They no longer serve any
purpose since UMA now provides their functionality by default.  Remove
them to simplyify the kernel memory allocator interfaces a bit.

Reviewed by:	cem, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25937
2020-08-04 13:58:36 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Alexander Motin
8acd3f126a Don't spin on cleanup_lock if we are not interrupt.
If somebody else holds that lock, it will likely do the work for us.
If it won't, then we return here later and retry.

Under heavy load it allows to avoid lock congestion between interrupt and
polling threads.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-12-31 04:16:52 +00:00
Alexander Motin
7280125e81 Add ioat_get_domain() to ioat(4) KPI.
This allows NUMA-aware consumers to reduce inter-domain traffic.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-11-19 02:09:04 +00:00
Alexander Motin
348efb140e Initialize *comp_update with valid value.
I've noticed that sometimes with enabled DMAR initial write from device
to this address is somehow getting delayed, triggering assertion due to
zero default being invalid.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-15 23:01:09 +00:00
Alexander Motin
1f4a469d36 Cleanup address range checks in ioat(4).
- Deduce allowed address range for bus_dma(9) from the hardware version.
Different versions (CPU generations) have different documented limits.
 - Remove difference between address ranges for src/dst and crc.  At least
docs for few recent generations of CPUs do not mention anything like that,
while older are already limited with above limits.
 - Remove address assertions from arguments.  While I do not think the
addresses out of allowed ranges should realistically happen there due to
the platforms physical address limitations, there is now bus_dma(9) to
make sure of that, preferably via IOMMU.
 - Since crc now has the same address range as src/dst, remove crc_dmamap,
reusing dst2_dmamap instead.

Discussed with:	cem
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-15 22:47:59 +00:00
Alexander Motin
3eb70a09f4 Pass more reasonable WAIT flags to bus_dma(9) calls.
MFC after:	2 weeks
2019-11-14 04:39:48 +00:00
Eric van Gyzen
83b82cd48c Add CTLFLAG_STATS to the dev.ioat.N.stats sysctl OIDs
Refer to r353111.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2019-10-09 12:14:10 +00:00
Alexander Motin
630d9800a1 Replace argument checks with assertions.
Those functions are used by kernel, and we can't check all possible argument
errors in production kernel.  Plus according to docs many of those errors
are checked by hardware.  Assertions should just help with code debugging.

MFC after:	2 weeks
2019-09-27 02:09:20 +00:00
Alexander Motin
657dc81d90 Improve ioat(4) NUMA-awareness.
Allocate ioat->ring memory from the device domain.
Schedule ioat->poll_timer to the first CPU of the device domain.

According to pcm-numa tool from intel-pcm port, this reduces number of
remote DRAM accesses while copying data by 75%.  And unless it is a noise,
I've noticed some speed improvement when copying data to other domain.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-09-19 22:15:57 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Tycho Nightingale
b80b32a2ab ioat(4) should use bus_dma(9) for the operation source and destination
addresses

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19725
2019-04-02 19:08:06 +00:00
Tycho Nightingale
a8d9ee9c50 ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and
crc-copy modes.

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19780
2019-04-02 19:06:25 +00:00
Alexander Motin
2f03a95fd2 Fix few issues in ioat(4) driver.
- Do not explicitly count active descriptors.  It allows hardware reset
to happen while device is still referenced, plus simplifies locking.
 - Do not stop/start callout each time the queue becomes empty.  Let it
run to completion and rearm if needed, that is much cheaper then to touch
it every time, plus also simplifies locking.
 - Decouple submit and cleanup locks, making driver reentrant.
 - Avoid memory mapped status register read on every interrupt.
 - Improve locking during device attach/detach.
 - Remove some no longer used variables.

Reviewed by:	cem
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D19231
2019-02-21 16:47:36 +00:00
Conrad Meyer
236b57fe36 ioat(4): Set __result_use_check on ioat_acquire_reserve
Even M_WAITOK callers must check for failure.  For example, if the device is
quiescing, either due to automatic error-recovery induced reset, or due to
administrative detach, the routine will return ENXIO and the acquire
reference will not be held.  So, there is no mode in which it is safe to
assume the routine succeeds without checking.

Sponsored by:	Dell EMC Isilon
2019-01-17 23:21:02 +00:00
Warner Losh
329e817fcc Reapply, with minor tweaks, r338025, from the original commit:
Remove unused and easy to misuse PNP macro parameter

Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely.  The 'table' parameter is now required to
have correct pointer (or array) type.  Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.

Mostly done with the coccinelle 'spatch' tool:

  $ cat modpnpsize0.cocci
    @normaltables@
    identifier b,c;
    expression a,d,e;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,d,
    -sizeof(d[0]),
     e);

    @singletons@
    identifier b,c,d;
    expression a;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,&d,
    -sizeof(d),
     1);

  $ rg -l MODULE_PNP_INFO -- sys | \
    xargs spatch --in-place --sp-file modpnpsize0.cocci

(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not.  So I had to link gdiff into
PATH as diff to use spatch.)

Tinderbox'd (-DMAKE_JUST_KERNELS).
Approved by: re (glen)
2018-09-26 17:12:14 +00:00
Conrad Meyer
b8e771e97a Back out r338035 until Warner is finished churning GSoC PNP patches
I was not aware Warner was making or planning to make forward progress in
this area and have since been informed of that.

It's easy to apply/reapply when churn dies down.
2018-08-19 00:46:22 +00:00
Conrad Meyer
faa319436f Remove unused and easy to misuse PNP macro parameter
Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely.  The 'table' parameter is now required to
have correct pointer (or array) type.  Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.

Mostly done with the coccinelle 'spatch' tool:

  $ cat modpnpsize0.cocci
    @normaltables@
    identifier b,c;
    expression a,d,e;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,d,
    -sizeof(d[0]),
     e);

    @singletons@
    identifier b,c,d;
    expression a;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,&d,
    -sizeof(d),
     1);

  $ rg -l MODULE_PNP_INFO -- sys | \
    xargs spatch --in-place --sp-file modpnpsize0.cocci

(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not.  So I had to link gdiff into
PATH as diff to use spatch.)

Tinderbox'd (-DMAKE_JUST_KERNELS).
2018-08-19 00:22:21 +00:00
Warner Losh
d2064cf030 Use '#' rather than some made up name for fields we want to ignore. 2017-12-22 17:53:27 +00:00
Conrad Meyer
08f16d0c01 ioat(4): Add Skylake Xeon PCI-ID
SKX IOAT is just another 3.2 version of the CBDMA engine.

Submitted by:	Deepak Veliath <deepak.veliath AT isilon.com>
Sponsored by:	Dell EMC Isilon
2017-12-05 18:48:58 +00:00
Conrad Meyer
a64bf59c49 Add PNP metadata to a few drivers
An eventual devd(8) or other component should be able to scan buses and
automatically load drivers that match device ids described in this metadata.

Reviewed by:	imp
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12364
2017-09-14 15:34:45 +00:00
Andriy Gapon
1a1406212b ioat: don't specify inline for function with variable argument list
Modern GCC and Clang simply ignore the qualifier, while the old base GCC
produces a warning (treated as an error in the kernel build).

Approved by:	cem
MFC after:	5 days
2017-03-04 12:51:57 +00:00
Conrad Meyer
e2a65c0031 ioat(4): Compile on i386
Truncate BUS_SPACE_MAXADDR_40BIT to essentially BUS_SPACE_MAXADDR_32BIT on
platforms with 32-bit bus_addr_t (i.e., i386).

PR:		215034
Reported by:	ngie@
Sponsored by:	Dell EMC Isilon
2016-12-04 04:04:57 +00:00
Conrad Meyer
58a639b77c ioat(4): Fix 'bogus completion_pending' KASSERT
Fix ioat_release to only set is_completion_pending if DMAs were actually
queued.  Otherwise, the spurious flag could trigger an assert in the
reset path on INVARIANTS kernels.

Reviewed by:	bdrewery, Suraj Raju @ Isilon
Sponsored by:	Dell EMC Isilon
2016-11-30 21:59:52 +00:00
Conrad Meyer
3a37091931 ioat(4): Fix race between process_events and reset_hw
In the case where a hardware error is detected during
ioat_process_events, hardware may advance (by one descriptor, probably)
and a subsequent ioat_process_events may race the intended ioat_reset_hw
followup.  In that case, the second process_events would observe a
completion update that does not match the software "last_seen" status,
and attempt to successfully complete already-failed descriptors.

Guard against this race with the resetting_cleanup flag.

Reviewed by:	bdrewery, markj
Sponsored by:	Dell EMC Isilon
2016-11-11 20:09:54 +00:00
Conrad Meyer
ea9e23edf3 ioat(4): Read CHANSTS register for suspended/halted checks
The device doesn't accurately update the CHANCMP address with the device state
when the device is suspended or halted.  So, read the CHANSTS register to check
for those states.

We still need to read the CHANCMP address for the last completed descriptor.

Sponsored by:	Dell EMC Isilon
2016-11-02 23:18:16 +00:00
Conrad Meyer
8e269d996a ioat(4): Allocate contiguous descriptors
This allows us to make strong assertions about descriptor address
validity.  Additionally, future generations of the ioat(4) hardware will
require contiguous descriptors.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
2016-11-01 19:18:54 +00:00
Conrad Meyer
a0992979e9 ioat(4): Simplify by removing dynamic scaling
This paves the way for a contiguous descriptor array.

A contiguous descriptor array has the benefit that we can make strong
assertions about whether an address is a valid descriptor or not.  The
other benefit is that future generations of I/OAT hardware will require
a contiguous descriptor array anyway.  The downside is that after system
boot, big chunks of contiguous memory is much harder to find.  So
dynamic scaling after boot is basically impossible.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
2016-11-01 19:18:52 +00:00
Conrad Meyer
0d0f264099 ioat(4): Use memory completion rather than device register
The CHANSTS register is a split 64-bit register on CBDMA units before
hardware v3.3.  If a torn read happens during ioat_process_events(),
software cannot know when to stop completing descriptors correctly.

So, just use the device-pushed main memory channel status instead.

Remove the ioat_get_active() seatbelt as well.  It does nothing if the
completion address is valid.

Sponsored by:	Dell EMC Isilon
2016-10-28 23:53:37 +00:00
Conrad Meyer
76305bb865 ioat(4): Add failpoint for delay() in ioat_release
Sponsored by:	Dell EMC Isilon
2016-10-28 23:53:36 +00:00
Conrad Meyer
5a8af401ae ioat(4): Assert the submit lock in ioat_submit_single
Sponsored by:	Dell EMC Isilon
2016-10-28 23:53:35 +00:00
Conrad Meyer
05e4cffbb0 ioat(4): Add additional tracing
These probes help track down driver bugs.

Sponsored by:	Dell EMC Isilon
2016-10-28 23:53:33 +00:00
Conrad Meyer
f44abf1fe7 ioat(4): Start poll timer when descriptors are released to HW
Rather than when the software creates the descriptors.

Sponsored by:	Dell EMC Isilon
2016-09-11 20:15:41 +00:00
Conrad Meyer
dc46505973 ioat(4): De-spam ioat_process_events KTR logs
Sponsored by:	Dell EMC Isilon
2016-09-11 20:14:19 +00:00
Conrad Meyer
985efa77cc ioat(4): Despam relatively common hardware reset messages
Reported by:	ngie@
2016-09-01 23:56:02 +00:00
Conrad Meyer
ee5642bb35 ioat(4): Add additional CTR tracing during reset 2016-08-29 20:51:34 +00:00
Conrad Meyer
1bbc06b85b ioat(4): Don't "complete" DMA descriptors prematurely
In r304602, I mistakenly removed the ioat_process_events check that we weren't
processing events before the hardware had completed the descriptor
("last_seen").  Reinstate that logic.

Keep the defensive loop condition and additionally make sure we've actually
completed a descriptor before blindly chasing the ring around.

In reset, queue and finish the startup command before allowing any event
processing or submission to occur.  Avoid potential missed callouts by
requeueing the poll later.
2016-08-29 20:46:33 +00:00
Conrad Meyer
c114e74120 ioat(4): Allow callouts to be scheduled after hw reset
is_completion_pending governs whether or not a callout will be scheduled
when new work is queued on the IOAT device.  If true, a callout is
already scheduled, so we do not need a new one.  If false, we schedule
one and set it true.  Because resetting the hardware completed all
outstanding work but failed to clear is_completion_pending, no new
callout could be scheduled after a reset with pending work.

This resulted in a driver hang for polled-only work.
2016-08-22 14:51:09 +00:00
Conrad Meyer
0283c0f581 ioat(4): Don't process events past queue head
Fix a race where the completion routine could overrun the active ring
area in some situations.
2016-08-22 14:51:07 +00:00
Conrad Meyer
520f6023de ioat(4): Log channel number in CTR events 2016-08-05 02:56:31 +00:00
Conrad Meyer
1be25cc928 ioat(4): Check ring links at grow/shrink in INVARIANTS 2016-07-12 21:57:05 +00:00
Conrad Meyer
6d41e1ed15 ioat(4): Add KTR trace for ioat_reset_hw 2016-07-12 21:57:02 +00:00
Conrad Meyer
bd17dd691e ioat(4): Enhance KTR logging for descriptor completions 2016-07-12 21:57:00 +00:00
Conrad Meyer
3b188a672f ioat(4): Assert against ring underflow 2016-07-12 21:56:57 +00:00
Conrad Meyer
08a6b96aa5 ioat_reserve_space: Recheck quiescing flag after dropping submit lock
Fix a minor bound check error while here (ring can only hold 1 <<
MAX_ORDER - 1 entries).
2016-07-12 21:56:55 +00:00
Conrad Meyer
fe19e76ced ioat(4): Remove force_hw_error sysctl; it does not work reliably 2016-07-12 21:56:52 +00:00
Conrad Meyer
f8253f1a39 ioat(4): Export HW capabilities to consumers 2016-07-12 21:56:49 +00:00
Conrad Meyer
25ad958599 ioat(4): Submitters pick up a shovel if queue is too full
Before attempting to grow the ring.
2016-07-12 21:56:46 +00:00
Conrad Meyer
c62db3954c ioat(4): Don't shrink ring if active 2016-07-12 21:56:34 +00:00
Conrad Meyer
35b7a45ef2 ioat(4): Print some more useful information about the ring from ddb "show ioat" 2016-07-12 21:52:26 +00:00