The new load_ma implementation can cause dereferences when used with
certain drivers, back it out until the reason is found:
Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 03
fault virtual address = 0x30
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff808a2d22
stack pointer = 0x28:0xfffffe07cc737710
frame pointer = 0x28:0xfffffe07cc737790
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 13 (g_down)
trap number = 12
panic: page fault
cpuid = 11
KDB: stack backtrace:
#0 0xffffffff80641647 at kdb_backtrace+0x67
#1 0xffffffff80606762 at vpanic+0x182
#2 0xffffffff806067e3 at panic+0x43
#3 0xffffffff8084eef1 at trap_fatal+0x351
#4 0xffffffff8084f0e4 at trap_pfault+0x1e4
#5 0xffffffff8084e82f at trap+0x4bf
#6 0xffffffff80830d57 at calltrap+0x8
#7 0xffffffff8063beab at _bus_dmamap_load_ccb+0x1fb
#8 0xffffffff8063bc51 at bus_dmamap_load_ccb+0x91
#9 0xffffffff8042dcad at ata_dmaload+0x11d
#10 0xffffffff8042df7e at ata_begin_transaction+0x7e
#11 0xffffffff8042c18e at ataaction+0x9ce
#12 0xffffffff802a220f at xpt_run_devq+0x5bf
#13 0xffffffff802a17ad at xpt_action_default+0x94d
#14 0xffffffff802c0024 at adastart+0x8b4
#15 0xffffffff802a2e93 at xpt_run_allocq+0x193
#16 0xffffffff802c0735 at adastrategy+0xf5
#17 0xffffffff80554206 at g_disk_start+0x426
Uptime: 2m29s
Add a new flag for DMA operations, DMA_NO_WAIT. It behaves much like
other NOWAIT flags -- if queueing an operation would sleep, abort and
return NULL instead.
When growing the internal descriptor ring, the memory allocation is
performed outside of all locks. A lock-protected flag is used to avoid
duplicated work. Threads that cannot sleep and attempt to queue
operations when the descriptor ring is full allocate a larger ring with
M_NOWAIT, or bail if that fails.
ioat_reserve_space() could become an external API if is important to
callers that they have room for a sequence of operations, or that those
operations succeed each other directly in the hardware ring.
This patch splits the internal head index (->head) from the hardware's
head-of-chain (DMACOUNT) register (->hw_head). In the future, for
simplicity's sake, we could drop the 'ring' array entirely and just use
a linked list (with head and tail pointers rather than indices).
Suggested by: Witness
Sponsored by: EMC / Isilon Storage Division
Internal busses (thus ECAM access) should be mapped to
all values from 0 to 143.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D3753
When one tries to allocate a resource with unspecified range,
read already configured BAR values (by UEFI or whatever).
This is necessary to make VNIC VFs working and to allow them to be
properly allocated.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D3752
Assertion used here was invalid. If current thread helds any of locks,
we never want to recurse on them.
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3903
Add e6000sw driver supporting Marvell 88E6352, 88E6172, 88E6176 switches.
It needs to be attached to mdio interface, exporting SMI access
functionality. e6000sw supports port-based VLAN configuration, per-port
media changing, accessing PHY and switch registers.
e6000sw attaches miibuses and PHY drivers as children. Instead of typical
tick as callout, kthread-based tick is used. This combined with SX locks
allows MDIO read/write calls to sleep. It is expected, because this
hardware requires long delays in SMI read/write procedures, which can not
be handled by busy-waiting.
Reviewed by: adrian
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3902
This commit introduces support for etherswitch devices that utilize SMI as
a way of accessing its registers. SMI register is located in address space
of mge -- access to it was exported through MDIO interface.
Attachment functions were enhanced so as to ensure proper initialisation
in both cases: 1) PHYs attached directly to mge, 2) PHYs attached to
switch device and switch attached to mge. Attachment of etherswitch device
depends on dts entry with compatible="mrvl,sw" property. If none is found,
typical PHY attachment procedure follows.
In case of switch attached, PHYs' status and configuration is accessible
via etherswitchcfg, and ifconfig shows always-up, non-configurable mge
interfaces.
Due to the fact that there may be simultaneous accessess to SMI
registers (e.g. from PHY attached to one of mge instances and switch
to the other), SMI access interlock was added. It is SX lock,
because sleep ability is necessary -- busy-waiting would result
in poor performance due to long delays required by hardware.
Underlying switch driver is obliged to use sleepable locks as well.
Reviewed by: adrian
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3900
is 0. Without this change it was sleeping for one tick. Maybe not a big
deal, but it makes share/dtrace/blocking script to report that.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D3814
Sponsored by: Wheel Systems, http://wheelsystems.com
IPv4 packets (when it should return FALSE). It happens because PF_ANEQ() doesn't
stop if first 32 bits of IPv4 packets are equal and starts to check next 3*32
bits (like for IPv6 packet). Those bits containt some garbage and in result
PF_ANEQ() wrongly returns TRUE.
Fix: Check if packet is of AF_INET type and if it is then compare only first 32
bits of data.
PR: 204005
Submitted by: Miłosz Kaniewski
We need to reset the chancmp and chainaddr MMIO registers to bring the
device back to a working state.
Name the chanerr bits while we're here.
Sponsored by: EMC / Isilon Storage Division
We only need to borrow a mutex for the drain sleep and the 0->1
transition, so just reuse an existing one for now.
The wchan is arbitrary. Using refcount itself would have required
__DEVOLATILE(), so use the lock's address instead.
Different uses are tagged by kind, although we only do anything with
that information in INVARIANTS builds.
Sponsored by: EMC / Isilon Storage Division
Callers should have acquired this lock when they invoked ioat_acquire()
before issuing operations. Assert it is held.
Sponsored by: EMC / Isilon Storage Division
This is still the worst possible way to allocate memory if it will ever
be under pressure, but at least it won't deadlock.
Suggested by: WITNESS
Sponsored by: EMC / Isilon Storage Division
Pull out the timer callout delay into IOAT_INTR_TIMO and shorten it
considerably (5s -> 100ms). Single operations do not take 5-10 seconds
and when interrupts aren't working, waiting 100ms sucks a lot less than
5s.
Sponsored by: EMC / Isilon Storage Division
I couldn't test arge0->arge1 bridging, only arge0 VLAN bridging.
The DIR-825C1 only hooks up arge0 to the switch GMAC0 and so
you need to abuse VLANs to test.
Tested:
* DIR-825C1 (AR9344)
the temporary file to vers.c at the end of the script
The previous logic wrote out to vers.c multiple times, so the file
could be incorrectly interpreted as being completely written out
after one of the echo calls with recursive make, when in reality it
was only partially written.
Also, in the event the build was interrupted when creating vers.c
(small race window), it would have a leftover file that needed to
be cleaned up before resuming the build.
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
pager. It is enough to execute VOP_BMAP() once to obtain both the
disk block address for the requested page, and the before/after limits
for the contiguous run. The clipping of the vm_page_t array passed to
the vnode_pager_generic_getpages() and the disk address for the first
page in the clipped array can be deduced from the call results.
While there, remove some noise (like if (1) {...}) and adjust nearby
code.
Reviewed by: alc
Discussed with: glebius
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
ordered with the MFENCE instruction. Similar weak guarantees are also
specified by the AMD APM vol. 3 rev. 3.22. x86 pmap methods
pmap_invalidate_cache_range() and pmap_invalidate_cache_pages() braced
CLFLUSH loop with MFENCE both before and after the loop.
In the revision 56 of SDM, Intel stated that all existing
implementations of CLFLUSH are strict, CLFLUSH instructions execution
is ordered WRT other CLFLUSH and writes. Also, the strict behaviour
is made architectural.
A new instruction CLFLUSHOPT (which was documented for some time in
the Instruction Set Extensions Programming Reference) provides the
weak behaviour which was previously attributed to CLFLUSH.
Use CLFLUSHOPT when available. When CLFLUSH is used on Intel CPUs, do
not execute MFENCE before and after the flushing loop.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
is a dcache invalidate to point of coherency just like dcache_inv_poc(), but
a slightly different version specific to dma operations. Elaborate the
comment about how and why it's different.
Now 24xx and above chips support full 8-byte LUN address space.
Older FC chips may support up to 16K LUNs when firmware allows.
Tested in both initiator and target modes for 23xx, 24xx and 25xx.