Using unmapped IO is really beneficial when running inside of a VM,
since it avoids IPIs to other vCPUs in order to invalidate the
mappings.
This patch adds unmapped IO support to blkfront. The following tests
results have been obtained when running on a Xen host without HAP:
PVHVM
3165.84 real 6354.17 user 4483.32 sys
PVHVM with unmapped IO
2099.46 real 4624.52 user 2967.38 sys
This is because when running using shadow page tables TLB flushes and
range invalidations are much more expensive, so using unmapped IO
provides a very important performance boost.
Sponsored by: Citrix Systems R&D
MFC after: 2 weeks
X-MFC-with: r289834
The implementation of bus_dmamap_load_ma_triv currently calls
_bus_dmamap_load_phys on each page that is part of the passed in buffer.
Since each page is treated as an individual buffer, the resulting behaviour
is different from the behaviour of _bus_dmamap_load_buffer. This breaks
certain drivers, like Xen blkfront.
If an unmapped buffer of size 4096 that starts at offset 13 into the first
page is passed to the current _bus_dmamap_load_ma implementation (so the ma
array contains two pages), the result is that two segments are created, one
with a size of 4083 and the other with size 13 (because two independant
calls to _bus_dmamap_load_phys are performed, one for each physical page).
If the same is done with a mapped buffer and calling _bus_dmamap_load_buffer
the result is that only one segment is created, with a size of 4096.
This patch relegates the usage of bus_dmamap_load_ma_triv in x86 bounce
buffer code to drivers requesting BUS_DMA_KEEP_PG_OFFSET and implements
_bus_dmamap_load_ma so that it's behaviour is the same as the mapped version
(_bus_dmamap_load_buffer). This patch only modifies the x86 bounce buffer
code, other arches are left untouched.
Reviewed by: kib, jah
Differential Revision: https://reviews.freebsd.org/D888
Sponsored by: Citrix Systems R&D
From the (now removed) comment:
* It is unclear in some cases if the bit is implementation defined.
* The Foundation Model and QEMU disagree on if the IL bit should
* be set when we are in a data fault from the same EL and the ISV
* bit (bit 24) is also set.
Instead of adding even more special cases just remove the assertion.
Approved by: andrew
Sponsored by: The FreeBSD Foundation
These are not target-specific modules, so the logic to build them should
be common. This also enables them for arm64.
Sponsored by: The FreeBSD Foundation
FC port database code already notifies CAM about all devices. Additional
full scan is just a waste of time, that by definition won't find anything
that is not present in port database.
registers (they are used for different purposes).
- Wrap R92C_MSR modifications into urtwn_set_mode().
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D3838
* Add a comment about the parameters I should support, stolen shamelessly
from iwn(4);
* Implement the rate bit for the raw transmit path;
* Print out the host-order versions of each of the transmit bits, so
I have a hope in heck of debugging why things are going wrong.
This still doesn't fix 5GHz in the office but that's likely due to a lot
of other configuration parameters being 2GHz-specific. That'll come next.
Tested:
* AR9170 + AR9103 (2/5GHz) 2x2, 5GHz association
Replace custom Linux-like logging with a thin shim around
device_printf(), when the softc is available.
In ioat_test, shim around printf(9) instead.
Sponsored by: EMC / Isilon Storage Division
This should export all of the same information as the Linux ntb_hw_intel
debugfs info file, but with a bit more structure, in the sysctl tree
rooted at 'dev.ntb_hw.<N>.debug_info'.
Raw registers are marked as OPAQUE because reading them on some hardware
revisions may cause a hard lockup (NTB errata). They can be read with
'sysctl -x dev.ntb_hw.<N>.debug_info.registers'. On Xeon platforms,
some additional registers are available under 'registers.xeon_stats' and
'registers.xeon_hw_err'. They are exported as big-endian values so that
the 'sysctl -x' output is legible.
Shrink the feature mask to 32 bits so we can use the %b formatter in
'debug_info.features'.
Sponsored by: EMC / Isilon Storage Division
linux_syscallnames[] from linux_* to linux32_* to avoid conflicts with
linux64.ko. While here, add support for linux64 binaries to systrace.
- Update NOPROTO entries in amd64/linux/syscalls.master to match the
main table to fix systrace build.
- Add a special case for union l_semun arguments to the systrace
generation.
- The systrace_linux32 module now only builds the systrace_linux32.ko.
module on amd64.
- Add a new systrace_linux module that builds on both i386 and amd64.
For i386 it builds the existing systrace_linux.ko. For amd64 it
builds a systrace_linux.ko for 64-bit binaries.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D3954
linux: fix handling of out-of-bounds syscall attempts
Due to an off by one the code would read an entry past the table, as
opposed to the last entry which contains the nosys handler.
Don't run the selftest until after we've enabled bus mastering, or the
DMA engine can't copy anything for our test.
Create the ioat_test device on attach, if so tuned. Destroy the
ioat_test device on teardown.
Replace deprecated 'CALLOUT_MPSAFE' with correct '1' in callout_init().
Sponsored by: EMC / Isilon Storage Division
Similar to r286787 for x86, this treats userspace buffers the same as unmapped buffers and no longer borrows the UVA for sync operations.
Submitted by: Svatopluk Kraus <onwahe@gmail.com> (earlier revision)
Tested by: Svatopluk Kraus
Differential Revision: https://reviews.freebsd.org/D3869
It turns out that it is pretty easy to make CloudABI work on ARM64. We
essentially only need to copy over the sysvec from AMD64 and ensure that
we use ARM64 specific registers.
As there is an overlap between function argument and return registers,
we do need to extend cloudabi64_schedtail() to only set its values if
we're actually forking. Not when we're creating a new thread.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D3917
For CloudABI we need to initialize the registers of new threads
differently based on whether the thread got created through a fork or
through simple thread creation.
Add a flag, TDP_FORKING, that is set by do_fork() and cleared by
fork_exit(). This can be tested against in schedtail.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D3973
In order to make it easier to support CloudABI on ARM64, move out all of
the bits from the AMD64 cloudabi_sysvec.c into a new file
cloudabi_module.c that would otherwise remain identical. This reduces
the AMD64 specific code to just ~160 lines.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D3974
This is an AR9331 part based on the AP121 reference design but with
32MB RAM. Yes, it has 4MB flash and it has no USB, so clever hacks
are required to get it up and working.
But boot/work it does.
This part seems to work bug-free with single byte TX/RX buffer alignment.
This drops the CPU requirement to bridge 100mbit iperf from 100% CPU
to ~ 50% CPU.
Tested:
* AP121 (AR9330) SoC, highly magic netbooted kernel + USB rootfs
due to 4mb flash, 16mb RAM; doing bridging between arge0 and arge1.
Notes:
* Yes, I likely can also turn this on for the AR934x SoC family now.
But since hardware design apparently follows similar branching
strategies to software design, I'll go and make sure all the AR934x's
that made it out into shipping products work before I flip it on.
The test logic now preallocates memory before running the test.
The buffer size is now configurable. Post-copy verification is
configurable. The number of copies to chain into one transaction (one
interrupt) is configurable.
A 'duration' mode is added, which repeats the test until the duration
has elapsed, reporting the B/s and transactions completed.
ioatcontrol.8 has been updated to document the new arguments.
Initial limits (on this particular Broadwell-DE) (and when the
interrupts are working) seem to be: 256 interrupts/sec or ~6 GB/s,
whichever limit is more restrictive.
Unfortunately, it seems the interrupt-reset handling on Broadwell isn't
working as intended. That will be fixed in a later commit.
Sponsored by: EMC / Isilon Storage Division
The FDT bindings for eeprom parts don't include any metadata about the
device other than the part name encoded in the compatible property.
Instead, a driver is required to have a compiled-in table of information
about the various parts (page size, device capacity, addressing scheme). So
much for FDT being an abstract description of hardware characteristics, huh?
In addition to the FDT-specific changes, this also switches to using the
newer iicbus_transfer_excl() mechanism which holds bus ownership for the
duration of the transfer. Previously this code held the bus across all
the transfers needed to complete the user's IO request, which could be
up to 128KB of data which might occupy the bus for 10-20 seconds. Now the
bus will be released and re-aquired between every page-sized (8-256 byte)
transfer, making this driver a much nicer citizen on the i2c bus.
The hint-based configuration mechanism is still in place for non-FDT systems.
Michal Meloun contributed some of the code for these changes.
while holding exclusive ownership of the bus. This is the routine most
slave drivers should use unless they have a need to acquire and hold the
bus across a series of related operations that involves multiple transfers.