Import portions of the PowerPC OF PCI implementation into
new file "ofw_pci.c", common for other platforms. The files ofw_pci.c and
ofw_pci.h from sys/powerpc/ofw no longer exist. All required declarations
are moved to sys/dev/ofw/ofw_pci.h.
This creates a new ofw_pci_write_ivar() function and modifies
ofw_pci_nranges(), ofw_pci_read_ivar(), ofw_pci_route_interrupt() methods.
Most functions contain existing ppc implementations in the majority
unchanged. Now there is no need to have multiple identical copies
of methods for various architectures.
Submitted by: Marcin Mazurek <mma@semihalf.com>
Obtained from: Semihalf
Sponsored by: Annapurna Labs
Reviewed by: jhibbits, mmel
Differential Revision: https://reviews.freebsd.org/D4879
Resource list for devices that are not ofwbus descendants, but
got to ofwbus method via bus_generic_release_resource() call chain,
cannot be found using BUS_GET_RESOURCE_LIST() used by ofwbus.
In that case, changing device's resource list should be avoided
(will not contain resource list prepared by ofw or simplebus).
Pointy-hat to: zbb
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5304
So one spinlock is avoided, which would be potentially dangerous for
virtual machine, if the spinlock holder was scheduled out by the host,
as noted by royger.
Old spinlock based txdesc list is still kept around, so we could have
a safe fallback.
No performance regression nor improvement is observed.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5290
This paves the way for upcoming vRSS stuffs and eases more code cleanup.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5283
Performance stays same; so no need to use fast taskqueue here.
Suggested by: royger
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5282
This also eases experiment on the non-fast taskqueue.
Reviewed by: adrian, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5276
This paves the way for upcoming vRSS stuffs and eases more code cleanup.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5275
And use SYSCTL+CTLFLAG_RDTUN for them.
Suggested by: adrian
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5274
This one gives the best performance so far.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5273
It is off by default. This eases further experimenting on this driver.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5272
Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most. Aggregating
anything more than 2 hurts TCP sending performance in hyperv. This
significantly improves the TCP sending performance when the number of
concurrent connetion is low (2~8). And it greatly stabilizes the TCP
sending performance in other cases.
Set TCP data segments aggregation length limit to 37500. Without this
limitation, hn(4) could aggregate ~45 TCP data segments for each
connection (even at 64 or more connections) before dispatching them to
socket code; large aggregation slows down ACK sending and eventually
hurts/destabilizes TCP reception performance. This setting stabilizes
and improves TCP reception performance for >4 concurrent connections
significantly.
Make them sysctls so they could be adjusted.
Reviewed by: adrian, gallatin (previous version), hselasky (previous version)
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5185
for all struct bio you get back from g_{new,alloc}_bio. Temporary
bios that you create on the stack or elsewhere should use this before
first use of the bio, and between uses of the bio. At the moment, it
is nothing more than a wrapper around bzero, but that may change in
the future. The wrapper also removes one place where we encode the
size of struct bio in the KBI.
will allow for code that uses the old fdt_get_range and fdt_regsize
functions to find a range, map it, access, then unmap to replace this, up
to and including the map, with a call to OF_decode_addr.
As this function should only be used in the early boot code the unmap is
mostly do document we no longer need the mapping as it's a no-op, at least
on arm.
Reviewed by: jhibbits
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D5258
Marvell twsi part, however uses different register locations, as such split
the existing driver into Marvell and Allwinner attachments.
While here clean a few style issues.
Submitted by: Emmanuel Vadot <manu@bidouilliste.com>
Differential Revision: https://reviews.freebsd.org/D4846
This patch comes from Dave Jiang's Linux tree, davejiang/ntb. It hasn't
been accepted into Linus' tree, so I do not have an authoritative SHA1
to point at. Original commit log:
=====================================================================
A hardware errata causes the NTB to hang when heavy bi-directional
traffic in addition to the usage of BAR0/1 (where the registers reside,
including the doorbell registers to trigger interrupts).
This workaround is only available on Haswell and Broadwell platform.
The workaround is to enable split BAR in the BIOS to allow the 64bit
BAR4 to be split into two 32bit BAR4 and BAR5. The BAR4 shall be pointed
to LAPIC region of the remote host. We will bypass the db mechanism and
directly trigger the MSIX interrupts. The offsets and vectors are
exchanged during transport scratch pad negotiation. The scratch pads are
now overloaded in order to allow the exchange of the information. This
gets around using the doorbell and prevents the lockup with additional
pcode changes in BIOS.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
=====================================================================
Notable changes in the FreeBSD version of this patch:
* The MSIX BAR is configurable, like hw.ntb.b2b_mw_idx (msix_mw_idx).
The Linux version of the patch only uses BAR4.
* MSIX negotiation aborts if the link goes down.
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
The I/OAT HW reset process may sleep, so it is invalid to perform a
channel reset from the software interrupt thread.
Sponsored by: EMC / Isilon Storage Division
supported, use full-width aliases MSRs for writes. This fixes the
"[pmc,X] negative increment" assertion on the context switch when
clipped counter value is sign-extended.
Add definitions for the MSR IA32_PERF_CAPABILITIES needed to detect
the feature.
PR: 207068
Submitted by: joss.upton@yahoo.com
MFC after: 2 weeks
nvme(4) issues a SET_NUM_QUEUES command during device
initialization to ensure enough I/O queues exists for each
of the MSI-X vectors we have allocated. The SET_NUM_QUEUES
command is then issued again during nvme_ctrlr_start(), to
ensure that is properly set after any controller reset.
At least one NVMe drive exists which fails this second
SET_NUM_QUEUES command during device initialization. So
change nvme_ctrlr_start() to only issue its SET_NUM_QUEUES
command when it is coming out of a reset - avoiding the
duplicate SET_NUM_QUEUES during device initialization.
Reported by: gallatin
MFC after: 3 days
Sponsored by: Intel
xn_ifp is allocated in create_netdev with if_alloc(IFT_ETHER).
According to the current arrangement it can't be NULL.
Coverity ID: 1349805
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5252
The variable error is assigned to 0 before entering the switch.
Assigning error to 0 before break pointless rewrites the real error
value that should be returned.
Coverity ID: 1304974
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5250
Replace the hw.ntb.enable_writecombine tunable with
hw.ntb.default_mw_pat. It can be set with several specific numerical
values to select a caching type. Any bogus value is treated as
Uncacheable (UC).
The ntb_mw_set_wc() KPI has removed the restriction that the selected
mode must be one of UC, WC, or WB.
Sponsored by: EMC / Isilon Storage Division