the coming ibcore and mlx5ib updates in order to support traffic redirection
to so-called raw ethernet QPs.
Remove unused E-switch related routines and files while at it.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Set the IPOIB_FLAG_INITIALIZED on dev_open and clear it on dev_stop to
avoid a race between ipoib load and the underlying device driver.
The device module must dispatch the IB_EVENT_PORT_ACTIVE event before ipoib
module is loaded. Otherwise, the flush will fail since no one set the
IPOIB_FLAG_INITIALIZED.
Submitted by: Slava Shwartsman <slavash@mellanox.com>
Sponsored by: Mellanox Technologies
MFC after: 1 week
* Check TLB1 in all mapdev cases, in case the memattr matches an existing
mapping (doesn't need to be MAP_DEFAULT).
* Fix mapping where the starting address is not a multiple of the widest size
base. For instance, it will now properly map 0xffffef000, size 0x11000 using
2 TLB entries, basing it at 0x****f000, instead of 0x***00000.
MFC after: 2 weeks
Ever since r143063, machine/atomic.h requires cdefs.h. So, include it
first. Weak support: style(9) tells us to include cdefs.h first.
Argument against: since code that includes systm.h still compiles,
compilation units that include systm.h must already include cdefs.h. So, an
argument could be made that the cdefs.h include could just be removed
entirely. That is maybe a bigger change and not one I am interested in
bikeshedding.
Universe compiles.
Sponsored by: Dell EMC Isilon
This introduces a facility to EVENTHANDLER(9) for explicitly defining a
reference to an event handler list. This is useful since previously all
invokers of events had to do a locked traversal of the global list of
event handler lists in order to find the appropriate event handler list.
By keeping a pointer to the appropriate list an invoker can avoid this
traversal completely. The pointer is initialized with SYSINIT(9) during
the eventhandler stage. Users registering interest in events do not need
to know if the event is backed by such a list, since the list is added
to the global list of lists. As with lists that are not pre-defined it
is safe to register for the events before the list has been created.
This converts the process_* and thread_* events to using the new
facility, as these are events whose locked traversals end up showing up
significantly in ports build workflows (and presumably other workflows
with many short lived threads/procs). It may be advantageous to convert
other events to using the new facility.
The el_flags field is now unused, but leave it be so that this revision
can be MFC'd.
Reviewed by: bdrewery, markj, mjg
Approved by: rstone (mentor)
In collaboration with: ian
MFC after: 4 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12814
Else the IPv6 address matching might fail. This change adds support for both
embedded and non-embedded IPv6 scope IDs when passing a IPv6 link-local socket
address to RDMA. Prior to this change only global IPv6 addresses would work
with RDMA.
Sponsored by: Mellanox Technologies
MFC after: 1 week
1) Fail to resolve RDMA address if rtalloc1() returns the loopback
device, lo0, as the gateway interface. Currently RDMA loopback is
not supported.
2) Use ip_dev_find() and ip6_dev_find() to lookup network interfaces
with matching IPv4 and IPv6 addresses, respectivly.
3) In addr_resolve() make sure the "ifa" pointer is always set, also when
the "ifp" is NULL. Else a NULL pointer access might happen trying to
read from the "ifa" pointer later on.
4) In rdma_addr_find_dmac_by_grh() make sure the "bound_dev_if" field
gets set properly instead of passing the scope ID through the IPv6
socket address structure. This is more in line with upstream OFED
in Linux.
5) In rdma_addr_find_smac_by_sgid() there is no need to pass the
scope ID for IPv6. Either it is stored in the "bound_dev_if" field
or ip6_dev_find() will find the correct network device regardless
of the scope ID.
Sponsored by: Mellanox Technologies
MFC after: 1 week
illumos/illumos-gate@27295216542729521654https://www.illumos.org/issues/7531
I found that some buffers that could be L2ARC eligible are not flagged
such, leading to some performance impact. As a test I ran the same IO
workload 10 times in a raw. It is a metadata only workload (files
listing). l2arc_noprefetch=0.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: benrubson <ben.rubson@gmail.com>
MFC after: 8 days
illumos/illumos-gate@f37ae9a714f37ae9a714https://www.illumos.org/issues/8713
If we're creating a pool with version >= SPA_VERSION_DSL_SCRUB (v11) we need to
account for additional space needed by the origin dataset which will also be
snapshotted: "poolname"+"/"+"$ORIGIN"+"@"+"$ORIGIN".
Enforce this limit in pool_namecheck().
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: loli10K <ezomori.nozomu@gmail.com>
MFC after: 1 week
With the current state of the AENQ handlers in the ENA driver, only
implemented handlers should be indicated.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12872
Using only 1 descriptor on RX could be an issue, if system would be low
on resources and could not provide driver with large chunks of
contiguous memory.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12871
The device now provides driver with max available MTU value it
can handle.
The function setting MTU for the interface was simplified and reworked
to follow up this changes.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12870
The maximum number of io_cq was the same number as maximum io_sq
indicated by the device working in normal mode (without LLQ).
It is not always true, especially when LLQ is being enabled.
Fix it.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12869
The driver was printing out a lot of information upon failure, which
does not have to be interested for the user.
Changing logging level required to rebuild driver with proper flags. The
proper sysctl was added, so the level now can be changed dynamically
using bitmask.
Levels of printouts were adjusted to keep on mind end user instead of
debugging purposes.
More verbose messages were added to align the driver with the Linux.
Fix building error introduced by the r325506 by casting csum_flags to
uint64_t.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12868
Also keep the calculated vm_page_alloc_contig() flags in the variable
to not re-evaluate it on the loop iteration.
Noted by: alc
Sponsored by: The FreeBSD Foundation
This bug wasn't impacting anything, because both enums are indicating
the same value, but it could cause a problem on API change.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12867
The gcc compiler is more sensitive when variable is having an value
assigned, but it is not used anywhere further.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: rlibby
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12866
The previous way of checking for DF was not valid.
When DF is enabled, the DF bit should be 1.
The original way of checking it was wrong in 2 ways: first of all, it
was not checking for single bit, secondly, it was checking for 0.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12865
Remove unused macros and fields - some of them were only initialized,
without further usage.
Implement minor style fixes and add required comments.
On the occasion add missing TX completion counter, which was existing,
but mistakenly remained unused.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12864
The situation, where part of the MSI-x was not configured properly, was
not properly handled. Now, the driver reduces number of queues to
reflect number of existing and properly configured MSI-x vectors.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12863
Few counters were imported from the Linux driver and never used,
because of differences between the Linux and FreeBSD APIs.
Queue stops and resumes are no longer supported by the driver and
counters were incremented indicating false events.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: rlibby
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12862
The driver was using it in only few places, so the rest of the code
was covered with those statement.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: rlibby
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12861
* Change all conditional checks in "if" statement to boolean expressions
* Initialize variables with too complex values outside the declaration
* Fix indentations
* Move code associated with sysctls to ena_sysctl.c file
* For consistency, remove unnecesary "return" from void functions
* Use if_getdrvflags() function instead of accesing variable directly
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12860
Some goto tags were renamed for consistency, and few error handling
routines were reworked.
The drbr_free() must be locked just in case code will change in the
future - for now, it should never be an issue, because drbr is being
flushed in the ena_down() call, and the lock is required only when there
are some mbufs inside.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12859
On heavy load, when interrupt handling routine was slowed down, there
could appear memory corruption, because resources were destroyed and
interrupt was still being handled.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12858
Pure cosmetic change for better readability of the driver.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12857
In case when Rx ring is full and driver will fail to allocate Rx mbufs,
the ring could be stalled.
Keep alive is checking every second for Rx ring state, and if it is full
for two cycles, then trigger rx_cleanup routine in another thread.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12856
The RX out of order completion feature, allows to complete RX
descriptors out of order, by keeping trace of all free descriptors in
the separate array.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: byenduri_gmail.com
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D12855
There's no way currently to automatically prevent the bad .OBJDIR from being
created but it can at least be prevented from being used. Passing
WITHOUT_AUTO_OBJ=yes or MK_AUTO_OBJ=no or -DNO_OBJ in will prevent it.
Reported by: jeffr
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12989
DTS files switch from clocks under /clocks to a ccu (Clock Controller Unit)
a while ago.
Restore A13 functionality by adding a clock driver for it.
Almost every clocks are handled, the missing ones are mostly video related
clocks.
Tested On: A13 Olinuxino
bsd.init.mk ends up including defs.mk so the per-arch options must be
set before including defs.mk, or else the global defaults will be
used and the per-arch ones will be ignored.
Although better, note that the usage of MK_FDT before the inclusion of
bsd.init.mk is incorrect but doesn't lead to build errors. This
circular dependency must be broken in order for this to work
correctly.
Reviewed by: imp
Sponsored by: Citrix Systems R&D
Compared to the previous version, v0.16, there are a couple of minor
changes:
- CLOUDABI_AT_PID: Process identifiers for CloudABI processes.
Initially, BSD process identifiers weren't exposed inside the runtime,
due to them being pretty much useless inside of a cluster computing
environment. When jobs are scheduled across systems, the BSD process
number doesn't act as an identifier. Even on individual systems they
may recycle relatively quickly.
With this change, the kernel will now generate a UUIDv4 when executing
a process. These UUIDs can be obtained within the process using
program_getpid(). Right now, FreeBSD will not attempt to store this
value. This should of course happen at some point in time, so that it
may be printed by administration tools.
- Removal of some unused structure members for polling.
With the polling framework being simplified/redesigned, it turns out
some of the structure fields were not used by the C library. We can
remove these to keep things nice and tidy.
Obtained from: https://github.com/NuxiNL/cloudabi
- Fix clear doorbell queue buffer for ADAPTER_TYPE_B
- Fix release memory resource when detach device
- Add support for ARC-1216, 1226 SAS 12Gb controllers
- Declare some functions as static
- Change checking dword read/write for IOP rqbuffer.
Many thanks to Areca for continuing to support FreeBSD.
Submitted by: 黃清隆 <ching2048 areca com tw>
MFC after: 2 weeks
similar to the kernel memory allocator.
This simplifies NUMA allocation because the domain will be known at wait
time and races between failure and sleeping are eliminated. This also
reduces boilerplate code and simplifies callers.
A wait primitive is supplied for uma zones for similar reasons. This
eliminates some non-specific VM_WAIT calls in favor of more explicit
sleeps that may be satisfied without new pages.
Reviewed by: alc, kib, markj
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon