Most often, hv_vmbus_post_message() doesn't fail. However, it fails
intermittently when GPADLs of large shared memory is to be established
with the host, e.g. on the hn(4) attach path: a GPADL of 15MB sendbuf
is created, for which lots of messages will be flooded to the host.
The host side tries to throttle the message rate by returning
HV_STATUS_INSUFFICIENT_BUFFERS.
Before this commit, we do several retries for failed messages, but the
delay between each retry is pretty/too low, which will cause sporadic
message posting failure. We now use large delay (>=1ms) between each
retry to fix the message posting failure.
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5715
This mainly used to improve ACK timeliness when multiple RX rings
are enabled.
This value gives the best performance in both Azure and Hyper-V
environment, w/ both 10Ge and 40Ge using non-{INVARIANTS,WITNESS}
kernel.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5691
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
This gets rid of the per-cpu SWIs.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
Using the same message slot as the other types of the messages has
the side effect that the event timer message could be deferred to
the swi threads to run (lacking of trapframe and the original code
didn't even handle that, so the event timer was actually broken).
As of this commit we use an independent message slot for event timer,
so that we could handle all of event timer messages in the interrupt
handler directly. Note, the message slot for event timer is still
bind to the same interrupt vector as the other types of messages.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5696
Submitted by: Ju Sun <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5651
So that functions shared w/ attach path could use if_printf().
While I'm here, remove unnecessary if_dunit and if_dname assignment.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5576
Each channel contains one RX ring and one TX ring. And we
try to distribute the channels to different evenly.
Note: Currently we don't have enough information to extract
the RSS type and RSS hash value from the received packets.
This greatly improves the TX/RX performance for 8 virtual CPU
Hyper-V over 10Ge: it can max out 10Ge for TCP when multiple
RX/TX rings are enabled.
This almost doubles the TX/RX performance for locally connected
Hyper-Vs: was 6Gbps w/ 128 TCP streams, now 11Gbps w/ multiple
RX/TX rings enabled.
It is not enabled by default; it will be switched on after more
tests.
Collaborated with: Hongjiang Zhang <honzhan microsoft com>
MFC after: 2 week
Sponsored by: Microsoft OSTC
And since the host may not being able to allocate the # of rings
requested by us, save the # of rings allocated by the host in the
ring_inuse counters; use ring_inuse counters for run time operation.
This paves the way for the upcoming vRSS support.
MFC after: 1 week
Sponsored by: Microsoft OSTC
And use it for cpu0 assignment; it does not sound right to assume that
cpu0 maps to vcpu0. And this factored out function will be exposed to
drivers, if driver specific CPU binding is needed, e.g. hn(4).
Move default cpu select after saving channel offer message. This makes
sure that all useful information of the channel has been setup.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5504
The renamed function create a sysctl tree for channel, and many
non-statistics nodes exists, so don't claim it only adds sysctl
nodes for statistics.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5503
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167. It has been a compat shim ever
since. It's time for the compat shim to go.
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: sephe
Differential Revision: https://reviews.freebsd.org/D5131
So that the host could dispatch the TX done back to this TX ring's
owner channel
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5498
It would serve as a debug tool, if the shared buffer ring's indices
stopped updating.
Submitted by: HongJiang Zhang <honzhan microsoft com>
Reviewed by: sephe, Jun Su <junsu microsoft com>
Modified by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5402
sfence only makes sure about the store-store order, which is not
sufficient here. Use atomic_thread_fence_seq_cst() as suggested
jhb and kib (a locked op in the nutshell, which should have the
Reviewed by: jhb, kib, Jun Su <junsu microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5436
Chimney sending buffer still needs conversion, which will be done
along with the upcoming vRSS support.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5457
This fixes the TX/RX ring selection for TX/RX done.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5454
This is preamble to associate the TX/RX rings to their channel.
While I'm here, revoke unused netvsc_recv_rollup.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5453
This is the preamble to pass channel back to hn(4) upon TX/RX done.
Reviewed by: Hongjiang Zhang <honzhan microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5452
And unregister hv_device only for primary channels, who own the hv_device.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5451
Split heartbeat, shutdown and timesync out of utils code
and name them properly.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe, Hongjiang Zhang <honzhan microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5216
IF_PREPEND promises out-of-order packet sending when the TX desc list
is depleted. It was overlooked and copied blindly when the transmission
path was partially rewritten.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5386
It will be shared w/ the upcoming ifnet.if_transmit method
implementation.
No functional change.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5385
need to include it explicitly when <vm/vm_param.h> is already included.
Suggested by: alc
Reviewed by: alc
Differential Revision: https://reviews.freebsd.org/D5379