Submitted by: Jun Su <junsu microsoft com>
Reviewed by: jhb, kib, sephe
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5910
8 gives the best performance in both Azure and local Hyper-V on both
10Ge and 40Ge. More rings are still allowed by manual configuration.
Reviewed by: Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5879
This time we make sure that the TIME_REF_COUNT MSR exists.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Features bits will be used to detect devices, e.g. timers, which
do not have corresponding event channels.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
Rearranged by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Use vm_guest == VM_GUEST_HV is not enough to determine whether FreeBSD
is running on Hyper-V or not. What a mess.
Reported by: smokehydration tutanota com
Sponsored by: Microsoft OSTC
Suggested by: jhb
Reviewed by: Dexuan Cui <decui microsoft com>, Jun Su <junsu microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5850
First of all sema_post() can't be called w/ spinlock, and the channel
message queue processing is not on hot code path, i.e. spinlock is not
necessary.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5812
Since atomic_thread_fence_seq_cst() will become compiler fence on UP kernel.
Reviewed by: kib, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5852
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c
Reviewed by: gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5725
The i8254 simulation in Hyper-V is kinda broken and is not available
in Generation 2 Hyper-V VMs, so Hyper-V timer must be registered early
enough so that it can be used to do the TSC freq calibration.
This fixes the notorious warning like this:
calcru: runtime went backwards from 50 usec to 25 usec for pid 0 (kernel)
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: kib, sephe
Tested by: kib, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5778
Using one taskqueue does not work, since the EOM MSR must be written
on the msg's owner CPU.
Noticed by: Jun Su <junsu microsoft com>
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Most often, hv_vmbus_post_message() doesn't fail. However, it fails
intermittently when GPADLs of large shared memory is to be established
with the host, e.g. on the hn(4) attach path: a GPADL of 15MB sendbuf
is created, for which lots of messages will be flooded to the host.
The host side tries to throttle the message rate by returning
HV_STATUS_INSUFFICIENT_BUFFERS.
Before this commit, we do several retries for failed messages, but the
delay between each retry is pretty/too low, which will cause sporadic
message posting failure. We now use large delay (>=1ms) between each
retry to fix the message posting failure.
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5715
This mainly used to improve ACK timeliness when multiple RX rings
are enabled.
This value gives the best performance in both Azure and Hyper-V
environment, w/ both 10Ge and 40Ge using non-{INVARIANTS,WITNESS}
kernel.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5691
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
This gets rid of the per-cpu SWIs.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5215
Using the same message slot as the other types of the messages has
the side effect that the event timer message could be deferred to
the swi threads to run (lacking of trapframe and the original code
didn't even handle that, so the event timer was actually broken).
As of this commit we use an independent message slot for event timer,
so that we could handle all of event timer messages in the interrupt
handler directly. Note, the message slot for event timer is still
bind to the same interrupt vector as the other types of messages.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: sephe
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5696
Submitted by: Ju Sun <junsu microsoft com>
Reviewed by: Dexuan Cui <decui microsoft com>, sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5651
So that functions shared w/ attach path could use if_printf().
While I'm here, remove unnecessary if_dunit and if_dname assignment.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5576
Each channel contains one RX ring and one TX ring. And we
try to distribute the channels to different evenly.
Note: Currently we don't have enough information to extract
the RSS type and RSS hash value from the received packets.
This greatly improves the TX/RX performance for 8 virtual CPU
Hyper-V over 10Ge: it can max out 10Ge for TCP when multiple
RX/TX rings are enabled.
This almost doubles the TX/RX performance for locally connected
Hyper-Vs: was 6Gbps w/ 128 TCP streams, now 11Gbps w/ multiple
RX/TX rings enabled.
It is not enabled by default; it will be switched on after more
tests.
Collaborated with: Hongjiang Zhang <honzhan microsoft com>
MFC after: 2 week
Sponsored by: Microsoft OSTC
And since the host may not being able to allocate the # of rings
requested by us, save the # of rings allocated by the host in the
ring_inuse counters; use ring_inuse counters for run time operation.
This paves the way for the upcoming vRSS support.
MFC after: 1 week
Sponsored by: Microsoft OSTC
And use it for cpu0 assignment; it does not sound right to assume that
cpu0 maps to vcpu0. And this factored out function will be exposed to
drivers, if driver specific CPU binding is needed, e.g. hn(4).
Move default cpu select after saving channel offer message. This makes
sure that all useful information of the channel has been setup.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5504
The renamed function create a sysctl tree for channel, and many
non-statistics nodes exists, so don't claim it only adds sysctl
nodes for statistics.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5503