Commit Graph

197 Commits

Author SHA1 Message Date
Sepherosa Ziehau
222c454d33 hyperv: Revert r297481
Use vm_guest == VM_GUEST_HV is not enough to determine whether FreeBSD
is running on Hyper-V or not.  What a mess.

Reported by:	smokehydration tutanota com
Sponsored by:	Microsoft OSTC
2016-04-08 09:20:46 +00:00
Sepherosa Ziehau
e43d20993a hyperv: Use lapic_{alloc,free}_ipi to allocate private interrupt vector
Suggested by:	jhb
Reviewed by:	Dexuan Cui <decui microsoft com>, Jun Su <junsu microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5850
2016-04-07 07:12:57 +00:00
Sepherosa Ziehau
a121087069 hyperv: Typo in r297634
Noticed by:	hiren
MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-04-07 05:56:22 +00:00
Sepherosa Ziehau
aec810d4c5 hyperv/vmbus: Use default mtx for channel message queue
First of all sema_post() can't be called w/ spinlock, and the channel
message queue processing is not on hot code path, i.e. spinlock is not
necessary.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	sephe, Dexuan Cui <decui microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5812
2016-04-07 05:45:49 +00:00
Sepherosa Ziehau
d8258c4498 hyperv: Use mb() instead of atomic_thread_fence_seq_cst()
Since atomic_thread_fence_seq_cst() will become compiler fence on UP kernel.

Reviewed by:	kib, Dexuan Cui <decui microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5852
2016-04-07 05:31:22 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
Sepherosa Ziehau
04c247a1d0 hyperv: Register Hyper-V timer early enough for TSC freq calibration
The i8254 simulation in Hyper-V is kinda broken and is not available
in Generation 2 Hyper-V VMs, so Hyper-V timer must be registered early
enough so that it can be used to do the TSC freq calibration.

This fixes the notorious warning like this:
calcru: runtime went backwards from 50 usec to 25 usec for pid 0 (kernel)

Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	kib, sephe
Tested by:	kib, sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5778
2016-04-01 06:17:57 +00:00
Sepherosa Ziehau
4793a1da1c hyperv/vmbus: Create per-cpu fast taskqueue for msg handling
Using one taskqueue does not work, since the EOM MSR must be written
on the msg's owner CPU.

Noticed by:	Jun Su <junsu microsoft com>
Discussed with:	Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-03-24 02:15:23 +00:00
Sepherosa Ziehau
0fc8db6eed hyperv/utils: Allow hint to disable individual utility
Reviewed by:	kib, Dexuan Cui <decui microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5714
2016-03-24 01:12:28 +00:00
Sepherosa Ziehau
25a40f12b0 hyperv/vmbus: use a better retry method in hv_vmbus_post_message()
Most often, hv_vmbus_post_message() doesn't fail.  However, it fails
intermittently when GPADLs of large shared memory is to be established
with the host, e.g. on the hn(4) attach path: a GPADL of 15MB sendbuf
is created, for which lots of messages will be flooded to the host.
The host side tries to throttle the message rate by returning
HV_STATUS_INSUFFICIENT_BUFFERS.

Before this commit, we do several retries for failed messages, but the
delay between each retry is pretty/too low, which will cause sporadic
message posting failure.  We now use large delay (>=1ms) between each
retry to fix the message posting failure.

Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5715
2016-03-24 00:40:41 +00:00
Sepherosa Ziehau
2e4dba97bd hyperv/hn: When short of mbufs on the RX path, don't spam the console.
Instead, increase the IQDROPS counter.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5693
2016-03-22 07:08:47 +00:00
Sepherosa Ziehau
59526d4ac7 hyperv/hn: Factor out hn_set_lro_lenlim()
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5692
2016-03-22 06:42:24 +00:00
Sepherosa Ziehau
16ac3db526 hyperv/hn: Reduce TCP segment aggregation limit for multiple RX rings
This mainly used to improve ACK timeliness when multiple RX rings
are enabled.

This value gives the best performance in both Azure and Hyper-V
environment, w/ both 10Ge and 40Ge using non-{INVARIANTS,WITNESS}
kernel.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5691
2016-03-22 06:31:39 +00:00
Sepherosa Ziehau
1f0dfd9918 hyperv/vmbus: Remove NULL check for taskqueue_create_fast(M_WAITOK)
Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	Dexuan Cui <decui microsoft com>, sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5215
2016-03-22 06:23:09 +00:00
Sepherosa Ziehau
595a21e69e hyperv/vmbus: Use taskqueue_fast for non-performance critical messages
This gets rid of the per-cpu SWIs.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	Dexuan Cui <decui microsoft com>, sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5215
2016-03-22 06:13:27 +00:00
Sepherosa Ziehau
1948628a54 hyperv/evttimer: Use an independent message slot so that it can work
Using the same message slot as the other types of the messages has
the side effect that the event timer message could be deferred to
the swi threads to run (lacking of trapframe and the original code
didn't even handle that, so the event timer was actually broken).

As of this commit we use an independent message slot for event timer,
so that we could handle all of event timer messages in the interrupt
handler directly.  Note, the message slot for event timer is still
bind to the same interrupt vector as the other types of messages.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	sephe
Discussed with: Jun Su <junsu microsoft com>, Dexuan Cui <decui microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5696
2016-03-22 05:48:51 +00:00
Sepherosa Ziehau
b6ba8b778a hyperv/vmbus: Implement bus_child_pnpinfo_str method
Submitted by:	Jun Su <junsu microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5669
2016-03-21 07:16:30 +00:00
Sepherosa Ziehau
57a339543d hyperv: Factor out snprinf_hv_guid()
Submitted by:	Ju Sun <junsu microsoft com>
Reviewed by:	Dexuan Cui <decui microsoft com>, sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5651
2016-03-21 06:54:21 +00:00
Sepherosa Ziehau
f9fbf67e74 hyperv/hn: Make the # of TX rings configurable.
Rename the tunables to avoid confusion.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5578
2016-03-10 02:37:47 +00:00
Sepherosa Ziehau
9e76da0054 hyperv/hn: Factor out hn_channel_attach
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5577
2016-03-10 02:28:01 +00:00
Sepherosa Ziehau
431b98ddc3 hyperv/hn: Move if_initname to an earlier place
So that functions shared w/ attach path could use if_printf().

While I'm here, remove unnecessary if_dunit and if_dname assignment.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5576
2016-03-10 02:13:42 +00:00
Sepherosa Ziehau
05c0884ee1 hyperv/hn: Add per-TX ring stats for # of transmitted packets
MFC after:	2 weeks
Sponsored by:	Microsoft OSTC
2016-03-04 07:07:42 +00:00
Sepherosa Ziehau
7dda664075 hyperv/hn: Pass channel to send done callbacks.
Mainly to strigent the data packet send done check.

MFC after:	2 weeks
Sponsored by:	Microsoft OSTC
2016-03-04 07:00:37 +00:00
Sepherosa Ziehau
21640df213 hyperv/hn: Add multiple channel support, a.k.a. vRSS
Each channel contains one RX ring and one TX ring.  And we
try to distribute the channels to different evenly.

Note: Currently we don't have enough information to extract
the RSS type and RSS hash value from the received packets.

This greatly improves the TX/RX performance for 8 virtual CPU
Hyper-V over 10Ge: it can max out 10Ge for TCP when multiple
RX/TX rings are enabled.

This almost doubles the TX/RX performance for locally connected
Hyper-Vs: was 6Gbps w/ 128 TCP streams, now 11Gbps w/ multiple
RX/TX rings enabled.

It is not enabled by default; it will be switched on after more
tests.

Collaborated with:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	2 week
Sponsored by:	Microsoft OSTC
2016-03-04 06:52:11 +00:00
Sepherosa Ziehau
3951eba165 hyperv/hn: Make # of rings configurable
And since the host may not being able to allocate the # of rings
requested by us, save the # of rings allocated by the host in the
ring_inuse counters; use ring_inuse counters for run time operation.

This paves the way for the upcoming vRSS support.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-03-02 05:24:55 +00:00
Sepherosa Ziehau
33c6a6670f hyperv/hn: Fix typo in comment
MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-03-02 03:19:59 +00:00
Sepherosa Ziehau
6c03f94bb3 hyperv/hn: Make read buffer per-channel
Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Reorganized by:	sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-03-02 03:07:31 +00:00
Sepherosa Ziehau
2ad2701c2a hyperv/hn: Pass channel to hv_nv_on_receive_completion()
While I'm here, staticize this function.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Modified by:	sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-03-02 02:27:13 +00:00
Sepherosa Ziehau
9e5c6042f9 hyperv/chan: Factor out the vcpu setting
And use it for cpu0 assignment; it does not sound right to assume that
cpu0 maps to vcpu0.  And this factored out function will be exposed to
drivers, if driver specific CPU binding is needed, e.g. hn(4).

Move default cpu select after saving channel offer message. This makes
sure that all useful information of the channel has been setup.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5504
2016-03-02 01:40:47 +00:00
Sepherosa Ziehau
990aae810b hyperv/chan: Function renaming; no functional change
The renamed function create a sysctl tree for channel, and many
non-statistics nodes exists, so don't claim it only adds sysctl
nodes for statistics.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5503
2016-03-02 01:33:30 +00:00
Sepherosa Ziehau
87d697796a hyperv/chan: Add sysctl node to check whether monitor is allocated or not
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5502
2016-03-02 01:26:05 +00:00
John Baldwin
cbc4d2db75 Remove taskqueue_enqueue_fast().
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167.  It has been a compat shim ever
since.  It's time for the compat shim to go.

Submitted by:	Howard Su <howard0su@gmail.com>
Reviewed by:	sephe
Differential Revision:	https://reviews.freebsd.org/D5131
2016-03-01 17:47:32 +00:00
Sepherosa Ziehau
b431335d25 hyperv/channel: Nuke useless stack variable
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5499
2016-03-01 05:15:26 +00:00
Sepherosa Ziehau
4663ccd32e hyperv/hn: Set hash per-packet-info for each packet transmission
So that the host could dispatch the TX done back to this TX ring's
owner channel

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5498
2016-03-01 04:59:18 +00:00
Sepherosa Ziehau
6512e69278 hyperv/channel: Add sysctl node for channel owner cpu
And add sysctl node for sub-channel's channel id.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5489
2016-02-29 09:14:55 +00:00
Sepherosa Ziehau
bd2f53618c hyperv/hn: Utilize mbuf flowid
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5488
2016-02-29 09:05:33 +00:00
Sepherosa Ziehau
f4e778c14d hyperv/hn: Put LRO aggregation limit settings under FreeBSD version check
This simplifies MFC to 10-stable

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5487
2016-02-29 08:53:53 +00:00
Sepherosa Ziehau
27a185fc4a hyperv/hn: Switch to if_transmit by default after r296178
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5485
2016-02-29 08:45:07 +00:00
Sepherosa Ziehau
77394b18ab hyperv/channel: Add debug sysctl nodes for channel indices
It would serve as a debug tool, if the shared buffer ring's indices
stopped updating.

Submitted by:	HongJiang Zhang <honzhan microsoft com>
Reviewed by:	sephe, Jun Su <junsu microsoft com>
Modified by:	sephe
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5402
2016-02-29 05:24:29 +00:00
Sepherosa Ziehau
5fa2e9dd39 hyperv: Use proper fence function to keep store-load order for msgs
sfence only makes sure about the store-store order, which is not
sufficient here.  Use atomic_thread_fence_seq_cst() as suggested
jhb and kib (a locked op in the nutshell, which should have the

Reviewed by:	jhb, kib, Jun Su <junsu microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5436
2016-02-29 04:58:40 +00:00
Sepherosa Ziehau
5f609a0812 hyperv/hn: Make transmission path channel aware
Chimney sending buffer still needs conversion, which will be done
along with the upcoming vRSS support.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5457
2016-02-26 09:50:35 +00:00
Sepherosa Ziehau
c10a0b9e69 hyperv/hn: Remove the useless num_outstanding_sends
We rely on taskqueue draining now.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5456
2016-02-26 09:45:48 +00:00
Sepherosa Ziehau
3f919777b6 hyperv/hn: Associate TX/RX ring with channel
This fixes the TX/RX ring selection for TX/RX done.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5454
2016-02-26 09:41:00 +00:00
Sepherosa Ziehau
6619838c04 hyperv/hn: Pass channel to TX/RX done
This is preamble to associate the TX/RX rings to their channel.

While I'm here, revoke unused netvsc_recv_rollup.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5453
2016-02-26 09:35:45 +00:00
Sepherosa Ziehau
654bf1ec96 hyperv/hn: Pass channel as the channel callback argument
This is the preamble to pass channel back to hn(4) upon TX/RX done.

Reviewed by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5452
2016-02-26 09:29:50 +00:00
Sepherosa Ziehau
b96a7ad189 hyperv: Always set device for channels
And unregister hv_device only for primary channels, who own the hv_device.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5451
2016-02-26 09:23:17 +00:00
Sepherosa Ziehau
a14df6ad64 hyperv: Remove useless channel inbound_lock
It serves no purpose.

Reviewed by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5450
2016-02-26 09:17:31 +00:00
Sepherosa Ziehau
9021e53b29 hyperv: Use atomic_fetchadd_int to get GPADL id.
Reviewed by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5439
2016-02-26 02:26:19 +00:00
Sepherosa Ziehau
f7d33dfef2 hyperv: Wait 5 seconds for hyperv result, instead of 500ms
This addresses various devices (network, stoarge) attach failure.

Reported by:	Hongxiong Xian <v-hoxian microsoft com>
Tested by:	Hongxiong Xian <v-hoxian microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5435
2016-02-25 09:27:50 +00:00
Sepherosa Ziehau
fd458696b3 hyperv/hn: Hold the TX ring lock then drain TX desc buf_ring
Reported by:	Hongxiong Xian <v-hoxian microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
2016-02-25 07:03:10 +00:00
Sepherosa Ziehau
ea134429d0 hyperv/hn: Implement ifnet.if_transmit method
It will be turned on by default later.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5415
2016-02-25 03:21:25 +00:00
Sepherosa Ziehau
600d84765f hyperv/vmbus: Use free(9) for interrupt page; it is allocated by malloc(9)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5417
2016-02-24 08:54:50 +00:00
Sepherosa Ziehau
0bc2abddc8 hyperv/utils: Code rearrange and cleanup
Split heartbeat, shutdown and timesync out of utils code
and name them properly.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, sephe, Hongjiang Zhang <honzhan microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5216
2016-02-24 05:01:18 +00:00
Sepherosa Ziehau
1b3ce7e39f hyperv/stor: Fix print format
Detected by:	PVS Static Analysis
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5388
2016-02-23 09:29:45 +00:00
Sepherosa Ziehau
b87683cb78 hyperv/hn: Use IFQ_DRV_PREPEND instead of IF_PREPEND
IF_PREPEND promises out-of-order packet sending when the TX desc list
is depleted. It was overlooked and copied blindly when the transmission
path was partially rewritten.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5386
2016-02-23 09:25:20 +00:00
Sepherosa Ziehau
a0a3cf18ec hyperv/hn: Factor out hn_send_pkt() from hn_start_locked()
It will be shared w/ the upcoming ifnet.if_transmit method
implementation.

No functional change.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5385
2016-02-23 09:20:33 +00:00
Svatopluk Kraus
35a0bc1260 As <machine/vmparam.h> is included from <vm/vm_param.h>, there is no
need to include it explicitly when <vm/vm_param.h> is already included.

Suggested by:	alc
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D5379
2016-02-22 09:08:04 +00:00
Sepherosa Ziehau
ed3960349b hyperv/hn: Add TX method for txeof processing.
Preamble to implement ifnet.if_transmit method.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5346
2016-02-22 06:28:18 +00:00
Sepherosa Ziehau
5cb0a42c8c hyperv/hn: Staticize and rename packet TX done function
It is only used in hv_netvsc_drv_freebsd.c; and rename it to hn_tx_done()
mainly to reserve "xmit" for ifnet.if_transmit implement.

While I'm here, remove unapplied comment.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5345
2016-02-22 06:22:47 +00:00
Sepherosa Ziehau
009a73c4ba hyperv/hn: Rename TX related function and struct fields a bit
Preamble to implement the ifnet.if_transmit method.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5344
2016-02-22 06:17:26 +00:00
Sepherosa Ziehau
563d00ab5c hyperv/hn: Free the txdesc buf_ring when the TX ring is destroyed
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5318
2016-02-19 05:13:56 +00:00
Sepherosa Ziehau
ea2f3d1776 hyperv/hn: Enable IP header checksum offloading for WIN8 (WinServ2012)
Tested on Windows Server 2012.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5317
2016-02-19 05:08:44 +00:00
Sepherosa Ziehau
5788ded618 hyperv/hn: Add option to bind TX taskqueues to the specified CPU
It will be used to help tracking host side transmission ring selection
issue; and it will be turned on by default, once we have concrete result.

Reviewed by:	adrian, Jun Su <junsu microsoft com>
Approved by:	adrian (mento)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5316
2016-02-19 05:03:17 +00:00
Sepherosa Ziehau
57fb9b3fd1 hyperv/hn: Use buf_ring for txdesc list
So one spinlock is avoided, which would be potentially dangerous for
virtual machine, if the spinlock holder was scheduled out by the host,
as noted by royger.

Old spinlock based txdesc list is still kept around, so we could have
a safe fallback.

No performance regression nor improvement is observed.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5290
2016-02-18 07:44:14 +00:00
Sepherosa Ziehau
dbfb4eba54 hyperv/hn: Split TX ring data structure out of softc
This paves the way for upcoming vRSS stuffs and eases more code cleanup.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5283
2016-02-18 07:37:59 +00:00
Sepherosa Ziehau
b8f2d59daf hyperv/hn: Use non-fast taskqueue for transmission
Performance stays same; so no need to use fast taskqueue here.

Suggested by:	royger
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5282
2016-02-18 07:28:45 +00:00
Sepherosa Ziehau
4d8e8cb113 hyperv/hn: Use taskqueue_enqueue()
This also eases experiment on the non-fast taskqueue.

Reviewed by:	adrian, Jun Su <junsu microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5276
2016-02-18 07:23:05 +00:00
Sepherosa Ziehau
17ab6c4f17 hyperv/hn: Split RX ring data structure out of softc
This paves the way for upcoming vRSS stuffs and eases more code cleanup.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5275
2016-02-18 07:16:31 +00:00
Sepherosa Ziehau
f9c55d1c83 hyperv/hn: Change global tunable prefix to hw.hn
And use SYSCTL+CTLFLAG_RDTUN for them.

Suggested by:	adrian
Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5274
2016-02-18 07:06:44 +00:00
Sepherosa Ziehau
01610d88da hyperv/hn: Always do transmission scheduling.
This one gives the best performance so far.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5273
2016-02-18 07:00:47 +00:00
Sepherosa Ziehau
58f5a606fa hyperv/hn: Add option to allow sharing TX taskq between hn instances
It is off by default.  This eases further experimenting on this driver.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5272
2016-02-18 06:55:05 +00:00
Sepherosa Ziehau
aa7f74113e hyperv/hn: Set the TCP ACK/data segment aggregation limit
Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most.  Aggregating
anything more than 2 hurts TCP sending performance in hyperv.  This
significantly improves the TCP sending performance when the number of
concurrent connetion is low (2~8).  And it greatly stabilizes the TCP
sending performance in other cases.

Set TCP data segments aggregation length limit to 37500.  Without this
limitation, hn(4) could aggregate ~45 TCP data segments for each
connection (even at 64 or more connections) before dispatching them to
socket code; large aggregation slows down ACK sending and eventually
hurts/destabilizes TCP reception performance.  This setting stabilizes
and improves TCP reception performance for >4 concurrent connections
significantly.

Make them sysctls so they could be adjusted.

Reviewed by:	adrian, gallatin (previous version), hselasky (previous version)
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5185
2016-02-18 04:59:37 +00:00
Sepherosa Ziehau
ffc6b639c2 hyperv/hn: Fix typo in comment
Noticed by:	avos
Reviewed by:	adrian, avos, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5199
2016-02-14 02:28:59 +00:00
Sepherosa Ziehau
3fd8cd9ce4 hyperv: Use malloc for page allocation.
We will eventually convert them to use busdma.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, sephe, Dexuan Cui <decui microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5087
2016-02-05 07:29:11 +00:00
Sepherosa Ziehau
27cc90ebb1 hyperv: Use WAITOK in the places where we can wait
And convert rndis non-hot path spinlock to mutex.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, sephe
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5081
2016-02-05 07:20:31 +00:00
Sepherosa Ziehau
f11ef33f0d hyperv: Use standard taskqueue instead of hv_work_queue
HyperV code was ported from Linux.  There is an implementation of
work queue called hv_work_queue.  In FreeBSD, taskqueue could be
used for the same purpose.  Convert all the consumer of hv_work_queue
to use taskqueue, and remove work queue implementation.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4963
2016-02-05 07:09:58 +00:00
Sepherosa Ziehau
a7f84cedee hyperv/hn: Add an option to always do transmission scheduling
It is off by default. This eases more experiment on hn(4).

Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5175
2016-02-05 05:50:53 +00:00
Sepherosa Ziehau
62c328f0c4 hyperv/hn: Move LRO flush to the channel processing rollup
This significantly increases LRO aggregation ratio when there are
large amount of connections (improves reception performance a lot).

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5167
2016-02-05 05:44:31 +00:00
Sepherosa Ziehau
e35e485b04 hyperv/hn: Increase LRO entry count to 128 by default
hn(4) only has one RX ring currently, so default 8 LRO entries
are too small.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5166
2016-02-05 05:38:01 +00:00
Sepherosa Ziehau
1e4bb37d22 hyperv/hn: Recover half of the chimney sending space
We lost half of the chimney sending space, because we mis-used
ffs() on a 64 bits mask, where ffsl() should be used.

While I'm here:
- Use system atomic operation instead.
- Stringent chimney sending index assertion.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5159
2016-02-05 05:31:31 +00:00
Sepherosa Ziehau
b256e94549 hyperv/hn: Factor out hn_encap() from hn_start_locked()
It will be shared w/ upcoming ifnet.if_transmit implementaion.

No functional changes.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5158
2016-02-05 05:25:11 +00:00
Sepherosa Ziehau
58d6fc930e hyperv/hn: Obey IFCAP_RXCSUM configure
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5104
2016-02-05 05:17:48 +00:00
Sepherosa Ziehau
b8109bd09e hyperv/hn: Add sysctls to trust host side UDP and IP csum verification
Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5103
2016-02-05 05:12:30 +00:00
Sepherosa Ziehau
51ae346f9d hyperv/hn: Enable UDP RXCSUM
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5102
2016-02-05 05:06:14 +00:00
Sepherosa Ziehau
b0fde7e820 hyperv/hn: Enable IP header checksum offloading
So that:
- TCP/IP stack will not do unnecessary IP header checksum for TSO
  packets.
- Reduce guest load for non-TSO IP packets.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5099
2016-02-05 05:01:02 +00:00
Sepherosa Ziehau
74506a55d6 hyperv/hn: Reorganize TX csum offloading
- For non-TSO offloading, we don't need to access mbuf to know
  which csum offloading is requested, we can just use the
  CSUM_{IP,TCP,UDP} in the csum_flags.
- For TSO offloading, we still can depend on CSUM_{TSO4,TSO6}
  in the csum_flags to tell whether the TSO packet is an IPv4
  TSO packet or an IPv6 TSO packet.

This streamlines csum offloading handling (remove the two goto)
and allows us the nuke the unnecessary get_transport_proto_type().

Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5098
2016-02-05 04:10:04 +00:00
Sepherosa Ziehau
82db5a8905 hyperv/hn: Avoid duplicate csum features settings
- Record csum features in softc, so we don't need to duplicate the
  logic from attach path to ioctl path.
- Protect if_capenable and if_hwassist changes by main lock.
- Prefer turn on/off bits in if_hwassist explicitly instead of using
  XOR.

Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5085
2016-02-05 04:03:50 +00:00
Sepherosa Ziehau
f70c7ffe5e hyperv/stor: Fix the NULL pointer dereference
Reported by:	Netapp
Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Reviewed by:	adrian, sephe, Dexuan Cui <decui microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5097
2016-02-05 03:46:53 +00:00
Sepherosa Ziehau
0eff2e2ea1 hyperv/vmbus: Event handling code refactor.
- Use taskqueue instead of swi for event handling.
- Scan the interrupt flags in filter
- Disable ringbuffer interrupt mask in filter to ensure no unnecessary
  interrupts.

Submitted by:		Jun Su <junsu microsoft com>
Reviewed by:		adrian, sephe, Dexuan <decui microsoft com>
Approved by:		adrian (mentor)
MFC after:		2 weeks
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4920
2016-01-27 03:53:30 +00:00
Sepherosa Ziehau
719d2f1ad5 hyperv/hn: Improve sending performance
- Avoid main lock contention by trylock for if_start, if that fails,
  schedule TX taskqueue for if_start
- Don't do direct sending if the packet to be sent is large, e.g.
  TSO packet.

This change gives me stable 9.1Gbps TCP sending performance w/ TSO
over a 10Gbe directly connected network (the performance fluctuated
between 4Gbps and 9Gbps before this commit). It also improves non-
TSO TCP sending performance a lot.

Reviewed by:		adrian, royger
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5074
2016-01-26 09:42:13 +00:00
Sepherosa Ziehau
51f6f18c88 hyperv/vmbus: Avoid extra copy of page information.
The page information array could contain up to 32 elements (i.e. 512B).
And on network side w/ TSO, 11+ (176B+) elements, i.e. ~44K TSO packet,
in the page information array is quite common.

This saves us some cpu cycles.

Reviewed by:		adrian, delphij
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4992
2016-01-25 05:33:18 +00:00
Sepherosa Ziehau
dc1418432b hyperv/hn: Trust host TCP segment checksum verification by default.
According to all available information, VMSWITCH always does the
TCP segment checksum verification before sending the segment to
guest.

Reviewed by:		adrian, delphij, Hongjiang Zhang <honzhan microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4991
2016-01-25 05:25:39 +00:00
Sepherosa Ziehau
7ea161b0ec hyperv/hn: Remove unnecessary zeroing out the netvsc_packet
All used fields are setup one by one, so there is no need to zero
out this large struct.

While I'm here, move the stack variable near its usage.

Reviewed by:		adrian, delphij, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4978
2016-01-25 05:18:57 +00:00
Sepherosa Ziehau
4d9e79a3be hyperv/hn: Use m_copydata for chimney sending.
While I'm here, move stack variables near their usage.

Reviewed by:		adrian, delphij, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4977
2016-01-25 05:12:00 +00:00
Sepherosa Ziehau
391ad73b70 hyperv/hn: Partly rework transmission path
- Avoid unnecessary malloc/free on transmission path.
- busdma(9)-fy transmission path.
- Properly handle IFF_DRV_OACTIVE.  This should fix the network
  stalls reported by many.
- Properly setup TSO parameters.
- Properly handle bpf(4) tapping.  This 5 times the performance
  during TCP sending test, when there is one bpf(4) attached.
- Allow size of chimney sending be tuned on a running system.
  Default value still needs more test to determine.

Reviewed by:		adrian, delphij
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4972
2016-01-25 05:01:32 +00:00
Sepherosa Ziehau
1b44593c51 hyperv/stor: Verify returned inquiry data before further dispatching
Windows 10 and Window 2016 will return all zero inquiry data for
non-existing slots.  If we dispatched them, then a lot of useless
(0 sized) disks would be created.  So we verify the returned inquiry
data (valid type, non-empty vendor/product/revision etc.), before
further dispatching.

Minor white space cleanup and wording fix.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		adrian, sephe, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Modified by:		sephe
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4928
2016-01-22 09:06:40 +00:00
Sepherosa Ziehau
a601c86751 hyperv/vmbus: Lookup channel through id table
Vmbus event handler will need to find the channel by its relative
id, when software interrupt for event happens.  The original lookup
searches the channel list, which is not very efficient.  We now
create a table indexed by the channel relative id to speed up
the channel lookup.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		delphij, adrain, sephe, Dexuan Cui <decui microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4802
2016-01-22 07:29:31 +00:00
Hans Petter Selasky
e936121d31 Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets
  according to the hash type and flowid. This prevents exhaustion of
  the LRO entries due to too many connections at the same time.
  Testing using a larger number of higher bandwidth TCP connections
  showed that the incoming ACK packet aggregation rate increased from
  ~1.3:1 to almost 3:1. Another test showed that for a number of TCP
  connections greater than 16 per hardware receive ring, where 8 TCP
  connections was the LRO active entry limit, there was a significant
  improvement in throughput due to being able to fully aggregate more
  than 8 TCP stream. For very few very high bandwidth TCP streams, the
  optimizing LRO wrapper will add CPU usage instead of reducing CPU
  usage. This is expected. Network drivers which want to use the
  optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
  of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
  "tcp_lro_flush()". Further the LRO control structure must be
  initialized using "tcp_lro_init_args()" passing a non-zero number
  into the "lro_mbufs" argument.

- Make LRO statistics 64-bit. Previously 32-bit integers were used for
  statistics which can be prone to wrap-around. Fix this while at it
  and update all SYSCTL's which expose LRO statistics.

- Ensure all data is freed when destroying a LRO control structures,
  especially leftover LRO entries.

- Reduce number of memory allocations needed when setting up a LRO
  control structure by precomputing the total amount of memory needed.

- Add own memory allocation counter for LRO.

- Bump the FreeBSD version to force recompilation of all KLDs due to
  change of the LRO control structure size.

Sponsored by:	Mellanox Technologies
Reviewed by:	gallatin, sbruno, rrs, gnn, transport
Tested by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D4914
2016-01-19 15:33:28 +00:00
Sepherosa Ziehau
dd7a7dd6af hyperv: set receive buffer size according to NVSP protocol version
If the NVSP protocol version is not greater than NVSP_PROTOCOL_VERSION_2,
then the recv buffer size is 15MB, otherwise the buffer size is 16MB.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		royger, Dexuan Cui <decui microsoft com>, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4814
2016-01-14 03:16:29 +00:00
Sepherosa Ziehau
1952924333 hyperv: add interrupt counters
Submitted by:		Howard Su <howard0su gmail com>
Reviewed by:		royger, Dexuan Cui <decui microsoft com>, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4693
2016-01-14 03:11:35 +00:00