Commit Graph

239 Commits

Author SHA1 Message Date
Sepherosa Ziehau
5cb0a42c8c hyperv/hn: Staticize and rename packet TX done function
It is only used in hv_netvsc_drv_freebsd.c; and rename it to hn_tx_done()
mainly to reserve "xmit" for ifnet.if_transmit implement.

While I'm here, remove unapplied comment.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5345
2016-02-22 06:22:47 +00:00
Sepherosa Ziehau
009a73c4ba hyperv/hn: Rename TX related function and struct fields a bit
Preamble to implement the ifnet.if_transmit method.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5344
2016-02-22 06:17:26 +00:00
Sepherosa Ziehau
563d00ab5c hyperv/hn: Free the txdesc buf_ring when the TX ring is destroyed
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5318
2016-02-19 05:13:56 +00:00
Sepherosa Ziehau
ea2f3d1776 hyperv/hn: Enable IP header checksum offloading for WIN8 (WinServ2012)
Tested on Windows Server 2012.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5317
2016-02-19 05:08:44 +00:00
Sepherosa Ziehau
5788ded618 hyperv/hn: Add option to bind TX taskqueues to the specified CPU
It will be used to help tracking host side transmission ring selection
issue; and it will be turned on by default, once we have concrete result.

Reviewed by:	adrian, Jun Su <junsu microsoft com>
Approved by:	adrian (mento)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5316
2016-02-19 05:03:17 +00:00
Sepherosa Ziehau
57fb9b3fd1 hyperv/hn: Use buf_ring for txdesc list
So one spinlock is avoided, which would be potentially dangerous for
virtual machine, if the spinlock holder was scheduled out by the host,
as noted by royger.

Old spinlock based txdesc list is still kept around, so we could have
a safe fallback.

No performance regression nor improvement is observed.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5290
2016-02-18 07:44:14 +00:00
Sepherosa Ziehau
dbfb4eba54 hyperv/hn: Split TX ring data structure out of softc
This paves the way for upcoming vRSS stuffs and eases more code cleanup.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5283
2016-02-18 07:37:59 +00:00
Sepherosa Ziehau
b8f2d59daf hyperv/hn: Use non-fast taskqueue for transmission
Performance stays same; so no need to use fast taskqueue here.

Suggested by:	royger
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5282
2016-02-18 07:28:45 +00:00
Sepherosa Ziehau
4d8e8cb113 hyperv/hn: Use taskqueue_enqueue()
This also eases experiment on the non-fast taskqueue.

Reviewed by:	adrian, Jun Su <junsu microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5276
2016-02-18 07:23:05 +00:00
Sepherosa Ziehau
17ab6c4f17 hyperv/hn: Split RX ring data structure out of softc
This paves the way for upcoming vRSS stuffs and eases more code cleanup.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5275
2016-02-18 07:16:31 +00:00
Sepherosa Ziehau
f9c55d1c83 hyperv/hn: Change global tunable prefix to hw.hn
And use SYSCTL+CTLFLAG_RDTUN for them.

Suggested by:	adrian
Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5274
2016-02-18 07:06:44 +00:00
Sepherosa Ziehau
01610d88da hyperv/hn: Always do transmission scheduling.
This one gives the best performance so far.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5273
2016-02-18 07:00:47 +00:00
Sepherosa Ziehau
58f5a606fa hyperv/hn: Add option to allow sharing TX taskq between hn instances
It is off by default.  This eases further experimenting on this driver.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5272
2016-02-18 06:55:05 +00:00
Sepherosa Ziehau
aa7f74113e hyperv/hn: Set the TCP ACK/data segment aggregation limit
Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most.  Aggregating
anything more than 2 hurts TCP sending performance in hyperv.  This
significantly improves the TCP sending performance when the number of
concurrent connetion is low (2~8).  And it greatly stabilizes the TCP
sending performance in other cases.

Set TCP data segments aggregation length limit to 37500.  Without this
limitation, hn(4) could aggregate ~45 TCP data segments for each
connection (even at 64 or more connections) before dispatching them to
socket code; large aggregation slows down ACK sending and eventually
hurts/destabilizes TCP reception performance.  This setting stabilizes
and improves TCP reception performance for >4 concurrent connections
significantly.

Make them sysctls so they could be adjusted.

Reviewed by:	adrian, gallatin (previous version), hselasky (previous version)
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5185
2016-02-18 04:59:37 +00:00
Sepherosa Ziehau
ffc6b639c2 hyperv/hn: Fix typo in comment
Noticed by:	avos
Reviewed by:	adrian, avos, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5199
2016-02-14 02:28:59 +00:00
Sepherosa Ziehau
3fd8cd9ce4 hyperv: Use malloc for page allocation.
We will eventually convert them to use busdma.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, sephe, Dexuan Cui <decui microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5087
2016-02-05 07:29:11 +00:00
Sepherosa Ziehau
27cc90ebb1 hyperv: Use WAITOK in the places where we can wait
And convert rndis non-hot path spinlock to mutex.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, sephe
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5081
2016-02-05 07:20:31 +00:00
Sepherosa Ziehau
f11ef33f0d hyperv: Use standard taskqueue instead of hv_work_queue
HyperV code was ported from Linux.  There is an implementation of
work queue called hv_work_queue.  In FreeBSD, taskqueue could be
used for the same purpose.  Convert all the consumer of hv_work_queue
to use taskqueue, and remove work queue implementation.

Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4963
2016-02-05 07:09:58 +00:00
Sepherosa Ziehau
a7f84cedee hyperv/hn: Add an option to always do transmission scheduling
It is off by default. This eases more experiment on hn(4).

Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5175
2016-02-05 05:50:53 +00:00
Sepherosa Ziehau
62c328f0c4 hyperv/hn: Move LRO flush to the channel processing rollup
This significantly increases LRO aggregation ratio when there are
large amount of connections (improves reception performance a lot).

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5167
2016-02-05 05:44:31 +00:00
Sepherosa Ziehau
e35e485b04 hyperv/hn: Increase LRO entry count to 128 by default
hn(4) only has one RX ring currently, so default 8 LRO entries
are too small.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5166
2016-02-05 05:38:01 +00:00
Sepherosa Ziehau
1e4bb37d22 hyperv/hn: Recover half of the chimney sending space
We lost half of the chimney sending space, because we mis-used
ffs() on a 64 bits mask, where ffsl() should be used.

While I'm here:
- Use system atomic operation instead.
- Stringent chimney sending index assertion.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5159
2016-02-05 05:31:31 +00:00
Sepherosa Ziehau
b256e94549 hyperv/hn: Factor out hn_encap() from hn_start_locked()
It will be shared w/ upcoming ifnet.if_transmit implementaion.

No functional changes.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5158
2016-02-05 05:25:11 +00:00
Sepherosa Ziehau
58d6fc930e hyperv/hn: Obey IFCAP_RXCSUM configure
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5104
2016-02-05 05:17:48 +00:00
Sepherosa Ziehau
b8109bd09e hyperv/hn: Add sysctls to trust host side UDP and IP csum verification
Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5103
2016-02-05 05:12:30 +00:00
Sepherosa Ziehau
51ae346f9d hyperv/hn: Enable UDP RXCSUM
Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5102
2016-02-05 05:06:14 +00:00
Sepherosa Ziehau
b0fde7e820 hyperv/hn: Enable IP header checksum offloading
So that:
- TCP/IP stack will not do unnecessary IP header checksum for TSO
  packets.
- Reduce guest load for non-TSO IP packets.

Reviewed by:	adrian
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5099
2016-02-05 05:01:02 +00:00
Sepherosa Ziehau
74506a55d6 hyperv/hn: Reorganize TX csum offloading
- For non-TSO offloading, we don't need to access mbuf to know
  which csum offloading is requested, we can just use the
  CSUM_{IP,TCP,UDP} in the csum_flags.
- For TSO offloading, we still can depend on CSUM_{TSO4,TSO6}
  in the csum_flags to tell whether the TSO packet is an IPv4
  TSO packet or an IPv6 TSO packet.

This streamlines csum offloading handling (remove the two goto)
and allows us the nuke the unnecessary get_transport_proto_type().

Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5098
2016-02-05 04:10:04 +00:00
Sepherosa Ziehau
82db5a8905 hyperv/hn: Avoid duplicate csum features settings
- Record csum features in softc, so we don't need to duplicate the
  logic from attach path to ioctl path.
- Protect if_capenable and if_hwassist changes by main lock.
- Prefer turn on/off bits in if_hwassist explicitly instead of using
  XOR.

Reviewed by:	adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5085
2016-02-05 04:03:50 +00:00
Sepherosa Ziehau
f70c7ffe5e hyperv/stor: Fix the NULL pointer dereference
Reported by:	Netapp
Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Reviewed by:	adrian, sephe, Dexuan Cui <decui microsoft com>
Approved by:	adrian (mentor)
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5097
2016-02-05 03:46:53 +00:00
Sepherosa Ziehau
0eff2e2ea1 hyperv/vmbus: Event handling code refactor.
- Use taskqueue instead of swi for event handling.
- Scan the interrupt flags in filter
- Disable ringbuffer interrupt mask in filter to ensure no unnecessary
  interrupts.

Submitted by:		Jun Su <junsu microsoft com>
Reviewed by:		adrian, sephe, Dexuan <decui microsoft com>
Approved by:		adrian (mentor)
MFC after:		2 weeks
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4920
2016-01-27 03:53:30 +00:00
Sepherosa Ziehau
719d2f1ad5 hyperv/hn: Improve sending performance
- Avoid main lock contention by trylock for if_start, if that fails,
  schedule TX taskqueue for if_start
- Don't do direct sending if the packet to be sent is large, e.g.
  TSO packet.

This change gives me stable 9.1Gbps TCP sending performance w/ TSO
over a 10Gbe directly connected network (the performance fluctuated
between 4Gbps and 9Gbps before this commit). It also improves non-
TSO TCP sending performance a lot.

Reviewed by:		adrian, royger
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5074
2016-01-26 09:42:13 +00:00
Sepherosa Ziehau
51f6f18c88 hyperv/vmbus: Avoid extra copy of page information.
The page information array could contain up to 32 elements (i.e. 512B).
And on network side w/ TSO, 11+ (176B+) elements, i.e. ~44K TSO packet,
in the page information array is quite common.

This saves us some cpu cycles.

Reviewed by:		adrian, delphij
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4992
2016-01-25 05:33:18 +00:00
Sepherosa Ziehau
dc1418432b hyperv/hn: Trust host TCP segment checksum verification by default.
According to all available information, VMSWITCH always does the
TCP segment checksum verification before sending the segment to
guest.

Reviewed by:		adrian, delphij, Hongjiang Zhang <honzhan microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4991
2016-01-25 05:25:39 +00:00
Sepherosa Ziehau
7ea161b0ec hyperv/hn: Remove unnecessary zeroing out the netvsc_packet
All used fields are setup one by one, so there is no need to zero
out this large struct.

While I'm here, move the stack variable near its usage.

Reviewed by:		adrian, delphij, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4978
2016-01-25 05:18:57 +00:00
Sepherosa Ziehau
4d9e79a3be hyperv/hn: Use m_copydata for chimney sending.
While I'm here, move stack variables near their usage.

Reviewed by:		adrian, delphij, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4977
2016-01-25 05:12:00 +00:00
Sepherosa Ziehau
391ad73b70 hyperv/hn: Partly rework transmission path
- Avoid unnecessary malloc/free on transmission path.
- busdma(9)-fy transmission path.
- Properly handle IFF_DRV_OACTIVE.  This should fix the network
  stalls reported by many.
- Properly setup TSO parameters.
- Properly handle bpf(4) tapping.  This 5 times the performance
  during TCP sending test, when there is one bpf(4) attached.
- Allow size of chimney sending be tuned on a running system.
  Default value still needs more test to determine.

Reviewed by:		adrian, delphij
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4972
2016-01-25 05:01:32 +00:00
Sepherosa Ziehau
1b44593c51 hyperv/stor: Verify returned inquiry data before further dispatching
Windows 10 and Window 2016 will return all zero inquiry data for
non-existing slots.  If we dispatched them, then a lot of useless
(0 sized) disks would be created.  So we verify the returned inquiry
data (valid type, non-empty vendor/product/revision etc.), before
further dispatching.

Minor white space cleanup and wording fix.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		adrian, sephe, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Modified by:		sephe
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4928
2016-01-22 09:06:40 +00:00
Sepherosa Ziehau
a601c86751 hyperv/vmbus: Lookup channel through id table
Vmbus event handler will need to find the channel by its relative
id, when software interrupt for event happens.  The original lookup
searches the channel list, which is not very efficient.  We now
create a table indexed by the channel relative id to speed up
the channel lookup.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		delphij, adrain, sephe, Dexuan Cui <decui microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4802
2016-01-22 07:29:31 +00:00
Hans Petter Selasky
e936121d31 Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets
  according to the hash type and flowid. This prevents exhaustion of
  the LRO entries due to too many connections at the same time.
  Testing using a larger number of higher bandwidth TCP connections
  showed that the incoming ACK packet aggregation rate increased from
  ~1.3:1 to almost 3:1. Another test showed that for a number of TCP
  connections greater than 16 per hardware receive ring, where 8 TCP
  connections was the LRO active entry limit, there was a significant
  improvement in throughput due to being able to fully aggregate more
  than 8 TCP stream. For very few very high bandwidth TCP streams, the
  optimizing LRO wrapper will add CPU usage instead of reducing CPU
  usage. This is expected. Network drivers which want to use the
  optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
  of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
  "tcp_lro_flush()". Further the LRO control structure must be
  initialized using "tcp_lro_init_args()" passing a non-zero number
  into the "lro_mbufs" argument.

- Make LRO statistics 64-bit. Previously 32-bit integers were used for
  statistics which can be prone to wrap-around. Fix this while at it
  and update all SYSCTL's which expose LRO statistics.

- Ensure all data is freed when destroying a LRO control structures,
  especially leftover LRO entries.

- Reduce number of memory allocations needed when setting up a LRO
  control structure by precomputing the total amount of memory needed.

- Add own memory allocation counter for LRO.

- Bump the FreeBSD version to force recompilation of all KLDs due to
  change of the LRO control structure size.

Sponsored by:	Mellanox Technologies
Reviewed by:	gallatin, sbruno, rrs, gnn, transport
Tested by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D4914
2016-01-19 15:33:28 +00:00
Sepherosa Ziehau
dd7a7dd6af hyperv: set receive buffer size according to NVSP protocol version
If the NVSP protocol version is not greater than NVSP_PROTOCOL_VERSION_2,
then the recv buffer size is 15MB, otherwise the buffer size is 16MB.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		royger, Dexuan Cui <decui microsoft com>, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4814
2016-01-14 03:16:29 +00:00
Sepherosa Ziehau
1952924333 hyperv: add interrupt counters
Submitted by:		Howard Su <howard0su gmail com>
Reviewed by:		royger, Dexuan Cui <decui microsoft com>, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4693
2016-01-14 03:11:35 +00:00
Sepherosa Ziehau
99781cb353 hyperv: implement an event timer
Submitted by:		Howard Su <howard0su@gmail.com>
Reviewed by:		delphij, royger, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4676
2016-01-14 03:05:10 +00:00
Sepherosa Ziehau
358d08b83b hyperv: remove unused vmbus definitions
We don't need them at all.

Submitted by:		Dexuan Cui <decui microsoft com>
Sponsored by:		Microsoft OSTC
Reviewed by:		royger, adrian, delphij
Approved by:		adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D4595
2016-01-14 02:55:28 +00:00
Sepherosa Ziehau
69a53a7a3a hyperv: use x86 generic code to do the hypervisor detection
This is first step to move the generic part of HV code into kernel instead
of module, so that it is possible to use hypercall to implement some other
paravirtualization code in the kernel.

Submitted by:		Howard Su <howard0su@gmail.com>
Reviewed by:		royger, delphij, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D3072
2016-01-14 02:50:13 +00:00
Sepherosa Ziehau
5de888779e hyperv/hn: Unbreak LINT-NOIP
Reported by:	bz
Approved by:	adrain (mentor)
Sponsored by:	Microsoft OSTC
2016-01-14 02:32:50 +00:00
Sepherosa Ziehau
4f8f2d4274 hyperv/hn: Removed unused netvsc_init()
Submitted by:		Dexuan Cui <decui microsoft com>
Reviewed by:		me, adrian, royger,
			Hongjiang Zhang <honzhan microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4594
2016-01-12 01:55:57 +00:00
Sepherosa Ziehau
c48d20d7c7 hyperv/hn: Avoid mbuf cluster allocation, if the packet is small.
This one mainly avoids mbuf cluster allocation for TCP ACKs during
TCP sending tests.  And it gives me ~200Mbps improvement (4.7Gbps
-> 4.9Gbps), when running iperf3 TCP sending test w/ 16 connections.

While I'm here, nuke the unnecessary zeroing out pkthdr.csum_flags.

Reviewed by:		adrain
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4853
2016-01-12 01:50:56 +00:00
Sepherosa Ziehau
da949700f2 hyperv/hn: Implement SIOC[SG]IFMEDIA support
Many applications and kernel modules (e.g. bridge) rely on the ifmedia
status report; give them what they want.

Submitted by:		Dexuan Cui <decui microsoft com>
Reviewed by:		Jun Su <junsu microsoftc com>, me, adrian
Modified by:		me (minor)
Original differential:	https://reviews.freebsd.org/D4611
Differential Revision:	https://reviews.freebsd.org/D4852
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
2016-01-12 01:41:34 +00:00
Sepherosa Ziehau
39863fbd98 hyperv/hn: Implement LRO
- Implement the LRO using tcp_lro APIs, and LRO is enabled by default.
- Add several stats sysctl nodes.
- Check IP/TCP length before sending the packet to tcp_lro_rx(), if host
  does not provide RX csum information (*); and add an option through
  sysctl to always trust host TCP segment csum checks (default is off).
- Add sysctl to control the LRO entry depth; it is disabled by default.
  It is used to avoid holding too much TCP segments in driver.  Limiting
  the LRO entry depth helps a lot in a one/two streams RX test.

This one 3x the RX performance on my local test (3Gbps -> 10Gbps), and
~2x the RX performance over a directly connected 40Ge network (5Gbps ->
9Gbps).

(*) It seems the host stops supplying csum information, once the network
load is high.  This still needs investigation...

Reviewed by:		Hongjiang Zhang <honzhan microsoft com>,
			Dexuan Cui <decui microsoft com>,
			Jun Su <junsu microsoft com>,
			delphij
Tested by:		me (local),
			Hongjiang Zhang <honzhan microsoft com>
			(directly connected 40Ge)
Approved by:		delphij (mentor), adrian (mentor, no objection)
With feedback from:	delphij, Hongjiang Zhang <honzhan microsoft com>
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4824
2016-01-12 01:30:51 +00:00
Xin LI
fcf8d36c46 hyperv: vmbus: run non-blocking message handlers in vmbus_msg_swintr()
We'll remove the per-channel control_work_queue because it can't properly
do serialization of message handling, e.g., when there are 2 NIC devices,
vmbus_channel_on_offer() -> hv_queue_work_item() has a race condition:
for an SMP VM, vmbus_channel_process_offer() can run concurrently on
different CPUs and if the second NIC's
vmbus_channel_process_offer() -> hv_vmbus_child_device_register() runs
first, the second NIC's name will be hn0 and the first NIC's name will
be hn1!

We can fix the race condition by removing the per-channel control_work_queue
and run all the message handlers in the global
hv_vmbus_g_connection.work_queue -- we'll do this in the next patch.

With the coming next patch, we have to run the non-blocking handlers
directly in the kernel thread vmbus_msg_swintr(), because the special
handling of sub-channel: when a sub-channel (e.g., of the storvsc driver)
is received and being handled in vmbus_channel_on_offer() running on the
global hv_vmbus_g_connection.work_queue, vmbus_channel_process_offer()
invokes channel->sc_creation_callback, i.e., storvsc_handle_sc_creation,
and the callback will invoke hv_vmbus_channel_open() -> hv_vmbus_post_message
and expect a further reply from the host, but the handling of the further
messag can't be done because the current message's handling hasn't finished
yet; as result, hv_vmbus_channel_open() -> sema_timedwait() will time out
and th device can't work.

Also renamed the handler type from hv_pfn_channel_msg_handler to
vmbus_msg_handler: the 'pfn' and 'channel' in the old name make no sense.

Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	royger
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D4596
2015-12-29 08:19:43 +00:00
Xin LI
47f175b846 hyperv: vmbus: remove the per-channel control_work_queue
Now vmbus_channel_on_offer() -> vmbus_channel_process_offer() can
safely run on the global hv_vmbus_g_connection.work_queue now.

We remove the per-channel control_work_queue to achieve the proper
serialization of the message handling.

I removed the bogus TODO in vmbus_channel_on_offer(): a vmbus offer
can only come from the parent partition, i.e., the host.

PR:		kern/205156
Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	Howard Su <howard0su gmail com>, delphij
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D4597
2015-12-29 07:54:55 +00:00
Enji Cooper
05ff465524 Remove redundant vmbus_select_outgoing_channel declaration already handled
in include/hyperv.h

This unbreaks the gcc 4.2.1 kernel build of hyperv

Differential Revision: https://reviews.freebsd.org/D4684
MFC after: 3 days
Reviewed by: royger
Sponsored by: EMC / Isilon Storage Division
2015-12-23 17:37:30 +00:00
Roger Pau Monné
f3b0813aee hyperv/kvp: wake up the daemon if it's sleeping due to poll()
Without the patch, there is a race condition: when poll() is invoked(),
if kvp_globals.daemon_busy is false, the daemon won't be timely
woke up, because hv_kvp_send_msg_to_daemon() can't wake up the daemon
in this case.

Submitted by:           Dexuan Cui <decui@microsoft.com>
Sponsored by:		Microsoft OSTC
Reviewed by:		delphij, royger
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D4258
2015-12-15 11:20:20 +00:00
Wei Hu
b17df20c78 Ignore the inbound checksum flags when doing packet forwarding in netvsc driver.
PR: 20363
Submitted by: whu
Reviewed by: royger, whu
Approved by: royger
MFC after: 1 week
Relnotes: No
Sponsored by: Microsoft OSTC
Differential Revision:  https://reviews.freebsd.org/D4131
2015-11-22 05:26:13 +00:00
Wei Hu
5f302628d0 Do not enable UDP checksum offloading when running on the Hyper-V on
Windows Server 2012 and earlier hosts.

Submitted by: whu
Reviewed by: royger
Approved by: royger
MFC after: 3 days
Relnotes: No
Sponsored by: Microsoft OSTC
Differential Revision:  https://reviews.freebsd.org/D3086
2015-07-22 05:05:01 +00:00
Bjoern A. Zeeb
de3587025a Fix compilation without INET6 and without INET and INET6 after
offload support was introduced in r284746.

While here also fix the ioctl() handler for IPv4 added in r279819,
which was never compiled in given opt_inet.h was not included.
2015-06-27 12:37:09 +00:00
Wei Hu
5efed58bdd TSO and checksum offloading support for Netvsc driver on Hyper-V.
Submitted by:	whu
Reviewed by:	royger
Approved by:	royger
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D2517
2015-06-24 06:01:29 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Wei Hu
17b8760445 Add support for SCSI disk hot add and remove. Also add padding according to
the requirement of different hypervisor releases.

Submitted by:	whu
Reviewed by:	royger
Approved by:	royger
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D2512
2015-05-18 10:31:23 +00:00
Wei Hu
da2f98a1cf Microsoft vmbus, storage and other related driver enhancements for HyperV.
- Vmbus multi channel support.
    - Vector interrupt support.
    - Signal optimization.
    - Storvsc driver performance improvement.
    - Scatter and gather support for storvsc driver.
    - Minor bug fix for KVP driver.
Thanks royger, jhb and delphij from FreeBSD community for the reviews
and comments. Also thanks Hovy Xu from NetApp for the contributions to
the storvsc driver.

PR:     195238
Submitted by:   whu
Reviewed by:    royger, jhb, delphij
Approved by:    royger
MFC after:      2 weeks
Relnotes:       yes
Sponsored by:   Microsoft OSTC
2015-04-29 10:12:34 +00:00
Xin LI
394e3bd088 Fix CARP when in use in a HyperV environment:
- Bump link state when stopping or starting the interface;
 - Don't handle SIOCGIFADDR specially, similar to r277103.

This change is based on a previous revision from Andy Zhang
(Microsoft) who did the diagnostic work and many thanks to
them for their help in supporting the HyperV work.

PR:		kern/187203
MFC after:	2 weeks
2015-03-09 20:11:16 +00:00
Steven Hartland
85c9dd9d89 Prevent overflow issues in timeout processing
Previously, any timeout value for which (timeout * hz) will overflow the
signed integer, will give weird results, since callout(9) routines will
convert negative values of ticks to '1'. For unsigned integer overflow we
will get sufficiently smaller timeout values than expected.

Switch from callout_reset, which requires conversion to int based ticks
to callout_reset_sbt to avoid this.

Also correct isci to correctly resolve ccb timeout.

This was based on the original work done by Eygene Ryabinkin
<rea@freebsd.org> back in 5 Aug 2011 which used a macro to help avoid
the overlow.

Differential Revision:	https://reviews.freebsd.org/D1157
Reviewed by:	mav, davide
MFC after:	1 month
Sponsored by:	Multiplay
2014-11-21 21:01:24 +00:00
Gleb Smirnoff
833e8dc5ab Remove struct arpcom. It is unused by most interface types, that allocate
it, except Ethernet, where it carried ng_ether(4) pointer.
For now carry the pointer in if_l2com directly.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-07 15:14:10 +00:00
Xin LI
20e27addbf Return BUS_PROBE_DEFAULT instead of BUS_PROBE_VENDOR or 0 for in-tree
driver.  This change was verified by Microsoft.
2014-10-24 06:27:45 +00:00
Glen Barber
c379cfaf3c Fix an issue where a FreeBSD virtual machine provisioned in
the Microsoft Azure service does not recognize the second
attached disk on the system.

Submitted by:	kyliel@Microsoft
Patched by:	weh@Microsoft
PR:		194376
MFC after:	3 days
X-MFC-10.1:	yes, ASAP
Sponsored by:	The FreeBSD Foundation
2014-10-21 19:36:20 +00:00
Gleb Smirnoff
c8dfaf382f Mechanically convert to if_inc_counter(). 2014-09-19 03:51:26 +00:00
Xin LI
e72055b7fe Import HyperV Key-Value Pair (KVP) driver and daemon code by Microsoft,
many thanks for their continued support of FreeBSD.

While I'm there, also implement a new build knob, WITHOUT_HYPERV to
disable building and installing of the HyperV utilities when necessary.

The HyperV utilities are only built for i386 and amd64 targets.

This is a stable/10 candidate for inclusion with 10.1-RELEASE.

Submitted by:	Wei Hu <weh microsoft com>
MFC after:	1 week
2014-09-13 02:15:31 +00:00
Gleb Smirnoff
1bffa9511f Use define from if_var.h to access a field inside struct if_data,
that resides in struct ifnet.

Sponsored by:	Nginx, Inc.
2014-08-30 19:55:54 +00:00
Warner Losh
3e5a6bd16d Make some unwise casts. On i386 these casts wind up being safe. Rather
than disturb the API, go with these casts to shut gcc up.
2014-04-05 22:42:00 +00:00
Xin LI
17eadb912e Hide a few messages under bootverbose.
Reviewed by:	Abhishek Gupta
MFC after:	2 weeks
2014-03-14 00:47:46 +00:00
Alexander Motin
c70d3b7abb Minor fix to r262789.
MFC after:	6 days
2014-03-06 12:37:25 +00:00
Alexander Motin
54332e0629 Remove custom bus scanner code and fix use of CAM's default scanner.
This fixes kernel panic during boot, caused by incompatibility of recent
CAM locking changes and this bus scanner code.

Submitted by:	Microsoft
MFC after:	1 week
2014-03-05 16:42:33 +00:00
Gleb Smirnoff
b8c83a1957 Another round of removing historical mbuf(9) allocator flags.
They are breeding! New ones arouse since last round.

Sponsored by:	Nginx, Inc.
2014-01-16 13:44:47 +00:00
Pawel Jakub Dawidek
ca5b58f5b1 Fix missing new line after:
Netvsc initializing...

during boot.
2013-12-10 17:16:13 +00:00
Xin LI
49796c6f56 Don't reference pointer before testing whether it is
NULL.

Submitted by:	Clement Lecigne <clecigne google com>
Reviewed by:	grehan
MFC after:	3 days
2013-10-29 22:42:30 +00:00
Nathan Whitehorn
feeec74df7 More BUS_PROBE_NOWILDCARD sweeping. Some devices here (if_ath_ahb and siba)
resist easy conversion since they implement a great deal of their attach
logic inside probe(). Some of this could be fixed by moving it to attach(),
but some requires something more subtle than BUS_PROBE_NOWILDCARD.
2013-10-29 14:19:42 +00:00
Gleb Smirnoff
c3322cb91c Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-28 07:29:16 +00:00
Justin T. Gibbs
352830e2fa Centralize the detection logic for the Hyper-V hypervisor.
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs, grehan
Approved by:	re (gjb)

sys/sys/systm.h:
 * Add a new VM_GUEST type, VM_GUEST_HV (HyperV guest).

sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c:
sys/dev/hyperv/vmbus/hv_hv.c:
sys/dev/hyperv/stordisengage/hv_ata_pci_disengage.c:
 * Set vm_guest to VM_GUEST_HV and use that on other HyperV related
   devices instead of cloning the cpuid hypervisor check.
 * Cleanup the vmbus_identify function.
2013-10-13 02:41:30 +00:00
Peter Grehan
c831112179 Fix a lock-order reversal in the net driver by dropping the lock
and holding a reference prior to calling further into the hyperv
stack.

Added missing FreeBSD idents.

Submitted by:	Microsoft hyperv dev team
Approved by:	re@ (gjb)
2013-10-12 00:32:34 +00:00
Peter Grehan
ddc4c1e797 Fix vmbus channel memory leak where incorrect length parameter was
being passed to contigfree().

Submitted by:	Microsoft hyperv dev team
Approved by:	re@ (glebius)
2013-10-11 21:30:27 +00:00
Peter Grehan
99a5e5a7b4 Allow the legacy CDROM device to be accessed in a FreeBSD guest, while
still using enlightened drivers for other block devices.

Submitted by:	Microsoft hyperv dev team, mav@
Approved by:	re@
2013-10-10 22:46:49 +00:00
Dimitry Andric
785e09b08f In sys/dev/hyperv, fix a number of gcc warnings about usage of anonymous
union members in strict C99, by giving them names.  While here, add some
FreeBSD keywords where they were missing.

Approved by:	re (gjb)
Reviewed by:	grehan
2013-10-10 16:25:53 +00:00
Justin T. Gibbs
bf57e9793a Correct panic caused by attaching both Xen PV and HyperV virtualization
aware drivers on Xen hypervisors that advertise support for some
HyperV features.

x86/xen/hvm.c:
	When running in HVM mode on a Xen hypervisor, set vm_guest
	to VM_GUEST_XEN so other virtualization aware components in
	the FreeBSD kernel can detect this mode is active.

dev/hyperv/vmbus/hv_hv.c:
	Use vm_guest to ignore Xen's HyperV emulation when Xen is
	detected and Xen PV drivers are active.

Reported by:	Shanker Balan
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (Xen blanket)
2013-10-05 19:51:09 +00:00
Peter Grehan
4a67483f2e Reorder the hypervisor presence test to avoid claiming ATA disks
on non hyperv systems.

Reviewed by:	neel, abgupta at microsoft dot com
Approved by:	re@ (hrs)
2013-09-19 02:34:52 +00:00
Peter Grehan
2ee2dc6fd6 Revert the kvp code - there's still some work that
needs to be done for that.

Discussed with:	Microsoft hyper-v devs
2013-09-09 19:27:44 +00:00
Peter Grehan
d940bfec8c Latest update from Microsoft.
Obtained from:	Microsoft Hyper-v dev team
2013-09-09 08:07:46 +00:00
Peter Grehan
672ed870a7 IFC @ r253862
- change the SI_SUB_RUN_SCHEDULER sysinits in hv_utilc and
hv_netvsc_drv_freebsd.c to SI_SUB_KTHREAD_IDLE, since the
former is no longer in FreeBSD.
  The use of these SYSINITs can probably be removed.
2013-08-01 22:09:57 +00:00
Peter Grehan
cc759c1995 Microsoft have changed their policy on how the hyper-v code will
be pulled into FreeBSD. From now, FreeBSD will be considered the
upstream repo.

First step: move the drivers away from the contrib area and into
the base system.

A follow-on commit will include the drivers in the amd64 GENERIC kernel.
2013-07-17 06:30:23 +00:00