Commit Graph

360 Commits

Author SHA1 Message Date
whu
42011dc102 hyperv/vmbus: Fix the wrong size in ndis_offload structure
Submitted by:	whu
MFC after:	2 weeks
Sponsored by:	Microsoft
2019-07-09 08:21:14 +00:00
whu
b16c552f6e hyperv/vmbus: Update VMBus version 4.0 and 5.0 support.
Add VMBus protocol version 4.0. and 5.0 to support Windows 10 and newer HyperV hosts.

For VMBus 4.0 and newer HyperV, the netvsc gpadl teardown must be done after vmbus close.

Submitted by:	whu
MFC after:	2 weeks
Sponsored by:	Microsoft
2019-07-09 07:24:18 +00:00
whu
9de81ae09f Do not trop UDP traffic when TXCSUM_IPV6 flag is on
PR:		231797
Submitted by:	whu
Reviewed by:	dexuan
Obtained from:	Kevin Morse
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://bugs.freebsd.org/bugzilla/attachment.cgi?id=198333&action=diff
2018-10-22 11:23:51 +00:00
dexuan
754fc2f721 hyperv/hn: Fix panic in hypervisor code upon device detach event
Submitted by:	hselasky
Reviewed by:	dexuan
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D16139
2018-07-17 21:05:08 +00:00
vangyzen
fbaec7d1e3 if_hn: fix use of uninitialized variable
omcast was used without being initialized in the non-multicast case.
The only effect was that the interface's multicast output counter could be
incorrect.

Reported by:	Coverity
CID:		1379662
MFC after:	3 days
Sponsored by:	Dell EMC
2018-05-26 14:14:56 +00:00
mmacy
7aeac9ef18 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
sephe
74acc5b16c hyperv/hn: Enable transparent VF by default.
MFC after:	3 days
Sponsored by:	Microsoft
2017-10-11 05:28:51 +00:00
sephe
6366e7bab0 hyperv/hn: Workaround erroneous hash type observed on WS2016 for VF.
The background was described in r324489.

MFC after:	3 days
Sponsored by:	Microsoft
2017-10-11 05:15:49 +00:00
sephe
be6e031010 hyperv/hn: Workaround erroneous hash type observed on WS2016.
Background:
- UDP 4-tuple hash type is unconditionally enabled in Hyper-V on WS2016,
  which is _not_ affected by NDIS_OBJTYPE_RSS_PARAMS.
- Non-fragment UDP/IPv4 datagrams' hash type is delivered to VM as
  TCP_IPV4.

Currently this erroneous behavior only applies to WS2016/Windows10.

Force l3/l4 protocol check, if the RXed packet's hash type is TCP_IPV4,
and the Hyper-V is running on WS2016/Windows10.  If the RXed packet is
UDP datagram, adjust mbuf hash type to UDP_IPV4.

MFC after:	3 days
Sponsored by:	Microsoft
2017-10-10 08:32:03 +00:00
sephe
e521cc6466 hyperv/hn: Fix options RSS building
Reported by:	np
MFC after:	1 week
Sponsored by:	Microsoft
2017-10-05 13:22:14 +00:00
sephe
11859da3f9 hyperv/hn: Unbreak i386 building.
Reported by:	cy
MFC after:	1 week
Sponsored by:	Microsoft
2017-09-28 07:02:56 +00:00
sephe
b69ef721ad hyperv/hn: Fix UDP checksum offload issue in Azure.
UDP checksum offload does not work in Azure if following conditions are
met:
- sizeof(IP hdr + UDP hdr + payload) > 1420.
- IP_DF is not set in IP hdr

Use software checksum for UDP datagrams falling into this category.

Add two tunables to disable UDP/IPv4 and UDP/IPv6 checksum offload, in
case something unexpected happened.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12429
2017-09-27 05:44:50 +00:00
sephe
8fc41247cb hyperv/hn: Set tcp header offset for CSUM/LSO offloading.
No observable effect; better safe than sorry.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12417
2017-09-27 04:42:40 +00:00
sephe
74e7b4226c hyperv/hn: Incease max supported MTU
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12365
2017-09-19 06:46:00 +00:00
sephe
284eaa3f70 hyperv/hn: Fix MTU setting
- Add size of an ethernet header to the value configured to NVS.  This
  does not seem to have any effects if MTU is 1500, but fix hypervisor
  side's setting if MTU > 1500.
- Override the MTU setting according to the view from the hypervisor
  side.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12352
2017-09-19 06:38:57 +00:00
sephe
becc8347a8 hyperv/hn: Apply VF's RSS setting
Since in Azure SYN and SYN|ACK go through the synthetic parts while the
rest of the same TCP flow goes through the VF, apply VF's RSS settings
to synthetic parts to have a consistent hash value/type for the same TCP
flow.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12333
2017-09-19 06:29:38 +00:00
sephe
ab9d242c8d hyperv/hn: Log RSS capabilities mask.
This helps to detect when UDP hash types can be supported.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12177
2017-09-05 06:20:02 +00:00
sephe
464b6d2488 hyperv/hn: Implement SIOCGIFRSS{KEY,HASH}.
The conditional compiling in the review request is removed, since
these IOCTLs will be available in stable/10 and stable/11.

Reviewed by:	gallatin
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12175
2017-09-05 06:05:48 +00:00
sephe
6a2d79f56b hyperv: Update copyright for the files changed in 2017
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11982
2017-08-14 06:00:50 +00:00
sephe
a4ceea128a hyperv/hn: Re-set datapath after synthetic parts reattached.
Do this even for non-transparent mode VF. Better safe than sorry.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11981
2017-08-14 05:55:16 +00:00
sephe
3142e35944 hyperv/hn: Minor cleanup
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11979
2017-08-14 05:46:50 +00:00
sephe
744451beef hyperv/hn: Fix/enhance receiving path when VF is activated.
- Update hn(4)'s stats properly for non-transparent mode VF.
- Allow BPF tapping to hn(4) for non-transparent mode VF.
- Don't setup mbuf hash, if 'options RSS' is set.
  In Azure, when VF is activated, TCP SYN and SYN|ACK go through hn(4)
  while the rest of segments and ACKs belonging to the same TCP 4-tuple
  go through the VF.  So don't setup mbuf hash, if a VF is activated
  and 'options RSS' is not enabled.  hn(4) and the VF may use neither
  the same RSS hash key nor the same RSS hash function, so the hash
  value for packets belonging to the same flow could be different!
- Disable LRO.
  hn(4) will only receive broadcast packets, multicast packets, TCP
  SYN and SYN|ACK (in Azure), LRO is useless for these packet types.
  For non-transparent, we definitely _cannot_ enable LRO at all, since
  the LRO flush will use hn(4) as the receiving interface; i.e.
  hn_ifp->if_input(hn_ifp, m).

While I'm here, remove unapplied comment and minor style change.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11978
2017-08-14 05:40:52 +00:00
sephe
19ec4cf79d hyperv/hn: Update VF's ibytes properly under transparent VF mode.
While, I'm here add comment about why updating VF's imcast stat is
not necessary.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11948
2017-08-14 05:30:02 +00:00
sephe
4b42c6e8a4 hyperv/hn: Implement transparent mode network VF.
How network VF works with hn(4) on Hyper-V in transparent mode:

- Each network VF has a cooresponding hn(4).
- The network VF and the it's cooresponding hn(4) have the same hardware
  address.
- Once the network VF is attached, the cooresponding hn(4) waits several
  seconds to make sure that the network VF attach routing completes, then:
  o  Set the intersection of the network VF's if_capabilities and the
     cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s
     if_capabilities.  And adjust the cooresponding hn(4) if_capable and
     if_hwassist accordingly. (*)
  o  Make sure that the cooresponding hn(4)'s TSO parameters meet the
     constraints posed by both the network VF and the cooresponding hn(4).
     (*)
  o  The network VF's if_input is overridden.  The overriding if_input
     changes the input packet's rcvif to the cooreponding hn(4).  The
     network layers are tricked into thinking that all packets are
     neceived by the cooresponding hn(4).
  o  If the cooresponding hn(4) was brought up, bring up the network VF.
     The transmission dispatched to the cooresponding hn(4) are
     redispatched to the network VF.
  o  Bringing down the cooresponding hn(4) also brings down the network
     VF.
  o  All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to
     the network VF; the cooresponding hn(4) changes its internal state
     if necessary.
  o  The media status of the cooresponding hn(4) solely relies on the
     network VF.
  o  If there are multicast filters on the cooresponding hn(4), allmulti
     will be enabled on the network VF. (**)
- Once the network VF is detached.  Undo all damages did to the
  cooresponding hn(4) in the above item.

NOTE:
No operation should be issued directly to the network VF, if the
network VF transparent mode is enabled.  The network VF transparent mode
can be enabled by setting tunable hw.hn.vf_transparent to 1.  The network
VF transparent mode is _not_ enabled by default, as of this commit.

The benefit of the network VF transparent mode is that the network VF
attachment and detachment are transparent to all network layers; e.g. live
migration detaches and reattaches the network VF.

The major drawbacks of the network VF transparent mode:
- The netmap(4) support is lost, even if the VF supports it.
- ALTQ does not work, since if_start method cannot be properly supported.

(*)
These decisions were made so that things will not be messed up too much
during the transition period.

(**)
This does _not_ need to go through the fancy multicast filter management
stuffs like what vlan(4) has, at least currently:
- As of this write, multicast does not work in Azure.
- As of this write, multicast packets go through the cooresponding hn(4).

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11803
2017-08-09 05:59:45 +00:00
sephe
571da5685c hyperv/hn: Add comment about ether_ifattach event subscription.
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11710
2017-08-01 02:55:43 +00:00
sephe
f1555b6b44 hyperv/hn: Renaming and minor cleanup
This prepares for the upcoming transparent VF support.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11708
2017-08-01 02:45:54 +00:00
sephe
f305deb3bc hyperv/hn: Ignore LINK_SPEED_CHANGE status.
This status will be reported if the backend NIC is wireless; it's not
useful.  Due to the high frequency of the reporting, this could be
pretty annoying; ignore it.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11651
2017-07-24 04:00:43 +00:00
sephe
e57775e637 hyperv/hn: Export VF list and VF-HN mapping
The VF-HN map will be used later on to implement "transparent VF".

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11618
2017-07-24 03:52:32 +00:00
sephe
91a23f2d4a hyperv/hn: Use channel0, i.e. TX ring0, for TCP SYN/SYN|ACK.
Hyper-V hot channel effect:
Operation latency on hot channel is only _half_ of the operation
latency on cold channels.

This commit takes the advantage of the above Hyper-V host channel
effect, and can reduce more than 75% latency and more than 50%
latency stdev, i.e. lower and more stable/predictable latency,
for various types of web server workloads.

MFC after:	3 days
Sponsored by:	Microsoft
2017-04-24 07:52:27 +00:00
sephe
27e51e2575 hyperv/hn: Fixat RNDIS rxfilter after the successful RNDIS init.
Under certain conditions on certain versions of Hyper-V, the RNDIS
rxfilter is _not_ zero on the hypervisor side after the successful
RNDIS initialization, which breaks the assumption of any following
code (well, it breaks the RNDIS API contract actually).  Clear the
RNDIS rxfilter explicitly, drain packets sneaking through, and drain
the interrupt taskqueues scheduled due to the stealth packets.

Reported by:	dexuan@
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D10230
2017-04-05 08:25:22 +00:00
sephe
bf0a3b4be6 hyperv/hn: Misaligned chimney sending buffers should not be used
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9714
2017-03-01 09:05:12 +00:00
sephe
357eaaacb0 hyperv/hn: Make sure that RNDIS packet message is at least 4B aligned.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9713
2017-03-01 08:50:41 +00:00
sephe
f72c601aa2 hyperv/hn: Simplify RNDIS packet total length calculation.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9712
2017-03-01 08:24:17 +00:00
sephe
2e63fa0867 hyperv/hn: Simplify RNDIS packet data offset calculation.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9699
2017-02-28 09:50:34 +00:00
dexuan
76cd3d1328 hyperv/hn: add devctl_notify for VF_UP/DOWN events
Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9102
2017-01-24 09:27:13 +00:00
dexuan
66d60a9912 hyperv/hn: add a sysctl name for the VF interface
This makes it easier for the userland script to find the releated
VF interface.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9101
2017-01-24 09:25:42 +00:00
dexuan
0d205f947a hyperv/hn: add the support for VF drivers (SR-IOV)
Hyper-V's NIC SR-IOV implementation needs a Hyper-V synthetic NIC and
a VF NIC to work together (both NICs have the same MAC address), mainly to
support seamless live migration.

When the VF device becomes UP (or DOWN), the synthetic NIC driver needs
to switch the data path from the synthetic NIC to the VF (or the opposite).

Note: multicast/broadcast packets are still received through the synthetic
NIC and we need to inject the packets through the VF interface (if the VF is
UP), even if the synthetic NIC is DOWN (so we need to force the rxfilter
to be NDIS_PACKET_TYPE_PROMISCUOUS, when the VF is UP).

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8964
2017-01-24 09:24:14 +00:00
dexuan
e6a11a0e4e hyperv/hn: remove the MTU and IFF_DRV_RUNNING checking in hn_rxpkt()
It's unnecessary because the upper nework stack does the same checking.

In the case of Hyper-V SR-IOV, we need to remove the checking because
1) multicast/broadcast packets are still received through the synthetic
NIC and we need to inject the packets through the VF interface;
2) we must inject the packets even if the synthetic NIC is down, or has
a different MTU from the VF device.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8962
2017-01-24 09:15:36 +00:00
dexuan
efde53fa0c hyperv/hn: remember the channel pointer in struct hn_rx_ring
This will be used by the coming NIC SR-IOV patch.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8909
2017-01-24 09:09:53 +00:00
sephe
39facbac7a hyperv/hn: Factor out function to set rxfilter.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8928
2016-12-28 04:47:17 +00:00
sephe
5aa4a394c9 hyperv/hn: Function renaming; no functional changes.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8908
2016-12-28 04:35:52 +00:00
sephe
c9b5e5f0a8 hyperv/hn: Consolidate hn_{suspend,resume}
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8907
2016-12-28 03:19:59 +00:00
sephe
e3c83e4556 hyperv/hn: Add polling support
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8739
2016-12-12 05:18:03 +00:00
sephe
d00d549125 hyperv/hn: Add 'options RSS' support.
Reviewed by:	adrian
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8676
2016-12-01 05:37:29 +00:00
sephe
95dbf6d04a hyperv/hn: Don't hold txdesc, if no BPFs are attached.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8675
2016-12-01 03:39:34 +00:00
sephe
a3a423047c hyperv/hn: Add HN_DEBUG kernel option.
If bufring is used for per-TX ring descs, don't update "available"
counter, which is only used to help debugging.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8674
2016-12-01 03:27:16 +00:00
sephe
8a84dc6205 hyperv/hn: Allow TX to share event taskqueues.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8659
2016-11-30 07:54:28 +00:00
sephe
06bb1b5f32 hyperv/hn: Allow multiple TX taskqueues.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8655
2016-11-30 05:28:39 +00:00
sephe
738c61bc0e hyperv/hn: Nuke the unused TX taskqueue CPU binding tunable.
It was an experimental tunable, and is now deemed to be road blocker
for further changes.  Time to retire it.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8654
2016-11-30 05:11:59 +00:00
sephe
8d4258a0ce hyperv/hn: Simplify RSS indirect table fixup API
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8630
2016-11-28 06:40:26 +00:00