Commit Graph

364 Commits

Author SHA1 Message Date
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Gleb Smirnoff
e87c494015 Although most of the NIC drivers are epoch ready, due to peer pressure
switch over to opt-in instead of opt-out for epoch.

Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks
itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch
when processing its packets.

Now this will create recursive entrance in epoch in >90% network
drivers, but will guarantee safeness of the transition.

Mark several tested drivers as IFF_KNOWSEPOCH.

Reviewed by:		hselasky, jeff, bz, gallatin
Differential Revision:	https://reviews.freebsd.org/D23674
2020-02-24 21:07:30 +00:00
Gleb Smirnoff
0921628ddc Introduce flag IFF_NEEDSEPOCH that marks Ethernet interfaces that
supposedly may call into ether_input() without network epoch.

They all need to be reviewed before 13.0-RELEASE.  Some may need
be fixed.  The flag is not planned to be used in the kernel for
a long time.
2020-01-23 01:41:09 +00:00
Hans Petter Selasky
a3b413af0a Fix spelling.
PR:		242891
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-12-30 09:22:52 +00:00
Wei Hu
23a499203c hyperv/vmbus: Fix the wrong size in ndis_offload structure
Submitted by:	whu
MFC after:	2 weeks
Sponsored by:	Microsoft
2019-07-09 08:21:14 +00:00
Wei Hu
ace5ce7e70 hyperv/vmbus: Update VMBus version 4.0 and 5.0 support.
Add VMBus protocol version 4.0. and 5.0 to support Windows 10 and newer HyperV hosts.

For VMBus 4.0 and newer HyperV, the netvsc gpadl teardown must be done after vmbus close.

Submitted by:	whu
MFC after:	2 weeks
Sponsored by:	Microsoft
2019-07-09 07:24:18 +00:00
Wei Hu
f0319886ca Do not trop UDP traffic when TXCSUM_IPV6 flag is on
PR:		231797
Submitted by:	whu
Reviewed by:	dexuan
Obtained from:	Kevin Morse
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://bugs.freebsd.org/bugzilla/attachment.cgi?id=198333&action=diff
2018-10-22 11:23:51 +00:00
Dexuan Cui
d76fb49fd8 hyperv/hn: Fix panic in hypervisor code upon device detach event
Submitted by:	hselasky
Reviewed by:	dexuan
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D16139
2018-07-17 21:05:08 +00:00
Eric van Gyzen
7898a1f4c4 if_hn: fix use of uninitialized variable
omcast was used without being initialized in the non-multicast case.
The only effect was that the interface's multicast output counter could be
incorrect.

Reported by:	Coverity
CID:		1379662
MFC after:	3 days
Sponsored by:	Dell EMC
2018-05-26 14:14:56 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Sepherosa Ziehau
78e46963b6 hyperv/hn: Enable transparent VF by default.
MFC after:	3 days
Sponsored by:	Microsoft
2017-10-11 05:28:51 +00:00
Sepherosa Ziehau
6f12c42e8b hyperv/hn: Workaround erroneous hash type observed on WS2016 for VF.
The background was described in r324489.

MFC after:	3 days
Sponsored by:	Microsoft
2017-10-11 05:15:49 +00:00
Sepherosa Ziehau
db76829b2b hyperv/hn: Workaround erroneous hash type observed on WS2016.
Background:
- UDP 4-tuple hash type is unconditionally enabled in Hyper-V on WS2016,
  which is _not_ affected by NDIS_OBJTYPE_RSS_PARAMS.
- Non-fragment UDP/IPv4 datagrams' hash type is delivered to VM as
  TCP_IPV4.

Currently this erroneous behavior only applies to WS2016/Windows10.

Force l3/l4 protocol check, if the RXed packet's hash type is TCP_IPV4,
and the Hyper-V is running on WS2016/Windows10.  If the RXed packet is
UDP datagram, adjust mbuf hash type to UDP_IPV4.

MFC after:	3 days
Sponsored by:	Microsoft
2017-10-10 08:32:03 +00:00
Sepherosa Ziehau
b7d499f4d4 hyperv/hn: Fix options RSS building
Reported by:	np
MFC after:	1 week
Sponsored by:	Microsoft
2017-10-05 13:22:14 +00:00
Sepherosa Ziehau
352035746f hyperv/hn: Unbreak i386 building.
Reported by:	cy
MFC after:	1 week
Sponsored by:	Microsoft
2017-09-28 07:02:56 +00:00
Sepherosa Ziehau
2be266caf2 hyperv/hn: Fix UDP checksum offload issue in Azure.
UDP checksum offload does not work in Azure if following conditions are
met:
- sizeof(IP hdr + UDP hdr + payload) > 1420.
- IP_DF is not set in IP hdr

Use software checksum for UDP datagrams falling into this category.

Add two tunables to disable UDP/IPv4 and UDP/IPv6 checksum offload, in
case something unexpected happened.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12429
2017-09-27 05:44:50 +00:00
Sepherosa Ziehau
c49d47daf3 hyperv/hn: Set tcp header offset for CSUM/LSO offloading.
No observable effect; better safe than sorry.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12417
2017-09-27 04:42:40 +00:00
Sepherosa Ziehau
d74831eea2 hyperv/hn: Incease max supported MTU
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12365
2017-09-19 06:46:00 +00:00
Sepherosa Ziehau
eb2fe04416 hyperv/hn: Fix MTU setting
- Add size of an ethernet header to the value configured to NVS.  This
  does not seem to have any effects if MTU is 1500, but fix hypervisor
  side's setting if MTU > 1500.
- Override the MTU setting according to the view from the hypervisor
  side.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12352
2017-09-19 06:38:57 +00:00
Sepherosa Ziehau
642ec226bb hyperv/hn: Apply VF's RSS setting
Since in Azure SYN and SYN|ACK go through the synthetic parts while the
rest of the same TCP flow goes through the VF, apply VF's RSS settings
to synthetic parts to have a consistent hash value/type for the same TCP
flow.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12333
2017-09-19 06:29:38 +00:00
Sepherosa Ziehau
7f1e5ebba8 hyperv/hn: Log RSS capabilities mask.
This helps to detect when UDP hash types can be supported.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12177
2017-09-05 06:20:02 +00:00
Sepherosa Ziehau
8c068aa51e hyperv/hn: Implement SIOCGIFRSS{KEY,HASH}.
The conditional compiling in the review request is removed, since
these IOCTLs will be available in stable/10 and stable/11.

Reviewed by:	gallatin
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12175
2017-09-05 06:05:48 +00:00
Sepherosa Ziehau
93b4e111bb hyperv: Update copyright for the files changed in 2017
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11982
2017-08-14 06:00:50 +00:00
Sepherosa Ziehau
d0cd8231e0 hyperv/hn: Re-set datapath after synthetic parts reattached.
Do this even for non-transparent mode VF. Better safe than sorry.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11981
2017-08-14 05:55:16 +00:00
Sepherosa Ziehau
c2d50b263f hyperv/hn: Minor cleanup
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11979
2017-08-14 05:46:50 +00:00
Sepherosa Ziehau
a97fff1913 hyperv/hn: Fix/enhance receiving path when VF is activated.
- Update hn(4)'s stats properly for non-transparent mode VF.
- Allow BPF tapping to hn(4) for non-transparent mode VF.
- Don't setup mbuf hash, if 'options RSS' is set.
  In Azure, when VF is activated, TCP SYN and SYN|ACK go through hn(4)
  while the rest of segments and ACKs belonging to the same TCP 4-tuple
  go through the VF.  So don't setup mbuf hash, if a VF is activated
  and 'options RSS' is not enabled.  hn(4) and the VF may use neither
  the same RSS hash key nor the same RSS hash function, so the hash
  value for packets belonging to the same flow could be different!
- Disable LRO.
  hn(4) will only receive broadcast packets, multicast packets, TCP
  SYN and SYN|ACK (in Azure), LRO is useless for these packet types.
  For non-transparent, we definitely _cannot_ enable LRO at all, since
  the LRO flush will use hn(4) as the receiving interface; i.e.
  hn_ifp->if_input(hn_ifp, m).

While I'm here, remove unapplied comment and minor style change.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11978
2017-08-14 05:40:52 +00:00
Sepherosa Ziehau
3bed4e54f8 hyperv/hn: Update VF's ibytes properly under transparent VF mode.
While, I'm here add comment about why updating VF's imcast stat is
not necessary.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11948
2017-08-14 05:30:02 +00:00
Sepherosa Ziehau
9c6cae2431 hyperv/hn: Implement transparent mode network VF.
How network VF works with hn(4) on Hyper-V in transparent mode:

- Each network VF has a cooresponding hn(4).
- The network VF and the it's cooresponding hn(4) have the same hardware
  address.
- Once the network VF is attached, the cooresponding hn(4) waits several
  seconds to make sure that the network VF attach routing completes, then:
  o  Set the intersection of the network VF's if_capabilities and the
     cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s
     if_capabilities.  And adjust the cooresponding hn(4) if_capable and
     if_hwassist accordingly. (*)
  o  Make sure that the cooresponding hn(4)'s TSO parameters meet the
     constraints posed by both the network VF and the cooresponding hn(4).
     (*)
  o  The network VF's if_input is overridden.  The overriding if_input
     changes the input packet's rcvif to the cooreponding hn(4).  The
     network layers are tricked into thinking that all packets are
     neceived by the cooresponding hn(4).
  o  If the cooresponding hn(4) was brought up, bring up the network VF.
     The transmission dispatched to the cooresponding hn(4) are
     redispatched to the network VF.
  o  Bringing down the cooresponding hn(4) also brings down the network
     VF.
  o  All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to
     the network VF; the cooresponding hn(4) changes its internal state
     if necessary.
  o  The media status of the cooresponding hn(4) solely relies on the
     network VF.
  o  If there are multicast filters on the cooresponding hn(4), allmulti
     will be enabled on the network VF. (**)
- Once the network VF is detached.  Undo all damages did to the
  cooresponding hn(4) in the above item.

NOTE:
No operation should be issued directly to the network VF, if the
network VF transparent mode is enabled.  The network VF transparent mode
can be enabled by setting tunable hw.hn.vf_transparent to 1.  The network
VF transparent mode is _not_ enabled by default, as of this commit.

The benefit of the network VF transparent mode is that the network VF
attachment and detachment are transparent to all network layers; e.g. live
migration detaches and reattaches the network VF.

The major drawbacks of the network VF transparent mode:
- The netmap(4) support is lost, even if the VF supports it.
- ALTQ does not work, since if_start method cannot be properly supported.

(*)
These decisions were made so that things will not be messed up too much
during the transition period.

(**)
This does _not_ need to go through the fancy multicast filter management
stuffs like what vlan(4) has, at least currently:
- As of this write, multicast does not work in Azure.
- As of this write, multicast packets go through the cooresponding hn(4).

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11803
2017-08-09 05:59:45 +00:00
Sepherosa Ziehau
f41e0df406 hyperv/hn: Add comment about ether_ifattach event subscription.
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11710
2017-08-01 02:55:43 +00:00
Sepherosa Ziehau
962f035786 hyperv/hn: Renaming and minor cleanup
This prepares for the upcoming transparent VF support.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11708
2017-08-01 02:45:54 +00:00
Sepherosa Ziehau
40905afa0f hyperv/hn: Ignore LINK_SPEED_CHANGE status.
This status will be reported if the backend NIC is wireless; it's not
useful.  Due to the high frequency of the reporting, this could be
pretty annoying; ignore it.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11651
2017-07-24 04:00:43 +00:00
Sepherosa Ziehau
499c3e1739 hyperv/hn: Export VF list and VF-HN mapping
The VF-HN map will be used later on to implement "transparent VF".

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11618
2017-07-24 03:52:32 +00:00
Sepherosa Ziehau
cc0c6ebc14 hyperv/hn: Use channel0, i.e. TX ring0, for TCP SYN/SYN|ACK.
Hyper-V hot channel effect:
Operation latency on hot channel is only _half_ of the operation
latency on cold channels.

This commit takes the advantage of the above Hyper-V host channel
effect, and can reduce more than 75% latency and more than 50%
latency stdev, i.e. lower and more stable/predictable latency,
for various types of web server workloads.

MFC after:	3 days
Sponsored by:	Microsoft
2017-04-24 07:52:27 +00:00
Sepherosa Ziehau
b3b75d9c84 hyperv/hn: Fixat RNDIS rxfilter after the successful RNDIS init.
Under certain conditions on certain versions of Hyper-V, the RNDIS
rxfilter is _not_ zero on the hypervisor side after the successful
RNDIS initialization, which breaks the assumption of any following
code (well, it breaks the RNDIS API contract actually).  Clear the
RNDIS rxfilter explicitly, drain packets sneaking through, and drain
the interrupt taskqueues scheduled due to the stealth packets.

Reported by:	dexuan@
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D10230
2017-04-05 08:25:22 +00:00
Sepherosa Ziehau
920adec330 hyperv/hn: Misaligned chimney sending buffers should not be used
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9714
2017-03-01 09:05:12 +00:00
Sepherosa Ziehau
7675868a04 hyperv/hn: Make sure that RNDIS packet message is at least 4B aligned.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9713
2017-03-01 08:50:41 +00:00
Sepherosa Ziehau
8fe90f73ae hyperv/hn: Simplify RNDIS packet total length calculation.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9712
2017-03-01 08:24:17 +00:00
Sepherosa Ziehau
9130c4f75b hyperv/hn: Simplify RNDIS packet data offset calculation.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9699
2017-02-28 09:50:34 +00:00
Dexuan Cui
33408a34c4 hyperv/hn: add devctl_notify for VF_UP/DOWN events
Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9102
2017-01-24 09:27:13 +00:00
Dexuan Cui
40d60d6ee1 hyperv/hn: add a sysctl name for the VF interface
This makes it easier for the userland script to find the releated
VF interface.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9101
2017-01-24 09:25:42 +00:00
Dexuan Cui
5bdfd3fd36 hyperv/hn: add the support for VF drivers (SR-IOV)
Hyper-V's NIC SR-IOV implementation needs a Hyper-V synthetic NIC and
a VF NIC to work together (both NICs have the same MAC address), mainly to
support seamless live migration.

When the VF device becomes UP (or DOWN), the synthetic NIC driver needs
to switch the data path from the synthetic NIC to the VF (or the opposite).

Note: multicast/broadcast packets are still received through the synthetic
NIC and we need to inject the packets through the VF interface (if the VF is
UP), even if the synthetic NIC is DOWN (so we need to force the rxfilter
to be NDIS_PACKET_TYPE_PROMISCUOUS, when the VF is UP).

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8964
2017-01-24 09:24:14 +00:00
Dexuan Cui
c927d68136 hyperv/hn: remove the MTU and IFF_DRV_RUNNING checking in hn_rxpkt()
It's unnecessary because the upper nework stack does the same checking.

In the case of Hyper-V SR-IOV, we need to remove the checking because
1) multicast/broadcast packets are still received through the synthetic
NIC and we need to inject the packets through the VF interface;
2) we must inject the packets even if the synthetic NIC is down, or has
a different MTU from the VF device.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8962
2017-01-24 09:15:36 +00:00
Dexuan Cui
3ab0fea134 hyperv/hn: remember the channel pointer in struct hn_rx_ring
This will be used by the coming NIC SR-IOV patch.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8909
2017-01-24 09:09:53 +00:00
Sepherosa Ziehau
f1b0a43ff6 hyperv/hn: Factor out function to set rxfilter.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8928
2016-12-28 04:47:17 +00:00
Sepherosa Ziehau
c08f7b2c28 hyperv/hn: Function renaming; no functional changes.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8908
2016-12-28 04:35:52 +00:00
Sepherosa Ziehau
87f8129d28 hyperv/hn: Consolidate hn_{suspend,resume}
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8907
2016-12-28 03:19:59 +00:00
Sepherosa Ziehau
6c1204df36 hyperv/hn: Add polling support
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8739
2016-12-12 05:18:03 +00:00
Sepherosa Ziehau
34d68912be hyperv/hn: Add 'options RSS' support.
Reviewed by:	adrian
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8676
2016-12-01 05:37:29 +00:00
Sepherosa Ziehau
8e7d313625 hyperv/hn: Don't hold txdesc, if no BPFs are attached.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8675
2016-12-01 03:39:34 +00:00
Sepherosa Ziehau
85e4ae1e13 hyperv/hn: Add HN_DEBUG kernel option.
If bufring is used for per-TX ring descs, don't update "available"
counter, which is only used to help debugging.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8674
2016-12-01 03:27:16 +00:00