Commit Graph

50 Commits

Author SHA1 Message Date
hselasky
456370e530 Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets
  according to the hash type and flowid. This prevents exhaustion of
  the LRO entries due to too many connections at the same time.
  Testing using a larger number of higher bandwidth TCP connections
  showed that the incoming ACK packet aggregation rate increased from
  ~1.3:1 to almost 3:1. Another test showed that for a number of TCP
  connections greater than 16 per hardware receive ring, where 8 TCP
  connections was the LRO active entry limit, there was a significant
  improvement in throughput due to being able to fully aggregate more
  than 8 TCP stream. For very few very high bandwidth TCP streams, the
  optimizing LRO wrapper will add CPU usage instead of reducing CPU
  usage. This is expected. Network drivers which want to use the
  optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
  of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
  "tcp_lro_flush()". Further the LRO control structure must be
  initialized using "tcp_lro_init_args()" passing a non-zero number
  into the "lro_mbufs" argument.

- Make LRO statistics 64-bit. Previously 32-bit integers were used for
  statistics which can be prone to wrap-around. Fix this while at it
  and update all SYSCTL's which expose LRO statistics.

- Ensure all data is freed when destroying a LRO control structures,
  especially leftover LRO entries.

- Reduce number of memory allocations needed when setting up a LRO
  control structure by precomputing the total amount of memory needed.

- Add own memory allocation counter for LRO.

- Bump the FreeBSD version to force recompilation of all KLDs due to
  change of the LRO control structure size.

Sponsored by:	Mellanox Technologies
Reviewed by:	gallatin, sbruno, rrs, gnn, transport
Tested by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D4914
2016-01-19 15:33:28 +00:00
sephe
796a32d5d8 hyperv: set receive buffer size according to NVSP protocol version
If the NVSP protocol version is not greater than NVSP_PROTOCOL_VERSION_2,
then the recv buffer size is 15MB, otherwise the buffer size is 16MB.

Submitted by:		Hongjiang Zhang <honzhan microsoft com>
Reviewed by:		royger, Dexuan Cui <decui microsoft com>, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4814
2016-01-14 03:16:29 +00:00
sephe
8453ef2fe2 hyperv: add interrupt counters
Submitted by:		Howard Su <howard0su gmail com>
Reviewed by:		royger, Dexuan Cui <decui microsoft com>, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4693
2016-01-14 03:11:35 +00:00
sephe
8d45cbc0b6 hyperv: implement an event timer
Submitted by:		Howard Su <howard0su@gmail.com>
Reviewed by:		delphij, royger, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4676
2016-01-14 03:05:10 +00:00
sephe
929b65cffc hyperv: remove unused vmbus definitions
We don't need them at all.

Submitted by:		Dexuan Cui <decui microsoft com>
Sponsored by:		Microsoft OSTC
Reviewed by:		royger, adrian, delphij
Approved by:		adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D4595
2016-01-14 02:55:28 +00:00
sephe
a3d3d84a95 hyperv: use x86 generic code to do the hypervisor detection
This is first step to move the generic part of HV code into kernel instead
of module, so that it is possible to use hypercall to implement some other
paravirtualization code in the kernel.

Submitted by:		Howard Su <howard0su@gmail.com>
Reviewed by:		royger, delphij, adrian
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D3072
2016-01-14 02:50:13 +00:00
sephe
8767bf77c4 hyperv/hn: Unbreak LINT-NOIP
Reported by:	bz
Approved by:	adrain (mentor)
Sponsored by:	Microsoft OSTC
2016-01-14 02:32:50 +00:00
sephe
0517d6cfae hyperv/hn: Removed unused netvsc_init()
Submitted by:		Dexuan Cui <decui microsoft com>
Reviewed by:		me, adrian, royger,
			Hongjiang Zhang <honzhan microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4594
2016-01-12 01:55:57 +00:00
sephe
b0c6076d09 hyperv/hn: Avoid mbuf cluster allocation, if the packet is small.
This one mainly avoids mbuf cluster allocation for TCP ACKs during
TCP sending tests.  And it gives me ~200Mbps improvement (4.7Gbps
-> 4.9Gbps), when running iperf3 TCP sending test w/ 16 connections.

While I'm here, nuke the unnecessary zeroing out pkthdr.csum_flags.

Reviewed by:		adrain
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4853
2016-01-12 01:50:56 +00:00
sephe
fa64dde96e hyperv/hn: Implement SIOC[SG]IFMEDIA support
Many applications and kernel modules (e.g. bridge) rely on the ifmedia
status report; give them what they want.

Submitted by:		Dexuan Cui <decui microsoft com>
Reviewed by:		Jun Su <junsu microsoftc com>, me, adrian
Modified by:		me (minor)
Original differential:	https://reviews.freebsd.org/D4611
Differential Revision:	https://reviews.freebsd.org/D4852
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
2016-01-12 01:41:34 +00:00
sephe
b493955d70 hyperv/hn: Implement LRO
- Implement the LRO using tcp_lro APIs, and LRO is enabled by default.
- Add several stats sysctl nodes.
- Check IP/TCP length before sending the packet to tcp_lro_rx(), if host
  does not provide RX csum information (*); and add an option through
  sysctl to always trust host TCP segment csum checks (default is off).
- Add sysctl to control the LRO entry depth; it is disabled by default.
  It is used to avoid holding too much TCP segments in driver.  Limiting
  the LRO entry depth helps a lot in a one/two streams RX test.

This one 3x the RX performance on my local test (3Gbps -> 10Gbps), and
~2x the RX performance over a directly connected 40Ge network (5Gbps ->
9Gbps).

(*) It seems the host stops supplying csum information, once the network
load is high.  This still needs investigation...

Reviewed by:		Hongjiang Zhang <honzhan microsoft com>,
			Dexuan Cui <decui microsoft com>,
			Jun Su <junsu microsoft com>,
			delphij
Tested by:		me (local),
			Hongjiang Zhang <honzhan microsoft com>
			(directly connected 40Ge)
Approved by:		delphij (mentor), adrian (mentor, no objection)
With feedback from:	delphij, Hongjiang Zhang <honzhan microsoft com>
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4824
2016-01-12 01:30:51 +00:00
delphij
850d0994e4 hyperv: vmbus: run non-blocking message handlers in vmbus_msg_swintr()
We'll remove the per-channel control_work_queue because it can't properly
do serialization of message handling, e.g., when there are 2 NIC devices,
vmbus_channel_on_offer() -> hv_queue_work_item() has a race condition:
for an SMP VM, vmbus_channel_process_offer() can run concurrently on
different CPUs and if the second NIC's
vmbus_channel_process_offer() -> hv_vmbus_child_device_register() runs
first, the second NIC's name will be hn0 and the first NIC's name will
be hn1!

We can fix the race condition by removing the per-channel control_work_queue
and run all the message handlers in the global
hv_vmbus_g_connection.work_queue -- we'll do this in the next patch.

With the coming next patch, we have to run the non-blocking handlers
directly in the kernel thread vmbus_msg_swintr(), because the special
handling of sub-channel: when a sub-channel (e.g., of the storvsc driver)
is received and being handled in vmbus_channel_on_offer() running on the
global hv_vmbus_g_connection.work_queue, vmbus_channel_process_offer()
invokes channel->sc_creation_callback, i.e., storvsc_handle_sc_creation,
and the callback will invoke hv_vmbus_channel_open() -> hv_vmbus_post_message
and expect a further reply from the host, but the handling of the further
messag can't be done because the current message's handling hasn't finished
yet; as result, hv_vmbus_channel_open() -> sema_timedwait() will time out
and th device can't work.

Also renamed the handler type from hv_pfn_channel_msg_handler to
vmbus_msg_handler: the 'pfn' and 'channel' in the old name make no sense.

Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	royger
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D4596
2015-12-29 08:19:43 +00:00
delphij
1e469c5590 hyperv: vmbus: remove the per-channel control_work_queue
Now vmbus_channel_on_offer() -> vmbus_channel_process_offer() can
safely run on the global hv_vmbus_g_connection.work_queue now.

We remove the per-channel control_work_queue to achieve the proper
serialization of the message handling.

I removed the bogus TODO in vmbus_channel_on_offer(): a vmbus offer
can only come from the parent partition, i.e., the host.

PR:		kern/205156
Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	Howard Su <howard0su gmail com>, delphij
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D4597
2015-12-29 07:54:55 +00:00
ngie
41d598fddd Remove redundant vmbus_select_outgoing_channel declaration already handled
in include/hyperv.h

This unbreaks the gcc 4.2.1 kernel build of hyperv

Differential Revision: https://reviews.freebsd.org/D4684
MFC after: 3 days
Reviewed by: royger
Sponsored by: EMC / Isilon Storage Division
2015-12-23 17:37:30 +00:00
royger
7330ea8733 hyperv/kvp: wake up the daemon if it's sleeping due to poll()
Without the patch, there is a race condition: when poll() is invoked(),
if kvp_globals.daemon_busy is false, the daemon won't be timely
woke up, because hv_kvp_send_msg_to_daemon() can't wake up the daemon
in this case.

Submitted by:           Dexuan Cui <decui@microsoft.com>
Sponsored by:		Microsoft OSTC
Reviewed by:		delphij, royger
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D4258
2015-12-15 11:20:20 +00:00
whu
a7bfea84af Ignore the inbound checksum flags when doing packet forwarding in netvsc driver.
PR: 20363
Submitted by: whu
Reviewed by: royger, whu
Approved by: royger
MFC after: 1 week
Relnotes: No
Sponsored by: Microsoft OSTC
Differential Revision:  https://reviews.freebsd.org/D4131
2015-11-22 05:26:13 +00:00
whu
eb8aae544f Do not enable UDP checksum offloading when running on the Hyper-V on
Windows Server 2012 and earlier hosts.

Submitted by: whu
Reviewed by: royger
Approved by: royger
MFC after: 3 days
Relnotes: No
Sponsored by: Microsoft OSTC
Differential Revision:  https://reviews.freebsd.org/D3086
2015-07-22 05:05:01 +00:00
bz
69a4e9c704 Fix compilation without INET6 and without INET and INET6 after
offload support was introduced in r284746.

While here also fix the ioctl() handler for IPv4 added in r279819,
which was never compiled in given opt_inet.h was not included.
2015-06-27 12:37:09 +00:00
whu
71f5e47790 TSO and checksum offloading support for Netvsc driver on Hyper-V.
Submitted by:	whu
Reviewed by:	royger
Approved by:	royger
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D2517
2015-06-24 06:01:29 +00:00
jkim
318c4f97e6 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
whu
e1077ab7dc Add support for SCSI disk hot add and remove. Also add padding according to
the requirement of different hypervisor releases.

Submitted by:	whu
Reviewed by:	royger
Approved by:	royger
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D2512
2015-05-18 10:31:23 +00:00
whu
0b81a49657 Microsoft vmbus, storage and other related driver enhancements for HyperV.
- Vmbus multi channel support.
    - Vector interrupt support.
    - Signal optimization.
    - Storvsc driver performance improvement.
    - Scatter and gather support for storvsc driver.
    - Minor bug fix for KVP driver.
Thanks royger, jhb and delphij from FreeBSD community for the reviews
and comments. Also thanks Hovy Xu from NetApp for the contributions to
the storvsc driver.

PR:     195238
Submitted by:   whu
Reviewed by:    royger, jhb, delphij
Approved by:    royger
MFC after:      2 weeks
Relnotes:       yes
Sponsored by:   Microsoft OSTC
2015-04-29 10:12:34 +00:00
delphij
6ec28e3ef5 Fix CARP when in use in a HyperV environment:
- Bump link state when stopping or starting the interface;
 - Don't handle SIOCGIFADDR specially, similar to r277103.

This change is based on a previous revision from Andy Zhang
(Microsoft) who did the diagnostic work and many thanks to
them for their help in supporting the HyperV work.

PR:		kern/187203
MFC after:	2 weeks
2015-03-09 20:11:16 +00:00
smh
dd63bf99a2 Prevent overflow issues in timeout processing
Previously, any timeout value for which (timeout * hz) will overflow the
signed integer, will give weird results, since callout(9) routines will
convert negative values of ticks to '1'. For unsigned integer overflow we
will get sufficiently smaller timeout values than expected.

Switch from callout_reset, which requires conversion to int based ticks
to callout_reset_sbt to avoid this.

Also correct isci to correctly resolve ccb timeout.

This was based on the original work done by Eygene Ryabinkin
<rea@freebsd.org> back in 5 Aug 2011 which used a macro to help avoid
the overlow.

Differential Revision:	https://reviews.freebsd.org/D1157
Reviewed by:	mav, davide
MFC after:	1 month
Sponsored by:	Multiplay
2014-11-21 21:01:24 +00:00
glebius
6306f79560 Remove struct arpcom. It is unused by most interface types, that allocate
it, except Ethernet, where it carried ng_ether(4) pointer.
For now carry the pointer in if_l2com directly.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-07 15:14:10 +00:00
delphij
df9d3cfbe5 Return BUS_PROBE_DEFAULT instead of BUS_PROBE_VENDOR or 0 for in-tree
driver.  This change was verified by Microsoft.
2014-10-24 06:27:45 +00:00
gjb
7f34195e3d Fix an issue where a FreeBSD virtual machine provisioned in
the Microsoft Azure service does not recognize the second
attached disk on the system.

Submitted by:	kyliel@Microsoft
Patched by:	weh@Microsoft
PR:		194376
MFC after:	3 days
X-MFC-10.1:	yes, ASAP
Sponsored by:	The FreeBSD Foundation
2014-10-21 19:36:20 +00:00
glebius
b3b337a80e Mechanically convert to if_inc_counter(). 2014-09-19 03:51:26 +00:00
delphij
edc7ea3a5d Import HyperV Key-Value Pair (KVP) driver and daemon code by Microsoft,
many thanks for their continued support of FreeBSD.

While I'm there, also implement a new build knob, WITHOUT_HYPERV to
disable building and installing of the HyperV utilities when necessary.

The HyperV utilities are only built for i386 and amd64 targets.

This is a stable/10 candidate for inclusion with 10.1-RELEASE.

Submitted by:	Wei Hu <weh microsoft com>
MFC after:	1 week
2014-09-13 02:15:31 +00:00
glebius
9b93b159b3 Use define from if_var.h to access a field inside struct if_data,
that resides in struct ifnet.

Sponsored by:	Nginx, Inc.
2014-08-30 19:55:54 +00:00
imp
4ba44b5d1b Make some unwise casts. On i386 these casts wind up being safe. Rather
than disturb the API, go with these casts to shut gcc up.
2014-04-05 22:42:00 +00:00
delphij
0d3fbce5f2 Hide a few messages under bootverbose.
Reviewed by:	Abhishek Gupta
MFC after:	2 weeks
2014-03-14 00:47:46 +00:00
mav
5b661a59bd Minor fix to r262789.
MFC after:	6 days
2014-03-06 12:37:25 +00:00
mav
e91188eee2 Remove custom bus scanner code and fix use of CAM's default scanner.
This fixes kernel panic during boot, caused by incompatibility of recent
CAM locking changes and this bus scanner code.

Submitted by:	Microsoft
MFC after:	1 week
2014-03-05 16:42:33 +00:00
glebius
529a68392a Another round of removing historical mbuf(9) allocator flags.
They are breeding! New ones arouse since last round.

Sponsored by:	Nginx, Inc.
2014-01-16 13:44:47 +00:00
pjd
fe4a871aa6 Fix missing new line after:
Netvsc initializing...

during boot.
2013-12-10 17:16:13 +00:00
delphij
ebba0d128d Don't reference pointer before testing whether it is
NULL.

Submitted by:	Clement Lecigne <clecigne google com>
Reviewed by:	grehan
MFC after:	3 days
2013-10-29 22:42:30 +00:00
nwhitehorn
3e9a158ce7 More BUS_PROBE_NOWILDCARD sweeping. Some devices here (if_ath_ahb and siba)
resist easy conversion since they implement a great deal of their attach
logic inside probe(). Some of this could be fixed by moving it to attach(),
but some requires something more subtle than BUS_PROBE_NOWILDCARD.
2013-10-29 14:19:42 +00:00
glebius
f469ae1d45 Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-28 07:29:16 +00:00
gibbs
c8b87aac87 Centralize the detection logic for the Hyper-V hypervisor.
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs, grehan
Approved by:	re (gjb)

sys/sys/systm.h:
 * Add a new VM_GUEST type, VM_GUEST_HV (HyperV guest).

sys/dev/hyperv/vmbus/hv_vmbus_drv_freebsd.c:
sys/dev/hyperv/vmbus/hv_hv.c:
sys/dev/hyperv/stordisengage/hv_ata_pci_disengage.c:
 * Set vm_guest to VM_GUEST_HV and use that on other HyperV related
   devices instead of cloning the cpuid hypervisor check.
 * Cleanup the vmbus_identify function.
2013-10-13 02:41:30 +00:00
grehan
e4009f5516 Fix a lock-order reversal in the net driver by dropping the lock
and holding a reference prior to calling further into the hyperv
stack.

Added missing FreeBSD idents.

Submitted by:	Microsoft hyperv dev team
Approved by:	re@ (gjb)
2013-10-12 00:32:34 +00:00
grehan
35b7675f72 Fix vmbus channel memory leak where incorrect length parameter was
being passed to contigfree().

Submitted by:	Microsoft hyperv dev team
Approved by:	re@ (glebius)
2013-10-11 21:30:27 +00:00
grehan
7abe80694b Allow the legacy CDROM device to be accessed in a FreeBSD guest, while
still using enlightened drivers for other block devices.

Submitted by:	Microsoft hyperv dev team, mav@
Approved by:	re@
2013-10-10 22:46:49 +00:00
dim
ee09e09e18 In sys/dev/hyperv, fix a number of gcc warnings about usage of anonymous
union members in strict C99, by giving them names.  While here, add some
FreeBSD keywords where they were missing.

Approved by:	re (gjb)
Reviewed by:	grehan
2013-10-10 16:25:53 +00:00
gibbs
716c2031c7 Correct panic caused by attaching both Xen PV and HyperV virtualization
aware drivers on Xen hypervisors that advertise support for some
HyperV features.

x86/xen/hvm.c:
	When running in HVM mode on a Xen hypervisor, set vm_guest
	to VM_GUEST_XEN so other virtualization aware components in
	the FreeBSD kernel can detect this mode is active.

dev/hyperv/vmbus/hv_hv.c:
	Use vm_guest to ignore Xen's HyperV emulation when Xen is
	detected and Xen PV drivers are active.

Reported by:	Shanker Balan
Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (Xen blanket)
2013-10-05 19:51:09 +00:00
grehan
0804d47dcd Reorder the hypervisor presence test to avoid claiming ATA disks
on non hyperv systems.

Reviewed by:	neel, abgupta at microsoft dot com
Approved by:	re@ (hrs)
2013-09-19 02:34:52 +00:00
grehan
8294851167 Revert the kvp code - there's still some work that
needs to be done for that.

Discussed with:	Microsoft hyper-v devs
2013-09-09 19:27:44 +00:00
grehan
182a56e295 Latest update from Microsoft.
Obtained from:	Microsoft Hyper-v dev team
2013-09-09 08:07:46 +00:00
grehan
fd121a9c7b IFC @ r253862
- change the SI_SUB_RUN_SCHEDULER sysinits in hv_utilc and
hv_netvsc_drv_freebsd.c to SI_SUB_KTHREAD_IDLE, since the
former is no longer in FreeBSD.
  The use of these SYSINITs can probably be removed.
2013-08-01 22:09:57 +00:00
grehan
c8195f5331 Microsoft have changed their policy on how the hyper-v code will
be pulled into FreeBSD. From now, FreeBSD will be considered the
upstream repo.

First step: move the drivers away from the contrib area and into
the base system.

A follow-on commit will include the drivers in the amd64 GENERIC kernel.
2013-07-17 06:30:23 +00:00