Commit Graph

302 Commits

Author SHA1 Message Date
Alexander Motin
fdd86701e5 Don't set bits that should be zero for SATA devices.
Old value made Linux think that it is PATA device with SATA bridge.

MFC after:	2 weeks
2015-04-20 19:11:27 +00:00
Alexander Motin
910280e539 Report link as up if tap device is not specified (black hole).
MFC after:	2 weeks
2015-04-20 14:55:01 +00:00
Alexander Motin
f2c58daab8 Report link as up only if we managed to open tap device.
It would be cool to report tap device status, but it has no such API.

MFC after:	2 weeks
2015-04-20 14:23:18 +00:00
Alexander Motin
d9a6698393 Disable RX/TX queues notifications when not needed.
This reduces CPU load and doubles iperf throughput, reaching 2-3Gbit/s.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2015-04-20 10:29:42 +00:00
Alexander Motin
bb1524af0c Workaround bhyve virtual disks operation on top of GEOM providers.
GEOM does not support scatter/gather lists in its I/Os.  Such requests
are cut in pieces by physio(), that may be problematic, if those pieces
are not multiple of provider's sector size.  If such case is detected,
move the data through temporary sequential buffer.

MFC after:	2 weeks
2015-04-18 20:10:19 +00:00
Alexander Motin
0990a33089 Make virtual AHCI more careful with I/O lengths.
MFC after:	2 weeks
2015-04-17 20:20:55 +00:00
Neel Natu
77afcadd51 If the number of guest vcpus is less than '1' then flag it as an error.
MFC after:	1 week
2015-04-16 20:11:49 +00:00
Tycho Nightingale
3b65fbe4d1 Prior to aborting due to an ioport error, it is always interesting to
see what the guest's %rip is.

Reviewed by:	grehan
2015-04-15 18:49:03 +00:00
Baptiste Daroussin
ea4a4d8a2e Fix overlinking in bhyve:
libvmmapi is actually needed to be linked to libutil, not bhyve nor bhyveload
2015-04-09 21:38:40 +00:00
Tycho Nightingale
703e4974aa Prior to aborting due to an instruction emulation error, it is always
interesting to see what the guest's %rip and instruction bytes are.

Reviewed by:	grehan
2015-04-01 20:36:07 +00:00
Peter Grehan
fed2d5edfc Move legacy interrupt allocation for virtio devices to common code.
There are a number of assumptions about legacy interrupts always
being available in virtio so don't allow back-ends to make the
decision to support them.

This fixes the issue seen with virtio-rnd on OpenBSD. MSI-x vectors
were not being used, and the virtio-rnd backend wasn't allocating a
legacy interrupt resulting in a bhyve assert and guest exit.

Reported by:	Julian Hsiao, madoka at nyanisore dot net
Reviewed by:	neel
MFC after:	1 week
2015-03-27 01:58:44 +00:00
Alexander Motin
8187174a9b Add missing variable initialization.
Reported by:	Coverity
CID:		1288938
MFC after:	3 days
2015-03-20 16:05:13 +00:00
Alexander Motin
cb5c792950 Report that we may have write cache, and that we do support FLUSH.
FreeBSD guest driver does not use that legacy flag, but Linux seems does.

MFC after:	2 weeks
2015-03-16 20:13:25 +00:00
Alexander Motin
54b7bb7626 Increase S/G list size of 32 to 33 entries.
32 entries are not enough for the worst case of misaligned 128KB request,
that made FreeBSD to chunk large quests in odd pieces.

MFC after:	2 weeks
2015-03-16 09:15:59 +00:00
Alexander Motin
e365f36c32 Pre-allocate one extra request per processing thread.
Processing threads call callbacks before freeing requests.  As result,
new requests may arrive before old ones are freed.

MFC after:	2 weeks
2015-03-15 22:44:53 +00:00
Alexander Motin
811a355f1a According to Linux and QEMU, s/n equal to buffer is not zero-terminated.
This makes same s/n reported for both virtio and AHCI drivers.

MFC after:	2 weeks
2015-03-15 17:45:16 +00:00
Alexander Motin
f2e62de7d9 Close potential race on blockif_close().
Reported by:	vangyzen
MFC after:	2 weeks
2015-03-15 16:18:03 +00:00
Alexander Motin
7315946b80 Fix networking problem after r280026.
I've missed that network driver sometimes returns taken request back to
available queue without processing.  Add new helper function for that case.

Reported by:	flo
MFC after:	2 weeks
2015-03-15 16:09:39 +00:00
Alexander Motin
e72d4950e1 Give AHCI disk serial based on backing file path same as for virtio block.
It is still not good that they may intersect on different hosts, but that
is better then intersecting on the same host.

MFC after:	2 weeks
2015-03-15 15:29:03 +00:00
Alexander Motin
066a8f1411 Rewrite virtio block device driver to work asynchronously and use the block
I/O interface.

Asynchronous operation, based on r280026 change, allows to not block virtual
CPU during I/O processing, that on slow/busy storage can take seconds.
Use of recently improved block I/O interface allows to process multiple
requests same time, that improves random I/O performance on wide storages.

Benchmarks of virtual disk, backed by ZVOL on RAID10 pool of 4 HDDs, show
~3.5 times random read performance improvements, while no degradation on
linear I/O.  Guest CPU usage during test dropped from 100% to almost zero.

MFC after:	2 weeks
2015-03-15 14:57:11 +00:00
Alexander Motin
fdb7e97f87 Modify virtqueue helpers added in r253440 to allow queuing.
Original virtqueue design allows queued and out-of-order processing, but
helpers added in r253440 suppose only direct blocking in-order one.
It could be fine for network, etc., but it is a huge limitation for storage
devices.
2015-03-15 11:37:07 +00:00
Alexander Motin
7e8e553940 Block delete capability for read-only devices.
Submitted by:	neel
MFC after:	2 weeks
2015-03-15 08:09:56 +00:00
Alexander Motin
79565afed8 Give block I/O interface multiple (8) execution threads.
On parallel random I/O this allows better utilize wide storage pools.
To not confuse prefetcher on linear I/O, consecutive requests are executed
sequentially, following the same logic as was earlier implemented in CTL.

Benchmarks of virtual AHCI disk, backed by ZVOL on RAID10 pool of 4 HDDs,
show ~3.5 times random read performance improvements, while no degradation
on linear I/O.

MFC after:	2 weeks
2015-03-14 21:15:45 +00:00
Alexander Motin
df57ec4933 Add checksums to identify data and NCQ command error log.
MFC after:	2 weeks
2015-03-14 14:06:37 +00:00
Alexander Motin
b441dabf7e Slightly polish virtual AHCI CD reporting.
MFC after:	2 weeks
2015-03-14 12:18:26 +00:00
Alexander Motin
fb329df8e4 Fix NOP and IDLE commands for virtual AHCI disks.
MFC after:	2 weeks
2015-03-14 10:38:25 +00:00
Alexander Motin
1fcb801948 Add support for NCQ variant of DSM TRIM for virtual AHCI disks.
The code is not really tested yet due to lack of initiator support.

Requested by:	imp
MFC after:	2 weeks
2015-03-14 09:46:43 +00:00
Alexander Motin
9009f43407 Improve NCQ errors reporting for virtual AHCI disks.
While this implementation is still not perfect, previous was just broken.

MFC after:	2 weeks
2015-03-14 08:45:54 +00:00
Alexander Motin
dcd0c998a9 Remove incorrect SERR register setting.
At this point we have nothing to report through that register.

MFC after:	2 weeks
2015-03-13 21:01:25 +00:00
Alexander Motin
9463f47b3a Change prdbc value reporting.
MFC after:	2 weeks
2015-03-13 20:56:17 +00:00
Alexander Motin
295e61d6a3 Polish AHCI disk identify data and fix speed negotiation.
MFC after:	2 weeks
2015-03-13 20:14:35 +00:00
Alexander Motin
5f6b63de7a Add support for PIO variants of READ/WRITE commands for AHCI disks.
AHCI API hides all PIO specifics, so this functionality is almost free.

MFC after:	2 weeks
2015-03-13 18:35:38 +00:00
Alexander Motin
f7c5bc2cfe Use ahci_write_fis_d2h() for commands completion.
MFC after:	2 weeks
2015-03-13 18:04:07 +00:00
Alexander Motin
0b9d25c935 Add DSM TRIM command support for virtual AHCI disks.
It works only for virtual disks backed by ZVOLs and raw devices supporting
BIO_DELETE.  Virtual disks backed by files won't report this capability.

MFC after:	2 weeks
Relnotes:	yes
2015-03-13 16:43:52 +00:00
Alexander Motin
f5f4836d62 Add variable initialization missed by me and clang.
Reported by:	grehan
MFC after:	2 weeks
2015-03-05 20:29:18 +00:00
Alexander Motin
371f1d88b6 Fix error translation broken in r279658.
Reported by:	grehan
MFC after:	2 weeks
2015-03-05 20:24:34 +00:00
Alexander Motin
2d678f1f4f Implement cache flush for ahci-hd and for virtio-blk over device.
MFC after:	2 weeks
2015-03-05 15:29:18 +00:00
Alexander Motin
d951589ddb Add check for absent stripe size to r279652.
MFC after:	2 weeks
2015-03-05 13:52:30 +00:00
Alexander Motin
94682383d9 Report logical/physical sector sizes for virtual SATA disk.
MFC after:	2 weeks
2015-03-05 12:21:12 +00:00
Alexander Motin
297c4868dd Add support for TOPOLOGY feature of virtio block device.
Passing through physical block size/offset from underlying storage allows
guest to manage proper data and I/O alignment to improve performance.

MFC after:	2 weeks
2015-03-05 10:40:45 +00:00
Neel Natu
12f91c70a3 Emulate MSR 0xC0011024 when running on AMD processors.
OpenBSD guests test bit 0 of this MSR to detect whether the workaround for
erratum 721 has been applied.

Reported by:	Jason Tubnor (jason@tubnor.net)
MFC after:	1 week
2015-02-24 05:15:40 +00:00
Neel Natu
c974767896 Add "-u" option to bhyve(8) to indicate that the RTC should maintain UTC time.
The default remains localtime for compatibility with the original device model
in bhyve(8). This is required for OpenBSD guests which assume that the RTC
keeps UTC time.

Reviewed by:	grehan
Pointed out by:	Jason Tubnor (jason@tubnor.net)
MFC after:	2 weeks
2015-02-24 02:04:16 +00:00
Peter Grehan
65392c66a5 Don't close a block context if it couldn't be opened,
for example if the backing file doesn't exist,
avoiding a null deref.

Reviewed by:	neel
MFC after:	1 week.
2015-02-23 22:31:39 +00:00
Neel Natu
d087a39935 Simplify instruction restart logic in bhyve.
Keep track of the next instruction to be executed by the vcpu as 'nextrip'.
As a result the VM_RUN ioctl no longer takes the %rip where a vcpu should
start execution.

Also, instruction restart happens implicitly via 'vm_inject_exception()' or
explicitly via 'vm_restart_instruction()'. The APIs behave identically in
both kernel and userspace contexts. The main beneficiary is the instruction
emulation code that executes in both contexts.

bhyve(8) VM exit handlers now treat 'vmexit->rip' and 'vmexit->inst_length'
as readonly:
- Restarting an instruction is now done by calling 'vm_restart_instruction()'
  as opposed to setting 'vmexit->inst_length' to 0 (e.g. emulate_inout())
- Resuming vcpu at an arbitrary %rip is now done by setting VM_REG_GUEST_RIP
  as opposed to changing 'vmexit->rip' (e.g. vmexit_task_switch())

Differential Revision:	https://reviews.freebsd.org/D1526
Reviewed by:		grehan
MFC after:		2 weeks
2015-01-18 03:08:30 +00:00
Neel Natu
0dafa5cd4b Replace bhyve's minimal RTC emulation with a fully featured one in vmm.ko.
The new RTC emulation supports all interrupt modes: periodic, update ended
and alarm. It is also capable of maintaining the date/time and NVRAM contents
across virtual machine reset. Also, the date/time fields can now be modified
by the guest.

Since bhyve now emulates both the PIT and the RTC there is no need for
"Legacy Replacement Routing" in the HPET so get rid of it.

The RTC device state can be inspected via bhyvectl as follows:
bhyvectl --vm=vm --get-rtc-time
bhyvectl --vm=vm --set-rtc-time=<unix_time_secs>
bhyvectl --vm=vm --rtc-nvram-offset=<offset> --get-rtc-nvram
bhyvectl --vm=vm --rtc-nvram-offset=<offset> --set-rtc-nvram=<value>

Reviewed by:	tychon
Discussed with:	grehan
Differential Revision:	https://reviews.freebsd.org/D1385
MFC after:	2 weeks
2014-12-30 22:19:34 +00:00
Baptiste Daroussin
c6db8143ed Convert usr.sbin to LIBADD
Reduce overlinking
2014-11-25 16:57:27 +00:00
Edward Tomasz Napierala
aca4343c62 Fix improper .Fx macro usage.
Differential Revision:	https://reviews.freebsd.org/D1158
Reviewed by:	wblock@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-11-19 18:19:21 +00:00
Tycho Nightingale
48a9d8f214 To allow a request to be submitted from within the callback routine of
a completing one increase the total by 1 but don't advertise it.

Reviewed by:	grehan
2014-11-09 21:08:52 +00:00
Tycho Nightingale
ae45750d6c Improve the ability to cancel an in-flight request by using an
interrupt, via SIGCONT, to force the read or write system call to
return prematurely.

Reviewed by:	grehan
2014-11-04 01:06:33 +00:00
Tycho Nightingale
26bf96112b If the start bit, PxCMD.ST, is cleared and nothing is in-flight then
PxCI, PxSACT, PxCMD.CCS and PxCMD.CR should be 0.

Reviewed by:	grehan
2014-11-03 12:55:31 +00:00