Commit Graph

292 Commits

Author SHA1 Message Date
tychon
0b17a7a512 Prior to aborting due to an instruction emulation error, it is always
interesting to see what the guest's %rip and instruction bytes are.

Reviewed by:	grehan
2015-04-01 20:36:07 +00:00
grehan
d96cec83fc Move legacy interrupt allocation for virtio devices to common code.
There are a number of assumptions about legacy interrupts always
being available in virtio so don't allow back-ends to make the
decision to support them.

This fixes the issue seen with virtio-rnd on OpenBSD. MSI-x vectors
were not being used, and the virtio-rnd backend wasn't allocating a
legacy interrupt resulting in a bhyve assert and guest exit.

Reported by:	Julian Hsiao, madoka at nyanisore dot net
Reviewed by:	neel
MFC after:	1 week
2015-03-27 01:58:44 +00:00
mav
54b9845962 Add missing variable initialization.
Reported by:	Coverity
CID:		1288938
MFC after:	3 days
2015-03-20 16:05:13 +00:00
mav
55b7ea0246 Report that we may have write cache, and that we do support FLUSH.
FreeBSD guest driver does not use that legacy flag, but Linux seems does.

MFC after:	2 weeks
2015-03-16 20:13:25 +00:00
mav
cfdd687fd7 Increase S/G list size of 32 to 33 entries.
32 entries are not enough for the worst case of misaligned 128KB request,
that made FreeBSD to chunk large quests in odd pieces.

MFC after:	2 weeks
2015-03-16 09:15:59 +00:00
mav
2b87ed684f Pre-allocate one extra request per processing thread.
Processing threads call callbacks before freeing requests.  As result,
new requests may arrive before old ones are freed.

MFC after:	2 weeks
2015-03-15 22:44:53 +00:00
mav
0a32d97912 According to Linux and QEMU, s/n equal to buffer is not zero-terminated.
This makes same s/n reported for both virtio and AHCI drivers.

MFC after:	2 weeks
2015-03-15 17:45:16 +00:00
mav
72856e7d90 Close potential race on blockif_close().
Reported by:	vangyzen
MFC after:	2 weeks
2015-03-15 16:18:03 +00:00
mav
2088070eaf Fix networking problem after r280026.
I've missed that network driver sometimes returns taken request back to
available queue without processing.  Add new helper function for that case.

Reported by:	flo
MFC after:	2 weeks
2015-03-15 16:09:39 +00:00
mav
cd363583ce Give AHCI disk serial based on backing file path same as for virtio block.
It is still not good that they may intersect on different hosts, but that
is better then intersecting on the same host.

MFC after:	2 weeks
2015-03-15 15:29:03 +00:00
mav
15ba37b7de Rewrite virtio block device driver to work asynchronously and use the block
I/O interface.

Asynchronous operation, based on r280026 change, allows to not block virtual
CPU during I/O processing, that on slow/busy storage can take seconds.
Use of recently improved block I/O interface allows to process multiple
requests same time, that improves random I/O performance on wide storages.

Benchmarks of virtual disk, backed by ZVOL on RAID10 pool of 4 HDDs, show
~3.5 times random read performance improvements, while no degradation on
linear I/O.  Guest CPU usage during test dropped from 100% to almost zero.

MFC after:	2 weeks
2015-03-15 14:57:11 +00:00
mav
42641f98a6 Modify virtqueue helpers added in r253440 to allow queuing.
Original virtqueue design allows queued and out-of-order processing, but
helpers added in r253440 suppose only direct blocking in-order one.
It could be fine for network, etc., but it is a huge limitation for storage
devices.
2015-03-15 11:37:07 +00:00
mav
476187cac8 Block delete capability for read-only devices.
Submitted by:	neel
MFC after:	2 weeks
2015-03-15 08:09:56 +00:00
mav
eb63aed246 Give block I/O interface multiple (8) execution threads.
On parallel random I/O this allows better utilize wide storage pools.
To not confuse prefetcher on linear I/O, consecutive requests are executed
sequentially, following the same logic as was earlier implemented in CTL.

Benchmarks of virtual AHCI disk, backed by ZVOL on RAID10 pool of 4 HDDs,
show ~3.5 times random read performance improvements, while no degradation
on linear I/O.

MFC after:	2 weeks
2015-03-14 21:15:45 +00:00
mav
efa8369c49 Add checksums to identify data and NCQ command error log.
MFC after:	2 weeks
2015-03-14 14:06:37 +00:00
mav
75e831bbaa Slightly polish virtual AHCI CD reporting.
MFC after:	2 weeks
2015-03-14 12:18:26 +00:00
mav
dd99a4abcb Fix NOP and IDLE commands for virtual AHCI disks.
MFC after:	2 weeks
2015-03-14 10:38:25 +00:00
mav
36590090f2 Add support for NCQ variant of DSM TRIM for virtual AHCI disks.
The code is not really tested yet due to lack of initiator support.

Requested by:	imp
MFC after:	2 weeks
2015-03-14 09:46:43 +00:00
mav
9ac55a7e33 Improve NCQ errors reporting for virtual AHCI disks.
While this implementation is still not perfect, previous was just broken.

MFC after:	2 weeks
2015-03-14 08:45:54 +00:00
mav
48507f436f Remove incorrect SERR register setting.
At this point we have nothing to report through that register.

MFC after:	2 weeks
2015-03-13 21:01:25 +00:00
mav
358865b66d Change prdbc value reporting.
MFC after:	2 weeks
2015-03-13 20:56:17 +00:00
mav
9d7a73f956 Polish AHCI disk identify data and fix speed negotiation.
MFC after:	2 weeks
2015-03-13 20:14:35 +00:00
mav
eded307e2f Add support for PIO variants of READ/WRITE commands for AHCI disks.
AHCI API hides all PIO specifics, so this functionality is almost free.

MFC after:	2 weeks
2015-03-13 18:35:38 +00:00
mav
7435136f27 Use ahci_write_fis_d2h() for commands completion.
MFC after:	2 weeks
2015-03-13 18:04:07 +00:00
mav
ec9fb407ff Add DSM TRIM command support for virtual AHCI disks.
It works only for virtual disks backed by ZVOLs and raw devices supporting
BIO_DELETE.  Virtual disks backed by files won't report this capability.

MFC after:	2 weeks
Relnotes:	yes
2015-03-13 16:43:52 +00:00
mav
a0ef792dfe Add variable initialization missed by me and clang.
Reported by:	grehan
MFC after:	2 weeks
2015-03-05 20:29:18 +00:00
mav
710980f2ff Fix error translation broken in r279658.
Reported by:	grehan
MFC after:	2 weeks
2015-03-05 20:24:34 +00:00
mav
a34b7ad0d2 Implement cache flush for ahci-hd and for virtio-blk over device.
MFC after:	2 weeks
2015-03-05 15:29:18 +00:00
mav
52d1cb2338 Add check for absent stripe size to r279652.
MFC after:	2 weeks
2015-03-05 13:52:30 +00:00
mav
02e846756e Report logical/physical sector sizes for virtual SATA disk.
MFC after:	2 weeks
2015-03-05 12:21:12 +00:00
mav
214386b6e7 Add support for TOPOLOGY feature of virtio block device.
Passing through physical block size/offset from underlying storage allows
guest to manage proper data and I/O alignment to improve performance.

MFC after:	2 weeks
2015-03-05 10:40:45 +00:00
neel
16ee311597 Emulate MSR 0xC0011024 when running on AMD processors.
OpenBSD guests test bit 0 of this MSR to detect whether the workaround for
erratum 721 has been applied.

Reported by:	Jason Tubnor (jason@tubnor.net)
MFC after:	1 week
2015-02-24 05:15:40 +00:00
neel
b341fa888c Add "-u" option to bhyve(8) to indicate that the RTC should maintain UTC time.
The default remains localtime for compatibility with the original device model
in bhyve(8). This is required for OpenBSD guests which assume that the RTC
keeps UTC time.

Reviewed by:	grehan
Pointed out by:	Jason Tubnor (jason@tubnor.net)
MFC after:	2 weeks
2015-02-24 02:04:16 +00:00
grehan
ceecb9ed38 Don't close a block context if it couldn't be opened,
for example if the backing file doesn't exist,
avoiding a null deref.

Reviewed by:	neel
MFC after:	1 week.
2015-02-23 22:31:39 +00:00
neel
d9f07f9841 Simplify instruction restart logic in bhyve.
Keep track of the next instruction to be executed by the vcpu as 'nextrip'.
As a result the VM_RUN ioctl no longer takes the %rip where a vcpu should
start execution.

Also, instruction restart happens implicitly via 'vm_inject_exception()' or
explicitly via 'vm_restart_instruction()'. The APIs behave identically in
both kernel and userspace contexts. The main beneficiary is the instruction
emulation code that executes in both contexts.

bhyve(8) VM exit handlers now treat 'vmexit->rip' and 'vmexit->inst_length'
as readonly:
- Restarting an instruction is now done by calling 'vm_restart_instruction()'
  as opposed to setting 'vmexit->inst_length' to 0 (e.g. emulate_inout())
- Resuming vcpu at an arbitrary %rip is now done by setting VM_REG_GUEST_RIP
  as opposed to changing 'vmexit->rip' (e.g. vmexit_task_switch())

Differential Revision:	https://reviews.freebsd.org/D1526
Reviewed by:		grehan
MFC after:		2 weeks
2015-01-18 03:08:30 +00:00
neel
7aa6460c48 Replace bhyve's minimal RTC emulation with a fully featured one in vmm.ko.
The new RTC emulation supports all interrupt modes: periodic, update ended
and alarm. It is also capable of maintaining the date/time and NVRAM contents
across virtual machine reset. Also, the date/time fields can now be modified
by the guest.

Since bhyve now emulates both the PIT and the RTC there is no need for
"Legacy Replacement Routing" in the HPET so get rid of it.

The RTC device state can be inspected via bhyvectl as follows:
bhyvectl --vm=vm --get-rtc-time
bhyvectl --vm=vm --set-rtc-time=<unix_time_secs>
bhyvectl --vm=vm --rtc-nvram-offset=<offset> --get-rtc-nvram
bhyvectl --vm=vm --rtc-nvram-offset=<offset> --set-rtc-nvram=<value>

Reviewed by:	tychon
Discussed with:	grehan
Differential Revision:	https://reviews.freebsd.org/D1385
MFC after:	2 weeks
2014-12-30 22:19:34 +00:00
bapt
a191ba5195 Convert usr.sbin to LIBADD
Reduce overlinking
2014-11-25 16:57:27 +00:00
trasz
5e025caad4 Fix improper .Fx macro usage.
Differential Revision:	https://reviews.freebsd.org/D1158
Reviewed by:	wblock@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2014-11-19 18:19:21 +00:00
tychon
8e07e96dc6 To allow a request to be submitted from within the callback routine of
a completing one increase the total by 1 but don't advertise it.

Reviewed by:	grehan
2014-11-09 21:08:52 +00:00
tychon
c9b1ef1cfd Improve the ability to cancel an in-flight request by using an
interrupt, via SIGCONT, to force the read or write system call to
return prematurely.

Reviewed by:	grehan
2014-11-04 01:06:33 +00:00
tychon
dc557cbaff If the start bit, PxCMD.ST, is cleared and nothing is in-flight then
PxCI, PxSACT, PxCMD.CCS and PxCMD.CR should be 0.

Reviewed by:	grehan
2014-11-03 12:55:31 +00:00
neel
a98b036706 Add a comment explaining the intent behind the I/O reservation [0x72-0x77]. 2014-10-26 21:17:44 +00:00
neel
eb58530f9b Move the ACPI PM timer emulation into vmm.ko.
This reduces variability during timer calibration by keeping the emulation
"close" to the guest. Additionally having all timer emulations in the kernel
will ease the transition to a per-VM clock source (as opposed to using the
host's uptime keep track of time).

Discussed with:	grehan
2014-10-26 04:44:28 +00:00
neel
dd2febd6f2 IFC @r273214 2014-10-20 02:57:30 +00:00
neel
13e9198693 Don't advertise the "OS visible workarounds" feature in cpuid.80000001H:ECX.
bhyve doesn't emulate the MSRs needed to support this feature at this time.

Don't expose any model-specific RAS and performance monitoring features in
cpuid leaf 80000007H.

Emulate a few more MSRs for AMD: TSEG base address, TSEG address mask and
BIOS signature and P-state related MSRs.

This eliminates all the unimplemented MSRs accessed by Linux/x86_64 kernels
2.6.32, 3.10.0 and 3.17.0.
2014-10-19 21:38:58 +00:00
neel
170f49cd34 Don't advertise the Instruction Based Sampling feature because it requires
emulating a large number of MSRs.

Ignore writes to a couple more AMD-specific MSRs and return 0 on read.

This further reduces the unimplemented MSRs accessed by a Linux guest on boot.
2014-10-17 06:23:04 +00:00
neel
b194fc5a2b Hide extended PerfCtr MSRs on AMD processors by clearing bits 23, 24 and 28 in
CPUID.80000001H:ECX.

Handle accesses to PerfCtrX and PerfEvtSelX MSRs by ignoring writes and
returning 0 on reads.

This further reduces the number of unimplemented MSRs hit by a Linux guest
during boot.
2014-10-17 03:04:38 +00:00
neel
7108d4bd75 Emulate the "Hardware Configuration" MSR when running on an AMD host.
This gets rid of the "TSC doesn't count with P0 frequency!" message when
booting a Linux guest.

Tested on an "AMD Opteron 6320" courtesy of Ben Perrault.
2014-10-16 19:27:26 +00:00
neel
bfa9f55e70 IFC @r272887 2014-10-10 23:52:56 +00:00
neel
f443215307 Support Intel-specific MSRs that are accessed when booting up a linux in bhyve:
- MSR_PLATFORM_INFO
- MSR_TURBO_RATIO_LIMITx
- MSR_RAPL_POWER_UNIT

Reviewed by:	grehan
MFC after:	1 week
2014-10-09 19:13:33 +00:00