Commit Graph

329 Commits

Author SHA1 Message Date
delphij
45de056048 Use strlcpy() instead of strncpy() because subsequent mkstemps expects
the string be nul-terminated.

Reviewed by:	neel
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D3685
2015-09-17 18:11:26 +00:00
grehan
9d08caeba8 Add simple (no-op) emulations for the CHECK_POWER_MODE,
READ_VERIFY and READ_VERIFY_EXT commands.

Reviewed by:	mav
2015-08-17 05:59:36 +00:00
mav
a4e4af47b0 Another small typo.
MFC after:	3 days
2015-08-11 09:00:27 +00:00
mav
f292c98681 Fix minor typo.
MFC after:	3 days
2015-08-11 08:58:00 +00:00
brueffer
94936bfa1d Manpage cleanup.
- new sentence -> new line
- fix manpage references
- fix macro usage
- fix a typo

MFC after:	1 week
2015-08-07 10:48:52 +00:00
neel
0a74489e33 Always assert DCD and DSR in bhyve's uart emulation.
The /etc/ttys entry for a serial console in FreeBSD/x86 is as follows:
ttyu0   "/usr/libexec/getty 3wire"      vt100   onifconsole secure

The initial terminal type passed to getty(8) is "3wire" which sets the
CLOCAL flag. However reset(1) clears this flag and any programs that try
to open the terminal will hang waiting for DCD to be asserted.

Fix this by always asserting DCD and DSR in the emulated uart.

The following discussion on virtualization@ has more details:
https://lists.freebsd.org/pipermail/freebsd-virtualization/2015-June/003666.html

Reported by: jmg
Discussed with: grehan
2015-07-06 19:33:29 +00:00
sjg
4beac78e55 Updated depends 2015-07-03 06:11:54 +00:00
jmg
2f2647539a add SO_REUSEADDR when starting debug port, lets you still bind when
a TIME_WAIT socket is still around...

Reviewed by:	grehan
Review:		https://reviews.freebsd.org/D2875
2015-06-20 07:49:08 +00:00
neel
8c70d6c7af Restructure memory allocation in bhyve to support "devmem".
devmem is used to represent MMIO devices like the boot ROM or a VESA framebuffer
where doing a trap-and-emulate for every access is impractical. devmem is a
hybrid of system memory (sysmem) and emulated device models.

devmem is mapped in the guest address space via nested page tables similar
to sysmem. However the address range where devmem is mapped may be changed
by the guest at runtime (e.g. by reprogramming a PCI BAR). Also devmem is
usually mapped RO or RW as compared to RWX mappings for sysmem.

Each devmem segment is named (e.g. "bootrom") and this name is used to
create a device node for the devmem segment (e.g. /dev/vmm/testvm.bootrom).
The device node supports mmap(2) and this decouples the host mapping of
devmem from its mapping in the guest address space (which can change).

Reviewed by:	tychon
Discussed with:	grehan
Differential Revision:	https://reviews.freebsd.org/D2762
MFC after:	4 weeks
2015-06-18 06:00:17 +00:00
neel
3f2b4fc770 Fix non-deterministic delays when accessing a vcpu that was in "running" or
"sleeping" state. This is done by forcing the vcpu to transition to "idle"
by returning to userspace with an exit code of VM_EXITCODE_REQIDLE.

MFC after:      2 weeks
2015-05-28 17:37:01 +00:00
tychon
ab4eab33be The 'hostbridge' device exists to allow guests to infer msi/msix
capablity by advertising pcie capability.

Since the 'hostbridge' device isn't a true pci-to-pci bridge, and
doesn't actaully use the bridge configuration space layout, change
the header-type from type 1 to type 0 to avoid confusion.

Reviewed by:	neel
2015-05-21 20:11:52 +00:00
grehan
5432c8344b Temporarily revert r282922 which bumped the max descriptors.
While there is no issued with the number of descriptors in
a virtio indirect descriptor, it's a guest's choice as to
whether indirect descriptors are used. For the case where
they aren't, the virtio block ring size is still 64 which
is less than the now reported max_segs of 67. This results
in an assertion in recent Linux guests even though it was
benign since they were using indirect descs.

The intertwined relationship between virtio ring size,
max seg size and blockif queue size will be addressed
in an upcoming commit, at which point the max descriptors
will again be bumped up to 67.
2015-05-21 04:19:22 +00:00
grehan
288de166c4 Bump the size of the blockif scatter-gather list to 67.
The Windows virtio driver ignores the advertized seg_max
field and assumes the host can accept up to 67 segments
in indirect descriptors, triggering an assert in the bhyve
process.

No objection from:	mav
Reviewed by:	neel
Reported and tested by:	Leon Dang (ldang@nahannisys.com)
MFC after:	2 weeks
2015-05-14 21:08:48 +00:00
grehan
cecc5885ac Set the subvendor field in config space to the vendor ID.
This is required by the Windows virtio drivers to correctly
match a device.

Submitted by:	Leon Dang (ldang@nahannisys.com)
MFC after:	2 weeks
2015-05-13 17:38:07 +00:00
neel
0bccd59f0c Allow configuration of the sector size advertised to the guest.
The default behavior is to infer the logical and physical sector sizes from
the block device backend. However older versions of Windows only work with
specific logical/physical combinations:
- Vista and Windows 7:	512/512
- Windows 7 SP1:	512/512 or 512/4096

For this reason allow the sector size to be specified using the following
block device option: sectorsize=logical[/physical]

Reported by:	Leon Dang (ldang@nahannisys.com)
Reviewed by:	grehan
MFC after:	2 weeks
2015-05-12 00:30:39 +00:00
grehan
cbd45b5dc8 Handling indirect descriptors is a capability of the host and
not one that needs to be negotiated. Use the host capabilities
field and not the negotiated field when verifying that indirect
descriptors are supported.

Found with the Redhat Windows viostor driver, which clears
the indirect capability in the negotiated caps and then starts
using them.

Reported and tested by: Leon Dang (ldang@nahannisys.com)
MFC after:   2 weeks
2015-05-11 21:24:10 +00:00
neel
7fdf1e8707 Allow byte reads of AHCI registers.
This is needed to support Windows guests that use byte reads to access certain
AHCI registers (e.g. PxTFD.Status and PxTFD.Error).

Reviewed by:	grehan, mav
Reported by:	Leon Dang (ldang@nahannisys.com)
Differential Revision:	https://reviews.freebsd.org/D2469
MFC after:	2 weeks
2015-05-07 18:35:15 +00:00
mav
eb1da25535 Add memory barrier to r281764.
While race at this point may cause only a single packet delay and so was
not really reproduced, it is better to not have it at all.

MFC after:	1 week
2015-05-06 18:04:31 +00:00
neel
7776059e98 Deprecate the 3-way return values from vm_gla2gpa() and vm_copy_setup().
Prior to this change both functions returned 0 for success, -1 for failure
and +1 to indicate that an exception was injected into the guest.

The numerical value of ERESTART also happens to be -1 so when these functions
returned -1 it had to be translated to a positive errno value to prevent the
VM_RUN ioctl from being inadvertently restarted. This made it easy to introduce
bugs when writing emulation code.

Fix this by adding an 'int *guest_fault' parameter and setting it to '1' if
an exception was delivered to the guest. The return value is 0 or EFAULT so
no additional translation is needed.

Reviewed by:	tychon
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D2428
2015-05-06 16:25:20 +00:00
mav
6b7074e362 Reimplement queue freeze on error, added in r282429:
It is not required to use CLO to recover from task file error, it should
be enough to do only stop/start, that does not clear the PxTFD.STS.ERR.

MFC after:	13 days
2015-05-06 09:59:19 +00:00
mav
8e9264e9c5 Implement in-order execution of non-NCQ commands.
Using status updates in r282364, block queue on BSY, DRQ or ERR bits set.
This can be a performance penalization for non-NCQ commands, but it is
required for proper error recovery and standard compliance.

MFC after:	2 weeks
2015-05-04 19:55:01 +00:00
mav
4cd4238e92 Implement basic PxTFD.STS.BSY reporting.
MFC after:	2 weeks
2015-05-03 07:43:58 +00:00
mav
843dbc5981 Initialize PxCMD on reset and make its read-only bits such.
MFC after:	2 weeks
2015-05-02 16:11:29 +00:00
mav
9b3d347cfb Handle ATA_SEND_FPDMA_QUEUED as NCQ in ahci_port_stop().
MFC after:	1 week
2015-05-02 14:43:37 +00:00
neel
9d0c86225f Advertise an additional memory BAR in the "dummy" device emulation.
This is useful for testing the MOVS emulation when both the source and
destination addresses are in the MMIO space.

MFC after:	1 week
2015-05-02 03:25:24 +00:00
neel
0151349b05 Implement the century byte in the RTC. Some guests require this field to be
properly set.

Reported by:	Leon Dang (ldang@nahannisys.com)
MFC after:	2 weeks
2015-04-28 23:44:47 +00:00
neel
fe295d11a1 Don't allow guest to modify readonly bits in the PCI config 'status' register.
Reported by:	Leon Dang (ldang@nahannisys.com)
MFC after:	2 weeks
2015-04-24 19:15:38 +00:00
jhb
e4683250d1 Reassign copyright statements on several files from Advanced
Computing Technologies LLC to Hudson River Trading LLC.

Approved by:	Hudson River Trading LLC (who owns ACT LLC)
MFC after:	1 week
2015-04-23 14:22:20 +00:00
mav
28d6acc050 Don't set bits that should be zero for SATA devices.
Old value made Linux think that it is PATA device with SATA bridge.

MFC after:	2 weeks
2015-04-20 19:11:27 +00:00
mav
b8bb630aaa Report link as up if tap device is not specified (black hole).
MFC after:	2 weeks
2015-04-20 14:55:01 +00:00
mav
f62a422e61 Report link as up only if we managed to open tap device.
It would be cool to report tap device status, but it has no such API.

MFC after:	2 weeks
2015-04-20 14:23:18 +00:00
mav
0c201fb9ee Disable RX/TX queues notifications when not needed.
This reduces CPU load and doubles iperf throughput, reaching 2-3Gbit/s.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2015-04-20 10:29:42 +00:00
mav
22312966ef Workaround bhyve virtual disks operation on top of GEOM providers.
GEOM does not support scatter/gather lists in its I/Os.  Such requests
are cut in pieces by physio(), that may be problematic, if those pieces
are not multiple of provider's sector size.  If such case is detected,
move the data through temporary sequential buffer.

MFC after:	2 weeks
2015-04-18 20:10:19 +00:00
mav
2ed1ecf65a Make virtual AHCI more careful with I/O lengths.
MFC after:	2 weeks
2015-04-17 20:20:55 +00:00
neel
e8382eebf0 If the number of guest vcpus is less than '1' then flag it as an error.
MFC after:	1 week
2015-04-16 20:11:49 +00:00
tychon
98c2a4bca4 Prior to aborting due to an ioport error, it is always interesting to
see what the guest's %rip is.

Reviewed by:	grehan
2015-04-15 18:49:03 +00:00
bapt
2ef8ecc842 Fix overlinking in bhyve:
libvmmapi is actually needed to be linked to libutil, not bhyve nor bhyveload
2015-04-09 21:38:40 +00:00
tychon
0b17a7a512 Prior to aborting due to an instruction emulation error, it is always
interesting to see what the guest's %rip and instruction bytes are.

Reviewed by:	grehan
2015-04-01 20:36:07 +00:00
grehan
d96cec83fc Move legacy interrupt allocation for virtio devices to common code.
There are a number of assumptions about legacy interrupts always
being available in virtio so don't allow back-ends to make the
decision to support them.

This fixes the issue seen with virtio-rnd on OpenBSD. MSI-x vectors
were not being used, and the virtio-rnd backend wasn't allocating a
legacy interrupt resulting in a bhyve assert and guest exit.

Reported by:	Julian Hsiao, madoka at nyanisore dot net
Reviewed by:	neel
MFC after:	1 week
2015-03-27 01:58:44 +00:00
mav
54b9845962 Add missing variable initialization.
Reported by:	Coverity
CID:		1288938
MFC after:	3 days
2015-03-20 16:05:13 +00:00
mav
55b7ea0246 Report that we may have write cache, and that we do support FLUSH.
FreeBSD guest driver does not use that legacy flag, but Linux seems does.

MFC after:	2 weeks
2015-03-16 20:13:25 +00:00
mav
cfdd687fd7 Increase S/G list size of 32 to 33 entries.
32 entries are not enough for the worst case of misaligned 128KB request,
that made FreeBSD to chunk large quests in odd pieces.

MFC after:	2 weeks
2015-03-16 09:15:59 +00:00
mav
2b87ed684f Pre-allocate one extra request per processing thread.
Processing threads call callbacks before freeing requests.  As result,
new requests may arrive before old ones are freed.

MFC after:	2 weeks
2015-03-15 22:44:53 +00:00
mav
0a32d97912 According to Linux and QEMU, s/n equal to buffer is not zero-terminated.
This makes same s/n reported for both virtio and AHCI drivers.

MFC after:	2 weeks
2015-03-15 17:45:16 +00:00
mav
72856e7d90 Close potential race on blockif_close().
Reported by:	vangyzen
MFC after:	2 weeks
2015-03-15 16:18:03 +00:00
mav
2088070eaf Fix networking problem after r280026.
I've missed that network driver sometimes returns taken request back to
available queue without processing.  Add new helper function for that case.

Reported by:	flo
MFC after:	2 weeks
2015-03-15 16:09:39 +00:00
mav
cd363583ce Give AHCI disk serial based on backing file path same as for virtio block.
It is still not good that they may intersect on different hosts, but that
is better then intersecting on the same host.

MFC after:	2 weeks
2015-03-15 15:29:03 +00:00
mav
15ba37b7de Rewrite virtio block device driver to work asynchronously and use the block
I/O interface.

Asynchronous operation, based on r280026 change, allows to not block virtual
CPU during I/O processing, that on slow/busy storage can take seconds.
Use of recently improved block I/O interface allows to process multiple
requests same time, that improves random I/O performance on wide storages.

Benchmarks of virtual disk, backed by ZVOL on RAID10 pool of 4 HDDs, show
~3.5 times random read performance improvements, while no degradation on
linear I/O.  Guest CPU usage during test dropped from 100% to almost zero.

MFC after:	2 weeks
2015-03-15 14:57:11 +00:00
mav
42641f98a6 Modify virtqueue helpers added in r253440 to allow queuing.
Original virtqueue design allows queued and out-of-order processing, but
helpers added in r253440 suppose only direct blocking in-order one.
It could be fine for network, etc., but it is a huge limitation for storage
devices.
2015-03-15 11:37:07 +00:00
mav
476187cac8 Block delete capability for read-only devices.
Submitted by:	neel
MFC after:	2 weeks
2015-03-15 08:09:56 +00:00