Import virtio base, PCI front-end, and net/block/balloon drivers.

Tested on Qemu/KVM, VirtualBox, and BHyVe.

Currently built as modules-only on i386/amd64. Man pages not yet hooked
up, pending review.

Submitted by:	Bryan Venteicher  bryanv at daemoninthecloset dot org
Reviewed by:	bz
MFC after:	4 weeks or so
This commit is contained in:
Peter Grehan 2011-11-18 05:43:43 +00:00
parent ef27340c5b
commit 10b59a9b4a
27 changed files with 8190 additions and 0 deletions

91
share/man/man4/virtio.4 Normal file

@ -0,0 +1,91 @@
.\" Copyright (c) 2011 Bryan Venteicher
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd July 4, 2011
.Dt VIRTIO 4
.Os
.Sh NAME
.Nm virtio
.Nd VirtIO Device Support
.Sh SYNOPSIS
To compile VirtIO device support into the kernel, place the following lines
in your kernel configuration file:
.Bd -ragged -offset indent
.Cd "device virtio"
.Cd "device virtio_pci"
.Ed
.Pp
Alternatively, to load VirtIO support as modules at boot time, place the
following lines in
.Xr loader.conf 5 :
.Bd -literal -offset indent
virtio_load="YES"
virtio_pci_load="YES"
.Ed
.Sh DESCRIPTION
VirtIO is a specification for para-virtualized I/O in a virtual machine (VM).
Traditionally, the hypervisor emulated real devices such as an Ethernet
interface or disk controller to provide the VM with I/O. This emulation is
often inefficient.
.Pp
VirtIO defines an interface for efficient I/O between the hypervisor and VM.
The
.Xr virtio 4
module provides a shared memory transport called a virtqueue.
The
.Xr virtio_pci 4
device driver represents an emulated PCI device that the hypervisor makes
available to the VM. This device provides the probing, configuration, and
interrupt notifications need to interact with the hypervisor.
.Fx
supports the following VirtIO devices:
.Bl -hang -offset indent -width xxxxxxxx
.It Nm Ethernet
An emulated Ethernet device is provided by the
.Xr if_vtnet 4
device driver.
.It Nm Block
An emulated disk controller is provided by the
.Xr virtio_blk 4
device driver.
.It Nm Balloon
A pseudo-device to allow the VM to release memory back to the hypervisor is
provided by the
.Xr virtio_balloon 4
device driver.
.El
.Sh SEE ALSO
.Xr if_vtnet 4 ,
.Xr virtio_blk 4 ,
.Xr virtio_balloon 4
.Sh HISTORY
Support for VirtIO first appeared in
.Fx 9.0 .
.Sh AUTHORS
.An -nosplit
.Fx
support for VirtIO was first added by
.An Bryan Venteicher Aq bryanv@daemoninthecloset.org .

@ -0,0 +1,64 @@
.\" Copyright (c) 2011 Bryan Venteicher
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd July 4, 2011
.Dt VIRTIO_BALLOON 4
.Os
.Sh NAME
.Nm virtio_balloon
.Nd VirtIO Memory Balloon driver
.Sh SYNOPSIS
To compile this driver into the kernel,
place the following lines in your
kernel configuration file:
.Bd -ragged -offset indent
.Cd "device virtio_balloon"
.Ed
.Pp
Alternatively, to load the driver as a
module at boot time, place the following line in
.Xr loader.conf 5 :
.Bd -literal -offset indent
virtio_balloon_load="YES"
.Ed
.Sh DESCRIPTION
The
.Nm
device driver provides support for VirtIO memory balloon devices.
.Pp
The memory balloon allows the guest to, at the request of the
hypervisor, return memory allocated to the hypervisor so it can
be made available to other guests. The hypervisor can later
signal the balloon to return the memory.
.Sh SEE ALSO
.Xr virtio 4
.Sh HISTORY
The
.Nm
driver was written by
.An Bryan Venteicher Aq bryanv@daemoninthecloset.org .
It first appeared in
.Fx 9.0 .

@ -0,0 +1,70 @@
.\" Copyright (c) 2011 Bryan Venteicher
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd July 4, 2011
.Dt VIRTIO_BLK 4
.Os
.Sh NAME
.Nm virtio_blk
.Nd VirtIO Block driver
.Sh SYNOPSIS
To compile this driver into the kernel,
place the following lines in your
kernel configuration file:
.Bd -ragged -offset indent
.Cd "device virtio_blk"
.Ed
.Pp
Alternatively, to load the driver as a
module at boot time, place the following line in
.Xr loader.conf 5 :
.Bd -literal -offset indent
virtio_blk_load="YES"
.Ed
.Sh DESCRIPTION
The
.Nm
device driver provides support for VirtIO block devices.
.Pp
.Sh LOADER TUNABLES
Tunables can be set at the
.Xr loader 8
prompt before booting the kernel or stored in
.Xr loader.conf 5 .
.Bl -tag -width "xxxxxx"
.It Va hw.vtblk.no_ident
This tunable disables retrieving the device identification string
from the hypervisor. The default value is 0.
.El
.Sh SEE ALSO
.Xr virtio 4
.Sh HISTORY
The
.Nm
driver was written by
.An Bryan Venteicher Aq bryanv@daemoninthecloset.org .
It first appeared in
.Fx 9.0 .

98
share/man/man4/vtnet.4 Normal file

@ -0,0 +1,98 @@
.\" Copyright (c) 2011 Bryan Venteicher
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd July 4, 2011
.Dt VTNET 4
.Os
.Sh NAME
.Nm vtnet
.Nd VirtIO Ethernet driver
.Sh SYNOPSIS
To compile this driver into the kernel,
place the following lines in your
kernel configuration file:
.Bd -ragged -offset indent
.Cd "device if_vtnet"
.Ed
.Pp
Alternatively, to load the driver as a
module at boot time, place the following line in
.Xr loader.conf 5 :
.Bd -literal -offset indent
if_vtnet_load="YES"
.Ed
.Sh DESCRIPTION
The
.Nm
device driver provides support for VirtIO Ethernet devices.
.Pp
If the hypervisor advertises the appreciate features, the
.Nm
driver supports TCP/UDP checksum offload for both transmit and receive,
TCP segmentation offload (TSO), TCP large receive offload (LRO), and
hardware VLAN tag stripping/insertion features, as well as a multicast
hash filter, as well as Jumbo Frames (up to 9216 bytes), which can be
configured via the interface MTU setting.
Selecting an MTU larger than 1500 bytes with the
.Xr ifconfig 8
utility configures the adapter to receive and transmit Jumbo Frames.
.Pp
For more information on configuring this device, see
.Xr ifconfig 8 .
.El
.Sh LOADER TUNABLES
Tunables can be set at the
.Xr loader 8
prompt before booting the kernel or stored in
.Xr loader.conf 5 .
.Bl -tag -width "xxxxxx"
.It Va hw.vtnet.csum_disable
This tunable disables receive and send checksum offload. The default
value is 0.
.It Va hw.vtnet.tso_disable
This tunable disables TSO. The default value is 0.
.It Va hw.vtnet.lro_disable
This tunable disables LRO. The default value is 0.
.El
.Sh SEE ALSO
.Xr arp 4 ,
.Xr netintro 4 ,
.Xr ng_ether 4 ,
.Xr vlan 4 ,
.Xr virtio 4 ,
.Xr ifconfig 8
.Sh HISTORY
The
.Nm
driver was written by
.An Bryan Venteicher Aq bryanv@daemoninthecloset.org .
It first appeared in
.Fx 9.0 .
.Sh CAVEATS
The
.Nm
driver only supports LRO when the hypervisor advertises the
mergeable buffer feature.

@ -0,0 +1,569 @@
/*-
* Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice unmodified, this list of conditions, and the following
* disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/* Driver for VirtIO memory balloon devices. */
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/kernel.h>
#include <sys/endian.h>
#include <sys/kthread.h>
#include <sys/malloc.h>
#include <sys/module.h>
#include <sys/sglist.h>
#include <sys/sysctl.h>
#include <sys/lock.h>
#include <sys/mutex.h>
#include <sys/queue.h>
#include <vm/vm.h>
#include <vm/vm_page.h>
#include <machine/bus.h>
#include <machine/resource.h>
#include <sys/bus.h>
#include <sys/rman.h>
#include <dev/virtio/virtio.h>
#include <dev/virtio/virtqueue.h>
#include <dev/virtio/balloon/virtio_balloon.h>
#include "virtio_if.h"
struct vtballoon_softc {
device_t vtballoon_dev;
struct mtx vtballoon_mtx;
uint64_t vtballoon_features;
uint32_t vtballoon_flags;
#define VTBALLOON_FLAG_DETACH 0x01
struct virtqueue *vtballoon_inflate_vq;
struct virtqueue *vtballoon_deflate_vq;
uint32_t vtballoon_desired_npages;
uint32_t vtballoon_current_npages;
TAILQ_HEAD(,vm_page) vtballoon_pages;
struct proc *vtballoon_kproc;
uint32_t *vtballoon_page_frames;
int vtballoon_timeout;
};
static struct virtio_feature_desc vtballoon_feature_desc[] = {
{ VIRTIO_BALLOON_F_MUST_TELL_HOST, "MustTellHost" },
{ VIRTIO_BALLOON_F_STATS_VQ, "StatsVq" },
{ 0, NULL }
};
static int vtballoon_probe(device_t);
static int vtballoon_attach(device_t);
static int vtballoon_detach(device_t);
static int vtballoon_config_change(device_t);
static void vtballoon_negotiate_features(struct vtballoon_softc *);
static int vtballoon_alloc_virtqueues(struct vtballoon_softc *);
static int vtballoon_vq_intr(void *);
static void vtballoon_inflate(struct vtballoon_softc *, int);
static void vtballoon_deflate(struct vtballoon_softc *, int);
static void vtballoon_send_page_frames(struct vtballoon_softc *,
struct virtqueue *, int);
static void vtballoon_pop(struct vtballoon_softc *);
static void vtballoon_stop(struct vtballoon_softc *);
static vm_page_t
vtballoon_alloc_page(struct vtballoon_softc *);
static void vtballoon_free_page(struct vtballoon_softc *, vm_page_t);
static int vtballoon_sleep(struct vtballoon_softc *);
static void vtballoon_thread(void *);
static void vtballoon_add_sysctl(struct vtballoon_softc *);
/* Features desired/implemented by this driver. */
#define VTBALLOON_FEATURES 0
/* Timeout between retries when the balloon needs inflating. */
#define VTBALLOON_LOWMEM_TIMEOUT hz
/*
* Maximum number of pages we'll request to inflate or deflate
* the balloon in one virtqueue request. Both Linux and NetBSD
* have settled on 256, doing up to 1MB at a time.
*/
#define VTBALLOON_PAGES_PER_REQUEST 256
#define VTBALLOON_MTX(_sc) &(_sc)->vtballoon_mtx
#define VTBALLOON_LOCK_INIT(_sc, _name) mtx_init(VTBALLOON_MTX((_sc)), _name, \
"VirtIO Balloon Lock", MTX_SPIN)
#define VTBALLOON_LOCK(_sc) mtx_lock_spin(VTBALLOON_MTX((_sc)))
#define VTBALLOON_UNLOCK(_sc) mtx_unlock_spin(VTBALLOON_MTX((_sc)))
#define VTBALLOON_LOCK_DESTROY(_sc) mtx_destroy(VTBALLOON_MTX((_sc)))
static device_method_t vtballoon_methods[] = {
/* Device methods. */
DEVMETHOD(device_probe, vtballoon_probe),
DEVMETHOD(device_attach, vtballoon_attach),
DEVMETHOD(device_detach, vtballoon_detach),
/* VirtIO methods. */
DEVMETHOD(virtio_config_change, vtballoon_config_change),
{ 0, 0 }
};
static driver_t vtballoon_driver = {
"vtballoon",
vtballoon_methods,
sizeof(struct vtballoon_softc)
};
static devclass_t vtballoon_devclass;
DRIVER_MODULE(virtio_balloon, virtio_pci, vtballoon_driver,
vtballoon_devclass, 0, 0);
MODULE_VERSION(virtio_balloon, 1);
MODULE_DEPEND(virtio_balloon, virtio, 1, 1, 1);
static int
vtballoon_probe(device_t dev)
{
if (virtio_get_device_type(dev) != VIRTIO_ID_BALLOON)
return (ENXIO);
device_set_desc(dev, "VirtIO Balloon Adapter");
return (BUS_PROBE_DEFAULT);
}
static int
vtballoon_attach(device_t dev)
{
struct vtballoon_softc *sc;
int error;
sc = device_get_softc(dev);
sc->vtballoon_dev = dev;
VTBALLOON_LOCK_INIT(sc, device_get_nameunit(dev));
TAILQ_INIT(&sc->vtballoon_pages);
vtballoon_add_sysctl(sc);
virtio_set_feature_desc(dev, vtballoon_feature_desc);
vtballoon_negotiate_features(sc);
sc->vtballoon_page_frames = malloc(VTBALLOON_PAGES_PER_REQUEST *
sizeof(uint32_t), M_DEVBUF, M_NOWAIT | M_ZERO);
if (sc->vtballoon_page_frames == NULL) {
error = ENOMEM;
device_printf(dev,
"cannot allocate page frame request array\n");
goto fail;
}
error = vtballoon_alloc_virtqueues(sc);
if (error) {
device_printf(dev, "cannot allocate virtqueues\n");
goto fail;
}
error = virtio_setup_intr(dev, INTR_TYPE_MISC);
if (error) {
device_printf(dev, "cannot setup virtqueue interrupts\n");
goto fail;
}
error = kproc_create(vtballoon_thread, sc, &sc->vtballoon_kproc,
0, 0, "virtio_balloon");
if (error) {
device_printf(dev, "cannot create balloon kproc\n");
goto fail;
}
virtqueue_enable_intr(sc->vtballoon_inflate_vq);
virtqueue_enable_intr(sc->vtballoon_deflate_vq);
fail:
if (error)
vtballoon_detach(dev);
return (error);
}
static int
vtballoon_detach(device_t dev)
{
struct vtballoon_softc *sc;
sc = device_get_softc(dev);
if (sc->vtballoon_kproc != NULL) {
VTBALLOON_LOCK(sc);
sc->vtballoon_flags |= VTBALLOON_FLAG_DETACH;
wakeup_one(sc);
msleep_spin(sc->vtballoon_kproc, VTBALLOON_MTX(sc),
"vtbdth", 0);
VTBALLOON_UNLOCK(sc);
sc->vtballoon_kproc = NULL;
}
if (device_is_attached(dev)) {
vtballoon_pop(sc);
vtballoon_stop(sc);
}
if (sc->vtballoon_page_frames != NULL) {
free(sc->vtballoon_page_frames, M_DEVBUF);
sc->vtballoon_page_frames = NULL;
}
VTBALLOON_LOCK_DESTROY(sc);
return (0);
}
static int
vtballoon_config_change(device_t dev)
{
struct vtballoon_softc *sc;
sc = device_get_softc(dev);
VTBALLOON_LOCK(sc);
wakeup_one(sc);
VTBALLOON_UNLOCK(sc);
return (1);
}
static void
vtballoon_negotiate_features(struct vtballoon_softc *sc)
{
device_t dev;
uint64_t features;
dev = sc->vtballoon_dev;
features = virtio_negotiate_features(dev, VTBALLOON_FEATURES);
sc->vtballoon_features = features;
}
static int
vtballoon_alloc_virtqueues(struct vtballoon_softc *sc)
{
device_t dev;
struct vq_alloc_info vq_info[2];
int nvqs;
dev = sc->vtballoon_dev;
nvqs = 2;
VQ_ALLOC_INFO_INIT(&vq_info[0], 0, vtballoon_vq_intr, sc,
&sc->vtballoon_inflate_vq, "%s inflate", device_get_nameunit(dev));
VQ_ALLOC_INFO_INIT(&vq_info[1], 0, vtballoon_vq_intr, sc,
&sc->vtballoon_deflate_vq, "%s deflate", device_get_nameunit(dev));
return (virtio_alloc_virtqueues(dev, 0, nvqs, vq_info));
}
static int
vtballoon_vq_intr(void *xsc)
{
struct vtballoon_softc *sc;
sc = xsc;
VTBALLOON_LOCK(sc);
wakeup_one(sc);
VTBALLOON_UNLOCK(sc);
return (1);
}
static void
vtballoon_inflate(struct vtballoon_softc *sc, int npages)
{
struct virtqueue *vq;
vm_page_t m;
int i;
vq = sc->vtballoon_inflate_vq;
m = NULL;
if (npages > VTBALLOON_PAGES_PER_REQUEST)
npages = VTBALLOON_PAGES_PER_REQUEST;
KASSERT(npages > 0, ("balloon doesn't need inflating?"));
for (i = 0; i < npages; i++) {
if ((m = vtballoon_alloc_page(sc)) == NULL)
break;
sc->vtballoon_page_frames[i] =
VM_PAGE_TO_PHYS(m) >> VIRTIO_BALLOON_PFN_SHIFT;
KASSERT(m->queue == PQ_NONE, ("allocated page on queue"));
TAILQ_INSERT_TAIL(&sc->vtballoon_pages, m, pageq);
}
if (i > 0)
vtballoon_send_page_frames(sc, vq, i);
if (m == NULL)
sc->vtballoon_timeout = VTBALLOON_LOWMEM_TIMEOUT;
}
static void
vtballoon_deflate(struct vtballoon_softc *sc, int npages)
{
TAILQ_HEAD(, vm_page) free_pages;
struct virtqueue *vq;
vm_page_t m;
int i;
vq = sc->vtballoon_deflate_vq;
TAILQ_INIT(&free_pages);
if (npages > VTBALLOON_PAGES_PER_REQUEST)
npages = VTBALLOON_PAGES_PER_REQUEST;
KASSERT(npages > 0, ("balloon doesn't need deflating?"));
for (i = 0; i < npages; i++) {
m = TAILQ_FIRST(&sc->vtballoon_pages);
KASSERT(m != NULL, ("no more pages to deflate"));
sc->vtballoon_page_frames[i] =
VM_PAGE_TO_PHYS(m) >> VIRTIO_BALLOON_PFN_SHIFT;
TAILQ_REMOVE(&sc->vtballoon_pages, m, pageq);
TAILQ_INSERT_TAIL(&free_pages, m, pageq);
}
if (i > 0) {
/* Always tell host first before freeing the pages. */
vtballoon_send_page_frames(sc, vq, i);
while ((m = TAILQ_FIRST(&free_pages)) != NULL) {
TAILQ_REMOVE(&free_pages, m, pageq);
vtballoon_free_page(sc, m);
}
}
KASSERT((TAILQ_EMPTY(&sc->vtballoon_pages) &&
sc->vtballoon_current_npages == 0) ||
(!TAILQ_EMPTY(&sc->vtballoon_pages) &&
sc->vtballoon_current_npages != 0), ("balloon empty?"));
}
static void
vtballoon_send_page_frames(struct vtballoon_softc *sc, struct virtqueue *vq,
int npages)
{
struct sglist sg;
struct sglist_seg segs[1];
void *c;
int error;
sglist_init(&sg, 1, segs);
error = sglist_append(&sg, sc->vtballoon_page_frames,
npages * sizeof(uint32_t));
KASSERT(error == 0, ("error adding page frames to sglist"));
error = virtqueue_enqueue(vq, vq, &sg, 1, 0);
KASSERT(error == 0, ("error enqueuing page frames to virtqueue"));
/*
* Inflate and deflate operations are done synchronously. The
* interrupt handler will wake us up.
*/
VTBALLOON_LOCK(sc);
virtqueue_notify(vq);
while ((c = virtqueue_dequeue(vq, NULL)) == NULL)
msleep_spin(sc, VTBALLOON_MTX(sc), "vtbspf", 0);
VTBALLOON_UNLOCK(sc);
KASSERT(c == vq, ("unexpected balloon operation response"));
}
static void
vtballoon_pop(struct vtballoon_softc *sc)
{
while (!TAILQ_EMPTY(&sc->vtballoon_pages))
vtballoon_deflate(sc, sc->vtballoon_current_npages);
}
static void
vtballoon_stop(struct vtballoon_softc *sc)
{
virtqueue_disable_intr(sc->vtballoon_inflate_vq);
virtqueue_disable_intr(sc->vtballoon_deflate_vq);
virtio_stop(sc->vtballoon_dev);
}
static vm_page_t
vtballoon_alloc_page(struct vtballoon_softc *sc)
{
vm_page_t m;
m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_WIRED |
VM_ALLOC_NOOBJ);
if (m != NULL)
sc->vtballoon_current_npages++;
return (m);
}
static void
vtballoon_free_page(struct vtballoon_softc *sc, vm_page_t m)
{
vm_page_unwire(m, 0);
vm_page_free(m);
sc->vtballoon_current_npages--;
}
static uint32_t
vtballoon_desired_size(struct vtballoon_softc *sc)
{
uint32_t desired;
desired = virtio_read_dev_config_4(sc->vtballoon_dev,
offsetof(struct virtio_balloon_config, num_pages));
return (le32toh(desired));
}
static void
vtballoon_update_size(struct vtballoon_softc *sc)
{
virtio_write_dev_config_4(sc->vtballoon_dev,
offsetof(struct virtio_balloon_config, actual),
htole32(sc->vtballoon_current_npages));
}
static int
vtballoon_sleep(struct vtballoon_softc *sc)
{
int rc, timeout;
uint32_t current, desired;
rc = 0;
current = sc->vtballoon_current_npages;
VTBALLOON_LOCK(sc);
for (;;) {
if (sc->vtballoon_flags & VTBALLOON_FLAG_DETACH) {
rc = 1;
break;
}
desired = vtballoon_desired_size(sc);
sc->vtballoon_desired_npages = desired;
/*
* If given, use non-zero timeout on the first time through
* the loop. On subsequent times, timeout will be zero so
* we will reevaluate the desired size of the balloon and
* break out to retry if needed.
*/
timeout = sc->vtballoon_timeout;
sc->vtballoon_timeout = 0;
if (current > desired)
break;
if (current < desired && timeout == 0)
break;
msleep_spin(sc, VTBALLOON_MTX(sc), "vtbslp", timeout);
}
VTBALLOON_UNLOCK(sc);
return (rc);
}
static void
vtballoon_thread(void *xsc)
{
struct vtballoon_softc *sc;
uint32_t current, desired;
sc = xsc;
for (;;) {
if (vtballoon_sleep(sc) != 0)
break;
current = sc->vtballoon_current_npages;
desired = sc->vtballoon_desired_npages;
if (desired != current) {
if (desired > current)
vtballoon_inflate(sc, desired - current);
else
vtballoon_deflate(sc, current - desired);
vtballoon_update_size(sc);
}
}
kproc_exit(0);
}
static void
vtballoon_add_sysctl(struct vtballoon_softc *sc)
{
device_t dev;
struct sysctl_ctx_list *ctx;
struct sysctl_oid *tree;
struct sysctl_oid_list *child;
dev = sc->vtballoon_dev;
ctx = device_get_sysctl_ctx(dev);
tree = device_get_sysctl_tree(dev);
child = SYSCTL_CHILDREN(tree);
SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "desired",
CTLFLAG_RD, &sc->vtballoon_desired_npages, sizeof(uint32_t),
"Desired balloon size in pages");
SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "current",
CTLFLAG_RD, &sc->vtballoon_current_npages, sizeof(uint32_t),
"Current balloon size in pages");
}

@ -0,0 +1,41 @@
/*
* This header is BSD licensed so anyone can use the definitions to implement
* compatible drivers/servers.
*
* $FreeBSD$
*/
#ifndef _VIRTIO_BALLOON_H
#define _VIRTIO_BALLOON_H
#include <sys/types.h>
/* Feature bits. */
#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0x1 /* Tell before reclaiming pages */
#define VIRTIO_BALLOON_F_STATS_VQ 0x2 /* Memory stats virtqueue */
/* Size of a PFN in the balloon interface. */
#define VIRTIO_BALLOON_PFN_SHIFT 12
struct virtio_balloon_config {
/* Number of pages host wants Guest to give up. */
uint32_t num_pages;
/* Number of pages we've actually got in balloon. */
uint32_t actual;
};
#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */
#define VIRTIO_BALLOON_S_SWAP_OUT 1 /* Amount of memory swapped out */
#define VIRTIO_BALLOON_S_MAJFLT 2 /* Number of major faults */
#define VIRTIO_BALLOON_S_MINFLT 3 /* Number of minor faults */
#define VIRTIO_BALLOON_S_MEMFREE 4 /* Total amount of free memory */
#define VIRTIO_BALLOON_S_MEMTOT 5 /* Total amount of memory */
#define VIRTIO_BALLOON_S_NR 6
struct virtio_balloon_stat {
uint16_t tag;
uint64_t val;
} __packed;
#endif /* _VIRTIO_BALLOON_H */

File diff suppressed because it is too large Load Diff

@ -0,0 +1,106 @@
/*
* This header is BSD licensed so anyone can use the definitions to implement
* compatible drivers/servers.
*
* $FreeBSD$
*/
#ifndef _VIRTIO_BLK_H
#define _VIRTIO_BLK_H
#include <sys/types.h>
/* Feature bits */
#define VIRTIO_BLK_F_BARRIER 0x0001 /* Does host support barriers? */
#define VIRTIO_BLK_F_SIZE_MAX 0x0002 /* Indicates maximum segment size */
#define VIRTIO_BLK_F_SEG_MAX 0x0004 /* Indicates maximum # of segments */
#define VIRTIO_BLK_F_GEOMETRY 0x0010 /* Legacy geometry available */
#define VIRTIO_BLK_F_RO 0x0020 /* Disk is read-only */
#define VIRTIO_BLK_F_BLK_SIZE 0x0040 /* Block size of disk is available*/
#define VIRTIO_BLK_F_SCSI 0x0080 /* Supports scsi command passthru */
#define VIRTIO_BLK_F_FLUSH 0x0200 /* Cache flush command support */
#define VIRTIO_BLK_F_TOPOLOGY 0x0400 /* Topology information is available */
#define VIRTIO_BLK_ID_BYTES 20 /* ID string length */
struct virtio_blk_config {
/* The capacity (in 512-byte sectors). */
uint64_t capacity;
/* The maximum segment size (if VIRTIO_BLK_F_SIZE_MAX) */
uint32_t size_max;
/* The maximum number of segments (if VIRTIO_BLK_F_SEG_MAX) */
uint32_t seg_max;
/* geometry the device (if VIRTIO_BLK_F_GEOMETRY) */
struct virtio_blk_geometry {
uint16_t cylinders;
uint8_t heads;
uint8_t sectors;
} geometry;
/* block size of device (if VIRTIO_BLK_F_BLK_SIZE) */
uint32_t blk_size;
/* the next 4 entries are guarded by VIRTIO_BLK_F_TOPOLOGY */
/* exponent for physical block per logical block. */
uint8_t physical_block_exp;
/* alignment offset in logical blocks. */
uint8_t alignment_offset;
/* minimum I/O size without performance penalty in logical blocks. */
uint16_t min_io_size;
/* optimal sustained I/O size in logical blocks. */
uint32_t opt_io_size;
} __packed;
/*
* Command types
*
* Usage is a bit tricky as some bits are used as flags and some are not.
*
* Rules:
* VIRTIO_BLK_T_OUT may be combined with VIRTIO_BLK_T_SCSI_CMD or
* VIRTIO_BLK_T_BARRIER. VIRTIO_BLK_T_FLUSH is a command of its own
* and may not be combined with any of the other flags.
*/
/* These two define direction. */
#define VIRTIO_BLK_T_IN 0
#define VIRTIO_BLK_T_OUT 1
/* This bit says it's a scsi command, not an actual read or write. */
#define VIRTIO_BLK_T_SCSI_CMD 2
/* Cache flush command */
#define VIRTIO_BLK_T_FLUSH 4
/* Get device ID command */
#define VIRTIO_BLK_T_GET_ID 8
/* Barrier before this op. */
#define VIRTIO_BLK_T_BARRIER 0x80000000
/* ID string length */
#define VIRTIO_BLK_ID_BYTES 20
/* This is the first element of the read scatter-gather list. */
struct virtio_blk_outhdr {
/* VIRTIO_BLK_T* */
uint32_t type;
/* io priority. */
uint32_t ioprio;
/* Sector (ie. 512 byte offset) */
uint64_t sector;
};
struct virtio_scsi_inhdr {
uint32_t errors;
uint32_t data_len;
uint32_t sense_len;
uint32_t residual;
};
/* And this is the final byte of the write scatter-gather list. */
#define VIRTIO_BLK_S_OK 0
#define VIRTIO_BLK_S_IOERR 1
#define VIRTIO_BLK_S_UNSUPP 2
#endif /* _VIRTIO_BLK_H */

File diff suppressed because it is too large Load Diff

@ -0,0 +1,240 @@
/*-
* Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice unmodified, this list of conditions, and the following
* disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* $FreeBSD$
*/
#ifndef _IF_VTNETVAR_H
#define _IF_VTNETVAR_H
struct vtnet_statistics {
unsigned long mbuf_alloc_failed;
unsigned long rx_frame_too_large;
unsigned long rx_enq_replacement_failed;
unsigned long rx_mergeable_failed;
unsigned long rx_csum_bad_ethtype;
unsigned long rx_csum_bad_start;
unsigned long rx_csum_bad_ipproto;
unsigned long rx_csum_bad_offset;
unsigned long rx_csum_failed;
unsigned long rx_csum_offloaded;
unsigned long rx_task_rescheduled;
unsigned long tx_csum_offloaded;
unsigned long tx_tso_offloaded;
unsigned long tx_csum_bad_ethtype;
unsigned long tx_tso_bad_ethtype;
unsigned long tx_task_rescheduled;
};
struct vtnet_softc {
device_t vtnet_dev;
struct ifnet *vtnet_ifp;
struct mtx vtnet_mtx;
uint32_t vtnet_flags;
#define VTNET_FLAG_LINK 0x0001
#define VTNET_FLAG_SUSPENDED 0x0002
#define VTNET_FLAG_CTRL_VQ 0x0004
#define VTNET_FLAG_CTRL_RX 0x0008
#define VTNET_FLAG_VLAN_FILTER 0x0010
#define VTNET_FLAG_TSO_ECN 0x0020
#define VTNET_FLAG_MRG_RXBUFS 0x0040
#define VTNET_FLAG_LRO_NOMRG 0x0080
struct virtqueue *vtnet_rx_vq;
struct virtqueue *vtnet_tx_vq;
struct virtqueue *vtnet_ctrl_vq;
int vtnet_hdr_size;
int vtnet_tx_size;
int vtnet_rx_size;
int vtnet_rx_process_limit;
int vtnet_rx_mbuf_size;
int vtnet_rx_mbuf_count;
int vtnet_if_flags;
int vtnet_watchdog_timer;
uint64_t vtnet_features;
struct taskqueue *vtnet_tq;
struct task vtnet_rx_intr_task;
struct task vtnet_tx_intr_task;
struct task vtnet_cfgchg_task;
struct vtnet_statistics vtnet_stats;
struct callout vtnet_tick_ch;
eventhandler_tag vtnet_vlan_attach;
eventhandler_tag vtnet_vlan_detach;
struct ifmedia vtnet_media;
/*
* Fake media type; the host does not provide us with
* any real media information.
*/
#define VTNET_MEDIATYPE (IFM_ETHER | IFM_1000_T | IFM_FDX)
char vtnet_hwaddr[ETHER_ADDR_LEN];
/*
* During reset, the host's VLAN filtering table is lost. The
* array below is used to restore all the VLANs configured on
* this interface after a reset.
*/
#define VTNET_VLAN_SHADOW_SIZE (4096 / 32)
int vtnet_nvlans;
uint32_t vtnet_vlan_shadow[VTNET_VLAN_SHADOW_SIZE];
char vtnet_mtx_name[16];
};
/*
* When mergeable buffers are not negotiated, the vtnet_rx_header structure
* below is placed at the beginning of the mbuf data. Use 4 bytes of pad to
* both keep the VirtIO header and the data non-contiguous and to keep the
* frame's payload 4 byte aligned.
*
* When mergeable buffers are negotiated, the host puts the VirtIO header in
* the beginning of the first mbuf's data.
*/
#define VTNET_RX_HEADER_PAD 4
struct vtnet_rx_header {
struct virtio_net_hdr vrh_hdr;
char vrh_pad[VTNET_RX_HEADER_PAD];
} __packed;
/*
* For each outgoing frame, the vtnet_tx_header below is allocated from
* the vtnet_tx_header_zone.
*/
struct vtnet_tx_header {
union {
struct virtio_net_hdr hdr;
struct virtio_net_hdr_mrg_rxbuf mhdr;
} vth_uhdr;
struct mbuf *vth_mbuf;
};
/*
* The VirtIO specification does not place a limit on the number of MAC
* addresses the guest driver may request to be filtered. In practice,
* the host is constrained by available resources. To simplify this driver,
* impose a reasonably high limit of MAC addresses we will filter before
* falling back to promiscuous or all-multicast modes.
*/
#define VTNET_MAX_MAC_ENTRIES 128
struct vtnet_mac_table {
uint32_t nentries;
uint8_t macs[VTNET_MAX_MAC_ENTRIES][ETHER_ADDR_LEN];
} __packed;
struct vtnet_mac_filter {
struct vtnet_mac_table vmf_unicast;
uint32_t vmf_pad; /* Make tables non-contiguous. */
struct vtnet_mac_table vmf_multicast;
};
/*
* The MAC filter table is malloc(9)'d when needed. Ensure it will
* always fit in one segment.
*/
CTASSERT(sizeof(struct vtnet_mac_filter) <= PAGE_SIZE);
#define VTNET_WATCHDOG_TIMEOUT 5
#define VTNET_CSUM_OFFLOAD (CSUM_TCP | CSUM_UDP | CSUM_SCTP)
/* Features desired/implemented by this driver. */
#define VTNET_FEATURES \
(VIRTIO_NET_F_MAC | \
VIRTIO_NET_F_STATUS | \
VIRTIO_NET_F_CTRL_VQ | \
VIRTIO_NET_F_CTRL_RX | \
VIRTIO_NET_F_CTRL_VLAN | \
VIRTIO_NET_F_CSUM | \
VIRTIO_NET_F_HOST_TSO4 | \
VIRTIO_NET_F_HOST_TSO6 | \
VIRTIO_NET_F_HOST_ECN | \
VIRTIO_NET_F_GUEST_CSUM | \
VIRTIO_NET_F_GUEST_TSO4 | \
VIRTIO_NET_F_GUEST_TSO6 | \
VIRTIO_NET_F_GUEST_ECN | \
VIRTIO_NET_F_MRG_RXBUF | \
VIRTIO_RING_F_INDIRECT_DESC)
/*
* The VIRTIO_NET_F_GUEST_TSO[46] features permit the host to send us
* frames larger than 1514 bytes. We do not yet support software LRO
* via tcp_lro_rx().
*/
#define VTNET_LRO_FEATURES (VIRTIO_NET_F_GUEST_TSO4 | \
VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_ECN)
#define VTNET_MAX_MTU 65536
#define VTNET_MAX_RX_SIZE 65550
/*
* Used to preallocate the Vq indirect descriptors. The first segment
* is reserved for the header.
*/
#define VTNET_MIN_RX_SEGS 2
#define VTNET_MAX_RX_SEGS 34
#define VTNET_MAX_TX_SEGS 34
/*
* Assert we can receive and transmit the maximum with regular
* size clusters.
*/
CTASSERT(((VTNET_MAX_RX_SEGS - 1) * MCLBYTES) >= VTNET_MAX_RX_SIZE);
CTASSERT(((VTNET_MAX_TX_SEGS - 1) * MCLBYTES) >= VTNET_MAX_MTU);
/*
* Determine how many mbufs are in each receive buffer. For LRO without
* mergeable descriptors, we must allocate an mbuf chain large enough to
* hold both the vtnet_rx_header and the maximum receivable data.
*/
#define VTNET_NEEDED_RX_MBUFS(_sc) \
((_sc)->vtnet_flags & VTNET_FLAG_LRO_NOMRG) == 0 ? 1 : \
howmany(sizeof(struct vtnet_rx_header) + VTNET_MAX_RX_SIZE, \
(_sc)->vtnet_rx_mbuf_size)
#define VTNET_MTX(_sc) &(_sc)->vtnet_mtx
#define VTNET_LOCK(_sc) mtx_lock(VTNET_MTX((_sc)))
#define VTNET_UNLOCK(_sc) mtx_unlock(VTNET_MTX((_sc)))
#define VTNET_LOCK_DESTROY(_sc) mtx_destroy(VTNET_MTX((_sc)))
#define VTNET_LOCK_ASSERT(_sc) mtx_assert(VTNET_MTX((_sc)), MA_OWNED)
#define VTNET_LOCK_ASSERT_NOTOWNED(_sc) \
mtx_assert(VTNET_MTX((_sc)), MA_NOTOWNED)
#define VTNET_LOCK_INIT(_sc) do { \
snprintf((_sc)->vtnet_mtx_name, sizeof((_sc)->vtnet_mtx_name), \
"%s", device_get_nameunit((_sc)->vtnet_dev)); \
mtx_init(VTNET_MTX((_sc)), (_sc)->vtnet_mtx_name, \
"VTNET Core Lock", MTX_DEF); \
} while (0)
#endif /* _IF_VTNETVAR_H */

@ -0,0 +1,138 @@
/*
* This header is BSD licensed so anyone can use the definitions to implement
* compatible drivers/servers.
*
* $FreeBSD$
*/
#ifndef _VIRTIO_NET_H
#define _VIRTIO_NET_H
#include <sys/types.h>
/* The feature bitmap for virtio net */
#define VIRTIO_NET_F_CSUM 0x00001 /* Host handles pkts w/ partial csum */
#define VIRTIO_NET_F_GUEST_CSUM 0x00002 /* Guest handles pkts w/ partial csum*/
#define VIRTIO_NET_F_MAC 0x00020 /* Host has given MAC address. */
#define VIRTIO_NET_F_GSO 0x00040 /* Host handles pkts w/ any GSO type */
#define VIRTIO_NET_F_GUEST_TSO4 0x00080 /* Guest can handle TSOv4 in. */
#define VIRTIO_NET_F_GUEST_TSO6 0x00100 /* Guest can handle TSOv6 in. */
#define VIRTIO_NET_F_GUEST_ECN 0x00200 /* Guest can handle TSO[6] w/ ECN in.*/
#define VIRTIO_NET_F_GUEST_UFO 0x00400 /* Guest can handle UFO in. */
#define VIRTIO_NET_F_HOST_TSO4 0x00800 /* Host can handle TSOv4 in. */
#define VIRTIO_NET_F_HOST_TSO6 0x01000 /* Host can handle TSOv6 in. */
#define VIRTIO_NET_F_HOST_ECN 0x02000 /* Host can handle TSO[6] w/ ECN in. */
#define VIRTIO_NET_F_HOST_UFO 0x04000 /* Host can handle UFO in. */
#define VIRTIO_NET_F_MRG_RXBUF 0x08000 /* Host can merge receive buffers. */
#define VIRTIO_NET_F_STATUS 0x10000 /* virtio_net_config.status available*/
#define VIRTIO_NET_F_CTRL_VQ 0x20000 /* Control channel available */
#define VIRTIO_NET_F_CTRL_RX 0x40000 /* Control channel RX mode support */
#define VIRTIO_NET_F_CTRL_VLAN 0x80000 /* Control channel VLAN filtering */
#define VIRTIO_NET_F_CTRL_RX_EXTRA 0x100000 /* Extra RX mode control support */
#define VIRTIO_NET_S_LINK_UP 1 /* Link is up */
struct virtio_net_config {
/* The config defining mac address (if VIRTIO_NET_F_MAC) */
uint8_t mac[ETHER_ADDR_LEN];
/* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
uint16_t status;
} __packed;
/*
* This is the first element of the scatter-gather list. If you don't
* specify GSO or CSUM features, you can simply ignore the header.
*/
struct virtio_net_hdr {
#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start,csum_offset*/
uint8_t flags;
#define VIRTIO_NET_HDR_GSO_NONE 0 /* Not a GSO frame */
#define VIRTIO_NET_HDR_GSO_TCPV4 1 /* GSO frame, IPv4 TCP (TSO) */
#define VIRTIO_NET_HDR_GSO_UDP 3 /* GSO frame, IPv4 UDP (UFO) */
#define VIRTIO_NET_HDR_GSO_TCPV6 4 /* GSO frame, IPv6 TCP */
#define VIRTIO_NET_HDR_GSO_ECN 0x80 /* TCP has ECN set */
uint8_t gso_type;
uint16_t hdr_len; /* Ethernet + IP + tcp/udp hdrs */
uint16_t gso_size; /* Bytes to append to hdr_len per frame */
uint16_t csum_start; /* Position to start checksumming from */
uint16_t csum_offset; /* Offset after that to place checksum */
};
/*
* This is the version of the header to use when the MRG_RXBUF
* feature has been negotiated.
*/
struct virtio_net_hdr_mrg_rxbuf {
struct virtio_net_hdr hdr;
uint16_t num_buffers; /* Number of merged rx buffers */
};
/*
* Control virtqueue data structures
*
* The control virtqueue expects a header in the first sg entry
* and an ack/status response in the last entry. Data for the
* command goes in between.
*/
struct virtio_net_ctrl_hdr {
uint8_t class;
uint8_t cmd;
} __packed;
typedef uint8_t virtio_net_ctrl_ack;
#define VIRTIO_NET_OK 0
#define VIRTIO_NET_ERR 1
/*
* Control the RX mode, ie. promiscuous, allmulti, etc...
* All commands require an "out" sg entry containing a 1 byte
* state value, zero = disable, non-zero = enable. Commands
* 0 and 1 are supported with the VIRTIO_NET_F_CTRL_RX feature.
* Commands 2-5 are added with VIRTIO_NET_F_CTRL_RX_EXTRA.
*/
#define VIRTIO_NET_CTRL_RX 0
#define VIRTIO_NET_CTRL_RX_PROMISC 0
#define VIRTIO_NET_CTRL_RX_ALLMULTI 1
#define VIRTIO_NET_CTRL_RX_ALLUNI 2
#define VIRTIO_NET_CTRL_RX_NOMULTI 3
#define VIRTIO_NET_CTRL_RX_NOUNI 4
#define VIRTIO_NET_CTRL_RX_NOBCAST 5
/*
* Control the MAC filter table.
*
* The MAC filter table is managed by the hypervisor, the guest should
* assume the size is infinite. Filtering should be considered
* non-perfect, ie. based on hypervisor resources, the guest may
* received packets from sources not specified in the filter list.
*
* In addition to the class/cmd header, the TABLE_SET command requires
* two out scatterlists. Each contains a 4 byte count of entries followed
* by a concatenated byte stream of the ETH_ALEN MAC addresses. The
* first sg list contains unicast addresses, the second is for multicast.
* This functionality is present if the VIRTIO_NET_F_CTRL_RX feature
* is available.
*/
struct virtio_net_ctrl_mac {
uint32_t entries;
uint8_t macs[][ETHER_ADDR_LEN];
} __packed;
#define VIRTIO_NET_CTRL_MAC 1
#define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
/*
* Control VLAN filtering
*
* The VLAN filter table is controlled via a simple ADD/DEL interface.
* VLAN IDs not added may be filtered by the hypervisor. Del is the
* opposite of add. Both commands expect an out entry containing a 2
* byte VLAN ID. VLAN filtering is available with the
* VIRTIO_NET_F_CTRL_VLAN feature bit.
*/
#define VIRTIO_NET_CTRL_VLAN 2
#define VIRTIO_NET_CTRL_VLAN_ADD 0
#define VIRTIO_NET_CTRL_VLAN_DEL 1
#endif /* _VIRTIO_NET_H */

File diff suppressed because it is too large Load Diff

@ -0,0 +1,64 @@
/*
* Copyright IBM Corp. 2007
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This header is BSD licensed so anyone can use the definitions to implement
* compatible drivers/servers.
*
* $FreeBSD$
*/
#ifndef _VIRTIO_PCI_H
#define _VIRTIO_PCI_H
/* VirtIO PCI vendor/device ID. */
#define VIRTIO_PCI_VENDORID 0x1AF4
#define VIRTIO_PCI_DEVICEID_MIN 0x1000
#define VIRTIO_PCI_DEVICEID_MAX 0x103F
/* VirtIO ABI version, this must match exactly. */
#define VIRTIO_PCI_ABI_VERSION 0
/*
* VirtIO Header, located in BAR 0.
*/
#define VIRTIO_PCI_HOST_FEATURES 0 /* host's supported features (32bit, RO)*/
#define VIRTIO_PCI_GUEST_FEATURES 4 /* guest's supported features (32, RW) */
#define VIRTIO_PCI_QUEUE_PFN 8 /* physical address of VQ (32, RW) */
#define VIRTIO_PCI_QUEUE_NUM 12 /* number of ring entries (16, RO) */
#define VIRTIO_PCI_QUEUE_SEL 14 /* current VQ selection (16, RW) */
#define VIRTIO_PCI_QUEUE_NOTIFY 16 /* notify host regarding VQ (16, RW) */
#define VIRTIO_PCI_STATUS 18 /* device status register (8, RW) */
#define VIRTIO_PCI_ISR 19 /* interrupt status register, reading
* also clears the register (8, RO) */
/* Only if MSIX is enabled: */
#define VIRTIO_MSI_CONFIG_VECTOR 20 /* configuration change vector (16, RW) */
#define VIRTIO_MSI_QUEUE_VECTOR 22 /* vector for selected VQ notifications
(16, RW) */
/* The bit of the ISR which indicates a device has an interrupt. */
#define VIRTIO_PCI_ISR_INTR 0x1
/* The bit of the ISR which indicates a device configuration change. */
#define VIRTIO_PCI_ISR_CONFIG 0x2
/* Vector value used to disable MSI for queue. */
#define VIRTIO_MSI_NO_VECTOR 0xFFFF
/*
* The remaining space is defined by each driver as the per-driver
* configuration space.
*/
#define VIRTIO_PCI_CONFIG(sc) \
(((sc)->vtpci_flags & VIRTIO_PCI_FLAG_MSIX) ? 24 : 20)
/*
* How many bits to shift physical queue address written to QUEUE_PFN.
* 12 is historical, and due to x86 page size.
*/
#define VIRTIO_PCI_QUEUE_ADDR_SHIFT 12
/* The alignment to use between consumer and producer parts of vring. */
#define VIRTIO_PCI_VRING_ALIGN 4096
#endif /* _VIRTIO_PCI_H */

283
sys/dev/virtio/virtio.c Normal file

@ -0,0 +1,283 @@
/*-
* Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice unmodified, this list of conditions, and the following
* disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/kernel.h>
#include <sys/malloc.h>
#include <sys/module.h>
#include <sys/sbuf.h>
#include <machine/bus.h>
#include <machine/resource.h>
#include <machine/_inttypes.h>
#include <sys/bus.h>
#include <sys/rman.h>
#include <dev/virtio/virtio.h>
#include <dev/virtio/virtqueue.h>
#include "virtio_bus_if.h"
static int virtio_modevent(module_t, int, void *);
static const char *virtio_feature_name(uint64_t, struct virtio_feature_desc *);
static struct virtio_ident {
uint16_t devid;
char *name;
} virtio_ident_table[] = {
{ VIRTIO_ID_NETWORK, "Network" },
{ VIRTIO_ID_BLOCK, "Block" },
{ VIRTIO_ID_CONSOLE, "Console" },
{ VIRTIO_ID_ENTROPY, "Entropy" },
{ VIRTIO_ID_BALLOON, "Balloon" },
{ VIRTIO_ID_IOMEMORY, "IOMemory" },
{ VIRTIO_ID_9P, "9P Transport" },
{ 0, NULL }
};
/* Device independent features. */
static struct virtio_feature_desc virtio_common_feature_desc[] = {
{ VIRTIO_F_NOTIFY_ON_EMPTY, "NotifyOnEmpty" },
{ VIRTIO_RING_F_INDIRECT_DESC, "RingIndirect" },
{ VIRTIO_RING_F_EVENT_IDX, "EventIdx" },
{ VIRTIO_F_BAD_FEATURE, "BadFeature" },
{ 0, NULL }
};
const char *
virtio_device_name(uint16_t devid)
{
struct virtio_ident *ident;
for (ident = virtio_ident_table; ident->name != NULL; ident++) {
if (ident->devid == devid)
return (ident->name);
}
return (NULL);
}
int
virtio_get_device_type(device_t dev)
{
uintptr_t devtype;
devtype = -1;
BUS_READ_IVAR(device_get_parent(dev), dev,
VIRTIO_IVAR_DEVTYPE, &devtype);
return ((int) devtype);
}
void
virtio_set_feature_desc(device_t dev,
struct virtio_feature_desc *feature_desc)
{
BUS_WRITE_IVAR(device_get_parent(dev), dev,
VIRTIO_IVAR_FEATURE_DESC, (uintptr_t) feature_desc);
}
void
virtio_describe(device_t dev, const char *msg,
uint64_t features, struct virtio_feature_desc *feature_desc)
{
struct sbuf sb;
uint64_t val;
char *buf;
const char *name;
int n;
if ((buf = malloc(512, M_TEMP, M_NOWAIT)) == NULL) {
device_printf(dev, "%s features: 0x%"PRIx64"\n", msg,
features);
return;
}
sbuf_new(&sb, buf, 512, SBUF_FIXEDLEN);
sbuf_printf(&sb, "%s features: 0x%"PRIx64, msg, features);
for (n = 0, val = 1ULL << 63; val != 0; val >>= 1) {
/*
* BAD_FEATURE is used to detect broken Linux clients
* and therefore is not applicable to FreeBSD.
*/
if (((features & val) == 0) || val == VIRTIO_F_BAD_FEATURE)
continue;
if (n++ == 0)
sbuf_cat(&sb, " <");
else
sbuf_cat(&sb, ",");
name = NULL;
if (feature_desc != NULL)
name = virtio_feature_name(val, feature_desc);
if (name == NULL)
name = virtio_feature_name(val,
virtio_common_feature_desc);
if (name == NULL)
sbuf_printf(&sb, "0x%"PRIx64, val);
else
sbuf_cat(&sb, name);
}
if (n > 0)
sbuf_cat(&sb, ">");
#if __FreeBSD_version < 900020
sbuf_finish(&sb);
if (sbuf_overflowed(&sb) == 0)
#else
if (sbuf_finish(&sb) == 0)
#endif
device_printf(dev, "%s\n", sbuf_data(&sb));
sbuf_delete(&sb);
free(buf, M_TEMP);
}
static const char *
virtio_feature_name(uint64_t val, struct virtio_feature_desc *feature_desc)
{
int i;
for (i = 0; feature_desc[i].vfd_val != 0; i++)
if (val == feature_desc[i].vfd_val)
return (feature_desc[i].vfd_str);
return (NULL);
}
/*
* VirtIO bus method wrappers.
*/
uint64_t
virtio_negotiate_features(device_t dev, uint64_t child_features)
{
return (VIRTIO_BUS_NEGOTIATE_FEATURES(device_get_parent(dev),
child_features));
}
int
virtio_alloc_virtqueues(device_t dev, int flags, int nvqs,
struct vq_alloc_info *info)
{
return (VIRTIO_BUS_ALLOC_VIRTQUEUES(device_get_parent(dev), flags,
nvqs, info));
}
int
virtio_setup_intr(device_t dev, enum intr_type type)
{
return (VIRTIO_BUS_SETUP_INTR(device_get_parent(dev), type));
}
int
virtio_with_feature(device_t dev, uint64_t feature)
{
return (VIRTIO_BUS_WITH_FEATURE(device_get_parent(dev), feature));
}
void
virtio_stop(device_t dev)
{
VIRTIO_BUS_STOP(device_get_parent(dev));
}
int
virtio_reinit(device_t dev, uint64_t features)
{
return (VIRTIO_BUS_REINIT(device_get_parent(dev), features));
}
void
virtio_reinit_complete(device_t dev)
{
VIRTIO_BUS_REINIT_COMPLETE(device_get_parent(dev));
}
void
virtio_read_device_config(device_t dev, bus_size_t offset, void *dst, int len)
{
VIRTIO_BUS_READ_DEVICE_CONFIG(device_get_parent(dev),
offset, dst, len);
}
void
virtio_write_device_config(device_t dev, bus_size_t offset, void *dst, int len)
{
VIRTIO_BUS_WRITE_DEVICE_CONFIG(device_get_parent(dev),
offset, dst, len);
}
static int
virtio_modevent(module_t mod, int type, void *unused)
{
int error;
error = 0;
switch (type) {
case MOD_LOAD:
case MOD_QUIESCE:
case MOD_UNLOAD:
case MOD_SHUTDOWN:
break;
default:
error = EOPNOTSUPP;
break;
}
return (error);
}
static moduledata_t virtio_mod = {
"virtio",
virtio_modevent,
0
};
DECLARE_MODULE(virtio, virtio_mod, SI_SUB_DRIVERS, SI_ORDER_FIRST);
MODULE_VERSION(virtio, 1);

130
sys/dev/virtio/virtio.h Normal file

@ -0,0 +1,130 @@
/*
* This header is BSD licensed so anyone can use the definitions to implement
* compatible drivers/servers.
*
* $FreeBSD$
*/
#ifndef _VIRTIO_H_
#define _VIRTIO_H_
#include <sys/types.h>
struct vq_alloc_info;
/* VirtIO device IDs. */
#define VIRTIO_ID_NETWORK 0x01
#define VIRTIO_ID_BLOCK 0x02
#define VIRTIO_ID_CONSOLE 0x03
#define VIRTIO_ID_ENTROPY 0x04
#define VIRTIO_ID_BALLOON 0x05
#define VIRTIO_ID_IOMEMORY 0x06
#define VIRTIO_ID_9P 0x09
/* Status byte for guest to report progress. */
#define VIRTIO_CONFIG_STATUS_RESET 0x00
#define VIRTIO_CONFIG_STATUS_ACK 0x01
#define VIRTIO_CONFIG_STATUS_DRIVER 0x02
#define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
#define VIRTIO_CONFIG_STATUS_FAILED 0x80
/*
* Generate interrupt when the virtqueue ring is
* completely used, even if we've suppressed them.
*/
#define VIRTIO_F_NOTIFY_ON_EMPTY (1 << 24)
/*
* The guest should never negotiate this feature; it
* is used to detect faulty drivers.
*/
#define VIRTIO_F_BAD_FEATURE (1 << 30)
/*
* Some VirtIO feature bits (currently bits 28 through 31) are
* reserved for the transport being used (eg. virtio_ring), the
* rest are per-device feature bits.
*/
#define VIRTIO_TRANSPORT_F_START 28
#define VIRTIO_TRANSPORT_F_END 32
/*
* Maximum number of virtqueues per device.
*/
#define VIRTIO_MAX_VIRTQUEUES 8
/*
* Each virtqueue indirect descriptor list must be physically contiguous.
* To allow us to malloc(9) each list individually, limit the number
* supported to what will fit in one page. With 4KB pages, this is a limit
* of 256 descriptors. If there is ever a need for more, we can switch to
* contigmalloc(9) for the larger allocations, similar to what
* bus_dmamem_alloc(9) does.
*
* Note the sizeof(struct vring_desc) is 16 bytes.
*/
#define VIRTIO_MAX_INDIRECT ((int) (PAGE_SIZE / 16))
/*
* VirtIO instance variables indices.
*/
#define VIRTIO_IVAR_DEVTYPE 1
#define VIRTIO_IVAR_FEATURE_DESC 2
struct virtio_feature_desc {
uint64_t vfd_val;
char *vfd_str;
};
const char *virtio_device_name(uint16_t devid);
int virtio_get_device_type(device_t dev);
void virtio_set_feature_desc(device_t dev,
struct virtio_feature_desc *feature_desc);
void virtio_describe(device_t dev, const char *msg,
uint64_t features, struct virtio_feature_desc *feature_desc);
/*
* VirtIO Bus Methods.
*/
uint64_t virtio_negotiate_features(device_t dev, uint64_t child_features);
int virtio_alloc_virtqueues(device_t dev, int flags, int nvqs,
struct vq_alloc_info *info);
int virtio_setup_intr(device_t dev, enum intr_type type);
int virtio_with_feature(device_t dev, uint64_t feature);
void virtio_stop(device_t dev);
int virtio_reinit(device_t dev, uint64_t features);
void virtio_reinit_complete(device_t dev);
/*
* Read/write a variable amount from the device specific (ie, network)
* configuration region. This region is encoded in the same endian as
* the guest.
*/
void virtio_read_device_config(device_t dev, bus_size_t offset,
void *dst, int length);
void virtio_write_device_config(device_t dev, bus_size_t offset,
void *src, int length);
/* Inlined device specific read/write functions for common lengths. */
#define VIRTIO_RDWR_DEVICE_CONFIG(size, type) \
static inline type \
__CONCAT(virtio_read_dev_config_,size)(device_t dev, \
bus_size_t offset) \
{ \
type val; \
virtio_read_device_config(dev, offset, &val, sizeof(type)); \
return (val); \
} \
\
static inline void \
__CONCAT(virtio_write_dev_config_,size)(device_t dev, \
bus_size_t offset, type val) \
{ \
virtio_write_device_config(dev, offset, &val, sizeof(type)); \
}
VIRTIO_RDWR_DEVICE_CONFIG(1, uint8_t);
VIRTIO_RDWR_DEVICE_CONFIG(2, uint16_t);
VIRTIO_RDWR_DEVICE_CONFIG(4, uint32_t);
#endif /* _VIRTIO_H_ */

@ -0,0 +1,92 @@
#-
# Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
# $FreeBSD$
#include <sys/bus.h>
#include <machine/bus.h>
INTERFACE virtio_bus;
HEADER {
struct vq_alloc_info;
};
METHOD uint64_t negotiate_features {
device_t dev;
uint64_t child_features;
};
METHOD int with_feature {
device_t dev;
uint64_t feature;
};
METHOD int alloc_virtqueues {
device_t dev;
int flags;
int nvqs;
struct vq_alloc_info *info;
};
HEADER {
#define VIRTIO_ALLOC_VQS_DISABLE_MSIX 0x1
};
METHOD int setup_intr {
device_t dev;
enum intr_type type;
};
METHOD void stop {
device_t dev;
};
METHOD int reinit {
device_t dev;
uint64_t features;
};
METHOD void reinit_complete {
device_t dev;
};
METHOD void notify_vq {
device_t dev;
uint16_t queue;
};
METHOD void read_device_config {
device_t dev;
bus_size_t offset;
void *dst;
int len;
};
METHOD void write_device_config {
device_t dev;
bus_size_t offset;
void *src;
int len;
};

@ -0,0 +1,43 @@
#-
# Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
# $FreeBSD$
#include <sys/bus.h>
INTERFACE virtio;
CODE {
static int
virtio_default_config_change(device_t dev)
{
/* Return that we've handled the change. */
return (1);
}
};
METHOD int config_change {
device_t dev;
} DEFAULT virtio_default_config_change;

@ -0,0 +1,119 @@
/*
* This header is BSD licensed so anyone can use the definitions
* to implement compatible drivers/servers.
*
* Copyright Rusty Russell IBM Corporation 2007.
*/
/* $FreeBSD$ */
#ifndef VIRTIO_RING_H
#define VIRTIO_RING_H
#include <sys/types.h>
/* This marks a buffer as continuing via the next field. */
#define VRING_DESC_F_NEXT 1
/* This marks a buffer as write-only (otherwise read-only). */
#define VRING_DESC_F_WRITE 2
/* This means the buffer contains a list of buffer descriptors. */
#define VRING_DESC_F_INDIRECT 4
/* The Host uses this in used->flags to advise the Guest: don't kick me
* when you add a buffer. It's unreliable, so it's simply an
* optimization. Guest will still kick if it's out of buffers. */
#define VRING_USED_F_NO_NOTIFY 1
/* The Guest uses this in avail->flags to advise the Host: don't
* interrupt me when you consume a buffer. It's unreliable, so it's
* simply an optimization. */
#define VRING_AVAIL_F_NO_INTERRUPT 1
/* VirtIO ring descriptors: 16 bytes.
* These can chain together via "next". */
struct vring_desc {
/* Address (guest-physical). */
uint64_t addr;
/* Length. */
uint32_t len;
/* The flags as indicated above. */
uint16_t flags;
/* We chain unused descriptors via this, too. */
uint16_t next;
};
struct vring_avail {
uint16_t flags;
uint16_t idx;
uint16_t ring[0];
};
/* uint32_t is used here for ids for padding reasons. */
struct vring_used_elem {
/* Index of start of used descriptor chain. */
uint32_t id;
/* Total length of the descriptor chain which was written to. */
uint32_t len;
};
struct vring_used {
uint16_t flags;
uint16_t idx;
struct vring_used_elem ring[0];
};
struct vring {
unsigned int num;
struct vring_desc *desc;
struct vring_avail *avail;
struct vring_used *used;
};
/* The standard layout for the ring is a continuous chunk of memory which
* looks like this. We assume num is a power of 2.
*
* struct vring {
* // The actual descriptors (16 bytes each)
* struct vring_desc desc[num];
*
* // A ring of available descriptor heads with free-running index.
* __u16 avail_flags;
* __u16 avail_idx;
* __u16 available[num];
*
* // Padding to the next align boundary.
* char pad[];
*
* // A ring of used descriptor heads with free-running index.
* __u16 used_flags;
* __u16 used_idx;
* struct vring_used_elem used[num];
* };
*
* NOTE: for VirtIO PCI, align is 4096.
*/
static inline int
vring_size(unsigned int num, unsigned long align)
{
int size;
size = num * sizeof(struct vring_desc);
size += sizeof(struct vring_avail) + (num * sizeof(uint16_t));
size = (size + align - 1) & ~(align - 1);
size += sizeof(struct vring_used) +
(num * sizeof(struct vring_used_elem));
return (size);
}
static inline void
vring_init(struct vring *vr, unsigned int num, uint8_t *p,
unsigned long align)
{
vr->num = num;
vr->desc = (struct vring_desc *) p;
vr->avail = (struct vring_avail *) (p +
num * sizeof(struct vring_desc));
vr->used = (void *)
(((unsigned long) &vr->avail->ring[num] + align-1) & ~(align-1));
}
#endif /* VIRTIO_RING_H */

755
sys/dev/virtio/virtqueue.c Normal file

@ -0,0 +1,755 @@
/*-
* Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice unmodified, this list of conditions, and the following
* disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/*
* Implements the virtqueue interface as basically described
* in the original VirtIO paper.
*/
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/kernel.h>
#include <sys/malloc.h>
#include <sys/sglist.h>
#include <vm/vm.h>
#include <vm/pmap.h>
#include <machine/cpu.h>
#include <machine/bus.h>
#include <machine/atomic.h>
#include <machine/resource.h>
#include <sys/bus.h>
#include <sys/rman.h>
#include <dev/virtio/virtio.h>
#include <dev/virtio/virtqueue.h>
#include <dev/virtio/virtio_ring.h>
#include "virtio_bus_if.h"
struct virtqueue {
device_t vq_dev;
char vq_name[VIRTQUEUE_MAX_NAME_SZ];
uint16_t vq_queue_index;
uint16_t vq_nentries;
uint32_t vq_flags;
#define VIRTQUEUE_FLAG_INDIRECT 0x0001
int vq_alignment;
int vq_ring_size;
void *vq_ring_mem;
int vq_max_indirect_size;
int vq_indirect_mem_size;
virtqueue_intr_t *vq_intrhand;
void *vq_intrhand_arg;
struct vring vq_ring;
uint16_t vq_free_cnt;
uint16_t vq_queued_cnt;
/*
* Head of the free chain in the descriptor table. If
* there are no free descriptors, this will be set to
* VQ_RING_DESC_CHAIN_END.
*/
uint16_t vq_desc_head_idx;
/*
* Last consumed descriptor in the used table,
* trails vq_ring.used->idx.
*/
uint16_t vq_used_cons_idx;
struct vq_desc_extra {
void *cookie;
struct vring_desc *indirect;
vm_paddr_t indirect_paddr;
uint16_t ndescs;
} vq_descx[0];
};
/*
* The maximum virtqueue size is 2^15. Use that value as the end of
* descriptor chain terminator since it will never be a valid index
* in the descriptor table. This is used to verify we are correctly
* handling vq_free_cnt.
*/
#define VQ_RING_DESC_CHAIN_END 32768
#define VQASSERT(_vq, _exp, _msg, ...) \
KASSERT((_exp),("%s: %s - "_msg, __func__, (_vq)->vq_name, \
##__VA_ARGS__))
#define VQ_RING_ASSERT_VALID_IDX(_vq, _idx) \
VQASSERT((_vq), (_idx) < (_vq)->vq_nentries, \
"invalid ring index: %d, max: %d", (_idx), \
(_vq)->vq_nentries)
#define VQ_RING_ASSERT_CHAIN_TERM(_vq) \
VQASSERT((_vq), (_vq)->vq_desc_head_idx == \
VQ_RING_DESC_CHAIN_END, "full ring terminated " \
"incorrectly: head idx: %d", (_vq)->vq_desc_head_idx)
static int virtqueue_init_indirect(struct virtqueue *vq, int);
static void virtqueue_free_indirect(struct virtqueue *vq);
static void virtqueue_init_indirect_list(struct virtqueue *,
struct vring_desc *);
static void vq_ring_init(struct virtqueue *);
static void vq_ring_update_avail(struct virtqueue *, uint16_t);
static uint16_t vq_ring_enqueue_segments(struct virtqueue *,
struct vring_desc *, uint16_t, struct sglist *, int, int);
static int vq_ring_use_indirect(struct virtqueue *, int);
static void vq_ring_enqueue_indirect(struct virtqueue *, void *,
struct sglist *, int, int);
static void vq_ring_notify_host(struct virtqueue *, int);
static void vq_ring_free_chain(struct virtqueue *, uint16_t);
uint64_t
virtqueue_filter_features(uint64_t features)
{
uint64_t mask;
mask = (1 << VIRTIO_TRANSPORT_F_START) - 1;
mask |= VIRTIO_RING_F_INDIRECT_DESC;
return (features & mask);
}
int
virtqueue_alloc(device_t dev, uint16_t queue, uint16_t size, int align,
vm_paddr_t highaddr, struct vq_alloc_info *info, struct virtqueue **vqp)
{
struct virtqueue *vq;
int error;
*vqp = NULL;
error = 0;
if (size == 0) {
device_printf(dev,
"virtqueue %d (%s) does not exist (size is zero)\n",
queue, info->vqai_name);
return (ENODEV);
} else if (!powerof2(size)) {
device_printf(dev,
"virtqueue %d (%s) size is not a power of 2: %d\n",
queue, info->vqai_name, size);
return (ENXIO);
} else if (info->vqai_maxindirsz > VIRTIO_MAX_INDIRECT) {
device_printf(dev, "virtqueue %d (%s) requested too many "
"indirect descriptors: %d, max %d\n",
queue, info->vqai_name, info->vqai_maxindirsz,
VIRTIO_MAX_INDIRECT);
return (EINVAL);
}
vq = malloc(sizeof(struct virtqueue) +
size * sizeof(struct vq_desc_extra), M_DEVBUF, M_NOWAIT | M_ZERO);
if (vq == NULL) {
device_printf(dev, "cannot allocate virtqueue\n");
return (ENOMEM);
}
vq->vq_dev = dev;
strlcpy(vq->vq_name, info->vqai_name, sizeof(vq->vq_name));
vq->vq_queue_index = queue;
vq->vq_alignment = align;
vq->vq_nentries = size;
vq->vq_free_cnt = size;
vq->vq_intrhand = info->vqai_intr;
vq->vq_intrhand_arg = info->vqai_intr_arg;
if (info->vqai_maxindirsz > 1) {
error = virtqueue_init_indirect(vq, info->vqai_maxindirsz);
if (error)
goto fail;
}
vq->vq_ring_size = round_page(vring_size(size, align));
vq->vq_ring_mem = contigmalloc(vq->vq_ring_size, M_DEVBUF,
M_NOWAIT | M_ZERO, 0, highaddr, PAGE_SIZE, 0);
if (vq->vq_ring_mem == NULL) {
device_printf(dev,
"cannot allocate memory for virtqueue ring\n");
error = ENOMEM;
goto fail;
}
vq_ring_init(vq);
virtqueue_disable_intr(vq);
*vqp = vq;
fail:
if (error)
virtqueue_free(vq);
return (error);
}
static int
virtqueue_init_indirect(struct virtqueue *vq, int indirect_size)
{
device_t dev;
struct vq_desc_extra *dxp;
int i, size;
dev = vq->vq_dev;
if (VIRTIO_BUS_WITH_FEATURE(dev, VIRTIO_RING_F_INDIRECT_DESC) == 0) {
/*
* Indirect descriptors requested by the driver but not
* negotiated. Return zero to keep the initialization
* going: we'll run fine without.
*/
if (bootverbose)
device_printf(dev, "virtqueue %d (%s) requested "
"indirect descriptors but not negotiated\n",
vq->vq_queue_index, vq->vq_name);
return (0);
}
size = indirect_size * sizeof(struct vring_desc);
vq->vq_max_indirect_size = indirect_size;
vq->vq_indirect_mem_size = size;
vq->vq_flags |= VIRTQUEUE_FLAG_INDIRECT;
for (i = 0; i < vq->vq_nentries; i++) {
dxp = &vq->vq_descx[i];
dxp->indirect = malloc(size, M_DEVBUF, M_NOWAIT);
if (dxp->indirect == NULL) {
device_printf(dev, "cannot allocate indirect list\n");
return (ENOMEM);
}
dxp->indirect_paddr = vtophys(dxp->indirect);
virtqueue_init_indirect_list(vq, dxp->indirect);
}
return (0);
}
static void
virtqueue_free_indirect(struct virtqueue *vq)
{
struct vq_desc_extra *dxp;
int i;
for (i = 0; i < vq->vq_nentries; i++) {
dxp = &vq->vq_descx[i];
if (dxp->indirect == NULL)
break;
free(dxp->indirect, M_DEVBUF);
dxp->indirect = NULL;
dxp->indirect_paddr = 0;
}
vq->vq_flags &= ~VIRTQUEUE_FLAG_INDIRECT;
vq->vq_indirect_mem_size = 0;
}
static void
virtqueue_init_indirect_list(struct virtqueue *vq,
struct vring_desc *indirect)
{
int i;
bzero(indirect, vq->vq_indirect_mem_size);
for (i = 0; i < vq->vq_max_indirect_size - 1; i++)
indirect[i].next = i + 1;
indirect[i].next = VQ_RING_DESC_CHAIN_END;
}
int
virtqueue_reinit(struct virtqueue *vq, uint16_t size)
{
struct vq_desc_extra *dxp;
int i;
if (vq->vq_nentries != size) {
device_printf(vq->vq_dev,
"%s: '%s' changed size; old=%hu, new=%hu\n",
__func__, vq->vq_name, vq->vq_nentries, size);
return (EINVAL);
}
/* Warn if the virtqueue was not properly cleaned up. */
if (vq->vq_free_cnt != vq->vq_nentries) {
device_printf(vq->vq_dev,
"%s: warning, '%s' virtqueue not empty, "
"leaking %d entries\n", __func__, vq->vq_name,
vq->vq_nentries - vq->vq_free_cnt);
}
vq->vq_desc_head_idx = 0;
vq->vq_used_cons_idx = 0;
vq->vq_queued_cnt = 0;
vq->vq_free_cnt = vq->vq_nentries;
/* To be safe, reset all our allocated memory. */
bzero(vq->vq_ring_mem, vq->vq_ring_size);
for (i = 0; i < vq->vq_nentries; i++) {
dxp = &vq->vq_descx[i];
dxp->cookie = NULL;
dxp->ndescs = 0;
if (vq->vq_flags & VIRTQUEUE_FLAG_INDIRECT)
virtqueue_init_indirect_list(vq, dxp->indirect);
}
vq_ring_init(vq);
virtqueue_disable_intr(vq);
return (0);
}
void
virtqueue_free(struct virtqueue *vq)
{
if (vq->vq_free_cnt != vq->vq_nentries) {
device_printf(vq->vq_dev, "%s: freeing non-empty virtqueue, "
"leaking %d entries\n", vq->vq_name,
vq->vq_nentries - vq->vq_free_cnt);
}
if (vq->vq_flags & VIRTQUEUE_FLAG_INDIRECT)
virtqueue_free_indirect(vq);
if (vq->vq_ring_mem != NULL) {
contigfree(vq->vq_ring_mem, vq->vq_ring_size, M_DEVBUF);
vq->vq_ring_size = 0;
vq->vq_ring_mem = NULL;
}
free(vq, M_DEVBUF);
}
vm_paddr_t
virtqueue_paddr(struct virtqueue *vq)
{
return (vtophys(vq->vq_ring_mem));
}
int
virtqueue_size(struct virtqueue *vq)
{
return (vq->vq_nentries);
}
int
virtqueue_empty(struct virtqueue *vq)
{
return (vq->vq_nentries == vq->vq_free_cnt);
}
int
virtqueue_full(struct virtqueue *vq)
{
return (vq->vq_free_cnt == 0);
}
void
virtqueue_notify(struct virtqueue *vq)
{
vq->vq_queued_cnt = 0;
vq_ring_notify_host(vq, 0);
}
int
virtqueue_nused(struct virtqueue *vq)
{
uint16_t used_idx, nused;
used_idx = vq->vq_ring.used->idx;
if (used_idx >= vq->vq_used_cons_idx)
nused = used_idx - vq->vq_used_cons_idx;
else
nused = UINT16_MAX - vq->vq_used_cons_idx +
used_idx + 1;
VQASSERT(vq, nused <= vq->vq_nentries, "used more than available");
return (nused);
}
int
virtqueue_intr(struct virtqueue *vq)
{
if (vq->vq_intrhand == NULL ||
vq->vq_used_cons_idx == vq->vq_ring.used->idx)
return (0);
vq->vq_intrhand(vq->vq_intrhand_arg);
return (1);
}
int
virtqueue_enable_intr(struct virtqueue *vq)
{
/*
* Enable interrupts, making sure we get the latest
* index of what's already been consumed.
*/
vq->vq_ring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
mb();
/*
* Additional items may have been consumed in the time between
* since we last checked and enabled interrupts above. Let our
* caller know so it processes the new entries.
*/
if (vq->vq_used_cons_idx != vq->vq_ring.used->idx)
return (1);
return (0);
}
void
virtqueue_disable_intr(struct virtqueue *vq)
{
/*
* Note this is only considered a hint to the host.
*/
vq->vq_ring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
}
int
virtqueue_enqueue(struct virtqueue *vq, void *cookie, struct sglist *sg,
int readable, int writable)
{
struct vq_desc_extra *dxp;
int needed;
uint16_t head_idx, idx;
needed = readable + writable;
VQASSERT(vq, cookie != NULL, "enqueuing with no cookie");
VQASSERT(vq, needed == sg->sg_nseg,
"segment count mismatch, %d, %d", needed, sg->sg_nseg);
VQASSERT(vq,
needed <= vq->vq_nentries || needed <= vq->vq_max_indirect_size,
"too many segments to enqueue: %d, %d/%d", needed,
vq->vq_nentries, vq->vq_max_indirect_size);
if (needed < 1)
return (EINVAL);
if (vq->vq_free_cnt == 0)
return (ENOSPC);
if (vq_ring_use_indirect(vq, needed)) {
vq_ring_enqueue_indirect(vq, cookie, sg, readable, writable);
return (0);
} else if (vq->vq_free_cnt < needed)
return (EMSGSIZE);
head_idx = vq->vq_desc_head_idx;
VQ_RING_ASSERT_VALID_IDX(vq, head_idx);
dxp = &vq->vq_descx[head_idx];
VQASSERT(vq, dxp->cookie == NULL,
"cookie already exists for index %d", head_idx);
dxp->cookie = cookie;
dxp->ndescs = needed;
idx = vq_ring_enqueue_segments(vq, vq->vq_ring.desc, head_idx,
sg, readable, writable);
vq->vq_desc_head_idx = idx;
vq->vq_free_cnt -= needed;
if (vq->vq_free_cnt == 0)
VQ_RING_ASSERT_CHAIN_TERM(vq);
else
VQ_RING_ASSERT_VALID_IDX(vq, idx);
vq_ring_update_avail(vq, head_idx);
return (0);
}
void *
virtqueue_dequeue(struct virtqueue *vq, uint32_t *len)
{
struct vring_used_elem *uep;
void *cookie;
uint16_t used_idx, desc_idx;
if (vq->vq_used_cons_idx == vq->vq_ring.used->idx)
return (NULL);
used_idx = vq->vq_used_cons_idx++ & (vq->vq_nentries - 1);
uep = &vq->vq_ring.used->ring[used_idx];
mb();
desc_idx = (uint16_t) uep->id;
if (len != NULL)
*len = uep->len;
vq_ring_free_chain(vq, desc_idx);
cookie = vq->vq_descx[desc_idx].cookie;
VQASSERT(vq, cookie != NULL, "no cookie for index %d", desc_idx);
vq->vq_descx[desc_idx].cookie = NULL;
return (cookie);
}
void *
virtqueue_poll(struct virtqueue *vq, uint32_t *len)
{
void *cookie;
while ((cookie = virtqueue_dequeue(vq, len)) == NULL)
cpu_spinwait();
return (cookie);
}
void *
virtqueue_drain(struct virtqueue *vq, int *last)
{
void *cookie;
int idx;
cookie = NULL;
idx = *last;
while (idx < vq->vq_nentries && cookie == NULL) {
if ((cookie = vq->vq_descx[idx].cookie) != NULL) {
vq->vq_descx[idx].cookie = NULL;
/* Free chain to keep free count consistent. */
vq_ring_free_chain(vq, idx);
}
idx++;
}
*last = idx;
return (cookie);
}
void
virtqueue_dump(struct virtqueue *vq)
{
if (vq == NULL)
return;
printf("VQ: %s - size=%d; free=%d; used=%d; queued=%d; "
"desc_head_idx=%d; avail.idx=%d; used_cons_idx=%d; "
"used.idx=%d; avail.flags=0x%x; used.flags=0x%x\n",
vq->vq_name, vq->vq_nentries, vq->vq_free_cnt,
virtqueue_nused(vq), vq->vq_queued_cnt, vq->vq_desc_head_idx,
vq->vq_ring.avail->idx, vq->vq_used_cons_idx,
vq->vq_ring.used->idx, vq->vq_ring.avail->flags,
vq->vq_ring.used->flags);
}
static void
vq_ring_init(struct virtqueue *vq)
{
struct vring *vr;
char *ring_mem;
int i, size;
ring_mem = vq->vq_ring_mem;
size = vq->vq_nentries;
vr = &vq->vq_ring;
vring_init(vr, size, ring_mem, vq->vq_alignment);
for (i = 0; i < size - 1; i++)
vr->desc[i].next = i + 1;
vr->desc[i].next = VQ_RING_DESC_CHAIN_END;
}
static void
vq_ring_update_avail(struct virtqueue *vq, uint16_t desc_idx)
{
uint16_t avail_idx;
/*
* Place the head of the descriptor chain into the next slot and make
* it usable to the host. The chain is made available now rather than
* deferring to virtqueue_notify() in the hopes that if the host is
* currently running on another CPU, we can keep it processing the new
* descriptor.
*/
avail_idx = vq->vq_ring.avail->idx & (vq->vq_nentries - 1);
vq->vq_ring.avail->ring[avail_idx] = desc_idx;
mb();
vq->vq_ring.avail->idx++;
/* Keep pending count until virtqueue_notify() for debugging. */
vq->vq_queued_cnt++;
}
static uint16_t
vq_ring_enqueue_segments(struct virtqueue *vq, struct vring_desc *desc,
uint16_t head_idx, struct sglist *sg, int readable, int writable)
{
struct sglist_seg *seg;
struct vring_desc *dp;
int i, needed;
uint16_t idx;
needed = readable + writable;
for (i = 0, idx = head_idx, seg = sg->sg_segs;
i < needed;
i++, idx = dp->next, seg++) {
VQASSERT(vq, idx != VQ_RING_DESC_CHAIN_END,
"premature end of free desc chain");
dp = &desc[idx];
dp->addr = seg->ss_paddr;
dp->len = seg->ss_len;
dp->flags = 0;
if (i < needed - 1)
dp->flags |= VRING_DESC_F_NEXT;
if (i >= readable)
dp->flags |= VRING_DESC_F_WRITE;
}
return (idx);
}
static int
vq_ring_use_indirect(struct virtqueue *vq, int needed)
{
if ((vq->vq_flags & VIRTQUEUE_FLAG_INDIRECT) == 0)
return (0);
if (vq->vq_max_indirect_size < needed)
return (0);
if (needed < 2)
return (0);
return (1);
}
static void
vq_ring_enqueue_indirect(struct virtqueue *vq, void *cookie,
struct sglist *sg, int readable, int writable)
{
struct vring_desc *dp;
struct vq_desc_extra *dxp;
int needed;
uint16_t head_idx;
needed = readable + writable;
VQASSERT(vq, needed <= vq->vq_max_indirect_size,
"enqueuing too many indirect descriptors");
head_idx = vq->vq_desc_head_idx;
VQ_RING_ASSERT_VALID_IDX(vq, head_idx);
dp = &vq->vq_ring.desc[head_idx];
dxp = &vq->vq_descx[head_idx];
VQASSERT(vq, dxp->cookie == NULL,
"cookie already exists for index %d", head_idx);
dxp->cookie = cookie;
dxp->ndescs = 1;
dp->addr = dxp->indirect_paddr;
dp->len = needed * sizeof(struct vring_desc);
dp->flags = VRING_DESC_F_INDIRECT;
vq_ring_enqueue_segments(vq, dxp->indirect, 0,
sg, readable, writable);
vq->vq_desc_head_idx = dp->next;
vq->vq_free_cnt--;
if (vq->vq_free_cnt == 0)
VQ_RING_ASSERT_CHAIN_TERM(vq);
else
VQ_RING_ASSERT_VALID_IDX(vq, vq->vq_desc_head_idx);
vq_ring_update_avail(vq, head_idx);
}
static void
vq_ring_notify_host(struct virtqueue *vq, int force)
{
mb();
if (force ||
(vq->vq_ring.used->flags & VRING_USED_F_NO_NOTIFY) == 0)
VIRTIO_BUS_NOTIFY_VQ(vq->vq_dev, vq->vq_queue_index);
}
static void
vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
{
struct vring_desc *dp;
struct vq_desc_extra *dxp;
VQ_RING_ASSERT_VALID_IDX(vq, desc_idx);
dp = &vq->vq_ring.desc[desc_idx];
dxp = &vq->vq_descx[desc_idx];
if (vq->vq_free_cnt == 0)
VQ_RING_ASSERT_CHAIN_TERM(vq);
vq->vq_free_cnt += dxp->ndescs;
dxp->ndescs--;
if ((dp->flags & VRING_DESC_F_INDIRECT) == 0) {
while (dp->flags & VRING_DESC_F_NEXT) {
VQ_RING_ASSERT_VALID_IDX(vq, dp->next);
dp = &vq->vq_ring.desc[dp->next];
dxp->ndescs--;
}
}
VQASSERT(vq, dxp->ndescs == 0, "failed to free entire desc chain");
/*
* We must append the existing free chain, if any, to the end of
* newly freed chain. If the virtqueue was completely used, then
* head would be VQ_RING_DESC_CHAIN_END (ASSERTed above).
*/
dp->next = vq->vq_desc_head_idx;
vq->vq_desc_head_idx = desc_idx;
}

@ -0,0 +1,98 @@
/*-
* Copyright (c) 2011, Bryan Venteicher <bryanv@daemoninthecloset.org>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice unmodified, this list of conditions, and the following
* disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* $FreeBSD$
*/
#ifndef _VIRTIO_VIRTQUEUE_H
#define _VIRTIO_VIRTQUEUE_H
#include <sys/types.h>
struct virtqueue;
struct sglist;
/* Support for indirect buffer descriptors. */
#define VIRTIO_RING_F_INDIRECT_DESC (1 << 28)
/* The guest publishes the used index for which it expects an interrupt
* at the end of the avail ring. Host should ignore the avail->flags field.
* The host publishes the avail index for which it expects a kick
* at the end of the used ring. Guest should ignore the used->flags field.
*/
#define VIRTIO_RING_F_EVENT_IDX (1 << 29)
/* Device callback for a virtqueue interrupt. */
typedef int virtqueue_intr_t(void *);
#define VIRTQUEUE_MAX_NAME_SZ 32
/* One for each virtqueue the device wishes to allocate. */
struct vq_alloc_info {
char vqai_name[VIRTQUEUE_MAX_NAME_SZ];
int vqai_maxindirsz;
virtqueue_intr_t *vqai_intr;
void *vqai_intr_arg;
struct virtqueue **vqai_vq;
};
#define VQ_ALLOC_INFO_INIT(_i,_nsegs,_intr,_arg,_vqp,_str,...) do { \
snprintf((_i)->vqai_name, VIRTQUEUE_MAX_NAME_SZ, _str, \
##__VA_ARGS__); \
(_i)->vqai_maxindirsz = (_nsegs); \
(_i)->vqai_intr = (_intr); \
(_i)->vqai_intr_arg = (_arg); \
(_i)->vqai_vq = (_vqp); \
} while (0)
uint64_t virtqueue_filter_features(uint64_t features);
int virtqueue_alloc(device_t dev, uint16_t queue, uint16_t size,
int align, vm_paddr_t highaddr, struct vq_alloc_info *info,
struct virtqueue **vqp);
void *virtqueue_drain(struct virtqueue *vq, int *last);
void virtqueue_free(struct virtqueue *vq);
int virtqueue_reinit(struct virtqueue *vq, uint16_t size);
int virtqueue_intr(struct virtqueue *vq);
int virtqueue_enable_intr(struct virtqueue *vq);
void virtqueue_disable_intr(struct virtqueue *vq);
/* Get physical address of the virtqueue ring. */
vm_paddr_t virtqueue_paddr(struct virtqueue *vq);
int virtqueue_full(struct virtqueue *vq);
int virtqueue_empty(struct virtqueue *vq);
int virtqueue_size(struct virtqueue *vq);
int virtqueue_nused(struct virtqueue *vq);
void virtqueue_notify(struct virtqueue *vq);
void virtqueue_dump(struct virtqueue *vq);
int virtqueue_enqueue(struct virtqueue *vq, void *cookie,
struct sglist *sg, int readable, int writable);
void *virtqueue_dequeue(struct virtqueue *vq, uint32_t *len);
void *virtqueue_poll(struct virtqueue *vq, uint32_t *len);
#endif /* _VIRTIO_VIRTQUEUE_H */

@ -317,6 +317,7 @@ SUBDIR= ${_3dfx} \
usb \
utopia \
${_vesa} \
${_virtio} \
vge \
vkbd \
${_vpo} \
@ -537,6 +538,7 @@ _padlock= padlock
_s3= s3
_twa= twa
_vesa= vesa
_virtio= virtio
_x86bios= x86bios
.elif ${MACHINE} == "pc98"
_canbepm= canbepm
@ -636,6 +638,7 @@ _sppp= sppp
_tpm= tpm
_twa= twa
_vesa= vesa
_virtio= virtio
_vxge= vxge
_x86bios= x86bios
_wi= wi

@ -0,0 +1,28 @@
#
# $FreeBSD$
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
SUBDIR= virtio pci network block balloon
.include <bsd.subdir.mk>

@ -0,0 +1,36 @@
#
# $FreeBSD$
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
.PATH: ${.CURDIR}/../../../dev/virtio/balloon
KMOD= virtio_balloon
SRCS= virtio_balloon.c
SRCS+= virtio_bus_if.h virtio_if.h
SRCS+= bus_if.h device_if.h
MFILES= kern/bus_if.m kern/device_if.m \
dev/virtio/virtio_bus_if.m dev/virtio/virtio_if.m
.include <bsd.kmod.mk>

@ -0,0 +1,36 @@
#
# $FreeBSD$
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
.PATH: ${.CURDIR}/../../../dev/virtio/block
KMOD= virtio_blk
SRCS= virtio_blk.c
SRCS+= virtio_bus_if.h virtio_if.h
SRCS+= bus_if.h device_if.h
MFILES= kern/bus_if.m kern/device_if.m \
dev/virtio/virtio_bus_if.m dev/virtio/virtio_if.m
.include <bsd.kmod.mk>

@ -0,0 +1,36 @@
#
# $FreeBSD$
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
.PATH: ${.CURDIR}/../../../dev/virtio/network
KMOD= if_vtnet
SRCS= if_vtnet.c
SRCS+= virtio_bus_if.h virtio_if.h
SRCS+= bus_if.h device_if.h
MFILES= kern/bus_if.m kern/device_if.m \
dev/virtio/virtio_bus_if.m dev/virtio/virtio_if.m
.include <bsd.kmod.mk>

@ -0,0 +1,36 @@
#
# $FreeBSD$
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
.PATH: ${.CURDIR}/../../../dev/virtio/pci
KMOD= virtio_pci
SRCS= virtio_pci.c
SRCS+= virtio_bus_if.h virtio_if.h
SRCS+= bus_if.h device_if.h pci_if.h
MFILES= kern/bus_if.m kern/device_if.m dev/pci/pci_if.m \
dev/virtio/virtio_bus_if.m dev/virtio/virtio_if.m
.include <bsd.kmod.mk>

@ -0,0 +1,38 @@
#
# $FreeBSD$
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
#
.PATH: ${.CURDIR}/../../../dev/virtio
KMOD= virtio
SRCS= virtio.c virtqueue.c
SRCS+= virtio_bus_if.c virtio_bus_if.h
SRCS+= virtio_if.c virtio_if.h
SRCS+= bus_if.h device_if.h
MFILES= kern/bus_if.m kern/device_if.m \
dev/virtio/virtio_bus_if.m dev/virtio/virtio_if.m
.include <bsd.kmod.mk>