Commit Graph

18156 Commits

Author SHA1 Message Date
Vincenzo Maffione
0ff7076bdb bhyve: abstraction for network backends
Bhyve can currently emulate two virtual NICs, namely virtio-net and e1000,
and connect to the host network through two backends, namely tap and netmap.
However, there is no interface between virtual NIC functionalities and
backend functionalities. As a result, the backend code is duplicated between
the two virtual NIC implementations and also within the same virtual NIC.
Also, e1000 cannot currently use netmap as a backend.
This patch introduces a network backend API between virtio-net/e1000 and
tap/netmap, to improve code reuse and add missing functionalities.
Virtual NICs and backends can negotiate virtio-net features, such as checksum
offload and TSO. If the backend supports the features, it will propagate this
information to the guest, so that the latter can make use of them. Currently,
only netmap VALE ports support the features, but support should be added to
tap in the future.

Reviewed by:	jhb, bryanv
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20659
2019-07-07 12:15:24 +00:00
Sean Chittenden
55c94d640f bhyve/audio: don't leak resources on failed initialization.
Coverity CID:	1402793
Approved by:	markj, jhb, bhyve
Differential Revision:	https://reviews.freebsd.org/D20841
2019-07-03 17:24:24 +00:00
Warren Block
a9258f9b7f Correct name of vmm(4) pptdevs variable.
Reported by:	nwolff@ixsystems.com
2019-07-02 14:53:51 +00:00
John Baldwin
7aa24c6006 Use __FBSDID() and sort #includes.
No functional change.
2019-06-27 21:45:40 +00:00
Ed Maste
9349d37845 bhyve: avoid theoretical stack buffer overflow from integer overflow
Use the proper size_t type to match strlen's return type.  This is not
exploitable in practice as this parses command line arguments, which
are limited to well below 2^31 bytes.

This is a minimal change to address the reported issue; hda_parse_config
and the rest of this file will benefit from further review.

Reported by:	Fakhri Zulkifli
Reviewed by:	jhb, markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-06-25 19:06:43 +00:00
Alexander Motin
4ae6e084f0 Fix strsep_quote() on strings without quotes.
For strings without quotes and escapes dstptr and srcptr are equal, so
zeroing *dstptr before checking *srcptr is not a good idea.  In practice
it means that in -maproot=65534:65533 everything after the colon is lost.

The problem was there since r293305, but before r346976 it was covered by
improper strsep_quote() usage.

PR:		238725
MFC after:	3 days
Sponsored by:	iXsystems, Inc.
2019-06-25 17:00:53 +00:00
Hans Petter Selasky
9efd65a9d2 Fix parsing of corrupt data in usbdump(8). Check that the transfer
type array lookup is within bounds to avoid segfault.

PR:		238801
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-06-25 13:15:29 +00:00
Marcelo Araujo
3e21da8ad1 Add SPDX tags to bhyve(8) HD Audio device.
Reviewed by:	bcran
Differential Revision:	https://reviews.freebsd.org/D20750
2019-06-25 06:24:56 +00:00
Warner Losh
36f9f044cd Replay r349336 by scottl accidentally reverted by r349352
Add a section about the HD Audio module support
2019-06-25 06:14:11 +00:00
Warner Losh
6b021cc2dd Replay r349335 by scottl accidentally reverted by r349352
Add the PCI HDAudio device model from the 2016 GSoC.  Detailed information
can be found at

https://wiki.freebsd.org/SummerOfCode2016/HDAudioEmulationForBhyve

This commit has evolved from the original work to include Capsicum
integration.  As part of that, it only opens the host audio devices once
and leaves them open, instead of opening and closing them on each guest
access.  Thanks to Peter Grehan and Marcelo Araujo for their help in
bringing the work forward and providing some of the final techncial push.

Submitted by:	Alex Teaca <iateaca@freebsd.org>
Differential Revision:	D7840, D12419
2019-06-25 06:14:05 +00:00
Warner Losh
f5a95d9a07 Remove NAND and NANDFS support
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.

Numerous posts to arch@ and other locations have found no actual users
for this software.

Relnotes:	Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
2019-06-25 04:50:09 +00:00
Warner Losh
73701bbe9d kbdcontrol -h prints two error messages.
We loop through getopt(3) twice. Once for -P args and once for the
rest. Catch '?' and print usage when that happens.
2019-06-24 21:05:14 +00:00
Scott Long
0a944371e8 Add a section about the HD Audio module support 2019-06-24 19:42:32 +00:00
Scott Long
7e3c742061 Add the PCI HDAudio device model from the 2016 GSoC. Detailed information
can be found at

https://wiki.freebsd.org/SummerOfCode2016/HDAudioEmulationForBhyve

This commit has evolved from the original work to include Capsicum
integration.  As part of that, it only opens the host audio devices once
and leaves them open, instead of opening and closing them on each guest
access.  Thanks to Peter Grehan and Marcelo Araujo for their help in
bringing the work forward and providing some of the final techncial push.

Submitted by:	Alex Teaca <iateaca@freebsd.org>
Differential Revision:	D7840, D12419
2019-06-24 19:31:32 +00:00
Eric van Gyzen
db2114b4b8 bhyve: Fix vtscsi maximum segment config
The seg_max value reported to the guest should be two less than the
host's maximum, in order to leave room for the request and the
response.  This is analogous to r347033 for virtio_block.

We hit the "too many segments to enqueue" assertion on OneFS because
we increase MAXPHYS to 256 KB.

Reviewed by:	bryanv
Discussed with:	cem jhb rgrimes
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20529
2019-06-21 18:57:33 +00:00
Shteryana Shopova
9a8070808e No need for each bsnmpd(1) module to open connection to syslog
bsnmpd(1) main does that early on init and the connection is available
to all loaded modules

Event:		Vienna Hackathon 2019
PR:		233431 , 221487
MFC after:	2 weeks
2019-06-21 07:45:58 +00:00
Shteryana Shopova
65a184e091 Unbreak snmp_pf(3) after the changes introduced in r338209
PR:		237011
Event:		Vienna Hackathon 2019
MFC after:	2 weeks
2019-06-21 07:29:02 +00:00
Mark Johnston
ab877e64d0 Make zlib encoding messages idempotent.
Otherwise duplicate messages can trigger a reinitialization of the
compression stream while the update thread is running.  Also ensure
that the stream is initialized before the update thread may attempt
to use it.

PR:		238333
Reviewed by:	cem, rgrimes
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20673
2019-06-19 16:09:20 +00:00
Vincenzo Maffione
5c2b348a54 bhyve: vtnet: fix locking on receive
The vsc_rx_ready and the RX virtqueue is protected by the rx_mtx lock.
However, pci_vtnet_ping_rxq() (currently called only once after each
device reset) accesses those without acquiring the lock.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20609
2019-06-18 17:51:30 +00:00
Ian Lepore
0309916a88 Oops, it seems I left out the word 'cycle', fix it.
Reported by:	rpokala@
2019-06-18 02:27:30 +00:00
Ian Lepore
26f3ca615d Rearrange the argument checking and processing so that enable and disable
can be combined with configuring the period and duty cycle (the same ioctl
sets all 3 values at once, so there's no reason to require the user to run
the program twice to get all 3 things set).
2019-06-18 01:15:00 +00:00
Ian Lepore
123570fb92 Explain the relationship between PWM hardware channels being controlled and
pwmc(4) device filenames.  Also, use uppercase PWM when the term is being
used as an acronym, and expand the acronym where it's first used.
2019-06-18 00:17:10 +00:00
Ian Lepore
780c3de886 Remove everything related to channels from the pwmc public interface, now
that there is a pwmc(4) instance per channel and the channel number is
maintained as a driver ivar rather than being passed in from userland.
2019-06-18 00:11:00 +00:00
Ian Lepore
060e638845 Put periods at the ends of argument descriptions. Explain the relationship
between the period and duty arguments.
2019-06-17 16:50:58 +00:00
Ian Lepore
7d763870e4 Follow changes in the pwmc(4) driver in relation to device filenames.
The driver now names its cdev nodes pwmcX.Y where X is unit number and
Y is the channel within that unit.  Change the default device name from
pwmc0 to pwmc0.0.  The driver now puts cdev files and label aliases in
the /dev/pwm directory, so allow the user to provide unqualified names
with -f and automatically prepend the /dev/pwm part for them.

Update the examples in the manpage to show the new device name format
and location within /dev/pwm.
2019-06-17 16:43:33 +00:00
Edward Tomasz Napierala
83743daead In iostat(8) output, skip the decimal point and the fractional part
for tps >= 100 and MB/s >= 1000, to prevent them for widening too much.

MFC after:	2 weeks
2019-06-16 17:32:05 +00:00
Ian Lepore
6cdbe2bf20 Make pwm channel numbers unsigned. 2019-06-15 23:02:09 +00:00
Ian Lepore
71fb373934 Move/rename the sys/pwm.h header file to dev/pwm/pwmc.h. The file contains
ioctl definitions and related datatypes that allow userland control of pwm
hardware via the pwmc device.  The new name and location better reflects its
assocation with a single device driver.
2019-06-15 19:46:59 +00:00
Vincenzo Maffione
4f7c3b7be5 bhyve: move common code to net_utils.c
Both virtio_net and e82545 network frontends have code to validate and
generate MAC addresses. These functionalities are replicated in the two
files, so we move them in a separate compilation unit.

Reviewed by:	rgrimes, bryanv, imp, kevans
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20626
2019-06-13 17:39:32 +00:00
John Baldwin
0026d8ccb7 Remove a spurious break when setting up a 64-bit memory BAR.
This was causing 'enbit' to not be initialized in this case.

CID:		1401924
Reported by:	Coverity
MFC after:	1 week
2019-06-12 16:49:01 +00:00
Vincenzo Maffione
17e9052ca8 bhyve: virtio: introduce vq_kick_enable() and vq_kick_disable()
The VirtIO standard supports two schemes for notification suppression:
a notification enable bit and a more sophisticated one (event_idx) that
also supports delayed notifications. Currently bhyve fully supports
only the first scheme. This patch hides the notification suppression
internals by means of two inline routines, vq_kick_enable() and
vq_kick_disable(), and makes the code more readable.
Moreover, further improve readability by replacing the call to mb()
with a call to atomic_thread_fence_seq_cst(), which is already used
in virtio.c

Reviewed by:	pmooney_pfmooney.com, bryanv
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20581
2019-06-11 15:52:41 +00:00
Vincenzo Maffione
f3b1307e01 bhyve: vtnet: simplify thread synchronization
On vtnet device reset it is necessary to wait for threads to stop TX and
RX processing. However, the rx_in_progress variable (used for to wait for
RX processing to stop) is actually useless, and can be removed. Acquiring
and releasing the RX lock is enough to synchronize correctly. Moreover,
it is possible to reset the device while holding both TX and RX locks, so
that the "resetting" variable becomes unnecessary for the RX thread, and
can be protected by the TX lock (instead of being volatile).

Reviewed by:	jhb, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20543
2019-06-09 12:41:21 +00:00
Chuck Tuffli
c3308a9469 Fix uninitialized variable in camdd
gcc  builds were failing because of this uninitialized warning.

Reported by:	bz, imp
Approved by:	imp (mentor)
Discussed with:	ken
Pointy hat:	chuck
2019-06-09 02:06:31 +00:00
Chuck Tuffli
2d9be22831 Add NVMe support to camdd(8)
Reviewed by:	ken
Approved by:	ken (mentor)
MFC after:	1 week
Differential Review: https://reviews.freebsd.org/D12141
2019-06-08 17:17:17 +00:00
Chuck Tuffli
129f93c5a7 bhyve: Add PCIe Integrated Endpoint capability
The NVMe CAM driver reports the PCIe Link Capability and Status for
devices. For emulated bhyve NVMe devices, this looks like:

nda0: nvme version 1.3 x63 (max x63) lanes PCIe Gen15 (max Gen15) link

The driver outputs this because the emulated device doesn't include the
PCIe Capability structure. The NVMe specification requires these
registers, so the fix is to add this set of capability registers to the
emulated device.

Note that PCI Express devices that are integrated into the Root Complex
(i.e. Bus 0x0) do not have to support the Link Capability or Status
registers. Windows will fail to start (i.e. Code 10) devices that appear
to be part of the Root Complex but report being a PCI Express Endpoint.
So also add a check to pci_emul_add_pciecap() to check if the device is
integrated and change the device type.

Reviewed by:	imp, ken, araujo, jhb, rgrimes
Approved by:	imp (mentor), ken (mentor), jhb (maintainer)
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D19904
2019-06-07 17:09:49 +00:00
John Baldwin
5628267505 Keep the shadow PCIR_COMMAND synced with the real one for pass through.
This ensures that bhyve properly recognizes when decoding is disabled
for BARs on passthru devices.  To properly handle writes to the
register, export a pci_emul_cmd_changed function from pci_emul.c that
the pass through device model invokes for config writes that change
PCIR_COMMAND.

Reviewed by:	rgrimes
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20531
2019-06-07 15:53:27 +00:00
John Baldwin
2729c9bbc7 Enable memory and I/O decoding in PCI devices on demand.
Rather than uncoditionally setting the MEMEN and PORTEN bits in
PCIR_COMMAND for PCI devices, set the respective bit when the first
BAR of a given type is added to the device.  This more closely matches
what firmware does on bare metal.

BUSMASTEREN is still set unconditionally.  Eventually this bit should
move into the device models as not all device models need this set.

Reviewed by:	rgrimes
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20530
2019-06-07 15:48:12 +00:00
John Baldwin
4db23c7455 Use parse_integer to avoid sign extension.
Coverity warned about gdb_write_mem sign extending the result of
parse_byte shifted left by 24 bits when generating a 32-bit memory
write value for MMIO.  Simplify the code by using parse_integer
instead of unrolled parse_byte calls.

CID:		1401600
Reviewed by:	cem
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D20508
2019-06-05 23:37:50 +00:00
John Baldwin
24be3f513f Don't simulate PBA access if the PBA is in a separate BAR.
bhyve has to virtualize the MSI-X table to trap reads and writes to
that table and map those to virtual interrupts that it maps real host
interrupts on to.  For the pending-bit-array (PBA), bhyve passes
accesses from the guest directly to the hardware.

bhyve's virtualization of the MSI-X table is done by intercepting all
reads and writes to the BAR holding the MSI-X table.  However, if the
PBA is stored in the same BAR as the MSI-X table, accesses to the PBA
portion of this BAR have to be forwarded to the real BAR.

However, in the case that the PBA was stored in a separate BAR and
it's offset in that separate BAR overlapped with the portion of the
MSI-X table BAR that the table used, the handlers for the table BAR
would incorrectly think that some accesses were PBA reads and writes.
This caused a crash in bhyve when it indirected a NULL pointer.  Fix
this case by never trying to handle PBA access if the PBA lives in a
separate BAR.

Reported by:	gallatin
Tested by:	gallatin
Reviewed by:	markj, Patrick Mooney
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20523
2019-06-05 19:29:02 +00:00
Conrad Meyer
09a3675d96 daemon(8): Don't block SIGTERM during restart delay
I believe this was introduced in the original '-r' commit, r231911 (2012).
At the time, the scope was limited to a 1 second sleep.  r332518 (2018)
added '-R', which increased the potential duration of the affected interval
(from 1 to N seconds) by permitting arbitrary restart intervals.

Instead, handle SIGTERM normally during restart-sleep, when the monitored
process is not running, and shut down promptly.

(I noticed this behavior when debugging a child process that exited quickly
under the 'daemon -r -R 30' environment.  'kill <daemonpid>' had no
immediate effect and the monitor process slept until the next restart
attempt.  This was annoying.)

Reviewed by:	allanjude, imp, markj
Differential Revision:	https://reviews.freebsd.org/D20509
2019-06-04 16:07:01 +00:00
John Baldwin
beb388db08 Emulate the AMD MSR_LS_CFG MSR used for various Ryzen errata.
Writes are ignored and reads always return zero.

Submitted by:	José Albornoz <jojo@eljojo.net> (write-only version)
Reviewed by:	Patrick Mooney, cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19506
2019-06-03 23:17:35 +00:00
Rick Macklem
c2ec111378 r348590 had mention of "-I" in a comment that no longer applied to the patch.
Take "-I" out of the comment line, since the patch no longer uses the "-I"
option.

MFC after:	1 month
2019-06-03 23:07:46 +00:00
Rick Macklem
0f0869bca3 Modify mountd so that it incrementally updates the kernel exports upon a reload.
Without this patch, mountd would delete/load all exports from the exports
file(s) when it receives a SIGHUP. This works fine for small exports file(s),
but can take several seconds to do when there are large numbers (10000+) of
exported file systems. Most of this time is spent doing the system calls
that delete/export each of these file systems. When the "-S" option
has been specified (the default these days), the nfsd threads are suspended
for several seconds while the reload is done.

This patch changes mountd so that it only does system calls for file systems
where the exports have been changed/added/deleted as compared to the exports
done for the previous load/reload of the exports file(s).
Basically, when SIGHUP is posted to mountd, it saves the exportlist structures
from the previous load and creates a new set of structures from the current
exports file(s). Then it compares the current with the previous and only does
system calls for cases that have been changed/added/deleted.
The nfsd threads do not need to be suspended until the comparison step is
being done. This results in a suspension period of milliseconds for a server
with 10000+ exported file systems.

There is some code using a LOGDEBUG() macro that allow runtime debugging
output via syslog(LOG_DEBUG,...) that can be enabled by creating a file
called /var/log/mountd.debug. This code is expected to be replaced with
code that uses dtrace by cy@ in the near future, once issues w.r.t. dtrace
in stable/12 have been resolved.

The patch should not change the usage of the exports file(s), but improves
the performance of reloading large exports file(s) where there are only a
small number of changes done to the file(s).

Tested by:	pen@lysator.liu.se
PR:		237860
Reviewed by:	kib
MFC after:	1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20487
2019-06-03 22:58:51 +00:00
Mark Johnston
9afd281187 rpc.yppasswdd: Fix dirname(3) usage after r305952.
PR:		234972
Submitted by:	Edward Fuhr <edward.fuhr@us.fujitsu.com> (original)
MFC after:	3 days
2019-06-03 16:51:07 +00:00
Maxim Sobolev
5ec57af4b2 Fix several places where tool name has been hardcoded:
install -> ${INSTALL}
    mtree -> ${MTREE_CMD}
    services_mkdb -> ${SERVICES_MKDB_CMD}
    cap_mkdb -> ${CAP_MKDB_CMD}
    pwd_mkdb -> ${PWD_MKDB_CMD}
    kldxref -> ${KLDXREF_CMD}

If you do custom FreeBSD builds you may want to override those
in some cases.

Sponsored by:	Sippy Software, Inc.
2019-06-02 23:38:19 +00:00
John Baldwin
df61066e8b Whitespace cleanups, no functional change. 2019-05-31 18:00:44 +00:00
Rick Macklem
46a6b5c451 Replace a single linked list with a hash table of lists.
mountd.c uses a single linked list of "struct exportlist" structures,
where there is one of these for each exported file system on the NFS server.
This list gets long if there are a large number of file systems exported and
the list must be searched for each line in the exports file(s) when
SIGHUP causes the exports file(s) to be reloaded.
A simple benchmark that traverses SLIST() elements and compares two 32bit
fields in the structure for equal (which is what the search is)
appears to take a couple of nsec. So, for a server with 72000 exported file
systems, this can take about 5sec during reload of the exports file(s).
By replacing the single linked list with a hash table with a target of
10 elements per list, the time should be reduced to less than 1msec.
Peter Errikson (who has a server with 72000+ exported file systems) ran
a test program using 5 hashes to see how they worked.
fnv_32_buf(fsid,..., 0)
fnv_32_buf(fsid,..., FNV1_32_INIT)
hash32_buf(fsid,..., 0)
hash32_buf(fsid,..., HASHINIT)
- plus simply using the low order bits of fsid.val[0].
The first three behaved about equally well, with the first one being
slightly better than the others.
It has an average variation of about 4.5% about the target list length
and that is what this patch uses.
Peter Errikson also tested this hash table version and found that the
performance wasn't measurably improved by a larger hash table, so a
load factor of 10 appears adequate.

Tested by:	pen@lysator.liu.se (with other patches)
PR:		237860
MFC after:	1 month
2019-05-31 01:28:48 +00:00
Alexander Motin
b627cd1c20 Pass data pointers to the driver in way in expects.
Probably due to historical reasons the driver uses In/Out arguments in
odd way.  While this tool still never uses Out arguments to see that,
make the code to not trigger EINVAL in possible future uses.

MFC after:	2 weeks
2019-05-30 15:07:39 +00:00
Dmitry Chagin
c5afec6e89 Complete LOCAL_PEERCRED support. Cache pid of the remote process in the
struct xucred. Do not bump XUCRED_VERSION as struct layout is not changed.

PR:		215202
Reviewed by:	tijl
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20415
2019-05-30 14:24:26 +00:00
Conrad Meyer
9c1fa7a429 kldxref(8): Sort MDT_MODULE info first in linker.hints output
MDT_MODULE info is required to be ordered before any other MDT metadata for
a given kld because it serves as an implicit record boundary between
distinct klds for linker.hints consumers.  kldxref(8) has previously relied
on the assumption that MDT_MODULE was ordered relative to other module
metadata in kld objects by source code ordering.

However, C does not require implementations to emit file scope objects in
any particular order, and it seems that GCC 6.4.0 and/or binutils 2.32 ld
may reorder emitted objects with respect to source code ordering.

So: just take two passes over a given .ko's module metadata, scanning for
the MDT_MODULE on the first pass and the other metadata on subsequent
passes.  It's not super expensive and not exactly a performance-critical
piece of code.  This ensures MDT_MODULE is always ordered before
MDT_PNP_INFO and other MDTs, regardless of compiler/linker movement.  As a
fringe benefit, it removes the requirement that care be taken to always
order MODULE_PNP_INFO after DRIVER_MODULE in source code.

Reviewed by:	emaste, imp
Differential Revision:	https://reviews.freebsd.org/D20405
2019-05-27 17:33:20 +00:00