the system.
Together with lm75(4) this module allows easy temperature monitoring over
SNMP, specially for embedded systems.
Manual page reviewed by: brueffer (D128)
to 16. This is arbitrary and is used to ensure that a vcpu goes back into
the vm_run() loop to process interrupts or rendezvous events in a timely
fashion.
Found with: Coverity Scan
CID: 1216436
it implicitly in vmm.ko.
Add ioctl VM_GET_CPUS to get the current set of 'active' and 'suspended' cpus
and display them via /usr/sbin/bhyvectl using the "--get-active-cpus" and
"--get-suspended-cpus" options.
This is in preparation for being able to reset virtual machine state without
having to destroy and recreate it.
Put the superblock in the correct possition for UFS2... There is a bug
in FFS that if we don't put it here (for UFS2), it will forcefully
relocate the superblock, and I believe cause data loss..
I have a fix for that, but w/ how many releases are broken, we won't be
able to switch to the better _FLOPPY (block 0) for this for a while..
o Teach vidcontrol(1) to distinct which virtual terminal system is running now.
o Load vt(4) fonts from different location.
o Add $FreeBSD$ tag for path.h.
Tested by: Claude Buisson <clbuisson@orange.fr>
MFC after: 7 days
Sponsored by: The FreeBSD Foundation
fault on the destination buffer.
Prior to this change a page fault would be detected in vm_copyout(). This
was done after the I/O port access was done. If the I/O port access had
side-effects (e.g. reading the uart FIFO) then restarting the instruction
would result in incorrect behavior.
Fix this by validating the guest linear address before doing the I/O port
emulation. If the validation results in a page fault exception being injected
into the guest then the instruction can now be restarted without any
side-effects.
API function 'vie_calculate_gla()'.
While the current implementation is simplistic it forms the basis of doing
segmentation checks if the guest is in 32-bit protected mode.
of the guest linear address space. These APIs in turn use a new ioctl
'VM_GLA2GPA' to convert the guest linear address to guest physical.
Use the new copyin/copyout APIs when emulating ins/outs instruction in
bhyve(8).
'struct vm_guest_paging'.
Check for canonical addressing in vmm_gla2gpa() and inject a protection
fault into the guest if a violation is detected.
If the page table walk is restarted in vmm_gla2gpa() then reset 'ptpphys' to
point to the root of the page tables.
the UART FIFO.
The emulation is constrained in a number of ways: 64-bit only, doesn't check
for all exception conditions, limited to i/o ports emulated in userspace.
Some of these constraints will be relaxed in followup commits.
Requested by: grehan
Reviewed by: tychon (partially and a much earlier version)
an embedded newline appearing within the options string surrounded by
double-quotes. Rework the logic that goes into setting dataset options on
the root pool dataset while we're here -- added two new variables (which
can be altered via scripting) ZFSBOOT_POOL_CREATE_OPTIONS and also
ZFSBOOT_BOOT_POOL_CREATE_OPTIONS for setting pool/dataset attributes at
the time of pool creation. The former is for setting options on the root
pool (zroot) and the latter is for setting options on the optional separate
boot pool (bootpool) implicitly enabled when using either GELI or MBR. The
default value for the root pool variable (ZFSBOOT_POOL_CREATE_OPTIONS) is
"-O compress=lz4 -O atime=off" and the default value for separate boot pool
variable (ZFSBOOT_BOOT_POOL_CREATE_OPTIONS) is NULL (no additional options
for the separate boot pool dataset).
Reviewed by: allanjude
MFC after: 7 days
X-MFC-with: r266107-266109
default for newsyslog(8).
The /usr/local/etc/newsyslog.conf.d will give packages an opportunity to
install a default configuration to handle their own log files.
MFC after: 2 weeks
Relnotes: yes
In one case generating callgraph output from a 24MB system-wide sampling
data file took 17.4 seconds on average. Profiling showed pmcstat
spending a lot of time in strcmp, due to hash collisions.
Replacing the XOR-only hash with FNV-1a reduces the run time for my
test by 40%.
Replace usage of "prison" with "jail", since that term has mostly dropped
out of use. Note once at the beginning that the "prison" term is equivalent,
but do not use it otherwise. [1]
Some grammar issues.
Some mdoc formatting fixes.
Consistently use \(em for em dashes, with spaces around it.
Avoid contractions.
Prefer ssh to telnet.
PR: docs/176832 [1]
Approved by: hrs (mentor)
the legacy 8259A PICs.
- Implement an ICH-comptabile PCI interrupt router on the lpc device with
8 steerable pins configured via config space access to byte-wide
registers at 0x60-63 and 0x68-6b.
- For each configured PCI INTx interrupt, route it to both an I/O APIC
pin and a PCI interrupt router pin. When a PCI INTx interrupt is
asserted, ensure that both pins are asserted.
- Provide an initial routing of PCI interrupt router (PIRQ) pins to
8259A pins (ISA IRQs) and initialize the interrupt line config register
for the corresponding PCI function with the ISA IRQ as this matches
existing hardware.
- Add a global _PIC method for OSPM to select the desired interrupt routing
configuration.
- Update the _PRT methods for PCI bridges to provide both APIC and legacy
PRT tables and return the appropriate table based on the configured
routing configuration. Note that if the lpc device is not configured, no
routing information is provided.
- When the lpc device is enabled, provide ACPI PCI link devices corresponding
to each PIRQ pin.
- Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A
pins via the ELCR.
- Mark the power management SCI as level triggered.
- Don't hardcode the number of elements in Packages in the source for
the DSDT. iasl(8) will fill in the actual number of elements, and
this makes it simpler to generate a Package with a variable number of
elements.
Reviewed by: tycho
It starts off being used to track the grammar for the number of disks
(singular vs plural) and then it is reused as the list of available disks.
Replace the variable with disks_grammar and move 'disk' and 'disks' to
msg_ vars so they can be translated in the future.
Submitted by: Allan Jude <freebsd@allanjude.com>
Reviewed by: roberto
MFC after: 2 weeks
Sponsored by: ScaleEngine Inc.
Set compress=lz4 for the entire pool, removing it from the individual
datasets
Remove exec=no from /usr/src, breaks the test suite.
Submitted by: Allan Jude <freebsd@allanjude.com>
Reviewed by: roberto
MFC after: 2 weeks
Sponsored by: ScaleEngine Inc.
encryption for swap, and optional gmirror for swap (which can be combined)
Submitted by: Allan Jude <freebsd@allanjude.com>
Requested By: roberto
Sponsored By: ScaleEngine Inc.
MFC after: 2 weeks
This has not added a lot of value when debugging bhyve issues while greatly
increasing the time and space required to store the core file.
Passing the "-C" option to bhyve(8) will change the default and dump guest
memory in the core dump.
Requested by: grehan
Reviewed by: grehan
Failing to do this will cause the kevent(2) notification to trigger
continuously and the bhyve(8) mevent thread will hog the cpu until the
characters on the backend tty device are drained.
Also, make the uart backend file descriptor non-blocking to avoid a
select(2) before every byte read from that backend.
Reviewed by: grehan
a 'hostcpu'. The new format of the argument string is "vcpu:hostcpu".
This allows pinning a subset of the vcpus if desired.
It also allows pinning a vcpu to more than a single 'hostcpu'.
Submitted by: novel (initial version)
However, if the original knote had been disabled then it is not automatically
re-enabled.
Fix this by using EV_ADD to create an mevent and EV_ENABLE to enable it.
Adding a kevent for the first time implicitly enables it so existing callers
of mevent_add() don't need to change.
Reviewed by: grehan
because there isn't a standard way to relay this information to the guest OS.
Add a command line option "-Y" to bhyve(8) to inhibit MPtable generation.
If the virtual machine is using PCI devices on buses other than 0 then it can
still use ACPI tables to convey this information to the guest.
Discussed with: grehan@
to sleep permanently by executing a HLT with interrupts disabled.
When this condition is detected the guest with be suspended with a reason of
VM_SUSPEND_HALT and the bhyve(8) process will exit.
Tested by executing "halt" inside a RHEL7-beta guest.
Discussed with: grehan@
Reviewed by: jhb@, tychon@
Omit "too many sections" warnings if the ELF file is not dynamically
linked (and is therefore skipped anyway), and otherwise output it only
once. An errant core file would previously cause kldxref to output a
number of warnings.
Also introduce a MAXSEGS #define and replace literal 2 with it, to make
comparisons clear.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
the 'HLT' instruction. This condition was detected by 'vm_handle_hlt()' and
converted into the SPINDOWN_CPU exitcode . The bhyve(8) process would exit
the vcpu thread in response to a SPINDOWN_CPU and when the last vcpu was
spun down it would reset the virtual machine via vm_suspend(VM_SUSPEND_RESET).
This functionality was broken in r263780 in a way that made it impossible
to kill the bhyve(8) process because it would loop forever in
vm_handle_suspend().
Unbreak this by removing the code to spindown vcpus. Thus a 'halt' from
a Linux guest will appear to be hung but this is consistent with the
behavior on bare metal. The guest can be rebooted by using the bhyvectl
options '--force-reset' or '--force-poweroff'.
Reviewed by: grehan@
by adding an argument to the VM_SUSPEND ioctl that specifies how the virtual
machine should be suspended, viz. VM_SUSPEND_RESET or VM_SUSPEND_POWEROFF.
The disposition of VM_SUSPEND is also made available to the exit handler
via the 'u.suspended' member of 'struct vm_exit'.
This capability is exposed via the '--force-reset' and '--force-poweroff'
arguments to /usr/sbin/bhyvectl.
Discussed with: grehan@
according to the method outlined in the AHCI spec.
Tested with FreeBSD 9/10/11 with MSI disabled,
and also NetBSD/amd64 (lightly).
Reviewed by: neel, tychon
MFC after: 3 weeks
a sysctl to determine what firmware is in use. This sysctl does not exist
yet, so the following blocks are in front of the wheels:
- I've provisionally called this "hw.platform" after the equivalent thing
on PPC
- The logic to check the sysctl is short-circuited to always choose BIOS.
There's a comment in the top of the file about how to turn this off.
If IA64 acquired a boot1.efifat-like thing (probably with very few
modifications), the same code could be adapted there.
Ignore writes, and return 0xff's, on config accesses when not set.
Behaviour now matches that seen on h/w.
Found with a NetBSD/amd64 guest.
Reviewed by: tychon
MFC after: 3 weeks
GEOM support (thereby adding GEOM support to the disk selection
menu of bsdinstall(8)'s `zfsboot' module updated herein).
MFC after: 1 week
X-MFC-with: 264840
different things from this commit:
+ More devices. Devices that were previously ignored are now present.
+ Faster device scanning. "There is no try, only Do" -- f_device_try()
is no longer the basis of device scanning as GEOM provides [nearly]
all devices (doesn't provide network devices).
+ More information available as non-root. Usually you have to be root
to do things like taste filesystems, and that limits the amount of
information available to non-root users; with GEOM, we see all even
running unprivileged as the brunt of information (except for so-
called ``dangerously dedicated'' file systems) is represented by the
`kern.geom.confxml' sysctl(8) MIB.
NB: Only really useful for external scripts that use the API and run as
non-root; where this code is used in bsdconfig(8) and bsdinstall(8)
you are running as root so can detect even ``dangerously dedicated''
file systems that are not present in GEOM; e.g., no PART class for
a DOS filesystem written directly to disk without partition table).
+ No more use of legacy tools such as diskinfo(8) to get disk capacity
or fdisk(8) to see partitions.
MFC after: 1 week
in not showing the most recent event by default.
- When the stop even is hit, break out of the outer loop to stop fetching
more events.
MFC after: 1 week
Status and Control register at port 0x61.
Be more conservative about "catching up" callouts that were supposed
to fire in the past by skipping an interrupt if it was
scheduled too far in the past.
Restore the PIT ACPI DSDT entries and add an entry for NMISC too.
Approved by: neel (co-mentor)
and normal mode; this makes it possible to compile with the former
by default, but use it only when neccessary. That's especially
important for the userland part.
Sponsored by: The FreeBSD Foundation
needed it to be already enabled, because listening in proxy mode
requires it; however, it's conf_apply() that opens pidfiles,
so it resulted in port being enabled before pidfile was opened.
This was not so bad, but it was also disabled when pidfile couldn't
be opened due to ctld already running; this means that starting
second ctld instance screwed up the first.
Sponsored by: The FreeBSD Foundation
that the slightly older dialog(1) requires --separate-output when using the
--checklist widget to force response to produce unquoted values (whereas in
stable/10 --checklist widget without --separate-output will only quote the
checklist labels in the response if the label is multi-word (contains any
whitespace).
Since these enhancements (see revisions 263956 and 264437) were developed
originally on 10, the --separate-output option was omitted. When merged to
stable/9, we (Allan Jude) and I found during testing that the "always-
quoting" of the response was causing things like struct interpolation to
fail (`f_struct device_$dev' would produce `f_struct device_\"da0\"' for
example -- literal quotes inherited from dialog(1) --checklist response).
To see the behavior, execute the following on stable/9 versus stable/10:
dialog --checklist disks: 0 0 0 da0 "" off da1 "" off
Check both items and hit enter. On stable/10, the response is:
da0 da1
On stable/9 the response is:
"da0" "da1"
If you use the --separate-output option, the response is the same for both:
da0
da1
So applying --separate-output on every platform until either one of two
things occurs 1) dialog(1,3) gets synchronized between stable/9, higher or
2) we drop support for stable/9.
MFC after: 3 days
Reviewed by: Allan Jude
compare.
Because of the change to find in SVN r253886, the entire temproot would be
deleted if it became empty, leading to a confusing message "*** FATAL ERROR:
The temproot directory ${TEMPROOT} has disappeared!"
Note that mergemaster does not do anything useful in this situation anyway
(e.g. put IGNORE_FILES="/etc/group /etc/master.passwd" in
/etc/mergemaster.rc and run mergemaster -p).
As noted in that commit, add -mindepth 1.
PR: bin/188485
Submitted by: David Boyd
MFC after: 1 week
and finish the job. ncurses is now the only Makefile in the tree that
uses it since it wasn't a simple mechanical change, and will be
addressed in a future commit.
0xff. Some guests may attempt to read from this port to identify
psuedo-PNP ISA devices. (The ie(4) driver in FreeBSD/i386 is one
example.)
Reviewed by: grehan
execution to a emumation program via parsing of ELF header information.
With this kernel module and userland tool, poudriere is able to build
ports packages via the QEMU userland tools (or another emulator program)
in a different architecture chroot, e.g. TARGET=mips TARGET_ARCH=mips
I'm not connecting this to GENERIC for obvious reasons, but this should
allow the kernel module to be built by default and enable the building
of the userland tool (which automatically loads the kernel module).
Submitted by: sson@
Reviewed by: jhb@
Call through to /dev/random synchronously to fill
virtio buffers with RNG data.
Tested with FreeBSD-CURRENT and Ubuntu guests.
Submitted by: Leon Dang
Discussed with: markm
MFC after: 3 weeks
Sponsored by: Nahanni Systems
Teach pciconf how to print out the status (enabled/disabled) of the ARI
capability on PCI Root Complexes and Downstream Ports.
MFC after: 2 months
Sponsored by: Sandvine Inc.
from any context i.e., it is not required to be called from a vcpu thread. The
ioctl simply sets a state variable 'vm->suspend' to '1' and returns.
The vcpus inspect 'vm->suspend' in the run loop and if it is set to '1' the
vcpu breaks out of the loop with a reason of 'VM_EXITCODE_SUSPENDED'. The
suspend handler waits until all 'vm->active_cpus' have transitioned to
'vm->suspended_cpus' before returning to userspace.
Discussed with: grehan
all the SUBDIR entries in parallel, instead of serially. Apply this
option to a selected number of Makefiles, which can greatly speed up the
build on multi-core machines, when using make -j.
This can be extended to more Makefiles later on, whenever they are
verified to work correctly with parallel building.
I tested this on a 24-core machine, with make -j48 buildworld (N = 6):
before stddev after stddev
======= ====== ======= ======
real time 1741.1 16.5 959.8 2.7
user time 12468.7 16.4 14393.0 16.8
sys time 1825.0 54.8 2110.6 22.8
(user+sys)/real 8.2 17.1
E.g. the build was approximately 45% faster in real time. On machines
with less cores, or with lower -j settings, the speedup will not be as
impressive. But at least you can now almost max out a machine with
buildworld!
Submitted by: jilles
MFC after: 2 weeks
The impact of this bug is that you cannot build a kernel if both of the
following are true:
1) The kernel config file is in a non-default location
2) The kernel config file uses the "include" statement from config(5).
usr.sbin/config/main.c
usr.sbin/config/config.8
usr.sbin/config/config.h
usr.sbin/config/lang.l
Added a "-I path" option to config(8). By analogy to cc(1), it adds
an extra path in which the "include" statement will search for
files.
Makefile.inc1
Pass "-I ${KERNCONFDIR}" to config(8).
PR: kern/187712
Reviewed by: will, imp (previous version)
MFC after: 3 weeks
Sponsored by: Spectra Logic Corporation
This change was originally going to only migrate the usr.sbin tests but, as
it turns out, the usr.sbin/sa/ tests require files from usr.bin/lastcomm/
so it's better to just also migrate the latter at the same time. The other
usr.bin tests will be moved separately.
To make these tests work within the test suite, some of them have required
changes to prevent modifying the source directory and instead just rely on
the current directory for file manipulation.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.
Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
taking a variable to set need to make sure they protect their locals; if
$var_to_set positional argument coincides with a local the expected call
to `setvar' will fail to reach outside of the function's namespace. When
such collisions are experienced (as I did in the rewrite of usermgmt) the
solution is to append a full or abbreviated version of the function name
to the local (ultimately eliminating collisions). This is rarely needed
and only occurs when you have a lot of like-named functions that pass
very similar $var_to_set positional arguments to each other (such as-is
the case with an expansive library such as `dialog.subr').
is not associated with user "username". E.g., user "foo" has primary group
"wheel" and is unassociated with group "foo", yet userdel would delete the
group "foo" when deleting user "foo" (despite the fact that user "foo" is
not associated with group "foo" in any way).
Patch committed with minor style(9) changes.
PR: bin/169471
Submitted by: Alexander Pyhalov <apyhalov@gmail.com>
New ioctls VM_ISA_ASSERT_IRQ, VM_ISA_DEASSERT_IRQ and VM_ISA_PULSE_IRQ
can be used to manipulate the pic, and optionally the ioapic, pin state.
Reviewed by: jhb, neel
Approved by: neel (co-mentor)
This fixes the issue of bhyve appearing to halt when using
nmdm ports for the console, until a connection is made to
the other end.
bhyveload already does this.
Reported by: Many.
MFC after: 3 weeks.
new command line options -W, to enable it when needed.
On my tests this change by almost ten times improves rpcbind performance.
No objections: many, net@
processor-specific VMCS or VMCB. The pending exception will be delivered right
before entering the guest.
The order of event injection into the guest is:
- hardware exception
- NMI
- maskable interrupt
In the Intel VT-x case, a pending NMI or interrupt will enable the interrupt
window-exiting and inject it as soon as possible after the hardware exception
is injected. Also since interrupts are inherently asynchronous, injecting
them after the hardware exception should not affect correctness from the
guest perspective.
Rename the unused ioctl VM_INJECT_EVENT to VM_INJECT_EXCEPTION and restrict
it to only deliver x86 hardware exceptions. This new ioctl is now used to
inject a protection fault when the guest accesses an unimplemented MSR.
Discussed with: grehan, jhb
Reviewed by: jhb
what btxldr expects (.set MEM_DATA,start+0x1000 in btxldr.S).
This makes resulting ELF binaries bootable with grub, gptboot and boot2.
PR: 153801
Submitted by: Gleb Kurtsou <gleb.kurtsou at gmail.com>
Tested by: Ruben Kerkhof <ruben at rubenkerkhof.com>
Glanced at by: jhb, peter
MFC after: 1 month
'-m <file>' spits out the given stream into <file> (eg, /dev/stdout).
However, it only resolves the first symbol; it doesn't parse the entire
callgraph. If it fails to lookup then it doesn't print anything.
'-a' instead does a symbol and file:line lookup for each address in each
callgraph and will happily print the address itself with no lookup
information if it couldn't look things up.
This makes it much easier to pull out individual records from a
pmc data file and look at the callgraph information without having to
hand-decode the addresses.
Sponsored by: Netflix, Inc.
Modelled after the i386 zfsloader. However, with no
2nd stage zfsboot to search for a bootable dataset,
attempt a ZFS boot if there is more than one ZFS
dataset found during the disk probe.
sys/boot/userboot/zfs
- build the ZFS boot library
sys/boot/userboot/userboot/
conf.c
- Add the ZFS pool and filesystem tables
devicename.c
- correctly format ZFS devices
main.c
- increase the size of the libstand malloc pool
to account for the increased usage from ZFS buffers
- probe for a ZFS dataset, and if one is
found, attempt to boot from it.
usr.sbin/bhyveload/bhyveload.c
- allow multiple invocations of the '-d' option
to specify multiple disks e.g. a raidz set.
Up to 32 disks are supported.
Tested with various combinations of GPT, MBR, single
and multiple disks, RAID-Z, mirrors.
Reviewed by: neel
Discussed with: avg
Tested by: Michael Dexter and others
MFC after: 3 weeks
simplify the implementation of the x2APIC virtualization assist in VT-x.
Prior to this change the vlapic allowed the guest to change its mode from
xAPIC to x2APIC. We don't allow that any more and the vlapic mode is locked
when the virtual machine is created. This is not very constraining because
operating systems already have to deal with BIOS setting up the APIC in
x2APIC mode at boot.
Fix a bug in the CPUID emulation where the x2APIC capability was leaking
from the host to the guest.
Ignore MMIO reads and writes to the vlapic in x2APIC mode. Similarly, ignore
MSR accesses to the vlapic when it is in xAPIC mode.
The default configuration of the vlapic is xAPIC. The "-x" option to bhyve(8)
can be used to change the mode to x2APIC instead.
Discussed with: grehan@
the non-standard zero capability list terminator. Instead, track
the start and end of the most recently added capability and use that
to adjust the previous capability's next pointer when a capability is
added and to determine the range of config registers belonging to
PCI capability registers.
Reviewed by: neel
NB: If the zfsboot variables ($ZFSBOOT_*) are set, a script is
assumed to want zfsboot module instead of scriptedpart module.
Submitted by: Loïc Brarda <loic.brarda@cern.ch>
Reviewed by: nwhitehorn@
MFC after: 3 days
This is done by representing each bus as root PCI device in ACPI. The device
implements the _BBN method to return the PCI bus number to the guest OS.
Each PCI bus keeps track of the resources that is decodes for devices
configured on the bus: i/o, mmio (32-bit) and mmio (64-bit). These windows
are advertised to the guest via the _CRS object of the root device.
Bus 0 is treated specially since it consumes the I/O ports to access the
PCI config space [0xcf8-0xcff]. It also decodes the legacy I/O ports that
are consumed by devices on the LPC bus. For this reason the LPC bridge can
be configured only on bus 0.
The bus number can be specified using the following command line option
to bhyve(8): "-s <bus>:<slot>:<func>,<emul>[,<config>]"
Discussed with: grehan@
Reviewed by: jhb@
the IDENTIFY DEVICE and IDENTIFY PACKET DEVICE commands.
Also, provide an indication a "D2H Register FIS" occurred during a SET FEATURES
command.
Approved by: grehan (co-mentor)
this could lead to the -n option effectively being ignored (in case
ac_line happened to be 0 aka SRC_AC), or other undefined behaviour.
PR: 169779
Submitted by: Alex Gonzalez <loox at e-shell.net>
Reviewed by: jhb
MFC after: 2 weeks