Previously the sizes were inferred indirectly based on the size of the mappings
at 0 and 4GB respectively. This works fine as long as size of the allocation is
identical to the size of the mapping in the guest's address space. However, if
the mapping is disjoint then this assumption falls apart (e.g., due to the
legacy BIOS hole between 640KB and 1MB).
running at the same time causing problems w/ wifi not working..
the patch will be submitted upstream... The next step if someone wants
to push it upstream is to break os_unix.c up so that all these other
utilities don't need libutil..
Reviewed by: rpaulo
separate argument structure with added level_type field for
CPUID_CPUID_COUNT request.
Reviewed by: attilio (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
performing cpuid calls.
Add also a new way to specify the level type to cpucontrol(8) as
reported in the manpage.
Sponsored by: EMC / Isilon storage division
Reviewed by: bdrewery, gcooper
Testerd by: bdrewery
Before this it was impossible to use all 16 bytes of serial number, and
client always got serial number NULL-terminated, that is not required.
MFC after: 2 weeks
This is currently an opt-in build flag. Once ASLR support is ready and stable
it should changed to opt-out and be enabled by default along with ASLR.
Each application Makefile uses opt-out to ensure that ASLR will be enabled by
default in new directories when the system is compiled with PIE/ASLR. [2]
Mark known build failures as NO_PIE for now.
The only known runtime failure was rtld.
[1] http://www.bsdcan.org/2014/schedule/events/452.en.html
Submitted by: Shawn Webb <lattera@gmail.com>
Discussed between: des@ and Shawn Webb [2]
register pairs where two 32-bit registers make up a larger logical
size. Support those access by splitting the quad-word into two
double-words.
Reviewed by: grehan
remove the now-redundant checks for RELEASE_CRUNCH. This originally
was defined for building smaller sysinstall images, but was later also
used by picobsd builds for a similar purpose. Now that we've moved
away from sysinstall, picobsd is the only remaining consumer of this
interface. Adding these two options reduces the RELEASE_CRUNCH
special cases in the tree by half.
the system.
Together with lm75(4) this module allows easy temperature monitoring over
SNMP, specially for embedded systems.
Manual page reviewed by: brueffer (D128)
to 16. This is arbitrary and is used to ensure that a vcpu goes back into
the vm_run() loop to process interrupts or rendezvous events in a timely
fashion.
Found with: Coverity Scan
CID: 1216436
it implicitly in vmm.ko.
Add ioctl VM_GET_CPUS to get the current set of 'active' and 'suspended' cpus
and display them via /usr/sbin/bhyvectl using the "--get-active-cpus" and
"--get-suspended-cpus" options.
This is in preparation for being able to reset virtual machine state without
having to destroy and recreate it.
Put the superblock in the correct possition for UFS2... There is a bug
in FFS that if we don't put it here (for UFS2), it will forcefully
relocate the superblock, and I believe cause data loss..
I have a fix for that, but w/ how many releases are broken, we won't be
able to switch to the better _FLOPPY (block 0) for this for a while..
o Teach vidcontrol(1) to distinct which virtual terminal system is running now.
o Load vt(4) fonts from different location.
o Add $FreeBSD$ tag for path.h.
Tested by: Claude Buisson <clbuisson@orange.fr>
MFC after: 7 days
Sponsored by: The FreeBSD Foundation
fault on the destination buffer.
Prior to this change a page fault would be detected in vm_copyout(). This
was done after the I/O port access was done. If the I/O port access had
side-effects (e.g. reading the uart FIFO) then restarting the instruction
would result in incorrect behavior.
Fix this by validating the guest linear address before doing the I/O port
emulation. If the validation results in a page fault exception being injected
into the guest then the instruction can now be restarted without any
side-effects.
API function 'vie_calculate_gla()'.
While the current implementation is simplistic it forms the basis of doing
segmentation checks if the guest is in 32-bit protected mode.
of the guest linear address space. These APIs in turn use a new ioctl
'VM_GLA2GPA' to convert the guest linear address to guest physical.
Use the new copyin/copyout APIs when emulating ins/outs instruction in
bhyve(8).
'struct vm_guest_paging'.
Check for canonical addressing in vmm_gla2gpa() and inject a protection
fault into the guest if a violation is detected.
If the page table walk is restarted in vmm_gla2gpa() then reset 'ptpphys' to
point to the root of the page tables.
the UART FIFO.
The emulation is constrained in a number of ways: 64-bit only, doesn't check
for all exception conditions, limited to i/o ports emulated in userspace.
Some of these constraints will be relaxed in followup commits.
Requested by: grehan
Reviewed by: tychon (partially and a much earlier version)
an embedded newline appearing within the options string surrounded by
double-quotes. Rework the logic that goes into setting dataset options on
the root pool dataset while we're here -- added two new variables (which
can be altered via scripting) ZFSBOOT_POOL_CREATE_OPTIONS and also
ZFSBOOT_BOOT_POOL_CREATE_OPTIONS for setting pool/dataset attributes at
the time of pool creation. The former is for setting options on the root
pool (zroot) and the latter is for setting options on the optional separate
boot pool (bootpool) implicitly enabled when using either GELI or MBR. The
default value for the root pool variable (ZFSBOOT_POOL_CREATE_OPTIONS) is
"-O compress=lz4 -O atime=off" and the default value for separate boot pool
variable (ZFSBOOT_BOOT_POOL_CREATE_OPTIONS) is NULL (no additional options
for the separate boot pool dataset).
Reviewed by: allanjude
MFC after: 7 days
X-MFC-with: r266107-266109
default for newsyslog(8).
The /usr/local/etc/newsyslog.conf.d will give packages an opportunity to
install a default configuration to handle their own log files.
MFC after: 2 weeks
Relnotes: yes
In one case generating callgraph output from a 24MB system-wide sampling
data file took 17.4 seconds on average. Profiling showed pmcstat
spending a lot of time in strcmp, due to hash collisions.
Replacing the XOR-only hash with FNV-1a reduces the run time for my
test by 40%.
Replace usage of "prison" with "jail", since that term has mostly dropped
out of use. Note once at the beginning that the "prison" term is equivalent,
but do not use it otherwise. [1]
Some grammar issues.
Some mdoc formatting fixes.
Consistently use \(em for em dashes, with spaces around it.
Avoid contractions.
Prefer ssh to telnet.
PR: docs/176832 [1]
Approved by: hrs (mentor)
the legacy 8259A PICs.
- Implement an ICH-comptabile PCI interrupt router on the lpc device with
8 steerable pins configured via config space access to byte-wide
registers at 0x60-63 and 0x68-6b.
- For each configured PCI INTx interrupt, route it to both an I/O APIC
pin and a PCI interrupt router pin. When a PCI INTx interrupt is
asserted, ensure that both pins are asserted.
- Provide an initial routing of PCI interrupt router (PIRQ) pins to
8259A pins (ISA IRQs) and initialize the interrupt line config register
for the corresponding PCI function with the ISA IRQ as this matches
existing hardware.
- Add a global _PIC method for OSPM to select the desired interrupt routing
configuration.
- Update the _PRT methods for PCI bridges to provide both APIC and legacy
PRT tables and return the appropriate table based on the configured
routing configuration. Note that if the lpc device is not configured, no
routing information is provided.
- When the lpc device is enabled, provide ACPI PCI link devices corresponding
to each PIRQ pin.
- Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A
pins via the ELCR.
- Mark the power management SCI as level triggered.
- Don't hardcode the number of elements in Packages in the source for
the DSDT. iasl(8) will fill in the actual number of elements, and
this makes it simpler to generate a Package with a variable number of
elements.
Reviewed by: tycho
It starts off being used to track the grammar for the number of disks
(singular vs plural) and then it is reused as the list of available disks.
Replace the variable with disks_grammar and move 'disk' and 'disks' to
msg_ vars so they can be translated in the future.
Submitted by: Allan Jude <freebsd@allanjude.com>
Reviewed by: roberto
MFC after: 2 weeks
Sponsored by: ScaleEngine Inc.
Set compress=lz4 for the entire pool, removing it from the individual
datasets
Remove exec=no from /usr/src, breaks the test suite.
Submitted by: Allan Jude <freebsd@allanjude.com>
Reviewed by: roberto
MFC after: 2 weeks
Sponsored by: ScaleEngine Inc.
encryption for swap, and optional gmirror for swap (which can be combined)
Submitted by: Allan Jude <freebsd@allanjude.com>
Requested By: roberto
Sponsored By: ScaleEngine Inc.
MFC after: 2 weeks
This has not added a lot of value when debugging bhyve issues while greatly
increasing the time and space required to store the core file.
Passing the "-C" option to bhyve(8) will change the default and dump guest
memory in the core dump.
Requested by: grehan
Reviewed by: grehan
Failing to do this will cause the kevent(2) notification to trigger
continuously and the bhyve(8) mevent thread will hog the cpu until the
characters on the backend tty device are drained.
Also, make the uart backend file descriptor non-blocking to avoid a
select(2) before every byte read from that backend.
Reviewed by: grehan
a 'hostcpu'. The new format of the argument string is "vcpu:hostcpu".
This allows pinning a subset of the vcpus if desired.
It also allows pinning a vcpu to more than a single 'hostcpu'.
Submitted by: novel (initial version)
However, if the original knote had been disabled then it is not automatically
re-enabled.
Fix this by using EV_ADD to create an mevent and EV_ENABLE to enable it.
Adding a kevent for the first time implicitly enables it so existing callers
of mevent_add() don't need to change.
Reviewed by: grehan
because there isn't a standard way to relay this information to the guest OS.
Add a command line option "-Y" to bhyve(8) to inhibit MPtable generation.
If the virtual machine is using PCI devices on buses other than 0 then it can
still use ACPI tables to convey this information to the guest.
Discussed with: grehan@
to sleep permanently by executing a HLT with interrupts disabled.
When this condition is detected the guest with be suspended with a reason of
VM_SUSPEND_HALT and the bhyve(8) process will exit.
Tested by executing "halt" inside a RHEL7-beta guest.
Discussed with: grehan@
Reviewed by: jhb@, tychon@
Omit "too many sections" warnings if the ELF file is not dynamically
linked (and is therefore skipped anyway), and otherwise output it only
once. An errant core file would previously cause kldxref to output a
number of warnings.
Also introduce a MAXSEGS #define and replace literal 2 with it, to make
comparisons clear.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
the 'HLT' instruction. This condition was detected by 'vm_handle_hlt()' and
converted into the SPINDOWN_CPU exitcode . The bhyve(8) process would exit
the vcpu thread in response to a SPINDOWN_CPU and when the last vcpu was
spun down it would reset the virtual machine via vm_suspend(VM_SUSPEND_RESET).
This functionality was broken in r263780 in a way that made it impossible
to kill the bhyve(8) process because it would loop forever in
vm_handle_suspend().
Unbreak this by removing the code to spindown vcpus. Thus a 'halt' from
a Linux guest will appear to be hung but this is consistent with the
behavior on bare metal. The guest can be rebooted by using the bhyvectl
options '--force-reset' or '--force-poweroff'.
Reviewed by: grehan@
by adding an argument to the VM_SUSPEND ioctl that specifies how the virtual
machine should be suspended, viz. VM_SUSPEND_RESET or VM_SUSPEND_POWEROFF.
The disposition of VM_SUSPEND is also made available to the exit handler
via the 'u.suspended' member of 'struct vm_exit'.
This capability is exposed via the '--force-reset' and '--force-poweroff'
arguments to /usr/sbin/bhyvectl.
Discussed with: grehan@
according to the method outlined in the AHCI spec.
Tested with FreeBSD 9/10/11 with MSI disabled,
and also NetBSD/amd64 (lightly).
Reviewed by: neel, tychon
MFC after: 3 weeks
a sysctl to determine what firmware is in use. This sysctl does not exist
yet, so the following blocks are in front of the wheels:
- I've provisionally called this "hw.platform" after the equivalent thing
on PPC
- The logic to check the sysctl is short-circuited to always choose BIOS.
There's a comment in the top of the file about how to turn this off.
If IA64 acquired a boot1.efifat-like thing (probably with very few
modifications), the same code could be adapted there.
Ignore writes, and return 0xff's, on config accesses when not set.
Behaviour now matches that seen on h/w.
Found with a NetBSD/amd64 guest.
Reviewed by: tychon
MFC after: 3 weeks
GEOM support (thereby adding GEOM support to the disk selection
menu of bsdinstall(8)'s `zfsboot' module updated herein).
MFC after: 1 week
X-MFC-with: 264840
different things from this commit:
+ More devices. Devices that were previously ignored are now present.
+ Faster device scanning. "There is no try, only Do" -- f_device_try()
is no longer the basis of device scanning as GEOM provides [nearly]
all devices (doesn't provide network devices).
+ More information available as non-root. Usually you have to be root
to do things like taste filesystems, and that limits the amount of
information available to non-root users; with GEOM, we see all even
running unprivileged as the brunt of information (except for so-
called ``dangerously dedicated'' file systems) is represented by the
`kern.geom.confxml' sysctl(8) MIB.
NB: Only really useful for external scripts that use the API and run as
non-root; where this code is used in bsdconfig(8) and bsdinstall(8)
you are running as root so can detect even ``dangerously dedicated''
file systems that are not present in GEOM; e.g., no PART class for
a DOS filesystem written directly to disk without partition table).
+ No more use of legacy tools such as diskinfo(8) to get disk capacity
or fdisk(8) to see partitions.
MFC after: 1 week
in not showing the most recent event by default.
- When the stop even is hit, break out of the outer loop to stop fetching
more events.
MFC after: 1 week
Status and Control register at port 0x61.
Be more conservative about "catching up" callouts that were supposed
to fire in the past by skipping an interrupt if it was
scheduled too far in the past.
Restore the PIT ACPI DSDT entries and add an entry for NMISC too.
Approved by: neel (co-mentor)
and normal mode; this makes it possible to compile with the former
by default, but use it only when neccessary. That's especially
important for the userland part.
Sponsored by: The FreeBSD Foundation
needed it to be already enabled, because listening in proxy mode
requires it; however, it's conf_apply() that opens pidfiles,
so it resulted in port being enabled before pidfile was opened.
This was not so bad, but it was also disabled when pidfile couldn't
be opened due to ctld already running; this means that starting
second ctld instance screwed up the first.
Sponsored by: The FreeBSD Foundation
that the slightly older dialog(1) requires --separate-output when using the
--checklist widget to force response to produce unquoted values (whereas in
stable/10 --checklist widget without --separate-output will only quote the
checklist labels in the response if the label is multi-word (contains any
whitespace).
Since these enhancements (see revisions 263956 and 264437) were developed
originally on 10, the --separate-output option was omitted. When merged to
stable/9, we (Allan Jude) and I found during testing that the "always-
quoting" of the response was causing things like struct interpolation to
fail (`f_struct device_$dev' would produce `f_struct device_\"da0\"' for
example -- literal quotes inherited from dialog(1) --checklist response).
To see the behavior, execute the following on stable/9 versus stable/10:
dialog --checklist disks: 0 0 0 da0 "" off da1 "" off
Check both items and hit enter. On stable/10, the response is:
da0 da1
On stable/9 the response is:
"da0" "da1"
If you use the --separate-output option, the response is the same for both:
da0
da1
So applying --separate-output on every platform until either one of two
things occurs 1) dialog(1,3) gets synchronized between stable/9, higher or
2) we drop support for stable/9.
MFC after: 3 days
Reviewed by: Allan Jude
compare.
Because of the change to find in SVN r253886, the entire temproot would be
deleted if it became empty, leading to a confusing message "*** FATAL ERROR:
The temproot directory ${TEMPROOT} has disappeared!"
Note that mergemaster does not do anything useful in this situation anyway
(e.g. put IGNORE_FILES="/etc/group /etc/master.passwd" in
/etc/mergemaster.rc and run mergemaster -p).
As noted in that commit, add -mindepth 1.
PR: bin/188485
Submitted by: David Boyd
MFC after: 1 week
and finish the job. ncurses is now the only Makefile in the tree that
uses it since it wasn't a simple mechanical change, and will be
addressed in a future commit.
0xff. Some guests may attempt to read from this port to identify
psuedo-PNP ISA devices. (The ie(4) driver in FreeBSD/i386 is one
example.)
Reviewed by: grehan
execution to a emumation program via parsing of ELF header information.
With this kernel module and userland tool, poudriere is able to build
ports packages via the QEMU userland tools (or another emulator program)
in a different architecture chroot, e.g. TARGET=mips TARGET_ARCH=mips
I'm not connecting this to GENERIC for obvious reasons, but this should
allow the kernel module to be built by default and enable the building
of the userland tool (which automatically loads the kernel module).
Submitted by: sson@
Reviewed by: jhb@
Call through to /dev/random synchronously to fill
virtio buffers with RNG data.
Tested with FreeBSD-CURRENT and Ubuntu guests.
Submitted by: Leon Dang
Discussed with: markm
MFC after: 3 weeks
Sponsored by: Nahanni Systems
Teach pciconf how to print out the status (enabled/disabled) of the ARI
capability on PCI Root Complexes and Downstream Ports.
MFC after: 2 months
Sponsored by: Sandvine Inc.
from any context i.e., it is not required to be called from a vcpu thread. The
ioctl simply sets a state variable 'vm->suspend' to '1' and returns.
The vcpus inspect 'vm->suspend' in the run loop and if it is set to '1' the
vcpu breaks out of the loop with a reason of 'VM_EXITCODE_SUSPENDED'. The
suspend handler waits until all 'vm->active_cpus' have transitioned to
'vm->suspended_cpus' before returning to userspace.
Discussed with: grehan
all the SUBDIR entries in parallel, instead of serially. Apply this
option to a selected number of Makefiles, which can greatly speed up the
build on multi-core machines, when using make -j.
This can be extended to more Makefiles later on, whenever they are
verified to work correctly with parallel building.
I tested this on a 24-core machine, with make -j48 buildworld (N = 6):
before stddev after stddev
======= ====== ======= ======
real time 1741.1 16.5 959.8 2.7
user time 12468.7 16.4 14393.0 16.8
sys time 1825.0 54.8 2110.6 22.8
(user+sys)/real 8.2 17.1
E.g. the build was approximately 45% faster in real time. On machines
with less cores, or with lower -j settings, the speedup will not be as
impressive. But at least you can now almost max out a machine with
buildworld!
Submitted by: jilles
MFC after: 2 weeks
The impact of this bug is that you cannot build a kernel if both of the
following are true:
1) The kernel config file is in a non-default location
2) The kernel config file uses the "include" statement from config(5).
usr.sbin/config/main.c
usr.sbin/config/config.8
usr.sbin/config/config.h
usr.sbin/config/lang.l
Added a "-I path" option to config(8). By analogy to cc(1), it adds
an extra path in which the "include" statement will search for
files.
Makefile.inc1
Pass "-I ${KERNCONFDIR}" to config(8).
PR: kern/187712
Reviewed by: will, imp (previous version)
MFC after: 3 weeks
Sponsored by: Spectra Logic Corporation
This change was originally going to only migrate the usr.sbin tests but, as
it turns out, the usr.sbin/sa/ tests require files from usr.bin/lastcomm/
so it's better to just also migrate the latter at the same time. The other
usr.bin tests will be moved separately.
To make these tests work within the test suite, some of them have required
changes to prevent modifying the source directory and instead just rely on
the current directory for file manipulation.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.
Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
taking a variable to set need to make sure they protect their locals; if
$var_to_set positional argument coincides with a local the expected call
to `setvar' will fail to reach outside of the function's namespace. When
such collisions are experienced (as I did in the rewrite of usermgmt) the
solution is to append a full or abbreviated version of the function name
to the local (ultimately eliminating collisions). This is rarely needed
and only occurs when you have a lot of like-named functions that pass
very similar $var_to_set positional arguments to each other (such as-is
the case with an expansive library such as `dialog.subr').
is not associated with user "username". E.g., user "foo" has primary group
"wheel" and is unassociated with group "foo", yet userdel would delete the
group "foo" when deleting user "foo" (despite the fact that user "foo" is
not associated with group "foo" in any way).
Patch committed with minor style(9) changes.
PR: bin/169471
Submitted by: Alexander Pyhalov <apyhalov@gmail.com>
New ioctls VM_ISA_ASSERT_IRQ, VM_ISA_DEASSERT_IRQ and VM_ISA_PULSE_IRQ
can be used to manipulate the pic, and optionally the ioapic, pin state.
Reviewed by: jhb, neel
Approved by: neel (co-mentor)
This fixes the issue of bhyve appearing to halt when using
nmdm ports for the console, until a connection is made to
the other end.
bhyveload already does this.
Reported by: Many.
MFC after: 3 weeks.
new command line options -W, to enable it when needed.
On my tests this change by almost ten times improves rpcbind performance.
No objections: many, net@
processor-specific VMCS or VMCB. The pending exception will be delivered right
before entering the guest.
The order of event injection into the guest is:
- hardware exception
- NMI
- maskable interrupt
In the Intel VT-x case, a pending NMI or interrupt will enable the interrupt
window-exiting and inject it as soon as possible after the hardware exception
is injected. Also since interrupts are inherently asynchronous, injecting
them after the hardware exception should not affect correctness from the
guest perspective.
Rename the unused ioctl VM_INJECT_EVENT to VM_INJECT_EXCEPTION and restrict
it to only deliver x86 hardware exceptions. This new ioctl is now used to
inject a protection fault when the guest accesses an unimplemented MSR.
Discussed with: grehan, jhb
Reviewed by: jhb
what btxldr expects (.set MEM_DATA,start+0x1000 in btxldr.S).
This makes resulting ELF binaries bootable with grub, gptboot and boot2.
PR: 153801
Submitted by: Gleb Kurtsou <gleb.kurtsou at gmail.com>
Tested by: Ruben Kerkhof <ruben at rubenkerkhof.com>
Glanced at by: jhb, peter
MFC after: 1 month
'-m <file>' spits out the given stream into <file> (eg, /dev/stdout).
However, it only resolves the first symbol; it doesn't parse the entire
callgraph. If it fails to lookup then it doesn't print anything.
'-a' instead does a symbol and file:line lookup for each address in each
callgraph and will happily print the address itself with no lookup
information if it couldn't look things up.
This makes it much easier to pull out individual records from a
pmc data file and look at the callgraph information without having to
hand-decode the addresses.
Sponsored by: Netflix, Inc.
Modelled after the i386 zfsloader. However, with no
2nd stage zfsboot to search for a bootable dataset,
attempt a ZFS boot if there is more than one ZFS
dataset found during the disk probe.
sys/boot/userboot/zfs
- build the ZFS boot library
sys/boot/userboot/userboot/
conf.c
- Add the ZFS pool and filesystem tables
devicename.c
- correctly format ZFS devices
main.c
- increase the size of the libstand malloc pool
to account for the increased usage from ZFS buffers
- probe for a ZFS dataset, and if one is
found, attempt to boot from it.
usr.sbin/bhyveload/bhyveload.c
- allow multiple invocations of the '-d' option
to specify multiple disks e.g. a raidz set.
Up to 32 disks are supported.
Tested with various combinations of GPT, MBR, single
and multiple disks, RAID-Z, mirrors.
Reviewed by: neel
Discussed with: avg
Tested by: Michael Dexter and others
MFC after: 3 weeks
simplify the implementation of the x2APIC virtualization assist in VT-x.
Prior to this change the vlapic allowed the guest to change its mode from
xAPIC to x2APIC. We don't allow that any more and the vlapic mode is locked
when the virtual machine is created. This is not very constraining because
operating systems already have to deal with BIOS setting up the APIC in
x2APIC mode at boot.
Fix a bug in the CPUID emulation where the x2APIC capability was leaking
from the host to the guest.
Ignore MMIO reads and writes to the vlapic in x2APIC mode. Similarly, ignore
MSR accesses to the vlapic when it is in xAPIC mode.
The default configuration of the vlapic is xAPIC. The "-x" option to bhyve(8)
can be used to change the mode to x2APIC instead.
Discussed with: grehan@
the non-standard zero capability list terminator. Instead, track
the start and end of the most recently added capability and use that
to adjust the previous capability's next pointer when a capability is
added and to determine the range of config registers belonging to
PCI capability registers.
Reviewed by: neel
NB: If the zfsboot variables ($ZFSBOOT_*) are set, a script is
assumed to want zfsboot module instead of scriptedpart module.
Submitted by: Loïc Brarda <loic.brarda@cern.ch>
Reviewed by: nwhitehorn@
MFC after: 3 days
This is done by representing each bus as root PCI device in ACPI. The device
implements the _BBN method to return the PCI bus number to the guest OS.
Each PCI bus keeps track of the resources that is decodes for devices
configured on the bus: i/o, mmio (32-bit) and mmio (64-bit). These windows
are advertised to the guest via the _CRS object of the root device.
Bus 0 is treated specially since it consumes the I/O ports to access the
PCI config space [0xcf8-0xcff]. It also decodes the legacy I/O ports that
are consumed by devices on the LPC bus. For this reason the LPC bridge can
be configured only on bus 0.
The bus number can be specified using the following command line option
to bhyve(8): "-s <bus>:<slot>:<func>,<emul>[,<config>]"
Discussed with: grehan@
Reviewed by: jhb@
the IDENTIFY DEVICE and IDENTIFY PACKET DEVICE commands.
Also, provide an indication a "D2H Register FIS" occurred during a SET FEATURES
command.
Approved by: grehan (co-mentor)
this could lead to the -n option effectively being ignored (in case
ac_line happened to be 0 aka SRC_AC), or other undefined behaviour.
PR: 169779
Submitted by: Alex Gonzalez <loox at e-shell.net>
Reviewed by: jhb
MFC after: 2 weeks
a dummy handler to make it interrupt an ioctl(2) or select(2).
This makes those short-lived ctld(8) zombies disappear.
Sponsored by: The FreeBSD Foundation
It doesn't change visible behaviour, as previously auth-group "default"
wasn't redefinable, so by default access was always denied.
Sponsored by: The FreeBSD Foundation
a dummy handler to make it interrupt an ioctl(2) or select(2).
This makes those short-lived iscsid(8) zombies disappear.
Sponsored by: The FreeBSD Foundation
- Similar to the hack for bootinfo32.c in userboot, define
_MACHINE_ELF_WANT_32BIT in the load_elf32 file handlers in userboot.
This allows userboot to load 32-bit kernels and modules.
- Copy the SMAP generation code out of bootinfo64.c and into its own
file so it can be shared with bootinfo32.c to pass an SMAP to the i386
kernel.
- Use uint32_t instead of u_long when aligning module metadata in
bootinfo32.c in userboot, as otherwise the metadata used 64-bit
alignment which corrupted the layout.
- Populate the basemem and extmem members of the bootinfo struct passed
to 32-bit kernels.
- Fix the 32-bit stack in userboot to start at the top of the stack
instead of the bottom so that there is room to grow before the
kernel switches to its own stack.
- Push a fake return address onto the 32-bit stack in addition to the
arguments normally passed to exec() in the loader. This return
address is needed to convince recover_bootinfo() in the 32-bit
locore code that it is being invoked from a "new" boot block.
- Add a routine to libvmmapi to setup a 32-bit flat mode register state
including a GDT and TSS that is able to start the i386 kernel and
update bhyveload to use it when booting an i386 kernel.
- Use the guest register state to determine the CPU's current instruction
mode (32-bit vs 64-bit) and paging mode (flat, 32-bit, PAE, or long
mode) in the instruction emulation code. Update the gla2gpa() routine
used when fetching instructions to handle flat mode, 32-bit paging, and
PAE paging in addition to long mode paging. Don't look for a REX
prefix when the CPU is in 32-bit mode, and use the detected mode to
enable the existing 32-bit mode code when decoding the mod r/m byte.
Reviewed by: grehan, neel
MFC after: 1 month
only if the specified option is NOT specified.' Bump version because
old config won't be able to cope with files* files that have this
construct in them.
did this only with the inner loop for the token parsing, and not the
outer loop which was understandable enough when the extra layers of
looping went away...
its proper location. Otherwise you could have 'file.c standard pci'
without an error. This construct isn't in our tree, and has no well
defined meaning.
performance by epsilon.
(Translation: elminate bogus macros that hid 'returns' making it hard
to read and moved a block of code inline rather than at the end of the
fuction where it was effectively a 'gosub' kind of goto).
r261266:
Add a jail parameter, allow.kmem, which lets jailed processes access
/dev/kmem and related devices (i.e. grants PRIV_IO and PRIV_KMEM_WRITE).
This in conjunction with changing the drm driver's permission check from
PRIV_DRIVER to PRIV_KMEM_WRITE will allow a jailed Xorg server.
commit 6b569451b92c48ccf1768da32e7e89189e1aa253
Author: Brooks Davis <brooks@one-eyed-alien.net>
Date: Mon Jan 27 22:50:46 2014 +0000
Always install nmtree as mtree.
For compability, link mtree to nmtree.
X-MFC after: never
Sponsored by: DARPA, AFRL
commit c1acf022c533c5ae27e0cd556977eafe3f5959eb
Author: Brooks Davis <brooks@one-eyed-alien.net>
Date: Fri Jan 17 21:46:44 2014 +0000
Add an option WITHOUT_NCURSESW to suppress building and linking to
libncursesw. While wide character support it useful we'd like to
only need one ncurses library on embedded systems.
MFC after: 4 weeks
Sponsored by: DARPA, AFRL
the virtio backends.
- Add a new ioctl to export the count of pins on the I/O APIC from vmm
to the hypervisor.
- Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for
ISA interrupts.
- Populate the MP Table with I/O interrupt entries for any PCI INTx
interrupts.
- Create a _PRT table under the PCI root bridge in ACPI to route any
PCI INTx interrupts appropriately.
- Track which INTx interrupts are in use per-slot so that functions
that share a slot attempt to distribute their INTx interrupts across
the four available pins.
- Implicitly mask INTx interrupts if either MSI or MSI-X is enabled
and when the INTx DIS bit is set in a function's PCI command register.
Either assert or deassert the associated I/O APIC pin when the
state of one of those conditions changes.
- Add INTx support to the virtio backends.
- Always advertise the MSI capability in the virtio backends.
Submitted by: neel (7)
Reviewed by: neel
MFC after: 2 weeks
/dev/kmem and related devices (i.e. grants PRIV_IO and PRIV_KMEM_WRITE).
This in conjunction with changing the drm driver's permission check from
PRIV_DRIVER to PRIV_KMEM_WRITE will allow a jailed Xorg server.
Submitted by: netchild
MFC after: 1 week
which cause EINVAL returned from nanosleep() which cause loop in
cron_sleep() and making all cron jobs to start about 30 seconds earlier
(which cause f.e. logfiles rotation by newsyslog delayed by 1 hour).
Use simple and proved calculations from kernel's timespecsub() instead.
MFC after: 3 days
- Store the length of each read-only VPD value since not all values are
guaranteed to be ASCII values (though most are).
- Add a new pciio ioctl to fetch VPD for a single PCI device. The values
are returned as a list of variable length records, one for the device
name and each keyword.
- Add a new -V flag to pciconf's list mode which displays VPD data for
each device.
MFC after: 1 week
device name instead of just the selector.
- Accept an optional device argument to -l to restrict the output to only
listing details about a single device. This is mostly useful in
conjunction with other flags like -e or -c to allow a user to query
details about a single device.
MFC after: 1 week
in the one-line comment associated with the dumpdev setting was not present
for the case where the user deselects the dumpdev service (restoring pre-
r256348 behaviour.
MFC After: 3 days
if it was above 4GB. This was seen with CentOS 6.5 guests with
large RAM, since the block drivers are loaded late in the
boot sequence and end up allocating descriptor memory from
high addresses.
Reported by: Michael Dexter
MFC after: 3 days
only allows basic username/password config, and does not provide the
ability to set any of the other WPA options. Regardless, this is
generally sufficient to associate.
Perhaps in the future this could allow full configuring (e.g. being able
to set "anonymous identity", and perhaps some of the more obscure WPA
options), though perhaps that will better belong in bsdconfig when that
grows wlan config ability.
MFC after: 1 week
LPC devices. Among other things, the LPC serial ports now appear as
ACPI devices.
- Move the info for the top-level PCI bus into the PCI emulation code and
add ResourceProducer entries for the memory ranges decoded by the bus
for memory BARs.
- Add a framework to allow each PCI emulation driver to optionally write
an entry into the DSDT under the \_SB_.PCI0 namespace. The LPC driver
uses this to write a node for the LPC bus (\_SB_.PCI0.ISA).
- Add a linker set to allow any LPC devices to write entries into the
DSDT below the LPC node.
- Move the existing DSDT block for the RTC to the RTC driver.
- Add DSDT nodes for the AT PIC, the 8254 ISA timer, and the LPC UART
devices.
- Add a "SuperIO" device under the LPC node to claim "system resources"
aling with a linker set to allow various drivers to add IO or memory
ranges that should be claimed as a system resource.
- Add system resource entries for the extended RTC IO range, the registers
used for ACPI power management, the ELCR, PCI interrupt routing register,
and post data register.
- Add various helper routines for generating DSDT entries.
Reviewed by: neel (earlier version)
hides the setjmp/longjmp semantics of VM enter/exit. vmx_enter_guest() is used
to enter guest context and vmx_exit_guest() is used to transition back into
host context.
Fix a longstanding race where a vcpu interrupt notification might be ignored
if it happens after vmx_inject_interrupts() but before host interrupts are
disabled in vmx_resume/vmx_launch. We now called vmx_inject_interrupts() with
host interrupts disabled to prevent this.
Suggested by: grehan@
i. e. the POSIX:5.6.1 st_ino field, which can be used to detect hard links
in the file system. This is also the default in mkisofs(8) and according to
its man page, no system only being able to cope with Rock Ridge version 1.10
is known to exist.
PR: 185138
Submitted by: Kurt Lidl
MFC after: 1 week
to SIGTERM when ACPI is enabled. Sending SIGTERM to the hypervisor when an
ACPI-aware OS is running will now trigger a soft-off allowing for a graceful
shutdown of the guest.
- Move constants for ACPI-related registers to acpi.h.
- Implement an SMI_CMD register with commands to enable and disable ACPI.
Currently the only change when ACPI is enabled is to enable the virtual
power button via SIGTERM.
- Implement a fixed-feature power button when ACPI is enabled by asserting
PWRBTN_STS in PM1_EVT when SIGTERM is received.
- Add support for EVFILT_SIGNAL events to mevent.
- Implement support for the ACPI system command interrupt (SCI) and assert
it when needed based on the values in PM1_EVT. Mark the SCI as active-low
and level triggered in the MADT and MP Table.
- Mark PCI interrupts in the MP Table as active-low in addition to level
triggered.
Reviewed by: neel
- Implement the PM1_EVT and PM1_CTL registers required by ACPI.
The PM1_EVT register is mostly a dummy as bhyve doesn't support any
of the hardware-initiated events. The only bit of PM1_CNT that is
implemented are the sleep request bits (SPL_EN and SLP_TYP) which
request a graceful power off for S5. In particular, for S5, bhyve
exits with a non-zero value which terminates the loop in vmrun.sh.
- Emulate the Reset Control register at I/O port 0xcf9 and advertise
it as the reset register via ACPI.
- Advertise an _S5 package.
- Extend the in/out interface to allow an in/out handler to request
that the hypervisor trigger a reset or power-off.
- While here, note that all vCPUs in a guest support C1 ("hlt").
Reviewed by: neel (earlier version)
- Add a generic routine to trigger an LVT interrupt that supports both
fixed and NMI delivery modes.
- Add an ioctl and bhyvectl command to trigger local interrupts inside a
guest. In particular, a global NMI similar to that raised by SERR# or
PERR# can be simulated by asserting LINT1 on all vCPUs.
- Extend the LVT table in the vCPU local APIC to support CMCI.
- Flesh out the local APIC error reporting a bit to cache errors and
report them via ESR when ESR is written to. Add support for asserting
the error LVT when an error occurs. Raise illegal vector errors when
attempting to signal an invalid vector for an interrupt or when sending
an IPI.
- Ignore writes to reserved bits in LVT entries.
- Export table entries the MADT and MP Table advertising the stock x86
config of LINT0 set to ExtInt and LINT1 wired to NMI.
Reviewed by: neel (earlier version)
state before the requested state transition. This guarantees that there is
exactly one ioctl() operating on a vcpu at any point in time and prevents
unintended state transitions.
More details available here:
http://lists.freebsd.org/pipermail/freebsd-virtualization/2013-December/001825.html
Reviewed by: grehan
Reported by: Markiyan Kushnir (markiyan.kushnir at gmail.com)
MFC after: 3 days
location of /etc/rc.local on the install media is more appropriate as it
knows serial vs. non-serial and can also do the change earlier (so that
even the initial Install dialog can benefit from the change).
MFC after: 3 days
installation to 3-4+ (depending on vdev type) vdevs would result in odd
error messages where the zpool `create' command appeared to repeat itself
(an artifact of printf when you supply too many arguments -- caused by
neglecting to properly quote the multi-word expansion of $*vdevs when
creating the pool(s)). Example error below (taken from bsdinstall_log):
DEBUG: zfs_create_boot: Creating root pool...
DEBUG: zfs_create_boot: zpool create -o altroot=/mnt -m none -f "zroot" \
ada0p3.nop ada1p3.nopzpool create ada2p3.nop "ada3p3.nop"
DEBUG: zfs_create_boot: retval=1 <output below>
cannot open 'ada1p3.nopzpool': no such GEOM provider
DEBUG: Running installation step: hostname
rm: /tmp/bsdinstall_etc/fstab: No such file or directory
The two lines are unrelated, and the rm is spurious. Let's add `-f' to
that rm(1) so it doesn't confuse us when debugging an install.
MFC after: 3 days
should not have used DIALOG_CANCEL because dialog.subr wasn't included to
define it. The effect of the error was that you could not cancel the
partition dialog. Discovered by checking bsdinstall_log for something else.
MFC after: 3 days
callers treat the MSI 'addr' and 'data' fields as opaque and also lets
bhyve implement multiple destination modes: physical, flat and clustered.
Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
Reviewed by: grehan@
required in such a case). But don't prevent the user from pointing the
gun at his/her foot -- you can disable 4k alignment after enabling geli).
MFC after: 3 days
and subsequent headaches caused by multiple pools with the same name.
Specifically, blast away any labels on the designated swap partition.
Problem was when you install to a given layout *with* swap and then turn
around and re-install the same layout *without* swap (we weren't doing a
labelclear for the swap device, so would end up with an "UNAVAIL" status
zroot pool that may only exist in the pool cache).
MFC after: 3 days
+ For GPT, always provision zfs# partition after swap [for resizability]
+ For MBR, always use a boot pool to relialy place root vdevs at EOD
NB: Fixes edge-cases where MBR combination failed boot (e.g. swap-less)
+ Generalize boot pool logic so it can be used for any scheme (namely MBR)
+ Update existing comments and some whitespace fixes
+ Change some variable names to make reading/debugging the code easier
in zfs_create_boot() (namely prepend zroot_ or bootpool_ to property)
+ Because zroot vdevs are at EOD, no longer need to calculate partsize
(vdev consumes remaining space after allocating swap)
+ Optimize processing of disks -- no reason to loop over the disks 3-4
separate times when we can logically use a single loop to do everything
Discussed on: -stable
MFC after: 3 days
+ De-obfuscate debugging to show actual values
+ Change graid(8) syntax; s/destroy/delete/ [destroy is not invalid syntax]
+ Log commands that were previously quiet
+ Added some new comemnts and updated some existing ones
+ Add missing local for `disk' used in zfs_create_boot()
+ Use $disks instead of multiply-expanding $* in zfs_create_boot()
+ Pedantically unset variable holding geli(8) passphrase after use
+ Pedantically add double-quotes around zpool names and zfs datasets
+ Fix quotation expansion for zpool_cache entries of loader.conf(5)
+ Some limited whitespace changes
MFC after: 3 days
https://communities.vmware.com/thread/107230https://communities.vmware.com/docs/DOC-11677
Basically, ignore the ``function 62'' and ``function 63'' interpretations
of the left/right command key when we're in the lengthiest portion of the
installation (initiated by the `auto' module).
The net effect is that you can now (once you've started the installer from
the media) escape the VM without prematurely terminating the current action
due to spurious escape sequence.
MFC after: 3 days
installation is cdrom. This enables bsdconfig(8) to make use
of the on-disc pkg(8) repository configuration, which fixes
package selection and installation from the dvd installer.
MFC after: 3 days
M-MFC-With: r259426
X-MFC-Before: -RC3
Sponsored by: The FreeBSD Foundation
===
DEBUG: Running installation step: services
local: Not in a function
/usr/libexec/bsdinstall/services: cannot create : Read-only file system
/usr/libexec/bsdinstall/services: /tmp/bsdinstall/etc/rc.conf.services: \
Permission denied
===
The `local: Not in a function' is obvious, and was introduced by myself in
SVN revision 256348.
The latter two are caused by the attempt to use "\" to continue the line
after using the ">>" redirect. This appears to attempt to write a file with
the name " " in the current directory and subsequently attempts to execute
the file that was originally intended for writing (which is not executable;
hence the `Permission denied'). That was introduced in SVN r228192 about
2 years ago, apparently unnoticed until I started going over the debug
outputs very carefully.
MFC after: 3 days
This will read the REPOS_DIR env/config setting (default is /etc/pkg
and /usr/local/etc/pkg/repos) and use the last enabled repository.
This can be changed in the environment using a comma-separated list,
or in /usr/local/etc/pkg.conf with JSON array syntax of:
REPOS_DIR: ["/etc/pkg", "/usr/local/etc/pkg/repos"]
Approved by: bapt
MFC after: 1 week
would either exit on assertion, or, if assertions are not enabled,
fail to authenticate the target.
MFC after: 2 days
Sponsored by: The FreeBSD Foundation
The An macros is used for authors while the Ar macro is used for arguments.
AFAIK mcast-addr and ifname are not authors.
PR: docs/184649
Submitted by: cnst++
MFC After: 3 days
after attempting to install to encrypted ZFS root (caused by a typo in a
variable name -- ZFSBOOT_BOOT_FSNAME -> ZFSBOOT_BOOTFS_NAME).
MFC after: 3 days
vcpu and destroy its thread context. Also modify the 'HLT' processing to ignore
pending interrupts in the IRR if interrupts have been disabled by the guest.
The interrupt cannot be injected into the guest in any case so resuming it
is futile.
With this change "halt" from a Linux guest works correctly.
Reviewed by: grehan@
Tested by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
+ Remove UNAME_P=$(...) from startup/misc -- already supplied by common.subr
+ Use f_getvar instead of $(eval echo \$$var) -- f_getvar is sub-shell free
+ Add `-e' and `-k var' options to f_eval_catch -- increasing use-cases
+ Use f_eval_catch to display errors on failure -- reducing duplicated code
+ Use f_eval_catch when we need output from a command -- improving debugging
+ Optimize f_isinter of strings.subr for performance -- now sub-shell free
+ Improve error checking on pidfiles -- using f_eval_catch and f_isinteger
+ Use $var_to_set arg of f_ifconfig_{inet,netmask} -- eliminate sub-shells
+ Use f_sprintf instead of $(printf ...) -- consolidate sub-shells
+ Use $var_to_set arg of f_route_get_default -- eliminate sub-shells
+ Add f_count to replace $(set -- ...;echo $#) -- eliminate sub-shells
+ Add f_count_ifs to replace $(IFS=x;set -- ...;echo $#) -- no sub-shells
+ Replace var="$var${var:+ }..." in loops with var="$var ..." with a follow-
up var="${var# }" to trim leading whitespace -- optimize loops
+ Use $var_to_set arg of f_resolv_conf_nameservers -- eliminate sub-shells
+ Comments for the f_eval_catch function
+ Remove a duplicate `local ... desc ...' in f_device_get_all of device.subr
+ Use $var_to_set arg of f_device_capacity -- eliminate sub-shells
+ Whitespace fixes in f_dialog_init of dialog.subr
+ Optimize f_inet_atoi of media/tcpip.subr for performance -- sub-shell free
+ In several cases, send stderr to /dev/null -- clean up runtime execution
+ Change f_err of common.subr to go to program stderr not terminal stderr,
allowing redirection of output from functions that use f_err
+ Disable debugging when using f_getvar to get variable argument to
f_startup_rcconf_map_expand of startup/rcconf.subr
+ Use f_replace_all instead of $(echo ... | tr | sed) -- performance
+ Add a $var_to_set option to f_index_{file,menusel_{command,keyword}} of
common.subr -- centralize sub-shells
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
Discussed with: -arch, rdivacky
Reviewed by: cperciva
requires process descriptors to work and having PROCDESC in GENERIC
seems not enough, especially that we hope to have more and more consumers
in the base.
MFC after: 3 days
commit level triggered interrupts would work as long as the pin was not shared
among multiple interrupt sources.
The vlapic now keeps track of level triggered interrupts in the trigger mode
register and will forward the EOI for a level triggered interrupt to the
vioapic. The vioapic in turn uses the EOI to sample the level on the pin and
re-inject the vector if the pin is still asserted.
The vhpet is the first consumer of level triggered interrupts and advertises
that it can generate interrupts on pins 20 through 23 of the vioapic.
Discussed with: grehan@
would exceed the maximum size. This can be a difficult problem to diagnose
if one is, for instance, using -s with a fixed size in a script and the bsize
calculated for a filesystem image changes, necessitating a re-rounding of the
image size or a hand-setting of the bsize. Previously one would get a
cryptic message about how the size exceeded the maximum size, which normally
only happens if the contents of the image are larger than specified.
bhyveload: introduce the -c <device> parameter
to select a tty for output (or "stdio")
bhyve: allow the puc and lpc-com backends to
accept a tty in addition to "stdio"
When used in conjunction with the null-modem device,
nmdm(4), this allows attach/detach to the guest console
and multiple concurrent serial ports. kgdb on a serial
port is now functional.
Reviewed by: neel
Requested by: Almost everyone that has used bhyve
MFC after: 10.0
Table is 22 bits, with the bit 31 being the interrupt-on-completion
bit.
OpenBSD and UEFI set this bit, resulting in large block i/o lengths
being sent to bhyve and coredumping the process. Fix by masking off
the relevant 22 bits when using the DBC field as a length.
Reviewed by: Zhixiang Yu
Discussed with: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
MFC after: 10.0
actual value read by the guest from the device. The IOAPIC ID is now set to
zero in both MPtable/ACPI tables as well as in the ioapic device emulation.
Pointed out by: grehan@
bhyve supports a single timer block with 8 timers. The timers are all 32-bit
and capable of being operated in periodic mode. All timers support interrupt
delivery using MSI. Timers 0 and 1 also support legacy interrupt routing.
At the moment the timers are not connected to any ioapic pins but that will
be addressed in a subsequent commit.
This change is based on a patch from Tycho Nightingale (tycho.nightingale@pluribusnetworks.com).
to inject edge triggered legacy interrupts into the guest.
Start using the new API in device models that use edge triggered interrupts:
viz. the 8254 timer and the LPC/uart device emulation.
Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
`device.subr' framework (improving performane and reducing sub-shells). Next
improve the `device.subr' framework itself. Make use of the `flags' device
struct member for network interfaces to indicate if an interface is Active,
Wired Ethernet, or 802.11 Wireless. Functions have been added to make checks
against the `flags' bit-field quick and efficient. Last, add function for
rescanning the network to update the device registers. Remove an unnecessary
local (ifn) while we're here (use already provided local `if').
nmtree.
The mtree output used by mergemaster in this case was clearly not meant for
computer consumption and an approach based on -f <file1> -f <file2> would
probalby be a better idea, but this is a minimal change.
MFC after: 3 days
X-MFC-with: r258437
errors on re-entry for physical media). Also, while we're here, stop
ejecting the CDROM when we're done with it (but leave the functions for
later use so that we could perhaps -- from the installer standpoint -- use
it to eject the media after an install).
MFC after: 3 days