misconfiguration VM-exit.
An EPT misconfiguration is triggered when the processor encounters a PTE
that is writable but not readable (WR=10). On processors that require A/D
bit emulation PG_M and PG_A map to EPT_PG_WRITE and EPT_PG_READ respectively.
If the PTE is updated as in the following code snippet:
*pte |= PG_M;
*pte |= PG_A;
then it is possible for another processor to observe the PTE after the PG_M
(aka EPT_PG_WRITE) bit is set but before PG_A (aka EPT_PG_READ) bit is set.
This will trigger an EPT misconfiguration VM-exit on the other processor.
Reported by: rodrigc
Reviewed by: grehan
MFC after: 3 days
timecounter resolution is available, so ask for a 1 GHz frequency. It
won't actually get one that fast, but that'll get the fastest available
clock and use a divisor of 1 (probably 132 or 66mhz on current hardware).
Add support for AMD's nested page tables in pmap.c:
- Provide the correct bit mask for various bit fields in a PTE (e.g. valid bit)
for a pmap of type PT_RVI.
- Add a function 'pmap_type_guest(pmap)' that returns TRUE if the pmap is of
type PT_EPT or PT_RVI.
Add CPU_SET_ATOMIC_ACQ(num, cpuset):
This is used when activating a vcpu in the nested pmap. Using the 'acquire'
variant guarantees that the load of the 'pm_eptgen' will happen only after
the vcpu is activated in 'pm_active'.
Add defines for various AMD-specific MSRs.
Submitted by: Anish Gupta (akgupt3@gmail.com)
EWOULDBLOCK.
Do not print any message at errors. The errors are properly sent to upper
layers which should be able to deal with it, including printing the errors
when they need to.
The error message was quite annoying while scanning the i2c bus.
MFC after: 1 week
two.
nullfs and unionfs need to request suspension if underlying filesystem(s)
use it. Utilize mnt_kern_flag for this purpose.
This is a fixup for 273271.
No strong objections from: kib
Pointy hat to: mjg
MFC after: 2 weeks
This involves:
1. Have the loader pass the start and size of the .ctors section to the
kernel in 2 new metadata elements.
2. Have the linker backends look for and record the start and size of
the .ctors section in dynamically loaded modules.
3. Have the linker backends call the constructors as part of the final
work of initializing preloaded or dynamically loaded modules.
Note that LLVM appends the priority of the constructors to the name of
the .ctors section. Not so when compiling with GCC. The code currently
works for GCC and not for LLVM.
Submitted by: Dmitry Mikulin <dmitrym@juniper.net>
Obtained from: Juniper Networks, Inc.
vxlan creates a virtual LAN by encapsulating the inner Ethernet frame in
a UDP packet. This implementation is based on RFC7348.
Currently, the IPv6 support is not fully compliant with the specification:
we should be able to receive UPDv6 packets with a zero checksum, but we
need to support RFC6935 first. Patches for this should come soon.
Encapsulation protocols such as vxlan emphasize the need for the FreeBSD
network stack to support batching, GRO, and GSO. Each frame has to make
two trips through the network stack, and each frame will be at most MTU
sized. Performance suffers accordingly.
Some latest generation NICs have begun to support vxlan HW offloads that
we should also take advantage of. VIMAGE support should also be added soon.
Differential Revision: https://reviews.freebsd.org/D384
Reviewed by: gnn
Relnotes: yes
Before, the font was loaded and the window size recalculated, giving an
unusable terminal, even if the actual font didn't change.
Reported by: beeessdee@ruggedinbox.com
MFC after: 3 days
This fix a race where the threads waiting for the bus would wake up early
and still see bus as busy.
While here, give a better description to wmesg for the two use cases we
have (bus and io waiting).
MFC after: 1 week
bhyve doesn't emulate the MSRs needed to support this feature at this time.
Don't expose any model-specific RAS and performance monitoring features in
cpuid leaf 80000007H.
Emulate a few more MSRs for AMD: TSEG base address, TSEG address mask and
BIOS signature and P-state related MSRs.
This eliminates all the unimplemented MSRs accessed by Linux/x86_64 kernels
2.6.32, 3.10.0 and 3.17.0.
will now find the virtual to physical mapping for libkvm to use at
runtime. This makes PHYSADDR redundant, however keep it around to give
everyone a chance to update their libkvm.
MFC after: 1 week
physaddr. This should allow for a kernel where PHYSADDR and KERNPHYSADDR
are both undefined.
For now libkvm will use the old method of reading physaddr and kernaddr
to allow it to work with old kernels. This could be removed in the future
when enough time has passed.
Differential Revision: https://reviews.freebsd.org/D939
MFC after: 1 week
bus_new_pass() handler so it doesn't happen until BUS_PASS_CPU. This allows
the anatop driver to outbid the generic simplebus driver (which the FDT
data describes as compatible).
Some day when we handle power regulators, this driver may actually
become a functional simplebus and attach the regulators as children, as
described in the FDT data.
Increasingly, FDT data has the "simple-bus" compatible string on nodes
that have children, but we wouldn't consider them to be busses. If the
node lacks a ranges property then we will fail to attach successfully,
so fail to probe as well.
rather than u_char.
To try and play nice with the ABI, the u_char CPU ID values are clamped
at 254. The new fields now contain the full CPU ID, or -1 for no cpu.
Differential Revision: D955
Reviewed by: jhb, kib
Sponsored by: Norse Corp, Inc.
lose the contents of consecutive writes (that happens within two SD card
clock cycles).
This fixes the causes of instability during the SD card detection and
identification on Raspberry Pi (which happens at 400 kHz and so was much
more vulnerable to this issue).
Remove the previous workaround which clearly can't provide the same effect.
MFC after: 1 week
Relnotes: yes
to be present. Thsi creates a new per-SoC driver that handles probe and
setting/getting the gpio flags.
Differential Revision: https://reviews.freebsd.org/D943
Reviewed by: loos, rpaulo
MFC after: 1 week
The TI watchdog timer is present on BeagleBone's. Since 2014, U-Boot
has been booting the BeagleBone with the watchdog enabled. We need to
disable it on boot to avoid a spurious reset.
The timer isn't exactly precise, but it will do as a watchdog. This
is also a reflection of the watchdog(9) API.
In the future, we could handle interrupts, but the watchdog(9) API
needs to be a bit smarter before that can happen.
Differential Revision: https://reviews.freebsd.org/D965
Reviewed by: andrew
MFC after: 1 week
Relnotes: yes
Older binaries are still permitted to use these flags.
PR: 193961 (exp-run in ports)
Differential Revision: https://reviews.freebsd.org/D848
Reviewed by: kib
consistent with pmc_destroy_owner_descriptor(). Also be sure to destroy
PMCs if a process exits or execs without explicitly releasing them.
Reviewed by: bz, gnn
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D958
Previously, if no drivers attached at boot we would panic with
"vtbuf_fill_locked begin.tp_row 0 must be < screen height 0".
PR: 192248
Reviewed by: ray
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D954
Rename vmx_assym.s to vmx_assym.h to reflect that file's actual use
and update vmx_support.S's include to match. Add vmx_assym.h to the
SRCS to that it gets properly added to the dependency list. Add
vmx_support.S to SRCS as well, so it gets built and needs fewer
special-case goo. Remove now-redundant special-case goo. Finally,
vmx_genassym.o doesn't need to depend on a hand expanded ${_ILINKS}
explicitly, that's all taken care of by beforedepend.
With these items fixed, we no longer build vmm.ko every single time
through the modules on a KERNFAST build.
Sponsored by: Netflix
emulating a large number of MSRs.
Ignore writes to a couple more AMD-specific MSRs and return 0 on read.
This further reduces the unimplemented MSRs accessed by a Linux guest on boot.
entry and remove now-redunant dependencies. Add assym.s to
linux*_locore.s build, as it depends on it.
With this change, linux*.ko no longer builds every time through a
KERNFAST run.
Sponsored by: Netflix
vestige of a time when we needed to do that, but it is all handled by
beforedepend now. When we depend on the symlink, bmake will cause the
file to be rebuilt always.
With this change, dtrace.ko doesn't rebuild every time through a
KERNFAST run.
Sponsored by: Netfix
CPUID.80000001H:ECX.
Handle accesses to PerfCtrX and PerfEvtSelX MSRs by ignoring writes and
returning 0 on reads.
This further reduces the number of unimplemented MSRs hit by a Linux guest
during boot.
used to align partitions in gpart. We also try to align partitions by
stripe size when creating new media. Align these two concepts by
making fwsectors the same as the stripe size. Select a sensible number
of heads so we wind up with about 20 cylinders. This number was
selected to keep the rounding effects to a few percent while keeping
the number of cylinder groups low.
Sadly, it is not possible to make these numbers match the numbers used
by SD card readers. There apperas to be much variation between brands
so there's no one universal number. These numbers are also not aligned
to the stripe size, so some performance problems may still be present
when SD cards are created this way.
Also, these numbers will differ from the far less common SD to ATA
adapters, which present a different, but more uniform, set of numbers
that also happened to match the old defaults.
Nothing should change for current users. Any suboptimal performance
caused by misalignment will still be there. gpart will honor the
partitions that aren't on proper boudnaries, but editing the partition
tables may result in different alignments being used than before when
editing things natively.
Ideally, there'd be some way to override these values in the disk
subsystem by the user for the USB adapter use case where all "native"
notions of geometry disappear. This does not implement that.
Initialize CPUID.80000008H:ECX[7:0] with the number of logical processors in
the package. This fixes a panic during early boot in NetBSD 7.0 BETA.
Clear the Topology Extension feature bit from CPUID.80000001H:ECX since we
don't emulate leaves 0x8000001D and 0x8000001E. This fixes a divide by zero
panic in early boot in Centos 6.4.
Tested on an "AMD Opteron 6320" courtesy of Ben Perrault.
Reviewed by: grehan
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.
Submitted by: kmacy
Tested by: make universe
sent incoming stream reset request was responded with failed
or denied.
Thanks to Peter Bostroem from Google for reporting the issue.
MFC after: 3 days
user nobody and/or setting group nogroup as owner of a file or directory.
Usually at the client side, if there is an username that is not in the
client's passwd database, some clients will send 'nobody@<your.dns.domain>'
in the wire and the NFSv4 server will treat it as an ERROR.
However, if you have a valid user nobody in your passwd database,
the NFSv4 server will treat it as a NFSERR_BADOWNER as its believes the
client doesn't has the username mapped.
Submitted by: Loic Blot <loic.blot@unix-experience.fr>
Reviewed by: rmacklem
Approved by: rmacklem
MFC after: 2 weeks
When processing async destroys ZFS would leak space every txg timeout
(5 seconds by default), if no writes occurred, until the pool is totally
full. At this point it would be unfixable without a pool recreation.
In addition if the machine was rebooted with the pool in this situation
would fail to import on boot, hanging indefinitely, as the import process
requires the ability to write data to the pool. Any attempts to query the
pool status during the hung import would not return as the import holds
the pool lock.
The only way to import such a pool would be to specify -o readonly=on
to the zpool import.
zdb -bb <pool> can be used to check for "deferred free" size which is where
this lost space will be counted.
MFC after: 3 days
Sponsored by: Multiplay
For compatibility, 'device windtunnel' is still supported, but one should use
'device adm1030' instead, and this has been updated in GENERIC and NOTES.
This fixes use-after-free, caused by geom_disk, completing same BIO twice
to save extra allocation, and getting BIO_DONE set after the first.
MFC after: 1 week
Mellanox hardware driver(s):
- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
cancelling and starting a timeout.
- Implement more Linux scatterlist
functions.
MFC after: 3 days
Sponsored by: Mellanox Technologies
read/write/poll/ioctl, call standard vnode filedescriptor fop. This
restores the special handling for terminals by calling the deadfs VOP,
instead of always returning ENXIO for destroyed devices or revoked
terminals.
Since destroyed (and not revoked) device would use devfs_specops VOP
vector, make dead_read/write/poll non-static and fill VOP table with
pointers to the functions, to instead of VOP_PANIC.
Noted and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
is interested in i/o state. Return POLLNVAL for invalid bits, similar
to poll_no_poll(). Note that POLLOUT must not be returned, since
POLLHUP is set.
Noted and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
have wildcards. This makes it possible for autofs(4) to avoid requesting
automountd(8) action on access to nonexistent nodes - unless wildcards
are actually used.
Note that this change breaks ABI for automountd(8).
Tested by: dhw@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Summary:
If a user uses powerd, or doesn't want to use the cycles monitoring, they can
now suspend the monitoring thread.
While here, reorganize the added prototypes to match existing groupings.
Reviewers: nwhitehorn, #powerpc, rpaulo
Reviewed By: #powerpc, rpaulo
Differential Revision: https://reviews.freebsd.org/D944
X-MFC-with: r273009
MFC after: 3 weeks
He noticed issues setting this bit in SRRCTL after the queue was up,
so doing it from the sysctl handler isn't enough and may not actually
work correctly.
This commit doesn't remove the sysctl path or try to change its
behaviour. I'll talk with others about how to finish fixing that
before I tackle that.
PR: kern/194311
Submitted by: luigi
MFC after: 3 days
Sponsored by: Norse Corp, Inc