injected into the vcpu but the VM-entry interruption information field
already has the valid bit set.
Pointed out by: David Reed (david.reed@tidalscale.com)
The latest draft of SBC-3 tells: "A MAXIMUM UNMAP LBA COUNT field set to
a non-zero value indicates the maximum number of LBAs that may be unmapped
by an UNMAP command." To me it does not sound like that limit is set per
single descriptor, but rather per all command. And I have at least one
device that behaves exactly that way. This patch fixes the problem there.
MFC after: 1 week
Make comments match parameters
Add options for early printf so we get regression build testing on it.
Add preview of options for FDT support coming soon (I hope)
paper trail now, this patch is similar to one posted for one of the
preliminary versions of a new armv6 port. I took them and made them
more generic. Option not enabled by default since each board/port has
to provide its own eputc, and possibly do other things as well...
Use the bitwise negation instead of bogus boolean negation and move
the flag manipulation with the assignment.
Fix some grammatical errors introduced in the same change.
Reported by: bde
MFC after: 3 days
pim_input() properly.
While here, remove extra variable and incorrect condition
before m_pullup().
Reported by: Olivier Cochard-Labbé <olivier cochard.me>
Sponsored by: Nginx, Inc.
a timeout value of a single tick is given. With FreeBSD-10 and newer
the current system time is used as a starting point, and the minimum
callout time of a single tick will be guaranteed. This patch mostly
affect the DMA delay timeouts, which are typically in the range from
0.125 to 2ms.
MFC after: 1 week
via a software interrupt.
This is safe to do because the logical processor is already cognizant of the
NMI and further NMIs are blocked until the host's NMI handler executes "iret".
r260545 cleared the inode flags to fix corruption problems but
we still need to pass some EXT4 flags for the ext4 read-only
mode. None of these attributes has an equivalent in FreeBSD and
are uninteresting for the system utilities so they should be
innaccessible in ext2_getattrib().
Note: we also use EXT4_HUGE_FILE but we use it directly from the
dinode structure so it is not necessary to translate it,
Suggested by: bde
MFC after: 3 days
This is the first step needed to get the snapper codec working on those
machines.
The second step is to enable the corresponding I2S device and its clock.
Tested on machines where the snapper codec was already working, a G4 PowerBook
and a PowerMac9,1 with a Shasta based macio.
The PowerMac7,2/7,3 with a K2 based macio can now also play sound.
MFC after: 1 month
- Store the length of each read-only VPD value since not all values are
guaranteed to be ASCII values (though most are).
- Add a new pciio ioctl to fetch VPD for a single PCI device. The values
are returned as a list of variable length records, one for the device
name and each keyword.
- Add a new -V flag to pciconf's list mode which displays VPD data for
each device.
MFC after: 1 week
fence. Under system load, the CPU has been found to change the order
by which the stores are made visible. When the tag is made visible
before the other TLB values, other CPUs may use the invalid TLB values
and do bad things.
While here (i.e. not a fix) don't return errors from pmap_remove_vhpt()
to callers of pmap_remove_pte(). Those callers don't check the return
value and as such don't do what is needed to keep a consistent state.
More importantly, pmap_remove_vhpt() can't really have an error without
it indicating something unintended. Using KASSERT is therefore better.
PR: 182999, 183227
console, it calls the grab functions. These functions should turn off
the RX interrupts, and any others that interfere. This makes mountroot
prompt work again. If there's more generalized need other than
prompting, many of these routines should be expanded to do those new
things.
Should have been part of r260889, but waasn't due to command line typo.
Reviewed by: bde (with reservations)
console, it calls the grab functions. These functions should turn off
the RX interrupts, and any others that interfere. This makes mountroot
prompt work again. If there's more generalized need other than
prompting, many of these routines should be expanded to do those new
things.
Reviewed by: bde (with reservations)
* Set ia address/mask values BEFORE attaching to address lists.
Inet6 address assignment is not atomic, so the simplest way to
do this atomically is to fill in ia before attach.
* Validate irfa->ia_addr field before use (we permit ANY sockaddr in old code).
* Do some renamings:
in6_ifinit -> in6_notify_ifa (interaction with other subsystems is here)
in6_setup_ifa -> in6_broadcast_ifa (LLE/Multicast/DaD code)
in6_ifaddloop -> nd6_add_ifa_lle
in6_ifremloop -> nd6_rem_ifa_lle
* Split working with LLE and route announce code for last two.
Add temporary in6_newaddrmsg() function to mimic current rtsock behaviour.
* Call device SIOCSIFADDR handler IFF we're adding first address.
In IPv4 we have to call it on every address change since ARP record
is installed by arp_ifinit() which is called by given handler.
IPv6 stack, on the opposite is responsible to call nd6_add_ifa_lle() so
there is no reason to call SIOCSIFADDR often.
ia64 for the very first time. Only 9 years in the making...
Note that the vt/vga driver does not actually make sure there's
VGA hardware at the standard/legacy VGA I/O port and memory I/O
addresses. This can cause machine checks if the H/W does not have
a VGA controller.
of a syncache connection, copy it into the inp_flowid field.
Without this, an incoming TCP connection won't have an inp_flowid marked
until some data comes in, and this means that things like the per-CPU
TCP timer option will choose a different CPU for the timer work.
(It also means that if one grabbed the flowid via an ioctl from userland,
it won't be available until some data has been received.)
Sponsored by: Netflix, Inc.
callback providers. link_init_sdl() function can be used to
fill most of the parameters. Use caller stack instead of
allocation / freing memory for each request. Do not drop support
for extra-long (probably non-existing) link-layer protocols by
introducing link_alloc_sdl() (used by if_resolvemulti() callback)
and link_free_sdl() (used by caller).
Since this change breaks KBI, MFC requires slightly different approach
(link_init_sdl() auto-allocating buffer if necessary to handle cases
with unmodified if_resolvemulti() callers).
MFC after: 2 weeks
the Guest Interruptibility-state field. However, there isn't any way to
figure out which processors have this requirement.
So, inject a pending NMI only if NMI_BLOCKING, MOVSS_BLOCKING, STI_BLOCKING
are all clear. If any of these bits are set then enable "NMI window exiting"
and inject the NMI in the VM-exit handler.
1. Be consistent in the style of "act_delta" manipulations between the
inactive and active queue scans.
2. Explicitly compare to zero.
3. The deactivation of a page is based is based on its recent history
and not just the current call to vm_pageout_scan(). The variable
"act_delta" represents the current state of the page, and not its
history. Avoid possible confusion by not (ab)using "act_delta" for
the making the deactivation decision.
Submitted by: kib [1]
Reviewed by: kib [2,3]
with USB device detach when using character device handles. This also
includes LibUSB. It turns out that "usb_close()" cannot always get a
reference to clean up its USB transfers and such, if called during the
kernel USB device detach.
Analysis by: hselasky @
Reported by: Juergen Lock <nox@jelal.kn-bremen.de>
MFC after: 1 week
This is done to ensure that visited object IDs are always increasing.
Also, pass correct object ID to prefetch_dnode_metadata for
os_groupused_dnode.
Without this change we would hit an assert if traversal was paused on
a GROUPUSED object, which is unlikely but possible.
Apparently the same change was independently developed by Deplhix.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
MFC after: 10 days
Sponsored by: HybridCluster
This fires off a kqueue note (of type sendfile) to the configured kqfd
when the sendfile transaction has completed and the relevant memory
backing the transaction is no longer in use by this transaction.
This is analogous to SF_SYNC waiting for the mbufs to complete -
except now you don't have to wait.
Both SF_SYNC and SF_KQUEUE should work together, even if it
doesn't necessarily make any practical sense.
This is designed for use by applications which use backing cache/store
files (eg Varnish) or POSIX shared memory (not sure anything is using
it yet!) to know when a region of memory is free for re-use. Note
it doesn't mark the region as free overall - only free from this
transaction. The application developer still needs to track which
ranges are in the process of being recycled and wait until all
pending transactions are completed.
TODO:
* documentation, as always
Sponsored by: Netflix, Inc.
This is still under a bit of flux, as the final API hasn't been nailed
down. It's also unclear whether we should define the two new types in the
header or not - it may allow bad code to compile that shouldn't (ie,
since uintX's are defined, the developer may not include sys/types.h.)
Reviewed by: peter, imp, bde
Sponsored by: Netflix, Inc.
in the Guest Interruptibility-state VMCS field.
If we fail to do this then a subsequent VM-entry will fail because it is an
error to inject an NMI into the guest while "NMI Blocking" is turned on. This
is described in "Checks on Guest Non-Register State" in the Intel SDM.
Submitted by: David Reed (david.reed@tidalscale.com)
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt. Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds. But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.
Reviewed by: gibbs
MFC after: 13 days
Sponsored by: HybridCluster
If we prematurely free the name buffer and it gets quickly recycled,
then zfs_rename may see data from another lookup or even unmapped memory
via cn_nameptr.
MFC after: 6 days
Sponsored by: HybridCluster
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt. Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds. But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.
Reviewed by: gibbs
MFC after: 13 days
Sponsored by: HybridCluster
Otherwise we could run into the following deadlock.
A thread has a transaction open and assigned to a transaction group.
That would prevent the transaction group from be quiesced and synced.
The thread is blocked in getnewvnode_reserve waiting for a vnode to
a be reclaimed. vnlru thread is blocked trying to enter ZFS VOP because
a filesystem is suspended by an ongoing rollback or receive operation.
In its turn the operation is waiting for the current transaction group
to be synced.
zfs_zget is always used outside of active transactions, but zfs_mknode
is always used in a transaction context. Thus, we hoist
getnewvnode_reserve from zfs_mknode to its callers.
While there, assert that ZFS always calls getnewvnode while having
a vnode reserved.
Reported by: adrian
Tested by: adrian
MFC after: 17 days
Sponsored by: HybridCluster
Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.
Found by: lstewart
Discussed with: ae,lstewart
MFC after: 2 weeks
Sponsored by: Yandex LLC
add separate rx/tx ring indexes
add ring specifier in nm_open device name
netmap.c, netmap_vale.c
more consistent errno numbers
netmap_generic.c
correctly handle failure in registering interfaces.
tools/tools/netmap/
massive cleanup of the example programs
(a lot of common code is now in netmap_user.h.)
nm_util.[ch] are going away soon.
pcap.c will also go when i commit the native netmap support for libpcap.
sure to clear the lower 12 bits. We're adding the translation
attributes to the physical address and non-zero bits in the first
12 bits would give us something unexpected, including invalid bit
values. Those trigger nested general protection faults.
We do not have to clear the region bits, because they are ignored
anyway, so we can replace an existing dep instruction with the one
we need.
This fixes GP faults for the swapper thread, as it's the only thread
that has a direct-mapped stack. Since the bug is in the nested TLB
fault handler, the frequency of hitting the GP is in the order of
hours/days under load.
non-modifier key press. This prevents so-called "ghost
keyboards" keeping modifier keys pressed while not
actually seen as a real keyboard.
MFC after: 2 weeks
can be initiated in the context of a vcpu thread or from the bhyve(8) control
process.
The first use of this functionality is to update the vlapic trigger-mode
register when the IOAPIC pin configuration is changed.
Prior to this change we would update the TMR in the virtual-APIC page at
the time of interrupt delivery. But this doesn't work with Posted Interrupts
because there is no way to program the EOI_exit_bitmap[] in the VMCS of
the target at the time of interrupt delivery.
Discussed with: grehan@
found in High Speed USB HUBs which translate from High Speed USB into
FULL or LOW speed USB. In some rare cases SPLIT transactions might get
lost, which might leave the TT in an unknown state. Whenever we detect
such an error try to issue either a clear TT buffer request, or if
that is not possible reset the whole TT.
MFC after: 1 week