Without clustering support we any way have only one group of permanently
active ports, but that gives us one more supported VMWare feature. ;)
Solaris' Comstar also reports it even when only one port is present.
This change fixes transient performance drops in some of my benchmarks,
vanishing as soon as I am trying to collect any stats from the scheduler.
It looks like reordered access to those variables sometimes caused loss of
IPI_PREEMPT, that delayed thread execution until some later interrupt.
MFC after: 3 days
spaces, rather than a split address, we actually can't check for being within
the kernel's address range. Instead, do what other backtraces do, and use
trapexit()/asttrapexit() as the stack sentinel.
MFC after: 3 weeks
In the fdt data we've written for ourselves, the interrupt properties
for GIC interrupts have just been a bare interrupt number. In standard
data that conforms to the published bindings, GIC interrupt properties
contain 3-tuples that describe the interrupt as shared vs private, the
interrupt number within the shared/private address space, and configuration
info such as level vs edge triggered.
The new gic_decode_fdt() function parses both types of data, based on the
#interrupt-cells property. Previously, each platform implemented a decode
routine and put a pointer to it into fdt_pic_table. Now they can just
list this function in their table instead if they use arm/gic.c.
path through the NFS clients' getpages functions.
Introduce vm_pager_free_nonreq(). This function can be used to eliminate
code that is duplicated in many getpages functions. Also, in contrast to
the code that currently appears in those getpages functions,
vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one
case.
Reviewed by: kib
MFC after: 6 weeks
Sponsored by: EMC / Isilon Storage Division
The code had references to both intr_offset and intr_parent variable names
as referring to the parent interrupt node. The intr_parent variable
wasn't actually defined anywhere, but the only references to it were as
an argument to a macro that didn't use that argument in expansion, so
the undefined variable accidentally didn't cause an error.
The intr_parent name makes more sense in context, so change all occurrances
of intr_offset to intr_parent.
* vfs.zfs.vdev.async_write_active_min_dirty_percent
* vfs.zfs.vdev.async_write_active_max_dirty_percent
Added validation of min / max for ZFS sysctl
* vfs.zfs.dirty_data_max_percent
MFC after: 3 days
devq_openings counter lost its meaning after allocation queues has gone.
held counter is still meaningful, but problematic to update due to separate
locking of CCB allocation and queuing.
To fix that replace devq_openings counter with allocated counter. held is
now calculated on request as difference between number of allocated, queued
and active CCBs.
MFC after: 1 month
The changes should not modify the generated code.
The pager->pgo_putpages() method takes int flags as its fourth
argument, while vnode_pager_putpages() used boolean_t (which is
typedef'ed to int). The flags are from VM_PAGER_* namespace, while
vnode_pager_putpages() passed TRUE and OBJPC_SYNC to VOP_PUTPAGES(),
which both are numerically equal to VM_PAGER_PUT_SYNC.
Noted and reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Prevent saturattion of the bus by constant polling which in
extreme cases can cause interface lockup. This makes FreeBSD
match similar case in the executive.
Correctly report hole at end of file.
When asked to find a hole, the DMU sees that there are no holes in the
object, and returns ESRCH. The ZPL interprets this as "no holes before
the end of the file", and therefore inserts the "virtual hole" at the
end of the file. Because DMU and ZPL have different ideas of where the
end of an object/file is, we will end up returning the end of file,
which is generally larger, instead of returning the end of object.
The fix is to handle the "virtual hole" in the DMU. If no hole is found,
the DMU will return a hole at the end of the file, rather than an error.
Illumos issue:
5139 SEEK_HOLE failed to report a hole at end of file
MFC after: 1 week
In zil_claim, don't issue warning if we get EBUSY (inconsistent) when
opening an objset, instead, ignore it silently.
Illumos issue:
5140 message about "%recv could not be opened" is printed when booting after crash
MFC after: 1 week
Add a new tunable/sysctl, vfs.zfs.free_max_blocks, which can be used to
limit how many blocks can be free'ed before a new transaction group is
created. The default is no limit (infinite), but we should probably have
a lower default, e.g. 100,000.
With this limit, we can guard against the case where ZFS could run out of
memory when destroying large numbers of blocks in a single transaction
group, as the entire DDT needs to be brought into memory.
Illumos issue:
5138 add tunable for maximum number of blocks freed in one txg
MFC after: 2 weeks
Enforce 4K as smallest indirect block size (previously the smallest
indirect block size was 1K but that was never used).
This makes some space estimates more accurate and uses less memory
for some data structures.
Illumos issue:
5141 zfs minimum indirect block size is 4K
MFC after: 2 weeks
It allows to bypass range checks between UNMAP and READ/WRITE commands,
which may introduce additional delays while waiting for UNMAP parameters.
READ and WRITE commands are always processed in safe order since their
range checks are almost free.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Before this change UNMAP completely blocked other I/Os while running.
Now it blocks only colliding ones, slowing down others only due to ZFS
locks collisions.
Sponsored by: iXsystems, Inc.
many thanks for their continued support of FreeBSD.
While I'm there, also implement a new build knob, WITHOUT_HYPERV to
disable building and installing of the HyperV utilities when necessary.
The HyperV utilities are only built for i386 and amd64 targets.
This is a stable/10 candidate for inclusion with 10.1-RELEASE.
Submitted by: Wei Hu <weh microsoft com>
MFC after: 1 week
Huawei. It might appear as if the firmware is allocating memory blocks
according to the USB transfer size and if there is initially a lot of
data, like at the answering machine prompt, it simply dies without any
apparent reason. The simple workaround for this is to force a zero
length packet at hardware level after every 512 bytes of data. This
will force the other side to use smaller memory blocks aswell.
MFC after: 1 week
- Add invfo_rdwr() (for read and write), invfo_ioctl(), invfo_poll(),
and invfo_kqfilter() for use by file types that do not support the
respective operations. Home-grown versions of invfo_poll() were
universally broken (they returned an errno value, invfo_poll()
uses poll_no_poll() to return an appropriate event mask). Home-grown
ioctl routines also tended to return an incorrect errno (invfo_ioctl
returns ENOTTY).
- Use the invfo_*() functions instead of local versions for
unsupported file operations.
- Reorder fileops members to match the order in the structure definition
to make it easier to spot missing members.
- Add several missing methods to linuxfileops used by the OFED shim
layer: fo_write(), fo_truncate(), fo_kqfilter(), and fo_stat(). Most
of these used invfo_*(), but a dummy fo_stat() implementation was
added.
instead of breaking out of the loop and then immediately checking the loop
index so that if it was broken out of the proper value can be returned.
While here, use nitems().
nexus_alloc_resource() and don't set a bushandle.
nexus_activate_resource() will set a proper bushandle.
- Implement a proper nexus_release_resource().
- Fix ixppcib_activate_resource() to call rman_activate_resource()
before creating a mapping for the resource.
Tested by: jmg