The division added in r331732 meant that we wouldn't attempt a
background laundering until at least v_free_target - v_free_min clean
pages had been freed by the page daemon since the last laundering. If
the inactive queue is depleted but not completely empty (e.g., because
it contains busy pages), it can thus take a long time to meet this
threshold. Restore the pre-r331732 behaviour of using a non-zero
background laundering threshold if at least one inactive queue scan has
elapsed since the last attempt at background laundering.
Submitted by: tijl (original version)
Because I didn't see any reason not too.
I've been making some changes to the code and couldn't help but notice
that the i386 and am64 code was nearly identical.
MFC after: 17 days
If cpu_reset() is called on an AP and if it somehow fails to wake the
BSP, then it's better to attempt the reset on the AP than just sit there
spinning on an unusable and undebuggable system.
MFC after: 16 days
The processor is "parked" in a spin-loop already and that's sufficient
for the reset. There is nothing that stop_cpus() would add here, only
extra complexity and fragility.
The original processor does not need to enable interrupts now, in fact,
it must not do that.
MFC after: 2 weeks
single slab, but with alignment adjustment it won't. Again, when
there is only one item in a slab alignment can be ignored. See
previous revision of this file for more info.
PR: 227116
and zone has a large alignment. With alignment taken into
account uk_rsize will be greater than space in a slab. However,
since we have only one item per slab, it is always naturally
aligned.
Code that will panic before this change with 4k page:
z = uma_zcreate("test", 3984, NULL, NULL, NULL, NULL, 31, 0);
uma_zalloc(z, M_WAITOK);
A practical scenario to hit the panic is a machine with 56 CPUs
and 2 NUMA domains, which yields in zone size of 3984.
PR: 227116
MFC after: 2 weeks
per-cpu alloc and free of pages. The cache is filled with as few trips
to the phys allocator as possible by the use of a new
vm_phys_alloc_npages() function which allocates as many as N pages.
This code was originally by markj with the import function rewritten by
me.
Reviewed by: markj, kib
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14905
a cache of fully populated buckets. This will be used in a follow-on
commit.
The flag idea was originally from markj.
Reviewed by: markj, kib
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
This fixes a failure to resolve object file paths seen when buildkernel
(which sets MK_AUTO_OBJ=yes) and installkernel (which sets MK_AUTO_OBJ=no)
are run as separate steps. r329232 partially fixed this scenario by removing
MAKEOBJDIR, but it seems the AUTO_OBJ setting also needs to be on the same
page for the build and install steps.
Reviewed by: bdrewery
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D14143
- The two types must be type-punnable for shared members of ifr_ifru.
This allows compatibility accessors to be shared.
- There must be no padding gap between ifr_name and ifr_ifru. This is
assumed in tcpdump's use of SIOCGIFFLAGS output which attempts to be
broadly portable. This is true for all current architectures, but very
large (256-bit) fat-pointers could violate this invariant.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14910
We intend to remove support before FreeBSD 12 is branched.
Reviewed by: imp, emaste
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14890
The common pratice in ethernet device drivers is to fall back to
ether_ioctl() to implement generic ioctls not implemented by the driver
and to fail if no handler exists.
Convert these drivers to follow that practice rather than calling
ether_ioctl() for specific cases.
vxge(4) aready had the default case, but it was only called on failure
to match.
Reviewed by: imp
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14895
- Make sure Giant is locked when calling PCI device methods.
Newbus currently requires this.
- Avoid unlocking Giant right before aquiring the sleepqueue lock.
This can save a task switch.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Make sure the command completion handler is not called when the device is
in internal error state. This can easily trigger use after free situations.
MFC after: 3 days
Sponsored by: Mellanox Technologies
During health care IRQ resources will be reallocated.
Newbus requires that Giant is locked before accessing
these resources.
MFC after: 3 days
Sponsored by: Mellanox Technologies
Firmware dump collecting should be triggered in case firmware syndrome
with request for reset bit is set.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
- Move the semaphore locking and unlocking to the same function.
- Flags are no longer needed if the reset and crdump will be done in the
same function.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
- Move firmware dump prep and cleanup to init_one() and remove_one() so that
the init and cleanup will happen only upon driver reload.
- Add some prints to indicate firmware dump.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
The old code checked for MLX5_CR_SPACE_DOMAIN which is irrelevant here.
However, if dev->vsec_addr would be 0, an access to wrong offset would
happen.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900
These objects are defined earlier in the same file; an extern declaration
after definition is redundant.
Broken in r331766 (introduction of ocs_fc(4)).
Sponsored by: Dell EMC Isilon
The ocs_fc(4) driver supports the following hardware:
Emulex 16/8G FC GEN 5 HBAS
LPe15004 FC Host Bus Adapters
LPe160XX FC Host Bus Adapters
Emulex 32/16G FC GEN 6 HBAS
LPe3100X FC Host Bus Adapters
LPe3200X FC Host Bus Adapters
The driver supports target and initiator mode, and also supports FC-Tape.
Note that the driver only currently works on little endian platforms. It
is only included in the module build for amd64 and i386, and in GENERIC
on amd64 only.
Submitted by: Ram Kishore Vegesna <ram.vegesna@broadcom.com>
Reviewed by: mav
MFC after: 5 days
Relnotes: yes
Sponsored by: Broadcom
Differential Revision: https://reviews.freebsd.org/D11423
There are out of tree consumers of vm_map_min() and vm_map_max(), and
I believe there are consumers of vm_map_pmap(), although the later is
arguably less in the need of KBI-stable interface. For the consumers
benefit, make modules using this KPI not depended on the struct vm_map
layout.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D14902
This makes it easier to catch lock accounting bugs, since the problem
is otherwise only detected upon a return to user mode (or never, for
kernel threads).
Reviewed by: cem
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14896
When de(4) was imported in 1997 the world was not ready for these ioctls.
In over 20 years that hasn't changed so it seems safe to assume their
time will never come.
Reviewed by: imp, jhb
Approved by: CheriBSD
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14889
Rather than using the number of inactive queue scans as a metric for
how many clean pages are being freed by the page daemon, have the
page daemon keep a running counter of the number of pages it has freed,
and have the laundry thread use that when computing the background
laundering threshold.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D14884
Mostly this is a thin shim around existing code to integrate with enc_xform
and cryptosoft (+ cryptodev).
Expand the cryptodev buffer used to match that of Chacha20's native block
size as a performance enhancement for chacha20_xform_crypt_multi.
Add a new "interleave" allocation policy which stripes pages across
domains with a stride or width keeping contiguity within a multi-page
region.
Move the kernel to the dedicated numbered cpuset #2 making it possible
to assign kernel threads and memory policy separately from user. This
also eliminates the need for the complicated interrupt binding code.
Add a sysctl API for viewing and manipulating domainsets. Refactor some
of the cpuset_t manipulation code using the generic bitset type so that
it can be used for both. This probably belongs in a dedicated subr file.
Attempt to improve the include situation.
Reviewed by: kib
Discussed with: jhb (cpuset parts)
Tested by: pho (before review feedback)
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14839
9280 Assertion failure while running removal_with_ganging test with 4K devices
illumos/illumos-gate@243952c7ee
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Matt Ahrens <Matt.Ahrens@delphix.com>
9188 increase size of dbuf cache to reduce indirect block decompression
illumos/illumos-gate@268bbb2a2f
With compressed ARC (6950) we use up to 25% of our CPU to decompress indirect
blocks, under a workload of random cached reads. To reduce this decompression
cost, we would like to increase the size of the dbuf cache so that more
indirect blocks can be stored uncompressed.
If we are caching entire large files of recordsize=8K, the indirect blocks
use 1/64th as much memory as the data blocks (assuming they have the same
compression ratio). We suggest making the dbuf cache be 1/32nd of all memory,
so that in this scenario we should be able to keep all the indirect blocks
decompressed in the dbuf cache. (We want it to be more than the 1/64th that
the indirect blocks would use because we need to cache other stuff in the
dbuf cache as well.)
In real world workloads, this won't help as dramatically as the example
above, but we think it's still worth it because the risk of decreasing
performance is low. The potential negative performance impact is that we
will be slightly reducing the size of the ARC (by ~3%).
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: George Wilson <george.wilson@delphix.com>
9321 arc_loan_compressed_buf() can increment arc_loaned_bytes by the wrong value
illumos/illumos-gate@9be12bd737
arc_loan_compressed_buf() increments arc_loaned_bytes by psize unconditionally
In the case of zfs_compressed_arc_enabled=0, when the buf is returned via
arc_return_buf(), if ARC_BUF_COMPRESSED(buf) is false, then arc_loaned_bytes
is decremented by lsize, not psize.
Switch to using arc_buf_size(buf), instead of psize, which will return
psize or lsize, depending on the result of ARC_BUF_COMPRESSED(buf).
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Allan Jude <allanjude@freebsd.org>
9235 rename zpool_rewind_policy_t to zpool_load_policy_t
illumos/illumos-gate@5dafeea3eb
We want to be able to pass various settings during import/open of a pool,
which are not only related to rewind. Instead of adding a new policy and
duplicate a bunch of code, we should just rename rewind_policy to a more
generic term like load_policy.
For instance, we'd like to set spa->spa_import_flags from the nvlist,
rather from a flags parameter passed to spa_import as in some cases we want
those flags not only for the import case, but also for the open case. One
such flag could be ZFS_IMPORT_MISSING_LOG (as used in zdb) which would
allow zfs to open a pool when logs are missing.
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>