The Tx interrupt is now kept disabled in the common case, only
enabled when the number of free descriptors in the queue falls
below a threshold. Transmitted frames are cleared from the VQ
before subsequent transmit, or in the watchdog timer.
This was a very big performance improvement for an experimental
Netmap bhyve backend.
MFC after: 1 month
awareness.
* Introduce IP_BINDMULTI - indicating that it's okay to bind multiple
sockets on the same bind details.
Although the PCB code has been taught about this (see below) this patch
doesn't introduce the rest of the PCB changes necessary to distribute
lookups among multiple PCB entries in the global wildcard table.
* Introduce IP_RSS_LISTEN_BUCKET - placing an listen socket into the
given RSS bucket (and thus a single PCBGROUP hash.)
* Modify the PCB add path to be aware of IP_BINDMULTI:
+ Only allow further PCB entries to be added if the owner credentials
and IP_BINDMULTI has been specified. Ie, only allow further
IP_BINDMULTI sockets to appear if the first bind() was IP_BINDMULTI.
* Teach the PCBGROUP code about IP_RSS_LISTE_BUCKET marked PCB entries.
Instead of using the wildcard logic and hashing, these sockets are
simply placed into the PCBGROUP and _not_ in the wildcard hash.
* When doing a PCBGROUP lookup, also do a wildcard match as well.
This allows for an RSS bucket PCB entry to appear in a PCBGROUP
rather than having to exist in the wildcard list.
Tested:
* TCP IPv4 server testing with igb(4)
* TCP IPv4 server testing with ix(4)
TODO:
* The pcbgroup lookup code duplicated the wildcard and wildcard-PCB
logic. This could be refactored into a single function.
* This doesn't yet work for IPv6 (The PCBGROUP code in netinet6/ doesn't
yet know about this); nor does it yet fully work for UDP.
percentage of machines has a 16550. Disable it for pc98 since only a
tiny fraction of them have one. These changes save 293 bytes when
building with clang, but preserves the ability to build with serial if
you really want. We now have 92 bytes free (412 with the in-tree gcc).
Use reserved space for ZFS administrative commands.
We reserve 1/2^spa_slop_shift = 1/32 or 3.125% of pool space (or 32MB at
least) for system use. Most ZPL operations, e.g. write(2), creat(2), will
fail with ENOSPC if we fall below this.
Certain operations, e.g. file removal and most administrative actions,
still permitted until half of the slop space is used. This would allow
users to use these operations to free up space in the pool when pool is
close to full but half of slop space is still free.
A very restricted set of operations that frees up space or change quota
are always permitted, regardless of the amount of free space.
MFC after: 2 weeks
ptrace_set_pc() use the correct return to userspace using iret.
The signal return, PT_CONTINUE (which in fact uses signal return path)
set the pcb flag already. The setcontext(2) enforces iret return when
%rip is incorrect. Due to this, the change is redundand, but is made
to ensure that no path which modifies context, forgets to set
PCB_FULL_IRET.
Inspired by: CVE-2014-4699
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Refresh zpool list for each interval in order to produce fresh
output.
Illumos issue: 4966 zpool list iterator does not update output
MFC after: 2 weeks
amount of resident pages, in fact calculates the amount of installed
pte entries in the region. Resident pages which were not soft-faulted
yet are not counted.
Calculate the amount of resident pages by looking in the objects chain
backing the region.
Add a knob to disable the residency calculation at all. For large
sparce regions, either previous or updated algorithm runs for too long
time, while several introspection tools do not need the (advisory) RSS
value at all.
PR: kern/188911
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Explicitly mark file removal transactions as "presumed to result
in a net free of space" so they will not fail with ENOSPC.
Illumos issue: 4950 files sometimes can't be removed from a full
filesystem
MFC after: 2 weeks
statically linked into consumers (GDB and variants) in the base
system, and the shared library is no longer installed.
That also allows ports to be able to use a modern version of readline
PR: 162948
Reviewed by: emaste
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Garrett D'Amore <garrett@damore.org>
illumos/illumos-gate@7d46dc6ca6
4954 "zfs create" need not involve libshare if we are not sharing
4955 libshare's get_zfs_dataset need not sort the datasets
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Approved by: Garrett D'Amore <garrett@damore.org>
illumos/illumos-gate@33cde0d0c2
Reviewed by: Adam Leventhal <adam.leventhal@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Boris Protopopov <bprotopopov@hotmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
illumos/illumos-gate@4bb7380495
The number of vm fictitious regions was limited to 8 by default, but
Xen will make heavy usage of those kind of regions in order to map
memory from foreign domains, so instead of increasing the default
number, change the implementation to use a red-black tree to track vm
fictitious ranges.
The public interface remains the same.
Sponsored by: Citrix Systems R&D
Reviewed by: kib, alc
Approved by: gibbs
vm/vm_phys.c:
- Replace the vm fictitious static array with a red-black tree.
- Use a rwlock instead of a mutex, since now we also need to take the
lock in vm_phys_fictitious_to_vm_page, and it can be shared.
With the move of atf-sh into /usr/libexec in r267181, some of the
tests in the integration_test program broke because they could not
execute atf-sh from the path any longer.
This slipped through because I do have a local atf installation in
my home directory that appears in my path, hence the tests could
still execute my own version.
Fix this by forcing /usr/libexec to appear at the beginning of the
path when attempting to execute atf-sh.
To make upgrading easy (and to avoid an unnecessary entry in UPDATING),
make integration_test depend on the Makefile so that a rebuild of the
shell script is triggered. This requires a hack in the *.test.mk files
to ensure the Makefile is not treated as a source to the generated
program. Ugly, I know, but I don't have a better way of doing this at
the moment. Will think of one once I address the TODO in the *.test.mk
files that suggests generalizing the file generation functionality.
PR: 191052
Reviewed by: Garrett Cooper
Although it is probably unwise to use this, POSIX is clear that leading
zeroes are permitted in positional parameters (and do not indicate octal).
Such positional parameters are checked for being unset and/or null
correctly, but their value is incorrectly expanded.
The test locale1.0 depends on locale support; it is meaningless without a
working LC_MESSAGES.
I added an OptionalObsoleteFiles.inc entry.
PR: 181151
Submitted by: Garrett Cooper (original version)
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
While it is possible to create and write file, modify its permissions, etc.
without ever doing sync, it looks odd that it is required for setting
extended file attributes on ZFS. UFS does not do sync there too.
Samba uses those extended attributes to store some its data, and doing it
synchronously by many times reduces file creation performance for systems
without SLOG device.
Reviewed by: delphij, jpaetzel, silence on fs@
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
That should make operation more kind to multi-initiator environment.
Without this, other initiators may find out that something bad happened
to their commands only via command timeout.