Increase the size of devd's client socket's send buffer from the
default (8k) to 128k. This prevents clients from getting
POLLHUPped during event storms. For example, during zpool creation,
the kernel emits a resource.fs.zfs.statechange event for every vdev
in the pool. A 128k buffer is large enough to hold the statechange
events for a pool with nearly 800 drives.
Reviewed by: ian, imp
Approved by: ken (mentor)
Sponsored by: Spectra Logic Corp
MFC after: 4 weeks
RFC 4402 specifies the implementation of the gss_pseudo_random()
function for the krb5 mechanism (and the C bindings therein).
The implementation uses a PRF+ function that concatenates the output
of individual krb5 pseudo-random operations produced with a counter
and seed. The original implementation of this function in Heimdal
incorrectly encoded the counter as a little-endian integer, but the
RFC specifies the counter encoding as big-endian. The implementation
initializes the counter to zero, so the first block of output (16 octets,
for the modern AES enctypes 17 and 18) is unchanged. (RFC 4402 specifies
that the counter should begin at 1, but both existing implementations
begin with zero and it looks like the standard will be re-issued, with
test vectors, to begin at zero.)
This is upstream's commit f85652af868e64811f2b32b815d4198e7f9017f6,
from 13 October, 2013:
% Fix krb5's gss_pseudo_random() (n is big-endian)
%
% The first enctype RFC3961 prf output length's bytes are correct because
% the little- and big-endian representations of unsigned zero are the
% same. The second block of output was wrong because the counter was not
% being encoded as big-endian.
%
% This change could break applications. But those applications would not
% have been interoperating with other implementations anyways (in
% particular: MIT's).
Approved by: hrs (mentor, src committer)
MFC after: 3 days
===
DEBUG: Running installation step: services
local: Not in a function
/usr/libexec/bsdinstall/services: cannot create : Read-only file system
/usr/libexec/bsdinstall/services: /tmp/bsdinstall/etc/rc.conf.services: \
Permission denied
===
The `local: Not in a function' is obvious, and was introduced by myself in
SVN revision 256348.
The latter two are caused by the attempt to use "\" to continue the line
after using the ">>" redirect. This appears to attempt to write a file with
the name " " in the current directory and subsequently attempts to execute
the file that was originally intended for writing (which is not executable;
hence the `Permission denied'). That was introduced in SVN r228192 about
2 years ago, apparently unnoticed until I started going over the debug
outputs very carefully.
MFC after: 3 days
This will read the REPOS_DIR env/config setting (default is /etc/pkg
and /usr/local/etc/pkg/repos) and use the last enabled repository.
This can be changed in the environment using a comma-separated list,
or in /usr/local/etc/pkg.conf with JSON array syntax of:
REPOS_DIR: ["/etc/pkg", "/usr/local/etc/pkg/repos"]
Approved by: bapt
MFC after: 1 week
Historically creation of device aliases created symbolic links using only
name of target device as a link target, not considering current directory.
Fix that by adding number of "../" chunks to the terget device name,
required to get out of the current directory to devfs root first.
MFC after: 1 month
receiving Zero Length Packets, ZLPs. See comment in code for more
information.
MFC after: 1 week
Reported by: Kohji Okuno <okuno.kohji@jp.panasonic.com>
When a da or ada device dissappears, outstanding IOs fail with
ENXIO, not EIO. The check for EIO was probably copied from Illumos,
where that is indeed the correct errno.
Without this change, pulling a busy drive from a zpool would usually
turn it into UNAVAIL, even though pulling an idle drive would turn
it into REMOVED. With this change, it is REMOVED every time.
Also, vdev_geom_io_intr shouldn't do zfs_post_remove, because that
results in devd getting two resource.fs.zfs.removed events. The
comment said that the event had to be sent directly instead of
through the async removal thread because "the DE engine is using
this information to discard prevoius I/O errors". However, the fact
that vdev_geom_io_intr was never actually sending the events until
now, and that vdev_geom_orphan never sent them at all, and that
vdev_geom_orphan usually gets called about 2 seconds after the
actual removal, means that FreeBSD's userland can cope with a late
event just fine.
Approved by: ken (mentor)
Sponsored by: Spectra Logic Corporation
MFC after: 4 weeks
In case of 4K allocation quantum that means for allocations up to 128K.
With growth of memory fragmentation these lists may grow to quite a large
sizes (tenths and hundreds of thousands items). Having in one list items
of different sizes in worst case may require full linear list traversal,
that may be very expensive. Having lists for items of single size means
that unless user specify some alignment or border requirements (that are
very rare cases) first item found on the list should satisfy the request.
While running SPEC NFS benchmark on top of ZFS on 24-core machine with
84GB RAM this change reduces CPU time spent in vmem_xalloc() from 8%
and lock congestion spinning around it from 20% to invisible levels.
And that all is by the cost of just 26 more pointers per vmem instance.
If at some point our kernel will start to actively use KVA allocations
with odd sizes above 128K, something may need to be done to bigger lists
also.
all the structures. While here, move a helper struct only used in
the kernel parser out of this header since it is not part of the MP
specification itself.
completely full, we'd not complete any of the mbufs due to the fence
post error (this creates a large leak). When this is fixed, we still
leak, but at a much smaller rate due to a race between ateintr and
atestart_locked as well as an asymmetry where atestart_locked is
called from elsewhere. Ensure that we free in-flight packets that
have completed there as well. Also remove needless check for NULL on
mb, checked earlier in the loop and simplify a redundant if.
MFC after: 3 days
This change is a proof of concept on how to easily integrate existing
tests from the tools/regression/ hierarchy into the /usr/tests/ test
suite and on how to adapt them to the new layout for src.
To achieve these goals, this change:
- Moves tests from tools/regression/bin/<tool>/ to bin/<tool>/tests/.
- Renames the previous regress.sh files to legacy_test.sh.
- Adds Makefiles to build and install the tests and all their supporting
data files into /usr/tests/bin/.
- Plugs the legacy_test test programs into the test suite using the new
TAP backend for Kyua (appearing in 0.8) so that the code of the test
programs does not have to change.
- Registers the new directories in the BSD.test.dist mtree file.
Reviewed by: freebsd-testing
Approved by: rpaulo (mentor)
This change fixes some cases where bsd.progs.mk would fail to handle
directories with SCRIPTS but no PROGS. In particular, "install" did
not handle such scripts nor dependent files when bsd.subdir.mk was
added to the mix.
This is "make tinderbox" clean.
Reviewed by: freebsd-testing
Approved by: rpaulo (mentor)
This file provides support to build test programs that comply with the
Test Anything Protocol. Its main goal is to support the painless
integration of existing tests from tools/regression/ into the Kyua-based
test suite.
Approved by: rpaulo (mentor)
When the guest is bringing up the APs in the x2APIC mode a write to the
ICR register will now trigger a return to userspace with an exitcode of
VM_EXITCODE_SPINUP_AP. This gets SMP guests working again with x2APIC.
Change the vlapic timer lock to be a spinlock because the vlapic can be
accessed from within a critical section (vm run loop) when guest is using
x2apic mode.
Reviewed by: grehan@
sys/cdefs.h. In particular, in case of COMPAT_43, param.h includes
sys/types.h, which includes sys/select.h, which includes
sys/_sigset.h. The _sigset.h customizes the provided definions based
on COMPAT_43, eliminating osigset_t if symbol is not defined. The
sys/proc.h is included after opt_compat.h and needs osigset_t.
Move opt_compat.h inclusion into the right place.
Sponsored by: The FreeBSD Foundation
the excess code in g_io_check(), bio_resid is also truncated by
g_io_deliver(). As result, bufdonebio() assigns truncated value to
the buffer b_resid field.
Use the residual bio_completed to calculate buffer b_resid from
b_bcount in bufdonebio(), instead of bio_resid, calculated from
bio_length in g_io_deliver().
The issue is seemingly caused by the code rearrange into g_io_check(),
which is not present in stable/10. The change still looks as the
useful change to have in 10 nevertheless.
Reported by: Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Tested by: pho, Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
the bio is unmapped, so we must map the bio pages into pbuf. This
works around the geom classes which do not follow the MAXPHYS limit on
the i/o size, since such classes do not know about unmapped bios
either.
Reported by: Paolo Pinto <paolo.pinto@netasq.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Intel manual says: "If a transition is already in progress, transition to
a new value will subsequently take effect. Reads of IA32_PERF_CTL determine
the last targeted operating point." So seems it should be fine to just
trigger wanted transition and go. Linux does the same.
MFC after: 1 month
is actually sent by the remote node).
Otherwise it generated confusing "Negotiated protocol version 1" debug
messages when processing the second connection.
MFC after: 2 weeks
request back from the receive queue -- it might already be processed
by remote_recv_thread, which lead to crashes like below:
(primary) Unable to receive reply header: Connection reset by peer.
(primary) Unable to send request (Connection reset by peer):
WRITE(954662912, 131072).
(primary) Disconnected from kopusha:7772.
(primary) Increasing localcnt to 1.
(primary) Assertion failed: (old > 0), function refcnt_release,
file refcnt.h, line 62.
Taking the request back was not necessary (it would properly be
processed by the remote_recv_thread) and only complicated things.
MFC after: 2 weeks