MFV r258373:
4168 ztest assertion failure in dbuf_undirty
4169 verbatim import causes zdb to segfa
4170 zhack leaves pool in ACTIVE state
illumos/illumos-gate@7fdd916c47
Remove scary comment about this being a test key.
There has been no need to regenerate the signing key.
Early MFC as it is just a comment and needs to get into releng/10.0.
Approved by: bapt (mentor, implicit)
* Take into account that readlink() does not add a terminating '\0'.
* Do not match symlinks that are followed because of -H or -L. This is
explicitly documented in GNU find's info file and is like -type l.
* Fix matching symlinks in subdirectories when fts changes directories.
As before, symlinks of length PATH_MAX or more are not handled correctly.
(These can only be created on other operating systems.)
Also, avoid some readlink() calls on files that are obviously not symlinks
(because of fts(3) restrictions, not all of them).
PR: bin/185393
Submitted by: Ben Reser (parts, original version)
Fix a braino with r259730: we cannot currently use CFLAGS.gcc or
CFLAGS.clang in sys/conf/Makefile.arm, since the main kernel build does
not use <bsd.sys.mk>. So revert that particular change for now.
Pointy hat to: me
Noticed by: zbb
gcc: Fix optimization bug.
GCC-PR rtl-optimization/34628
* combine.c (try_combine): Stop and undo after the first combination
if an autoincrement side-effect on the first insn has effectively
been lost.
sbin/devd/devd.cc
Increase the size of devd's client socket's send buffer from the
default (8k) to 128k. This prevents clients from getting
POLLHUPped during event storms. For example, during zpool creation,
the kernel emits a resource.fs.zfs.statechange event for every vdev
in the pool. A 128k buffer is large enough to hold the statechange
events for a pool with nearly 800 drives.
MFC 259362
sbin/devd/devd.cc
Promoting the SIGINFO handler's log message from LOG_INFO to
LOG_NOTICE, and promoting the "Processing event ..." message from
LOG_DEBUG to LOG_INFO. Setting the logfile to LOG_NOTICE with this
change will have the same result as setting it to LOG_INFO without
this change. Setting it to LOG_INFO with this change will include
the useful "Processing event ..." messages that were previously at
LOG_DEBUG, without including useless messages like "Pushing table".
The intent of this change is that one can log "Processing event ..."
without logging "Pushing table" and related messages that are sent
for every event. The number of lines actually logged is reduced by
about 75% by making this change and setting syslog to LOG_INFO vs
setting syslog to LOG_DEBUG.
etc/syslog.conf
Changing the recommended loglevel to notice instead of info.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
When a da or ada device dissappears, outstanding IOs fail with
ENXIO, not EIO. The check for EIO was probably copied from Illumos,
where that is indeed the correct errno.
Without this change, pulling a busy drive from a zpool would usually
turn it into UNAVAIL, even though pulling an idle drive would turn
it into REMOVED. With this change, it is REMOVED every time.
Also, vdev_geom_io_intr shouldn't do zfs_post_remove, because that
results in devd getting two resource.fs.zfs.removed events. The
comment said that the event had to be sent directly instead of
through the async removal thread because "the DE engine is using
this information to discard prevoius I/O errors". However, the fact
that vdev_geom_io_intr was never actually sending the events until
now, and that vdev_geom_orphan never sent them at all, and that
vdev_geom_orphan usually gets called about 2 seconds after the
actual removal, means that FreeBSD's userland can cope with a late
event just fine.
Use an RLOCK here instead of an RWLOCK - matching all the other calls
to lla_lookup().
This drastically reduces the very high lock contention when doing parallel
TCP throughput tests (> 1024 sockets) with IPv6.
MFC r260187:
lla_lookup() does modification only when LLE_CREATE is specified.
Thus we can use IF_AFDATA_RLOCK() instead of IF_AFDATA_LOCK() when doing
lla_lookup() without LLE_CREATE flag.
MFC r260217:
Add IF_AFDATA_WLOCK_ASSERT() in case lla_lookup() is called with
LLE_CREATE flag.
Prevent users from deactivating the last component of a mirror.
MFC r259929:
Add an ability to stop gmirror and clear its metadata in one command.
This fixes the problem, when gmirror starts again just after stop.
The problem occurs when gmirror's component has geom label with equal size.
E.g. gpt and gptid have the same size as partition, diskid has the same
size as entire disk. When gmirror's geom has been destroyed, glabel
creates its providers and this initiate retaste.
Now "gmirror destroy" command is available. It destroys geom and also
erases gmirror's metadata.
PR: 184985
Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4).
Now it is easy to expand the size of the mirror when all its components
are replaced. Also add g_resize method to geom_mirror class. It will write
updated metadata to new last sector, when parent provider is resized.
Split the last gcc-specific flags off into CFLAGS.gcc. This also
removes the need to use -Qunused-arguments for clang throughout the
tree.
MFC r260369:
Apply band-aid for 32-bit compat libs failures after r260334: put back
-Qunused-arguments for clang for now, until I can figure out a way to
make it unneeded in all scenarios. Sorry about the breakage.
Similar to r260020, only use -fms-extensions with gcc, for all other
modules which require this flag to compile. Use a GCC_MS_EXTENSIONS
variable, defined in kern.pre.mk, which can be used to easily supply the
flag (or not), depending on the compiler type.
MFC r260322:
In addition to r260102, also define GCC_MS_EXTENSIONS in bsd.sys.mk,
since kernel module builds do not use kern.pre.mk.
Add an OFW SPI compatible bus. Fix the spibus probe to return
BUS_PROBE_GENERIC and not BUS_PROBE_SPECIFIC (0) so the OFW SPI bus can
attach when enabled. Export the spibus devclass_t and driver_t
declarations.
Submitted by: ray
Approved by: adrian (mentor)
Implement automatic live resize support for GEOM MULTIPATH class.
In "manual" mode just automatically resize provider in any direction.
In "automatic" mode allow growth (with new metadata write); in case of
shrinking check if there is already valid metadata found at the new
location. This should allow easy transparent recovery if first resize
was done by mistake.
While there, unify metadata write code and fix minor memory leak.
Do not DELAY() for P-state transition unless we want to see the result.
Intel manual says: "If a transition is already in progress, transition to
a new value will subsequently take effect. Reads of IA32_PERF_CTL determine
the last targeted operating point." So seems it should be fine to just
trigger wanted transition and go. Linux does the same.
Update the description for pmap_remove_pages() to match the modern
times. Assert that the pmap passed to pmap_remove_pages() is only
active on current CPU.
Add a new sysctl / loader tunable kern.panic_reboot_wait_time which
defaults to PANIC_REBOOT_WAIT_TIME (a long-existing kernel config
setting). Use this now-variable value in place of the defined constant
to control how long the system waits after a panic before rebooting.
Fix several bugs in sctp_bindx():
* Set errno to EAFNOSUPPORT if an address is provided which is neither
AF_INET nor AF_INET6.
* Don't modify the arguments.
* Don't smash the stack when provided with a non-zero port.
* Handle the case correctly where the first address provided is
an IPv6 address.
Apply vendor commits:
197e0ea Fix for TLS record tampering bug. (CVE-2013-4353).
3462896 For DTLS we might need to retransmit messages from the
previous session so keep a copy of write context in DTLS
retransmission buffers instead of replacing it after
sending CCS. (CVE-2013-6450).
ca98926 When deciding whether to use TLS 1.2 PRF and record hash
algorithms use the version number in the corresponding
SSL_METHOD structure instead of the SSL structure. The
SSL structure version is sometimes inaccurate.
Note: OpenSSL 1.0.2 and later effectively do this already.
(CVE-2013-6449).
Security: CVE-2013-4353
Security: CVE-2013-6449
Security: CVE-2013-6450
Bring back the old size of the kinfo_file structure to preserve ABI.
Keep only one uint64_t spare for further cap_rights_t expension.
Add a comment clarifying that if the size of this structure changes,
a new sysctl MIB has to be allocate for it and the old structure has
to be returned by the old sysctl MIB.
Requested by: re
Don't check for fd limits in fdgrowtable_exp.
Callers do that already and additional check races with process
decreasing limits and can result in not growing the table at all, which
is currently not handled.
locking support for CAM
r256826:
Fix several target mode SIMs to not blindly clear ccb_h.flags field of
ATIO CCBs. Not all CCB flags there belong to them.
r256836:
Remove hard limit on number of BIOs handled with one ATA TRIM request.
r256843:
Merge CAM locking changes from the projects/camlock branch to radically
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
r256888:
Unconditionally acquire periph reference on CCB allocation failure.
r256895:
Fix memory and references leak due to unfreed path.
r256960:
Move CAM_UNQUEUED_INDEX setting to the last moment and under the periph lock.
This fixes race condition with cam_periph_ccbwait(), causing use-after-free.
r256975:
Minor (mostly cosmetical) addition to r256960.
r257054:
Some microoptimizations for da and ada drivers:
- Replace ordered_tag_count counter with single flag;
- From da remove outstanding_cmds counter, duplicating pending_ccbs list;
- From da_softc remove unused links field.
r257482:
Fix lock recursion, triggered by `smartctl -a /dev/adaX`.
r257501:
Make getenv_*() functions and respectively TUNABLE_*_FETCH() macros not
allocate memory and so not require sleepable environment. getenv() has
already used on-stack temporary storage, so just use it more rationally.
getenv_string() receives buffer as argument, so don't need another one.
r257914:
Some CAM locks polishing:
- Fix LOR and possible lock recursion when handling high-power commands.
Introduce new lock to protect left power quota and list of frozen devices.
- Correct locking around xpt periph creation.
- Remove seems never used XPT_FLAG_OPEN xpt periph flag.
Again, Netflix assisted with testing the merge, but all of the credit goes
to Alexander and iX Systems.
Submitted by: mav
Sponsored by: iX Systems
r256603:
Introduce new function devstat_end_transaction_bio_bt(), adding new argument
to specify present time. Use this function to move binuptime() out of lock,
substantially reducing lock congestion when slow timecounter is used.
r256606:
Move g_io_deliver() out of the lock, as required for direct dispatch.
Move g_destroy_bio() out too to reduce lock scope even more.
r256607:
Fix passing uninitialized bio_resid argument to g_trace().
r256610:
Add unmapped I/O support to GEOM RAID.
r256830:
Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping
temporary mapped buffer. That fixes double unmap if biodone() called twice
for the same BIO (but with different done methods).
r256880:
Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.
r259247:
Fix bug introduced at r256607. We have to recalculate bp_resid here since
sizes of original and completed requests may differ due to end of media.
Testing of the stable/10 merge was done by Netflix, but all of the credit
goes to Alexander and iX Systems.
Submitted by: mav
Sponsored by: iX Systems
- Take BIO lock in biodone() only when there is no completion callback set
and so we should wake up thread waiting in biowait().
- Remove msleep() timeout from biowait(). It was added 11 years ago, when
there was no locks used, and it should not be needed any more.
Handle case when ACPI reports HPET device, but does not provide memory
resource for it. In such case take the address range from the HPET table.
This fixes hpet(4) driver attach on Asrock C2750D4I board.
Use relaxed (write-only) memory barriers when writing some of queue index
registers (for now on ISP2400+). We never read those registers back and
AFAIK their semantics does not require any immediate reaction on write.