Add a last-modified timestamp to each LRO entry and provide an interface
to flush all inactive entries. Drivers decide when to flush and what
the inactivity threshold should be.
Network drivers that process an rx queue to completion can enter a
livelock type situation when the rate at which packets are received
reaches equilibrium with the rate at which the rx thread is processing
them. When this happens the final LRO flush (normally when the rx
routine is done) does not occur. Pure ACKs and segments with total
payload < 64K can get stuck in an LRO entry. Symptoms are that TCP
tx-mostly connections' performance falls off a cliff during heavy,
unrelated rx on the interface.
Flushing only inactive LRO entries works better than any of these
alternates that I tried:
- don't LRO pure ACKs
- flush _all_ LRO entries periodically (every 'x' microseconds or every
'y' descriptors)
- stop rx processing in the driver periodically and schedule remaining
work for later.
Reviewed by: andre
UF_SYSTEM, UF_SPARSE, UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY,
and UF_HIDDEN.
Sort the file flags tmpfs supports alphabetically. tmpfs now
supports the same flags as UFS, with the exception of SF_SNAPSHOT.
Reported by: bdrewery, antoine
Sponsored by: Spectra Logic
- tom_uninit had to be reworked not to hold the adapter lock (a mutex)
around t4_deactivate_uld, which acquires the uld_list_lock.
- the ifc_match for the interface cloner that creates the tracer ifnet
had to be reworked as the kernel calls ifc_match with the global
if_cloners_mtx held.
allocations under low free-space conditions (-r254995), determine
that old block-preference search order used before -r249782 worked
a bit better. This change reverts to that block-preference search order.
MFC after: 2 weeks
I have 25TB Dell PERC 6 RAID5 array. When it becomes almost
full (10-20GB free), processes which write data to it start
eating 100% CPU and write speed drops below 1MB/sec (normally
to gives 400MB/sec). The revision at which it first became
apparent was http://svnweb.freebsd.org/changeset/base/249782.
The offending change reserved an area in each cylinder group to
store metadata. The new algorithm attempts to save this area for
metadata and allows its use for non-metadata only after all the
data areas have been exhausted. The size of the reserved area
defaults to half of minfree, so the filesystem reports full before
the data area can completely fill. However, in this report, the
filesystem has had minfree reduced to 1% thus forcing the metadata
area to be used for data. As the filesystem approached full, it
had only metadata areas left to allocate. The result was that
every block allocation had to scan summary data for 30,000 cylinder
groups before falling back to searching up to 30,000 metadata areas.
The fix is to give up on saving the metadata areas once the free
space reserve drops below 2%. The effect of this change is to use
the old algorithm of just accepting the first available block that
we find. Since most filesystems use the default 5% minfree, this
will have no effect on their operation. For those that want to push
to the limit, they will get their crappy block placements quickly.
Submitted by: Dmitry Sivachenko
Fix Tested by: Dmitry Sivachenko
PR: kern/181226
MFC after: 2 weeks
If we panic again shortly after boot (say, within 30 seconds), any core
dump we wrote out may be lost on reboot. In this situation, we really
want to keep that core file, as it may be the only way to have the issue
resolved. Call sync(8) after writing out the core file and running
crashinfo(8), in the hope that these will not be lost if we panic
again. sync(8) is only called in the case where there is a core dump
to be written out, so won't be called during normal boots.
Discovered by: Trying to debug an IPSEC panic
MFC after: 1 week
the passed vnode belongs to the same mount point (v_vfsp or also
known as v_mount in FreeBSD). This check prevents the code from
proceeding further on vnodes that do not belong to ZFS, for
instance, on UFS or NULLFS.
The recent change (merged as r254585) on upstream changes the
check of v_vfsp to instead check the znode's z_zfsvfs. On Illumos
this would work because when the vnode comes from lofs, the
VOP_REALVP() would give the right vnode, this is not true on
FreeBSD where our VOP_REALVP is a no-op, and as such tdvp is
not guaranteed to be a ZFS vnode, and will later trigger a
failed assertion when verifying the vnode.
This changeset modifies our local shims (zfs_freebsd_rename and
zfs_freebsd_link) to check if v_mount matches before proceeding
further.
Reported by: many
Diagnostic work by: avg
This removes the WITH_BSDCONFIG description alltogether, since this option
is removed.
At the same time, fix the WITHOUT_LIBCPLUSPLUS option that had gotten
inverted.
There are now six additional variables
weekly_status_security_enable
weekly_status_security_inline
weekly_status_security_output
monthly_status_security_enable
monthly_status_security_inline
monthly_status_security_output
alongside their existing daily counterparts. They all have the same
default values.
All other "daily_status_security_${scriptname}_${whatever}"
variables have been renamed to "security_status_${name}_${whatever}".
A compatibility shim has been introduced for the old variable names,
which we will be able to remove in 11.0-RELEASE.
"security_status_${name}_enable" is still a boolean but a new
"security_status_${name}_period" allows to define the period of
each script. The value is one of "daily" (the default for backward
compatibility), "weekly", "monthly" and "NO".
Note that when the security periodic scripts are run directly from
crontab(5) (as opposed to being called by daily or weekly periodic
scripts), they will run unless the test is explicitely disabled with a
"NO", either for in the "_enable" or the "_period" variable.
When the security output is not inlined, the mail subject has been
changed from "$host $arg run output" to "$host $arg $period run output".
For instance:
myfbsd security run output -> myfbsd security daily run output
I don't think this is considered as a stable API, but feel free to
correct me if I'm wrong.
Finally, I will rearrange periodic.conf(5) and default/periodic.conf
to put the security options in their own section. I left them in
place for this commit to make reviewing easier.
Reviewed by: hackers@
problems with the way MLEN, MHLEN, and struct mbuf are set up.
CTASSERT's are provided to detect such issues at compile time in the
future.
The #define MLEN and MHLEN calculation do not take actual compiler-
induced alignment and padding inside the complete struct mbuf into
account. Accordingly appropriate attention is required when changing
members of struct mbuf.
Ideally one would calculate MLEN as (MSIZE - sizeof(((struct mbuf *)0)->m_hdr)
but that doesn't work as the compiler refuses to operate on an as of
yet incomplete structure.
In particular ARM 32bit has more strict alignment requirements which
caused 4 bytes of padding between m_hdr and pkthdr in struct mbuf
because of the 64bit members in pkthdr. This wasn't picked up by MLEN
and MHLEN causing an overflow of the mbuf provided data storage by
overestimating its size.
I386 didn't show this problem because it handles unaligned access just
fine, albeit at a small performance penalty.
On 64bit architectures the struct mbuf layout is 64bit aligned in all
places.
Reported by: Thomas Skibo <ThomasSkibo-at-sbcglobal-dot-net>
Tested by: tuexen, ian, Thomas Skibo (extended patch)
Sponsored by: The FreeBSD Foundation
notify (enable spinup) required", instead of doing the normal
retries, poll for a change in status.
We will poll every half second for a minute for the status to
change.
Hitachi drives (and likely other SAS drives) return that ASC/ASCQ
when they are waiting to spin up. What it means is that they are
waiting for the SAS expander to send them the SAS
NOTIFY (ENABLE SPINUP) primitive.
That primitive is the mechanism expanders/enclosures use to
sequence drive spinup to avoid overloading power supplies.
Sponsored by: Spectra Logic
MFC after: 3 days
. Use integer literal constants instead of double literal constants.
* s_erff.c:
. Use integer literal constants instead of casting double literal
constants to float.
. Update the threshold values from those carried over from erf() to
values appropriate for float.
. New sets of polynomial coefficients for the rational approximations.
These coefficients have little, but positive, effect on the maximum
error in ULP in the four intervals, but do improve the overall
speed of execution.
. Remove redundant GET_FLOAT_WORD(ix,x) as hx already contained the
contents that is packed into ix.
. Update the mask that is used to zero-out lower-order bits in x in
the intervals [1.25, 2.857143] and [2.857143, 12]. In tests on
amd64, this change improves the maximum error in ULP from 6.27739
and 63.8095 to 3.16774 and 2.92095 on these intervals for erffc().
Reviewed by: bde
lib/libpam/modules/pam_passwdqc/Makefile:
Bump WARNS to 2.
contrib/pam_modules/pam_passwdqc/pam_passwdqc.c:
Bump _XOPEN_SOURCE and _XOPEN_VERSION from 500 to 600
so that vsnprint() is declared.
Use the two new union types (pam_conv_item_t and
pam_text_item_t) to resolve strict aliasing violations
caused by casts to comply with the pam_get_item() API taking
a "const void **" for all item types. Warnings are
generated for casts that create "type puns" (pointers of
conflicting sized types that are set to access the same
memory location) since these pointers may be used in ways
that violate C's strict aliasing rules. Casts to a new
type must be performed through a union in order to be
compliant, and access must be performed through only one
of the union's data types during the lifetime of the union
instance. Handle strict-aliasing warnings through pointer
assignments, which drastically simplifies this change.
Correct a CLANG "printf-like function with more arguments
than format" error.
Submitted by: gibbs
Sponsored by: Spectra Logic