far away from a ldr psuedo instruction. With this clang will place the
literal value here where it's close enough to be loaded.
MFC after: 1 week
Sponsored by: ABT Systems Ltd
In dsl_dataset_hold_obj() we used zap_contains(.., DS_FIELD_LARGE_BLOCKS)
to determine whether the extensible (zapifyed) dataset have large blocks.
The code expects the result be either 0 (found) or ENOENT (not found),
however reused the variable 'err' which later code expects to be 0.
Fix this by adopting similar code construct that is used later for
DS_FIELD_BOOKMARK_NAMES, which uses a temporary variable zaperr to catch
errors from zap_* rountines.
Reported by: Peter J. Creath (on FreeNAS; FreeNAS bug #6848)
Illumos issue: 5393 spurious failures from dsl_dataset_hold_obj()
Reviewed by: mahrens
Sponsored by: iXsystems, Inc.
X-MFC with: r274337
Some old libraries may be used even with newer binaries (specifically the
Nvidia driver libraries).
Differential Revision: https://reviews.freebsd.org/D1262
Reviewed by: kib
vp->v_vflag without taking vnode lock and without bypass. We do know
that vp is the lowest level in the stack, since the pointer is
obtained from the object' handle. Stale VV_TEXT flag read can only
happen if parallel execve() is performed and not yet activated the
image, since process takes reference for text mapping. In this case,
the execve() code manages the VV_TEXT flag on its own already.
It was observed that otherwise read-only sendfile(2) requires
exclusive vnode lock and contending on it on some loads for VV_TEXT
handling.
Reported by: glebius, scottl
Tested by: glebius, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
While without UNMAP support there is not much initiator can do about it,
the administrator still better be notified about the storage overflow.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too. Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system. Pool thresholds are still not implemented in this case.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
re-using a hardware propritary transfer descriptor, PTD, in USB host
mode. If the PTD's are recycled too quickly, it has been observed that
the hardware simply fails to schedule the requested job or resets
completely disconnecting all devices.
After recent optimizations this change is no longer blocked by CTL memory
consumption. Those limits are still not free, but much cheaper now.
MFC after: 1 week
Relnotes: yes
Sponsored by: iXsystems, Inc.
Abusing ability of major UAs cover minor ones we may not account UAs for
inactive ports. Allocate UAs storage for port and start accounting only
after some initiator from that port fetched its first POWER ON OCCURRED.
This reduces per-LUN CTL memory usage from >1MB to less then 100K.
MFC after: 1 month
The .text, .bss, and .data sections claimed 16-byte alignment, but were
only aligned to 8 by the linker script.
Discovered with strip(1) from elftoolchain, which performs validation
absent from the binutils strip(1).
Sponsored by: DARPA, AFRL
In configurations with many ports, like iSCSI, each LUN is typically
accessed only by limited subset of ports. Allocating that memory on
demand allows to reduce CTL memory usage from 5.3MB/LUN to 1.3MB/LUN.
MFC after: 1 month
Always include the card human readable name. We support ~270 cards and
at ~20 bytes each, this bloats things by only ~5k. Retain the
PCMCIA_CARD vs PCMCIA_CARD_D distinction, though, in case this is
intolerable.
WITNESS and INVARIANTS checking, which are known to have significant
performance impact on running systems. When benchmarking new features
this kernel should be used instead of the standard GENERIC.
This kernel configuration should never appear outside of the HEAD
of the FreeBSD tree.
interpreating NULLs as EOLs, but converting them to spaces.
SPC-4 does not tell that T10-based IDs should be NULL-terminated/padded.
And while it tells that it should include only ASCII chars (0x20-0x7F),
there are some USB sticks (SanDisk Ultra Fit), that have NULLs inside
the value. Treating NULLs as EOLs there made those LUN IDs non-unique.
MFC after: 1 week
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.
This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.
"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.
Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.
MFC after: 1 month
Sponsored by: Mellanox Technologies
uma_reclaim(). Reclamation code must not see half-constructed or
destructed zones. Do this by bracing uma_zcreate() and uma_zdestroy()
into a shared-locked sx, and take the sx exclusively in uma_reclaim().
Usually zones are not created/destroyed during the system operation,
but tmpfs mounts do cause zone operations and exposed the bug.
Another solution could be to only expose a new keg on uma_kegs list
after the corresponding zone is fully constructed, and similar
treatment for the destruction. But it probably requires more risky
code rearrangement as well.
Reported and tested by: pho
Discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
- Add support for GEOM direct completion. Depending on the benchmark,
this tends to give a ~30% improvement w.r.t IOPs and BW.
- Remove an invariants check in the strategy routine. This assertion
is caught later on by an existing panic.
- Rename and resort various related functions to make more sense.
MFC after: 1 month
- Provide pru_ready function for TCP.
- Don't call tcp_output() from tcp_usr_send() if no ready data was put
into the socket buffer.
- In case of dropped connection don't try to m_freem() not ready data.
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
Provide pru_ready for AF_LOCAL sockets. Local sockets sendsdata directly
to the receive buffer of the peer, thus pru_ready also works on the peer
socket.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
o Introduce a notion of "not ready" mbufs in socket buffers. These
mbufs are now being populated by some I/O in background and are
referenced outside. This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
freed.
o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
The sb_ccc stands for ""claimed character count", or "committed
character count". And the sb_acc is "available character count".
Consumers of socket buffer API shouldn't already access them directly,
but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
search.
o New function sbready() is provided to activate certain amount of mbufs
in a socket buffer.
A special note on SCTP:
SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs. Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them. The new
notion of "not ready" data isn't supported by SCTP. Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does. This was discussed with rrs@.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Summary:
Revert the initial FBT-with-KDB changes for trap_subr*.S, and instead use the
db_trap filter function to handle dtrace trap filtering. With this, the MMU is
enabled by the support code, simplifying the codepath altogether.
Test Plan: Tested on my G4 PowerBook
Reviewers: #powerpc, nwhitehorn
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D1207
MFC after: 3 weeks
crowded as we now are at about 70k. Bump the limit to 1MB instead
which is still quite a reasonable limit and allows for future growth
of this file and possible future expansion to additional data.
MFC After: 2 weeks
recursion on mutex initialization.
The only places where the recursive acquire is performed are read and
write filters, since knlist, which uses the pipe pair mutex as lock,
is locked when filter is called.
The recursion was added in r93296, and consistent locking for
kn_fop->f_event() introduced in r133741.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Call to the driver-specific ioctl used to process ioctl number
that will lead to the out-of-bounds access to the ioctl handler
array.
PR: 193367
Approved by: kib
MFC after: 1 week
Bump the default from 16 to 32, to accommodate kernel flamegraphs.
Bump the maximum from 32 to 128, to accommodate deep user stacks.
Reviewed by: gnn
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1203
This allows one to make a kernel module to tune the
number of queues before the driver loads.
This is needed so that a module at SI_SUB_CPU can set
tunables for these drivers to take. Otherwise getenv
is called too early by the TUNABLE macros.
Reviewed by: smh
Phabric: https://reviews.freebsd.org/D1149
xform_ipip was used as fallback with low priority for IPIP
encapsulated packets that were decrypted. In some cases
it can decapsulate packets, that it shouldn't. This leads to situations,
when wrong configurations are magically working. Also it can propagate
wrong ingress interface and this can break security.
Now we redesigned the IPSEC code and IPIP encapsulation is called directly
from ipsec_output, and decapsulation is done in the ipsec_input with m_striphdr.
Differential Revision: https://reviews.freebsd.org/D1220
MFC after: 1 month
Sponsored by: Yandex LLC