interpreating NULLs as EOLs, but converting them to spaces.
SPC-4 does not tell that T10-based IDs should be NULL-terminated/padded.
And while it tells that it should include only ASCII chars (0x20-0x7F),
there are some USB sticks (SanDisk Ultra Fit), that have NULLs inside
the value. Treating NULLs as EOLs there made those LUN IDs non-unique.
MFC after: 1 week
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.
This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.
"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.
Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.
MFC after: 1 month
Sponsored by: Mellanox Technologies
uma_reclaim(). Reclamation code must not see half-constructed or
destructed zones. Do this by bracing uma_zcreate() and uma_zdestroy()
into a shared-locked sx, and take the sx exclusively in uma_reclaim().
Usually zones are not created/destroyed during the system operation,
but tmpfs mounts do cause zone operations and exposed the bug.
Another solution could be to only expose a new keg on uma_kegs list
after the corresponding zone is fully constructed, and similar
treatment for the destruction. But it probably requires more risky
code rearrangement as well.
Reported and tested by: pho
Discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
- Add support for GEOM direct completion. Depending on the benchmark,
this tends to give a ~30% improvement w.r.t IOPs and BW.
- Remove an invariants check in the strategy routine. This assertion
is caught later on by an existing panic.
- Rename and resort various related functions to make more sense.
MFC after: 1 month
- Provide pru_ready function for TCP.
- Don't call tcp_output() from tcp_usr_send() if no ready data was put
into the socket buffer.
- In case of dropped connection don't try to m_freem() not ready data.
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
Provide pru_ready for AF_LOCAL sockets. Local sockets sendsdata directly
to the receive buffer of the peer, thus pru_ready also works on the peer
socket.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
o Introduce a notion of "not ready" mbufs in socket buffers. These
mbufs are now being populated by some I/O in background and are
referenced outside. This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
freed.
o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
The sb_ccc stands for ""claimed character count", or "committed
character count". And the sb_acc is "available character count".
Consumers of socket buffer API shouldn't already access them directly,
but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
search.
o New function sbready() is provided to activate certain amount of mbufs
in a socket buffer.
A special note on SCTP:
SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs. Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them. The new
notion of "not ready" data isn't supported by SCTP. Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does. This was discussed with rrs@.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Summary:
Revert the initial FBT-with-KDB changes for trap_subr*.S, and instead use the
db_trap filter function to handle dtrace trap filtering. With this, the MMU is
enabled by the support code, simplifying the codepath altogether.
Test Plan: Tested on my G4 PowerBook
Reviewers: #powerpc, nwhitehorn
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D1207
MFC after: 3 weeks
crowded as we now are at about 70k. Bump the limit to 1MB instead
which is still quite a reasonable limit and allows for future growth
of this file and possible future expansion to additional data.
MFC After: 2 weeks
recursion on mutex initialization.
The only places where the recursive acquire is performed are read and
write filters, since knlist, which uses the pipe pair mutex as lock,
is locked when filter is called.
The recursion was added in r93296, and consistent locking for
kn_fop->f_event() introduced in r133741.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Call to the driver-specific ioctl used to process ioctl number
that will lead to the out-of-bounds access to the ioctl handler
array.
PR: 193367
Approved by: kib
MFC after: 1 week
Bump the default from 16 to 32, to accommodate kernel flamegraphs.
Bump the maximum from 32 to 128, to accommodate deep user stacks.
Reviewed by: gnn
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1203
This allows one to make a kernel module to tune the
number of queues before the driver loads.
This is needed so that a module at SI_SUB_CPU can set
tunables for these drivers to take. Otherwise getenv
is called too early by the TUNABLE macros.
Reviewed by: smh
Phabric: https://reviews.freebsd.org/D1149
xform_ipip was used as fallback with low priority for IPIP
encapsulated packets that were decrypted. In some cases
it can decapsulate packets, that it shouldn't. This leads to situations,
when wrong configurations are magically working. Also it can propagate
wrong ingress interface and this can break security.
Now we redesigned the IPSEC code and IPIP encapsulation is called directly
from ipsec_output, and decapsulation is done in the ipsec_input with m_striphdr.
Differential Revision: https://reviews.freebsd.org/D1220
MFC after: 1 month
Sponsored by: Yandex LLC
- Threads lifetime cycle, in particular, counting of the threads in
the process, and interlocking with process mutex and thread lock.
The main reason of this is that turnstile locks are after thread
locks, so you e.g. cannot unlock blockable mutex (think process
mutex) while owning thread lock.
- Virtual and profiling itimers, since the timers activation is done
from the clock interrupt context. Replace the p_slock by p_itimmtx
and PROC_ITIMLOCK().
- Profiling code (profil(2)), for similar reason. Replace the p_slock
by p_profmtx and PROC_PROFLOCK().
- Resource usage accounting. Need for the spinlock there is subtle,
my understanding is that spinlock blocks context switching for the
current thread, which prevents td_runtime and similar fields from
changing (updates are done at the mi_switch()). Replace the p_slock
by p_statmtx and PROC_STATLOCK().
The split is done mostly for code clarity, and should not affect
scalability.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
method needs pre-reset state of the ps_siginfo to correctly construct
signal frame.
Move sigdflt() call after the sv_sendsig() invocation in postsig().
Simultaneously extract common code from trapsignal() and postsig()
into new helper postsig_done().
Submitted by: rea
MFC after: 1 week
Records with target_mode == 1 are allocated from the end of portdb, so it
seems logical to start search from the end not traverse whole array.
MFC after: 1 month
Make CTL core and block backend set success status before initiating last
data move for read commands. Make CAM target and iSCSI frontends detect
such condition and send command status together with data. New I/O flag
allows to skip duplicate status sending on later fe_done() call.
For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS. For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
Special thanks to Nicholas Esborn for the loaner router to get this
target bootstrapped.
Review: D777
Reviewed by: adrian
Sponsored by: Nicholas Esborn <nick@desert.net>
- add parentheses around macro parameters for consistent style
- remove redundant parentheses around an expression
- use tab before a line continuation symbol
Differential Revision: https://reviews.freebsd.org/D1161 (partial)
Reviewed by: markj
MFC after: 1 week
a new per-device '%domain' sysctl node that returns the NUMA domain a
device is associated with if it is associated with one.
Note that this API is still a WIP and might change before 11.0 actually
ships.
Differential Revision: https://reviews.freebsd.org/D930
Reviewed by: kib, adrian
This was previously working by accident because BUSDMA_COHERENT_MEMORY has
always been set to strongly-ordered on arm. Now we're moving towards
normal-uncacheable (what might be called write-combining on other platforms)
and using the proper sync ops will be more important. Of course, that
opens the question of just what is the "proper" sync op for shared
concurrent dma access as opposed to accesses where the handoff of control
of the memory has well-defined sequence points that match the available
busdma sync operations.
Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments. But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way. That allows to avoid allocations
in hot I/O path. Other frontends either may sleep in allocation context
or can properly handle allocation errors.
On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS! Yay! :)
MFC after: 1 month
Sponsored by: iXsystems, Inc.
If this feels like deja vu... the last time this was fixed in this file
only ARM_MMU_V6 was fixed, this time it's ARM_ARCH_V6 (and this time I
searched for other occurrances of pj4b in here).
to automatically set the armv6 option when MACHINE_ARCH is armv6. That
allows replacing ever-growing lists of cpu names as options to compile
a given file with the using either "optional armv6" or "optional !armv6".
using the VM_MIN_ADDRESS constant.
HardenedBSD redefines VM_MIN_ADDRESS to be 64K, which results in
bhyve VM startup failing. Guest memory is always assumed to start
at 0 so use the absolute value instead.
Reported by: Shawn Webb, lattera at gmail com
Reviewed by: neel, grehan
Obtained from: Oliver Pinter via HardenedBSD
23bd719ce1
MFC after: 1 week
ath kernel module:
sys/dev/ath/ath_hal/ar5212/ar5212_reset.c:2642:7: error: taking the absolute value of unsigned type 'unsigned int' has no effect [-Werror,-Wabsolute-value]
if (abs(lp[0] * EEP_SCALE - target) < EEP_DELTA) {
^
sys/dev/ath/ah_osdep.h:74:18: note: expanded from macro 'abs'
#define abs(_a) __builtin_abs(_a)
^
sys/dev/ath/ath_hal/ar5212/ar5212_reset.c:2642:7: note: remove the call to '__builtin_abs' since unsigned values cannot be negative
sys/dev/ath/ah_osdep.h:74:18: note: expanded from macro 'abs'
#define abs(_a) __builtin_abs(_a)
^
1 error generated.
This warning occurs because both lp[0] and target are unsigned, so the
subtraction expression is also unsigned, and calling abs() is a no-op.
However, the intention was to look at the absolute difference between
the two unsigned quantities. Introduce a small static function to
clarify what we're doing, and call that instead.
Reviewed by: adrian
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D1212
isochronous endpoint descriptor used for the data transfers, hence the
synchronization feature might not be supposed to be supported [yet].
This makes seamless playback synced with the USB HOST clock work with
the DN32-USB module for Midas audio systems and possibly other similar
products from Klark Teknik.
MFC after: 1 week
o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but
doesn't sleep. It returns immediately, and will execute the I/O done handler
function that must be supplied as argument.
o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager.
o Extend pagertab to support pgo_getpages_async method, and implement this
method for vnode_pager.
Reviewed by: kib
Tested by: pho
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
SAF1761 OTG driver. Currently the driver logic is very simple and
double buffering the USB transactions is not done. Also you need to
use an external USB high speed USB HUB for reliable FULL speed
outgoing ISOCHRONOUS traffic, because the internal one chokes on
so-called split transfers above 188 bytes.
establishing connection.
This is a workaround for Chelsio TOE driver, that does not update socket
buffer size in hardware after connection established, and unless that is
done beforehand, kernel code will stuck, attempting to send/receive full
PDU at once.
MFC after: 1 week
I've missed that iscsi_outstanding_remove() frees the second pointer,
so it should no longer be used. And in fact we don't really need to.
MFC after: 2 weeks
During heavy reads data copying in icl_pdu_get_data() may consume large
percent of CPU time. Moving it out of the lock significantly reduces
lock hold time and respectively lock congestion on read operations.
MFC after: 2 weeks
the first cacheline if the buffer start address is not on a cacheline
boundary. Normally a buffer which is not cacheline-aligned is bounced,
but a special rule applies for mbufs, which are always misaligned due to
the header. We know the cpu will not write to the header while dma is in
progress (so we've been told anyway), but it may have written to the
header shortly before starting a read, so we need to flush that write out
to memory before invalidating the whole buffer.
In collaboration with Mical Meloun and Svata Kraus.
rename it to ufs_dirhashreclaimpercent, as suggested
by jhb@. As an added bonus this avoids divide-by-zero
errors.
Requested by: jhb, markj
Reviewied by: jhb, markj
commit 6d3c4c09226ad6bdd662e3e52489ef294a6ce298
Add terasic_mtl vt(4) framebuffer driver
terasic_mtl can be built with syscons(4) and vt(4) attachments, selected
at compile time.
commit 33240259b47a7c990a5a88a19f133a5600432a4c
Clear terasic_mtl text buffer on attach
commit d188c2d2412953f949624aa35cd07082830943c9
Update terasic vt(4) driver for FreeBSD r269783
commit d1cc54eee852fa4fc9d359d5bb2171d24ec73369
Safety belt to ensure vt(4) fb parameters are correct
commit 76e6d468ef45711d7952786095fc4791289ebb4b
Improve terasic_mtl_vt fdt parsing
- Use OF_getencprop to avoid need for explicit endian handling
(submitted by ray@freebsd.org)
- Check for expected length and correct pointer type
commit 3e2524b8995ab66e8a9295e4c87cbc7126eeddf4
Correct device_printf usage
commit 9e53e3c8e0766414e25662c95b09cc51c92443b0
Switch framebuffer to match host endianness
Xorg and xf86-video-scfb work much better with a native-endian
framebuffer.
commit 0f49259d596321ed85288ac0e1fb4ee1c966df48
Switch DE4 to vt(4) and enable kbdmux
commit 5bc96ebc89db7d134ad478335090c8477c1677c7
Add missing \n in device_printf calls
Submitted by: emaste
Sponsored by: DARPA, AFRL
commit d0c7d235c09fc65dbdb278e7016a96f79c6a49cc
Make the Altera JTAG UART device driver slightly more forgiving of
the foibles of a sub-par hrdware interface by increasing the timeout
for spotting JTAG polling from one to two seconds.
commit 19ed45a18832560dab967c179d83b71081c3a220
Update comment.
commit 8edfe803f033cc8e33229f99894c2b7496a44d5f
Add a comment about a device-driver race condition that could cause the BERI
pipeline to wedge awaiting JTAG in the event that both the low-level console
and the tty layer decide to write to the JTAG FIFO just before JTAG is
disconnected. Resolving this race is a bit tricky as it looks like there
isn't a way to 'give the character back' to the tty layer when we discover
the race. The easy fix is to drop the character, which we don't yet do, but
perhaps should as that is a better outcome than wedging the pipeline.
commit 2ea26cf579c9defcf31e413e7c9b0fbc159237fc
Add a comment about an inherent race with hardware in the Altera JTAG
UART's low-level console code.
Submitted by: rwatson
MFC after: 1 week
Sponsored by: DARPA, AFRL
Previously, any timeout value for which (timeout * hz) will overflow the
signed integer, will give weird results, since callout(9) routines will
convert negative values of ticks to '1'. For unsigned integer overflow we
will get sufficiently smaller timeout values than expected.
Switch from callout_reset, which requires conversion to int based ticks
to callout_reset_sbt to avoid this.
Also correct isci to correctly resolve ccb timeout.
This was based on the original work done by Eygene Ryabinkin
<rea@freebsd.org> back in 5 Aug 2011 which used a macro to help avoid
the overlow.
Differential Revision: https://reviews.freebsd.org/D1157
Reviewed by: mav, davide
MFC after: 1 month
Sponsored by: Multiplay
- Dump an NT_X86_XSTATE note if XSAVE is in use. This note is designed
to match what Linux does in that 1) it dumps the entire XSAVE area
including the fxsave state, and 2) it stashes a copy of the current
xsave mask in the unused padding between the fxsave state and the
xstate header at the same location used by Linux.
- Teach readelf() to recognize NT_X86_XSTATE notes.
- Change PT_GET/SETXSTATE to take the entire XSAVE state instead of
only the extra portion. This avoids having to always make two
ptrace() calls to get or set the full XSAVE state.
- Add a PT_GET_XSTATE_INFO which returns the length of the current
XSTATE save area (so the size of the buffer needed for PT_GETXSTATE)
and the current XSAVE mask (%xcr0).
Differential Revision: https://reviews.freebsd.org/D1193
Reviewed by: kib
MFC after: 2 weeks
This change saves/restores the callee-saved MIPS floating point
registers as documented by the o32/n32/n64 spec ("MIPSpro N32
ABI Handbook", Table 2-1) for the _setjmp(3), _longjmp(3),
setjmp(3) and longjmp(3) C library functions. This is only
included when the C library is built with hardware floating point
support (or when "SOFTFLOAT" is not defined).
Submitted by: sson
MFC after: 1 month
Sponsored by: DARPA, AFRL
In this mode one head is in Active state, supporting all commands, while
another is in Standby state, supporting only minimal LUN discovery subset.
It is still incomplete since Standby state requires reservation support,
which is impossible to do right without having interlink between heads.
But it allows to run some basic experiments.
made getmntinfo() return empty flags for smbfs filesystems when
called with MNT_WAIT. It's not visible with mount(8), since it uses
MNT_NOWAIT, but broke autounmount(8) operation.
PR: 195161
Differential Revision: https://reviews.freebsd.org/D1194
Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
It is automatically set when -fPIC is passed to the compiler.
Reviewed by: dim, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1179
related cleanups:
- Require each driver to initalize a mutex in the scsi_low_softc that
is shared with the scsi_low code. This mutex is used for CAM SIMs,
timers, and interrupt handlers.
- Replace the osdep function switch with direct calls to the relevant
CAM functions and direct manipulation of timers via callout(9).
- Collapse the CAM-specific scsi_low_osdep_interface substructure
directly into scsi_low_softc.
- Use bus_*() instead of bus_space_*().
- Return BUS_PROBE_DEFAULT from probe routines instead of 0.
- No need to zero softcs.
- Pass 0ul and ~0ul instead of 0 and ~0 to bus_alloc_resource().
- Spell "dettach" as "detach".
- Remove unused 'dvname' variables.
- De-spl().
Tested by: no one
- Add a per-softc mutex as a driver lock.
- Use callout(9) instead of timeout(9).
- Set softc pointer in si_drv1 of cdev instead of looking softc
up via devclass in cdev methods.
Tested by: no one
- Don't recurse driver mutex.
- Don't hold driver mutex across fubyte/subyte.
- Replace fubyte/subyte loops with copyin/copyout calls.
- Use relatively sane locking in wl_ioctl().
- Use bus space accessors instead of in*()/out*().
- Use callout(9) instead of timeout(9).
- Stop watchdog timer in detach and don't hold mutex across
bus_teardown_intr().
- Use device_printf() and if_printf().
- De-spl().
Tested by: no one
node. Take this in to account by searching until we find the range for the
root node.
Differential Revision: https://reviews.freebsd.org/D1160
Reviewed by: ian
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Without this fix, the vnet was NULL and would crash.
This fix is similar to what was done inside the ioctl handler for PF.
Tested by:
(1) Boot a kernel with "options VIMAGE" enabled
(2) Type:
echo "map lo0 from 10.0.0.0/24 to ! 10.0.0.0/24 -> 127.0.0.1/32" > /etc/ipnat.rules ; service ipnat onerestart
PR: 176992
Differential Revision: https://reviews.freebsd.org/D1191
Reviewed by: cy
Summary:
Currently if there are problems finding a symbol, backtrace ends up printing
something like:
0xdeadbeef: at +0x12345
Which is pretty useless. This on its own should be fixed (retrieving symbols),
but aside from that, using db_printsym() is a better solution anyway. If it
can't find a valid symbol it prints the actual address, and it has the added
benefit that if it can find the symbol, it might be able to print the file and
line as well.
Test Plan: Tested on my G4 PowerBook
Reviewers: #powerpc, nwhitehorn
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D1173
MFC after: 3 weeks
Early UART should be released right after system console initialization is
completed. Otherwise, after cninit() both early and system console coexist
what may lead to various issues (i.a. writing to unmapped early
UART address). This cannot be done in cninit_finish() since it can be
called late at the end of MI configuration.
Obtained from: Semihalf
Reviewed by: andrew
Sponsored by: The FreeBSD Foundation