Under enough load, the swi's can actually be preempted and migrated
to other currently free cores. When doing RSS experiments, this lead
to the per-CPU TCP timers not lining up any more with the RX CPU said
flows were ending up on, leading to increased lock contention.
Since there was a little pushback on flipping them on by default,
I've left the default at "don't pin."
The other less obvious problem here is that the default swi
is also the same as the destination swi for CPU #0. So if one
pins the swi on CPU #0, there's no default floating swi.
A nice future project would be to create a separate swi for
the "default" floating swi, as well as per-CPU swis that are
(optionally) pinned.
Tested:
* parallel TCP tests (2 x 1g unfortunately for now);
CPU: Intel(R) Xeon(R) CPU E5-2650
Note:
This is based on some initial investigation into RSS/TCP stack lock
contention on FreeBSD-HEAD whilst at Netflix in January 2014.
- Rework how we allocate and free USB host channels, so that we only
allocate a channel if there is a real packet going out on the USB
cable.
- Use BULK type for control data and status, due to instabilities in
the HW it appears.
- Split FIFO TX levels into one for the periodic FIFO and one for the
non-periodic FIFO.
- Use correct HFNUM mask when scheduling host transactions. The HFNUM
register does not count the full 16-bit range.
- Correct START/COMPLETION slot for TT transactions. For INTERRUPT and
ISOCHRONOUS type transactions the hardware always respects the ODDFRM
bit, which means we need to allocate multiple host channels when
processing such endpoints, to not miss any so-called complete split
opportunities.
- When doing ISOCHRONOUS OUT transfers through a TT send all data
payload in a single ALL-burst. This deacreases the likelyhood for
isochronous data underruns.
- Fixed unbalanced unlock in case of "dwc_otg_init_fifo()" failure.
- Increase interrupt priority.
MFC after: 2 weeks
have implemented the PIM_NOSCAN rescan functionality will have it
enabled.
This is a no-op for head.
Reviewed by: slm
Sponsored by: Spectra Logic Corporation
MFC after: 3 days
TLR is necessary for reliable communication with SAS tape drives.
This was broken by change 246713 in the mps(4) driver. It changed the
cm_data field for SCSI I/O requests to point to the CCB instead of the data
buffer. So, instead, look at the CCB's data pointer to determine whether
or not we're talking to a tape drive.
Also, take the residual into account to make sure that we don't go off the
end of the request.
MFC after: 3 days
Sponsored by: Spectra Logic Corporation
Remove some other ifdefs that came in with a copy/paste that mean basically
"if this processor supports multicore stuff", because if you're starting up
an AP core... it does.
* Don't use sysexits.h. Just exit 1 on error and 0 otherwise.
* Don't sacrifice precision by converting the output of clock_gettime() to a
double and then comparing the results. Instead, subtract the values of
the two clock_gettime() calls, then convert to double.
* Don't use CLOCK_MONOTONIC_PRECISE. It's an unportable synonym for
CLOCK_MONOTONIC.
* Use more appropriate names for some local variables.
* In the summary message, round elapsed time to the nearest microsecond.
Reported by: bde, jilles
MFC after: 3 days
X-MFC-With: 265472
The thread pool is used by libzfs to implement parallel disk scanning.
Without this change our dummy wrapper made `zpool import ZZZ` command to
scan all disks sequentially from the single thread when searching for pools.
This change makes it use two threads per CPU, same as in OpenSolaris.
On system with 200 HDDs this change reduces ZFS pool import time from 35
to 22 seconds.
been installed in the first place, and it must be removed ASAP or
weird build errors may start happening in the future if this file is
ever taken from the installed system. Add note to UPDATING.
Since radix has been ignoring sa_family in passed sockaddrs,
no one ever has bothered filling valid sa_family in netmasks.
Additionally, radix adjusts sa_len field in every netmask not to
compare zero bytes at all.
This leads us to rt_mask with sa_family of AF_UNSPEC (-1) and
arbitrary sa_len field (0 for default route, for example).
However, rtsock have been passing that rt_mask intact for ages,
requiring all rtsock consumers to make ther own local hacks.
We even have unfixed on in base:
do `route -n monitor` in one window and issue `route -n get addr`
for some directly-connected address. You will probably see the following:
got message of size 304 on Thu May 8 15:06:06 2014
RTM_GET: Report Metrics: len 304, pid: 30493, seq 1, errno 0, flags:<UP,DONE,PINNED>
locks: inits:
sockaddrs: <DST,GATEWAY,NETMASK,IFP,IFA>
10.0.0.0 link#1 (255) ffff ffff ff em0:8.0.27.c5.29.d4 10.0.0.92
_________________^^^^^^^^^^^^^^^^^^
after the change:
got message of size 312 on Thu May 8 15:44:07 2014
RTM_GET: Report Metrics: len 312, pid: 2895, seq 1, errno 0, flags:<UP,DONE,PINNED>
locks: inits:
sockaddrs: <DST,GATEWAY,NETMASK,IFP,IFA>
10.0.0.0 link#1 255.255.255.0 em0:8.0.27.c5.29.d4 10.0.0.92
_________________^^^^^^^^^^^^^^^^^^
Sponsored by: Yandex LLC
MFC after: 1 month
fail to attach to stripped binaries. With the _r_debug_postinit symbol,
dtrace(1) can now set a breakpoint in the victim process after it has
registered its DOF table(s) with the kernel. r_debug_state cannot be used
for this purpose since it is called before DOF is made available, in which
case dtrace(1) cannot create USDT probes before the program begins
execution.
MFC after: 2 weeks
with r265456 _r_debug_postinit can be used for RD_POSTINIT events. rtld(1)
uses r_debug_state for dl state transitions, so we use its address for
RD_DLACTIVITY events.
MFC after: 2 weeks
If e_shnum or e_shstrndx are at least SHN_LORESERVE (0xff00) then an
escape value is used to indicate that the actual value is found in one
of section 0's fields.
Sponsored by: DARPA, AFRL