Add separate software Tx queue limit for non-TCP traffic to make total
limit higher and avoid local drops of TCP packets because of no
backpressure.
There is no point to make non-TCP limit high since without backpressure
UDP stream easily overflows any sensible limit.
Split early drops statistics since it is better to have separate counter
for each drop reason to make it unabmiguous.
Add software Tx queue high watermark. The information is very useful to
understand how big queues grow under traffic load.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
Most likely is was just memory leak on the error handling path since
typically efsys_mem_t is filled in by zeros on allocation.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
sfxge_dma_alloc() calls bus_dmamem_alloc() with BUS_DMA_ZERO flag, so
allocated memory is already filled in by zeros
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
Remove the first member alignment to cacheline since it is nop.
Use __aligned() for the whole structure to make sure that the structure
size is cacheline aligned.
Remove lock alignment to make the structure smaller and fit all members
used on event queue processing into one cacheline (128 bytes) on x86-64.
The lock is obtained as well from different context when event queue
statistics are retrived from sysctl context, but it is infrequent.
Reorder members to avoid padding and go in usage order on event
processing.
As the result all structure members used on event queue processing fit
into exactly one cacheline (128 byte) now.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
In fact the pointer is used only if more than one TXQ is processed in
one interrupt.
It is used (read-write) on completion path only.
Also it makes the first part of the structure smaller and it fits now
into one 128byte cache line. So, TXQ structure becomes 128 bytes smaller.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
doesn't get truncated to 32 bits.
Without this, 3x3 NICs transmitting at an MCS rate whose rix (rate
index) in the rate table is > 31 end up returning errors, as the
sample rate code doesn't think the rate is set in the rate table.
Tested:
* AR9380, STA, speaking 3x3 to an AP
Without this change a local attacker could trigger a panic by
tricking the kernel into accessing undefined kernel memory.
We would like to acknowledge Francisco Falcon from CORE Security
Technologies who discovered the issue and reported to the
FreeBSD Security Team.
More information can be found at CORE Security's advisory at:
http://www.coresecurity.com/content/freebsd-kernel-multiple-vulnerabilities
This is an errata candidate for releng/10.1 and releng/9.3. Earlier
releases are not affected.
Reported by: Francisco Falcon from CORE Security Technologies
Security: CVE-2014-0998
Reviewed by: dumbbell
MFC after: 3 days
Also, split power_suspend into power_suspend and power_suspend_early.
power_suspend_early is called before the userland is frozen.
power_suspend is called after the userland is frozen.
Currently only VT switching is hooked to power_suspend_early.
This is needed because switching away from X server requires its
cooperation, so obviously X server must not be frozen when that happens.
Freezing userland during ACPI suspend is useful because not all drivers
correctly handle suspension concurrent with other activity. This is
especially applicable to drivers ported from other operating systems
that suspend all software activity between placing drivers and hardware
into suspended state.
In particular drm2/radeon (radeonkms) depends on the described
procedure. The driver does not have any internal synchronization
between suspension activities and processing of userland requests.
Many thanks to kib for the code that allows to freeze and thaw all
userland threads.
Note that ideally we also need to park / inhibit (non-special) kernel
threads as well to ensure that they do not call into drivers.
MFC after: 17 days
suspend/resume
The goal is to avoid that the vt(4) resume happens before the video
display is resumed. The original patch was provided by Andriy Gapon.
This new patch registers the handlers in vt_upgrade(). This is done
once, thanks to the VDF_ASYNC flag. I abused this flag because it was
already abused by the keyboard allocation. The event handlers then call
the backend if it provides callbacks for suspend/resume.
Differential Revision: https://reviews.freebsd.org/D1004
On behalf of: dumbbell
MFC after: 2 weeks
Previously, the driver resets the device and abandon the requests that
are caught in flight when the dump was initiated. This was problematic
if the system is resumed after the dump is completed.
While that is probably not the typical action, it is simple to rework
the driver to very likely have the device usable after the dump without
making it more likely for the dump to fail. The in flight requests are
simply queued for completion once the dump is finished.
Requested by: markj
MFC after: 1 month
the no-longer existant sb_cc sockbuf member.
- Use sbavail() instead of sbused() in t4_soreceive_ddp() to match the
usage in soreceive_stream() on which it is based.
Discussed with: glebius (2)
for i386, and from the code inspection, nothing in the
arm/mips/sparc64 implementations depends on it.
Discussed with: imp, nwhitehorn
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
The newer boards don't have the response field that indicates
whether the SCSI status byte is present. You have to just look to
see whether it is non-zero.
The code was looking to see whether the sense length was valid
before propagating the SCSI status byte (and sense information) up
the stack. With a status like Reservation Conflict, there is no
sense information, only the SCSI status byte. So it wasn't getting
correctly returned.
isp.c:
In isp_intr(), if we are on a 2400 or 2500 type board and
get a response, look at the actual contents of the
SCSI status value and set the RQSF_GOT_STATUS flag
accordingly so that return any SCSI status value we get. The
RQSF_GOT_SENSE flag will get set later on if there is
actual sense information returned.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112791 on 2015/01/15
If the user sends an XPT_RESET_DEV CCB, make sure to reset the
Fibre Channel Command Reference Number if we're running on a FC
controller.
We send a SCSI Target Reset when we get this CCB, and as a result
need to reset the CRN to 1 on the next command.
isp_freebsd.c:
In the XPT_RESET_DEV implementation in isp_action(), reset
the CRN if we're on a FC controller.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112787 on 2015/01/15
Fix SCSI status byte reporting on 4Gb and 8Gb Qlogic boards.
The newer boards don't have the response field that indicates
whether the SCSI status byte is present. You have to just look to
see whether it is non-zero.
The code was looking to see whether the sense length was valid
before propagating the SCSI status byte (and sense information) up
the stack. With a status like Reservation Conflict, there is no
sense information, only the SCSI status byte. So it wasn't getting
correctly returned.
isp.c:
In isp_intr(), if we are on a 2400 or 2500 type board and
get a response, look at the actual contents of the
SCSI status value and set the RQSF_GOT_STATUS flag
accordingly so that return any SCSI status value we get. The
RQSF_GOT_SENSE flag will get set later on if there is
actual sense information returned.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112791 on 2015/01/15
systems with more than 4GB of physical memory.
To remotely debug the system 'stealthy' which has a kernel
with this change installed and firewire properly configured:
% fwcontrol -m stealthy (or stealthy's firewire EUI64)
% kgdb kernel /dev/fwmem0.0
sys/dev/firewire/fwohci.c:
Rather than hard code the upper limit for hw based
automatic responses to remote DMA requests at 4GB,
program the hardware using Maxmem, the page number
one higher than the highest physical page detected
in the system.
While here, garbage collect more useless splfw()
calls.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110994 on 2015/01/06
asynchronous remote dma request (DMA request that the
hardware cannot automatically handle).
sys/dev/firewire/firewire.c
In fw_rcv(), add missing early return in the error
path for DMA requests to unregistered regions.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110993 on 2015/01/06
sys/boot/i386/libfirewire/firewire.c:
sys/dev/firewire/firewire.c:
Fix configuration ROM generation count wrapping logic
so that the generation count is never outside of
allowed limits (0x2 -> 0xF).
sys/dev/firewire/firewire.c:
In fw_xfer_unload(), xfer->fc may be NULL. Protect
against this before taking the fc lock.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110685 on 2015/01/05
sys/dev/firewire/firewire.c:
In fw_xfer_unload() expand lock coverage so that
the test for FWXF_INQ doesn't race with it being
cleared in another thread.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110207 on 2015/01/02
sys/dev/firewire/firewire.c:
In fw_xfer_unload(), clear the FWXF_INQ flag on the
xfer under protection of the FW_GMTX, after the
xfer is removeed from the tx/rx queue. Otherwise
it is possible for the xfer to be removed again
(corrupting the list or immediately panicing) from
another thread that has found this xfer in the
transaction label table.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110200 on 2015/01/02
as the cpu id on arm64 as it may use two cells. In it's place we can use
the device id.
It is expected we will use the reg data on arm64 to enable cores so we
still need to read and store it even if it is not yet used.
Differential Revision: https://reviews.freebsd.org/D1555
Reviewed by: nwhitehorn
Sponsored by: The FreeBSD Foundation
commit 4d93914ae3db4a897ead4b. Some related drm infrastructure
changes are imported as needed.
Biggest update is the rewrite of the i915 gem io to more closely
follow Linux model, althought the mechanism used by FreeBSD port is
different.
Sponsored by: The FreeBSD Foundation
MFC after: 2 month
every operation to retrieve the bs_cookie value almost nothing actually uses.
The bus_space struct contains a private data pointer (poorly named bs_cookie,
now renamed to bs_privdata) which is used only by a few old armv4 xscale
implementations. The bus_space functions were all defined to take this
value as the first parameter instead of the bus_space_tag_t, requiring all
the inline macro and function expansions to dereference the tag to pass it
to another function, which never uses it. Now all the functions take the tag
as the first parameter and retrieve the privdata if they need it.
Also fix a couple bus_space_unmap() implementations that were calling
kva_free() instead of pmap_unmapdev().
Discussed with: cognet
This makes Mac OS X happy when it returns back from suspending.
o Switch notify state after data is transferred, but not before.
o Consider there is also Super Speed mode.
o Do not set stall bit on any pipes in device mode as Mac OS X seems
don't support it.
In collaboration with: hselasky@
driver on Rockchip boards. It currently supports PIO mode
and dma mode needs external dma controller to be used.
Submitted by: jmcneill
Approved by: stas (mentor)
"MODULE_VERSION" macro definition. Remove the redefinition of the
"MODULE_VERSION" macro from the Linux kernel compatibility API.
MFC after: 1 month
Reported by: np@
Sponsored by: Mellanox Technologies
bits.
The motivation here is to eventually teach netisr and potentially
other networking subsystems a bit more about how RSS work queues / buckets
are configured so things have a hope of auto-configuring in the future.
* net/rss_config.[ch] takes care of the generic bits for doing
configuration, hash function selection, etc;
* topelitz.[ch] is now in net/ rather than netinet/;
* (and would be in libkern if it didn't directly include RSS_KEYSIZE;
that's a later thing to fix up.)
* netinet/in_rss.[ch] now just contains the IPv4 specific methods;
* and netinet/in6_rss.[ch] now just contains the IPv6 specific methods.
This should have no functional impact on anyone currently using
the RSS support.
Differential Revision: D1383
Reviewed by: gnn, jfv (intel driver bits)
accidentally enable non-existent states.
This bug was triggered if ACPI advertises the presence of a C2 state
which we fail to parse via acpi_PkgGas due to our lack of support for
FFixedHW resources, and causes an immediate panic when an attempt is
made to enter the (NULL) state.
One affected platform is the EC2 c4.8xlarge VM instance type; there
may be others.
MFC after: 1 week
Thanks to: jkim, @_msw_
sdhci controllers, such as the one on a Raspberry Pi, mishandle the signal
timing in high speed signaling mode, but run just fine in standard mode
with the bus running at frequencies between 25-50MHz (which shouldn't work).
This is the solution adopted by U-Boot and other OSes (linux and *BSD)
for the timeouts on Raspberry Pi boards with certain SD cards. Some
research shows that this quirk is also used on a few other boards, so the
fix is a generic quirk instead of being in the RPi-specific driver code.
This change is based on information discovered by Michal Meloun.
Required when communicating to Mac OS X USB host stack.
o Also don't set stall bit to TX pipe in device mode as seems Mac OS X
don't clears it as it should.
Discussed with: hselasky@
in prep for the next NF calibration pass.
Totally missing braces. Damn you C.
Submitted by: Sascha Wildner <swildner@dragonflybsd.org>
MFC after: 1 week
MCI bluetooth coexistence method for WB222.
The rest of MCI requires a bunch more work, including adding a DMA buffer
for the MCI hardware to bounce messages in/out of and handling MCI
interrupts. But the more important part here is telling the HAL
the btcoex is enabled and MCI is in use so it configures the correct
initial bluetooth parameters in the wireless NIC and configures
things like bluetooth traffic weights and such.
So, this at least gets the HAL to do some of the right things in
configuring the inital bluetooth coexistence stuff, but doesn't
actually do full btcoex. That'll take.. some effort.
Tested:
* AR9462 (WB222), STA mode
tree's /chosen node to provide out-of-band header fields of the FDT. This
emulation is not perfect without corresponding changes to ofw_fdt_nextprop(),
but is enough to enable lookup by memory-map-parsing code.
MFC after: 1 week
resume sometimes (but not others). On powerup, other wierd issues show
up (sometimes the card comes up, but with really bogus pci config
space stuff. There may be more, but given my experience of historical
fussiness, stick to what works and make more minimal changes to that.
go back through HASWELL, IVY_BRIDGE, IVY_BRIDGE_XEON and SANDY_BRIDGE
to straighten out all the missing PMCs. We also add a new pmc tool
pmcstudy, this allows one to run the various formulas from
the documents "Using Intel Vtune Amplifier XE on XXX Generation platforms" for
IB/SB and Haswell. The tool also allows one to postulate your own
formulas with any of the various PMC's. At some point I will enahance
this to work with Brendan Gregg's flame-graphs so we can flamegraph
various PMC interactions. Note the manual page also needs some
work (lots of work) but gnn has committed to help me with that ;-)
Reviewed by: gnn
MFC after:1 month
Sponsored by: Netflix Inc.
can suspend / resume and unload / load cbb and cardbus without errors
on my Lenovo T400, which wasn't possible before. Cards suspending
and resuming in the CardBus slot not yet tested.
o Enable memory cycles to the bridge early (as part of the new
cbb_pci_bridge_init). This fixes the Bad VCC errors which were
caused by the code accessing the device registers with this
cleared. The suspend / resume process clears it.
o Refactor suspend / resume into bus specific code (though the ISA
code is just stubbed). This isn't strictly necessary, but makes
the initializaiton code more uniform and should be more bullet
proof in the face of variant behavior among cardbus bridges.
o Fixup comments in the power-up sequence to reflect reality. These
comments were written for one regime of power-up, but not updated
as things were revised.
o Add a paranoid small delay (100ms) to cover noisy cards powering
down.
o Fix some debugging prints to be easier to grep from dmesg.
Sponsored by: Netflix
simultaneously detaching kernel drivers on the same USB device we can
get stuck in the "usb_wait_pending_ref_locked()" function because the
conditions needed for allowing detach are not met. The "destroy_dev()"
function waits for all system calls involving the given character
device to return. Character device system calls may lock the USB
enumeration lock, which is also held when "destroy_dev()" is
called. This can sometimes lead to a deadlock not noticed by
WITNESS. The current solution is to ensure the calling thread is the
only one holding the USB enumeration lock and prevent other threads
from getting refs while a USB device detach is ongoing. This turned
out not to be sufficient. To solve this deadlock we could use
"destroy_dev_sched()" to schedule the device destruction in the
background, but then we don't know when it is safe to free() the
private data of the character device. Instead a callback function is
executed by the USB explore process to kill off any leftover USB
character devices synchronously after the USB device explore code is
finished and the USB enumeration lock is no longer locked. This makes
porting easier and also ensures us that character devices must
eventually go away after a USB device detach.
While at it ensure that "flag_iserror" is only written when "priv_mtx"
is locked, which is protecting it.
MFC after: 5 days
Before this change, the current code handles SIOCGIFADDR the same
way with SIOCSIFADDR, which involves full arp_ifinit, et al. They
should be unnecessary for SIOCGIFADDR case.
Differential Revision: https://reviews.freebsd.org/D1508
Reviewed by: glebius
MFC after: 2 weeks
Instead of reusing the same reg parsing code, create one, common function
that puts reg contents to the resource list. Address cells and size cells
are passed rather than acquired here so that any bus can have different
default values.
Obtained from: Semihalf
Reviewed by: andrew, ian, nwhitehorn
Sponsored by: The FreeBSD Foundation
driver name and NIC driver softc via the device(9) tree,
instead of going dirty through the ifnet(9) layer.
Differential Revision: D1506
Reviewed by: imp, jhb
if_ixl to version 1.3.0, if_ixlv to version 1.2.0
- Major change in both drivers is to add RSS support
- In ixl fix some interface speed related issues, dual
speed was not changing correctly, KR/X media was not
displaying correctly (this has a workaround until a
more robust media handling is in place)
- Add a warning when using Dell NPAR and the speed is
less than 10G
- Wrap a queue hung message in IXL_DEBUG, as it is non-fatal,
and without tuning can display excessively
MFC after: 1 week
unnecessary filter configuration code in nge_init_locked().
While I'm here add a check for driver running state for multicast
filter handling. Also remove unnecessary assignment to error
variable since it is cleared in the function entry.
Suggested by: brad@OpenBSD.org
devices which don't support the synchronize cache SCSI command are
likely to also not support the prevent-allow medium removal SCSI
command.
PR: 185747
MFC after: 1 week