This may result in a bit of a throughput drop. However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.
This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
dirty and murky past.
* Override the default cache line size to be something reasonable if
it's set to 0. Some NICs initialise with '0' (eg embedded ones)
and there are comments in the driver stating that various OSes (eg
older Linux ones) would incorrectly program things and 0 out this
register.
* Just default to overriding the latency timer. Every other driver
does this.
* Use a default cache line size of 32 bytes. It should be "reasonable
enough".
Obtained from: Linux ath9k, Atheros
a8af6270bd96be6ccd86f70b60fa6512b710e4f0
virtio_blk: Include function name in panic string
cbdb03a694b76c5253d7ae3a59b9995b9afbb67a
virtio_balloon: Do the notify outside of the lock
By the time we return from virtqueue_notify(), the descriptor
will be in the used ring so we shouldn't have to sleep.
10ba392e60692529a5cbc1e9987e4064e0128447
virtio: Use DEVMETHOD_END
80cbcc4d6552cac758be67f0c99c36f23ce62110
virtqueue: Add support for VIRTIO_F_RING_EVENT_IDX
This can be used to reduce the number of guest/host and
host/guest interrupts by delaying the interrupt until a
certain index value is reached.
Actual use by the network driver will come along later.
8fc465969acc0c58477153e4c3530390db436c02
virtqueue: Simplify virtqueue_nused()
Since the values just wrap naturally at UINT16_MAX, we
can just subtract the two values directly, rather than
doing 2's complement math.
a8aa22f25959e2767d006cd621b69050e7ffb0ae
virtio_blk: Remove debugging crud from 75dd732a
There seems to be an issue with Qemu (or FreeBSD VirtIO) that sets
the PCI register space for the device config to bogus values. This
only seems to happen after unloading and reloading the module.
d404800661cb2a9769c033f8a50b2133934501aa
virtio_blk: Use better variable name
75dd732a97743d96e7c63f7ced3c2169696dadd3
virtio_blk: Partially revert 92ba40e65
Just use the virtqueue to determine if any requests are
still inflight.
06661ed66b7a9efaea240f99f414c368f1bbcdc7
virtio_blk: error if allowed too few segments
Should never happen unless the host provides use with a
bogus seg_max value.
4b33e5085bc87a818433d7e664a0a2c8f56a1a89
virtio_blk: Sort function declarations
426b9f5cac892c9c64cc7631966461514f7e08c6
virtio_blk: Cleanup whitespace
617c23e12c61e3c2233d942db713c6b8ff0bd112
virtio_blk: Call disk_err() on error'd completed requests
081a5712d4b2e0abf273be4d26affcf3870263a9
virtio_blk: ASSERT the ready and inflight request queues are empty
a9be2631a4f770a84145c18ee03a3f103bed4ca8
virtio_blk: Simplify check for too many segments
At the cost of a small style violation.
e00ec09da014f2e60cc75542d0ab78898672d521
virtio_blk: Add beginnings of suspend/resume
Still not sure if we need to virtio_stop()/virtio_reinit()
the device before/after a suspend.
Don't start additional IO when marked as suspending.
47c71dc6ce8c238aa59ce8afd4bda5aa294bc884
virtio_blk: Panic when dealt an unhandled BIO cmd
1055544f90fb8c0cc6a2395f5b6104039606aafe
virtio_blk: Add VQ enqueue/dequeue wrappers
Wrapper functions managed the added/removing to the in-flight
list of requests.
Normally biodone() any completed IO when draining the virtqueue.
92ba40e65b3bb5e4acb9300ece711f1ea8f3f7f4
virtio_blk: Add in-flight list of requests
74f6d260e075443544522c0833dc2712dd93f49b
virtio_blk: Rename VTBLK_FLAG_DETACHING to VTBLK_FLAG_DETACH
7aa549050f6fc6551c09c6362ed6b2a0728956ef
virtio_blk: Finish all BIOs through vtblk_finish_bio()
Also properly set bio_resid in the case of errors. Most geom_disk
providers seem to do the same.
9eef6d0e6f7e5dd362f71ba097f2e2e4c3744882
Added function to translate VirtIO status to error code
ef06adc337f31e1129d6d5f26de6d8d1be27bcd2
Reset dumping flag when given unexpected parameters
393b3e390c644193a2e392220dcc6a6c50b212d9
Added missing VTBLK_LOCK() in dump handler
Obtained from: Bryan Venteicher bryanv at daemoninthecloset dot org
Contrarily to what i wrote in my previous commit, the 82599
does include the CRC in the length. The operating mode is
reset in ixgbe_init_locked() and so we need to hook into
the places where the two registers (HLREG0 and RDRXCTL) are
modified.
interface.
* Introduce a device hint, 'eeprom_firmware', which is the name of firmware
to lookup.
* If the lookup succeeds, take a copy of it and use it as the eeprom data.
This isn't enabled by default - you have to define ATH_EEPROM_FIRMWARE.
I'll add it to the configuration variables in a later commit.
TODO:
* just keep a firmware reference in ath_softc, and remove the need to
waste the extra memory in having sc_eepromdata be a malloc()ed block.
used in polled-mode. The callout invokes uart_intr, which rearms the timeout.
Implemented for bhyve, but generically useful for e.g. embedded bringup
when the interrupt controller hasn't been setup, or if it's not deemed
worthy to wire an interrupt line from a serial port.
Submitted by: neel
Reviewed by: marcel
Obtained from: NetApp
MFC after: 3 weeks
does not include the CRC irrespective of the setting
of CRCSTRIP. The 82599 data sheets (sec. 7.1.6) say differently.
Very strange. Need to check what happens on legacy descriptors,
but for the time being this restores functionality.
and make it easier to replace it with a different implementation.
On passing, also fix indentation.
NOTE: I know that #include "foo.c" is ugly, but the alternative
(add another entry to sys/conf/files, add a separate header with
structs and prototypes, and expose functions that are meant to
be private) looks even worse to me.
We need a more modular way to specify dependencies and build options.
portions were already reapplied in r233708:
- Use a dedicated task to handle deferred transmits from the if_transmit
method instead of reusing the existing per-queue interrupt task.
Reusing the per-queue interrupt task could result in both an interrupt
thread and the taskqueue thread trying to handle received packets on a
single queue resulting in out-of-order packet processing.
- Call ether_ifdetach() earlier in igb_detach().
- Drain tasks and free taskqueues during igb_detach().
MFC after: 1 week
- add a sysctl, dev.netmap.ix_crcstrip, to control whether ixgbe should
strip the CRC on received frames. Defaults to 0, which keeps the CRC.
and improves performance when receiving min-sized (64-byte) frames.
This matters because min-sized frames is one of the standard
benchmarks for switches and routers, some chipsets seem to issue
read-modify-write cycles for PCIe transactions that are not a
full cache line, and a min-sized frame triggers the bug, resulting
in reduced throughput -- 9.7 instead of 14.88 Mpps -- and heavy
bus load.
- for the time being, always look for incoming packets on a select/poll
even if there has not been an interrupt in the meantime. This is
only a temporary workaround for a probable race condition in keeping
track of rx interrupts.
Add a couple of diagnostic vars to help studying the problem.
values as in the Intel driver 3.8.21 for linux. The fact that it
is standard in the above driver suggests that it has no bad side
effects.
But of course there must be a reason for enabling features, not
just "it does not harm", so here it is a good one:
Prefetching enables full line rate even using a single queue (14.88
Mpps, compared to ~12 Mpps without prefetch). This in turn is
terribly useful when one wants to schedule traffic.
For obvious reasons the difference is only visible with netmap
or other high speed solutions, but presumably the advantage
should be in the order of a fraction of a microsecond when
starting transmission on an empty queue.
Discussed with Jack Vogel.
MFC after: 1 week
r228476 fixed superfluous link UP/DOWN messages but broke IPMI
access during boot. It's not clear why r228476 breaks IPMI and
should be revisited.
Reported by: Paul Guyot <paulguyot <> ieee dot org >
MFC after: 1 week
identical now that the bus spaces are unified under sys/x86.
Replace them with a single uart_cpu_x86.c.
o delete uart_cpu_i386.c
o move uart_cpu_amd64.c to uart_cpu_x86.c
o update files.amd64 and files.i386 accordingly.
problem where userspace apps such as smartctl fail due to CAM_REQUEUE_REQ
status getting returned when tagged commands are outstanding when smartctl
sends its I/O using the pass(4) interface.
Sponsored by: Intel
Found and tested by: Ravi Pokala <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
MFC after: 1 week
add a FreeBSD_version check. It should work fine for compiling
on -HEAD, 9.x and 8.x.
* Conditionally compile the 11n options only when 11n is enabled.
The above changes allow the ath(4) driver to compile and run on
8.1-RELEASE (Hi old PC-BSD!) but with the 11n stuff disabled.
I've done a test against the net80211 and tools in 8.1-RELEASE.
The NIC used in testing is the AR2427 in an EEEPC.
Just to be clear - this change is to allow the -HEAD ath/hal/rate
code to run on 9.x _and_ 8.x with no source changes. However,
when running on earlier kernels, it should only be used for legacy
mode. (Don't define ATH_ENABLE_11N.)
damage which I committed when I had less clue about such things.
Don't ever put normal data frames on the mcast software queue.
Just put mcast frames there if needed.
Pass the txq decision into ath_tx_normal_setup(), as we've already made
the decision. Don't re-do it.
Whilst i'm here, add another random debugging statement.
This fixes bootp on if_smc, as bootp code perform SIOCSIFADDR
ioctl call immediately after sending the request (which causes
if_init being called) which causes the adapter to drop all the
packets received in the meantime.
call these after rate control selection is done.
The duration/protection code wasn't working - it expected the rix to
be valid. Unfortunately after I moved the rate control selection into
late in the process, the rix value isn't valid and thus the protection/
duration code would get things wrong.
HT frames are now correctly protected with an RTS and for the AR5416,
this involves having the aggregate frames be limited to 8K.
TODO:
* Fix up the DMA sync to occur just before the frame is queued to the
hardware. I'm adjusting the duration here but not doing the DMA
flush.
* Doubly/triply ensure that the aggregate frames are being limited to
the correct size, or the AR5416 will get unhappy when TXing RTS-protected
aggregates.
if any subframes in an aggregate have different protection from the
first frame in the formed aggregate, don't add that frame to the
aggregate.
This is likely a suboptimal method (I think we'll mostly be OK marking
frames that have seqno's with the same protection as normal data frames)
but I'll just be cautious for now.
This will be used by some upcoming code to ensure that aggregates
are enforced to be a certain size. The AR5416 has a limitation on
RTS protected aggregates (8KiB).
A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission. The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.
In order to do this:
* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
is torn down.
This is a first pass that implements the above. What needs to be done/
tested:
* What happens during say, a channel reset / stuck beacon _and_ BAR
TX. It _should_ be correctly buffered and retried once the
reset has completed. But if a bgscan occurs (and they shouldn't,
grr) the BAR frame will be forcibly failed and the aggregation session
will be torn down.
Yes, another reason to disable bgscan until I've figured this out.
* There's way too much locking going on here. I'm going to do a couple
of further passes of sanitising and refactoring so the (re) locking
isn't so heavy. Right now I'm going for correctness, not speed.
* The BAR TX can fail if the hardware TX queue is full. Since there's
no "free" space kept for management frames, a full TX queue (from eg
an iperf test) can race with your ability to allocate ath_buf/mbufs
and cause issues. I'll knock this on the head with a subsequent
commit.
* I need to do some _much_ more thorough testing in hostap mode to ensure
that many concurrent traffic streams to different end nodes are correctly
handled. I'll find and squish whichever bugs show up here.
But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.
Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.
Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".
The correct behaviour is:
* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
in an aggregate.
Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.
I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.
can be upgraded to MegaRAID mode, in which case mfi(4) should attach to
these based on the sub-vendor and -device ID instead (not currently done).
Therefore, let mpt_pci_probe() return BUS_PROBE_LOW_PRIORITY.
While it, let mpt_pci_probe() return BUS_PROBE_DEFAULT instead of 0 in
the default case.
MFC after: 3 days
First cut of new HW support from LSI and merge into FreeBSD.
Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
Style
MFhead_mfi r227579
Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
MSI support
MFhead_mfi r227612
More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
Improved timeout support from Scott.
MFhead_mfi r228108
Make file.
MFhead_mfi r228208
Fixed botched merge of Skinny support and enhanced handling
in call back routine.
MFhead_mfi r228279
Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
Restore structure layout by reverting the array header to
use [0] instead of [1].
MFhead_mfi r232412
Put wildcard pattern later in the match table.
MFhead_mfi r232413
Use lower case for hexadecimal numbers to match surrounding
style.
MFhead_mfi r232414
Add more Thunderbolt variants.
MFhead_mfi r232888
Don't act on events prior to boot or when shutting down.
Add hw.mfi.detect_jbod_change to enable or disable acting
on JBOD type of disks being added on insert and removed on
removing. Switch hw.mfi.msi to 1 by default since it works
better on newer cards.
MFhead_mfi r233016
Release driver lock before taking Giant when deleting children.
Use TAILQ_FOREACH_SAFE when items can be deleted. Make code a
little simplier to follow. Fix a couple more style issues.
MFhead_mfi r233620
Update mfi_spare/mfi_array with the actual number of elements
for array_ref and pd. Change these max. #define names to avoid
name space collisions. This will require an update to mfiutil
It avoids mfiutil having to do a magic calculation.
Add a note and #define to state that a "SYSTEM" disk is really
what the firmware calls a "JBOD" drive.
Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
sys/dev/isci/isci_task_request.c:198:7: error: case value not in enumerated type 'SCI_TASK_STATUS' (aka 'enum _SCI_TASK_STATUS') [-Werror,-Wswitch]
case SCI_FAILURE_TIMEOUT:
^
This is because the switch is done on a SCI_TASK_STATUS enum type, but
the SCI_FAILURE_TIMEOUT value belongs to SCI_STATUS instead.
Because the list of SCI_TASK_STATUS values cannot be modified at this
time, use the simplest way to get rid of this warning, which is to cast
the switch argument to int. No functional change.
Reviewed by: jimharris
MFC after: 3 days
- Do not define the foo_start() methods or set if_start in the ifnet if
multiq transmit is enabled. Also, set if_transmit and if_qflush before
ether_ifattach rather than after when multiq transmit is enabled. This
helps to ensure that the drivers never try to mix different transmit
methods.
- Properly restart transmit during resume. igb(4) was not restarting it
at all, and em(4) was restarting even if the link was down and was
calling the wrong method if multiq transmit was enabled.
- Remove all the 'more' handling for transmit completions. Transmit
completion processing does not have a processing limit, so it always
runs to completion and never has more work to do when it returns.
Instead, the previous code was returning 'true' anytime there were
packets in the queue that weren't still in the process of being
transmitted. The effect was that the driver would continuously
reschedule a task to process TX completions in effect running at 100%
CPU polling the hardware until it finished transmitting all of the
packets in the ring. Now it will just wait for the next TX completion
interrupt.
- Restart packet transmission when the link becomes active.
- Fix the MSI-X queue interrupt handlers to restart packet transmission if
there are pending packets in the relevant software queue (IFQ or buf_ring)
after processing TX completions. This is the root cause for the OACTIVE
hangs as if the MSI-X queue handler drained all the pending packets from
the TX ring, nothing would ever restart it. As such, remove some
previously-added workarounds to reschedule a task to poll the TX ring
anytime OACTIVE was set.
Tested by: sbruno
Reviewed by: jfv
MFC after: 1 week
This change also workarounds dhclient's link state handling bug by
not giving current link status.
Unlike other controllers, ale(4)'s PHY hibernation perfectly works
such that driver does not see a valid link if the controller is not
brought up. If dhclient(8) runs on ale(4) it will blindly waits
until link UP and then gives up after 10 seconds. Because
dhclient(8) still thinks interface got a valid link when IFM_AVALID
is not set for selected media, this change makes dhclient initiate
DHCP without waiting for link UP.
needs to defer link state handling.
While I'm here, mark IFF_DRV_RUNNING before changing media. If
link is established without any delay, that link state change
handling could be lost.
not disable it and it is even harmful as hselasky found out. Historically,
this code was originated from (OLDCARD) CardBus driver and later leaked into
PCI driver when CardBus was newbus'ified and refactored with PCI driver.
However, it is not really necessary even for CardBus.
Reviewed by: hselasky, imp, jhb
bridges. Rather than blindly enabling the windows on all of them, only
enable the window when an MSI interrupt is enabled for a device behind
the bridge, similar to what already happens for HT PCI-PCI bridges.
To implement this, each x86 Host-PCI bridge driver has to be able to
locate it's actual backing device on bus 0. For ACPI, use the _ADR
method to find the slot and function of the device. For the non-ACPI
case, the legacy(4) driver already scans bus 0 looking for Host-PCI
bridge devices. Now it saves the slot and function of each bridge that
it finds as ivars that the Host-PCI bridge driver can then use in its
pcib_map_msi() method.
This fixes machines where non-MSI interrupts were broken by the previous
round of HT MSI changes.
Tested by: bapt
MFC after: 1 week
Right now ath_txq_sched() is mainly called from the TX ath_tx_processq()
routine, which is (mostly) done as part of the taskqueue. It shouldn't
be called outside the taskqueue.
But now that I'm about to flip back on BAR TX, I'm going to start
stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths.
What I don't want to have happen is a reschedule of the TID traffic
_during_ the completion of TX frames.
Ideally I'd like to have a way to flag back up to the processing code
that the current hardware queue should be rechecked for software TID
queue frames. But for now, this should suffice for the BAR TX case.
I may eventually delete this code once I've brought some further
sanity to the general TX queue/completion path.
Don't disable BARs on any PCI display devices, because doing that can
sometimes cause the main memory bus to stop working, causing all
memory reads to return nothing but 0xFFFFFFFF, even though the memory
location was previously written. After a while a privileged
instruction fault will appear and then nothing more can be debugged.
The reason for this behaviour is unknown.
MFC after: 1 week
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
violation should be activated unless the system is cold-booted
after updating EEPROM.
The PCI protocol violation happens only when established link is
10Mbps so the workaround should be updated whenever link state
change is detected. Previously the workaround was activated only
when user checks current media status with ifconfig(8).
whether the checksum of EEPROM is valid or not. Because driver
heavily relies on EEPROM information when it selectively enables
features/workarounds, it would be helpful to know whether driver
sees valid EEPROM.
While I'm here remove all other EEPROM accesses since the entire
EEPROM is loaded at device attach time.
MFC after: 2 weeks
for 82550C. For 82550 controllers this change restores CPUSaver
microcode loading. Due to silicon bug on 82550 and 82550C with
server extension, these controllers seem to require CPUSaver
microcode to receive fragmented UDP datagrams. However the
microcode shouldn't be used on client featured 82550C as it locks
up the controller. In addition, client featured 82550C does not
have the silicon bug. Also clear temporary memory used for
microcode loading since the same memory area is used for other
commands.
While I'm here use 82550C in probe message instead of generic
82550.
Reported by: Andreas Longwitz <longwitz <> incore de>
Tested by: Andreas Longwitz <longwitz <> incore de>
MFC after: 2 weeks
- Make INITAFTERSUSPEND flag independent of HOOKRESUME flag.
- Automatically set INITAFTERSUSPEND flag when ALPS GlidePoint is detected.
- Always probe Synaptics Touchpad. Allow MOUSE_SYN_GETHWINFO ioctl and
automatically set INITAFTERSUSPEND flag when a supported device is detected,
regardless of "hw.psm.synaptics_support" tunable setting.
- Update psm(4) to reflect the above changes.
- Remove long-time defunct SYNCHACK flag while I am in the neighborhood.
MFC after: 1 month
- Do not cover error returned by pmc_core_initialize with the
result of pmc_uncore_initialize, fail right away.
- Give a user something to report instead failing silently
Reported by: Alexandr Kovalenko <never@nevermind.kiev.ua>
The flash commands and responses are little-endian and have to be
byte swapped on big-endian systems. However the raw read of data
need not be swapped.
Make the cfi_read and cfi_write do the swapping, and provide a
cfi_read_raw which does not byte swap for reading data from
flash.
Add a Simple polled driver iicoc for the OpenCores I2C controller. This
is used in Netlogic XLP processors.
Submitted by: Sreekanth M. S. (kanthms at netlogicmicro com)
The earlier version of the driver is sys/mips/rmi/dev/iic/ds1374u.c
Convert all references to ds1374u to ds1374, and use DEVMETHOD_END.
Also update the license header as Netlogic is now Broadcom.
within the BAW.
This regression was introduced in ane earlier commit by me to fix the
BAW seqno allocation-but-not-insertion-into-BAW race. Since it was only
ever using the to-be allocated sequence number, any frame retries
with the first frame in the BAW still in the software queue would
have constantly failed, as ni_txseqs[tid] would always be outside
the BAW.
TODO:
* Extract out the mostly common code here in the agg and non-agg ADDBA
case and stuff it into a single function.
PR: kern/166357
I see traffic stalls.
It turns out that the bug isn't because the first and last frame in the
BAW is in the software queue. It is more likely that it's because
the first frame in the BAW is still in the software queue and thus there's
no more room to allocate and do subsequent TX.
PR: kern/166357