24146 Commits

Author SHA1 Message Date
adrian
9593f94786 Fix the compile to work when IEEE80211_DEBUG isn't defined.
Sponsored by:	Hobnob, Inc.
2011-11-09 04:08:01 +00:00
adrian
778aebbf9d Fix the KTR option to compile by default - it was referencing
some unmerged interrupt status debugging code from my branch.

* Add ah_intrstate[8] which will have the record of the last
  call to ath_hal_getintr().
* Wrap the KTR code behind ATH_KTR_INTR_DEBUG.
* Add the HAL interrupt debugging behind AH_INTERRUPT_DEBUGGING.

This is only done for the AR5416 and later NICs but it will be
trivial to add to the earlier NICs if required.

Neither are enabled by default, although to minimise HAL binary
API differences, the ah_intrstate[] array is always compiled into
the ath_hal struct.
2011-11-08 22:50:28 +00:00
adrian
b9db55d70c Introduce TX aggregation and software TX queue management
for Atheros AR5416 and later wireless devices.

This is a very large commit - the complete history can be
found in the user/adrian/if_ath_tx branch.

Legacy (ie, pre-AR5416) devices also use the per-software
TXQ support and (in theory) can support non-aggregation
ADDBA sessions. However, the net80211 stack doesn't currently
support this.

In summary:

TX path:

* queued frames normally go onto a per-TID, per-node queue
* some special frames (eg ADDBA control frames) are thrown
  directly onto the relevant hardware queue so they can
  go out before any software queued frames are queued.
* Add methods to create, suspend, resume and tear down an
  aggregation session.
* Add in software retransmission of both normal and aggregate
  frames.
* Add in completion handling of aggregate frames, including
  parsing the block ack bitmap provided by the hardware.
* Write an aggregation function which can assemble frames into
  an aggregate based on the selected rate control and channel
  configuration.
* The per-TID queues are locked based on their target hardware
  TX queue. This matches what ath9k/atheros does, and thus
  simplified porting over some of the aggregation logic.
* When doing TX aggregation, stick the sequence number allocation
  in the TX path rather than net80211 TX path, and protect it
  by the TXQ lock.

Rate control:

* Delay rate control selection until the frame is about to
  be queued to the hardware, so retried frames can have their
  rate control choices changed. Frames with a static rate
  control selection have that applied before each TX, just
  to simplify the TX path (ie, not have "static" and "dynamic"
  rate control special cased.)
* Teach ath_rate_sample about aggregates - both completion and
  errors.
* Add an EWMA for tracking what the current "good" MCS rate is
  based on failure rates.

Misc:

* Introduce a bunch of dirty hacks and workarounds so TID mapping
  and net80211 frame inspection can be kept out of the net80211
  layer. Because of the way this code works (and it's from Atheros
  and Linux ath9k), there is a consistent, 1:1 mapping between
  TID and AC. So we need to ensure that frames going to a specific
  TID will _always_ end up on the right AC, and vice versa, or the
  completion/locking will simply get very confused. I plan on
  addressing this mess in the future.

Known issues:

* There is no BAR frame transmission just yet. A whole lot of
  tidying up needs to occur before BAR frame TX can occur in the
  "correct" place - ie, once the TID TX queue has been drained.

* Interface reset/purge/etc results in frames in the TX and RX
  queues being removed. This creates holes in the sequence numbers
  being assigned and the TX/RX AMPDU code (on either side) just
  hangs.

* There's no filtered frame support at the present moment, so
  stations going into power saving mode will simply have a number
  of frames dropped - likely resulting in a traffic "hang".

* Raw frame TX is going to just not function with 11n aggregation.
  Likely this needs to be modified to always override the sequence
  number if the frame is going into an aggregation session.
  However, general raw frame injection currently doesn't work in
  general in net80211, so let's just ignore this for now until
  this is sorted out.

* HT protection is just not implemented and won't be until the above
  is sorted out. In addition, the AR5416 has issues RTS protecting
  large aggregates (anything >8k), so the work around needs to be
  ported and tested. Thus, this will be put on hold until the above
  work is complete.

* The rate control module 'sample' is the only currently supported
  module; onoe/amrr haven't been tested and have likely bit rotted
  a little. I'll follow up with some commits to make them work again
  for non-11n rates, but they won't be updated to handle 11n and
  aggregation. If someone wishes to do so then they're welcome to
  send along patches.

* .. and "sample" doesn't really do a good job of 11n TX. Specifically,
  the metrics used (packet TX time and failure/success rates) isn't as
  useful for 11n. It's likely that it should be extended to take into
  account the aggregate throughput possible and then choose a rate
  which maximises that. Ie, it may be acceptable for a higher MCS rate
  with a higher failure to be used if it gives a more acceptable
  throughput/latency then a lower MCS rate @ a lower error rate.
  Again, patches will be gratefully accepted.

Because of this, ATH_ENABLE_11N is still not enabled by default.

Sponsored by:	Hobnob, Inc.
Obtained from:	Linux, Atheros
2011-11-08 22:43:13 +00:00
adrian
d1f6843e2d Add support to the TX descriptor printing code to follow ath_buf
chains. This allows for debugging of aggregate frames.

Sponsored by:	Hobnob, Inc.
2011-11-08 22:01:03 +00:00
adrian
d69e09b56f Make sure TXEOL is set on default queues. Otherwise we don't get an
interrupt on the completion of a TX queue and this can cause TX
hangs / timeout.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:55:40 +00:00
adrian
65d79f5f4c Refactor out the TX buffer management and completion code in
preparation for TX aggregation.

* Add in logic which calls ath_buf bf->bf_comp if it's set.
  This allows for AMPDU (and RIFS, and FF, if someone desires) code
  to handle completion - which includes freeing subframes, retransmitting
  subframes, etc.

* Break out the buffer free, buffer busy/unbusy default completion handler
  code into separate functions. This allows bf_comp methods to free and
  unbusy each subframe ath_buf as required.

* Break out the statistics update code into a separate function, just
  to clean up the TX completion path a little.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:49:33 +00:00
adrian
49cbf8b5e4 Change the descriptor logic to use bf_lastds to point to the last
descriptor, rather than using the maths involving bf_desc[bf_nseg - 1].

When doing TX aggregation, the status will be updated in the -final-
descriptor of the -final- subframe in an aggregate. Thus bf_lastds
may point to the last descriptor in a completely different ath_buf.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:25:36 +00:00
adrian
7bcb6d191c Change ath_buf allocation to:
* Immediately return NULL if a buffer isn't available;
* Track the "buffers not available" count;
* Clear some fields used for tx aggregation;
* Add ath_buf_clone() which clones the majority of buffer state.
  This is needed when retransmission of a "busy" buffer is required.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:13:05 +00:00
adrian
8c46d06cce Break out the TX DMA stop code into a separate function.
Sponsored by:	Hobnob, Inc.
2011-11-08 21:06:36 +00:00
adrian
227a997065 Add a 'vap' to ath_keyset().
Add some code (which is currently disabled) which modifies the group
multicast key cache behaviour. I haven't yet figured out what the
exact/correct behaviour is so I'm leaving it disabled. It's worth
investigating and "correcting", especially for future work with
mesh/ibss and encryption.

Sponsored by:	Hobnob, Inc.
2011-11-08 19:25:52 +00:00
adrian
7464bb4495 Some more various fixes, etc from my 11n branch.
* When doing software TX queue handling and flush, it's possible
  that the deletion of a VAP (eg a STA shutdown) will queue a
  "STA Disassociate" frame whilst the interface is being deleted.
  The VAP is then deleted, and the frame ends up being queued
  to a node that is freed before it can be TX'ed. Things go awry
  at this point.

  There's no way at the present to avoid freeing the underlying node
  when the vap is being deleted. It's too late in the game.

  I suspect the real fix is to make sure the frame is software
  queued with no completion information somehow, so it doesn't
  link back to a node whose underlying VAP has been freed.
  For now, we'll just have to do this.

* Add some comments showing what's going on.

* Move an instance of the ATH_LOCK() around to protect the interrupt
  set. I'll worry about changing that to a PCU lock later on once
  the 11n code is in the tree.

Sponsored by:	Hobnob, Inc.
2011-11-08 19:18:34 +00:00
adrian
ebf7f1ae73 Add KTR tracepoints to the ath driver, in order to debug TX, RX
and interrupt handling.

Sponsored by:	Hobnob, Inc.
2011-11-08 19:02:59 +00:00
adrian
913f44bdc8 In preparation for supporting 11n TX/RX properly, allow for TX queue draining
and interface resets to be marked as ATH_RESET_DEFAULT, ATH_RESET_FULL,
ATH_RESET_NOLOSS.

Currently a reset is still a reset - ie, all tx/rx frames in the hardware
queues are purged. This means that those frames will be lost to the 11n TX
and RX aggregation state tracking, breaking AMPDU sessions.

The (eventual) new semantics:

* ATH_RESET_DEFAULT:
      full reset, this is the default for reset situations
      which I haven't yet figured out what they should be.
* ATH_RESET_FULL:
      A full reset - for things such as channel changes.
* ATH_RESET_NOLOSS:
      Don't flush TX/RX queues - handle pending RX frames and leave TX
      frames where they are; restart TX DMA from where it was.
2011-11-08 18:56:52 +00:00
adrian
5b3fc4fe6e Break out the node cleanup and node free path, in preparation for
doing software TX queue management.

The software queued TX frames will be freed by the new cleanup
function.

Sponsored by:	Hobnob, Inc.
2011-11-08 18:48:26 +00:00
adrian
3f66d2db4f Preparation for correct 802.11n tx/rx handling.
* Change ath_rx_proc() to ath_rx_tasklet(); make that the taskqueue function.
  This way (eventually) ath_rx_proc() can be called from elsewhere in the
  packet reset/processing queue so frames aren't just "flushed" during
  interface resets/reconfigure. This breaks 802.11n RX aggregation tracking.
* Extend ath_tx_proc() to take a 'resched' flag, which marks whether to
  reschedule further RX PCU reads or not.
* Change ath_tx_processq() to take a "dosched" flag, which will eventually
  be used to indicate whether to reschedule the software TX scheduler.

Sponsored by:	Hobnob, Inc.
2011-11-08 18:45:15 +00:00
adrian
22b39990b1 Conditionally compile the PCI latency workaround; I think it's
only required for some earlier NICs.
2011-11-08 18:37:52 +00:00
yongari
ebf84a8f03 Retire 'options TI_PRIVATE_JUMBOS' and replace local jumbo
allocator with UMA backed jumbo allocator by default. Previously
ti(4) used sf_buf(9) interface for jumbo buffers but it was broken
at this moment such that enabling jumbo frame caused instant panic.
Due to the nature of sf_buf(9) it heavily relies on VM changes but
it seems ti(4) was not received much blessing from VM gurus.  I
don't understand VM magic and implications used in driver either.
Switching to UMA backed jumbo allocator like other network drivers
will make jumbo frame work on ti(4).
While I'm here, fully allocate all RX buffers. This means ti(4) now
uses 512 RX buffer and 1024 mini RX buffers.

To use sf_buf(9) interface for jumbo buffers, introduce a new
'options TI_SF_BUF_JUMBO'. If it is proven that sf_buf(9) is better
for jumbo buffers, interesting developers can fix the issue in
future.

ti(4) still needs more bus_dma(9) cleanups and should use separate
DMA tag/map for each ring(standard, jumbo, mini, command, event
etc) but it should work on all platforms except PAE.

Special thanks to Jay[1] who provided complete remote debugging
access.

Tested by:	Jay Borkenhagen <jayb <> braeburn dot org > [1]
2011-11-08 18:23:02 +00:00
adrian
6ecc40aefc Merge in some fixes from the if_ath_tx branch.
* Close down some of the kickpcu races, where the interrupt handler
  can and will run concurrently with the taskqueue.
* Close down the TXQ active/completed race between the interrupt
  handler and the concurrently running tx completion taskqueue
  function.
* Add some tx and rx interrupt count tracking, for debugging.
* Fix the kickpcu logic in ath_rx_proc() to not simply drain and
  restart the TX queue - instead, assume the hardware isn't
  (too) confused and just restart RX DMA. This may break on
  previous chipsets, so if it does I'll add a HAL flag and
  conditionally handle this (ie, for broken chipsets, I'll
  just restore the "stop PCU / flush things / restart PCU"
  logic.)
* Misc stuff

Sponsored by:	Hobnob, Inc.
2011-11-08 18:10:04 +00:00
adrian
324216748e Migrate the STAILQ lists to TAILQs.
A bunch of the 11n TX aggregation logic wants to traverse lists of buffers
in various ways. In order to provide O(1) behaviour in this instance,
use TAILQs.

This does blow out the memory footprint and CPU cycles slightly for some
of these operations. I may convert some of these back to STAILQs once
the rest of the software transmit queue handling has been stabilised.

Sponsored by:	Hobnob, Inc.
2011-11-08 17:08:12 +00:00
adrian
8f9fd74063 Some cosmetic fixes to ath_rate_sample.
* Use 64 bit integer types for the sample rate statistics.
  When TX'ing 11n aggregates, a 32 bit counter will overflow in a few
  hours due to the high packet throughput.

* Create a default label of "" rather than defaulting to "Mb" - that way
  if a rate hasn't yet been selected, it won't say "-1 Mb".

Sponsored by:	Hobnob, Inc.
2011-11-08 14:46:03 +00:00
adrian
476761abc0 Begin merging in some of my 802.11n TX aggregation driver changes.
* Add a PCU lock, which isn't currently used but will eventually be
  used to serialise some of the driver access.

* Add in all the software TX aggregation state, that's kept per-node
  and per-TID.

* Add in the software and aggregation state to ath_buf.

* Add in hooks to ath_softc for aggregation state and the (upcoming)
  aggregation TX state calls.

* Add / fix the HAL access macros.

Obtained from:	Linux, ath9k
Sponsored by:	Hobnob, Inc.
2011-11-08 02:12:11 +00:00
adrian
d5ee0678cc Bring over the 11n aggregation statistics struct from the if_ath_tx branch. 2011-11-08 01:35:44 +00:00
adrian
2b3224e3c9 Merge in ath rate flags and ath_rc_series from my 11n TX branch.
This is in preparation for 802.11n TX aggregation support.
2011-11-08 01:35:05 +00:00
yongari
89ad5455c3 Do not allow changing MTU to be less than the minimum. 2011-11-07 22:58:49 +00:00
yongari
711c65d39d If ti_chipinit() fails in ti_stop(), ignore the error and release
all allocated TX/RX buffer resources. If the interface is brought
to up again after the error, we will leak allocated TX/RX buffers.
2011-11-07 22:53:06 +00:00
yongari
990f68fb09 Show RX buffer allocation failure and do not blindly send alive
message to firmware. Probably the correct way for this error is to
send a TI_CMD_CODE_STACK_DOWN message to firmware and let firmware
handle the rest.
2011-11-07 22:47:25 +00:00
yongari
17a190b301 Mini ring is not available on Tigon 1 so do not create DMA maps for
mini ring on Tigon 1 to save resources.
2011-11-07 22:17:44 +00:00
yongari
ab2f26b102 Track which ring was updated in RX handler and update only modified
ring. This should reduce unnecessary register accesses.
2011-11-07 22:11:38 +00:00
yongari
f2e78d375a o Remove unnecessary controller reinitialization.
o Do not blindly UP controller when MTU is changed. Reinitialize
   controller only if driver is running.
 o Remove useless ti_stop() in ti_watchdog() since ti_init_locked()
   always invokes ti_stop().
2011-11-07 18:26:38 +00:00
yongari
1439814d21 Remove ti_unit member variable in softc.
While I'm here use PCIR_BAR macro.
2011-11-07 18:19:20 +00:00
ed
0c56cf839d Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
ed
e97eae1577 Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
hselasky
37aef6a488 Fix build breakage for 8-stable and older.
MFC after:	0 days
2011-11-06 23:26:26 +00:00
hselasky
a361ae513d Add support for new USB modem protocol.
PR:		usb/159919
MFC after:	1 week
2011-11-06 23:11:30 +00:00
marius
80a7f0f380 Freeze the device queue if a request didn't complete without error and
isn't already frozen.
2011-11-06 22:28:13 +00:00
ed
637fddd999 Remove MALLOC_DECLAREs of nonexisting malloc-pools.
After careful grepping, it seems none of these pools can be found in our
source tree. They are not in use, nor are they defined.
2011-11-06 20:16:50 +00:00
mjacob
3230b890d9 Implement the sysctl's for fibre channel that are listed in the man page.
MFC after:	3 days
2011-11-06 00:44:40 +00:00
hselasky
e58889997a Implement support for modem control lines.
Don't short terminate transmitted BULK data.
Assume that the chip reads one USB packet at a time.

PR:		usb/162307
MFC after:	3 days
2011-11-05 12:01:51 +00:00
yongari
8fee4feec4 Implement altq(4) support.
While I'm here fix a logic error in r227098 where it didn't
re-enable interrupts when TX queue is empty.
2011-11-04 23:34:54 +00:00
yongari
3d013de1b9 Because ti(4) drops a driver lock in RX handler, check whether
driver is still running before re-enabling interrupts.
2011-11-04 23:09:57 +00:00
yongari
3826036d91 Don't abuse if_hwassist and make sure enabling corresponding TX/RX
checksum offloading and VLAN hardware tag insertion/stripping from
the currently enabled hardware offloading capabilities.
Previously if_hwassist, which was initialized to TX/RX checksum
offloading, was blindly used to enable both TX and RX checksum
offloading such that disabling either TX or RX checksum offloading
was not possible.

ti(4) controllers support TX/RX checksum offloading with VLAN
tagging so announce TX/RX checksum offloading capability over VLAN
to vlan(4).

Make VLAN hardware tag insertion/stripping honors currently enabled
interface capability instead of blindly enabling VLAN hardware
tagging. This change allows disabling hardware support of VLAN tag.

Because ti(4) supports VLAN oversized frames, make network stack
know the capability by setting if_hdrlen.

While I'm here, rewrite SIOCSIFCAP handler and make sure to
reinitialize controller whenever TX/RX checksum offloading and VLAN
hardware tagging option is changed.  The requirement of controller
reinitialization comes from the limitation of Tigon I/II firmware.
Tigon I/II firmware requires all related RCBs should be
reinitialized whenever any of its hardware offloading capabilities
change.

vlan(4) is also notified whenever the parent interface's capability
changes such that it can correctly handle TX/RX checksum offloading
based on parent interface's enabled offloading capabilities.

RX checksum offloading handler was changed to make upper stack use
controller computed partial checksum value.  Previously, ti(4) just
set the computed value for any frames(IPv4, IPv6) and the value was
not used in upper stack because driver didn't set CSUM_DATA_VALID
such that upper network stack had to recompute checksum of TCP/UDP
packets. I have no idea how this was not noticed for a long time.
With this change, upper network stack does not have to fully
recompute the checksum such that calculating pseudo checksum based
on partial checksum is sufficient to know whether received packet's
checksum is correct or not. However, I don't know why ti(4) does
not have controller compute pseudo checksum as controller has
ability to do it. I'm just guessing enabling that feature could
trigger a firmware bug or could be slower than computing it on host
side so just leave it as it was.

In order not to produce false positives, ti(4) now checks whether
controller actually computed IP or TCP/UDP checksum by checking
ti_flags field.
2011-11-04 22:53:52 +00:00
yongari
70d11ac70e Don't clear upper 4bits from VLAN tag information. It's
responsibility of vlan(4) to extract VLAN id from the tag
information and vlan(4) correctly handles it.
2011-11-04 21:42:13 +00:00
yongari
1aaf09e416 Introduce ti_ifmedia_upd_locked() to use in driver initialization
and add missing driver lock for both ti_ifmedia_upd() and
ti_ifmedia_sts().
2011-11-04 21:30:46 +00:00
yongari
302be77fb3 Announce IFCAP_LINKSTATE capability and let network stack know link
state changes.  Hide superfluous link up/down message under
bootverbose since if_link_state_change(9) shows that information.
While I'm here, change baudrate with the resolved speed of the
established link instead of blindly setting it 1G. Unfortunately,
it seems there is no way to differentiate 10/100Mbps from
non-gigabit link so just assume we established a 100Mbps link if
current link is not a gigabit link.
2011-11-04 20:43:37 +00:00
yongari
f6e77a8a74 Make sure to unload loaded DMA area(descriptor, command, event ring). 2011-11-04 20:25:30 +00:00
yongari
9528202f6c s/u_intXX_t/uintXX_t/g 2011-11-04 19:12:07 +00:00
yongari
3766776654 Make ti(4) build with 'options TI_PRIVATE_JUMBOS'.
This was broken in r175872.

We have a UMA backed jumbo allocator and that is much better
implementation than having a local jumbo buffer allocator in
driver. This local allocator would be removed in near future but
fixing build before removal wouldn't be a bad idea.
2011-11-04 18:39:39 +00:00
yongari
e803959917 style.
No functional changes.
2011-11-04 18:28:10 +00:00
yongari
cf29881bd6 Use ANSI function definations. 2011-11-04 17:07:53 +00:00
adrian
89ec155d24 Call the correct chipset power routine when disabling the AR5416 and later NICs. 2011-11-04 13:32:13 +00:00