This patchset implements a new TX lock, covering both the per-VAP (and
thus per-node) TX locking and the serialisation through to the underlying
physical device.
This implements the hard requirement that frames to the underlying physical
device are scheduled to the underlying device in the same order that they
are processed at the VAP layer. This includes adding extra encapsulation
state (such as sequence numbers and CCMP IV numbers.) Any order mismatch
here will result in dropped packets at the receiver.
There are multiple transmit contexts from the upper protocol layers as well
as the "raw" interface via the management and BPF transmit paths.
All of these need to be correctly serialised or bad behaviour will result
under load.
The specifics:
* add a new TX IC lock - it will eventually just be used for serialisation
to the underlying physical device but for now it's used for both the
VAP encapsulation/serialisation and the physical device dispatch.
This lock is specifically non-recursive.
* Methodize the parent transmit, vap transmit and ic_raw_xmit function
pointers; use lock assertions in the parent/vap transmit routines.
* Add a lock assertion in ieee80211_encap() - the TX lock must be held
here to guarantee sensible behaviour.
* Refactor out the packet sending code from ieee80211_start() - now
ieee80211_start() is just a loop over the ifnet queue and it dispatches
each VAP packet send through ieee80211_start_pkt().
Yes, I will likely rename ieee80211_start_pkt() to something that
better reflects its status as a VAP packet transmit path. More on
that later.
* Add locking around the management and BAR TX sending - to ensure that
encapsulation and TX are done hand-in-hand.
* Add locking in the mesh code - again, to ensure that encapsulation
and mesh transmit are done hand-in-hand.
* Add locking around the power save queue and ageq handling, when
dispatching to the parent interface.
* Add locking around the WDS handoff.
* Add a note in the mesh dispatch code that the TX path needs to be
re-thought-out - right now it's doing a direct parent device transmit
rather than going via the vap layer. It may "work", but it's likely
incorrect (as it bypasses any possible per-node power save and
aggregation handling.)
Why not a per-VAP or per-node lock?
Because in order to ensure per-VAP ordering, we'd have to hold the
VAP lock across parent->if_transmit(). There are a few problems
with this:
* There's some state being setup during each driver transmit - specifically,
the encryption encap / CCMP IV setup. That should eventually be dragged
back into the encapsulation phase but for now it lives in the driver TX path.
This should be locked.
* Two drivers (ath, iwn) re-use the node->ni_txseqs array in order to
allocate sequence numbers when doing transmit aggregation. This should
also be locked.
* Drivers may have multiple frames queued already - so when one calls
if_transmit(), it may end up dispatching multiple frames for different
VAPs/nodes, each needing a different lock when handling that particular
end destination.
So to be "correct" locking-wise, we'd end up needing to grab a VAP or
node lock inside the driver TX path when setting up crypto / AMPDU sequence
numbers, and we may already _have_ a TX lock held - mostly for the same
destination vap/node, but sometimes it'll be for others. That could lead
to LORs and thus deadlocks.
So for now, I'm sticking with an IC TX lock. It has the advantage of
papering over the above and it also has the added advantage that I can
assert that it's being held when doing a parent device transmit.
I'll look at splitting the locks out a bit more later on.
General outstanding net80211 TX path issues / TODO:
* Look into separating out the VAP serialisation and the IC handoff.
It's going to be tricky as parent->if_transmit() doesn't give me the
opportunity to split queuing from driver dispatch. See above.
* Work with monthadar to fix up the mesh transmit path so it doesn't go via
the parent interface when retransmitting frames.
* Push the encryption handling back into the driver, if it's at all
architectually sane to do so. I know it's possible - it's what mac80211
in Linux does.
* Make ieee80211_raw_xmit() queue a frame into VAP or parent queue rather
than doing a short-cut direct into the driver. There are QoS issues
here - you do want your management frames to be encapsulated and pushed
onto the stack sooner than the (large, bursty) amount of data frames
that are queued. But there has to be a saner way to do this.
* Fragments are still broken - drivers need to be upgraded to an if_transmit()
implementation and then fragmentation handling needs to be properly fixed.
Tested:
* STA - AR5416, AR9280, Intel 5300 abgn wifi
* Hostap - AR5416, AR9160, AR9280
* Mesh - some testing by monthadar@, more to come.
since the former is defined everywhere. This cuts off some code not
necessary on non strict aligment arches.
Reviewed by: adrian
Sponsored by: Nginx, Inc.
There are some people who use the -HEAD net80211 and wireless drivers
on earlier FreeBSD versions in order to get the updated 802.11n support.
The previous if_clone API changes broke this.
net80211 devices and vaps.
* vnet sets vnet0 during kldload and device probe/attach, but not for
the hotplug event. Thus, plugging in a NIC causes things to panic.
So, add a CURVNET_SET(vnet0) for now during the attach phase, until
the hotplug code is taught to set CURVNET_SET(vnet0).
* there's also no implied detach vnet context - so teach the detach
path about ifp->if_vnet.
* When creating/deleting vaps, also set the vnet context appropriately.
These can be done at any time.
Now, the problems!
* ieee80211.c is supposed to be OS-portable code, with no OS-specific stuff
like vnet. That should be fixed.
* When the device hotplug code gets taught about CURVNET_SET(vnet0), the
device vnet set can go away; but the VAP vnet set still needs to be there.
* .. and there still is the question about potentially adding an implied
CURVNET_SET(ifp->if_vnet) on if_free(), since any/all devices may end up
being detached by a hotplug event in today's world. That's going to be
a topic of a subsequent commit.
now use function calls:
if_clone_simple()
if_clone_advanced()
to initialize a cloner, instead of macros that initialize if_clone
structure.
Discussed with: brooks, bz, 1 year ago
Because the code lacks all the GNU extensions to printf() format stuff,
the compiler doesn't helpfully tell us that I messed up in a previous
commit.
Pointy hat to: adrian, who likely only cares about this because he's the
only one who bothers flipping on net80211 debugging.
I found this useful when trying to debug the AR9160 STA RX filter issue -
I'd get crypto reply errors but it wasn't entirely clear which TID it
was for.
the parent wirless interface. If the user passed in a mac address or it was
autogenerated then flag this to avoid trashing it on update.
This will fix wlan+lagg in a post vap world.
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.
Reviewed by: bz
Approved by: re (vimage blanket)
bridge do a better job.
o move ether_ifdetach to the top of ieee80211_detach
o do not clear if_softc at the top of ieee80211_detach; we no longer need
this because we are safeguarded against calls coming back through if_ioctl
o simplify the bpf tracker now that we don't null if_softc
This also fixes an issue where having a bpf consumer active when a vap
is destroyed would cause a crash because bpf referenced free'd memory.
Reviewed by: imp
o track # bpf taps on monitor mode vaps instead of # monitor mode vaps
o spam monitor mode taps on tx/rx
o fix ieee80211_radiotap_rx_all to dispatch frames only if the vap is up
o while here print radiotap (and superg) state in show com
o replace DLT_IEEE802_11 support in net80211 with DLT_IEEE802_11_RADIO
and remove explicit bpf support from wireless drivers; drivers now
use ieee80211_radiotap_attach to setup shared data structures that
hold the radiotap header for each packet tx/rx
o remove rx timestamp from the rx path; it was used only by the tdma support
for debugging and was mostly useless due to it being 32-bits and mostly
unavailable
o track DLT_IEEE80211_RADIO bpf attachments and maintain per-vap and
per-com state when there are active taps
o track the number of monitor mode vaps
o use bpf tap and monitor mode vap state to decide when to collect radiotap
state and dispatch frames; drivers no longer explicitly directly check
bpf state or use bpf calls to tap frames
o handle radiotap state updates on channel change in net80211; drivers
should not do this (unless they bypass net80211 which is almost always
a mistake)
o update various drivers to be more consistent/correct in handling radiotap
o update ral to include TSF in radiotap'd frames
o add promisc mode callback to wi
Reviewed by: cbzimmer, rpaulo, thompsa
previously always pointing to the default vnet context, to a
dynamically changing thread-local one. The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE(). Recursions
on curvnet are permitted, though strongly discuouraged.
This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.
The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.
The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.
This change also introduces a DDB subcommand to show the list of all
vnet instances.
Approved by: julian (mentor)
o call ieee80211_encap in ieee80211_start so frames passed down to drivers
are already encapsulated
o remove ieee80211_encap calls in drivers
o fixup wi so it recreates the 802.3 head it requires from the 802.11
header contents
o move fast-frame aggregation from ath to net80211 (conditional on
IEEE80211_SUPPORT_SUPERG):
- aggregation is now done in ieee80211_start; it is enabled when the
packets/sec exceeds ieee80211_ffppsmin (net.wlan.ffppsmin) and frames
are held on a staging queue according to ieee80211_ffagemax
(net.wlan.ffagemax) to wait for a frame to combine with
- drivers must call back to age/flush the staging queue (ath does this
on tx done, at swba, and on rx according to the state of the tx queues
and/or the contents of the staging queue)
- remove fast-frame-related data structures from ath
- add ieee80211_ff_node_init and ieee80211_ff_node_cleanup to handle
per-node fast-frames state (we reuse 11n tx ampdu state)
o change ieee80211_encap calling convention to include an explicit vap
so frames coming through a WDS vap are recognized w/o setting M_WDS
With these changes any device able to tx/rx 3Kbyte+ frames can use fast-frames.
Reviewed by: thompsa, rpaulo, avatar, imp, sephe
o remove ic_myaddr from ieee80211com
o change ieee80211_ifattach to take the mac address of the physical device
and use that to setup the lladdr.
o replace all references to ic_myaddr in drivers by IF_LLADDR
o related cleanups (e.g. kill dead code)
PR: kern/133178
Reviewed by: thompsa, rpaulo
o add net80211 support for a tdma vap that is built on top of the
existing adhoc-demo support
o add tdma scheduling of frame transmission to the ath driver; it's
conceivable other devices might be capable of this too in which case
they can make use of the 802.11 protocol additions etc.
o add minor bits to user tools that need to know: ifconfig to setup and
configure, new statistics in athstats, and new debug mask bits
While the architecture can support >2 slots in a TDMA BSS the current
design is intended (and tested) for only 2 slots.
Sponsored by: Intel
o add IEEE80211_C_STA capability to indicate sta mode is supported
(was previously assumed) and mark drivers as capable
o add ieee80211_opcap array to map an opmode to the equivalent capability bit
o move IEEE80211_C_OPMODE definition to where capabilities are defined so it's
clear it should be kept in sync (on future additions)
o check device capabilities in clone create before trying to create a vap;
this makes driver checks unneeded
o make error codes return on failed clone request unique
o temporarily add console printfs on clone request failures to aid in
debugging; these will move under DIAGNOSTIC or similar before release
Note this includes changes to all drivers and moves some device firmware
loading to use firmware(9) and a separate module (e.g. ral). Also there
no longer are separate wlan_scan* modules; this functionality is now
bundled into the wlan module.
Supported by: Hobnob and Marvell
Reviewed by: many
Obtained from: Atheros (some bits)
o major overhaul of the way channels are handled: channels are now
fully enumerated and uniquely identify the operating characteristics;
these changes are visible to user applications which require changes
o make scanning support independent of the state machine to enable
background scanning and roaming
o move scanning support into loadable modules based on the operating
mode to enable different policies and reduce the memory footprint
on systems w/ constrained resources
o add background scanning in station mode (no support for adhoc/ibss
mode yet)
o significantly speedup sta mode scanning with a variety of techniques
o add roaming support when background scanning is supported; for now
we use a simple algorithm to trigger a roam: we threshold the rssi
and tx rate, if either drops too low we try to roam to a new ap
o add tx fragmentation support
o add first cut at 802.11n support: this code works with forthcoming
drivers but is incomplete; it's included now to establish a baseline
for other drivers to be developed and for user applications
o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates
prepending mbufs for traffic generated locally
o add support for Atheros protocol extensions; mainly the fast frames
encapsulation (note this can be used with any card that can tx+rx
large frames correctly)
o add sta support for ap's that beacon both WPA1+2 support
o change all data types from bsd-style to posix-style
o propagate noise floor data from drivers to net80211 and on to user apps
o correct various issues in the sta mode state machine related to handling
authentication and association failures
o enable the addition of sta mode power save support for drivers that need
net80211 support (not in this commit)
o remove old WI compatibility ioctls (wicontrol is officially dead)
o change the data structures returned for get sta info and get scan
results so future additions will not break user apps
o fixed tx rate is now maintained internally as an ieee rate and not an
index into the rate set; this needs to be extended to deal with
multi-mode operation
o add extended channel specifications to radiotap to enable 11n sniffing
Drivers:
o ath: add support for bg scanning, tx fragmentation, fast frames,
dynamic turbo (lightly tested), 11n (sniffing only and needs
new hal)
o awi: compile tested only
o ndis: lightly tested
o ipw: lightly tested
o iwi: add support for bg scanning (well tested but may have some
rough edges)
o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data
o wi: lightly tested
This work is based on contributions by Atheros, kmacy, sephe, thompsa,
mlaier, kevlo, and others. Much of the scanning work was supported by
Atheros. The 11n work was supported by Marvell.
the state machine clocks to INIT, node references are not reclaimed.
Add a new routine ieee80211_drain_ifq that does this and use it
instead of IF_DRAIN.
Submitted by: Sepherosa Ziehau
Obtained from: DragonFly
MFC after: 1 month
we intend for the user to be able to unload them later via kldunload(2)
instead of calling linker_load_module() and then directly adjusting the
ref count on the linker file structure. This makes the resulting
consumer code simpler and cleaner and better hides the linker internals
making it possible to sanely lock the linker.
in the 802.11 layer: we send a directed probe request frame to the
current ap bmiss_max times (w/o answer) before scanning for a new ap.
MFC after: 2 weeks
Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it. It only adds a layer of confusion. The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.
Non-native code is not changed in this commit. For compatibility
MT_HEADER is mapped to MT_DATA.
Sponsored by: TCP/IP Optimization Fundraise 2005
Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)
Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)
Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api
These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.
Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks