The 802.11-2012 spec talks about this - section 10.1.3.2 - Beacon Generation
in Infrastructure Networks. So yes, we should be expecting beacons to be
going out in multiples of intval.
Silly adrian.
So:
* fix the FreeBSD APs that are sending beacons at incorrect TBTTs (target
beacon transmit time); and
* yes indeed we will have to wake up out of network sleep until we sync
a beacon.
ASMedia ASM1062 AHCI chips with some fancy firmware handling PMP inside
seems sometimes forgeting to set bits in PxIS, causing command timeouts.
Removal of this check fixes the issue by the theoretical cost of slightly
higher CPU usage in some odd cases, but this is what Linux does too.
MFC after: 1 month
Note: there was a merge conflict resolved by me.
illumos/illumos-gate@43297f973a43297f973ahttps://www.illumos.org/issues/3821
We recently had nodes with some of the latest zfs bits panic on us in a
rollback-heavy environment. The following is from my preliminary analysis:
Let's look at where we died:
> $C
ffffff01ea6b9a10 taskq_dispatch+0x3a(0, fffffffff7d20450, ffffff5551dea920, 1)
ffffff01ea6b9a60 zil_clean+0xce(ffffff4b7106c080, 7e0f1)
ffffff01ea6b9aa0 dsl_pool_sync_done+0x47(ffffff4313065680, 7e0f1)
ffffff01ea6b9b70 spa_sync+0x55f(ffffff4310c1d040, 7e0f1)
ffffff01ea6b9c20 txg_sync_thread+0x20f(ffffff4313065680)
ffffff01ea6b9c30 thread_start+8()
If we dig in we can find that this dataset corresponds to a zone:
> ffffff4b7106c080::print zilog_t zl_os->os_dsl_dataset->ds_dir->dd_myname
zl_os->os_dsl_dataset->ds_dir->dd_myname = [ "8ffce16a-13c2-4efa-a233-
9e378e89877b" ]
Okay so we have a null taskq pointer. That only happens during the calls to
zil_open and zil_close. If we poke around we can see that we're actually in
midst of a rollback:
> ::pgrep zfs | ::printf "0x%x %s\\n" proc_t . p_user.u_psargs
0xffffff43262800a0 zfs rollback zones/15714eb6-f5ea-469f-ac6d-
4b8ab06213c2@marlin_init
0xffffff54e22a1028 zfs rollback zones/8ffce16a-13c2-4efa-a233-
9e378e89877b@marlin_init
0xffffff4362f3a058 zfs rollback zones/0ddb8e49-ca7e-42e1-8fdc-
4ac4ba8fe9f8@marlin_init
0xffffff5748e8d020 zfs rollback zones/426357b5-832d-4430-953e-
10cd45ff8e9f@marlin_init
0xffffff436b867008 zfs rollback zones/8f36bf37-8a9c-4a44-995c-
6d1b2751e6f5@marlin_init
0xffffff4381ad4090 zfs rollback zones/6c8eca18-fbd6-46dd-ac24-
2ed45cd0da70@marlin_init
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>
MFC after: 3 weeks
This was being done in the pre-AR9380 case, but not for AR9380 and later.
When powersave in STA mode is enabled, this may have lead to the transmit
completion code doing this:
* call the task, which doesn't wake up the hardware
* complete the frames, which doesn't touch the hardware
* schedule pending frames on the hardware queue, which DOES touch the
hardware, and this will be ignored
This would show up in the logs like this:
(with debugging enabled):
Nov 27 23:03:56 lovelace kernel: Q1[ 0] (nseg=1) (DS.V:0xfffffe011bd57300 DS.P:0x49b57300) I: 168cc117 L:00000000 F:0005
...
(in general, doesn't require debugging enabled):
Nov 27 23:03:56 lovelace kernel: ath_hal_reg_write: reg=0x00000804, val=0x49b57300, pm=2
That register is a EDMA TX FIFO register (queue 1), and the val is the descriptor
being written.
Whilst here, make sure the software queue gets kicked here.
Tested;
* AR9485, STA mode + powersave
Since hypervisor does not respond CHOPEN to a revoked channel.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8636
Just in case that no chimney sending buffer can be used.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8619
This bug has been bugging me for quite some time. I finally sat down
with enough coffee to figure it out.
The short of it - rounding up to the next intval multiple of the TSF value
only works if the AP is transmitting all its beacons on an interval of
the TSF. If it isn't - for example, doing staggered beacons on a multi-VAP
setup with a single hardware TSF - then weird things occur.
The long of it -
When powersave is enabled, the MAC and PHY are partially powered off.
They can't receive any packets (or transmit, for that matter.)
The target beacon timer programming will wake up the MAC/PHY just before
the beacon is supposed to be received (well, strictly speaking, at DTIM
so it can see the TIM - traffic information map - telling the STA whether
any traffic is there for it) and it happens automatically.
However, this relies on the target beacon time being programmed correctly.
If it isn't then the hardware will wake up and not hear any beacons -
and then it'll be asleep for said beacons. After enough of this, net80211
will give up and assume the AP went away.
This should fix both TSFOOR interrupts and disconnects from APs with powersave
enabled.
The annoying bit is that it only happens if APs stagger things or start
on a non-zero TSF. So, this would sometimes be fine and sometimes not be
fine.
What:
* I don't know (yet) why the code rounds up to the next intval.
For now, just disable rounding it and trust the value we get.
TODO:
* If we do see a beacon miss in STA mode then we should transition
out of sleep for a while so we can hear beacons to resync against.
I'd love a patch from someone to enable that particular behaviour.
Note - that doesn't require that net80211 brings the chip out of
sleep state - only that we wake the chip up through to full-on and
then let it go to sleep again when we've seen a beacon. The wifi
stack and AP can still completely just stay believing we're in sleep
mode.
Tested:
* AR9485, STA mode, powersave enabled
MFC after: 1 week
Relnotes: Yes
indent(1) treated the "L" in "L'a'" as if it were an identifier and forced
a space character after it, breaking valid code.
PR: 143090
MFC after: 2 weeks
Multi-line comments are always block comments in KNF. Restore properly,
handling the case when a long one-liner gets wrapped and becomes a
multi-line comment.
Obtained from: Piotr Stefaniak
- Set IEEE80211_FEXT_SCAN_OFFLOAD flag; firmware can send null data
frames when associated.
- Check IEEE80211_SCAN_ACTIVE scan flag instead of IEEE80211_F_ASCAN
ic flag; the last is never set since r170530.
- Eliminate software scan (net80211) <-> site_survey (driver) race:
* override ic_scan_curchan and ic_scan_mindwell pointers so net80211
will not try to finish scanning automatically;
* inform net80211 about current status via ieee80211_cancel_scan()
and ieee80211_scan_done();
* remove corresponding workaround from rsu_join_bss().
Now the driver can associate to an AP with hidden SSID.
Tested with Asus USB-N10.
the vnode is inactivated. This contradicts with the nullfs caching
which keeps upper vnode around, as consequence keeping the use
reference to lower vnode.
Add a filesystem flag to request nullfs to not cache when mounted over
that filesystem, and set the flag for nfs v4 mounts.
Reported by: asomers
Reviewed by: rmacklem
Tested by: asomers, rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
a new page of radix trie nodes to complete a vm_radix_insert() operation
that was requested by vm_page_cache(). Specifically, vm_page_cache()
already held the free page queue lock when UMA tried to acquire it through
a call to vm_page_alloc(). This code path no longer exists, so there is no
longer any reason to allow recursion on the free page queue mutex.
Improve nearby comments.
Reviewed by: kib, markj
Tested by: pho
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8628
- Defined an abstract NVRAM I/O API (bhnd_nvram_io), decoupling NVRAM/SPROM
parsing from the actual underlying NVRAM data provider (e.g. CFE firmware
devices).
- Defined an abstract NVRAM data API (bhnd_nvram_data), decoupling
higher-level NVRAM operations (indexed lookup, data conversion, etc) from
the underlying NVRAM file format parsing/serialization.
- Implemented a new high-level bhnd_nvram_store API, providing indexed
variable lookup, pending write tracking, etc on top of an arbitrary
bhnd_nvram_data instance.
- Migrated all bhnd(4) NVRAM device drivers to the common bhnd_nvram_store
API.
- Implemented a common bhnd_nvram_val API for parsing/encoding NVRAM
variable values, including applying format-specific behavior when
converting to/from the NVRAM string representations.
- Dropped the now unnecessary bhnd_nvram driver, and moved the
broadcom/mips-specific CFE NVRAM driver out into sys/mips/broadcom.
- Implemented a new nvram_map file format:
- Variable definitions are now defined separately from the SPROM
layout. This will also allow us to define CIS tuple NVRAM
mappings referencing the common NVRAM variable definitions.
- Variables can now be defined within arbitrary named groups.
- Textual descriptions and help information can be defined inline
for both variables and variable groups.
- Implemented a new, compact encoding of SPROM image layout
offsets.
- Source-level (but not build system) support for building the NVRAM file
format APIs (bhnd_nvram_io, bhnd_nvram_data, bhnd_nvram_store) as a
userspace library.
The new compact SPROM image layout encoding is loosely modeled on Apple
dyld compressed LINKEDIT symbol binding opcodes; it provides a compact
state-machine encoding of the mapping between NVRAM variables and the SPROM
image offset, mask, and shift instructions necessary to decode or encode
the SPROM variable data.
The compact encoding reduces the size of the generated SPROM layout data
from roughly 60KB to 3KB. The sequential nature SPROM layout opcode tables
also simplify iteration of the SPROM variables, as it's no longer
neccessary to iterate the full NVRAM variable definition table, but
instead simply scan the SPROM revision's layout opcode table.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D8645
As of r234483, vnode deactivation causes non-VPO_NOSYNC pages to be
laundered. This behaviour has two problems:
1. Dirty VPO_NOSYNC pages must be laundered before the vnode can be
reclaimed, and this work may be unfairly deferred to the vnlru process
or an unrelated application when the system is under vnode pressure.
2. Deactivation of a vnode with dirty VPO_NOSYNC pages requires a scan of
the corresponding VM object's memq for non-VPO_NOSYNC dirty pages; if
the laundry thread needs to launder pages from an unreferenced such
vnode, it will reactivate and deactivate the vnode with each laundering,
potentially resulting in a large number of expensive scans.
Therefore, ensure that all dirty pages are laundered upon deactivation,
i.e., when all maps of the vnode are removed and all references are
released.
Reviewed by: alc, kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8641