check_deferred_signal() returns twice, since handle_signal() emulates
the return from the normal signal handler by sigreturn(2)ing the
passed context. Second return is performed on the destroyed stack
frame, because __fillcontextx() has already returned. This causes
undefined and bad behaviour, usually the victim thread gets SIGSEGV.
Avoid nested frame and the need to return from it by doing direct call
to getcontext() in the check_deferred_signal() and using a new private
libc helper __fillcontextx2() to complement the context with the
extended CPU state if the deferred signal is still present.
The __fillcontextx() is now unused, but is kept to allow older
libthr.so to be used with the new libc.
Mark __fillcontextx() as returning twice [1].
Reported by: pgj
Pointy hat to: kib
Discussed with: dim
Tested by: pgj, dim
Suggested by: jilles [1]
MFC after: 1 week
value on purpose, but the ia32 context handling code is logically more
correct to use the _MC_IA32_HASFPXSTATE name for the flag.
Tested by: dim, pgj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
usermode context state is not changed by the get operation, and
get_mcontext() does not require full iret as well.
Tested by: dim, pgj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
context on return from the trap handler, re-enable the interrupts on
i386 and amd64. The trap return path have to disable interrupts since
the sequence of loading the machine state is not atomic. The trap()
function which transfers the control to the special handler would
enable the interrupt, but an iret loads the previous eflags with PSL_I
clear. Then, the special handler calls trap() on its own, which now
sees the original eflags with PSL_I set and does not enable
interrupts.
The end result is that signal delivery and process exiting code could
be executed with interrupts disabled, which is generally wrong and
triggers several assertions.
For amd64, the interrupts are enabled conditionally based on PSL_I in
the eflags of the outer frame, as it is already done for
doreti_iret_fault. For i386, the interrupts are enabled
unconditionally, the ast loop could have opened a window with
interrupts enabled just before the iret anyway.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
controller hardware most likely present on UHCI chipsets aswell. The
bug manifests itself when issuing isochronous transfers and bulk
transfers towards the same device simultaneously. From time to time it
happens that either the completion IRQ was missing or that the
completion IRQ was happening before the ITD/SITD was completely
written back to memory. The workaround assumes that double buffered
isochronous transfers are used, and that a second interrupt is
generated at the beginning of the next isochronous transfer to
complete the previous one. Possibly skipping the interrupt at the last
isochronous frame is possible, but will then break single buffered
isochronous transfers. For now we can live with some extra interrupts.
MFC after: 1 week
and if queue mechanism; also fix up (non-11n) TX fragment handling.
This may result in a bit of a performance drop for now but I plan on
debugging and resolving this at a later stage.
Whilst here, fix the transmit path so fragment transmission works.
The TX fragmentation handling is a bit more special. In order to
correctly transmit TX fragments, there's a bunch of corner cases that
need to be handled:
* They must be transmitted back to back, in the same order..
* .. ie, you need to hold the TX lock whilst transmitting this
set of fragments rather than interleaving it with other MSDUs
destined to other nodes;
* The length of the next fragment is required when transmitting, in
order to correctly set the NAV field in the current frame to the
length of the next frame; which requires ..
* .. that we know the transmit duration of the next frame, which ..
* .. requires us to set the rate of all fragments to the same length,
or make the decision up-front, etc.
To facilitate this, I've added a new ath_buf field to describe the
length of the next fragment. This avoids having to keep the mbuf
chain together. This used to work before my 11n TX path work because
the ath_tx_start() routine would be handed a single mbuf with m_nextpkt
pointing to the next frame, and that would be maintained all the way
up to when the duration calculation was done. This doesn't hold
true any longer - the actual queuing may occur at any point in the
future (think ath_node TID software queuing) so this information
needs to be maintained.
Right now this does work for non-11n frames but it doesn't at all
enforce the same rate control decision for all frames in the fragment.
I plan on fixing this in a followup commit.
RTS/CTS has the same issue, I'll look at fixing this in a subsequent
commit.
Finaly, 11n fragment support requires the driver to have fully
decided what the rate scenario setup is - including 20/40MHz,
short/long GI, STBC, LDPC, number of streams, etc. Right now that
decision is (currently) made _after_ the NAV field value is updated.
I'll fix all of this in subsequent commits.
Tested:
* AR5416, STA, transmitting 11abg fragments
* AR5416, STA, 11n fragments work but the NAV field is incorrect for
the reasons above.
TODO:
* It would be nice to be able to queue mbufs per-node and per-TID so
we can only queue ath_buf entries when it's time to assemble frames
to send to the hardware.
But honestly, we should just do that level of software queue management
in net80211 rather than ath(4), so I'm going to leave this alone for now.
* More thorough AP, mesh and adhoc testing.
* Ensure that net80211 doesn't hand us fragmented frames when A-MPDU has
been negotiated, as we can't do software retransmission of fragments.
* .. set CLRDMASK when transmitting fragments, just to ensure.
__amd64__, and thus limited. Eliminate 2 trivial conditionals by
casting the 64-bit integral, holding an address, via (uintptr_t)
to (void *) and replace the last remaining check for __amd64__
with a check for __LP64__ instead.
It turns out that in C++11, char16_t and char32_t are built-in types;
language keywords. Just fix this by putting traditional _*_T_DECLARED
blocks around the definitions. We'll just predefine these in
<sys/_types.h>.
This also opens up the possibility to define char16_t in other header
files, if ever needed (e.g. if we would gain a <ctype.h> for
char16_t/char32_t).
When creating fragment frames, the header length should honour the
DATAPAD flag.
This fixes the fragments that are queued to the ath(4) driver but it
doesn't yet fix fragment transmission. That requires further changes
to the ath(4) transmit path. Well, strictly speaking, it requires
further changes to _all_ wifi driver transmit paths, but this is at least
a start.
Tested:
* AR5416, STA mode, w/ fragthreshold set to 256.
This prevents users from selecting a delete method which may cause
corruption e.g. MPS WS16 on pre P14 firmware.
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 days
Assign the rv variable a success code if the pager was not asked for
the page. Using an error code from the previous processed page caused
zeroing of the valid page, when e.g. the previous page was not
available in the pager.
Reported by: lstewart
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
USDT probes exits. This was previously done with a callout; however, it is
possible to sleep while holding the DTrace mutexes, so a panic will occur
on INVARIANTS kernels if the callout handler can't immediately acquire one
of these mutexes. This panic will be frequently triggered on systems where
a USDT-enabled program (perl, for instance) is often run.
This revision changes the fasttrap cleanup mechanism so that a dedicated
thread is used instead of a callout. The old behaviour is otherwise
preserved.
Reviewed by: rpaulo
MFC after: 1 month
be a GOOD IDEA (TM).
Apparently MOST users set this (e.g. tcp and friends) but there are a few
users that just assume that it is a sensible value but then go on to read it.
These include SCTP, pf and the FLOWTABLE option (and maybe others).
avoid a dangling pointer and eventual panic on system shutdown.
Reported by: Ali <comnetboy at gmail.com>
Tested by: Ali <comnetboy at gmail.com>
MFC after: 1 week
ownership to the FreeBSD foundation for the years this file has been in
the FreeBSD repository.
This file was originally created by Juniper as part of upgrading to FreeBSD
4.10 (which had no MIPS support) and held functions found on other machines
It grew actual functionality over time. The functionaliy was copied from
other architectures and ported to MIPS on a as-needed basis.
Approved by: Mark Baushke (Juniper IP)
Approved by: Megan Sugiyama (Juniper legal)
Pointed out by: jmallett@
Requested by: core (jhb@)
pmap_enter_locked() implementation was very ambiguous and confusing.
Rearrange it so that each part of the mapping creation is separated.
Avoid walking through the redundant conditions.
Extract vector_page specific PTE setup from normal PTE setting.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Using PVF_MOD, PVF_REF and PVF_EXEC is redundant as we can get the proper
info from PTE bits.
When the mapping is marked as executable and has been referenced we assume
that it has been executed. Similarly, when the mapping is set to be writable
and is referenced, it must have been due to write access to it.
PVF_MOD and PVF_REF flags are kept just for pmap_clearbit() usage,
to pass the information on which bit should be cleared.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Use pmap_find_pv if needed instead of multiplying its code throughout
pmap-v6.
Avoid possible NULL pointer dereference in pmap_enter_locked()
When trying to get m->md.pv_memattr, make sure that m != NULL,
in particular that vector_page is set to be NULL.
Do not set PGA_REFERENCED flag in pmap_enter_pv().
On ARM any new page reference will result in either entering the new
mapping by calling pmap_enter, etc. or fixing-up the existing mapping in
pmap_fault_fixup().
Therefore we set PGA_REFERENCED flag in the earlier mentioned cases and
setting it later in pmap_enter_pv() is just waste of cycles.
Delete unused pm_pdir pointer from the pmap structure.
Rearrange brackets in the fault cause detection in trap.c
Place the brackets correctly in order to see course of the conditions
instantaneously.
Unify naming in pmap-v6.c and improve style
Use naming common for whole pmap and compatible with other pmaps,
improve style where possible:
pm -> pmap
pg -> m
opg -> om
*pt -> *ptep
*pte -> *ptep
*pde -> *pdep
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
bit in PTE.
Enable Access Flag in CPU control. With AF enabled each valid mapping
needs to have referenced bit in PTE set in order to be able to cache
it in the TLB.
AP[0] bit is to be used as reference flag.
All access permissions are encoded by AP[2:1] wherein AP[1] is in fact
"user enable" and AP[2](APX) is "write disable".
All mappings are always set to be valid. Reference emulation is performed
by setting/clearing reference flag in PTE.
md.pvh_attrs are no longer necessary however pv_flags are still being used
for now.
Marking vm_page as "dirty" or "referenced" is being performed on:
- page or flag fault servicing in pmap_fault_fixup(), basing on the fault
type
- vm_fault servicing in pmap_enter() according to the desired protections
and faulty access type
Redundant page marking has been removed as on ARM we know exactly when the
particular page is referenced or is going to be written.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
xen/xenbus/xenbusb.c:
In xenbusb_probe_children(), do not modify the XenBus state of
devices for which we have no PV driver support. An emulated device
we do support may share this backend. Hide the node from XenBus
instead.
This prevents closing the vkbd device, which Qemu's emulated keyboard
device is using as the source for keyboard events.
Tested with qemu-xen-traditional, qemu-xen and qemu stubdomains, all
working as expected.
Submitted by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed by: gibbs
MFC after: 1 week
dev/xen/netfront:
In netif_free(), properly stop the interface and drain any pending
timers prior to disconnecting from the backend device.
Remove all media and detach our interface object from the system
prior to deleting it.
PR: kern/176471
Submitted by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed by: gibbs
MFC after: 1 week
from 1k to 20k The previous value was good 10 years ago, but not
anymore now.
More importantly, lots of good surprises:
polling is incredibly effective under virtualization, and not only
prevents livelock but also saves most of the VM exit overhead in
receive mode.
Using polling, a FreeBSD instance under qemu-kvm remains perfectly
responsive even when bombed with 10 Mpps over an emulated e1000,
and happily processes 1.7 Mpps through ipfw.
Note that some incompatibilities still remain: e.g. polling is not
(yet) compatible with netmap, and seems to freeze the guest when
kern.polling.idle_poll=1
MFC after: 3 days
an error. One could argue that returning a buffer even when it is
not valid is incorrect, but bread has always returned a buffer
valid or not.
Reviewed by: kib
MFC after: 2 weeks
the return value is NULL. Based on the returned flags, the
return value should never be inspected in the case where NULL
is returned, but it is good coding practice not to return a
pointer to freed memory.
Found by: Coverity Scan, CID 1006096
Reviewed by: kib
MFC after: 2 weeks
o Relax locking assertions for pmap_enter_object() and add them also
to architectures that currently don't have any
o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade
operation on the per-object rwlock
o Use all the mechanisms above to make vm_map_pmap_enter() to work
mostl of the times only with readlocks.
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
The <uchar.h> header, part of C11, adds a small number of utility
functions for 16/32-bit "universal" characters, which may or may not be
UTF-16/32. As our wchar_t is already ISO 10646, simply add light-weight
wrappers around wcrtomb() and mbrtowc().
While there, also add (non-yet-standard) _l functions, similar to the
ones we already have for the other locale-dependent functions.
Reviewed by: theraven
traffic.
When transmitting non-aggregate traffic, we need to keep the hardware
busy whilst transmitting or small bursts in txdone/tx latency will
kill us.
This restores non-aggregate iperf performance, especially when doing
TDMA.
Tested:
* AR5416<->AR5416, TDMA
* AR5416 STA <-> AR9280 AP
of course.)
There's a few things that needed to happen:
* In case someone decides to set the beacon transmission rate to be
at an MCS rate, use the MCS-aware version of the duration calculation
to figure out how long the received beacon frame was.
* If TxOP enforcing is available on the hardware and we're doing TDMA,
enable it after a reset and set the TDMA guard interval to zero.
This seems to behave fine.
TODO:
* Although I haven't yet seen packet loss, the PHY errors that would be
triggered (specifically Transmit-Override-Receive) aren't enabled
by the 11n HAL. I'll have to do some work to enable these PHY errors
for debugging.
What broke:
* My recent changes to the TX queue handling has resulted in the driver
not keeping the hardware queue properly filled when doing non-aggregate
traffic. I have a patch to commit soon which fixes this situation
(albeit by reminding me about how my ath driver locking isn't working
out, sigh.)
So if you want to test this without updating to the next set of patches
that I commit, just bump the sysctl dev.ath.X.hwq_limit from 2 to 32.
Tested:
* AR5416 <-> AR5416, with ampdu disabled, HT40, 5GHz, MCS12+Short-GI.
I saw 30mbit/sec in both directions using a bidirectional UDP test.
the right type for the argument in syscalls.master. Also fix the
posix_fallocate(2) and posix_fadvise(2) compat32 syscalls on the
architectures which require padding of the 64bit argument.
Noted and reviewed by: jhb
Pointy hat to: kib
MFC after: 1 week
for the case when the nullfs vnode is not reclaimed. Otherwise, later
reclamation would not unlock the lower vnode.
Reported by: antoine
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
registers also on other CPUs, besides the CPU which happens to execute
the ddb. The debugging registers are stored in the pcpu area,
together with the command which is executed by the IPI stop handler
upon resume.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
a vm page, denoted either by an address of the struct vm_page, or, if
the '/p' modifier is specified, by a physical address of the
corresponding frame.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
1. Common headers for fdt.h and ofw_machdep.h under x86/include
with indirections under i386/include and amd64/include.
2. New modinfo for loader provided FDT blob.
3. Common x86_init_fdt() called from hammer_time() on amd64 and
init386() on i386.
4. Split-off FDT specific low-level console functions from FDT
bus methods for the uart(4) driver. The low-level console
logic has been moved to uart_cpu_fdt.c and is used for arm,
mips & powerpc only. The FDT bus methods are shared across
all architectures.
5. Add dev/fdt/fdt_x86.c to hold the fdt_fixup_table[] and the
fdt_pic_table[] arrays. Both are empty right now.
FDT addresses are I/O ports on x86. Since the core FDT code does
not handle different address spaces, adding support for both I/O
ports and memory addresses requires some thought and discussion.
It may be better to use a compile-time option that controls this.
Obtained from: Juniper Networks, Inc.
I added this block, knowing that lint does not support _Thread_local.
When linting, we could argue that we don't care about TLS (yet). It
seems, however, that external pieces of software also sometimes do a
-Dlint, regex the output and compile it again.
Reported by: swills
apply to most jails but do apply to vnet jails. This includes adding
a new sysctl "security.jail.vnet" to identify vnet jails.
PR: conf/149050
Submitted by: mdodd
MFC after: 3 days
Group onboard mic and headphone mic jack together. Creates association that
will switch between microphone inputs depending on the state of the headphone
jack being connected to a live mic.
Fixes onboard mic not working at all on T520.
Tested on T520, T420.
Suspect X220 needs this too, untested on.
MFC after: 1 month
The list-based DMA engine has the following behaviour:
* When the DMA engine is in the init state, you can write the first
descriptor address to the QCU TxDP register and it will work.
* Then when it hits the end of the list (ie, it either hits a NULL
link pointer, OR it hits a descriptor with VEOL set) the QCU
stops, and the TxDP points to the last descriptor that was transmitted.
* Then when you want to transmit a new frame, you can then either:
+ write the head of the new list into TxDP, or
+ you write the head of the new list into the link pointer of the
last completed descriptor (ie, where TxDP points), then kick
TxE to restart transmission on that QCU>
* The hardware then will re-read the descriptor to pick up the link
pointer and then jump to that.
Now, the quirks:
* If you write a TxDP when there's been no previous TxDP (ie, it's 0),
it works.
* If you write a TxDP in any other instance, the TxDP write may actually
fail. Thus, when you start transmission, it will re-read the last
transmitted descriptor to get the link pointer, NOT just start a new
transmission.
So the correct thing to do here is:
* ALWAYS use the holding descriptor (ie, the last transmitted descriptor
that we've kept safe) and use the link pointer in _THAT_ to transmit
the next frame.
* NEVER write to the TxDP after you've done the initial write.
* .. also, don't do this whilst you're also resetting the NIC.
With this in mind, the following patch does basically the above.
* Since this encapsulates Sam's issues with the QCU behaviour w/ TDMA,
kill the TDMA special case and replace it with the above.
* Add a new TXQ flag - PUTRUNNING - which indicates that we've started
DMA.
* Clear that flag when DMA has been shutdown.
* Ensure that we're not restarting DMA with PUTRUNNING enabled.
* Fix the link pointer logic during TXQ drain - we should always ensure
the link pointer does point to something if there's a list of frames.
Having it be NULL as an indication that DMA has finished or during
a reset causes trouble.
Now, given all of this, i want to nuke axq_link from orbit. There's now HAL
methods to get and set the link pointer of a descriptor, so what we
should do instead is to update the right link pointer.
* If there's a holding descriptor and an empty TXQ list, set the
link pointer of said holding descriptor to the new frame.
* If there's a non-empty TXQ list, set the link pointer of the
last descriptor in the list to the new frame.
* Nuke axq_link from orbit.
Note:
* The AR9380 doesn't need this. FIFO TX writes are atomic. As long as
we don't append to a list of frames that we've already passed to the
hardware, all of the above doesn't apply. The holding descriptor stuff
is still needed to ensure the hardware can re-read a completed
descriptor to move onto the next one, but we restart DMA by pushing in
a new FIFO entry into the TX QCU. That doesn't require any real
gymnastics.
Tested:
* AR5210, AR5211, AR5212, AR5416, AR9380 - STA mode.
Most likely some non-USB compliant devices will choke on it
sooner or later. Clear stall is strictly speaking not needed.
If the first MIDI command sent or transmitted is lost, this
is not a big problem for us.
MFC after: 1 week
Set a valid alternate interface setting
when enumerating USB audio devices else
the device mentioned will not work like
expected.
PR: usb/178722
MFC after: 1 week
o The CP2101 and CP2102 do not support GPIO pin use at all, enforce this.
o Support reading the GPIO status on the second port of the CP2105. More
work is needed before the CP2105 GPIO pins can be used as outputs.
Hardware donated by: Silicon Labs
MFC after: 3 weeks
tree is used to maintain the object's collection of resident pages,
vm_page_lookup() no longer needs an exclusive lock.
Reviewed by: attilio
Sponsored by: EMC / Isilon Storage Division
doesn't match the actual hardware queue this frame is queued to.
I'm trying to ensure that the holding buffers are actually being queued
to the same TX queue as the holding buffer that they end up on.
I'm pretty sure this is all correct so if this complains, it'll be due
to some kind of subtle broken-ness that needs fixing.
This is only done for legacy hardware, not EDMA hardware.
Tested:
* AR5416 STA mode, very lightly
PS-POLL support.
This implements PS-POLL awareness i nthe
* Implement frame "leaking", which allows for a software queue
to be scheduled even though it's asleep
* Track whether a frame has been leaked or not
* Leak out a single non-AMPDU frame when transmitting aggregates
* Queue BAR frames if the node is asleep
* Direct-dispatch the rest of control and management frames.
This allows for things like re-association to occur (which involves
sending probe req/resp as well as assoc request/response) when
the node is asleep and then tries reassociating.
* Limit how many frames can set in the software node queue whilst
the node is asleep. net80211 is already buffering frames for us
so this is mostly just paranoia.
* Add a PS-POLL method which leaks out a frame if there's something
in the software queue, else it calls net80211's ps-poll routine.
Since the ath PS-POLL routine marks the node as having a single frame
to leak, either a software queued frame would leak, OR the next queued
frame would leak. The next queued frame could be something from the
net80211 power save queue, OR it could be a NULL frame from net80211.
TODO:
* Don't transmit further BAR frames (eg via a timeout) if the node is
currently asleep. Otherwise we may end up exhausting management frames
due to the lots of queued BAR frames.
I may just undo this bit later on and direct-dispatch BAR frames
even if the node is asleep.
* It would be nice to burst out a single A-MPDU frame if both ends
support this. I may end adding a FreeBSD IE soon to negotiate
this power save behaviour.
* I should make STAs timeout of power save mode if they've been in power
save for more than a handful of seconds. This way cards that get
"stuck" in power save mode don't stay there for the "inactivity" timeout
in net80211.
* Move the queue depth check into the driver layer (ath_start / ath_transmit)
rather than doing it in the TX path.
* There could be some naughty corner cases with ps-poll leaking.
Specifically, if net80211 generates a NULL data frame whilst another
transmitter sends a normal data frame out net80211 output / transmit,
we need to ensure that the NULL data frame goes out first.
This is one of those things that should occur inside the VAP/ic TX lock.
Grr, more investigations to do..
Tested:
* STA: AR5416, AR9280
* AP: AR5416, AR9280, AR9160
QLogic 8300 Series Adapters
Submitted by: David C Somayajulu (davidcs@freebsd.org) QLogic Corporation
Approved by: George Neville-Neil (gnn@freebsd.org)
PV entries are now roughly half the size.
Instead of using a shared UMA zone for 28 byte pv entries
(two 8-byte tailq nodes, a 4 byte pointer, a 4 byte address and 4 byte
flags), we allocate a page at a time per process.
This provides 252 pv entries per process (actually, per pmap address space)
and eliminates one of the 8-byte tailq entries since we now can track
per-process pv entries implicitly.
The pointer to the pmap can be eliminated by doing address arithmetic to
find the metadata on the page headers to find a single pointer shared by
all 252 entries. There is an 8-int bitmap for the freelist of those 252
entries.
When in serious low memory condition, allocation of another pv_chunk is
possible by freeing some pages in pmap_pv_reclaim().
Added pv_entry/pv_chunk related statistics to pmap.
pv_entry/pv_chunk statistics can be accessed via sysctl vm.pmap.
Ported PTE freelist of KVA allocation and maintenance from i386.
Using an idea from Stephan Uphoff, use the empty pte's that correspond
to the unused kva in the pv memory block to thread a freelist through.
This allows us to free pages that used to be used for pv entry chunks
since we can now track holes in the kva memory block.
As both ARM pmap.c and pmap-v6.c use the same header and pv_entry, pmap and
md_page structures are different, it was needed to separate code designed
for ARMv6/7 from the one for other ARMs.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
Instead of doing all sorts of weird casting of constants to
pointer-pointers, simply use the standard C offsetof() macro to obtain
the offset of the respective fields in the structures.
Instead of only checking the __STDC_VERSION__, we can also use Clang's
__has_extension() to check for features specifically. This allows us to,
say, use Clang's native _Static_assert() instead of the typedef hack,
making the compiler error messages a lot more readable.
Reviewed by: theraven
* Move the node sleep/wake state under the TX lock rather than the
node lock. Let's leave the node lock protecting rate control only
for now.
* When reassociating, various state needs to be cleared. For example,
the aggregate session needs to be torn down, including any pending
aggregation negotiation and BAR TX waiting.
* .. and we need to do a "cleanup" pass since frames in the hardware
TX queue need to be transmitted.
Modify ath_tx_tid_cleanup() to be called with the TX lock held and push
frames into a completion list. This allows for the cleanup to be
done atomically for all TIDs in a node rather than grabbing and
releasing the TX lock each time.
freelist.
o Split the pool of free pages queues really by domain and not rely on
definition of VM_RAW_NFREELIST.
o For MAXMEMDOM > 1, wrap the RR allocation logic into a specific
function that is called when calculating the allocation domain.
The RR counter is kept, currently, per-thread.
In the future it is expected that such function evolves in a real
policy decision referee, based on specific informations retrieved by
per-thread and per-vm_object attributes.
o Add the concept of "probed domains" under the form of vm_ndomains.
It is responsibility for every architecture willing to support multiple
memory domains to correctly probe vm_ndomains along with mem_affinity
segments attributes. Those two values are supposed to remain always
consistent.
Please also note that vm_ndomains and td_dom_rr_idx are both int
because segments already store domains as int. Ideally u_int would
have much more sense. Probabilly this should be cleaned up in the
future.
o Apply RR domain selection also to vm_phys_zero_pages_idle().
Sponsored by: EMC / Isilon storage division
Partly obtained from: jeff
Reviewed by: alc
Tested by: jeff
bit available for a flag in the pointer. However, it felt more correct
to enforce natural alignment of the key pointer. Unfortunately on
32bit architectures 64bit integers are not always naturally aligned.
Change the assert to enforce only 32bit alignment of the 64bit key for
now to fix the build. A more correct fix would be to properly sort
the struct buf fields which definitely suffer from bloat due to padding.
vm_page_insert() so that (1) vm_radix_lookup_le() is never called while the
free page queues lock is held and (2) vm_radix_lookup_le() is called at most
once. This change reduces the average time that the free page queues lock
is held by vm_page_alloc() as well as vm_page_alloc()'s average overall
running time.
Sponsored by: EMC / Isilon Storage Division
users to guarantee that the output of DTrace scripts will be time-ordered.
This option is enabled by adding the line
#pragma D option temporal
to the beginning of a script, or by adding '-x temporal' to the arguments of
dtrace(1).
This change fixes a bug in the original port of the temporal option. This
bug was causing some assertions to fail, so they had been disabled; in this
revision the assertions are working properly and are enabled.
The DTrace version number has been bumped from 1.9.0 to 1.9.1 to reflect
the language change that's being introduced.
This change corresponds to part of illumos-gate commit e5803b76927480:
3021 option for time-ordered output from dtrace(1M)
Reviewed by: pfg
Obtained from: illumos
MFC after: 1 month
with any structure containing a uint64_t index. The tree code
auto-generates type safe wrappers.
- Eliminate the buf splay and replace it with pctrie. This is not only
significantly faster with large files but also allows for the possibility
of shared locking.
Reviewed by: alc, attilio
Sponsored by: EMC / Isilon Storage Division
EABI ARM kernels or clang-compiled ARM kernels.
This fixes a crash seen in clang-compiled ARM
kernels that include WITNESS.
This code could be easily modified to walk the stack
for current clang-generated code (including EABI)
but Andrew Turner has raised concerns that the
stack frame currently emitted by clang isn't actually
required by EABI so such a change might cause problems
down the road.
In case anyone wants to experiment, the change
to support current clang-compiled kernels
involves simply setting FR_RFP=0 and FR_SCP=1.
functions, reverse the numbering scheme for the levels. The highest
numbered level in the tree now appears near the root instead of the leaves.
Sponsored by: EMC / Isilon Storage Division
With "cached read" HDD testing and multiple ports busy on a SATA
host controller, 3726/3826 PMP will very rarely drop a deferred
R_OK that was intended for the host. Symptom will be all 5 drives
under test will timeout, get reset, and recover.
Submitted by: Rich Futyma <rich.futyma@sanmina.com>
MFC after: 2 weeks
null_hashget() obtains the reference on the nullfs vnode, which must
be dropped.
- Fix a wart which existed from the introduction of the nullfs
caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
It should be innocent, but now it is also formally safe. Inform the
nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
nullfs inode.
- Add a callback to the upper filesystems for the lower vnode
unlinking. When inactivating a nullfs vnode, check if the lower
vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
on the lower vnode, and reclaim upper vnode if so. This allows
nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
excessive caching.
Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
a non-loss reset.
When the drain functions are called, the holding descriptor and link pointers
are NULLed out.
But when the processq function is called during a non-loss reset, this
doesn't occur. So the next time a DMA occurs, it's chained to a descriptor
that no longer exists and the hardware gets angry.
Tested:
* AR5416, STA mode; use sysctl dev.ath.X.forcebstuck=1 to force a non-loss
reset.
TODO:
* Further AR9380 testing just to check that the behaviour for the EDMA
chips is sane.
PR: kern/178477
before using said node.
The "blessed" way here is to take a node reference before referencing
anything inside the node, otherwise the node can be freed between
the time the pointer is copied/dereferenced and the time the node contents
are used.
This mirrors fixes that I've done elsewhere in the net80211/driver
stack.
PR: kern/178470
the beaglebone-specific .dts file.
Add a new .dts for the BeagleBone Black with more memory,
slightly different pinmux initialization, and with mmchs1
configured (though the latter doesn't quite work yet).
options OCTEON_VENDOR_GEFES to enable support for these boards, to
match changes that GE publishes to the Octeon Simple Executive. Since
board types overlap with other boards, it is unlikely that we will
properly boot on other Octeon boards with OCTEON_VENDOR_GEFES enabled.
Tested extensively on the WANIC 6354, but I retained support for all
the other models. Some features need changes in the base kernel, and
those are in progress.
An array-type stat in vmm.ko is defined as follows:
VMM_STAT_ARRAY(IPIS_SENT, VM_MAXCPU, "ipis sent to vcpu");
It is incremented as follows:
vmm_stat_array_incr(vm, vcpuid, IPIS_SENT, array_index, 1);
And output of 'bhyvectl --get-stats' looks like:
ipis sent to vcpu[0] 3114
ipis sent to vcpu[1] 0
Reviewed by: grehan
Obtained from: NetApp
the Linux tree that they always include this chip in their FDT, so
make support for the ds1337 opt-out rather than opt-in. Now my boards
boot with the correct time.
Convert the structures to C99 style initialisation, which makes it
a lot easier to check that all of them are set and to generate a
derived template from them.
Sponsored by: DARPA, AFRL
MFC after: 2 weeks
assigned conflicting ranges to BARs then leaving the BARs alone could
result in one device stealing mmio accesses intended to go to a second
device. Prior to 233677 the PCI bus driver attempted to handle this case
by clearing the BAR to 0 depending on BARs based at 0 not decoding (which
is not guaranteed to be true). Now when a conflicting BAR is detected the
following steps are taken:
1) If hw.pci.realloc_bars (a new tunable) is enabled (default is enabled),
then ignore the current BAR setting from the firmware and attempt to
allocate a fresh resource range for the BAR.
2) If 1) failed (or was disabled), disable decoding for the relevant
BAR type (e.g. disable mem decoding for a memory BAR) and emit a
warning if booting verbose.
Tested by: Alex Keda <admin@lissyara.su>
MFC after: 1 week
to 63 bit positions.
Do not fill the save area and do not set the saved bit in the xstate
bit vector for the state which is not marked as enabled in xsave_mask.
Reported and tested by: Jim Ohlstein <jim@ohlste.in>
MFC after: 3 days
irrespective of the setting of lem_rx_process_limit, while
giving a chance to the taskqueue scheduler to act after
each chunk.
This makes lem_rxeof similar to the one in if_em.c and if_igb.c .
if_lem.c and if_em.c: add a sysctl to manually configure the
'itr' moderation register.
Approved by: Jack Vogel
locks. To support this, VNODE locks are created with the LK_IS_VNODE
flag. This flag is propagated down using the LO_IS_VNODE flag.
Note that WITNESS still records the LOR. Only the printing and the
optional entering into the kernel debugger is bypassed with the
WITNESS_NO_VNODE option.
all requested data was sent. The reason is that xfsize <= 0 condition
must not be tested at all if space == loopbytes. Otherwise, the done
is set to 1, and sendfile(2) is aborted too early.
Instead of moving the condition to exiting the inner loop after the
xfersize check, directly check for the completed transfer before the
testing of the available space in the socket buffer, and revert item 1
of r248830. It is arguably another bug to sleep waiting for socket
buffer space (or return EAGAIN for non-blocking socket) if all bytes
are already transferred.
Reported by: pho
Discussed with: scottl, gibbs
Tested by: scottl (stable/9 backport), pho
redefine such operations for different consumers.
This will be used when NUMA support will be finished and numaset
will need to be used.
Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc
of "right".)
Flip back on the "always continue TX DMA using the holding descriptor"
code - by always setting ATH_BUF_BUSY and never setting axq_link to NULL.
Since the holding descriptor is accessed via txq->axq_link and _that_
is done behind the TXQ lock rather than the TX path lock, the holding
descriptor stuff itself needs to be behind the TXQ lock.
So, do the mental gymnastics needed to do this.
I've not seen any of the hardware failures that I was seeing when
I last tried to do this.
Tested:
* AR5416, STA mode
Until an ADM6996 driver shows up, this allows for the two switch
ports to be used.
Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>
Reviewed by: ray
* Fix API changes;
* remove unused code;
* Allow some switches to be used that don't expose a set of PHY
registers for the CPU facing port (eg the ADM6996 for the Ubiquiti
Routerstation.)
Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>
Reviewed by: ray
This adds a vlan capability field to etherswitch_info structure and some
definitions of ports flags.
It adds the support to global config parameters which right now is used
only to switch between the vlan modes, but it is intended to be extended
to support the setup of others parameters (STP, mirror, etc.).
Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>
Reviewed by: ray
Use if_initbaudrate() to set baudrate.
Add IFCAP_LINKSTATE to if_capabilities.
Submitted by: David C Somayajulu <davidcs@freebsd.org>
Approved by: George Neville-Neil <gnn@freebsd.org>
do drain (flush_workqueue() in Linux terms) but instead returns true if
the work was removed before it is run, or false otherwise.
Simulate this by removing the taskqueue_drain() and return the value
derived from taskqueue_cancel()'s return value.
This would solve a witness warning caused by calling taskqueue_drain()
with a non-sleepable lock held, like:
taskqueue_drain with the following non-sleepable locks held:
exclusive rw lle (lle) r = 0 (0xfffffe001450b410) locked @
/usr/src/sys/netinet/in.c:1484
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff848d4f7690
kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff848d4f7740
witness_warn() at witness_warn+0x4a8/frame 0xffffff848d4f7800
taskqueue_drain() at taskqueue_drain+0x3a/frame 0xffffff848d4f7840
set_timeout() at set_timeout+0x4a/frame 0xffffff848d4f7860
netevent_callback() at netevent_callback+0x16/frame 0xffffff848d4f7870
arpintr() at arpintr+0x9b5/frame 0xffffff848d4f7930
This do not affect kernel without OFED compiled in.
Reported by: Garrett Cooper <yaneurabeya gmail com>
(who also tested an earlier version of this patch,
but bugs are mine)
MFC after: 2 weeks
I'm not sure why this is failing. The holding descriptor should be being
re-read when starting DMA of the next frame. Obviously something here
isn't totally correct.
I'll review the TX queue handling and see if I can figure out why this
is failing. I'll then re-revert this patch out and use the holding
descriptor again.