when running FreeBSD on QEMU emulating a Gumstix board.
While here remove the use of a magic number in the not-XScale version.
Pointed out by: kib
Reviewed by: stas
This is not strictly required with the current ABI but will be when we
switch to the ARM EABI. The aapcs requires the stack to be 4 byte aligned
at all times and 8 byte aligned when calling a public subroutine where the
current ABI only requires sp to be a multiple of 4.
programming using earlier cached values. This makes respective routines to
disappear from PMC top and reduces total number of active CPU cycles on idle
24-core system by 10%.
attributes (currently just BUS_DMA_NOCACHE):
- Don't call pmap_change_attr() on the returned address, instead use
kmem_alloc_contig() to ask the VM system for memory with the requested
attribute.
- As a result, always use kmem_alloc_contig() for non-default memory
attributes, even for sub-page allocations. This requires adjusting
bus_dmamem_free()'s logic for determining which free routine to use.
- For x86, add a new dummy bus_dmamap that is used for static DMA
buffers allocated via kmem_alloc_contig(). bus_dmamem_free() can then
use the map pointer to determine which free routine to use.
- For powerpc, add a new flag to the allocated map (bus_dmamem_alloc()
always creates a real map on powerpc) to indicate which free routine
should be used.
Note that the BUS_DMA_NOCACHE handling in powerpc is currently #ifdef'd out.
I have left it disabled but updated it to match x86.
Reviewed by: scottl
MFC after: 1 month
When forming aggregates, the last descriptor was now not being
correctly setup - instead, the "setuplasttxdesc" call was being
handed the first descriptor in the last subframe, rather than the
last descriptor in the last subframe.
This showed up as "bad series0 hwrate" messages, as the final
descriptor just didn't have any of the rate control information
squirreled away.
Tested:
* AR9280 STA -> 11n AP, iperf TCP
llentry_free() and arptimer():
o Use callout_init_rw() for lle timeout, this allows us safely
disestablish them.
- This allows us to simplify the arptimer() and make it
race safe.
o Consistently use ifp->if_afdata_lock to lock access to
linked lists in the lle hashes.
o Introduce new lle flag LLE_LINKED, which marks an entry that
is attached to the hash.
- Use LLE_LINKED to avoid double unlinking via consequent
calls to llentry_free().
- Mark lle with LLE_DELETED via |= operation istead of =,
so that other flags won't be lost.
o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more
consistent and provide more informative KASSERTs.
The patch is a collaborative work of all submitters and myself.
PR: kern/165863
Submitted by: Andrey Zonov <andrey zonov.org>
Submitted by: Ryan Stone <rysto32 gmail.com>
Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>
system with sparse CPU IDs, you can have a valid CPU ID > mp_ncpus (e.g. if
you have two CPUs 0 and 4, with mp_maxid == 4 and mp_ncpus == 2).
Introduced at svn r235210
Submitted by: jhb@
Reviewed by: jfv@
should be paried with g_vfs_close(). Though g_vfs_close() is a wrapper
around g_wither_geom_close(), r206130 added the following test in
g_vfs_open():
if (bo->bo_private != vp)
return (EBUSY);
Which will cause a 'Device busy' error inside reiserfs_mountfs() if
the same file system is re-mounted again after umount or mounting failure:
(case 1, /dev/ad4s3 is not a valid REISERFS partition)
# mount -t reiserfs -o ro /dev/ad4s3 /mnt
mount: /dev/ad4s3: Invalid argument
# mount -t msdosfs -o ro /dev/ad4s3 /mnt
mount: /dev/ad4s3: Device busy
(case 2, /dev/ad4s3 is a valid REISERFS partition)
# mount -t reiserfs -o ro /dev/ad4s3 /mnt
# umount /mnt
# mount -t reiserfs -o ro /dev/ad4s3 /mnt
mount: /dev/ad4s3: Device busy
On the other hand, g_vfs_close() 'fixed' the above cases by doing an
extra step to keep 'sc->sc_bo->bo_private' and 'cp->private' pointers
synchronised.
Reviewed by: kib
MFC after: 1 month
As discussed on -current, inet_ntoa_r() is non standard,
has different arguments in userspace and kernel, and
almost unused (no clients in userspace, only
net/flowtable.c, net/if_llatbl.c, netinet/in_pcb.c, netinet/tcp_subr.c
in the kernel)
message for r238973:
Rdtsc instruction is not synchronized, it seems on some Intel cores it
can bypass even the locked instructions. As a result, rdtsc executed
on different cores may return unordered TSC values even when the rdtsc
appearance in the instruction sequences is provably ordered.
Similarly to what has been done in r238755 for TSC synchronization
test, add explicit fences right before rdtsc in the timecounters 'get'
functions. Intel recommends to use LFENCE, while AMD refers to
MFENCE. For VIA follow what Linux does and use LFENCE. With this
change, I see no reordered reads of TSC on Nehalem.
Change the rmb() to inlined CPUID in the SMP TSC synchronization test.
On i386, locked instruction is used for rmb(), and as noted earlier,
it is not enough. Since i386 machine may not support SSE2, do simplest
possible synchronization with CPUID.
MFC after: 1 week
Discussed with: avg, bde, jkim
- remove special handling of zero length transfers in mpi_pre_fw_upload();
- add missing MPS_CM_FLAGS_DATAIN flag in mpi_pre_fw_upload();
- move mps_user_setup_request() call into proper place;
- increase user command timeout from 30 to 60 seconds;
- avoid NULL dereference panic in case of firmware crash.
Set max DMA segment size to 24bit, as MPI SGE supports it.
Use mps_add_dmaseg() to add empty SGE instead of custom code.
Tune endianness safety.
Reviewed by: Desai, Kashyap <Kashyap.Desai@lsi.com>
Sponsored by: iXsystems, Inc.
PTE's PG_M and PG_RW bits but not the physical page frame. First,
only perform vm_page_dirty() on a managed vm_page when the PG_M bit is
being cleared. If the updated PTE continues to have PG_M set, then
there is no requirement to perform vm_page_dirty(). Second, flush the
mapping from the TLB when PG_M alone is cleared, not just when PG_M
and PG_RW are cleared. Otherwise, a stale TLB entry may stop PG_M
from being set again on the next store to the virtual page. However,
since the vm_page's dirty field already shows the physical page as
being dirty, no actual harm comes from the PG_M bit not being set.
Nonetheless, it is potentially confusing to someone expecting to see
the PTE change after a store to the virtual page.
enabled.
The legacy (pre-802.11n) hardware doesn't support this - although
the AR5212 era hardware supports MRR, it doesn't have all the bits
needed to support MRR + RTS/CTS. The AR5416 and later support
a packet duration and RTS/CTS flags per rate scenario, so we should
support it.
Tested:
* AR9280, STA
PR: kern/170302
adding any extension header, or rather before calling into IPsec
processing as we may send the packet and not return to IPv6 output
processing here.
PR: kern/170116
MFC After: 3 days
This allows my TI1510 cardbus/PCI bridge to work after a suspend/resume,
without having to unload/reload the cbb driver.
I've also tested this on stable/9. I'll MFC it shortly.
PR: kern/170058
Reviewed by: jhb
MFC after: 1 day
lock is obtained before the write count is increased during open() and the
lock is released after the write count is decreased during close().
The first change closes a race where an open() that will block with O_SHLOCK
or O_EXLOCK can increase the write count while it waits. If the process
holding the current lock on the file then tries to call exec() on the file
it has locked, it can fail with ETXTBUSY even though the advisory lock is
preventing other threads from succesfully completeing a writable open().
The second change closes a race where a read-only open() with O_SHLOCK or
O_EXLOCK may return successfully while the write count is non-zero due to
another descriptor that had the advisory lock and was blocking the open()
still being in the process of closing. If the process that completed the
open() then attempts to call exec() on the file it locked, it can fail with
ETXTBUSY even though the other process that held a write lock has closed
the file and released the lock.
Reviewed by: kib
MFC after: 1 month
code is called and remove it from ath_buf_set_rate().
For the legacy (non-11n API) TX routines, ath_hal_filltxdesc() takes care
of setting up the intermediary and final descriptors right, complete
with copying the rate control info into the final descriptor so the
rate modules can grab it.
The 11n version doesn't do this - ath_hal_chaintxdesc() doesn't
copy the rate control bits over, nor does it clear isaggr/moreaggr/
pad delimiters. So the call to setuplasttxdesc() is needed here.
So:
* legacy NICs - never call the 11n rate control stuff, so filltxdesc
copies the rate control info right;
* 11n NICs transmitting legacy or 11n non-aggregate frames -
ath_hal_set11nratescenario() is called to setup rate control and
then ath_hal_filltxdesc() chains them together - so the rate control
info is right;
* 11n aggregate frames - set11nratescenario() is called, then
ath_hal_chaintxdesc() is called to chain a list of aggregate and subframes
together. This requires a call to ath_hal_setuplasttxdesc() to complete
things.
Tested:
* AR9280 in station mode
TODO:
* I really should make sure that the descriptor contents get blanked
out correctly or garbage left over from aggregate frames may show
up in non-aggregate frames, leading to badness.
functions, for both legacy and 802.11n.
This will simplify supporting the EDMA chipsets as these two descriptor
setup functions can just be overridden in their entirety, hiding all of
the subtle differences in setting things up.
It's not a permanent solution, as eventually the AR5416 HAL should grow
similar versions of the 11n descriptor functions and then those can be
used.
TODO:
* Push the "clr11naggr" call into the legacy setds, just to ensure
that retried frames don't end up with the aggregate bits set
inappropriately;
* Remove the "setlasttxdesc" call from the 11n TX path and push it
into setds_11n.
* Ensure that setds_11n will work correctly for non-aggregate frames;
* .. and then when it does, just unconditionally call "setds_11n" for
11n NICs and "setds" for non-11n NICs.
For C1 and C2 states use cpu_ticks() to measure sleep time instead of much
slower ACPI timer. We can't do it for C3, as TSC may stop there. But it is
less important there as wake up latency is high any way.
For C1 and C2 states do not check/clear bus mastering activity status, as
it is important only for C3. As side effect it can make CPU enter C2 instead
of C3 if last BM activity was two sleeps back (unlike one before), but
that may be even good because of collecting more statistics. Premature BM
wakeup from C3, entered because of overestimation, can easily be worse then
entering C2 from both performance and power consumption points of view.
Together on dual Xeon E5645 system on sequential 512 bytes read test this
change makes cpu_idle_acpi() as fast as simplest cpu_idle_hlt() and only
few percents slower then cpu_idle_mwait(), while deeper states are still
actively used during idle periods.
To help with diagnostics, add C-state type into dev.cpu.X.cx_supported.
Sponsored by: iXsystems, Inc.