The correct ordering for non-aggregate TX is:
* call ath_hal_setuptxdesc() to setup the first TX descriptor complete
with the first TX rate/try count;
* call ath_hal_setupxtxdesc() to setup the multi-rate retry;
* .. or for 802.11n NICs, call ath_hal_set11nratescenario() for MRR and
802.11n flags;
* then call ath_hal_filltxdesc() to setup intermediary descriptors
in a multi-descriptor single frame.
The call to ath_hal_filltxdesc() routines seem to correctly (consistently?)
handle the intermediary descriptor flags, including copying the rate
control information to the final descriptor in the frame. That's used
by the rate control module rather than the hardware.
Tested:
* Only on AR9280 STA mode, however it should work on other chips in
both STA and AP mode.
wrapping.
The previous code was only wrapping descriptor "block" boundaries rather
than individual descriptors. It sounds equivalent but it isn't.
r238824 changed the descriptor allocation to enforce that an individual
descriptor doesn't wrap a 4KiB boundary rather than the whole block
of descriptors. Eg, for TX descriptors, they're allocated in blocks
of 10 descriptors for each ath_buf (for scatter/gather DMA.)
checking. This allows the FreeBSD 9.1 release process to move forward.
Work around the problem that loopback connections to local addresses
not on loopback interfaces and not on interfaces w/ IPv6 checksum offloading
enabled would not work.
A proper fix to allow us to disable the "checksum offload" on loopback
for testing, measurements, ... as we allow for IPv4 needs to put in
place later.
Reported by: tuexen, Matthew Seaman (m.seaman infracaninophile.co.uk)
Reported by: Mike Andrews (mandrews bit0.com), kib, ...
PR: kern/170070
MFC after: 1 day
X-MFC after: re approval
data pointer. This is a temp fix that resubmits the
command, adjusted, so that the backend can fetch the
data again.
Sponsored by: Spectralogic
MFC after: 1 month
MISC CHANGES
Add a new async event- ISP_TARGET_NOTIFY_ACK, that will guarantee
eventual delivery of a NOTIFY ACK. This is tons better than just
ignoring the return from isp_notify_ack and hoping for the best.
Clean up the lower level lun enable code to be a bit more sensible.
Fix a botch in isp_endcmd which was messing up the sense data.
Fix notify ack for SRR to use a sensible error code in the case
of a reject.
Clean up and make clear what kind of firmware we've loaded and
what capabilities it has.
-----------
FULL (252 byte) SENSE DATA
In CTIOs for the ISP, there's only a limimted amount of space
to load SENSE DATA for associated CHECK CONDITIONS (24 or 26
bytes). This makes it difficult to send full SENSE DATA that can
be up to 252 bytes.
Implement MODE 2 responses which have us build the FCP Response
in system memory which the ISP will put onto the wire directly.
On the initiator side, the same problem occurs in that a command
status response only has a limited amount of space for SENSE DATA.
This data is supplemented by status continuation responses that
the ISP pushes onto the response queue after the status response.
We now pull them all together so that full sense data can be
returned to the periph driver.
This is supported on 23XX, 24XX and 25XX cards.
This is also preparation for doing >16 byte CDBs.
-----------
FC TAPE
Implement full FC-TAPE on both initiator and target mode side. This
capability is driven by firmware loaded, board type, board NVRAM
settings, or hint configuration options to enable or disable. This
is supported for 23XX, 24XX and 25XX cards.
On the initiator side, we pretty much just have to generate a command
reference number for each command we send out. This is FCP-4 compliant
in that we do this per ITL nexus to generate the allowed 1 thru 255
CRN.
In order to support the target side of FC-TAPE, we now pay attention
to more of the PRLI word 3 parameters which will tell us whether
an initiator wants confirmed responses. While we're at it, we'll
pay attention to the initiator view too and report it.
On sending back CTIOs, we will notice whether the initiator wants
confirmed responses and we'll set up flags to do so.
If a response or data frame is lost the initiator sends us an SRR
(Sequence Retransmit Request) ELS which shows up as an SRR notify
and all outstanding CTIOs are nuked with SRR Received status. The
SRR notify contains the offset that the initiator wants us to restart
the data transfer from or to retransmit the response frame.
If the ISP driver still has the CCB around for which the data segment
or response applies, it will retransmit.
However, we typically don't know about a lost data frame until we
send the FCP Response and the initiator totes up counters for data
moved and notices missing segments. In this case we've already
completed the data CCBs already and sent themn back up to the periph
driver. Because there's no really clean mechanism yet in CAM to
handle this, a hack has been put into place to complete the CTIO
CCB with the CAM_MESSAGE_RECV status which will have a MODIFY DATA
POINTER extended message in it. The internal ISP target groks this
and ctl(8) will be modified to deal with this as well.
At any rate, the data is retransmitted and an an FCP response is
sent. The whole point here is to successfully complete a command
so that you don't have to depend on ULP (SCSI) to have to recover,
which in the case of tape is not really possible (hence the name
FC-TAPE).
Sponsored by: Spectralogic
MFC after: 1 month
Modify MIPS page table entry (PTE) initialisation so that cachability
bits are set only once, using is_cacheable_mem() to determine what
caching properties are required, rather than also unconditionally
setting PTE_C_CACHE in init_pte_prot(). As PTE_C_CACHE |
PTE_C_UNCACHED == PTE_C_CACHE, this meant that all userspace memory
mappings of device memory (incorrectly) used caching TLB entries.
This is arguably not quite what we want, even though it is (more)
consistent with the MIPS pmap design: PTE caching properties should
be derived from machine-independent page table attributes, but this
is a substantially more complex change as the MIPS pmap doesn't yet
know about page attributes, causing it to ignore requests by device
drivers that want uncached userspace memory mappings as they
describe memory-mapped FIFOs or shared memory with a device not
participating in the cache coherence scheme.
This fixes cacheability issues (specifically, undesired and
unrequested caching) seen in userspace memory mappings of Avalon SoC
bus device memory on BERI MIPS.
Discussed with: jmallett, alc
Sponsored by: DARPA, AFRL
MFC after: 3 days
The existing method for testing for MRR is to call the "SetupXTXDesc"
HAL method and see if it returns AH_TRUE or AH_FALSE. This capability
explicitly lists what number of multi-rate attempts are possible.
"1" means "one rate attempt supported".
all diskN aliases for providers (which more or less corresponds to how the
x86 version behaves) but instead probe only those listed in the boot-device
OFW environment variable. This has the following advantages:
- avoids otherwise unavoidable OFW warnings about failures to open disks
for which aliases exist but no actual hardware is connected
- avoids issues due to different diskN naming schemes
- aligns us with Solaris
MFC after: 3 days
* shuffle things around so things fall on natural padding boundaries;
* add a couple of new flags to specify LDPC and whether to switch to the
low power RX chain configuration after this TX has completed.
Obtained from: Qualcomm Atheros
Specifically, however:
* AR9280 and later support 1-stream STBC RX;
* AR9280 and AR9287 support 1-stream STBC TX.
The STBC support isn't announced (yet) via net80211 and it isn't at all
chosen by the rate control code, so there's no real consumer of this
yet.
Obtained from: Qualcomm Atheros
(future) TPC support in the AR9300 HAL.
This is effectively a no-op for the moment as (a) TPC isn't really
supported, (b) the AR9300 HAL isn't yet public, and (c) the existing
HAL code doesn't use these fields.
Obtained from: Qualcomm Atheros
buffers.
ath_descdma is now being used for things other than the classical
combination of ath_buf + ath_desc allocations. In this particular case,
don't try to free and blank out the ath_buf list if it's not passed in.
of buffers, only the number of descriptors.
This involves:
* Change the allocation function to not use nbuf at all;
* When calling it, pass in "nbuf * ndesc" to correctly update how many
descriptors are being allocated.
Whilst here, fix the descriptor allocation code to correctly allocate
a larger buffer size if the Merlin 4KB WAR is required. It overallocates
descriptors when allocating a block that doesn't ever have a 4KB boundary
being crossed, but that can be fixed at a later stage.
off da0s1a instead of ate0. Note that MMC/SD is slot B. Until I
switch over to NAND boot, dataflash booting will preclude having SD
cards inserted at boot, so this last bit is untested.
My SAM9260-EK not boots to multi-user prompt.
http://info.iet.unipi.it/~luigi/vale/
VALE lets you dynamically instantiate multiple software bridges
that talk the netmap API (and are *extremely* fast), so you can test
netmap applications without the need for high end hardware.
This is particularly useful as I am completing a netmap-aware
version of ipfw, and VALE provides an excellent testing platform.
Also, I also have netmap backends for qemu mostly ready for commit
to the port, and this too will let you interconnect virtual machines
at high speed without fiddling with bridges, tap or other slow solutions.
The API for applications is unchanged, so you can use the code
in tools/tools/netmap (which i will update soon) on the VALE ports.
This commit also syncs the code with the one in my internal repository,
so you will see some conditional code for other platforms.
The code should run mostly unmodified on stable/9 so people interested
in trying it can just copy sys/dev/netmap/ and sys/net/netmap*.h
from HEAD
VALE is joint work with my colleague Giuseppe Lettieri, and
is partly supported by the EU Projects CHANGE and OPENLAB
subdevice ahciem. Emulate SEMB SES device from AHCI LED interface to expose
it to users in form of ses(4) CAM device. If we ever see AHCI controllers
supporting SES of SAF-TE over I2C as described by specification, they should
fit well into this new picture.
Sponsored by: iXsystems, Inc.
list of supported devices with the union of:
NetBSD src/sys/dev/usb/uslsa.c 1.18
OpenBSD src/sys/dev/usb/uslcom.c 1.24
Linux source/drivers/usb/serial/cp210x.c HEAD
Remove duplicate JABLOTRON PC60B entry.
Note that some of the devices added here are multi-port devices. The
uslcom(4) driver currently only supports the first port on such devices.
Update the man page to reflect the full list of supported devices.
Remove two caveats from the CAVEATS section, as both listed caveats no
longer apply. Add a caveat about multi-port devices.
MFC after: 2 weeks
often modified directory created symbolic links points to - it cause
unnecessary full rebuilds each time make runs when directory is changed.
So do it only if symbolic link does not exists, which usually means that
objdir is clean anyway.
MFC after: 1 week
them alone.
Process the act_count updates for the held pages in the vm_pageout
loop over the inactive queue, instead of refusing to do anything with
such page.
Clarify the intent of the addl_page_shortage counter and change its
use for pages which are not processed in the loop according to the
description.
Reviewed by: alc
MFC after: 2 weeks
Update i2c devices to just include the eeprom.
Update dataflash chip select to be CS 1 (this doesn't work yet and
needs changes to at91_spi and the spibus infrastructure).
Fix typo in comment.
- The USLCOM_SET_BAUD_DIV command (0x01)
- The USLCOM_SET_BAUD_RATE command (0x13)
Devices based on the CP1204 will only accept the latter command, and ignore
the former. As the latter command works on all chips that this driver
supports, switch to always using it.
A slight confusion here is that the previously used command was incorrectly
named USLCOM_BAUD_RATE - even though we no longer use it, rename it to
USLCOM_SET_BAUD_DIV to closer match the name used in the datasheet.
This change reflects a similar change made in the Linux driver, which was
submitted by preston.fick at silabs.com, and has been tested on all of the
uslcom(4) devices I have to hand.
MFC after: 2 weeks
but not for IPv6. The current checks in nd6_nbr.c along with the
old version will result in ifa being NULL and subsequently the
packet will be dropped. This prevented NS/NA, from working and
with that IPv6.
Now return the ifa from the carp lookup function in two cases:
1) if the address matches, is a carp address, and we are MASTER
(as before),
2) if the address matches but it is not a carp address at all (new).
Reported by: Peter Wemm (new Y! FreeBSD cluster, eating our own dogfood)
Tested on: New Y! FreeBSD cluster machines
Reviewed by: glebius
one device (support for Motorola cables), this syncronises us with:
OpenBSD src/sys/dev/usb/uplcom.c 1.56
NetBSD src/sys/dev/usb/uplcom.c 1.73
Linux kernel.org HEAD
MFC after: 1 week
when used in qemu (and this driver is for non-PCIe cards,
so probably its largest use is in virtualized environments).
Approved by: Jack Vogel
MFC after: 3 days
Intel Architecture Manual specifies that rdtsc instruction is not serialized,
so without this change, TSC synchronization test would periodically fail,
resulting in use of HPET timecounter instead of TSC-low. This caused
severe performance degradation (40-50%) when running high IO/s workloads due to
HPET MMIO reads and GEOM stat collection.
Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC synchronization
fail approximately 20% of the time.
Sponsored by: Intel
Reviewed by: kib
MFC after: 3 days
including the include directive.
Fix minor build issue corrected by converting yypush_buffer_state and
yypop_buffer_state to yy_set_buffer_state and a hard-coded 100-deep
stack. It was easier to fix it here than to import that support into
our flex.
The new tools and test hardness remain unsupported at the moment.
Just free inclomplete daemon cache instead to let it retry next time.
Premature ses_softc_cleanup() caused NULL dereference when freed softc
was accessed later.
The AR9300 and later descriptors are 128 bytes, however I'd like to make
sure that isn't used for earlier chips.
* Populate the TX descriptor length field in the softc with
sizeof(ath_desc)
* Use this field when allocating the TX descriptors
* Pre-AR93xx TX/RX descriptors will use the ath_desc size; newer ones will
query the HAL for these sizes.
The function keys on a Microsoft Natural Egronomic Keyboard 4000 have been
repurposed as "Help", "Undo", "Redo" etc., and a special "F Lock" key is
required to return them to their normal purpose.
This change enables the UQ_KBD_BOOTPROTO quirk for the MS Natural 4000
keyboard to get the keys working again. More extensive changes to the USB
keyboard infrastructure would be needed to fully support the "F Lock" mode
and the extended keys on this keyboard.
PR: usb/116947
Approved by: hselasky@
* Introduce TX DMA setup/teardown methods, mirroring what's done in
the RX path.
Although the TX DMA descriptor is setup via ath_desc_alloc() /
ath_desc_free(), there TX status descriptor ring will be allocated
in this path.
* Remove some of the TX EDMA capability probing from the RX path and
push it into the new TX EDMA path.
sized TX descriptor.
This is required for the AR93xx EDMA support which requires 128 byte
TX descriptors (which is significantly larger than the earlier
hardware.)
compatibility definitions are only needed for implementation of the
syscalls, they cause namespace pollution and are not useful for
applications.
Noted by: bde
MFC after: 1 week
SAs. For now allow same address family bundles. While discovered with
ESP and AH, which does not make a lot of sense, IPcomp could be a possible
problematic candidate.
PR: kern/164400
MFC after: 3 days
stopped threads. Implementation assumes that the thread's FPU context
is spilled into the PCB due to stop. This is mostly true, except when
FPU state for the thread is not initialized. Then the requests operate
on the garbage state which is currently left in the PCB, causing
confusion.
The situation is indeed observed after a signal delivery and before
#NM fault on execution of any FPU instruction in the signal handler,
since sendsig(9) drops FPU state for current thread, clearing
PCB_FPUINITDONE. When inspecting context state for the signal handler,
debugger sees the FPU state of the main program context instead of the
clear state supposed to be provided to handler.
Fix this by forcing clean FPU state in PCB user FPU save area by
performing getfpuregs(9) before accessing user FPU save area in
ptrace_machdep.c.
Note: this change will be merged to i386 kernel as well, where it is
much more important, since e.g. gdb on i386 uses PT_I386_GETXMMREGS to
inspect FPU context on CPUs that support SSE. Amd64 version of gdb
uses PT_GETFPREGS to inspect both 64 and 32 bit processes, which does
not exhibit the bug.
Reported by: bde
MFC after: 1 week
understands FPU hardware enough to catch SIGFPE and unmask exceptions
in control word, then it may as well properly handle return from
SIGFPE without causing an infinite loop of #MF exceptions due to
faulting instruction restart, when needed.
Clearing exceptions causes information loss for handlers which do
understand FPU hardware, and struct siginfo si_code member cannot be
considered adequate replacement for en_sw content due to translation.
Supposed reason for clearing the exceptions, which is IRQ13 handling
oddities, were never applicable to amd64.
Note: this change will be merged to i386 kernel as well, since we do
not support IRQ13 delivery of #MF notifications for some time.
Requested by: bde
MFC after: 1 week
Append '__' prefix to the tag of struct oflock, and put it under BSD
namespace. Structure is needed both by libc and kernel, thus cannot be
hidden under #ifdef _KERNEL.
Move a set of non-standard F_* and O_* constants into BSD namespace.
SUSv4 explicitely allows implemenation to pollute F_* and O_* names
after fcntl.h is included, but it costs us nothing to adhere
to the specification if exact POSIX compliance level is requested by
user code.
Change some spaces after #define to tabs.
Noted by and discussed with: bde
MFC after: 1 week
For now, the only module implement is 'sample', and that's only partially
implemented. The main issue here with reusing this structure in userland
is that it uses 'rix' everywhere, which requires the userland code to
have access to the current HAL rate table.
For now, this is a very large work in progress.
Specific details:
* The rate control information is per-node at the moment and wrapped
in a TLV, to ease parsing and backwards compatibility.
* .. but so I can be slack for now, the userland statistics are just
a copy of the kernel-land sample node state.
* However, for now use a temporary copy and change the rix entries
to dot11rate entries to make it slightly easier to eyeball.
Problems:
* The actual rate information table is unfortunately indexed by rix
and it doesn't contain a rate code. So the userland side of this
currently has no way to extract out a mapping.
TODO:
* Add a TLV payload to dump out the rate control table mapping so
'rix' can be turned into a dot11 / MCS rate.
* .. then remove the temporary copy.
amd64. It is implemented as __pure2 inline with non-volatile asm read
from pcpu, which allows a compiler to cache its results.
Convert most PCPU_GET(pcb) and curthread->td_pcb accesses into curpcb.
Note that __curthread() uses magic value 0 as an offsetof(struct pcpu,
pc_curthread). It seems to be done this way due to machine/pcpu.h
needs to be processed before sys/pcpu.h, because machine/pcpu.h
contributes machine-depended fields to the struct pcpu definition. As
result, machine/pcpu.h cannot use struct pcpu yet.
The __curpcb() also uses a magic constant instead of offsetof(struct
pcpu, pc_curpcb) for the same reason. The constants are now defined as
symbols and CTASSERTs are added to ensure that future KBI changes do
not break the code.
Requested and reviewed by: bde
MFC after: 3 weeks
data introduced in r236061. Using that flag doesn't make that much
sense on this case as the DMA maps using it are also created during
sym_pci_attach(). Moreover, due to the maxsegsz parameter used, doing
so may exhaust the bounce pages pool on architectures requiring
bounce pages. [1]
While at it, use a slightly more appropriate maxsegsz parameter.
PR: 169526
Submitted by: Mike Watters [1]
MFC after: 3 days
ELF parser. Specifically, do not allow note reader and interpreter
path comparision in the brandelf code to read past end of the page.
This may happen if specially crafter ELF image is activated.
Submitted by: Lukasz Wojcik <lukasz.wojcik zoho com>
MFC after: 3 days
TX descriptor link pointers.
This is required for the AR93xx and later chipsets.
The RX path is slightly different - the legacy RX path directly
accesses ath_desc->ds_link for now, however this isn't at all done
for EDMA (FIFO) RX.
Now, for those performing a little software archeology here:
This is all a bit sub-optimal. "struct ath_desc" is only really relevant
for the pre-AR93xx NICs - where ds_link and ds_data is always in the
same location.
The AR93xx and later NICs have different descriptor layouts altogether.
Now, for AR93xx and later NICs, you should never directly reference
ds_link and ds_data, as:
* the RX descriptors don't have either - the data is _after_ the RX
descriptor. They're just one large buffer. There's also no need for
a per-descriptor RX buffer size as they're all fixed sizes.
* the TX descriptors have 4 buffer and 4 length fields _and_ a link
pointer. Each frame takes up one TX FIFO pointer, but it can contain
multiple subframes (either multiple frames in a buffer, and/or
multiple frames in an aggregate/RIFS burst.)
* .. so, when TX frames are queued to a hardware queue, the link
pointer is ONLY for buffers in that frame/aggregate. The next frame
starts in a new FIFO pointer.
* Finally, descriptor completion status is in a different ring.
I'll write something up about that when its time to do so.
This was inspired by Linux ath9k and the reference driver but is a
reimplementation.
Obtained from: Linux ath9k, Qualcomm Atheros
The DMA FIFO chips (AR93xx and later) differ slightly to th elegacy
chips:
* The RX DMA descriptors don't have a ds_link field;
* The TX DMA descriptors have a ds_link field however at a different
offset.
This is a reimplementation based on what the reference driver and ath9k
does.
A subsequent commit will enable it in the TX and beacon paths.
Obtained from: Linux ath9k, Qualcomm Atheros
occurs using the SSE math processor. Update comments describing the
handling of the exception status bits in coprocessors control words.
Remove GET_FPU_CW and GET_FPU_SW macros which were used only once.
Prefer to use curpcb to access pcb_save over the longer path of
referencing pcb through the thread structure.
Based on the submission by: Ed Alley <wea llnl gov>
PR: amd64/169927
Reviewed by: bde
MFC after: 3 weeks
FDT-enabled targets were broken after r238043 that relies
on device up the hierarchy to properly setup interrupt.
nexus device for ARM platforms did job only partially:
setting handler but not unmasking interrupt. Unmasking
was performed by platform code.
Reviewed by: andrew@
it can't sleep, it can still move clean pages from the inactive queue to
the cache. Also, when a page is cached, there is no need to restart the
scan. The "next" page pointer held by vm_contig_launder() is still
valid. Finally, add a comment summarizing what vm_contig_grow_cache()
does based upon the value of "tries".
MFC after: 3 weeks
tcp_mtudisc(), which in its turn may call tcp_output(). Under certain
conditions (must admit they are very special) an infinite recursion can
happen.
To avoid recursion we can pass struct route to ip_output() and obtain
correct mtu. This allows us not to use tcp_mtudisc() but call tcp_mss_update()
directly.
PR: kern/155585
Submitted by: Andrey Zonov <andrey zonov.org> (original version of patch)
The AR9003 series NICs implement a separate RX error to signal that a
Keycache miss occured. The earlier NICs would not set the key index
valid bit.
I'll dig into the difference between "no key index bit set" and "keycache
miss".
VM_KMEM_MAX_SIZE.
The code was not taking into account the size of the kernel_map, which
the kmem_map is allocated from, so it could produce a sub-map size too
large to fit. The simplest solution is to ignore VM_KMEM_MAX entirely
and base the memguard map's size off the kernel_map's size, since this
is always relevant and always smaller.
Found by: Justin Hibbits
This behavior is recommended by RFC 4213 clause 3.2.
Sometimes fragmentation is the least evil.
For example, some Linux IPVS kernels forwards
ICMPv6 checksums to real servers incorrectly.
Reviewed by: hrs(previous version)
Approved by: kib(mentor)
MFC after: 1 week
mostly meets the guidelines set by the Intel SDM:
1. We use XRSTOR and XSAVE from the same CPL using the same linear
address for the store area
2. Contrary to the recommendations, we cannot zero the FPU save area
for a new thread, since fork semantic requires the copy of the
previous state. This advice seemingly contradicts to the advice
from the item 6.
3. We do use XSAVEOPT in the context switch code only, and the area
for XSAVEOPT already always contains the data saved by XSAVE.
4. We do not modify the save area between XRSTOR, when the area is
loaded into FPU context, and XSAVE. We always spit the fpu context
into save area and start emulation when directly writing into FPU
context.
5. We do not use segmented addressing to access save area, or rather,
always address it using %ds basing.
6. XSAVEOPT can be only executed in the area which was previously
loaded with XRSTOR, since context switch code checks for FPU use by
outgoing thread before saving, and thread which stopped emulation
forcibly get context loaded with XRSTOR.
7. The PCB cannot be paged out while FPU emulation is turned off, since
stack of the executing thread is never swapped out.
The context switch code is patched to issue XSAVEOPT instead of XSAVE
if supported. This approach eliminates one conditional in the context
switch code, which would be needed otherwise.
For user-visible machine context to have proper data, fpugetregs()
checks for unsaved extension blocks and manually copies pristine FPU
state into them, according to the description provided by CPUID leaf
0xd.
MFC after: 1 month
* wrap the RX proc calls in the RX refcount;
* call the DFS checking, fast frames staging and TX rescheduling if
required.
TODO:
* figure out if I can just make "do TX rescheduling" mean "schedule
TX taskqueue" ?
with fresh descriptors, before handling the frames.
Wrap it all in the RX locks.
Since the FIFO is very shallow (16 for HP, 128 for LP) it needs to be
drained and replenished very quickly. Ideally, I'll eventually move this
RX FIFO drain/fill into the interrupt handler, only deferring the actual
frame completion.
I was setting up the RX EDMA buffer to be 4096 bytes rather than the
RX data buffer portion. The hardware was likely getting very confused
and DMAing descriptor portions into places it shouldn't, leading to
memory corruption and occasional panics.
Whilst here, don't bother allocating descriptors for the RX EDMA case.
We don't use those descriptors. Instead, just allocate ath_buf entries.
adds an extra tick to account for the current partial clock tick. However,
that is not appropriate for a repeating timer when the exact tvtohz() value
should be used for subsequent intervals. Fix repeating callouts for
EVFILT_TIMER by subtracting 1 tick from the tvtohz() result similar to the
fix used in realitexpire() for interval timers.
While here, update a few comments to note that if the EVFILT_TIMER code
were to move out of kern_event.c, it should move to kern_time.c (where the
interval timer code it mimics lives) rather than kern_timeout.c.
MFC after: 1 month
... from a user-set persistent limit on the said level.
Allow to set the user-imposed limit below current deepest available level
as the available levels may be dynamically changed by ACPI platform
in both directions.
Allow "Cmax" as an input value for cx_lowest sysctls to mean that there
is not limit and OS can use all available C-states.
Retire global cpu_cx_count as it no longer serves any meaningful
purpose.
Reviewed by: jhb, gianni, sbruno
Tested by: sbruno, Vitaly Magerya <vmagerya@gmail.com>
MFC after: 2 weeks
the linker set of CPU modules. The newbus method, although clever,
had many flaws: it didn't really support multiple SoC, many of the
comments about order were just wrong, and it did a few things far too
late to be useful. delay and cpu_reset now work much earlier in the
boot process.
Renamed the kern.cam.ada.ada_send_ordered sysctl and tunable to
kern.cam.ada.send_ordered, more in line with the other da sysctls/tunables.
Suggested by: kib
kern.cam.da.send_ordered, more in line with the other da sysctls/tunables.
PR: 169765
Submitted by: Steven Hartland <steven.hartland@multiplay.co.uk>
Reviewed by: mav
having the CPU device that's a child of atmelarm that does stuff.
o Create a linker_set for the support fucntions for the SoCs.
o Rename soc_data to soc_info.
o Move the delay and reset function pointers to new soc_data struct
o Create elements for all known SoCs
o Add lookup of the SoC we found, and print a warning if it isn't one
we know about.
the upper levels notice. Otherwise we see commands silently failing leading
to data corruption. This mirrors dadone()
Submitted by: Andrew Boyer aboyer@averesystems.com
Reviewed by: scottl@freebsd.org
MFC after: 2 weeks
These probes are most useful when looking into the structures
they provide, which are listed in io.d. For example:
dtrace -n 'io:genunix::start { printf("%d\n", args[0]->bio_bcount); }'
Note that the I/O systems in FreeBSD and Solaris/Illumos are sufficiently
different that there is not a 1:1 mapping from scripts that work
with one to the other.
MFC after: 1 month
PCI:
- Properly handle interrupt fallback from MSIX to MSI to legacy.
The host may not have sufficient resources to support MSIX,
so we must be able to fallback to legacy interrupts.
- Add interface to get the (sub) vendor and device IDs.
- Rename flags to VTPCI_FLAG_* like other VirtIO drivers.
Block:
- No longer allocate vtblk_requests from separate UMA zone.
malloc(9) from M_DEVBUF is sufficient. Assert segment counts
at allocation.
- More verbose error and debug messages.
Network:
- Remove stray write once variable.
Virtqueue:
- Shuffle code around in preparation of converting the mb()s to
the appropriate atomic(9) operations.
- Only walk the descriptor chain when freeing if INVARIANTS is
defined since the result is only KASSERT()ed.
Submitted by: Bryan Venteicher (bryanv@daemoninthecloset.org)
If an error occurs when transmitting one mbuf in a chain of fragments,
free the subsequent fragments instead of leaking them.
Sponsored by: ADARA Networks
the FIFO.
I still see some corner cases where no RX occurs when it should be
occuring. It's quite possible that there's a subtle race condition
somewhere; or maybe I'm not programming the RX queues right.
There's also no locking here yet, so any reset/configuration path
state change (ie, enabling/disabling receive from the ioctl, net80211
taskqueue, etc) could quite possibly confuse things.
* For now, kickpcu should hopefully just do nothing - the PCU doesn't need
'kicking' for Osprey and later NICs. The PCU will just restart once
the next FIFO entry is pushed in.
* Teach "proc" about "dosched", so it can be used to just flush the
FIFO contents without adding new FIFO entries.
* .. and now, implement the RX "flush" routine.
* Re-initialise the FIFO contents if the FIFO is empty (the DP is NULL.)
When PCU RX is disabled (ie, writing RX_D to the RX configuration
register) then the FIFO will be completely emptied. If the software FIFO
is full, then no further descriptors are pushed into the FIFO and
things stall.
This all requires much, much more thorough stress testing.
debugger exited without calling ptrace(PT_DETACH), there is a time window
that the p_xthread may be pointing to non-existing thread, in practical,
this is not a problem because child process soon will be killed by parent
process.
This is inspired by ath9k and the reference driver, but it's a new
implementation of the RX FIFO handling.
This has some issues - notably the FIFO needs to be reprogrammed when
the chip is reset.
* Add a couple of RX errors;
* Add the spectral scan PHY error code;
* extend the RX flags to be a 16 bit field, rather than an 8 bit field;
* Add a new RX flag.
Obtained from: Qualcomm Atheros
on x86 and use that to implement stop_emulating() in the fpu/npx code.
Reimplement start_emulating() in the non-XEN case by using load_cr0() and
rcr0() instead of the 'lmsw' and 'smsw' instructions. Intel explicitly
discourages the use of 'lmsw' and 'smsw' on 80386 and later processors in
the description of these instructions in Volume 2 of the ADM.
Reviewed by: kib
MFC after: 1 month
switch to its vnet before calling ether_ifdetach(). Otherwise if the
second half resides in a different vnet, if_detach() silently fails
leaving a stale pointer in V_ifnet list, and the system crashes trying
to access this pointer later.
Another solution could be not to allow to destroy epair unless both
ends are in the home vnet.
Discussed with: bz
Tested by: delphij
The last 12 bits of the limit registers have to be set to 1. These
bits are not significant in bridge BARs and are 0 on read, but the
bits are valid in the swap limit register and needs to be set.
to attach to the process, it is surprising that the process is resumed
without inputting any gdb commands, however ptrace manual said:
The tracing process will see the newly-traced process stop and may
then control it as if it had been traced all along.
But the current code does not work in this way, unless traced process
received a signal later, it will continue to run as a background task.
To fix this problem, just send signal SIGSTOP to the traced process after
we resumed it, this works like that you are attaching to a running process,
it is not perfect but better than nothing.
The AR93xx and later chips support two RX FIFO queues - a high and low
priority queue.
For legacy chips, just assume the queues are high priority.
This is inspired by the reference driver but is a reimplementation of
the API and code.
AR93xx receive descriptors.
This isn't entirely complete - the AR93xx and later descriptors
don't have a link/buffer pointer; the descriptor contents just
start.
implementation of RFC 3484 for this purpose for a long time and "prefer_source"
was never implemented actually. ND6_IFF_PREFER_SOURCE macro is left intact.
Pass only FEXEC (instead of FREAD|FEXEC) in fgetvp_exec. _fget has to check for
!FWRITE anyway and may as well know about FREAD.
Make _fget code a bit more readable by converting permission checking from if()
to switch(). Assert that correct permission flags are passed.
In collaboration with: kib
Approved by: trasz (mentor)
MFC after: 6 days
X-MFC: with r238220
up and running to service interrupts. This is especially important
when the firmware has bound interrupts to CPUs, like for the SGI
Altix 350. We wake up APs at SI_SUB_CPU time and they sit and spin
until we unleash them, so there's nothing fundamentally different
from a MD perspective.
While here return EBADF for descriptors opened for writing (previously it was ETXTBSY).
Add fgetvp_exec function which performs appropriate checks.
PR: kern/169651
In collaboration with: kib
Approved by: trasz (mentor)
MFC after: 1 week
setting in the igb and em driver. This was necessitated by
a shared code change that I was given late in the game, a data
type changed from bool to int, in the last update I dealt with
it by a cast, but it was pointed out (thanks jhb) that there
was a potential problem with this. John suggested this safer
approach, and it is fine with me...
MFC after:2 days (to catch the 9.1 update)
inactive queue, unless busy page is found.
Dropping the mutex often should allow the other lock acquires to
proceed without waiting for whole inactive scan to finish. On machines
with lot of physical memory scan often need to iterate a lot before it
finishes or finds a page which requires laundring, causing high
latency for other lock waiters.
Suggested and reviewed by: alc
MFC after: 3 weeks
is an error set on the provider. With GEOM resizing, class can become
orphaned when it doesn't implement resize() method and the provider size
decreases.
Reviewed by: mav
Sponsored by: FreeBSD Foundation
although by default only C1 is enabled (cx_lowest=0) and enabling deeper
states goes through acpi_cpu_set_cx_lowest which re-evaluates cpu_non_c3
MFC after: 2 weeks
cpu_non_c3 is already evaluated in acpi_cpu_cx_cst and in
acpi_cpu_set_cx_lowest.
Besides acpi_cpu_cx_list is not protected by any locking.
As a result also move setting of cpu_can_deep_sleep to more appropriate
places.
MFC after: 2 weeks
before VM has been initialized. This includes:
1. Replacing pmap_steal_memory(),
2. Replace the handcrafted logic to allocate a naturally aligned VHPT,
3. Properly allocate the DPCPU for the BSP.
Ad 3: Appending the DPCPU to kernend worked as long as we wouldn't
cross into the next PBVM page. If we were to cross into the next
page, then there wouldn't be a PTE entry on the page table for it
and we would end up with a MCA following a page fault. As such,
this commit fixes MCAs occasionally seen.
arm platform. Add all the atmel boards to the ATMEL kernel for
testing purposes. Until boot loader arg parsing of baord type
is done, this won't actually be able to do the runtime selection.
100 bytes from the binary with silly tricks. Hope to get this small
enough to run on the models that have 4k SRAM. We are close compiled
for the at91rm9200, but still need to trim for the target.
correctly. We now iterate the EFI memory descriptors once and collect all
the information in a single pass. This includes:
1. The I/O port base address,
2. The PAL memory region. Have the physmem API track this.
3. Memory descriptors of memory we can't use, like bad memory, runtime
services code & data, etc. Have the physmem API track these.
4. memory descriptors of memory we can use or re-use, such as free
memory, boot time services code & data, loader code & data, etc.
These are added by the physmem API.
Since the PBVM page table and pages are in memory described as loader
data, inform the physmem API of chunks that need to be delated from the
available physical memory.
While here, remove Maxmem and replace it with the better named paddr_max.
Maxmem was defined as physmem, which is generally wrong. Now, paddr_max
is properly defined as the largesty physical address.
The upshot of all this is that:
1. We properly determine realmem.
2. We maximize physmem by re-using memory where possible.
3. We remove complexity from ia64_init() in machdep.c.
4. Remove confusion about realmem, physmem & Maxmem.
The new ia64_physmem_alloc() is to replace pmap_steal_memory() in pmap.c,
as well as replace the handcrafted allocation of the VHPT for the BSP in
pmap_bootstrap() in pmap.c. This is step 2 and addresses the manipulation
of phys_avail after it is being created.
Don't use Maxmem when the amount of memory is meant. Use realmem instead.
Maxmem is not only a MD variable, it represents the highest physical memory
address in use. On systems where memory is sparsely layed-out the highest
memory address and the amount of memory are not interchangeable. Scaling the
AGP aperture based on the actual amount of memory (= realmem) rather than
the available memory (= physmem) makes sure there's consistent behaviour
across architectures.
agp_i810.c:
While arguably the use of Maxmem can be considered correct, replace its use
with realmem anyway. agp_i810.c is specific to amd64, i386 & pc98, which
have a dense physical memory layout. Avoiding Maxmem here is done with an
eye on copy-n-paste behaviour in general and to avoid confusion caused by
using realmem in agp.c and Maxmem in agp_i810.c.
In both cases, remove the inclusion of md_var.h
Update DTrace disassembler accordingly. The code to treat the prefixes
as null prefixes was already in place.
Although in practice compilers seem to generate only cs-prefix for use
in long NOPs, the same treatment is applied to all of cs, ds, es, ss for
consistency.
Reported by: emaste
Tested by: emaste
Obtained from: Illumos commit 13442:4adbe6de60c8 (+ local changes)
MFC after: 5 days
- Add generic support for opcodes that are escape bytes used for
multi-byte opcodes (such as the 0x0f prefix). Use this to replace
the hard-coded 0x0f special case and add support for three-byte
opcodes that use the 0x0f38 prefix.
- Decode all Intel VMX instructions. invept and invvpid in particular are
three-byte opcodes that use the 0x0f38 escape prefix.
- Rework how the special 'SDEP' size flag works such that the default
instruction name (i_name) is the instruction when the data size
prefix (0x66) is not specified, and the alternate name in i_extra is
used when the prefix is included.
- Add a new 'ADEP' size flag similar to 'SDEP' except that it chooses
between i_name and i_extra based on the address size prefix (0x67).
Use this to fix the decoding for jrcxz vs jecxz which is determined
by the address size prefix, not the operand size prefix. Also, jcxz
is not possible in 64-bit mode, but jrcxz is the default instruction
for that opcode.
- Add support for handling instructions that have a mandatory 'rep'
prefix (this means not outputting the 'repe ' prefix until determining
if it is used as part of an opcode). Make 'pause' less of a special
case this way.
- Decode 'cmpxchg16b' and 'cdqe' which are variants of other instructions
but with a REX.W prefix.
MFC after: 1 month
"Reserved by Microsoft" in the standard PNP ID table, but has been seen
in the wild on at least one laptop.
PR: kern/169571
Submitted by: Matthias Apitz guru unixarea de
MFC after: 3 days
functions that manage PV entries. Specifically, remove the PV entry from
the containing PV list only after the corresponding PTE is destroyed.
Update the pmap's wired mapping count in pmap_enter() before the PV list
lock is acquired.
- Add a couple of new devices
- Flow control changes in shared and core code
- Bug fix to Flow Director for 82598
- Shared code sync to internal with required core change
Thanks to those helping in the testing and improvements to this driver!
MFC after:5 days