them alone.
Process the act_count updates for the held pages in the vm_pageout
loop over the inactive queue, instead of refusing to do anything with
such page.
Clarify the intent of the addl_page_shortage counter and change its
use for pages which are not processed in the loop according to the
description.
Reviewed by: alc
MFC after: 2 weeks
Update i2c devices to just include the eeprom.
Update dataflash chip select to be CS 1 (this doesn't work yet and
needs changes to at91_spi and the spibus infrastructure).
Fix typo in comment.
- The USLCOM_SET_BAUD_DIV command (0x01)
- The USLCOM_SET_BAUD_RATE command (0x13)
Devices based on the CP1204 will only accept the latter command, and ignore
the former. As the latter command works on all chips that this driver
supports, switch to always using it.
A slight confusion here is that the previously used command was incorrectly
named USLCOM_BAUD_RATE - even though we no longer use it, rename it to
USLCOM_SET_BAUD_DIV to closer match the name used in the datasheet.
This change reflects a similar change made in the Linux driver, which was
submitted by preston.fick at silabs.com, and has been tested on all of the
uslcom(4) devices I have to hand.
MFC after: 2 weeks
but not for IPv6. The current checks in nd6_nbr.c along with the
old version will result in ifa being NULL and subsequently the
packet will be dropped. This prevented NS/NA, from working and
with that IPv6.
Now return the ifa from the carp lookup function in two cases:
1) if the address matches, is a carp address, and we are MASTER
(as before),
2) if the address matches but it is not a carp address at all (new).
Reported by: Peter Wemm (new Y! FreeBSD cluster, eating our own dogfood)
Tested on: New Y! FreeBSD cluster machines
Reviewed by: glebius
one device (support for Motorola cables), this syncronises us with:
OpenBSD src/sys/dev/usb/uplcom.c 1.56
NetBSD src/sys/dev/usb/uplcom.c 1.73
Linux kernel.org HEAD
MFC after: 1 week
when used in qemu (and this driver is for non-PCIe cards,
so probably its largest use is in virtualized environments).
Approved by: Jack Vogel
MFC after: 3 days
Intel Architecture Manual specifies that rdtsc instruction is not serialized,
so without this change, TSC synchronization test would periodically fail,
resulting in use of HPET timecounter instead of TSC-low. This caused
severe performance degradation (40-50%) when running high IO/s workloads due to
HPET MMIO reads and GEOM stat collection.
Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC synchronization
fail approximately 20% of the time.
Sponsored by: Intel
Reviewed by: kib
MFC after: 3 days
including the include directive.
Fix minor build issue corrected by converting yypush_buffer_state and
yypop_buffer_state to yy_set_buffer_state and a hard-coded 100-deep
stack. It was easier to fix it here than to import that support into
our flex.
The new tools and test hardness remain unsupported at the moment.
Just free inclomplete daemon cache instead to let it retry next time.
Premature ses_softc_cleanup() caused NULL dereference when freed softc
was accessed later.
The AR9300 and later descriptors are 128 bytes, however I'd like to make
sure that isn't used for earlier chips.
* Populate the TX descriptor length field in the softc with
sizeof(ath_desc)
* Use this field when allocating the TX descriptors
* Pre-AR93xx TX/RX descriptors will use the ath_desc size; newer ones will
query the HAL for these sizes.
The function keys on a Microsoft Natural Egronomic Keyboard 4000 have been
repurposed as "Help", "Undo", "Redo" etc., and a special "F Lock" key is
required to return them to their normal purpose.
This change enables the UQ_KBD_BOOTPROTO quirk for the MS Natural 4000
keyboard to get the keys working again. More extensive changes to the USB
keyboard infrastructure would be needed to fully support the "F Lock" mode
and the extended keys on this keyboard.
PR: usb/116947
Approved by: hselasky@
* Introduce TX DMA setup/teardown methods, mirroring what's done in
the RX path.
Although the TX DMA descriptor is setup via ath_desc_alloc() /
ath_desc_free(), there TX status descriptor ring will be allocated
in this path.
* Remove some of the TX EDMA capability probing from the RX path and
push it into the new TX EDMA path.
sized TX descriptor.
This is required for the AR93xx EDMA support which requires 128 byte
TX descriptors (which is significantly larger than the earlier
hardware.)
compatibility definitions are only needed for implementation of the
syscalls, they cause namespace pollution and are not useful for
applications.
Noted by: bde
MFC after: 1 week
SAs. For now allow same address family bundles. While discovered with
ESP and AH, which does not make a lot of sense, IPcomp could be a possible
problematic candidate.
PR: kern/164400
MFC after: 3 days
stopped threads. Implementation assumes that the thread's FPU context
is spilled into the PCB due to stop. This is mostly true, except when
FPU state for the thread is not initialized. Then the requests operate
on the garbage state which is currently left in the PCB, causing
confusion.
The situation is indeed observed after a signal delivery and before
#NM fault on execution of any FPU instruction in the signal handler,
since sendsig(9) drops FPU state for current thread, clearing
PCB_FPUINITDONE. When inspecting context state for the signal handler,
debugger sees the FPU state of the main program context instead of the
clear state supposed to be provided to handler.
Fix this by forcing clean FPU state in PCB user FPU save area by
performing getfpuregs(9) before accessing user FPU save area in
ptrace_machdep.c.
Note: this change will be merged to i386 kernel as well, where it is
much more important, since e.g. gdb on i386 uses PT_I386_GETXMMREGS to
inspect FPU context on CPUs that support SSE. Amd64 version of gdb
uses PT_GETFPREGS to inspect both 64 and 32 bit processes, which does
not exhibit the bug.
Reported by: bde
MFC after: 1 week
understands FPU hardware enough to catch SIGFPE and unmask exceptions
in control word, then it may as well properly handle return from
SIGFPE without causing an infinite loop of #MF exceptions due to
faulting instruction restart, when needed.
Clearing exceptions causes information loss for handlers which do
understand FPU hardware, and struct siginfo si_code member cannot be
considered adequate replacement for en_sw content due to translation.
Supposed reason for clearing the exceptions, which is IRQ13 handling
oddities, were never applicable to amd64.
Note: this change will be merged to i386 kernel as well, since we do
not support IRQ13 delivery of #MF notifications for some time.
Requested by: bde
MFC after: 1 week
Append '__' prefix to the tag of struct oflock, and put it under BSD
namespace. Structure is needed both by libc and kernel, thus cannot be
hidden under #ifdef _KERNEL.
Move a set of non-standard F_* and O_* constants into BSD namespace.
SUSv4 explicitely allows implemenation to pollute F_* and O_* names
after fcntl.h is included, but it costs us nothing to adhere
to the specification if exact POSIX compliance level is requested by
user code.
Change some spaces after #define to tabs.
Noted by and discussed with: bde
MFC after: 1 week
For now, the only module implement is 'sample', and that's only partially
implemented. The main issue here with reusing this structure in userland
is that it uses 'rix' everywhere, which requires the userland code to
have access to the current HAL rate table.
For now, this is a very large work in progress.
Specific details:
* The rate control information is per-node at the moment and wrapped
in a TLV, to ease parsing and backwards compatibility.
* .. but so I can be slack for now, the userland statistics are just
a copy of the kernel-land sample node state.
* However, for now use a temporary copy and change the rix entries
to dot11rate entries to make it slightly easier to eyeball.
Problems:
* The actual rate information table is unfortunately indexed by rix
and it doesn't contain a rate code. So the userland side of this
currently has no way to extract out a mapping.
TODO:
* Add a TLV payload to dump out the rate control table mapping so
'rix' can be turned into a dot11 / MCS rate.
* .. then remove the temporary copy.
amd64. It is implemented as __pure2 inline with non-volatile asm read
from pcpu, which allows a compiler to cache its results.
Convert most PCPU_GET(pcb) and curthread->td_pcb accesses into curpcb.
Note that __curthread() uses magic value 0 as an offsetof(struct pcpu,
pc_curthread). It seems to be done this way due to machine/pcpu.h
needs to be processed before sys/pcpu.h, because machine/pcpu.h
contributes machine-depended fields to the struct pcpu definition. As
result, machine/pcpu.h cannot use struct pcpu yet.
The __curpcb() also uses a magic constant instead of offsetof(struct
pcpu, pc_curpcb) for the same reason. The constants are now defined as
symbols and CTASSERTs are added to ensure that future KBI changes do
not break the code.
Requested and reviewed by: bde
MFC after: 3 weeks
data introduced in r236061. Using that flag doesn't make that much
sense on this case as the DMA maps using it are also created during
sym_pci_attach(). Moreover, due to the maxsegsz parameter used, doing
so may exhaust the bounce pages pool on architectures requiring
bounce pages. [1]
While at it, use a slightly more appropriate maxsegsz parameter.
PR: 169526
Submitted by: Mike Watters [1]
MFC after: 3 days
ELF parser. Specifically, do not allow note reader and interpreter
path comparision in the brandelf code to read past end of the page.
This may happen if specially crafter ELF image is activated.
Submitted by: Lukasz Wojcik <lukasz.wojcik zoho com>
MFC after: 3 days
TX descriptor link pointers.
This is required for the AR93xx and later chipsets.
The RX path is slightly different - the legacy RX path directly
accesses ath_desc->ds_link for now, however this isn't at all done
for EDMA (FIFO) RX.
Now, for those performing a little software archeology here:
This is all a bit sub-optimal. "struct ath_desc" is only really relevant
for the pre-AR93xx NICs - where ds_link and ds_data is always in the
same location.
The AR93xx and later NICs have different descriptor layouts altogether.
Now, for AR93xx and later NICs, you should never directly reference
ds_link and ds_data, as:
* the RX descriptors don't have either - the data is _after_ the RX
descriptor. They're just one large buffer. There's also no need for
a per-descriptor RX buffer size as they're all fixed sizes.
* the TX descriptors have 4 buffer and 4 length fields _and_ a link
pointer. Each frame takes up one TX FIFO pointer, but it can contain
multiple subframes (either multiple frames in a buffer, and/or
multiple frames in an aggregate/RIFS burst.)
* .. so, when TX frames are queued to a hardware queue, the link
pointer is ONLY for buffers in that frame/aggregate. The next frame
starts in a new FIFO pointer.
* Finally, descriptor completion status is in a different ring.
I'll write something up about that when its time to do so.
This was inspired by Linux ath9k and the reference driver but is a
reimplementation.
Obtained from: Linux ath9k, Qualcomm Atheros
The DMA FIFO chips (AR93xx and later) differ slightly to th elegacy
chips:
* The RX DMA descriptors don't have a ds_link field;
* The TX DMA descriptors have a ds_link field however at a different
offset.
This is a reimplementation based on what the reference driver and ath9k
does.
A subsequent commit will enable it in the TX and beacon paths.
Obtained from: Linux ath9k, Qualcomm Atheros
occurs using the SSE math processor. Update comments describing the
handling of the exception status bits in coprocessors control words.
Remove GET_FPU_CW and GET_FPU_SW macros which were used only once.
Prefer to use curpcb to access pcb_save over the longer path of
referencing pcb through the thread structure.
Based on the submission by: Ed Alley <wea llnl gov>
PR: amd64/169927
Reviewed by: bde
MFC after: 3 weeks
FDT-enabled targets were broken after r238043 that relies
on device up the hierarchy to properly setup interrupt.
nexus device for ARM platforms did job only partially:
setting handler but not unmasking interrupt. Unmasking
was performed by platform code.
Reviewed by: andrew@
it can't sleep, it can still move clean pages from the inactive queue to
the cache. Also, when a page is cached, there is no need to restart the
scan. The "next" page pointer held by vm_contig_launder() is still
valid. Finally, add a comment summarizing what vm_contig_grow_cache()
does based upon the value of "tries".
MFC after: 3 weeks
tcp_mtudisc(), which in its turn may call tcp_output(). Under certain
conditions (must admit they are very special) an infinite recursion can
happen.
To avoid recursion we can pass struct route to ip_output() and obtain
correct mtu. This allows us not to use tcp_mtudisc() but call tcp_mss_update()
directly.
PR: kern/155585
Submitted by: Andrey Zonov <andrey zonov.org> (original version of patch)
The AR9003 series NICs implement a separate RX error to signal that a
Keycache miss occured. The earlier NICs would not set the key index
valid bit.
I'll dig into the difference between "no key index bit set" and "keycache
miss".
VM_KMEM_MAX_SIZE.
The code was not taking into account the size of the kernel_map, which
the kmem_map is allocated from, so it could produce a sub-map size too
large to fit. The simplest solution is to ignore VM_KMEM_MAX entirely
and base the memguard map's size off the kernel_map's size, since this
is always relevant and always smaller.
Found by: Justin Hibbits
This behavior is recommended by RFC 4213 clause 3.2.
Sometimes fragmentation is the least evil.
For example, some Linux IPVS kernels forwards
ICMPv6 checksums to real servers incorrectly.
Reviewed by: hrs(previous version)
Approved by: kib(mentor)
MFC after: 1 week
mostly meets the guidelines set by the Intel SDM:
1. We use XRSTOR and XSAVE from the same CPL using the same linear
address for the store area
2. Contrary to the recommendations, we cannot zero the FPU save area
for a new thread, since fork semantic requires the copy of the
previous state. This advice seemingly contradicts to the advice
from the item 6.
3. We do use XSAVEOPT in the context switch code only, and the area
for XSAVEOPT already always contains the data saved by XSAVE.
4. We do not modify the save area between XRSTOR, when the area is
loaded into FPU context, and XSAVE. We always spit the fpu context
into save area and start emulation when directly writing into FPU
context.
5. We do not use segmented addressing to access save area, or rather,
always address it using %ds basing.
6. XSAVEOPT can be only executed in the area which was previously
loaded with XRSTOR, since context switch code checks for FPU use by
outgoing thread before saving, and thread which stopped emulation
forcibly get context loaded with XRSTOR.
7. The PCB cannot be paged out while FPU emulation is turned off, since
stack of the executing thread is never swapped out.
The context switch code is patched to issue XSAVEOPT instead of XSAVE
if supported. This approach eliminates one conditional in the context
switch code, which would be needed otherwise.
For user-visible machine context to have proper data, fpugetregs()
checks for unsaved extension blocks and manually copies pristine FPU
state into them, according to the description provided by CPUID leaf
0xd.
MFC after: 1 month
* wrap the RX proc calls in the RX refcount;
* call the DFS checking, fast frames staging and TX rescheduling if
required.
TODO:
* figure out if I can just make "do TX rescheduling" mean "schedule
TX taskqueue" ?
with fresh descriptors, before handling the frames.
Wrap it all in the RX locks.
Since the FIFO is very shallow (16 for HP, 128 for LP) it needs to be
drained and replenished very quickly. Ideally, I'll eventually move this
RX FIFO drain/fill into the interrupt handler, only deferring the actual
frame completion.
I was setting up the RX EDMA buffer to be 4096 bytes rather than the
RX data buffer portion. The hardware was likely getting very confused
and DMAing descriptor portions into places it shouldn't, leading to
memory corruption and occasional panics.
Whilst here, don't bother allocating descriptors for the RX EDMA case.
We don't use those descriptors. Instead, just allocate ath_buf entries.
adds an extra tick to account for the current partial clock tick. However,
that is not appropriate for a repeating timer when the exact tvtohz() value
should be used for subsequent intervals. Fix repeating callouts for
EVFILT_TIMER by subtracting 1 tick from the tvtohz() result similar to the
fix used in realitexpire() for interval timers.
While here, update a few comments to note that if the EVFILT_TIMER code
were to move out of kern_event.c, it should move to kern_time.c (where the
interval timer code it mimics lives) rather than kern_timeout.c.
MFC after: 1 month
... from a user-set persistent limit on the said level.
Allow to set the user-imposed limit below current deepest available level
as the available levels may be dynamically changed by ACPI platform
in both directions.
Allow "Cmax" as an input value for cx_lowest sysctls to mean that there
is not limit and OS can use all available C-states.
Retire global cpu_cx_count as it no longer serves any meaningful
purpose.
Reviewed by: jhb, gianni, sbruno
Tested by: sbruno, Vitaly Magerya <vmagerya@gmail.com>
MFC after: 2 weeks
the linker set of CPU modules. The newbus method, although clever,
had many flaws: it didn't really support multiple SoC, many of the
comments about order were just wrong, and it did a few things far too
late to be useful. delay and cpu_reset now work much earlier in the
boot process.
Renamed the kern.cam.ada.ada_send_ordered sysctl and tunable to
kern.cam.ada.send_ordered, more in line with the other da sysctls/tunables.
Suggested by: kib
kern.cam.da.send_ordered, more in line with the other da sysctls/tunables.
PR: 169765
Submitted by: Steven Hartland <steven.hartland@multiplay.co.uk>
Reviewed by: mav
having the CPU device that's a child of atmelarm that does stuff.
o Create a linker_set for the support fucntions for the SoCs.
o Rename soc_data to soc_info.
o Move the delay and reset function pointers to new soc_data struct
o Create elements for all known SoCs
o Add lookup of the SoC we found, and print a warning if it isn't one
we know about.
the upper levels notice. Otherwise we see commands silently failing leading
to data corruption. This mirrors dadone()
Submitted by: Andrew Boyer aboyer@averesystems.com
Reviewed by: scottl@freebsd.org
MFC after: 2 weeks
These probes are most useful when looking into the structures
they provide, which are listed in io.d. For example:
dtrace -n 'io:genunix::start { printf("%d\n", args[0]->bio_bcount); }'
Note that the I/O systems in FreeBSD and Solaris/Illumos are sufficiently
different that there is not a 1:1 mapping from scripts that work
with one to the other.
MFC after: 1 month
PCI:
- Properly handle interrupt fallback from MSIX to MSI to legacy.
The host may not have sufficient resources to support MSIX,
so we must be able to fallback to legacy interrupts.
- Add interface to get the (sub) vendor and device IDs.
- Rename flags to VTPCI_FLAG_* like other VirtIO drivers.
Block:
- No longer allocate vtblk_requests from separate UMA zone.
malloc(9) from M_DEVBUF is sufficient. Assert segment counts
at allocation.
- More verbose error and debug messages.
Network:
- Remove stray write once variable.
Virtqueue:
- Shuffle code around in preparation of converting the mb()s to
the appropriate atomic(9) operations.
- Only walk the descriptor chain when freeing if INVARIANTS is
defined since the result is only KASSERT()ed.
Submitted by: Bryan Venteicher (bryanv@daemoninthecloset.org)
If an error occurs when transmitting one mbuf in a chain of fragments,
free the subsequent fragments instead of leaking them.
Sponsored by: ADARA Networks
the FIFO.
I still see some corner cases where no RX occurs when it should be
occuring. It's quite possible that there's a subtle race condition
somewhere; or maybe I'm not programming the RX queues right.
There's also no locking here yet, so any reset/configuration path
state change (ie, enabling/disabling receive from the ioctl, net80211
taskqueue, etc) could quite possibly confuse things.
* For now, kickpcu should hopefully just do nothing - the PCU doesn't need
'kicking' for Osprey and later NICs. The PCU will just restart once
the next FIFO entry is pushed in.
* Teach "proc" about "dosched", so it can be used to just flush the
FIFO contents without adding new FIFO entries.
* .. and now, implement the RX "flush" routine.
* Re-initialise the FIFO contents if the FIFO is empty (the DP is NULL.)
When PCU RX is disabled (ie, writing RX_D to the RX configuration
register) then the FIFO will be completely emptied. If the software FIFO
is full, then no further descriptors are pushed into the FIFO and
things stall.
This all requires much, much more thorough stress testing.
debugger exited without calling ptrace(PT_DETACH), there is a time window
that the p_xthread may be pointing to non-existing thread, in practical,
this is not a problem because child process soon will be killed by parent
process.
This is inspired by ath9k and the reference driver, but it's a new
implementation of the RX FIFO handling.
This has some issues - notably the FIFO needs to be reprogrammed when
the chip is reset.
* Add a couple of RX errors;
* Add the spectral scan PHY error code;
* extend the RX flags to be a 16 bit field, rather than an 8 bit field;
* Add a new RX flag.
Obtained from: Qualcomm Atheros
on x86 and use that to implement stop_emulating() in the fpu/npx code.
Reimplement start_emulating() in the non-XEN case by using load_cr0() and
rcr0() instead of the 'lmsw' and 'smsw' instructions. Intel explicitly
discourages the use of 'lmsw' and 'smsw' on 80386 and later processors in
the description of these instructions in Volume 2 of the ADM.
Reviewed by: kib
MFC after: 1 month
switch to its vnet before calling ether_ifdetach(). Otherwise if the
second half resides in a different vnet, if_detach() silently fails
leaving a stale pointer in V_ifnet list, and the system crashes trying
to access this pointer later.
Another solution could be not to allow to destroy epair unless both
ends are in the home vnet.
Discussed with: bz
Tested by: delphij
The last 12 bits of the limit registers have to be set to 1. These
bits are not significant in bridge BARs and are 0 on read, but the
bits are valid in the swap limit register and needs to be set.
to attach to the process, it is surprising that the process is resumed
without inputting any gdb commands, however ptrace manual said:
The tracing process will see the newly-traced process stop and may
then control it as if it had been traced all along.
But the current code does not work in this way, unless traced process
received a signal later, it will continue to run as a background task.
To fix this problem, just send signal SIGSTOP to the traced process after
we resumed it, this works like that you are attaching to a running process,
it is not perfect but better than nothing.
The AR93xx and later chips support two RX FIFO queues - a high and low
priority queue.
For legacy chips, just assume the queues are high priority.
This is inspired by the reference driver but is a reimplementation of
the API and code.
AR93xx receive descriptors.
This isn't entirely complete - the AR93xx and later descriptors
don't have a link/buffer pointer; the descriptor contents just
start.
implementation of RFC 3484 for this purpose for a long time and "prefer_source"
was never implemented actually. ND6_IFF_PREFER_SOURCE macro is left intact.
Pass only FEXEC (instead of FREAD|FEXEC) in fgetvp_exec. _fget has to check for
!FWRITE anyway and may as well know about FREAD.
Make _fget code a bit more readable by converting permission checking from if()
to switch(). Assert that correct permission flags are passed.
In collaboration with: kib
Approved by: trasz (mentor)
MFC after: 6 days
X-MFC: with r238220
up and running to service interrupts. This is especially important
when the firmware has bound interrupts to CPUs, like for the SGI
Altix 350. We wake up APs at SI_SUB_CPU time and they sit and spin
until we unleash them, so there's nothing fundamentally different
from a MD perspective.
While here return EBADF for descriptors opened for writing (previously it was ETXTBSY).
Add fgetvp_exec function which performs appropriate checks.
PR: kern/169651
In collaboration with: kib
Approved by: trasz (mentor)
MFC after: 1 week
setting in the igb and em driver. This was necessitated by
a shared code change that I was given late in the game, a data
type changed from bool to int, in the last update I dealt with
it by a cast, but it was pointed out (thanks jhb) that there
was a potential problem with this. John suggested this safer
approach, and it is fine with me...
MFC after:2 days (to catch the 9.1 update)
inactive queue, unless busy page is found.
Dropping the mutex often should allow the other lock acquires to
proceed without waiting for whole inactive scan to finish. On machines
with lot of physical memory scan often need to iterate a lot before it
finishes or finds a page which requires laundring, causing high
latency for other lock waiters.
Suggested and reviewed by: alc
MFC after: 3 weeks
is an error set on the provider. With GEOM resizing, class can become
orphaned when it doesn't implement resize() method and the provider size
decreases.
Reviewed by: mav
Sponsored by: FreeBSD Foundation
although by default only C1 is enabled (cx_lowest=0) and enabling deeper
states goes through acpi_cpu_set_cx_lowest which re-evaluates cpu_non_c3
MFC after: 2 weeks
cpu_non_c3 is already evaluated in acpi_cpu_cx_cst and in
acpi_cpu_set_cx_lowest.
Besides acpi_cpu_cx_list is not protected by any locking.
As a result also move setting of cpu_can_deep_sleep to more appropriate
places.
MFC after: 2 weeks
before VM has been initialized. This includes:
1. Replacing pmap_steal_memory(),
2. Replace the handcrafted logic to allocate a naturally aligned VHPT,
3. Properly allocate the DPCPU for the BSP.
Ad 3: Appending the DPCPU to kernend worked as long as we wouldn't
cross into the next PBVM page. If we were to cross into the next
page, then there wouldn't be a PTE entry on the page table for it
and we would end up with a MCA following a page fault. As such,
this commit fixes MCAs occasionally seen.
arm platform. Add all the atmel boards to the ATMEL kernel for
testing purposes. Until boot loader arg parsing of baord type
is done, this won't actually be able to do the runtime selection.
100 bytes from the binary with silly tricks. Hope to get this small
enough to run on the models that have 4k SRAM. We are close compiled
for the at91rm9200, but still need to trim for the target.
correctly. We now iterate the EFI memory descriptors once and collect all
the information in a single pass. This includes:
1. The I/O port base address,
2. The PAL memory region. Have the physmem API track this.
3. Memory descriptors of memory we can't use, like bad memory, runtime
services code & data, etc. Have the physmem API track these.
4. memory descriptors of memory we can use or re-use, such as free
memory, boot time services code & data, loader code & data, etc.
These are added by the physmem API.
Since the PBVM page table and pages are in memory described as loader
data, inform the physmem API of chunks that need to be delated from the
available physical memory.
While here, remove Maxmem and replace it with the better named paddr_max.
Maxmem was defined as physmem, which is generally wrong. Now, paddr_max
is properly defined as the largesty physical address.
The upshot of all this is that:
1. We properly determine realmem.
2. We maximize physmem by re-using memory where possible.
3. We remove complexity from ia64_init() in machdep.c.
4. Remove confusion about realmem, physmem & Maxmem.
The new ia64_physmem_alloc() is to replace pmap_steal_memory() in pmap.c,
as well as replace the handcrafted allocation of the VHPT for the BSP in
pmap_bootstrap() in pmap.c. This is step 2 and addresses the manipulation
of phys_avail after it is being created.
Don't use Maxmem when the amount of memory is meant. Use realmem instead.
Maxmem is not only a MD variable, it represents the highest physical memory
address in use. On systems where memory is sparsely layed-out the highest
memory address and the amount of memory are not interchangeable. Scaling the
AGP aperture based on the actual amount of memory (= realmem) rather than
the available memory (= physmem) makes sure there's consistent behaviour
across architectures.
agp_i810.c:
While arguably the use of Maxmem can be considered correct, replace its use
with realmem anyway. agp_i810.c is specific to amd64, i386 & pc98, which
have a dense physical memory layout. Avoiding Maxmem here is done with an
eye on copy-n-paste behaviour in general and to avoid confusion caused by
using realmem in agp.c and Maxmem in agp_i810.c.
In both cases, remove the inclusion of md_var.h
Update DTrace disassembler accordingly. The code to treat the prefixes
as null prefixes was already in place.
Although in practice compilers seem to generate only cs-prefix for use
in long NOPs, the same treatment is applied to all of cs, ds, es, ss for
consistency.
Reported by: emaste
Tested by: emaste
Obtained from: Illumos commit 13442:4adbe6de60c8 (+ local changes)
MFC after: 5 days
- Add generic support for opcodes that are escape bytes used for
multi-byte opcodes (such as the 0x0f prefix). Use this to replace
the hard-coded 0x0f special case and add support for three-byte
opcodes that use the 0x0f38 prefix.
- Decode all Intel VMX instructions. invept and invvpid in particular are
three-byte opcodes that use the 0x0f38 escape prefix.
- Rework how the special 'SDEP' size flag works such that the default
instruction name (i_name) is the instruction when the data size
prefix (0x66) is not specified, and the alternate name in i_extra is
used when the prefix is included.
- Add a new 'ADEP' size flag similar to 'SDEP' except that it chooses
between i_name and i_extra based on the address size prefix (0x67).
Use this to fix the decoding for jrcxz vs jecxz which is determined
by the address size prefix, not the operand size prefix. Also, jcxz
is not possible in 64-bit mode, but jrcxz is the default instruction
for that opcode.
- Add support for handling instructions that have a mandatory 'rep'
prefix (this means not outputting the 'repe ' prefix until determining
if it is used as part of an opcode). Make 'pause' less of a special
case this way.
- Decode 'cmpxchg16b' and 'cdqe' which are variants of other instructions
but with a REX.W prefix.
MFC after: 1 month
"Reserved by Microsoft" in the standard PNP ID table, but has been seen
in the wild on at least one laptop.
PR: kern/169571
Submitted by: Matthias Apitz guru unixarea de
MFC after: 3 days
functions that manage PV entries. Specifically, remove the PV entry from
the containing PV list only after the corresponding PTE is destroyed.
Update the pmap's wired mapping count in pmap_enter() before the PV list
lock is acquired.
- Add a couple of new devices
- Flow control changes in shared and core code
- Bug fix to Flow Director for 82598
- Shared code sync to internal with required core change
Thanks to those helping in the testing and improvements to this driver!
MFC after:5 days
shared code update and small changes in core required
Add support for new i210/i211 devices
Improve queue calculation based on mac type
MFC after:5 days
natively rather than hand-assembled versions. For xgetbv/xsetbv, add a
wrapper API to deal with xcr* registers: rxcr() and load_xcr().
Reviewed by: kib
MFC after: 1 month
This will allow HAST to read directly from the local component without
even communicating userland daemon.
Sponsored by: Panzura, http://www.panzura.com
MFC after: 1 month
Before this change the IV-Key was used to generate encryption keys,
which was incorrect, but safe - for the XTS mode this key was unused
anyway and for CBC mode it was used differently to generate IV
vectors, so there is no risk that IV vector collides with encryption
key somehow.
Bump version number and keep compatibility for older versions.
MFC after: 2 weeks
it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning
here: it may be supplied to provide route, and it may be supplied to
store and return to caller the route that ip_output()/ip6_output()
finds. In the latter case skipping FLOWTABLE lookup is pessimisation.
The difference between struct route filled by FLOWTABLE and filled
by rtalloc() family is that the former doesn't hold a reference on
its rtentry. Reference is hold by flow entry, and it is about to
be released in future. Thus, route filled by FLOWTABLE shouldn't
be passed to RTFREE() macro.
- Introduce new flag for struct route/route_in6, that marks route
not holding a reference on rtentry.
- Introduce new macro RO_RTFREE() that cleans up a struct route
depending on its kind.
- All callers to ip_output()/ip6_output() that do supply non-NULL
but empty route should use RO_RTFREE() to free results of
lookup.
- ip_output()/ip6_output() now do FLOWTABLE lookup always when
ro->ro_rt == NULL.
Tested by: tuexen (SCTP part)
The new driver changed the size of the mfi_dcmd_frame structure in such a
way that a MFI_IOC_PASSTHRU ioctl from an old amd64 binary is treated as an
MFI_IOC_PASSTHRU32 ioctl in the new driver. As a result, the user pointer
is treated as the buffer length. mfi_user_command() doesn't have a bounds
check on the buffer length, so it passes a really big value to malloc()
which panics when it tries to exhaust the kmem_map. Fix this two ways:
- Only honor MFI_IOC_PASSTHRU32 if the binary has the SV_ILP32 flag set,
otherwise treat it as an unknown ioctl.
- Add a bounds check on the buffer length passed by the user. For now
it fails any user attempts to use a buffer larger than 1MB.
While here, fix a few other nits:
- Remove an unnecessary check for a NULL return from malloc(M_WAITOK).
- Use the ENOTTY errno for invalid ioctl commands instead of ENOENT.
MFC after: 3 days
- Add a note to the ipfw(8) man page about the rules no longer being
case sensitive.
- Fix some typos in the man page.
PR: docs/164772
Reviewed by: bz
Approved by: gabor (doc mentor, src committer)
MFC after: 2 weeks
The RX EDMA support requires a modified approach to the RX descriptor
handling.
Specifically:
* There's now two RX queues - high and low priority;
* The RX queues are implemented as FIFOs; they're now an array of pointers
to buffers;
* .. and the RX buffer and descriptor are in the same "buffer", rather than
being separate.
So to that end, this commit abstracts out most of the RX related functions
from the bulk of the driver. Notably, the RX DMA/buffer allocation isn't
updated, primarily because I haven't yet fleshed out what it should look
like.
Whilst I'm here, create a set of matching but mostly unimplemented EDMA
stubs.
Tested:
* AR9280, station mode
TODO:
* Thorough AP and other mode testing for non-EDMA chips;
* Figure out how to allocate RX buffers suitable for RX EDMA, including
correctly setting the mbuf length to compensate for the RX descriptor
and completion status area.
This driver does not yet handle multiple chip selects properly.
Note that the NAND infrastructure does not perform full page
reads or writes, which means that this driver cannot make use
of the hardware ECC that is otherwise present.
o Save and clear the LTESR register in the interrupt handler.
o In lbc_read_reg(), return the saved LTESR register value if applicable
(i.e. when the saved value is not invalid (read: ~0U)).
o In lbc_write_reg(), clear the bits in the saved register when when it's
written to and when the asved value is not invalid.
o Also in lbc_write_reg(), the LTESR register is unlocked (in H/W) when
bit 1 of LTEATR is cleared. We use this to invalidate our saved LTESR
register value. Subsequent reads and write go to H/W directly.
While here:
o In lbc_read_reg() & lbc_write_reg(), add some belts and suspenders to
catch when register offsets are out of range.
o In lbc_attach(), initialize completely and don't leave something left
for lbc_banks_enable().
of interrupts of direct children. Have the bus_config_intr and
bus_teardown_intr methods implemented by bus_generic_config_intr and
bus_generic_teardown_intr (resp) as we don't need to do anything
special outselves.
This removes all the ``#ifdef $arch'' code that was there because powerpc
didn't have a proper nexus and people tend to copy and paste stuff.
methods so that MI drvers can depend on us doing the right thing instead
of having to go around us and call MD code directly. See the FDT code for
example (not for long though).
the aggressive pattern matching of the :C modifier. I tested build and
install in 2 phases, however with different solutions, resulting in the
breakage. Mea culpa.
The solution is to break out the all: target. This causes a few lines of
code duplication, but now the all: target works as it should, and the
other targets continue to work as they did before.
While I'm here, add a ===> header line to the start of each port build
to make it easier to find/more clear in the logs.
Note that setting the PTE_MODIFIED bit based on whether write is possible
is incorrect. We should set PTE_MODIFIED based on whether the access
is a write operation.
now fully encapsulates all accesses to f_offset, and extends f_offset
locking to other consumers that need it, in particular, to lseek() and
variants of getdirentries().
Ensure that on 32bit architectures f_offset, which is 64bit quantity,
always read and written under the mtxpool protection. This fixes
apparently easy to trigger race when parallel lseek()s or lseek() and
read/write could destroy file offset.
The already broken ABI emulations, including iBCS and SysV, are not
converted (yet).
Tested by: pho
No objections from: jhb
MFC after: 3 weeks
names to shorten them. PulseAudio reported to have problems with names
longer then 63 chars and at least in XMMS long names are inconvinient.
Reported by: hselasky
MFC after: 3 days
Adjust power_profile script to handle the new world order as well.
Some vendors are opting out of a C2 state and only defining C1 & C3. This
leads the acpi_cpu display to indicate that the machine supports C1 & C2
which is caused by the (mis)use of the index of the cx_state array as the
ACPI_STATE_CX value.
e.g. the code was pretending that cx_state[i] would
always convert to i by subtracting 1.
cx_state[2] == ACPI_STATE_C3
cx_state[1] == ACPI_STATE_C2
cx_state[0] == ACPI_STATE_C1
however, on certain machines this would lead to
cx_state[1] == ACPI_STATE_C3
cx_state[0] == ACPI_STATE_C1
This didn't break anything but led to a display of:
* dev.cpu.0.cx_supported: C1/1 C2/96
Instead of
* dev.cpu.0.cx_supported: C1/1 C3/96
MFC after: 2 weeks
bwrite(). VFS needs to know about EFAULT from uiomove() and does not
care much that partially filled block writeback after EFAULT was
successfull. Early return without error causes short write to be
reported to usermode.
Reported and tested by: andreast
MFC after: 3 weeks
Asus laptops. It is alike to acpi_asus(4), but uses WMI interface instead
of separate ACPI device.
On Asus EeePC T101MT netbook it allows to handle hotkeys and on/off WLAN,
Bluetooth, LCD backlight, camera, cardreader and touchpad.
On Asus UX31A ultrabook it allows to handle hotkeys, on/off WLAN, Bluetooth,
Wireless LED, control keyboard backlight brightness, monitor temperature
and fan speed. LCD brightness control doesn't work now for unknown reason,
possibly requiring some video card initialization.
Sponsored by: iXsystems, Inc.
important for those that use -DNO_CLEAN routinely, since it will prevent
installing stale stuff, and even more important when the port is upgraded
to a newer version. When the user doesn't use -DNO_CLEAN, this will create
an infinitesimal amount of extra work, but won't hurt anything.
This is necessary because the ports tree has flags that prevent the ususal
'update the build if newer source files exist' logic from doing what it
would do in the base.
as an EDMA check function.
For the AR9003 and later NICs, different TX/RX DMA and descriptor handling
code will be conditional on the EDMA check.
Obtained from: Qualcomm Atheros
we need to pass BIO_DELETE requests down to providers that support
it. Also, we need to announce our support for BIO_DELETE to upper
consumer. This requires:
- In g_mirror_start() return true for "GEOM::candelete" request.
- In g_mirror_init_disk() probe below provider for "GEOM::candelete"
attribute, and mark disk with a flag if it does support BIO_DELETE.
- In g_mirror_register_request() distribute BIO_DELETE requests only
to those disks, that do support it.
Note that we announce "GEOM::candelete" as true unconditionally of
whether we have TRIM-capable media down below or not. This is made
intentionally, because upper consumer (usually UFS) requests the
attribite only once at mount time. And if user ever migrates his
mirror from HDDs to SSDs, then he/she would get TRIM working without
remounting filesystem.
Reviewed by: pjd
running with multiple SoCs compiled in very well anyway, so this just
wastes space. As more and more SoCs arrive in the tree, it is better
to edit one master file that builds them all than many board files.
* Add a new ANI variable, for AR9003 and later chips;
* The AR9003 and later series chips support two RX queues now, so start
down the road of supporting that;
* Add some new TX queue types - uAPSD is possible on earlier chips,
but PAPRD is relevant to AR9003 and later.
Obtained from: Qualcomm Atheros, Linux ath9k
recent regression with ULE, causing processes to get stuck in getblk
as well as interrupt handler execution delays to rise above the command
timeout of mpt(4).
MFC after: 3 days
at the point that it calls get_pv_entry(). Thus, pmap_enter()'s PV list
lock pointer must be passed to get_pv_entry() for those rare occasions
when get_pv_entry() calls reclaim_pv_chunk().
Update some related comments.
across in_gif_output() and in6_gif_output() anyway, and once it is held
across those it might as well be held for the entire loop. This simplifies
the code and removes the need for the custom IFF_GIF_WANTED flag (which
belonged in the softc and not as an IFF_* flag anyway).
Tested by: Vincent Hoffman vince unsane co uk
According to the AMD manual the whole range from 0x09 to 0x1f are NOPs.
Intel manual mentions only 0x1f. Use only Intel one for now, it seems
to be the one actually generated by compilers.
Use gdb mnemonic for the operation: "nopw".
[1] AMD64 Architecture Programmer's Manual
Volume 3: General-Purpose and System Instructions
[2] Software Optimization Guide for AMD Family 10h Processors
[3] Intel(R) 64 and IA-32 Architectures Software Developer’s Manual
Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z
Tested by: Fabian Keil <freebsd-listen@fabiankeil.de> (earlier version)
MFC after: 3 days
relative to the start address (unless the start address is 0, which is
not the case).
This is currently not a problem because all powerpc architectures are
using loader(8) which passes metadata to the kernel including the
correct `endkernel' address. If we don't use loader(8), register 4
and 5 will have the size of the kernel ELF file, not its end address.
We fix that simply by adding `kernel_text' to `end' to compute
`endkernel'.
Discussed with: nathanw