Provide both __sync_*-style and __atomic_*-style functions that perform
the atomic operations on ARMv5 by using Restartable Atomic Sequences.
While there, clean up some pieces of code where it's sufficient to use
regular uint32_t to store register contents and don't need full reg_t's.
Also sync this back to the MIPS code.
hook functions into hhook points which register after the modules were loaded -
potentially useful during boot or if hhook points are dynamically registered.
MFC after: 1 week
only re-enable I/O when all reasons have cleared.
sys/dev/xen/blkfront/block.h:
In the block front driver softc, replace the boolean
XBDF_FROZEN flag with a count of commands and driver global
issues that freeze the I/O queue. So long xbd_qfrozen_cnt
is non-zero, I/O is halted.
Add flags to xbd_flags for tracking grant table entry and
free command resource shortages. Each of these classes can
increment xbd_qfrozen_cnt at most once.
Add a command flag (XBDCF_ASYNC_MAPPING) that is set whenever
the initial mapping attempt of a command fails with EINPROGRESS.
sys/dev/xen/blkfront/blkfront.c:
In xbd_queue_cb(), use new XBDCF_ASYNC_MAPPING flag to definitively
know if an async bus dmamap load has occurred.
Add xbd_freeze() and xbd_thaw() helper methods for managing
xbd_qfrozen_cnt and use them to implement all queue freezing logic.
Add missing "thaw" to restart I/O processing once grant references
become available.
Sponsored by: Spectra Logic Corporation
hhook_{add|remove}_hook_lookup() so that khelp (and other potential API
consumers) do not have to care when they attempt to (un)hook a particular hook
point identified by id and type.
Reviewed by: scottl
MFC after: 1 week
QEMU 1.4 made the descriptor requirement stricter - the size of buffer
descriptor must exactly match the number of MAC addresses provided.
PR: kern/178955
MFC after: 5 days
Move FreeBSD from interface version 0x00030204 to 0x00030208.
Updates are required to our grant table implementation before we
can bump this further.
sys/xen/hvm.h:
Replace the implementation of hvm_get_parameter(), formerly located
in sys/xen/interface/hvm/params.h. Linux has a similar file which
primarily stores this function.
sys/xen/xenstore/xenstore.c:
Include new xen/hvm.h header file to get hvm_get_parameter().
sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
Correctly protect function definition and variables from being
included into assembly files in xen-os.h
Xen memory barriers are now prefixed with "xen_" to avoid conflicts
with OS native primatives. Define Xen memory barriers in terms of
the native FreeBSD primatives.
Sponsored by: Spectra Logic Corporation
Reviewed by: Roger Pau Monné
Tested by: Roger Pau Monné
Obtained from: Roger Pau Monné (bug fixes)
KASSERT during TCP hhook registration at boot. Virtualised hook points only
require extra housekeeping and sanity checking when "options VIMAGE" is present.
Reported by: bdrewery,jh,dhw
Tested by: dhw
MFC after: 1 week
X-MFC with: 251732
scheme for defining inline command queuing functions.
Prefer enums to #defines.
sys/dev/xen/blkfront/block.h
Replace inline function generation performed by the
XBDQ_COMMAND_QUEUE() macro with single instances of each
inline function (init, enqueue, dequeue, remove). This was
made possible by using queue indexes instead of bit flags
in the command structure, and passing the index enum as
an argument to the functions.
Improve panic/assert messages in the queue functions.
Combine queue data and stats into a single data structure
and declare an array of them instead of each queue individually.
Convert command flags, softc state, and softc flags to enums.
sys/dev/xen/blkfront/blkfront.c
Mechanical adjustments for new queue api.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
This hasn't yet been tested as unfortunately the AR3012 I have doesn't
have the "real" firmware on it; it shipped with the cut down HCI firmware
that only understands enough to accept a new firmware image.
* Linux ath9k (GPIO constants)
NICs which have bluetooth coexistence enabled.
The WB225 NIC has the common antenna switch configuration set to 0x0 which
disables all external switch bit setting. This obviously won't work when
doing coexistence.
This value is a magic value from the windows .inf files. It _looks_ right
but I haven't yet verified it - unfortunately my AR9285+AR3012 BT combo
has an earlier BT device which doesn't actually _have_ firmware on it.
So I have to fix ath3kfw to handle loading in firmware into the newer
NICs before I can finish testing this.
This may not hold true for CUS198, which is another custom AR9485 board.
The bluetooth setup code actually does a channel lookup during setup,
even though we haven't yet programmed in a channel. Sigh.
Tested:
* WB225 (AR9485) + bluetooth
generic mii_phy_reset().
- Return the result of mii_mediachg() rather than blindly returning 0.
- on smsc(4), driver lock should be held to get current
mii_media_active/mii_media_status value.
Reviewed by: yongari
type and id, as compared to virtualised hook points which are now uniquely
identified by type, id and a vid (which for vimage is the pointer to the vnet
that the hhook resides in).
All hhook_head structs for both virtualised and non-virtualised hook points
coexist in hhook_head_list, and a separate list is maintained for hhook points
within each vnet to simplify some vimage-related housekeeping.
Reviewed by: scottl
MFC after: 1 week
* Add the LNA configuration table entries for AR933x/AR9485
* Add a chip-dependent LNA signal level delta in the startup path
* Add a TODO list for the stuff I haven't yet ported over but
I haven't.
Tested:
* AR9462 with LNA diversity enabled
In netif_free(), call ifmedia_removeall() after ether_ifdetach()
so that bpf listeners are detached, any link state processing
is completed, and there is no chance for external reference to media
information.
Suggested by: yongari
MFC after: 1 week
exact same (subsystem) device and vendor IDs. However, the reference
design for the OXu16PCI954 uses a 14.7456 MHz clock (as does the EXSYS
EX-41098-2 equipped with these), while at least the OX16PCI954 defaults
to a 1.8432 MHz one. According to the datasheets of these chips, the
only difference in PCI configuration space is that OXu16PCI954 have
a revision ID of 1 while the other two are at 0. So employ the latter
for determining the default clock rates of this family.
Note that one might think that the actual clock could be derived from
the Clock Prescaler Register (CPR) of these chips. Unfortunately, this
is not that case and its use and content are orthogonal to the frequency
of the crystal employed.
Tested with an EXSYS EX-41098-2, which identifies and attaches as:
pcib4@pci0:19:0:0: class=0x060400 card=0x02dd1014 chip=0x10801b21
rev=0x03 hdr=0x01
vendor = 'ASMedia Technology Inc.'
device = 'ASM1083/1085 PCIe to PCI Bridge'
class = bridge
subclass = PCI-PCI
puc0@pci0:20:4:0: class=0x070006 card=0x00001415 chip=0x95011415
rev=0x01 hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
device = 'OX16PCI954 (Quad 16950 UART) function 0 (Uart)'
class = simple comms
subclass = UART
puc1@pci0:20:4:1: class=0x068000 card=0x00001415 chip=0x95111415
rev=0x01 hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
device = 'OX16PCI954 (Quad 16950 UART) function 1 (8bit bus)'
class = bridge
puc2@pci0:20:8:0: class=0x070006 card=0x00001415 chip=0x95011415
rev=0x01 hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
device = 'OX16PCI954 (Quad 16950 UART) function 0 (Uart)'
class = simple comms
subclass = UART
puc3@pci0:20:8:1: class=0x068000 card=0x00001415 chip=0x95111415
rev=0x01 hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
device = 'OX16PCI954 (Quad 16950 UART) function 1 (8bit bus)'
class = bridge
pci20: <ACPI PCI bus> on pcib4
puc0: <Oxford Semiconductor OX16PCI954 UARTs> port 0x5000-0x501f,
0x5020-0x503f mem 0xc6000000-0xc6000fff,0xc6001000-0xc6001fff irq 16 at
device 4.0 on pci20
uart1: <16950 or compatible> at port 1 on puc0
uart2: <16950 or compatible> at port 2 on puc0
uart3: <16950 or compatible> at port 3 on puc0
uart4: <16950 or compatible> at port 4 on puc0
puc1: <Oxford Semiconductor OX9160/OX16PCI954 UARTs (function 1)> port
0x5040-0x505f,0x5060-0x507f mem 0xc6002000-0xc6002fff,0xc6003000-0xc6003fff
irq 16 at device 4.1 on pci20
puc2: <Oxford Semiconductor OX16PCI954 UARTs> port 0x5080-0x509f,
0x50a0-0x50bf mem 0xc6004000-0xc6004fff,0xc6005000-0xc6005fff irq 16 at
device 8.0 on pci20
uart5: <16950 or compatible> at port 1 on puc2
uart6: <16950 or compatible> at port 2 on puc2
uart7: <16950 or compatible> at port 3 on puc2
uart8: <16950 or compatible> at port 4 on puc2
puc3: <Oxford Semiconductor OX9160/OX16PCI954 UARTs (function 1)> port
0x50c0-0x50df,0x50e0-0x50ff mem 0xc6006000-0xc6006fff,0xc6007000-0xc6007fff
irq 16 at device 8.1 on pci20
MFC after: 2 weeks
check which variant we are on, and if it is a VFPv3 or v4, and has 32
double registers we save these. This fixes VFP support on Raspberry Pi.
While here clean fmrx and fmxr up to use the register names from vfp.h
as opposed to the raw register names.
bitmap using sys/bitset. This is much simpler, has lower space
overhead and is cheaper in most cases.
- Use a second bitmap for invariants asserts and improve the quality of
the asserts as well as the number of erroneous conditions that we will
catch.
- Drastically simplify sizing code. Special case refcnt zones since they
will be going away.
- Update stale comments.
Sponsored by: EMC / Isilon Storage Division
Basically the situation is as follows:
- When using Clang + armv6, we should not need any intrinsics. It should
support it, even though due to a target misconfiguration it does not.
We should fix this in Clang.
- When using Clang + noarmv6, provide __atomic_* functions that disable
interrupts.
- When using GCC + armv6, we can provide __sync_* intrinsics, similar to
what we did for MIPS. As ARM and MIPS are quite similar, simply base
this implementation on the one I did for MIPS.
- When using GCC + noarmv6, disable the interrupts, like we do for
Clang.
This implementation still lacks functions for noarmv6 userspace. To be
done.
- Define __SYNC_ATOMICS in case we're using the __sync_*() API. This is
not used by <stdatomic.h> itself, but may be useful for some of the
intrinsics code to determine whether it should build the
machine-dependent intrinsic functions.
- Make is_lock_free() work in kernelspace. For now, assume atomics in
kernelspace are always lock free. This is a quite reasonable
assumption, as we surely shouldn't implement the atomic fallbacks for
arbitrary sizes.
- When looping, check for the pending suspension. Otherwise, other
usermode thread which races with the looping one, could try to
prevent the process from stopping or exiting.
- Add missed checks for the faults from casuword*(). The code is
structured in a way which makes the loops exit if the specified
address is invalid, since both fuword() and casuword() return -1 on
the fault. But if the address is mapped readonly, the typical value
read by fuword() is different from -1, while casuword() returns -1.
Absent the checks for casuword() faults, this is interpreted as the
race with other thread and causes non-interruptible spinning in the
kernel.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Before this change state creating sequence was:
1) lock wire key hash
2) link state's wire key
3) unlock wire key hash
4) lock stack key hash
5) link state's stack key
6) unlock stack key hash
7) lock ID hash
8) link into ID hash
9) unlock ID hash
What could happen here is that other thread finds the state via key
hash lookup after 6), locks ID hash and does some processing of the
state. When the thread creating state unblocks, it finds the state
it was inserting already non-virgin.
Now we perform proper interlocking between key hash locks and ID hash
lock:
1) lock wire & stack hashes
2) link state's keys
3) lock ID hash
4) unlock wire & stack hashes
5) link into ID hash
6) unlock ID hash
To achieve that, the following hacking was performed in pf_state_key_attach():
- Key hash mutex is marked with MTX_DUPOK.
- To avoid deadlock on 2 key hash mutexes, we lock them in order determined
by their address value.
- pf_state_key_attach() had a magic to reuse a > FIN_WAIT_2 state. It unlinked
the conflicting state synchronously. In theory this could require locking
a third key hash, which we can't do now.
Now we do not remove the state immediately, instead we leave this task to
the purge thread. To avoid conflicts in a short period before state is
purged, we push to the very end of the TAILQ.
- On success, before dropping key hash locks, pf_state_key_attach() locks
ID hash and returns.
Tested by: Ian FREISLICH <ianf clue.co.za>
While the changes in r245820 are in line with the ext2 spec,
the code derived from UFS can use negative values so it is
better to relax some types to keep them as they were, and
somewhat more similar to UFS. While here clean some casts.
Some of the original types are still wrong and will require
more work.
Discussed with: bde
MFC after: 3 days
Only four specific ATA PIO commands transfer several sectors per DRQ block
(interrupt). All other ATA PIO commands transfer one sector or 512 bytes
at one time. Hardcode these exceptions in mvs(4) with ATA_CAM option.
This fixes timeout of READ LOG EXT command used by `smartctl -x /dev/adaX`.
Also it fixes timeout of DOWNLOAD_MICROCODE on `camcontrol fwdownload`.
The reference HAL pushes a config group parameter to the driver layer
to inform it which particular chip behaviour to implement.
This particular value tags it as an AR9285.
The AR9485 chip and AR933x SoC both implement LNA diversity.
There are a few extra things that need to happen before this can be
flipped on for those chips (mostly to do with setting up the different
bias values and LNA1/LNA2 RSSI differences) but the first stage is
putting this code into the driver layer so it can be reused.
This has the added benefit of making it easier to expose configuration
options and diagnostic information via the ioctl API. That's not yet
being done but it sure would be nice to do so.
Tested:
* AR9285, with LNA diversity enabled
* AR9285, with LNA diversity disabled in EEPROM
SPC-4 specification states that serial number may be property of device,
but not a specific logical unit. People reported about FC storages using
serial number in that way, making it unusable for purposes of LUN multipath
detection. SPC-4 states that designators associated with logical unit from
the VPD page 83h "Device Identification" should be used for that purpose.
Report first of them in the new attribute in such preference order: NAA,
EUI-64, T10 and SCSI name string.
While there, make GEOM DISK properly report GEOM::ident in XML output also
using d_getattr() method, if available. This fixes serial numbers reporting
for SCSI disks in `geom disk list` output and confxml.
Discussed with: gibbs, ken
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
d_getattr(). Since these drivers use disk(9) KPI and not directly GEOM, use
of that function means KPI layering violation, causing extra g_io_deliver()
call for the request.
While GEOM in general has provider opened while sending BIO_GETATTR,
GEOM DISK does not really need to open disk to read medium-unrelated
attributes for own use.
Proposed by: ken
ZFS event processing should work on R/O root filesystems
Illumos ZFS issues:
3749 zfs event processing should work on R/O root filesystems
MFC after: 2 weeks
linux_file structure and use it instead of directly accessing td_fpop
when destroying the linux_file structure. The td_fpop pointer is not
valid when a cdevpriv destructor is run, and the type-specific close
method has already been called, so f_vnode may not be valid (and the
vnode might have been recycled without our own reference).
Tested by: Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
MFC after: 1 week
running state. fxp(4) requires controller reinitialization for the
following cases.
o RX lockup condition on i82557
o promiscuous mode change
o multicast filter change
o WOL configuration
o TSO/VLAN hardware tagging/checksum offloading configuration
o MAC reprogramming after speed/duplex/flow-control resolution
o Any events that result in MAC reprogramming(link UP/DOWN,
remote link partner's restart of auto-negotiation etc)
o Microcode loading/unloading
Apart from above cases which come from hardware limitation, upper
stack also blindly reinitializes controller whenever an IP address
is assigned. After r194573, fxp(4) no longer needs to reinitialize
the controller to program multicast filter after upping the
interface. So keeping track of driver running state should remove
all unnecessary controller reinitializations.
This change will also address endless controller reinitialization
triggered by dhclient(8).
Tested by: hrs, Alban Hertroys <haramrae@gmail.com>
about mount and unmount events. This is used by Juniper to implement a more
optimal implementation of NetBSD's veriexec.
Submitted by: stevek@juniper.net
Obtained from: Juniper Networks, Inc
when booting from ZFS turned out to also cause the boot path not being
adjusted if booting from CD-ROM with firmware versions that do not employ
the "cdrom" alias in that case. So shuffle the code around instead in order
to achieve the original intent. Ideally, we shouldn't fiddle with the boot
path when booting from UFS on a disk either; unfortunately, there doesn't
seem to be an universal way of telling disks and CD-ROMs apart, though. [1]
- Use NULL instead of 0 for pointers.
PR: 179289
MFC after: 1 week
This allows setting attributes on tables. One simply does not provide
an index in that case. Otherwise the entry corresponding the index has
the attribute set or unset.
Use this change to fix a relatively longstanding bug in our GPT scheme
that's the result of rev 198097 (relatively harmless) followed by rev
237057 (damaging). The damaging part being that our GPT scheme always
has the active flag set on the PMBR slice. This is in violation with
EFI. Existing EFI implementions for both x86 and ia64 reject the GPT.
As such, GPT disks created by us aren't usable under EFI because of
that.
After this change, GPT disks never have the active flag set on the PMBR
slice. In order to make the GPT disk bootable under some x86 BIOSes,
the reason of rev 198097, one must now set the active attribute on the
gpt table. The kernel will apply this to the PMBR slice For (S)ATA:
gpart set -a active ada0
To fix an existing GPT disk that has the active flag set in the PMBR,
and that does not need the flag, use (again for (S)ATA):
gpart unset -a active ada0
The EBR, MBR & PC98 schemes, which also impement at least 1 attribute,
now check to make sure the entry passed is valid. They do not have
attributes that apply to the table.
The superblock in ext2fs defines all the fields as unsigned but for
some reason the in-memory superblock was carrying e2fs_bpg and
e2fs_isize as signed.
We should preserve the specified types for consistency.
MFC after: 5 days
After pushing in my fix for the 2 byte functions, I realized that the
functions for 1 and 2 byte operations had become identical. Reduce the
code size by merging the functions for 1 and 2 byte operations together.
While there, slightly improve variable naming and comments.
This is based on the AR933x (Hornet) SoC from Qualcomm Atheros.
It's a much nicer board to do development on - 64MB RAM, 16MB flash.
The development board breaks out the GPIO pins, ethernet, serial (via
a USB<->RS232 chip), USB host and of course a small wifi antenna.
Everything but the wifi works thus far.
Even though I tested the 1-byte operations on arbitrarily aligned bytes,
it seems I did not do this for the 2-byte operations.
Create easy to read functions that are used to get/put bytes and
halfwords in words. To keep the compiler happy, explicitly read two
bytes into a union to obtain a 16-bit value.
Realtek RTL8188CU/RTL8192CU USB IEEE 802.11b/g/n wireless cards.
This driver requires microcode which is available in FreeBSD ports:
net/urtwn-firmware-kmod.
Hiren ported the urtwn(4) man page from OpenBSD and Glen just commited a port
for the firmware.
TODO:
- 802.11n support
- Stability fixes - the driver can sustain lots of traffic but has trouble
coping with simultaneous iperf sessions.
- fix debugging
MFC after: 2 months
Tested by: kevlo, hiren, gjb
To make <stdatomic.h> work on MIPS (and ARM) using GCC, we need to
provide implementations of the __sync_*() functions. I already added
these functions for 4 and 8 byte types to libcompiler-rt some time ago,
based on top of <machine/atomic.h>.
Unfortunately, <machine/atomic.h> only provides a subset of the features
needed to implement <stdatomic.h>. This means that in some cases we had
to do compare-and-exchange calls in loops, where a simple ll/sc would
suffice.
Also implement these functions for 1 and 2 byte types. MIPS only
provides ll/sc instructions for 4 and 8 byte types, but this is of
course no limitation. We can simply load 4 bytes and use some bitmask
tricks to modify only the bytes affected.
Discussed on: mips, arch
Tested with: QEMU
* Illumos ZFS issue #3805 arc shouldn't cache freed blocks
Quote from the Illumos issue:
ZFS should proactively evict freed blocks from the cache.
Even though these freed blocks will never be used again, and thus
will eventually be evicted, this causes us to use memory
inefficiently for 2 reasons:
1. A block that is freed has no chance of being accessed again, but
will be kept in memory preferentially to a block that was accessed
before it (and is thus older) but has not been freed and thus has
at least some chance of being accessed again.
2. We partition the ARC into several buckets:
user data that has been accessed only once (MRU)
metadata that has been accessed only once (MRU)
user data that has been accessed more than once (MFU)
metadata that has been accessed more than once (MFU)
The user data vs metadata split is somewhat arbitrary, and the
primary control on how much memory is used to cache data vs metadata
is to simply try to keep the proportion the same as it has been in the
past (each bucket "evicts against" itself). The secondary control is
to evict data before evicting metadata.
Because of this bucketing, we may end up with one bucket mostly
containing freed blocks that are very old, while another bucket has
more recently accessed, still-allocated blocks. Data in the useful
bucket (with still-allocated blocks) may be evicted in preference to
data in the useless bucket (with old, freed blocks).
On dcenter, we saw that the MFU metadata bucket was 230MB, while the
MFU data bucket was 27GB and the MRU metadata bucket was 256GB.
However, the vast majority of data in the MRU metadata bucket (256GB)
was freed blocks, and thus useless. Meanwhile, the MFU metadata bucket
(230MB) was constantly evicting useful blocks that will be soon needed.
The problem of cache segmentation is a larger problem that needs more
investigation. However, if we stop caching freed blocks, it should
reduce the impact of this more fundamental issue.
MFC after: 2 weeks
1) Only multi-TD isochronous transfers should use NORMAL
type after specific type as per XHCI specification.
2) BEI bit is only available in NORMAL and ISOCHRONOUS
TRB types. Don't use this bit for other types to avoid
hardware asserts. Reserved bits should be don't care
though ...
MFC after: 1 week
PR: usb/179342
* Stop pretending we support anything other than ELF by removing code
surrounded by #ifdef __ELF__ ... #endif.
* Remove _JB_MAGIC_SETJMP and _JB_MAGIC__SETJMP, they are defined in
setjmp.h, which is able to be included from asm.
* Fix the spelling of dependent.
* Rename END _END and add END and ASEND to complement ENTRY and ASENTRY
respectively
* Add macros to simplify accessing the Global Offset Table, some of these
will be used in the upcoming update to the setjmp functions.
the regular interrupt handler is not working properly or
in case of MSI interrupts which are not yet supported.
Remove interrupt setup code for FreeBSD versions older
than 700031.
MFC after: 1 week
PR: usb/179342
The "find node" function call will increase the node reference anyway;
so there's no reason to hold the node table lock during the MLME change.
The only reason I could think of is to stop overlapping mlme ioctls
from causing issues, but this should be fixed a different way.
This fixes a whole class of LORs that creep up when nodes are being
timed out or removed by hostapd.
Tested:
* AR5416, hostap, with nodes coming and going. No LORs or stability
issues were observed.
for the WB195 combo NIC - an AR9285 w/ an AR3011 USB bluetooth NIC.
The AR3011 is wired up using a 3-wire coexistence scheme to the AR9285.
The code in if_ath_btcoex.c sets up the initial hardware mapping
and coexistence configuration. There's nothing special about it -
it's static; it doesn't try to configure bluetooth / MAC traffic priorities
or try to figure out what's actually going on. It's enough to stop basic
bluetooth traffic from causing traffic stalls and diassociation from
the wireless network.
To use this code, you must have the above NIC. No, it won't work
for the AR9287+AR3012, nor the AR9485, AR9462 or AR955x combo cards.
Then you set a kernel hint before boot or before kldload, where 'X'
is the unit number of your AR9285 NIC:
# kenv hint.ath.X.btcoex_profile=wb195
This will then appear in your boot messages:
[100482] athX: Enabling WB195 BTCOEX
This code is going to evolve pretty quickly (well, depending upon my
spare time) so don't assume the btcoex API is going to stay stable.
In order to use the bluetooth side, you must also load in firmware using
ath3kfw and the binary firmware file (ath3k-1.fw in my case.)
Tested:
* AR9280, no interference
* WB195 - AR9285 + AR3011 combo; STA mode; basic bluetooth inquiries
were enough to cause traffic stalls and disassociations. This has
stopped with the btcoex profile code.
TODO:
* Importantly - the AR9285 needs ASPM disabled if bluetooth coexistence
is enabled. No, I don't know why. It's likely some kind of bug to do
with the AR3011 sending bluetooth coexistence signals whilst the device
is asleep. Since we don't actually sleep the MAC just yet, it shouldn't
be a problem. That said, to be totally correct:
+ ASPM should be disabled - upon attach and wakeup
+ The PCIe powersave HAL code should never be called
Look at what the ath9k driver does for inspiration.
* Add WB197 (AR9287+AR3012) support
* Add support for the AR9485, which is another combo like the AR9285
* The later NICs have a different signaling mechanism between the MAC
and the bluetooth device; I haven't even begun to experiment with
making that HAL code work. But it should be a lot more automatic.
* The hardware can do much more interesting traffic weighting with
bluetooth and wifi traffic. None of this is currently used.
Ideally someone would code up something to watch the bluetooth traffic
GPIO (via an interrupt) and then watch it go high/low; then figure out
what the bluetooth traffic is and adjust things appropriately.
* If I get the time I may add in some code to at least track this stuff
and expose statistics. But it's up to someone else to experiment with
the bluetooth coexistence support and add the interesting stuff (like
"real" detection of bulk, audio, etc bluetooth traffic patterns and
change wifi parameters appropriately - eg, maximum aggregate length,
transmit power, using quiet time to control TX duty cycle, etc.)
* Call the bluetooth setup function during the reset path, so the bluetooth
settings are actually initialised.
* Call the AR9285 diversity functions during bluetooth setup; so the AR9285
diversity and antenna configuration registers are correctly programmed
* Misc debugging info.
Tested:
* AR9285+AR3011 bluetooth combo; this code itself doesn't enable bluetooth
coexistence but it's part of what I'm currently using.
implemented as a 10 bits linear feedback shift register so only
lower 10 bits are valid.
Because this register is used to initialize random backoff interval
register only when resolved duplex is half-duplex, it wouldn't have
caused issues in these days.
Submitted by: Masanobu SAITOH <msaitoh@NetBSD.org>
Reporting link status in driver has a side-effect that makes mii(4)
check current link status. mii(4) will call link status change
callback when it sees link state change. Normally this wouldn't
have problems. However, ASF/IPMI firmware can actively access PHY
regardless of driver's running state such that reporting link
status for not-running interface can generate meaningless link
UP/DOWN messages.
This change also makes dhclient think driver got a valid link
regardless of link establishment so it will bypass dhclient's
initial link status check. I think that wouldn't be issue
though.
Tested by: Daniel Braniss <danny@cs.huji.ac.il>
* Illumos zfs issue #3137 L2ARC compression
Whether or not to compress buffers entering the L2ARC is
controlled by "compression" setting on the dataset, when
compression is not "off", L2ARC compression is enabled.
The compress method is always LZ4 for L2ARC when enabled
because it works best for the scenario.
MFC after: 2 weeks
Avoid to busy/unbusy a page in cases where there is no need to drop the
vm_obj lock, more nominally when the page is full valid after
vm_page_grab().
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
the mask of a cpuset. Also, change the cpuset's mask before updating the
masks of all children. Previously changing a cpuset's mask first required
setting the mask to a super-set of both the old and new masks and then
changing it a second time to the new mask.
- Split the bqlock into bqclean and bqdirty locks.
- Only acquire the wakeup synchronization locks when we cross a
threshold requiring them.
- Restructure the way flushbufqueues() targets work so they are more
smp friendly and sane.
Reviewed by: kib
Discussed with: mckusick, attilio
Sponsored by: EMC / Isilon Storage Division
M vfs_bio.c
Now that I understand what's going on - and the RX antenna array maps
to what the receive LNA configuration actually is - I feel comfortable
in enabling this.
If people do have issues with this, there's enough debugging now available
that we have a chance to diagnose it without writing it up as 'weird
crap.'
Tested:
* AR9285 STA w/ diversity combining enabled in EEPROM
TODO:
* (More) testing in hostap mode
and controlling this form of antenna diversity) - print out the AR9285
antenna diversity configuration at attach time.
This will help track down and diagose if/when people have connectivity
issues on cards (eg if they connect a single antenna to LNA1, yet the
card has RX configured to only occur on LNA2.)
Tested:
* AR9285 w/ antenna diversity enabled in EEPROM;
* AR9285 w/ antenna diversity disabled in EEPROM; mapping only to a
single antenna (LNA1.)
from each batch flowing on the VALE switch
- feature: add glue for 'indirect' buffers on the sender side:
if a slot has NS_INDIRECT set, the netmap buffer contains pointer(s)
to the actual userspace buffers, which are accessed with copyin().
The feature is not finalised yet, as it will likely need to deal
with some iovec variant for proper scatter/gather support.
This will save one copy for clients (e.g. qemu) that cannot
use the netmap buffer directly.
A curiosity: on amd64 copyin() appears to be 10-15% faster than pkt_copy()
or bcopy() at least for sizes of 256 and greater.
the RX antenna field.
The AR9285/AR9485 use an LNA mixer to determine how to combine the signals
from the two antennas. This is encoded in the RSSI fields (ctl/ext) for
chain 2. So, let's use that here.
This maps RX antennas 0->3 to the RX mixer configuration used to
receive a frame. There's more that can be done but this is good enough
to diagnose if the hardware is doing "odd" things like trying to
receive frames on LNA2 (ie, antenna 2 or "alt" antenna) when there's
only one antenna connected.
Tested:
* AR9285, STA mode
for the RX path.
This is different to the div comb HAL flag, that says it actually
can use this for RX diversity (the "slow" diversity path implemented
but disabled in the AR9285 HAL code.)
Tested:
* AR9285, STA operation
swap_pager_copy() is invoked, otherwise there is no reason to do so.
This will eliminate the necessity to busy pages most of the times.
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
sys/dev/mps/mps_user.c
Fix uninitialized memory reference in mps_read_config_page. It was
referencing a field (params->hdr.Ext.ExtPageType) that would only be
set when reading an Extended config page. The symptom was that
MPSIO_READ_CFG_PAGE ioctls would randomly fail with
MPI2_IOCSTATUS_CONFIG_INVALID_PAGE errors. The solution is to
determine whether an extended or an ordinary config page is requested
by looking at the PageType field, which should be available regardless.
Similarly, mps_user_read_extcfg_header and mps_user_read_extcfg_page,
which call mps_read_config_page, had to be fixed to always set the
PageType field. They were implicitly assuming that
mps_read_config_page always operated on Extended pages.
Reviewed by: ken
Approved by: ken (mentor)
MFC after: 3 days
In order to become independent of Coherency Fabric frequency, configure
Timer and Watchdog to operate in 25MHz mode.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Copy the given range of mappings from the source map to the
destination map, thereby reducing the number of VM faults on fork.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
* Grab the reset lock first, so any subsequent interrupt, TX, RX work
will fail
* Then shut down interrupts
* Then wait for TX/RX to finish running
At this point no further work will be running, so it's safe to do the
reset path code.
PR: kern/179232
The main problem here is that fast and driver RX diversity isn't actually
configured; I need to figure out why that is. That said, this makes
the single-antenna connected AR9285 and AR2427 (AR9285 w/ no 11n) work
correctly.
PR: kern/179269
Uncover some, previously reserved, fields that are used by Ext4.
These are currently unused but it is good to have them for future
reference.
Reviewed by: bde
MFC after: 3 days
of a lock within a single thread.
- Fix handling of interlocks in WITNESS by properly requiring the interlock
to be held exactly once if it is specified.
the vnode lock while iterating over the free vnode list. Instead of
yielding, pause for 1 tick. The change is reported to help in some
virtualized environments.
Submitted by: Roger Pau Monn? <roger.pau@citrix.com>
Discussed with: jilles
Tested by: pho
MFC after: 2 weeks
Xen host side can handle after defragmentation.
This prevents the driver from throwing away too long TSO chains and
improves the performance on Amazon AWS instances with 10GigE virtual
interfaces to the normally expected throughput.
Submitted by: cperciva (earlier version)
Reviewed by: cperciva
Tested by: cperciva
MFC after: 1 week
limited in the amount of data they can handle at once.
Drivers can set ifp->if_hw_tsomax before calling ether_ifattach() to
change the limit.
The lowest allowable size is IP_MAXPACKET / 8 (8192 bytes) as anything
less wouldn't be very useful anymore. The upper limit is still at
IP_MAXPACKET (65536 bytes). Raising it requires further auditing of
the IPv4/v6 code path's as the length field in the IP header would
overflow leading to confusion in firewalls and others packet handler on
the real size of the packet.
The placement into "struct ifnet" is a bit hackish but the best place
that was found. When the stack/driver boundary is updated it should
be handled in a better way.
Submitted by: cperciva (earlier version)
Reviewed by: cperciva
Tested by: cperciva
MFC after: 1 week (using spare struct members to preserve ABI)
cacos, cacosf, cacosh, cacoshf,
casin, casinf, casinh, casinhf,
catan, catanf, catanh, catanhf,
logl, log2l, log10l, log1pl
I am hoping kargl@ will commit expl and expm1l soon, in which case this
bump will cover those, too.
Requested by: danfe
space, fork(2) would cause shadowing of the physical object and
copying of the shared page into private copy, effectively preventing
updates for the exported timehands structure and stopping the clock.
Specify the maximum allowed permissions for the page to be read and
execute, preventing write from the user mode.
Reported and tested by: <huanghwh@yahoo.com>
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
used as the estimation of size, to 32GB. This provides around 100K of
buffer headers and corresponding KVA for buffer map at the peak.
Sizing the cache larger is not useful, also resulting in the wasting
and exhausting of KVA for large machines.
Reported and tested by: bdrewery
Sponsored by: The FreeBSD Foundation
clearing the page's PGA_REFERENCED flag. Since we are typically
manipulating the page's act_count field when we are clearing its
PGA_REFERENCED flag, the page lock is already held everywhere that we clear
the PGA_REFERENCED flag. So, in fact, this revision only changes some
comments and an assertion. Nonetheless, it will enable later changes to
object locking in the pageout code.
Introduce vm_page_assert_locked(), which completely hides the implementation
details of the page lock from the caller, and use it in
vm_page_aflag_clear(). (The existing vm_page_lock_assert() could not be
used in vm_page_aflag_clear().) Over the coming weeks, I expect that we'll
either eliminate or replace the various uses of vm_page_lock_assert() with
vm_page_assert_locked().
Reviewed by: attilio
Sponsored by: EMC / Isilon Storage Division
dtrace_probe(). Arguments beyond these five must be obtained in an
architecture-specific way; this can be done through the getargval provider
method, and through dtrace_getarg() if getargval isn't overridden.
This change fixes two off-by-one bugs in the way these arguments are fetched
in FreeBSD's DTrace implementation. First, the SDT provider must set the
aframes parameter to 1 when creating a probe. The aframes parameter controls
the number of frames that dtrace_getarg() will step over in order to find
the frame containing the extra arguments. On FreeBSD, dtrace_getarg() is
called in SDT probe context via
dtrace_probe()->dtrace_dif_emulate()->dtrace_dif_variable->dtrace_getarg()
so aframes must be 3 since the arguments are in dtrace_probe()'s frame; it
was previously being called with a value of 2 instead. illumos uses a
different aframes value for SDT probes, but this is because illumos SDT
probes fire by triggering the #UD fault handler rather than calling
dtrace_probe() directly.
The second bug has to do with the way arguments are grabbed out
dtrace_probe()'s frame on amd64. The code currently jumps over the first
stack argument and retrieves the rest of them using a pointer into the
stack. This works on i386 because all of dtrace_probe()'s arguments will be
on the stack and the first argument is the probe ID, which should be
ignored. However, it is incorrect to ignore the first stack argument on
amd64, so we correct the pointer used to access the arguments.
MFC after: 2 weeks
seven arguments.
The original test uses Solaris' uadmin system call to trigger the test
probe; this change adds a sysctl to the dtrace_test module and gets the test
program to trigger the test probe via the sysctl handler.
The test is currently failing on amd64 because of some bugs in the way that
probe arguments beyond the first five are obtained - these bugs will be
fixed in a separate change.
lock instead of the object lock, there is no reason for vm_page_activate()
to assert that the object is locked for either read or write access.
(The "VPO_UNMANAGED" flag never changes after page allocation.)
Sponsored by: EMC / Isilon Storage Division
requires a pkthdr being present but that's not the case for either
_bus_dmamap_load_mbuf_sg() or bus_dmamap_load_mbuf_sg(9).
Reported by: sbruno
MFC after: 1 week
Remove local, and incorrect, definition for the value of an invalid
grant reference.
Extract ring cleanup code into xbd_free_ring() function for
symetry with xbd_alloc_ring(). This process also eliminated
an initialized but unused variable.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
o Group functions by by their functionality.
o Remove superfluous declarations.
o Remove more unused (#ifdef'd out) code.
Sponsored by: Spectra Logic Corporation
o This driver is the "xbd" driver, not the "blkfront", "blkif", "xbf", or
"xb" driver. Use the "xbd_" naming conventions for all functions,
structures, and constants.
o The prevailing convention for structure fields in this driver is to
prefix them with an abreviation of the structure type. Update
"recently added" fields to match this style.
o Remove unused data structures.
o Remove superfluous casts.
o Make a pass over the whole driver and bring it closer to
style(9) conformance.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
reason to inline the implementation of vm_page_lock_assert() in the
!KLD_MODULES case. Use the same implementation for both KLD_MODULES and
!KLD_MODULES.
Reviewed by: kib
- Use a shared bufobj lock in getblk() and inmem().
- Convert softdep's lk to rwlock to match the bufobj lock.
- Move INFREECNT to b_flags and protect it with the buf lock.
- Remove unnecessary locking around bremfree() and BKGRDINPROG.
Sponsored by: EMC / Isilon Storage Division
Discussed with: mckusick, kib, mdf
they are needed when porting some of the Solaris providers (ip, iscsi, and
tcp in particular).
dtrace_probe() only takes five arguments from the probe site, so we need to
add the appropriate cast to allow for more than five arguments. The extra
arguments are later copied out of dtrace_probe()'s stack frame by
dtrace_getarg() (or the provider-specific getarg method) as needed.
MFC after: 1 week
Actually, this may be further optimized for controller variants
supporting one-shot MSIs but I'm lacking the necessary hardware for
testing.
- Add some missing synchronization of the statistics and status DMA
maps.
MFC after: 1 week
change. Retest the ref_count and return from the function to not
execute the further code which assumes that ref_count == 1 if it is
not. Also, do not leak vnode lock if other thread cleared OBJ_TMPFS
flag meantime.
Reported by: bdrewery
Tested by: bdrewery, pho
Sponsored by: The FreeBSD Foundation
altered or actually needed there any longer.
- Honor errors passed to the DMA mapping callbacks.
- In bce_get_rx_buf(), do not reserve stack space for more DMA segments
than actually necessary.
- In bce_get_pg_buf(), take advantage of bus_dmamap_load_mbuf_sg(9).
- In bce_rx_intr(), remove a pointless check for an empty mbuf pointer
which can only happen in case of a severe programming error. Moreover,
recovering from that situation would require way more actions with header
splitting enabled (which it is by default).
- Fix VLAN tagging in the RX path; do not attach the VLAN tag twice if the
firmware has been told to keep it. [1]
Obtained from: OpenBSD [1]
MFC after: 1 week
patching at runtime actually const.
- Remove pointless softc members by employing the corresponding constants
directly.
- Remove pointless returns.
- Remove unnecessary inclusion of opt_device_polling.h.
- Replace an outdated and now bogus comment in bce_tick() with the
appropriate one.
MFC after: 1 week
- the VALE switch now support up to 254 destinations per switch,
unicast or broadcast (multicast goes to all ports).
- we can attach hw interfaces and the host stack to a VALE switch,
which means we will be able to use it more or less as a native bridge
(minor tweaks still necessary).
A 'vale-ctl' program is supplied in tools/tools/netmap
to attach/detach ports the switch, and list current configuration.
- the lookup function in the VALE switch can be reassigned to
something else, similar to the pf hooks. This will enable
attaching the firewall, or other processing functions (e.g. in-kernel
openvswitch) directly on the netmap port.
The internal API used by device drivers does not change.
Userspace applications should be recompiled because we
bump NETMAP_API as we now use some fields in the struct nmreq
that were previously ignored -- otherwise, data structures
are the same.
Manpages will be committed separately.
the actual PCI device which makes the request for DMA tag, instead of
some descendant of the PCI device, by creating a pass-through trampoline.
- Sprinkle const on tables.
- Use NULL instead of 0 for pointers.
- Take advantage of nitems().
MFC after: 1 week
Intel and Sharp flash power on with their blocks in a "locked" state.
Unlocked them before attempting to perform an erase or write action and
relock when the action is complete.
this file is in FreeBSD. There's formality to this that hasn't
happened and Juniper is perfectly fine with being the holder.
Discussed with: eadler, imp, jhb
truncated directory for some NFS servers. This turned out to
be because the size of a directory reported by an NFS server
can be smaller that the ufs-like directory created from the
RPC XDR in the client. This patch fixes the problem by changing
r248567 so that vnode_pager_setsize() is only done for regular files.
Reported and tested by: hartmut.brandt@dlr.de
Reviewed by: kib
MFC after: 1 week
It can now be accessed with a write lock on the object containing it OR
with a read lock on the object containing it along with the swhash_mtx.
o Remove some duplicate assertions for swap_pager_freespace() and
swap_pager_unswapped() but keep the object locking references for
documentation.
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
Re-ordered SSD quirks alphabetically so they are easier to maintain.
Removed my email and PR reference from comments on each quirk.
Added quirks for more SSDs:
* Crucial M4
* Corsair Force GT
* Intel 520 Series
* Kingston E100 Series
* Samsung 830 Series
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 1 week
check_deferred_signal() returns twice, since handle_signal() emulates
the return from the normal signal handler by sigreturn(2)ing the
passed context. Second return is performed on the destroyed stack
frame, because __fillcontextx() has already returned. This causes
undefined and bad behaviour, usually the victim thread gets SIGSEGV.
Avoid nested frame and the need to return from it by doing direct call
to getcontext() in the check_deferred_signal() and using a new private
libc helper __fillcontextx2() to complement the context with the
extended CPU state if the deferred signal is still present.
The __fillcontextx() is now unused, but is kept to allow older
libthr.so to be used with the new libc.
Mark __fillcontextx() as returning twice [1].
Reported by: pgj
Pointy hat to: kib
Discussed with: dim
Tested by: pgj, dim
Suggested by: jilles [1]
MFC after: 1 week
value on purpose, but the ia32 context handling code is logically more
correct to use the _MC_IA32_HASFPXSTATE name for the flag.
Tested by: dim, pgj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
usermode context state is not changed by the get operation, and
get_mcontext() does not require full iret as well.
Tested by: dim, pgj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
context on return from the trap handler, re-enable the interrupts on
i386 and amd64. The trap return path have to disable interrupts since
the sequence of loading the machine state is not atomic. The trap()
function which transfers the control to the special handler would
enable the interrupt, but an iret loads the previous eflags with PSL_I
clear. Then, the special handler calls trap() on its own, which now
sees the original eflags with PSL_I set and does not enable
interrupts.
The end result is that signal delivery and process exiting code could
be executed with interrupts disabled, which is generally wrong and
triggers several assertions.
For amd64, the interrupts are enabled conditionally based on PSL_I in
the eflags of the outer frame, as it is already done for
doreti_iret_fault. For i386, the interrupts are enabled
unconditionally, the ast loop could have opened a window with
interrupts enabled just before the iret anyway.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
controller hardware most likely present on UHCI chipsets aswell. The
bug manifests itself when issuing isochronous transfers and bulk
transfers towards the same device simultaneously. From time to time it
happens that either the completion IRQ was missing or that the
completion IRQ was happening before the ITD/SITD was completely
written back to memory. The workaround assumes that double buffered
isochronous transfers are used, and that a second interrupt is
generated at the beginning of the next isochronous transfer to
complete the previous one. Possibly skipping the interrupt at the last
isochronous frame is possible, but will then break single buffered
isochronous transfers. For now we can live with some extra interrupts.
MFC after: 1 week
and if queue mechanism; also fix up (non-11n) TX fragment handling.
This may result in a bit of a performance drop for now but I plan on
debugging and resolving this at a later stage.
Whilst here, fix the transmit path so fragment transmission works.
The TX fragmentation handling is a bit more special. In order to
correctly transmit TX fragments, there's a bunch of corner cases that
need to be handled:
* They must be transmitted back to back, in the same order..
* .. ie, you need to hold the TX lock whilst transmitting this
set of fragments rather than interleaving it with other MSDUs
destined to other nodes;
* The length of the next fragment is required when transmitting, in
order to correctly set the NAV field in the current frame to the
length of the next frame; which requires ..
* .. that we know the transmit duration of the next frame, which ..
* .. requires us to set the rate of all fragments to the same length,
or make the decision up-front, etc.
To facilitate this, I've added a new ath_buf field to describe the
length of the next fragment. This avoids having to keep the mbuf
chain together. This used to work before my 11n TX path work because
the ath_tx_start() routine would be handed a single mbuf with m_nextpkt
pointing to the next frame, and that would be maintained all the way
up to when the duration calculation was done. This doesn't hold
true any longer - the actual queuing may occur at any point in the
future (think ath_node TID software queuing) so this information
needs to be maintained.
Right now this does work for non-11n frames but it doesn't at all
enforce the same rate control decision for all frames in the fragment.
I plan on fixing this in a followup commit.
RTS/CTS has the same issue, I'll look at fixing this in a subsequent
commit.
Finaly, 11n fragment support requires the driver to have fully
decided what the rate scenario setup is - including 20/40MHz,
short/long GI, STBC, LDPC, number of streams, etc. Right now that
decision is (currently) made _after_ the NAV field value is updated.
I'll fix all of this in subsequent commits.
Tested:
* AR5416, STA, transmitting 11abg fragments
* AR5416, STA, 11n fragments work but the NAV field is incorrect for
the reasons above.
TODO:
* It would be nice to be able to queue mbufs per-node and per-TID so
we can only queue ath_buf entries when it's time to assemble frames
to send to the hardware.
But honestly, we should just do that level of software queue management
in net80211 rather than ath(4), so I'm going to leave this alone for now.
* More thorough AP, mesh and adhoc testing.
* Ensure that net80211 doesn't hand us fragmented frames when A-MPDU has
been negotiated, as we can't do software retransmission of fragments.
* .. set CLRDMASK when transmitting fragments, just to ensure.
__amd64__, and thus limited. Eliminate 2 trivial conditionals by
casting the 64-bit integral, holding an address, via (uintptr_t)
to (void *) and replace the last remaining check for __amd64__
with a check for __LP64__ instead.
It turns out that in C++11, char16_t and char32_t are built-in types;
language keywords. Just fix this by putting traditional _*_T_DECLARED
blocks around the definitions. We'll just predefine these in
<sys/_types.h>.
This also opens up the possibility to define char16_t in other header
files, if ever needed (e.g. if we would gain a <ctype.h> for
char16_t/char32_t).
When creating fragment frames, the header length should honour the
DATAPAD flag.
This fixes the fragments that are queued to the ath(4) driver but it
doesn't yet fix fragment transmission. That requires further changes
to the ath(4) transmit path. Well, strictly speaking, it requires
further changes to _all_ wifi driver transmit paths, but this is at least
a start.
Tested:
* AR5416, STA mode, w/ fragthreshold set to 256.
This prevents users from selecting a delete method which may cause
corruption e.g. MPS WS16 on pre P14 firmware.
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 days
Assign the rv variable a success code if the pager was not asked for
the page. Using an error code from the previous processed page caused
zeroing of the valid page, when e.g. the previous page was not
available in the pager.
Reported by: lstewart
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
USDT probes exits. This was previously done with a callout; however, it is
possible to sleep while holding the DTrace mutexes, so a panic will occur
on INVARIANTS kernels if the callout handler can't immediately acquire one
of these mutexes. This panic will be frequently triggered on systems where
a USDT-enabled program (perl, for instance) is often run.
This revision changes the fasttrap cleanup mechanism so that a dedicated
thread is used instead of a callout. The old behaviour is otherwise
preserved.
Reviewed by: rpaulo
MFC after: 1 month
be a GOOD IDEA (TM).
Apparently MOST users set this (e.g. tcp and friends) but there are a few
users that just assume that it is a sensible value but then go on to read it.
These include SCTP, pf and the FLOWTABLE option (and maybe others).
avoid a dangling pointer and eventual panic on system shutdown.
Reported by: Ali <comnetboy at gmail.com>
Tested by: Ali <comnetboy at gmail.com>
MFC after: 1 week
ownership to the FreeBSD foundation for the years this file has been in
the FreeBSD repository.
This file was originally created by Juniper as part of upgrading to FreeBSD
4.10 (which had no MIPS support) and held functions found on other machines
It grew actual functionality over time. The functionaliy was copied from
other architectures and ported to MIPS on a as-needed basis.
Approved by: Mark Baushke (Juniper IP)
Approved by: Megan Sugiyama (Juniper legal)
Pointed out by: jmallett@
Requested by: core (jhb@)
pmap_enter_locked() implementation was very ambiguous and confusing.
Rearrange it so that each part of the mapping creation is separated.
Avoid walking through the redundant conditions.
Extract vector_page specific PTE setup from normal PTE setting.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Using PVF_MOD, PVF_REF and PVF_EXEC is redundant as we can get the proper
info from PTE bits.
When the mapping is marked as executable and has been referenced we assume
that it has been executed. Similarly, when the mapping is set to be writable
and is referenced, it must have been due to write access to it.
PVF_MOD and PVF_REF flags are kept just for pmap_clearbit() usage,
to pass the information on which bit should be cleared.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
Use pmap_find_pv if needed instead of multiplying its code throughout
pmap-v6.
Avoid possible NULL pointer dereference in pmap_enter_locked()
When trying to get m->md.pv_memattr, make sure that m != NULL,
in particular that vector_page is set to be NULL.
Do not set PGA_REFERENCED flag in pmap_enter_pv().
On ARM any new page reference will result in either entering the new
mapping by calling pmap_enter, etc. or fixing-up the existing mapping in
pmap_fault_fixup().
Therefore we set PGA_REFERENCED flag in the earlier mentioned cases and
setting it later in pmap_enter_pv() is just waste of cycles.
Delete unused pm_pdir pointer from the pmap structure.
Rearrange brackets in the fault cause detection in trap.c
Place the brackets correctly in order to see course of the conditions
instantaneously.
Unify naming in pmap-v6.c and improve style
Use naming common for whole pmap and compatible with other pmaps,
improve style where possible:
pm -> pmap
pg -> m
opg -> om
*pt -> *ptep
*pte -> *ptep
*pde -> *pdep
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
bit in PTE.
Enable Access Flag in CPU control. With AF enabled each valid mapping
needs to have referenced bit in PTE set in order to be able to cache
it in the TLB.
AP[0] bit is to be used as reference flag.
All access permissions are encoded by AP[2:1] wherein AP[1] is in fact
"user enable" and AP[2](APX) is "write disable".
All mappings are always set to be valid. Reference emulation is performed
by setting/clearing reference flag in PTE.
md.pvh_attrs are no longer necessary however pv_flags are still being used
for now.
Marking vm_page as "dirty" or "referenced" is being performed on:
- page or flag fault servicing in pmap_fault_fixup(), basing on the fault
type
- vm_fault servicing in pmap_enter() according to the desired protections
and faulty access type
Redundant page marking has been removed as on ARM we know exactly when the
particular page is referenced or is going to be written.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Sponsored by: The FreeBSD Foundation, Semihalf
xen/xenbus/xenbusb.c:
In xenbusb_probe_children(), do not modify the XenBus state of
devices for which we have no PV driver support. An emulated device
we do support may share this backend. Hide the node from XenBus
instead.
This prevents closing the vkbd device, which Qemu's emulated keyboard
device is using as the source for keyboard events.
Tested with qemu-xen-traditional, qemu-xen and qemu stubdomains, all
working as expected.
Submitted by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed by: gibbs
MFC after: 1 week
dev/xen/netfront:
In netif_free(), properly stop the interface and drain any pending
timers prior to disconnecting from the backend device.
Remove all media and detach our interface object from the system
prior to deleting it.
PR: kern/176471
Submitted by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed by: gibbs
MFC after: 1 week
from 1k to 20k The previous value was good 10 years ago, but not
anymore now.
More importantly, lots of good surprises:
polling is incredibly effective under virtualization, and not only
prevents livelock but also saves most of the VM exit overhead in
receive mode.
Using polling, a FreeBSD instance under qemu-kvm remains perfectly
responsive even when bombed with 10 Mpps over an emulated e1000,
and happily processes 1.7 Mpps through ipfw.
Note that some incompatibilities still remain: e.g. polling is not
(yet) compatible with netmap, and seems to freeze the guest when
kern.polling.idle_poll=1
MFC after: 3 days