- Use the new architecture-agnostic buffer pool manager that uses uma(9)
to manage a set of power-of-2 sized buffers for bus_dmamem_alloc().
- Create pools of buffers backed by both regular and uncacheable memory,
and use them to handle regular versus BUS_DMA_COHERENT allocations.
- Use uma(9) to manage a pool of bus_dmamap structs instead of local code
to manage a static list of 500 items (it took 3300 maps to get to
multi-user mode, so the static pool wasn't much of an optimization).
- Small BUS_DMA_COHERENT allocations no longer waste an entire page per
allocation, or set pages to uncached when they contain data other than
DMA buffers. There's no longer a need for drivers to work around the
inefficiency by allocing large buffers then sub-dividing them.
- Because we know the alignment and padding of buffers allocated by
bus_dmamem_alloc() (whether coherent or regular memory, and whether
obtained from the pool allocator or directly from the kernel) we
can avoid doing partial cacheline flushes on them.
- Add a fast-out to _bus_dma_could_bounce() (and some comments about
what the routine really does because the old misplaced comment was wrong).
- Everywhere the dma tag alignment is used, the interpretation is that
an alignment of 1 means no special alignment. If the tag is created
with an alignment argument of zero, store it in the tag as one, and
remove all the code scattered around that changed 0->1 at point of use.
- Remove stack-allocated arrays of segments, use a local array of two
segments within the tag struct, or dynamically allocate an array at first
use if nsegments > 2. On an arm system I tested, only 5 of 97 tags used
more than two segments. On my x86 desktop it was only 7 of 111 tags.
Submitted by: Ian Lepore <freebsd@damnhippie.dyndns.org>
manage a set of power-of-2 sized buffers for bus_dmamem_alloc().
This allows the caller to provide the back-end allocator uma allocator,
allowing full control of the memory pages backing the pool. For
convenience, it provides an optional builtin allocator that provides pages
allocated with the VM_MEMATTR_UNCACHEABLE attribute, for managing pools of
DMA buffers for BUS_DMA_COHERENT or BUS_DMA_NOCACHE.
This also allows the caller to specify a minimum alignment, and it ensures
that all buffers start on a boundary and have a length that's a multiple of
that value, to avoid using buffers that trigger partial cache line flushes.
Submitted by: Ian Lepore <freebsd@damnhippie.dyndns.org>
only returns name, but also vnode of corefile to use.
This simplifies the code and closes few races, especially in %I handling.
Reviewed by: kib
Obtained from: WHEEL Systems
- Use this new format to automatically handle syscalls and VOPs. This
changes the earlier format but is still human readable.
Sponsored by: EMC / Isilon Storage Division
support for their new RAID adapter ARC-1214.
Many thanks to Areca for continuing to support FreeBSD.
Submitted by: 黃清隆 Ching-Lung Huang <ching2048 areca com tw>
MFC after: 2 weeks
Starting with firmware v7.5, the "Read TouchPad Modes" ($01) and "Read
Capabilities" ($02) commands changed: previously constant bytes now
carry variable information.
We now compare those bytes to expected constants only for firmware prior
to v7.5.
Tested by: Zeus Panchenko <zeus@gnu.org.ua>
MFC after: 1 week
* The warning message was:
'warning error: format string is not a string literal';
* Changed how make_dev is called, now a string literal
for formatting is used;
Approved by: adrian (mentor)
calls and turn it on.
- Do not allow to call them inside jail. [1]
Pointed out by: trasz [1]
Reviewed by: avg
Approved by: kib (mentor)
MFC after: 1 week
caused by use of an invalid kgss_gssd_handle during an upcall to
the gssd daemon when it has exited. This patch seems to avoid the
crashes by holding a reference count on the kgss_gssd_handle until
the upcall is done. It also adds a new mutex kgss_gssd_lock used to
make manipulation of kgss_gssd_handle SMP safe.
Tested by: Illias A. Marinos, Herbert Poeckl
Reviewed by: jhb
MFC after: 2 weeks
LUNs for the virtual processor device. This removes lots of CAM warnings,
and follows similar recent changes to tws(4) and twa(4) drivers.
Also fix case where CAM_REQ_CMP was getting OR'd with CAM_DEV_NOT_THERE
in the nonexistent LUN case, resulting in different CAM status (CAM_UA_TERMIO)
getting reported to CAM. This issue existing previously, but was more subtle
because it changed CAM_SEL_TIMEOUT to CAM_CMD_TIMEOUT.
Sponsored by: Intel
Reported and tested by: Willem Jan Withagen <wjw@digiware.nl>
MFC after: 1 week
This fixed panic where we hold mutex (process lock) and try to obtain sleepable
lock (vnode lock in expand_name()). The panic could occur when %I was used
in kern.corefile.
Additionally we avoid expand_name() overhead when coredumps are disabled.
Obtained from: WHEEL Systems
This fixes panic when listing sysctls on INVARIANTS-enabled kernel while
having wbwd loaded.
This panic was not fatal, at worst one additional space was printed.
Also sbuf_trim() makes some sense even if drain function is set. The drain
function is called only when buffer is to be expanded. So we could still trim
existing buffer before drain is called. In this case it worked just fine - the
trailing space was correctly trimmed.
Obtained from: WHEEL Systems
MFC after: 1 week
For now use 256 buckets and fnv_hash function. Use xor'ed 32-bit
s6_addr32 parts of in6_addr structure as a hash key. Update
in6_localip and in6_is_addr_deprecated to use hash table for fastest
lookup.
Sponsored by: Yandex LLC
Discussed with: dwmalone, glebius, bz
set.
As the checks don't require vnet context, this is fixed by setting
vnet after the checks.
PR: kern/160541
Submitted by: Nikos Vassiliadis (slightly different approach)
implement the BSM audit trail format. Rename the kernel versions of the
files to match the userspace filenames so that it's easier to work out
what they correspond to, and therefore ensure they are kept in-sync.
Obtained from: TrustedBSD Project
yields, specify the user priority for the yield. Otherwise, a
higher-priority (kernel) thread could fall into the priority-inversion
with the thread owning the mutex lock.
On single-processor machines or UP kernels, do not loop adaptively
when the next vnode cannot be locked, instead yield unconditionally.
Restructure the iteration initializer and the iterator to remove code
duplication. Put the code to fetch and lock a vnode next to the
current marker, into the mnt_vnode_next_active() function, and use it
instead of repeating the loop.
Reported by: hrs, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
was being copied from the wrong place. This patch fixes that.
This could cause access failures for mapped users, when the group
permissions were needed.
PR: 147998
Submitted by: Christopher Key (cjk32 at cam.ac.uk)
MFC after: 2 weeks
If virtio_setup_intr() failed during boot, we would hang in
taskqueue_free() -> taskqueue_terminate() for all the taskq
threads to terminate. This will never happen since the
scheduler is not running by this point.
Reported by: neel, grehan
Approved by: grehan (mentor)
Rather than trying to KASSERT for callers that invoke this on
IO tags, either do nothing (for write_8) or return ~0 (for read_8).
Using KASSERT here just makes bus.h too messy from both
polluting bus.h with systm.h (for any number of drivers that include
bus.h without first including systm.h) or ports that use bus.h
directly (i.e. libpciaccess) as reported by zeising@.
Also don't try to implement all of the other bus_space functions for
8 byte access since realistically only these two are needed for some
devices that expose 64-bit memory-mapped registers.
Put the amd64-specific functions here rather than sys/amd64/include/bus.h
so that we can keep this header unified for x86, as requested by mdf@
and tijl@.
Submitted by: Carl Delsey <carl.r.delsey@intel.com>
MFC after: 3 days
initialisation to be enabled (1) / disabled (0) defaults to enabled.
This is useful for devices which have a slow trim speed and are either
new or have otherwise already been wiped e.g. secure erase.
PR: kern/173116
Submitted by: Steven Hartland
Approved by: pjd (mentor)
making range consolidation much more effective particularly for small
deletes.
This reduces memory used by the free map as well as reducing the number
of bio requests down to geom required to process all deletes.
In tests this achieved a factor of 10 reduction of trim ranges / geom
call downs.
While I'm here correct the description of zio_vdev_io_start.
PR: kern/173254
Submitted by: Steven Hartland
Approved by: pjd (mentor)
date: 2009/03/31 01:21:29; author: dlg; state: Exp; lines: +9 -16
...
this also firms up some of the input parsing so it handles short frames a
bit better.
This actually fixes reading beyond mbuf data area in pfsync_input(), that
may happen at certain pfsync datagrams.
entering llentry_free(), and in case if we lose the race, we should simply
perform LLE_FREE_LOCKED(). Otherwise, if the race is lost by the thread
performing arptimer(), it will remove two references from the lle instead
of one.
Reported by: Ian FREISLICH <ianf clue.co.za>
Prior to r222417, setting `password' in loader.conf(5) did not prevent boot
but instead only prevented changes to boot options by prompting for password
if autoboot failed or the user interrupted the countdown sequence.
After r222417 the same machine with `password' set in loader.conf(5) would no
longer boot without _always_ entering the password.
This patch restores the old (8.x and older) functionality for password in
loader.conf(5) while adding a new bootlock_password feature to replace the
edge-case should anybody desire the regressed functionality (HINT: great for
PXE servers and/or private distributions).
loader.conf(5) was updated to be more clear with-respect to password setting
(previous text was misleading).
Documentation (loader.conf(5) and check-password.4th(8)) has been updated to
include notes on the new bootlock_password setting.
Special thanks to Alex Verbod for bringing this to my attention and helping to
refine the loader.conf(5) text.
PR: conf/170110
Submitted by: Vitaly Zakharov <ded3axap@gmail.com>
Reviewed by: Alexander Verbod <alexander.verbod@gmail.com>
but later after processing and freeing the tag, we need to jump back again
to the findpcb label. Since the fwd_tag pointer wasn't NULL we tried to
process and free the tag for second time.
Reported & tested by: Pawel Tyll <ptyll nitronet.pl>
MFC after: 3 days
I am not exactly sure about the naming due to lack of specs on AMD site,
but it is better to have some identification then none at all.
MFC after: 1 month
guest floating point state without having to know the
size of floating-point state.
Unstaticize fpurestore to allow the hypervisor to
save/restore guest state using fpusave/fpurestore
on the allocated FPU state area.
Reviewed by: kib
Obtained from: NetApp/bhyve
MFC after: 1 week
These must have been accidently copied from the if statement a few
lines later. Also remove parameter name from function prototype.
Approved by: grehan (mentor)
as r242694):
do better detection of when we have a better version of the tcp sequence
windows than our peer.
this resolves the last of the pfsync traffic storm issues ive been able to
produce, and therefore makes it possible to do usable active-active
statuful firewalls with pf.
This is an ongoing effort to provide runtime debug information
useful in the field that does not panic existing installations.
This gives us the flexibility needed when shipping images to a
potentially large audience with WITNESS enabled without worrying
about formerly non-fatal LORs hurting a release.
Sponsored by: iXsystems
In preparation for sysctl(8) growing the ability to only print
out boot/run-time tunables we need a way to differentiate between
RW sysctl nodes that tune a particular thing, or simply export
a stat that we want to allow the sysadmin to reset to 0 (or some
other value).
To do so, we add the CTLFLAG_STATS which should be OR'd into the
CTLFLAGs when exporting a "writable/resettable" statistic node via
sysctl.
kern_yield() is problematic than.
The owned mutex is the mount interlock, and it is in fact not needed
to guarantee the stability of the mount list of active vnodes, so fix
the the issue by only taking the mount interlock for MNT_REF and
MNT_REL operations.
While there, augment the unconditional yield by some amount of
spinning [1].
Reported and tested by: pho
Reviewed by: attilio
Submitted by: attilio [1]
MFC after: 3 days
cause kernel panics.
Add a flag to the bpf descriptor to indicate whether the hold buffer
is in use. In bpfread(), set the "hold buffer in use" flag before
dropping the descriptor lock during the call to bpf_uiomove().
Everywhere else the hold buffer is used or changed, wait while
the hold buffer is in use by bpfread(). Add a KASSERT in bpfread()
after re-acquiring the descriptor lock to assist uncovering any
additional hold buffer races.
an IBSS VAP to RUN.
An 11n IBSS was beaconing HTINFO/HTCAP IE's that didn't have any HT
information setup (like the HT TX/RX MCS bitmask.)
Tested:
* AR9280, IBSS - both a statically setup channel and a scanned channel
PR: kern/172955
hierarchy of the page table entries which map the specified address.
Reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
call. The function indicates a failure by the TRUE return value. To
be extra safe, assert that the return value from the following
vm_map_insert() indicates success.
Fix style issues in the nearby lines, reformulate the comment.
Reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
parameters in IBSSes.
IBSS was just being plainly ignored here even though aggressive mode
was 'on'.
This still doesn't fix the "why are the WME parameters reset upon
interface down/up" issue.
PR: kern/165969
is totally wrong.
If we parse the WME IE here, we'll be constantly updating the WME
configuration from each WME enabled IBSS node we see.
There's a separate issue where the WME configuration is blanked out
when the interface is brought up; the WME parameters aren't "sticky."
Also, ieee80211_init_neighbor() parses the ath IE, so doing it here
isn't required.
Sorry about the noise.
PR: kern/165969
The Adhoc support wasn't parsing and handling the ath specific and WME
IEs, thus the atheros vendor support and WME TXOP parameters aren't being
copied from the peer.
It copies the WME parameters from whichever adhoc node it decides to
associate to, rather than just having them be statically configured
per adhoc node. This may or may not be exactly "right", but it's certainly
going to be more convienent for people - they just have to ensure their
adhoc nodes are setup with correct WME parameters.
Since WME parameters aren't per-node but are configured on hardware TX
queues, if some nodes support WME and some don't - or perhaps, have
different WME parameters - things will get quite quirky.
So ensure that you configure your adhoc nodes with the same WME
parameters.
Secondly - the Atheros Vendor IE is parsed and operated on per-node, so
this should work out ok between nodes that do and don't do Atheros
extensions. Once you see a becaon from that node and you setup the
association state, it _should_ parse things correctly.
TODO:
* I do need to ensure that both adhoc setup paths are correctly updating
the IE stuff. Ie, if the adhoc node is created by a data frame instead
of a beacon frame, it'll come up with no WME/ath IE config. The next
beacon frame that it receives from that node will update the state.
I just need to sit down and better understand how that's suppose to
work in IBSS mode.
Tested:
* AR5416 <-> AR9280 - fast frames and the WME configuration both popped
up. (This is with a local HAL patch that enables the fast frames
capability on the AR5416 chipsets.)
PR: kern/165969
ctl_frontend_cam_sim.c: Coalesce cfcs_online() and cfcs_offline()
into a single function since these were
identical except for one line.
Make sure we hold the SIM lock around path
creation, and calling xpt_rescan().
scsi_ctl.c: In ctlfe_onoffline(), make sure we hold the
SIM lock around path creation and free
calls, as well as xpt_action().
In ctlfe_lun_enable(), hold the SIM lock
around path and peripheral operations that
require it.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The stageqdepth (global, over all staging queues) was being kept
incorrectly. It was being incremented whenever things were added,
but only decremented during a flush. During active fast frames activity
it wasn't being decremented, resulting in it always having a non-zero
value during normal fast-frames operation.
It was only used when checking if the aging queue should be checked;
we may as well just defer to each of those staging queue counters (which
look correct, thankfully.)
Whilst I'm here, add locking assertions in the staging queue add/remove
functions. The current crash shows that the staging queue has one frame,
but only has a tail pointer set (the head pointer being set to NULL.)
I'd like to grab a few more crashes where these locking assertions are
in place so I can narrow down the issue between "somehow locking is
messed up and things are racy" and "the stage queue head/tail pointer
manipulation logic is subtly wrong."
Tested:
* AR5416 STA, AR5413 AP; with FastFrames enabled in the AR5416 HAL.
PR: kern/174283
Committed with changes to support the following from loader.conf(5):
+ console="vidconsole comconsole" (not just console="comconsole")
+ boot_serial="anything" (not just boot_serial="YES")
+ boot_multicons="anything" (unsupported in originally-submitted patch)
PR: conf/121064
Submitted by: koitsu
Reviewed by: gcooper, adrian (co-mentor)
Approved by: adrian (co-mentor)
pointers and leave the stage queue flush routine to just do nothing
(since both head and tail here will be NULL.)
This should quieten the "stageq empty" panic where the stageq itself
is empty, but it won't fix the second KASSERT() here "staging queue empty"
as that's likely a different underlying problem.
PR: kern/174283
similar changes had to be made in various places throughout the machine-
independent virtual memory layer to support the new vm object type.
However, in most of these places, it's actually not the type of the vm
object that matters to us but instead certain attributes of its pages.
For example, OBJT_DEVICE, OBJT_MGTDEVICE, and OBJT_SG objects contain
fictitious pages. In other words, in most of these places, we were
testing the vm object's type to determine if it contained fictitious (or
unmanaged) pages.
To both simplify the code in these places and make the addition of future
vm object types easier, this change introduces two new vm object flags
that describe attributes of the vm object's pages, specifically, whether
they are fictitious or unmanaged.
Reviewed and tested by: kib
to head. I don't think the NFS client behaviour will change unless
the new "minorversion=1" mount option is used. It includes basic
NFSv4.1 support plus support for pNFS using the Files Layout only.
All problems detecting during an NFSv4.1 Bakeathon testing event
in June 2012 have been resolved in this code and it has been tested
against the NFSv4.1 server available to me.
Although not reviewed, I believe that kib@ has looked at it.
while doing a copyout. That can cause a panic, because copyout
can trigger VM faults, and we can't handle VM faults while holding
a mutex.
The solution here is to malloc a separate buffer to hold the OOA
queue entries, so that we don't risk a VM fault while filling up
the buffer and we don't have to drop the lock. The other solution
would be to wire the user's memory while filling their buffer with
copyout, but that would have been a little more complex.
Also fix a debugging parenthesis issue in ctl_abort_task() pointed
out by Chuck Tuffli.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
drivers.
The bug occurrs when a userland process has the driver instance
open and the underlying device goes away. We get the devfs
callback that the device node has been destroyed, but not all of
the closes necessary to fully decrement the reference count on the
CAM peripheral.
The reason is that once devfs calls back and says the device has
been destroyed, it is moved off to deadfs, and devfs guarantees
that there will be no more open or close calls. So the solution
is to keep track of how many outstanding open calls there are on
the device, and just release that many references when we get the
callback from devfs.
scsi_pass.c,
scsi_enc.c,
scsi_enc_internal.h: Add an open count to the softc in these
drivers. Increment it on open and
decrement it on close.
When we get a devfs callback to say that
the device node has gone away, decrement
the peripheral reference count by the
number of still outstanding opens.
Make sure we don't access the peripheral
with cam_periph_unlock() after what might
be the final call to
cam_periph_release_locked(). The
peripheral might have been freed, and we
will be dereferencing freed memory.
scsi_ch.c,
scsi_sg.c: For the ch(4) and sg(4) drivers, add the
same changes described above, and in
addition, fix another bug that was
previously fixed in the pass(4) and enc(4)
drivers.
These drivers were calling destroy_dev()
from their cleanup routine, but that could
cause a deadlock because the cleanup
routine could be indirectly called from
the driver's close routine. This would
cause a deadlock, because the device node
is being held open by the active close
call, and can't be destroyed.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
are used by NFSv4.1 for callbacks. A backchannel is a connection
established by the client, but used for RPCs done by the server
on the client (callbacks). As a result, this patch mixes some
client side calls in the server side and vice versa. Some
definitions in the .c files were extracted out into a file called
krpc.h, so that they could be included in multiple .c files.
This code has been in projects/nfsv4.1-client for some time.
Although no one has given it a formal review, I believe kib@
has taken a look at it.
The problem was a race condition between the EDT traversal used by
things like 'camcontrol devlist', and CAM peripheral driver
removal.
The EDT traversal code holds the CAM topology lock, and wants
to show devices that have been invalidated. It acquires a
reference to the peripheral to make sure the peripheral it is
examining doesn't go away.
However, because the peripheral removal code in camperiphfree()
drops the CAM topology lock to call the peripheral's destructor
routine, we can run into a situation where the EDT traversal
increments the peripheral reference count after free process is
already in progress. At that point, the reference count is
ignored, because it was 0 when we started the process.
Fix this race by setting a flag, CAM_PERIPH_FREE, that I previously
added and checked in xptperiphtraverse() and xptpdperiphtravsere(),
but failed to use. If the EDT traversal code sees that flag,
it will know that the peripheral free process has already started,
and that it should not access that peripheral.
Also, fix an inconsistency in the locking between
xptpdperiphtraverse() and xptperiphtraverse(). They now both
hold the CAM topology lock while calling the peripheral traversal
function.
cam_xpt.c: Change xptperiphtraverse() to hold the CAM topology
lock across calls to the traversal function.
Take out the comment in xptpdperiphtraverse() that
referenced the locking inconsistency.
cam_periph.c: Set the CAM_PERIPH_FREE flag when we are in the
process of freeing a peripheral driver.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
- unp_zone: kern.ipc.maxsockets limit reached
- socket_zone: kern.ipc.maxsockets limit reached
- zone_mbuf: kern.ipc.nmbufs limit reached
- zone_clust: kern.ipc.nmbclusters limit reached
- zone_jumbop: kern.ipc.nmbjumbop limit reached
- zone_jumbo9: kern.ipc.nmbjumbo9 limit reached
- zone_jumbo16: kern.ipc.nmbjumbo16 limit reached
Note that those warnings are printed not often than every five minutes and can
be globally turned off by setting sysctl/tunable vm.zone_warnings to 0.
Discussed on: arch
Obtained from: WHEEL Systems
MFC after: 2 weeks
will be printed once the given zone becomes full and cannot allocate an
item. The warning will not be printed more often than every five minutes.
All UMA warnings can be globally turned off by setting sysctl/tunable
vm.zone_warnings to 0.
Discussed on: arch
Obtained from: WHEEL Systems
MFC after: 2 weeks
This is to allow debug images to be used without taking down the
system when non-fatal asserts are hit.
The following sysctls are added:
debug.kassert.warn_only: 1 = log, 0 = panic
debug.kassert.do_ktr: set to a ktr mask for logging via KTR
debug.kassert.do_log: 1 = log, 0 = quiet
debug.kassert.warnings: stats, number of kasserts hit
debug.kassert.log_panic_at:
number of kasserts before we actually panic, 0 = never
debug.kassert.log_pps_limit: pps limit for log messages
debug.kassert.log_mute_at: stop warning after N kasserts, 0 = never stop
debug.kassert.kassert: set this sysctl to trigger a kassert
Discussed with: scottl, gnn, marcel
Sponsored by: iXsystems
The XC900M acts as a Ubiquiti XR9 (and I _think_ SR9) by default;
it uses the same 900MHz<->2.4GHz downconverter mapping.
However it has an alternative frequency mapping which squeezes in a couple
more half/quarter rate channels. Since the default HAL doesn't support
fractional tuning (sub-1MHz) in 2.4GHz mode on the AR5413/AR5414, they
implement it using a jumper.
Datasheet: http://www.xagyl.com/download/XC900M_Datasheet.pdf
Thankyou to Xagyl Communications for the XC900M NICs and Edgar Martinez
for organising the donation.
Tested:
* XC900M <-> XC900M
* Ubiquiti XR9 <-> XC900M
TODO:
* Test against SR9 and GZ901 if possible (the IEEE channel<->frequency
mapping may not match up, thanks to the slightly different channels
involved)
EPROTONOSUPPORT if the address family is not supported.
- introduce pffinddomain() to find a domain by family and use it as
appropriate.
Reviewed by: glebius
id hash. If a state has been disconnected from id hash, its rule pointers
can no longer be dereferenced, and referenced memory can't be modified.
Thus, move rule statistics from pf_free_rule() to pf_unlink_rule() and
update them prior to releasing id hash slot lock.
Reported by: Ian FREISLICH <ianf cloudseed.co.za>