them out of the #if _BSD_VISIBLE block. Other headers may depend on
__bitcount(). The dependencies can be a header not specified by
POSIX, and then namespace restrictions by _XOPEN_SOURCE are not
applicable, as it was reported. Or, we might grow an implementation
of some POSIX facility using __bitcount(), which also should work.
Reported by: Jason Schulz <schulz.j@gmail.com>
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
in the join message so the firmware would pick it up.
* Strip out the direct hardware fiddling for 40MHz mode - the firmware
we're using doesn't require it (the rtl8712su firmware does; it
is less 'fullmac' than what we're using.)
* Fix the mbuf handling during errors - rsu_tx shouldn't free mbufs;
it's up to the caller to do so. This brings it in line with
what other drivers do or should be doing.
Tested:
* RTL8712, HT40 channel, STA mode (during this commit)
belong to a vm object, they can't be paged out. Since they can't be paged
out, they are never enqueued in a paging queue. Nonetheless, passing
PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages
are being enqueued in the inactive queue. As of r288122, we can avoid
this false impression by passing PQ_NONE.
Submitted by: kmacy (an earlier version)
Differential Revision: https://reviews.freebsd.org/D1674
Atheros.
Thanks to OpenBSD for providing a driver based on the original
Atheros open source driver circa 2008. This uses the early, pre-carl9170
atheros provided firmware.
It only supports 11bg at the moment. I've not tested it with 11a
(and so the TX rate control logic may be slightly wrong!) so if
you do have the dual-band version of this hardware please do let me know.
Tested:
* AR9170, TP-Link WN821N 2GHz.
TODO:
* Hook this up to a non-module build.
net80211 receive path. This allows drivers (notably USB right now, but
anything/everything!) to optionally defer bulk RX of 802.11 frames until
/outside/ of the driver lock(s), rather than doing:
UNLOCK(sc);
ieee80211_input*()
LOCK(sc);
.. which is really stupid.
The existing API is maintaned - if ieee80211_input() / ieee80211_input_all()
is called then the RSSI/NF values are used. If the MIMO versions are called
with a given rx status pointer then it's used. Else, it'll use whatever
is in the RX mbuf tag.
Revamp sbuf_put_byte() to sbuf_put_bytes() in the obvious fashion and
fixup callers.
Add a thin shim around sbuf_put_bytes() with the old ABI to avoid ugly
changes to some callers.
Reviewed by: jhb, markj
Obtained from: Dan Sledz
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3717
are updated lockess, different CPUs write its own view of timecounter
state. The critical section is done for safety, callers of
tc_cpu_ticks() are supposed to already enter critical section, or to
own a spinlock.
The change fixes sporadical reports of too high values reported for
the (W)CPU on platforms that do not provide cpu ticker and use
tc_cpu_ticks(), in particular, arm*.
Diagnosed and reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
A change to a property on a dataset must be propagated to its descendants
in case that property is inherited. For datasets whose information is
not currently loaded into memory (e.g. a snapshot that isn't currently
mounted), there is nothing to do; the property change will take effect
the next time that dataset is loaded. To handle updates to datasets that
are in-core, ZFS registers a callback entry for each property of each
loaded dataset with the dsl directory that holds that dataset. There
is a dsl directory associated with each live dataset that references
both the live dataset and any snapshots of the live dataset. A property
change is effected by doing a traversal of the tree of dsl directories
for a pool, starting at the directory sourcing the change, and invoking
these callbacks.
The current implementation both registers and de-registers properties
individually for each loaded dataset. While registration for a property is
O(1) (insert into a list), de-registration is O(n) (search list and then
remove). The 'n' for de-registration, however, is not limited to the size
(number of snapshots + 1) of the dsl directory. The eviction portion
of the life cycle for the in core state of datasets is asynchronous,
which allows multiple copies of the dataset information to be in-core
at once. Only one of these copies is active at any time with the rest
going through tear down processing, but all copies contribute to the
cost of performing a dsl_prop_unregister().
One way to create multiple, in-flight copies of dataset information
is by performing "zfs list" operations from multiple threads
concurrently. In-core dataset information is loaded on demand and then
evicted when reference counts drops to zero. For datasets that are not
mounted, there is no persistent reference count to keep them resident.
So, a list operation will load them, compute the information required to
do the list operation, and then evict them. When performing this operation
from multiple threads it is possible that some of the in-core dataset
information will be reused, but also possible to lose the race and load
the dataset again, even while the same information is being torn down.
Compounding the performance issue further is a change made for illumos
issue 5056 which made dataset eviction single threaded. In environments
using automation to manage ZFS datasets, it is now possible to create
enough of a backlog of dataset evictions to consume excessive amounts
of kernel memory and to bog down the system.
The fix employed here is to make property de-registration O(1). With this
change in place, it is hoped that a single thread is more than sufficient
to handle eviction processing. If it isn't, the problem can be solved
by increasing the number of threads devoted to the eviction taskq.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_prop.h:
Associate dsl property callback records with both the
dsl directory and the dsl dataset that is registering the
callback. Both connections are protected by the dsl directory's
"dd_lock".
When linking callbacks into a dsl directory, group them by
the property type. This helps reduce the space penalty for the
double association (the property name pointer is stored once
per dsl_dir instead of in each record) and reduces the number of
strcmp() calls required to do callback processing when updating
a single property. Property types are stored in a linked list
since currently ZFS registers a maximum of 10 property types
for each dataset.
Note that the property buckets/records associated with a dsl
directory are created on demand, but only freed when the dsl
directory is freed. Given the static nature of property types
and their small number, there is no benefit to freeing the few
bytes of memory used to represent the property record earlier.
When a property record becomes empty, the dsl directory is either
going to become unreferenced a little later in this thread of
execution, or there is a high chance that another dataset is
going to be loaded that would recreate the bucket anyway.
Replace dsl_prop_unregister() with dsl_prop_unregister_all().
All callers of dsl_prop_unregister() are trying to remove
all property registrations for a given dsl dataset anyway. By
changing the API, we can avoid doing any lookups of callbacks
by property type and just traverse the list of all callbacks
for the dataset and free each one.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:
Replace use of dsl_prop_unregister() with the new
dsl_prop_unregister_all() API.
illumos/illumos-gate@03bad06fbb
Author: Justin Gibbs <gibbs@scsiguy.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Illumos issue:
6171 dsl_prop_unregister() slows down dataset eviction
https://www.illumos.org/issues/6171
MFC after: 2 weeks
Refer to the usb_quirk(4) manual page for more details on how to use
this new feature.
Submitted by: Maxime Soule <btik-fbsd@scoubidou.com>
PR: 203249
MFC after: 2 weeks
* Don't free the mbuf in the tx path - it uses the transmit path now,
so the caller frees the mbuf.
* Don't decrement the node ref upon error - that's up to the caller to
do as well.
Tested:
* Intel 5300 3x3 wifi, station mode
Noticed by: <s3erios@gmail.com>
This avoids needing a large boot partition / file system in order to
accommodate multiple kernels, and provides consistency with userland
debug. This also simplifies the process of moving kernel debug files
to a separate package and installing them on demand.
In addition, change kernel debug file extension to .debug, to match
userland debug files.
When using the supported kernel installation method the
/usr/lib/debug/boot/kernel directory will be renamed (to kernel.old)
as is done with /boot/kernel.
Developers wishing to maintain the historical behavior of installing
debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5).
Reviewed by: bdrewery, brooks, imp, markj
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1006
Skip a /dev/ prefix, if one is present, when checking for matching
device names for dump.
Suggested by: avg
Reviewed by: markj
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3725
We allow to modify only few fields in mode pages now, but still it is
not good if they unexpectedly change during failover. Also this fixes
reporting of "Mode parameters changed" UAs on secondary node.
HA protocol requires strict version, parameters and configuration match.
Differences there may cause full set of problems up to kernel panic.
To avoid that, validate peer parameters on connect, and abort connection
immediately if some mismatch detected.
- Eliminate bogus page replacement that is inconsistently applied in the
invalidation loop in brelse. This has been a no-op in modern times as
biodone() is responsible for cleaning up after bogus pages. This
would've spammed the console with printfs at a minimum.
- Allow the compiler and human readers alike to reason about allocbuf()
by splitting it into constituent parts.
- Separate the VM manipulating and buf manipulating code in brelse() and
bufdone() so that the intentions are clear. This makes it evident that
there are several duplicated buf pages loops that will be consolidated
at a later time.
Reviewed by: kib
Tested by: pho
Sponsored by: EMC / Isilon Storage Division
reschedule. Right now arm_cpu_intr() does critical_exit() as the last
action, so the impact is not serious.
Remove duplicated interrupt disable in restore_registers macro, when
returning to usermode. The do_ast macro disabled interrupts for us.
Reviewed by: andrew
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3714
queue and (2) returns a Boolean indicating whether the page's wire count
transitioned to zero.
Exploit this change in vfs_vmio_release() to avoid pointlessly enqueueing
a page that is about to be freed.
(An earlier version of this change was developed by attilio@ and kmacy@.
Any errors in this version are my own.)
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
* Add a new method to control NIC poweron / network-sleep / power off;
* Add in A-MPDU TX negotiation support, but comment it out because it
does break TX traffic;
* blank out the tx buffer before sending a firmware message, just in case;
* go into network-sleep once associated;
TODO:
* figure out why ampdu negotiation isn't working and breaking TX traffic,
then enable it.
Some fullmac devices may rely on the stack starting it but not doing it.
Whilst here, remove a duplicate LE_* macro definition, thanks to
Andriy Voskoboinyk <s3erios@gmail.com>.
Changes to kern.pid_max mib after the boot can break this relation.
The maxfiles value was calculated by the MAXFILES formula based on
maxproc value, but this change decouples them, and MAXFILES now
references maxusers. Without manual tuning, the maxfiles default
value remains as it was prior to this commit. But for systems which
have tuned maxproc and rely on maxfiles to adjust, additional
reconfiguration is needed.
Reported by: rwatson
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
e.g., based on wrong "next header" assumptions (which does not have to point to
the upper layer protocol), or using hard-coded UDP instead of UDP or UDP-Lite
possibly switching protocols. Fix those cases for UDP-Lite to work correctly.
PR: 202788
Submitted by: Tiwei Bie (btw mail.ustc.edu.cn) [parts]
Reviewed by: gnn, Tiwei Bie (btw mail.ustc.edu.cn),
kevlo (earlier version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D3686
c546f36aa8https://www.illumos.org/issues/6220
5408 introduced a memleak in l2arc, namely the member b_thawed gets leaked when
an arc_hdr is realloced from full to l2only.
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: George Wilson <george@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Arne Jansen <sensille@gmx.net>
function. The change is mostly mechanical with the following exception:
Last piece of nd6_resolve_slow() was refactored: ND6_LLINFO_PERMANENT
condition was removed as always-true, explicit ND6_LLINFO_NOSTATE ->
ND6_LLINFO_INCOMPLETE state transition was removed as duplicate.
Reviewed by: ae
Sponsored by: Yandex LLC
to do it directly.
Ensure that we re-queue starting transmit upon TX completion.
This solves two issues:
* It stops tx stalls - before this, if the transmit path filled the
mbuf queue then it'd never start another transmit.
* It enforces ordering - this is very required for 802.11n which
requires frames to be transmitted in the order they're queued.
Since everything remotely involved in USB has an unlock/thing/relock
pattern with that mutex, the only way to guarantee TX ordering is
to 100% defer it into a separate thread.
This now survives an iperf test and gets a reliable 30mbit/sec.
Correctly (I hope!) remove net80211 references before doing so.
Just doing a dumb mbufq drain isn't enough.
If enough traffic occurs and the mbuf queue fills up then transmit
stalls (which I'm not fixing in this commit!) but then the mbuf queue
stays full until the driver is removed. There's also the net80211
node refcounting leak.
This just ensures that during rsu_stop and detach the mbuf queue
is purged (and references!) so the queue-full situation can be
recovered from.
This code initializes the GMAC clock and sets the pin mux to rgmii.
It also override the if_dwc defaults to set the alternate descriptor type
and MII clock used on A20.
Tested on cubieboard2 and banana pi.
setup pieces and so (at least) transmit doesn't work.
It'll just fall back to being a straight HT20 device and negotiate
HT20 only.
Tested by: Idwer Vollering <vidwer@gmail.com>
file descriptor opened for complimentary access exists as well.
The implementation of the guarantee is done by counting the
generations of readers and writers opens. We return success and not
EINTR or ERESTART error, when the sleep for complimentary opening is
interrupted, but the generation was changed during the sleep.
Longer explanation: assume there are two threads, A doing open("fifo",
O_RDONLY) and B doing open("fifo", O_WRONLY), and no other threads
either trying to open the fifo, nor there are any file descriptors
referencing the fifo. Before the change, it was possible e.g. for for
thread A to return a valid file descriptor, while thread B returned
EINTR if a signal to B was delivered simultaneously with the wakeup
from A. After the change, in this situation both A::open() and
B::open() succeed and the signal is made "as if" it was noticed
slightly later. Note that the signal actual delivery is not changed,
it is done by ast on syscall return path, so signal handler is still
executed before first instruction after syscall.
See PR for the code demonstrating the issue.
PR: 203162
Reported by: Victor Stinner victor.stinner@gmail.com
Reviewed by: jilles
Tested by: bapt, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
should not assume that vm_pages_needed will remain set while it sleeps.
Other threads can clear vm_pages_needed by performing a sufficient
number of vm_page_free() calls, e.g., process termination. The effect
of this error was that vm_pageout_worker() would free and/or launder
pages when, in fact, there was no shortage of free pages.
Rewrite a nearby comment to describe all of the possible cases and not
just the most common case. The problem being that the comment made
the most common case seem like the only case.
Reviewed by: kib
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
This also adds a newbus interface that allows a SoC to override the
following settings:
- if_dwc specific SoC initialization;
- if_dwc descriptor type;
- if_dwc MII clock.
This seems to be an old version of the hardware descriptors but it is
still in use in a few SoCs (namely Allwinner A20 and Amlogic at least).
Tested on Cubieboard2 and Banana pi.
Tested for regressions on Altera Cyclone by br@ (old version).
Obtained from: NetBSD
linkers no longer raise an error when undefined weak symbols are
found, but relocate as if the symbol value was 0. Note that we do not
repeat the mistake of userspace dynamic linker of making the symbol
lookup prefer non-weak symbol definition over the weak one, if both
are available. In fact, kernel linker uses the first definition
found, and ignores duplicates.
Signature of the elf_lookup() and elf_obj_lookup() functions changed
to split result/error code and the symbol address returned.
Otherwise, it is impossible to return zero address as the symbol
value, to MD relocation code. This explains the mechanical changes in
elf_machdep.c sources.
The powerpc64 R_PPC_JMP_SLOT handler did not checked error from the
lookup() call, the patch leaves the code as is (untested).
Reported by: glebius
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
REPORT LUNS command is more related to target rather then specific LUN.
This node may be primary for LUNs for some reason unknown to another,
and command forwarded to another node won't be able to report them.
REQUEST SENSE is related to LUN, but in our implementation it reports
only UAs and CAs, that are stored locally rather then on primary node.
* Since new extries are now allocated explicitly, fill in
all the necessary fields for lle _before_ attaching it to the table.
* Remove ND6_LLINFO_INCOMPLETE check which was unused even in
first KAME merge (r53541).
* After that, the only new state that function can set, was
ND6_LLINFO_STALE. Given everything above, simplify logic besides
do_update and is_newentry.
* Fix nd_resolve() comment.
Previously, with serseq enabled, next command was unblocked only after
previous completed. With this change, for read operations, next command
is unblocked as soon as last media read completed. This is important
for frontends that actually wait for data move completion (like camtgt),
or when data are moved through the HA link, or especially when both.
Note that the mountlist manipulations are somewhat fragile, and not very
pretty. The reason for this is to avoid changing vfs_mountroot(), which
is (obviously) rather mission-critical, but not very well documented,
and thus hard to test properly. It might be possible to rework it to use
its own simple root mount mechanism instead of vfs_mountroot().
Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D2698
We should not call vm_fault(), or send a signal, with interrupts
disabled. MI kernel code is not prepared for such environment, not to
mention that this increases system latency, since code appears to be
executing as being under spinlock.
The FAR register for data aborts is read before the interrupts are
enabled, to avoid its corruption due to nested exception or context
switch.
Add asserts, similar to the checks done by other architectures, about
not taking page faults in non-sleepable contexts, rather than die with
late and somewhat confusing witness diagnostic.
Reviewed by: andrew
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3669