As of r302092 libc uses pipe2() with a zero flags value instead of pipe().
Commit with regenerated files and implementation to follow.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D6816
This will result in lock recursion and is more generally incorrect since
the completion handlers will just reinsert the BIOs into the queue we're
trying to drain.
Reviewed by: imp, ngie
Approved by: re (gjb)
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D6908
There is an order between covered vnode lock and allproc_lock, which
is established by calling mountcheckdirs() while owning the covered
vnode lock. mountcheckdirs() iterates over the processes, protected by
allproc_lock. This order is needed and seems to be not avoidable.
On the other hand, various VM daemons also need to iterate over all
processes, and they lock and unlock user maps. Since unlock of the
user map may trigger processing of the deferred map entries, it causes
vnode locking to occur. Or, when vmspace is freed, dropping references
on the vnode-backed object also lock vnodes. We get reverted order
comparing with the mount/unmount order.
For VM daemons, there is no need to own allproc_lock while we operate
on vmspaces. If the process is held, it serves as the marker for
allproc list, which allows to continue the iteration.
Add _PHOLD_LITE() macro, similar to _PHOLD(), but not causing swap-in
of the kernel stacks. It is used instead of _PHOLD() in vm code,
since e.g. calling faultin() in OOM conditions only exaggerates the
problem.
Modernize comment describing PHOLD.
Reported by: lists@yamagi.org
Tested by: pho (previous version)
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 week
Approved by: re (gjb)
Differential revision: https://reviews.freebsd.org/D6679
after the underlying device went away.
The problem was that callers who queue the GEOM resize provider
event didn't check to make sure that the provider had not been
withered. For the other equivalent case, g_new_provider_event(),
the code checks to see whether the provider has been withered
before queueing a g_new_provider_event() to the event thread.
In some cases, a resize provider event would come through after
the provider had been withered and all of the existing consumers
had been orphaned. When the resize event triggered a taste of
the provider, that would attach a new consumer to the now
withered provider. The wither washer (g_wither_washer() would
never be able to completely tear down the GEOM because of the
consumers that were hanging around.
The solution was to check the G_PF_WITHER provider flag before
queueing the g_resize_provider_event(), and add an assert to
g_resize_provider_event() to insure that it isn't called on a
withered provider.
sys/geom/geom_subr.c:
In g_resize_provider(), don't try to continue if the
G_PF_WITHER flag is set.
In g_resize_provider_event(), add an assert that the
G_PF_WITHER flag is not set.
In g_access(), if a provider has an error, print out the
name of the provider with the error.
Sponsored by: Spectra Logic
Approved by: re (marius)
MFC after: 3 days
some fields to match the order in the struct. Especially needed
if_pf_kif to do pf(4) VNET debugging.
Approved by: re (marius)
Obtained from: projects/vnet
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
uncategorised reason. We need to read the fault address register before
enabling interrupts as the interrupt handler may cause this register to
change.
Approved by: re (marius, kib)
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
without VIMAGE support would dereference a NULL point unconditionally
leading to a panic. Wrap the entire VIMAGE related code with #ifdefs
rather than just the decision making part to save an extra bit of
resources.
Reported by: np
Sponsored by: The FreeBSD Foundation
MFC After: 13 days
Approved by: re (marius)
version of the XHCI specification. Make sure the code can handle the
maximum number of allowed scratch pages.
Submitted by: Shichun_Ma@Dell.com
Approved by: re (hrs)
MFC after: 1 week
File and disk-backed I/O requests store counts of read/written disk
blocks in each AIO job so that they can be charged to the thread that
completes an AIO request via aio_return() or aio_waitcomplete(). This
change extends AIO jobs to store counts of received/sent messages and
updates socket backends to set these counts accordingly. Note that
the socket backends are careful to only charge a single messages for
each AIO request even though a single request on a blocking socket might
invoke sosend or soreceive multiple times. This is to mimic the
resource accounting of synchronous read/write.
Adjust the UNIX socketpair AIO test to verify that the message resource
usage counts update accordingly for aio_read and aio_write.
Approved by: re (hrs)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D6911
The DPADD data in .depend will be redundant with what is in the .meta file.
Also extend NO_EXTRADEPEND support to bsd.prog.mk.
Approved by: re (blanket, META_MODE)
Sponsored by: EMC / Isilon Storage Division
device is gone.
The problem was that when disk_gone() is called, if the GEOM disk
creation process has not yet happened, the withering process
couldn't start.
We didn't record any state in the GEOM disk code, and so the d_gone()
callback to the da(4) driver never happened.
The solution is to track the state of the creation process, and
initiate the withering process from g_disk_create() if the disk is
being created.
This change does add fields to struct disk, and so I have bumped
DISK_VERSION.
geom_disk.c: Track where we are in the disk creation process,
and check to see whether our underlying disk has
gone away or not.
In disk_gone(), set a new d_goneflag variable that
g_disk_create() can check to see if it needs to
clean up the disk instance.
geom_disk.h: Add a mutex to struct disk (for internal use) disk
init level, and a gone flag.
Bump DISK_VERSION because the size of struct disk has
changed and fields have been added at the beginning.
Sponsored by: Spectra Logic
Approved by: re (marius)
fully-pessimized implementation that requires a type to be aligned to
its natural size.
On armv6+ the compiler might generate load-/store-multiple instructions
which require 4-byte alignment even though the source code is only
accessing individual uint32_t values in a way that doesn't require any
particular alignment at all. The compiler apparently feels free to
combine multiple accesses into a single instruction that requires a
more-strict alignment, and no set of compiler flags seems to disable
this behavior (at least in clang 3.8).
This fixes alignment faults on arm systems using wifi adapters. The
wifi code uses ALIGNED_POINTER(p, uint32_t) to decide whether it needs
to copy-align tcp headers. Because clang is combining several uint32_t
accesses into a single ldm instruction, we need to say that accessing a
uint32_t requires 4-byte alignment.
Approved by: re(gjb)
statistics. Marking is done by setting the OBJ_ACTIVE flag. The
flags change is locked, but the problem is that many parts of system
assume that vm object initialization ensures that no other code could
change the object, and thus performed lockless. The end result is
corrupted flags in vm objects, most visible is spurious OBJ_DEAD flag,
causing random hangs.
Avoid the active object marking, instead provide equally inexact but
immutable is_object_alive() definition for the object mapped state.
Avoid iterating over the processes mappings altogether by using
arguably improved definition of the paging thread as one which sleeps
on the v_free_count.
PR: 204764
Diagnosed by: pho
Tested by: pho (previous version)
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (gjb)
It turns out that getting decent performance requires stacking the TX
FIFO a little more aggressively.
* Ensure that when we complete a frame, we attempt to push a new frame
into the FIFO so TX is kept as active as it needs to be
* Be more aggressive about batching non-aggregate frames into a single
TX FIFO slot. This "fixes" TDMA performance (since we only get one
TX FIFO slot ungated per DMA beacon alert) but it does this by pushing
a whole lot of work into the TX FIFO slot.
I'm not /entirely/ pleased by this solution, but it does fix a whole bunch
of corner case issues in the transmit side and fix TDMA whilst I'm at it.
I'll go revisit transmit packet scheduling in ath(4) post 11.
Tested:
* AR9380, STA mode
* AR9580, hostap mode
* AR9380, TDMA client mode
Approved by: re (hrs)
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.
Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.
Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.
For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.
Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.
For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).
Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.
Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747
1) Unload mbuf instead of descriptor in rtwn_tx_done().
2) Add more synchronization for device visible mappings before
touching the memory.
3) Improve watchdog timer logic.
Reported and tested by: mva
Approved by: re (gjb)
Remove frames from active/pending Tx queues and free related node
references when vap is destroyed to prevent various use-after-free
scenarios.
Reported and tested by: Aleksander Alekseev <afiskon@devzen.ru>
PR: 208632
Approved by: re (gjb)
Use MPI2_IOCSTATUS_MASK when checking IOCStatus to mask off the log bit, and
make a few more things endian-safe.
- Fix possible use of invalid pointer.
It was possible to use an invalid pointer to get the target ID value. To fix
this, initialize a local Target ID variable to an invalid value and change that
variable to a valid value only if the pointer to the Target ID is not NULL.
- No need to set the MPSSAS_SHUTDOWN flag because it's never used.
- done_ccb pointer can be used if it is NULL.
To prevent this, move check for done_ccb == NULL to before done_ccb is used in
mpssas_stop_unit_done().
- Disks can go missing until a reboot is done in some cases.
This is due to the DevHandle not being released, which causes the Firmware to
not allow that disk to be re-added.
Reviewed by: ken
Approved by: re (gjb), ken, scottl, ambrisko (mentors)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6872
Among other things, this introduces the idea of DBA-gated queues that
aren't the CABQ. The TDMA support requires this.
Tested:
* AR9580 (hostap mode)
* AR9380 (sta mode)
Approved by: re (gjb)
This started showing up when doing lots of aggregate traffic. For TDMA it's
always no-ACK traffic and I didn't notice this, and I didn't notice it
when doing 11abg traffic as it didn't fail enough in a bad way to trigger
this.
This showed up as the fifo depth being < 0.
Eg:
Jun 19 09:23:07 gertrude kernel: ath0: ath_tx_edma_push_staging_list: queued 2 packets; depth=2, fifo depth=1
Jun 19 09:23:07 gertrude kernel: ath0: ath_edma_tx_processq: Q1, bf=0xfffffe000385f068, start=1, end=1
Jun 19 09:23:07 gertrude kernel: ath0: ath_edma_tx_processq: Q1: FIFO depth is now 0 (1)
Jun 19 09:23:07 gertrude kernel: ath0: ath_edma_tx_processq: Q1, bf=0xfffffe0003866fe8, start=0, end=1
Jun 19 09:23:07 gertrude kernel: ath0: ath_edma_tx_processq: Q1: FIFO depth is now -1 (0)
So, clear the flags before adding them to a TX queue, so if they're
re-added for the retransmit path it'll clear whatever they were and
not double-account the FIFOEND flag. Oops.
Tested:
* AR9380, STA mode, 11n iperf testing (~130mbit)
Approved by: re (delphij)
explains the plausible scenario), resulting in EDEADLK returned on the
local registration attempt. Handle this by re-trying the local op [1].
On unmount, local registration abort is indicated as EINTR, abort the nlm
call as well.
Reported and tested by: pho
Suggested and reviewed by: dfr (previous version, [1])
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (delphij)
Drop scan generation number and node table scan lock - the only place
where ni_scangen is checked is in ieee80211_timeout_stations() (and it
is used to prevent duplicate checking of the same node); node scan lock
protects only this variable + node table scan generation number.
This will fix (at least) next LOR (hostap mode):
lock order reversal:
1st 0xc175f84c urtwm0_scan_loc (urtwm0_scan_loc) @ /usr/src/sys/modules/wlan/../../net80211/ieee80211_node.c:2019
2nd 0xc175e018 urtwm0_com_lock (urtwm0_com_lock) @ /usr/src/sys/modules/wlan/../../net80211/ieee80211_node.c:2693
stack backtrace:
#0 0xa070d1c5 at witness_debugger+0x75
#1 0xa070d0f6 at witness_checkorder+0xd46
#2 0xa0694cce at __mtx_lock_flags+0x9e
#3 0xb03ad9ef at ieee80211_node_leave+0x12f
#4 0xb03afd13 at ieee80211_timeout_stations+0x483
#5 0xb03aa1c2 at ieee80211_node_timeout+0x42
#6 0xa06c6fa1 at softclock_call_cc+0x1e1
#7 0xa06c7518 at softclock+0xc8
#8 0xa06789ae at intr_event_execute_handlers+0x8e
#9 0xa0678fa0 at ithread_loop+0x90
#10 0xa0675fbe at fork_exit+0x7e
#11 0xa08af910 at fork_trampoline+0x8
In addition to the above:
* switch to ieee80211_iterate_nodes();
* do not assert that node table lock is held, while calling node_age();
that's not really needed (there are no resources, which can be protected
by this lock) + this fixes LOR/deadlock between ieee80211_timeout_stations()
and ieee80211_set_tim() (easy to reproduce in HOSTAP mode while
sending something to an STA with enabled power management).
Tested:
* (avos) urtwn0, hostap mode
* (adrian) AR9380, STA mode
* (adrian) AR9380, AR9331, AR9580, hostap mode
Notes:
* This changes the net80211 internals, so you have to recompile all of it
and the wifi drivers.
Submitted by: avos
Approved by: re (delphij)
Differential Revision: https://reviews.freebsd.org/D6833
It turns out the frame scheduling policies (eg DBA_GATED) operate on
a single TX FIFO entry. ASAP scheduling is fine; those frames always
go out.
DBA-gated sets the TX queue ready when the DBA timer fires, which triggers
a beacon transmit. Normally this is used for content-after-beacon queue
(CABQ) work, which needs to burst out immediately after a beacon.
(eg broadcast, multicast, etc frames.) This is a general policy that you
can use for any queue, and Sam's TDMA code uses it.
When DBA_GATED is used and something like say, an 11e TX burst window,
it only operates on a single TX FIFO entry. If you have a single frame
per TX FIFO entry and say, a 2.5ms long burst window (eg TDMA!) then it'll
only burst a single frame every 2.5ms. If there's no gating (eg ASAP) then
the burst window is fine, and multiple TX FIFO slots get used.
The CABQ code does pack in a list of frames (ie, the whole cabq) but
up until this commit, the normal TX queues didn't. It showed up when
I started to debug TDMA on the AR9380 and later.
This commit doesn't fix the TDMA case - that's still broken here, because
all I'm doing here is allowing 'some' frames to be bursting, but I'm
certainly not filling the whole TX FIFO slot entry with frames.
Doing that 'properly' kind of requires me to take into account how long
packets should take to transmit and say, doing 1.5 or something times that
per TX FIFO slot, as if you partially transmit a slot, when it's next
gated it'll just finish that TX FIFO slot, then not advance to the next
one.
Now, I /also/ think queuing a new packet restarts DMA, but you have to
push new frames into the TX FIFO. I need to experiment some more with
this because if it's really the case, I will be able to do TDMA support
without the egregious hacks I have in my local tree. Sam's TDMA code
for previous chips would just kick the TXE bit to push along DMA
again, but we can't do that for EDMA chips - we /have/ to push a new
frame into the TX FIFO to restart DMA. Ugh.
Tested:
* AR9380, STA mode
* AR9380, hostap mode
* AR9580, hostap mode
Approved by: re (gjb)
This allows IPv6 link local addresses (and other IPv6 functionality) to work.
PR: 210355
Submitted by: Steve Wahl and David Bright (both at Dell Inc.)
Reviewed by: cem, mav
Tested by: mav (on Intel hardware)
Approved by: re (kib)
MFC after: 5 days
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D6885
dropping the reference on mnt_cred. Prevent this by referencing the
temporal credentials before unlock.
Tested by: pho
Reviewed by: dfr
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (gjb)
Adopt the OpenBSD syntax for setting and filtering on VLAN PCP values. This
introduces two new keywords: 'set prio' to set the PCP value, and 'prio' to
filter on it.
Reviewed by: allanjude, araujo
Approved by: re (gjb)
Obtained from: OpenBSD (mostly)
Differential Revision: https://reviews.freebsd.org/D6786
This apparently puts ARC back under the limits after the vnode pressure
rework in r291244, in particular due to the kmem exhaustion.
Based on patch by: mckusick
Reviewed by: avg, mckusick
Tested by: allanjude, madpilot
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
to mount points with the given filesystem type, specified by mount
vfs_ops pointer.
Based on patch by: mckusick
Reviewed by: avg, mckusick
Tested by: allanjude, madpilot
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
reported by EFI implementation. This address comment on r301714.
Approved by: re (gjb), andrew (mentor)
Differential Revision: https://reviews.freebsd.org/D6787
Maps Sonics/OCP per-core address spaces to bcma(4)-compatible port/region
identifiers.
This permits the use of common address map identifiers in bhnd device
drivers, independent of the underlying interconnect type.
Approved by: re (gjb), adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D6850
- Delete all chipc children on attachment failure.
- Added missing bhnd_nexus bhnd_bus_deactivate_resource implementation.
- Drop a CHIPC_UNLOCK() accidentally left behind after lifting
synchronization into the chipc region refcounting API.
- Fix re-allocation of chipc resources. Previously, the resource ID was
reset to -1 on release, preventing later re-allocation.
Approved by: re (gjb), adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D6849
supported, e.g. CPUID or MSR, return ENODEV from the ioctl which needs
that feature.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (hrs)
threads, to make it less confusing and using modern kernel terms.
Rename the functions to reflect current use of the functions, instead
of the historic KSE conventions:
cpu_set_fork_handler -> cpu_fork_kthread_handler (for kthreads)
cpu_set_upcall -> cpu_copy_thread (for forks)
cpu_set_upcall_kse -> cpu_set_upcall (for new threads creation)
Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (hrs)
Differential revision: https://reviews.freebsd.org/D6731
reason for it in modern times. In the other case, expand the comment
stating instead of doubting.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (hrs)
X-Differential revision: https://reviews.freebsd.org/D6731
There is no reason to return non-zero value from zfs_probe_partition()
as that causes following partitions to not be probed for ZFS vdevs.
A particular scenario that I encountered is a GPT partitioned disk
where several partitions have freebsd-zfs type. A partition with a lower
index is used as a cache (l2arc) vdev and in that case case zfs_probe()
returned a non-zero status. That status was returned to ptable_iterate()
and caused it to abort the iteration. Because of that the subsequent
partitions were not probed and a root pool was not discovered resulting
in a boot failure.
While there fix the style for nearby return statements.
Approved by: re (kib)
This is a follow-up to r300343.
This is important for the OBJS_DEPEND_GUESS usage in
gnu/usr.bin/cc/cc_tools.
See comments for more details.
Approved by: re (implicit)
Sponsored by: EMC / Isilon Storage Division
Inserting a full mbuf with an external cluster into the socket buffer
resulted in sbspace() returning -MLEN. However, since sb_hiwat is
unsigned, the -MLEN value was converted to unsigned in comparisons. As a
result, the socket buffer was never autosized. Note that sb_lowat is signed
to permit direct comparisons with sbspace(), but sb_hiwat is unsigned.
Follow suit with what tcp_output() does and compare the value of sbused()
with sb_hiwat instead.
Approved by: re (gjb)
Sponsored by: Chelsio Communications
This reduces the size of kaiocb slightly. I've also added some generic
fields that other backends can use in place of the BIO-specific fields.
Change the socket and Chelsio DDP backends to use 'backend3' instead of
abusing _aiocb_private.status directly. This confines the use of
_aiocb_private to the AIO internals in vfs_aio.c.
Reviewed by: kib (earlier version)
Approved by: re (gjb)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D6547
we set MNTK_UNMOUNT flag on the mp. Otherwise parallel unmount which
wins race with us could dereference the covered vnode, and we are
left with the locked freed memory.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
MFC after: 1 week
Release the hold on ep->com immediately after sending the RST. This
fixes a bug that sometimes leaves userspace iWARP tools hung when the
user presses ^C.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Approved by: re (gjb@)
Sponsored by: Chelsio Communications
initialisation. This ensures it will complete before signalling to the boot
CPU it has booted. This fixes a race with the GIC where the arm_gic_map may
not be populated before it is used to bind interrupts leading to some
interrupts becoming bound to no CPUs.
Approved by: re (kib)
Sponsored by: ABT Systems Ltd
The SIOCSIFALIFETIME_IN6 provided by the kame project is unused,
it can't really be used safely and has been completely removed from
NetBSD and OpenBSD.
Obtained from: NetBSD (kern/35897)
PR: 210148 (exp-run)
Reviewed by: ae, hrs
Relnotes: yes
Approved by: re (glebius)
Differential Revision: https://reviews.freebsd.org/D5491
The change is in arc_buf_l2_cdata_free().
Without this we can trip the assertion in arc_hdr_realloc()
if INVARIANTS option is enabled.
Approved by: re (kib)
MFC after: 1 week
are no longer natural-alignment strict, there are still some restrictions.
FreeBSD network code assumes data is naturally-aligned or is running
on a platform with no restrictions; pointers are not annotated to
indicate the data pointed to may be packed or unaligned. The clang
optimizer can sometimes combine the load or store of a pair of adjacent
32-bit values into a single doubleword load/store, and that operation
requires at least 4-byte alignment. __NO_STRICT_ALIGNMENT can lead
to tcp headers being only 2-byte aligned.
Note that alignment faults remain disabled on armv6, this change reverts
only the defining of the symbol which leads to some overly-agressive code
shortcuts when building common/shared drivers and network code for arm.
Approved by: re(kib)
console warnings when pread(2) and pwrite(2) are used with full
system-call auditing enabled. We audit the same file-descriptor data
for these calls as we do read(2) and write(2).
Approved by: re (kib)
MFC after: 3 days
Sponsored by: DARPA, AFRL
does not cover the dynamically registered ficititious ranges, and
fictitious pages mappings are not promoted. Offer a dummy struct
md_page to fetch constant superpage pv list generation to satisfy
logic. Also, by initializing the pv_dummy pv_list to empty, we can
remove several explicit PG_FICTITIOUS tests.
Reported and tested by: Michael Butler <imb@protected-networks.net>
(previous version)
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D6728
Approved by: re (hrs)
Right now, all modifications of the list are locked by sw_alloc_mtx.
But initial lookup of the object by the handle in swap_pager_alloc()
is not protected by sw_alloc_mtx, which means that
vm_pager_object_lookup() could follow freed pointer.
Create a new named swap object with the OBJT_SWAP type, instead
of OBJT_DEFAULT. With this change, swp_pager_meta_build() never need
to upgrade named OBJT_DEFAULT to OBJT_SWAP (in the other place, we do
not forbid for client code to create named OBJT_DEFAULT objects at
all).
That change allows to remove sw_alloc_mtx and make the list locked by
sw_alloc_sx lock. Update swap_pager_copy() to new locking mode.
Create helper swap_pager_alloc_init() to consolidate named and
anonymous swap objects creation, while a caller ensures that the
neccesary locks are held around the helper.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Approved by: re (hrs)
When allocating a new mbuf or bus_dmamap_load()-ing it fails,
we can just keep the old mbuf since we are dropping that packet anyway.
Instead of doing bus_dmamap_create() and bus_dmamap_destroy() all the time,
create an extra bus_dmamap_t which we can use to safely try
bus_dmamap_load()-ing the new mbuf. On success we just swap the spare
bus_dmamap_t with the data->map of that ring entry.
Tested:
Tested with Intel AC7260, verified with vmstat -m that new kernel no
longer visibly leaks memory from the M_DEVBUF malloc type.
Before, leakage was 1KB every few seconds while ping(8)-ing over the wlan
connection.
Submitted by: Imre Vadasz <imre@vdsz.com>
Approved by: re@
Obtained from: DragonflyBSD.git cc440b26818b5dfdd9af504d71c1b0e6522b53ef
Differential Revision: https://reviews.freebsd.org/D6742
For DWC_GMAC_ALT_DESC implementations, the multicast hash table has only
64 entries. Instead of 8 registers starting at 0x500, a pair of registers
at 0x08 and 0x0c are used instead.
Approved by: re (hrs)
Submitted by: Guy Yur <guyyur@gmail.com>
the loop in linprocfs_doswaps() is useless.
List of the registered filesystems is protected by vfsconf_sx,
not by the Giant. Adjust linprocfs_dofilesystems() correspondingly.
Approved by: re (delphij), des (linprocfs maintainer)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
default. At least initially, the feature to support multiple TCP stacks is
aimed at supporting advanced use cases and TCP development, but it is not
necessarily aimed at a wide audience. Therefore, there is no need to build
and install the extra TCP stacks by default. Instead, the people who are
using or developing this functionality can add the extra option to build/
install the extra TCP stacks.
However, we do want to build the extra TCP stacks as part of test builds
(e.g. LINT or tinderbox) to ensure that developers who are testing their
changes will know that their changes do not break the additional TCP
stack modules.
After this change, a user will need to add WITH_EXTRA_TCP_STACKS=1 to
make.conf or the kernel config in order to build the extra TCP modules.
Differential Revision: https://reviews.freebsd.org/D6795
Reviewed by: sjg
Approved by: re (kib)
Similar to r300836, cl and ct will always be non-NULL as they're allocated
using the mem_alloc routines, which always use `malloc(..., M_WAITOK)`.
Deobfuscating the cleanup path fixes a leak where if cl was NULL and
ct was not, ct would not be free'd, and also removes a duplicate test for
cl not being NULL.
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D6801
MFC after: 1 week
Reported by: Coverity
CID: 1229999
Reviewed by: cem
Sponsored by: EMC / Isilon Storage Division
Some later code I'll commit pushes lists of frames into the EDMA TX
FIFO, rather than a single frame at a time. The CABQ code already
pushes frame lists, but it turns out we should actually be doing it
in general or performance tanks. :(
Since key table is cleared on every device shutdown,
static WEP keys (which are set only once) need to be
reinstalled manually every time when device starts running.
Tested with RTL8188EU, STA (all ciphers) / IBSS (WPA-none) modes.
The allproc_lock lock used in the sysctl_kern_corefile function is initialized
in the procinit function which is called after setting sysctl values at boot.
That means if we set kern.corefile at boot we will be trying to use
lock with is uninitialized and machine will crash.
If we define kern.corefile as tunable instead of using CTFLAG_RWTUN we will
not call the sysctl_kern_corefile function and we will not use an uninitialized
lock. When machine will boot then we will start using function depending on
the lock.
Reviewed by: pjd
by holding allprison_lock exclusively (even if only for a moment before
downgrading) on all paths that call PR_METHOD_REMOVE. Since they may run
on a downgraded lock, it's still possible for them to run concurrently
with PR_METHOD_GET, which will need to use the prison lock.
r298930 removed the inittodr call, but it seems like this prevents
"calcru: runtime went backwards ..." messages from occasionally appearing
when resuming from migration.
Reported by: Karl Pielorz <kpielorz@tdx.co.uk>
Sponsored by: Citrix Systems R&D
Do not try to pass such frames; a correct frame cannot be smaller than
(the corresponding) header size.
(for wpi(4) an additional check was added in r289012).
PR: 144987
ticks are signed int and if statistics is not updated for a long time
(more than INT_MAX ticks, but less than UINT_MAX) difference becomes
negative and less than hz for a long time.
Other option to repeat is simply load driver (which initializes
timestamps to 0) when ticks are negative.
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6777
Remove 'if_inc_counter(ifp, IFCOUNTER_OPACKETS, 1);' from raw xmit
and apbridge path; it will be incremented by ieee80211_tx_complete()
after packet transmission.
Noticed by: Imre Vadasz <imre@vdsz.com>
This fixes a warning that occurs in a number of files that use the
random_harvest_queue function.
Differential Revision: https://reviews.freebsd.org/D4229
Submitted by: stevek@juniper.net
Reviewed by: markm
Approved by: so
where we assumed TERM_EMU was defined but didn't check. Fix these by also
including them under the ifdefs.
As HO is called from loader we need a null implementation so loader.efi
doesn't need to know which version of libefi it is building against.
Sponsored by: ABT Systems Ltd
Changes:
- Fixed incorrect MIPS74k vendor ID in the bhnd core descriptor tables
- Fixed MIPS core driver's matching against MIPS/MIPS33 cores.
- Improved MIPS3302 core description.
- Enabled BUS_PASS_BUS on the bhnd nexus drivers to allow early probing
of the MIPS core.
- Enabled BUS_PASS_CPU on the MIPS core driver to ensure correct attach
order.
- Disabled matching of the MIPS core driver on non-SoC devices.
Reviewed by: Michael Zhilin <mizhka@gmail.com>
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D6735
This is in preparation for some other TDMA fixes which will hopefully
end with having working TDMA.
But, it does avoid lots of read/modify/writes in the txq setup path.
* Allow readyTime to just be programmed in directly
* The beacon interval and all of the beacon timing sysctl's are in TU,
not TSF. So, we were doing the wrong math on the CAB programming
in the first place.
when the process credentials were not changed. This can happen if an
error occured trying to activate the setuid binary. And on error, if
new credentials were not yet assigned, they must be freed to not
create the leak.
Use oldcred == NULL as the predicate to detect credential
reassignment.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
In the case where cam_iosched_init() fails, the ada and da softcs were leaked.
Instead, free them.
Reported by: Coverity
CID: 1356039
Sponsored by: EMC / Isilon Storage Division
If strlen(hostp) was zero, the stack array 'nam' would never be initialized
before being strdup()ed. Fix this by initializing it to the empty string.
It's possible some external condition makes this case impossible, in which
case, an assertion instead of this workaround is appropriate.
Introduced in r299848.
Reported by: Coverity
CID: 1355336
Sponsored by: EMC / Isilon Storage Division
Due to an accidental mismatch between allocation and release in the slow path
of iflib_if_transmit, if a caller passed 9-16 mbufs to the routine, the mbuf
array would be leaked.
Fix the mismatch by removing the magic numbers in favor of nitems() on the
stack array. According to mmacy, this leak is unlikely.
Reported by: Coverity
Discussed with: mmacy
CID: 1356040
Sponsored by: EMC / Isilon Storage Division
the exact CPU we are running on to set the cpu functions. Relax the check
to ignore the CPU revision. Even so this may still be too specific.
Reviewed by: mmel
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D6504
Support for compression has been available from July 2007 but it
was never imported due to concerns with patents once held by
STAC/HiFn. The issues have clearly been resolved so bring it
in now.
Special thanks to Brett Glass for preserving the code and
pointing documentation for the expiration case.
Obtained from: mav (through Brett Glass)
Relnotes: yes
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6739
Ext2/3/4 manages generation numbers differently than UFS so adopt
some rules that should work well. When allocating a new inode,
make sure we generate a "good" random value specifically avoiding
zero.
Don't interfere with the numbers that are already generated in
the filesystem: ext2fs doesn't have the backwards compatibility
issues where there were no generation numbers.
Reviewed by: kevlo
MFC after: 1 week
This patch adds the missing pieces needed for device setup using the
mlx5en driver inside a virtual machine which is providing hardware
access through SR-IOV.
Sponsored by: Mellanox Technologies
MFC after: 1 week
- Validate the scheduling class against the actual limit (which is chip
specific) instead of a magic number.
- Return an error if an attempt is made to manipulate the tx queues of a
VI that hasn't been initialized.
Sponsored by: Chelsio Communications
to NULL to avoid it being mis-treated on a possible re-attach but also
to get a clean NULL pointer derefence in case of errors due to
unexpected race conditions elsewhere in the code, e.g., callouts.
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
panic string again if set, in case it scrolled out of the active
window. This avoids having to remember the symbol name.
Also add a show callout <addr> command to DDB in order to inspect
some struct callout fields in case of panics in the callout code.
This may help to see if there was memory corruption or to further
ease debugging problems.
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Reviewed by: jhb (comment only on the show panic initally)
Differential Revision: https://reviews.freebsd.org/D4527
The SYSINIT runs at SI_SUB_KICK_SCHEDULER after the scheduler is fully
initialized and timers are working. This fixes booting in the
EARLY_AP_STARTUP case.
A couple of mostly cosmetic fixes for the final initialization of netfront:
- Switch to "connected" state before starting to kick the rings.
- Correctly use "rxq" in the initialization loop (previously rxq was not
updated in the loop, and netfront would kick np->rxq[N] several times).
- Declare and define xn_connect as static, it's not used outside of this
file.
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6657
due to called functions (as in other parts of the stack, leave a comment).
Put around a lock the removal of the ifa from the list however to
reduce the possible race with other places.
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
generally cleanup code might still acquire it thus try to be
consistent destroying locks late.
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
family as an argument as well.
This will be used to cleanup individual protocols during VNET teardown.
Obtained from: projects/vnet
Sponsored by: The FreeBSD Foundation
In mediatek etherswitch support, functions mtkswitch_reg_write32_mt7621
and mtkswitch_reg_read32_mt7621 are called without locks held, so
lock assertions fail. Remove the lock assertions.
Sponsored by: Smartcom - Bulgaria AD
which refers to IEEE 802.1p class of service and maps to the frame
priority level.
Values in order of priority are: 1 (Background (lowest)),
0 (Best effort (default)), 2 (Excellent effort),
3 (Critical applications), 4 (Video, < 100ms latency),
5 (Video, < 10ms latency), 6 (Internetwork control) and
7 (Network control (highest)).
Example of usage:
root# ifconfig em0.1 create
root# ifconfig em0.1 vlanpcp 3
Note:
The review D801 includes the pf(4) part, but as discussed with kristof,
we won't commit the pf(4) bits for now.
The credits of the original code is from rwatson.
Differential Revision: https://reviews.freebsd.org/D801
Reviewed by: gnn, adrian, loos
Discussed with: rwatson, glebius, kristof
Tested by: many including Matthew Grooms <mgrooms__shrew.net>
Obtained from: pfSense
Relnotes: Yes
This is done because one has no point to have more channels since they
will be unused.
Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D6720
This change is needed because 'opt_rss.h' is included by multiple source
files and RSS macro is defined as 1 within the file during build process
if option RSS is enabled in the kernel.
Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D6718
And fix message processing; only channel messages are supported.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6706
This is a NOP.
The COMPAT_IA32 was renamed in r205014 to COMPAT_FREEBSD32 and
COMPAT_ARCH32 does not seem to have existed. Also remove some
leftovers from the sysent rework in r301404. Include
freebsd32_util.h for the freebsd32_sysent prototype.
X-MFC-With: r301404
Reported by: kib
MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
Do not try to change attributes for DMAP when working on a mapping
which is not covered by the DMAP. This was reported on real system
where a BAR of a device (NTB) was mapped outside the PCI window.
Reported and tested by: mav
Reviewed by: jhb, mav
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D6668
p_sched is unused.
The struct td_sched is always co-allocated with the struct thread,
except for the thread0. Avoid useless indirection, instead calculate
td_sched location using simple pointer arithmetic in td_get_sched(9).
For thread0, which is statically allocated, create a structure to
emulate layout of the dynamic allocation.
Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D6711
BUS_MAP_INTR() is used to get an interrupt mapping data according
to provided hints. The hints could be modified afterwards, but only
if mapping data was allocated. This method is intended to be called
before BUS_ALLOC_RESOURCE().
An interrupt mapping data describes an interrupt - hardware number,
type, configuration, cpu binding, and whatever is needed to setup it.
(2) Introduce a method which allows storing of an additional data
in struct resource to be available for bus drivers. This method is
convenient in two ways:
- there is no need to rework existing bus drivers as they can simply
be extended to provide an additional data,
- there is no need to modify any existing bus methods as struct
resource is already passed to them as argument and thus stored data
is simply accessible by other bus drivers.
For now, implement this method only for INTRNG.
This is motivated by needs of modern SOCs where hardware initialization
is not straightforward and resources descriptions are complex, opaque
for everyone but provider, and may vary from SOC to SOC. Typical
situation is that one bus driver can fetch a resource description for
its child device, but it's opaque for this driver. Another bus driver
knows a provider for this kind of resource and can pass this resource
description to it. In fact, something like device IVARS would be
perfect for that if implemented generally enough. Unfortunatelly, IVARS
are usable only by their owners now. Only owner knows its IVARS layout,
thus other bus drivers are not able to use them.
Differential Revision: https://reviews.freebsd.org/D6632
This trips me up whenever I'm fooling around with partially supported
NICs that fail to fully attach or initialise - the firmware gets loaded
and references, but something fails - and the firmware references
aren't cleaned up.
After perusing the PHY-LP code (don't ask why; honest) I discovered that
it /has/ 5GHz support - but it's not ever used. I found one NIC - a
BCM4312 w/ pci id 0x4315 - which advertised dual-band PHY-LP support.
Turns out it works.
Whilst here, move up the support bit logging code so I can use it
to debug this.
Tested:
* BCM4312 (pci id 0x4315); 5GHz STA operation
On Medford, licenses are required to enable RX and event cut through and to
disable RX batching. To avoid the need for the driver to make decisions based on
the licensing state, the MC_CMD_INIT_EVQ has been extended to allow us to leave
the decision to the firmware. If the adapter is licensed for low-latency use,
the firmware will choose the optimal settings for latency, otherwise it will use
the best settings for throughput.
For Huntington we still need to choose the settings ourselves.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6717
This seems to make 5G work better.
It doesn't fix powersave handling though, that still sees the PHY get
stuck during initial calibration and everything goes pear shaped.
I'll look into that later.
Tested:
* QCAFN222 NIC, STA mode, 5GHz
Obtained from: Linux ath9k
Turns out I wasn't even initialising or programming a lot of stuff
for the AR9462 2.1 chip. Oops.
This mostly gets it working. powersave scan results in some pretty
hilarious NFcal hangs and I don't see beacons reliably.
There are still some xlna gain tables missing that ath9k has; I'll
follow up with some fixes and then see if the QCAFN222 NIC I have
tests this path.
Tested:
* QCAFN222 NIC, STA mode, 2GHz and 5GHz
the graphics drivers can benefit from access to the lid handle for querying and getting notifications
Submitted by: kmacy
Differential Revision: https://reviews.freebsd.org/D6643