This change tries to fix the most obvious locking problems.
sbp_cam_scan_lun() is never called with the sbp lock held, so the lock
needs to be acquired internally (if it's needed at all).
Without this change a kernel with INVARIANTS panics when a firewire disk
is connected:
panic: mutex sbp not owned at /usr/src/sys/dev/firewire/sbp.c:967
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff80420bbb = db_trace_self_wrapper+0x2b/frame 0xfffffe0504df0930
kdb_backtrace() at 0xffffffff80670359 = kdb_backtrace+0x39/frame 0xfffffe0504df09e0
vpanic() at 0xffffffff8063986c = vpanic+0x14c/frame 0xfffffe0504df0a20
panic() at 0xffffffff806395b3 = panic+0x43/frame 0xfffffe0504df0a80
__mtx_assert() at 0xffffffff8061c40d = __mtx_assert+0xed/frame 0xfffffe0504df0ac0
sbp_cam_scan_lun() at 0xffffffff80474667 = sbp_cam_scan_lun+0x37/frame 0xfffffe0504df0af0
xpt_done_process() at 0xffffffff802aacfa = xpt_done_process+0x2da/frame 0xfffffe0504df0b30
xpt_done_td() at 0xffffffff802ac2e5 = xpt_done_td+0xd5/frame 0xfffffe0504df0b80
fork_exit() at 0xffffffff805ff72f = fork_exit+0xdf/frame 0xfffffe0504df0bf0
fork_trampoline() at 0xffffffff8082483e = fork_trampoline+0xe/frame
0xfffffe0504df0bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Also, I tried to reduce the scope of the sbp lock to avoid holding it
while doing bus_dma allocations.
The code badly needs some re-engineering. SBP really should implement
a CAM transport, so that it avoids control flow inversion when re-scanning
the bus. Also, the sbp lock seems to be too coarse.
Additionally, the commit includes some changes not related to locking.
- sbp_cam_scan_lun: restore CAM_DEV_QFREEZE before re-queueing the ccb
because xpt_setup_ccb resets ccb_h.flags
- sbp_post_busreset: call xpt_release_simq only if it's actually frozen
- don't place private SIMQ_FREEZED flag (sic, "freezed") into sim->flags,
use sbp->flags for that
- some style fixes and control flow enhancements
Reviewed by: sbruno
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D9898
SBP-2 specification defined maximum CDB length as 12 bytes. Newer SBP-3
specification allows CDB of any size, but this driver is too old. Proper
solution would be to look on maximal ORB size supported by the target.
MFC after: 1 week
The sim_vid, hba_vid, and dev_name fields of struct ccb_pathinq are
fixed-length strings. AFAICT the only place they're read is in
sbin/camcontrol/camcontrol.c, which assumes they'll be null-terminated.
However, the kernel doesn't null-terminate them. A bunch of copy-pasted code
uses strncpy to write them, and doesn't guarantee null-termination. For at
least 4 drivers (mpr, mps, ciss, and hyperv), the hba_vid field actually
overflows. You can see the result by doing "camcontrol negotiate da0 -v".
This change null-terminates those fields everywhere they're set in the
kernel. It also shortens a few strings to ensure they'll fit within the
16-character field.
PR: 215474
Reported by: Coverity
CID: 1009997 1010000 1010001 1010002 1010003 1010004 1010005
CID: 1331519 1010006 1215097 1010007 1288967 1010008 1306000
CID: 1211924 1010009 1010010 1010011 1010012 1010013 1010014
CID: 1147190 1010017 1010016 1010018 1216435 1010020 1010021
CID: 1010022 1009666 1018185 1010023 1010025 1010026 1010027
CID: 1010028 1010029 1010030 1010031 1010033 1018186 1018187
CID: 1010035 1010036 1010042 1010041 1010040 1010039
Reviewed by: imp, sephe, slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D9037
Differential Revision: https://reviews.freebsd.org/D9038
Zero can be confused for a potentially valid value.
For example, if I load and unload sbp driver I get a lot of messages
like the following:
fw_tl_free: the xfer is not in the queue (tlabel=0, flag=0x0)
send: dst=0x00 tl=0x00 rt=0 tcode=0x0 pri=0x0 src=0x000
recv: dst=0x01 tl=0x21 rt=1 tcode=0x1 pri=0x0 src=0xffc0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe04464407e0
fw_tl_free() at fw_tl_free+0x18d/frame 0xfffffe0446440820
fw_xfer_unload() at fw_xfer_unload+0xca/frame 0xfffffe0446440840
fw_xferlist_remove() at fw_xferlist_remove+0x2f/frame 0xfffffe0446440870
sbp_detach() at sbp_detach+0x1e0/frame 0xfffffe04464408e0
device_detach() at device_detach+0x80/frame 0xfffffe0446440900
devclass_driver_deleted() at devclass_driver_deleted+0x6a/frame 0xfffffe0446440940
devclass_delete_driver() at devclass_delete_driver+0x7d/frame 0xfffffe0446440980
driver_module_handler() at driver_module_handler+0xff/frame 0xfffffe04464409d0
module_unload() at module_unload+0x32/frame 0xfffffe04464409f0
linker_file_unload() at linker_file_unload+0x24b/frame 0xfffffe0446440a40
kern_kldunload() at kern_kldunload+0xbc/frame 0xfffffe0446440a70
amd64_syscall() at amd64_syscall+0x314/frame 0xfffffe0446440bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0446440bf0
MFC after: 2 weeks
Please see section 5.15 of 1394 OHCI Specification.
If the register is not implemented, then the physical response unit is
limited to the first 4GB of the physical memory.
In that case the non-cooperative debugging over firewire (using /dev/fwmem)
can not be expected to work if a target has more RAM than that.
The method is described in gdb.4 and the Developer's Handbook.
It seems that most of the consumer hardware does not implement
PhysicalUpperBound register.
MFC after: 1 week
driver is (or behaves identically to) /dev/mem. Remove the D_MEM flag
from random drivers.
Note that currently the D_MEM flag does not affect any behaviour, but
this going to change in the next commit.
Noted and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
X-Differential revision: https://reviews.freebsd.org/D6149
systems with more than 4GB of physical memory.
To remotely debug the system 'stealthy' which has a kernel
with this change installed and firewire properly configured:
% fwcontrol -m stealthy (or stealthy's firewire EUI64)
% kgdb kernel /dev/fwmem0.0
sys/dev/firewire/fwohci.c:
Rather than hard code the upper limit for hw based
automatic responses to remote DMA requests at 4GB,
program the hardware using Maxmem, the page number
one higher than the highest physical page detected
in the system.
While here, garbage collect more useless splfw()
calls.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110994 on 2015/01/06
asynchronous remote dma request (DMA request that the
hardware cannot automatically handle).
sys/dev/firewire/firewire.c
In fw_rcv(), add missing early return in the error
path for DMA requests to unregistered regions.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110993 on 2015/01/06
sys/boot/i386/libfirewire/firewire.c:
sys/dev/firewire/firewire.c:
Fix configuration ROM generation count wrapping logic
so that the generation count is never outside of
allowed limits (0x2 -> 0xF).
sys/dev/firewire/firewire.c:
In fw_xfer_unload(), xfer->fc may be NULL. Protect
against this before taking the fc lock.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110685 on 2015/01/05
sys/dev/firewire/firewire.c:
In fw_xfer_unload() expand lock coverage so that
the test for FWXF_INQ doesn't race with it being
cleared in another thread.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110207 on 2015/01/02
sys/dev/firewire/firewire.c:
In fw_xfer_unload(), clear the FWXF_INQ flag on the
xfer under protection of the FW_GMTX, after the
xfer is removeed from the tx/rx queue. Otherwise
it is possible for the xfer to be removed again
(corrupting the list or immediately panicing) from
another thread that has found this xfer in the
transaction label table.
Submitted by: gibbs
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1110200 on 2015/01/02
Previously, any timeout value for which (timeout * hz) will overflow the
signed integer, will give weird results, since callout(9) routines will
convert negative values of ticks to '1'. For unsigned integer overflow we
will get sufficiently smaller timeout values than expected.
Switch from callout_reset, which requires conversion to int based ticks
to callout_reset_sbt to avoid this.
Also correct isci to correctly resolve ccb timeout.
This was based on the original work done by Eygene Ryabinkin
<rea@freebsd.org> back in 5 Aug 2011 which used a macro to help avoid
the overlow.
Differential Revision: https://reviews.freebsd.org/D1157
Reviewed by: mav, davide
MFC after: 1 month
Sponsored by: Multiplay
Do not pass wrong alignment value to fwdma_malloc_multiseg and ultimately
to contigalloc. In addition to being wrong, this causes insta-panic in
certain cases due to safety assertion - the alignment is required to be
the power of two and the value we calculate here seldom is.
MFC after: 1 month
Commit my version of style(9) pass over the firewire code. Now that
other people have started changing the code carrying this is as a
local patch is not longer a viable option.
MFC after: 1 month
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
The sbp_cam_detach_target can be called from sbp_post_explore function
on the first target that is not really attached and it was written with
the corresponding safety check in place to tolerate that. Unfortunately
the recent locking cleanup did add a locking assertion that tries to
dereference the target->sbp pointer unconditionally, which causes less
than desirable outcome. Since the assertion is useful, just initialize
the target sbp pointer once when sbp device is being initialized instead
of when the target is being attached. This makes assertion work in all
cases and fixes the crash on boot.
- Switch from timeout() to callout_*() for per-request timers.
- Use device_find_child() in the identify routine.
- Use device_printf() instead of passing device_get_nameunit() to
printf().
- Expand the SBP_LOCK coverage simplifying the locking.
- Uninline STAILQ_FOREACH_SAFE().
Tested by: sbruno
Remove old bits of data concat for 'ascii' field.
Remove special SIOCGIFSTATUS handling from if.c (which Coverity yells at).
Reported by: Coverity
Coverity CID: 1147174
MFC after: 2 weeks
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
Discussed with: -arch, rdivacky
Reviewed by: cperciva
mostly by adjustments to debugging printf() format specifiers. For high
numbered LUNs, also switch to printing them in hex as per SAM-5.
MFC after: 2 weeks
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
original, this hides the contents of cam_compat.h from ktrace/kdump/truss,
avoiding problems there. There are no user-servicable parts in there, so
no need for those tools to be groping around in there.
Approved by: re
- Remove the timeout_ch field. It's been deprecated since FreeBSD 7.0;
MPSAFE drivers should be managing their own timeout storage. The
remaining non-MPSAFE drivers have been modified to also manage their own
storage, and should be considered for updating to MPSAFE (or removal)
during the FreeBSD 10.x lifecycle.
- Add fields related to soft timeouts and quality of service, to be used
in upcoming work.
- Add room for more flags in the CCB header and path_inq structures.
- Begin support for extended 64-bit LUNs.
- Bump the CAM version number to 0x18, but add compat shims. Tested with
camcontrol and smartctl.
Reviewed by: nathanw, ken, kib
Approved by: re
Obtained from: Netflix
dev_ref() in the clone handlers that still use it.
- Don't set SI_CHEAPCLONE flag, it's not used anywhere neither in devfs
(for anything real)
Reviewed by: kib
command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.
Submitted by: jhb
Reviewed by: jfv, marius, achadd, achim
MFC after: 1 day
Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
sys/dev/firewire/firewire.c:
- fw_xfer_unload(): Since we are about to free this xfer, call fw_tl_free()
to remove the xfer from its tlabel's list, if it has a tlabel.
- In every occasion when a xfer is removed from a tlabel's list, reset
xfer->tl to -1 while holding fc->tlabel_lock, so that the xfer isn't
mis-identified as belonging to a tlabel.
This doesn't fix all the use-after-free problems for M_FWMEM, but is an
incremental towards that goal.
Reviewed by: kan, sbruno
Sponsored by: Spectra Logic
DragonFlyBSD, so it certainly doesn't need splsoftvm(). Remove it.
# I doubt this driver will now compile on older FreeBSD versions or DFBSD
# We should consider unifdefing it since that code seems unmaintained.
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
to attach to target capable HBAs that implement the old immediate
notify (XPT_IMMED_NOTIFY) and notify acknowledge (XPT_NOTIFY_ACK)
CCBs. The new API has been in place since SVN change 196008 in
2009.
The solution is two-fold: fix CTL to handle the responses from the
HBAs, and convert the HBA drivers in question to use the new API.
These drivers have not been tested with CTL, so how well they will
interoperate with CTL is unknown.
scsi_target.c: Update the userland target example code to use the
new immediate notify API.
scsi_ctl.c: Detect when an immediate notify CCB is returned
with CAM_REQ_INVALID or CAM_PROVIDE_FAIL status,
and just free it.
Fix a duplicate assignment.
aic79xx.c,
aic79xx_osm.c: Update the aic79xx driver to use the new API.
Target mode is not enabled on for this driver, so
the changes will have no practical effect.
aic7xxx.c,
aic7xxx_osm.c: Update the aic7xxx driver to use the new API.
sbp_targ.c: Update the firewire target code to work with the
new API.
mpt_cam.c: Update the mpt(4) driver to work with the new API.
Target mode is only enabled for Fibre Channel
mpt(4) devices.
MFC after: 3 days