James Smart [Tue, 7 May 2019 00:26:47 +0000 (17:26 -0700)]
scsi: lpfc: resolve lockdep warnings
There were a number of erroneous comments and incorrect older lockdep
checks that were causing a number of warnings.
Resolve the following:
- Inconsistent lock state warnings in lpfc_nvme_info_show().
- Fixed comments and code on sequences where ring lock is now held instead
of hbalock.
- Reworked calling sequences around lpfc_sli_iocbq_lookup(). Rather than
locking prior to the routine and have routine guess on what lock, take
the lock within the routine. The lockdep check becomes unnecessary.
- Fixed comments and removed erroneous hbalock checks.
scsi: qedi: remove set but not used variables 'cdev' and 'udev'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/qedi/qedi_iscsi.c: In function 'qedi_ep_connect':
drivers/scsi/qedi/qedi_iscsi.c:813:23: warning: variable 'udev' set but not used [-Wunused-but-set-variable]
drivers/scsi/qedi/qedi_iscsi.c:812:18: warning: variable 'cdev' set but not used [-Wunused-but-set-variable]
The buggy address belongs to the variable:
__func__.67584+0x0/0xffffffffffffd520 [qedi]
Memory state around the buggy address: ffffffffc12b0980: fa fa fa fa 00 04 fa fa fa fa fa fa 00 00 05 fa ffffffffc12b0a00: fa fa fa fa 00 00 04 fa fa fa fa fa 00 05 fa fa
> ffffffffc12b0a80: fa fa fa fa 00 06 fa fa fa fa fa fa 00 02 fa fa
^ ffffffffc12b0b00: fa fa fa fa 00 00 04 fa fa fa fa fa 00 00 03 fa ffffffffc12b0b80: fa fa fa fa 00 00 02 fa fa fa fa fa 00 00 04 fa
Currently the qedi_dbg_* family of functions can overrun the end of the
source string if it is less than the destination buffer length because of
the use of a fixed sized memcpy. Remove the memset/memcpy calls to nfunc
and just use func instead as it is always a null terminated string.
Reported-by: Hulk Robot <[email protected]> Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Signed-off-by: YueHaibing <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Quinn Tran [Mon, 6 May 2019 20:52:19 +0000 (13:52 -0700)]
scsi: qla2xxx: Add cleanup for PCI EEH recovery
During EEH error recovery testing it was discovered that driver's reset()
callback partially frees resources used by driver, leaving some stale
memory. After reset() is done and when resume() callback in driver uses
old data which results into error leaving adapter disabled due to PCIe
error.
This patch does cleanup for EEH recovery code path and prevents adapter
from getting disabled.
scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in tcm_qla2xxx_close_session()
This patch avoids that lockdep reports the following warning:
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
5.1.0-rc1-dbg+ #11 Tainted: G W
-----------------------------------------------------
rmdir/1478 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 00000000e7ac4607 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0
and this task is already holding: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx]
which would create a new lock dependency:
(&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.}
but this new dependency connects a HARDIRQ-irq-safe lock:
(&(&ha->tgt.sess_lock)->rlock){-...}
scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory
The "(&ctio->u.status1.sense_data)[i]" where i >= 0 expressions in
qlt_send_resp_ctio() are probably typos and should have been
"(&ctio->u.status1.sense_data[4 * i])" instead. Instead of only fixing
these typos, modify the code for storing sense data such that it becomes
easy to read. This patch fixes a Coverity complaint about accessing an
array outside its bounds.
Since fc_remote_port_delete() must be called with interrupts enabled, do
not disable interrupts when calling that function. Remove the lockin calls
from around the put_sess() call. This is safe because the function that is
called when the final reference is dropped, qlt_unreg_sess(), grabs the
proper locks. This patch avoids that lockdep reports the following:
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
kworker/2:1/62 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 0000000009e679b3 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0
and this task is already holding: 00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst]
which would create a new lock dependency:
(&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.}
but this new dependency connects a HARDIRQ-irq-safe lock:
(&(&ha->tgt.sess_lock)->rlock){-...}
scsi: qla2xxx: Introduce the dsd32 and dsd64 data structures
Introduce two structures for the (DMA address, length) combination instead
of using separate structure members for the DMA address and length. This
patch fixes several Coverity complaints about 'cur_dsd' being used to write
outside the bounds of structure members.
scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands
In the *_done() functions, instead of returning early if sp->ref_count >=
2, only decrement sp->ref_count. In qla2xxx_eh_abort(), instead of deciding
what to do based on the value of sp->ref_count, decide which action to take
depending on the completion status of the firmware abort. Remove srb.cwaitq
and use srb.comp instead. In qla2x00_abort_srb(), call
isp_ops->abort_command() directly instead of calling qla2xxx_eh_abort().
scsi: qla2xxx: Make qla24xx_async_abort_cmd() static
Since qla24xx_async_abort_cmd() is only called from inside qla_init.c,
declare that function static. Reorder a few functions to avoid that any
forward declarations are needed.
scsi: qla2xxx: Remove unnecessary locking from the target code
All callbacks from the target core into the qla2xxx driver and also all I/O
completion functions are serialized per command. Since .cmd_sent_to_fw and
.trc_flags are only modified from inside these functions it is not
necessary to protect it with locking. Remove the superfluous locking.
Since the previous patch removed the only statement that sets
qla_tgt_cmd.released, remove the code that depends on that member variable
being set and the member variable itself.
scsi: qla2xxx: Complain if a command is released that is owned by the firmware
The previous patch guarantees that a command is only released after the
firmware has finished processing it. Hence complain if a command is
released that is owned by the firmware.
scsi: qla2xxx: target: Fix offline port handling and host reset handling
Remove the function qlt_abort_cmd_on_host_reset() because it can do the
following, all of which can cause a kernel crash:
- DMA unmapping while DMA is in progress.
- Call target_execute_cmd() while DMA is in progress.
- Call transport_generic_free_cmd() while the LIO core owns a command.
Instead of trying to abort a command asynchronously, set the 'aborted' flag
and handle the abort after the hardware has passed control back to the
tcm_qla2xxx driver.
scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending()
Implementations of the .write_pending() callback functions must guarantee
that an appropriate LIO core callback function will be called immediately or
at a later time. Make sure that this guarantee is met for aborted SCSI
commands.
[mkp: typo]
Cc: Himanshu Madhani <[email protected]> Cc: Giridhar Malavali <[email protected]> Fixes: 694833ee00c4 ("scsi: tcm_qla2xxx: Do not allow aborted cmd to advance.") # v4.13. Fixes: a07100e00ac4 ("qla2xxx: Fix TMR ABORT interaction issue between qla2xxx and TCM") # v4.5. Signed-off-by: Bart Van Assche <[email protected]> Acked-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd()
The test "if (!cmd)" is not useful because it is guaranteed that cmd !=
NULL. Instead of testing the cmd pointer, rely on the tag to decide
whether or not command allocation failed.
All qlt_send_term_imm_notif() callers pass '1' as second argument to this
function. Hence remove the (broken) code that depends on that second
argument having another value. Add a pr_debug() statement that prints rc to
avoid that the compiler would complain that rc has been set but is not
used.
scsi: qla2xxx: Make qla2x00_mem_free() easier to verify
Instead of clearing all freed pointers at the end of qla2x00_mem_free(),
clear freed pointers immediately after having freed the memory these
pointers point at.
scsi: qla2xxx: Avoid that Coverity complains about dereferencing a NULL rport pointer
Since Coverity cannot know that rport != NULL in qla2xxx_queuecommand() and
since there is code in that function that dereferences the rport pointer,
modify qla2xxx_queuecommand() such that it fails SCSI commands if rport ==
NULL.
scsi: qla2xxx: Move qla2x00_is_reserved_id() from qla_inline.h into qla_init.c
The previous patch moved all qla2x00_is_reserved_id() callers into
qla_init.c. Hence also move the qla2x00_is_reserved_id() definition into
qla_init.c.
Since all qla2x00_find_new_loop_id() calls occur in the same source file as
the definition of this function, move that function to just before its
first caller and declare it static. Convert the header above this function
into kernel-doc format.
Change one occurrence of "*(" into "()" and change one occurrence of
"lcoate" into "locate". Fix the reference to qla_tgt_handle_cmd_for_atio():
there has never been a function with that name.
Cc: Himanshu Madhani <[email protected]> Cc: Giridhar Malavali <[email protected]> Fixes: 75f8c1f693ee ("[SCSI] tcm_qla2xxx: Add >= 24xx series fabric module for target-core") # v3.5. Fixes: 2d70c103fd2a ("[SCSI] qla2xxx: Add LLD target-mode infrastructure for >= 24xx series") # v3.5. Signed-off-by: Bart Van Assche <[email protected]> Acked-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Pedro Sousa [Thu, 18 Apr 2019 19:13:34 +0000 (21:13 +0200)]
scsi: ufs: Fix RX_TERMINATION_FORCE_ENABLE define value
Fix RX_TERMINATION_FORCE_ENABLE define value from 0x0089 to 0x00A9
according to MIPI Alliance MPHY specification.
Fixes: e785060ea3a1 ("ufs: definitions for phy interface") Signed-off-by: Pedro Sousa <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
drivers/scsi/qedf/qedf_els.c: In function 'qedf_process_els_compl':
drivers/scsi/qedf/qedf_els.c:149:20: warning: variable 'sc_cmd' set but not used [-Wunused-but-set-variable]
drivers/scsi/qedf/qedf_els.c:148:28: warning: variable 'task_ctx' set but not used [-Wunused-but-set-variable]
drivers/scsi/qedf/qedf_els.c: In function 'qedf_send_srr':
drivers/scsi/qedf/qedf_els.c:612:6: warning: variable 'sid' set but not used [-Wunused-but-set-variable]
scsi: qedi: Adjust termination and offload ramrod timers
Whenever offload ramrod is issued, firmware wants driver to wait for max 5
secs, otherwise driver can initiate further corrective action. Similarly,
when termination ramrod is issued, irrespective of abortive or non-abortive
termination, driver should wait for 60 sec * max TCP-RT timeout.
scsi: qedi: Abort ep termination if offload not scheduled
Sometimes during connection recovery when there is a failure to resolve
ARP, and offload connection was not issued, driver tries to flush pending
offload connection work which was not queued up.
scsi: qla2xxx: Fix device staying in blocked state
This patch fixes issue reported by some of the customers, who discovered
that after cable pull scenario the devices disappear and path seems to
remain in blocked state. Once the device reappears, driver does not seem to
update path to online. This issue appears because of the defer flag
creating race condition where the same session reappears. This patch fixes
this issue by indicating SCSI-ML of device lost when
qlt_free_session_done() is called from qlt_unreg_sess().
Colin Ian King [Fri, 12 Apr 2019 09:48:29 +0000 (10:48 +0100)]
scsi: qedf: remove memset/memcpy to nfunc and use func instead
Currently the qedf_dbg_* family of functions can overrun the end of the
source string if it is less than the destination buffer length because of
the use of a fixed sized memcpy. Remove the memset/memcpy calls to nfunc
and just use func instead as it is always a null terminated string.
Addresses-Coverity: ("Out-of-bounds access") Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Signed-off-by: Colin Ian King <[email protected]> Acked-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Li Zhong [Mon, 15 Apr 2019 05:20:31 +0000 (13:20 +0800)]
scsi: core: map PQ=1, PDT=other values to SCSI_SCAN_TARGET_PRESENT
commit 84961f28e9d1 ("[SCSI] Don't add scsi_device for devices that return
PQ=1, PDT=0x1f") returns SCSI_SCAN_TARGET_PRESENT if inquiry returns PQ=1,
and PDT = 0x1f. However, from the scsi spec, it seemed setting PQ=1, and
PDT to the type it is capable to support, can also mean the device is not
connected. E.g. we see an IBM/2145 returns PQ=1 and PDT=0 for a non-mapped
lun (details attached at the end).
This patch changes the check condition a bit, so the check don't require
PTD to be 0x1f when PQ=1.
Stanley Chu [Mon, 15 Apr 2019 12:23:38 +0000 (20:23 +0800)]
scsi: ufs: Print real incorrect request response code
If UFS device responds an unknown request response code, we can not know
what it was via logs because the code is replaced by "DID_ERROR << 16"
before log printing.
Fix this to provide precise request response code information for easier
issue breakdown.
To support vlan and bridge devices first find route using ifindex 0, if
route is not found through net device associated with input scsi host then
find route using ifindex of net device.
Ming Lei [Fri, 12 Apr 2019 03:30:32 +0000 (11:30 +0800)]
scsi: core: don't hold device refcount in IO path
scsi_device's refcount is always grabbed in IO path.
Turns out it isn't necessary, because blk_queue_cleanup() will drain any
in-flight IOs, then cancel timeout/requeue work, and SCSI's requeue_work is
canceled too in __scsi_remove_device().
Also scsi_device won't go away until blk_cleanup_queue() is done.
So don't hold the refcount in IO path, especially the refcount isn't
required in IO path since blk_queue_enter() / blk_queue_exit() is
introduced in the legacy block layer.
scsi: qla2xxx: Fix read offset in qla24xx_load_risc_flash()
This patch fixes regression introduced by commit f8f97b0c5b7f ("scsi:
qla2xxx: Cleanups for NVRAM/Flash read/write path") where flash read/write
routine cleanup left out code which resulted into checksum failure leading
to use-after-free stack during driver load.
Following stack trace is seen in the log file
qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.01.00.16-k.
qla2xxx [0000:00:0b.0]-001d: : Found an ISP2532 irq 11 iobase 0x0000000000f47f03.
qla2xxx [0000:00:0b.0]-00cd:8: ISP Firmware failed checksum.
qla2xxx [0000:00:0b.0]-00cf:8: Setup chip ****FAILED****.
qla2xxx [0000:00:0b.0]-00d6:8: Failed to initialize adapter - Adapter flags 2.
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0x15/0xd0
Read of size 8 at addr ffff8880ca05a490 by task modprobe/857
Memory state around the buggy address: ffff8880ca05a380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff8880ca05a400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880ca05a480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^ ffff8880ca05a500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880ca05a580: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
==================================================================
Fixes: f8f97b0c5b7f ("scsi: qla2xxx: Cleanups for NVRAM/Flash read/write path") Reported-by: Bart Van Assche <[email protected]> Tested-by: Bart Van Assche <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
scsi: qla2xxx: Remove qla_tgt_cmd.data_work and qla_tgt_cmd.data_work_free
The 'data_work' and 'data_work_free' member variables are set but never
used. Hence remove both member variables. See also commit 6bcbb3174caa
("qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT (v2)").
scsi: qla2xxx: Move the <linux/io-64-nonatomic-lo-hi.h> include directive
The <linux/io-64-nonatomic-lo-hi.h> header file is included because of the
readq() macro. Since that macro is only used in qla_nx.c, move that include
statement into qla_nx.c.
Improve source code readability by inserting spaces where these are
required according to the coding standard. This patch only inserts
whitespace and does not make any other changes.
Most but not all code in the qla2xxx driver uses tabs for indentation.
Make the qla2xxx code easier to read by using tabs consistently for
indentation. This patch improves conformance with the Linux kernel coding
style.
John Garry [Fri, 12 Apr 2019 08:57:55 +0000 (16:57 +0800)]
scsi: libsas: Inject revalidate event for root port event
According to the SAS spec, an expander device shall transmit BROADCAST
(CHANGE) from at least one phy in each expander port other than the
expander port that is the cause for transmitting BROADCAST (CHANGE).
As such, for when the link is lost for a root PHY attached to an expander
PHY, we get no broadcast event.
This causes an issue for libsas, in that we will not revalidate the domain
for these events.
As a solution, for when a root PHY is formed or deformed from a root port,
insert a broadcast event to trigger a domain revalidation.
John Garry [Fri, 12 Apr 2019 08:57:53 +0000 (16:57 +0800)]
scsi: libsas: Try to retain programmed min linkrate for SATA min pathway unmatch fixing
Currently for fixing the linkrate matching during discovery such that the
linkrate of a SATA PHY does not exceed min pathway to initiator, we set the
SATA PHY programmed min linkrate to the same value as the programmed max
linkrate.
This is unnecessary, and we should be able to keep the same programmed min
linkrate if it is already lower than this new max programmed linkrate.
This patch makes that change.
In effect, this will not make much difference since we generally will
negotiate a linkrate at the programmed max linkrate, and the programmed min
linkrate will have no impact.
Luo Jiaxing [Thu, 11 Apr 2019 12:46:42 +0000 (20:46 +0800)]
scsi: hisi_sas: Don't hard reset disk during controller reset
In the function of hisi_sas_init_device(), we added ops->hardreset() to
clear affiliation of STP target port or handle [STP pending] state.
Function hisi_sas_init_device() will be called when a device is found or
during controller reset. At controller reset, we call
hisi_sas_init_device() to re-init the disks, so calling hardreset() is
unnecessary and it also will cause some delay at controller reset.
Xiaofei Tan [Thu, 11 Apr 2019 12:46:41 +0000 (20:46 +0800)]
scsi: hisi_sas: Support all RAS events with MSI interrupts
This patch is to switch HW all error handling from PCI AER to MSI interrupt
due to non-standard PCI implementation. All HW errors which were being
reported through PCI AER can be reported through MSI interrupt also.
Do two things to complete the switch:
1. Notify FW to switch to MSI handling through ACPI DSM.
2. Add MSI handling for some hw errors, ECC errors and poison errors (we
also call some of them AXI reuser error). They were handled only through
PCI AER before.
For old FW reporting PCI AER events, the PCI AER handler will see that the
driver on longer support AER, and will leave the device in offlined state,
which is safe.
scsi: hisi_sas: allocate different SAS address for directly attached situation
In commit 8b8d66531555 ("scsi: hisi_sas: make SAS address of SATA disks
unique"), we ensured that each SATA disk in the system has a unique SAS
address, even if it is fake. That was for v2 hw.
John Garry [Thu, 11 Apr 2019 12:46:38 +0000 (20:46 +0800)]
scsi: hisi_sas: Fix for setting the PHY linkrate when disconnected
In commit efdcad62e7b8 ("scsi: hisi_sas: Set PHY linkrate when
disconnected"), we use the sas_phy_data.enable flag to track whether the
PHY was enabled or not, so that we know if we should set the PHY negotiated
linkrate at SAS_LINK_RATE_UNKNOWN or SAS_PHY_DISABLED.
However, it is not proper to use sas_phy_data.enable, since it is only set
when libsas attempts to set the PHY disabled/enabled; hence, it may not
even have an initial value.
As a solution to this problem, introduce hisi_sas_phy.enable to track
whether the PHY is enabled or not, so that we can set the negotiated
linkrate properly when the PHY comes down.
scsi: hisi_sas: Remedy inconsistent PHY down state in software
Currently there are two scenarioes which may cause PHY state of hardware
(which is 0) is inconsistent with the state held in software:
- Unplug SAS wire before get_phys_state when SAS controller reset, then the
interrupts of phy down are ignored, phy state is 0 before reset, and it
also gets 0 after reset, so phy down doesn't occur even if unplugged SAS
wire;
- For v3 hw later version, it will close bus when 2 bit ECC error occurs.
So if unplug SAS wire at that time, interrupts of phy down also not
occur. So at last it will cause host reset. It also get phy state 0
before and after reset, the same issue occurs.
To solve it, use hisi_sas_phy_down() directly in rescan topology function.