Commit Graph

18992 Commits

Author SHA1 Message Date
Ming Lei a0ba41317c scsi: core: Run queue in case of I/O resource contention failure
[ Upstream commit 3f0dcfbcd2 ]

I/O requests may be held in scheduler queue because of resource contention.
The starvation scenario was handled properly in the regular completion
path but we failed to account for it during I/O submission. This lead to
the hang captured below. Make sure we run the queue when resource
contention is encountered in the submission path.

[   39.054963] scsi 13:0:0:0: rejecting I/O to dead device
[   39.058700] scsi 13:0:0:0: rejecting I/O to dead device
[   39.087855] sd 13:0:0:1: [sdd] Synchronizing SCSI cache
[   39.088909] scsi 13:0:0:1: rejecting I/O to dead device
[   39.095351] scsi 13:0:0:1: rejecting I/O to dead device
[   39.096962] scsi 13:0:0:1: rejecting I/O to dead device
[  247.021859] INFO: task scsi-stress-rem:813 blocked for more than 122 seconds.
[  247.023258]       Not tainted 5.8.0-rc2 #8
[  247.024069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  247.025331] scsi-stress-rem D    0   813    802 0x00004000
[  247.025334] Call Trace:
[  247.025354]  __schedule+0x504/0x55f
[  247.027987]  schedule+0x72/0xa8
[  247.027991]  blk_mq_freeze_queue_wait+0x63/0x8c
[  247.027994]  ? do_wait_intr_irq+0x7a/0x7a
[  247.027996]  blk_cleanup_queue+0x4b/0xc9
[  247.028000]  __scsi_remove_device+0xf6/0x14e
[  247.028002]  scsi_remove_device+0x21/0x2b
[  247.029037]  sdev_store_delete+0x58/0x7c
[  247.029041]  kernfs_fop_write+0x10d/0x14f
[  247.031281]  vfs_write+0xa2/0xdf
[  247.032670]  ksys_write+0x6b/0xb3
[  247.032673]  do_syscall_64+0x56/0x82
[  247.034053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  247.034059] RIP: 0033:0x7f69f39e9008
[  247.036330] Code: Bad RIP value.
[  247.036331] RSP: 002b:00007ffdd8116498 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  247.037613] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f69f39e9008
[  247.039714] RDX: 0000000000000002 RSI: 000055cde92a0ab0 RDI: 0000000000000001
[  247.039715] RBP: 000055cde92a0ab0 R08: 000000000000000a R09: 00007f69f3a79e80
[  247.039716] R10: 000000000000000a R11: 0000000000000246 R12: 00007f69f3abb780
[  247.039717] R13: 0000000000000002 R14: 00007f69f3ab6740 R15: 0000000000000002

Link: https://lore.kernel.org/r/20200720025435.812030-1-ming.lei@redhat.com
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-05 09:59:50 +02:00
Steve Schremmer 0ff9fce4ab scsi: dh: Add Fujitsu device to devinfo and dh lists
[ Upstream commit e094fd3460 ]

Add FUJITSU ETERNUS_AHB

Link: https://lore.kernel.org/r/DM6PR06MB5276CCA765336BD312C4282E8C660@DM6PR06MB5276.namprd06.prod.outlook.com
Signed-off-by: Steve Schremmer <steve.schremmer@netapp.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29 10:18:27 +02:00
Johannes Thumshirn 3959567d87 scsi: mpt3sas: Fix error returns in BRM_status_show
[ Upstream commit 0fd181456a ]

BRM_status_show() has several error branches, but none of them record the
error in the error return.

Also while at it remove the manual mutex_unlock() of the pci_access_mutex
in case of an ongoing pci error recovery or host removal and jump to the
cleanup label instead.

Note: We can safely jump to out from here as io_unit_pg3 is initialized to
NULL and if it hasn't been allocated, kfree() skips the NULL pointer.

[mkp: compilation warning]

Link: https://lore.kernel.org/r/20200701131454.5255-1-johannes.thumshirn@wdc.com
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29 10:18:27 +02:00
Damien Le Moal 0eced76360 scsi: mpt3sas: Fix unlock imbalance
[ Upstream commit cb551b8dc0 ]

In BRM_status_show(), if the condition "!ioc->is_warpdrive" tested on entry
to the function is true, a "goto out" is called. This results in unlocking
ioc->pci_access_mutex without this mutex lock being taken.  This generates
the following splat:

[ 1148.539883] mpt3sas_cm2: BRM_status_show: BRM attribute is only for warpdrive
[ 1148.547184]
[ 1148.548708] =====================================
[ 1148.553501] WARNING: bad unlock balance detected!
[ 1148.558277] 5.8.0-rc3+ #827 Not tainted
[ 1148.562183] -------------------------------------
[ 1148.566959] cat/5008 is trying to release lock (&ioc->pci_access_mutex) at:
[ 1148.574035] [<ffffffffc070b7a3>] BRM_status_show+0xd3/0x100 [mpt3sas]
[ 1148.580574] but there are no more locks to release!
[ 1148.585524]
[ 1148.585524] other info that might help us debug this:
[ 1148.599624] 3 locks held by cat/5008:
[ 1148.607085]  #0: ffff92aea3e392c0 (&p->lock){+.+.}-{3:3}, at: seq_read+0x34/0x480
[ 1148.618509]  #1: ffff922ef14c4888 (&of->mutex){+.+.}-{3:3}, at: kernfs_seq_start+0x2a/0xb0
[ 1148.630729]  #2: ffff92aedb5d7310 (kn->active#224){.+.+}-{0:0}, at: kernfs_seq_start+0x32/0xb0
[ 1148.643347]
[ 1148.643347] stack backtrace:
[ 1148.655259] CPU: 73 PID: 5008 Comm: cat Not tainted 5.8.0-rc3+ #827
[ 1148.665309] Hardware name: HGST H4060-S/S2600STB, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[ 1148.678394] Call Trace:
[ 1148.684750]  dump_stack+0x78/0xa0
[ 1148.691802]  lock_release.cold+0x45/0x4a
[ 1148.699451]  __mutex_unlock_slowpath+0x35/0x270
[ 1148.707675]  BRM_status_show+0xd3/0x100 [mpt3sas]
[ 1148.716092]  dev_attr_show+0x19/0x40
[ 1148.723664]  sysfs_kf_seq_show+0x87/0x100
[ 1148.731193]  seq_read+0xbc/0x480
[ 1148.737882]  vfs_read+0xa0/0x160
[ 1148.744514]  ksys_read+0x58/0xd0
[ 1148.751129]  do_syscall_64+0x4c/0xa0
[ 1148.757941]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1148.766240] RIP: 0033:0x7f1230566542
[ 1148.772957] Code: Bad RIP value.
[ 1148.779206] RSP: 002b:00007ffeac1bcac8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 1148.790063] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f1230566542
[ 1148.800284] RDX: 0000000000020000 RSI: 00007f1223460000 RDI: 0000000000000003
[ 1148.810474] RBP: 00007f1223460000 R08: 00007f122345f010 R09: 0000000000000000
[ 1148.820641] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000000
[ 1148.830728] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000

Fix this by returning immediately instead of jumping to the out label.

Link: https://lore.kernel.org/r/20200701085254.51740-1-damien.lemoal@wdc.com
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29 10:18:27 +02:00
Tom Rix 806ffec1a9 scsi: scsi_transport_spi: Fix function pointer check
[ Upstream commit 5aee52c44d ]

clang static analysis flags several null function pointer problems.

drivers/scsi/scsi_transport_spi.c:374:1: warning: Called function pointer is null (null dereference) [core.CallAndMessage]
spi_transport_max_attr(offset, "%d\n");
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewing the store_spi_store_max macro

	if (i->f->set_##field)
		return -EINVAL;

should be

	if (!i->f->set_##field)
		return -EINVAL;

Link: https://lore.kernel.org/r/20200627133242.21618-1-trix@redhat.com
Reviewed-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29 10:18:26 +02:00
Chandrakanth Patil e9506de7b3 scsi: megaraid_sas: Remove undefined ENABLE_IRQ_POLL macro
commit 07d3f04550 upstream.

As the ENABLE_IRQ_POLL macro is undefined, the check for ENABLE_IRQ_POLL
macro in ISR will always be false. This leads to irq polling being
non-functional.

Remove ENABLE_IRQ_POLL check from ISR.

Link: https://lore.kernel.org/r/20200715120153.20512-1-chandrakanth.patil@broadcom.com
Fixes: a6ffd5bf68 ("scsi: megaraid_sas: Call disable_irq from process IRQ")
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22 09:33:13 +02:00
Diego Elio Pettenò 69fbdbb4fa scsi: sr: remove references to BLK_DEV_SR_VENDOR, leave it enabled
[ Upstream commit 679b2ec8e0 ]

This kernel configuration is basically enabling/disabling sr driver quirks
detection. While these quirks are for fairly rare devices (very old CD
burners, and a glucometer), the additional detection of these models is a
very minimal amount of code.

The logic behind the quirks is always built into the sr driver.

This also removes the config from all the defconfig files that are enabling
this already.

Link: https://lore.kernel.org/r/20200223191144.726-1-flameeyes@flameeyes.com
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Diego Elio Pettenò <flameeyes@flameeyes.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-22 09:32:57 +02:00
SeongJae Park a3a86515d0 scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()
[ Upstream commit 46da547e21 ]

Commit cdb42becdd ("scsi: lpfc: Replace io_channels for nvme and fcp with
general hdw_queues per cpu") has introduced static checker warnings for
potential null dereferences in 'lpfc_sli4_hba_unset()' and commit 1ffdd2c044
("scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset") has
tried to fix it.  However, yet another potential null dereference is
remaining.  This commit fixes it.

This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.

Link: https://lore.kernel.org/r/20200623084122.30633-1-sjpark@amazon.com
Fixes: 1ffdd2c044 ("scsi: lpfc: resolve static checker warning inlpfc_sli4_hba_unset")
Fixes: cdb42becdd ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu")
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30 15:37:00 -04:00
Roman Bolshakov d3a251b847 scsi: qla2xxx: Keep initiator ports after RSCN
commit 632f24f09d upstream.

The driver performs SCR (state change registration) in all modes including
pure target mode.

For each RSCN, scan_needed flag is set in qla2x00_handle_rscn() for the
port mentioned in the RSCN and fabric rescan is scheduled. During the
rescan, GNN_FT handler, qla24xx_async_gnnft_done() deletes session of the
port that caused the RSCN.

In target mode, the session deletion has an impact on ATIO handler,
qlt_24xx_atio_pkt(). Target responds with SAM STATUS BUSY to I/O incoming
from the deleted session. qlt_handle_cmd_for_atio() and
qlt_handle_task_mgmt() return -EFAULT if they are not able to find session
of the command/TMF, and that results in invocation of qlt_send_busy():

  qlt_24xx_atio_pkt_all_vps: qla_target(0): type 6 ox_id 0014
  qla_target(0): Unable to send command to target, sending BUSY status

Such response causes command timeout on the initiator. Error handler thread
on the initiator will be spawned to abort the commands:

  scsi 23:0:0:0: tag#0 abort scheduled
  scsi 23:0:0:0: tag#0 aborting command
  qla2xxx [0000:af:00.0]-188c:23: Entered qla24xx_abort_command.
  qla2xxx [0000:af:00.0]-801c:23: Abort command issued nexus=23:0:0 -- 0 2003.

Command abort is rejected by target and fails (2003), error handler then
tries to perform DEVICE RESET and TARGET RESET but they're also doomed to
fail because TMFs are ignored for the deleted sessions.

Then initiator makes BUS RESET that resets the link via
qla2x00_full_login_lip(). BUS RESET succeeds and brings initiator port up,
SAN switch detects that and sends RSCN to the target port and it fails
again the same way as described above. It never goes out of the loop.

The change breaks the RSCN loop by keeping initiator sessions mentioned in
RSCN payload in all modes, including dual and pure target mode.

Link: https://lore.kernel.org/r/20200605144435.27023-1-r.bolshakov@yadro.com
Fixes: 2037ce49d3 ("scsi: qla2xxx: Fix stale session")
Cc: Quinn Tran <qutran@marvell.com>
Cc: Arun Easi <aeasi@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: stable@vger.kernel.org # v5.4+
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Shyam Sundar <ssundar@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30 15:36:50 -04:00
Dinghao Liu cb5fe511b1 scsi: ufs-bsg: Fix runtime PM imbalance on error
[ Upstream commit a1e17eb03e ]

When ufs_bsg_alloc_desc_buffer() returns an error code, a pairing runtime
PM usage counter decrement is needed to keep the counter balanced.

Link: https://lore.kernel.org/r/20200522045932.31795-1-dinghao.liu@zju.edu.cn
Fixes: 74e5e468b6 (scsi: ufs-bsg: Wake the device before sending raw upiu commands)
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:47 +02:00
Christophe JAILLET 04ce5f6f0d scsi: acornscsi: Fix an error handling path in acornscsi_probe()
[ Upstream commit 42c76c9848 ]

'ret' is known to be 0 at this point.  Explicitly return -ENOMEM if one of
the 'ecardm_iomap()' calls fail.

Link: https://lore.kernel.org/r/20200530081622.577888-1-christophe.jaillet@wanadoo.fr
Fixes: e95a1b656a ("[ARM] rpc: acornscsi: update to new style ecard driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:43 +02:00
Can Guo bd410e6600 scsi: ufs: Don't update urgent bkops level when toggling auto bkops
[ Upstream commit be32acff43 ]

Urgent bkops level is used to compare against actual bkops status read from
UFS device. Urgent bkops level is set during initialization and might be
updated in exception event handler during runtime. But it should not be
updated to the actual bkops status every time when auto bkops is toggled.
Otherwise, if urgent bkops level is updated to 0, auto bkops shall always
be kept enabled.

Link: https://lore.kernel.org/r/1590632686-17866-1-git-send-email-cang@codeaurora.org
Fixes: 24366c2afb ("scsi: ufs: Recheck bkops level if bkops is disabled")
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:38 +02:00
Qiushi Wu 88d6783484 scsi: iscsi: Fix reference count leak in iscsi_boot_create_kobj
[ Upstream commit 0267ffce56 ]

kobject_init_and_add() takes reference even when it fails. If this
function returns an error, kobject_put() must be called to properly
clean up the memory associated with the object.

Link: https://lore.kernel.org/r/20200528201353.14849-1-wu000273@umn.edu
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:37 +02:00
Jeffrey Hugo 39983e1886 scsi: ufs-qcom: Fix scheduling while atomic issue
[ Upstream commit 3be60b564d ]

ufs_qcom_dump_dbg_regs() uses usleep_range, a sleeping function, but can be
called from atomic context in the following flow:

ufshcd_intr -> ufshcd_sl_intr -> ufshcd_check_errors ->
ufshcd_print_host_regs -> ufshcd_vops_dbg_register_dump ->
ufs_qcom_dump_dbg_regs

This causes a boot crash on the Lenovo Miix 630 when the interrupt is
handled on the idle thread.

Fix the issue by switching to udelay().

Link: https://lore.kernel.org/r/20200525204125.46171-1-jeffrey.l.hugo@gmail.com
Fixes: 9c46b86762 ("scsi: ufs-qcom: dump additional testbus registers")
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:34 +02:00
Suganath Prabu S f34a3697a8 scsi: mpt3sas: Fix double free warnings
[ Upstream commit cbbfdb2a24 ]

Fix following warning from Smatch static analyser:

drivers/scsi/mpt3sas/mpt3sas_base.c:5256 _base_allocate_memory_pools()
warn: 'ioc->hpr_lookup' double freed

drivers/scsi/mpt3sas/mpt3sas_base.c:5256 _base_allocate_memory_pools()
warn: 'ioc->internal_lookup' double freed

Link: https://lore.kernel.org/r/20200508110738.30732-1-suganath-prabu.subramani@broadcom.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:26 +02:00
Viacheslav Dubeyko 3201bb03c8 scsi: qla2xxx: Fix warning after FC target reset
[ Upstream commit f839544ccf ]

Currently, FC target reset finishes with the warning message:

[84010.596893] ------------[ cut here ]------------
[84010.596917] WARNING: CPU: 238 PID: 279973 at ../drivers/scsi/qla2xxx/qla_target.c:6644 qlt_enable_vha+0x1d0/0x260 [qla2xxx]
[84010.596918] Modules linked in: vrf af_packet 8021q garp mrp stp llc netlink_diag target_tatlin_tblock(OEX) dm_ec(OEX) ttln_rdma(OEX) dm_frontend(OEX) nvme_rdma nvmet tcm_qla2xxx iscsi_target_mod target_core_mod at24 nvmem_core pnv_php ipmi_watchdog ipmi_ssif vmx_crypto gf128mul crct10dif_vpmsum qla2xxx rpcrdma nvme_fc powernv_flash(X) nvme_fabrics uio_pdrv_genirq mtd rtc_opal(X) ibmpowernv(X) opal_prd(X) uio scsi_transport_fc i2c_opal(X) ses enclosure ipmi_poweroff ast i2c_algo_bit ttm bmc_mcu(OEX) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm drm_panel_orientation_quirks agpgart nfsd auth_rpcgss nfs_acl ipmi_powernv(X) lockd ipmi_devintf ipmi_msghandler grace dummy ext4 crc16 jbd2 mbcache sd_mod rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm
[84010.596975]  configfs mlx5_ib ib_uverbs ib_core mlx5_core crc32c_vpmsum xhci_pci xhci_hcd mpt3sas(OEX) tg3 usbcore mlxfw tls raid_class libphy scsi_transport_sas devlink ptp pps_core nvme nvme_core sunrpc dm_mirror dm_region_hash dm_log sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
[84010.597001] Supported: Yes, External
[84010.597004] CPU: 238 PID: 279973 Comm: bash Tainted: G           OE      4.12.14-197.29-default #1 SLE15-SP1
[84010.597006] task: c000000a104c0000 task.stack: c000000b52188000
[84010.597007] NIP: d00000001ffd7f78 LR: d00000001ffd7f6c CTR: c0000000001676c0
[84010.597008] REGS: c000000b5218b910 TRAP: 0700   Tainted: G           OE       (4.12.14-197.29-default)
[84010.597008] MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
[84010.597015]   CR: 48242424  XER: 00000000
[84010.597016] CFAR: d00000001ff45d08 SOFTE: 1
               GPR00: d00000001ffd7f6c c000000b5218bb90 d00000002001b228 0000000000000102
               GPR04: 0000000000000001 0000000000000001 00013d91ed0a5e2d 0000000000000000
               GPR08: c000000007793300 0000000000000000 0000000000000000 c000000a086e7818
               GPR12: 0000000000002200 c000000007793300 0000000000000000 000000012bc937c0
               GPR16: 000000012bbf7ed0 0000000000000000 000000012bc3dd10 0000000000000000
               GPR20: 000000012bc4db28 0000010036442810 000000012bc97828 000000012bc96c70
               GPR24: 00000100365b1550 0000000000000000 00000100363f3d80 c000000be20d3080
               GPR28: c000000bda7eae00 c000000be20db7e8 c000000be20d3778 c000000be20db7e8
[84010.597042] NIP [d00000001ffd7f78] qlt_enable_vha+0x1d0/0x260 [qla2xxx]
[84010.597051] LR [d00000001ffd7f6c] qlt_enable_vha+0x1c4/0x260 [qla2xxx]
[84010.597051] Call Trace:
[84010.597061] [c000000b5218bb90] [d00000001ffd7f6c] qlt_enable_vha+0x1c4/0x260 [qla2xxx] (unreliable)
[84010.597064] [c000000b5218bc20] [d000000009820b6c] tcm_qla2xxx_tpg_enable_store+0xc4/0x130 [tcm_qla2xxx]
[84010.597067] [c000000b5218bcb0] [d0000000185d0e68] configfs_write_file+0xd0/0x190 [configfs]
[84010.597072] [c000000b5218bd00] [c0000000003d0edc] __vfs_write+0x3c/0x1e0
[84010.597074] [c000000b5218bd90] [c0000000003d2ea8] vfs_write+0xd8/0x220
[84010.597076] [c000000b5218bde0] [c0000000003d4ddc] SyS_write+0x6c/0x110
[84010.597079] [c000000b5218be30] [c00000000000b188] system_call+0x3c/0x130
[84010.597080] Instruction dump:
[84010.597082] 7d0050a8 7d084b78 7d0051ad 40c2fff4 7fa3eb78 4bf73965 60000000 7fa3eb78
[84010.597086] 4bf6dcd9 60000000 2fa30000 419eff40 <0fe00000> 4bffff38 e95f0058 a12a0180
[84010.597090] ---[ end trace e32abaf6e6fee826 ]---

To reproduce:

echo 0x7fffffff > /sys/module/qla2xxx/parameters/logging
modprobe target_core_mod
modprobe tcm_qla2xxx
mkdir /sys/kernel/config/target/qla2xxx
mkdir /sys/kernel/config/target/qla2xxx/<port-name>
mkdir /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1
echo 1 > /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1/enable
echo 0 > /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1/enable
echo 1 > /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1/enable

SYSTEM START
kernel: pid 327:drivers/scsi/qla2xxx/qla_init.c:2174 qla2x00_initialize_adapter(): vha->flags.online 0x0
<...>
kernel: pid 327:drivers/scsi/qla2xxx/qla_os.c:3444 qla2x00_probe_one(): vha->flags.online 0x1

echo 1 > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:86:a6:2a/tpgt_1/enable
kernel: pid 348:drivers/scsi/qla2xxx/qla_init.c:6641 qla2x00_abort_isp_cleanup(): vha->flags.online 0x0, ISP_ABORT_NEEDED 0x0
<...>
kernel: pid 348:drivers/scsi/qla2xxx/qla_init.c:6998 qla2x00_restart_isp(): vha->flags.online 0x0

echo 0 > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:86:a6:2a/tpgt_1/enable
kernel: pid 348:drivers/scsi/qla2xxx/qla_init.c:6641 qla2x00_abort_isp_cleanup(): vha->flags.online 0x0, ISP_ABORT_NEEDED 0x0
<...>
kernel: pid 1404:drivers/scsi/qla2xxx/qla_os.c:1107 qla2x00_wait_for_hba_online(): base_vha->flags.online 0x0

echo 1 > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:86:a6:2a/tpgt_1/enable
kernel: pid 1404:drivers/scsi/qla2xxx/qla_os.c:1107 qla2x00_wait_for_hba_online(): base_vha->flags.online 0x0
kernel: -----------[ cut here ]-----------
kernel: WARNING: CPU: 1 PID: 1404 at drivers/scsi/qla2xxx/qla_target.c:6654 qlt_enable_vha+0x1e0/0x280 [qla2xxx]

The issue happens because no real ISP reset is executed.  The
qla2x00_abort_isp(scsi_qla_host_t *vha) function expects that
vha->flags.online will be not zero for ISP reset procedure.  This patch
sets vha->flags.online to 1 before calling ->abort_isp() for starting the
ISP reset.

Link: https://lore.kernel.org/r/1d7b21bf9f7676643239eb3d60eaca7cfa505cf0.camel@yadro.com
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Viacheslav Dubeyko <v.dubeiko@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:24 +02:00
Chad Dupuis cd3f4ed1c5 scsi: qedf: Fix crash when MFW calls for protocol stats while function is still probing
[ Upstream commit ad40f52560 ]

The MFW may make a call to qed and then to qedf for protocol statistics
while the function is still probing.  If this happens it's possible that
some members of the struct qedf_ctx may not be fully initialized which can
result in a NULL pointer dereference or general protection fault.

To prevent this, add a new flag call QEDF_PROBING and set it when the
__qedf_probe() function is active. Then in the qedf_get_protocol_tlv_data()
function we can check if the function is still probing and return
immediantely before any uninitialized structures can be touched.

Link: https://lore.kernel.org/r/20200416084314.18851-9-skashyap@marvell.com
Signed-off-by: Chad Dupuis <cdupuis@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:21 +02:00
Nilesh Javali f37623862e scsi: qedi: Do not flush offload work if ARP not resolved
[ Upstream commit 927527aea0 ]

For an unreachable target, offload_work is not initialized and the endpoint
state is set to OFLDCONN_NONE. This results in a WARN_ON due to the check
of the work function field being set to zero.

------------[ cut here ]------------
WARNING: CPU: 24 PID: 18587 at ../kernel/workqueue.c:3037 __flush_work+0x1c1/0x1d0
:
Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 02/01/2020
RIP: 0010:__flush_work+0x1c1/0x1d0
Code: ba 6d 00 03 80 c9 f0 eb b6 48 c7 c7 20 ee 6c a4 e8 52 d3 04 00 0f 0b 31 c0 e9 d1 fe ff
ff 48 c7 c7 20 ee 6c a4 e8 3d d3 04 00 <0f> 0b 31 c0 e9 bc fe ff ff e8 11 f3 f
 00 31 f6
RSP: 0018:ffffac5a8cd47a80 EFLAGS: 00010282
RAX: 0000000000000024 RBX: ffff98d68c1fcaf0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff98ce9fd99898 RDI: ffff98ce9fd99898
RBP: ffff98d68c1fcbc0 R08: 00000000000006fa R09: 0000000000000001
R10: ffffac5a8cd47b50 R11: 0000000000000001 R12: 0000000000000000
R13: 000000000000489b R14: ffff98d68c1fc800 R15: ffff98d692132c00
FS:  00007f65f7f62280(0000) GS:ffff98ce9fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd2435e880 CR3: 0000000809334003 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 ? class_create_release+0x40/0x40
 ? klist_put+0x2c/0x80
 qedi_ep_disconnect+0xdd/0x400 [qedi]
 iscsi_if_ep_disconnect.isra.20+0x59/0x70 [scsi_transport_iscsi]
 iscsi_if_rx+0x129b/0x1670 [scsi_transport_iscsi]
 ? __netlink_lookup+0xe7/0x160
 netlink_unicast+0x21d/0x300
 netlink_sendmsg+0x30f/0x430
 sock_sendmsg+0x5b/0x60
 ____sys_sendmsg+0x1e2/0x240
 ? copy_msghdr_from_user+0xd9/0x160
 ___sys_sendmsg+0x88/0xd0
 ? ___sys_recvmsg+0xa2/0xe0
 ? hrtimer_try_to_cancel+0x25/0x100
 ? do_nanosleep+0x9c/0x170
 ? __sys_sendmsg+0x5e/0xa0
 __sys_sendmsg+0x5e/0xa0
 do_syscall_64+0x60/0x1f0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f65f6f16107
Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 aa d2 2b 00 48 63 d2 48
63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 8
    0 00 00 00 00 53 48 89 f3 48
 RSP: 002b:00007ffd24367ca8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 000055a7aeaaf110 RCX: 00007f65f6f16107
 RDX: 0000000000000000 RSI: 00007ffd24367cc0 RDI: 0000000000000003
 RBP: 0000000000000070 R08: 0000000000000000 R09: 0000000000000000
 R10: 000000000000075c R11: 0000000000000246 R12: 00007ffd24367cc0
 R13: 000055a7ae560008 R14: 00007ffd24367db0 R15: 0000000000000000
 ---[ end trace 54f499c05d41f8bb ]---

Only flush if the connection endpoint state if different from
OFLDCONN_NONE.

[mkp: clarified commit desc]

Link: https://lore.kernel.org/r/20200408064332.19377-5-mrangankar@marvell.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:20 +02:00
Tyrel Datwyler af1f72ad18 scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM
[ Upstream commit 4919b33b63 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyreld@linux.ibm.com
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:19 +02:00
Simon Arlott 08d2ea2e23 scsi: sr: Fix sr_probe() missing deallocate of device minor
[ Upstream commit 6555781b3f ]

If the cdrom fails to be registered then the device minor should be
deallocated.

Link: https://lore.kernel.org/r/072dac4b-8402-4de8-36bd-47e7588969cd@0882a8b5-c6c3-11e9-b005-00805fc181fe
Signed-off-by: Simon Arlott <simon@octiron.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:19 +02:00
Xiyu Yang 296cec7050 scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
[ Upstream commit 7217e6e694 ]

In order to create or activate a new node, lpfc_els_unsol_buffer() invokes
lpfc_nlp_init() or lpfc_enable_node() or lpfc_nlp_get(), all of them will
return a reference of the specified lpfc_nodelist object to "ndlp" with
increased refcnt.

When lpfc_els_unsol_buffer() returns, local variable "ndlp" becomes
invalid, so the refcount should be decreased to keep refcount balanced.

The reference counting issue happens in one exception handling path of
lpfc_els_unsol_buffer(). When "ndlp" in DEV_LOSS, the function forgets to
decrease the refcnt increased by lpfc_nlp_init() or lpfc_enable_node() or
lpfc_nlp_get(), causing a refcnt leak.

Fix this issue by calling lpfc_nlp_put() when "ndlp" in DEV_LOSS.

Link: https://lore.kernel.org/r/1590416184-52592-1-git-send-email-xiyuyang19@fudan.edu.cn
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:17 +02:00
Dan Carpenter bee9cc3513 scsi: cxgb3i: Fix some leaks in init_act_open()
[ Upstream commit b6170a49c5 ]

There wasn't any clean up done if cxgb3_alloc_atid() failed and also the
original code didn't release "csk->l2t".

Link: https://lore.kernel.org/r/20200521121221.GA247492@mwanda
Fixes: 6f7efaabef ("[SCSI] cxgb3i: change cxgb3i to use libcxgbi")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:16 +02:00
Luo Jiaxing ba5192db96 scsi: hisi_sas: Do not reset phy timer to wait for stray phy up
[ Upstream commit e16b9ed61e ]

We found out that after phy up, the hardware reports another oob interrupt
but did not follow a phy up interrupt:

oob ready -> phy up -> DEV found -> oob read -> wait phy up -> timeout

We run link reset when wait phy up timeout, and it send a normal disk into
reset processing. So we made some circumvention action in the code, so that
this abnormal oob interrupt will not start the timer to wait for phy up.

Link: https://lore.kernel.org/r/1589552025-165012-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:15 +02:00
Viacheslav Dubeyko 70a2296d4b scsi: qla2xxx: Fix issue with adapter's stopping state
[ Upstream commit 803e45550b ]

The goal of the following command sequence is to restart the adapter.
However, the tgt_stop flag remains set, indicating that the adapter is
still in stopping state even after re-enabling it.

echo 0x7fffffff > /sys/module/qla2xxx/parameters/logging
modprobe target_core_mod
modprobe tcm_qla2xxx
mkdir /sys/kernel/config/target/qla2xxx
mkdir /sys/kernel/config/target/qla2xxx/<port-name>
mkdir /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1
echo 1 > /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1/enable
echo 0 > /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1/enable
echo 1 > /sys/kernel/config/target/qla2xxx/<port-name>/tpgt_1/enable

kernel: PID 1396:qla_target.c:1555 qlt_stop_phase1(): tgt_stop 0x0, tgt_stopped 0x0
kernel: qla2xxx [0001:00:02.0]-e803:1: PID 1396:qla_target.c:1567: Stopping target for host 1(c0000000033557e8)
kernel: PID 1396:qla_target.c:1579 qlt_stop_phase1(): tgt_stop 0x1, tgt_stopped 0x0
kernel: PID 1396:qla_target.c:1266 qlt_schedule_sess_for_deletion(): tgt_stop 0x1, tgt_stopped 0x0
kernel: qla2xxx [0001:00:02.0]-e801:1: PID 1396:qla_target.c:1316: Scheduling sess c00000002d5cd800 for deletion 21:00:00:24:ff:7f:35:c7
<skipped>
kernel: qla2xxx [0001:00:02.0]-290a:1: PID 340:qla_target.c:1187: qlt_unreg_sess sess c00000002d5cd800 for deletion 21:00:00:24:ff:7f:35:c7
<skipped>
kernel: qla2xxx [0001:00:02.0]-f801:1: PID 340:qla_target.c:1145: Unregistration of sess c00000002d5cd800 21:00:00:24:ff:7f:35:c7 finished fcp_cnt 0
kernel: PID 340:qla_target.c:1155 qlt_free_session_done(): tgt_stop 0x1, tgt_stopped 0x0
kernel: qla2xxx [0001:00:02.0]-4807:1: PID 346:qla_os.c:6329: ISP abort scheduled.
<skipped>
kernel: qla2xxx [0001:00:02.0]-28f1:1: PID 346:qla_os.c:3956: Mark all dev lost
kernel: PID 346:qla_target.c:1266 qlt_schedule_sess_for_deletion(): tgt_stop 0x1, tgt_stopped 0x0
kernel: qla2xxx [0001:00:02.0]-4808:1: PID 346:qla_os.c:6338: ISP abort end.
<skipped>
kernel: PID 1396:qla_target.c:6812 qlt_enable_vha(): tgt_stop 0x1, tgt_stopped 0x0
<skipped>
kernel: qla2xxx [0001:00:02.0]-4807:1: PID 346:qla_os.c:6329: ISP abort scheduled.
<skipped>
kernel: qla2xxx [0001:00:02.0]-4808:1: PID 346:qla_os.c:6338: ISP abort end.

qlt_handle_cmd_for_atio() rejects the request to send commands because the
adapter is in the stopping state:

kernel: PID 0:qla_target.c:4442 qlt_handle_cmd_for_atio(): tgt_stop 0x1, tgt_stopped 0x0
kernel: qla2xxx [0001:00:02.0]-3861:1: PID 0:qla_target.c:4447: New command while device c000000005314600 is shutting down
kernel: qla2xxx [0001:00:02.0]-e85f:1: PID 0:qla_target.c:5728: qla_target: Unable to send command to target

This patch calls qla_stop_phase2() in addition to qlt_stop_phase1() in
tcm_qla2xxx_tpg_enable_store() and tcm_qla2xxx_npiv_tpg_enable_store(). The
qlt_stop_phase1() marks adapter as stopping (tgt_stop == 0x1, tgt_stopped
== 0x0) but qlt_stop_phase2() marks adapter as stopped (tgt_stop == 0x0,
tgt_stopped == 0x1).

Link: https://lore.kernel.org/r/52be1e8a3537f6c5407eae3edd4c8e08a9545ea5.camel@yadro.com
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Viacheslav Dubeyko <v.dubeiko@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:12 +02:00
Dan Carpenter 7631dc491a scsi: qedi: Check for buffer overflow in qedi_set_path()
[ Upstream commit 4a4c0cfb4b ]

Smatch complains that the "path_data->handle" variable is user controlled.
It comes from iscsi_set_path() so that seems possible.  It's harmless to
add a limit check.

The qedi->ep_tbl[] array has qedi->max_active_conns elements (which is
always ISCSI_MAX_SESS_PER_HBA (4096) elements).  The array is allocated in
the qedi_cm_alloc_mem() function.

Link: https://lore.kernel.org/r/20200428131939.GA696531@mwanda
Fixes: ace7f46ba5 ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.")
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:11 +02:00
Johannes Thumshirn 545b86efa0 scsi: core: free sgtables in case command setup fails
[ Upstream commit 20a66f2bf2 ]

In case scsi_setup_fs_cmnd() fails we're not freeing the sgtables allocated
by scsi_init_io(), thus we leak the allocated memory.

Free the sgtables allocated by scsi_init_io() in case scsi_setup_fs_cmnd()
fails.

Technically scsi_setup_scsi_cmnd() does not suffer from this problem as it
can only fail if scsi_init_io() fails, so it does not have sgtables
allocated. But to maintain symmetry and as a measure of defensive
programming, free the sgtables on scsi_setup_scsi_cmnd() failure as well.
scsi_mq_free_sgtables() has safeguards against double-freeing of memory so
this is safe to do.

While we're at it, rename scsi_mq_free_sgtables() to scsi_free_sgtables().

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205595
Link: https://lore.kernel.org/r/20200428104605.8143-2-johannes.thumshirn@wdc.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24 17:50:10 +02:00
Dick Kennedy 221b26e01d scsi: lpfc: Fix negation of else clause in lpfc_prep_node_fc4type
commit f809da6db6 upstream.

Implementation of a previous patch added a condition to an if check that
always end up with the if test being true. Execution of the else clause was
inadvertently negated.  The additional condition check was incorrect and
unnecessary after the other modifications had been done in that patch.

Remove the check from the if series.

Link: https://lore.kernel.org/r/20200501214310.91713-5-jsmart2021@gmail.com
Fixes: b95b21193c ("scsi: lpfc: Fix loss of remote port after devloss due to lack of RPIs")
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-17 16:40:35 +02:00
Sumit Saxena 387d51f0c1 scsi: megaraid_sas: TM command refire leads to controller firmware crash
commit 6fd8525a70 upstream.

When TM command times out, driver invokes the controller reset. Post reset,
driver re-fires pended TM commands which leads to firmware crash.

Post controller reset, return pended TM commands back to OS.

Link: https://lore.kernel.org/r/20200508085242.23406-1-chandrakanth.patil@broadcom.com
Cc: stable@vger.kernel.org
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-17 16:40:35 +02:00
Can Guo 1857d7d776 scsi: pm: Balance pm_only counter of request queue during system resume
[ Upstream commit 05d18ae1cc ]

During system resume, scsi_resume_device() decreases a request queue's
pm_only counter if the scsi device was quiesced before. But after that, if
the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter is still
held (non-zero). Current SCSI resume hook only sets the RPM status of the
scsi_device and its request queue to RPM_ACTIVE, but leaves the pm_only
counter unchanged. This may make the request queue's pm_only counter remain
non-zero after resume hook returns, hence those who are waiting on the
mq_freeze_wq would never be woken up. Fix this by calling
blk_post_runtime_resume() if a sdev's RPM status was RPM_SUSPENDED.

(struct request_queue)0xFFFFFF815B69E938
	pm_only = (counter = 2),
	rpm_status = 0,
	dev = 0xFFFFFF815B0511A0,

((struct device)0xFFFFFF815B0511A0)).power
	is_suspended = FALSE,
	runtime_status = RPM_ACTIVE,

(struct scsi_device)0xffffff815b051000
	request_queue = 0xFFFFFF815B69E938,
	sdev_state = SDEV_RUNNING,
	quiesced_by = 0x0,

B::v.f_/task_0xFFFFFF810C246940
-000|__switch_to(prev = 0xFFFFFF810C246940, next = 0xFFFFFF80A49357C0)
-001|context_switch(inline)
-001|__schedule(?)
-002|schedule()
-003|blk_queue_enter(q = 0xFFFFFF815B69E938, flags = 0)
-004|generic_make_request(?)
-005|submit_bio(bio = 0xFFFFFF80A8195B80)

Link: https://lore.kernel.org/r/1588740936-28846-1-git-send-email-cang@codeaurora.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-07 13:18:50 +02:00
Xiang Chen a293045d38 scsi: hisi_sas: Check sas_port before using it
[ Upstream commit 8c39673d54 ]

Need to check the structure sas_port before using it.

Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-07 13:18:49 +02:00
Can Guo d628f7a59e scsi: ufs: Release clock if DMA map fails
commit 17c7d35f14 upstream.

In queuecommand path, if DMA map fails, it bails out with clock held.  In
this case, release the clock to keep its usage paired.

[mkp: applied by hand]

Link: https://lore.kernel.org/r/0101016ed3d66395-1b7e7fce-b74d-42ca-a88a-4db78b795d3b-000000@us-west-2.amazonses.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[EB: resolved cherry-pick conflict caused by newer kernels not having
 the clear_bit_unlock() line]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-07 13:18:48 +02:00
Ewan D. Milne 818657105a scsi: qla2xxx: Do not log message when reading port speed via sysfs
commit fb9024b064 upstream.

Calling ql_log() inside qla2x00_port_speed_show() is causing messages to be
output to the console for no particularly good reason.  The sysfs read
routine should just return the information to userspace.  The only reason
to log a message is when the port speed actually changes, and this already
occurs elsewhere.

Link: https://lore.kernel.org/r/20200504175416.15417-1-emilne@redhat.com
Fixes: 4910b524ac ("scsi: qla2xxx: Add support for setting port speed")
Cc: <stable@vger.kernel.org> # v5.1+
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27 17:46:40 +02:00
Tyrel Datwyler acd3efa17d scsi: ibmvscsi: Fix WARN_ON during event pool release
[ Upstream commit b36522150e ]

While removing an ibmvscsi client adapter a WARN_ON like the following is
seen in the kernel log:

drmgr: drmgr: -r -c slot -s U9080.M9S.783AEC8-V11-C11 -w 5 -d 1
WARNING: CPU: 9 PID: 24062 at ../kernel/dma/mapping.c:311 dma_free_attrs+0x78/0x110
Supported: No, Unreleased kernel
CPU: 9 PID: 24062 Comm: drmgr Kdump: loaded Tainted: G               X 5.3.18-12-default
NIP:  c0000000001fa758 LR: c0000000001fa744 CTR: c0000000001fa6e0
REGS: c0000002173375d0 TRAP: 0700   Tainted: G               X (5.3.18-12-default)
MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28088282  XER: 20000000
CFAR: c0000000001fbf0c IRQMASK: 1
GPR00: c0000000001fa744 c000000217337860 c00000000161ab00 0000000000000000
GPR04: 0000000000000000 c000011e12250000 0000000018010000 0000000000000000
GPR08: 0000000000000000 0000000000000001 0000000000000001 c0080000190f4fa8
GPR12: c0000000001fa6e0 c000000007fc2a00 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 000000011420e310 0000000000000000 0000000000000000 0000000018010000
GPR28: c00000000159de50 c000011e12250000 0000000000006600 c000011e5c994848
NIP [c0000000001fa758] dma_free_attrs+0x78/0x110
LR [c0000000001fa744] dma_free_attrs+0x64/0x110
Call Trace:
[c000000217337860] [000000011420e310] 0x11420e310 (unreliable)
[c0000002173378b0] [c0080000190f0280] release_event_pool+0xd8/0x120 [ibmvscsi]
[c000000217337930] [c0080000190f3f74] ibmvscsi_remove+0x6c/0x160 [ibmvscsi]
[c000000217337960] [c0000000000f3cac] vio_bus_remove+0x5c/0x100
[c0000002173379a0] [c00000000087a0a4] device_release_driver_internal+0x154/0x280
[c0000002173379e0] [c0000000008777cc] bus_remove_device+0x11c/0x220
[c000000217337a60] [c000000000870fc4] device_del+0x1c4/0x470
[c000000217337b10] [c0000000008712a0] device_unregister+0x30/0xa0
[c000000217337b80] [c0000000000f39ec] vio_unregister_device+0x2c/0x60
[c000000217337bb0] [c00800001a1d0964] dlpar_remove_slot+0x14c/0x250 [rpadlpar_io]
[c000000217337c50] [c00800001a1d0bcc] remove_slot_store+0xa4/0x110 [rpadlpar_io]
[c000000217337cd0] [c000000000c091a0] kobj_attr_store+0x30/0x50
[c000000217337cf0] [c00000000057c934] sysfs_kf_write+0x64/0x90
[c000000217337d10] [c00000000057be10] kernfs_fop_write+0x1b0/0x290
[c000000217337d60] [c000000000488c4c] __vfs_write+0x3c/0x70
[c000000217337d80] [c00000000048c648] vfs_write+0xd8/0x260
[c000000217337dd0] [c00000000048ca8c] ksys_write+0xdc/0x130
[c000000217337e20] [c00000000000b488] system_call+0x5c/0x70
Instruction dump:
7c840074 f8010010 f821ffb1 20840040 eb830218 7c8407b4 48002019 60000000
2fa30000 409e003c 892d0988 792907e0 <0b090000> 2fbd0000 419e0028 2fbc0000
---[ end trace 5955b3c0cc079942 ]---
rpadlpar_io: slot U9080.M9S.783AEC8-V11-C11 removed

This is tripped as a result of irqs being disabled during the call to
dma_free_coherent() by release_event_pool(). At this point in the code path
we have quiesced the adapter and it is overly paranoid to be holding the
host lock.

[mkp: fixed build warning reported by sfr]

Link: https://lore.kernel.org/r/1588027793-17952-1-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27 17:46:32 +02:00
Quinn Tran 2b52a61adb scsi: qla2xxx: Delete all sessions before unregister local nvme port
[ Upstream commit c48f849d3f ]

Delete all sessions before unregistering local nvme port.  This allows nvme
layer to decrement all active rport count down to zero.  Once the count is
down to zero, nvme would call qla to continue with the npiv port deletion.

PID: 27448  TASK: ffff9e34b777c1c0  CPU: 0   COMMAND: "qaucli"
 0 [ffff9e25e84abbd8] __schedule at ffffffff977858ca
 1 [ffff9e25e84abc68] schedule at ffffffff97785d79
 2 [ffff9e25e84abc78] schedule_timeout at ffffffff97783881
 3 [ffff9e25e84abd28] wait_for_completion at ffffffff9778612d
 4 [ffff9e25e84abd88] qla_nvme_delete at ffffffffc0e3024e [qla2xxx]
 5 [ffff9e25e84abda8] qla24xx_vport_delete at ffffffffc0e024b9 [qla2xxx]
 6 [ffff9e25e84abdf0] fc_vport_terminate at ffffffffc011c247 [scsi_transport_fc]
 7 [ffff9e25e84abe28] store_fc_host_vport_delete at ffffffffc011cd94 [scsi_transport_fc]
 8 [ffff9e25e84abe70] dev_attr_store at ffffffff974b376b
 9 [ffff9e25e84abe80] sysfs_kf_write at ffffffff972d9a92
10 [ffff9e25e84abe90] kernfs_fop_write at ffffffff972d907b
11 [ffff9e25e84abec8] vfs_write at ffffffff9724c790
12 [ffff9e25e84abf08] sys_write at ffffffff9724d55f
13 [ffff9e25e84abf50] system_call_fastpath at ffffffff97792ed2
    RIP: 00007fc0bd81a6fd  RSP: 00007ffff78d9648  RFLAGS: 00010202
    RAX: 0000000000000001  RBX: 0000000000000022  RCX: 00007ffff78d96e0
    RDX: 0000000000000022  RSI: 00007ffff78d94e0  RDI: 0000000000000008
    RBP: 00007ffff78d9440   R8: 0000000000000000   R9: 00007fc0bd48b2cd
    R10: 0000000000000017  R11: 0000000000000293  R12: 0000000000000000
    R13: 00005624e4dac840  R14: 00005624e4da9a10  R15: 0000000000000000
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

Link: https://lore.kernel.org/r/20200331104015.24868-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27 17:46:29 +02:00
Arun Easi d2430cb7f2 scsi: qla2xxx: Fix hang when issuing nvme disconnect-all in NPIV
[ Upstream commit 45a76264c2 ]

In NPIV environment, a NPIV host may use a queue pair created by base host
or other NPIVs, so the check for a queue pair created by this NPIV is not
correct, and can cause an abort to fail, which in turn means the NVME
command not returned.  This leads to hang in nvme_fc layer in
nvme_fc_delete_association() which waits for all I/Os to be returned, which
is seen as hang in the application.

Link: https://lore.kernel.org/r/20200331104015.24868-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27 17:46:28 +02:00
Wu Bo 2d6d0ce4de scsi: sg: add sg_remove_request in sg_write
commit 83c6f23900 upstream.

If the __copy_from_user function failed we need to call sg_remove_request
in sg_write.

Link: https://lore.kernel.org/r/610618d9-e983-fd56-ed0f-639428343af7@huawei.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[groeck: Backport to v5.4.y and older kernels]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20 08:20:07 +02:00
Martin Wilck 1f5a216251 scsi: qla2xxx: check UNLOADING before posting async work
commit 5a263892d7 upstream.

qlt_free_session_done() tries to post async PRLO / LOGO, and waits for the
completion of these async commands. If UNLOADING is set, this is doomed to
timeout, because the async logout command will never complete.

The only way to avoid waiting pointlessly is to fail posting these commands
in the first place if the driver is in UNLOADING state.  In general,
posting any command should be avoided when the driver is UNLOADING.

With this patch, "rmmod qla2xxx" completes without noticeable delay.

Link: https://lore.kernel.org/r/20200421204621.19228-3-mwilck@suse.com
Fixes: 45235022da ("scsi: qla2xxx: Fix driver unload by shutting down chip")
Acked-by: Arun Easi <aeasi@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06 08:15:11 +02:00
Martin Wilck faa8daca02 scsi: qla2xxx: set UNLOADING before waiting for session deletion
commit 856e152a3c upstream.

The purpose of the UNLOADING flag is to avoid port login procedures to
continue when a controller is in the process of shutting down.  It makes
sense to set this flag before starting session teardown.

Furthermore, use atomic test_and_set_bit() to avoid the shutdown being run
multiple times in parallel. In qla2x00_disable_board_on_pci_error(), the
test for UNLOADING is postponed until after the check for an already
disabled PCI board.

Link: https://lore.kernel.org/r/20200421204621.19228-2-mwilck@suse.com
Fixes: 45235022da ("scsi: qla2xxx: Fix driver unload by shutting down chip")
Reviewed-by: Arun Easi <aeasi@marvell.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06 08:15:11 +02:00
Kevin Barnett 3340d011cf scsi: smartpqi: fix problem with unique ID for physical device
[ Upstream commit 5b083b305b ]

Obtain the unique IDs from the RLL and RPL instead of VPD page 83h.

Link: https://lore.kernel.org/r/157048751833.11757.11996314786914610803.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:33:06 +02:00
Murthy Bhat d867f27571 scsi: smartpqi: fix call trace in device discovery
[ Upstream commit b969261134 ]

Use sas_phy_delete rather than sas_phy_free which, according to
comments, should not be called for PHYs that have been set up
successfully.

Link: https://lore.kernel.org/r/157048748876.11757.17773443136670011786.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:33:05 +02:00
Kevin Barnett 8a20fb1c9a scsi: smartpqi: fix controller lockup observed during force reboot
[ Upstream commit 0530736e40 ]

Link: https://lore.kernel.org/r/157048748297.11757.3872221216800537383.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:33:05 +02:00
Wu Bo 713ad9b9d3 scsi: iscsi: Report unbind session event when the target has been removed
[ Upstream commit 13e60d3ba2 ]

If the daemon is restarted or crashes while logging out of a session, the
unbind session event sent by the kernel is not processed and is lost.  When
the daemon starts again, the session can't be unbound because the daemon is
waiting for the event message. However, the kernel has already logged out
and the event will not be resent.

When iscsid restart is complete, logout session reports error:

Logging out of session [sid: 6, target: iqn.xxxxx, portal: xx.xx.xx.xx,3260]
iscsiadm: Could not logout of [sid: 6, target: iscsiadm -m node iqn.xxxxx, portal: xx.xx.xx.xx,3260].
iscsiadm: initiator reported error (9 - internal error)
iscsiadm: Could not logout of all requested sessions

Make sure the unbind event is emitted.

[mkp: commit desc and applied by hand since patch was mangled]

Link: https://lore.kernel.org/r/4eab1771-2cb3-8e79-b31c-923652340e99@huawei.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:32:59 +02:00
Javed Hasan fb669262fd scsi: libfc: If PRLI rejected, move rport to PLOGI state
[ Upstream commit 45e544bfda ]

If PRLI reject code indicates "rejected status", move rport state machine
back to PLOGI state.

Link: https://lore.kernel.org/r/20200327060208.17104-2-skashyap@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:32:58 +02:00
James Smart 8427b05a7a scsi: lpfc: Fix crash in target side cable pulls hitting WAIT_FOR_UNREG
[ Upstream commit 807e7353d8 ]

Kernel is crashing with the following stacktrace:

  BUG: unable to handle kernel NULL pointer dereference at
    00000000000005bc
  IP: lpfc_nvme_register_port+0x1a8/0x3a0 [lpfc]
  ...
  Call Trace:
  lpfc_nlp_state_cleanup+0x2b2/0x500 [lpfc]
  lpfc_nlp_set_state+0xd7/0x1a0 [lpfc]
  lpfc_cmpl_prli_prli_issue+0x1f7/0x450 [lpfc]
  lpfc_disc_state_machine+0x7a/0x1e0 [lpfc]
  lpfc_cmpl_els_prli+0x16f/0x1e0 [lpfc]
  lpfc_sli_sp_handle_rspiocb+0x5b2/0x690 [lpfc]
  lpfc_sli_handle_slow_ring_event_s4+0x182/0x230 [lpfc]
  lpfc_do_work+0x87f/0x1570 [lpfc]
  kthread+0x10d/0x130
  ret_from_fork+0x35/0x40

During target side fault injections, it is possible to hit the
NLP_WAIT_FOR_UNREG case in lpfc_nvme_remoteport_delete. A prior commit
fixed a rebind and delete race condition, but called lpfc_nlp_put
unconditionally. This triggered a deletion and the crash.

Fix by movng nlp_put to inside the NLP_WAIT_FOR_UNREG case, where the nlp
will be being unregistered/removed. Leave the reference if the flag isn't
set.

Link: https://lore.kernel.org/r/20200322181304.37655-8-jsmart2021@gmail.com
Fixes: b15bd3e621 ("scsi: lpfc: Fix nvme remoteport registration race conditions")
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:32:58 +02:00
James Smart 0c5733a962 scsi: lpfc: Fix crash after handling a pci error
[ Upstream commit 4cd7089130 ]

Injecting EEH on a 32GB card is causing kernel oops

The pci error handler is doing an IO flush and the offline code is also
doing an IO flush. When the 1st flush is complete the hdwq is destroyed
(freed), yet the second flush accesses the hdwq and crashes.

Added a check in lpfc_sli4_fush_io_rings to check both the HBA_IOQ_FLUSH
flag and the hdwq pointer to see if it is already set and not already
freed.

Link: https://lore.kernel.org/r/20200322181304.37655-6-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:32:57 +02:00
James Smart 9d1062c4dd scsi: lpfc: Fix kasan slab-out-of-bounds error in lpfc_unreg_login
[ Upstream commit 38503943c8 ]

The following kasan bug was called out:

 BUG: KASAN: slab-out-of-bounds in lpfc_unreg_login+0x7c/0xc0 [lpfc]
 Read of size 2 at addr ffff889fc7c50a22 by task lpfc_worker_3/6676
 ...
 Call Trace:
 dump_stack+0x96/0xe0
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 print_address_description.constprop.6+0x1b/0x220
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 __kasan_report.cold.9+0x37/0x7c
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 kasan_report+0xe/0x20
 lpfc_unreg_login+0x7c/0xc0 [lpfc]
 lpfc_sli_def_mbox_cmpl+0x334/0x430 [lpfc]
 ...

When processing the completion of a "Reg Rpi" login mailbox command in
lpfc_sli_def_mbox_cmpl, a call may be made to lpfc_unreg_login. The vpi is
extracted from the completing mailbox context and passed as an input for
the next. However, the vpi stored in the mailbox command context is an
absolute vpi, which for SLI4 represents both base + offset.  When used with
a non-zero base component, (function id > 0) this results in an
out-of-range access beyond the allocated phba->vpi_ids array.

Fix by subtracting the function's base value to get an accurate vpi number.

Link: https://lore.kernel.org/r/20200322181304.37655-2-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:32:57 +02:00
Li Bin d8b667b45d scsi: sg: add sg_remove_request in sg_common_write
commit 849f8583e9 upstream.

If the dxfer_len is greater than 256M then the request is invalid and we
need to call sg_remove_request in sg_common_write.

Link: https://lore.kernel.org/r/1586777361-17339-1-git-send-email-huawei.libin@huawei.com
Fixes: f930c70436 ("scsi: sg: only check for dxfer_len greater than 256M")
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-23 10:36:16 +02:00
Can Guo 777c8c9f11 scsi: ufs: Fix ufshcd_hold() caused scheduling while atomic
commit c63d6099a7 upstream.

The async version of ufshcd_hold(async == true), which is only called in
queuecommand path as for now, is expected to work in atomic context, thus
it should not sleep or schedule out. When it runs into the condition that
clocks are ON but link is still in hibern8 state, it should bail out
without flushing the clock ungate work.

Fixes: f2a785ac23 ("scsi: ufshcd: Fix race between clk scaling and ungate work")
Link: https://lore.kernel.org/r/1581392451-28743-6-git-send-email-cang@codeaurora.org
Reviewed-by: Hongwu Su <hongwus@codeaurora.org>
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-21 09:04:51 +02:00
James Smart 484cc15ad0 scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list()
[ Upstream commit d480e57809 ]

Compilation can fail due to having an inline function reference where the
function body is not present.

Fix by removing the inline tag.

Fixes: 93a4d6f401 ("scsi: lpfc: Add registration for CPU Offline/Online events")

Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:26 +02:00
James Smart ba74ab0c29 scsi: lpfc: Fix broken Credit Recovery after driver load
[ Upstream commit 835214f5d5 ]

When driver is set to enable bb credit recovery, the switch displayed the
setting as inactive.  If the link bounces, it switches to Active.

During link up processing, the driver currently does a MBX_READ_SPARAM
followed by a MBX_CONFIG_LINK. These mbox commands are queued to be
executed, one at a time and the completion is processed by the worker
thread.  Since the MBX_READ_SPARAM is done BEFORE the MBX_CONFIG_LINK, the
BB_SC_N bit is never set the the returned values. BB Credit recovery status
only gets set after the driver requests the feature in CONFIG_LINK, which
is done after the link up. Thus the ordering of READ_SPARAM needs to follow
the CONFIG_LINK.

Fix by reordering so that READ_SPARAM is done after CONFIG_LINK.  Added a
HBA_DEFER_FLOGI flag so that any FLOGI handling waits until after the
READ_SPARAM is done so that the proper BB credit value is set in the FLOGI
payload.

Fixes: 6bfb162082 ("scsi: lpfc: Fix configuration of BB credit recovery in service parameters")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-4-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:24 +02:00
James Smart 33ebae4f3b scsi: lpfc: Fix configuration of BB credit recovery in service parameters
[ Upstream commit 6bfb162082 ]

The driver today is reading service parameters from the firmware and then
overwriting the firmware-provided values with values of its own.  There are
some switch features that require preliminary FLOGI's that are
switch-specific and done prior to the actual fabric FLOGI for traffic.  The
fw will perform those FLOGIs and will revise the service parameters for the
features configured. As the driver later overwrites those values with its
own values, it misconfigures things like BBSCN use by doing so.

Correct by eliminating the driver-overwrite of firmware values. The driver
correctly re-reads the service parameters after each link up to obtain the
latest values from firmware.

Link: https://lore.kernel.org/r/20191105005708.7399-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:24 +02:00
James Smart 037b0b5521 scsi: lpfc: Fix Fabric hostname registration if system hostname changes
[ Upstream commit e3ba04c9ba ]

There are reports of multiple ports on the same system displaying different
hostnames in fabric FDMI displays.

Currently, the driver registers the hostname at initialization and obtains
the hostname via init_utsname()->nodename queried at the time the FC link
comes up. Unfortunately, if the machine hostname is updated after
initialization, such as via DHCP or admin command, the value registered
initially will be incorrect.

Fix by having the driver save the hostname that was registered with FDMI.
The driver then runs a heartbeat action that will check the hostname.  If
the name changes, reregister the FMDI data.

The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA.

Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:24 +02:00
James Smart f48e759352 scsi: lpfc: Add registration for CPU Offline/Online events
[ Upstream commit 93a4d6f401 ]

The recent affinitization didn't address cpu offlining/onlining.  If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:24 +02:00
Sreekanth Reddy 3457b2232e scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
commit cc41f11a21 upstream.

Generic protection fault type kernel panic is observed when user performs
soft (ordered) HBA unplug operation while IOs are running on drives
connected to HBA.

When user performs ordered HBA removal operation, the kernel calls PCI
device's .remove() call back function where driver is flushing out all the
outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps
sg buffers allocated for these IO commands.

However, in the ordered HBA removal case (unlike of real HBA hot removal),
HBA device is still alive and hence HBA hardware is performing the DMA
operations to those buffers on the system memory which are already unmapped
while flushing out the outstanding SCSI IO commands and this leads to
kernel panic.

Don't flush out the outstanding IOs from .remove() path in case of ordered
removal since HBA will be still alive in this case and it can complete the
outstanding IOs. Flush out the outstanding IOs only in case of 'physical
HBA hot unplug' where there won't be any communication with the HBA.

During shutdown also it is possible that HBA hardware can perform DMA
operations on those outstanding IO buffers which are completed with
DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied
in shutdown path as well.

It is safe to drop the outstanding commands when HBA is inaccessible such
as when permanent PCI failure happens, when HBA is in non-operational
state, or when someone does a real HBA hot unplug operation. Since driver
knows that HBA is inaccessible during these cases, it is safe to drop the
outstanding commands instead of waiting for SCSI error recovery to kick in
and clear these outstanding commands.

Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com
Fixes: c666d3be99 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
Cc: stable@vger.kernel.org #v4.14.174+
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:50:23 +02:00
James Smart e5b9c1027e scsi: lpfc: Fix lpfc_io_buf resource leak in lpfc_get_scsi_buf_s4 error path
commit 0ab384a49c upstream.

If a call to lpfc_get_cmd_rsp_buf_per_hdwq returns NULL (memory allocation
failure), a previously allocated lpfc_io_buf resource is leaked.

Fix by releasing the lpfc_io_buf resource in the failure path.

Fixes: d79c9e9d4b ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:50:19 +02:00
Stanley Chu 73a122c263 scsi: ufs: fix Auto-Hibern8 error detection
commit 5a244e0ea6 upstream.

Auto-Hibern8 may be disabled by some vendors or sysfs in runtime even if
Auto-Hibern8 capability is supported by host. If Auto-Hibern8 capability is
supported by host but not actually enabled, Auto-Hibern8 error shall not
happen.

To fix this, provide a way to detect if Auto-Hibern8 is actually enabled
first, and bypass Auto-Hibern8 disabling case in
ufshcd_is_auto_hibern8_error().

Fixes: 8217444039 ("scsi: ufs: Add error-handling of Auto-Hibernate")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200129105251.12466-4-stanley.chu@mediatek.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:50:19 +02:00
James Smart 91bed1f1fb nvme-fc: Revert "add module to ops template to allow module references"
commit 8c5c660529 upstream.

The original patch was to resolve the lldd being able to be unloaded
while being used to talk to the boot device of the system. However, the
end result of the original patch is that any driver unload while a nvme
controller is live via the lldd is now being prohibited. Given the module
reference, the module teardown routine can't be called, thus there's no
way, other than manual actions to terminate the controllers.

Fixes: 863fbae929 ("nvme_fc: add module to ops template to allow module references")
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:50:10 +02:00
Martin K. Petersen b34e20c78f scsi: sd: Fix optimal I/O size for devices that change reported values
commit ea697a8bf5 upstream.

Some USB bridge devices will return a default set of characteristics during
initialization. And then, once an attached drive has spun up, substitute
the actual parameters reported by the drive. According to the SCSI spec,
the device should return a UNIT ATTENTION in case any reported parameters
change. But in this case the change is made silently after a small window
where default values are reported.

Commit a83da8a450 ("scsi: sd: Optimal I/O size should be a multiple of
physical block size") validated the reported optimal I/O size against the
physical block size to overcome problems with devices reporting nonsensical
transfer sizes. However, this validation did not account for the fact that
aforementioned devices will return default values during a brief window
during spin-up. The subsequent change in reported characteristics would
invalidate the checking that had previously been performed.

Unset a previously configured optimal I/O size should the sanity checking
fail on subsequent revalidate attempts.

Link: https://lore.kernel.org/r/33fb522e-4f61-1b76-914f-c9e6a3553c9b@gmail.com
Cc: Bryan Gurney <bgurney@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Bernhard Sulzer <micraft.b@gmail.com>
Tested-by: Bernhard Sulzer <micraft.b@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-01 11:02:01 +02:00
Wen Xiong 5ecb28b156 scsi: ipr: Fix softlockup when rescanning devices in petitboot
[ Upstream commit 394b61711f ]

When trying to rescan disks in petitboot shell, we hit the following
softlockup stacktrace:

Kernel panic - not syncing: System is deadlocked on memory
[  241.223394] CPU: 32 PID: 693 Comm: sh Not tainted 5.4.16-openpower1 #1
[  241.223406] Call Trace:
[  241.223415] [c0000003f07c3180] [c000000000493fc4] dump_stack+0xa4/0xd8 (unreliable)
[  241.223432] [c0000003f07c31c0] [c00000000007d4ac] panic+0x148/0x3cc
[  241.223446] [c0000003f07c3260] [c000000000114b10] out_of_memory+0x468/0x4c4
[  241.223461] [c0000003f07c3300] [c0000000001472b0] __alloc_pages_slowpath+0x594/0x6d8
[  241.223476] [c0000003f07c3420] [c00000000014757c] __alloc_pages_nodemask+0x188/0x1a4
[  241.223492] [c0000003f07c34a0] [c000000000153e10] alloc_pages_current+0xcc/0xd8
[  241.223508] [c0000003f07c34e0] [c0000000001577ac] alloc_slab_page+0x30/0x98
[  241.223524] [c0000003f07c3520] [c0000000001597fc] new_slab+0x138/0x40c
[  241.223538] [c0000003f07c35f0] [c00000000015b204] ___slab_alloc+0x1e4/0x404
[  241.223552] [c0000003f07c36c0] [c00000000015b450] __slab_alloc+0x2c/0x48
[  241.223566] [c0000003f07c36f0] [c00000000015b754] kmem_cache_alloc_node+0x9c/0x1b4
[  241.223582] [c0000003f07c3760] [c000000000218c48] blk_alloc_queue_node+0x34/0x270
[  241.223599] [c0000003f07c37b0] [c000000000226574] blk_mq_init_queue+0x2c/0x78
[  241.223615] [c0000003f07c37e0] [c0000000002ff710] scsi_mq_alloc_queue+0x28/0x70
[  241.223631] [c0000003f07c3810] [c0000000003005b8] scsi_alloc_sdev+0x184/0x264
[  241.223647] [c0000003f07c38a0] [c000000000300ba0] scsi_probe_and_add_lun+0x288/0xa3c
[  241.223663] [c0000003f07c3a00] [c000000000301768] __scsi_scan_target+0xcc/0x478
[  241.223679] [c0000003f07c3b20] [c000000000301c64] scsi_scan_channel.part.9+0x74/0x7c
[  241.223696] [c0000003f07c3b70] [c000000000301df4] scsi_scan_host_selected+0xe0/0x158
[  241.223712] [c0000003f07c3bd0] [c000000000303f04] store_scan+0x104/0x114
[  241.223727] [c0000003f07c3cb0] [c0000000002d5ac4] dev_attr_store+0x30/0x4c
[  241.223741] [c0000003f07c3cd0] [c0000000001dbc34] sysfs_kf_write+0x64/0x78
[  241.223756] [c0000003f07c3cf0] [c0000000001da858] kernfs_fop_write+0x170/0x1b8
[  241.223773] [c0000003f07c3d40] [c0000000001621fc] __vfs_write+0x34/0x60
[  241.223787] [c0000003f07c3d60] [c000000000163c2c] vfs_write+0xa8/0xcc
[  241.223802] [c0000003f07c3db0] [c000000000163df4] ksys_write+0x70/0xbc
[  241.223816] [c0000003f07c3e20] [c00000000000b40c] system_call+0x5c/0x68

As a part of the scan process Linux will allocate and configure a
scsi_device for each target to be scanned. If the device is not present,
then the scsi_device is torn down. As a part of scsi_device teardown a
workqueue item will be scheduled and the lockups we see are because there
are 250k workqueue items to be processed.  Accoding to the specification of
SIS-64 sas controller, max_channel should be decreased on SIS-64 adapters
to 4.

The patch fixes softlockup issue.

Thanks for Oliver Halloran's help with debugging and explanation!

Link: https://lore.kernel.org/r/1583510248-23672-1-git-send-email-wenxiong@linux.vnet.ibm.com
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-01 11:01:54 +02:00
Igor Druzhinin 6c663120ec scsi: libfc: free response frame from GPN_ID
[ Upstream commit ff6993bb79 ]

fc_disc_gpn_id_resp() should be the last function using it so free it here
to avoid memory leak.

Link: https://lore.kernel.org/r/1579013000-14570-2-git-send-email-igor.druzhinin@citrix.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-21 08:11:50 +01:00
Tomas Henzl 26bfd300f7 scsi: megaraid_sas: silence a warning
[ Upstream commit 0e99b2c625 ]

Add a flag to DMA memory allocation to silence a warning.

This driver allocates DMA memory for IO frames. This allocation may exceed
MAX_ORDER pages for few megaraid_sas controllers (controllers with very
high queue depth). Consequently, the driver has logic to keep reducing the
controller queue depth until the DMA memory allocation succeeds.

On impacted megaraid_sas controllers there would be multiple DMA allocation
failures until driver settled on an allocation that fit. These failed DMA
allocation requests caused stack traces in system logs. These were not
harmful and this patch silences those warnings/stack traces.

[mkp: clarified commit desc]

Link: https://lore.kernel.org/r/20200204152413.7107-1-thenzl@redhat.com
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-12 13:00:12 +01:00
Nick Black 817faa4ed4 scsi: iscsi: Don't destroy session if there are outstanding connections
[ Upstream commit 54155ed419 ]

A faulty userspace that calls destroy_session() before destroying the
connections can trigger the failure.  This patch prevents the issue by
refusing to destroy the session if there are outstanding connections.

------------[ cut here ]------------
kernel BUG at mm/slub.c:306!
invalid opcode: 0000 [#1] SMP PTI
CPU: 1 PID: 1224 Comm: iscsid Not tainted 5.4.0-rc2.iscsi+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__slab_free+0x181/0x350
[...]
[ 1209.686056] RSP: 0018:ffffa93d4074fae0 EFLAGS: 00010246
[ 1209.686694] RAX: ffff934efa5ad800 RBX: 000000008010000a RCX: ffff934efa5ad800
[ 1209.687651] RDX: ffff934efa5ad800 RSI: ffffeb4041e96b00 RDI: ffff934efd402c40
[ 1209.688582] RBP: ffffa93d4074fb80 R08: 0000000000000001 R09: ffffffffbb5dfa26
[ 1209.689425] R10: ffff934efa5ad800 R11: 0000000000000001 R12: ffffeb4041e96b00
[ 1209.690285] R13: ffff934efa5ad800 R14: ffff934efd402c40 R15: 0000000000000000
[ 1209.691213] FS:  00007f7945dfb540(0000) GS:ffff934efda80000(0000) knlGS:0000000000000000
[ 1209.692316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1209.693013] CR2: 000055877fd3da80 CR3: 0000000077384000 CR4: 00000000000006e0
[ 1209.693897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1209.694773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1209.695631] Call Trace:
[ 1209.695957]  ? __wake_up_common_lock+0x8a/0xc0
[ 1209.696712]  iscsi_pool_free+0x26/0x40
[ 1209.697263]  iscsi_session_teardown+0x2f/0xf0
[ 1209.698117]  iscsi_sw_tcp_session_destroy+0x45/0x60
[ 1209.698831]  iscsi_if_rx+0xd88/0x14e0
[ 1209.699370]  netlink_unicast+0x16f/0x200
[ 1209.699932]  netlink_sendmsg+0x21a/0x3e0
[ 1209.700446]  sock_sendmsg+0x4f/0x60
[ 1209.700902]  ___sys_sendmsg+0x2ae/0x320
[ 1209.701451]  ? cp_new_stat+0x150/0x180
[ 1209.701922]  __sys_sendmsg+0x59/0xa0
[ 1209.702357]  do_syscall_64+0x52/0x160
[ 1209.702812]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.703419] RIP: 0033:0x7f7946433914
[...]
[ 1209.706084] RSP: 002b:00007fffb99f2378 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1209.706994] RAX: ffffffffffffffda RBX: 000055bc869eac20 RCX: 00007f7946433914
[ 1209.708082] RDX: 0000000000000000 RSI: 00007fffb99f2390 RDI: 0000000000000005
[ 1209.709120] RBP: 00007fffb99f2390 R08: 000055bc84fe9320 R09: 00007fffb99f1f07
[ 1209.710110] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000038
[ 1209.711085] R13: 000055bc8502306e R14: 0000000000000000 R15: 0000000000000000
 Modules linked in:
 ---[ end trace a2d933ede7f730d8 ]---

Link: https://lore.kernel.org/r/20191226203148.2172200-1-krisman@collabora.com
Signed-off-by: Nick Black <nlb@google.com>
Co-developed-by: Salman Qazi <sqazi@google.com>
Signed-off-by: Salman Qazi <sqazi@google.com>
Co-developed-by: Junho Ryu <jayr@google.com>
Signed-off-by: Junho Ryu <jayr@google.com>
Co-developed-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:50 +01:00
Stanley Chu 12b685be50 scsi: ufs-mediatek: add apply_dev_quirks variant operation
[ Upstream commit ea92c32bd3 ]

Add vendor-specific variant callback "apply_dev_quirks" to MediaTek UFS
driver.

Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/1578726707-6596-3-git-send-email-stanley.chu@mediatek.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:50 +01:00
Stanley Chu 4fa2dd4eeb scsi: ufs: pass device information to apply_dev_quirks
[ Upstream commit c40ad6b7fc ]

Pass UFS device information to vendor-specific variant callback
"apply_dev_quirks" because some platform vendors need to know such
information to apply special handling or quirks in specific devices.

At the same time, modify existing vendor implementations according to the
new interface for those vendor drivers which will be built-in or built as a
module alone with UFS core driver.

[mkp: clarified commit desc]

Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/1578726707-6596-2-git-send-email-stanley.chu@mediatek.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:50 +01:00
James Smart ce591c9219 scsi: lpfc: Fix: Rework setting of fdmi symbolic node name registration
[ Upstream commit df9166bfa7 ]

This patch reworks the fdmi symbolic node name data for the following two
issues:

 - Correcting extraneous periods following the DV and HN fdmi data fields.

 - Avoiding buffer overflow issues when formatting the data.

The fix to the fist issue is to just remove the characters.

The fix to the second issue has all data being staged in temporary storage
before being moved to the real buffer.

Link: https://lore.kernel.org/r/20191218235808.31922-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:39 +01:00
Nathan Chancellor df0f4455a1 scsi: aic7xxx: Adjust indentation in ahc_find_syncrate
[ Upstream commit 4dbc96ad65 ]

Clang warns:

../drivers/scsi/aic7xxx/aic7xxx_core.c:2317:5: warning: misleading
indentation; statement is not part of the previous 'if'
[-Wmisleading-indentation]
                        if ((syncrate->sxfr_u2 & ST_SXFR) != 0)
                        ^
../drivers/scsi/aic7xxx/aic7xxx_core.c:2310:4: note: previous statement
is here
                        if (syncrate == &ahc_syncrates[maxsync])
                        ^
1 warning generated.

This warning occurs because there is a space amongst the tabs on this
line. Remove it so that the indentation is consistent with the Linux kernel
coding style and clang no longer warns.

This has been a problem since the beginning of git history hence no fixes
tag.

Link: https://github.com/ClangBuiltLinux/linux/issues/817
Link: https://lore.kernel.org/r/20191218014220.52746-1-natechancellor@gmail.com
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:38 +01:00
Can Guo f6ebbf46c3 scsi: ufs: Complete pending requests in host reset and restore path
[ Upstream commit 2df74b6985 ]

In UFS host reset and restore path, before probe, we stop and start the
host controller once. After host controller is stopped, the pending
requests, if any, are cleared from the doorbell, but no completion IRQ
would be raised due to the hba is stopped.  These pending requests shall be
completed along with the first NOP_OUT command (as it is the first command
which can raise a transfer completion IRQ) sent during probe.  Since the
OCSs of these pending requests are not SUCCESS (because they are not yet
literally finished), their UPIUs shall be dumped. When there are multiple
pending requests, the UPIU dump can be overwhelming and may lead to
stability issues because it is in atomic context.  Therefore, before probe,
complete these pending requests right after host controller is stopped and
silence the UPIU dump from them.

Link: https://lore.kernel.org/r/1574751214-8321-5-git-send-email-cang@qti.qualcomm.com
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:38 +01:00
Bean Huo da268240fb scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails
commit b9fc532021 upstream.

A non-zero error value likely being returned by ufshcd_scsi_add_wlus() in
case of failure of adding the WLs, but ufshcd_probe_hba() doesn't use this
value, and doesn't report this failure to upper caller.  This patch is to
fix this issue.

Fixes: 2a8fa60044 ("ufs: manually add well known logical units")
Link: https://lore.kernel.org/r/20200120130820.1737-2-huobean@gmail.com
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-14 16:34:08 -05:00
Asutosh Das 3c9edf5581 scsi: ufs: Recheck bkops level if bkops is disabled
commit 24366c2afb upstream.

bkops level should be rechecked upon receiving an exception.  Currently the
level is being cached and never updated.

Update bkops each time the level is checked.  Also do not use the cached
bkops level value if it is disabled and then enabled.

Fixes: afdfff59a0 (scsi: ufs: handle non spec compliant bkops behaviour by device)
Link: https://lore.kernel.org/r/1574751214-8321-2-git-send-email-cang@qti.qualcomm.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:44 -08:00
Nathan Chancellor 5f8c02d322 scsi: qla4xxx: Adjust indentation in qla4xxx_mem_free
commit aa8679736a upstream.

Clang warns:

../drivers/scsi/qla4xxx/ql4_os.c:4148:3: warning: misleading
indentation; statement is not part of the previous 'if'
[-Wmisleading-indentation]
         if (ha->fw_dump)
         ^
../drivers/scsi/qla4xxx/ql4_os.c:4144:2: note: previous statement is
here
        if (ha->queues)
        ^
1 warning generated.

This warning occurs because there is a space after the tab on this
line.  Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.

Fixes: 068237c87c ("[SCSI] qla4xxx: Capture minidump for ISP82XX on firmware failure")
Link: https://github.com/ClangBuiltLinux/linux/issues/819
Link: https://lore.kernel.org/r/20191218015252.20890-1-natechancellor@gmail.com
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:44 -08:00
Nathan Chancellor bdc7651e61 scsi: csiostor: Adjust indentation in csio_device_reset
commit a808a04c86 upstream.

Clang warns:

../drivers/scsi/csiostor/csio_scsi.c:1386:3: warning: misleading
indentation; statement is not part of the previous 'if'
[-Wmisleading-indentation]
         csio_lnodes_exit(hw, 1);
         ^
../drivers/scsi/csiostor/csio_scsi.c:1382:2: note: previous statement is
here
        if (*buf != '1')
        ^
1 warning generated.

This warning occurs because there is a space after the tab on this
line.  Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.

Fixes: a3667aaed5 ("[SCSI] csiostor: Chelsio FCoE offload driver")
Link: https://github.com/ClangBuiltLinux/linux/issues/818
Link: https://lore.kernel.org/r/20191218014726.8455-1-natechancellor@gmail.com
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:44 -08:00
Bart Van Assche 454db8d616 scsi: qla2xxx: Fix the endianness of the qla82xx_get_fw_size() return type
commit 3f5f7335e5 upstream.

Since qla82xx_get_fw_size() returns a number in CPU-endian format, change
its return type from __le32 into u32. This patch does not change any
functionality.

Fixes: 9c2b297572 ("[SCSI] qla2xxx: Support for loading Unified ROM Image (URI) format firmware file.")
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Roman Bolshakov <r.bolshakov@yadro.com>
Link: https://lore.kernel.org/r/20191219004905.39586-1-bvanassche@acm.org
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:44 -08:00
Arun Easi 8d313c04b4 scsi: qla2xxx: Fix unbound NVME response length
commit 00fe717ee1 upstream.

On certain cases when response length is less than 32, NVME response data
is supplied inline in IOCB. This is indicated by some combination of state
flags. There was an instance when a high, and incorrect, response length
was indicated causing driver to overrun buffers. Fix this by checking and
limiting the response payload length.

Fixes: 7401bc18d1 ("scsi: qla2xxx: Add FC-NVMe command handling")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200124045014.23554-1-hmadhani@marvell.com
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:32 -08:00
Quinn Tran 0b84591fdd scsi: qla2xxx: Fix stuck login session using prli_pend_timer
[ Upstream commit 8aaac2d7da ]

Session is stuck if driver sees FW has received a PRLI. Driver allows FW to
finish with processing of PRLI by checking back with FW at a later time to
see if the PRLI has finished. Instead, driver failed to push forward after
re-checking PRLI completion.

Fixes: ce0ba496dc ("scsi: qla2xxx: Fix stuck login session")
Cc: stable@vger.kernel.org # 5.3
Link: https://lore.kernel.org/r/20191217220617.28084-9-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-11 04:35:27 -08:00
Quinn Tran 2b27acfde9 scsi: qla2xxx: Fix mtcp dump collection failure
commit 641e0efddc upstream.

MTCP dump failed due to MB Reg 10 was picking garbage data from stack
memory.

Fixes: 81178772b6 ("[SCSI] qla2xxx: Implemetation of mctp.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191217220617.28084-14-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:23 -08:00
Anand Lodnoor edd15b1546 scsi: megaraid_sas: Do not initiate OCR if controller is not in ready state
commit 6d7537270e upstream.

Driver initiates OCR if a DCMD command times out. But there is a deadlock
if the driver attempts to invoke another OCR before the mutex lock
(reset_mutex) is released from the previous session of OCR.

This patch takes care of the above scenario using new flag
MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1579000882-20246-9-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:35:23 -08:00
Hannes Reinecke edb6590b1b scsi: fnic: do not queue commands during fwreset
[ Upstream commit 0e2209629f ]

When a link is going down the driver will be calling fnic_cleanup_io(),
which will traverse all commands and calling 'done' for each found command.
While the traversal is handled under the host_lock, calling 'done' happens
after the host_lock is being dropped.

As fnic_queuecommand_lck() is being called with the host_lock held, it
might well be that it will pick the command being selected for abortion
from the above routine and enqueue it for sending, but then 'done' is being
called on that very command from the above routine.

Which of course confuses the hell out of the scsi midlayer.

So fix this by not queueing commands when fnic_cleanup_io is active.

Link: https://lore.kernel.org/r/20200116102053.62755-1-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-05 21:22:49 +00:00
Bean Huo 7c3327816d scsi: ufs: delete redundant function ufshcd_def_desc_sizes()
[ Upstream commit 059efd847a ]

There is no need to call ufshcd_def_desc_sizes() in ufshcd_init(), since
descriptor lengths will be checked and initialized later in
ufshcd_init_desc_sizes().

Fixes: a4b0e8a4e92b1b(scsi: ufs: Factor out ufshcd_read_desc_param)
Link: https://lore.kernel.org/r/BN7PR08MB5684A3ACE214C3D4792CE729DB610@BN7PR08MB5684.namprd08.prod.outlook.com
Signed-off-by: Bean Huo <beanhuo@micron.com>
Acked-by: Avri Altman <avri.altman.wdc.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-26 10:01:05 +01:00
James Smart b1b105a633 scsi: lpfc: use hdwq assigned cpu for allocation
[ Upstream commit 4583a4f66b ]

Looking at the recent conversion from smp_processor_id() to
raw_smp_processor_id(), realized that the allocation should be based on the
cpu the hdwq is bound to, not the executing cpu.

Revise to pull cpu number from the hdwq

Fixes: 765ab6cdac ("scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()")
Link: https://lore.kernel.org/r/20191116003847.6141-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-23 08:23:01 +01:00
Bart Van Assche 0ec3e3ba47 scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()
commit 765ab6cdac upstream.

Fix the following kernel bug report:

BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/954

Fixes: d79c9e9d4b ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Link: https://lore.kernel.org/r/20191107052158.25788-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:59 +01:00
James Smart 51a2104cc4 scsi: lpfc: Fix hdwq sgl locks and irq handling
commit a4c21acca2 upstream.

Many of the sgl-per-hdwq paths are locking with spin_lock_irq() and
spin_unlock_irq() and may unwittingly raising irq when it shouldn't. Hard
deadlocks were seen around lpfc_scsi_prep_cmnd().

Fix by converting the locks to irqsave/irqrestore.

Fixes: d79c9e9d4b ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Link: https://lore.kernel.org/r/20190922035906.10977-16-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:59 +01:00
James Smart 287a465e12 scsi: lpfc: Fix list corruption detected in lpfc_put_sgl_per_hdwq
commit 35a635af54 upstream.

In lpfc_release_io_buf, an lpfc_io_buf is returned to the 'available' pool
before any associated sgl or cmd and rsp buffers are returned via their
respective 'put' routines.  If xri rebalancing occurs and an lpfc_io_buf
structure is reused quickly, there may be a race condition between release
of old and association of new resources.

Re-ordered lpfc_release_io_buf to release sgl and cmd/rsp
buffer lists before releasing the lpfc_io_buf structure for re-use.

Fixes: d79c9e9d4b ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Link: https://lore.kernel.org/r/20190922035906.10977-17-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:59 +01:00
Bart Van Assche 240333d7b1 scsi: core: scsi_trace: Use get_unaligned_be*()
commit b1335f5b04 upstream.

This patch fixes an unintended sign extension on left shifts. From Colin
King: "Shifting a u8 left will cause the value to be promoted to an
integer. If the top bit of the u8 is set then the following conversion to
an u64 will sign extend the value causing the upper 32 bits to be set in
the result."

Fix this by using get_unaligned_be*() instead.

Fixes: bf81623542 ("[SCSI] add scsi trace core functions and put trace points")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20191101211447.187151-1-bvanassche@acm.org
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:59 +01:00
Martin Wilck 6acefee17a scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan
commit d341e9a8f2 upstream.

In qla2x00_find_all_fabric_devs(), fcport->flags & FCF_LOGIN_NEEDED is a
necessary condition for logging into new rports, but not for dropping lost
ones.

Fixes: 726b854870 ("qla2xxx: Add framework for async fabric discovery")
Link: https://lore.kernel.org/r/20191122221912.20100-2-martin.wilck@suse.com
Tested-by: David Bond <dbond@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:59 +01:00
Huacai Chen fdad431431 scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI
commit 45dc8f2d9c upstream.

Commit 4fa1834559 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/
pci_free_irq_vectors calls.") use pci_alloc_irq_vectors() to replace
pci_enable_msi() but it didn't handle the return value correctly. This bug
make qla2x00 always fail to setup MSI if MSI-X fail, so fix it.

BTW, improve the log message of return value in qla2x00_request_irqs() to
avoid confusion.

Fixes: 4fa1834559 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls.")
Cc: Michael Hernandez <michael.hernandez@cavium.com>
Link: https://lore.kernel.org/r/1574314847-14280-1-git-send-email-chenhc@lemote.com
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:59 +01:00
John Garry c98dc7e9a0 scsi: scsi_transport_sas: Fix memory leak when removing devices
commit 82ea3e0e12 upstream.

Removing a non-host rphy causes a memory leak:

root@(none)$ echo 0 > /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:10/phy-0:0:10/sas_phy/phy-0:0:10/enable
[   79.857888] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone
root@(none)$ echo scan > /sys/kernel/debug/kmemleak
[  131.656603] kmemleak: 3 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
root@(none)$ more /sys/kernel/debug/kmemleak
unreferenced object 0xffff041da5c66000 (size 256):
  comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
  hex dump (first 32 bytes):
    00 5e c6 a5 1d 04 ff ff 01 00 00 00 00 00 00 00  .^..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] kmem_cache_alloc+0x188/0x260
    [<(____ptrval____)>] bsg_setup_queue+0x48/0x1a8
    [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
    [<(____ptrval____)>] sas_probe_devices+0x168/0x208
    [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
    [<(____ptrval____)>] process_one_work+0x3f8/0x690
    [<(____ptrval____)>] worker_thread+0x70/0x6a0
    [<(____ptrval____)>] kthread+0x1b8/0x1c0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff041d8c075400 (size 128):
  comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
  hex dump (first 32 bytes):
    00 40 25 97 1d 00 ff ff 00 00 00 00 00 00 00 00  .@%.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] __kmalloc_node+0x1a8/0x2c8
    [<(____ptrval____)>] blk_mq_realloc_tag_set_tags.part.70+0x48/0xd8
    [<(____ptrval____)>] blk_mq_alloc_tag_set+0x1dc/0x530
    [<(____ptrval____)>] bsg_setup_queue+0xe8/0x1a8
    [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
    [<(____ptrval____)>] sas_probe_devices+0x168/0x208
    [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
    [<(____ptrval____)>] process_one_work+0x3f8/0x690
    [<(____ptrval____)>] worker_thread+0x70/0x6a0
    [<(____ptrval____)>] kthread+0x1b8/0x1c0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff041da5c65e00 (size 256):
  comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] __kmalloc_node+0x1a8/0x2c8
    [<(____ptrval____)>] blk_mq_alloc_tag_set+0x254/0x530
    [<(____ptrval____)>] bsg_setup_queue+0xe8/0x1a8
    [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
    [<(____ptrval____)>] sas_probe_devices+0x168/0x208
    [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
    [<(____ptrval____)>] process_one_work+0x3f8/0x690
    [<(____ptrval____)>] worker_thread+0x70/0x6a0
    [<(____ptrval____)>] kthread+0x1b8/0x1c0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
root@(none)$

It turns out that we don't clean up the request queue fully for bsg
devices, as the blk mq tags for the request queue are not freed.

Fix by doing the queue removal in one place - in sas_rphy_remove() -
instead of unregistering the queue in sas_rphy_remove() and finally
cleaning up the queue in calling blk_cleanup_queue() from
sas_end_device_release() or sas_expander_release().

Function bsg_remove_queue() can handle a NULL pointer q, so remove the
precheck in sas_rphy_remove().

Fixes: 651a013649 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough")
Link: https://lore.kernel.org/r/1574242755-94156-1-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
Xiang Chen 83d2e42a2d scsi: hisi_sas: Return directly if init hardware failed
commit 547fde8b5a upstream.

Need to return directly if init hardware failed.

Fixes: 73a4925d15 ("scsi: hisi_sas: Update all the registers after suspend and resume")
Link: https://lore.kernel.org/r/1573551059-107873-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
James Smart e9866d5ed2 scsi: lpfc: fix: Coverity: lpfc_get_scsi_buf_s3(): Null pointer dereferences
commit 6f23f8c5c9 upstream.

Coverity reported the following:
2020-01-23 08:22:58 +01:00
Pan Bian be4e660bca scsi: bnx2i: fix potential use after free
commit 29d28f2b8d upstream.

The member hba->pcidev may be used after its reference is dropped. Move the
put function to where it is never used to avoid potential use after free
issues.

Fixes: a771718065 ("[SCSI] bnx2i: Removed the reference to the netdev->base_addr")
Link: https://lore.kernel.org/r/1573043541-19126-1-git-send-email-bianpan2016@163.com
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
Pan Bian 43c5bca0c4 scsi: qla4xxx: fix double free bug
commit 3fe3d2428b upstream.

The variable init_fw_cb is released twice, resulting in a double free
bug. The call to the function dma_free_coherent() before goto is removed to
get rid of potential double free.

Fixes: 2a49a78ed3 ("[SCSI] qla4xxx: added IPv6 support.")
Link: https://lore.kernel.org/r/1572945927-27796-1-git-send-email-bianpan2016@163.com
Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
Xiang Chen 1331942cbc scsi: hisi_sas: Set the BIST init value before enabling BIST
commit 65a3b8bd56 upstream.

If set the BIST init value after enabling BIST, there may be still some few
error bits. According to the process, need to set the BIST init value
before enabling BIST.

Fixes: 97b151e758 ("scsi: hisi_sas: Add BIST support for phy loopback")
Link: https://lore.kernel.org/r/1571926105-74636-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
Xiang Chen 1397d34930 scsi: hisi_sas: Don't create debugfs dump folder twice
commit 35160421b6 upstream.

Due to a merge error, we attempt to create 2x debugfs dump folders, which
fails:
[  861.101914] debugfs: Directory 'dump' with parent '0000:74:02.0'
already present!

This breaks the dump function.

To fix, remove the superfluous attempt to create the folder.

Fixes: 7ec7082c57 ("scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation")
Link: https://lore.kernel.org/r/1571926105-74636-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
Dan Carpenter 37a15afafb scsi: esas2r: unlock on error in esas2r_nvram_read_direct()
commit 906ca6353a upstream.

This error path is missing an unlock.

Fixes: 26780d9e12 ("[SCSI] esas2r: ATTO Technology ExpressSAS 6G SAS/SATA RAID Adapter Driver")
Link: https://lore.kernel.org/r/20191022102324.GA27540@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:58 +01:00
Long Li f6e4244f8b scsi: storvsc: Correctly set number of hardware queues for IDE disk
commit 7b571c19d4 upstream.

Commit 0ed8810276 ("scsi: storvsc: setup 1:1 mapping between hardware
queue and CPU queue") introduced a regression for disks attached to
IDE. For these disks the host VSP only offers one VMBUS channel. Setting
multiple queues can overload the VMBUS channel and result in performance
drop for high queue depth workload on system with large number of CPUs.

Fix it by leaving the number of hardware queues to 1 (default value) for
IDE disks.

Fixes: 0ed8810276 ("scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue")
Link: https://lore.kernel.org/r/1578960516-108228-1-git-send-email-longli@linuxonhyperv.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:38 +01:00
Arnd Bergmann 86ee2a91cb scsi: fnic: fix invalid stack access
commit 42ec15ceae upstream.

gcc -O3 warns that some local variables are not properly initialized:

drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_hang_notify':
drivers/scsi/fnic/vnic_dev.c:511:16: error: 'a0' is used uninitialized in this function [-Werror=uninitialized]
  vdev->args[0] = *a0;
  ~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:691:6: note: 'a0' was declared here
  u64 a0, a1;
      ^~
drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized]
  vdev->args[1] = *a1;
  ~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:691:10: note: 'a1' was declared here
  u64 a0, a1;
          ^~
drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_mac_addr':
drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized]
  vdev->args[1] = *a1;
  ~~~~~~~~~~~~~~^~~~~
drivers/scsi/fnic/vnic_dev.c:698:10: note: 'a1' was declared here
  u64 a0, a1;
          ^~

Apparently the code relies on the local variables occupying adjacent memory
locations in the same order, but this is of course not guaranteed.

Use an array of two u64 variables where needed to make it work correctly.

I suspect there is also an endianness bug here, but have not digged in deep
enough to be sure.

Fixes: 5df6d737dd ("[SCSI] fnic: Add new Cisco PCI-Express FCoE HBA")
Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200107201602.4096790-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23 08:22:35 +01:00
Varun Prakash 2921fd9b30 scsi: libcxgbi: fix NULL pointer dereference in cxgbi_device_destroy()
[ Upstream commit 71482fde70 ]

If cxgb4i_ddp_init() fails then cdev->cdev2ppm will be NULL, so add a check
for NULL pointer before dereferencing it.

Link: https://lore.kernel.org/r/1576676731-3068-1-git-send-email-varun@chelsio.com
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-17 19:49:03 +01:00
Arnd Bergmann bdd308d667 scsi: sd: enable compat ioctls for sed-opal
commit 142b2ac82e upstream.

The sed_ioctl() function is written to be compatible between
32-bit and 64-bit processes, however compat mode is only
wired up for nvme, not for sd.

Add the missing call to sed_ioctl() in sd_compat_ioctl().

Fixes: d80210f25f ("sd: add support for TCG OPAL self encrypting disks")
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:51 +01:00
Can Guo 7d261f5fda scsi: ufs: Give an unique ID to each ufs-bsg
commit 8c850a0296 upstream.

Considering there can be multiple UFS hosts in SoC, give each ufs-bsg an
unique ID by appending the scsi host number to its device name.

Link: https://lore.kernel.org/r/0101016eca8dc9d7-d24468d3-04d2-4ef3-a906-abe8b8cbcd3d-000000@us-west-2.amazonses.com
Fixes: df032bf27a ("scsi: ufs: Add a bsg endpoint that supports UPIUs")
Signed-off-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:45 +01:00
Xiang Chen 63154ee1cd scsi: sd: Clear sdkp->protection_type if disk is reformatted without PI
commit 465f4edaec upstream.

If an attached disk with protection information enabled is reformatted
to Type 0 the revalidation code does not clear the original protection
type and subsequent accesses will keep setting RDPROTECT/WRPROTECT.

Set the protection type to 0 if the disk reports PROT_EN=0 in READ
CAPACITY(16).

[mkp: commit desc]

Fixes: fe542396da ("[SCSI] sd: Ensure we correctly disable devices with unknown protection type")
Link: https://lore.kernel.org/r/1578532344-101668-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:42 +01:00
Dan Carpenter eba97de9f8 scsi: mpt3sas: Fix double free in attach error handling
commit ee560e7bba upstream.

The caller also calls _base_release_memory_pools() on error so it leads to
a number of double frees:

drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->chain_dma_pool' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->hpr_lookup' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->internal_lookup' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->pcie_sgl_dma_pool' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->reply_dma_pool' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->reply_free_dma_pool' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->reply_post_free_array_dma_pool' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->reply_post_free_dma_pool' double freed
drivers/scsi/mpt3sas/mpt3sas_base.c:7207 mpt3sas_base_attach() warn: 'ioc->sense_dma_pool' double freed

Fixes: 74522a92bb ("scsi: mpt3sas: Optimize I/O memory consumption in driver.")
Link: https://lore.kernel.org/r/20191203093652.gyntgvnkw2udatyc@kili.mountain
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:21 +01:00
Navid Emamdoost 448fe0b67c scsi: bfa: release allocated memory in case of error
commit 0e62395da2 upstream.

In bfad_im_get_stats if bfa_port_get_stats fails, allocated memory needs to
be released.

Link: https://lore.kernel.org/r/20190910234417.22151-1-navid.emamdoost@gmail.com
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-14 20:08:37 +01:00
James Smart 858f090696 scsi: lpfc: Fix rpi release when deleting vport
commit 97acd0019d upstream.

A prior use-after-free mailbox fix solved it's problem by null'ing a ndlp
pointer.  However, further testing has shown that this change causes a
later state change to occasionally be skipped, which results in a reference
count never being decremented thus the rpi is never released, which causes
a vport delete to never succeed.

Revise the fix in the prior patch to no longer null the ndlp. Instead the
RELEASE_RPI flag is set which will drive the release of the rpi.

Given the new code was added at a deep indentation level, refactor the code
block using a new routine that avoids the indentation issues.

Fixes: 	9b16406864 ("scsi: lpfc: Fix use-after-free mailbox cmd completion")
Link: https://lore.kernel.org/r/20190922035906.10977-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-09 10:20:01 +01:00
Bo Wu d45a917138 scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func
[ Upstream commit bba340c79b ]

In iscsi_if_rx func, after receiving one request through
iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to
reply to the request in a do-while loop.  If the iscsi_if_send_reply
function keeps returning -EAGAIN, a deadlock will occur.

For example, a client only send msg without calling recvmsg func, then
it will result in the watchdog soft lockup.  The details are given as
follows:

	sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI);
	retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr);
	while (1) {
		state_msg = sendmsg(sock_fd, &msg, 0);
		//Note: recvmsg(sock_fd, &msg, 0) is not processed here.
	}
	close(sock_fd);

watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat:
curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0
deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq:
         TIMER:        992
         SCHED:          8
Sample irqstat:
         irq    2: delta       1003, curr:    3103802, arch_timer
CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G           OE
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 40400005 (nZcv daif +PAN -UAO)
pc : __alloc_skb+0x104/0x1b0
lr : __alloc_skb+0x9c/0x1b0
sp : ffff000033603a30
x29: ffff000033603a30 x28: 00000000000002dd
x27: ffff800b34ced810 x26: ffff800ba7569f00
x25: 00000000ffffffff x24: 0000000000000000
x23: ffff800f7c43f600 x22: 0000000000480020
x21: ffff0000091d9000 x20: ffff800b34eff200
x19: ffff800ba7569f00 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0001000101000100
x13: 0000000101010000 x12: 0101000001010100
x11: 0001010101010001 x10: 00000000000002dd
x9 : ffff000033603d58 x8 : ffff800b34eff400
x7 : ffff800ba7569200 x6 : ffff800b34eff400
x5 : 0000000000000000 x4 : 00000000ffffffff
x3 : 0000000000000000 x2 : 0000000000000001
x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace:
__alloc_skb+0x104/0x1b0
iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi]
netlink_unicast+0x1e0/0x258
netlink_sendmsg+0x310/0x378
sock_sendmsg+0x4c/0x70
sock_write_iter+0x90/0xf0
__vfs_write+0x11c/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc

Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.com
Signed-off-by: Bo Wu <wubo40@huawei.com>
Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:46 +01:00
Jason Yan 55c89290c7 scsi: libsas: stop discovering if oob mode is disconnected
[ Upstream commit f70267f379 ]

The discovering of sas port is driven by workqueue in libsas. When libsas
is processing port events or phy events in workqueue, new events may rise
up and change the state of some structures such as asd_sas_phy.  This may
cause some problems such as follows:

==>thread 1                       ==>thread 2

                                  ==>phy up
                                  ==>phy_up_v3_hw()
                                    ==>oob_mode = SATA_OOB_MODE;
                                  ==>phy down quickly
                                  ==>hisi_sas_phy_down()
                                    ==>sas_ha->notify_phy_event()
                                    ==>sas_phy_disconnected()
                                      ==>oob_mode = OOB_NOT_CONNECTED
==>workqueue wakeup
==>sas_form_port()
  ==>sas_discover_domain()
    ==>sas_get_port_device()
      ==>oob_mode is OOB_NOT_CONNECTED and device
         is wrongly taken as expander

This at last lead to the panic when libsas trying to issue a command to
discover the device.

[183047.614035] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000058
[183047.622896] Mem abort info:
[183047.625762]   ESR = 0x96000004
[183047.628893]   Exception class = DABT (current EL), IL = 32 bits
[183047.634888]   SET = 0, FnV = 0
[183047.638015]   EA = 0, S1PTW = 0
[183047.641232] Data abort info:
[183047.644189]   ISV = 0, ISS = 0x00000004
[183047.648100]   CM = 0, WnR = 0
[183047.651145] user pgtable: 4k pages, 48-bit VAs, pgdp =
00000000b7df67be
[183047.657834] [0000000000000058] pgd=0000000000000000
[183047.662789] Internal error: Oops: 96000004 [#1] SMP
[183047.667740] Process kworker/u16:2 (pid: 31291, stack limit =
0x00000000417c4974)
[183047.675208] CPU: 0 PID: 3291 Comm: kworker/u16:2 Tainted: G
W  OE 4.19.36-vhulk1907.1.0.h410.eulerosv2r8.aarch64 #1
[183047.687015] Hardware name: N/A N/A/Kunpeng Desktop Board D920S10,
BIOS 0.15 10/22/2019
[183047.695007] Workqueue: 0000:74:02.0_disco_q sas_discover_domain
[183047.700999] pstate: 20c00009 (nzCv daif +PAN +UAO)
[183047.705864] pc : prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw]
[183047.711510] lr : prep_ata_v3_hw+0xb0/0x230 [hisi_sas_v3_hw]
[183047.717153] sp : ffff00000f28ba60
[183047.720541] x29: ffff00000f28ba60 x28: ffff8026852d7228
[183047.725925] x27: ffff8027dba3e0a8 x26: ffff8027c05fc200
[183047.731310] x25: 0000000000000000 x24: ffff8026bafa8dc0
[183047.736695] x23: ffff8027c05fc218 x22: ffff8026852d7228
[183047.742079] x21: ffff80007c2f2940 x20: ffff8027c05fc200
[183047.747464] x19: 0000000000f80800 x18: 0000000000000010
[183047.752848] x17: 0000000000000000 x16: 0000000000000000
[183047.758232] x15: ffff000089a5a4ff x14: 0000000000000005
[183047.763617] x13: ffff000009a5a50e x12: ffff8026bafa1e20
[183047.769001] x11: ffff0000087453b8 x10: ffff00000f28b870
[183047.774385] x9 : 0000000000000000 x8 : ffff80007e58f9b0
[183047.779770] x7 : 0000000000000000 x6 : 000000000000003f
[183047.785154] x5 : 0000000000000040 x4 : ffffffffffffffe0
[183047.790538] x3 : 00000000000000f8 x2 : 0000000002000007
[183047.795922] x1 : 0000000000000008 x0 : 0000000000000000
[183047.801307] Call trace:
[183047.803827]  prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw]
[183047.809127]  hisi_sas_task_prep+0x750/0x888 [hisi_sas_main]
[183047.814773]  hisi_sas_task_exec.isra.7+0x88/0x1f0 [hisi_sas_main]
[183047.820939]  hisi_sas_queue_command+0x28/0x38 [hisi_sas_main]
[183047.826757]  smp_execute_task_sg+0xec/0x218
[183047.831013]  smp_execute_task+0x74/0xa0
[183047.834921]  sas_discover_expander.part.7+0x9c/0x5f8
[183047.839959]  sas_discover_root_expander+0x90/0x160
[183047.844822]  sas_discover_domain+0x1b8/0x1e8
[183047.849164]  process_one_work+0x1b4/0x3f8
[183047.853246]  worker_thread+0x54/0x470
[183047.856981]  kthread+0x134/0x138
[183047.860283]  ret_from_fork+0x10/0x18
[183047.863931] Code: f9407a80 528000e2 39409281 72a04002 (b9405800)
[183047.870097] kernel fault(0x1) notification starting on CPU 0
[183047.875828] kernel fault(0x1) notification finished on CPU 0
[183047.881559] Modules linked in: unibsp(OE) hns3(OE) hclge(OE)
hnae3(OE) mem_drv(OE) hisi_sas_v3_hw(OE) hisi_sas_main(OE)
[183047.892418] ---[ end trace 4cc26083fc11b783  ]---
[183047.897107] Kernel panic - not syncing: Fatal exception
[183047.902403] kernel fault(0x5) notification starting on CPU 0
[183047.908134] kernel fault(0x5) notification finished on CPU 0
[183047.913865] SMP: stopping secondary CPUs
[183047.917861] Kernel Offset: disabled
[183047.921422] CPU features: 0x2,a2a00a38
[183047.925243] Memory Limit: none
[183047.928372] kernel reboot(0x2) notification starting on CPU 0
[183047.934190] kernel reboot(0x2) notification finished on CPU 0
[183047.940008] ---[ end Kernel panic - not syncing: Fatal exception
]---

Fixes: 2908d778ab ("[SCSI] aic94xx: new driver")
Link: https://lore.kernel.org/r/20191206011118.46909-1-yanaijie@huawei.com
Reported-by: Gao Chuan <gaochuan4@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:46 +01:00
Dan Carpenter 505eead5e5 scsi: iscsi: qla4xxx: fix double free in probe
[ Upstream commit fee92f2577 ]

On this error path we call qla4xxx_mem_free() and then the caller also
calls qla4xxx_free_adapter() which calls qla4xxx_mem_free().  It leads to a
couple double frees:

drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->chap_dma_pool' double freed
drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->fw_ddb_dma_pool' double freed

Fixes: afaf5a2d34 ("[SCSI] Initial Commit of qla4xxx")
Link: https://lore.kernel.org/r/20191203094421.hw7ex7qr3j2rbsmx@kili.mountain
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:46 +01:00
Roman Bolshakov 68204b46da scsi: qla2xxx: Ignore PORT UPDATE after N2N PLOGI
[ Upstream commit af22f0c7b0 ]

PORT UPDATE asynchronous event is generated on the host that issues PLOGI
ELS (in the case of higher WWPN). In that case, the event shouldn't be
handled as it sets unwanted DPC flags (i.e. LOOP_RESYNC_NEEDED) that
trigger link flap.

Ignore the event if the host has higher WWPN, but handle otherwise.

Cc: Quinn Tran <qutran@marvell.com>
Link: https://lore.kernel.org/r/20191125165702.1013-13-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:46 +01:00
Roman Bolshakov 7497032b17 scsi: qla2xxx: Don't defer relogin unconditonally
[ Upstream commit dabc5ec915 ]

qla2x00_configure_local_loop sets RELOGIN_NEEDED bit and calls
qla24xx_fcport_handle_login to perform the login. This bit triggers a wake
up of DPC later after a successful login.

The deferred call is not needed if login succeeds, and it's set in
qla24xx_fcport_handle_login in case of errors, hence it should be safe to
drop.

Link: https://lore.kernel.org/r/20191125165702.1013-12-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Acked-by: Quinn Tran <qutran@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:45 +01:00
Roman Bolshakov 944e01086e scsi: qla2xxx: Send Notify ACK after N2N PLOGI
[ Upstream commit 5e6b01d84b ]

qlt_handle_login schedules session for deletion even if a login is in
progress. That causes login bouncing, i.e. a few logins are made before it
settles down.

Complete the first login by sending Notify Acknowledge IOCB via
qlt_plogi_ack_unref if the session is pending login completion.

Fixes: 9cd883f07a ("scsi: qla2xxx: Fix session cleanup for N2N")
Cc: Krishna Kant <krishna.kant@purestorage.com>
Cc: Alexei Potashnik <alexei@purestorage.com>
Link: https://lore.kernel.org/r/20191125165702.1013-11-r.bolshakov@yadro.com
Acked-by: Quinn Tran <qutran@marvell.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:45 +01:00
Roman Bolshakov 610ab9d567 scsi: qla2xxx: Configure local loop for N2N target
[ Upstream commit fd1de5830a ]

qla2x00_configure_local_loop initializes PLOGI payload for PLOGI ELS using
Get Parameters mailbox command.

In the case when the driver is running in target mode, the topology is N2N
and the target port has higher WWPN, LOCAL_LOOP_UPDATE bit is cleared too
early and PLOGI payload is not initialized by the Get Parameters
command. That causes a failure of ELS IOCB carrying the PLOGI with 0x15 aka
Data Underrun error.

LOCAL_LOOP_UPDATE has to be set to initialize PLOGI payload.

Fixes: 48acad0990 ("scsi: qla2xxx: Fix N2N link re-connect")
Link: https://lore.kernel.org/r/20191125165702.1013-10-r.bolshakov@yadro.com
Acked-by: Quinn Tran <qutran@marvell.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:45 +01:00
Roman Bolshakov 97345ea931 scsi: qla2xxx: Fix PLOGI payload and ELS IOCB dump length
[ Upstream commit 0334cdea1f ]

The size of the buffer is hardcoded as 0x70 or 112 bytes, while the size of
ELS IOCB is 0x40 and the size of PLOGI payload returned by Get Parameters
command is 0x74.

Cc: Quinn Tran <qutran@marvell.com>
Link: https://lore.kernel.org/r/20191125165702.1013-9-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:45 +01:00
Roman Bolshakov 7e56964da8 scsi: qla2xxx: Don't call qlt_async_event twice
[ Upstream commit 2c2f4bed9b ]

MBA_PORT_UPDATE generates duplicate log lines in target mode because
qlt_async_event is called twice. Drop the calls within the case as the
function will be called right after the switch statement.

Cc: Quinn Tran <qutran@marvell.com>
Link: https://lore.kernel.org/r/20191125165702.1013-8-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:44 +01:00
Roman Bolshakov 8ad67d0ec1 scsi: qla2xxx: Drop superfluous INIT_WORK of del_work
[ Upstream commit 600954e6f2 ]

del_work is already initialized inside qla2x00_alloc_fcport, there's no
need to overwrite it. Indeed, it might prevent complete traversal of
workqueue list.

Fixes: a01c77d2cb ("scsi: qla2xxx: Move session delete to driver work queue")
Cc: Quinn Tran <qutran@marvell.com>
Link: https://lore.kernel.org/r/20191125165702.1013-5-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:44 +01:00
Quinn Tran 99a1bdf6ae scsi: qla2xxx: Use explicit LOGO in target mode
[ Upstream commit 86196a8fa8 ]

Target makes implicit LOGO on session teardown. LOGO ELS is not send on the
wire and initiator is not aware that target no longer wants talking to
it. Initiator keeps sending I/O requests, target responds with BA_RJT, they
time out and then initiator sends ABORT TASK (ABTS-LS).

Current behaviour incurs unneeded I/O timeout and can be fixed for some
initiators by making explicit LOGO on session deletion.

Link: https://lore.kernel.org/r/20191125165702.1013-3-r.bolshakov@yadro.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:44 +01:00
Bo Wu f327fb083c scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func
[ Upstream commit 9a1b0b9a6d ]

When phba->mbox_ext_buf_ctx.seqNum != phba->mbox_ext_buf_ctx.numBuf,
dd_data should be freed before return SLI_CONFIG_HANDLED.

When lpfc_sli_issue_mbox func return fails, pmboxq should be also freed in
job_error tag.

Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E7A966@DGGEML525-MBS.china.huawei.com
Signed-off-by: Bo Wu <wubo40@huawei.com>
Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:44 +01:00
James Smart 6b49a5a9eb nvme_fc: add module to ops template to allow module references
[ Upstream commit 863fbae929 ]

In nvme-fc: it's possible to have connected active controllers
and as no references are taken on the LLDD, the LLDD can be
unloaded.  The controller would enter a reconnect state and as
long as the LLDD resumed within the reconnect timeout, the
controller would resume.  But if a namespace on the controller
is the root device, allowing the driver to unload can be problematic.
To reload the driver, it may require new io to the boot device,
and as it's no longer connected we get into a catch-22 that
eventually fails, and the system locks up.

Fix this issue by taking a module reference for every connected
controller (which is what the core layer did to the transport
module). Reference is cleared when the controller is removed.

Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:41 +01:00
Anatol Pomazau 692b104e36 scsi: iscsi: Don't send data to unbound connection
[ Upstream commit 238191d65d ]

If a faulty initiator fails to bind the socket to the iSCSI connection
before emitting a command, for instance, a subsequent send_pdu, it will
crash the kernel due to a null pointer dereference in sock_sendmsg(), as
shown in the log below.  This patch makes sure the bind succeeded before
trying to use the socket.

BUG: kernel NULL pointer dereference, address: 0000000000000018
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.4.0-rc2.iscsi+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   24.158246] Workqueue: iscsi_q_0 iscsi_xmitworker
[   24.158883] RIP: 0010:apparmor_socket_sendmsg+0x5/0x20
[...]
[   24.161739] RSP: 0018:ffffab6440043ca0 EFLAGS: 00010282
[   24.162400] RAX: ffffffff891c1c00 RBX: ffffffff89d53968 RCX: 0000000000000001
[   24.163253] RDX: 0000000000000030 RSI: ffffab6440043d00 RDI: 0000000000000000
[   24.164104] RBP: 0000000000000030 R08: 0000000000000030 R09: 0000000000000030
[   24.165166] R10: ffffffff893e66a0 R11: 0000000000000018 R12: ffffab6440043d00
[   24.166038] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9d5575a62e90
[   24.166919] FS:  0000000000000000(0000) GS:ffff9d557db80000(0000) knlGS:0000000000000000
[   24.167890] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.168587] CR2: 0000000000000018 CR3: 000000007a838000 CR4: 00000000000006e0
[   24.169451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   24.170320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   24.171214] Call Trace:
[   24.171537]  security_socket_sendmsg+0x3a/0x50
[   24.172079]  sock_sendmsg+0x16/0x60
[   24.172506]  iscsi_sw_tcp_xmit_segment+0x77/0x120
[   24.173076]  iscsi_sw_tcp_pdu_xmit+0x58/0x170
[   24.173604]  ? iscsi_dbg_trace+0x63/0x80
[   24.174087]  iscsi_tcp_task_xmit+0x101/0x280
[   24.174666]  iscsi_xmit_task+0x83/0x110
[   24.175206]  iscsi_xmitworker+0x57/0x380
[   24.175757]  ? __schedule+0x2a2/0x700
[   24.176273]  process_one_work+0x1b5/0x360
[   24.176837]  worker_thread+0x50/0x3c0
[   24.177353]  kthread+0xf9/0x130
[   24.177799]  ? process_one_work+0x360/0x360
[   24.178401]  ? kthread_park+0x90/0x90
[   24.178915]  ret_from_fork+0x35/0x40
[   24.179421] Modules linked in:
[   24.179856] CR2: 0000000000000018
[   24.180327] ---[ end trace b4b7674b6df5f480 ]---

Signed-off-by: Anatol Pomazau <anatol@google.com>
Co-developed-by: Frank Mayhar <fmayhar@google.com>
Signed-off-by: Frank Mayhar <fmayhar@google.com>
Co-developed-by: Bharath Ravi <rbharath@google.com>
Signed-off-by: Bharath Ravi <rbharath@google.com>
Co-developed-by: Khazhimsel Kumykov <khazhy@google.com>
Signed-off-by: Khazhimsel Kumykov <khazhy@google.com>
Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:17 +01:00
Can Guo affadd78b1 scsi: ufs: Fix up auto hibern8 enablement
[ Upstream commit 71d848b8d9 ]

Fix up possible unclocked register access to auto hibern8 register in
resume path and through sysfs entry. Meanwhile, enable auto hibern8 only
after device is fully initialized in probe path.

Link: https://lore.kernel.org/r/1573798172-20534-4-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:17 +01:00
Finn Thain 026401a594 scsi: NCR5380: Add disconnect_mask module parameter
[ Upstream commit 0b7a223552 ]

Add a module parameter to inhibit disconnect/reselect for individual
targets. This gains compatibility with Aztec PowerMonster SCSI/SATA
adapters with buggy firmware. (No fix is available from the vendor.)

Apparently these adapters pass-through the product/vendor of the attached
SATA device. Since they can't be identified from the response to an INQUIRY
command, a device blacklist flag won't work.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/993b17545990f31f9fa5a98202b51102a68e7594.1573875417.git.fthain@telegraphics.com.au
Reviewed-and-tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:16 +01:00
Maurizio Lombardi 9d411fa766 scsi: scsi_debug: num_tgts must be >= 0
[ Upstream commit aa5334c4f3 ]

Passing the parameter "num_tgts=-1" will start an infinite loop that
exhausts the system memory

Link: https://lore.kernel.org/r/20191115163727.24626-1-mlombard@redhat.com
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:15 +01:00
Subhash Jadavani 17e6ff3d46 scsi: ufs: Fix error handing during hibern8 enter
[ Upstream commit 6d303e4b19 ]

During clock gating (ufshcd_gate_work()), we first put the link hibern8 by
calling ufshcd_uic_hibern8_enter() and if ufshcd_uic_hibern8_enter()
returns success (0) then we gate all the clocks.  Now let’s zoom in to what
ufshcd_uic_hibern8_enter() does internally: It calls
__ufshcd_uic_hibern8_enter() and if failure is encountered, link recovery
shall put the link back to the highest HS gear and returns success (0) to
ufshcd_uic_hibern8_enter() which is the issue as link is still in active
state due to recovery!  Now ufshcd_uic_hibern8_enter() returns success to
ufshcd_gate_work() and hence it goes ahead with gating the UFS clock while
link is still in active state hence I believe controller would raise UIC
error interrupts. But when we service the interrupt, clocks might have
already been disabled!

This change fixes for this by returning failure from
__ufshcd_uic_hibern8_enter() if recovery succeeds as link is still not in
hibern8, upon receiving the error ufshcd_hibern8_enter() would initiate
retry to put the link state back into hibern8.

Link: https://lore.kernel.org/r/1573798172-20534-8-git-send-email-cang@codeaurora.org
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:15 +01:00
peter chang ed1e1d6731 scsi: pm80xx: Fix for SATA device discovery
[ Upstream commit ce21c63ee9 ]

Driver was missing complete() call in mpi_sata_completion which result in
SATA abort error handling timing out. That causes the device to be left in
the in_recovery state so subsequent commands sent to the device fail and
the OS removes access to it.

Link: https://lore.kernel.org/r/20191114100910.6153-2-deepak.ukey@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: peter chang <dpf@google.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:14 +01:00
Finn Thain f2ead37107 scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE
[ Upstream commit 79172ab20b ]

Since the scsi subsystem adopted the blk-mq API, a host with zero
sg_tablesize crashes with a NULL pointer dereference.

blk_queue_max_segments: set to minimum 1
scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
scsi target0:0:0: Beginning Domain Validation
scsi target0:0:0: Domain Validation skipping write tests
scsi target0:0:0: Ending Domain Validation
blk_queue_max_segments: set to minimum 1
scsi 0:0:1:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
scsi target0:0:1: Beginning Domain Validation
scsi target0:0:1: Domain Validation skipping write tests
scsi target0:0:1: Ending Domain Validation
blk_queue_max_segments: set to minimum 1
scsi 0:0:2:0: CD-ROM            QEMU     QEMU CD-ROM      2.5+ PQ: 0 ANSI: 5
scsi target0:0:2: Beginning Domain Validation
scsi target0:0:2: Domain Validation skipping write tests
scsi target0:0:2: Ending Domain Validation
blk_queue_max_segments: set to minimum 1
blk_queue_max_segments: set to minimum 1
blk_queue_max_segments: set to minimum 1
blk_queue_max_segments: set to minimum 1
sr 0:0:2:0: Power-on or device reset occurred
sd 0:0:0:0: Power-on or device reset occurred
sd 0:0:1:0: Power-on or device reset occurred
sd 0:0:0:0: [sda] 10485762 512-byte logical blocks: (5.37 GB/5.00 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Unable to handle kernel NULL pointer dereference at virtual address (ptrval)
Oops: 00000000
Modules linked in:
PC: [<001cd874>] blk_mq_free_request+0x66/0xe2
SR: 2004  SP: (ptrval)  a2: 00874520
d0: 00000000    d1: 00000000    d2: 009ba800    d3: 00000000
d4: 00000000    d5: 08000002    a0: 0087be68    a1: 009a81e0
Process kworker/u2:2 (pid: 15, task=(ptrval))
Frame format=7 eff addr=0000007a ssw=0505 faddr=0000007a
wb 1 stat/addr/data: 0000 00000000 00000000
wb 2 stat/addr/data: 0000 00000000 00000000
wb 3 stat/addr/data: 0000 0000007a 00000000
push data: 00000000 00000000 00000000 00000000
Stack from 0087bd98:
        00000002 00000000 0087be72 009a7820 0087bdb4 001c4f6c 009a7820 0087bdd4
        0024d200 009a7820 0024d0dc 0087be72 009baa00 0087be68 009a5000 0087be7c
        00265d10 009a5000 0087be72 00000003 00000000 00000000 00000000 0087be68
        00000bb8 00000005 00000000 00000000 00000000 00000000 00265c56 00000000
        009ba60c 0036ddf4 00000002 ffffffff 009baa00 009ba600 009a50d6 0087be74
        00227ba0 009baa08 00000001 009baa08 009ba60c 0036ddf4 00000000 00000000
Call Trace: [<001c4f6c>] blk_put_request+0xe/0x14
 [<0024d200>] __scsi_execute+0x124/0x174
 [<0024d0dc>] __scsi_execute+0x0/0x174
 [<00265d10>] sd_revalidate_disk+0xba/0x1f02
 [<00265c56>] sd_revalidate_disk+0x0/0x1f02
 [<0036ddf4>] strlen+0x0/0x22
 [<00227ba0>] device_add+0x3da/0x604
 [<0036ddf4>] strlen+0x0/0x22
 [<00267e64>] sd_probe+0x30c/0x4b4
 [<0002da44>] process_one_work+0x0/0x402
 [<0022b978>] really_probe+0x226/0x354
 [<0022bc34>] driver_probe_device+0xa4/0xf0
 [<0002da44>] process_one_work+0x0/0x402
 [<0022bcd0>] __driver_attach_async_helper+0x50/0x70
 [<00035dae>] async_run_entry_fn+0x36/0x130
 [<0002db88>] process_one_work+0x144/0x402
 [<0002e1aa>] worker_thread+0x0/0x570
 [<0002e29a>] worker_thread+0xf0/0x570
 [<0002e1aa>] worker_thread+0x0/0x570
 [<003768d8>] schedule+0x0/0xb8
 [<0003f58c>] __init_waitqueue_head+0x0/0x12
 [<00033e92>] kthread+0xc2/0xf6
 [<000331e8>] kthread_parkme+0x0/0x4e
 [<003768d8>] schedule+0x0/0xb8
 [<00033dd0>] kthread+0x0/0xf6
 [<00002c10>] ret_from_kernel_thread+0xc/0x14
Code: 0280 0006 0800 56c0 4400 0280 0000 00ff <52b4> 0c3a 082b 0006 0013 6706 2042 53a8 00c4 4ab9 0047 3374 6640 202d 000c 670c
Disabling lock debugging due to kernel taint

Avoid this by setting sg_tablesize = 1.

Link: https://lore.kernel.org/r/4567bcae94523b47d6f3b77450ba305823bca479.1572656814.git.fthain@telegraphics.com.au
Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
References: commit 68ab2d76e4 ("scsi: cxlflash: Set sg_tablesize to 1 instead of SG_NONE")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:10 +01:00
Bean Huo e02c150d9b scsi: ufs: fix potential bug which ends in system hang
[ Upstream commit cfcbae3895 ]

In function __ufshcd_query_descriptor(), in the event of an error
happening, we directly goto out_unlock and forget to invaliate
hba->dev_cmd.query.descriptor pointer. This results in this pointer still
valid in ufshcd_copy_query_response() for other query requests which go
through ufshcd_exec_raw_upiu_cmd(). This will cause __memcpy() crash and
system hangs. Log as shown below:

Unable to handle kernel paging request at virtual address
ffff000012233c40
Mem abort info:
   ESR = 0x96000047
   Exception class = DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
Data abort info:
   ISV = 0, ISS = 0x00000047
   CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000028cc735c
[ffff000012233c40] pgd=00000000bffff003, pud=00000000bfffe003,
pmd=00000000ba8b8003, pte=0000000000000000
 Internal error: Oops: 96000047 [#2] PREEMPT SMP
 ...
 Call trace:
  __memcpy+0x74/0x180
  ufshcd_issue_devman_upiu_cmd+0x250/0x3c0
  ufshcd_exec_raw_upiu_cmd+0xfc/0x1a8
  ufs_bsg_request+0x178/0x3b0
  bsg_queue_rq+0xc0/0x118
  blk_mq_dispatch_rq_list+0xb0/0x538
  blk_mq_sched_dispatch_requests+0x18c/0x1d8
  __blk_mq_run_hw_queue+0xb4/0x118
  blk_mq_run_work_fn+0x28/0x38
  process_one_work+0x1ec/0x470
  worker_thread+0x48/0x458
  kthread+0x130/0x138
  ret_from_fork+0x10/0x1c
 Code: 540000ab a8c12027 a88120c7 a8c12027 (a88120c7)
 ---[ end trace 793e1eb5dff69f2d ]---
 note: kworker/0:2H[2054] exited with preempt_count 1

This patch is to move "descriptor = NULL" down to below the label
"out_unlock".

Fixes: d44a5f98bb49b2(ufs: query descriptor API)
Link: https://lore.kernel.org/r/20191112223436.27449-3-huobean@gmail.com
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:46 +01:00
Kars de Jong b0a72e78fc scsi: zorro_esp: Limit DMA transfers to 65536 bytes (except on Fastlane)
[ Upstream commit 02f7e9f351 ]

When using this driver on a Blizzard 1260, there were failures whenever DMA
transfers from the SCSI bus to memory of 65535 bytes were followed by a DMA
transfer of 1 byte. This caused the byte at offset 65535 to be overwritten
with 0xff. The Blizzard hardware can't handle single byte DMA transfers.

Besides this issue, limiting the DMA length to something that is not a
multiple of the page size is very inefficient on most file systems.

It seems this limit was chosen because the DMA transfer counter of the ESP
by default is 16 bits wide, thus limiting the length to 65535 bytes.
However, the value 0 means 65536 bytes, which is handled by the ESP and the
Blizzard just fine. It is also the default maximum used by esp_scsi when
drivers don't provide their own dma_length_limit() function.

The limit of 65536 bytes can be used by all boards except the Fastlane. The
old driver used a limit of 65532 bytes (0xfffc), which is reintroduced in
this patch.

Fixes: b7ded0e8b0 ("scsi: zorro_esp: Limit DMA transfers to 65535 bytes")
Link: https://lore.kernel.org/r/20191112175523.23145-1-jongk@linux-m68k.org
Signed-off-by: Kars de Jong <jongk@linux-m68k.org>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:37 +01:00
James Smart cd53b26c1b scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
[ Upstream commit 6c6d59e0fe ]

Coverity reported the following:

*** CID 101747:  Null pointer dereferences  (FORWARD_NULL)
/drivers/scsi/lpfc/lpfc_els.c: 4439 in lpfc_cmpl_els_rsp()
4433     			kfree(mp);
4434     		}
4435     		mempool_free(mbox, phba->mbox_mem_pool);
4436     	}
4437     out:
4438     	if (ndlp && NLP_CHK_NODE_ACT(ndlp)) {
vvv     CID 101747:  Null pointer dereferences  (FORWARD_NULL)
vvv     Dereferencing null pointer "shost".
4439     		spin_lock_irq(shost->host_lock);
4440     		ndlp->nlp_flag &= ~(NLP_ACC_REGLOGIN | NLP_RM_DFLT_RPI);
4441     		spin_unlock_irq(shost->host_lock);
4442
4443     		/* If the node is not being used by another discovery thread,
4444     		 * and we are sending a reject, we are done with it.

Fix by adding a check for non-null shost in line 4438.
The scenario when shost is set to null is when ndlp is null.
As such, the ndlp check present was sufficient. But better safe
than sorry so add the shost check.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 101747 ("Null pointer dereferences")
Fixes: 2e0fef85e0 ("[SCSI] lpfc: NPIV: split ports")

CC: James Bottomley <James.Bottomley@SteelEye.com>
CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
CC: linux-next@vger.kernel.org
Link: https://lore.kernel.org/r/20191111230401.12958-3-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:36 +01:00
James Smart 11ff350c9b scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
[ Upstream commit 7cfd5639d9 ]

If the driver receives a login that is later then LOGO'd by the remote port
(aka ndlp), the driver, upon the completion of the LOGO ACC transmission,
will logout the node and unregister the rpi that is being used for the
node.  As part of the unreg, the node's rpi value is replaced by the
LPFC_RPI_ALLOC_ERROR value.  If the port is subsequently offlined, the
offline walks the nodes and ensures they are logged out, which possibly
entails unreg'ing their rpi values.  This path does not validate the node's
rpi value, thus doesn't detect that it has been unreg'd already.  The
replaced rpi value is then used when accessing the rpi bitmask array which
tracks active rpi values.  As the LPFC_RPI_ALLOC_ERROR value is not a valid
index for the bitmask, it may fault the system.

Revise the rpi release code to detect when the rpi value is the replaced
RPI_ALLOC_ERROR value and ignore further release steps.

Link: https://lore.kernel.org/r/20191105005708.7399-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:17 +01:00
James Smart 358b37c6c6 scsi: lpfc: Fix unexpected error messages during RSCN handling
[ Upstream commit 2332e6e475 ]

During heavy RCN activity and log_verbose = 0 we see these messages:

  2754 PRLI failure DID:521245 Status:x9/xb2c00, data: x0
  0231 RSCN timeout Data: x0 x3
  0230 Unexpected timeout, hba link state x5

This is due to delayed RSCN activity.

Correct by avoiding the timeout thus the messages by restarting the
discovery timeout whenever an rscn is received.

Filter PRLI responses such that severity depends on whether expected for
the configuration or not. For example, PRLI errors on a fabric will be
informational (they are expected), but Point-to-Point errors are not
necessarily expected so they are raised to an error level.

Link: https://lore.kernel.org/r/20191105005708.7399-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:16 +01:00
Bart Van Assche 872f801313 scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
[ Upstream commit f6b8540f40 ]

According to SBC-2 a TRANSFER LENGTH field of zero means that 256 logical
blocks must be transferred. Make the SCSI tracing code follow SBC-2.

Fixes: bf81623542 ("[SCSI] add scsi trace core functions and put trace points")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20191105215553.185018-1-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:15 +01:00
Luo Jiaxing 266bde856c scsi: hisi_sas: Delete the debugfs folder of hisi_sas when the probe fails
[ Upstream commit cabe7c10c9 ]

Although if the debugfs initialization fails, we will delete the debugfs
folder of hisi_sas, but we did not consider the scenario where debugfs was
successfully initialized, but the probe failed for other reasons. We found
out that hisi_sas folder is still remain after the probe failed.

When probe fail, we should delete debugfs folder to avoid the above issue.

Link: https://lore.kernel.org/r/1571926105-74636-18-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:49 +01:00
Xiang Chen e9eb98caa0 scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()
[ Upstream commit 550c0d89d5 ]

For IOs from upper layer, preemption may be disabled as it may be called by
function __blk_mq_delay_run_hw_queue which will call get_cpu() (it disables
preemption). So if flags HISI_SAS_REJECT_CMD_BIT is set in function
hisi_sas_task_exec(), it may disable preempt twice after down() and up()
which will cause following call trace:

BUG: scheduling while atomic: fio/60373/0x00000002
Call trace:
dump_backtrace+0x0/0x150
show_stack+0x24/0x30
dump_stack+0xa0/0xc4
__schedule_bug+0x68/0x88
__schedule+0x4b8/0x548
schedule+0x40/0xd0
schedule_timeout+0x200/0x378
__down+0x78/0xc8
down+0x54/0x70
hisi_sas_task_exec.isra.10+0x598/0x8d8 [hisi_sas_main]
hisi_sas_queue_command+0x28/0x38 [hisi_sas_main]
sas_queuecommand+0x168/0x1b0 [libsas]
scsi_queue_rq+0x2ac/0x980
blk_mq_dispatch_rq_list+0xb0/0x550
blk_mq_do_dispatch_sched+0x6c/0x110
blk_mq_sched_dispatch_requests+0x114/0x1d8
__blk_mq_run_hw_queue+0xb8/0x130
__blk_mq_delay_run_hw_queue+0x1c0/0x220
blk_mq_run_hw_queue+0xb0/0x128
blk_mq_sched_insert_requests+0xdc/0x208
blk_mq_flush_plug_list+0x1b4/0x3a0
blk_flush_plug_list+0xdc/0x110
blk_finish_plug+0x3c/0x50
blkdev_direct_IO+0x404/0x550
generic_file_read_iter+0x9c/0x848
blkdev_read_iter+0x50/0x78
aio_read+0xc8/0x170
io_submit_one+0x1fc/0x8d8
__arm64_sys_io_submit+0xdc/0x280
el0_svc_common.constprop.0+0xe0/0x1e0
el0_svc_handler+0x34/0x90
el0_svc+0x10/0x14
...

To solve the issue, check preemptible() to avoid disabling preempt multiple
when flag HISI_SAS_REJECT_CMD_BIT is set.

Link: https://lore.kernel.org/r/1571926105-74636-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:48 +01:00
Dan Carpenter e98014e8d3 scsi: csiostor: Don't enable IRQs too early
[ Upstream commit d6c9b31ac3 ]

These are called with IRQs disabled from csio_mgmt_tmo_handler() so we
can't call spin_unlock_irq() or it will enable IRQs prematurely.

Fixes: a3667aaed5 ("[SCSI] csiostor: Chelsio FCoE offload driver")
Link: https://lore.kernel.org/r/20191019085913.GA14245@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:47 +01:00
James Smart 00b111173e scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
[ Upstream commit feff8b3d84 ]

When operating in private loop mode, PLOGI exchanges are racing and the
driver tries to abort it's PLOGI. But the PLOGI abort ends up terminating
the login with the other end causing the other end to abort its PLOGI as
well. Discovery never fully completes.

Fix by disabling the PLOGI abort when private loop and letting the state
machine play out.

Link: https://lore.kernel.org/r/20191018211832.7917-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:45 +01:00
James Smart 41f66da6d4 scsi: lpfc: Fix hardlockup in lpfc_abort_handler
[ Upstream commit 91a52b617c ]

In lpfc_abort_handler, the lock acquire order is hbalock (irqsave),
buf_lock (irq) and ring_lock (irq).  The issue is that in two places the
locks are released out of order - the buf_lock and the hbalock - resulting
in the cpu preemption/lock flags getting restored out of order and
deadlocking the cpu.

Fix the unlock order by fully releasing the hbalocks as well.

CC: Zhangguanghui <zhang.guanghui@h3c.com>
Link: https://lore.kernel.org/r/20191018211832.7917-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:44 +01:00
James Smart 03d0de2da8 scsi: lpfc: Fix list corruption in lpfc_sli_get_iocbq
[ Upstream commit 15498dc1a5 ]

After study, it was determined there was a double free of a CT iocb during
execution of lpfc_offline_prep and lpfc_offline.  The prep routine issued
an abort for some CT iocbs, but the aborts did not complete fast enough for
a subsequent routine that waits for completion. Thus the driver proceeded
to lpfc_offline, which releases any pending iocbs. Unfortunately, the
completions for the aborts were then received which re-released the ct
iocbs.

Turns out the issue for why the aborts didn't complete fast enough was not
their time on the wire/in the adapter. It was the lpfc_work_done routine,
which requires the adapter state to be UP before it calls
lpfc_sli_handle_slow_ring_event() to process the completions. The issue is
the prep routine takes the link down as part of it's processing.

To fix, the following was performed:

 - Prevent the offline routine from releasing iocbs that have had aborts
   issued on them. Defer to the abort completions. Also means the driver
   fully waits for the completions.  Given this change, the recognition of
   "driver-generated" status which then releases the iocb is no longer
   valid. As such, the change made in the commit 296012285c is reverted.
   As recognition of "driver-generated" status is no longer valid, this
   patch reverts the changes made in
   commit 296012285c ("scsi: lpfc: Fix leak of ELS completions on adapter reset")

 - Modify lpfc_work_done to allow slow path completions so that the abort
   completions aren't ignored.

 - Updated the fdmi path to recognize a CT request that fails due to the
   port being unusable. This stops FDMI retries. FDMI will be restarted on
   next link up.

Link: https://lore.kernel.org/r/20190922035906.10977-14-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:28 +01:00
Sreekanth Reddy 08f9773f3d scsi: mpt3sas: Reject NVMe Encap cmnds to unsupported HBA
[ Upstream commit 77fd4f2c88 ]

If any faulty application issues an NVMe Encapsulated commands to HBA which
doesn't support NVMe protocol then driver should return the command as
invalid with the following message.

"HBA doesn't support NVMe. Rejecting NVMe Encapsulated request."

Otherwise below page fault kernel panic will be observed while building the
PRPs as there is no PRP pools allocated for the HBA which doesn't support
NVMe drives.

RIP: 0010:_base_build_nvme_prp+0x3b/0xf0 [mpt3sas]
Call Trace:
 _ctl_do_mpt_command+0x931/0x1120 [mpt3sas]
 _ctl_ioctl_main.isra.11+0xa28/0x11e0 [mpt3sas]
 ? prepare_to_wait+0xb0/0xb0
 ? tty_ldisc_deref+0x16/0x20
 _ctl_ioctl+0x1a/0x20 [mpt3sas]
 do_vfs_ioctl+0xaa/0x620
 ? vfs_read+0x117/0x140
 ksys_ioctl+0x67/0x90
 __x64_sys_ioctl+0x1a/0x20
 do_syscall_64+0x60/0x190
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

[mkp: tweaked error string]

Link: https://lore.kernel.org/r/1568379890-18347-12-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:26 +01:00
James Smart a51f92387f scsi: lpfc: Fix locking on mailbox command completion
[ Upstream commit 07b8582430 ]

Symptoms were seen of the driver not having valid data for mailbox
commands. After debugging, the following sequence was found:

The driver maintains a port-wide pointer of the mailbox command that is
currently in execution. Once finished, the port-wide pointer is cleared
(done in lpfc_sli4_mq_release()). The next mailbox command issued will set
the next pointer and so on.

The mailbox response data is only copied if there is a valid port-wide
pointer.

In the failing case, it was seen that a new mailbox command was being
attempted in parallel with the completion.  The parallel path was seeing
the mailbox no long in use (flag check under lock) and thus set the port
pointer.  The completion path had cleared the active flag under lock, but
had not touched the port pointer.  The port pointer is cleared after the
lock is released. In this case, the completion path cleared the just-set
value by the parallel path.

Fix by making the calls that clear mbox state/port pointer while under
lock.  Also slightly cleaned up the error path.

Link: https://lore.kernel.org/r/20190922035906.10977-8-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:24 +01:00
Sreekanth Reddy dc1f146054 scsi: mpt3sas: Fix clear pending bit in ioctl status
[ Upstream commit 782b281883 ]

When user issues diag register command from application with required size,
and if driver unable to allocate the memory, then it will fail the register
command. While failing the register command, driver is not currently
clearing MPT3_CMD_PENDING bit in ctl_cmds.status variable which was set
before trying to allocate the memory. As this bit is set, subsequent
register command will be failed with BUSY status even when user wants to
register the trace buffer will less memory.

Clear MPT3_CMD_PENDING bit in ctl_cmds.status before returning the diag
register command with no memory status.

Link: https://lore.kernel.org/r/1568379890-18347-4-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:24 +01:00
James Smart fe35d5a4fa scsi: lpfc: Fix discovery failures when target device connectivity bounces
[ Upstream commit 3f97aed611 ]

An issue was seen discovering all SCSI Luns when a target device undergoes
link bounce.

The driver currently does not qualify the FC4 support on the target.
Therefore it will send a SCSI PRLI and an NVMe PRLI. The expectation is
that the target will reject the PRLI if it is not supported. If a PRLI
times out, the driver will retry. The driver will not proceed with the
device until both SCSI and NVMe PRLIs are resolved.  In the failure case,
the device is FCP only and does not respond to the NVMe PRLI, thus
initiating the wait/retry loop in the driver.  During that time, a RSCN is
received (device bounced) causing the driver to issue a GID_FT.  The GID_FT
response comes back before the PRLI mess is resolved and it prematurely
cancels the PRLI retry logic and leaves the device in a STE_PRLI_ISSUE
state. Discovery with the target never completes or resets.

Fix by resetting the node state back to STE_NPR_NODE when GID_FT completes,
thereby restarting the discovery process for the node.

Link: https://lore.kernel.org/r/20190922035906.10977-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:23 +01:00
James Smart 45331ad469 scsi: lpfc: Fix spinlock_irq issues in lpfc_els_flush_cmd()
[ Upstream commit d38b4a527f ]

While reviewing the CT behavior, issues with spinlock_irq were seen. The
driver should be using spinlock_irqsave/irqrestore in the els flush
routine.

Changed to spinlock_irqsave/irqrestore.

Link: https://lore.kernel.org/r/20190922035906.10977-15-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:22 +01:00
Michael Hernandez 2dfbb6448c scsi: qla2xxx: Fix incorrect SFUB length used for Secure Flash Update MB Cmd
commit c868907e1a upstream.

SFUB length should be in DWORDs when passed to FW.

Fixes: 3f006ac342 ("scsi: qla2xxx: Secure flash update support for ISP28XX")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191203223657.22109-4-hmadhani@marvell.com
Signed-off-by: Michael Hernandez <mhernandez@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:10 +01:00
Himanshu Madhani 69b0e7e76a scsi: qla2xxx: Correctly retrieve and interpret active flash region
commit 4e71dcae0c upstream.

ISP27XX/28XX supports multiple flash regions. This patch fixes issue where
active flash region was not interpreted correctly during secure flash
update process.

[mkp: typo]

Fixes: 5fa8774c7f ("scsi: qla2xxx: Add 28xx flash primary/secondary status/image mechanism")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191203223657.22109-2-hmadhani@marvell.com
Signed-off-by: Michael Hernandez <mhernandez@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:09 +01:00
Roman Bolshakov a82545b62e scsi: qla2xxx: Change discovery state before PLOGI
commit 58e39a2ce4 upstream.

When a port sends PLOGI, discovery state should be changed to login
pending, otherwise RELOGIN_NEEDED bit is set in
qla24xx_handle_plogi_done_event(). RELOGIN_NEEDED triggers another PLOGI,
and it never goes out of the loop until login timer expires.

Fixes: 8777e4314d ("scsi: qla2xxx: Migrate NVME N2N handling into state machine")
Fixes: 8b5292bcfc ("scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag")
Cc: Quinn Tran <qutran@marvell.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191125165702.1013-6-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:08 +01:00
Michael Hernandez f9daaba804 scsi: qla2xxx: Added support for MPI and PEP regions for ISP28XX
commit a530bf691f upstream.

This patch adds support for MPI/PEP region updates which is required with
secure flash updates for ISP28XX.

Fixes: 3f006ac342 ("scsi: qla2xxx: Secure flash update support for ISP28XX")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191203223657.22109-3-hmadhani@marvell.com
Signed-off-by: Michael Hernandez <mhernandez@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:07 +01:00
Roman Bolshakov 09f401b656 scsi: qla2xxx: Initialize free_work before flushing it
commit 4c86b037a6 upstream.

Target creation triggers a new BUG_ON introduced in in commit 4d43d395fe
("workqueue: Try to catch flush_work() without INIT_WORK().").  The BUG_ON
reveals an attempt to flush free_work in qla24xx_do_nack_work before it's
initialized in qlt_unreg_sess:

  WARNING: CPU: 7 PID: 211 at kernel/workqueue.c:3031 __flush_work.isra.38+0x40/0x2e0
  CPU: 7 PID: 211 Comm: kworker/7:1 Kdump: loaded Tainted: G            E     5.3.0-rc7-vanilla+ #2
  Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
  NIP:  c000000000159620 LR: c0080000009d91b0 CTR: c0000000001598c0
  REGS: c000000005f3f730 TRAP: 0700   Tainted: G            E      (5.3.0-rc7-vanilla+)
  MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002222  XER: 00000000
  CFAR: c0000000001598d0 IRQMASK: 0
  GPR00: c0080000009d91b0 c000000005f3f9c0 c000000001670a00 c0000003f8655ca8
  GPR04: c0000003f8655c00 000000000000ffff 0000000000000011 ffffffffffffffff
  GPR08: c008000000949228 0000000000000000 0000000000000001 c0080000009e7780
  GPR12: 0000000000002200 c00000003fff6200 c000000000161bc8 0000000000000004
  GPR16: c0000003f9d68280 0000000002000000 0000000000000005 0000000000000003
  GPR20: 0000000000000002 000000000000ffff 0000000000000000 fffffffffffffef7
  GPR24: c000000004f73848 c000000004f73838 c000000004f73f28 c000000005f3fb60
  GPR28: c000000004f73e48 c000000004f73c80 c000000004f73818 c0000003f9d68280
  NIP [c000000000159620] __flush_work.isra.38+0x40/0x2e0
  LR [c0080000009d91b0] qla24xx_do_nack_work+0x88/0x180 [qla2xxx]
  Call Trace:
  [c000000005f3f9c0] [c000000000159644] __flush_work.isra.38+0x64/0x2e0 (unreliable)
  [c000000005f3fa50] [c0080000009d91a0] qla24xx_do_nack_work+0x78/0x180 [qla2xxx]
  [c000000005f3fae0] [c0080000009496ec] qla2x00_do_work+0x604/0xb90 [qla2xxx]
  [c000000005f3fc40] [c008000000949cd8] qla2x00_iocb_work_fn+0x60/0xe0 [qla2xxx]
  [c000000005f3fc80] [c000000000157bb8] process_one_work+0x2c8/0x5b0
  [c000000005f3fd10] [c000000000157f28] worker_thread+0x88/0x660
  [c000000005f3fdb0] [c000000000161d64] kthread+0x1a4/0x1b0
  [c000000005f3fe20] [c00000000000b960] ret_from_kernel_thread+0x5c/0x7c
  Instruction dump:
  3d22001d 892966b1 7d908026 91810008 f821ff71 69290001 0b090000 2e290000
  40920200 e9230018 7d2a0074 794ad182 <0b0a0000> 2fa90000 419e01e8 7c0802a6
  ---[ end trace 5ccf335d4f90fcb8 ]---

Fixes: 1021f0bc2f ("scsi: qla2xxx: allow session delete to finish before create.")
Cc: Quinn Tran <qutran@marvell.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191125165702.1013-4-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:07 +01:00
Roman Bolshakov fe6e4d041c scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd
commit f2c9ee54a5 upstream.

If ABTS cannot be completed in target mode, the driver attempts to free
related management command and crashes:

  NIP [d000000019181ee8] tcm_qla2xxx_free_mcmd+0x40/0x80 [tcm_qla2xxx]
  LR [d00000001dc1e6f8] qlt_response_pkt+0x190/0xa10 [qla2xxx]
  Call Trace:
  [c000003fff27bb50] [c000003fff27bc10] 0xc000003fff27bc10 (unreliable)
  [c000003fff27bb70] [d00000001dc1e6f8] qlt_response_pkt+0x190/0xa10 [qla2xxx]
  [c000003fff27bc10] [d00000001dbc2be0] qla24xx_process_response_queue+0x5d8/0xbd0 [qla2xxx]
  [c000003fff27bd50] [d00000001dbc632c] qla24xx_msix_rsp_q+0x64/0x150 [qla2xxx]
  [c000003fff27bde0] [c000000000187200] __handle_irq_event_percpu+0x90/0x310
  [c000003fff27bea0] [c0000000001874b8] handle_irq_event_percpu+0x38/0x90
  [c000003fff27bee0] [c000000000187574] handle_irq_event+0x64/0xb0
  [c000003fff27bf10] [c00000000018cd38] handle_fasteoi_irq+0xe8/0x280
  [c000003fff27bf40] [c000000000185ccc] generic_handle_irq+0x4c/0x70
  [c000003fff27bf60] [c000000000016cec] __do_irq+0x7c/0x1d0
  [c000003fff27bf90] [c00000000002a530] call_do_irq+0x14/0x24
  [c00000207d2cba90] [c000000000016edc] do_IRQ+0x9c/0x130
  [c00000207d2cbae0] [c000000000008bf4] hardware_interrupt_common+0x114/0x120
  --- interrupt: 501 at arch_local_irq_restore+0x74/0x90
      LR = arch_local_irq_restore+0x74/0x90
  [c00000207d2cbdd0] [c0000000001c64fc] tick_broadcast_oneshot_control+0x4c/0x60 (unreliable)
  [c00000207d2cbdf0] [c0000000007ac840] cpuidle_enter_state+0xf0/0x450
  [c00000207d2cbe50] [c00000000016b81c] call_cpuidle+0x4c/0x90
  [c00000207d2cbe70] [c00000000016bc30] do_idle+0x2b0/0x330
  [c00000207d2cbec0] [c00000000016beec] cpu_startup_entry+0x3c/0x50
  [c00000207d2cbef0] [c00000000004a06c] start_secondary+0x63c/0x670
  [c00000207d2cbf90] [c00000000000aa6c] start_secondary_prolog+0x10/0x14

The crash can be triggered by ACL deletion when there's active I/O.

During ACL deletion, qla2xxx performs implicit LOGO that's invisible for
the initiator. Only the driver and firmware are aware of the logout.
Therefore the initiator continues to send SCSI commands and the target
always responds with SAM STATUS BUSY as it can't find the session.

The command times out after a while and initiator invokes ABORT TASK TMF
for the command. The TMF is mapped to ABTS-LS in FCP. The target can't find
session for S_ID originating ABTS-LS so it never allocates mcmd.  And since
N_Port handle was deleted after LOGO, it is no longer valid and ABTS
Response IOCB is returned from firmware with status 31. Then free_mcmd is
invoked on NULL pointer and the kernel crashes.

[ 7734.578642] qla2xxx [0000:00:0c.0]-e837:6: ABTS_RECV_24XX: instance 0
[ 7734.578644] qla2xxx [0000:00:0c.0]-f811:6: qla_target(0): task abort (s_id=1:2:0, tag=1209504, param=0)
[ 7734.578645] find_sess_by_s_id: 0x010200
[ 7734.578645] Unable to locate s_id: 0x010200
[ 7734.578646] qla2xxx [0000:00:0c.0]-f812:6: qla_target(0): task abort for non-existent session
[ 7734.578648] qla2xxx [0000:00:0c.0]-e806:6: Sending task mgmt ABTS response (ha=c0000000d5819000, atio=c0000000d3fd4700, status=4
[ 7734.578730] qla2xxx [0000:00:0c.0]-e838:6: ABTS_RESP_24XX: compl_status 31
[ 7734.578732] qla2xxx [0000:00:0c.0]-e863:6: qla_target(0): ABTS_RESP_24XX failed 31 (subcode 19:a)
[ 7734.578740] Unable to handle kernel paging request for data at address 0x00000200

Fixes: 6b0431d6fa ("scsi: qla2xxx: Fix out of order Termination and ABTS response")
Cc: Quinn Tran <qutran@marvell.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Thomas Abraham <tabraham@suse.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191125165702.1013-2-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:05 +01:00
Bart Van Assche 80dfdacecf scsi: iscsi: Fix a potential deadlock in the timeout handler
commit 5480e299b5 upstream.

Some time ago the block layer was modified such that timeout handlers are
called from thread context instead of interrupt context. Make it safe to
run the iSCSI timeout handler in thread context. This patch fixes the
following lockdep complaint:

================================
WARNING: inconsistent lock state
5.5.1-dbg+ #11 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/7:1H/206 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff88802d9827e8 (&(&session->frwd_lock)->rlock){+.?.}, at: iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi]
{IN-SOFTIRQ-W} state was registered at:
  lock_acquire+0x106/0x240
  _raw_spin_lock+0x38/0x50
  iscsi_check_transport_timeouts+0x3e/0x210 [libiscsi]
  call_timer_fn+0x132/0x470
  __run_timers.part.0+0x39f/0x5b0
  run_timer_softirq+0x63/0xc0
  __do_softirq+0x12d/0x5fd
  irq_exit+0xb3/0x110
  smp_apic_timer_interrupt+0x131/0x3d0
  apic_timer_interrupt+0xf/0x20
  default_idle+0x31/0x230
  arch_cpu_idle+0x13/0x20
  default_idle_call+0x53/0x60
  do_idle+0x38a/0x3f0
  cpu_startup_entry+0x24/0x30
  start_secondary+0x222/0x290
  secondary_startup_64+0xa4/0xb0
irq event stamp: 1383705
hardirqs last  enabled at (1383705): [<ffffffff81aace5c>] _raw_spin_unlock_irq+0x2c/0x50
hardirqs last disabled at (1383704): [<ffffffff81aacb98>] _raw_spin_lock_irq+0x18/0x50
softirqs last  enabled at (1383690): [<ffffffffa0e2efea>] iscsi_queuecommand+0x76a/0xa20 [libiscsi]
softirqs last disabled at (1383682): [<ffffffffa0e2e998>] iscsi_queuecommand+0x118/0xa20 [libiscsi]

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&session->frwd_lock)->rlock);
  <Interrupt>
    lock(&(&session->frwd_lock)->rlock);

 *** DEADLOCK ***

2 locks held by kworker/7:1H/206:
 #0: ffff8880d57bf928 ((wq_completion)kblockd){+.+.}, at: process_one_work+0x472/0xab0
 #1: ffff88802b9c7de8 ((work_completion)(&q->timeout_work)){+.+.}, at: process_one_work+0x476/0xab0

stack backtrace:
CPU: 7 PID: 206 Comm: kworker/7:1H Not tainted 5.5.1-dbg+ #11
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: kblockd blk_mq_timeout_work
Call Trace:
 dump_stack+0xa5/0xe6
 print_usage_bug.cold+0x232/0x23b
 mark_lock+0x8dc/0xa70
 __lock_acquire+0xcea/0x2af0
 lock_acquire+0x106/0x240
 _raw_spin_lock+0x38/0x50
 iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi]
 scsi_times_out+0xf4/0x440 [scsi_mod]
 scsi_timeout+0x1d/0x20 [scsi_mod]
 blk_mq_check_expired+0x365/0x3a0
 bt_iter+0xd6/0xf0
 blk_mq_queue_tag_busy_iter+0x3de/0x650
 blk_mq_timeout_work+0x1af/0x380
 process_one_work+0x56d/0xab0
 worker_thread+0x7a/0x5d0
 kthread+0x1bc/0x210
 ret_from_fork+0x24/0x30

Fixes: 287922eb0b ("block: defer timeouts to a workqueue")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191209173457.187370-1-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:04 +01:00
sheebab f306f06d9a scsi: ufs: Disable autohibern8 feature in Cadence UFS
commit d168001d14 upstream.

This patch disables autohibern8 feature in Cadence UFS.  The autohibern8
feature has issues due to which unexpected interrupt trigger is happening.
After the interrupt issue is sorted out, autohibern8 feature will be
re-enabled

Link: https://lore.kernel.org/r/1575367635-22662-1-git-send-email-sheebab@cadence.com
Cc: <stable@vger.kernel.org>
Signed-off-by: sheebab <sheebab@cadence.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Tested-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:05:03 +01:00
Martin K. Petersen 44120fd4fd Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
[ Upstream commit 5a993e507e ]

This reverts commit 2f856d4e8c.

This patch was found to introduce a double free regression. The issue
it originally attempted to address was fixed in patch
f45bca8c50 ("scsi: qla2xxx: Fix double scsi_done for abort path").

Link: https://lore.kernel.org/r/4BDE2B95-835F-43BE-A32C-2629D7E03E0A@marvell.com
Requested-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17 19:56:45 +01:00
Bart Van Assche 26c9d7b181 scsi: qla2xxx: Fix a dma_pool_free() call
[ Upstream commit 162b805e38 ]

This patch fixes the following kernel warning:

DMA-API: qla2xxx 0000:00:0a.0: device driver frees DMA memory with different size [device address=0x00000000c7b60000] [map size=4088 bytes] [unmap size=512 bytes]
WARNING: CPU: 3 PID: 1122 at kernel/dma/debug.c:1021 check_unmap+0x4d0/0xbd0
CPU: 3 PID: 1122 Comm: rmmod Tainted: G           O      5.4.0-rc1-dbg+ #1
RIP: 0010:check_unmap+0x4d0/0xbd0
Call Trace:
 debug_dma_free_coherent+0x123/0x173
 dma_free_attrs+0x76/0xe0
 qla2x00_mem_free+0x329/0xc40 [qla2xxx_scst]
 qla2x00_free_device+0x170/0x1c0 [qla2xxx_scst]
 qla2x00_remove_one+0x4f0/0x6d0 [qla2xxx_scst]
 pci_device_remove+0xd5/0x1f0
 device_release_driver_internal+0x159/0x280
 driver_detach+0x8b/0xf2
 bus_remove_driver+0x9a/0x15a
 driver_unregister+0x51/0x70
 pci_unregister_driver+0x2d/0x130
 qla2x00_module_exit+0x1c/0xbc [qla2xxx_scst]
 __x64_sys_delete_module+0x22a/0x300
 do_syscall_64+0x6f/0x2e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 3f006ac342 ("scsi: qla2xxx: Secure flash update support for ISP28XX") # v5.2-rc1~130^2~270.
Cc: Michael Hernandez <mhernandez@marvell.com>
Cc: Himanshu Madhani <hmadhani@marvell.com>
Link: https://lore.kernel.org/r/20191106044226.5207-3-bvanassche@acm.org
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17 19:56:45 +01:00
Quinn Tran dea6ee7173 scsi: qla2xxx: Fix SRB leak on switch command timeout
[ Upstream commit af2a0c51b1 ]

when GPSC/GPDB switch command fails, driver just returns without doing a
proper cleanup. This patch fixes this memory leak by calling sp->free() in
the error path.

Link: https://lore.kernel.org/r/20191105150657.8092-4-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17 19:56:44 +01:00
Arun Easi 402f719831 scsi: qla2xxx: Fix memory leak when sending I/O fails
commit 2f856d4e8c upstream.

On heavy loads, a memory leak of the srb_t structure is observed.  This
would make the qla2xxx_srbs cache gobble up memory.

Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-7-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:55:30 +01:00
Quinn Tran 31c1f45520 scsi: qla2xxx: Fix double scsi_done for abort path
commit f45bca8c50 upstream.

Current code assumes abort will remove the original command from the active
list where scsi_done will not be called. Instead, the eh_abort thread will
do the scsi_done. That is not the case.  Instead, we have a double
scsi_done calls triggering use after free.

Abort will tell FW to release the command from FW possesion. The original
command will return to ULP with error in its normal fashion via scsi_done.
eh_abort path would wait for the original command completion before
returning.  eh_abort path will not perform the scsi_done call.

Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-6-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:55:29 +01:00
Quinn Tran 5cb5b67480 scsi: qla2xxx: Fix driver unload hang
commit dd322b7f3e upstream.

This patch fixes driver unload hang by removing msleep()

Fixes: d74595278f ("scsi: qla2xxx: Add multiple queue pair functionality.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191105150657.8092-5-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:55:28 +01:00
Quinn Tran b7abcc7df5 scsi: qla2xxx: Do command completion on abort timeout
commit 71c80b75ce upstream.

On switch, fabric and mgt command timeout, driver send Abort to tell FW to
return the original command.  If abort is timeout, then return both Abort
and original command for cleanup.

Fixes: 219d27d714 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191105150657.8092-3-hmadhani@marvell.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:55:27 +01:00
James Smart 64c8e5afcb scsi: lpfc: Fix bad ndlp ptr in xri aborted handling
commit 324e1c4020 upstream.

In cases where I/O may be aborted, such as driver unload or link bounces,
the system will crash based on a bad ndlp pointer.

Example:
  RIP: 0010:lpfc_sli4_abts_err_handler+0x15/0x140 [lpfc]
  ...
  lpfc_sli4_io_xri_aborted+0x20d/0x270 [lpfc]
  lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x84/0x170 [lpfc]
  lpfc_sli4_fp_handle_cqe+0xc2/0x480 [lpfc]
  __lpfc_sli4_process_cq+0xc6/0x230 [lpfc]
  __lpfc_sli4_hba_process_cq+0x29/0xc0 [lpfc]
  process_one_work+0x14c/0x390

Crash was caused by a bad ndlp address passed to I/O indicated by the XRI
aborted CQE.  The address was not NULL so the routine deferenced the ndlp
ptr. The bad ndlp also caused the lpfc_sli4_io_xri_aborted to call an
erroneous io handler.  Root cause for the bad ndlp was an lpfc_ncmd that
was aborted, put on the abort_io list, completed, taken off the abort_io
list, sent to lpfc_release_nvme_buf where it was put back on the abort_io
list because the lpfc_ncmd->flags setting LPFC_SBUF_XBUSY was not cleared
on the final completion.

Rework the exchange busy handling to ensure the flags are properly set for
both scsi and nvme.

Fixes: c490850a09 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Cc: <stable@vger.kernel.org> # v5.1+
Link: https://lore.kernel.org/r/20191018211832.7917-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:55:26 +01:00
Linus Torvalds 72d5ac679e SCSI fixes on 20191111
Three small changes: two in the core and one in the qla2xxx
 driver. The sg_tablesize fix affects a thinko in the migration to
 blk-mq of certain legacy drivers which could cause an oops and the sd
 core change should only affect zoned block devices which were wrongly
 suppressing error messages for reset all zones.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXcmURyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishV63APoCnP9P
 kJP1Bp1fd7f91FrFaxY7sKH9VZqbioUtwUhE9AD/f0o5/gg/5jSIM90GKXdZYVpt
 KIsIaQzVOPL3K7EaFDs=
 =77IM
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three small changes: two in the core and one in the qla2xxx driver.

  The sg_tablesize fix affects a thinko in the migration to blk-mq of
  certain legacy drivers which could cause an oops and the sd core
  change should only affect zoned block devices which were wrongly
  suppressing error messages for reset all zones"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: Handle drivers which set sg_tablesize to zero
  scsi: qla2xxx: fix NPIV tear down process
  scsi: sd_zbc: Fix sd_zbc_complete()
2019-11-11 09:14:36 -08:00
Michael Schmitz 9393c8de62 scsi: core: Handle drivers which set sg_tablesize to zero
In scsi_mq_setup_tags(), cmd_size is calculated based on zero size for the
scatter-gather list in case the low level driver uses SG_NONE in its host
template.

cmd_size is passed on to the block layer for calculation of the request
size, and we've seen NULL pointer dereference errors from the block layer
in drivers where SG_NONE is used and a mq IO scheduler is active,
apparently as a consequence of this (see commit 68ab2d76e4 ("scsi:
cxlflash: Set sg_tablesize to 1 instead of SG_NONE"), and a recent patch by
Finn Thain converting the three m68k NFR5380 drivers to avoid setting
SG_NONE).

Try to avoid these errors by accounting for at least one sg list entry when
calculating cmd_size, regardless of whether the low level driver set a zero
sg_tablesize.

Tested on 030 m68k with the atari_scsi driver - setting sg_tablesize to
SG_NONE no longer results in a crash when loading this driver.

CC: Finn Thain <fthain@telegraphics.com.au>
Link: https://lore.kernel.org/r/1572922150-4358-1-git-send-email-schmitzmic@gmail.com
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:44:34 -05:00
Martin Wilck 8b1062d513 scsi: qla2xxx: fix NPIV tear down process
Fix two issues with commit f5187b7d1a ("scsi: qla2xxx: Optimize NPIV
tear down process"): a missing negation in a wait_event_timeout()
condition, and a missing loop end condition.

Fixes: f5187b7d1a ("scsi: qla2xxx: Optimize NPIV tear down process")
Link: https://lore.kernel.org/r/20191105145550.10268-1-martin.wilck@suse.com
Signed-off-by: Martin Wilck <mwilck@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-05 23:55:31 -05:00
Damien Le Moal edc1f5432f scsi: sd_zbc: Fix sd_zbc_complete()
The ILLEGAL REQUEST/INVALID FIELD IN CDB error generated by an attempt to
reset a conventional zone does not apply to the reset write pointer command
with the ALL bit set, that is, to REQ_OP_ZONE_RESET_ALL requests. Fix
sd_zbc_complete() to be quiet only in the case of REQ_OP_ZONE_RESET,
excluding REQ_OP_ZONE_RESET_ALL.

Since REQ_OP_ZONE_RESET is the only request handled by sd_zbc_complete(),
also simplify the code using a simple if statement.

[mkp: applied by hand]

Fixes: d81e9d4943 ("scsi: implement REQ_OP_ZONE_RESET_ALL")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191027140549.26272-4-damien.lemoal@wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-05 23:17:53 -05:00
Linus Torvalds f83e148a41 SCSI fixes on 20191101
Nine changes, eight in drivers [ufs, target, lpfc x 2, qla2xxx x 4]
 and one core change in sd that fixes an I/O failure on DIF type 3
 devices.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXbzO+iYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYOpAP9/BCSY
 2TAFlli2rVQe+ZNjhHcE4Gj92HNPO7ZgvDQvWgD9F184tjG+1pntYGFutoso7Ak6
 QimtBw4AuYg9eDKJDKU=
 =bQRX
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Nine changes, eight in drivers [ufs, target, lpfc x 2, qla2xxx x 4]
  and one core change in sd that fixes an I/O failure on DIF type 3
  devices"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: stop timer in shutdown path
  scsi: sd: define variable dif as unsigned int instead of bool
  scsi: target: cxgbit: Fix cxgbit_fw4_ack()
  scsi: qla2xxx: Fix partial flash write of MBI
  scsi: qla2xxx: Initialized mailbox to prevent driver load failure
  scsi: lpfc: Honor module parameter lpfc_use_adisc
  scsi: ufs-bsg: Wake the device before sending raw upiu commands
  scsi: lpfc: Check queue pointer before use
  scsi: qla2xxx: fixup incorrect usage of host_byte
2019-11-02 11:15:52 -07:00
Nicholas Piggin d3566abb1a scsi: qla2xxx: stop timer in shutdown path
In shutdown/reboot paths, the timer is not stopped:

  qla2x00_shutdown
  pci_device_shutdown
  device_shutdown
  kernel_restart_prepare
  kernel_restart
  sys_reboot

This causes lockups (on powerpc) when firmware config space access calls
are interrupted by smp_send_stop later in reboot.

Fixes: e30d175648 ("[SCSI] qla2xxx: Addition of shutdown callback handler.")
Link: https://lore.kernel.org/r/20191024063804.14538-1-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 21:58:01 -04:00
Linus Torvalds 1c4e395cf7 SCSI fixes on 20191025
Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
 53c710[x2], target) and one core change that tries to close a race
 between sysfs delete and module removal.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXbN1gSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWUzAP4tB9Z+
 X5zfnMLmeAtSCnVwIgFX3/GVSFfzEmi+3VxfBQEA3nfs5AAJCPsaTk9z+jLtAKPk
 6uYoHwsyTHal19Ojt9g=
 =IOPn
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
  53c710[x2], target) and one core change that tries to close a race
  between sysfs delete and module removal"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: remove left-over BUILD_NVME defines
  scsi: core: try to get module before removing device
  scsi: hpsa: add missing hunks in reset-patch
  scsi: target: core: Do not overwrite CDB byte 1
  scsi: ch: Make it possible to open a ch device multiple times again
  scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
  scsi: sni_53c710: fix compilation error
  scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
  scsi: qla2xxx: fix a potential NULL pointer dereference
2019-10-25 20:11:33 -04:00
Xiang Chen 0cf9f4e547 scsi: sd: define variable dif as unsigned int instead of bool
Variable dif in function sd_setup_read_write_cmnd() is the return value of
function scsi_host_dif_capable() which returns dif capability of disks.  If
define it as bool, even for the disks which support DIF3, the function
still return dif=1, which causes IO error. So define variable dif as
unsigned int instead of bool.

Fixes: e249e42d27 ("scsi: sd: Clean up sd_setup_read_write_cmnd()")
Link: https://lore.kernel.org/r/1571725628-132736-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 20:34:10 -04:00
Quinn Tran 8d8b83f5be scsi: qla2xxx: Fix partial flash write of MBI
For new adapters with multiple flash regions to write to, current code
allows FW & Boot regions to be written, while other regions are blocked via
sysfs. The fix is to block all flash read/write through sysfs interface.

Fixes: e81d1bcbde ("scsi: qla2xxx: Further limit FLASH region write access from SysFS")
Cc: stable@vger.kernel.org # 5.2
Link: https://lore.kernel.org/r/20191022193643.7076-3-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Girish Basrur <gbasrur@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-22 22:36:04 -04:00
Himanshu Madhani c2ff2a36ef scsi: qla2xxx: Initialized mailbox to prevent driver load failure
This patch fixes issue with Gen7 adapter in a blade environment where one
of the ports will not be detected by driver. Firmware expects mailbox 11 to
be set or cleared by driver for newer ISP.

Following message is seen in the log file:

[   18.810892] qla2xxx [0000:d8:00.0]-1820:1: **** Failed=102 mb[0]=4005 mb[1]=37 mb[2]=20 mb[3]=8
[   18.819596]  cmd=2 ****

[mkp: typos]

Link: https://lore.kernel.org/r/20191022193643.7076-2-hmadhani@marvell.com
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-22 22:34:46 -04:00
Daniel Wagner 0fd103ccfe scsi: lpfc: Honor module parameter lpfc_use_adisc
The initial lpfc_desc_set_adisc implementation in commit
dea3101e0a ("lpfc: add Emulex FC driver version 8.0.28") enabled ADISC if

	cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE

In commit 92d7f7b0cd ("[SCSI] lpfc: NPIV: add NPIV support on top of
SLI-3") this changed to

	(cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE

and later in commit ffc954936b ("[SCSI] lpfc 8.3.13: FC Discovery Fixes
and enhancements.") to

	(cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICE && FCP_TARGET)

A customer reports that after a devloss, an ADISC failure is logged. It
turns out the ADISC flag is set even the user explicitly set lpfc_use_adisc
= 0.

[Sat Dec 22 22:55:58 2018] lpfc 0000:82:00.0: 2:(0):0203 Devloss timeout on WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x82000000 x8 xa
[Sat Dec 22 23:08:20 2018] lpfc 0000:82:00.0: 2:(0):2755 ADISC failure DID:05DF00 Status:x9/x70000

[mkp: fixed Hannes' email]

Fixes: 92d7f7b0cd ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3")
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: James Smart <james.smart@broadcom.com>
Link: https://lore.kernel.org/r/20191022072112.132268-1-dwagner@suse.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-22 22:30:27 -04:00
Avri Altman 74e5e468b6 scsi: ufs-bsg: Wake the device before sending raw upiu commands
The scsi async probe process is calling blk_pm_runtime_init for each lun,
and then those request queues are monitored by the block layer pm
engine (blk-pm.c).  This is however, not the case for scsi-passthrough
queues, created by bsg_setup_queue().

So the ufs-bsg driver might send various commands, disregarding the pm
status of the device. This is wrong, regardless if its request queue is
pm-aware or not.

Fixes: df032bf27a (scsi: ufs: Add a bsg endpoint that supports UPIUs)
Link: https://lore.kernel.org/r/1570696267-8487-1-git-send-email-avri.altman@wdc.com
Reported-by: Yuliy Izrailov <yuliy.izrailov@wdc.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-18 18:02:16 -04:00
Daniel Wagner 535fb49e73 scsi: lpfc: Check queue pointer before use
The queue pointer might not be valid. The rest of the code checks the
pointer before accessing it. lpfc_sli4_process_missed_mbox_completions is
the only place where the check is missing.

Fixes: 657add4e5e ("scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors")
Cc: James Smart <jsmart2021@gmail.com>
Link: https://lore.kernel.org/r/20191018162111.8798-1-dwagner@suse.de
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-18 17:59:27 -04:00
Hannes Reinecke 66cf50e65b scsi: qla2xxx: fixup incorrect usage of host_byte
DRIVER_ERROR is a a driver byte setting, not a host byte.  The qla2xxx
driver should rather return DID_ERROR here to be in line with the other
drivers.

Link: https://lore.kernel.org/r/20191018140458.108278-1-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-18 17:25:19 -04:00
Linus Torvalds c3419fd6d3 Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZOpW2gUwxXeCmhkh7ulgGnXF3j0FAl2PeYQACgkQ7ulgGnXF
 3j3Zow/7BD/9Vai5zqDOFXSFnR5cFfcLoL2aYu13B5GjIYKZlhUW8ePC0jo0p0sV
 TSsIrOxv3RaeDLC5ISi+njsSJMspW5qGv8jrZb7xBn1zE2gcJ9YeVhb+tboW2rrr
 R03i7HInSQhdyKFMQS05IonRi8LphmTYKy3p8LifiiPoy4TsGcpw2tjQKicp0GxZ
 gWMOcMnx4sfUiivo0tys6UwUIACVqKOysXn4HGs8COFF4cdBHXJVkddZ5ZUO6hP+
 JInRdiKqDwycZyE6X/6Mj1B7tbmLVGH5mOX2Mx6dwUQkBIpsgJGGIxHRd9sUKv3r
 ltfZADn7CJGcTgwEFF1Fnn61pYXgx/m5M9ECop+w89CLNjWBVsRW1rLjIqiDIY6a
 pCmZQE/P95iHxLtAn6s1IkZoiXzU44tGQZaS3/8uGFxizx/ktWosokUdzC3PJee3
 1eHIXOGwJlu9dTSTR7YDid4s3pXHovlMlu1OTNp1ap8jHX7L5D2AM6xxlLJaXPhN
 zOJz6vcP5ZdVWqNq55jsB0dXDa76hrN2SUkpcwgJYKeU7qnRuGRq/jzjetSuRLvI
 jyaLY0VyKxHWk0/YgmU2gfW/sBYccJg6ONCPJ80R3KuO8VoMxUPZZiWWTb6bwVXn
 Mj2M6dWRIySVI1D8MR7sahGfeCqk7LkCE88DdTDoyaBHasbUSAk=
 =LmPS
 -----END PGP SIGNATURE-----

Merge tag 'mkp-scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi

Pull scsi fixes from Martin Petersen:
 "These two commits were in a separate postmerge branch due to a
  dependency on changes merged for 5.4 in the block tree.

  They fix two issues in the intersection of the request cleanup changes
  from block (b7e9e1fb7a) and the request batching changes
  (8930a6c207) that were made to SCSI during the 5.4 cycle"

* tag 'mkp-scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi:
  scsi: core: fix dh and multipathing for SCSI hosts without request batching
  scsi: core: fix missing .cleanup_rq for SCSI hosts without request batching
2019-10-18 08:08:53 -07:00
Hannes Reinecke 1052b41b25 scsi: lpfc: remove left-over BUILD_NVME defines
The BUILD_NVME define never got defined anywhere, causing NVMe commands to
be treated as SCSI commands when freeing the buffers.  This was causing a
stuck discovery and a horrible crash in lpfc_set_rrq_active() later on.

Link: https://lore.kernel.org/r/20191017150019.75769-1-hare@suse.de
Fixes: c00f62e6c5 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-17 22:01:27 -04:00
Yufen Yu 77c301287e scsi: core: try to get module before removing device
We have a test case like block/001 in blktests, which will create a scsi
device by loading scsi_debug module and then try to delete the device by
sysfs interface. At the same time, it may remove the scsi_debug module.

And getting a invalid paging request BUG_ON as following:

[   34.625854] BUG: unable to handle page fault for address: ffffffffa0016bb8
[   34.629189] Oops: 0000 [#1] SMP PTI
[   34.629618] CPU: 1 PID: 450 Comm: bash Tainted: G        W         5.4.0-rc3+ #473
[   34.632524] RIP: 0010:scsi_proc_hostdir_rm+0x5/0xa0
[   34.643555] CR2: ffffffffa0016bb8 CR3: 000000012cd88000 CR4: 00000000000006e0
[   34.644545] Call Trace:
[   34.644907]  scsi_host_dev_release+0x6b/0x1f0
[   34.645511]  device_release+0x74/0x110
[   34.646046]  kobject_put+0x116/0x390
[   34.646559]  put_device+0x17/0x30
[   34.647041]  scsi_target_dev_release+0x2b/0x40
[   34.647652]  device_release+0x74/0x110
[   34.648186]  kobject_put+0x116/0x390
[   34.648691]  put_device+0x17/0x30
[   34.649157]  scsi_device_dev_release_usercontext+0x2e8/0x360
[   34.649953]  execute_in_process_context+0x29/0x80
[   34.650603]  scsi_device_dev_release+0x20/0x30
[   34.651221]  device_release+0x74/0x110
[   34.651732]  kobject_put+0x116/0x390
[   34.652230]  sysfs_unbreak_active_protection+0x3f/0x50
[   34.652935]  sdev_store_delete.cold.4+0x71/0x8f
[   34.653579]  dev_attr_store+0x1b/0x40
[   34.654103]  sysfs_kf_write+0x3d/0x60
[   34.654603]  kernfs_fop_write+0x174/0x250
[   34.655165]  __vfs_write+0x1f/0x60
[   34.655639]  vfs_write+0xc7/0x280
[   34.656117]  ksys_write+0x6d/0x140
[   34.656591]  __x64_sys_write+0x1e/0x30
[   34.657114]  do_syscall_64+0xb1/0x400
[   34.657627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   34.658335] RIP: 0033:0x7f156f337130

During deleting scsi target, the scsi_debug module have been removed. Then,
sdebug_driver_template belonged to the module cannot be accessd, resulting
in scsi_proc_hostdir_rm() BUG_ON.

To fix the bug, we add scsi_device_get() in sdev_store_delete() to try to
increase refcount of module, avoiding the module been removed.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191015130556.18061-1-yuyufen@huawei.com
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-17 21:57:09 -04:00
Don Brace 134993456c scsi: hpsa: add missing hunks in reset-patch
Correct returning from reset before outstanding commands are completed
for the device.

Link: https://lore.kernel.org/r/157107623870.17997.11208813089704833029.stgit@brunhilda
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-17 21:56:55 -04:00
Linus Torvalds 8625732e77 SCSI fixes on 20191015
Five changes, two in drivers (qla2xxx, zfcp), one to MAINTAINERS
 (qla2xxx) and two in the core.  The last two are mostly about removing
 incorrect messages from the kernel log: the resid message is
 definitely wrong and the sync cache on protected drive problem is
 arguably wrong.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXaYZYCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVuDAP9HBhGv
 dQ3FPA7gZ33rmsb8M1Q1NJ0GJuvFj2muh9CFYwD6AoJtVLivVZR75gojLLMqKpuf
 6EwRTaUZwYAoWeILNuA=
 =+iTy
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five changes, two in drivers (qla2xxx, zfcp), one to MAINTAINERS
  (qla2xxx) and two in the core.

  The last two are mostly about removing incorrect messages from the
  kernel log: the resid message is definitely wrong and the sync cache
  on protected drive problem is arguably wrong"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: MAINTAINERS: Update qla2xxx driver
  scsi: zfcp: fix reaction on bit error threshold notification
  scsi: core: save/restore command resid for error handling
  scsi: qla2xxx: Remove WARN_ON_ONCE in qla2x00_status_cont_entry()
  scsi: sd: Ignore a failure to sync cache due to lack of authorization
2019-10-15 12:19:08 -07:00
Bart Van Assche 6a0990eaa7 scsi: ch: Make it possible to open a ch device multiple times again
Clearing ch->device in ch_release() is wrong because that pointer must
remain valid until ch_remove() is called. This patch fixes the following
crash the second time a ch device is opened:

BUG: kernel NULL pointer dereference, address: 0000000000000790
RIP: 0010:scsi_device_get+0x5/0x60
Call Trace:
 ch_open+0x4c/0xa0 [ch]
 chrdev_open+0xa2/0x1c0
 do_dentry_open+0x13a/0x380
 path_openat+0x591/0x1470
 do_filp_open+0x91/0x100
 do_sys_open+0x184/0x220
 do_syscall_64+0x5f/0x1a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 085e56766f ("scsi: ch: add refcounting")
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191009173536.247889-1-bvanassche@acm.org
Reported-by: Rob Turk <robtu@rtist.nl>
Suggested-by: Rob Turk <robtu@rtist.nl>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 23:39:35 -04:00
Thomas Bogendoerfer 8cbf0c173a scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
When building a kernel with SCSI_SNI_53C710 enabled, Kconfig warns:

WARNING: unmet direct dependencies detected for 53C700_LE_ON_BE
  Depends on [n]: SCSI_LOWLEVEL [=y] && SCSI [=y] && SCSI_LASI700 [=n]
  Selected by [y]:
  - SCSI_SNI_53C710 [=y] && SCSI_LOWLEVEL [=y] && SNI_RM [=y] && SCSI [=y]

Add the missing depends SCSI_SNI_53C710 to 53C700_LE_ON_BE to fix it.

Link: https://lore.kernel.org/r/20191009151128.32411-1-tbogendoerfer@suse.de
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 23:36:18 -04:00
Thomas Bogendoerfer 0ee6211408 scsi: sni_53c710: fix compilation error
Drop out memory dev_printk() with wrong device pointer argument.

[mkp: typo]

Link: https://lore.kernel.org/r/20191009151118.32350-1-tbogendoerfer@suse.de
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 23:35:42 -04:00
Hannes Reinecke b6ce6fb121 scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
Some arrays are not capable of returning RTPG data during state
transitioning, but rather return an 'LUN not accessible, asymmetric access
state transition' sense code. In these cases we can set the state to
'transitioning' directly and don't need to evaluate the RTPG data (which we
won't have anyway).

Link: https://lore.kernel.org/r/20191007135701.32389-1-hare@suse.de
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 23:29:20 -04:00
Allen Pais 35a79a6351 scsi: qla2xxx: fix a potential NULL pointer dereference
alloc_workqueue is not checked for errors and as a result a potential
NULL dereference could occur.

Link: https://lore.kernel.org/r/1568824618-4366-1-git-send-email-allen.pais@oracle.com
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 23:29:13 -04:00
Linus Torvalds 126195c972 SCSI fixes on 20191004
Twelve patches mostly small but obvious fixes or cosmetic but small
 updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXZgfWiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaVOAQDnuANx
 QGEuQ1dZPALeZPOlEOsJzzpHPd3O+mQauIE96wD9FMypt/UKF9+fvlp4mCP+ya66
 0fz1kmTQIcAADdYaNYM=
 =aQi7
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Twelve patches mostly small but obvious fixes or cosmetic but small
  updates"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix Nport ID display value
  scsi: qla2xxx: Fix N2N link up fail
  scsi: qla2xxx: Fix N2N link reset
  scsi: qla2xxx: Optimize NPIV tear down process
  scsi: qla2xxx: Fix stale mem access on driver unload
  scsi: qla2xxx: Fix unbound sleep in fcport delete path.
  scsi: qla2xxx: Silence fwdump template message
  scsi: hisi_sas: Make three functions static
  scsi: megaraid: disable device when probe failed after enabled device
  scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
  scsi: qedf: Remove always false 'tmp_prio < 0' statement
  scsi: ufs: skip shutdown if hba is not powered
  scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF
2019-10-05 12:53:27 -07:00
Damien Le Moal 8f8fed0cdb scsi: core: save/restore command resid for error handling
When a non-passthrough command is terminated with CHECK CONDITION, request
sense is executed by hijacking the command descriptor. Since
scsi_eh_prep_cmnd() and scsi_eh_restore_cmnd() do not save/restore the
original command resid, the value returned on failure of the original
command is lost and replaced with the value set by the execution of the
request sense command. This value may in many instances be unaligned to the
device sector size, causing sd_done() to print a warning message about the
incorrect unaligned resid before the command is retried.

Fix this problem by saving the original command residual in struct
scsi_eh_save using scsi_eh_prep_cmnd() and restoring it in
scsi_eh_restore_cmnd(). In addition, to make sure that the request sense
command is executed with a correctly initialized command structure, also
reset the residual to 0 in scsi_eh_prep_cmnd() after saving the original
command value in struct scsi_eh_save.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191001074839.1994-1-damien.lemoal@wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-03 21:43:04 -04:00
Daniel Wagner 9bc6157f5f scsi: qla2xxx: Remove WARN_ON_ONCE in qla2x00_status_cont_entry()
Commit 88263208dd ("scsi: qla2xxx: Complain if sp->done() is not called
from the completion path") introduced the WARN_ON_ONCE in
qla2x00_status_cont_entry(). The assumption was that there is only one
status continuations element. According to the firmware documentation it is
possible that multiple status continuations are emitted by the firmware.

Fixes: 88263208dd ("scsi: qla2xxx: Complain if sp->done() is not called from the completion path")
Link: https://lore.kernel.org/r/20190927073031.62296-1-dwagner@suse.de
Cc: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 23:53:02 -04:00
Oliver Neukum 21e3d6c811 scsi: sd: Ignore a failure to sync cache due to lack of authorization
I've got a report about a UAS drive enclosure reporting back Sense: Logical
unit access not authorized if the drive it holds is password protected.
While the drive is obviously unusable in that state as a mass storage
device, it still exists as a sd device and when the system is asked to
perform a suspend of the drive, it will be sent a SYNCHRONIZE CACHE. If
that fails due to password protection, the error must be ignored.

Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190903101840.16483-1-oneukum@suse.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 23:52:14 -04:00
Linus Torvalds 2e959dd87a for-5.4/post-2019-09-24
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl2J8xQQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpujgD/94s9GGKN8JShxCpT0YNuWyyFF5gNlaimQU
 RSGAwnv2YUgEGNSUOPpcaj5FAYhTfYzbqoHlE+jytA2U5KXTOhc5Z85QV+TY4HPs
 I03xczYuYD/uX0QuF00zU2+6eV3lETELPiBARbfEQdHfm72iwurweHzlh4dfhbxW
 P7UA/cKixXWF2CH9wg5347Ll93nD24f2pi8BUyLJi/xpdlaRrN11Ii8AzNlRmq52
 VRxURuogl98W89F6EV2VhPGFgUEYHY2Ot7II2OqqV+jmjHDQW9y5hximzINOqkxs
 bQwo5J+WrDSPoqwl8+db2k7QQjAl1XKDAHmCwz+7J/BoOgZj8/M1FMBwzita+5x+
 UqxEYe7k+2G3w2zuhBrq03BypU8pwqFep/QI0cCCPaHs4J5QnkVOScEqd6iV/C3T
 FPvMvqDf7MrElghj4Qa2IZlh/CgqmLG5NUEz8E40cXkdiP+E+eK9ZY2Uwx2XhBrm
 7Gl+SpG5DxWqqJeRNVWjFwM4p5L+01NtwDbTjZ1rsf+mCW5cNsy/L9B4UpPz4HxW
 coAs0y/Ce+ZhCopIXZ4jLDBoTG9yoVg8EcyfaHKD2Zz0mUFxa2xm+LvXKeT49qqx
 xuodpKD3fiuM7h9Xgv+cDsmn8Rr8gSeXEGV7qzpudmkxbp6IVg/yG5hC/dM921GR
 EVrRtUIwdw==
 =aAPP
 -----END PGP SIGNATURE-----

Merge tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
 "Some later additions that weren't quite done for the first pull
  request, and also a few fixes that have arrived since.

  This contains:

   - Kill silly pktcdvd warning on attempting to register a non-scsi
     passthrough device (me)

   - Use symbolic constants for the block t10 protection types, and
     switch to handling it in core rather than in the drivers (Max)

   - libahci platform missing node put fix (Nishka)

   - Small series of fixes for BFQ (Paolo)

   - Fix possible nbd crash (Xiubo)"

* tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block:
  block: drop device references in bsg_queue_rq()
  block: t10-pi: fix -Wswitch warning
  pktcdvd: remove warning on attempting to register non-passthrough dev
  ata: libahci_platform: Add of_node_put() before loop exit
  nbd: fix possible page fault for nbd disk
  nbd: rename the runtime flags as NBD_RT_ prefixed
  block, bfq: push up injection only after setting service time
  block, bfq: increase update frequency of inject limit
  block, bfq: reduce upper bound for inject limit to max_rq_in_driver+1
  block, bfq: update inject limit only after injection occurred
  block: centralize PI remapping logic to the block layer
  block: use symbolic constants for t10_pi type
2019-09-24 16:31:50 -07:00
Quinn Tran 0aabb6b699 scsi: qla2xxx: Fix Nport ID display value
For N2N, the NPort ID is assigned by driver in the PLOGI ELS.  According to
FW Spec the byte order for SID is not the same as DID.

Link: https://lore.kernel.org/r/20190912180918.6436-8-hmadhani@marvell.com
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Tested-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:43 -04:00
Quinn Tran f3f1938bb6 scsi: qla2xxx: Fix N2N link up fail
During link up/bounce, qla driver would do command flush as part of
cleanup.  In this case, the flush can intefere with FW state.  This patch
allows FW to be in control of link up.

Link: https://lore.kernel.org/r/20190912180918.6436-7-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:43 -04:00
Quinn Tran 7f2a398d59 scsi: qla2xxx: Fix N2N link reset
Fix stalled link recovery for N2N with FC-NVMe connection.

Link: https://lore.kernel.org/r/20190912180918.6436-6-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:43 -04:00
Quinn Tran f5187b7d1a scsi: qla2xxx: Optimize NPIV tear down process
In the case of NPIV port is being torn down, this patch will set a flag to
indicate VPORT_DELETE. This would prevent relogin to be triggered.

Link: https://lore.kernel.org/r/20190912180918.6436-5-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:43 -04:00
Quinn Tran fd5564ba54 scsi: qla2xxx: Fix stale mem access on driver unload
On driver unload, 'remove_one' thread was allowed to advance, while session
cleanup still lag behind.  This patch ensures session deletion will finish
before remove_one can advance.

Link: https://lore.kernel.org/r/20190912180918.6436-4-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:43 -04:00
Quinn Tran c3b6a1d397 scsi: qla2xxx: Fix unbound sleep in fcport delete path.
There are instances, though rare, where a LOGO request cannot be sent out
and the thread in free session done can wait indefinitely. Fix this by
putting an upper bound to sleep.

Link: https://lore.kernel.org/r/20190912180918.6436-3-hmadhani@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Himanshu Madhani 248a445adf scsi: qla2xxx: Silence fwdump template message
Print if fwdt template is present or not, only when
ql2xextended_error_logging is enabled.

Link: https://lore.kernel.org/r/20190912180918.6436-2-hmadhani@marvell.com
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
YueHaibing 4b6b1bb686 scsi: hisi_sas: Make three functions static
Fix sparse warnings:

drivers/scsi/hisi_sas/hisi_sas_main.c:3686:6:
 warning: symbol 'hisi_sas_debugfs_release' was not declared. Should it be static?
drivers/scsi/hisi_sas/hisi_sas_main.c:3708:5:
 warning: symbol 'hisi_sas_debugfs_alloc' was not declared. Should it be static?
drivers/scsi/hisi_sas/hisi_sas_main.c:3799:6:
 warning: symbol 'hisi_sas_debugfs_bist_init' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20190923054035.19036-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Xiang Chen 70054aa39a scsi: megaraid: disable device when probe failed after enabled device
For pci device, need to disable device when probe failed after enabled
device.

Link: https://lore.kernel.org/r/1567818450-173315-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Long Li 0ed8810276 scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
storvsc doesn't use a dedicated hardware queue for a given CPU queue. When
issuing I/O, it selects returning CPU (hardware queue) dynamically based on
vmbus channel usage across all channels.

This patch advertises num_present_cpus() as number of hardware queues. This
will have upper layer setup 1:1 mapping between hardware queue and CPU
queue and avoid unnecessary locking when issuing I/O.

Link: https://lore.kernel.org/r/1567790660-48142-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Austin Kim 4b062e7cf9 scsi: qedf: Remove always false 'tmp_prio < 0' statement
Since tmp_prio is declared as u8, the following statement is always false.
   tmp_prio < 0

So remove 'always false' statement.

Link: https://lore.kernel.org/r/20190919075548.GA112801@LGEARND20B15
Signed-off-by: Austin Kim <austindh.kim@gmail.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Stanley Chu f51913eef2 scsi: ufs: skip shutdown if hba is not powered
In some cases, hba may go through shutdown flow without successful
initialization and then make system hang.

For example, if ufshcd_change_power_mode() gets error and leads to
ufshcd_hba_exit() to release resources of the host, future shutdown flow
may hang the system since the host register will be accessed in unpowered
state.

To solve this issue, simply add checking to skip shutdown for above kind of
situation.

Link: https://lore.kernel.org/r/1568780438-28753-1-git-send-email-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Linus Torvalds 299d14d4c3 pci-v5.4-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl2JNVAUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyTOA/9EZeyS7J+ZcOwihWz5vNijf0kfpKp
 /jZ9VF9nHjsL9Pw3/Fzha605Ssrtwcqge8g/sze9f0g/pxZk99lLHokE6dEOurEA
 GyKpNNMdiBol4YZMCsSoYji0MpwW0uMCuASPMiEwv2LxZ72A2Tu1RbgYLU+n4m1T
 fQldDTxsUMXc/OH/8SL8QDEh6o8qyDRhmSXFAOv8RGqN8N3iUwVwhQobKpwpmEvx
 ddzqWMS8f91qkhIKO7fgc9P4NI/7yI7kkF+wcdwtfiMO8Qkr4IdcdF7qwNVAtpKA
 A+sMRi59i2XxDTqRFx+wXXMa+rt+Pf1pucv77SO74xXWwpuXSxLVDYjULP1YQugK
 FTBo4SNmico/ts+n5cgm+CGMq2P2E29VYeqkI1Un6eDDvQnQlBgQdpdcBoadJ0rW
 y31OInjhRJC1ZK5bATKfCMbmB+VQxFsbyeUA7PBlrALyAmXZfw30iNxX9iHBhWqc
 myPNVEJJGp0cWTxGxMAU9MhelzeQxDAd+Eb44J5gv51bx0w9yqmZHECSDrOVdtYi
 HpOyI7E3Cb8m23BOHvCdB/v8igaYMZl08LUUJqu1S9mFclYyYVuOOIB04Yc2Qrx1
 3PHtT8TC47FbWuzKwo12RflzoAiNShJGw+tNKo6T1jC+r5jdbKWWtTnsoRqbSfaG
 rG5RJpB7EuQSP1Y=
 =/xB3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate _HPP/_HPX stuff in pci-acpi.c and simplify it
     (Krzysztof Wilczynski)

   - Fix incorrect PCIe device types and remove dev->has_secondary_link
     to simplify code that deals with upstream/downstream ports (Mika
     Westerberg)

   - After suspend, restore Resizable BAR size bits correctly for 1MB
     BARs (Sumit Saxena)

   - Enable PCI_MSI_IRQ_DOMAIN support for RISC-V (Wesley Terpstra)

  Virtualization:

   - Add ACS quirks for iProc PAXB (Abhinav Ratna), Amazon Annapurna
     Labs (Ali Saidi)

   - Move sysfs SR-IOV functions to iov.c (Kelsey Skunberg)

   - Remove group write permissions from sysfs sriov_numvfs,
     sriov_drivers_autoprobe (Kelsey Skunberg)

  Hotplug:

   - Simplify pciehp indicator control (Denis Efremov)

  Peer-to-peer DMA:

   - Allow P2P DMA between root ports for whitelisted bridges (Logan
     Gunthorpe)

   - Whitelist some Intel host bridges for P2P DMA (Logan Gunthorpe)

   - DMA map P2P DMA requests that traverse host bridge (Logan
     Gunthorpe)

  Amazon Annapurna Labs host bridge driver:

   - Add DT binding and controller driver (Jonathan Chocron)

  Hyper-V host bridge driver:

   - Fix hv_pci_dev->pci_slot use-after-free (Dexuan Cui)

   - Fix PCI domain number collisions (Haiyang Zhang)

   - Use instance ID bytes 4 & 5 as PCI domain numbers (Haiyang Zhang)

   - Fix build errors on non-SYSFS config (Randy Dunlap)

  i.MX6 host bridge driver:

   - Limit DBI register length (Stefan Agner)

  Intel VMD host bridge driver:

   - Fix config addressing issues (Jon Derrick)

  Layerscape host bridge driver:

   - Add bar_fixed_64bit property to endpoint driver (Xiaowei Bao)

   - Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC drivers separately
     (Xiaowei Bao)

  Mediatek host bridge driver:

   - Add MT7629 controller support (Jianjun Wang)

  Mobiveil host bridge driver:

   - Fix CPU base address setup (Hou Zhiqiang)

   - Make "num-lanes" property optional (Hou Zhiqiang)

  Tegra host bridge driver:

   - Fix OF node reference leak (Nishka Dasgupta)

   - Disable MSI for root ports to work around design problem (Vidya
     Sagar)

   - Add Tegra194 DT binding and controller support (Vidya Sagar)

   - Add support for sideband pins and slot regulators (Vidya Sagar)

   - Add PIPE2UPHY support (Vidya Sagar)

  Misc:

   - Remove unused pci_block_cfg_access() et al (Kelsey Skunberg)

   - Unexport pci_bus_get(), etc (Kelsey Skunberg)

   - Hide PM, VC, link speed, ATS, ECRC, PTM constants and interfaces in
     the PCI core (Kelsey Skunberg)

   - Clean up sysfs DEVICE_ATTR() usage (Kelsey Skunberg)

   - Mark expected switch fall-through (Gustavo A. R. Silva)

   - Propagate errors for optional regulators and PHYs (Thierry Reding)

   - Fix kernel command line resource_alignment parameter issues (Logan
     Gunthorpe)"

* tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (112 commits)
  PCI: Add pci_irq_vector() and other stubs when !CONFIG_PCI
  arm64: tegra: Add PCIe slot supply information in p2972-0000 platform
  arm64: tegra: Add configuration for PCIe C5 sideband signals
  PCI: tegra: Add support to enable slot regulators
  PCI: tegra: Add support to configure sideband pins
  PCI: vmd: Fix shadow offsets to reflect spec changes
  PCI: vmd: Fix config addressing when using bus offsets
  PCI: dwc: Add validation that PCIe core is set to correct mode
  PCI: dwc: al: Add Amazon Annapurna Labs PCIe controller driver
  dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding
  PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port
  PCI/VPD: Prevent VPD access for Amazon's Annapurna Labs Root Port
  PCI: Add ACS quirk for Amazon Annapurna Labs root ports
  PCI: Add Amazon's Annapurna Labs vendor ID
  MAINTAINERS: Add PCI native host/endpoint controllers designated reviewer
  PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers
  dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries
  dt-bindings: PCI: tegra: Add sideband pins configuration entries
  PCI: tegra: Add Tegra194 PCIe support
  PCI: Get rid of dev->has_secondary_link flag
  ...
2019-09-23 19:16:01 -07:00
Laurence Oberman c9c5374937 scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF
The qla2xxx driver had this issue as well when the newer array firmware
returned the retry_delay_timer in the fcp_rsp.  The bnx2fc is not handling
the masking of the scope bits either so the retry_delay_timestamp value
lands up being a large value added to the timer timestamp delaying I/O for
up to 27 Minutes.  This patch adds similar code to handle this to the
bnx2fc driver to avoid the huge delay.

Link: https://lore.kernel.org/r/1568210202-12794-1-git-send-email-loberman@redhat.com
Signed-off-by: Laurence Oberman <loberman@redhat.com>
Reported-by: David Jeffery <djeffery@redhat.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 22:13:30 -04:00
Steffen Maier 6b6fa7a5c8 scsi: core: fix dh and multipathing for SCSI hosts without request batching
This was missing from scsi_device_from_queue() due to the introduction of
another new scsi_mq_ops_no_commit of linux-next commit 8930a6c207 ("scsi:
core: add support for request batching") from Martin's scsi/5.4/scsi-queue
or James' scsi/misc.

Only devicehandler code seems to call scsi_device_from_queue():
*** drivers/scsi/scsi_dh.c:
scsi_dh_activate[255]          sdev = scsi_device_from_queue(q);
scsi_dh_set_params[302]        sdev = scsi_device_from_queue(q);
scsi_dh_attach[325]            sdev = scsi_device_from_queue(q);
scsi_dh_attached_handler_name[363] sdev = scsi_device_from_queue(q);

Fixes multipath tools follow-on errors:

$ multipath -v6
...
libdevmapper: ioctl/libdm-iface.c(1887): device-mapper: reload ioctl on mpatha  failed: No such device
...
mpatha: failed to load map, error 19
...

showing also as kernel messages:

device-mapper: table: 252:0: multipath: error attaching hardware handler
device-mapper: ioctl: error adding target to table

Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes: 8930a6c207 ("scsi: core: add support for request batching")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 21:34:34 -04:00
Steffen Maier 82a9ac7130 scsi: core: fix missing .cleanup_rq for SCSI hosts without request batching
This was missing from scsi_mq_ops_no_commit of linux-next commit
8930a6c207 ("scsi: core: add support for request batching") from Martin's
scsi/5.4/scsi-queue or James' scsi/misc.

See also linux-next commit b7e9e1fb7a ("scsi: implement .cleanup_rq
callback") from block/for-next.

Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes: 8930a6c207 ("scsi: core: add support for request batching")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 21:34:21 -04:00