summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2021-05-22scsi: lpfc: Fix illegal memory access on Abort IOCBsJames Smart
[ Upstream commit e1364711359f3ced054bda9920477c8bf93b74c5 ] In devloss timer handler and in backend calls to terminate remote port I/O, there is logic to walk through all active IOCBs and validate them to potentially trigger an abort request. This logic is causing illegal memory accesses which leads to a crash. Abort IOCBs, which may be on the list, do not have an associated lpfc_io_buf struct. The driver is trying to map an lpfc_io_buf struct on the IOCB and which results in a bogus address thus the issue. Fix by skipping over ABORT IOCBs (CLOSE IOCBs are ABORTS that don't send ABTS) in the IOCB scan logic. Link: https://lore.kernel.org/r/20210421234433.102079-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19scsi: ufs: core: Narrow down fast path in system suspend pathCan Guo
[ Upstream commit ce4f62f9dd8cf43ac044045ed598a0b80ef33890 ] If spm_lvl is set to 0 or 1, when system suspend kicks start and HBA is runtime active, system suspend may just bail without doing anything (the fast path), leaving other contexts still running, e.g., clock gating and clock scaling. When system resume kicks start, concurrency can happen between ufshcd_resume() and these contexts, leading to various stability issues. Add a check against HBA's runtime state and allowing fast path only if HBA is runtime suspended, otherwise let system suspend go ahead call ufshcd_suspend(). This will guarantee that these contexts are stopped by either runtime suspend or system suspend. Link: https://lore.kernel.org/r/1619408921-30426-4-git-send-email-cang@codeaurora.org Fixes: 0b257734344a ("scsi: ufs: optimize system suspend handling") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspendCan Guo
[ Upstream commit 637822e63b79ee8a729f7ba2645a26cf5a524ee4 ] During ufs system suspend, leaving rpm_dev_flush_recheck_work running or pending is risky because concurrency may happen between system suspend/resume and runtime resume routine. Fix this by cancelling rpm_dev_flush_recheck_work synchronously during system suspend. Link: https://lore.kernel.org/r/1619408921-30426-3-git-send-email-cang@codeaurora.org Fixes: 51dd905bd2f6 ("scsi: ufs: Fix WriteBooster flush during runtime suspend") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19scsi: ufs: core: Do not put UFS power into LPM if link is brokenCan Guo
[ Upstream commit 23043dd87b153d02eaf676e752d32429be5e5126 ] During resume, if link is broken due to AH8 failure, make sure ufshcd_resume() does not put UFS power back into LPM. Link: https://lore.kernel.org/r/1619408921-30426-2-git-send-email-cang@codeaurora.org Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19scsi: qla2xxx: Prevent PRLI in target modeAnastasia Kovaleva
[ Upstream commit fcb16d9a8ecf1e9bfced0fc654ea4e2caa7517f4 ] In a case when the initiator in P2P mode by some circumstances does not send PRLI, the target, in a case when the target port's WWPN is less than initiator's, changes the discovery state in DSC_GNL. When gnl completes it sends PRLI to the initiator. Usually the initiator in P2P mode always sends PRLI. We caught this issue on Linux stable v5.4.6 https://www.spinics.net/lists/stable/msg458515.html. Fix this particular corner case in the behaviour of the P2P mod target login state machine. Link: https://lore.kernel.org/r/20210422153414.4022-1-a.kovaleva@yadro.com Fixes: a9ed06d4e640 ("scsi: qla2xxx: Allow PLOGI in target mode") Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: ibmvfc: Fix invalid state machine BUG_ON()Brian King
[ Upstream commit 15cfef8623a449d40d16541687afd58e78033be3 ] This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ, which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end up changing the host action to IBMVFC_HOST_ACTION_INIT. If we then change the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON(). Make a couple of changes to avoid this. Leave the host action to be IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop the host lock and reset or reenable the CRQ. Also harden the host state machine to ensure we cannot leave the reset / reenable state until we've finished processing the reset or reenable. Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com Fixes: 73ee5d867287 ("[SCSI] ibmvfc: Fix soft lockup on resume") Signed-off-by: Brian King <brking@linux.vnet.ibm.com> [tyreld: added fixes tag] Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> [mkp: fix comment checkpatch warnings] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: sni_53c710: Add IRQ checkSergey Shtylyov
[ Upstream commit 1160d61bc51e87e509cfaf9da50a0060f67b6de4 ] The driver neglects to check the result of platform_get_irq()'s call and blithely passes the negative error codes to request_irq() (which takes *unsigned* IRQ #s), causing it to fail with -EINVAL (overridden by -ENODEV further below). Stop calling request_irq() with the invalid IRQ #s. Link: https://lore.kernel.org/r/8f4b8fa5-8251-b977-70a1-9099bcb4bb17@omprussia.ru Fixes: c27d85f3f3c5 ("[SCSI] SNI RM 53c710 driver") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: sun3x_esp: Add IRQ checkSergey Shtylyov
[ Upstream commit 14b321380eb333c82853d7d612d0995f05f88fdc ] The driver neglects to check the result of platform_get_irq()'s call and blithely passes the negative error codes to request_irq() (which takes *unsigned* IRQ #), causing it to fail with -EINVAL, overriding the real error code. Stop calling request_irq() with the invalid IRQ #s. Link: https://lore.kernel.org/r/363eb4c8-a3bf-4dc9-2a9e-90f349030a15@omprussia.ru Fixes: 0bb67f181834 ("[SCSI] sun3x_esp: convert to esp_scsi") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: jazz_esp: Add IRQ checkSergey Shtylyov
[ Upstream commit 38fca15c29db6ed06e894ac194502633e2a7d1fb ] The driver neglects to check the result of platform_get_irq()'s call and blithely passes the negative error codes to request_irq() (which takes *unsigned* IRQ #), causing it to fail with -EINVAL, overriding the real error code. Stop calling request_irq() with the invalid IRQ #s. Link: https://lore.kernel.org/r/594aa9ae-2215-49f6-f73c-33bd38989912@omprussia.ru Fixes: 352e921f0dd4 ("[SCSI] jazz_esp: converted to use esp_core") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: hisi_sas: Fix IRQ checksSergey Shtylyov
[ Upstream commit 6c11dc060427e07ca144eacaccd696106b361b06 ] Commit df2d8213d9e3 ("hisi_sas: use platform_get_irq()") failed to take into account that irq_of_parse_and_map() and platform_get_irq() have a different way of indicating an error: the former returns 0 and the latter returns a negative error code. Fix up the IRQ checks! Link: https://lore.kernel.org/r/810f26d3-908b-1d6b-dc5c-40019726baca@omprussia.ru Fixes: df2d8213d9e3 ("hisi_sas: use platform_get_irq()") Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: ufs: ufshcd-pltfrm: Fix deferred probingSergey Shtylyov
[ Upstream commit 339c9b63cc7ce779ce45c675bf709cb58b807fc3 ] The driver overrides the error codes returned by platform_get_irq() to -ENODEV, so if it returns -EPROBE_DEFER, the driver would fail the probe permanently instead of the deferred probing. Propagate the error code upstream as it should have been done from the start... Link: https://lore.kernel.org/r/420364ca-614a-45e3-4e35-0e0653c7bc53@omprussia.ru Fixes: 2953f850c3b8 ("[SCSI] ufs: use devres functions for ufshcd") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: pm80xx: Fix potential infinite loopColin Ian King
[ Upstream commit 40fa7394a1ad5706e795823276f2e394cca145d0 ] The for-loop iterates with a u8 loop counter i and compares this with the loop upper limit of pm8001_ha->max_q_num which is a u32 type. There is a potential infinite loop if pm8001_ha->max_q_num is larger than the u8 loop counter. Fix this by making the loop counter the same type as pm8001_ha->max_q_num. [mkp: this is purely theoretical, max_q_num is currently limited to 64] Link: https://lore.kernel.org/r/20210407135840.494747-1-colin.king@canonical.com Fixes: 65df7d1986a1 ("scsi: pm80xx: Fix chip initialization failure") Addresses-Coverity: ("Infinite loop") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()Igor Pylypiv
[ Upstream commit 3f744a14f331f56703a9d74e86520db045f11831 ] The mpi_uninit_check() takes longer for inbound doorbell register to be cleared. Increase the timeout substantially so that the driver does not fail to load. Previously, the inbound doorbell wait time was mistakenly increased in the mpi_init_check() instead of mpi_uninit_check(). It is okay to leave the mpi_init_check() wait time as-is as these are timeout values and if there is a failure, waiting longer is not an issue. Link: https://lore.kernel.org/r/20210406180534.1924345-2-ipylypiv@google.com Fixes: e90e236250e9 ("scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check") Reviewed-by: Vishakha Channapattan <vishakhavc@google.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: libfc: Fix a format specifierBart Van Assche
[ Upstream commit 90d6697810f06aceea9de71ad836a8c7669789cd ] Since the 'mfs' member has been declared as 'u32' in include/scsi/libfc.h, use the %u format specifier instead of %hu. This patch fixes the following clang compiler warning: warning: format specifies type 'unsigned short' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] "lport->mfs:%hu\n", mfs, lport->mfs); ~~~ ^~~~~~~~~~ %u Link: https://lore.kernel.org/r/20210415220826.29438-8-bvanassche@acm.org Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logicJames Smart
[ Upstream commit b62232ba8caccaf1954e197058104a6478fac1af ] SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3 does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have code to issue the command. The command will always fail. Remove the code for the mailbox command and leave only the resulting "failure path" logic. Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL modeJames Smart
[ Upstream commit 304ee43238fed517faa123e034b593905b8679f8 ] In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses the BMBX register to send the command rather than the MQ. A flag is set indicating the BMBX register is active and saves the mailbox job struct (mboxq) in the mbox_active element of the adapter. The routine then waits for completion or timeout. The mailbox job struct is not freed by the routine. In cases of timeout, the adapter will be reset. The lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for the reset. It clears the BMBX active flag and marks the job structure as MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation in both normal completion and timeout cases is that the issuer of the mbx command will free the structure. Unfortunately, not all calling paths are freeing the memory in cases of error. All calling paths were looked at and updated, if missing, to free the mboxq memory regardless of completion status. Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO responseJames Smart
[ Upstream commit fffd18ec6579c2d9c72b212169259062fe747888 ] Fix a crash caused by a double put on the node when the driver completed an ACC for an unsolicted abort on the same node. The second put was executed by lpfc_nlp_not_used() and is wrong because the completion routine executes the nlp_put when the iocbq was released. Additionally, the driver is issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node into NPR. This call does nothing. Remove the lpfc_nlp_not_used call and additional set_state in the completion routine. Remove the lpfc_nlp_set_state post issue_logo. Isn't necessary. Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: scsi_dh_alua: Remove check for ASC 24h in alua_rtpg()Ewan D. Milne
[ Upstream commit bc3f2b42b70eb1b8576e753e7d0e117bbb674496 ] Some arrays return ILLEGAL_REQUEST with ASC 00h if they don't support the RTPG extended header so remove the check for INVALID FIELD IN CDB. Link: https://lore.kernel.org/r/20210331201154.20348-1-emilne@redhat.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: smartpqi: Add new PCI IDsKevin Barnett
[ Upstream commit 75fbeacca3ad30835e903002dba98dd909b4dfff ] Add support for newer hardware. Link: https://lore.kernel.org/r/161549386882.25025.2594251735886014958.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Acked-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: smartpqi: Correct request leakage during reset operationsMurthy Bhat
[ Upstream commit b622a601a13ae5974c5b0aeecb990c224b8db0d9 ] While failing queued I/Os in TMF path, there was a request leak and hence stale entries in request pool with ref count being non-zero. In shutdown path we have a BUG_ON to catch stuck I/O either in firmware or in the driver. The stale requests caused a system crash. The I/O request pool leakage also lead to a significant performance drop. Link: https://lore.kernel.org/r/161549370379.25025.12793264112620796062.stgit@brunhilda Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: smartpqi: Use host-wide tag spaceDon Brace
[ Upstream commit c6d3ee209b9e863c6251f72101511340451ca324 ] Correct SCSI midlayer sending more requests than exposed host queue depth causing firmware ASSERT and lockup issues by enabling host-wide tags. Note: This also results in better performance. Link: https://lore.kernel.org/r/161549369787.25025.8975999483518581619.stgit@brunhilda Suggested-by: Ming Lei <ming.lei@redhat.com> Suggested-by: John Garry <john.garry@huawei.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: qla2xxx: Fix use after free in bsgQuinn Tran
[ Upstream commit 2ce35c0821afc2acd5ee1c3f60d149f8b2520ce8 ] On bsg command completion, bsg_job_done() was called while qla driver continued to access the bsg_job buffer. bsg_job_done() would free up resources that ended up being reused by other task while the driver continued to access the buffers. As a result, driver was reading garbage data. localhost kernel: BUG: KASAN: use-after-free in sg_next+0x64/0x80 localhost kernel: Read of size 8 at addr ffff8883228a3330 by task swapper/26/0 localhost kernel: localhost kernel: CPU: 26 PID: 0 Comm: swapper/26 Kdump: loaded Tainted: G OE --------- - - 4.18.0-193.el8.x86_64+debug #1 localhost kernel: Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 08/12/2016 localhost kernel: Call Trace: localhost kernel: <IRQ> localhost kernel: dump_stack+0x9a/0xf0 localhost kernel: print_address_description.cold.3+0x9/0x23b localhost kernel: kasan_report.cold.4+0x65/0x95 localhost kernel: debug_dma_unmap_sg.part.12+0x10d/0x2d0 localhost kernel: qla2x00_bsg_sp_free+0xaf6/0x1010 [qla2xxx] Link: https://lore.kernel.org/r/20210329085229.4367-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: qla2xxx: Always check the return value of qla24xx_get_isp_stats()Bart Van Assche
[ Upstream commit a2b2cc660822cae08c351c7f6b452bfd1330a4f7 ] This patch fixes the following Coverity warning: CID 361199 (#1 of 1): Unchecked return value (CHECKED_RETURN) 3. check_return: Calling qla24xx_get_isp_stats without checking return value (as is done elsewhere 4 out of 5 times). Link: https://lore.kernel.org/r/20210320232359.941-7-bvanassche@acm.org Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Lee Duncan <lduncan@suse.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: lpfc: Fix pt2pt connection does not recover after LOGOJames Smart
[ Upstream commit bd4f5100424d17d4e560d6653902ef8e49b2fc1f ] On a pt2pt setup, between 2 initiators, if one side issues a a LOGO, there is no relogin attempt. The FC specs are grey in this area on which port (higher wwn or not) is to re-login. As there is no spec guidance, unconditionally re-PLOGI after the logout to ensure a login is re-established. Link: https://lore.kernel.org/r/20210301171821.3427-8-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: lpfc: Fix incorrect dbde assignment when building target abts wqeJames Smart
[ Upstream commit 9302154c07bff4e7f7f43c506a1ac84540303d06 ] The wqe_dbde field indicates whether a Data BDE is present in Words 0:2 and should therefore should be clear in the abts request wqe. By setting the bit we can be misleading fw into error cases. Clear the wqe_dbde field. Link: https://lore.kernel.org/r/20210301171821.3427-2-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11scsi: mpt3sas: Block PCI config access from userspace during resetSreekanth Reddy
commit 3c8604691d2acc7b7d4795d9695070de9eaa5828 upstream. While diag reset is in progress there is short duration where all access to controller's PCI config space from the host needs to be blocked. This is due to a hardware limitation of the IOC controllers. Block all access to controller's config space from userland applications by calling pci_cfg_access_lock() while diag reset is in progress and unlocking it again after the controller comes back to ready state. Link: https://lore.kernel.org/r/20210330105137.20728-1-sreekanth.reddy@broadcom.com Cc: stable@vger.kernel.org #v5.4.108+ Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11scsi: qla2xxx: Fix crash in qla2xxx_mqueuecommand()Arun Easi
commit 6641df81ab799f28a5d564f860233dd26cca0d93 upstream. RIP: 0010:kmem_cache_free+0xfa/0x1b0 Call Trace: qla2xxx_mqueuecommand+0x2b5/0x2c0 [qla2xxx] scsi_queue_rq+0x5e2/0xa40 __blk_mq_try_issue_directly+0x128/0x1d0 blk_mq_request_issue_directly+0x4e/0xb0 Fix incorrect call to free srb in qla2xxx_mqueuecommand(), as srb is now allocated by upper layers. This fixes smatch warning of srb unintended free. Link: https://lore.kernel.org/r/20210329085229.4367-7-njavali@marvell.com Fixes: af2a0c51b120 ("scsi: qla2xxx: Fix SRB leak on switch command timeout") Cc: stable@vger.kernel.org # 5.5 Reported-by: Laurence Oberman <loberman@redhat.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-21scsi: libsas: Reset num_scatter if libata marks qc as NODATAJolly Shah
commit 176ddd89171ddcf661862d90c5d257877f7326d6 upstream. When the cache_type for the SCSI device is changed, the SCSI layer issues a MODE_SELECT command. The caching mode details are communicated via a request buffer associated with the SCSI command with data direction set as DMA_TO_DEVICE (scsi_mode_select()). When this command reaches the libata layer, as a part of generic initial setup, libata layer sets up the scatterlist for the command using the SCSI command (ata_scsi_qc_new()). This command is then translated by the libata layer into ATA_CMD_SET_FEATURES (ata_scsi_mode_select_xlat()). The libata layer treats this as a non-data command (ata_mselect_caching()), since it only needs an ATA taskfile to pass the caching on/off information to the device. It does not need the scatterlist that has been setup, so it does not perform dma_map_sg() on the scatterlist (ata_qc_issue()). Unfortunately, when this command reaches the libsas layer (sas_ata_qc_issue()), libsas layer sees it as a non-data command with a scatterlist. It cannot extract the correct DMA length since the scatterlist has not been mapped with dma_map_sg() for a DMA operation. When this partially constructed SAS task reaches pm80xx LLDD, it results in the following warning: "pm80xx_chip_sata_req 6058: The sg list address start_addr=0x0000000000000000 data_len=0x0end_addr_high=0xffffffff end_addr_low=0xffffffff has crossed 4G boundary" Update libsas to handle ATA non-data commands separately so num_scatter and total_xfer_len remain 0. Link: https://lore.kernel.org/r/20210318225632.2481291-1-jollys@google.com Fixes: 53de092f47ff ("scsi: libsas: Set data_dir as DMA_NONE if libata marks qc as NODATA") Tested-by: Luo Jiaxing <luojiaxing@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jolly Shah <jollys@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-21scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST stateMartin Wilck
[ Upstream commit 5cd0f6f57639c5afbb36100c69281fee82c95ee7 ] rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with srp_reconnect_work(), a warning is printed: Mar 27 18:48:07 ictm1604s01h4 kernel: dev_loss_tmo expired for SRP port-18:1 / host18. Mar 27 18:48:07 ictm1604s01h4 kernel: ------------[ cut here ]------------ Mar 27 18:48:07 ictm1604s01h4 kernel: scsi_internal_device_block(18:0:0:100) failed: ret = -22 Mar 27 18:48:07 ictm1604s01h4 kernel: Call Trace: Mar 27 18:48:07 ictm1604s01h4 kernel: ? scsi_target_unblock+0x50/0x50 [scsi_mod] Mar 27 18:48:07 ictm1604s01h4 kernel: starget_for_each_device+0x80/0xb0 [scsi_mod] Mar 27 18:48:07 ictm1604s01h4 kernel: target_block+0x24/0x30 [scsi_mod] Mar 27 18:48:07 ictm1604s01h4 kernel: device_for_each_child+0x57/0x90 Mar 27 18:48:07 ictm1604s01h4 kernel: srp_reconnect_rport+0xe4/0x230 [scsi_transport_srp] Mar 27 18:48:07 ictm1604s01h4 kernel: srp_reconnect_work+0x40/0xc0 [scsi_transport_srp] Avoid this by not trying to block targets for rports in SRP_PORT_LOST state. Link: https://lore.kernel.org/r/20210401091105.8046-1-mwilck@suse.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUsCan Guo
[ Upstream commit 4b42d557a8add52b9a9924fb31e40a218aab7801 ] In __ufshcd_issue_tm_cmd(), it is not correct to use hba->nutrs + req->tag as the Task Tag in a TMR UPIU. Directly use req->tag as the Task Tag. Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command implementation") Link: https://lore.kernel.org/r/1617262750-4864-3-git-send-email-cang@codeaurora.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14scsi: ufs: core: Fix task management request completion timeoutCan Guo
[ Upstream commit 1235fc569e0bf541ddda0a1224d4c6fa6d914890 ] ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()), but since blk_mq_tagset_busy_iter() only iterates over all reserved tags and requests which are not in IDLE state, ufshcd_compl_tm() never gets a chance to run. Thus, TMR always ends up with completion timeout. Fix it by calling blk_mq_start_request() in __ufshcd_issue_tm_cmd(). Link: https://lore.kernel.org/r/1617262750-4864-2-git-send-email-cang@codeaurora.org Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14scsi: pm80xx: Fix chip initialization failureViswas G
commit 65df7d1986a1909a0869419919e7d9c78d70407e upstream. Inbound and outbound queues were not properly configured and that lead to MPI configuration failure. Fixes: 05c6c029a44d ("scsi: pm80xx: Increase number of supported queues") Cc: stable@vger.kernel.org # 5.10+ Link: https://lore.kernel.org/r/20210402054212.17834-1-Viswas.G@microchip.com.com Reported-and-tested-by: Ash Izat <ash@ai0.uk> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07scsi: qla2xxx: Fix broken #endif placementAlexey Dobriyan
[ Upstream commit 5999b9e5b1f8a2f5417b755130919b3ac96f5550 ] Only half of the file is under include guard because terminating #endif is placed too early. Link: https://lore.kernel.org/r/YE4snvoW1SuwcXAn@localhost.localdomain Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07scsi: st: Fix a use after free in st_open()Lv Yunlong
[ Upstream commit c8c165dea4c8f5ad67b1240861e4f6c5395fa4ac ] In st_open(), if STp->in_use is true, STp will be freed by scsi_tape_put(). However, STp is still used by DEBC_printk() after. It is better to DEBC_printk() before scsi_tape_put(). Link: https://lore.kernel.org/r/20210311064636.10522-1-lyl2019@mail.ustc.edu.cn Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30scsi: mpt3sas: Fix error return code of mpt3sas_base_attach()Jia-Ju Bai
[ Upstream commit 3401ecf7fc1b9458a19d42c0e26a228f18ac7dda ] When kzalloc() returns NULL, no error return code of mpt3sas_base_attach() is assigned. To fix this bug, r is assigned with -ENOMEM in this case. Link: https://lore.kernel.org/r/20210308035241.3288-1-baijiaju1990@gmail.com Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30scsi: qedi: Fix error return code of qedi_alloc_global_queues()Jia-Ju Bai
[ Upstream commit f69953837ca5d98aa983a138dc0b90a411e9c763 ] When kzalloc() returns NULL to qedi->global_queues[i], no error return code of qedi_alloc_global_queues() is assigned. To fix this bug, status is assigned with -ENOMEM in this case. Link: https://lore.kernel.org/r/20210308033024.27147-1-baijiaju1990@gmail.com Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30scsi: Revert "qla2xxx: Make sure that aborted commands are freed"Bart Van Assche
[ Upstream commit 39c0c8553bfb5a3d108aa47f1256076d507605e3 ] Calling vha->hw->tgt.tgt_ops->free_cmd() from qlt_xmit_response() is wrong since the command for which a response is sent must remain valid until the SCSI target core calls .release_cmd(). It has been observed that the following scenario triggers a kernel crash: - qlt_xmit_response() calls qlt_check_reserve_free_req() - qlt_check_reserve_free_req() returns -EAGAIN - qlt_xmit_response() calls vha->hw->tgt.tgt_ops->free_cmd(cmd) - transport_handle_queue_full() tries to retransmit the response Fix this crash by reverting the patch that introduced it. Link: https://lore.kernel.org/r/20210320232359.941-2-bvanassche@acm.org Fixes: 0dcec41acb85 ("scsi: qla2xxx: Make sure that aborted commands are freed") Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30scsi: ufs: ufs-qcom: Disable interrupt in reset pathNitin Rawat
[ Upstream commit 4a791574a0ccf36eb3a0a46fbd71d2768df3eef9 ] Disable interrupt in reset path to flush pending IRQ handler in order to avoid possible NoC issues. Link: https://lore.kernel.org/r/1614145010-36079-3-git-send-email-cang@codeaurora.org Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Nitin Rawat <nitirawa@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: isci: Pass gfp_t flags in isci_port_bc_change_received()Ahmed S. Darwish
[ Upstream commit 71dca5539fcf977aead0c9ea1962e70e78484b8e ] Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. libsas sas_notify_port_event() is called from isci_port_bc_change_received(). Below is the context analysis for all of its call chains: host.c: sci_controller_error_handler(): atomic, irq handler (*) OR host.c: sci_controller_completion_handler(), atomic, tasklet (*) -> sci_controller_process_completions() -> sci_controller_event_completion() -> phy.c: sci_phy_event_handler() -> port.c: sci_port_broadcast_change_received() -> isci_port_bc_change_received() host.c: isci_host_init() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_initialize(), atomic (*) -> port_config.c: sci_port_configuration_agent_initialize() -> sci_mpc_agent_validate_phy_configuration() -> port.c: sci_port_add_phy() -> sci_port_set_phy() -> phy.c: sci_phy_set_port() -> port.c: sci_port_broadcast_change_received() -> isci_port_bc_change_received() port_config.c: apc_agent_timeout(), atomic, timer callback (*) -> sci_apc_agent_configure_ports() -> port.c: sci_port_add_phy() -> sci_port_set_phy() -> phy.c: sci_phy_set_port() -> port.c: sci_port_broadcast_change_received() -> isci_port_bc_change_received() phy.c: enter SCI state: *SCI_PHY_STOPPED* # Cont. from [1] -> sci_phy_stopped_state_enter() -> host.c: sci_controller_link_down() -> ->link_down_handler() == port_config.c: sci_apc_agent_link_down() -> port.c: sci_port_remove_phy() -> sci_port_clear_phy() -> phy.c: sci_phy_set_port() -> port.c: sci_port_broadcast_change_received() -> isci_port_bc_change_received() phy.c: enter SCI state: *SCI_PHY_STARTING* # Cont. from [2] -> sci_phy_starting_state_enter() -> host.c: sci_controller_link_down() -> ->link_down_handler() == port_config.c: sci_apc_agent_link_down() -> port.c: sci_port_remove_phy() -> sci_port_clear_phy() -> phy.c: sci_phy_set_port() -> port.c: sci_port_broadcast_change_received() -> isci_port_bc_change_received() [1] Call chains for entering state: *SCI_PHY_STOPPED* ----------------------------------------------------- host.c: isci_host_init() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_initialize(), atomic (*) -> phy.c: sci_phy_initialize() -> phy.c: sci_phy_link_layer_initialization() -> phy.c: sci_change_state(SCI_PHY_STOPPED) init.c: PCI ->remove() || PM_OPS ->suspend, process context (+) -> host.c: isci_host_deinit() -> sci_controller_stop_phys() -> phy.c: sci_phy_stop() -> sci_change_state(SCI_PHY_STOPPED) phy.c: isci_phy_control() spin_lock_irqsave(isci_host::scic_lock, ) -> sci_phy_stop(), atomic (*) -> sci_change_state(SCI_PHY_STOPPED) [2] Call chains for entering state: *SCI_PHY_STARTING* ------------------------------------------------------ phy.c: phy_sata_timeout(), atimer, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> sci_change_state(SCI_PHY_STARTING) host.c: phy_startup_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> sci_controller_start_next_phy() -> sci_phy_start() -> sci_change_state(SCI_PHY_STARTING) host.c: isci_host_start() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_start(), atomic (*) -> sci_controller_start_next_phy() -> sci_phy_start() -> sci_change_state(SCI_PHY_STARTING) phy.c: Enter SCI state *SCI_PHY_SUB_FINAL* # Cont. from [2A] -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_phy_starting_final_substate_enter() -> sci_change_state(SCI_PHY_READY) -> Enter SCI state: *SCI_PHY_READY* -> sci_phy_ready_state_enter() -> host.c: sci_controller_link_up() -> sci_controller_start_next_phy() -> sci_phy_start() -> sci_change_state(SCI_PHY_STARTING) phy.c: sci_phy_event_handler(), atomic, discussed earlier (*) -> sci_change_state(SCI_PHY_STARTING), 11 instances port.c: isci_port_perform_hard_reset() spin_lock_irqsave(isci_host::scic_lock, ) -> port.c: sci_port_hard_reset(), atomic (*) -> phy.c: sci_phy_reset() -> sci_change_state(SCI_PHY_RESETTING) -> enter SCI PHY state: *SCI_PHY_RESETTING* -> sci_phy_resetting_state_enter() -> sci_change_state(SCI_PHY_STARTING) [2A] Call chains for entering SCI state: *SCI_PHY_SUB_FINAL* ------------------------------------------------------------ host.c: power_control_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> phy.c: sci_phy_consume_power_handler() -> phy.c: sci_change_state(SCI_PHY_SUB_FINAL) host.c: sci_controller_error_handler(): atomic, irq handler (*) OR host.c: sci_controller_completion_handler(), atomic, tasklet (*) -> sci_controller_process_completions() -> sci_controller_unsolicited_frame() -> phy.c: sci_phy_frame_handler() -> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER) -> sci_phy_starting_await_sas_power_substate_enter() -> host.c: sci_controller_power_control_queue_insert() -> phy.c: sci_phy_consume_power_handler() -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_controller_event_completion() -> phy.c: sci_phy_event_handler() -> sci_phy_start_sata_link_training() -> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER) -> sci_phy_starting_await_sata_power_substate_enter -> host.c: sci_controller_power_control_queue_insert() -> phy.c: sci_phy_consume_power_handler() -> sci_change_state(SCI_PHY_SUB_FINAL) As can be seen from the "(*)" markers above, almost all the call-chains are atomic. The only exception, marked with "(+)", is a PCI ->remove() and PM_OPS ->suspend() cold path. Thus, pass GFP_ATOMIC to the libsas port event notifier. Note, the now-replaced libsas APIs used in_interrupt() to implicitly decide which memory allocation type to use. This was only partially correct, as it fails to choose the correct GFP flags when just preemption or interrupts are disabled. Such buggy code paths are marked with "(@)" in the call chains above. Link: https://lore.kernel.org/r/20210118100955.1761652-8-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: isci: Pass gfp_t flags in isci_port_link_up()Ahmed S. Darwish
[ Upstream commit 5ce7902902adb8d154d67ba494f06daa29360ef0 ] Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. libsas sas_notify_port_event() is called from isci_port_link_up(). Below is the context analysis for all of its call chains: host.c: isci_host_init() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_initialize(), atomic (*) -> port_config.c: sci_port_configuration_agent_initialize() -> sci_mpc_agent_validate_phy_configuration() -> port.c: sci_port_add_phy() -> sci_port_general_link_up_handler() -> sci_port_activate_phy() -> isci_port_link_up() port_config.c: apc_agent_timeout(), atomic, timer callback (*) -> sci_apc_agent_configure_ports() -> port.c: sci_port_add_phy() -> sci_port_general_link_up_handler() -> sci_port_activate_phy() -> isci_port_link_up() phy.c: enter SCI state: *SCI_PHY_SUB_FINAL* # Cont. from [1] -> phy.c: sci_phy_starting_final_substate_enter() -> phy.c: sci_change_state(SCI_PHY_READY) -> enter SCI state: *SCI_PHY_READY* -> phy.c: sci_phy_ready_state_enter() -> host.c: sci_controller_link_up() -> .link_up_handler() == port_config.c: sci_apc_agent_link_up() -> port.c: sci_port_link_up() -> (continue at [A]) == port_config.c: sci_mpc_agent_link_up() -> port.c: sci_port_link_up() -> (continue at [A]) port_config.c: mpc_agent_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> ->link_up_handler() == port_config.c: sci_apc_agent_link_up() -> port.c: sci_port_link_up() -> (continue at [A]) == port_config.c: sci_mpc_agent_link_up() -> port.c: sci_port_link_up() -> (continue at [A]) [A] port.c: sci_port_link_up() -> sci_port_activate_phy() -> isci_port_link_up() -> sci_port_general_link_up_handler() -> sci_port_activate_phy() -> isci_port_link_up() [1] Call chains for entering SCI state: *SCI_PHY_SUB_FINAL* ----------------------------------------------------------- host.c: power_control_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> phy.c: sci_phy_consume_power_handler() -> phy.c: sci_change_state(SCI_PHY_SUB_FINAL) host.c: sci_controller_error_handler(): atomic, irq handler (*) OR host.c: sci_controller_completion_handler(), atomic, tasklet (*) -> sci_controller_process_completions() -> sci_controller_unsolicited_frame() -> phy.c: sci_phy_frame_handler() -> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER) -> sci_phy_starting_await_sas_power_substate_enter() -> host.c: sci_controller_power_control_queue_insert() -> phy.c: sci_phy_consume_power_handler() -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_controller_event_completion() -> phy.c: sci_phy_event_handler() -> sci_phy_start_sata_link_training() -> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER) -> sci_phy_starting_await_sata_power_substate_enter -> host.c: sci_controller_power_control_queue_insert() -> phy.c: sci_phy_consume_power_handler() -> sci_change_state(SCI_PHY_SUB_FINAL) As can be seen from the "(*)" markers above, all the call-chains are atomic. Pass GFP_ATOMIC to libsas port event notifier. Note, the now-replaced libsas APIs used in_interrupt() to implicitly decide which memory allocation type to use. This was only partially correct, as it fails to choose the correct GFP flags when just preemption or interrupts are disabled. Such buggy code paths are marked with "(@)" in the call chains above. Link: https://lore.kernel.org/r/20210118100955.1761652-7-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: isci: Pass gfp_t flags in isci_port_link_down()Ahmed S. Darwish
[ Upstream commit 885ab3b8926fdf9cdd7163dfad99deb9b0662b39 ] Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. sas_notify_phy_event() is exclusively called by isci_port_link_down(). Below is the context analysis for all of its call chains: port.c: port_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> port_state_machine_change(..., SCI_PORT_FAILED) -> enter SCI port state: *SCI_PORT_FAILED* -> sci_port_failed_state_enter() -> isci_port_hard_reset_complete() -> isci_port_link_down() port.c: isci_port_perform_hard_reset() spin_lock_irqsave(isci_host::scic_lock, ) -> port.c: sci_port_hard_reset(), atomic (*) -> phy.c: sci_phy_reset() -> sci_change_state(SCI_PHY_RESETTING) -> enter SCI PHY state: *SCI_PHY_RESETTING* -> sci_phy_resetting_state_enter() -> port.c: sci_port_deactivate_phy() -> isci_port_link_down() port.c: enter SCI port state: *SCI_PORT_READY* # Cont. from [1] -> sci_port_ready_state_enter() -> isci_port_hard_reset_complete() -> isci_port_link_down() phy.c: enter SCI state: *SCI_PHY_STOPPED* # Cont. from [2] -> sci_phy_stopped_state_enter() -> host.c: sci_controller_link_down() -> ->link_down_handler() == port_config.c: sci_apc_agent_link_down() -> port.c: sci_port_remove_phy() -> sci_port_deactivate_phy() -> isci_port_link_down() == port_config.c: sci_mpc_agent_link_down() -> port.c: sci_port_link_down() -> sci_port_deactivate_phy() -> isci_port_link_down() phy.c: enter SCI state: *SCI_PHY_STARTING* # Cont. from [3] -> sci_phy_starting_state_enter() -> host.c: sci_controller_link_down() -> ->link_down_handler() == port_config.c: sci_apc_agent_link_down() -> port.c: sci_port_remove_phy() -> isci_port_link_down() == port_config.c: sci_mpc_agent_link_down() -> port.c: sci_port_link_down() -> sci_port_deactivate_phy() -> isci_port_link_down() [1] Call chains for 'enter SCI port state: *SCI_PORT_READY*' ------------------------------------------------------------ host.c: isci_host_init() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_initialize(), atomic (*) -> port_config.c: sci_port_configuration_agent_initialize() -> sci_mpc_agent_validate_phy_configuration() -> port.c: sci_port_add_phy() -> sci_port_general_link_up_handler() -> port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* host.c: isci_host_start() (@) spin_lock_irq(isci_host::scic_lock) -> host.c: sci_controller_start(), atomic (*) -> host.c: sci_port_start() -> port.c: port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* port_config.c: apc_agent_timeout(), atomic, timer callback (*) -> sci_apc_agent_configure_ports() -> port.c: sci_port_add_phy() -> sci_port_general_link_up_handler() -> port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* port_config.c: mpc_agent_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> ->link_up_handler() == port.c: sci_apc_agent_link_up() -> sci_port_general_link_up_handler() -> port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* == port.c: sci_mpc_agent_link_up() -> port.c: sci_port_link_up() -> sci_port_general_link_up_handler() -> port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* phy.c: enter SCI state: SCI_PHY_SUB_FINAL # Cont. from [1A] -> sci_phy_starting_final_substate_enter() -> sci_change_state(SCI_PHY_READY) -> enter SCI state: *SCI_PHY_READY* -> sci_phy_ready_state_enter() -> host.c: sci_controller_link_up() -> port_agent.link_up_handler() == port_config.c: sci_apc_agent_link_up() -> port.c: sci_port_link_up() -> sci_port_general_link_up_handler() -> port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* == port_config.c: sci_mpc_agent_link_up() -> port.c: sci_port_link_up() -> sci_port_general_link_up_handler() -> port_state_machine_change(, SCI_PORT_READY) -> enter port state *SCI_PORT_READY* [1A] Call chains for entering SCI state: *SCI_PHY_SUB_FINAL* ------------------------------------------------------------ host.c: power_control_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> phy.c: sci_phy_consume_power_handler() -> phy.c: sci_change_state(SCI_PHY_SUB_FINAL) host.c: sci_controller_error_handler(): atomic, irq handler (*) OR host.c: sci_controller_completion_handler(), atomic, tasklet (*) -> sci_controller_process_completions() -> sci_controller_unsolicited_frame() -> phy.c: sci_phy_frame_handler() -> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER) -> sci_phy_starting_await_sas_power_substate_enter() -> host.c: sci_controller_power_control_queue_insert() -> phy.c: sci_phy_consume_power_handler() -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_controller_event_completion() -> phy.c: sci_phy_event_handler() -> sci_phy_start_sata_link_training() -> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER) -> sci_phy_starting_await_sata_power_substate_enter -> host.c: sci_controller_power_control_queue_insert() -> phy.c: sci_phy_consume_power_handler() -> sci_change_state(SCI_PHY_SUB_FINAL) [2] Call chains for entering state: *SCI_PHY_STOPPED* ----------------------------------------------------- host.c: isci_host_init() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_initialize(), atomic (*) -> phy.c: sci_phy_initialize() -> phy.c: sci_phy_link_layer_initialization() -> phy.c: sci_change_state(SCI_PHY_STOPPED) init.c: PCI ->remove() || PM_OPS ->suspend, process context (+) -> host.c: isci_host_deinit() -> sci_controller_stop_phys() -> phy.c: sci_phy_stop() -> sci_change_state(SCI_PHY_STOPPED) phy.c: isci_phy_control() spin_lock_irqsave(isci_host::scic_lock, ) -> sci_phy_stop(), atomic (*) -> sci_change_state(SCI_PHY_STOPPED) [3] Call chains for entering state: *SCI_PHY_STARTING* ------------------------------------------------------ phy.c: phy_sata_timeout(), atimer, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> sci_change_state(SCI_PHY_STARTING) host.c: phy_startup_timeout(), atomic, timer callback (*) spin_lock_irqsave(isci_host::scic_lock, ) -> sci_controller_start_next_phy() -> sci_phy_start() -> sci_change_state(SCI_PHY_STARTING) host.c: isci_host_start() (@) spin_lock_irq(isci_host::scic_lock) -> sci_controller_start(), atomic (*) -> sci_controller_start_next_phy() -> sci_phy_start() -> sci_change_state(SCI_PHY_STARTING) phy.c: Enter SCI state *SCI_PHY_SUB_FINAL*, atomic, check above (*) -> sci_change_state(SCI_PHY_SUB_FINAL) -> sci_phy_starting_final_substate_enter() -> sci_change_state(SCI_PHY_READY) -> Enter SCI state: *SCI_PHY_READY* -> sci_phy_ready_state_enter() -> host.c: sci_controller_link_up() -> sci_controller_start_next_phy() -> sci_phy_start() -> sci_change_state(SCI_PHY_STARTING) phy.c: sci_phy_event_handler(), atomic, discussed earlier (*) -> sci_change_state(SCI_PHY_STARTING), 11 instances phy.c: enter SCI state: *SCI_PHY_RESETTING*, atomic, discussed (*) -> sci_phy_resetting_state_enter() -> sci_change_state(SCI_PHY_STARTING) As can be seen from the "(*)" markers above, almost all the call-chains are atomic. The only exception, marked with "(+)", is a PCI ->remove() and PM_OPS ->suspend() cold path. Thus, pass GFP_ATOMIC to the libsas phy event notifier. Note, The now-replaced libsas APIs used in_interrupt() to implicitly decide which memory allocation type to use. This was only partially correct, as it fails to choose the correct GFP flags when just preemption or interrupts are disabled. Such buggy code paths are marked with "(@)" in the call chains above. Link: https://lore.kernel.org/r/20210118100955.1761652-6-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: mvsas: Pass gfp_t flags to libsas event notifiersAhmed S. Darwish
[ Upstream commit feb18e900f0048001ff375dca639eaa327ab3c1b ] mvsas calls the non _gfp version of the libsas event notifiers API, leading to the buggy call chains below: mvsas/mv_sas.c: mvs_work_queue() [process context] spin_lock_irqsave(mvs_info::lock, ) -> libsas/sas_event.c: sas_notify_phy_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation -> libsas/sas_event.c: sas_notify_port_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation Use the new event notifiers API instead, which requires callers to explicitly pass the gfp_t memory allocation flags. Below are context analysis for the modified functions: => mvs_bytes_dmaed(): Since it is invoked from both process and atomic contexts, let its callers pass the gfp_t flags. Call chains: scsi_scan.c: do_scsi_scan_host() [has msleep()] -> shost->hostt->scan_start() -> [mvsas/mv_init.c: Scsi_Host::scsi_host_template .scan_start = mvs_scan_start()] -> mvsas/mv_sas.c: mvs_scan_start() -> mvs_bytes_dmaed(..., GFP_KERNEL) mvsas/mv_sas.c: mvs_work_queue() spin_lock_irqsave(mvs_info::lock,) -> mvs_bytes_dmaed(..., GFP_ATOMIC) mvsas/mv_64xx.c: mvs_64xx_isr() || mvsas/mv_94xx.c: mvs_94xx_isr() -> mvsas/mv_chips.h: mvs_int_full() -> mvsas/mv_sas.c: mvs_int_port() -> mvs_bytes_dmaed(..., GFP_ATOMIC); => mvs_work_queue(): Invoked from process context, but it calls all the libsas event notifier APIs under a spin_lock_irqsave(). Pass GFP_ATOMIC. Link: https://lore.kernel.org/r/20210118100955.1761652-5-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: libsas: Introduce a _gfp() variant of event notifiersAhmed S. Darwish
[ Upstream commit c2d0f1a65ab9fbabebb463bf36f50ea8f4633386 ] sas_alloc_event() uses in_interrupt() to decide which allocation should be used. The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be separated or the context be conveyed in an argument passed by the caller, which usually knows the context. The in_interrupt() check is also only partially correct, because it fails to choose the correct code path when just preemption or interrupts are disabled. For example, as in the following call chain: mvsas/mv_sas.c: mvs_work_queue() [process context] spin_lock_irqsave(mvs_info::lock, ) -> libsas/sas_event.c: sas_notify_phy_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation -> libsas/sas_event.c: sas_notify_port_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation Introduce sas_alloc_event_gfp(), sas_notify_port_event_gfp(), and sas_notify_phy_event_gfp(), which all behave like the non _gfp() variants but use a caller-passed GFP mask for allocations. For bisectability, all callers will be modified first to pass GFP context, then the non _gfp() libsas API variants will be modified to take a gfp_t by default. Link: https://lore.kernel.org/r/20210118100955.1761652-4-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: libsas: Remove notifier indirectionJohn Garry
[ Upstream commit 121181f3f839c29d8dd9fdc3cc9babbdc74227f8 ] LLDDs report events to libsas with .notify_port_event and .notify_phy_event callbacks. These callbacks are fixed and so there is no reason why the functions cannot be called directly, so do that. This neatens the code slightly, makes it more obvious, and reduces function pointer usage, which is generally a good thing. Downside is that there are 2x more symbol exports. [a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables] Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: pm8001: Neaten debug logging macros and usesJoe Perches
[ Upstream commit 1b5d2793283dcb97b401b3b2c02b8a94eee29af1 ] Every PM8001_<FOO>_DBG macro uses an internal call to pm8001_printk. Convert all uses of: PM8001_<FOO>_DBG(hba, pm8001_printk(fmt, ...)) to pm8001_dbg(hba, <FOO>, fmt, ...) so the visual complexity of each macro is reduced. The repetitive macro definitions are converted to a single pm8001_dbg and the level is concatenated using PM8001_##level##_LOGGING for the specific level test. Done with coccinelle, checkpatch and a little typing of the new macro definition. Miscellanea: - Coalesce formats - Realign arguments - Add missing terminating newlines to formats - Remove trailing spaces from formats - Change defective loop with printk(KERN_INFO... to emit a 16 byte hex block to %p16h Link: https://lore.kernel.org/r/49f36a93af7752b613d03c89a87078243567fd9a.1605914030.git.joe@perches.com Reported-by: kernel test robot <lkp@intel.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: pm80xx: Fix pm8001_mpi_get_nvmd_resp() race conditionyuuzheng
[ Upstream commit 1f889b58716a5f5e3e4fe0e6742c1a4472f29ac1 ] A use-after-free or null-pointer error occurs when the 251-byte response data is copied from IOMB buffer to response message buffer in function pm8001_mpi_get_nvmd_resp(). After sending the command get_nvmd_data(), the caller begins to sleep by calling wait_for_complete() and waits for the wake-up from calling complete() in pm8001_mpi_get_nvmd_resp(). Due to unexpected events (e.g., interrupt), if response buffer gets freed before memcpy(), a use-after-free error will occur. To fix this, the complete() should be called after memcpy(). Link: https://lore.kernel.org/r/20201102165528.26510-5-Viswas.G@microchip.com.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: yuuzheng <yuuzheng@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: pm80xx: Make running_req atomicViswas G
[ Upstream commit 4a2efd4b89fcaa6e9a7b4ce49a441afaacba00ea ] Incorrect value of the running_req was causing the driver unload to be stuck during the SAS lldd_dev_gone notification handling. During SATA I/O completion, for some error status values, the driver schedules the event handler and running_req is decremented from that. However, there are some other error status values (like IO_DS_IN_RECOVERY, IO_XFER_ERR_LAST_PIO_DATAIN_CRC_ERR) where the I/O has already been completed by fw/driver so running_req is not decremented. Also during NCQ error handling, driver itself will initiate READ_LOG_EXT and ABORT_ALL. When libsas/libata initiate READ_LOG_EXT (0x2F), driver increments running_req. This will be completed by the driver in pm80xx_chip_sata_req(), but running_req was not decremented. Link: https://lore.kernel.org/r/20201102165528.26510-3-Viswas.G@microchip.com.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: pm80xx: Make mpi_build_cmd locking consistentpeter chang
[ Upstream commit 7640e1eb8c5de33dafa6c68fd4389214ff9ec1f9 ] Driver submits all internal requests (like abort_task, event acknowledgment etc.) through inbound queue 0. While submitting those, driver does not acquire any lock and this may lead to a race when there is an I/O request coming in on CPU0 and submitted through inbound queue 0. To avoid this, lock acquisition has been moved to pm8001_mpi_build_cmd(). All command submission will go through this path. Link: https://lore.kernel.org/r/20201102165528.26510-2-Viswas.G@microchip.com.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: peter chang <dpf@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25scsi: ufs: ufs-mediatek: Correct operator & -> &&dongjian
commit 0fdc7d5d8f3719950478cca452cf7f0f1355be10 upstream. The "lpm" and "->enabled" are all boolean. We should be using && rather than the bit operator. Link: https://lore.kernel.org/r/1615896915-148864-1-git-send-email-dj0227@163.com Fixes: 488edafb1120 ("scsi: ufs-mediatek: Introduce low-power mode for device power supply") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: dongjian <dongjian@yulong.com> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-25scsi: myrs: Fix a double free in myrs_cleanup()Lv Yunlong
commit 2bb817712e2f77486d6ee17e7efaf91997a685f8 upstream. In myrs_cleanup(), cs->mmio_base will be freed twice by iounmap(). Link: https://lore.kernel.org/r/20210311063005.9963-1-lyl2019@mail.ustc.edu.cn Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller (SCSI interface)") Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>