From patchwork Thu Jan 7 07:47:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jaegeuk Kim X-Patchwork-Id: 12003079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C470CC433E9 for ; Thu, 7 Jan 2021 07:48:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C8022158C for ; Thu, 7 Jan 2021 07:48:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726697AbhAGHr4 (ORCPT ); Thu, 7 Jan 2021 02:47:56 -0500 Received: from mail.kernel.org ([198.145.29.99]:56650 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726231AbhAGHrz (ORCPT ); Thu, 7 Jan 2021 02:47:55 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 8A86D23123; Thu, 7 Jan 2021 07:47:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610005634; bh=g8ipjYF6Aeyyb/h68s3nvQK8Gqa0WZ0vA4dkBmKypRI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uj2Im2xQJUon4pwA9ZhkQ99hPAIkhh1OsHjQpWTOOpXYf3bG+TQT0y7rPoPN/APYa vvkzieeWi8ypgXH2fO/gm5rOwePxiNKlGauiwA/WWwEWAalElmhAVl0qqmzocXHZNG 3nyoT84qlUaSy+6j5C3OGVH5QNdYstYPmnH1iV4HskStC8H7zYuMuO+8P7Q8Bx4aH+ BeruQco6lUyCC7X+Qw7e7aeZMI28gnMSOy+CEq+kN81Sb5dQq6hGICJ/qwImX2fjkc o/ULP4+1+lgoVbrQw3shmtV5tKaG8KJFDmZIFGKJxJdh8iL+abALKAEkRsFDAbn2rH f8SzUhHtWlHfg== From: Jaegeuk Kim To: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, kernel-team@android.com Cc: cang@codeaurora.org, alim.akhtar@samsung.com, avri.altman@wdc.com, bvanassche@acm.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, Jaegeuk Kim Subject: [PATCH v4 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns Date: Wed, 6 Jan 2021 23:47:09 -0800 Message-Id: <20210107074710.549309-2-jaegeuk@kernel.org> X-Mailer: git-send-email 2.29.2.729.g45daf8777d-goog In-Reply-To: <20210107074710.549309-1-jaegeuk@kernel.org> References: <20210107074710.549309-1-jaegeuk@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org When gate_work/ungate_work gets an error during hibern8_enter or exit, ufshcd_err_handler() ufshcd_scsi_block_requests() ufshcd_reset_and_restore() ufshcd_clear_ua_wluns() -> stuck ufshcd_scsi_unblock_requests() In order to avoid it, ufshcd_clear_ua_wluns() can be called per recovery flows such as suspend/resume, link_recovery, and error_handler. Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd resets") Signed-off-by: Jaegeuk Kim Reviewed-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index bedb822a40a3..e6e7bdf99cd7 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3996,6 +3996,8 @@ int ufshcd_link_recovery(struct ufs_hba *hba) if (ret) dev_err(hba->dev, "%s: link recovery failed, err %d", __func__, ret); + else + ufshcd_clear_ua_wluns(hba); return ret; } @@ -6003,6 +6005,9 @@ static void ufshcd_err_handler(struct work_struct *work) ufshcd_scsi_unblock_requests(hba); ufshcd_err_handling_unprepare(hba); up(&hba->eh_sem); + + if (!err && needs_reset) + ufshcd_clear_ua_wluns(hba); } /** @@ -6940,14 +6945,11 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba) ufshcd_set_clk_freq(hba, true); err = ufshcd_hba_enable(hba); - if (err) - goto out; /* Establish the link again and restore the device */ - err = ufshcd_probe_hba(hba, false); if (!err) - ufshcd_clear_ua_wluns(hba); -out: + err = ufshcd_probe_hba(hba, false); + if (err) dev_err(hba->dev, "%s: Host init failed %d\n", __func__, err); ufshcd_update_evt_hist(hba, UFS_EVT_HOST_RESET, (u32)err); @@ -7718,6 +7720,8 @@ static int ufshcd_add_lus(struct ufs_hba *hba) if (ret) goto out; + ufshcd_clear_ua_wluns(hba); + /* Initialize devfreq after UFS device is detected */ if (ufshcd_is_clkscaling_supported(hba)) { memcpy(&hba->clk_scaling.saved_pwr_info.info, @@ -7919,8 +7923,6 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie) pm_runtime_put_sync(hba->dev); ufshcd_exit_clk_scaling(hba); ufshcd_hba_exit(hba); - } else { - ufshcd_clear_ua_wluns(hba); } } @@ -8777,6 +8779,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) ufshcd_resume_clkscaling(hba); hba->clk_gating.is_suspended = false; hba->dev_info.b_rpm_dev_flush_capable = false; + ufshcd_clear_ua_wluns(hba); ufshcd_release(hba); out: if (hba->dev_info.b_rpm_dev_flush_capable) { @@ -8887,6 +8890,8 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) cancel_delayed_work(&hba->rpm_dev_flush_recheck_work); } + ufshcd_clear_ua_wluns(hba); + /* Schedule clock gating in case of no access to UFS device yet */ ufshcd_release(hba); From patchwork Thu Jan 7 07:47:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jaegeuk Kim X-Patchwork-Id: 12003081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE8ECC43381 for ; Thu, 7 Jan 2021 07:48:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7F1EA20705 for ; Thu, 7 Jan 2021 07:48:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726770AbhAGHr4 (ORCPT ); Thu, 7 Jan 2021 02:47:56 -0500 Received: from mail.kernel.org ([198.145.29.99]:56676 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726676AbhAGHr4 (ORCPT ); Thu, 7 Jan 2021 02:47:56 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3A81C2312C; Thu, 7 Jan 2021 07:47:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610005635; bh=13x/aMrJHQQPVB8F9oQzX7H6jaKgP3Tkp3MjMmKNa3s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c+H+jVHRJaK28DXbJ2RyU5DEdVl+o3n1dBBkturx0xHVRLwP70WbTHp66O7v5ezlJ 6worKeObn+7lsAEDvA6I7BcDyetWF7evgwG6XTO6uol8YOFI7iXIj20HBfKnaGCQtp trcGmxHu+uZH6lXSRDZPZ9bY2g2d6+eRvr0x1V9L/r0TLqrdx3Fr6oaZbZBAF8AvoN BFjxHyOBCi+rdSROP98Qbha9Td5i1nsJvtMim5YvYxBXQ7hj6/j9TdVaAuvJgCU5Gr WxL1rvKt/Nfcrvafx8rWlcU0zA+0FAREKrPTyuP9gnh3uwhvO0BXXRbCPVtFC1F+L4 OnDBdWG9xQ3PQ== From: Jaegeuk Kim To: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, kernel-team@android.com Cc: cang@codeaurora.org, alim.akhtar@samsung.com, avri.altman@wdc.com, bvanassche@acm.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, Jaegeuk Kim , Jaegeuk Kim Subject: [PATCH v4 2/2] scsi: ufs: handle LINERESET with correct tm_cmd Date: Wed, 6 Jan 2021 23:47:10 -0800 Message-Id: <20210107074710.549309-3-jaegeuk@kernel.org> X-Mailer: git-send-email 2.29.2.729.g45daf8777d-goog In-Reply-To: <20210107074710.549309-1-jaegeuk@kernel.org> References: <20210107074710.549309-1-jaegeuk@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Jaegeuk Kim This fixes a warning caused by wrong reserve tag usage in __ufshcd_issue_tm_cmd. WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c And, in ufshcd_err_handler(), we can avoid to send tm_cmd before aborting outstanding commands by waiting a bit for IO completion like this. __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Fixes: 2355b66ed20c ("scsi: ufs: Handle LINERESET indication in err handler") Signed-off-by: Jaegeuk Kim --- drivers/scsi/ufs/ufshcd.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index e6e7bdf99cd7..340dd5e515dd 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -44,6 +44,9 @@ /* Query request timeout */ #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ +/* LINERESET TIME OUT */ +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ + /* Task management command timeout */ #define TM_CMD_TIMEOUT 100 /* msecs */ @@ -5826,6 +5829,7 @@ static void ufshcd_err_handler(struct work_struct *work) int err = 0, pmc_err; int tag; bool needs_reset = false, needs_restore = false; + ktime_t start; hba = container_of(work, struct ufs_hba, eh_work); @@ -5911,6 +5915,22 @@ static void ufshcd_err_handler(struct work_struct *work) } hba->silence_err_logs = true; + + /* Wait for IO completion for non-fatal errors to avoid aborting IOs */ + start = ktime_get(); + while (hba->outstanding_reqs) { + ufshcd_complete_requests(hba); + spin_unlock_irqrestore(hba->host->host_lock, flags); + schedule(); + spin_lock_irqsave(hba->host->host_lock, flags); + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > + LINERESET_IO_TIMEOUT_MS) { + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", + __func__, hba->outstanding_reqs); + break; + } + } + /* release lock as clear command might sleep */ spin_unlock_irqrestore(hba->host->host_lock, flags); /* Clear pending transfer requests */ @@ -6302,9 +6322,13 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba) intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); } - if (enabled_intr_status && retval == IRQ_NONE) { - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", - __func__, intr_status); + if (enabled_intr_status && retval == IRQ_NONE && + !ufshcd_eh_in_progress(hba)) { + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 0x%08x)\n", + __func__, + intr_status, + hba->ufs_stats.last_intr_status, + enabled_intr_status); ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); } @@ -6348,7 +6372,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, * Even though we use wait_event() which sleeps indefinitely, * the maximum wait time is bounded by %TM_CMD_TIMEOUT. */ - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); + req = blk_get_request(q, REQ_OP_DRV_OUT, 0); + if (IS_ERR(req)) + return PTR_ERR(req); + req->end_io_data = &wait; free_slot = req->tag; WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);