From patchwork Thu Feb 28 14:50:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 10833265 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27A921399 for ; Thu, 28 Feb 2019 14:52:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16ECA2F235 for ; Thu, 28 Feb 2019 14:52:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 14DE42F257; Thu, 28 Feb 2019 14:52:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C2C72F258 for ; Thu, 28 Feb 2019 14:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733151AbfB1Ov3 (ORCPT ); Thu, 28 Feb 2019 09:51:29 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:46374 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731361AbfB1Ov3 (ORCPT ); Thu, 28 Feb 2019 09:51:29 -0500 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id ADEF69C18F2B0E4D68E6; Thu, 28 Feb 2019 22:51:25 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.408.0; Thu, 28 Feb 2019 22:51:19 +0800 From: John Garry To: , CC: , , , Xiang Chen , "John Garry" Subject: [PATCH 2/6] scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO Date: Thu, 28 Feb 2019 22:50:58 +0800 Message-ID: <1551365462-128193-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1551365462-128193-1-git-send-email-john.garry@huawei.com> References: <1551365462-128193-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Xiang Chen For internal IO and SMP IO, there is a time-out timer for them. In the timer handler, it checks whether IO is done according to the flag task->task_state_lock. There is an issue which may cause system suspended: internal IO or SMP IO is sent, but at that time because of hardware exception (such as inject 2Bit ECC error), so IO is not completed and also not timeout. But, at that time, the SAS controller reset occurs to recover system. It will release the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO timeout, it will never complete the completion of IO and wait for ever. [ 729.123632] Call trace: [ 729.126791] [] __switch_to+0x94/0xa8 [ 729.133106] [] __schedule+0x1e8/0x7fc [ 729.138975] [] schedule+0x34/0x8c [ 729.144401] [] schedule_timeout+0x1d8/0x3cc [ 729.150690] [] wait_for_common+0xdc/0x1a0 [ 729.157101] [] wait_for_completion+0x28/0x34 [ 729.165973] [] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main] [ 729.176447] [] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main] [ 729.185258] [] sas_eh_handle_sas_errors+0x1c8/0x7b8 [ 729.192391] [] sas_scsi_recover_host+0x130/0x398 [ 729.199237] [] scsi_error_handler+0x148/0x5c0 [ 729.206009] [] kthread+0x10c/0x138 [ 729.211563] [] ret_from_fork+0x10/0x18 To solve the issue, callback function task_done of those IOs need to be called when on SAS controller reset. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 923296653ed7..dd03dcbd3786 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -980,7 +980,8 @@ static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task spin_lock_irqsave(&task->task_state_lock, flags); task->task_state_flags &= ~(SAS_TASK_STATE_PENDING | SAS_TASK_AT_INITIATOR); - task->task_state_flags |= SAS_TASK_STATE_DONE; + if (!slot->is_internal && task->task_proto != SAS_PROTOCOL_SMP) + task->task_state_flags |= SAS_TASK_STATE_DONE; spin_unlock_irqrestore(&task->task_state_lock, flags); }