From patchwork Thu Nov 30 22:44:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10085741 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 87BBD605D2 for ; Thu, 30 Nov 2017 22:45:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 799E72851D for ; Thu, 30 Nov 2017 22:45:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6E9212A3E8; Thu, 30 Nov 2017 22:45:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52D912851D for ; Thu, 30 Nov 2017 22:45:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751156AbdK3WpA (ORCPT ); Thu, 30 Nov 2017 17:45:00 -0500 Received: from esa4.hgst.iphmx.com ([216.71.154.42]:2571 "EHLO esa4.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750994AbdK3Wo6 (ORCPT ); Thu, 30 Nov 2017 17:44:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1512081898; x=1543617898; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=QA0u2Lx45NlGwS8LkCjSCbY5JlU8inD1ikF4WUwotFM=; b=EdVDXsiT/FyfsQ5FBpf6S80gs5kUSRDg4EGIu3jw4kbQpOt9cheSxUM5 xG0Aw54+9IgBmZuhd1I9eUHhXltN16711JHMxMYwVooLme12hPsdHX8T+ vB9LwbxDWEOhP6NX2dL9wYF6ZgWQDij6fS/wYroD4CR1kr1ZRUzER+uds G+MadLdrT8GG4PKcqaB05LC+Vlv2IyN1Jj7X8XomoclrS+O/K4ySMfkDO 9hm2/EjHoD52XDVdpJPTe7SEehLDM/SAXolNuvH0GgKck2YWvGecbyg7E SAqS/bNXVMkwhPFc/5wgaxahA3k934dNjLXDPodsN4Xcmwim+SKx3Zev+ g==; X-IronPort-AV: E=Sophos;i="5.45,343,1508774400"; d="scan'208";a="63682124" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 01 Dec 2017 06:44:57 +0800 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP; 30 Nov 2017 14:41:43 -0800 Received: from thinkpad-bart.sdcorp.global.sandisk.com (HELO thinkpad-bart.int.fusionio.com) ([10.11.166.51]) by uls-op-cesaip02.wdc.com with ESMTP; 30 Nov 2017 14:44:57 -0800 From: Bart Van Assche To: "Martin K . Petersen" , "James E . J . Bottomley" Cc: linux-scsi@vger.kernel.org, Bart Van Assche , Konstantin Khorenko , Stuart Hayes , Pavel Tikhomirov , Christoph Hellwig , Hannes Reinecke , Johannes Thumshirn , stable@vger.kernel.org Subject: [PATCH 1/2] Ensure that the SCSI error handler gets woken up Date: Thu, 30 Nov 2017 14:44:55 -0800 Message-Id: <20171130224456.23100-2-bart.vanassche@wdc.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20171130224456.23100-1-bart.vanassche@wdc.com> References: <20171130224456.23100-1-bart.vanassche@wdc.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If scsi_eh_scmd_add() is called concurrently with scsi_host_queue_ready() while shost->host_blocked > 0 then it can happen that neither function wakes up the SCSI error handler. Fix this by making every function that decreases the host_busy counter wake up the error handler if necessary and by protecting the host_failed checks with the SCSI host lock. Reported-by: Pavel Tikhomirov Fixes: commit 746650160866 ("scsi: convert host_busy to atomic_t") Signed-off-by: Bart Van Assche Cc: Konstantin Khorenko Cc: Stuart Hayes Cc: Pavel Tikhomirov Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Johannes Thumshirn Cc: --- drivers/scsi/scsi_error.c | 8 +++++++- drivers/scsi/scsi_lib.c | 39 ++++++++++++++++++++++++++++----------- 2 files changed, 35 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 5e89049e9b4e..b22a9a23c74c 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -233,19 +233,25 @@ static void scsi_eh_reset(struct scsi_cmnd *scmd) void scsi_eh_scmd_add(struct scsi_cmnd *scmd) { struct Scsi_Host *shost = scmd->device->host; + enum scsi_host_state shost_state; unsigned long flags; int ret; WARN_ON_ONCE(!shost->ehandler); spin_lock_irqsave(shost->host_lock, flags); + shost_state = shost->shost_state; if (scsi_host_set_state(shost, SHOST_RECOVERY)) { ret = scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY); WARN_ON_ONCE(ret); } if (shost->eh_deadline != -1 && !shost->last_reset) shost->last_reset = jiffies; - + if (shost_state != shost->shost_state) { + spin_unlock_irqrestore(shost->host_lock, flags); + synchronize_rcu(); + spin_lock_irqsave(shost->host_lock, flags); + } scsi_eh_reset(scmd); list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q); shost->host_failed++; diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b6d3842b6809..7d18fb245d7d 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -318,22 +318,39 @@ static void scsi_init_cmd_errh(struct scsi_cmnd *cmd) cmd->cmd_len = scsi_command_size(cmd->cmnd); } -void scsi_device_unbusy(struct scsi_device *sdev) +/* + * Decrement the host_busy counter and wake up the error handler if necessary. + * Avoid as follows that the error handler is not woken up if shost->host_busy + * == shost->host_failed: use synchronize_rcu() in scsi_eh_scmd_add() in + * combination with an RCU read lock in this function to ensure that this + * function in its entirety either finishes before scsi_eh_scmd_add() + * increases the host_failed counter or that it notices the shost state change + * made by scsi_eh_scmd_add(). + */ +static void scsi_dec_host_busy(struct Scsi_Host *shost) { - struct Scsi_Host *shost = sdev->host; - struct scsi_target *starget = scsi_target(sdev); unsigned long flags; + rcu_read_lock(); atomic_dec(&shost->host_busy); - if (starget->can_queue > 0) - atomic_dec(&starget->target_busy); - - if (unlikely(scsi_host_in_recovery(shost) && - (shost->host_failed || shost->host_eh_scheduled))) { + if (unlikely(scsi_host_in_recovery(shost))) { spin_lock_irqsave(shost->host_lock, flags); - scsi_eh_wakeup(shost); + if (shost->host_failed || shost->host_eh_scheduled) + scsi_eh_wakeup(shost); spin_unlock_irqrestore(shost->host_lock, flags); } + rcu_read_unlock(); +} + +void scsi_device_unbusy(struct scsi_device *sdev) +{ + struct Scsi_Host *shost = sdev->host; + struct scsi_target *starget = scsi_target(sdev); + + scsi_dec_host_busy(shost); + + if (starget->can_queue > 0) + atomic_dec(&starget->target_busy); atomic_dec(&sdev->device_busy); } @@ -1531,7 +1548,7 @@ static inline int scsi_host_queue_ready(struct request_queue *q, list_add_tail(&sdev->starved_entry, &shost->starved_list); spin_unlock_irq(shost->host_lock); out_dec: - atomic_dec(&shost->host_busy); + scsi_dec_host_busy(shost); return 0; } @@ -2017,7 +2034,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, return BLK_STS_OK; out_dec_host_busy: - atomic_dec(&shost->host_busy); + scsi_dec_host_busy(shost); out_dec_target_busy: if (scsi_target(sdev)->can_queue > 0) atomic_dec(&scsi_target(sdev)->target_busy);