From patchwork Tue Apr 2 16:16:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10882061 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6CEA5139A for ; Tue, 2 Apr 2019 16:16:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59A9C28740 for ; Tue, 2 Apr 2019 16:16:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4D60628751; Tue, 2 Apr 2019 16:16:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7F4428740 for ; Tue, 2 Apr 2019 16:16:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731295AbfDBQQe (ORCPT ); Tue, 2 Apr 2019 12:16:34 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:42075 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729772AbfDBQQe (ORCPT ); Tue, 2 Apr 2019 12:16:34 -0400 Received: by mail-pf1-f196.google.com with SMTP id r15so6604737pfn.9 for ; Tue, 02 Apr 2019 09:16:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZL+PwRaJ0gp4UgL4ljqIybMF/XelsIWECHX6ttNwzpE=; b=DOoe/OAI0jNFm/XIoPjAg6aXutfO/YHY20F8dbCN3dN3u+iiLE9c1RJSvH5WrIrVmC mrMIsRurxZ75ZaiLUdkQEzqLXCjMAEgodZS5WtMbKVKWy+4YpU6jxUZgrPGjqv7HQt+r y+1iy1dheRcS+p7JhNAElzovIikg2BkUyRGrFeV03BSOMe11wrYe/MJRSxCwNENZDBmO HShY99TKNo/Iv03IVj9B2aGlOLt72av0S+sZzMbvxHEuTwhv+z1MqfOSkZXGeTjSLGMH dbaLjdmtTEa5ylQ8Xt5LECAZAJTRE5vix2BN0sxSzOIKrg3IAyeZVeVWY/jxBOCsTrRJ Lwvw== X-Gm-Message-State: APjAAAVF9ke7xtl/f8xnn+yg645w5WoCc86eq9/sgN8tSCJzFcTRBZGr gmIwkJEn292Bb2wJvMRbBTM= X-Google-Smtp-Source: APXvYqwEWWfYeVRH6goaGSnjKR+rG4jYR5mOrj+7TjuPREblDGQS6FGfE6J0adioL13ANCGhSPNXmg== X-Received: by 2002:a63:c64a:: with SMTP id x10mr37239153pgg.12.1554221793232; Tue, 02 Apr 2019 09:16:33 -0700 (PDT) Received: from desktop-bart.svl.corp.google.com ([2620:15c:2cd:203:5cdc:422c:7b28:ebb5]) by smtp.gmail.com with ESMTPSA id 26sm19150473pfj.93.2019.04.02.09.16.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 Apr 2019 09:16:32 -0700 (PDT) From: Bart Van Assche To: "Martin K . Petersen" , "James E . J . Bottomley" Cc: linux-scsi@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Ming Lei , Hannes Reinecke , Johannes Thumshirn Subject: [PATCH v2 1/2] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device Date: Tue, 2 Apr 2019 09:16:23 -0700 Message-Id: <20190402161624.148359-2-bvanassche@acm.org> X-Mailer: git-send-email 2.20.GIT In-Reply-To: <20190402161624.148359-1-bvanassche@acm.org> References: <20190402161624.148359-1-bvanassche@acm.org> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Several SCSI transport and LLD drivers surround code that does not tolerate concurrent calls of .queuecommand() with scsi_target_block() / scsi_target_unblock(). These last two functions use blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request queues to prevent concurrent .queuecommand() calls. However, that is not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd(). Hence surround the .queuecommand() call from the SCSI error handler with code that avoids that .queuecommand() gets called in the quiesced state. Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into code that calls blk_get_request() + blk_execute_rq() is not an option since scsi_send_eh_cmnd() must be able to make forward progress even if all requests have been allocated. Cc: Christoph Hellwig Cc: Ming Lei Cc: Hannes Reinecke Cc: Johannes Thumshirn Signed-off-by: Bart Van Assche --- drivers/scsi/scsi_error.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 8e9680572b9f..d516dd1b824d 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1054,7 +1054,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd->device; struct Scsi_Host *shost = sdev->host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft = timeout; + unsigned long timeleft = timeout, delay; struct scsi_eh_save ses; const unsigned long stall_for = msecs_to_jiffies(100); int rtn; @@ -1065,7 +1065,29 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, scsi_log_send(scmd); scmd->scsi_done = scsi_eh_done; - rtn = shost->hostt->queuecommand(shost, scmd); + + /* + * Lock sdev->state_mutex to avoid that scsi_device_quiesce() can + * change the SCSI device state after we have examined it and before + * .queuecommand() is called. + */ + mutex_lock(&sdev->state_mutex); + while (sdev->sdev_state == SDEV_BLOCK && timeleft > 0) { + mutex_unlock(&sdev->state_mutex); + SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_DEBUG, sdev, + "%s: state %d <> %d\n", __func__, sdev->sdev_state, + SDEV_BLOCK)); + delay = min(timeleft, stall_for); + timeleft -= delay; + msleep(jiffies_to_msecs(delay)); + mutex_lock(&sdev->state_mutex); + } + if (sdev->sdev_state != SDEV_BLOCK) + rtn = shost->hostt->queuecommand(shost, scmd); + else + rtn = SCSI_MLQUEUE_DEVICE_BUSY; + mutex_unlock(&sdev->state_mutex); + if (rtn) { if (timeleft > stall_for) { scsi_eh_restore_cmnd(scmd, &ses); From patchwork Tue Apr 2 16:16:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10882059 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D585139A for ; Tue, 2 Apr 2019 16:16:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77A7628740 for ; Tue, 2 Apr 2019 16:16:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6B01728751; Tue, 2 Apr 2019 16:16:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E87E628740 for ; Tue, 2 Apr 2019 16:16:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731339AbfDBQQg (ORCPT ); Tue, 2 Apr 2019 12:16:36 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:39188 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729063AbfDBQQf (ORCPT ); Tue, 2 Apr 2019 12:16:35 -0400 Received: by mail-pg1-f196.google.com with SMTP id k3so6810282pga.6 for ; Tue, 02 Apr 2019 09:16:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=D0ZN95NFlX3guyxd7pBshp5Il+OQLzpJvs5qMI74W/E=; b=jxbBB14jOVnXBzM7IuyvyxCVztc5llDg9HZgd+MX+tecuL+uvgIsH+JBPfGhzkUEM0 Vbn3QLk5996vyGSsVp419P0oG7ChlVxJmdhkUxii6URfrBwlSFOEbOyoLUo7S6WPyTdw fPVYXsYjOiK14nML8eiv/ofxwa1IGKRmjIKlxih88tcHJn99Wb089IkUsHIELdPgwwzw N64f23/owzp1FTd390TWa+QGygotjNGFDh0EgyaC+D+lbicu4mGIQ3d85jFYdlh+SNzG ctmVPzvmgzkBDwZbrUiaXoMDulonws20RTbx7i94LlTg9zGX19tVq9DckrAaIDoCndT5 mH0Q== X-Gm-Message-State: APjAAAWAQ52P0qWuvmjA8C6xJ0C8WfyxFWiHqgXR/s4FdpHuuXkmqX2L cq7rxvF2OQ/U4LtIabDrvA8= X-Google-Smtp-Source: APXvYqz3n/sTbywb/C+kwfxSw9vee0sBXXjqMBcpMxC0x8zznCsmtxjXLvsFQR++uSz4k887mFRIzw== X-Received: by 2002:a63:6142:: with SMTP id v63mr67405273pgb.342.1554221794375; Tue, 02 Apr 2019 09:16:34 -0700 (PDT) Received: from desktop-bart.svl.corp.google.com ([2620:15c:2cd:203:5cdc:422c:7b28:ebb5]) by smtp.gmail.com with ESMTPSA id 26sm19150473pfj.93.2019.04.02.09.16.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 Apr 2019 09:16:33 -0700 (PDT) From: Bart Van Assche To: "Martin K . Petersen" , "James E . J . Bottomley" Cc: linux-scsi@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Hannes Reinecke , Jason Gunthorpe , Leon Romanovsky , Doug Ledford , Laurence Oberman Subject: [PATCH v2 2/2] RDMA/srp: Fix a sleep-in-invalid-context bug Date: Tue, 2 Apr 2019 09:16:24 -0700 Message-Id: <20190402161624.148359-3-bvanassche@acm.org> X-Mailer: git-send-email 2.20.GIT In-Reply-To: <20190402161624.148359-1-bvanassche@acm.org> References: <20190402161624.148359-1-bvanassche@acm.org> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The previous patch guarantees that srp_queuecommand() does not get invoked while reconnecting occurs. Hence remove the code from srp_queuecommand() that prevents command queueing while reconnecting. This patch avoids that the following can appear in the kernel log: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 in_atomic(): 1, irqs_disabled(): 0, pid: 5600, name: scsi_eh_9 1 lock held by scsi_eh_9/5600: #0: (rcu_read_lock){....}, at: [<00000000cbb798c7>] __blk_mq_run_hw_queue+0xf1/0x1e0 Preemption disabled at: [<00000000139badf2>] __blk_mq_delay_run_hw_queue+0x78/0xf0 CPU: 9 PID: 5600 Comm: scsi_eh_9 Tainted: G W 4.15.0-rc4-dbg+ #1 Hardware name: Dell Inc. PowerEdge R720/0VWT90, BIOS 2.5.4 01/22/2016 Call Trace: dump_stack+0x67/0x99 ___might_sleep+0x16a/0x250 [ib_srp] __mutex_lock+0x46/0x9d0 srp_queuecommand+0x356/0x420 [ib_srp] scsi_dispatch_cmd+0xf6/0x3f0 scsi_queue_rq+0x4a8/0x5f0 blk_mq_dispatch_rq_list+0x73/0x440 blk_mq_sched_dispatch_requests+0x109/0x1a0 __blk_mq_run_hw_queue+0x131/0x1e0 __blk_mq_delay_run_hw_queue+0x9a/0xf0 blk_mq_run_hw_queue+0xc0/0x1e0 blk_mq_start_hw_queues+0x2c/0x40 scsi_run_queue+0x18e/0x2d0 scsi_run_host_queues+0x22/0x40 scsi_error_handler+0x18d/0x5f0 kthread+0x11c/0x140 ret_from_fork+0x24/0x30 Reviewed-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Doug Ledford Cc: Laurence Oberman Signed-off-by: Bart Van Assche --- drivers/infiniband/ulp/srp/ib_srp.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index be9ddcad8f28..b7c5a35f7daa 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2338,7 +2338,6 @@ static void srp_handle_qp_err(struct ib_cq *cq, struct ib_wc *wc, static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) { struct srp_target_port *target = host_to_target(shost); - struct srp_rport *rport = target->rport; struct srp_rdma_ch *ch; struct srp_request *req; struct srp_iu *iu; @@ -2348,16 +2347,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) u32 tag; u16 idx; int len, ret; - const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler; - - /* - * The SCSI EH thread is the only context from which srp_queuecommand() - * can get invoked for blocked devices (SDEV_BLOCK / - * SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() by - * locking the rport mutex if invoked from inside the SCSI EH. - */ - if (in_scsi_eh) - mutex_lock(&rport->mutex); scmnd->result = srp_chkready(target->rport); if (unlikely(scmnd->result)) @@ -2426,13 +2415,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) goto err_unmap; } - ret = 0; - -unlock_rport: - if (in_scsi_eh) - mutex_unlock(&rport->mutex); - - return ret; + return 0; err_unmap: srp_unmap_data(scmnd, ch, req); @@ -2454,7 +2437,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) ret = SCSI_MLQUEUE_HOST_BUSY; } - goto unlock_rport; + return ret; } /*