From patchwork Tue Mar 26 20:43:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10872167 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D4D517E6 for ; Tue, 26 Mar 2019 20:43:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED69328CDA for ; Tue, 26 Mar 2019 20:43:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E16A528D01; Tue, 26 Mar 2019 20:43:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 506E828CDA for ; Tue, 26 Mar 2019 20:43:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732157AbfCZUnm (ORCPT ); Tue, 26 Mar 2019 16:43:42 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:32878 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729914AbfCZUnl (ORCPT ); Tue, 26 Mar 2019 16:43:41 -0400 Received: by mail-pl1-f193.google.com with SMTP id bg8so2176482plb.0 for ; Tue, 26 Mar 2019 13:43:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JHtGnxhXJ8m/14wGBktxMMV1zuBA1EDaQCyJ0ZxSKfg=; b=PjxkvwY/7uEO+ESwhKL2/r6zmKo36YKrEb2C171yIj3CLB1Nz3xqMBXZishznOrKP5 SQKHZBXbZ8PfsPU7OPdAejJyqQ01d07wUmr3Qsst7mWtzxD/47MC0QqcdREoSUkEUou7 hDH53Mu+grkoopCqiWLfV/evBA4mJH0xTx3VX7dBs00ibsogqf7PlmU4129NGOoCNyIA BJdira68OWGSL8UKC/m6yhDjFe6Yo6SVyOmr3XheUzXhUBrfdNRAIHCPCERfvM9KvZkj z2BnNcsacdiLDEJ7UpRfVJY3BI7tfgBk3NtDtV61lgah6PAXw1ZDSLYOHQQqO6HA5P3Q /AKg== X-Gm-Message-State: APjAAAXmfDnL/tl/RIoUV5FLZRlsl1BikNDuE1SS8yYGEMT542gvXYb7 Kjz8/b6iK4QEN4fP+xb2DYE= X-Google-Smtp-Source: APXvYqwwO15rQRu8f0hsZE/PSFlXhc4Li6CPl+XF7vi0m/tSGp8mn2+CvL5ZWlqZpEtHTSkz4rRJ8Q== X-Received: by 2002:a17:902:20eb:: with SMTP id v40mr33136371plg.20.1553633021055; Tue, 26 Mar 2019 13:43:41 -0700 (PDT) Received: from desktop-bart.svl.corp.google.com ([2620:15c:2cd:203:5cdc:422c:7b28:ebb5]) by smtp.gmail.com with ESMTPSA id j4sm29810524pfn.132.2019.03.26.13.43.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Mar 2019 13:43:40 -0700 (PDT) From: Bart Van Assche To: "Martin K . Petersen" , "James E . J . Bottomley" Cc: linux-scsi@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Ming Lei , Hannes Reinecke , Johannes Thumshirn Subject: [PATCH 1/2] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device Date: Tue, 26 Mar 2019 13:43:30 -0700 Message-Id: <20190326204331.54352-2-bvanassche@acm.org> X-Mailer: git-send-email 2.20.GIT In-Reply-To: <20190326204331.54352-1-bvanassche@acm.org> References: <20190326204331.54352-1-bvanassche@acm.org> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Several SCSI transport and LLD drivers surround code that does not tolerate concurrent calls of .queuecommand() with scsi_target_block() / scsi_target_unblock(). These last two functions use blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request queues to prevent concurrent .queuecommand() calls. However, that is not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd(). Hence surround the .queuecommand() call from the SCSI error handler with code that avoids that .queuecommand() gets called in the quiesced state. Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into code that calls blk_get_request() + blk_execute_rq() is not an option since scsi_send_eh_cmnd() must be able to make forward progress even if all requests have been allocated. Cc: Christoph Hellwig Cc: Ming Lei Cc: Hannes Reinecke Cc: Johannes Thumshirn Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig --- drivers/scsi/scsi_error.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 8e9680572b9f..5c9b30251abd 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1054,7 +1054,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd->device; struct Scsi_Host *shost = sdev->host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft = timeout; + unsigned long timeleft = timeout, delay; struct scsi_eh_save ses; const unsigned long stall_for = msecs_to_jiffies(100); int rtn; @@ -1065,7 +1065,29 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, scsi_log_send(scmd); scmd->scsi_done = scsi_eh_done; - rtn = shost->hostt->queuecommand(shost, scmd); + + /* + * Lock sdev->state_mutex to avoid that scsi_device_quiesce() can + * change the SCSI device state after we have examined it and before + * .queuecommand() is called. + */ + mutex_lock(&sdev->state_mutex); + while (sdev->sdev_state == SDEV_QUIESCE && timeleft > 0) { + mutex_unlock(&sdev->state_mutex); + SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_DEBUG, sdev, + "%s: state %d <> %d\n", __func__, sdev->sdev_state, + SDEV_QUIESCE)); + delay = min(timeleft, stall_for); + timeleft -= delay; + msleep(jiffies_to_msecs(delay)); + mutex_lock(&sdev->state_mutex); + } + if (sdev->sdev_state != SDEV_QUIESCE) + rtn = shost->hostt->queuecommand(shost, scmd); + else + rtn = SCSI_MLQUEUE_DEVICE_BUSY; + mutex_unlock(&sdev->state_mutex); + if (rtn) { if (timeleft > stall_for) { scsi_eh_restore_cmnd(scmd, &ses); From patchwork Tue Mar 26 20:43:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10872169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5712617E6 for ; Tue, 26 Mar 2019 20:43:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 436DE28CC8 for ; Tue, 26 Mar 2019 20:43:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3755228CF3; Tue, 26 Mar 2019 20:43:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD4E028CC8 for ; Tue, 26 Mar 2019 20:43:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732414AbfCZUnn (ORCPT ); Tue, 26 Mar 2019 16:43:43 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:45778 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729914AbfCZUnn (ORCPT ); Tue, 26 Mar 2019 16:43:43 -0400 Received: by mail-pl1-f194.google.com with SMTP id bf11so2184884plb.12 for ; Tue, 26 Mar 2019 13:43:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RfJTt9M3T2U/IUKkal/zip30G60yd81drB4jC6DbWTQ=; b=n9Y8eTWNsq3liT+hgQazLs5XjLbRW9uM9vrLzW9uPkJ4wz7T2ryvgm+dbBEISPYc1W O6BDnRmGHqIFJ7PUA/ow8AVctTPi/wxPhNKZIThWl+j/CVZ5oj5Awc9gIgkemjiACIl3 Hw6udtQq8PJYkLI/0jwk2BLE7W8DjS0wBFYUbY9iHFMdEVhAoTmHB12OQR9s0LO87/lH yPVVJOaNQwkqtrlLtgLF/DoCXdu8qvuwHUpMs4QqLuds25Qzgg3FfDaXH/tzECJBWfK4 TDexB6v0EnG+1KZ8u7aAnlcw4sVhPouOdRPR/jPCotPyuOBWjrQLf5wzTLDYHbb9j39C Ycjg== X-Gm-Message-State: APjAAAUDxUtyrfvpwAaX69n9aZ0NAHeRBQ6Hd0520JFduUisc00Xd9es lCGfj4oXYIwJ9Yqjii0jMb8= X-Google-Smtp-Source: APXvYqw/bUKN4x23ZnEm8TU3gOZcCxaPQUpf/fT1V3zhz/GcB5SdlTthcX7S9a/jhNa0m++pePvQ1Q== X-Received: by 2002:a17:902:b618:: with SMTP id b24mr31318394pls.73.1553633022439; Tue, 26 Mar 2019 13:43:42 -0700 (PDT) Received: from desktop-bart.svl.corp.google.com ([2620:15c:2cd:203:5cdc:422c:7b28:ebb5]) by smtp.gmail.com with ESMTPSA id j4sm29810524pfn.132.2019.03.26.13.43.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Mar 2019 13:43:41 -0700 (PDT) From: Bart Van Assche To: "Martin K . Petersen" , "James E . J . Bottomley" Cc: linux-scsi@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Hannes Reinecke , Jason Gunthorpe , Leon Romanovsky , Doug Ledford , Laurence Oberman Subject: [PATCH 2/2] RDMA/srp: Fix a sleep-in-invalid-context bug Date: Tue, 26 Mar 2019 13:43:31 -0700 Message-Id: <20190326204331.54352-3-bvanassche@acm.org> X-Mailer: git-send-email 2.20.GIT In-Reply-To: <20190326204331.54352-1-bvanassche@acm.org> References: <20190326204331.54352-1-bvanassche@acm.org> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The previous patch guarantees that srp_queuecommand() does not get invoked while reconnecting occurs. Hence remove the code from srp_queuecommand() that prevents command queueing while reconnecting. This patch avoids that the following can appear in the kernel log: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 in_atomic(): 1, irqs_disabled(): 0, pid: 5600, name: scsi_eh_9 1 lock held by scsi_eh_9/5600: #0: (rcu_read_lock){....}, at: [<00000000cbb798c7>] __blk_mq_run_hw_queue+0xf1/0x1e0 Preemption disabled at: [<00000000139badf2>] __blk_mq_delay_run_hw_queue+0x78/0xf0 CPU: 9 PID: 5600 Comm: scsi_eh_9 Tainted: G W 4.15.0-rc4-dbg+ #1 Hardware name: Dell Inc. PowerEdge R720/0VWT90, BIOS 2.5.4 01/22/2016 Call Trace: dump_stack+0x67/0x99 ___might_sleep+0x16a/0x250 [ib_srp] __mutex_lock+0x46/0x9d0 srp_queuecommand+0x356/0x420 [ib_srp] scsi_dispatch_cmd+0xf6/0x3f0 scsi_queue_rq+0x4a8/0x5f0 blk_mq_dispatch_rq_list+0x73/0x440 blk_mq_sched_dispatch_requests+0x109/0x1a0 __blk_mq_run_hw_queue+0x131/0x1e0 __blk_mq_delay_run_hw_queue+0x9a/0xf0 blk_mq_run_hw_queue+0xc0/0x1e0 blk_mq_start_hw_queues+0x2c/0x40 scsi_run_queue+0x18e/0x2d0 scsi_run_host_queues+0x22/0x40 scsi_error_handler+0x18d/0x5f0 kthread+0x11c/0x140 ret_from_fork+0x24/0x30 Reviewed-by: Hannes Reinecke Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Doug Ledford Cc: Laurence Oberman Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig --- drivers/infiniband/ulp/srp/ib_srp.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index be9ddcad8f28..b7c5a35f7daa 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2338,7 +2338,6 @@ static void srp_handle_qp_err(struct ib_cq *cq, struct ib_wc *wc, static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) { struct srp_target_port *target = host_to_target(shost); - struct srp_rport *rport = target->rport; struct srp_rdma_ch *ch; struct srp_request *req; struct srp_iu *iu; @@ -2348,16 +2347,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) u32 tag; u16 idx; int len, ret; - const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler; - - /* - * The SCSI EH thread is the only context from which srp_queuecommand() - * can get invoked for blocked devices (SDEV_BLOCK / - * SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() by - * locking the rport mutex if invoked from inside the SCSI EH. - */ - if (in_scsi_eh) - mutex_lock(&rport->mutex); scmnd->result = srp_chkready(target->rport); if (unlikely(scmnd->result)) @@ -2426,13 +2415,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) goto err_unmap; } - ret = 0; - -unlock_rport: - if (in_scsi_eh) - mutex_unlock(&rport->mutex); - - return ret; + return 0; err_unmap: srp_unmap_data(scmnd, ch, req); @@ -2454,7 +2437,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) ret = SCSI_MLQUEUE_HOST_BUSY; } - goto unlock_rport; + return ret; } /*