From patchwork Wed Apr 3 10:26:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10883233 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E9321575 for ; Wed, 3 Apr 2019 10:26:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 79290289D0 for ; Wed, 3 Apr 2019 10:26:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D616289D3; Wed, 3 Apr 2019 10:26:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0D640289D0 for ; Wed, 3 Apr 2019 10:26:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726316AbfDCK0V (ORCPT ); Wed, 3 Apr 2019 06:26:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52964 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726199AbfDCK0V (ORCPT ); Wed, 3 Apr 2019 06:26:21 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 098C93091747; Wed, 3 Apr 2019 10:26:21 +0000 (UTC) Received: from localhost (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 11C21608AA; Wed, 3 Apr 2019 10:26:19 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V3 1/6] blk-mq: grab .q_usage_counter when queuing request from plug code path Date: Wed, 3 Apr 2019 18:26:04 +0800 Message-Id: <20190403102609.18707-2-ming.lei@redhat.com> In-Reply-To: <20190403102609.18707-1-ming.lei@redhat.com> References: <20190403102609.18707-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Wed, 03 Apr 2019 10:26:21 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Just like aio/io_uring, we need to grab 2 refcount for queuing one request, one is for submission, another is for completion. If the request isn't queued from plug code path, the refcount grabbed in generic_make_request() serves for submission. In theroy, this refcount should have been released after the sumission(async run queue) is done. But blk_freeze_queue() is working with blk_sync_queue() together for canceling async run queue work activities if hctx->run_work is scheduled with holding the refcount. However, if request is staggered into plug list, and finally queued from plug code path, the refcount in submission side is actually missed. And we may start to run queue after queue is removed, then kernel oops is triggered. Fixes the issue by grab .q_usage_counter before calling blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is safe because the queue is absolutely alive before inserting request. Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei --- block/blk-mq.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 3ff3d7b49969..5b586affee09 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1728,9 +1728,12 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) if (rq->mq_hctx != this_hctx || rq->mq_ctx != this_ctx) { if (this_hctx) { trace_block_unplug(this_q, depth, !from_schedule); + + percpu_ref_get(&this_q->q_usage_counter); blk_mq_sched_insert_requests(this_hctx, this_ctx, &rq_list, from_schedule); + percpu_ref_put(&this_q->q_usage_counter); } this_q = rq->q; @@ -1749,8 +1752,11 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) */ if (this_hctx) { trace_block_unplug(this_q, depth, !from_schedule); + + percpu_ref_get(&this_q->q_usage_counter); blk_mq_sched_insert_requests(this_hctx, this_ctx, &rq_list, from_schedule); + percpu_ref_put(&this_q->q_usage_counter); } } From patchwork Wed Apr 3 10:26:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10883239 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F19891800 for ; Wed, 3 Apr 2019 10:26:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E060B289D0 for ; Wed, 3 Apr 2019 10:26:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D4906289D2; Wed, 3 Apr 2019 10:26:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 877CD289DD for ; Wed, 3 Apr 2019 10:26:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726415AbfDCK0Y (ORCPT ); Wed, 3 Apr 2019 06:26:24 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49388 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726213AbfDCK0Y (ORCPT ); Wed, 3 Apr 2019 06:26:24 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4F16F308212A; Wed, 3 Apr 2019 10:26:24 +0000 (UTC) Received: from localhost (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 827DD608CF; Wed, 3 Apr 2019 10:26:23 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V3 2/6] blk-mq: move cancel of requeue_work into blk_mq_release Date: Wed, 3 Apr 2019 18:26:05 +0800 Message-Id: <20190403102609.18707-3-ming.lei@redhat.com> In-Reply-To: <20190403102609.18707-1-ming.lei@redhat.com> References: <20190403102609.18707-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 03 Apr 2019 10:26:24 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP with holding queue's kobject refcount, it is safe for driver to schedule requeue. However, blk_mq_kick_requeue_list() may be called after blk_sync_queue() is done because of concurrent requeue activities, then requeue work may not be completed when freeing queue, and kernel oops is triggered. So moving the cancel of requeue_work into blk_mq_release() for avoiding race between requeue and freeing queue. Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei Reviewed-by: Bart Van Assche --- block/blk-core.c | 1 - block/blk-mq.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4673ebe42255..6583d67f3e34 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -237,7 +237,6 @@ void blk_sync_queue(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - cancel_delayed_work_sync(&q->requeue_work); queue_for_each_hw_ctx(q, hctx, i) cancel_delayed_work_sync(&hctx->run_work); } diff --git a/block/blk-mq.c b/block/blk-mq.c index 5b586affee09..b512ba0cb359 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2626,6 +2626,8 @@ void blk_mq_release(struct request_queue *q) struct blk_mq_hw_ctx *hctx; unsigned int i; + cancel_delayed_work_sync(&q->requeue_work); + /* hctx kobj stays in hctx */ queue_for_each_hw_ctx(q, hctx, i) { if (!hctx) From patchwork Wed Apr 3 10:26:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10883241 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9D3E1575 for ; Wed, 3 Apr 2019 10:26:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C0574289D0 for ; Wed, 3 Apr 2019 10:26:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B40F1289D4; Wed, 3 Apr 2019 10:26:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 320B5289D0 for ; Wed, 3 Apr 2019 10:26:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726444AbfDCK0b (ORCPT ); Wed, 3 Apr 2019 06:26:31 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49442 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726442AbfDCK0b (ORCPT ); Wed, 3 Apr 2019 06:26:31 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 74CCC3086203; Wed, 3 Apr 2019 10:26:30 +0000 (UTC) Received: from localhost (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 85A601018A38; Wed, 3 Apr 2019 10:26:27 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang , stable@vger.kernel.org Subject: [PATCH V3 3/6] blk-mq: free hw queue's resource in hctx's release handler Date: Wed, 3 Apr 2019 18:26:06 +0800 Message-Id: <20190403102609.18707-4-ming.lei@redhat.com> In-Reply-To: <20190403102609.18707-1-ming.lei@redhat.com> References: <20190403102609.18707-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 03 Apr 2019 10:26:30 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit 45a9c9d909b2 ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before 45a9c9d909b2, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: 8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done") c2856ae2f315 ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Reported-by: James Smart Fixes: 45a9c9d909b2 ("blk-mq: Fix a use-after-free") Cc: stable@vger.kernel.org Signed-off-by: Ming Lei --- block/blk-core.c | 2 +- block/blk-mq-sysfs.c | 6 ++++++ block/blk-mq.c | 8 ++------ block/blk-mq.h | 2 +- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 6583d67f3e34..20298aa5a77c 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -374,7 +374,7 @@ void blk_cleanup_queue(struct request_queue *q) blk_exit_queue(q); if (queue_is_mq(q)) - blk_mq_free_queue(q); + blk_mq_exit_queue(q); percpu_ref_exit(&q->q_usage_counter); diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 3f9c3f4ac44c..4040e62c3737 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -10,6 +10,7 @@ #include #include +#include "blk.h" #include "blk-mq.h" #include "blk-mq-tag.h" @@ -33,6 +34,11 @@ static void blk_mq_hw_sysfs_release(struct kobject *kobj) { struct blk_mq_hw_ctx *hctx = container_of(kobj, struct blk_mq_hw_ctx, kobj); + + if (hctx->flags & BLK_MQ_F_BLOCKING) + cleanup_srcu_struct(hctx->srcu); + blk_free_flush_queue(hctx->fq); + sbitmap_free(&hctx->ctx_map); free_cpumask_var(hctx->cpumask); kfree(hctx->ctxs); kfree(hctx); diff --git a/block/blk-mq.c b/block/blk-mq.c index b512ba0cb359..afc9912e2e42 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2259,12 +2259,7 @@ static void blk_mq_exit_hctx(struct request_queue *q, if (set->ops->exit_hctx) set->ops->exit_hctx(hctx, hctx_idx); - if (hctx->flags & BLK_MQ_F_BLOCKING) - cleanup_srcu_struct(hctx->srcu); - blk_mq_remove_cpuhp(hctx); - blk_free_flush_queue(hctx->fq); - sbitmap_free(&hctx->ctx_map); } static void blk_mq_exit_hw_queues(struct request_queue *q, @@ -2899,7 +2894,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, } EXPORT_SYMBOL(blk_mq_init_allocated_queue); -void blk_mq_free_queue(struct request_queue *q) +/* tags can _not_ be used after returning from blk_mq_exit_queue */ +void blk_mq_exit_queue(struct request_queue *q) { struct blk_mq_tag_set *set = q->tag_set; diff --git a/block/blk-mq.h b/block/blk-mq.h index d704fc7766f4..c421e3a16e36 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -37,7 +37,7 @@ struct blk_mq_ctx { struct kobject kobj; } ____cacheline_aligned_in_smp; -void blk_mq_free_queue(struct request_queue *q); +void blk_mq_exit_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q); bool blk_mq_dispatch_rq_list(struct request_queue *, struct list_head *, bool); From patchwork Wed Apr 3 10:26:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10883247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB2731800 for ; Wed, 3 Apr 2019 10:26:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA0A0289C0 for ; Wed, 3 Apr 2019 10:26:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CE2F5289D2; Wed, 3 Apr 2019 10:26:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 787FD289DD for ; Wed, 3 Apr 2019 10:26:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726515AbfDCK0h (ORCPT ); Wed, 3 Apr 2019 06:26:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53062 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726442AbfDCK0h (ORCPT ); Wed, 3 Apr 2019 06:26:37 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 403EE3091761; Wed, 3 Apr 2019 10:26:37 +0000 (UTC) Received: from localhost (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id EDB37608BA; Wed, 3 Apr 2019 10:26:32 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V3 4/6] blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release Date: Wed, 3 Apr 2019 18:26:07 +0800 Message-Id: <20190403102609.18707-5-ming.lei@redhat.com> In-Reply-To: <20190403102609.18707-1-ming.lei@redhat.com> References: <20190403102609.18707-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Wed, 03 Apr 2019 10:26:37 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP hctx is always released after requeue is freed. With holding queue's kobject refcount, it is safe for driver to run queue, so one run queue might be scheduled after blk_sync_queue() is done. So moving the cancel of hctx->run_work into blk_mq_hw_sysfs_release() for avoiding run released queue. Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei Reviewed-by: Bart Van Assche --- block/blk-core.c | 8 -------- block/blk-mq-sysfs.c | 2 ++ 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 20298aa5a77c..ad17e999f79e 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -232,14 +232,6 @@ void blk_sync_queue(struct request_queue *q) { del_timer_sync(&q->timeout); cancel_work_sync(&q->timeout_work); - - if (queue_is_mq(q)) { - struct blk_mq_hw_ctx *hctx; - int i; - - queue_for_each_hw_ctx(q, hctx, i) - cancel_delayed_work_sync(&hctx->run_work); - } } EXPORT_SYMBOL(blk_sync_queue); diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 4040e62c3737..25c0d0a6a556 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -35,6 +35,8 @@ static void blk_mq_hw_sysfs_release(struct kobject *kobj) struct blk_mq_hw_ctx *hctx = container_of(kobj, struct blk_mq_hw_ctx, kobj); + cancel_delayed_work_sync(&hctx->run_work); + if (hctx->flags & BLK_MQ_F_BLOCKING) cleanup_srcu_struct(hctx->srcu); blk_free_flush_queue(hctx->fq); From patchwork Wed Apr 3 10:26:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10883249 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 77DC31575 for ; Wed, 3 Apr 2019 10:26:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 602AF289D2 for ; Wed, 3 Apr 2019 10:26:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4DC25289D4; Wed, 3 Apr 2019 10:26:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4977289C0 for ; Wed, 3 Apr 2019 10:26:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726442AbfDCK0n (ORCPT ); Wed, 3 Apr 2019 06:26:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53662 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726424AbfDCK0n (ORCPT ); Wed, 3 Apr 2019 06:26:43 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D11B9C1306F9; Wed, 3 Apr 2019 10:26:42 +0000 (UTC) Received: from localhost (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id CD74060147; Wed, 3 Apr 2019 10:26:39 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V3 5/6] block: don't drain in-progress dispatch in blk_cleanup_queue() Date: Wed, 3 Apr 2019 18:26:08 +0800 Message-Id: <20190403102609.18707-6-ming.lei@redhat.com> In-Reply-To: <20190403102609.18707-1-ming.lei@redhat.com> References: <20190403102609.18707-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 03 Apr 2019 10:26:42 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now freeing hw queue resource is moved to hctx's release handler, we don't need to worry about the race between blk_cleanup_queue and run queue any more. So don't drain in-progress dispatch in blk_cleanup_queue(). This is basically revert of c2856ae2f315 ("blk-mq: quiesce queue before freeing queue"). Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei Reviewed-by: Bart Van Assche --- block/blk-core.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ad17e999f79e..82b630da7e2c 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -338,18 +338,6 @@ void blk_cleanup_queue(struct request_queue *q) blk_queue_flag_set(QUEUE_FLAG_DEAD, q); - /* - * make sure all in-progress dispatch are completed because - * blk_freeze_queue() can only complete all requests, and - * dispatch may still be in-progress since we dispatch requests - * from more than one contexts. - * - * We rely on driver to deal with the race in case that queue - * initialization isn't done. - */ - if (queue_is_mq(q) && blk_queue_init_done(q)) - blk_mq_quiesce_queue(q); - /* for synchronous bio-based driver finish in-flight integrity i/o */ blk_flush_integrity(); From patchwork Wed Apr 3 10:26:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10883253 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4FD8017E9 for ; Wed, 3 Apr 2019 10:26:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36E6A28628 for ; Wed, 3 Apr 2019 10:26:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 25B06289D4; Wed, 3 Apr 2019 10:26:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A170C28628 for ; Wed, 3 Apr 2019 10:26:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726544AbfDCK0r (ORCPT ); Wed, 3 Apr 2019 06:26:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53628 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726387AbfDCK0r (ORCPT ); Wed, 3 Apr 2019 06:26:47 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3D33037E8F; Wed, 3 Apr 2019 10:26:46 +0000 (UTC) Received: from localhost (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7232E62A14; Wed, 3 Apr 2019 10:26:45 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V3 6/6] SCSI: don't hold device refcount in IO path Date: Wed, 3 Apr 2019 18:26:09 +0800 Message-Id: <20190403102609.18707-7-ming.lei@redhat.com> In-Reply-To: <20190403102609.18707-1-ming.lei@redhat.com> References: <20190403102609.18707-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 03 Apr 2019 10:26:46 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP scsi_device's refcount is always grabbed in IO path. Turns out it isn't necessary, because blk_queue_cleanup() will drain any in-flight IOs, then cancel timeout/requeue work, and SCSI's requeue_work is canceled too in __scsi_remove_device(). Also scsi_device won't go away until blk_cleanup_queue() is done. So don't hold the refcount in IO path, especially the refcount isn't required in IO path since blk_queue_enter() / blk_queue_exit() is introduced in the legacy block layer. Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei --- drivers/scsi/scsi_lib.c | 30 ++++++------------------------ 1 file changed, 6 insertions(+), 24 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 601b9f1de267..6772d2ae1357 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -141,8 +141,6 @@ scsi_set_blocked(struct scsi_cmnd *cmd, int reason) static void scsi_mq_requeue_cmd(struct scsi_cmnd *cmd) { - struct scsi_device *sdev = cmd->device; - if (cmd->request->rq_flags & RQF_DONTPREP) { cmd->request->rq_flags &= ~RQF_DONTPREP; scsi_mq_uninit_cmd(cmd); @@ -150,7 +148,6 @@ static void scsi_mq_requeue_cmd(struct scsi_cmnd *cmd) WARN_ON_ONCE(true); } blk_mq_requeue_request(cmd->request, true); - put_device(&sdev->sdev_gendev); } /** @@ -190,18 +187,12 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, bool unbusy) cmd->result = 0; /* - * Before a SCSI command is dispatched, - * get_device(&sdev->sdev_gendev) is called and the host, - * target and device busy counters are increased. Since - * requeuing a request causes these actions to be repeated and - * since scsi_device_unbusy() has already been called, - * put_device(&device->sdev_gendev) must still be called. Call - * put_device() after blk_mq_requeue_request() to avoid that - * removal of the SCSI device can start before requeueing has - * happened. + * Before a SCSI command is dispatched, the host, target and + * device busy counters are increased. Since requeuing a request + * causes these actions to be repeated and since scsi_device_unbusy() + * has already been called. */ blk_mq_requeue_request(cmd->request, true); - put_device(&device->sdev_gendev); } /* @@ -619,7 +610,6 @@ static bool scsi_end_request(struct request *req, blk_status_t error, blk_mq_run_hw_queues(q, true); percpu_ref_put(&q->q_usage_counter); - put_device(&sdev->sdev_gendev); return false; } @@ -1613,7 +1603,6 @@ static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx) struct scsi_device *sdev = q->queuedata; atomic_dec(&sdev->device_busy); - put_device(&sdev->sdev_gendev); } static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx) @@ -1621,16 +1610,9 @@ static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx) struct request_queue *q = hctx->queue; struct scsi_device *sdev = q->queuedata; - if (!get_device(&sdev->sdev_gendev)) - goto out; - if (!scsi_dev_queue_ready(q, sdev)) - goto out_put_device; - - return true; + if (scsi_dev_queue_ready(q, sdev)) + return true; -out_put_device: - put_device(&sdev->sdev_gendev); -out: if (atomic_read(&sdev->device_busy) == 0 && !scsi_device_blocked(sdev)) blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY); return false;