From patchwork Mon Apr 1 04:42:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10879103 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E93E6184E for ; Mon, 1 Apr 2019 04:43:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D99B228823 for ; Mon, 1 Apr 2019 04:43:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B9775287E0; Mon, 1 Apr 2019 04:43:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A15D287E0 for ; Mon, 1 Apr 2019 04:43:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726624AbfDAEnE (ORCPT ); Mon, 1 Apr 2019 00:43:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53632 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725860AbfDAEnE (ORCPT ); Mon, 1 Apr 2019 00:43:04 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 13EAF307CDD5; Mon, 1 Apr 2019 04:43:04 +0000 (UTC) Received: from localhost (ovpn-8-23.pek2.redhat.com [10.72.8.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id 205975C219; Mon, 1 Apr 2019 04:43:00 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang , stable@vger.kernel.org Subject: [PATCH V2 1/3] blk-mq: free hw queue's resource in hctx's release handler Date: Mon, 1 Apr 2019 12:42:45 +0800 Message-Id: <20190401044247.29881-2-ming.lei@redhat.com> In-Reply-To: <20190401044247.29881-1-ming.lei@redhat.com> References: <20190401044247.29881-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Mon, 01 Apr 2019 04:43:04 +0000 (UTC) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit 45a9c9d909b2 ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before 45a9c9d909b2, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: 8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done") c2856ae2f315 ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Reported-by: James Smart Fixes: 45a9c9d909b2 ("blk-mq: Fix a use-after-free") Cc: stable@vger.kernel.org Signed-off-by: Ming Lei --- block/blk-core.c | 2 +- block/blk-mq-sysfs.c | 6 ++++++ block/blk-mq.c | 8 ++------ block/blk-mq.h | 2 +- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4673ebe42255..b3bbf8a5110d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -375,7 +375,7 @@ void blk_cleanup_queue(struct request_queue *q) blk_exit_queue(q); if (queue_is_mq(q)) - blk_mq_free_queue(q); + blk_mq_exit_queue(q); percpu_ref_exit(&q->q_usage_counter); diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 3f9c3f4ac44c..4040e62c3737 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -10,6 +10,7 @@ #include #include +#include "blk.h" #include "blk-mq.h" #include "blk-mq-tag.h" @@ -33,6 +34,11 @@ static void blk_mq_hw_sysfs_release(struct kobject *kobj) { struct blk_mq_hw_ctx *hctx = container_of(kobj, struct blk_mq_hw_ctx, kobj); + + if (hctx->flags & BLK_MQ_F_BLOCKING) + cleanup_srcu_struct(hctx->srcu); + blk_free_flush_queue(hctx->fq); + sbitmap_free(&hctx->ctx_map); free_cpumask_var(hctx->cpumask); kfree(hctx->ctxs); kfree(hctx); diff --git a/block/blk-mq.c b/block/blk-mq.c index 70b210a308c4..05d11149e75a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2243,12 +2243,7 @@ static void blk_mq_exit_hctx(struct request_queue *q, if (set->ops->exit_hctx) set->ops->exit_hctx(hctx, hctx_idx); - if (hctx->flags & BLK_MQ_F_BLOCKING) - cleanup_srcu_struct(hctx->srcu); - blk_mq_remove_cpuhp(hctx); - blk_free_flush_queue(hctx->fq); - sbitmap_free(&hctx->ctx_map); } static void blk_mq_exit_hw_queues(struct request_queue *q, @@ -2881,7 +2876,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, } EXPORT_SYMBOL(blk_mq_init_allocated_queue); -void blk_mq_free_queue(struct request_queue *q) +/* tags can _not_ be used after returning from blk_mq_exit_queue */ +void blk_mq_exit_queue(struct request_queue *q) { struct blk_mq_tag_set *set = q->tag_set; diff --git a/block/blk-mq.h b/block/blk-mq.h index 0ed8e5a8729f..0a5d0893dae4 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -37,7 +37,7 @@ struct blk_mq_ctx { struct kobject kobj; } ____cacheline_aligned_in_smp; -void blk_mq_free_queue(struct request_queue *q); +void blk_mq_exit_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q); bool blk_mq_dispatch_rq_list(struct request_queue *, struct list_head *, bool); From patchwork Mon Apr 1 04:42:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10879105 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D4E813B5 for ; Mon, 1 Apr 2019 04:43:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 164852882E for ; Mon, 1 Apr 2019 04:43:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0A494287E0; Mon, 1 Apr 2019 04:43:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94C422881E for ; Mon, 1 Apr 2019 04:43:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726656AbfDAEnI (ORCPT ); Mon, 1 Apr 2019 00:43:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57858 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725860AbfDAEnH (ORCPT ); Mon, 1 Apr 2019 00:43:07 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6633F83F3C; Mon, 1 Apr 2019 04:43:07 +0000 (UTC) Received: from localhost (ovpn-8-23.pek2.redhat.com [10.72.8.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8780B5C219; Mon, 1 Apr 2019 04:43:06 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V2 2/3] block: don't drain in-progress dispatch in blk_cleanup_queue() Date: Mon, 1 Apr 2019 12:42:46 +0800 Message-Id: <20190401044247.29881-3-ming.lei@redhat.com> In-Reply-To: <20190401044247.29881-1-ming.lei@redhat.com> References: <20190401044247.29881-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 01 Apr 2019 04:43:07 +0000 (UTC) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now freeing hw queue resource is moved to hctx's release handler, we don't need to worry about the race between blk_cleanup_queue and run queue any more. So don't drain in-progress dispatch in blk_cleanup_queue(). This is basically revert of c2856ae2f315 ("blk-mq: quiesce queue before freeing queue"). Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei --- block/blk-core.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index b3bbf8a5110d..491dc0295778 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -347,18 +347,6 @@ void blk_cleanup_queue(struct request_queue *q) blk_queue_flag_set(QUEUE_FLAG_DEAD, q); - /* - * make sure all in-progress dispatch are completed because - * blk_freeze_queue() can only complete all requests, and - * dispatch may still be in-progress since we dispatch requests - * from more than one contexts. - * - * We rely on driver to deal with the race in case that queue - * initialization isn't done. - */ - if (queue_is_mq(q) && blk_queue_init_done(q)) - blk_mq_quiesce_queue(q); - /* for synchronous bio-based driver finish in-flight integrity i/o */ blk_flush_integrity(); From patchwork Mon Apr 1 04:42:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10879111 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A246184E for ; Mon, 1 Apr 2019 04:43:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6934A287E0 for ; Mon, 1 Apr 2019 04:43:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5D34D28827; Mon, 1 Apr 2019 04:43:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 06A42287E8 for ; Mon, 1 Apr 2019 04:43:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726762AbfDAEnN (ORCPT ); Mon, 1 Apr 2019 00:43:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45328 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725860AbfDAEnN (ORCPT ); Mon, 1 Apr 2019 00:43:13 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D99B5308A946; Mon, 1 Apr 2019 04:43:12 +0000 (UTC) Received: from localhost (ovpn-8-23.pek2.redhat.com [10.72.8.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0BB9C5D9C8; Mon, 1 Apr 2019 04:43:09 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Dongli Zhang , James Smart , Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig , "James E . J . Bottomley" , jianchao wang Subject: [PATCH V2 3/3] SCSI: don't grab queue usage counter before run queue Date: Mon, 1 Apr 2019 12:42:47 +0800 Message-Id: <20190401044247.29881-4-ming.lei@redhat.com> In-Reply-To: <20190401044247.29881-1-ming.lei@redhat.com> References: <20190401044247.29881-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Mon, 01 Apr 2019 04:43:13 +0000 (UTC) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now freeing hw queue resource is moved to hctx's release handler, we don't need to worry about the possible race between blk_cleanup_queue and run queue. So don't grab the queue usage counter before run queue in scsi_end_request(). This is basically revert of 8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done"). Cc: Dongli Zhang Cc: James Smart Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen , Cc: Christoph Hellwig , Cc: James E . J . Bottomley , Cc: jianchao wang Signed-off-by: Ming Lei --- drivers/scsi/scsi_lib.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 601b9f1de267..18bf341d1236 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -604,12 +604,6 @@ static bool scsi_end_request(struct request *req, blk_status_t error, */ scsi_mq_uninit_cmd(cmd); - /* - * queue is still alive, so grab the ref for preventing it - * from being cleaned up during running queue. - */ - percpu_ref_get(&q->q_usage_counter); - __blk_mq_end_request(req, error); if (scsi_target(sdev)->single_lun || @@ -618,7 +612,6 @@ static bool scsi_end_request(struct request *req, blk_status_t error, else blk_mq_run_hw_queues(q, true); - percpu_ref_put(&q->q_usage_counter); put_device(&sdev->sdev_gendev); return false; }