From patchwork Sun Apr 25 08:57:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDCD0C433ED for ; Sun, 25 Apr 2021 08:58:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AA9A0613AB for ; Sun, 25 Apr 2021 08:58:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229620AbhDYI6x (ORCPT ); Sun, 25 Apr 2021 04:58:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25147 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229485AbhDYI6x (ORCPT ); Sun, 25 Apr 2021 04:58:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341093; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M74/hsQnSAa0X2C+IMiyjVk4wyeWitrFumK7BMvk4gA=; b=EOui41o3lv5BJn3wYSYDTR43CIO1GQFcs/1IRUBPdMgPVkTy5js1bVCgndtw2uK5iG1bWu QrH2l/spW+13EbjUePDN+o0wqky4Uh5jYHvg/mavBv+Y0ERxa34PULT6ODkZ2Rrl7SX8bi PJ8YVPjQgPTTZVlz9S/Gb854d8R0/NE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-320-V0FQV5NMMvm2rR2RedqVZA-1; Sun, 25 Apr 2021 04:58:11 -0400 X-MC-Unique: V0FQV5NMMvm2rR2RedqVZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AA5718030CF; Sun, 25 Apr 2021 08:58:09 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1691A10016F8; Sun, 25 Apr 2021 08:58:02 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 1/8] Revert "blk-mq: Fix races between blk_mq_update_nr_hw_queues() and iterating over tags" Date: Sun, 25 Apr 2021 16:57:46 +0800 Message-Id: <20210425085753.2617424-2-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This reverts commit ac81d1ffd022b432d24fe79adf2d31f81a4acdc3. Signed-off-by: Ming Lei --- block/blk-mq-tag.c | 39 --------------------------------------- 1 file changed, 39 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index b0e0f074a864..39d5c9190a6b 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -376,31 +376,6 @@ void blk_mq_all_tag_iter_atomic(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, __blk_mq_all_tag_iter(tags, fn, priv, BT_TAG_ITER_STATIC_RQS); } -/* - * Iterate over all request queues in a tag set, find the first queue with a - * non-zero usage count, increment its usage count and return the pointer to - * that request queue. This prevents that blk_mq_update_nr_hw_queues() will - * modify @set->nr_hw_queues while iterating over tags since - * blk_mq_update_nr_hw_queues() only modifies @set->nr_hw_queues while the - * usage count of all queues associated with a tag set is zero. - */ -static struct request_queue * -blk_mq_get_any_tagset_queue(struct blk_mq_tag_set *set) -{ - struct request_queue *q; - - rcu_read_lock(); - list_for_each_entry_rcu(q, &set->tag_list, tag_set_list) { - if (percpu_ref_tryget(&q->q_usage_counter)) { - rcu_read_unlock(); - return q; - } - } - rcu_read_unlock(); - - return NULL; -} - /** * blk_mq_tagset_busy_iter - iterate over all started requests in a tag set * @tagset: Tag set to iterate over. @@ -416,22 +391,15 @@ blk_mq_get_any_tagset_queue(struct blk_mq_tag_set *set) void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) { - struct request_queue *q; int i; might_sleep(); - q = blk_mq_get_any_tagset_queue(tagset); - if (!q) - return; - for (i = 0; i < tagset->nr_hw_queues; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, BT_TAG_ITER_STARTED | BT_TAG_ITER_MAY_SLEEP); } - - blk_queue_exit(q); } EXPORT_SYMBOL(blk_mq_tagset_busy_iter); @@ -450,20 +418,13 @@ EXPORT_SYMBOL(blk_mq_tagset_busy_iter); void blk_mq_tagset_busy_iter_atomic(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) { - struct request_queue *q; int i; - q = blk_mq_get_any_tagset_queue(tagset); - if (!q) - return; - for (i = 0; i < tagset->nr_hw_queues; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, BT_TAG_ITER_STARTED); } - - blk_queue_exit(q); } EXPORT_SYMBOL(blk_mq_tagset_busy_iter_atomic); From patchwork Sun Apr 25 08:57:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D44C43460 for ; Sun, 25 Apr 2021 08:58:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6115E61350 for ; Sun, 25 Apr 2021 08:58:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229668AbhDYI65 (ORCPT ); Sun, 25 Apr 2021 04:58:57 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41649 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229485AbhDYI65 (ORCPT ); Sun, 25 Apr 2021 04:58:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341097; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1ncarbKPgNU0ZeJA+q4PSEQDtpWvG/p0TouzrHLB+yU=; b=iPaKGmBwbmMK+Y2jxbXFDgUefM+FWMsBNrV7wVra2dbn1TgP78xGCqI5B/9hb+iwjfvo+w UKxKcFeI2MrH44rTpSjq55c+OwUbE8LSf6qsdLJ5Cvj6hTebn0bDMRZBYcLAte+VhgA9MN x3enomEdB9BFJ2ymjOqgPhlNJ/glk4Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-14-9R7uqRVmPr-f_nmiv8kLAQ-1; Sun, 25 Apr 2021 04:58:15 -0400 X-MC-Unique: 9R7uqRVmPr-f_nmiv8kLAQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E755E1006701; Sun, 25 Apr 2021 08:58:13 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4420C2BFF1; Sun, 25 Apr 2021 08:58:12 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 2/8] Revert "blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list" Date: Sun, 25 Apr 2021 16:57:47 +0800 Message-Id: <20210425085753.2617424-3-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This reverts commit a8a6ac7ad3fb6b84b933ca1ea5110998bdaeee17. Signed-off-by: Ming Lei --- block/blk-mq.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 7d2ea6357c7d..8b59f6b4ec8e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2947,7 +2947,7 @@ static void blk_mq_del_queue_tag_set(struct request_queue *q) struct blk_mq_tag_set *set = q->tag_set; mutex_lock(&set->tag_list_lock); - list_del_rcu(&q->tag_set_list); + list_del(&q->tag_set_list); if (list_is_singular(&set->tag_list)) { /* just transitioned to unshared */ set->flags &= ~BLK_MQ_F_TAG_QUEUE_SHARED; @@ -2955,11 +2955,7 @@ static void blk_mq_del_queue_tag_set(struct request_queue *q) blk_mq_update_tag_set_shared(set, false); } mutex_unlock(&set->tag_list_lock); - /* - * Calling synchronize_rcu() and INIT_LIST_HEAD(&q->tag_set_list) is - * not necessary since blk_mq_del_queue_tag_set() is only called from - * blk_cleanup_queue(). - */ + INIT_LIST_HEAD(&q->tag_set_list); } static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set, From patchwork Sun Apr 25 08:57:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAB38C433B4 for ; Sun, 25 Apr 2021 08:58:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D02A613B4 for ; Sun, 25 Apr 2021 08:58:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbhDYI7G (ORCPT ); Sun, 25 Apr 2021 04:59:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:38039 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229699AbhDYI7F (ORCPT ); Sun, 25 Apr 2021 04:59:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341105; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1ySprp4c/nxfZOZakpTyMDVQUGhmiCiKIQUYVlnHBwI=; b=T9Rd2IzJhXfWzGPDlVLeYIyuWDAmA4JxH+W6HX/1rhtADSRLXC3YUKy4Tp0v2yviGSWK3V H4t2CT3ukxJw6dpHvGjpM2YxwHYh34dOTwnXQ+kWXMUZ7vGKDNB1OE/41JQ0l5/YX0DKAy k2s9+xkbQQLoDGv8HDfX5Vn+Be3TqoQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-349-vZvJvkNQNY-Uikz2PyA_Fg-1; Sun, 25 Apr 2021 04:58:21 -0400 X-MC-Unique: vZvJvkNQNY-Uikz2PyA_Fg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 226ED1006C80; Sun, 25 Apr 2021 08:58:20 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB99C690FD; Sun, 25 Apr 2021 08:58:15 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 3/8] Revert "blk-mq: Fix races between iterating over requests and freeing requests" Date: Sun, 25 Apr 2021 16:57:48 +0800 Message-Id: <20210425085753.2617424-4-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This reverts commit 5ba3f5a6ca7ee2dffcae7fab25a1a1053e3264cb. Signed-off-by: Ming Lei --- block/blk-core.c | 34 +------------------------------ block/blk-mq-tag.c | 51 ++++++---------------------------------------- block/blk-mq-tag.h | 4 +--- block/blk-mq.c | 21 ++++--------------- block/blk-mq.h | 1 - block/blk.h | 2 -- block/elevator.c | 1 - 7 files changed, 12 insertions(+), 102 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ca7f833e25a8..9bcdae93f6d4 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -279,36 +279,6 @@ void blk_dump_rq_flags(struct request *rq, char *msg) } EXPORT_SYMBOL(blk_dump_rq_flags); -/** - * blk_mq_wait_for_tag_iter - wait for preexisting tag iteration functions to finish - * @set: Tag set to wait on. - * - * Waits for preexisting calls of blk_mq_all_tag_iter(), - * blk_mq_tagset_busy_iter() and also for their atomic variants to finish - * accessing hctx->tags->rqs[]. New readers may start while this function is - * in progress or after this function has returned. Use this function to make - * sure that hctx->tags->rqs[] changes have become globally visible. - * - * Waits for preexisting blk_mq_queue_tag_busy_iter(q, fn, priv) calls to - * finish accessing requests associated with other request queues than 'q'. - */ -void blk_mq_wait_for_tag_iter(struct blk_mq_tag_set *set) -{ - struct blk_mq_tags *tags; - int i; - - if (set->tags) { - for (i = 0; i < set->nr_hw_queues; i++) { - tags = set->tags[i]; - if (!tags) - continue; - down_write(&tags->iter_rwsem); - up_write(&tags->iter_rwsem); - } - } - synchronize_rcu(); -} - /** * blk_sync_queue - cancel any pending callbacks on a queue * @q: the queue @@ -442,10 +412,8 @@ void blk_cleanup_queue(struct request_queue *q) * it is safe to free requests now. */ mutex_lock(&q->sysfs_lock); - if (q->elevator) { - blk_mq_wait_for_tag_iter(q->tag_set); + if (q->elevator) blk_mq_sched_free_requests(q); - } mutex_unlock(&q->sysfs_lock); percpu_ref_exit(&q->q_usage_counter); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 39d5c9190a6b..d8eaa38a1bd1 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -209,24 +209,14 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) if (!reserved) bitnr += tags->nr_reserved_tags; - rcu_read_lock(); - /* - * The request 'rq' points at is protected by an RCU read lock until - * its queue pointer has been verified and by q_usage_count while the - * callback function is being invoked. See also the - * percpu_ref_tryget() and blk_queue_exit() calls in - * blk_mq_queue_tag_busy_iter(). - */ - rq = rcu_dereference(tags->rqs[bitnr]); + rq = tags->rqs[bitnr]; + /* * We can hit rq == NULL here, because the tagging functions * test and set the bit before assigning ->rqs[]. */ - if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx) { - rcu_read_unlock(); + if (rq && rq->q == hctx->queue && rq->mq_hctx == hctx) return iter_data->fn(hctx, rq, iter_data->data, reserved); - } - rcu_read_unlock(); return true; } @@ -264,17 +254,11 @@ struct bt_tags_iter_data { unsigned int flags; }; -/* Include reserved tags. */ #define BT_TAG_ITER_RESERVED (1 << 0) -/* Only include started requests. */ #define BT_TAG_ITER_STARTED (1 << 1) -/* Iterate over tags->static_rqs[] instead of tags->rqs[]. */ #define BT_TAG_ITER_STATIC_RQS (1 << 2) -/* The callback function may sleep. */ -#define BT_TAG_ITER_MAY_SLEEP (1 << 3) -static bool __bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, - void *data) +static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) { struct bt_tags_iter_data *iter_data = data; struct blk_mq_tags *tags = iter_data->tags; @@ -291,8 +275,7 @@ static bool __bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, if (iter_data->flags & BT_TAG_ITER_STATIC_RQS) rq = tags->static_rqs[bitnr]; else - rq = rcu_dereference_check(tags->rqs[bitnr], - lockdep_is_held(&tags->iter_rwsem)); + rq = tags->rqs[bitnr]; if (!rq) return true; if ((iter_data->flags & BT_TAG_ITER_STARTED) && @@ -301,25 +284,6 @@ static bool __bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, return iter_data->fn(rq, iter_data->data, reserved); } -static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) -{ - struct bt_tags_iter_data *iter_data = data; - struct blk_mq_tags *tags = iter_data->tags; - bool res; - - if (iter_data->flags & BT_TAG_ITER_MAY_SLEEP) { - down_read(&tags->iter_rwsem); - res = __bt_tags_iter(bitmap, bitnr, data); - up_read(&tags->iter_rwsem); - } else { - rcu_read_lock(); - res = __bt_tags_iter(bitmap, bitnr, data); - rcu_read_unlock(); - } - - return res; -} - /** * bt_tags_for_each - iterate over the requests in a tag map * @tags: Tag map to iterate over. @@ -393,12 +357,10 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, { int i; - might_sleep(); - for (i = 0; i < tagset->nr_hw_queues; i++) { if (tagset->tags && tagset->tags[i]) __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, - BT_TAG_ITER_STARTED | BT_TAG_ITER_MAY_SLEEP); + BT_TAG_ITER_STARTED); } } EXPORT_SYMBOL(blk_mq_tagset_busy_iter); @@ -582,7 +544,6 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, tags->nr_tags = total_tags; tags->nr_reserved_tags = reserved_tags; - init_rwsem(&tags->iter_rwsem); if (blk_mq_is_sbitmap_shared(flags)) return tags; diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index d1d73d7cc7df..0290c308ece9 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -17,11 +17,9 @@ struct blk_mq_tags { struct sbitmap_queue __bitmap_tags; struct sbitmap_queue __breserved_tags; - struct request __rcu **rqs; + struct request **rqs; struct request **static_rqs; struct list_head page_list; - - struct rw_semaphore iter_rwsem; }; extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, diff --git a/block/blk-mq.c b/block/blk-mq.c index 8b59f6b4ec8e..79c01b1f885c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -496,10 +496,8 @@ static void __blk_mq_free_request(struct request *rq) blk_crypto_free_request(rq); blk_pm_mark_last_busy(rq); rq->mq_hctx = NULL; - if (rq->tag != BLK_MQ_NO_TAG) { + if (rq->tag != BLK_MQ_NO_TAG) blk_mq_put_tag(hctx->tags, ctx, rq->tag); - rcu_assign_pointer(hctx->tags->rqs[rq->tag], NULL); - } if (sched_tag != BLK_MQ_NO_TAG) blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag); blk_mq_sched_restart(hctx); @@ -841,20 +839,9 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag) { - struct request *rq; - if (tag < tags->nr_tags) { - /* - * Freeing tags happens with the request queue frozen so the - * rcu dereference below is protected by the request queue - * usage count. We can only verify that usage count after - * having read the request pointer. - */ - rq = rcu_dereference_check(tags->rqs[tag], true); - WARN_ON_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && rq && - percpu_ref_is_zero(&rq->q->q_usage_counter)); - prefetch(rq); - return rq; + prefetch(tags->rqs[tag]); + return tags->rqs[tag]; } return NULL; @@ -1125,7 +1112,7 @@ static bool blk_mq_get_driver_tag(struct request *rq) rq->rq_flags |= RQF_MQ_INFLIGHT; __blk_mq_inc_active_requests(hctx); } - rcu_assign_pointer(hctx->tags->rqs[rq->tag], rq); + hctx->tags->rqs[rq->tag] = rq; return true; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 9ccb1818303b..3616453ca28c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -226,7 +226,6 @@ static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) { blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag); - rcu_assign_pointer(hctx->tags->rqs[rq->tag], NULL); rq->tag = BLK_MQ_NO_TAG; if (rq->rq_flags & RQF_MQ_INFLIGHT) { diff --git a/block/blk.h b/block/blk.h index d88b0823738c..529233957207 100644 --- a/block/blk.h +++ b/block/blk.h @@ -185,8 +185,6 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list, void blk_account_io_start(struct request *req); void blk_account_io_done(struct request *req, u64 now); -void blk_mq_wait_for_tag_iter(struct blk_mq_tag_set *set); - /* * Internal elevator interface */ diff --git a/block/elevator.c b/block/elevator.c index aae9cff6d5ae..7c486ce858e0 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -201,7 +201,6 @@ static void elevator_exit(struct request_queue *q, struct elevator_queue *e) { lockdep_assert_held(&q->sysfs_lock); - blk_mq_wait_for_tag_iter(q->tag_set); blk_mq_sched_free_requests(q); __elevator_exit(q, e); } From patchwork Sun Apr 25 08:57:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A9CEC433ED for ; Sun, 25 Apr 2021 08:58:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4490F613BD for ; Sun, 25 Apr 2021 08:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229760AbhDYI7K (ORCPT ); Sun, 25 Apr 2021 04:59:10 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:20053 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229699AbhDYI7J (ORCPT ); Sun, 25 Apr 2021 04:59:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341110; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RbUbE3u61b64kWZ58krCM3rw6W+Wwws3IIUxJ6lIUWg=; b=dol0Y08hczStcZSdJl2CJU/Bgy4BZI/bXaeOw7Qk+payVcPwr7EGj67CGuhuHfMOJa0j0X UTsVUeTRmo1cW55lo7GWDdL3gSjdUceDG4ka9goIRf6Xj+UqNFluHMWdRozdQaRs2hToBQ AcJmh+PFSiJ5qmhEv/oJJKNXoSaulLA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-69-xnhAsOmFO4ePFiIFQb7OIQ-1; Sun, 25 Apr 2021 04:58:28 -0400 X-MC-Unique: xnhAsOmFO4ePFiIFQb7OIQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 85D01107ACC7; Sun, 25 Apr 2021 08:58:26 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA43062665; Sun, 25 Apr 2021 08:58:22 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 4/8] Revert "blk-mq: Introduce atomic variants of blk_mq_(all_tag|tagset_busy)_iter" Date: Sun, 25 Apr 2021 16:57:49 +0800 Message-Id: <20210425085753.2617424-5-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This reverts commit 5d39098af969f222253036b1b2e7ffc57c734570. Signed-off-by: Ming Lei --- block/blk-mq-tag.c | 38 +++++--------------------------------- block/blk-mq-tag.h | 2 +- block/blk-mq.c | 2 +- drivers/scsi/hosts.c | 16 ++++++++-------- drivers/scsi/ufs/ufshcd.c | 4 ++-- include/linux/blk-mq.h | 2 -- 6 files changed, 17 insertions(+), 47 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index d8eaa38a1bd1..2a37731e8244 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -322,19 +322,18 @@ static void __blk_mq_all_tag_iter(struct blk_mq_tags *tags, } /** - * blk_mq_all_tag_iter_atomic - iterate over all requests in a tag map + * blk_mq_all_tag_iter - iterate over all requests in a tag map * @tags: Tag map to iterate over. * @fn: Pointer to the function that will be called for each * request. @fn will be called as follows: @fn(rq, @priv, * reserved) where rq is a pointer to a request. 'reserved' * indicates whether or not @rq is a reserved request. Return - * true to continue iterating tags, false to stop. Must not - * sleep. + * true to continue iterating tags, false to stop. * @priv: Will be passed as second argument to @fn. * - * Does not sleep. + * Caller has to pass the tag map from which requests are allocated. */ -void blk_mq_all_tag_iter_atomic(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, +void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, void *priv) { __blk_mq_all_tag_iter(tags, fn, priv, BT_TAG_ITER_STATIC_RQS); @@ -349,8 +348,6 @@ void blk_mq_all_tag_iter_atomic(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, * indicates whether or not @rq is a reserved request. Return * true to continue iterating tags, false to stop. * @priv: Will be passed as second argument to @fn. - * - * May sleep. */ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) @@ -365,31 +362,6 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, } EXPORT_SYMBOL(blk_mq_tagset_busy_iter); -/** - * blk_mq_tagset_busy_iter_atomic - iterate over all started requests in a tag set - * @tagset: Tag set to iterate over. - * @fn: Pointer to the function that will be called for each started - * request. @fn will be called as follows: @fn(rq, @priv, - * reserved) where rq is a pointer to a request. 'reserved' - * indicates whether or not @rq is a reserved request. Return - * true to continue iterating tags, false to stop. Must not sleep. - * @priv: Will be passed as second argument to @fn. - * - * Does not sleep. - */ -void blk_mq_tagset_busy_iter_atomic(struct blk_mq_tag_set *tagset, - busy_tag_iter_fn *fn, void *priv) -{ - int i; - - for (i = 0; i < tagset->nr_hw_queues; i++) { - if (tagset->tags && tagset->tags[i]) - __blk_mq_all_tag_iter(tagset->tags[i], fn, priv, - BT_TAG_ITER_STARTED); - } -} -EXPORT_SYMBOL(blk_mq_tagset_busy_iter_atomic); - static bool blk_mq_tagset_count_completed_rqs(struct request *rq, void *data, bool reserved) { @@ -412,7 +384,7 @@ void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset) while (true) { unsigned count = 0; - blk_mq_tagset_busy_iter_atomic(tagset, + blk_mq_tagset_busy_iter(tagset, blk_mq_tagset_count_completed_rqs, &count); if (!count) break; diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index 0290c308ece9..7d3e6b333a4a 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -43,7 +43,7 @@ extern void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool); void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, void *priv); -void blk_mq_all_tag_iter_atomic(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, +void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, void *priv); static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue *bt, diff --git a/block/blk-mq.c b/block/blk-mq.c index 79c01b1f885c..927189a55575 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2484,7 +2484,7 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx) .hctx = hctx, }; - blk_mq_all_tag_iter_atomic(tags, blk_mq_has_request, &data); + blk_mq_all_tag_iter(tags, blk_mq_has_request, &data); return data.has_rq; } diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index f8863aa88642..697c09ef259b 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -573,8 +573,8 @@ int scsi_host_busy(struct Scsi_Host *shost) { int cnt = 0; - blk_mq_tagset_busy_iter_atomic(&shost->tag_set, - scsi_host_check_in_flight, &cnt); + blk_mq_tagset_busy_iter(&shost->tag_set, + scsi_host_check_in_flight, &cnt); return cnt; } EXPORT_SYMBOL(scsi_host_busy); @@ -672,8 +672,8 @@ static bool complete_all_cmds_iter(struct request *rq, void *data, bool rsvd) */ void scsi_host_complete_all_commands(struct Scsi_Host *shost, int status) { - blk_mq_tagset_busy_iter_atomic(&shost->tag_set, complete_all_cmds_iter, - &status); + blk_mq_tagset_busy_iter(&shost->tag_set, complete_all_cmds_iter, + &status); } EXPORT_SYMBOL_GPL(scsi_host_complete_all_commands); @@ -694,11 +694,11 @@ static bool __scsi_host_busy_iter_fn(struct request *req, void *priv, /** * scsi_host_busy_iter - Iterate over all busy commands * @shost: Pointer to Scsi_Host. - * @fn: Function to call on each busy command. Must not sleep. + * @fn: Function to call on each busy command * @priv: Data pointer passed to @fn * * If locking against concurrent command completions is required - * it has to be provided by the caller. + * ithas to be provided by the caller **/ void scsi_host_busy_iter(struct Scsi_Host *shost, bool (*fn)(struct scsi_cmnd *, void *, bool), @@ -709,7 +709,7 @@ void scsi_host_busy_iter(struct Scsi_Host *shost, .priv = priv, }; - blk_mq_tagset_busy_iter_atomic(&shost->tag_set, - __scsi_host_busy_iter_fn, &iter_data); + blk_mq_tagset_busy_iter(&shost->tag_set, __scsi_host_busy_iter_fn, + &iter_data); } EXPORT_SYMBOL_GPL(scsi_host_busy_iter); diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 6d2f8f18e2a3..c86760788c72 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1380,7 +1380,7 @@ static bool ufshcd_any_tag_in_use(struct ufs_hba *hba) struct request_queue *q = hba->cmd_queue; int busy = 0; - blk_mq_tagset_busy_iter_atomic(q->tag_set, ufshcd_is_busy, &busy); + blk_mq_tagset_busy_iter(q->tag_set, ufshcd_is_busy, &busy); return busy; } @@ -6269,7 +6269,7 @@ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba) .pending = ufshcd_readl(hba, REG_UTP_TASK_REQ_DOOR_BELL), }; - blk_mq_tagset_busy_iter_atomic(q->tag_set, ufshcd_compl_tm, &ci); + blk_mq_tagset_busy_iter(q->tag_set, ufshcd_compl_tm, &ci); return ci.ncpl ? IRQ_HANDLED : IRQ_NONE; } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index dfa0114a49fd..2c473c9b8990 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -526,8 +526,6 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async); void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs); void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv); -void blk_mq_tagset_busy_iter_atomic(struct blk_mq_tag_set *tagset, - busy_tag_iter_fn *fn, void *priv); void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset); void blk_mq_freeze_queue(struct request_queue *q); void blk_mq_unfreeze_queue(struct request_queue *q); From patchwork Sun Apr 25 08:57:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83106C433B4 for ; Sun, 25 Apr 2021 08:58:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5979F613BD for ; Sun, 25 Apr 2021 08:58:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229793AbhDYI7N (ORCPT ); Sun, 25 Apr 2021 04:59:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:43937 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229762AbhDYI7M (ORCPT ); Sun, 25 Apr 2021 04:59:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341112; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nlKUIawhhdfke5e91nvF9IFytrIJPg/rQRLeCbz+53s=; b=HGi34WtQk+Nty+oQyu2EMrY2JpOV6wYsnTNl/I+6GSI3pXSjNlcdbSuli+MZVU1bfGAMeT sNTrJhcGiH/zcEm9fp5LLyGD8UPekGYpH9sgz03sjlyynoz+jPng4I1sOjUqLlR2aUunPR v2zdjsFNU273JZYMZYbCQqq7VLrNSBQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-330-liBRZltwP2y6AS6-HBKHbA-1; Sun, 25 Apr 2021 04:58:31 -0400 X-MC-Unique: liBRZltwP2y6AS6-HBKHbA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7F23F1898297; Sun, 25 Apr 2021 08:58:29 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCA7F690F5; Sun, 25 Apr 2021 08:58:28 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 5/8] blk-mq: blk_mq_complete_request_locally Date: Sun, 25 Apr 2021 16:57:50 +0800 Message-Id: <20210425085753.2617424-6-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add blk_mq_complete_request_locally() for completing request via blk_mq_tagset_busy_iter(), so that we can avoid request UAF related with queue releasing, or request freeing. Signed-off-by: Ming Lei --- block/blk-mq.c | 16 ++++++++++++++++ include/linux/blk-mq.h | 1 + 2 files changed, 17 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 927189a55575..e3d1067b10c3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -681,6 +681,22 @@ void blk_mq_complete_request(struct request *rq) } EXPORT_SYMBOL(blk_mq_complete_request); +/** + * blk_mq_complete_request_locally - end I/O on a request locally + * @rq: the request being processed + * + * Description: + * Complete a request by calling the ->complete_rq directly, + * and it is usually used in error handling via + * blk_mq_tagset_busy_iter(). + **/ +void blk_mq_complete_request_locally(struct request *rq) +{ + WRITE_ONCE(rq->state, MQ_RQ_COMPLETE); + rq->q->mq_ops->complete(rq); +} +EXPORT_SYMBOL(blk_mq_complete_request_locally); + static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx) __releases(hctx->srcu) { diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 2c473c9b8990..f630bf9e497e 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -511,6 +511,7 @@ void blk_mq_kick_requeue_list(struct request_queue *q); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs); void blk_mq_complete_request(struct request *rq); bool blk_mq_complete_request_remote(struct request *rq); +void blk_mq_complete_request_locally(struct request *rq); bool blk_mq_queue_stopped(struct request_queue *q); void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx); void blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx); From patchwork Sun Apr 25 08:57:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABD41C433ED for ; Sun, 25 Apr 2021 08:58:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B8D1613B4 for ; Sun, 25 Apr 2021 08:58:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229821AbhDYI7U (ORCPT ); Sun, 25 Apr 2021 04:59:20 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:47306 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229694AbhDYI7U (ORCPT ); Sun, 25 Apr 2021 04:59:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341120; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9XMNgph7pfp6Sn4gR+e9FqJfKyDwk7Q14I9+0tXapEU=; b=U0lDTA+qNk2qiMOm4qSxTspbarjz5I93bqeqk62EqBar2BEzEy6XtQNrLKAi96VHoJ2D3h wo1DcPu6fO3mXVr2Z0FmQXe7Xz/DVCHZMk1W6Zw+SfxFos1BS5GOUU1dHj0WuYfTzWs4TU eEDdpzmXvXfQK2M0dg6PpE2wQEpeT3s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-446-s3okdzmXPgazHHveUS35cw-1; Sun, 25 Apr 2021 04:58:36 -0400 X-MC-Unique: s3okdzmXPgazHHveUS35cw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 38F581898296; Sun, 25 Apr 2021 08:58:35 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B812177CE; Sun, 25 Apr 2021 08:58:31 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 6/8] block: drivers: complete request locally from blk_mq_tagset_busy_iter Date: Sun, 25 Apr 2021 16:57:51 +0800 Message-Id: <20210425085753.2617424-7-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org It can be a bit hard for driver to avoid request UAF between normal completion and completion via blk_mq_tagset_busy_iter() if async completion is done in blk_mq_tagset_busy_iter(). Cause request->tag is only freed after .mq_ops->complete() is called, and rquest->tag may still be valid after blk_mq_complete_request() is returned from normal completion path, so this request is still visible in blk_mq_tagset_busy_iter(). This patch itself can't avoid such request UAF completely. We will grab a request reference in next patch when walking request via blk_mq_tagset_busy_iter() for fixing such race, that is why we have to convert to blk_mq_complete_request_locally() first. Signed-off-by: Ming Lei --- drivers/block/mtip32xx/mtip32xx.c | 2 +- drivers/block/nbd.c | 2 +- drivers/nvme/host/core.c | 2 +- drivers/scsi/scsi_lib.c | 6 +++++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 3be0dbc674bd..05f5e36ee608 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -3748,7 +3748,7 @@ static bool mtip_no_dev_cleanup(struct request *rq, void *data, bool reserv) struct mtip_cmd *cmd = blk_mq_rq_to_pdu(rq); cmd->status = BLK_STS_IOERR; - blk_mq_complete_request(rq); + blk_mq_complete_request_locally(rq); return true; } diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 4ff71b579cfc..3dcf3288efa8 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -809,7 +809,7 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved) cmd->status = BLK_STS_IOERR; mutex_unlock(&cmd->lock); - blk_mq_complete_request(req); + blk_mq_complete_request_locally(req); return true; } diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 0896e21642be..a605954477da 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -381,7 +381,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved) nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD; nvme_req(req)->flags |= NVME_REQ_CANCELLED; - blk_mq_complete_request(req); + blk_mq_complete_request_locally(req); return true; } EXPORT_SYMBOL_GPL(nvme_cancel_request); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c289991ffaed..7cbaee282b6d 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1568,7 +1568,11 @@ static void scsi_mq_done(struct scsi_cmnd *cmd) if (unlikely(test_and_set_bit(SCMD_STATE_COMPLETE, &cmd->state))) return; trace_scsi_dispatch_cmd_done(cmd); - blk_mq_complete_request(cmd->request); + + if (unlikely(host_byte(cmd->result) != DID_OK)) + blk_mq_complete_request_locally(cmd->request); + else + blk_mq_complete_request(cmd->request); } static void scsi_mq_put_budget(struct request_queue *q) From patchwork Sun Apr 25 08:57:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1489C433ED for ; Sun, 25 Apr 2021 08:58:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BA71061285 for ; Sun, 25 Apr 2021 08:58:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229828AbhDYI72 (ORCPT ); Sun, 25 Apr 2021 04:59:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:26737 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229694AbhDYI71 (ORCPT ); Sun, 25 Apr 2021 04:59:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kpa9pCDoik6FTiDJWFmgKud0dcOZjitGbcFr0C3GKKk=; b=RZYF59iRQEc5DyB7JZPF8+GRBRfStfg87xtVeua6jY4OTFwjeonyGKuDFEfp8SjESeXCTD 9ZTAW3tVeIJcAeml/M7Sg173YRGOp764hdGIirQPLCNYErCLdk7S9r2wAtC0cdmpr4wXip e08ps298rk+M6iqnLMBghtfBoJfQ3jQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-208-kkmOzQfbOsO_XSpnEw0QZA-1; Sun, 25 Apr 2021 04:58:46 -0400 X-MC-Unique: kkmOzQfbOsO_XSpnEw0QZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BA30D343A2; Sun, 25 Apr 2021 08:58:44 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8314310016F8; Sun, 25 Apr 2021 08:58:37 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 7/8] blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter Date: Sun, 25 Apr 2021 16:57:52 +0800 Message-Id: <20210425085753.2617424-8-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and this way will prevent the request from being re-used when ->fn is running. The approach is same as what we do during handling timeout. Fix request UAF related with completion race or queue releasing: - If one rq is referred before rq->q is frozen, then queue won't be frozen before the request is released during iteration. - If one rq is referred after rq->q is frozen, refcount_inc_not_zero() will return false, and we won't iterate over this request. However, still one request UAF not covered: refcount_inc_not_zero() may read one freed request, and it will be handled in next patch. Signed-off-by: Ming Lei --- block/blk-mq-tag.c | 14 +++++++++++--- block/blk-mq.c | 14 +++++++++----- block/blk-mq.h | 1 + 3 files changed, 21 insertions(+), 8 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2a37731e8244..489d2db89856 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -264,6 +264,7 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) struct blk_mq_tags *tags = iter_data->tags; bool reserved = iter_data->flags & BT_TAG_ITER_RESERVED; struct request *rq; + bool ret; if (!reserved) bitnr += tags->nr_reserved_tags; @@ -276,12 +277,15 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) rq = tags->static_rqs[bitnr]; else rq = tags->rqs[bitnr]; - if (!rq) + if (!rq || !refcount_inc_not_zero(&rq->ref)) return true; if ((iter_data->flags & BT_TAG_ITER_STARTED) && !blk_mq_request_started(rq)) - return true; - return iter_data->fn(rq, iter_data->data, reserved); + ret = true; + else + ret = iter_data->fn(rq, iter_data->data, reserved); + blk_mq_put_rq_ref(rq); + return ret; } /** @@ -348,6 +352,10 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn, * indicates whether or not @rq is a reserved request. Return * true to continue iterating tags, false to stop. * @priv: Will be passed as second argument to @fn. + * + * We grab one request reference before calling @fn and release it after + * @fn returns. So far we don't support to pass the request reference to + * one new conetxt in @fn. */ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv) diff --git a/block/blk-mq.c b/block/blk-mq.c index e3d1067b10c3..9a4d520740a1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -925,6 +925,14 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next) return false; } +void blk_mq_put_rq_ref(struct request *rq) +{ + if (is_flush_rq(rq, rq->mq_hctx)) + rq->end_io(rq, 0); + else if (refcount_dec_and_test(&rq->ref)) + __blk_mq_free_request(rq); +} + static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx, struct request *rq, void *priv, bool reserved) { @@ -958,11 +966,7 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx, if (blk_mq_req_expired(rq, next)) blk_mq_rq_timed_out(rq, reserved); - if (is_flush_rq(rq, hctx)) - rq->end_io(rq, 0); - else if (refcount_dec_and_test(&rq->ref)) - __blk_mq_free_request(rq); - + blk_mq_put_rq_ref(rq); return true; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 3616453ca28c..143afe42c63a 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -47,6 +47,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list); struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *start); +void blk_mq_put_rq_ref(struct request *rq); /* * Internal helpers for allocating/freeing the request map From patchwork Sun Apr 25 08:57:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 12223111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22D63C433B4 for ; Sun, 25 Apr 2021 08:58:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F3920613AA for ; Sun, 25 Apr 2021 08:58:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229849AbhDYI7e (ORCPT ); Sun, 25 Apr 2021 04:59:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:28196 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229694AbhDYI7e (ORCPT ); Sun, 25 Apr 2021 04:59:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619341134; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3LuFfzevb60+uhPD/IWPbLI9CooPtCWm72NIAvuzaMY=; b=LEvl70CiHMWehW148VmbJhGkqLnWpUlZ1g4ecpw1ZReaCV1gHPOnH6oOLfDuFERouSOrjl 6jG+6Ydpc298J8aSRxYvRmGwxDOBNGA8Ft3qkfQXlvsgVp1/Grgn8MWAFM+Rg9cBzS75sq 3c2UdsJPoKfFrpYzFwU3ozTzFWfezfM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-448-NXm1uJmlMKufextPkATXxw-1; Sun, 25 Apr 2021 04:58:52 -0400 X-MC-Unique: NXm1uJmlMKufextPkATXxw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1B19C1006C80; Sun, 25 Apr 2021 08:58:51 +0000 (UTC) Received: from localhost (ovpn-13-143.pek2.redhat.com [10.72.13.143]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2919560BE5; Sun, 25 Apr 2021 08:58:46 +0000 (UTC) From: Ming Lei To: linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, Jens Axboe , linux-block@vger.kernel.org, "Martin K . Petersen" , Christoph Hellwig Cc: Bart Van Assche , Khazhy Kumykov , Shin'ichiro Kawasaki , Hannes Reinecke , John Garry , David Jeffery , Ming Lei Subject: [PATCH 8/8] blk-mq: clear stale request in tags->rq[] before freeing one request pool Date: Sun, 25 Apr 2021 16:57:53 +0800 Message-Id: <20210425085753.2617424-9-ming.lei@redhat.com> In-Reply-To: <20210425085753.2617424-1-ming.lei@redhat.com> References: <20210425085753.2617424-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org refcount_inc_not_zero() in bt_tags_iter() still may read one freed request. Fix the issue by the following approach: 1) hold a per-tags spinlock when reading ->rqs[tag] and calling refcount_inc_not_zero in bt_tags_iter() 2) clearing stale request referred via ->rqs[tag] before freeing request pool, the per-tags spinlock is held for clearing stale ->rq[tag] So after we cleared stale requests, bt_tags_iter() won't observe freed request any more, also the clearing will wait for pending request reference. The idea of clearing ->rqs[] is borrowed from John Garry's previous patch and one recent David's patch. Cc: John Garry Cc: David Jeffery Signed-off-by: Ming Lei --- block/blk-mq-tag.c | 9 ++++++++- block/blk-mq-tag.h | 3 +++ block/blk-mq.c | 39 ++++++++++++++++++++++++++++++++++----- 3 files changed, 45 insertions(+), 6 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 489d2db89856..402a7860f6c9 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -265,10 +265,12 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) bool reserved = iter_data->flags & BT_TAG_ITER_RESERVED; struct request *rq; bool ret; + unsigned long flags; if (!reserved) bitnr += tags->nr_reserved_tags; + spin_lock_irqsave(&tags->lock, flags); /* * We can hit rq == NULL here, because the tagging functions * test and set the bit before assigning ->rqs[]. @@ -277,8 +279,12 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) rq = tags->static_rqs[bitnr]; else rq = tags->rqs[bitnr]; - if (!rq || !refcount_inc_not_zero(&rq->ref)) + if (!rq || !refcount_inc_not_zero(&rq->ref)) { + spin_unlock_irqrestore(&tags->lock, flags); return true; + } + spin_unlock_irqrestore(&tags->lock, flags); + if ((iter_data->flags & BT_TAG_ITER_STARTED) && !blk_mq_request_started(rq)) ret = true; @@ -524,6 +530,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, tags->nr_tags = total_tags; tags->nr_reserved_tags = reserved_tags; + spin_lock_init(&tags->lock); if (blk_mq_is_sbitmap_shared(flags)) return tags; diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index 7d3e6b333a4a..f942a601b5ef 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -20,6 +20,9 @@ struct blk_mq_tags { struct request **rqs; struct request **static_rqs; struct list_head page_list; + + /* used to clear rqs[] before one request pool is freed */ + spinlock_t lock; }; extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, diff --git a/block/blk-mq.c b/block/blk-mq.c index 9a4d520740a1..1f913f786c1f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2307,6 +2307,38 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) return BLK_QC_T_NONE; } +static size_t order_to_size(unsigned int order) +{ + return (size_t)PAGE_SIZE << order; +} + +/* called before freeing request pool in @tags */ +static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set, + struct blk_mq_tags *tags, unsigned int hctx_idx) +{ + struct blk_mq_tags *drv_tags = set->tags[hctx_idx]; + struct page *page; + unsigned long flags; + + spin_lock_irqsave(&drv_tags->lock, flags); + list_for_each_entry(page, &tags->page_list, lru) { + unsigned long start = (unsigned long)page_address(page); + unsigned long end = start + order_to_size(page->private); + int i; + + for (i = 0; i < set->queue_depth; i++) { + struct request *rq = drv_tags->rqs[i]; + unsigned long rq_addr = (unsigned long)rq; + + if (rq_addr >= start && rq_addr < end) { + WARN_ON_ONCE(refcount_read(&rq->ref) != 0); + cmpxchg(&drv_tags->rqs[i], rq, NULL); + } + } + } + spin_unlock_irqrestore(&drv_tags->lock, flags); +} + void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, unsigned int hctx_idx) { @@ -2325,6 +2357,8 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, } } + blk_mq_clear_rq_mapping(set, tags, hctx_idx); + while (!list_empty(&tags->page_list)) { page = list_first_entry(&tags->page_list, struct page, lru); list_del_init(&page->lru); @@ -2384,11 +2418,6 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, return tags; } -static size_t order_to_size(unsigned int order) -{ - return (size_t)PAGE_SIZE << order; -} - static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq, unsigned int hctx_idx, int node) {