From patchwork Thu Dec 3 01:26:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11947409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15797C64E7C for ; Thu, 3 Dec 2020 01:28:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B877222228 for ; Thu, 3 Dec 2020 01:28:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729206AbgLCB2a (ORCPT ); Wed, 2 Dec 2020 20:28:30 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:60211 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726167AbgLCB2a (ORCPT ); Wed, 2 Dec 2020 20:28:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606958823; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EDimu4Rkgk5+WAwUIHzTdVZRQCclc6B61tJQP9FjUgU=; b=BZBrDyE5RiXEAiFe1EisZE5mu3M2Df650f7q4irGxYw96f2TZVf7JF+1FPeEG90PkK4Lf4 8+3WyMG5lWrsRBuWmp3gCtg3wyly6BmH0ULgAhJn21iWMpJFO+CQPtTtImpwCcvR9KdCkr 7c4xLK+N2QtqHcRMV3DTjGtU8p5M5Yk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-521-Y6YzRUrgPPC0NqH5EoFW6w-1; Wed, 02 Dec 2020 20:27:02 -0500 X-MC-Unique: Y6YzRUrgPPC0NqH5EoFW6w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A026D393A6; Thu, 3 Dec 2020 01:27:00 +0000 (UTC) Received: from localhost (ovpn-12-87.pek2.redhat.com [10.72.12.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id BB9B15D9CA; Thu, 3 Dec 2020 01:26:53 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Kashyap Desai , Qian Cai , Sumit Saxena , John Garry , Bart Van Assche , Hannes Reinecke Subject: [PATCH V2 1/3] blk-mq: add new API of blk_mq_hctx_set_fq_lock_class Date: Thu, 3 Dec 2020 09:26:36 +0800 Message-Id: <20201203012638.543321-2-ming.lei@redhat.com> In-Reply-To: <20201203012638.543321-1-ming.lei@redhat.com> References: <20201203012638.543321-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org flush_end_io() may be called recursively from some driver, such as nvme-loop, so lockdep may complain 'possible recursive locking'. Commit b3c6a5997541("block: Fix a lockdep complaint triggered by request queue flushing") tried to address this issue by assigning dynamically allocated per-flush-queue lock class. This solution adds synchronize_rcu() for each hctx's release handler, and causes horrible SCSI MQ probe delay(more than half an hour on megaraid sas). Add new API of blk_mq_hctx_set_fq_lock_class() for these drivers, so we just need to use driver specific lock class for avoiding the lockdep warning of 'possible recursive locking'. Tested-by: Kashyap Desai Reported-by: Qian Cai Cc: Sumit Saxena Cc: John Garry Cc: Kashyap Desai Cc: Bart Van Assche Cc: Hannes Reinecke Signed-off-by: Ming Lei Reviewed-by: Hannes Reinecke --- block/blk-flush.c | 25 +++++++++++++++++++++++++ include/linux/blk-mq.h | 3 +++ 2 files changed, 28 insertions(+) diff --git a/block/blk-flush.c b/block/blk-flush.c index 9507dcdd5881..bf51588762d8 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -490,3 +490,28 @@ void blk_free_flush_queue(struct blk_flush_queue *fq) kfree(fq->flush_rq); kfree(fq); } + +/* + * Allow driver to set its own lock class to fq->mq_flush_lock for + * avoiding lockdep complaint. + * + * flush_end_io() may be called recursively from some driver, such as + * nvme-loop, so lockdep may complain 'possible recursive locking' because + * all 'struct blk_flush_queue' instance share same mq_flush_lock lock class + * key. We need to assign different lock class for these driver's + * fq->mq_flush_lock for avoiding the lockdep warning. + * + * Use dynamically allocated lock class key for each 'blk_flush_queue' + * instance is over-kill, and more worse it introduces horrible boot delay + * issue because synchronize_rcu() is implied in lockdep_unregister_key which + * is called for each hctx release. SCSI probing may synchronously create and + * destroy lots of MQ request_queues for non-existent devices, and some robot + * test kernel always enable lockdep option. It is observed that more than half + * an hour is taken during SCSI MQ probe with per-fq lock class. + */ +void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx, + struct lock_class_key *key) +{ + lockdep_set_class(&hctx->fq->mq_flush_lock, key); +} +EXPORT_SYMBOL_GPL(blk_mq_hctx_set_fq_lock_class); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 794b2a33a2c3..5f639240760e 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -5,6 +5,7 @@ #include #include #include +#include struct blk_mq_tags; struct blk_flush_queue; @@ -594,5 +595,7 @@ static inline void blk_mq_cleanup_rq(struct request *rq) } blk_qc_t blk_mq_submit_bio(struct bio *bio); +void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx, + struct lock_class_key *key); #endif From patchwork Thu Dec 3 01:26:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11947411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F2E0C6369E for ; Thu, 3 Dec 2020 01:28:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A40D822228 for ; Thu, 3 Dec 2020 01:28:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726222AbgLCB2n (ORCPT ); Wed, 2 Dec 2020 20:28:43 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:45825 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726167AbgLCB2m (ORCPT ); Wed, 2 Dec 2020 20:28:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606958835; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DdlIhQrVcQSjcRRu6rCOZzDSNzJA1THUuCHTl6bEpAg=; b=QBPiHKghvMxmSD0iQM+afnRE7FLp1dIfBgKx2bHieUvYVjLtdV4O7RntwKnkH4Cu7i2sZC VtB54m5NNE0n64MJP+1dn/tE2zHXJcuXH5mc9bK65wCSZm2+bMqTfiRS5jAeHH6/G+oDhP WxpgDk798mre3kJJCXMyz3DYAxgWHJU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-94-ptaMkaECOsifWyO4caZBwg-1; Wed, 02 Dec 2020 20:27:12 -0500 X-MC-Unique: ptaMkaECOsifWyO4caZBwg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8FBA8185E48F; Thu, 3 Dec 2020 01:27:10 +0000 (UTC) Received: from localhost (ovpn-12-87.pek2.redhat.com [10.72.12.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 14F125D6BA; Thu, 3 Dec 2020 01:27:02 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Kashyap Desai , Qian Cai , Christoph Hellwig , Sumit Saxena , John Garry , Bart Van Assche , Hannes Reinecke Subject: [PATCH V2 2/3] nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class Date: Thu, 3 Dec 2020 09:26:37 +0800 Message-Id: <20201203012638.543321-3-ming.lei@redhat.com> In-Reply-To: <20201203012638.543321-1-ming.lei@redhat.com> References: <20201203012638.543321-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Set nvme-loop's lock class via blk_mq_hctx_set_fq_lock_class for avoiding lockdep possible recursive locking, then we can remove the dynamically allocated lock class for each flush queue, finally we can avoid horrible SCSI probe delay. This way may not address situation in which one nvme-loop is backed on another nvme-loop. However, in reality, people seldom uses this way for test. Even though someone played in this way, it is just one recursive locking false positive, no real deadlock issue. Tested-by: Kashyap Desai Reported-by: Qian Cai Reviewed-by: Christoph Hellwig Cc: Sumit Saxena Cc: John Garry Cc: Kashyap Desai Cc: Bart Van Assche Cc: Hannes Reinecke Signed-off-by: Ming Lei Reviewed-by: Hannes Reinecke --- drivers/nvme/target/loop.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index f6d81239be21..07806016c09d 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -211,6 +211,8 @@ static int nvme_loop_init_request(struct blk_mq_tag_set *set, (set == &ctrl->tag_set) ? hctx_idx + 1 : 0); } +static struct lock_class_key loop_hctx_fq_lock_key; + static int nvme_loop_init_hctx(struct blk_mq_hw_ctx *hctx, void *data, unsigned int hctx_idx) { @@ -219,6 +221,14 @@ static int nvme_loop_init_hctx(struct blk_mq_hw_ctx *hctx, void *data, BUG_ON(hctx_idx >= ctrl->ctrl.queue_count); + /* + * flush_end_io() can be called recursively for us, so use our own + * lock class key for avoiding lockdep possible recursive locking, + * then we can remove the dynamically allocated lock class for each + * flush queue, that way may cause horrible boot delay. + */ + blk_mq_hctx_set_fq_lock_class(hctx, &loop_hctx_fq_lock_key); + hctx->driver_data = queue; return 0; } From patchwork Thu Dec 3 01:26:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11947413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30191C6369E for ; Thu, 3 Dec 2020 01:28:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C891D22228 for ; Thu, 3 Dec 2020 01:28:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729233AbgLCB2w (ORCPT ); Wed, 2 Dec 2020 20:28:52 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:39195 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729218AbgLCB2w (ORCPT ); Wed, 2 Dec 2020 20:28:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606958845; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GX61fRXro/iN1f/QllGwrAqHqUKtdtZYtR9z3U2BCOw=; b=iCk4uGmiUb65u8NSreXPomdVeYAq7ZOIOrp2JbVK2M3fk7dEoalfbo36EQm0A3gmRS98MG ZZ4aKZtcqENLSEi3pWj/37WdnzmEi6rq7EPaCEvJlZRyyUwhYCHYvsqMHdZw9kgPgUzAPx GVUotrkLXM4kJApx47AxsM7zftr+mm8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-425-hQudFP02PWWwrqXiwM_qzA-1; Wed, 02 Dec 2020 20:27:24 -0500 X-MC-Unique: hQudFP02PWWwrqXiwM_qzA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 57FB510066FD; Thu, 3 Dec 2020 01:27:22 +0000 (UTC) Received: from localhost (ovpn-12-87.pek2.redhat.com [10.72.12.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 67FC919D9C; Thu, 3 Dec 2020 01:27:13 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Kashyap Desai , Qian Cai , Christoph Hellwig , Sumit Saxena , John Garry , Bart Van Assche , Hannes Reinecke Subject: [PATCH V2 3/3] Revert "block: Fix a lockdep complaint triggered by request queue flushing" Date: Thu, 3 Dec 2020 09:26:38 +0800 Message-Id: <20201203012638.543321-4-ming.lei@redhat.com> In-Reply-To: <20201203012638.543321-1-ming.lei@redhat.com> References: <20201203012638.543321-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This reverts commit b3c6a59975415bde29cfd76ff1ab008edbf614a9. Now we can avoid nvme-loop lockdep warning of 'lockdep possible recursive locking' by nvme-loop's lock class, no need to apply dynamically allocated lock class key, so revert commit b3c6a5997541("block: Fix a lockdep complaint triggered by request queue flushing"). This way fixes horrible SCSI probe delay issue on megaraid_sas, and it is reported the whole probe may take more than half an hour. Tested-by: Kashyap Desai Reported-by: Qian Cai Reviewed-by: Christoph Hellwig Cc: Sumit Saxena Cc: John Garry Cc: Kashyap Desai Cc: Bart Van Assche Cc: Hannes Reinecke Signed-off-by: Ming Lei Reviewed-by: Hannes Reinecke --- block/blk-flush.c | 5 ----- block/blk.h | 1 - 2 files changed, 6 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index bf51588762d8..996d5d03dade 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -69,7 +69,6 @@ #include #include #include -#include #include "blk.h" #include "blk-mq.h" @@ -469,9 +468,6 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size, INIT_LIST_HEAD(&fq->flush_queue[1]); INIT_LIST_HEAD(&fq->flush_data_in_flight); - lockdep_register_key(&fq->key); - lockdep_set_class(&fq->mq_flush_lock, &fq->key); - return fq; fail_rq: @@ -486,7 +482,6 @@ void blk_free_flush_queue(struct blk_flush_queue *fq) if (!fq) return; - lockdep_unregister_key(&fq->key); kfree(fq->flush_rq); kfree(fq); } diff --git a/block/blk.h b/block/blk.h index 98f0b1ae2641..d23d018fd2cd 100644 --- a/block/blk.h +++ b/block/blk.h @@ -25,7 +25,6 @@ struct blk_flush_queue { struct list_head flush_data_in_flight; struct request *flush_rq; - struct lock_class_key key; spinlock_t mq_flush_lock; };