[v2] blk-mq: Remove 'running from the wrong CPU' warning

Message ID	20201130101921.52754-1-dwagner@suse.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Daniel Wagner <dwagner@suse.de> To: linux-block@vger.kernel.org Cc: Jens Axboe <axboe@kernel.dk>, linux-kernel@vger.kernel.org, Ming Lei <ming.lei@redhat.com>, Daniel Wagner <dwagner@suse.de> Subject: [PATCH v2] blk-mq: Remove 'running from the wrong CPU' warning Date: Mon, 30 Nov 2020 11:19:21 +0100 Message-Id: <20201130101921.52754-1-dwagner@suse.de> Precedence: bulk
Series	[v2] blk-mq: Remove 'running from the wrong CPU' warning \| expand [v2] blk-mq: Remove 'running from the wrong CPU' warning

Message ID

20201130101921.52754-1-dwagner@suse.de (mailing list archive)

State

New, archived

Headers

From: Daniel Wagner <dwagner@suse.de>
To: linux-block@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>, linux-kernel@vger.kernel.org,
        Ming Lei <ming.lei@redhat.com>, Daniel Wagner <dwagner@suse.de>
Subject: [PATCH v2] blk-mq: Remove 'running from the wrong CPU' warning
Date: Mon, 30 Nov 2020 11:19:21 +0100
Message-Id: <20201130101921.52754-1-dwagner@suse.de>
Precedence: bulk

Series

[v2] blk-mq: Remove 'running from the wrong CPU' warning | expand

Commit Message

Daniel Wagner Nov. 30, 2020, 10:19 a.m. UTC

It's guaranteed that no request is in flight when a hctx is going
offline. This warning is only triggered when the wq's CPU is hot
plugged and the blk-mq is not synced up yet.

As this state is temporary and the request is still processed
correctly, better remove the warning as this is the fast path.

Suggested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---

v2:
  - remove the warning as suggested by Ming
v1:
  - initial version
    https://lore.kernel.org/linux-block/20201126095152.19151-1-dwagner@suse.de/

 block/blk-mq.c | 25 -------------------------
 1 file changed, 25 deletions(-)

Comments

Christoph Hellwig Nov. 30, 2020, 5:17 p.m. UTC | #1

On Mon, Nov 30, 2020 at 11:19:21AM +0100, Daniel Wagner wrote:
> It's guaranteed that no request is in flight when a hctx is going
> offline. This warning is only triggered when the wq's CPU is hot
> plugged and the blk-mq is not synced up yet.
> 
> As this state is temporary and the request is still processed
> correctly, better remove the warning as this is the fast path.
> 
> Suggested-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

Ming Lei Dec. 1, 2020, 2:09 a.m. UTC | #2

On Mon, Nov 30, 2020 at 11:19:21AM +0100, Daniel Wagner wrote:
> It's guaranteed that no request is in flight when a hctx is going
> offline. This warning is only triggered when the wq's CPU is hot
> plugged and the blk-mq is not synced up yet.
> 
> As this state is temporary and the request is still processed
> correctly, better remove the warning as this is the fast path.
> 
> Suggested-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> 
> v2:
>   - remove the warning as suggested by Ming
> v1:
>   - initial version
>     https://lore.kernel.org/linux-block/20201126095152.19151-1-dwagner@suse.de/
> 
>  block/blk-mq.c | 25 -------------------------
>  1 file changed, 25 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 55bcee5dc032..7e6761804f86 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1495,31 +1495,6 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
>  {
>  	int srcu_idx;
>  
> -	/*
> -	 * We should be running this queue from one of the CPUs that
> -	 * are mapped to it.
> -	 *
> -	 * There are at least two related races now between setting
> -	 * hctx->next_cpu from blk_mq_hctx_next_cpu() and running
> -	 * __blk_mq_run_hw_queue():
> -	 *
> -	 * - hctx->next_cpu is found offline in blk_mq_hctx_next_cpu(),
> -	 *   but later it becomes online, then this warning is harmless
> -	 *   at all
> -	 *
> -	 * - hctx->next_cpu is found online in blk_mq_hctx_next_cpu(),
> -	 *   but later it becomes offline, then the warning can't be
> -	 *   triggered, and we depend on blk-mq timeout handler to
> -	 *   handle dispatched requests to this hctx
> -	 */
> -	if (!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
> -		cpu_online(hctx->next_cpu)) {
> -		printk(KERN_WARNING "run queue from wrong CPU %d, hctx %s\n",
> -			raw_smp_processor_id(),
> -			cpumask_empty(hctx->cpumask) ? "inactive": "active");
> -		dump_stack();
> -	}
> -
>  	/*
>  	 * We can't run the queue inline with ints disabled. Ensure that
>  	 * we catch bad users of this early.
> -- 
> 2.16.4
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

Daniel Wagner Dec. 16, 2020, 3:35 p.m. UTC | #3

On Mon, Nov 30, 2020 at 05:17:48PM +0000, Christoph Hellwig wrote:
> On Mon, Nov 30, 2020 at 11:19:21AM +0100, Daniel Wagner wrote:
> > It's guaranteed that no request is in flight when a hctx is going
> > offline. This warning is only triggered when the wq's CPU is hot
> > plugged and the blk-mq is not synced up yet.
> > 
> > As this state is temporary and the request is still processed
> > correctly, better remove the warning as this is the fast path.
> > 
> > Suggested-by: Ming Lei <ming.lei@redhat.com>
> > Signed-off-by: Daniel Wagner <dwagner@suse.de>
> 
> Looks good,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Jens, any chance you queue this one up?

Thanks,
Daniel

Jens Axboe Dec. 16, 2020, 3:47 p.m. UTC | #4

On 11/30/20 3:19 AM, Daniel Wagner wrote:
> It's guaranteed that no request is in flight when a hctx is going
> offline. This warning is only triggered when the wq's CPU is hot
> plugged and the blk-mq is not synced up yet.
> 
> As this state is temporary and the request is still processed
> correctly, better remove the warning as this is the fast path.

Applied, thanks.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 55bcee5dc032..7e6761804f86 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1495,31 +1495,6 @@  static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 {
 	int srcu_idx;
 
-	/*
-	 * We should be running this queue from one of the CPUs that
-	 * are mapped to it.
-	 *
-	 * There are at least two related races now between setting
-	 * hctx->next_cpu from blk_mq_hctx_next_cpu() and running
-	 * __blk_mq_run_hw_queue():
-	 *
-	 * - hctx->next_cpu is found offline in blk_mq_hctx_next_cpu(),
-	 *   but later it becomes online, then this warning is harmless
-	 *   at all
-	 *
-	 * - hctx->next_cpu is found online in blk_mq_hctx_next_cpu(),
-	 *   but later it becomes offline, then the warning can't be
-	 *   triggered, and we depend on blk-mq timeout handler to
-	 *   handle dispatched requests to this hctx
-	 */
-	if (!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
-		cpu_online(hctx->next_cpu)) {
-		printk(KERN_WARNING "run queue from wrong CPU %d, hctx %s\n",
-			raw_smp_processor_id(),
-			cpumask_empty(hctx->cpumask) ? "inactive": "active");
-		dump_stack();
-	}
-
 	/*
 	 * We can't run the queue inline with ints disabled. Ensure that
 	 * we catch bad users of this early.

[v2] blk-mq: Remove 'running from the wrong CPU' warning

Commit Message

Comments

Patch