diff mbox series

blk-mq: make synchronous hw_queue runs RT friendly

Message ID 20211213054425.28121-1-dave@stgolabs.net (mailing list archive)
State New, archived
Headers show
Series blk-mq: make synchronous hw_queue runs RT friendly | expand

Commit Message

Davidlohr Bueso Dec. 13, 2021, 5:44 a.m. UTC
Disabling preemption for the synchronous part of __blk_mq_delay_run_hw_queue()
is to ensure that the hw queue runs in the correct CPU. This does not play
well with PREEMPT_RT as regular spinlocks can be taken at this time (such as
the hctx->lock), triggering scheduling while atomic scenarios.

Introduce regions to mark starting and ending such cases and allow RT to
instead disable migration. While this actually better documents what is
occurring (as it is not about preemption but CPU locality), doing so for the
regular non-RT case can be too expensive. Similarly, instead of relying on
preemption or migration tricks, the task could also be affined to the valid
cpumask, but that too would be unnecessarily expensive.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 block/blk-mq.c | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

Comments

Christoph Hellwig Dec. 13, 2021, 1:07 p.m. UTC | #1
> +#ifndef CONFIG_PREEMPT_RT

Please don't add these silly inverted ifdefs.

> +static inline void blk_mq_start_sync_run_hw_queue(void)
> +{
> +	preempt_disable();
> +}
> +static inline void blk_mq_end_sync_run_hw_queue(void)
> +{
> +	preempt_enable();
> +}
> +#else
> +static inline void blk_mq_start_sync_run_hw_queue(void)
> +{
> +	migrate_disable();
> +}
> +static inline void blk_mq_end_sync_run_hw_queue(void)
> +{
> +	migrate_enable();
> +}
> +#endif

But more importantly:  why isn't migrate_disable/enable doing the right
thing for !PREEMPT_RT to avoid this mess?
Sebastian Andrzej Siewior Dec. 13, 2021, 1:52 p.m. UTC | #2
On 2021-12-13 05:07:04 [-0800], Christoph Hellwig wrote:
> But more importantly:  why isn't migrate_disable/enable doing the right
> thing for !PREEMPT_RT to avoid this mess?

Thank you for asking the question.

Sebastian
Davidlohr Bueso Dec. 13, 2021, 7:05 p.m. UTC | #3
On Mon, 13 Dec 2021, Christoph Hellwig wrote:
>But more importantly:  why isn't migrate_disable/enable doing the right
>thing for !PREEMPT_RT to avoid this mess?

Please see Peter's description of the situation in af449901b84.

While I'm not at all a fan of sprinkling migrate_disabling around code,
I didn't want to add any overhead for the common case. If this, however,
were not an issue (if most cases are async runs, for example) the ideal
solution I think would be to just pin current to the hctx->cpumask.

Thanks,
Davidlohr
Christoph Hellwig Dec. 14, 2021, 8:08 a.m. UTC | #4
On Mon, Dec 13, 2021 at 11:05:29AM -0800, Davidlohr Bueso wrote:
> On Mon, 13 Dec 2021, Christoph Hellwig wrote:
> > But more importantly:  why isn't migrate_disable/enable doing the right
> > thing for !PREEMPT_RT to avoid this mess?
> 
> Please see Peter's description of the situation in af449901b84.

That explains why migrate_disable is a bad idea in PREEMPT_RT, not why it
can't do something sensible for !PREEMPT_RT…

> 
> While I'm not at all a fan of sprinkling migrate_disabling around code,
> I didn't want to add any overhead for the common case. If this, however,
> were not an issue (if most cases are async runs, for example) the ideal
> solution I think would be to just pin current to the hctx->cpumask.

sync running is the performance case.
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8874a63ae952..d44b851fffba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1841,6 +1841,30 @@  static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 	return next_cpu;
 }
 
+/*
+ * Mark regions to ensure that a synchronous hardware queue
+ * runs on a correct CPU.
+ */
+#ifndef CONFIG_PREEMPT_RT
+static inline void blk_mq_start_sync_run_hw_queue(void)
+{
+	preempt_disable();
+}
+static inline void blk_mq_end_sync_run_hw_queue(void)
+{
+	preempt_enable();
+}
+#else
+static inline void blk_mq_start_sync_run_hw_queue(void)
+{
+	migrate_disable();
+}
+static inline void blk_mq_end_sync_run_hw_queue(void)
+{
+	migrate_enable();
+}
+#endif
+
 /**
  * __blk_mq_delay_run_hw_queue - Run (or schedule to run) a hardware queue.
  * @hctx: Pointer to the hardware queue to run.
@@ -1857,14 +1881,14 @@  static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 		return;
 
 	if (!async && !(hctx->flags & BLK_MQ_F_BLOCKING)) {
-		int cpu = get_cpu();
-		if (cpumask_test_cpu(cpu, hctx->cpumask)) {
+		blk_mq_start_sync_run_hw_queue();
+		if (cpumask_test_cpu(smp_processor_id(), hctx->cpumask)) {
 			__blk_mq_run_hw_queue(hctx);
-			put_cpu();
+			blk_mq_end_sync_run_hw_queue();
 			return;
 		}
 
-		put_cpu();
+		blk_mq_end_sync_run_hw_queue();
 	}
 
 	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,