diff mbox series

[3/8] block/mq-deadline: skip expensive merge lookups if contended

Message ID 20240123174021.1967461-4-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series [1/8] block/mq-deadline: pass in queue directly to dd_insert_request() | expand

Commit Message

Jens Axboe Jan. 23, 2024, 5:34 p.m. UTC
We do several stages of merging in the block layer - the most likely one
to work is also the cheap one, merging direct in the per-task plug when
IO is submitted. Getting merges outside of that is a lot less likely,
but IO schedulers may still maintain internal data structures to
facilitate merge lookups outside of the plug.

Make mq-deadline skip expensive merge lookups if the queue lock is
already contended. The likelihood of getting a merge here is not very
high, hence it should not be a problem skipping the attempt in the also
unlikely event that the queue is already contended.

Perf diff shows the difference between a random read/write workload
with 4 threads doing IO, with expensive merges turned on and off:

    25.00%    +61.94%  [kernel.kallsyms]  [k] queued_spin_lock_slowpath

where we almost quadruple the lock contention by attempting these
expensive merges.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/mq-deadline.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

Comments

Johannes Thumshirn Jan. 24, 2024, 9:31 a.m. UTC | #1
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Christoph Hellwig Jan. 24, 2024, 9:32 a.m. UTC | #2
On Tue, Jan 23, 2024 at 10:34:15AM -0700, Jens Axboe wrote:
> We do several stages of merging in the block layer - the most likely one
> to work is also the cheap one, merging direct in the per-task plug when
> IO is submitted. Getting merges outside of that is a lot less likely,
> but IO schedulers may still maintain internal data structures to
> facilitate merge lookups outside of the plug.
> 
> Make mq-deadline skip expensive merge lookups if the queue lock is
> already contended. The likelihood of getting a merge here is not very
> high, hence it should not be a problem skipping the attempt in the also
> unlikely event that the queue is already contended.

I'm curious if you tried benchmarking just removing these extra
merges entirely?
Jens Axboe Jan. 24, 2024, 3:02 p.m. UTC | #3
On 1/24/24 2:32 AM, Christoph Hellwig wrote:
> On Tue, Jan 23, 2024 at 10:34:15AM -0700, Jens Axboe wrote:
>> We do several stages of merging in the block layer - the most likely one
>> to work is also the cheap one, merging direct in the per-task plug when
>> IO is submitted. Getting merges outside of that is a lot less likely,
>> but IO schedulers may still maintain internal data structures to
>> facilitate merge lookups outside of the plug.
>>
>> Make mq-deadline skip expensive merge lookups if the queue lock is
>> already contended. The likelihood of getting a merge here is not very
>> high, hence it should not be a problem skipping the attempt in the also
>> unlikely event that the queue is already contended.
> 
> I'm curious if you tried benchmarking just removing these extra
> merges entirely?

We've tried removing this many years ago, and the issue is generally
threadpools that do related IO and hence won't merge for those cases.
It's a pretty stupid case, but I'm willing to bet we'll get regressions
on rotational storage reported if we just skip it entirely.

Alternatively we could make it dependent on rotational or not, but seems
cleaner to me to just keep it generally enabled and just skip it if
we're contended.
diff mbox series

Patch

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 79bc3b6784b3..740b94f36cac 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -800,7 +800,19 @@  static bool dd_bio_merge(struct request_queue *q, struct bio *bio,
 	struct request *free = NULL;
 	bool ret;
 
-	spin_lock(&dd->lock);
+	/*
+	 * bio merging is called for every bio queued, and it's very easy
+	 * to run into contention because of that. If we fail getting
+	 * the dd lock, just skip this merge attempt. For related IO, the
+	 * plug will be the successful merging point. If we get here, we
+	 * already failed doing the obvious merge. Chances of actually
+	 * getting a merge off this path is a lot slimmer, so skipping an
+	 * occassional lookup that will most likely not succeed anyway should
+	 * not be a problem.
+	 */
+	if (!spin_trylock(&dd->lock))
+		return false;
+
 	ret = blk_mq_sched_try_merge(q, bio, nr_segs, &free);
 	spin_unlock(&dd->lock);