diff mbox series

[1/2] block: Do not merge recursively in elv_attempt_insert_merge()

Message ID 20210520223353.11561-2-jack@suse.cz (mailing list archive)
State New, archived
Headers show
Series block: Fix deadlock when merging requests with BFQ | expand

Commit Message

Jan Kara May 20, 2021, 10:33 p.m. UTC
Most of the merging happens at bio level. There should not be much
merging happening at request level anymore. Furthermore if we backmerged
a request to the previous one, the chances to be able to merge the
result to even previous request are slim - that could succeed only if
requests were inserted in 2 1 3 order. Merging more requests in
elv_attempt_insert_merge() will be difficult to handle when we want to
pass requests to free back to the caller of
blk_mq_sched_try_insert_merge(). So just remove the possibility of
merging multiple requests in elv_attempt_insert_merge().

Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/elevator.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

Comments

Ming Lei May 21, 2021, 12:42 a.m. UTC | #1
On Fri, May 21, 2021 at 12:33:52AM +0200, Jan Kara wrote:
> Most of the merging happens at bio level. There should not be much
> merging happening at request level anymore. Furthermore if we backmerged
> a request to the previous one, the chances to be able to merge the
> result to even previous request are slim - that could succeed only if
> requests were inserted in 2 1 3 order. Merging more requests in

Right, but some workload has this kind of pattern.

For example of qemu IO emulation, it often can be thought as single job,
native aio, direct io with high queue depth. IOs is originated from one VM, but
may be from multiple jobs in the VM, so bio merge may not hit much because of IO
emulation timing(virtio-scsi/blk's MQ, or IO can be interleaved from multiple
jobs via the SQ transport), but request merge can really make a difference, see
recent patch in the following link:

https://lore.kernel.org/linux-block/3f61e939-d95a-1dd1-6870-e66795cfc1b1@suse.de/T/#t

> elv_attempt_insert_merge() will be difficult to handle when we want to
> pass requests to free back to the caller of
> blk_mq_sched_try_insert_merge(). So just remove the possibility of
> merging multiple requests in elv_attempt_insert_merge().

This way will cause regression.


Thanks, 
Ming
Jan Kara May 21, 2021, 11:53 a.m. UTC | #2
On Fri 21-05-21 08:42:16, Ming Lei wrote:
> On Fri, May 21, 2021 at 12:33:52AM +0200, Jan Kara wrote:
> > Most of the merging happens at bio level. There should not be much
> > merging happening at request level anymore. Furthermore if we backmerged
> > a request to the previous one, the chances to be able to merge the
> > result to even previous request are slim - that could succeed only if
> > requests were inserted in 2 1 3 order. Merging more requests in
> 
> Right, but some workload has this kind of pattern.
> 
> For example of qemu IO emulation, it often can be thought as single job,
> native aio, direct io with high queue depth. IOs is originated from one VM, but
> may be from multiple jobs in the VM, so bio merge may not hit much because of IO
> emulation timing(virtio-scsi/blk's MQ, or IO can be interleaved from multiple
> jobs via the SQ transport), but request merge can really make a difference, see
> recent patch in the following link:
> 
> https://lore.kernel.org/linux-block/3f61e939-d95a-1dd1-6870-e66795cfc1b1@suse.de/T/#t

Oh, request merging definitely does make a difference. But the elevator
hash & merge logic I'm modifying here is used only by BFQ and MQ-DEADLINE
AFAICT. And these IO schedulers will already call blk_mq_sched_try_merge()
from their \.bio_merge handler which gets called from blk_mq_submit_bio().
So all the merging that can happen in the code I remove should have already
happened. Or am I missing something?

								Honza
Ming Lei May 21, 2021, 1:12 p.m. UTC | #3
On Fri, May 21, 2021 at 01:53:54PM +0200, Jan Kara wrote:
> On Fri 21-05-21 08:42:16, Ming Lei wrote:
> > On Fri, May 21, 2021 at 12:33:52AM +0200, Jan Kara wrote:
> > > Most of the merging happens at bio level. There should not be much
> > > merging happening at request level anymore. Furthermore if we backmerged
> > > a request to the previous one, the chances to be able to merge the
> > > result to even previous request are slim - that could succeed only if
> > > requests were inserted in 2 1 3 order. Merging more requests in
> > 
> > Right, but some workload has this kind of pattern.
> > 
> > For example of qemu IO emulation, it often can be thought as single job,
> > native aio, direct io with high queue depth. IOs is originated from one VM, but
> > may be from multiple jobs in the VM, so bio merge may not hit much because of IO
> > emulation timing(virtio-scsi/blk's MQ, or IO can be interleaved from multiple
> > jobs via the SQ transport), but request merge can really make a difference, see
> > recent patch in the following link:
> > 
> > https://lore.kernel.org/linux-block/3f61e939-d95a-1dd1-6870-e66795cfc1b1@suse.de/T/#t
> 
> Oh, request merging definitely does make a difference. But the elevator
> hash & merge logic I'm modifying here is used only by BFQ and MQ-DEADLINE
> AFAICT. And these IO schedulers will already call blk_mq_sched_try_merge()
> from their \.bio_merge handler which gets called from blk_mq_submit_bio().
> So all the merging that can happen in the code I remove should have already
> happened. Or am I missing something?

There might be at least two reasons:

1) when .bio_merge() is called, some requests are kept in plug list, so
the bio may not be merged to requests in scheduler queue; when flushing plug
list and inserts these requests to scheduler queue, we have to try to
merge them further

2) only blk_mq_sched_try_insert_merge() is capable of doing aggressive
request merge, such as, when req A is merged to req B, the function will
continue to try to merge req B with other in-queue requests, until no
any further merge can't be done; neither blk_mq_sched_try_merge() nor
blk_attempt_plug_merge can do such aggressive request merge.



Thanks,
Ming
Jan Kara May 21, 2021, 1:44 p.m. UTC | #4
On Fri 21-05-21 21:12:14, Ming Lei wrote:
> On Fri, May 21, 2021 at 01:53:54PM +0200, Jan Kara wrote:
> > On Fri 21-05-21 08:42:16, Ming Lei wrote:
> > > On Fri, May 21, 2021 at 12:33:52AM +0200, Jan Kara wrote:
> > > > Most of the merging happens at bio level. There should not be much
> > > > merging happening at request level anymore. Furthermore if we backmerged
> > > > a request to the previous one, the chances to be able to merge the
> > > > result to even previous request are slim - that could succeed only if
> > > > requests were inserted in 2 1 3 order. Merging more requests in
> > > 
> > > Right, but some workload has this kind of pattern.
> > > 
> > > For example of qemu IO emulation, it often can be thought as single job,
> > > native aio, direct io with high queue depth. IOs is originated from one VM, but
> > > may be from multiple jobs in the VM, so bio merge may not hit much because of IO
> > > emulation timing(virtio-scsi/blk's MQ, or IO can be interleaved from multiple
> > > jobs via the SQ transport), but request merge can really make a difference, see
> > > recent patch in the following link:
> > > 
> > > https://lore.kernel.org/linux-block/3f61e939-d95a-1dd1-6870-e66795cfc1b1@suse.de/T/#t
> > 
> > Oh, request merging definitely does make a difference. But the elevator
> > hash & merge logic I'm modifying here is used only by BFQ and MQ-DEADLINE
> > AFAICT. And these IO schedulers will already call blk_mq_sched_try_merge()
> > from their \.bio_merge handler which gets called from blk_mq_submit_bio().
> > So all the merging that can happen in the code I remove should have already
> > happened. Or am I missing something?
> 
> There might be at least two reasons:
> 
> 1) when .bio_merge() is called, some requests are kept in plug list, so
> the bio may not be merged to requests in scheduler queue; when flushing plug
> list and inserts these requests to scheduler queue, we have to try to
> merge them further

Oh, right, I forgot that plug list stores already requests, not bios.

> 2) only blk_mq_sched_try_insert_merge() is capable of doing aggressive
> request merge, such as, when req A is merged to req B, the function will
> continue to try to merge req B with other in-queue requests, until no
> any further merge can't be done; neither blk_mq_sched_try_merge() nor
> blk_attempt_plug_merge can do such aggressive request merge.

Yes, fair point. I was thinking only about a few requests but it the request
sequence is like 0 2 4 6 ... 2n 1 3 5 7 ... 2n+1, then bio merging will
result in 'n' requests while request merging will be able to get it down to
1 request.

I'll keep the recursive merge and pass back list of requests to free
instead. Thanks for explanations!

								Honza
diff mbox series

Patch

diff --git a/block/elevator.c b/block/elevator.c
index 440699c28119..098f4bd226f5 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -350,12 +350,11 @@  enum elv_merge elv_merge(struct request_queue *q, struct request **req,
  * we can append 'rq' to an existing request, so we can throw 'rq' away
  * afterwards.
  *
- * Returns true if we merged, false otherwise
+ * Returns true if we merged, false otherwise.
  */
 bool elv_attempt_insert_merge(struct request_queue *q, struct request *rq)
 {
 	struct request *__rq;
-	bool ret;
 
 	if (blk_queue_nomerges(q))
 		return false;
@@ -369,21 +368,13 @@  bool elv_attempt_insert_merge(struct request_queue *q, struct request *rq)
 	if (blk_queue_noxmerges(q))
 		return false;
 
-	ret = false;
 	/*
 	 * See if our hash lookup can find a potential backmerge.
 	 */
-	while (1) {
-		__rq = elv_rqhash_find(q, blk_rq_pos(rq));
-		if (!__rq || !blk_attempt_req_merge(q, __rq, rq))
-			break;
-
-		/* The merged request could be merged with others, try again */
-		ret = true;
-		rq = __rq;
-	}
-
-	return ret;
+	__rq = elv_rqhash_find(q, blk_rq_pos(rq));
+	if (!__rq || !blk_attempt_req_merge(q, __rq, rq))
+		return false;
+	return true;
 }
 
 void elv_merged_request(struct request_queue *q, struct request *rq,