diff mbox series

[2/2] blk-flush: don't need to end rq twice for non post-flush requests

Message ID 20230710064705.1847287-2-chengming.zhou@linux.dev (mailing list archive)
State New, archived
Headers show
Series [1/2] blk-flush: fix rq->flush.seq for post-flush requests | expand

Commit Message

Chengming Zhou July 10, 2023, 6:47 a.m. UTC
From: Chengming Zhou <zhouchengming@bytedance.com>

Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
make rq return twice back to the flush state machine for post-flush.

Obviously, non post-flush requests don't need it, they don't need to
end request twice, so they don't need to replace rq->end_io callback.
And the same for requests with the FUA bit on hardware with FUA support.

So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace
rq->end_io if it needs post-flush. Otherwise, it can end like normal
request and doesn't need to return back to the flush state machine.

There are also some other good points:
1. all requests on hardware with FUA support won't have post-flush, so
   all of them don't need to end twice.

2. non post-flush requests won't have RQF_FLUSH_SEQ rq_flags set, so
   they can merge like normal requests.

3. we don't account non post-flush requests in flush_data_in_flight,
   since there is no point to defer pending flush for these requests.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 block/blk-flush.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

Comments

Christoph Hellwig July 10, 2023, 1:33 p.m. UTC | #1
On Mon, Jul 10, 2023 at 02:47:05PM +0800, chengming.zhou@linux.dev wrote:
> From: Chengming Zhou <zhouchengming@bytedance.com>
> 
> Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
> make rq return twice back to the flush state machine for post-flush.
> 
> Obviously, non post-flush requests don't need it, they don't need to
> end request twice, so they don't need to replace rq->end_io callback.
> And the same for requests with the FUA bit on hardware with FUA support.
> 
> So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace
> rq->end_io if it needs post-flush. Otherwise, it can end like normal
> request and doesn't need to return back to the flush state machine.

I really like the idea behind this optimization, but I kinda hate
adding more magic to the already way too magic flush sequence.

I wonder if a better idea would be to kill the flush sequence entirely,
and just split the flush_queue into a preflush and a postflush queue.
This would remove a field from struct request and lead to more readable
code as well.
Chengming Zhou July 25, 2023, 1:15 p.m. UTC | #2
On 2023/7/10 21:33, Christoph Hellwig wrote:
> On Mon, Jul 10, 2023 at 02:47:05PM +0800, chengming.zhou@linux.dev wrote:
>> From: Chengming Zhou <zhouchengming@bytedance.com>
>>
>> Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
>> make rq return twice back to the flush state machine for post-flush.
>>
>> Obviously, non post-flush requests don't need it, they don't need to
>> end request twice, so they don't need to replace rq->end_io callback.
>> And the same for requests with the FUA bit on hardware with FUA support.
>>
>> So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace
>> rq->end_io if it needs post-flush. Otherwise, it can end like normal
>> request and doesn't need to return back to the flush state machine.
> 
> I really like the idea behind this optimization, but I kinda hate
> adding more magic to the already way too magic flush sequence.

Yes, agree.

> 
> I wonder if a better idea would be to kill the flush sequence entirely,
> and just split the flush_queue into a preflush and a postflush queue.
> This would remove a field from struct request and lead to more readable
> code as well.

I have thought about this for some time, it seems feasible. So I implement it
today and test it using blktests, it works.

I will send the patchset soon.

Thanks.
diff mbox series

Patch

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 094a6adb2718..1b92654e8757 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -93,6 +93,7 @@  enum {
 
 static void blk_kick_flush(struct request_queue *q,
 			   struct blk_flush_queue *fq, blk_opf_t flags);
+static void blk_rq_init_flush(struct request *rq);
 
 static inline struct blk_flush_queue *
 blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx)
@@ -187,7 +188,15 @@  static void blk_flush_complete_seq(struct request *rq,
 		break;
 
 	case REQ_FSEQ_DATA:
-		fq->flush_data_in_flight++;
+		/*
+		 * Only for requests that need post-flush,
+		 * we need to do rq->end_io replacement trick
+		 * to return back to the flush state machine.
+		 */
+		if (!(rq->flush.seq & REQ_FSEQ_POSTFLUSH)) {
+			blk_rq_init_flush(rq);
+			fq->flush_data_in_flight++;
+		}
 		spin_lock(&q->requeue_lock);
 		list_move_tail(&rq->queuelist, &q->flush_list);
 		spin_unlock(&q->requeue_lock);
@@ -202,7 +211,13 @@  static void blk_flush_complete_seq(struct request *rq,
 		 * normal completion and end it.
 		 */
 		list_del_init(&rq->queuelist);
-		blk_flush_restore_request(rq);
+		/*
+		 * Only for requests that had rq->end_io replaced,
+		 * we need to restore rq->end_io and make it a normal
+		 * request before the second end.
+		 */
+		if (rq->rq_flags & RQF_FLUSH_SEQ)
+			blk_flush_restore_request(rq);
 		blk_mq_end_request(rq, error);
 		break;
 
@@ -389,7 +404,6 @@  static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq,
 
 static void blk_rq_init_flush(struct request *rq)
 {
-	rq->flush.seq = 0;
 	rq->rq_flags |= RQF_FLUSH_SEQ;
 	rq->flush.saved_end_io = rq->end_io; /* Usually NULL */
 	rq->end_io = mq_flush_data_end_io;
@@ -424,6 +438,7 @@  bool blk_insert_flush(struct request *rq)
 	 * the request accounting.
 	 */
 	rq->cmd_flags |= REQ_SYNC;
+	rq->flush.seq = 0;
 
 	switch (policy) {
 	case 0:
@@ -458,7 +473,6 @@  bool blk_insert_flush(struct request *rq)
 		 * Mark the request as part of a flush sequence and submit it
 		 * for further processing to the flush state machine.
 		 */
-		blk_rq_init_flush(rq);
 		spin_lock_irq(&fq->mq_flush_lock);
 		blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
 		spin_unlock_irq(&fq->mq_flush_lock);