mbox series

[0/6] blk-mq: optimize the queue_rqs() support

Message ID 20230824144403.2135739-1-chengming.zhou@linux.dev (mailing list archive)
Headers show
Series blk-mq: optimize the queue_rqs() support | expand

Message

Chengming Zhou Aug. 24, 2023, 2:43 p.m. UTC
From: Chengming Zhou <zhouchengming@bytedance.com>

The current queue_rqs() support has limitation that it can't work on
shared tags queue, which is resolved by patch 1-3. We move the account
of active requests to where we really allocate the driver tag.

This is clearer and matched with the unaccount side which now happen
when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which
was used to avoid double account problem of flush request.

Another problem is that the driver that support queue_rqs() has to
set inflight request table by itself, which is resolved in patch 4.

The patch 5 fixes a potential race problem which may cause false
timeout because of the reorder of rq->state and rq->deadline.

The patch 6 add support queue_rqs() for null_blk, which showed a
3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM.
And we also use it for testing queue_rqs() on shared tags queue.

Thanks for review!

Chengming Zhou (6):
  blk-mq: account active requests when get driver tag
  blk-mq: remove RQF_MQ_INFLIGHT
  blk-mq: support batched queue_rqs() on shared tags queue
  blk-mq: update driver tags request table when start request
  blk-mq: fix potential reorder of request state and deadline
  block/null_blk: add queue_rqs() support

 block/blk-flush.c             | 11 ++-----
 block/blk-mq-debugfs.c        |  1 -
 block/blk-mq.c                | 53 ++++++++++++++------------------
 block/blk-mq.h                | 57 ++++++++++++++++++++++++-----------
 drivers/block/null_blk/main.c | 20 ++++++++++++
 drivers/block/virtio_blk.c    |  2 --
 drivers/nvme/host/pci.c       |  1 -
 include/linux/blk-mq.h        |  2 --
 8 files changed, 84 insertions(+), 63 deletions(-)

Comments

Bart Van Assche Aug. 24, 2023, 5:02 p.m. UTC | #1
On 8/24/23 07:43, chengming.zhou@linux.dev wrote:
> From: Chengming Zhou <zhouchengming@bytedance.com>
> 
> The current queue_rqs() support has limitation that it can't work on
> shared tags queue, which is resolved by patch 1-3. We move the account
> of active requests to where we really allocate the driver tag.
> 
> This is clearer and matched with the unaccount side which now happen
> when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which
> was used to avoid double account problem of flush request.
> 
> Another problem is that the driver that support queue_rqs() has to
> set inflight request table by itself, which is resolved in patch 4.
> 
> The patch 5 fixes a potential race problem which may cause false
> timeout because of the reorder of rq->state and rq->deadline.
> 
> The patch 6 add support queue_rqs() for null_blk, which showed a
> 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM.
> And we also use it for testing queue_rqs() on shared tags queue.

Hi Jens and Christoph,

This patch series would be simplified significantly if the code for
fair tag allocation would be removed first
(https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/, 
January 2023).
It has been proposed to improve fair tag sharing but the complexity of
the proposed alternative is scary
(https://lore.kernel.org/linux-block/20230618160738.54385-1-yukuai1@huaweicloud.com/, 
June 2023).
  Does everyone agree with removing the code for fair tag sharing - code
that significantly hurts performance of UFS devices and code that did
not exist in the legacy block layer?

Thanks,

Bart.
Chengming Zhou Aug. 25, 2023, 8:24 a.m. UTC | #2
On 2023/8/25 01:02, Bart Van Assche wrote:
> On 8/24/23 07:43, chengming.zhou@linux.dev wrote:
>> From: Chengming Zhou <zhouchengming@bytedance.com>
>>
>> The current queue_rqs() support has limitation that it can't work on
>> shared tags queue, which is resolved by patch 1-3. We move the account
>> of active requests to where we really allocate the driver tag.
>>
>> This is clearer and matched with the unaccount side which now happen
>> when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which
>> was used to avoid double account problem of flush request.
>>
>> Another problem is that the driver that support queue_rqs() has to
>> set inflight request table by itself, which is resolved in patch 4.
>>
>> The patch 5 fixes a potential race problem which may cause false
>> timeout because of the reorder of rq->state and rq->deadline.
>>
>> The patch 6 add support queue_rqs() for null_blk, which showed a
>> 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM.
>> And we also use it for testing queue_rqs() on shared tags queue.
> 
> Hi Jens and Christoph,
> 
> This patch series would be simplified significantly if the code for
> fair tag allocation would be removed first
> (https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/, January 2023).
> It has been proposed to improve fair tag sharing but the complexity of
> the proposed alternative is scary
> (https://lore.kernel.org/linux-block/20230618160738.54385-1-yukuai1@huaweicloud.com/, June 2023).
>  Does everyone agree with removing the code for fair tag sharing - code
> that significantly hurts performance of UFS devices and code that did
> not exist in the legacy block layer?
> 

Hi Bart, thanks for the references!

I don't know the details of the UFS devices bad performance problem.
But I feel it maybe caused by the too lazy queue idle handling, which
is now only handled in queue timeout work.

Another problem maybe the wakeup batch algorithm, which is too subtle.
And there were some IO hang problems caused by it in the past.

So yes, we should improve it, although I don't have good idea for now,
need to do some tests and analysis.

As for removing all this code, I don't know from my limited knowledge.
It was introduced to improve relative fair tags sharing between queues,
to avoid starvation. And the proposed alternative looks too complex to me.

Thanks.
Bart Van Assche Aug. 27, 2023, 12:45 a.m. UTC | #3
On 8/25/23 01:24, Chengming Zhou wrote:
> I don't know the details of the UFS devices bad performance problem.
> But I feel it maybe caused by the too lazy queue idle handling, which
> is now only handled in queue timeout work.

Hi Chengming,

The root cause of the UFS performance problem is the fair sharing
algorithm itself: reducing the active queue count only happens after
the request queue timeout has expired. This is way too slow. Last time
it was proposed to remove that algorithm Yu Kuai promised to replace it
by a better algorithm. Since progress on the replacement algorithm has
stalled I'm asking again whether everyone agrees to remove the fairness
algorithm.

Thanks,

Bart.
Chengming Zhou Sept. 2, 2023, 3 p.m. UTC | #4
On 2023/8/24 22:43, chengming.zhou@linux.dev wrote:
> From: Chengming Zhou <zhouchengming@bytedance.com>
> 
> The current queue_rqs() support has limitation that it can't work on
> shared tags queue, which is resolved by patch 1-3. We move the account
> of active requests to where we really allocate the driver tag.
> 
> This is clearer and matched with the unaccount side which now happen
> when we put the driver tag. And we can remove RQF_MQ_INFLIGHT, which
> was used to avoid double account problem of flush request.
> 
> Another problem is that the driver that support queue_rqs() has to
> set inflight request table by itself, which is resolved in patch 4.
> 
> The patch 5 fixes a potential race problem which may cause false
> timeout because of the reorder of rq->state and rq->deadline.
> 
> The patch 6 add support queue_rqs() for null_blk, which showed a
> 3.6% IOPS improvement in fio/t/io_uring benchmark on my test VM.
> And we also use it for testing queue_rqs() on shared tags queue.

Hello, gentle ping.

Thanks.

> 
> Thanks for review!
> 
> Chengming Zhou (6):
>   blk-mq: account active requests when get driver tag
>   blk-mq: remove RQF_MQ_INFLIGHT
>   blk-mq: support batched queue_rqs() on shared tags queue
>   blk-mq: update driver tags request table when start request
>   blk-mq: fix potential reorder of request state and deadline
>   block/null_blk: add queue_rqs() support
> 
>  block/blk-flush.c             | 11 ++-----
>  block/blk-mq-debugfs.c        |  1 -
>  block/blk-mq.c                | 53 ++++++++++++++------------------
>  block/blk-mq.h                | 57 ++++++++++++++++++++++++-----------
>  drivers/block/null_blk/main.c | 20 ++++++++++++
>  drivers/block/virtio_blk.c    |  2 --
>  drivers/nvme/host/pci.c       |  1 -
>  include/linux/blk-mq.h        |  2 --
>  8 files changed, 84 insertions(+), 63 deletions(-)
>