diff mbox series

答复: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"

Message ID cf8127b0fa594169a71f3257326e5bec@BJMBX02.spreadtrum.com (mailing list archive)
State New, archived
Headers show
Series 答复: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests" | expand

Commit Message

Zhiguo Niu March 14, 2024, 1:03 a.m. UTC
Hi Bart,

Just as mentioned in original patch, "dd->async_depth = max(1UL, 3 * q->nr_requests / 4);", this limitation methods look likes won't have a limit effect, because tag allocated is based on sbitmap, not based the whole nr_requests.
Right?
Thanks!

For write requests, when we assign a tags from sched_tags,
data->shallow_depth will be passed to sbitmap_find_bit,
see the following code:

nr = sbitmap_find_bit_in_word(&sb->map[index],
			min_t (unsigned int,
			__map_depth(sb, index),
			depth),
			alloc_hint, wrap);

The smaller of data->shallow_depth and __map_depth(sb, index)
will be used as the maximum range when allocating bits.

For a mmc device (one hw queue, deadline I/O scheduler):
q->nr_requests = sched_tags = 128, so according to the previous
calculation method, dd->async_depth = data->shallow_depth = 96,
and the platform is 64bits with 8 cpus, sched_tags.bitmap_tags.sb.shift=5,
sb.maps[]=32/32/32/32, 32 is smaller than 96, whether it is a read or
a write I/O, tags can be allocated to the maximum range each time,
which has not throttling effect.

-----邮件原件-----
发件人: Bart Van Assche <bvanassche@acm.org> 
发送时间: 2024年3月14日 5:42
收件人: Jens Axboe <axboe@kernel.dk>
抄送: linux-block@vger.kernel.org; Christoph Hellwig <hch@lst.de>; Bart Van Assche <bvanassche@acm.org>; stable@vger.kernel.org; Damien Le Moal <dlemoal@kernel.org>; Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>; 牛志国 (Zhiguo Niu) <Zhiguo.Niu@unisoc.com>
主题: [PATCH] Revert "block/mq-deadline: use correct way to throttling write requests"


注意: 这封邮件来自于外部。除非你确定邮件内容安全,否则不要点击任何链接和附件。
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.



The code "max(1U, 3 * (1U << shift)  / 4)" comes from the Kyber I/O scheduler. The Kyber I/O scheduler maintains one internal queue per hwq and hence derives its async_depth from the number of hwq tags. Using this approach for the mq-deadline scheduler is wrong since the mq-deadline scheduler maintains one internal queue for all hwqs combined. Hence this revert.

Cc: stable@vger.kernel.org
Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Cc: Zhiguo Niu <Zhiguo.Niu@unisoc.com>
Fixes: d47f9717e5cf ("block/mq-deadline: use correct way to throttling write requests")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Bart Van Assche March 14, 2024, 5:08 p.m. UTC | #1
On 3/13/24 18:03, 牛志国 (Zhiguo Niu) wrote:
> Just as mentioned in original patch, "dd->async_depth = max(1UL, 3 * q->nr_requests / 4);", this limitation methods look likes won't have a limit effect, because tag allocated is based on sbitmap, not based the whole nr_requests.
> Right?
> Thanks!
> 
> For write requests, when we assign a tags from sched_tags,
> data->shallow_depth will be passed to sbitmap_find_bit,
> see the following code:
> 
> nr = sbitmap_find_bit_in_word(&sb->map[index],
> 			min_t (unsigned int,
> 			__map_depth(sb, index),
> 			depth),
> 			alloc_hint, wrap);
> 
> The smaller of data->shallow_depth and __map_depth(sb, index)
> will be used as the maximum range when allocating bits.
> 
> For a mmc device (one hw queue, deadline I/O scheduler):
> q->nr_requests = sched_tags = 128, so according to the previous
> calculation method, dd->async_depth = data->shallow_depth = 96,
> and the platform is 64bits with 8 cpus, sched_tags.bitmap_tags.sb.shift=5,
> sb.maps[]=32/32/32/32, 32 is smaller than 96, whether it is a read or
> a write I/O, tags can be allocated to the maximum range each time,
> which has not throttling effect.
Whether or not the code in my patch effectively performs throttling,
we need this revert to be merged. The patch that is being reverted
("block/mq-deadline: use correct way to throttling write requests")
ended up in Greg KH's stable branches. Hence, the first step is to
revert that patch and tag it with "Cc: stable" such that the revert
lands in the stable branches.

Thanks,

Bart.
Jens Axboe March 14, 2024, 7:31 p.m. UTC | #2
On 3/14/24 11:08 AM, Bart Van Assche wrote:
> On 3/13/24 18:03, ??? (Zhiguo Niu) wrote:
>> Just as mentioned in original patch, "dd->async_depth = max(1UL, 3 * q->nr_requests / 4);", this limitation methods look likes won't have a limit effect, because tag allocated is based on sbitmap, not based the whole nr_requests.
>> Right?
>> Thanks!
>>
>> For write requests, when we assign a tags from sched_tags,
>> data->shallow_depth will be passed to sbitmap_find_bit,
>> see the following code:
>>
>> nr = sbitmap_find_bit_in_word(&sb->map[index],
>>             min_t (unsigned int,
>>             __map_depth(sb, index),
>>             depth),
>>             alloc_hint, wrap);
>>
>> The smaller of data->shallow_depth and __map_depth(sb, index)
>> will be used as the maximum range when allocating bits.
>>
>> For a mmc device (one hw queue, deadline I/O scheduler):
>> q->nr_requests = sched_tags = 128, so according to the previous
>> calculation method, dd->async_depth = data->shallow_depth = 96,
>> and the platform is 64bits with 8 cpus, sched_tags.bitmap_tags.sb.shift=5,
>> sb.maps[]=32/32/32/32, 32 is smaller than 96, whether it is a read or
>> a write I/O, tags can be allocated to the maximum range each time,
>> which has not throttling effect.
> Whether or not the code in my patch effectively performs throttling,
> we need this revert to be merged. The patch that is being reverted
> ("block/mq-deadline: use correct way to throttling write requests")
> ended up in Greg KH's stable branches. Hence, the first step is to
> revert that patch and tag it with "Cc: stable" such that the revert
> lands in the stable branches.

Indeed, no amount of arguing is going to change that fact. Zhiguo, it
caused a regression. Rather than argue on why the change is correct,
it'd be much more productive to figure out a future solution.
diff mbox series

Patch

diff --git a/block/mq-deadline.c b/block/mq-deadline.c index f958e79277b8..02a916ba62ee 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -646,9 +646,8 @@  static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
        struct request_queue *q = hctx->queue;
        struct deadline_data *dd = q->elevator->elevator_data;
        struct blk_mq_tags *tags = hctx->sched_tags;
-       unsigned int shift = tags->bitmap_tags.sb.shift;

-       dd->async_depth = max(1U, 3 * (1U << shift)  / 4);
+       dd->async_depth = max(1UL, 3 * q->nr_requests / 4);

        sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);  }