diff mbox series

[-next,1/2] block: fix that blk_time_get_ns() doesn't update time after schedule

Message ID 20240411032349.3051233-2-yukuai1@huaweicloud.com (mailing list archive)
State New
Headers show
Series block: fix cached time in plug | expand

Commit Message

Yu Kuai April 11, 2024, 3:23 a.m. UTC
From: Yu Kuai <yukuai3@huawei.com>

While monitoring the throttle time of IO from iocost, it's found that
such time is always zero after the io_schedule() from ioc_rqos_throttle,
for example, with the following debug patch:

+       printk("%s-%d: %s enter %llu\n", current->comm, current->pid, __func__, blk_time_get_ns());
        while (true) {
                set_current_state(TASK_UNINTERRUPTIBLE);
                if (wait.committed)
                        break;
                io_schedule();
        }
+       printk("%s-%d: %s exit  %llu\n", current->comm, current->pid, __func__, blk_time_get_ns());

It can be observerd that blk_time_get_ns() always return the same time:

[ 1068.096579] fio-1268: ioc_rqos_throttle enter 1067901962288
[ 1068.272587] fio-1268: ioc_rqos_throttle exit  1067901962288
[ 1068.274389] fio-1268: ioc_rqos_throttle enter 1067901962288
[ 1068.472690] fio-1268: ioc_rqos_throttle exit  1067901962288
[ 1068.474485] fio-1268: ioc_rqos_throttle enter 1067901962288
[ 1068.672656] fio-1268: ioc_rqos_throttle exit  1067901962288
[ 1068.674451] fio-1268: ioc_rqos_throttle enter 1067901962288
[ 1068.872655] fio-1268: ioc_rqos_throttle exit  1067901962288

And I think the root cause is that 'PF_BLOCK_TS' is always cleared
by blk_flush_plug() before scheduel(), hence blk_plug_invalidate_ts()
will never be called:

blk_time_get_ns
 plug->cur_ktime = ktime_get_ns();
 current->flags |= PF_BLOCK_TS;

io_schedule:
 io_schedule_prepare
  blk_flush_plug
   __blk_flush_plug
    /* the flag is cleared, while time is not */
    current->flags &= ~PF_BLOCK_TS;
 schedule
 sched_update_worker
  /* the flag is not set, hence plug->cur_ktime is not cleared */
  if (tsk->flags & PF_BLOCK_TS)
   blk_plug_invalidate_ts()

blk_time_get_ns
 /* got the time stashed before schedule */
 return plug->cur_ktime;

Fix the problem by clearing cached time in __blk_flush_plug().

Fixes: 06b23f92af87 ("block: update cached timestamp post schedule/preemption")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-core.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Jens Axboe April 11, 2024, 4:44 p.m. UTC | #1
On 4/10/24 9:23 PM, Yu Kuai wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index a16b5abdbbf5..e317d7bc0696 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1195,6 +1195,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
>  	if (unlikely(!rq_list_empty(plug->cached_rq)))
>  		blk_mq_free_plug_rqs(plug);
>  
> +	plug->cur_ktime = 0;
>  	current->flags &= ~PF_BLOCK_TS;
>  }

We can just use blk_plug_invalidate_ts() here, but not really important.
I think this one should go into 6.9, and patch 2 should go into 6.10,
however.
Yu Kuai April 12, 2024, 1:24 a.m. UTC | #2
Hi,

在 2024/04/12 0:44, Jens Axboe 写道:
> On 4/10/24 9:23 PM, Yu Kuai wrote:
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index a16b5abdbbf5..e317d7bc0696 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1195,6 +1195,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
>>   	if (unlikely(!rq_list_empty(plug->cached_rq)))
>>   		blk_mq_free_plug_rqs(plug);
>>   
>> +	plug->cur_ktime = 0;
>>   	current->flags &= ~PF_BLOCK_TS;
>>   }
> 
> We can just use blk_plug_invalidate_ts() here, but not really important.
> I think this one should go into 6.9, and patch 2 should go into 6.10,
> however.

This sounds great! Do you want me to update and send them separately?

Thanks,
Kuai

>
Jens Axboe April 12, 2024, 2:33 p.m. UTC | #3
On 4/11/24 7:24 PM, Yu Kuai wrote:
> Hi,
> 
> ? 2024/04/12 0:44, Jens Axboe ??:
>> On 4/10/24 9:23 PM, Yu Kuai wrote:
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index a16b5abdbbf5..e317d7bc0696 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -1195,6 +1195,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
>>>       if (unlikely(!rq_list_empty(plug->cached_rq)))
>>>           blk_mq_free_plug_rqs(plug);
>>>   +    plug->cur_ktime = 0;
>>>       current->flags &= ~PF_BLOCK_TS;
>>>   }
>>
>> We can just use blk_plug_invalidate_ts() here, but not really important.
>> I think this one should go into 6.9, and patch 2 should go into 6.10,
>> however.
> 
> This sounds great! Do you want me to update and send them separately?

I've applied 1/2 separately, so just resend 2/2 when -rc4 has been
tagged and I'll get that one queued for 6.10.
diff mbox series

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..e317d7bc0696 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1195,6 +1195,7 @@  void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
 	if (unlikely(!rq_list_empty(plug->cached_rq)))
 		blk_mq_free_plug_rqs(plug);
 
+	plug->cur_ktime = 0;
 	current->flags &= ~PF_BLOCK_TS;
 }