diff mbox

sched/deadline: Add missing update_rq_clock() in dl_task_timer()

Message ID 1488865888-15894-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li March 7, 2017, 5:51 a.m. UTC
From: Wanpeng Li <wanpeng.li@hotmail.com>

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

 ------------[ cut here ]------------
 WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
 rq->clock_update_flags < RQCF_ACT_SKIP
 CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ #24
 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
 Call Trace:
  <IRQ>
  dump_stack+0x85/0xc4
  __warn+0x172/0x1b0
  warn_slowpath_fmt+0xb4/0xf0
  ? __warn+0x1b0/0x1b0
  ? debug_check_no_locks_freed+0x2c0/0x2c0
  ? cpudl_set+0x3d/0x2b0
  replenish_dl_entity+0x71e/0xc40
  enqueue_task_dl+0x2ea/0x12e0
  ? dl_task_timer+0x777/0x990
  ? __hrtimer_run_queues+0x270/0xa50
  dl_task_timer+0x316/0x990
  ? enqueue_task_dl+0x12e0/0x12e0
  ? enqueue_task_dl+0x12e0/0x12e0
  __hrtimer_run_queues+0x270/0xa50
  ? hrtimer_cancel+0x20/0x20
  ? hrtimer_interrupt+0x119/0x600
  hrtimer_interrupt+0x19c/0x600
  ? trace_hardirqs_off+0xd/0x10
  local_apic_timer_interrupt+0x74/0xe0
  smp_apic_timer_interrupt+0x76/0xa0
  apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL 
timer fires and currnet rq is offline. The rq clock of the new rq should 
be updated. This patch fixes it by updating the rq clock after holding 
the new rq's rq lock.

Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 kernel/sched/deadline.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Matt Fleming March 7, 2017, 1:35 p.m. UTC | #1
On Mon, 06 Mar, at 09:51:28PM, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> The following warning can be triggered by hot-unplugging the CPU
> on which an active SCHED_DEADLINE task is running on:
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
>  rq->clock_update_flags < RQCF_ACT_SKIP
>  CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ #24
 
[...]

> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99b2c33..c6db3fd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
>  		lockdep_unpin_lock(&rq->lock, rf.cookie);
>  		rq = dl_task_offline_migration(rq, p);
>  		rf.cookie = lockdep_pin_lock(&rq->lock);
> +		update_rq_clock(rq);
>  
>  		/*
>  		 * Now that the task has been migrated to the new RQ and we

Yeah, I guess the reason we can't use the rq_repin_lock() function is
because of all the DL double rq locking going on inside of
dl_task_offline_migration().

I'd definitely like someone else to verify, but this looks OK to me.

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Wanpeng Li March 15, 2017, 7:53 a.m. UTC | #2
Ping, :)
2017-03-07 13:51 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> The following warning can be triggered by hot-unplugging the CPU
> on which an active SCHED_DEADLINE task is running on:
>
>  ------------[ cut here ]------------
>  WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
>  rq->clock_update_flags < RQCF_ACT_SKIP
>  CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ #24
>  Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
>  Call Trace:
>   <IRQ>
>   dump_stack+0x85/0xc4
>   __warn+0x172/0x1b0
>   warn_slowpath_fmt+0xb4/0xf0
>   ? __warn+0x1b0/0x1b0
>   ? debug_check_no_locks_freed+0x2c0/0x2c0
>   ? cpudl_set+0x3d/0x2b0
>   replenish_dl_entity+0x71e/0xc40
>   enqueue_task_dl+0x2ea/0x12e0
>   ? dl_task_timer+0x777/0x990
>   ? __hrtimer_run_queues+0x270/0xa50
>   dl_task_timer+0x316/0x990
>   ? enqueue_task_dl+0x12e0/0x12e0
>   ? enqueue_task_dl+0x12e0/0x12e0
>   __hrtimer_run_queues+0x270/0xa50
>   ? hrtimer_cancel+0x20/0x20
>   ? hrtimer_interrupt+0x119/0x600
>   hrtimer_interrupt+0x19c/0x600
>   ? trace_hardirqs_off+0xd/0x10
>   local_apic_timer_interrupt+0x74/0xe0
>   smp_apic_timer_interrupt+0x76/0xa0
>   apic_timer_interrupt+0x93/0xa0
>
> The DL task will be migrated to a suitable later deadline rq once the DL
> timer fires and currnet rq is offline. The rq clock of the new rq should
> be updated. This patch fixes it by updating the rq clock after holding
> the new rq's rq lock.
>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  kernel/sched/deadline.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99b2c33..c6db3fd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
>                 lockdep_unpin_lock(&rq->lock, rf.cookie);
>                 rq = dl_task_offline_migration(rq, p);
>                 rf.cookie = lockdep_pin_lock(&rq->lock);
> +               update_rq_clock(rq);
>
>                 /*
>                  * Now that the task has been migrated to the new RQ and we
> --
> 2.7.4
>
Daniel Bristot de Oliveira March 15, 2017, 10:44 a.m. UTC | #3
On 03/15/2017 08:53 AM, Wanpeng Li wrote:
> Ping, :)
> 2017-03-07 13:51 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> The following warning can be triggered by hot-unplugging the CPU
>> on which an active SCHED_DEADLINE task is running on:
>>
>>  ------------[ cut here ]------------
>>  WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
>>  rq->clock_update_flags < RQCF_ACT_SKIP
>>  CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ #24
>>  Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
>>  Call Trace:
>>   <IRQ>
>>   dump_stack+0x85/0xc4
>>   __warn+0x172/0x1b0
>>   warn_slowpath_fmt+0xb4/0xf0
>>   ? __warn+0x1b0/0x1b0
>>   ? debug_check_no_locks_freed+0x2c0/0x2c0
>>   ? cpudl_set+0x3d/0x2b0
>>   replenish_dl_entity+0x71e/0xc40
>>   enqueue_task_dl+0x2ea/0x12e0
>>   ? dl_task_timer+0x777/0x990
>>   ? __hrtimer_run_queues+0x270/0xa50
>>   dl_task_timer+0x316/0x990
>>   ? enqueue_task_dl+0x12e0/0x12e0
>>   ? enqueue_task_dl+0x12e0/0x12e0
>>   __hrtimer_run_queues+0x270/0xa50
>>   ? hrtimer_cancel+0x20/0x20
>>   ? hrtimer_interrupt+0x119/0x600
>>   hrtimer_interrupt+0x19c/0x600
>>   ? trace_hardirqs_off+0xd/0x10
>>   local_apic_timer_interrupt+0x74/0xe0
>>   smp_apic_timer_interrupt+0x76/0xa0
>>   apic_timer_interrupt+0x93/0xa0
>>
>> The DL task will be migrated to a suitable later deadline rq once the DL
>> timer fires and currnet rq is offline. The rq clock of the new rq should
>> be updated. This patch fixes it by updating the rq clock after holding
>> the new rq's rq lock.
>>
>> Cc: Juri Lelli <juri.lelli@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Matt Fleming <matt@codeblueprint.co.uk>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>

Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>

-- Daniel
Juri Lelli March 15, 2017, 11:03 a.m. UTC | #4
Hi,

On 06/03/17 21:51, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> The following warning can be triggered by hot-unplugging the CPU
> on which an active SCHED_DEADLINE task is running on:
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
>  rq->clock_update_flags < RQCF_ACT_SKIP
>  CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ #24
>  Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
>  Call Trace:
>   <IRQ>
>   dump_stack+0x85/0xc4
>   __warn+0x172/0x1b0
>   warn_slowpath_fmt+0xb4/0xf0
>   ? __warn+0x1b0/0x1b0
>   ? debug_check_no_locks_freed+0x2c0/0x2c0
>   ? cpudl_set+0x3d/0x2b0
>   replenish_dl_entity+0x71e/0xc40
>   enqueue_task_dl+0x2ea/0x12e0
>   ? dl_task_timer+0x777/0x990
>   ? __hrtimer_run_queues+0x270/0xa50
>   dl_task_timer+0x316/0x990
>   ? enqueue_task_dl+0x12e0/0x12e0
>   ? enqueue_task_dl+0x12e0/0x12e0
>   __hrtimer_run_queues+0x270/0xa50
>   ? hrtimer_cancel+0x20/0x20
>   ? hrtimer_interrupt+0x119/0x600
>   hrtimer_interrupt+0x19c/0x600
>   ? trace_hardirqs_off+0xd/0x10
>   local_apic_timer_interrupt+0x74/0xe0
>   smp_apic_timer_interrupt+0x76/0xa0
>   apic_timer_interrupt+0x93/0xa0
> 
> The DL task will be migrated to a suitable later deadline rq once the DL 
> timer fires and currnet rq is offline. The rq clock of the new rq should 
> be updated. This patch fixes it by updating the rq clock after holding 
> the new rq's rq lock.
> 
> Cc: Juri Lelli <juri.lelli@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  kernel/sched/deadline.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99b2c33..c6db3fd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
>  		lockdep_unpin_lock(&rq->lock, rf.cookie);
>  		rq = dl_task_offline_migration(rq, p);
>  		rf.cookie = lockdep_pin_lock(&rq->lock);
> +		update_rq_clock(rq);

Looks good to me.

Acked-by: Juri Lelli <juri.lelli@arm.com>

Thanks,

- Juri
diff mbox

Patch

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 99b2c33..c6db3fd 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -638,6 +638,7 @@  static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
 		lockdep_unpin_lock(&rq->lock, rf.cookie);
 		rq = dl_task_offline_migration(rq, p);
 		rf.cookie = lockdep_pin_lock(&rq->lock);
+		update_rq_clock(rq);
 
 		/*
 		 * Now that the task has been migrated to the new RQ and we