Message ID | 1488865888-15894-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 06 Mar, at 09:51:28PM, Wanpeng Li wrote: > From: Wanpeng Li <wanpeng.li@hotmail.com> > > The following warning can be triggered by hot-unplugging the CPU > on which an active SCHED_DEADLINE task is running on: > > ------------[ cut here ]------------ > WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40 > rq->clock_update_flags < RQCF_ACT_SKIP > CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24 [...] > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 99b2c33..c6db3fd 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > lockdep_unpin_lock(&rq->lock, rf.cookie); > rq = dl_task_offline_migration(rq, p); > rf.cookie = lockdep_pin_lock(&rq->lock); > + update_rq_clock(rq); > > /* > * Now that the task has been migrated to the new RQ and we Yeah, I guess the reason we can't use the rq_repin_lock() function is because of all the DL double rq locking going on inside of dl_task_offline_migration(). I'd definitely like someone else to verify, but this looks OK to me. Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Ping, :) 2017-03-07 13:51 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>: > From: Wanpeng Li <wanpeng.li@hotmail.com> > > The following warning can be triggered by hot-unplugging the CPU > on which an active SCHED_DEADLINE task is running on: > > ------------[ cut here ]------------ > WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40 > rq->clock_update_flags < RQCF_ACT_SKIP > CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24 > Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 > Call Trace: > <IRQ> > dump_stack+0x85/0xc4 > __warn+0x172/0x1b0 > warn_slowpath_fmt+0xb4/0xf0 > ? __warn+0x1b0/0x1b0 > ? debug_check_no_locks_freed+0x2c0/0x2c0 > ? cpudl_set+0x3d/0x2b0 > replenish_dl_entity+0x71e/0xc40 > enqueue_task_dl+0x2ea/0x12e0 > ? dl_task_timer+0x777/0x990 > ? __hrtimer_run_queues+0x270/0xa50 > dl_task_timer+0x316/0x990 > ? enqueue_task_dl+0x12e0/0x12e0 > ? enqueue_task_dl+0x12e0/0x12e0 > __hrtimer_run_queues+0x270/0xa50 > ? hrtimer_cancel+0x20/0x20 > ? hrtimer_interrupt+0x119/0x600 > hrtimer_interrupt+0x19c/0x600 > ? trace_hardirqs_off+0xd/0x10 > local_apic_timer_interrupt+0x74/0xe0 > smp_apic_timer_interrupt+0x76/0xa0 > apic_timer_interrupt+0x93/0xa0 > > The DL task will be migrated to a suitable later deadline rq once the DL > timer fires and currnet rq is offline. The rq clock of the new rq should > be updated. This patch fixes it by updating the rq clock after holding > the new rq's rq lock. > > Cc: Juri Lelli <juri.lelli@arm.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Matt Fleming <matt@codeblueprint.co.uk> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> > --- > kernel/sched/deadline.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 99b2c33..c6db3fd 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > lockdep_unpin_lock(&rq->lock, rf.cookie); > rq = dl_task_offline_migration(rq, p); > rf.cookie = lockdep_pin_lock(&rq->lock); > + update_rq_clock(rq); > > /* > * Now that the task has been migrated to the new RQ and we > -- > 2.7.4 >
On 03/15/2017 08:53 AM, Wanpeng Li wrote: > Ping, :) > 2017-03-07 13:51 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>: >> From: Wanpeng Li <wanpeng.li@hotmail.com> >> >> The following warning can be triggered by hot-unplugging the CPU >> on which an active SCHED_DEADLINE task is running on: >> >> ------------[ cut here ]------------ >> WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40 >> rq->clock_update_flags < RQCF_ACT_SKIP >> CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24 >> Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 >> Call Trace: >> <IRQ> >> dump_stack+0x85/0xc4 >> __warn+0x172/0x1b0 >> warn_slowpath_fmt+0xb4/0xf0 >> ? __warn+0x1b0/0x1b0 >> ? debug_check_no_locks_freed+0x2c0/0x2c0 >> ? cpudl_set+0x3d/0x2b0 >> replenish_dl_entity+0x71e/0xc40 >> enqueue_task_dl+0x2ea/0x12e0 >> ? dl_task_timer+0x777/0x990 >> ? __hrtimer_run_queues+0x270/0xa50 >> dl_task_timer+0x316/0x990 >> ? enqueue_task_dl+0x12e0/0x12e0 >> ? enqueue_task_dl+0x12e0/0x12e0 >> __hrtimer_run_queues+0x270/0xa50 >> ? hrtimer_cancel+0x20/0x20 >> ? hrtimer_interrupt+0x119/0x600 >> hrtimer_interrupt+0x19c/0x600 >> ? trace_hardirqs_off+0xd/0x10 >> local_apic_timer_interrupt+0x74/0xe0 >> smp_apic_timer_interrupt+0x76/0xa0 >> apic_timer_interrupt+0x93/0xa0 >> >> The DL task will be migrated to a suitable later deadline rq once the DL >> timer fires and currnet rq is offline. The rq clock of the new rq should >> be updated. This patch fixes it by updating the rq clock after holding >> the new rq's rq lock. >> >> Cc: Juri Lelli <juri.lelli@arm.com> >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Ingo Molnar <mingo@kernel.org> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Matt Fleming <matt@codeblueprint.co.uk> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> -- Daniel
Hi, On 06/03/17 21:51, Wanpeng Li wrote: > From: Wanpeng Li <wanpeng.li@hotmail.com> > > The following warning can be triggered by hot-unplugging the CPU > on which an active SCHED_DEADLINE task is running on: > > ------------[ cut here ]------------ > WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40 > rq->clock_update_flags < RQCF_ACT_SKIP > CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24 > Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 > Call Trace: > <IRQ> > dump_stack+0x85/0xc4 > __warn+0x172/0x1b0 > warn_slowpath_fmt+0xb4/0xf0 > ? __warn+0x1b0/0x1b0 > ? debug_check_no_locks_freed+0x2c0/0x2c0 > ? cpudl_set+0x3d/0x2b0 > replenish_dl_entity+0x71e/0xc40 > enqueue_task_dl+0x2ea/0x12e0 > ? dl_task_timer+0x777/0x990 > ? __hrtimer_run_queues+0x270/0xa50 > dl_task_timer+0x316/0x990 > ? enqueue_task_dl+0x12e0/0x12e0 > ? enqueue_task_dl+0x12e0/0x12e0 > __hrtimer_run_queues+0x270/0xa50 > ? hrtimer_cancel+0x20/0x20 > ? hrtimer_interrupt+0x119/0x600 > hrtimer_interrupt+0x19c/0x600 > ? trace_hardirqs_off+0xd/0x10 > local_apic_timer_interrupt+0x74/0xe0 > smp_apic_timer_interrupt+0x76/0xa0 > apic_timer_interrupt+0x93/0xa0 > > The DL task will be migrated to a suitable later deadline rq once the DL > timer fires and currnet rq is offline. The rq clock of the new rq should > be updated. This patch fixes it by updating the rq clock after holding > the new rq's rq lock. > > Cc: Juri Lelli <juri.lelli@arm.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Matt Fleming <matt@codeblueprint.co.uk> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> > --- > kernel/sched/deadline.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 99b2c33..c6db3fd 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) > lockdep_unpin_lock(&rq->lock, rf.cookie); > rq = dl_task_offline_migration(rq, p); > rf.cookie = lockdep_pin_lock(&rq->lock); > + update_rq_clock(rq); Looks good to me. Acked-by: Juri Lelli <juri.lelli@arm.com> Thanks, - Juri
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 99b2c33..c6db3fd 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) lockdep_unpin_lock(&rq->lock, rf.cookie); rq = dl_task_offline_migration(rq, p); rf.cookie = lockdep_pin_lock(&rq->lock); + update_rq_clock(rq); /* * Now that the task has been migrated to the new RQ and we