sched/core: fix illegal RCU from offline CPUs
diff mbox series

Message ID 20200112161752.10492-1-cai@lca.pw
State New
Headers show
Series
  • sched/core: fix illegal RCU from offline CPUs
Related show

Commit Message

Qian Cai Jan. 12, 2020, 4:17 p.m. UTC
In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaints when mmdrop() uses RCU from either memcg or
debugobjects. Fix it by scheduling mmdrop() on another online CPU.

=============================
 WARNING: suspicious RCU usage
 -----------------------------
 kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!

 other info that might help us debug this:

 RCU used illegally from offline CPU!
 rcu_scheduler_active = 2, debug_locks = 1
 2 locks held by swapper/37/0:
  #0: c0000000010af608 (rcu_read_lock){....}, at:
      percpu_ref_put_many+0x8/0x230
  #1: c0000000010af608 (rcu_read_lock){....}, at:
      __queue_work+0x7c/0xca0

 stack backtrace:
 Call Trace:
  dump_stack+0xf4/0x164 (unreliable)
  lockdep_rcu_suspicious+0x140/0x164
  get_work_pool+0x110/0x150
  __queue_work+0x1bc/0xca0
  queue_work_on+0x114/0x120
  css_release+0x9c/0xc0
  percpu_ref_put_many+0x204/0x230
  free_pcp_prepare+0x264/0x570
  free_unref_page+0x38/0xf0
  __mmdrop+0x21c/0x2c0
  idle_task_exit+0x170/0x1b0
  pnv_smp_cpu_kill_self+0x38/0x2e0
  cpu_die+0x48/0x64
  arch_cpu_idle_dead+0x30/0x50
  do_idle+0x2f4/0x470
  cpu_startup_entry+0x38/0x40
  start_secondary+0x7a8/0xa80
  start_secondary_resume+0x10/0x14

 =============================
 WARNING: suspicious RCU usage
 -----------------------------
 kernel/sched/core.c:562 suspicious rcu_dereference_check() usage!

 other info that might help us debug this:

 RCU used illegally from offline CPU!
 rcu_scheduler_active = 2, debug_locks = 1
 2 locks held by swapper/94/0:
  #0: c000201cc77dc118 (&base->lock){-.-.}, at:
      lock_timer_base+0x114/0x1f0
  #1: c0000000010af608 (rcu_read_lock){....}, at:
      get_nohz_timer_target+0x3c/0x2d0

 stack backtrace:
 Call Trace:
  dump_stack+0xf4/0x164 (unreliable)
  lockdep_rcu_suspicious+0x140/0x164
  get_nohz_timer_target+0x248/0x2d0
  add_timer+0x24c/0x470
  __queue_delayed_work+0x8c/0x110
  queue_delayed_work_on+0x128/0x130
  __debug_check_no_obj_freed+0x2ec/0x320
  free_pcp_prepare+0x1b4/0x570
  free_unref_page+0x38/0xf0
  __mmdrop+0x21c/0x2c0
  idle_task_exit+0x170/0x1b0
  pnv_smp_cpu_kill_self+0x38/0x2e0
  cpu_die+0x48/0x64
  arch_cpu_idle_dead+0x30/0x50
  do_idle+0x2f4/0x470
  cpu_startup_entry+0x38/0x40
  start_secondary+0x7a8/0xa80
  start_secondary_prolog+0x10/0x14

Signed-off-by: Qian Cai <cai@lca.pw>
---
 kernel/sched/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Patch
diff mbox series

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 90e4b00ace89..41fb49f3dfce 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6194,7 +6194,8 @@  void idle_task_exit(void)
 		current->active_mm = &init_mm;
 		finish_arch_post_lock_switch();
 	}
-	mmdrop(mm);
+	smp_call_function_single(cpumask_first(cpu_online_mask),
+				(void (*)(void *))mmdrop, mm, 0);
 }
 
 /*