Message ID | 20220420204417.155194-1-quic_eberman@quicinc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: paravirt: Disable IRQs during stolen_time_cpu_down_prepare | expand |
On 20.04.22 22:44, Elliot Berman wrote: > From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com> > > During hotplug, the stolen time data structure is unmapped and memset. > There is a possibility of the timer IRQ being triggered before memset > and stolen time is getting updated as part of this timer IRQ handler. This > causes the below crash in timer handler - > > [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 > ... > [ 3458.154398][ C5] Call trace: > [ 3458.157648][ C5] para_steal_clock+0x30/0x50 > [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 > [ 3458.168148][ C5] account_process_tick+0x3c/0x280 > [ 3458.173274][ C5] update_process_times+0x5c/0xf4 > [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 > [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c > [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 > [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 > [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 > [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 > [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 > [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 > [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c > [ 3458.223156][ C5] el1_interrupt+0x34/0x64 > [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c > [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 > [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c > [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 > [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 > [ 3458.251656][ C5] __vunmap+0x154/0x278 > [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 > [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 > [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 > [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 > [ 3458.276638][ C5] kthread+0x17c/0x1e0 > [ 3458.280691][ C5] ret_from_fork+0x10/0x20 > > As a fix, disable the IRQs during hotplug until we unmap and memset the > stolen time structure. This will work for the call chain of your observed crash, but are you sure that para_steal_clock() can't be called from another cpu concurrently? In case you verified this can't happen, you can add my: Reviewed-by: Juergen Gross <jgross@suse.com> Otherwise you either need to use RCU for doing the memunmap(), or a lock to protect stolen_time_region. Juergen
On Thu, Apr 21, 2022 at 09:44:28AM +0200, Juergen Gross wrote: > On 20.04.22 22:44, Elliot Berman wrote: > > From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com> > > > > During hotplug, the stolen time data structure is unmapped and memset. > > There is a possibility of the timer IRQ being triggered before memset > > and stolen time is getting updated as part of this timer IRQ handler. This > > causes the below crash in timer handler - > > > > [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 > > ... > > [ 3458.154398][ C5] Call trace: > > [ 3458.157648][ C5] para_steal_clock+0x30/0x50 > > [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 > > [ 3458.168148][ C5] account_process_tick+0x3c/0x280 > > [ 3458.173274][ C5] update_process_times+0x5c/0xf4 > > [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 > > [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c > > [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 > > [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 > > [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 > > [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 > > [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 > > [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 > > [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c > > [ 3458.223156][ C5] el1_interrupt+0x34/0x64 > > [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c > > [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 > > [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c > > [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 > > [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 > > [ 3458.251656][ C5] __vunmap+0x154/0x278 > > [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 > > [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 > > [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 > > [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 > > [ 3458.276638][ C5] kthread+0x17c/0x1e0 > > [ 3458.280691][ C5] ret_from_fork+0x10/0x20 > > > > As a fix, disable the IRQs during hotplug until we unmap and memset the > > stolen time structure. > > This will work for the call chain of your observed crash, but are > you sure that para_steal_clock() can't be called from another cpu > concurrently? Agreed, this needs checking as update_rq_clock() is called from all over the place. > In case you verified this can't happen, you can add my: > > Reviewed-by: Juergen Gross <jgross@suse.com> > > Otherwise you either need to use RCU for doing the memunmap(), or a > lock to protect stolen_time_region. Yes, I think RCU would make a lot of sense here, deferring the memunmap until there are no longer any readers. The reader is currently doing: if (!reg->kaddr) return 0; return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time)); so we'd also want an rcu_dereference() on reg->kaddr to avoid reading it twice. Will
On 4/21/2022 2:10 AM, Will Deacon wrote: > On Thu, Apr 21, 2022 at 09:44:28AM +0200, Juergen Gross wrote: >> On 20.04.22 22:44, Elliot Berman wrote: >>> From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com> >>> >>> During hotplug, the stolen time data structure is unmapped and memset. >>> There is a possibility of the timer IRQ being triggered before memset >>> and stolen time is getting updated as part of this timer IRQ handler. This >>> causes the below crash in timer handler - >>> >>> [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 >>> ... >>> [ 3458.154398][ C5] Call trace: >>> [ 3458.157648][ C5] para_steal_clock+0x30/0x50 >>> [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 >>> [ 3458.168148][ C5] account_process_tick+0x3c/0x280 >>> [ 3458.173274][ C5] update_process_times+0x5c/0xf4 >>> [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 >>> [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c >>> [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 >>> [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 >>> [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 >>> [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 >>> [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 >>> [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 >>> [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c >>> [ 3458.223156][ C5] el1_interrupt+0x34/0x64 >>> [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c >>> [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 >>> [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c >>> [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 >>> [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 >>> [ 3458.251656][ C5] __vunmap+0x154/0x278 >>> [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 >>> [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 >>> [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 >>> [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 >>> [ 3458.276638][ C5] kthread+0x17c/0x1e0 >>> [ 3458.280691][ C5] ret_from_fork+0x10/0x20 >>> >>> As a fix, disable the IRQs during hotplug until we unmap and memset the >>> stolen time structure. >> >> This will work for the call chain of your observed crash, but are >> you sure that para_steal_clock() can't be called from another cpu >> concurrently? > > Agreed, this needs checking as update_rq_clock() is called from all over the > place. > >> In case you verified this can't happen, you can add my: >> >> Reviewed-by: Juergen Gross <jgross@suse.com> >> >> Otherwise you either need to use RCU for doing the memunmap(), or a >> lock to protect stolen_time_region. > > Yes, I think RCU would make a lot of sense here, deferring the memunmap > until there are no longer any readers. The reader is currently doing: > > if (!reg->kaddr) > return 0; > > return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time)); > > so we'd also want an rcu_dereference() on reg->kaddr to avoid reading it > twice. > > Will We have seen this particular stack many times in our testing, but not any other way. Will's comments are agreeable, I'll send out an updated patch. Thanks, Elliot
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c index 75fed4460407..fc05a08557e0 100644 --- a/arch/arm64/kernel/paravirt.c +++ b/arch/arm64/kernel/paravirt.c @@ -70,13 +70,16 @@ static u64 para_steal_clock(int cpu) static int stolen_time_cpu_down_prepare(unsigned int cpu) { struct pv_time_stolen_time_region *reg; + unsigned long flags; reg = this_cpu_ptr(&stolen_time_region); if (!reg->kaddr) return 0; + local_irq_save(flags); memunmap(reg->kaddr); memset(reg, 0, sizeof(*reg)); + local_irq_restore(flags); return 0; }