diff mbox series

arm64: paravirt: Disable IRQs during stolen_time_cpu_down_prepare

Message ID 20220420204417.155194-1-quic_eberman@quicinc.com (mailing list archive)
State New, archived
Headers show
Series arm64: paravirt: Disable IRQs during stolen_time_cpu_down_prepare | expand

Commit Message

Elliot Berman April 20, 2022, 8:44 p.m. UTC
From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>

During hotplug, the stolen time data structure is unmapped and memset.
There is a possibility of the timer IRQ being triggered before memset
and stolen time is getting updated as part of this timer IRQ handler. This
causes the below crash in timer handler -

  [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
  ...
  [ 3458.154398][    C5] Call trace:
  [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
  [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
  [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
  [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
  [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
  [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
  [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
  [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
  [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
  [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
  [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
  [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
  [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
  [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
  [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
  [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
  [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
  [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
  [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
  [ 3458.251656][    C5]  __vunmap+0x154/0x278
  [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
  [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
  [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
  [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
  [ 3458.276638][    C5]  kthread+0x17c/0x1e0
  [ 3458.280691][    C5]  ret_from_fork+0x10/0x20

As a fix, disable the IRQs during hotplug until we unmap and memset the
stolen time structure.

Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
---
 arch/arm64/kernel/paravirt.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jürgen Groß April 21, 2022, 7:44 a.m. UTC | #1
On 20.04.22 22:44, Elliot Berman wrote:
> From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> 
> During hotplug, the stolen time data structure is unmapped and memset.
> There is a possibility of the timer IRQ being triggered before memset
> and stolen time is getting updated as part of this timer IRQ handler. This
> causes the below crash in timer handler -
> 
>    [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
>    ...
>    [ 3458.154398][    C5] Call trace:
>    [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
>    [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
>    [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
>    [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
>    [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
>    [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
>    [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
>    [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
>    [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
>    [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
>    [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
>    [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
>    [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
>    [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
>    [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
>    [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
>    [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
>    [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
>    [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
>    [ 3458.251656][    C5]  __vunmap+0x154/0x278
>    [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
>    [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
>    [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
>    [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
>    [ 3458.276638][    C5]  kthread+0x17c/0x1e0
>    [ 3458.280691][    C5]  ret_from_fork+0x10/0x20
> 
> As a fix, disable the IRQs during hotplug until we unmap and memset the
> stolen time structure.

This will work for the call chain of your observed crash, but are
you sure that para_steal_clock() can't be called from another cpu
concurrently?

In case you verified this can't happen, you can add my:

Reviewed-by: Juergen Gross <jgross@suse.com>

Otherwise you either need to use RCU for doing the memunmap(), or a
lock to protect stolen_time_region.


Juergen
Will Deacon April 21, 2022, 9:10 a.m. UTC | #2
On Thu, Apr 21, 2022 at 09:44:28AM +0200, Juergen Gross wrote:
> On 20.04.22 22:44, Elliot Berman wrote:
> > From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> > 
> > During hotplug, the stolen time data structure is unmapped and memset.
> > There is a possibility of the timer IRQ being triggered before memset
> > and stolen time is getting updated as part of this timer IRQ handler. This
> > causes the below crash in timer handler -
> > 
> >    [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
> >    ...
> >    [ 3458.154398][    C5] Call trace:
> >    [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
> >    [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
> >    [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
> >    [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
> >    [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
> >    [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
> >    [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
> >    [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
> >    [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
> >    [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
> >    [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
> >    [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
> >    [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
> >    [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
> >    [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
> >    [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
> >    [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
> >    [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
> >    [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
> >    [ 3458.251656][    C5]  __vunmap+0x154/0x278
> >    [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
> >    [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
> >    [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
> >    [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
> >    [ 3458.276638][    C5]  kthread+0x17c/0x1e0
> >    [ 3458.280691][    C5]  ret_from_fork+0x10/0x20
> > 
> > As a fix, disable the IRQs during hotplug until we unmap and memset the
> > stolen time structure.
> 
> This will work for the call chain of your observed crash, but are
> you sure that para_steal_clock() can't be called from another cpu
> concurrently?

Agreed, this needs checking as update_rq_clock() is called from all over the
place.

> In case you verified this can't happen, you can add my:
> 
> Reviewed-by: Juergen Gross <jgross@suse.com>
> 
> Otherwise you either need to use RCU for doing the memunmap(), or a
> lock to protect stolen_time_region.

Yes, I think RCU would make a lot of sense here, deferring the memunmap
until there are no longer any readers. The reader is currently doing:

	if (!reg->kaddr)
		return 0;

	return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time));

so we'd also want an rcu_dereference() on reg->kaddr to avoid reading it
twice.

Will
Elliot Berman April 22, 2022, 7:03 p.m. UTC | #3
On 4/21/2022 2:10 AM, Will Deacon wrote:
> On Thu, Apr 21, 2022 at 09:44:28AM +0200, Juergen Gross wrote:
>> On 20.04.22 22:44, Elliot Berman wrote:
>>> From: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
>>>
>>> During hotplug, the stolen time data structure is unmapped and memset.
>>> There is a possibility of the timer IRQ being triggered before memset
>>> and stolen time is getting updated as part of this timer IRQ handler. This
>>> causes the below crash in timer handler -
>>>
>>>     [ 3457.473139][    C5] Unable to handle kernel paging request at virtual address ffffffc03df05148
>>>     ...
>>>     [ 3458.154398][    C5] Call trace:
>>>     [ 3458.157648][    C5]  para_steal_clock+0x30/0x50
>>>     [ 3458.162319][    C5]  irqtime_account_process_tick+0x30/0x194
>>>     [ 3458.168148][    C5]  account_process_tick+0x3c/0x280
>>>     [ 3458.173274][    C5]  update_process_times+0x5c/0xf4
>>>     [ 3458.178311][    C5]  tick_sched_timer+0x180/0x384
>>>     [ 3458.183164][    C5]  __run_hrtimer+0x160/0x57c
>>>     [ 3458.187744][    C5]  hrtimer_interrupt+0x258/0x684
>>>     [ 3458.192698][    C5]  arch_timer_handler_virt+0x5c/0xa0
>>>     [ 3458.198002][    C5]  handle_percpu_devid_irq+0xdc/0x414
>>>     [ 3458.203385][    C5]  handle_domain_irq+0xa8/0x168
>>>     [ 3458.208241][    C5]  gic_handle_irq.34493+0x54/0x244
>>>     [ 3458.213359][    C5]  call_on_irq_stack+0x40/0x70
>>>     [ 3458.218125][    C5]  do_interrupt_handler+0x60/0x9c
>>>     [ 3458.223156][    C5]  el1_interrupt+0x34/0x64
>>>     [ 3458.227560][    C5]  el1h_64_irq_handler+0x1c/0x2c
>>>     [ 3458.232503][    C5]  el1h_64_irq+0x7c/0x80
>>>     [ 3458.236736][    C5]  free_vmap_area_noflush+0x108/0x39c
>>>     [ 3458.242126][    C5]  remove_vm_area+0xbc/0x118
>>>     [ 3458.246714][    C5]  vm_remove_mappings+0x48/0x2a4
>>>     [ 3458.251656][    C5]  __vunmap+0x154/0x278
>>>     [ 3458.255796][    C5]  stolen_time_cpu_down_prepare+0xc0/0xd8
>>>     [ 3458.261542][    C5]  cpuhp_invoke_callback+0x248/0xc34
>>>     [ 3458.266842][    C5]  cpuhp_thread_fun+0x1c4/0x248
>>>     [ 3458.271696][    C5]  smpboot_thread_fn+0x1b0/0x400
>>>     [ 3458.276638][    C5]  kthread+0x17c/0x1e0
>>>     [ 3458.280691][    C5]  ret_from_fork+0x10/0x20
>>>
>>> As a fix, disable the IRQs during hotplug until we unmap and memset the
>>> stolen time structure.
>>
>> This will work for the call chain of your observed crash, but are
>> you sure that para_steal_clock() can't be called from another cpu
>> concurrently?
> 
> Agreed, this needs checking as update_rq_clock() is called from all over the
> place.
> 
>> In case you verified this can't happen, you can add my:
>>
>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>
>> Otherwise you either need to use RCU for doing the memunmap(), or a
>> lock to protect stolen_time_region.
> 
> Yes, I think RCU would make a lot of sense here, deferring the memunmap
> until there are no longer any readers. The reader is currently doing:
> 
> 	if (!reg->kaddr)
> 		return 0;
> 
> 	return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time));
> 
> so we'd also want an rcu_dereference() on reg->kaddr to avoid reading it
> twice.
> 
> Will


We have seen this particular stack many times in our testing, but not 
any other way. Will's comments are agreeable, I'll send out an updated 
patch.

Thanks,
Elliot
diff mbox series

Patch

diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c
index 75fed4460407..fc05a08557e0 100644
--- a/arch/arm64/kernel/paravirt.c
+++ b/arch/arm64/kernel/paravirt.c
@@ -70,13 +70,16 @@  static u64 para_steal_clock(int cpu)
 static int stolen_time_cpu_down_prepare(unsigned int cpu)
 {
 	struct pv_time_stolen_time_region *reg;
+	unsigned long flags;
 
 	reg = this_cpu_ptr(&stolen_time_region);
 	if (!reg->kaddr)
 		return 0;
 
+	local_irq_save(flags);
 	memunmap(reg->kaddr);
 	memset(reg, 0, sizeof(*reg));
+	local_irq_restore(flags);
 
 	return 0;
 }