diff mbox series

[v3,1/2] xen/arm: Defer request_irq on secondary CPUs after local_irq_enable

Message ID 20220507025434.1063710-2-Henry.Wang@arm.com (mailing list archive)
State New, archived
Headers show
Series Adjustment after introducing ASSERT_ALLOC_CONTEXT | expand

Commit Message

Henry Wang May 7, 2022, 2:54 a.m. UTC
With the enhanced ASSERT_ALLOC_CONTEXT, calling request_irq before
local_irq_enable on secondary cores will lead to

(XEN) Xen call trace:
(XEN) [<000000000021d86c>] alloc_xenheap_pages+0x74/0x194 (PC)
(XEN) [<000000000021d864>] alloc_xenheap_pages+0x6c/0x194 (LR)
(XEN) [<0000000000229e90>] xmalloc_tlsf.c#xmalloc_pool_get+0x1c/0x28
(XEN) [<000000000022a270>] xmem_pool_alloc+0x21c/0x448
(XEN) [<000000000022a8dc>] _xmalloc+0x8c/0x290
(XEN) [<000000000026b57c>] request_irq+0x40/0xb8
(XEN) [<0000000000272780>] init_timer_interrupt+0x74/0xcc
(XEN) [<000000000027212c>] start_secondary+0x1b4/0x238
(XEN) [<0000000084000200>] 0000000084000200
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 4:
(XEN) Assertion '!in_irq() && (local_irq_is_enabled() ||
num_online_cpus() <= 1)' failed at common/page_alloc.c:2212
(XEN) ****************************************

on systems without a big enough pool for xmalloc() to cater the
requested size.

Moving the call of request_irq() past local_irq_enable() on
secondary cores will make sure the assertion condition in
alloc_xenheap_pages(), i.e. !in_irq && local_irq_enabled() is
satisfied. It is also safe because the timer and GIC maintenance
interrupt will not be used until the CPU is fully online.

Reported-by: Wei Chen <Wei.Chen@arm.com>
Suggested-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
---
v2 -> v3:
- No changes.
v1 -> v2:
- Explain why the moving of code is safe in the commit message and
add comments.
---
 xen/arch/arm/smpboot.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Julien Grall May 15, 2022, 10:58 a.m. UTC | #1
Hi Henry,

On 07/05/2022 03:54, Henry Wang wrote:
> With the enhanced ASSERT_ALLOC_CONTEXT, calling request_irq before
> local_irq_enable on secondary cores will lead to
> 
> (XEN) Xen call trace:
> (XEN) [<000000000021d86c>] alloc_xenheap_pages+0x74/0x194 (PC)
> (XEN) [<000000000021d864>] alloc_xenheap_pages+0x6c/0x194 (LR)
> (XEN) [<0000000000229e90>] xmalloc_tlsf.c#xmalloc_pool_get+0x1c/0x28
> (XEN) [<000000000022a270>] xmem_pool_alloc+0x21c/0x448
> (XEN) [<000000000022a8dc>] _xmalloc+0x8c/0x290
> (XEN) [<000000000026b57c>] request_irq+0x40/0xb8
> (XEN) [<0000000000272780>] init_timer_interrupt+0x74/0xcc
> (XEN) [<000000000027212c>] start_secondary+0x1b4/0x238
> (XEN) [<0000000084000200>] 0000000084000200
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 4:
> (XEN) Assertion '!in_irq() && (local_irq_is_enabled() ||
> num_online_cpus() <= 1)' failed at common/page_alloc.c:2212
> (XEN) ****************************************
> 
> on systems without a big enough pool for xmalloc() to cater the
> requested size.
> 
> Moving the call of request_irq() past local_irq_enable() on
> secondary cores will make sure the assertion condition in
> alloc_xenheap_pages(), i.e. !in_irq && local_irq_enabled() is
> satisfied. It is also safe because the timer and GIC maintenance
> interrupt will not be used until the CPU is fully online.
> 
> Reported-by: Wei Chen <Wei.Chen@arm.com>
> Suggested-by: Julien Grall <jgrall@amazon.com>
> Signed-off-by: Henry Wang <Henry.Wang@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,
Julien Grall May 16, 2022, 5:07 p.m. UTC | #2
On 15/05/2022 11:58, Julien Grall wrote:
> On 07/05/2022 03:54, Henry Wang wrote:
>> With the enhanced ASSERT_ALLOC_CONTEXT, calling request_irq before
>> local_irq_enable on secondary cores will lead to
>>
>> (XEN) Xen call trace:
>> (XEN) [<000000000021d86c>] alloc_xenheap_pages+0x74/0x194 (PC)
>> (XEN) [<000000000021d864>] alloc_xenheap_pages+0x6c/0x194 (LR)
>> (XEN) [<0000000000229e90>] xmalloc_tlsf.c#xmalloc_pool_get+0x1c/0x28
>> (XEN) [<000000000022a270>] xmem_pool_alloc+0x21c/0x448
>> (XEN) [<000000000022a8dc>] _xmalloc+0x8c/0x290
>> (XEN) [<000000000026b57c>] request_irq+0x40/0xb8
>> (XEN) [<0000000000272780>] init_timer_interrupt+0x74/0xcc
>> (XEN) [<000000000027212c>] start_secondary+0x1b4/0x238
>> (XEN) [<0000000084000200>] 0000000084000200
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 4:
>> (XEN) Assertion '!in_irq() && (local_irq_is_enabled() ||
>> num_online_cpus() <= 1)' failed at common/page_alloc.c:2212
>> (XEN) ****************************************
>>
>> on systems without a big enough pool for xmalloc() to cater the
>> requested size.
>>
>> Moving the call of request_irq() past local_irq_enable() on
>> secondary cores will make sure the assertion condition in
>> alloc_xenheap_pages(), i.e. !in_irq && local_irq_enabled() is
>> satisfied. It is also safe because the timer and GIC maintenance
>> interrupt will not be used until the CPU is fully online.
>>
>> Reported-by: Wei Chen <Wei.Chen@arm.com>
>> Suggested-by: Julien Grall <jgrall@amazon.com>
>> Signed-off-by: Henry Wang <Henry.Wang@arm.com>
> 
> Reviewed-by: Julien Grall <jgrall@amazon.com>

I have committed this patch. The second patch will go in once 
"page_alloc: assert IRQs are enabled in heap alloc/free" has been 
re-committed.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 7bfd0a73a7..9bb32a301a 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -361,9 +361,6 @@  void start_secondary(void)
 
     init_secondary_IRQ();
 
-    init_maintenance_interrupt();
-    init_timer_interrupt();
-
     set_current(idle_vcpu[cpuid]);
 
     setup_cpu_sibling_map(cpuid);
@@ -380,6 +377,15 @@  void start_secondary(void)
     cpumask_set_cpu(cpuid, &cpu_online_map);
 
     local_irq_enable();
+
+    /*
+     * Calling request_irq() after local_irq_enable() on secondary cores
+     * will make sure the assertion condition in alloc_xenheap_pages(),
+     * i.e. !in_irq && local_irq_enabled() is satisfied.
+     */
+    init_maintenance_interrupt();
+    init_timer_interrupt();
+
     local_abort_enable();
 
     check_local_cpu_errata();