diff mbox

[v2,2/2] Xen/timer: Process softirq during dumping timer info

Message ID 1476259105-16448-3-git-send-email-tianyu.lan@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

lan,Tianyu Oct. 12, 2016, 7:58 a.m. UTC
Dumping timer info may run for a long time on the huge machine with
a lot of physical cpus. To avoid triggering NMI watchdog, add
process_pending_softirqs() in the loop of dumping timer info.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/timer.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

Comments

Wei Liu Oct. 21, 2016, 5:27 p.m. UTC | #1
On Wed, Oct 12, 2016 at 03:58:24PM +0800, Lan Tianyu wrote:
> Dumping timer info may run for a long time on the huge machine with
> a lot of physical cpus. To avoid triggering NMI watchdog, add
> process_pending_softirqs() in the loop of dumping timer info.
> 
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/timer.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/xen/common/timer.c b/xen/common/timer.c
> index 29a60a9..ab6bca0 100644
> --- a/xen/common/timer.c
> +++ b/xen/common/timer.c
> @@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key)
>      {
>          ts = &per_cpu(timers, i);
>  
> +        process_pending_softirqs();

This is causing issues in ARM (x86 has a similar issue):

Oct 20 01:43:31.410010 (XEN) Xen call trace:
Oct 20 01:43:31.410048 (XEN)    [<00233920>] process_pending_softirqs+0x34/0x5c (PC)
Oct 20 01:43:31.417990 (XEN)    [<00237c6c>] timer.c#dump_timerq+0x9c/0x1fc (LR)
Oct 20 01:43:31.418030 (XEN)    [<00218658>] handle_keypress+0xc0/0xf4
Oct 20 01:43:31.426001 (XEN)    [<002490c8>] console.c#__serial_rx+0x4c/0x9c
Oct 20 01:43:31.433970 (XEN)    [<00249b74>] console.c#serial_rx+0xcc/0xe4
Oct 20 01:43:31.434007 (XEN)    [<0024b6ec>] serial_rx_interrupt+0xcc/0xf8
Oct 20 01:43:31.441964 (XEN)    [<0024ae54>] exynos4210-uart.c#exynos4210_uart_interrupt+0xf8/0x160
Oct 20 01:43:31.450001 (XEN)    [<00256338>] do_IRQ+0x1a0/0x228
Oct 20 01:43:31.450040 (XEN)    [<00254074>] gic_interrupt+0x58/0xfc
Oct 20 01:43:31.457985 (XEN)    [<00260f98>] do_trap_irq+0x24/0x38
Oct 20 01:43:31.458022 (XEN)    [<00264970>] entry.o#return_from_trap+0/0x4
Oct 20 01:43:31.466010 (XEN)    [<0030a240>] 0030a240
Oct 20 01:43:31.466044 (XEN) 
Oct 20 01:43:31.466066 (XEN) 
Oct 20 01:43:31.466099 (XEN) ****************************************
Oct 20 01:43:31.473998 (XEN) Panic on CPU 0:
Oct 20 01:43:31.474029 (XEN) Assertion '!in_irq() && local_irq_is_enabled()' failed at softirq.c:57
Oct 20 01:43:31.481982 (XEN) ****************************************

See
http://logs.test-lab.xenproject.org/osstest/logs/101571/test-armhf-armhf-libvirt/serial-arndale-bluewater.log

I've reverted this patch in staging.

Wei.

>          printk("CPU%02d:\n", i);
>          spin_lock_irqsave(&ts->lock, flags);
>          for ( j = 1; j <= GET_HEAP_SIZE(ts->heap); j++ )
> -- 
> 1.7.1
>
lan,Tianyu Oct. 22, 2016, 3:52 a.m. UTC | #2
On 10/22/2016 1:27 AM, Wei Liu wrote:
> On Wed, Oct 12, 2016 at 03:58:24PM +0800, Lan Tianyu wrote:
>> Dumping timer info may run for a long time on the huge machine with
>> a lot of physical cpus. To avoid triggering NMI watchdog, add
>> process_pending_softirqs() in the loop of dumping timer info.
>>
>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/timer.c |    1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/xen/common/timer.c b/xen/common/timer.c
>> index 29a60a9..ab6bca0 100644
>> --- a/xen/common/timer.c
>> +++ b/xen/common/timer.c
>> @@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key)
>>      {
>>          ts = &per_cpu(timers, i);
>>
>> +        process_pending_softirqs();
>
> This is causing issues in ARM (x86 has a similar issue):
>
> Oct 20 01:43:31.410010 (XEN) Xen call trace:
> Oct 20 01:43:31.410048 (XEN)    [<00233920>] process_pending_softirqs+0x34/0x5c (PC)
> Oct 20 01:43:31.417990 (XEN)    [<00237c6c>] timer.c#dump_timerq+0x9c/0x1fc (LR)
> Oct 20 01:43:31.418030 (XEN)    [<00218658>] handle_keypress+0xc0/0xf4
> Oct 20 01:43:31.426001 (XEN)    [<002490c8>] console.c#__serial_rx+0x4c/0x9c
> Oct 20 01:43:31.433970 (XEN)    [<00249b74>] console.c#serial_rx+0xcc/0xe4
> Oct 20 01:43:31.434007 (XEN)    [<0024b6ec>] serial_rx_interrupt+0xcc/0xf8
> Oct 20 01:43:31.441964 (XEN)    [<0024ae54>] exynos4210-uart.c#exynos4210_uart_interrupt+0xf8/0x160
> Oct 20 01:43:31.450001 (XEN)    [<00256338>] do_IRQ+0x1a0/0x228
> Oct 20 01:43:31.450040 (XEN)    [<00254074>] gic_interrupt+0x58/0xfc
> Oct 20 01:43:31.457985 (XEN)    [<00260f98>] do_trap_irq+0x24/0x38
> Oct 20 01:43:31.458022 (XEN)    [<00264970>] entry.o#return_from_trap+0/0x4
> Oct 20 01:43:31.466010 (XEN)    [<0030a240>] 0030a240
> Oct 20 01:43:31.466044 (XEN)
> Oct 20 01:43:31.466066 (XEN)
> Oct 20 01:43:31.466099 (XEN) ****************************************
> Oct 20 01:43:31.473998 (XEN) Panic on CPU 0:
> Oct 20 01:43:31.474029 (XEN) Assertion '!in_irq() && local_irq_is_enabled()' failed at softirq.c:57
> Oct 20 01:43:31.481982 (XEN) ****************************************
>
> See
> http://logs.test-lab.xenproject.org/osstest/logs/101571/test-armhf-armhf-libvirt/serial-arndale-bluewater.log
>
> I've reverted this patch in staging.
>
> Wei.

dump_timerq() or other non-irq keyhandlers should not run in irq context 
and has sent out a fix patch.

https://lists.xen.org/archives/html/xen-devel/2016-10/msg01391.html
diff mbox

Patch

diff --git a/xen/common/timer.c b/xen/common/timer.c
index 29a60a9..ab6bca0 100644
--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -530,6 +530,7 @@  static void dump_timerq(unsigned char key)
     {
         ts = &per_cpu(timers, i);
 
+        process_pending_softirqs();
         printk("CPU%02d:\n", i);
         spin_lock_irqsave(&ts->lock, flags);
         for ( j = 1; j <= GET_HEAP_SIZE(ts->heap); j++ )