diff mbox series

[2/2] xen/x86: ioapic: Bail out from timer_irq_works() as soon as possible

Message ID 20230818134441.45586-3-julien@xen.org (mailing list archive)
State New, archived
Headers show
Series xen/x86: Optimize timer_irq_works() | expand

Commit Message

Julien Grall Aug. 18, 2023, 1:44 p.m. UTC
From: Julien Grall <jgrall@amazon.com>

Currently timer_irq_works() will wait the full 100ms before checking
that pit0_ticks has been incremented at least 4 times.

However, the bulk of the BIOS/platform should not have a buggy timer.
So waiting for the full 100ms is a bit harsh.

Rework the logic to only wait until 100ms passed or we saw more than
4 ticks. So now, in the good case, this will reduce the wait time
to ~50ms.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---
 xen/arch/x86/io_apic.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

Comments

Jan Beulich Sept. 7, 2023, 2:28 p.m. UTC | #1
On 18.08.2023 15:44, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Currently timer_irq_works() will wait the full 100ms before checking
> that pit0_ticks has been incremented at least 4 times.
> 
> However, the bulk of the BIOS/platform should not have a buggy timer.
> So waiting for the full 100ms is a bit harsh.
> 
> Rework the logic to only wait until 100ms passed or we saw more than
> 4 ticks. So now, in the good case, this will reduce the wait time
> to ~50ms.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

In principle this is all fine. There's a secondary aspect though which
may call for a slight rework of the patch.

> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
>  static int __init timer_irq_works(void)
>  {
>      unsigned long t1, flags;
> +    /* Wait for maximum 10 ticks */
> +    unsigned long msec = (10 * 1000) / HZ;

(Minor remark: I don't think this needs to be unsigned long; unsigned
in will suffice.)

> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>  
>      local_save_flags(flags);
>      local_irq_enable();
> -    /* Let ten ticks pass... */
> -    mdelay((10 * 1000) / HZ);
> -    local_irq_restore(flags);
>  
> -    /*
> -     * Expect a few ticks at least, to be sure some possible
> -     * glue logic does not lock up after one or two first
> -     * ticks in a non-ExtINT mode.  Also the local APIC
> -     * might have cached one ExtINT interrupt.  Finally, at
> -     * least one tick may be lost due to delays.
> -     */
> -    if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
> +    while ( msec-- )
> +    {
> +        mdelay(1);
> +        /*
> +         * Expect a few ticks at least, to be sure some possible
> +         * glue logic does not lock up after one or two first
> +         * ticks in a non-ExtINT mode.  Also the local APIC
> +         * might have cached one ExtINT interrupt.  Finally, at
> +         * least one tick may be lost due to delays.
> +         */
> +        if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
> +            continue;
> +
> +        local_irq_restore(flags);
>          return 1;
> +    }
> +
> +    local_irq_restore(flags);
>  
>      return 0;
>  }

While Andrew has a patch pending (not sure why it didn't go in yet)
to simplify local_irq_restore(), and while further it shouldn't really
need using here (local_irq_disable() ought to be fine), I can see that
you don't want to make such an adjustment here. But then I'd prefer if
we got away with just a single instance, adjusting the final return
statement accordingly (easiest would likely be to go from the value of
"msec").

Jan
Julien Grall Sept. 15, 2023, 2 p.m. UTC | #2
Hi Jan,

On 07/09/2023 15:28, Jan Beulich wrote:
> On 18.08.2023 15:44, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Currently timer_irq_works() will wait the full 100ms before checking
>> that pit0_ticks has been incremented at least 4 times.
>>
>> However, the bulk of the BIOS/platform should not have a buggy timer.
>> So waiting for the full 100ms is a bit harsh.
>>
>> Rework the logic to only wait until 100ms passed or we saw more than
>> 4 ticks. So now, in the good case, this will reduce the wait time
>> to ~50ms.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> In principle this is all fine. There's a secondary aspect though which
> may call for a slight rework of the patch.
> 
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
>>   static int __init timer_irq_works(void)
>>   {
>>       unsigned long t1, flags;
>> +    /* Wait for maximum 10 ticks */
>> +    unsigned long msec = (10 * 1000) / HZ;
> 
> (Minor remark: I don't think this needs to be unsigned long; unsigned
> in will suffice.)

You are right. I can switch to unsigned int.

> 
>> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>>   
>>       local_save_flags(flags);
>>       local_irq_enable();
>> -    /* Let ten ticks pass... */
>> -    mdelay((10 * 1000) / HZ);
>> -    local_irq_restore(flags);
>>   
>> -    /*
>> -     * Expect a few ticks at least, to be sure some possible
>> -     * glue logic does not lock up after one or two first
>> -     * ticks in a non-ExtINT mode.  Also the local APIC
>> -     * might have cached one ExtINT interrupt.  Finally, at
>> -     * least one tick may be lost due to delays.
>> -     */
>> -    if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
>> +    while ( msec-- )
>> +    {
>> +        mdelay(1);
>> +        /*
>> +         * Expect a few ticks at least, to be sure some possible
>> +         * glue logic does not lock up after one or two first
>> +         * ticks in a non-ExtINT mode.  Also the local APIC
>> +         * might have cached one ExtINT interrupt.  Finally, at
>> +         * least one tick may be lost due to delays.
>> +         */
>> +        if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
>> +            continue;
>> +
>> +        local_irq_restore(flags);
>>           return 1;
>> +    }
>> +
>> +    local_irq_restore(flags);
>>   
>>       return 0;
>>   }
> 
> While Andrew has a patch pending (not sure why it didn't go in yet)
> to simplify local_irq_restore(), and while further it shouldn't really
> need using here (local_irq_disable() ought to be fine)

Skimming through the code, the last call of timer_irq_works() in 
check_timer() happens after the interrupts masking state have been restored:

local_irq_restore(flags);

if ( timer_irq_works() )
   ...


So I think timer_irq_works() can be called with interrupts enabled and 
therefore we can't use local_irq_disable().

> I can see that
> you don't want to make such an adjustment here. But then I'd prefer if
> we got away with just a single instance, adjusting the final return
> statement accordingly (easiest would likely be to go from the value of
> "msec").

I was thinking to use 'msec > 0' as a condition to determine if the test 
passed. However, it would consider a failure if it tooks 10ms for the 
test to pass.

I will see if I can rework the loop.

Cheers,
Jan Beulich Sept. 18, 2023, 10:42 a.m. UTC | #3
On 15.09.2023 16:00, Julien Grall wrote:
> Hi Jan,
> 
> On 07/09/2023 15:28, Jan Beulich wrote:
>> On 18.08.2023 15:44, Julien Grall wrote:
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> Currently timer_irq_works() will wait the full 100ms before checking
>>> that pit0_ticks has been incremented at least 4 times.
>>>
>>> However, the bulk of the BIOS/platform should not have a buggy timer.
>>> So waiting for the full 100ms is a bit harsh.
>>>
>>> Rework the logic to only wait until 100ms passed or we saw more than
>>> 4 ticks. So now, in the good case, this will reduce the wait time
>>> to ~50ms.
>>>
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> In principle this is all fine. There's a secondary aspect though which
>> may call for a slight rework of the patch.
>>
>>> --- a/xen/arch/x86/io_apic.c
>>> +++ b/xen/arch/x86/io_apic.c
>>> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
>>>   static int __init timer_irq_works(void)
>>>   {
>>>       unsigned long t1, flags;
>>> +    /* Wait for maximum 10 ticks */
>>> +    unsigned long msec = (10 * 1000) / HZ;
>>
>> (Minor remark: I don't think this needs to be unsigned long; unsigned
>> in will suffice.)
> 
> You are right. I can switch to unsigned int.
> 
>>
>>> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>>>   
>>>       local_save_flags(flags);
>>>       local_irq_enable();
>>> -    /* Let ten ticks pass... */
>>> -    mdelay((10 * 1000) / HZ);
>>> -    local_irq_restore(flags);
>>>   
>>> -    /*
>>> -     * Expect a few ticks at least, to be sure some possible
>>> -     * glue logic does not lock up after one or two first
>>> -     * ticks in a non-ExtINT mode.  Also the local APIC
>>> -     * might have cached one ExtINT interrupt.  Finally, at
>>> -     * least one tick may be lost due to delays.
>>> -     */
>>> -    if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
>>> +    while ( msec-- )
>>> +    {
>>> +        mdelay(1);
>>> +        /*
>>> +         * Expect a few ticks at least, to be sure some possible
>>> +         * glue logic does not lock up after one or two first
>>> +         * ticks in a non-ExtINT mode.  Also the local APIC
>>> +         * might have cached one ExtINT interrupt.  Finally, at
>>> +         * least one tick may be lost due to delays.
>>> +         */
>>> +        if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
>>> +            continue;
>>> +
>>> +        local_irq_restore(flags);
>>>           return 1;
>>> +    }
>>> +
>>> +    local_irq_restore(flags);
>>>   
>>>       return 0;
>>>   }
>>
>> While Andrew has a patch pending (not sure why it didn't go in yet)
>> to simplify local_irq_restore(), and while further it shouldn't really
>> need using here (local_irq_disable() ought to be fine)
> 
> Skimming through the code, the last call of timer_irq_works() in 
> check_timer() happens after the interrupts masking state have been restored:
> 
> local_irq_restore(flags);
> 
> if ( timer_irq_works() )
>    ...
> 
> 
> So I think timer_irq_works() can be called with interrupts enabled and 
> therefore we can't use local_irq_disable().

Hmm, yes, you're right. That's inconsistent, but dealing with that is a
separate task.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 4875bb97003f..0bd774962a68 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1509,6 +1509,8 @@  static void __init setup_ioapic_ids_from_mpc(void)
 static int __init timer_irq_works(void)
 {
     unsigned long t1, flags;
+    /* Wait for maximum 10 ticks */
+    unsigned long msec = (10 * 1000) / HZ;
 
     if ( no_timer_check )
         return 1;
@@ -1517,19 +1519,25 @@  static int __init timer_irq_works(void)
 
     local_save_flags(flags);
     local_irq_enable();
-    /* Let ten ticks pass... */
-    mdelay((10 * 1000) / HZ);
-    local_irq_restore(flags);
 
-    /*
-     * Expect a few ticks at least, to be sure some possible
-     * glue logic does not lock up after one or two first
-     * ticks in a non-ExtINT mode.  Also the local APIC
-     * might have cached one ExtINT interrupt.  Finally, at
-     * least one tick may be lost due to delays.
-     */
-    if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
+    while ( msec-- )
+    {
+        mdelay(1);
+        /*
+         * Expect a few ticks at least, to be sure some possible
+         * glue logic does not lock up after one or two first
+         * ticks in a non-ExtINT mode.  Also the local APIC
+         * might have cached one ExtINT interrupt.  Finally, at
+         * least one tick may be lost due to delays.
+         */
+        if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
+            continue;
+
+        local_irq_restore(flags);
         return 1;
+    }
+
+    local_irq_restore(flags);
 
     return 0;
 }