diff mbox series

[4/4] x86/APIC: restrict certain messages to BSP

Message ID 513e4f93-a8a0-ae72-abcc-aa28531eca97@suse.com (mailing list archive)
State New, archived
Headers show
Series x86: reduce errors in frequency calculations / unrelated cleanup | expand

Commit Message

Jan Beulich March 13, 2020, 9:26 a.m. UTC
All CPUs get an equal setting of EOI broadcast suppression; no need to
log one message per CPU, even if it's only in verbose APIC mode.

Only the BSP is eligible to possibly get ExtINT enabled; no need to log
that it gets disabled on all APs, even if - again - it's only in verbose
APIC mode.

Take the opportunity and introduce a "bsp" parameter to the function, to
stop using smp_processor_id() to tell BSP from APs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Andrew Cooper March 13, 2020, 4:18 p.m. UTC | #1
On 13/03/2020 09:26, Jan Beulich wrote:
> All CPUs get an equal setting of EOI broadcast suppression; no need to
> log one message per CPU, even if it's only in verbose APIC mode.
>
> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
> that it gets disabled on all APs, even if - again - it's only in verbose
> APIC mode.

How true is this in practice?

I know it is certainly the recommended configuration, but interrupt
remapping can definitely point ExtINT at arbitrary CPUs, and IIRC, the
MP spec simply says that "only one CPU should be configured to receive
ExtINT".

I know we definitely have bugs with ExiINT handling, but I haven't had
time to debug further.

~Andrew
Jan Beulich March 16, 2020, 9:10 a.m. UTC | #2
On 13.03.2020 17:18, Andrew Cooper wrote:
> On 13/03/2020 09:26, Jan Beulich wrote:
>> All CPUs get an equal setting of EOI broadcast suppression; no need to
>> log one message per CPU, even if it's only in verbose APIC mode.
>>
>> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
>> that it gets disabled on all APs, even if - again - it's only in verbose
>> APIC mode.
> 
> How true is this in practice?

I guess you read "eligible" as "in theory", whereas it is meant as "with
how our [and in particular this] code is working right now". Even if we
decided to switch the CPU to handle ExtINT, it still wouldn't need to be
one message per CPU - it would suffice to issue the message for the one
CPU getting ExtINT enabled.

Jan

> I know it is certainly the recommended configuration, but interrupt
> remapping can definitely point ExtINT at arbitrary CPUs, and IIRC, the
> MP spec simply says that "only one CPU should be configured to receive
> ExtINT".
> 
> I know we definitely have bugs with ExiINT handling, but I haven't had
> time to debug further.
> 
> ~Andrew
>
Jan Beulich April 2, 2020, 7:55 a.m. UTC | #3
On 16.03.2020 10:10, Jan Beulich wrote:
> On 13.03.2020 17:18, Andrew Cooper wrote:
>> On 13/03/2020 09:26, Jan Beulich wrote:
>>> All CPUs get an equal setting of EOI broadcast suppression; no need to
>>> log one message per CPU, even if it's only in verbose APIC mode.
>>>
>>> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
>>> that it gets disabled on all APs, even if - again - it's only in verbose
>>> APIC mode.
>>
>> How true is this in practice?
> 
> I guess you read "eligible" as "in theory", whereas it is meant as "with
> how our [and in particular this] code is working right now". Even if we
> decided to switch the CPU to handle ExtINT, it still wouldn't need to be
> one message per CPU - it would suffice to issue the message for the one
> CPU getting ExtINT enabled.

Are you okay with the above explanation, and hence willing to give an
ack? If not, what alternative suggestion do you have to limit this
particular part of the log spam on very-many-CPU systems?

Jan
Andrew Cooper April 2, 2020, 5:55 p.m. UTC | #4
On 13/03/2020 09:26, Jan Beulich wrote:
> All CPUs get an equal setting of EOI broadcast suppression; no need to
> log one message per CPU, even if it's only in verbose APIC mode.
>
> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
> that it gets disabled on all APs, even if - again - it's only in verbose
> APIC mode.
>
> Take the opportunity and introduce a "bsp" parameter to the function, to
> stop using smp_processor_id() to tell BSP from APs.

On further consideration, this is introducing a latent bug.

In a theoretical world where we could take the BSP offline, it is still
the CPU with the ID 0 which needs various of these things setting back up.

You could argue that we could move ExtINT/NMI handling to a different
CPU, and in this case, BSP still isn't the right distinction.  We'd want
something to signify "the processor which is the target of legacy
interrupts", as in such a case, it would specifically no longer be the
CPU we booted on.

OTOH, the adjustment for the NMI watchdog does look to be different. 
AFAICT, that is for deferring the watchdog setup until later in boot, at
which point "the BSP" is the appropriate distinction to use.  (That said
- I'm not sure why anything should need delaying.  I suspect this is
misplaced code to begin with.)

As for the messages being printed, I think that is fine to restrict to
the BSP.

A conversation on LKML has revealed why LVT0.MASK gets sampled - it is
to distinguish between the two virtual wire modes.  LVT0.MASK needs to
stay masked on the BSP if the firmware configured it like that, because
the PIC is wired through an IO-APIC pin which ultimately ends up
delivering an MSI ExtINT interrupt, rather than using the dedicates
sideband bus message to emulate the legacy ExtINT/LINT0 line.

~Andrew
Jan Beulich April 3, 2020, 7:32 a.m. UTC | #5
On 02.04.2020 19:55, Andrew Cooper wrote:
> On 13/03/2020 09:26, Jan Beulich wrote:
>> All CPUs get an equal setting of EOI broadcast suppression; no need to
>> log one message per CPU, even if it's only in verbose APIC mode.
>>
>> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
>> that it gets disabled on all APs, even if - again - it's only in verbose
>> APIC mode.
>>
>> Take the opportunity and introduce a "bsp" parameter to the function, to
>> stop using smp_processor_id() to tell BSP from APs.
> 
> On further consideration, this is introducing a latent bug.
> 
> In a theoretical world where we could take the BSP offline, it is still
> the CPU with the ID 0 which needs various of these things setting back up.

No. If we took the BSP offline, no CPU with ID 0 would remain
(until such time that the ID would be re-used).

> You could argue that we could move ExtINT/NMI handling to a different
> CPU, and in this case, BSP still isn't the right distinction.  We'd want
> something to signify "the processor which is the target of legacy
> interrupts", as in such a case, it would specifically no longer be the
> CPU we booted on.

I see nothing wrong with calling this new "focus" processor the
BSP again. Post boot the significance of BSP, afaict, reduces
to aspects that aren't really tied to having been the processor
to lead bootup.

The important point is - if we were to allow offlining the BSP,
whichever other one would take its position would _not_ come
through setup_local_APIC(). The adjustments needs would need
doing by other means (quite likely by splitting out some of the
code into a new helper). Hence in setup_local_APIC() itself the
BSP / non-BSP distinction is correct.

> OTOH, the adjustment for the NMI watchdog does look to be different. 
> AFAICT, that is for deferring the watchdog setup until later in boot, at
> which point "the BSP" is the appropriate distinction to use.  (That said
> - I'm not sure why anything should need delaying.  I suspect this is
> misplaced code to begin with.)

Well, in any event - an unrelated aspect, and I think the switch
from smp_processor_id() to bsp is correct there as well.

> As for the messages being printed, I think that is fine to restrict to
> the BSP.

As per above I understand this isn't an ack for the patch. However,
I then don't understand what concrete changes you mean me to make
for it to stand a chance of getting your ack.

Jan
Roger Pau Monné May 14, 2020, 10:01 a.m. UTC | #6
On Fri, Mar 13, 2020 at 10:26:47AM +0100, Jan Beulich wrote:
> All CPUs get an equal setting of EOI broadcast suppression; no need to
> log one message per CPU, even if it's only in verbose APIC mode.
> 
> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
> that it gets disabled on all APs, even if - again - it's only in verbose
> APIC mode.
> 
> Take the opportunity and introduce a "bsp" parameter to the function, to
> stop using smp_processor_id() to tell BSP from APs.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

LGTM:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

AFAICT this doesn't introduce any functional change in APIC setup or
behavior, the only functional change is the log message reduction.
Might be good to add a note to that effect to make this clear, since
the change from smp_processor_id() -> bsp might make this not obvious.

> 
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -499,7 +499,7 @@ static void resume_x2apic(void)
>      __enable_x2apic();
>  }
>  
> -void setup_local_APIC(void)
> +void setup_local_APIC(bool bsp)
>  {
>      unsigned long oldvalue, value, maxlvt;
>      int i, j;
> @@ -598,8 +598,8 @@ void setup_local_APIC(void)
>      if ( directed_eoi_enabled )
>      {
>          value |= APIC_SPIV_DIRECTED_EOI;
> -        apic_printk(APIC_VERBOSE, "Suppress EOI broadcast on CPU#%d\n",
> -                    smp_processor_id());
> +        if ( bsp )
> +            apic_printk(APIC_VERBOSE, "Suppressing EOI broadcast\n");
>      }
>  
>      apic_write(APIC_SPIV, value);
> @@ -615,21 +615,22 @@ void setup_local_APIC(void)
>       * TODO: set up through-local-APIC from through-I/O-APIC? --macro
>       */
>      value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
> -    if (!smp_processor_id() && (pic_mode || !value)) {
> +    if (bsp && (pic_mode || !value)) {
>          value = APIC_DM_EXTINT;
>          apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
>                      smp_processor_id());
>      } else {
>          value = APIC_DM_EXTINT | APIC_LVT_MASKED;
> -        apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
> -                    smp_processor_id());
> +        if (bsp)
> +            apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
> +                        smp_processor_id());

You might want to also drop the CPU#%d from the above messages, since
they would only be printed for the BSP.

>      }
>      apic_write(APIC_LVT0, value);
>  
>      /*
>       * only the BP should see the LINT1 NMI signal, obviously.
>       */
> -    if (!smp_processor_id())
> +    if (bsp)
>          value = APIC_DM_NMI;
>      else
>          value = APIC_DM_NMI | APIC_LVT_MASKED;

This would be shorter as:

value = APIC_DM_NMI | (bsp ? 0 : APIC_LVT_MASKED);

Not specially trilled anyway.

Thanks, Roger.
Jan Beulich May 14, 2020, 12:31 p.m. UTC | #7
On 14.05.2020 12:01, Roger Pau Monné wrote:
> On Fri, Mar 13, 2020 at 10:26:47AM +0100, Jan Beulich wrote:
>> All CPUs get an equal setting of EOI broadcast suppression; no need to
>> log one message per CPU, even if it's only in verbose APIC mode.
>>
>> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
>> that it gets disabled on all APs, even if - again - it's only in verbose
>> APIC mode.
>>
>> Take the opportunity and introduce a "bsp" parameter to the function, to
>> stop using smp_processor_id() to tell BSP from APs.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> LGTM:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> AFAICT this doesn't introduce any functional change in APIC setup or
> behavior, the only functional change is the log message reduction.
> Might be good to add a note to that effect to make this clear, since
> the change from smp_processor_id() -> bsp might make this not obvious.

I've added "No functional change from this" to the last paragraph.

>> --- a/xen/arch/x86/apic.c
>> +++ b/xen/arch/x86/apic.c
>> @@ -499,7 +499,7 @@ static void resume_x2apic(void)
>>      __enable_x2apic();
>>  }
>>  
>> -void setup_local_APIC(void)
>> +void setup_local_APIC(bool bsp)
>>  {
>>      unsigned long oldvalue, value, maxlvt;
>>      int i, j;
>> @@ -598,8 +598,8 @@ void setup_local_APIC(void)
>>      if ( directed_eoi_enabled )
>>      {
>>          value |= APIC_SPIV_DIRECTED_EOI;
>> -        apic_printk(APIC_VERBOSE, "Suppress EOI broadcast on CPU#%d\n",
>> -                    smp_processor_id());
>> +        if ( bsp )
>> +            apic_printk(APIC_VERBOSE, "Suppressing EOI broadcast\n");
>>      }
>>  
>>      apic_write(APIC_SPIV, value);
>> @@ -615,21 +615,22 @@ void setup_local_APIC(void)
>>       * TODO: set up through-local-APIC from through-I/O-APIC? --macro
>>       */
>>      value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
>> -    if (!smp_processor_id() && (pic_mode || !value)) {
>> +    if (bsp && (pic_mode || !value)) {
>>          value = APIC_DM_EXTINT;
>>          apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
>>                      smp_processor_id());
>>      } else {
>>          value = APIC_DM_EXTINT | APIC_LVT_MASKED;
>> -        apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
>> -                    smp_processor_id());
>> +        if (bsp)
>> +            apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
>> +                        smp_processor_id());
> 
> You might want to also drop the CPU#%d from the above messages, since
> they would only be printed for the BSP.

I want to specifically keep them, so that once (if ever) we introduce
hot-unplug support for the BSP, the same or similar messages can be
used and matched against earlier ones in the log.

>>      }
>>      apic_write(APIC_LVT0, value);
>>  
>>      /*
>>       * only the BP should see the LINT1 NMI signal, obviously.
>>       */
>> -    if (!smp_processor_id())
>> +    if (bsp)
>>          value = APIC_DM_NMI;
>>      else
>>          value = APIC_DM_NMI | APIC_LVT_MASKED;
> 
> This would be shorter as:
> 
> value = APIC_DM_NMI | (bsp ? 0 : APIC_LVT_MASKED);

Indeed, at the expense of larger code churn. Seems like an at least
partially unrelated change to me (at risk of obscuring the actual
purpose of the change here).

Jan
Roger Pau Monné May 14, 2020, 1:12 p.m. UTC | #8
On Thu, May 14, 2020 at 02:31:33PM +0200, Jan Beulich wrote:
> On 14.05.2020 12:01, Roger Pau Monné wrote:
> > On Fri, Mar 13, 2020 at 10:26:47AM +0100, Jan Beulich wrote:
> >> --- a/xen/arch/x86/apic.c
> >> +++ b/xen/arch/x86/apic.c
> >> @@ -499,7 +499,7 @@ static void resume_x2apic(void)
> >>      __enable_x2apic();
> >>  }
> >>  
> >> -void setup_local_APIC(void)
> >> +void setup_local_APIC(bool bsp)
> >>  {
> >>      unsigned long oldvalue, value, maxlvt;
> >>      int i, j;
> >> @@ -598,8 +598,8 @@ void setup_local_APIC(void)
> >>      if ( directed_eoi_enabled )
> >>      {
> >>          value |= APIC_SPIV_DIRECTED_EOI;
> >> -        apic_printk(APIC_VERBOSE, "Suppress EOI broadcast on CPU#%d\n",
> >> -                    smp_processor_id());
> >> +        if ( bsp )
> >> +            apic_printk(APIC_VERBOSE, "Suppressing EOI broadcast\n");
> >>      }
> >>  
> >>      apic_write(APIC_SPIV, value);
> >> @@ -615,21 +615,22 @@ void setup_local_APIC(void)
> >>       * TODO: set up through-local-APIC from through-I/O-APIC? --macro
> >>       */
> >>      value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
> >> -    if (!smp_processor_id() && (pic_mode || !value)) {
> >> +    if (bsp && (pic_mode || !value)) {
> >>          value = APIC_DM_EXTINT;
> >>          apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
> >>                      smp_processor_id());
> >>      } else {
> >>          value = APIC_DM_EXTINT | APIC_LVT_MASKED;
> >> -        apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
> >> -                    smp_processor_id());
> >> +        if (bsp)
> >> +            apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
> >> +                        smp_processor_id());
> > 
> > You might want to also drop the CPU#%d from the above messages, since
> > they would only be printed for the BSP.
> 
> I want to specifically keep them, so that once (if ever) we introduce
> hot-unplug support for the BSP, the same or similar messages can be
> used and matched against earlier ones in the log.
> 
> >>      }
> >>      apic_write(APIC_LVT0, value);
> >>  
> >>      /*
> >>       * only the BP should see the LINT1 NMI signal, obviously.
> >>       */
> >> -    if (!smp_processor_id())
> >> +    if (bsp)
> >>          value = APIC_DM_NMI;
> >>      else
> >>          value = APIC_DM_NMI | APIC_LVT_MASKED;
> > 
> > This would be shorter as:
> > 
> > value = APIC_DM_NMI | (bsp ? 0 : APIC_LVT_MASKED);
> 
> Indeed, at the expense of larger code churn. Seems like an at least
> partially unrelated change to me (at risk of obscuring the actual
> purpose of the change here).

FTR, I'm happy with both of the above and my RB stands.

Roger.
diff mbox series

Patch

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -499,7 +499,7 @@  static void resume_x2apic(void)
     __enable_x2apic();
 }
 
-void setup_local_APIC(void)
+void setup_local_APIC(bool bsp)
 {
     unsigned long oldvalue, value, maxlvt;
     int i, j;
@@ -598,8 +598,8 @@  void setup_local_APIC(void)
     if ( directed_eoi_enabled )
     {
         value |= APIC_SPIV_DIRECTED_EOI;
-        apic_printk(APIC_VERBOSE, "Suppress EOI broadcast on CPU#%d\n",
-                    smp_processor_id());
+        if ( bsp )
+            apic_printk(APIC_VERBOSE, "Suppressing EOI broadcast\n");
     }
 
     apic_write(APIC_SPIV, value);
@@ -615,21 +615,22 @@  void setup_local_APIC(void)
      * TODO: set up through-local-APIC from through-I/O-APIC? --macro
      */
     value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-    if (!smp_processor_id() && (pic_mode || !value)) {
+    if (bsp && (pic_mode || !value)) {
         value = APIC_DM_EXTINT;
         apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
                     smp_processor_id());
     } else {
         value = APIC_DM_EXTINT | APIC_LVT_MASKED;
-        apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
-                    smp_processor_id());
+        if (bsp)
+            apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
+                        smp_processor_id());
     }
     apic_write(APIC_LVT0, value);
 
     /*
      * only the BP should see the LINT1 NMI signal, obviously.
      */
-    if (!smp_processor_id())
+    if (bsp)
         value = APIC_DM_NMI;
     else
         value = APIC_DM_NMI | APIC_LVT_MASKED;
@@ -663,7 +664,7 @@  void setup_local_APIC(void)
         printk("Leaving ESR disabled.\n");
     }
 
-    if (nmi_watchdog == NMI_LOCAL_APIC && smp_processor_id())
+    if (nmi_watchdog == NMI_LOCAL_APIC && !bsp)
         setup_apic_nmi_watchdog();
     apic_pm_activate();
 }
@@ -1474,7 +1475,7 @@  int __init APIC_init_uniprocessor (void)
     physids_clear(phys_cpu_present_map);
     physid_set(boot_cpu_physical_apicid, phys_cpu_present_map);
 
-    setup_local_APIC();
+    setup_local_APIC(true);
 
     if (nmi_watchdog == NMI_LOCAL_APIC)
         check_nmi_watchdog();
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -191,7 +191,7 @@  static void smp_callin(void)
      */
     Dprintk("CALLIN, before setup_local_APIC().\n");
     x2apic_ap_setup();
-    setup_local_APIC();
+    setup_local_APIC(false);
 
     /* Save our processor parameters. */
     if ( !smp_store_cpu_info(cpu) )
@@ -1165,7 +1165,7 @@  void __init smp_prepare_cpus(void)
     verify_local_APIC();
 
     connect_bsp_APIC();
-    setup_local_APIC();
+    setup_local_APIC(true);
 
     if ( !skip_ioapic_setup && nr_ioapics )
         setup_IO_APIC();
--- a/xen/include/asm-x86/apic.h
+++ b/xen/include/asm-x86/apic.h
@@ -169,7 +169,7 @@  extern int verify_local_APIC (void);
 extern void cache_APIC_registers (void);
 extern void sync_Arb_IDs (void);
 extern void init_bsp_APIC (void);
-extern void setup_local_APIC (void);
+extern void setup_local_APIC(bool bsp);
 extern void init_apic_mappings (void);
 extern void smp_local_timer_interrupt (struct cpu_user_regs *regs);
 extern void setup_boot_APIC_clock (void);