diff mbox series

x86/io-apic: prevent early exit from i8259 loop detection

Message ID 20241217090045.6251-1-roger.pau@citrix.com (mailing list archive)
State New
Headers show
Series x86/io-apic: prevent early exit from i8259 loop detection | expand

Commit Message

Roger Pau Monné Dec. 17, 2024, 9 a.m. UTC
Avoid exiting early from the loop when a pin that could be connected to the
i8259 is found, as such early exit would leave the EOI handler translation
array only partially allocated and/or initialized.

Otherwise on systems with multiple IO-APICs and an unmasked ExtINT pin on
any IO-APIC that's no the last one the following NULL pointer dereference
triggers:

(XEN) Enabling APIC mode.  Using 2 I/O APICs
(XEN) ----[ Xen-4.20-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d040328046>] __ioapic_write_entry+0x83/0x95
[...]
(XEN) Xen call trace:
(XEN)    [<ffff82d040328046>] R __ioapic_write_entry+0x83/0x95
(XEN)    [<ffff82d04027464b>] F amd_iommu_ioapic_update_ire+0x1ea/0x273
(XEN)    [<ffff82d0402755a1>] F iommu_update_ire_from_apic+0xa/0xc
(XEN)    [<ffff82d040328056>] F __ioapic_write_entry+0x93/0x95
(XEN)    [<ffff82d0403283c1>] F arch/x86/io_apic.c#clear_IO_APIC_pin+0x7c/0x10e
(XEN)    [<ffff82d040328480>] F arch/x86/io_apic.c#clear_IO_APIC+0x2d/0x61
(XEN)    [<ffff82d0404448b7>] F enable_IO_APIC+0x2e3/0x34f
(XEN)    [<ffff82d04044c9b0>] F smp_prepare_cpus+0x254/0x27a
(XEN)    [<ffff82d04044bec2>] F __start_xen+0x1ce1/0x23ae
(XEN)    [<ffff82d0402033ae>] F __high_start+0x8e/0x90
(XEN)
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4[0x000] = 000000007dbfd063 ffffffffffffffff
(XEN)  L3[0x000] = 000000007dbfa063 ffffffffffffffff
(XEN)  L2[0x000] = 000000007dbcc063 ffffffffffffffff
(XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: 0000000000000000
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

Reported-by: Sergii Dmytruk <sergii.dmytruk@3mdeb.com>
Fixes: 86001b3970fe ('x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/io_apic.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Jan Beulich Dec. 17, 2024, 9:40 a.m. UTC | #1
On 17.12.2024 10:00, Roger Pau Monne wrote:
> Avoid exiting early from the loop when a pin that could be connected to the
> i8259 is found, as such early exit would leave the EOI handler translation
> array only partially allocated and/or initialized.
> 
> Otherwise on systems with multiple IO-APICs and an unmasked ExtINT pin on
> any IO-APIC that's no the last one the following NULL pointer dereference
> triggers:
> 
> (XEN) Enabling APIC mode.  Using 2 I/O APICs
> (XEN) ----[ Xen-4.20-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82d040328046>] __ioapic_write_entry+0x83/0x95
> [...]
> (XEN) Xen call trace:
> (XEN)    [<ffff82d040328046>] R __ioapic_write_entry+0x83/0x95
> (XEN)    [<ffff82d04027464b>] F amd_iommu_ioapic_update_ire+0x1ea/0x273
> (XEN)    [<ffff82d0402755a1>] F iommu_update_ire_from_apic+0xa/0xc
> (XEN)    [<ffff82d040328056>] F __ioapic_write_entry+0x93/0x95
> (XEN)    [<ffff82d0403283c1>] F arch/x86/io_apic.c#clear_IO_APIC_pin+0x7c/0x10e
> (XEN)    [<ffff82d040328480>] F arch/x86/io_apic.c#clear_IO_APIC+0x2d/0x61
> (XEN)    [<ffff82d0404448b7>] F enable_IO_APIC+0x2e3/0x34f
> (XEN)    [<ffff82d04044c9b0>] F smp_prepare_cpus+0x254/0x27a
> (XEN)    [<ffff82d04044bec2>] F __start_xen+0x1ce1/0x23ae
> (XEN)    [<ffff82d0402033ae>] F __high_start+0x8e/0x90
> (XEN)
> (XEN) Pagetable walk from 0000000000000000:
> (XEN)  L4[0x000] = 000000007dbfd063 ffffffffffffffff
> (XEN)  L3[0x000] = 000000007dbfa063 ffffffffffffffff
> (XEN)  L2[0x000] = 000000007dbcc063 ffffffffffffffff
> (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=0002]
> (XEN) Faulting linear address: 0000000000000000
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> 
> Reported-by: Sergii Dmytruk <sergii.dmytruk@3mdeb.com>
> Fixes: 86001b3970fe ('x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Hmm, considering the earlier change was backported, I'm even inclined to
delay 4.18.4 a little, for taking this one there as well.

> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1389,14 +1389,15 @@ void __init enable_IO_APIC(void)
>              /* If the interrupt line is enabled and in ExtInt mode
>               * I have found the pin where the i8259 is connected.
>               */
> -            if ((entry.mask == 0) && (entry.delivery_mode == dest_ExtINT)) {
> +            if ( ioapic_i8259.apic == -1 && entry.mask == 0 &&
> +                 entry.delivery_mode == dest_ExtINT )
> +            {
> +                ASSERT(ioapic_i8259.pin == -1);

I'm not sure of the value of this assertion. It is provable that ...

>                  ioapic_i8259.apic = apic;
>                  ioapic_i8259.pin  = pin;

... both fields are updated together (and not earlier on), and hence once
we've been here neither field will still be -1.

Looking around I further notice that it's generally ioapic_i8259.pin that
we check against -1, so I wonder whether - just for consistency - the if()
condition wouldn't better too use that one.

Preferably with these adjustments:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
Roger Pau Monné Dec. 17, 2024, 10:13 a.m. UTC | #2
On Tue, Dec 17, 2024 at 10:40:31AM +0100, Jan Beulich wrote:
> On 17.12.2024 10:00, Roger Pau Monne wrote:
> > Avoid exiting early from the loop when a pin that could be connected to the
> > i8259 is found, as such early exit would leave the EOI handler translation
> > array only partially allocated and/or initialized.
> > 
> > Otherwise on systems with multiple IO-APICs and an unmasked ExtINT pin on
> > any IO-APIC that's no the last one the following NULL pointer dereference
> > triggers:
> > 
> > (XEN) Enabling APIC mode.  Using 2 I/O APICs
> > (XEN) ----[ Xen-4.20-unstable  x86_64  debug=y  Not tainted ]----
> > (XEN) CPU:    0
> > (XEN) RIP:    e008:[<ffff82d040328046>] __ioapic_write_entry+0x83/0x95
> > [...]
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82d040328046>] R __ioapic_write_entry+0x83/0x95
> > (XEN)    [<ffff82d04027464b>] F amd_iommu_ioapic_update_ire+0x1ea/0x273
> > (XEN)    [<ffff82d0402755a1>] F iommu_update_ire_from_apic+0xa/0xc
> > (XEN)    [<ffff82d040328056>] F __ioapic_write_entry+0x93/0x95
> > (XEN)    [<ffff82d0403283c1>] F arch/x86/io_apic.c#clear_IO_APIC_pin+0x7c/0x10e
> > (XEN)    [<ffff82d040328480>] F arch/x86/io_apic.c#clear_IO_APIC+0x2d/0x61
> > (XEN)    [<ffff82d0404448b7>] F enable_IO_APIC+0x2e3/0x34f
> > (XEN)    [<ffff82d04044c9b0>] F smp_prepare_cpus+0x254/0x27a
> > (XEN)    [<ffff82d04044bec2>] F __start_xen+0x1ce1/0x23ae
> > (XEN)    [<ffff82d0402033ae>] F __high_start+0x8e/0x90
> > (XEN)
> > (XEN) Pagetable walk from 0000000000000000:
> > (XEN)  L4[0x000] = 000000007dbfd063 ffffffffffffffff
> > (XEN)  L3[0x000] = 000000007dbfa063 ffffffffffffffff
> > (XEN)  L2[0x000] = 000000007dbcc063 ffffffffffffffff
> > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) FATAL PAGE FAULT
> > (XEN) [error_code=0002]
> > (XEN) Faulting linear address: 0000000000000000
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Reboot in five seconds...
> > 
> > Reported-by: Sergii Dmytruk <sergii.dmytruk@3mdeb.com>
> > Fixes: 86001b3970fe ('x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Hmm, considering the earlier change was backported, I'm even inclined to
> delay 4.18.4 a little, for taking this one there as well.
> 
> > --- a/xen/arch/x86/io_apic.c
> > +++ b/xen/arch/x86/io_apic.c
> > @@ -1389,14 +1389,15 @@ void __init enable_IO_APIC(void)
> >              /* If the interrupt line is enabled and in ExtInt mode
> >               * I have found the pin where the i8259 is connected.
> >               */
> > -            if ((entry.mask == 0) && (entry.delivery_mode == dest_ExtINT)) {
> > +            if ( ioapic_i8259.apic == -1 && entry.mask == 0 &&
> > +                 entry.delivery_mode == dest_ExtINT )
> > +            {
> > +                ASSERT(ioapic_i8259.pin == -1);
> 
> I'm not sure of the value of this assertion. It is provable that ...
> 
> >                  ioapic_i8259.apic = apic;
> >                  ioapic_i8259.pin  = pin;
> 
> ... both fields are updated together (and not earlier on), and hence once
> we've been here neither field will still be -1.

No strong opinion, as all asserts I've placed it here in case the
logic to set apic or pin would change.

> Looking around I further notice that it's generally ioapic_i8259.pin that
> we check against -1, so I wonder whether - just for consistency - the if()
> condition wouldn't better too use that one.

As with the above, no strong opinion really, it's true that most
checks tend to use ioapic_i8259.pin != -1 instead of the apic field,
so I can adjust to that.

> Preferably with these adjustments:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index d9db2efc4f58..e4a88a2527d7 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1389,14 +1389,15 @@  void __init enable_IO_APIC(void)
             /* If the interrupt line is enabled and in ExtInt mode
              * I have found the pin where the i8259 is connected.
              */
-            if ((entry.mask == 0) && (entry.delivery_mode == dest_ExtINT)) {
+            if ( ioapic_i8259.apic == -1 && entry.mask == 0 &&
+                 entry.delivery_mode == dest_ExtINT )
+            {
+                ASSERT(ioapic_i8259.pin == -1);
                 ioapic_i8259.apic = apic;
                 ioapic_i8259.pin  = pin;
-                goto found_i8259;
             }
         }
     }
- found_i8259:
     /* Look to see what if the MP table has reported the ExtINT */
     /* If we could not find the appropriate pin by looking at the ioapic
      * the i8259 probably is not connected the ioapic but give the