Message ID | E1rDOgx-00Dvkv-Bb@rmk-PC.armlinux.org.uk (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | ACPI/arm64: add support for virtual cpu hotplug | expand |
Context | Check | Description |
---|---|---|
conchuod/vmtest-fixes-PR | fail | merge-conflict |
On Wed, 13 Dec 2023 12:50:23 +0000 Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote: > From: James Morse <james.morse@arm.com> > > gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions(). > It should only count the number of enabled redistributors, but it > also tries to sanity check the GICC entry, currently returning an > error if the Enabled bit is set, but the gicr_base_address is zero. > > Adding support for the online-capable bit to the sanity check > complicates it, for no benefit. The existing check implicitly > depends on gic_acpi_count_gicr_regions() previous failing to find > any GICR regions (as it is valid to have gicr_base_address of zero if > the redistributors are described via a GICR entry). > > Instead of complicating the check, remove it. Failures that happen > at this point cause the irqchip not to register, meaning no irqs > can be requested. The kernel grinds to a panic() pretty quickly. > > Without the check, MADT tables that exhibit this problem are still > caught by gic_populate_rdist(), which helpfully also prints what > went wrong: > | CPU4: mpidr 100 has no re-distributor! > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Gavin Shan <gshan@redhat.com> > Tested-by: Miguel Luis <miguel.luis@oracle.com> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> > Tested-by: Jianyong Wu <jianyong.wu@arm.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/irqchip/irq-gic-v3.c | 18 ++++++------------ > 1 file changed, 6 insertions(+), 12 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index 98b0329b7154..ebecd4546830 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -2420,21 +2420,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header, > > /* > * If GICC is enabled and has valid gicr base address, then it means > - * GICR base is presented via GICC > + * GICR base is presented via GICC. The redistributor is only known to > + * be accessible if the GICC is marked as enabled. If this bit is not > + * set, we'd need to add the redistributor at runtime, which isn't > + * supported. > */ > - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) { > + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address) I was very vague in previous review. I think the reasons you are switching from acpi_gicc_is_useable(gicc) to the gicc->flags & ACPI_MADT_ENABLED needs calling out as I'm fairly sure that this point in the series at least acpi_gicc_is_usable is same as current upstream: static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc) { return gicc->flags & ACPI_MADT_ENABLED; } > acpi_data.enabled_rdists++; > - return 0; > - } > > - /* > - * It's perfectly valid firmware can pass disabled GICC entry, driver > - * should not treat as errors, skip the entry instead of probe fail. > - */ > - if (!acpi_gicc_is_usable(gicc)) > - return 0; > - > - return -ENODEV; > + return 0; > } > > static int __init gic_acpi_count_gicr_regions(void)
On Fri, Dec 15, 2023 at 04:33:01PM +0000, Jonathan Cameron wrote: > On Wed, 13 Dec 2023 12:50:23 +0000 > Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote: > > > From: James Morse <james.morse@arm.com> > > > > gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions(). > > It should only count the number of enabled redistributors, but it > > also tries to sanity check the GICC entry, currently returning an > > error if the Enabled bit is set, but the gicr_base_address is zero. > > > > Adding support for the online-capable bit to the sanity check > > complicates it, for no benefit. The existing check implicitly > > depends on gic_acpi_count_gicr_regions() previous failing to find > > any GICR regions (as it is valid to have gicr_base_address of zero if > > the redistributors are described via a GICR entry). > > > > Instead of complicating the check, remove it. Failures that happen > > at this point cause the irqchip not to register, meaning no irqs > > can be requested. The kernel grinds to a panic() pretty quickly. > > > > Without the check, MADT tables that exhibit this problem are still > > caught by gic_populate_rdist(), which helpfully also prints what > > went wrong: > > | CPU4: mpidr 100 has no re-distributor! > > > > Signed-off-by: James Morse <james.morse@arm.com> > > Reviewed-by: Gavin Shan <gshan@redhat.com> > > Tested-by: Miguel Luis <miguel.luis@oracle.com> > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> > > Tested-by: Jianyong Wu <jianyong.wu@arm.com> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > drivers/irqchip/irq-gic-v3.c | 18 ++++++------------ > > 1 file changed, 6 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > index 98b0329b7154..ebecd4546830 100644 > > --- a/drivers/irqchip/irq-gic-v3.c > > +++ b/drivers/irqchip/irq-gic-v3.c > > @@ -2420,21 +2420,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header, > > > > /* > > * If GICC is enabled and has valid gicr base address, then it means > > - * GICR base is presented via GICC > > + * GICR base is presented via GICC. The redistributor is only known to > > + * be accessible if the GICC is marked as enabled. If this bit is not > > + * set, we'd need to add the redistributor at runtime, which isn't > > + * supported. > > */ > > - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) { > > + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address) > > I was very vague in previous review. I think the reasons you are switching > from acpi_gicc_is_useable(gicc) to the gicc->flags & ACPI_MADT_ENABLED > needs calling out as I'm fairly sure that this point in the series at least > acpi_gicc_is_usable is same as current upstream: > > static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc) > { > return gicc->flags & ACPI_MADT_ENABLED; > } In a previous patch adding acpi_gicc_is_usable() c54e52f84d7a ("arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper") this was: - if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) { + if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) { so effectively this is undoing that particular change, which raises in my mind why the change was made in the first place if it's just going to be reverted in a later patch (because in a following patch, acpi_gicc_is_usable() has an additional condition added to it that isn't applicable here.) which effectively makes acpi_gicc_is_usable() return true if either ACPI_MADT_ENABLED _or_ ACPI_MADT_GICC_ONLINE_CAPABLE (as it is now known) are set. However, if ACPI_MADT_GICC_ONLINE_CAPABLE is set, does that actually mean that the GICC is usable? I'm not sure it does. ACPI v6.5 says that this bit indicates that the system supports enabling this processor later. Is the GICC of a currently disabled processor "usable"... Clearly, the intention of this change is not to count this GICC entry if it is marked ACPI_MADT_GICC_ONLINE_CAPABLE, but I feel that isn't described in the commit message. Moreover, I am getting the feeling that there are _two_ changes going on here - there's the change that's talked about in the commit message (the complex validation that seems unnecessary) and then there's the preparation for the change to acpi_gicc_is_usable() - which maybe should be in the following patch where it would be less confusing. Would you agree?
On Tue, 9 Jan 2024 19:27:20 +0000 "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Fri, Dec 15, 2023 at 04:33:01PM +0000, Jonathan Cameron wrote: > > On Wed, 13 Dec 2023 12:50:23 +0000 > > Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote: > > > > > From: James Morse <james.morse@arm.com> > > > > > > gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions(). > > > It should only count the number of enabled redistributors, but it > > > also tries to sanity check the GICC entry, currently returning an > > > error if the Enabled bit is set, but the gicr_base_address is zero. > > > > > > Adding support for the online-capable bit to the sanity check > > > complicates it, for no benefit. The existing check implicitly > > > depends on gic_acpi_count_gicr_regions() previous failing to find > > > any GICR regions (as it is valid to have gicr_base_address of zero if > > > the redistributors are described via a GICR entry). > > > > > > Instead of complicating the check, remove it. Failures that happen > > > at this point cause the irqchip not to register, meaning no irqs > > > can be requested. The kernel grinds to a panic() pretty quickly. > > > > > > Without the check, MADT tables that exhibit this problem are still > > > caught by gic_populate_rdist(), which helpfully also prints what > > > went wrong: > > > | CPU4: mpidr 100 has no re-distributor! > > > > > > Signed-off-by: James Morse <james.morse@arm.com> > > > Reviewed-by: Gavin Shan <gshan@redhat.com> > > > Tested-by: Miguel Luis <miguel.luis@oracle.com> > > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> > > > Tested-by: Jianyong Wu <jianyong.wu@arm.com> > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > > --- > > > drivers/irqchip/irq-gic-v3.c | 18 ++++++------------ > > > 1 file changed, 6 insertions(+), 12 deletions(-) > > > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > > index 98b0329b7154..ebecd4546830 100644 > > > --- a/drivers/irqchip/irq-gic-v3.c > > > +++ b/drivers/irqchip/irq-gic-v3.c > > > @@ -2420,21 +2420,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header, > > > > > > /* > > > * If GICC is enabled and has valid gicr base address, then it means > > > - * GICR base is presented via GICC > > > + * GICR base is presented via GICC. The redistributor is only known to > > > + * be accessible if the GICC is marked as enabled. If this bit is not > > > + * set, we'd need to add the redistributor at runtime, which isn't > > > + * supported. > > > */ > > > - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) { > > > + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address) > > > > I was very vague in previous review. I think the reasons you are switching > > from acpi_gicc_is_useable(gicc) to the gicc->flags & ACPI_MADT_ENABLED > > needs calling out as I'm fairly sure that this point in the series at least > > acpi_gicc_is_usable is same as current upstream: > > > > static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc) > > { > > return gicc->flags & ACPI_MADT_ENABLED; > > } > > In a previous patch adding acpi_gicc_is_usable() c54e52f84d7a ("arm64, > irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper") this > was: > > - if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) { > + if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) { > > so effectively this is undoing that particular change, which raises in > my mind why the change was made in the first place if it's just going > to be reverted in a later patch (because in a following patch, > acpi_gicc_is_usable() has an additional condition added to it that > isn't applicable here.) which effectively makes acpi_gicc_is_usable() > return true if either ACPI_MADT_ENABLED _or_ > ACPI_MADT_GICC_ONLINE_CAPABLE (as it is now known) are set. Ok. So maybe just calling out that we are about to change the meaning of acpi_gicc_is_usable() so need to partly revert that earlier patch to make use of it everywhere. Or perhaps introduce acpi_gicc_is_enabled() which is called by acpi_gicc_is_usable() along with the new conditions when they are added though as you say later, what does usable mean? > > However, if ACPI_MADT_GICC_ONLINE_CAPABLE is set, does that actually > mean that the GICC is usable? I'm not sure it does. ACPI v6.5 says that > this bit indicates that the system supports enabling this processor > later. Is the GICC of a currently disabled processor "usable"... I agree, this is confusing. acpi_gicc_may_be_usable()? Or invert it in all places to give a cleaner meaning !acpi_gicc_never_usable() Bit of a pain to change this throughout again, but maybe necessary to avoid confusion in future. > > Clearly, the intention of this change is not to count this GICC entry > if it is marked ACPI_MADT_GICC_ONLINE_CAPABLE, but I feel that isn't > described in the commit message. Agreed, though that only happens in the next patch so easier to describe there or via a patch adding initially identical multiple helper functions that then diverge in following patch? Whilst a helper for this one location seems silly it would let us put the two helpers next to each other where the distinction is obvious. > > Moreover, I am getting the feeling that there are _two_ changes going > on here - there's the change that's talked about in the commit message > (the complex validation that seems unnecessary) and then there's the > preparation for the change to acpi_gicc_is_usable() - which maybe > should be in the following patch where it would be less confusing. Agreed. > > Would you agree? > Yes, the move would help as then it's obvious why this needs to change and that is separate from the naming question. So in conclusion, I agree with everything you've called out on this one, up to you to pick which solution cleans this up. I think options are. 1) Just move the change to the next patch where it's easier to describe. Leaves the odd 'usable' behind. 2) Rename the useable() to something else, maybe inverting logic as !never is easier than now_or_maybe_later. 3) Possibly add another helper for this new case which starts as matching the existing one, but diverges in a later patch (Should still not be in this patch which as you observer is doing something else and I think is actually a bug fix anyway, be it one that has never mattered for any shipping firmware). Jonathan
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 98b0329b7154..ebecd4546830 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -2420,21 +2420,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header, /* * If GICC is enabled and has valid gicr base address, then it means - * GICR base is presented via GICC + * GICR base is presented via GICC. The redistributor is only known to + * be accessible if the GICC is marked as enabled. If this bit is not + * set, we'd need to add the redistributor at runtime, which isn't + * supported. */ - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) { + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address) acpi_data.enabled_rdists++; - return 0; - } - /* - * It's perfectly valid firmware can pass disabled GICC entry, driver - * should not treat as errors, skip the entry instead of probe fail. - */ - if (!acpi_gicc_is_usable(gicc)) - return 0; - - return -ENODEV; + return 0; } static int __init gic_acpi_count_gicr_regions(void)