Message ID | 20241127152236.26122-1-farbere@amazon.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] arm64: kexec: Check if IRQ is already masked before masking | expand |
On Wed, Nov 27 2024 at 15:22, Eliav Farber wrote: > diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c > index 80ceb5bd2680..54d0bd1bd449 100644 > --- a/arch/arm/kernel/machine_kexec.c > +++ b/arch/arm/kernel/machine_kexec.c > @@ -142,11 +142,8 @@ static void machine_kexec_mask_interrupts(void) > if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) > chip->irq_eoi(&desc->irq_data); > > - if (chip->irq_mask) > - chip->irq_mask(&desc->irq_data); > - > - if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) > - chip->irq_disable(&desc->irq_data); > + irq_set_status_flags(i, IRQ_DISABLE_UNLAZY); > + irq_disable(desc); This is just wrong. If the interrupt was torn down, then its state is deactivated and it was masked already. So the EOI handling and the mask/disable dance are neither required nor make sense. So this whole thing should be: chip = irq_desc_get_chip(desc); - if (!chip) + if (!chip || !irqd_is_started(&desc->irq_data)) continue; But what's worse is that we have 4 almost identical variants of the same code. So instead of exposing core functionality and "fixing" up four variants, can we please have a consolidated version of this function in the core code: struct irq_chip *chip; int check_eoi = 1; chip = irq_desc_get_chip(desc); if (!chip || !irqd_is_started(&desc->irq_data)) continue; if (IS_ENABLED(CONFIG_.....)) { /* * Add a sensible comment which explains this. */ check_eoi = irq_set_irqchip_state(....); } if (check_eoi && ....) chip->irq_eoi(&desc->irq_data); irq_shutdown(desc); No? Thanks, tglx
On Wed, Nov 27 2024 at 15:22, Eliav Farber wrote: As a related note. The subject line is not really matching what the patch does. It want's to be split into a core change and one patch per architecture. > This patch replaces the direct invocation of the irq_mask() and git grep 'This patch' Documentation/process/ > irq_disable() hooks with simplified code that leverages the > irq_disable() kernel infrastructure. This higher-level function checks > the interrupt's state to prevent redundant operations. Additionally, the > IRQ_DISABLE_UNLAZY status flag is set to ensure that, for interrupt > chips lacking an irq_disable callback, the disable operation is handled > using the lazy approach. Not that it matters much anymore, but the last sentence does not make sense: Set the UNLAZY flag so disable is handled using the LAZY approach ... Thanks, tglx
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index 80ceb5bd2680..54d0bd1bd449 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -142,11 +142,8 @@ static void machine_kexec_mask_interrupts(void) if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) chip->irq_eoi(&desc->irq_data); - if (chip->irq_mask) - chip->irq_mask(&desc->irq_data); - - if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) - chip->irq_disable(&desc->irq_data); + irq_set_status_flags(i, IRQ_DISABLE_UNLAZY); + irq_disable(desc); } } diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index 82e2203d86a3..9b48d952df3e 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -230,11 +230,8 @@ static void machine_kexec_mask_interrupts(void) chip->irq_eoi) chip->irq_eoi(&desc->irq_data); - if (chip->irq_mask) - chip->irq_mask(&desc->irq_data); - - if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) - chip->irq_disable(&desc->irq_data); + irq_set_status_flags(i, IRQ_DISABLE_UNLAZY); + irq_disable(desc); } } diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c index b8333a49ea5d..3489e50f5017 100644 --- a/arch/powerpc/kexec/core.c +++ b/arch/powerpc/kexec/core.c @@ -36,11 +36,8 @@ void machine_kexec_mask_interrupts(void) { if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) chip->irq_eoi(&desc->irq_data); - if (chip->irq_mask) - chip->irq_mask(&desc->irq_data); - - if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) - chip->irq_disable(&desc->irq_data); + irq_set_status_flags(i, IRQ_DISABLE_UNLAZY); + irq_disable(desc); } } diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index 3c830a6f7ef4..a9df80e0602c 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -129,11 +129,8 @@ static void machine_kexec_mask_interrupts(void) if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) chip->irq_eoi(&desc->irq_data); - if (chip->irq_mask) - chip->irq_mask(&desc->irq_data); - - if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) - chip->irq_disable(&desc->irq_data); + irq_set_status_flags(i, IRQ_DISABLE_UNLAZY); + irq_disable(desc); } } diff --git a/include/linux/irq.h b/include/linux/irq.h index fa711f80957b..176a7f671409 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -600,6 +600,8 @@ enum { #define IRQ_DEFAULT_INIT_FLAGS ARCH_IRQ_INIT_FLAGS +extern void irq_disable(struct irq_desc *desc); + struct irqaction; extern int setup_percpu_irq(unsigned int irq, struct irqaction *new); extern void remove_percpu_irq(unsigned int irq, struct irqaction *act); diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index fe0272cd84a5..d9104d2b26b4 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -91,7 +91,6 @@ extern int irq_startup(struct irq_desc *desc, bool resend, bool force); extern void irq_shutdown(struct irq_desc *desc); extern void irq_shutdown_and_deactivate(struct irq_desc *desc); extern void irq_enable(struct irq_desc *desc); -extern void irq_disable(struct irq_desc *desc); extern void irq_percpu_enable(struct irq_desc *desc, unsigned int cpu); extern void irq_percpu_disable(struct irq_desc *desc, unsigned int cpu); extern void mask_irq(struct irq_desc *desc);
During machine kexec, the function machine_kexec_mask_interrupts() is responsible for disabling or masking all interrupts. While the irq_disable hook ensures that an already-disabled IRQ is not disabled again, the current implementation unconditionally invokes the irq_mask() function for every interrupt descriptor, even when the interrupt is already masked. A specific issue was observed in the crash kernel flow after unbinding a device (prior to kexec) that used a GPIO as an IRQ source. The warning was triggered by the gpiochip_disable_irq() function, which attempted to clear the FLAG_IRQ_IS_ENABLED flag when FLAG_USED_AS_IRQ was not set: ``` void gpiochip_disable_irq(struct gpio_chip *gc, unsigned int offset) { struct gpio_desc *desc = gpiochip_get_desc(gc, offset); if (!IS_ERR(desc) && !WARN_ON(!test_bit(FLAG_USED_AS_IRQ, &desc->flags))) clear_bit(FLAG_IRQ_IS_ENABLED, &desc->flags); } ``` The issue emerged after commit a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable"), which introduced lazy disablement for GPIO IRQs by replacing disable/enable hooks with mask/unmask hooks in some cases. While irq_disable guarded against redundant operations, the unguarded irq_mask in machine_kexec_mask_interrupts() led to warnings when invoked on already-masked IRQs. When a GPIO-IRQ-using driver is unbound, the IRQ is released, invoking __irq_disable() and irq_state_set_masked(). A subsequent call to machine_kexec_mask_interrupts() reinvoked chip->irq_mask(), leading to a call chain that included gpiochip_irq_mask() and gpiochip_disable_irq(). Because FLAG_USED_AS_IRQ was cleared earlier, a warning was printed. This patch replaces the direct invocation of the irq_mask() and irq_disable() hooks with simplified code that leverages the irq_disable() kernel infrastructure. This higher-level function checks the interrupt's state to prevent redundant operations. Additionally, the IRQ_DISABLE_UNLAZY status flag is set to ensure that, for interrupt chips lacking an irq_disable callback, the disable operation is handled using the lazy approach. As part of this change, the irq_disable() declaration was moved from kernel/irq/internals.h to include/linux/irq.h to make it accessible outside the kernel/irq/ directory, as the former can only be included within that directory. Signed-off-by: Eliav Farber <farbere@amazon.com> --- V1 -> V2: - Implement an alternative solution by utilizing the kernel's irq_disable() infrastructure. - Apply the fix to additional architectures, including ARM, PowerPC, and RISC-V. --- arch/arm/kernel/machine_kexec.c | 7 ++----- arch/arm64/kernel/machine_kexec.c | 7 ++----- arch/powerpc/kexec/core.c | 7 ++----- arch/riscv/kernel/machine_kexec.c | 7 ++----- include/linux/irq.h | 2 ++ kernel/irq/internals.h | 1 - 6 files changed, 10 insertions(+), 21 deletions(-)