diff mbox series

[7/7] PCI/AER: Fix the broken interrupt injection

Message ID 20200306130624.098374457@linutronix.de (mailing list archive)
State Not Applicable, archived
Headers show
Series genirq/PCI: Sanitize interrupt injection | expand

Commit Message

Thomas Gleixner March 6, 2020, 1:03 p.m. UTC
The AER error injection mechanism just blindly abuses generic_handle_irq()
which is really not meant for consumption by random drivers. The include of
linux/irq.h should have been a red flag in the first place. Driver code,
unless implementing interrupt chips or low level hypervisor functionality
has absolutely no business with that.

Invoking generic_handle_irq() from non interrupt handling context can have
nasty side effects at least on x86 due to the hardware trainwreck which
makes interrupt affinity changes a fragile beast. Sathyanarayanan triggered
a NULL pointer dereference in the low level APIC code that way. While the
particular pointer could be checked this would only paper over the issue
because there are other ways to trigger warnings or silently corrupt state.

Invoke the new irq_inject_interrupt() mechanism, which has the necessary
sanity checks in place and injects the interrupt via the irq_retrigger()
mechanism, which is at least halfways safe vs. the fragile x86 affinity
change mechanics.

It's safe on x86 as it does not corrupt state, but it still can cause a
premature completion of an interrupt affinity change causing the interrupt
line to become stale. Very unlikely, but possible.

For regular operations this is a non issue as AER error injection is meant
for debugging and testing and not for usage on production systems. People
using this should better know what they are doing.

Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/pci/pcie/Kconfig      |    1 +
 drivers/pci/pcie/aer_inject.c |    6 ++----
 2 files changed, 3 insertions(+), 4 deletions(-)

Comments

Kuppuswamy Sathyanarayanan March 6, 2020, 6:32 p.m. UTC | #1
On 3/6/20 5:03 AM, Thomas Gleixner wrote:
> The AER error injection mechanism just blindly abuses generic_handle_irq()
> which is really not meant for consumption by random drivers. The include of
> linux/irq.h should have been a red flag in the first place. Driver code,
> unless implementing interrupt chips or low level hypervisor functionality
> has absolutely no business with that.
>
> Invoking generic_handle_irq() from non interrupt handling context can have
> nasty side effects at least on x86 due to the hardware trainwreck which
> makes interrupt affinity changes a fragile beast. Sathyanarayanan triggered
> a NULL pointer dereference in the low level APIC code that way. While the
> particular pointer could be checked this would only paper over the issue
> because there are other ways to trigger warnings or silently corrupt state.
>
> Invoke the new irq_inject_interrupt() mechanism, which has the necessary
> sanity checks in place and injects the interrupt via the irq_retrigger()
> mechanism, which is at least halfways safe vs. the fragile x86 affinity
> change mechanics.
>
> It's safe on x86 as it does not corrupt state, but it still can cause a
> premature completion of an interrupt affinity change causing the interrupt
> line to become stale. Very unlikely, but possible.
>
> For regular operations this is a non issue as AER error injection is meant
> for debugging and testing and not for usage on production systems. People
> using this should better know what they are doing.
It looks good to me.

Reviewed-by: Kuppuswamy Sathyanarayanan 
<sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Kuppuswamy Sathyanarayanan 
<sathyanarayanan.kuppuswamy@linux.intel.com>
>
> Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
> Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   drivers/pci/pcie/Kconfig      |    1 +
>   drivers/pci/pcie/aer_inject.c |    6 ++----
>   2 files changed, 3 insertions(+), 4 deletions(-)
>
> --- a/drivers/pci/pcie/Kconfig
> +++ b/drivers/pci/pcie/Kconfig
> @@ -34,6 +34,7 @@ config PCIEAER
>   config PCIEAER_INJECT
>   	tristate "PCI Express error injection support"
>   	depends on PCIEAER
> +	select GENERIC_IRQ_INJECTION
>   	help
>   	  This enables PCI Express Root Port Advanced Error Reporting
>   	  (AER) software error injector.
> --- a/drivers/pci/pcie/aer_inject.c
> +++ b/drivers/pci/pcie/aer_inject.c
> @@ -16,7 +16,7 @@
>   
>   #include <linux/module.h>
>   #include <linux/init.h>
> -#include <linux/irq.h>
> +#include <linux/interrupt.h>
>   #include <linux/miscdevice.h>
>   #include <linux/pci.h>
>   #include <linux/slab.h>
> @@ -468,9 +468,7 @@ static int aer_inject(struct aer_error_i
>   		}
>   		pci_info(edev->port, "Injecting errors %08x/%08x into device %s\n",
>   			 einj->cor_status, einj->uncor_status, pci_name(dev));
> -		local_irq_disable();
> -		generic_handle_irq(edev->irq);
> -		local_irq_enable();
> +		ret = irq_inject_interrupt(edev->irq);
>   	} else {
>   		pci_err(rpdev, "AER device not found\n");
>   		ret = -ENODEV;
>
Kuppuswamy Sathyanarayanan March 6, 2020, 7:29 p.m. UTC | #2
On 3/6/20 10:32 AM, Kuppuswamy Sathyanarayanan wrote:
>
> On 3/6/20 5:03 AM, Thomas Gleixner wrote:
>> The AER error injection mechanism just blindly abuses 
>> generic_handle_irq()
>> which is really not meant for consumption by random drivers. The 
>> include of
>> linux/irq.h should have been a red flag in the first place. Driver code,
>> unless implementing interrupt chips or low level hypervisor 
>> functionality
>> has absolutely no business with that.
>>
>> Invoking generic_handle_irq() from non interrupt handling context can 
>> have
>> nasty side effects at least on x86 due to the hardware trainwreck which
>> makes interrupt affinity changes a fragile beast. Sathyanarayanan 
>> triggered
>> a NULL pointer dereference in the low level APIC code that way. While 
>> the
>> particular pointer could be checked this would only paper over the issue
>> because there are other ways to trigger warnings or silently corrupt 
>> state.
>>
>> Invoke the new irq_inject_interrupt() mechanism, which has the necessary
>> sanity checks in place and injects the interrupt via the irq_retrigger()
>> mechanism, which is at least halfways safe vs. the fragile x86 affinity
>> change mechanics.
>>
>> It's safe on x86 as it does not corrupt state, but it still can cause a
>> premature completion of an interrupt affinity change causing the 
>> interrupt
>> line to become stale. Very unlikely, but possible.
>>
>> For regular operations this is a non issue as AER error injection is 
>> meant
>> for debugging and testing and not for usage on production systems. 
>> People
>> using this should better know what they are doing.
> It looks good to me.
>
> Reviewed-by: Kuppuswamy Sathyanarayanan 
> <sathyanarayanan.kuppuswamy@linux.intel.com>
> Tested-by: Kuppuswamy Sathyanarayanan 
> <sathyanarayanan.kuppuswamy@linux.intel.com>
>>
>> Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
This patch is merged in v4.20 kernel. So this fix could be a candidate 
for stable fix.
>> Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>   drivers/pci/pcie/Kconfig      |    1 +
>>   drivers/pci/pcie/aer_inject.c |    6 ++----
>>   2 files changed, 3 insertions(+), 4 deletions(-)
>>
>> --- a/drivers/pci/pcie/Kconfig
>> +++ b/drivers/pci/pcie/Kconfig
>> @@ -34,6 +34,7 @@ config PCIEAER
>>   config PCIEAER_INJECT
>>       tristate "PCI Express error injection support"
>>       depends on PCIEAER
>> +    select GENERIC_IRQ_INJECTION
>>       help
>>         This enables PCI Express Root Port Advanced Error Reporting
>>         (AER) software error injector.
>> --- a/drivers/pci/pcie/aer_inject.c
>> +++ b/drivers/pci/pcie/aer_inject.c
>> @@ -16,7 +16,7 @@
>>     #include <linux/module.h>
>>   #include <linux/init.h>
>> -#include <linux/irq.h>
>> +#include <linux/interrupt.h>
>>   #include <linux/miscdevice.h>
>>   #include <linux/pci.h>
>>   #include <linux/slab.h>
>> @@ -468,9 +468,7 @@ static int aer_inject(struct aer_error_i
>>           }
>>           pci_info(edev->port, "Injecting errors %08x/%08x into 
>> device %s\n",
>>                einj->cor_status, einj->uncor_status, pci_name(dev));
>> -        local_irq_disable();
>> -        generic_handle_irq(edev->irq);
>> -        local_irq_enable();
>> +        ret = irq_inject_interrupt(edev->irq);
>>       } else {
>>           pci_err(rpdev, "AER device not found\n");
>>           ret = -ENODEV;
>>
diff mbox series

Patch

--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -34,6 +34,7 @@  config PCIEAER
 config PCIEAER_INJECT
 	tristate "PCI Express error injection support"
 	depends on PCIEAER
+	select GENERIC_IRQ_INJECTION
 	help
 	  This enables PCI Express Root Port Advanced Error Reporting
 	  (AER) software error injector.
--- a/drivers/pci/pcie/aer_inject.c
+++ b/drivers/pci/pcie/aer_inject.c
@@ -16,7 +16,7 @@ 
 
 #include <linux/module.h>
 #include <linux/init.h>
-#include <linux/irq.h>
+#include <linux/interrupt.h>
 #include <linux/miscdevice.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
@@ -468,9 +468,7 @@  static int aer_inject(struct aer_error_i
 		}
 		pci_info(edev->port, "Injecting errors %08x/%08x into device %s\n",
 			 einj->cor_status, einj->uncor_status, pci_name(dev));
-		local_irq_disable();
-		generic_handle_irq(edev->irq);
-		local_irq_enable();
+		ret = irq_inject_interrupt(edev->irq);
 	} else {
 		pci_err(rpdev, "AER device not found\n");
 		ret = -ENODEV;