diff mbox

x86/PCI: Don't alloc pcibios-irq when MSI is enabled

Message ID 1444386214-26319-1-git-send-email-joro@8bytes.org (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Joerg Roedel Oct. 9, 2015, 10:23 a.m. UTC
From: Joerg Roedel <jroedel@suse.de>

The pcibios-irq and MSI both use dev->irq to store the IRQ
number. While the MSI code checks for that and frees the
pcibios-irq before overwriting dev->irq, the
pcibios_alloc_irq function does not.

Usually this is not a problem, as the pcibios-irq is
allocated before probe time of the device and the MSI irq is
allocted from the drivers probe path.

But there are PCI devices handled by the core kernel and not
by a standard pci driver, like the AMD IOMMU for example.
For the AMD IOMMU a normal pci device driver does not make
sense, because a driver can be forcibly unbound from its
device, which is not a good idea for an IOMMU.

Nevertheless the PCI core code tries to match the PCI device
implementing the AMD IOMMU against drivers, and
allocates/frees a pcibios IRQ every time it tries out a new
driver. This overwrites the dev->irq field set by
pci_enable_msi() and sets it to 0 in the end (because the
probe fails and the pcibios-irq is freed again).

On suspend/resume this breaks the kernel, because the irq
descriptor for irq 0 is NULL.

Fix this by not allocating a pcibios-irq when MSI is
already active. This also has the benefit, that a device
claimed by the core kernel can not be probed by a pci driver
later.

Cc: Jiang Liu <jiang.liu@linux.intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/pci/common.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Thomas Gleixner Oct. 9, 2015, 10:26 a.m. UTC | #1
On Fri, 9 Oct 2015, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> The pcibios-irq and MSI both use dev->irq to store the IRQ
> number. While the MSI code checks for that and frees the
> pcibios-irq before overwriting dev->irq, the
> pcibios_alloc_irq function does not.
> 
> Usually this is not a problem, as the pcibios-irq is
> allocated before probe time of the device and the MSI irq is
> allocted from the drivers probe path.
> 
> But there are PCI devices handled by the core kernel and not
> by a standard pci driver, like the AMD IOMMU for example.
> For the AMD IOMMU a normal pci device driver does not make
> sense, because a driver can be forcibly unbound from its
> device, which is not a good idea for an IOMMU.
> 
> Nevertheless the PCI core code tries to match the PCI device
> implementing the AMD IOMMU against drivers, and
> allocates/frees a pcibios IRQ every time it tries out a new
> driver. This overwrites the dev->irq field set by
> pci_enable_msi() and sets it to 0 in the end (because the
> probe fails and the pcibios-irq is freed again).
> 
> On suspend/resume this breaks the kernel, because the irq
> descriptor for irq 0 is NULL.
> 
> Fix this by not allocating a pcibios-irq when MSI is
> already active. This also has the benefit, that a device
> claimed by the core kernel can not be probed by a pci driver
> later.
> 
> Cc: Jiang Liu <jiang.liu@linux.intel.com>
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Joerg Roedel <jroedel@suse.de>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

> ---
>  arch/x86/pci/common.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index dc78a4a..6254c06 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -675,6 +675,14 @@ int pcibios_add_device(struct pci_dev *dev)
>  
>  int pcibios_alloc_irq(struct pci_dev *dev)
>  {
> +	/*
> +	 * If the PCI device was already claimed by core code and has
> +	 * MSI enabled, probing of the pcibios irq will overwrite
> +	 * dev->irq.  So bail out if MSI is already enabled.
> +	 */
> +	if (pci_dev_msi_enabled(dev))
> +		return -EBUSY;
> +
>  	return pcibios_enable_irq(dev);
>  }
>  
> -- 
> 1.9.1
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiang Liu Oct. 9, 2015, 2:07 p.m. UTC | #2
Hi Joerg,
	I prepared this patchset yesterday but forgot to send them out.
I think it still worth sending out for review, so here we go:)

---
Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") breaks suspend/resume on some AMD platforms.
The root cause is:
1) AMD IOMMU drivers enables MSI on IOMMU PCI devices at early boot stage
2) PCI driver binding code tries to allocate/free PCI legacy IRQ when
   binding other PCI drivers to IOMMU PCI devices at later boot stage,
   and breaks PCI MSI allocation information.
3) System resume breaks when restoring PCI MSI state using the damaged
   data.

We have tried on solution to detect that a PCI is in use by IOMMU driver
by checking pci_msi_enabled(). But that's too specific, actually we should
prevent binding PCI drivers to PCI devices used by non-PCI drivers.

Fortunately, we could prevent binding PCI drivers to PCI devices by setting
pci_dev->match_driver to false. If needed, we could implement a helper
function to manipulate pci_dev->match_driver.

Jiang Liu (2):
  iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices
  ACPI, PCI: Prevent binding other PCI drivers to IOAPIC PCI devices

 drivers/acpi/ioapic.c          |    7 +++++--
 drivers/iommu/amd_iommu_init.c |    3 +++
 2 files changed, 8 insertions(+), 2 deletions(-)
Bjorn Helgaas Oct. 21, 2015, 4:23 p.m. UTC | #3
On Fri, Oct 09, 2015 at 12:23:34PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> The pcibios-irq and MSI both use dev->irq to store the IRQ
> number. While the MSI code checks for that and frees the
> pcibios-irq before overwriting dev->irq, the
> pcibios_alloc_irq function does not.
> 
> Usually this is not a problem, as the pcibios-irq is
> allocated before probe time of the device and the MSI irq is
> allocted from the drivers probe path.
> 
> But there are PCI devices handled by the core kernel and not
> by a standard pci driver, like the AMD IOMMU for example.
> For the AMD IOMMU a normal pci device driver does not make
> sense, because a driver can be forcibly unbound from its
> device, which is not a good idea for an IOMMU.
> 
> Nevertheless the PCI core code tries to match the PCI device
> implementing the AMD IOMMU against drivers, and
> allocates/frees a pcibios IRQ every time it tries out a new
> driver. This overwrites the dev->irq field set by
> pci_enable_msi() and sets it to 0 in the end (because the
> probe fails and the pcibios-irq is freed again).
> 
> On suspend/resume this breaks the kernel, because the irq
> descriptor for irq 0 is NULL.
> 
> Fix this by not allocating a pcibios-irq when MSI is
> already active. This also has the benefit, that a device
> claimed by the core kernel can not be probed by a pci driver
> later.
> 
> Cc: Jiang Liu <jiang.liu@linux.intel.com>
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Joerg Roedel <jroedel@suse.de>

Applied with Thomas' reviewed-by to pci/msi for v4.4, thanks, Joerg!

> ---
>  arch/x86/pci/common.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index dc78a4a..6254c06 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -675,6 +675,14 @@ int pcibios_add_device(struct pci_dev *dev)
>  
>  int pcibios_alloc_irq(struct pci_dev *dev)
>  {
> +	/*
> +	 * If the PCI device was already claimed by core code and has
> +	 * MSI enabled, probing of the pcibios irq will overwrite
> +	 * dev->irq.  So bail out if MSI is already enabled.
> +	 */
> +	if (pci_dev_msi_enabled(dev))
> +		return -EBUSY;
> +
>  	return pcibios_enable_irq(dev);
>  }
>  
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiang Liu Oct. 23, 2015, 2:02 a.m. UTC | #4
On 2015/10/22 0:23, Bjorn Helgaas wrote:
> On Fri, Oct 09, 2015 at 12:23:34PM +0200, Joerg Roedel wrote:
>> From: Joerg Roedel <jroedel@suse.de>
>>
>> The pcibios-irq and MSI both use dev->irq to store the IRQ
>> number. While the MSI code checks for that and frees the
>> pcibios-irq before overwriting dev->irq, the
>> pcibios_alloc_irq function does not.
>>
>> Usually this is not a problem, as the pcibios-irq is
>> allocated before probe time of the device and the MSI irq is
>> allocted from the drivers probe path.
>>
>> But there are PCI devices handled by the core kernel and not
>> by a standard pci driver, like the AMD IOMMU for example.
>> For the AMD IOMMU a normal pci device driver does not make
>> sense, because a driver can be forcibly unbound from its
>> device, which is not a good idea for an IOMMU.
>>
>> Nevertheless the PCI core code tries to match the PCI device
>> implementing the AMD IOMMU against drivers, and
>> allocates/frees a pcibios IRQ every time it tries out a new
>> driver. This overwrites the dev->irq field set by
>> pci_enable_msi() and sets it to 0 in the end (because the
>> probe fails and the pcibios-irq is freed again).
>>
>> On suspend/resume this breaks the kernel, because the irq
>> descriptor for irq 0 is NULL.
>>
>> Fix this by not allocating a pcibios-irq when MSI is
>> already active. This also has the benefit, that a device
>> claimed by the core kernel can not be probed by a pci driver
>> later.
>>
>> Cc: Jiang Liu <jiang.liu@linux.intel.com>
>> Reported-by: Borislav Petkov <bp@alien8.de>
>> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> 
> Applied with Thomas' reviewed-by to pci/msi for v4.4, thanks, Joerg!
Hi Bjorn,
	There's another patch already merged into mainstream kernel,
which solves this issue in another way by making use of
pci_dev->match_driver flag. Please refer to:
cbbc00be2ce3 ("iommu/amd: Prevent binding other PCI drivers to IOMMU PCI
devices")
	So I think this patch is redundant now.
Thanks,
Gerry

> 
>> ---
>>  arch/x86/pci/common.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index dc78a4a..6254c06 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -675,6 +675,14 @@ int pcibios_add_device(struct pci_dev *dev)
>>  
>>  int pcibios_alloc_irq(struct pci_dev *dev)
>>  {
>> +	/*
>> +	 * If the PCI device was already claimed by core code and has
>> +	 * MSI enabled, probing of the pcibios irq will overwrite
>> +	 * dev->irq.  So bail out if MSI is already enabled.
>> +	 */
>> +	if (pci_dev_msi_enabled(dev))
>> +		return -EBUSY;
>> +
>>  	return pcibios_enable_irq(dev);
>>  }
>>  
>> -- 
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 23, 2015, 8:23 a.m. UTC | #5
On Fri, Oct 23, 2015 at 10:02:11AM +0800, Jiang Liu wrote:
> 	There's another patch already merged into mainstream kernel,
> which solves this issue in another way by making use of
> pci_dev->match_driver flag. Please refer to:
> cbbc00be2ce3 ("iommu/amd: Prevent binding other PCI drivers to IOMMU PCI
> devices")
> 	So I think this patch is redundant now.

Yeah, it fixes the same problem already fixed by your match_driver fix.
But I think this fix makes still sense as it avoids that the problem can
reappear in the future with other drivers. Setting pci_dev->match_driver
could be easily forgotten when writing new code.


	Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index dc78a4a..6254c06 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -675,6 +675,14 @@  int pcibios_add_device(struct pci_dev *dev)
 
 int pcibios_alloc_irq(struct pci_dev *dev)
 {
+	/*
+	 * If the PCI device was already claimed by core code and has
+	 * MSI enabled, probing of the pcibios irq will overwrite
+	 * dev->irq.  So bail out if MSI is already enabled.
+	 */
+	if (pci_dev_msi_enabled(dev))
+		return -EBUSY;
+
 	return pcibios_enable_irq(dev);
 }