Message ID | 1422456683-797-1-git-send-email-marc.zyngier@arm.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On 2015/1/28 22:51, Marc Zyngier wrote: > pcibios_update_irq writes an irq number into the config space > of a given PCI device, but ignores the fact that this number > is a virtual interrupt number, which might be a very different > value from what the underlying hardware is using. > > The obvious fix is to fetch the HW interrupt number from the > corresponding irq_data structure. This is slightly complicated > by the fact that this interrupt might be services by a stacked > domain. > > This has been tested on KVM with kvmtool. > > Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Tested-by: Andre Przywara <andre.przywara@arm.com> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> > --- > drivers/pci/setup-irq.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c > index 4e2d595..828cbc9 100644 > --- a/drivers/pci/setup-irq.c > +++ b/drivers/pci/setup-irq.c > @@ -15,11 +15,19 @@ > #include <linux/errno.h> > #include <linux/ioport.h> > #include <linux/cache.h> > +#include <linux/irq.h> > > void __weak pcibios_update_irq(struct pci_dev *dev, int irq) > { > - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); > - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); > + struct irq_data *d; > + > + d = irq_get_irq_data(irq); > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > + while (d->parent_data) > + d = d->parent_data; > +#endif > + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); > + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); > } Hi Mark, Instead of modifying the common version, how about implementing an arch specific version? Arch may have different way to determine the irq number. Above implementation doesn't work with x86, for example. Regards! Gerry > > static void pdev_fixup_irq(struct pci_dev *dev, > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 28, 2015 at 9:27 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: > Hi Gerry, > > On 28/01/15 15:21, Jiang Liu wrote: >> >> >> On 2015/1/28 22:51, Marc Zyngier wrote: >>> pcibios_update_irq writes an irq number into the config space >>> of a given PCI device, but ignores the fact that this number >>> is a virtual interrupt number, which might be a very different >>> value from what the underlying hardware is using. >>> >>> The obvious fix is to fetch the HW interrupt number from the >>> corresponding irq_data structure. This is slightly complicated >>> by the fact that this interrupt might be services by a stacked >>> domain. >>> >>> This has been tested on KVM with kvmtool. >>> >>> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> >>> Tested-by: Andre Przywara <andre.przywara@arm.com> >>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> >>> --- >>> drivers/pci/setup-irq.c | 12 ++++++++++-- >>> 1 file changed, 10 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c >>> index 4e2d595..828cbc9 100644 >>> --- a/drivers/pci/setup-irq.c >>> +++ b/drivers/pci/setup-irq.c >>> @@ -15,11 +15,19 @@ >>> #include <linux/errno.h> >>> #include <linux/ioport.h> >>> #include <linux/cache.h> >>> +#include <linux/irq.h> >>> >>> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) >>> { >>> - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); >>> - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); >>> + struct irq_data *d; >>> + >>> + d = irq_get_irq_data(irq); >>> +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >>> + while (d->parent_data) >>> + d = d->parent_data; >>> +#endif >>> + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); >>> + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); >>> } >> Hi Mark, >> Instead of modifying the common version, how about >> implementing an arch specific version? Arch may have different >> way to determine the irq number. Above implementation doesn't >> work with x86, for example. > > If you look at the Makefile, this file is used on: > > obj-$(CONFIG_ALPHA) += setup-irq.o > obj-$(CONFIG_ARM) += setup-irq.o > obj-$(CONFIG_UNICORE32) += setup-irq.o > obj-$(CONFIG_SUPERH) += setup-irq.o > obj-$(CONFIG_MIPS) += setup-irq.o > obj-$(CONFIG_TILE) += setup-irq.o > obj-$(CONFIG_SPARC_LEON) += setup-irq.o > obj-$(CONFIG_M68K) += setup-irq.o > > x86 doesn't use that at all. Since you're looking at this, Marc, do you see a nice way to get rid of these arch dependencies in the Makefile and unify this a bit? We still have this pci_fixup_irqs() ugliness -- it's not really arch-specific at all, but it's called from arch code, and it uses for_each_pci_dev(), which obviously only works for things present at boot and not for things hot-added later. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 28, 2015 at 8:51 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: > pcibios_update_irq writes an irq number into the config space > of a given PCI device, but ignores the fact that this number > is a virtual interrupt number, which might be a very different > value from what the underlying hardware is using. > > The obvious fix is to fetch the HW interrupt number from the > corresponding irq_data structure. This is slightly complicated > by the fact that this interrupt might be services by a stacked > domain. > > This has been tested on KVM with kvmtool. > > Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Tested-by: Andre Przywara <andre.przywara@arm.com> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Jiang, are you OK with this patch as-is now, since it isn't used on x86? Marc, Lorenzo, I assume this actually fixes a bug. Can we get any more details about what happens when you hit the bug, and how you reproduced it (what platform, driver, etc.)? Bjorn > --- > drivers/pci/setup-irq.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c > index 4e2d595..828cbc9 100644 > --- a/drivers/pci/setup-irq.c > +++ b/drivers/pci/setup-irq.c > @@ -15,11 +15,19 @@ > #include <linux/errno.h> > #include <linux/ioport.h> > #include <linux/cache.h> > +#include <linux/irq.h> > > void __weak pcibios_update_irq(struct pci_dev *dev, int irq) > { > - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); > - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); > + struct irq_data *d; > + > + d = irq_get_irq_data(irq); > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > + while (d->parent_data) > + d = d->parent_data; > +#endif > + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); > + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); > } > > static void pdev_fixup_irq(struct pci_dev *dev, > -- > 2.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015/2/2 23:57, Bjorn Helgaas wrote: > On Wed, Jan 28, 2015 at 8:51 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: >> pcibios_update_irq writes an irq number into the config space >> of a given PCI device, but ignores the fact that this number >> is a virtual interrupt number, which might be a very different >> value from what the underlying hardware is using. >> >> The obvious fix is to fetch the HW interrupt number from the >> corresponding irq_data structure. This is slightly complicated >> by the fact that this interrupt might be services by a stacked >> domain. >> >> This has been tested on KVM with kvmtool. >> >> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> >> Tested-by: Andre Przywara <andre.przywara@arm.com> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> > > Jiang, are you OK with this patch as-is now, since it isn't used on x86? Sure, I'm OK. I missed the point that it's not used on x86 at all in previous review. > > Marc, Lorenzo, I assume this actually fixes a bug. Can we get any > more details about what happens when you hit the bug, and how you > reproduced it (what platform, driver, etc.)? > > Bjorn > >> --- >> drivers/pci/setup-irq.c | 12 ++++++++++-- >> 1 file changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c >> index 4e2d595..828cbc9 100644 >> --- a/drivers/pci/setup-irq.c >> +++ b/drivers/pci/setup-irq.c >> @@ -15,11 +15,19 @@ >> #include <linux/errno.h> >> #include <linux/ioport.h> >> #include <linux/cache.h> >> +#include <linux/irq.h> >> >> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) >> { >> - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); >> - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); >> + struct irq_data *d; >> + >> + d = irq_get_irq_data(irq); >> +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >> + while (d->parent_data) >> + d = d->parent_data; >> +#endif >> + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); >> + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); >> } >> >> static void pdev_fixup_irq(struct pci_dev *dev, >> -- >> 2.1.4 >> -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 28/01/15 15:43, Bjorn Helgaas wrote: > On Wed, Jan 28, 2015 at 9:27 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: >> Hi Gerry, >> >> On 28/01/15 15:21, Jiang Liu wrote: >>> >>> >>> On 2015/1/28 22:51, Marc Zyngier wrote: >>>> pcibios_update_irq writes an irq number into the config space >>>> of a given PCI device, but ignores the fact that this number >>>> is a virtual interrupt number, which might be a very different >>>> value from what the underlying hardware is using. >>>> >>>> The obvious fix is to fetch the HW interrupt number from the >>>> corresponding irq_data structure. This is slightly complicated >>>> by the fact that this interrupt might be services by a stacked >>>> domain. >>>> >>>> This has been tested on KVM with kvmtool. >>>> >>>> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> >>>> Tested-by: Andre Przywara <andre.przywara@arm.com> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> >>>> --- >>>> drivers/pci/setup-irq.c | 12 ++++++++++-- >>>> 1 file changed, 10 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c >>>> index 4e2d595..828cbc9 100644 >>>> --- a/drivers/pci/setup-irq.c >>>> +++ b/drivers/pci/setup-irq.c >>>> @@ -15,11 +15,19 @@ >>>> #include <linux/errno.h> >>>> #include <linux/ioport.h> >>>> #include <linux/cache.h> >>>> +#include <linux/irq.h> >>>> >>>> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) >>>> { >>>> - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); >>>> - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); >>>> + struct irq_data *d; >>>> + >>>> + d = irq_get_irq_data(irq); >>>> +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >>>> + while (d->parent_data) >>>> + d = d->parent_data; >>>> +#endif >>>> + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); >>>> + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); >>>> } >>> Hi Mark, >>> Instead of modifying the common version, how about >>> implementing an arch specific version? Arch may have different >>> way to determine the irq number. Above implementation doesn't >>> work with x86, for example. >> >> If you look at the Makefile, this file is used on: >> >> obj-$(CONFIG_ALPHA) += setup-irq.o >> obj-$(CONFIG_ARM) += setup-irq.o >> obj-$(CONFIG_UNICORE32) += setup-irq.o >> obj-$(CONFIG_SUPERH) += setup-irq.o >> obj-$(CONFIG_MIPS) += setup-irq.o >> obj-$(CONFIG_TILE) += setup-irq.o >> obj-$(CONFIG_SPARC_LEON) += setup-irq.o >> obj-$(CONFIG_M68K) += setup-irq.o >> >> x86 doesn't use that at all. > > Since you're looking at this, Marc, do you see a nice way to get rid > of these arch dependencies in the Makefile and unify this a bit? We > still have this pci_fixup_irqs() ugliness -- it's not really > arch-specific at all, but it's called from arch code, and it uses > for_each_pci_dev(), which obviously only works for things present at > boot and not for things hot-added later. I can have a look at this in the next cycle - I'm a bit strapped for time just now. As for for_each_pci_dev(), I'm not completely clear about what it should be replaced for. Do we have some form of notifier for this? Thanks, M.
On Mon, Feb 2, 2015 at 10:15 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: > On 28/01/15 15:43, Bjorn Helgaas wrote: >> On Wed, Jan 28, 2015 at 9:27 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: >>> Hi Gerry, >>> >>> On 28/01/15 15:21, Jiang Liu wrote: >>>> >>>> >>>> On 2015/1/28 22:51, Marc Zyngier wrote: >>>>> pcibios_update_irq writes an irq number into the config space >>>>> of a given PCI device, but ignores the fact that this number >>>>> is a virtual interrupt number, which might be a very different >>>>> value from what the underlying hardware is using. >>>>> >>>>> The obvious fix is to fetch the HW interrupt number from the >>>>> corresponding irq_data structure. This is slightly complicated >>>>> by the fact that this interrupt might be services by a stacked >>>>> domain. >>>>> >>>>> This has been tested on KVM with kvmtool. >>>>> >>>>> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> >>>>> Tested-by: Andre Przywara <andre.przywara@arm.com> >>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> >>>>> --- >>>>> drivers/pci/setup-irq.c | 12 ++++++++++-- >>>>> 1 file changed, 10 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c >>>>> index 4e2d595..828cbc9 100644 >>>>> --- a/drivers/pci/setup-irq.c >>>>> +++ b/drivers/pci/setup-irq.c >>>>> @@ -15,11 +15,19 @@ >>>>> #include <linux/errno.h> >>>>> #include <linux/ioport.h> >>>>> #include <linux/cache.h> >>>>> +#include <linux/irq.h> >>>>> >>>>> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) >>>>> { >>>>> - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); >>>>> - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); >>>>> + struct irq_data *d; >>>>> + >>>>> + d = irq_get_irq_data(irq); >>>>> +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >>>>> + while (d->parent_data) >>>>> + d = d->parent_data; >>>>> +#endif >>>>> + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); >>>>> + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); >>>>> } >>>> Hi Mark, >>>> Instead of modifying the common version, how about >>>> implementing an arch specific version? Arch may have different >>>> way to determine the irq number. Above implementation doesn't >>>> work with x86, for example. >>> >>> If you look at the Makefile, this file is used on: >>> >>> obj-$(CONFIG_ALPHA) += setup-irq.o >>> obj-$(CONFIG_ARM) += setup-irq.o >>> obj-$(CONFIG_UNICORE32) += setup-irq.o >>> obj-$(CONFIG_SUPERH) += setup-irq.o >>> obj-$(CONFIG_MIPS) += setup-irq.o >>> obj-$(CONFIG_TILE) += setup-irq.o >>> obj-$(CONFIG_SPARC_LEON) += setup-irq.o >>> obj-$(CONFIG_M68K) += setup-irq.o >>> >>> x86 doesn't use that at all. >> >> Since you're looking at this, Marc, do you see a nice way to get rid >> of these arch dependencies in the Makefile and unify this a bit? We >> still have this pci_fixup_irqs() ugliness -- it's not really >> arch-specific at all, but it's called from arch code, and it uses >> for_each_pci_dev(), which obviously only works for things present at >> boot and not for things hot-added later. > > I can have a look at this in the next cycle - I'm a bit strapped for > time just now. > > As for for_each_pci_dev(), I'm not completely clear about what it should > be replaced for. Do we have some form of notifier for this? I think this should be done somewhere in the enumeration path, e.g., maybe in pci_device_add(). Then we shouldn't need a for_each_pci_dev() loop or a notifier at all. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/02/15 15:57, Bjorn Helgaas wrote: > On Wed, Jan 28, 2015 at 8:51 AM, Marc Zyngier <marc.zyngier@arm.com> wrote: >> pcibios_update_irq writes an irq number into the config space >> of a given PCI device, but ignores the fact that this number >> is a virtual interrupt number, which might be a very different >> value from what the underlying hardware is using. >> >> The obvious fix is to fetch the HW interrupt number from the >> corresponding irq_data structure. This is slightly complicated >> by the fact that this interrupt might be services by a stacked >> domain. >> >> This has been tested on KVM with kvmtool. >> >> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> >> Tested-by: Andre Przywara <andre.przywara@arm.com> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> > > Jiang, are you OK with this patch as-is now, since it isn't used on x86? > > Marc, Lorenzo, I assume this actually fixes a bug. Can we get any > more details about what happens when you hit the bug, and how you > reproduced it (what platform, driver, etc.)? It definitely fixes a bug. This has been found by running a KVM guest using kvmtool PCI emulation, where the following things happen: - Guest programs a virtual (bogus) interrupt number in the PCI device config space (virtio disk in this case) - kvmtool uses that interrupt number as is, without any other form of validation - Either the injection fails (because the interrupt is out of the range of the virtual interrupt controller) -> virtio PCI device goes dead - or the injection succeeds because this is a valid interrupt number, but signals an unrelated peripheral -> virtio PCI device goes dead. This can be trivially reproduced on any ARM PCI system that requires legacy interrupts (i.e. no MSI support), and that uses a GIC interrupt controller. Doing it in a VM is just much more convenient. Hope this helps, M.
On Wed, Jan 28, 2015 at 02:51:23PM +0000, Marc Zyngier wrote: > void __weak pcibios_update_irq(struct pci_dev *dev, int irq) > { > - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); > - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); > + struct irq_data *d; > + > + d = irq_get_irq_data(irq); > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > + while (d->parent_data) > + d = d->parent_data; > +#endif > + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); > + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); I'm really not convinced about this being the correct thing to do. Let's take an older ARM system, such as a Footbridge based system with a PCI southbridge. Such a system has IRQs 0-15 as the PCI southbridge ISA interrupts. Then there are four PCI interrupts provided by the on-board Footbridge. Right now, PCI devices are programmed with the OS specific interrupt number - eg: 00:06.1 IDE interface: Contaq Microsystems 82c693 (prog-if 80 [Master]) Flags: medium devsel, IRQ 14 30: 00 00 00 00 00 00 00 00 00 00 00 00 0e 01 00 00 00:06.2 IDE interface: Contaq Microsystems 82c693 (prog-if 00 []) Flags: medium devsel, IRQ 15 30: 00 00 00 00 00 00 00 00 00 00 00 00 0f 02 00 00 00:06.3 USB Controller: Contaq Microsystems 82c693 (prog-if 10 [OHCI]) Flags: medium devsel, IRQ 12 30: 00 00 00 00 00 00 00 00 00 00 00 00 0c 01 00 00 00:07.0 Mass storage controller: Integrated Technology Express, Inc. IT/ITE8212 Dual channel ATA RAID controller (rev 13) Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 24 30: 00 00 02 04 80 00 00 00 00 00 00 00 18 01 08 08 00:08.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 74) Flags: bus master, medium devsel, latency 32, IRQ 22 30: 00 00 06 04 dc 00 00 00 00 00 00 00 16 01 0a 0a 00:09.0 VGA compatible controller: S3 Inc. 86c775/86c785 [Trio 64V2/DX or /GX] (rev 16) (prog-if 00 [VGA controller]) Flags: medium devsel, IRQ 21 30: 00 00 00 0c 00 00 00 00 00 00 00 00 15 01 00 00 What your change would mean is that the IRQs currently being programmed >= 16 would be programmed into with numbers with 16 removed from them. This means that legacy stuff (eg on the Southbridge which really do signal via the ISA IRQ controller) end up using the same number range as those which take PCI specific IRQs. This surely is not sane.
On Wednesday 28 January 2015 14:51:23 Marc Zyngier wrote: > void __weak pcibios_update_irq(struct pci_dev *dev, int irq) > { > - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); > - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); > + struct irq_data *d; > + > + d = irq_get_irq_data(irq); > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > + while (d->parent_data) > + d = d->parent_data; > +#endif > + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); > + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); > } I'm puzzled by this. Why is it even important what we write into the config space? Isn't this just an interface between BIOS and OS for systems that rely on the interrupt numbers to be statically assigned before boot? Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Russell, On 02/02/15 16:33, Russell King - ARM Linux wrote: > On Wed, Jan 28, 2015 at 02:51:23PM +0000, Marc Zyngier wrote: >> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) >> { >> - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); >> - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); >> + struct irq_data *d; >> + >> + d = irq_get_irq_data(irq); >> +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >> + while (d->parent_data) >> + d = d->parent_data; >> +#endif >> + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); >> + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); > > I'm really not convinced about this being the correct thing to do. > > Let's take an older ARM system, such as a Footbridge based system with a > PCI southbridge. > > Such a system has IRQs 0-15 as the PCI southbridge ISA interrupts. Then > there are four PCI interrupts provided by the on-board Footbridge. > > Right now, PCI devices are programmed with the OS specific interrupt > number - eg: > > 00:06.1 IDE interface: Contaq Microsystems 82c693 (prog-if 80 [Master]) > Flags: medium devsel, IRQ 14 > 30: 00 00 00 00 00 00 00 00 00 00 00 00 0e 01 00 00 > > 00:06.2 IDE interface: Contaq Microsystems 82c693 (prog-if 00 []) > Flags: medium devsel, IRQ 15 > 30: 00 00 00 00 00 00 00 00 00 00 00 00 0f 02 00 00 > > 00:06.3 USB Controller: Contaq Microsystems 82c693 (prog-if 10 [OHCI]) > Flags: medium devsel, IRQ 12 > 30: 00 00 00 00 00 00 00 00 00 00 00 00 0c 01 00 00 > > 00:07.0 Mass storage controller: Integrated Technology Express, Inc. IT/ITE8212 > Dual channel ATA RAID controller (rev 13) > Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 24 > 30: 00 00 02 04 80 00 00 00 00 00 00 00 18 01 08 08 > > 00:08.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 74) > Flags: bus master, medium devsel, latency 32, IRQ 22 > 30: 00 00 06 04 dc 00 00 00 00 00 00 00 16 01 0a 0a > > 00:09.0 VGA compatible controller: S3 Inc. 86c775/86c785 [Trio 64V2/DX or /GX] (rev 16) (prog-if 00 [VGA controller]) > Flags: medium devsel, IRQ 21 > 30: 00 00 00 0c 00 00 00 00 00 00 00 00 15 01 00 00 > > What your change would mean is that the IRQs currently being programmed >> = 16 would be programmed into with numbers with 16 removed from them. > This means that legacy stuff (eg on the Southbridge which really do signal > via the ISA IRQ controller) end up using the same number range as those > which take PCI specific IRQs. > > This surely is not sane. I suppose this is ebsa285? I must confess I don't see how to distinguish the two cases (the GIC case uses a purely virtual number, and the footbridge case uses something that seems to be physical). A very easy fix would be to entirely contain this change within #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY/#endif, but that doesn't fill me with confidence. What I don't get is how the hwirq field is set in this case. It probably isn't very useful (as there is no domain lookup), so I would have hoped to see irq == hwirq. Obviously, this is not the case. What am I missing? Thanks, M.
On Mon, Feb 02, 2015 at 06:08:17PM +0000, Marc Zyngier wrote: > Hi Russell, > > On 02/02/15 16:33, Russell King - ARM Linux wrote: > > What your change would mean is that the IRQs currently being programmed > >> = 16 would be programmed into with numbers with 16 removed from them. > > This means that legacy stuff (eg on the Southbridge which really do signal > > via the ISA IRQ controller) end up using the same number range as those > > which take PCI specific IRQs. > > > > This surely is not sane. > > I suppose this is ebsa285? I must confess I don't see how to distinguish > the two cases (the GIC case uses a purely virtual number, and the > footbridge case uses something that seems to be physical). Yep. > A very easy fix would be to entirely contain this change within #ifdef > CONFIG_IRQ_DOMAIN_HIERARCHY/#endif, but that doesn't fill me with > confidence. > > What I don't get is how the hwirq field is set in this case. It probably > isn't very useful (as there is no domain lookup), so I would have hoped > to see irq == hwirq. Obviously, this is not the case. What am I missing? Well, it depends how this register is to be interpreted. The PCI specs say that it's used to communicate the interrupt line routing information from POST to device drivers and operating systems. "The value in this register tells which input of the system interrupt controller(s) the device's interrupt pin is connected to." Note the plural there - which imples that any per-interrupt controller numbering scheme is quite certainly wrong. It also implies that there is a known, shared numberspace between POST and OS implementations on a platform which is used by PCI devices to describe how the PCI interrupts are wired up.
On 02/02/15 17:02, Arnd Bergmann wrote: > On Wednesday 28 January 2015 14:51:23 Marc Zyngier wrote: >> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) >> { >> - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); >> - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); >> + struct irq_data *d; >> + >> + d = irq_get_irq_data(irq); >> +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >> + while (d->parent_data) >> + d = d->parent_data; >> +#endif >> + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); >> + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); >> } > > I'm puzzled by this. Why is it even important what we write into > the config space? Isn't this just an interface between BIOS and > OS for systems that rely on the interrupt numbers to be statically > assigned before boot? That's exactly what I thought until Lorenzo reported kvmtool falling over because of this write. Obviously, some platforms must actually require this (possibly for bridges that are not known by the firmware). Entirely removing that code solves my problem too, but that'd cannot be the right solution... Thanks, M.
On Tuesday 03 February 2015 10:38:25 Marc Zyngier wrote: > > That's exactly what I thought until Lorenzo reported kvmtool falling > over because of this write. Obviously, some platforms must actually > require this (possibly for bridges that are not known by the firmware). This sounds much like a bug in kvmtool. > Entirely removing that code solves my problem too, but that'd cannot be > the right solution... The comment in pdev_fixup_irq() says /* Always tell the device, so the driver knows what is the real IRQ to use; the device does not use it. */ which I read to mean that there are drivers that incorrectly use 'pci_read_config_byte(dev, PCI_INTERRUPT_LINE)' as the number they pass into request_irq, rather than using dev->irq. However, this means that your patch is actually wrong, because what the driver cares about is the virtual irq number (which request_irq expects), not the number relative to some interrupt controller. There is another comment in include/linux/pci.h that states /* * Instead of touching interrupt line and base address registers * directly, use the values stored here. They might be different! */ unsigned int irq; so apparently this has been a cause for problems in the past, and drivers that rely on the number are already broken. I also checked ancient kernel versions from the 2.1 days when the code was first added. And indeed at the time drivers used to read the word, but now none of them use the number for anything real, they were all fixed during linux-2.2 at the latest. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/02/15 11:31, Arnd Bergmann wrote: > On Tuesday 03 February 2015 10:38:25 Marc Zyngier wrote: >> >> That's exactly what I thought until Lorenzo reported kvmtool falling >> over because of this write. Obviously, some platforms must actually >> require this (possibly for bridges that are not known by the firmware). > > This sounds much like a bug in kvmtool. Lorenzo and I just came to a similar conclusion, given that the HW should never use that information. >> Entirely removing that code solves my problem too, but that'd cannot be >> the right solution... > > The comment in pdev_fixup_irq() says > > /* Always tell the device, so the driver knows what is > the real IRQ to use; the device does not use it. */ > > which I read to mean that there are drivers that incorrectly use > 'pci_read_config_byte(dev, PCI_INTERRUPT_LINE)' as the number > they pass into request_irq, rather than using dev->irq. > However, this means that your patch is actually wrong, because > what the driver cares about is the virtual irq number (which > request_irq expects), not the number relative to some interrupt > controller. Yes, I now realise that. That makes a lot more sense actually, because I was getting very confused about how the HW should interpret that number. Side question: In the probe-only case, should we still allow this write to happen? Thanks, M.
On Tuesday 03 February 2015 11:37:30 Marc Zyngier wrote: > Side question: In the probe-only case, should we still allow this write > to happen? No, my understanding is that PCI_PROBE_ONLY precisely means that we do not modify the config space and instead trust what is there to be sensible. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c index 4e2d595..828cbc9 100644 --- a/drivers/pci/setup-irq.c +++ b/drivers/pci/setup-irq.c @@ -15,11 +15,19 @@ #include <linux/errno.h> #include <linux/ioport.h> #include <linux/cache.h> +#include <linux/irq.h> void __weak pcibios_update_irq(struct pci_dev *dev, int irq) { - dev_dbg(&dev->dev, "assigning IRQ %02d\n", irq); - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq); + struct irq_data *d; + + d = irq_get_irq_data(irq); +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY + while (d->parent_data) + d = d->parent_data; +#endif + dev_dbg(&dev->dev, "assigning IRQ %02ld\n", d->hwirq); + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, d->hwirq); } static void pdev_fixup_irq(struct pci_dev *dev,