Message ID | 20221121140048.659849460@linutronix.de (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | genirq, irqchip: Convert ARM MSI handling to per device MSI domains | expand |
On Mon, 21 Nov 2022 14:39:36 +0000, Thomas Gleixner <tglx@linutronix.de> wrote: > > Most ARM(64) PCI/MSI domains mask and unmask in the parent domain after or > before the PCI mask/unmask operation takes place. So there are more than a > dozen of the same wrapper implementation all over the place. > > Don't make the same mistake with the new per device PCI/MSI domains and > provide a static key which lets the domain implementation enable this > sequence in the PCI/MSI code. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/msi/irqdomain.c | 30 ++++++++++++++++++++++++++++++ > include/linux/msi.h | 2 ++ > 2 files changed, 32 insertions(+) > > --- a/drivers/pci/msi/irqdomain.c > +++ b/drivers/pci/msi/irqdomain.c > @@ -148,17 +148,45 @@ static void pci_device_domain_set_desc(m > arg->hwirq = desc->msi_index; > } > > +static DEFINE_STATIC_KEY_FALSE(pci_msi_mask_unmask_parent); > + > +/** > + * pci_device_msi_mask_unmask_parent_enable - Enable propagation of mask/unmask > + * to the parent interrupt chip > + * > + * For MSI parent interrupt domains which want to mask at the parent interrupt > + * chip too. > + */ > +void pci_device_msi_mask_unmask_parent_enable(void) > +{ > + static_branch_enable(&pci_msi_mask_unmask_parent); > +} > + > +static __always_inline void cond_mask_parent(struct irq_data *data) > +{ > + if (static_branch_unlikely(&pci_msi_mask_unmask_parent)) > + irq_chip_mask_parent(data); > +} > + > +static __always_inline void cond_unmask_parent(struct irq_data *data) > +{ > + if (static_branch_unlikely(&pci_msi_mask_unmask_parent)) > + irq_chip_unmask_parent(data); > +} > + > static void pci_mask_msi(struct irq_data *data) > { > struct msi_desc *desc = irq_data_get_msi_desc(data); > > pci_msi_mask(desc, BIT(data->irq - desc->irq)); > + cond_mask_parent(data); I find this a bit odd. If anything, I'd rather drop the masking at the PCI level and keep it local to the interrupt controller, because this is likely to be more universal than the equivalent PCI operation (think multi-MSI, for example, which cannot masks individual MSIs). Another thing is that the static key is a global state. Nothing says that masking one way or the other is a universal thing, specially when you have multiple interrupt controllers dealing with MSIs in different ways. For example, GICv3 can use both the ITS and the GICv3-MBI frame at the same time for different PCI RC. OK, they happen to deal with MSIs in the same way, but you hopefully get my point. Thanks, M.
On Thu, Nov 24 2022 at 13:04, Marc Zyngier wrote: > On Mon, 21 Nov 2022 14:39:36 +0000, >> static void pci_mask_msi(struct irq_data *data) >> { >> struct msi_desc *desc = irq_data_get_msi_desc(data); >> >> pci_msi_mask(desc, BIT(data->irq - desc->irq)); >> + cond_mask_parent(data); > > I find this a bit odd. If anything, I'd rather drop the masking at the > PCI level and keep it local to the interrupt controller, because this > is likely to be more universal than the equivalent PCI operation > (think multi-MSI, for example, which cannot masks individual MSIs). > > Another thing is that the static key is a global state. Nothing says > that masking one way or the other is a universal thing, specially when > you have multiple interrupt controllers dealing with MSIs in different > ways. For example, GICv3 can use both the ITS and the GICv3-MBI frame > at the same time for different PCI RC. OK, they happen to deal with > MSIs in the same way, but you hopefully get my point. I'm fine with dropping that. I did this because basically all of the various ARM PCI/MSI domain implementation have a copy of the same functions. Some of them have pointlessly the wrong order because copy & pasta is so wonderful.... So the alternative solution is to provide _ONE_ set of correct callbacks and let the domain initialization code override the irq chip callbacks of the default PCI/MSI template. Thanks, tglx
On Thu, 24 Nov 2022 13:17:00 +0000, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, Nov 24 2022 at 13:04, Marc Zyngier wrote: > > On Mon, 21 Nov 2022 14:39:36 +0000, > >> static void pci_mask_msi(struct irq_data *data) > >> { > >> struct msi_desc *desc = irq_data_get_msi_desc(data); > >> > >> pci_msi_mask(desc, BIT(data->irq - desc->irq)); > >> + cond_mask_parent(data); > > > > I find this a bit odd. If anything, I'd rather drop the masking at the > > PCI level and keep it local to the interrupt controller, because this > > is likely to be more universal than the equivalent PCI operation > > (think multi-MSI, for example, which cannot masks individual MSIs). > > > > Another thing is that the static key is a global state. Nothing says > > that masking one way or the other is a universal thing, specially when > > you have multiple interrupt controllers dealing with MSIs in different > > ways. For example, GICv3 can use both the ITS and the GICv3-MBI frame > > at the same time for different PCI RC. OK, they happen to deal with > > MSIs in the same way, but you hopefully get my point. > > I'm fine with dropping that. I did this because basically all of the > various ARM PCI/MSI domain implementation have a copy of the same > functions. Some of them have pointlessly the wrong order because copy & > pasta is so wonderful.... > > So the alternative solution is to provide _ONE_ set of correct callbacks > and let the domain initialization code override the irq chip callbacks > of the default PCI/MSI template. If the various irqchips can tell the core code whether they want things to be masked at the PCI level or at the irqchip level, this would be a move in the right direction. For the GIC, I'd definitely want things masked locally. What I'd like to get rid off is the double masking, as I agree it is on the "pretty dumb" side of things. Thanks, M.
On Thu, Nov 24 2022 at 13:38, Marc Zyngier wrote: > On Thu, 24 Nov 2022 13:17:00 +0000, > Thomas Gleixner <tglx@linutronix.de> wrote: >> > I find this a bit odd. If anything, I'd rather drop the masking at the >> > PCI level and keep it local to the interrupt controller, because this >> > is likely to be more universal than the equivalent PCI operation >> > (think multi-MSI, for example, which cannot masks individual MSIs). >> > >> > Another thing is that the static key is a global state. Nothing says >> > that masking one way or the other is a universal thing, specially when >> > you have multiple interrupt controllers dealing with MSIs in different >> > ways. For example, GICv3 can use both the ITS and the GICv3-MBI frame >> > at the same time for different PCI RC. OK, they happen to deal with >> > MSIs in the same way, but you hopefully get my point. >> >> I'm fine with dropping that. I did this because basically all of the >> various ARM PCI/MSI domain implementation have a copy of the same >> functions. Some of them have pointlessly the wrong order because copy & >> pasta is so wonderful.... >> >> So the alternative solution is to provide _ONE_ set of correct callbacks >> and let the domain initialization code override the irq chip callbacks >> of the default PCI/MSI template. > > If the various irqchips can tell the core code whether they want > things to be masked at the PCI level or at the irqchip level, this > would be a move in the right direction. For the GIC, I'd definitely > want things masked locally. > > What I'd like to get rid off is the double masking, as I agree it is > on the "pretty dumb" side of things. Not necessarily. It mitigates the problem of MSI interrupts which can't be masked because the implementers decided to spare the gates. MSI allows that as masking is opt-in... Let me think about it. Thanks, tglx
On Fri, Nov 25 2022 at 01:11, Thomas Gleixner wrote: > On Thu, Nov 24 2022 at 13:38, Marc Zyngier wrote: >> On Thu, 24 Nov 2022 13:17:00 +0000, >> Thomas Gleixner <tglx@linutronix.de> wrote: >>> > I find this a bit odd. If anything, I'd rather drop the masking at the >>> > PCI level and keep it local to the interrupt controller, because this >>> > is likely to be more universal than the equivalent PCI operation >>> > (think multi-MSI, for example, which cannot masks individual MSIs). >>> > >>> > Another thing is that the static key is a global state. Nothing says >>> > that masking one way or the other is a universal thing, specially when >>> > you have multiple interrupt controllers dealing with MSIs in different >>> > ways. For example, GICv3 can use both the ITS and the GICv3-MBI frame >>> > at the same time for different PCI RC. OK, they happen to deal with >>> > MSIs in the same way, but you hopefully get my point. >>> >>> I'm fine with dropping that. I did this because basically all of the >>> various ARM PCI/MSI domain implementation have a copy of the same >>> functions. Some of them have pointlessly the wrong order because copy & >>> pasta is so wonderful.... >>> >>> So the alternative solution is to provide _ONE_ set of correct callbacks >>> and let the domain initialization code override the irq chip callbacks >>> of the default PCI/MSI template. >> >> If the various irqchips can tell the core code whether they want >> things to be masked at the PCI level or at the irqchip level, this >> would be a move in the right direction. For the GIC, I'd definitely >> want things masked locally. >> >> What I'd like to get rid off is the double masking, as I agree it is >> on the "pretty dumb" side of things. > > Not necessarily. It mitigates the problem of MSI interrupts which can't > be masked because the implementers decided to spare the gates. MSI > allows that as masking is opt-in... > > Let me think about it. That really took a while to think about it :) We have the following cases on the PCI/MSI side: 1) The MSI[X] entry can be masked 2) The MSI[X] entry cannot be masked because hardware did not implement it, masking is globally disabled due to XEN, masking does not exist for this horrible virtual MSI hackery Now you said: "For the GIC, I'd definitely want things masked locally." I decoded this, that you want to have these interrupts masked at the GIC level too independent of #1 or #2 above. And then: "What I'd like to get rid off is the double masking." But relying on the GIC alone is not really a good thing IMO. There is no point to let some confused device send unwanted MSI messages around without a way to shut it up from the generic code via the regular mask/unmask callbacks. On the other hand for PCI/MSI[x] the mask/unmask operations are not in the hot path as PCI/MSI[x] are strictly edge. Mask/unmask is only happening on startup, shutdown and when an interrupt arrives after disable_irq() incremented the lazy disable counter. For regular interrupt handling mask/unmask is not involved. So to avoid that global key we can let the parent domain set a new flag, e.g. MSI_FLAG_PCI_MSI_MASK_PARENT, in msi_parent_ops::supported_flags and let the PCI/MSI core code query that information when the per device domain is created and select the appropriate template or fixup the callbacks after the domain is created. Does that address your concerns? Thanks, tglx
On Mon, 22 May 2023 15:19:39 +0100, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Fri, Nov 25 2022 at 01:11, Thomas Gleixner wrote: > > On Thu, Nov 24 2022 at 13:38, Marc Zyngier wrote: > >> On Thu, 24 Nov 2022 13:17:00 +0000, > >> Thomas Gleixner <tglx@linutronix.de> wrote: > >>> > I find this a bit odd. If anything, I'd rather drop the masking at the > >>> > PCI level and keep it local to the interrupt controller, because this > >>> > is likely to be more universal than the equivalent PCI operation > >>> > (think multi-MSI, for example, which cannot masks individual MSIs). > >>> > > >>> > Another thing is that the static key is a global state. Nothing says > >>> > that masking one way or the other is a universal thing, specially when > >>> > you have multiple interrupt controllers dealing with MSIs in different > >>> > ways. For example, GICv3 can use both the ITS and the GICv3-MBI frame > >>> > at the same time for different PCI RC. OK, they happen to deal with > >>> > MSIs in the same way, but you hopefully get my point. > >>> > >>> I'm fine with dropping that. I did this because basically all of the > >>> various ARM PCI/MSI domain implementation have a copy of the same > >>> functions. Some of them have pointlessly the wrong order because copy & > >>> pasta is so wonderful.... > >>> > >>> So the alternative solution is to provide _ONE_ set of correct callbacks > >>> and let the domain initialization code override the irq chip callbacks > >>> of the default PCI/MSI template. > >> > >> If the various irqchips can tell the core code whether they want > >> things to be masked at the PCI level or at the irqchip level, this > >> would be a move in the right direction. For the GIC, I'd definitely > >> want things masked locally. > >> > >> What I'd like to get rid off is the double masking, as I agree it is > >> on the "pretty dumb" side of things. > > > > Not necessarily. It mitigates the problem of MSI interrupts which can't > > be masked because the implementers decided to spare the gates. MSI > > allows that as masking is opt-in... > > > > Let me think about it. > > That really took a while to think about it :) > > We have the following cases on the PCI/MSI side: > > 1) The MSI[X] entry can be masked > > 2) The MSI[X] entry cannot be masked because hardware did not implement > it, masking is globally disabled due to XEN, masking does not exist > for this horrible virtual MSI hackery And as a bonus the case of non-PCI MSIs, which are definitely a thing, and I'd like them to fit in the same model (because life is too short to do anything else). As for the Xen side, I hope to never have to care about it for the architecture I care about (I've long proclaimed Xen/arm64 dead and buried). > > Now you said: > > "For the GIC, I'd definitely want things masked locally." > > I decoded this, that you want to have these interrupts masked at the GIC > level too independent of #1 or #2 above. And then: > > "What I'd like to get rid off is the double masking." > > But relying on the GIC alone is not really a good thing IMO. There is no > point to let some confused device send unwanted MSI messages around > without a way to shut it up from the generic code via the regular > mask/unmask callbacks. I have a slightly different view of the problem. The device masking is somehow orthogonal with the masking at the GIC level: - can the interrupt be generated: this is a device property - can the interrupt be signalled: this is an interrupt controller property In a way, this is no different from your basic device, such as a timer: you need both the interrupt generation to be enabled at the timer level, and the interrupt signalling to be enabled (unmasked) at the irqchip level. Today, we conflate the two, because we have either: - devices that cannot selectively mask interrupts - interrupt controllers that are limited in what they can mask and this results in the terrible pattern that's all over the GIC-related stuff. > On the other hand for PCI/MSI[x] the mask/unmask operations are not in > the hot path as PCI/MSI[x] are strictly edge. Mask/unmask is only > happening on startup, shutdown and when an interrupt arrives after > disable_irq() incremented the lazy disable counter. > > For regular interrupt handling mask/unmask is not involved. > > So to avoid that global key we can let the parent domain set a new flag, > e.g. MSI_FLAG_PCI_MSI_MASK_PARENT, in msi_parent_ops::supported_flags > and let the PCI/MSI core code query that information when the per device > domain is created and select the appropriate template or fixup the > callbacks after the domain is created. > > Does that address your concerns? It does to a certain extent. But what I'd really like is that in the most common case where the interrupt controller is capable of masking MSIs, the PCI/MSI *enabling* becomes the responsibility of the PCI core code and not the IRQ code. The IRQ code should ideally only be concerned with the masking of the interrupt at the irqchip level, and not beyond that. And that'd solve the Xen problem by merely ignoring it. If we have HW out there that cannot mask MSIs at the interrupt controller level, then we'd have to fallback to device-side masking, which doesn't really work in general (MultiMSI being my favourite example). My gut feeling is that this is rare, but I'm pretty sure it exists. Thanks, M.
On Tue, May 23 2023 at 11:25, Marc Zyngier wrote: > On Mon, 22 May 2023 15:19:39 +0100, > Thomas Gleixner <tglx@linutronix.de> wrote: >> On the other hand for PCI/MSI[x] the mask/unmask operations are not in >> the hot path as PCI/MSI[x] are strictly edge. Mask/unmask is only >> happening on startup, shutdown and when an interrupt arrives after >> disable_irq() incremented the lazy disable counter. >> >> For regular interrupt handling mask/unmask is not involved. >> >> So to avoid that global key we can let the parent domain set a new flag, >> e.g. MSI_FLAG_PCI_MSI_MASK_PARENT, in msi_parent_ops::supported_flags >> and let the PCI/MSI core code query that information when the per device >> domain is created and select the appropriate template or fixup the >> callbacks after the domain is created. >> >> Does that address your concerns? > > It does to a certain extent. > > But what I'd really like is that in the most common case where the > interrupt controller is capable of masking MSIs, the PCI/MSI > *enabling* becomes the responsibility of the PCI core code and not the > IRQ code. > > The IRQ code should ideally only be concerned with the masking of the > interrupt at the irqchip level, and not beyond that. And that'd solve > the Xen problem by merely ignoring it. > > If we have HW out there that cannot mask MSIs at the interrupt > controller level, then we'd have to fallback to device-side masking, > which doesn't really work in general (MultiMSI being my favourite > example). My gut feeling is that this is rare, but I'm pretty sure it > exists. Sure. There are 3 parts involved: [Device]--->[PCI/MSI]---->[GIC] irqchip irqchip Controlling the interrupt machinery in the device happens at the device driver level and is conceptually independent of the interrupt manangement code. The device driver has no access to the PCI/MSI irqchip and all it can do is to enable/disable the source of the interrupt in the device. For the interrupt management code the job is to ensure that an interrupt can be prevented from disrupting the OS operation independent of the device driver correctness. As a matter of fact we know that PCI/MSI masking ranges from not possible over flaky to properly working. So we can't reliably prevent that a rougue device spams the PCIe bus with messages. Which means that we should utilize the fact that the next interrupt chip in the hierarchy can mask reliably. I wish I could disable individual vectors at the local APIC level on x86... Now the question is whether we want to make this conditional depending on what the PCI/MSI[X] hardware advertises or just keep it simple and do it unconditionally. Thanks, tglx
On Tue, 23 May 2023 14:05:56 +0100, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Tue, May 23 2023 at 11:25, Marc Zyngier wrote: > > On Mon, 22 May 2023 15:19:39 +0100, > > Thomas Gleixner <tglx@linutronix.de> wrote: > >> On the other hand for PCI/MSI[x] the mask/unmask operations are not in > >> the hot path as PCI/MSI[x] are strictly edge. Mask/unmask is only > >> happening on startup, shutdown and when an interrupt arrives after > >> disable_irq() incremented the lazy disable counter. > >> > >> For regular interrupt handling mask/unmask is not involved. > >> > >> So to avoid that global key we can let the parent domain set a new flag, > >> e.g. MSI_FLAG_PCI_MSI_MASK_PARENT, in msi_parent_ops::supported_flags > >> and let the PCI/MSI core code query that information when the per device > >> domain is created and select the appropriate template or fixup the > >> callbacks after the domain is created. > >> > >> Does that address your concerns? > > > > It does to a certain extent. > > > > But what I'd really like is that in the most common case where the > > interrupt controller is capable of masking MSIs, the PCI/MSI > > *enabling* becomes the responsibility of the PCI core code and not the > > IRQ code. > > > > The IRQ code should ideally only be concerned with the masking of the > > interrupt at the irqchip level, and not beyond that. And that'd solve > > the Xen problem by merely ignoring it. > > > > If we have HW out there that cannot mask MSIs at the interrupt > > controller level, then we'd have to fallback to device-side masking, > > which doesn't really work in general (MultiMSI being my favourite > > example). My gut feeling is that this is rare, but I'm pretty sure it > > exists. > > Sure. There are 3 parts involved: > > [Device]--->[PCI/MSI]---->[GIC] > irqchip irqchip > > Controlling the interrupt machinery in the device happens at the device > driver level and is conceptually independent of the interrupt > manangement code. The device driver has no access to the PCI/MSI irqchip > and all it can do is to enable/disable the source of the interrupt in > the device. > > For the interrupt management code the job is to ensure that an interrupt > can be prevented from disrupting the OS operation independent of the > device driver correctness. > > As a matter of fact we know that PCI/MSI masking ranges from not > possible over flaky to properly working. So we can't reliably prevent > that a rougue device spams the PCIe bus with messages. > > Which means that we should utilize the fact that the next interrupt chip > in the hierarchy can mask reliably. I wish I could disable individual > vectors at the local APIC level on x86... > > Now the question is whether we want to make this conditional depending > on what the PCI/MSI[X] hardware advertises or just keep it simple and do > it unconditionally. I think this should be unconditional if the root irqchip (the GIC in this instance) is capable of it. So a suggestion where the root irqchip exposes its masking capability, which upon detection by the upper layer (whateverbusyouwant/MSI) makes it stop playing with its own device-level mask has my full support (and now breathe normally). Thanks, M.
--- a/drivers/pci/msi/irqdomain.c +++ b/drivers/pci/msi/irqdomain.c @@ -148,17 +148,45 @@ static void pci_device_domain_set_desc(m arg->hwirq = desc->msi_index; } +static DEFINE_STATIC_KEY_FALSE(pci_msi_mask_unmask_parent); + +/** + * pci_device_msi_mask_unmask_parent_enable - Enable propagation of mask/unmask + * to the parent interrupt chip + * + * For MSI parent interrupt domains which want to mask at the parent interrupt + * chip too. + */ +void pci_device_msi_mask_unmask_parent_enable(void) +{ + static_branch_enable(&pci_msi_mask_unmask_parent); +} + +static __always_inline void cond_mask_parent(struct irq_data *data) +{ + if (static_branch_unlikely(&pci_msi_mask_unmask_parent)) + irq_chip_mask_parent(data); +} + +static __always_inline void cond_unmask_parent(struct irq_data *data) +{ + if (static_branch_unlikely(&pci_msi_mask_unmask_parent)) + irq_chip_unmask_parent(data); +} + static void pci_mask_msi(struct irq_data *data) { struct msi_desc *desc = irq_data_get_msi_desc(data); pci_msi_mask(desc, BIT(data->irq - desc->irq)); + cond_mask_parent(data); } static void pci_unmask_msi(struct irq_data *data) { struct msi_desc *desc = irq_data_get_msi_desc(data); + cond_unmask_parent(data); pci_msi_unmask(desc, BIT(data->irq - desc->irq)); } @@ -195,10 +223,12 @@ static struct msi_domain_template pci_ms static void pci_mask_msix(struct irq_data *data) { pci_msix_mask(irq_data_get_msi_desc(data)); + cond_mask_parent(data); } static void pci_unmask_msix(struct irq_data *data) { + cond_unmask_parent(data); pci_msix_unmask(irq_data_get_msi_desc(data)); } --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -653,12 +653,14 @@ struct irq_domain *pci_msi_create_irq_do struct irq_domain *parent); u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev); struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev); +void pci_device_msi_mask_unmask_parent_enable(void); #else /* CONFIG_PCI_MSI */ static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev) { return NULL; } static inline void pci_write_msi_msg(unsigned int irq, struct msi_msg *msg) { } +static inline void pci_device_msi_mask_unmask_parent_enable(void) { } #endif /* !CONFIG_PCI_MSI */ #endif /* LINUX_MSI_H */
Most ARM(64) PCI/MSI domains mask and unmask in the parent domain after or before the PCI mask/unmask operation takes place. So there are more than a dozen of the same wrapper implementation all over the place. Don't make the same mistake with the new per device PCI/MSI domains and provide a static key which lets the domain implementation enable this sequence in the PCI/MSI code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> --- drivers/pci/msi/irqdomain.c | 30 ++++++++++++++++++++++++++++++ include/linux/msi.h | 2 ++ 2 files changed, 32 insertions(+)