Message ID | 20250205151635.v2.1.Id60295bee6aacf44aa3664e702012cb4710529c3@changeid (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Krzysztof WilczyĆski |
Headers | show |
Series | [v2] PCI: dwc: Use level-triggered handler for MSI IRQs | expand |
On Wed, 05 Feb 2025 23:16:36 +0000, Brian Norris <briannorris@chromium.org> wrote: > > From: Brian Norris <briannorris@google.com> > > Per Synopsis's documentation [1], the msi_ctrl_int signal is > level-triggered, not edge-triggered. > > The use of handle_edge_trigger() was chosen in commit 7c5925afbc58 > ("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API"), > which actually dropped preexisting use of handle_level_trigger(). > Looking at the patch history, this change was only made by request: > > Subject: Re: [PATCH v6 1/9] PCI: dwc: Add IRQ chained API support > https://lore.kernel.org/all/04d3d5b6-9199-218d-476f-c77d04b8d2e7@arm.com/ > > "Are you sure about this "handle_level_irq"? MSIs are definitely edge > triggered, not level." > > However, while the underlying MSI protocol is edge-triggered in a sense, > the DesignWare IP is actually level-triggered. You are confusing two things: - MSIs are edge triggered. No ifs, no buts. That's because you can't "unwrite" something. Even the so-called level-triggered MSIs are build on a pair of edges (one up, one down). - The DisgustWare IP multiplexes MSIs onto a single interrupt, and *latches* them, presenting a level sensitive signal *for the latch*. Not for the MSIs themselves. > > So, let's switch back to level-triggered. > > In many cases, the distinction doesn't really matter here, because this > signal is hidden behind another (chained) top-level IRQ which can be > masked on its own. But it's still wise to manipulate this interrupt line > according to its actual specification -- specifically, to mask it while > we handle it. The distinction absolutely matters, because you are not dealing with the actual MSIs, but with a latch. > > [1] From: > > DesignWare Cores PCI Express RP Controller Reference Manual > Version 6.00a, June 2022 > Section 2.89 MSI Interface Signals > > msi_ctrl_int is described as: > > "Asserted when an MSI interrupt is pending. De-asserted when there is > no MSI interrupt pending. > ... > Active State: High (level)" > > It also points at the databook for more info. One relevant excerpt from > the databook: > > DesignWare Cores PCI Express Controller Databook > Version 6.00a, June 2022 > Section 3.9.2.3 iMSI-RX: Integrated MSI Receiver [AXI Bridge] > > "When any status bit remains set, then msi_ctrl_int remains asserted. > The interrupt status register provides a status bit for up to 32 > interrupt vectors per Endpoint. When the decoded interrupt vector is > enabled but is masked, then the controller sets the corresponding bit > in interrupt status register but the it [sic] does not assert the > top-level controller output msi_ctrl_int. Key word: *output*. That is the level-triggered line. Not the MSIs, which are *input* signals to the mux. > > Signed-off-by: Brian Norris <briannorris@google.com> > Signed-off-by: Brian Norris <briannorris@chromium.org> > --- > > Changes in v2: > * add documentation references > > drivers/pci/controller/dwc/pcie-designware-host.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c > index ffaded8f2df7..89a1207754d3 100644 > --- a/drivers/pci/controller/dwc/pcie-designware-host.c > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c > @@ -198,7 +198,7 @@ static int dw_pcie_irq_domain_alloc(struct irq_domain *domain, > for (i = 0; i < nr_irqs; i++) > irq_domain_set_info(domain, virq + i, bit + i, > pp->msi_irq_chip, > - pp, handle_edge_irq, > + pp, handle_level_irq, > NULL, NULL); > > return 0; I don't buy this, at least not without further justification based on the programming model of the mux. It also breaks the semantics of interrupt being made pending while we were handling them (retrigger being one). M.
Hi Marc, First off, thanks for reviewing. I'm a bit unsure about some of this area, which is one reason I sent this patch. Maybe it could have been "RFC". (See also v1: https://lore.kernel.org/all/Z3ho7eJMWvAy3HHC@google.com/ I'm dealing with HW bugs that cause me to have to configure the output signal -- msi_ctrl_int -- as EDGE-triggered on my GIC. This is adjacent to that problem, but doesn't really solve it.) On Thu, Feb 06, 2025 at 09:04:00AM +0000, Marc Zyngier wrote: > On Wed, 05 Feb 2025 23:16:36 +0000, > Brian Norris <briannorris@chromium.org> wrote: > > > > From: Brian Norris <briannorris@google.com> > > > > Per Synopsis's documentation [1], the msi_ctrl_int signal is > > level-triggered, not edge-triggered. > > > > The use of handle_edge_trigger() was chosen in commit 7c5925afbc58 > > ("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API"), > > which actually dropped preexisting use of handle_level_trigger(). > > Looking at the patch history, this change was only made by request: > > > > Subject: Re: [PATCH v6 1/9] PCI: dwc: Add IRQ chained API support > > https://lore.kernel.org/all/04d3d5b6-9199-218d-476f-c77d04b8d2e7@arm.com/ > > > > "Are you sure about this "handle_level_irq"? MSIs are definitely edge > > triggered, not level." > > > > However, while the underlying MSI protocol is edge-triggered in a sense, > > the DesignWare IP is actually level-triggered. > > You are confusing two things: > > - MSIs are edge triggered. No ifs, no buts. That's because you can't > "unwrite" something. Even the so-called level-triggered MSIs are > build on a pair of edges (one up, one down). > > - The DisgustWare IP multiplexes MSIs onto a single interrupt, and > *latches* them, presenting a level sensitive signal *for the > latch*. Not for the MSIs themselves. Indeed, I probably was a little confused, and distracted by my aforementioned HW bug, which can be at least partially mitigated by masking (which this patch does). I also didn't understand the original choices in various DW-based PCIe drivers, since their choice of handle_level_irq vs handle_edge_irq seemed a bit arbitrary. ... > It also breaks the semantics of > interrupt being made pending while we were handling them (retrigger > being one). What do you mean here? Are you referring to SW state (a la IRQS_PENDING), or HW state? For HW state, MSIs are accumulated in the STATUS register even when we're masked, so they'll "retrigger" when we're done handling. But I'm less clear about some of the IRQ framework semantics here. All in all, I'm OK with dropping this patch, but I'd like to understand a little more of what you think breaks here. Thanks again, Brian
On Thu, 06 Feb 2025 20:06:03 +0000, Brian Norris <briannorris@chromium.org> wrote: > > Hi Marc, > > First off, thanks for reviewing. I'm a bit unsure about some of this > area, which is one reason I sent this patch. Maybe it could have been > "RFC". RFC means nothing to me. Or rather, RFC is a sure indication that a patch can safely be ignored! ;-) My advise on this front is to either post patches as you have done, or not post it at all. > > (See also v1: https://lore.kernel.org/all/Z3ho7eJMWvAy3HHC@google.com/ > > I'm dealing with HW bugs that cause me to have to configure the output > signal -- msi_ctrl_int -- as EDGE-triggered on my GIC. This is adjacent > to that problem, but doesn't really solve it.) Configuring a level-triggered signal as edge is another recipe for disaster (a sure way to miss interrupts), but short of a description of your particular issue, I can't help on that. > > On Thu, Feb 06, 2025 at 09:04:00AM +0000, Marc Zyngier wrote: > > On Wed, 05 Feb 2025 23:16:36 +0000, > > Brian Norris <briannorris@chromium.org> wrote: > > > > > > From: Brian Norris <briannorris@google.com> > > > > > > Per Synopsis's documentation [1], the msi_ctrl_int signal is > > > level-triggered, not edge-triggered. > > > > > > The use of handle_edge_trigger() was chosen in commit 7c5925afbc58 > > > ("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API"), > > > which actually dropped preexisting use of handle_level_trigger(). > > > Looking at the patch history, this change was only made by request: > > > > > > Subject: Re: [PATCH v6 1/9] PCI: dwc: Add IRQ chained API support > > > https://lore.kernel.org/all/04d3d5b6-9199-218d-476f-c77d04b8d2e7@arm.com/ > > > > > > "Are you sure about this "handle_level_irq"? MSIs are definitely edge > > > triggered, not level." > > > > > > However, while the underlying MSI protocol is edge-triggered in a sense, > > > the DesignWare IP is actually level-triggered. > > > > You are confusing two things: > > > > - MSIs are edge triggered. No ifs, no buts. That's because you can't > > "unwrite" something. Even the so-called level-triggered MSIs are > > build on a pair of edges (one up, one down). > > > > - The DisgustWare IP multiplexes MSIs onto a single interrupt, and > > *latches* them, presenting a level sensitive signal *for the > > latch*. Not for the MSIs themselves. > > Indeed, I probably was a little confused, and distracted by my > aforementioned HW bug, which can be at least partially mitigated by > masking (which this patch does). I also didn't understand the original > choices in various DW-based PCIe drivers, since their choice of > handle_level_irq vs handle_edge_irq seemed a bit arbitrary. > > ... > > > It also breaks the semantics of > > interrupt being made pending while we were handling them (retrigger > > being one). > > What do you mean here? Are you referring to SW state (a la > IRQS_PENDING), or HW state? For HW state, MSIs are accumulated in the > STATUS register even when we're masked, so they'll "retrigger" when > we're done handling. But I'm less clear about some of the IRQ framework > semantics here. IRQS_PENDING is indeed what indicates the SW-driven retrigger state, by which any part of the kernel can decide to reinject an *edge* interrupt if, for any reason, it needs to. This is actually how lazy masking is implemented, by not masking the interrupt, taking it (which is a "consume" operation), realising we were logically masked, masking it for good and marking it as SW-pending for later processing. Hence the while{} loop in handle_edge_irq(). Switching to level triggered removes most of that processing, since by definition, a level is not consumed when acking the interrupt. You need to talk to the end-point for the level to drop, so simply masking the interrupt is enough to get it back when unmasking it. HTH, M.
Hi Marc, On Fri, Feb 07, 2025 at 09:13:32AM +0000, Marc Zyngier wrote: > On Thu, 06 Feb 2025 20:06:03 +0000, > Brian Norris <briannorris@chromium.org> wrote: > > First off, thanks for reviewing. I'm a bit unsure about some of this > > area, which is one reason I sent this patch. Maybe it could have been > > "RFC". > > RFC means nothing to me. Or rather, RFC is a sure indication that a > patch can safely be ignored! ;-) My advise on this front is to either > post patches as you have done, or not post it at all. Haha, OK, I guess you're a Cunningham's Law proponent :) > > (See also v1: https://lore.kernel.org/all/Z3ho7eJMWvAy3HHC@google.com/ > > > > I'm dealing with HW bugs that cause me to have to configure the output > > signal -- msi_ctrl_int -- as EDGE-triggered on my GIC. This is adjacent > > to that problem, but doesn't really solve it.) > > Configuring a level-triggered signal as edge is another recipe for > disaster (a sure way to miss interrupts), I'm very well aware of that, but I'm not aware of great alternatives. > but short of a description > of your particular issue, I can't help on that. Sure, I'm not really asking for that, at least not in this forum. I'm just trying to color the background a bit, that I'm not trying to flip level/edge settings just for fun. > > On Thu, Feb 06, 2025 at 09:04:00AM +0000, Marc Zyngier wrote: > > > It also breaks the semantics of > > > interrupt being made pending while we were handling them (retrigger > > > being one). > > > > What do you mean here? Are you referring to SW state (a la > > IRQS_PENDING), or HW state? For HW state, MSIs are accumulated in the > > STATUS register even when we're masked, so they'll "retrigger" when > > we're done handling. But I'm less clear about some of the IRQ framework > > semantics here. > > IRQS_PENDING is indeed what indicates the SW-driven retrigger state, > by which any part of the kernel can decide to reinject an *edge* > interrupt if, for any reason, it needs to. Are you referring to check_irq_resend() and related code? Notably, the current patch doesn't actually change the result of irq_settings_is_level() -- the nested descriptor still retains its "edge" status -- so the "We do not resend level type interrupts" comment doesn't actually apply. But anyway, that still suggests my patch is probably the wrong way to go about things, as it further mixes up "edge" and "level" concepts. > This is actually how lazy masking is implemented, by not masking the > interrupt, taking it (which is a "consume" operation), realising we > were logically masked, masking it for good and marking it as > SW-pending for later processing. Hence the while{} loop in > handle_edge_irq(). Sure, I've familiarized myself with lazy masking. I don't think it causes any problem here; handle_level_irq simply non-lazily masks it, and we'll pick up the latched result (if any) again when we eventually unmask. > Switching to level triggered removes most of that processing, since by > definition, a level is not consumed when acking the interrupt. You > need to talk to the end-point for the level to drop, so simply masking > the interrupt is enough to get it back when unmasking it. Ack, thanks. Brian
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c index ffaded8f2df7..89a1207754d3 100644 --- a/drivers/pci/controller/dwc/pcie-designware-host.c +++ b/drivers/pci/controller/dwc/pcie-designware-host.c @@ -198,7 +198,7 @@ static int dw_pcie_irq_domain_alloc(struct irq_domain *domain, for (i = 0; i < nr_irqs; i++) irq_domain_set_info(domain, virq + i, bit + i, pp->msi_irq_chip, - pp, handle_edge_irq, + pp, handle_level_irq, NULL, NULL); return 0;