diff mbox series

PCI: dwc: Use level-triggered handler for MSI IRQs

Message ID 20241015141215.1.Id60295bee6aacf44aa3664e702012cb4710529c3@changeid (mailing list archive)
State Under Review
Delegated to: Manivannan Sadhasivam
Headers show
Series PCI: dwc: Use level-triggered handler for MSI IRQs | expand

Commit Message

Brian Norris Oct. 15, 2024, 9:12 p.m. UTC
From: Brian Norris <briannorris@google.com>

Per Synopsis's documentation, the msi_ctrl_int signal is
level-triggered, not edge-triggered.

The use of handle_edge_trigger() was chosen in commit 7c5925afbc58
("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API"),
which actually dropped preexisting use of handle_level_trigger().
Looking at the patch history, this change was only made by request:

  Subject: Re: [PATCH v6 1/9] PCI: dwc: Add IRQ chained API support
  https://lore.kernel.org/all/04d3d5b6-9199-218d-476f-c77d04b8d2e7@arm.com/

  "Are you sure about this "handle_level_irq"? MSIs are definitely edge
   triggered, not level."

However, while the underlying MSI protocol is edge-triggered in a sense,
the DesignWare IP is actually level-triggered.

So, let's switch back to level-triggered.

In many cases, the distinction doesn't really matter here, because this
signal is hidden behind another (chained) top-level IRQ which can be
masked on its own. But it's still wise to manipulate this interrupt line
according to its actual specification -- specifically, to mask it while
we handle it.

Signed-off-by: Brian Norris <briannorris@google.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
---

 drivers/pci/controller/dwc/pcie-designware-host.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

manivannan.sadhasivam@linaro.org Dec. 30, 2024, 5:11 p.m. UTC | #1
On Tue, Oct 15, 2024 at 02:12:16PM -0700, Brian Norris wrote:
> From: Brian Norris <briannorris@google.com>
> 
> Per Synopsis's documentation, the msi_ctrl_int signal is
> level-triggered, not edge-triggered.
> 

Could you please quote the spec reference?

- Mani

> The use of handle_edge_trigger() was chosen in commit 7c5925afbc58
> ("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API"),
> which actually dropped preexisting use of handle_level_trigger().
> Looking at the patch history, this change was only made by request:
> 
>   Subject: Re: [PATCH v6 1/9] PCI: dwc: Add IRQ chained API support
>   https://lore.kernel.org/all/04d3d5b6-9199-218d-476f-c77d04b8d2e7@arm.com/
> 
>   "Are you sure about this "handle_level_irq"? MSIs are definitely edge
>    triggered, not level."
> 
> However, while the underlying MSI protocol is edge-triggered in a sense,
> the DesignWare IP is actually level-triggered.
> 
> So, let's switch back to level-triggered.
> 
> In many cases, the distinction doesn't really matter here, because this
> signal is hidden behind another (chained) top-level IRQ which can be
> masked on its own. But it's still wise to manipulate this interrupt line
> according to its actual specification -- specifically, to mask it while
> we handle it.
> 
> Signed-off-by: Brian Norris <briannorris@google.com>
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> ---
> 
>  drivers/pci/controller/dwc/pcie-designware-host.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 3e41865c7290..0fb79a26d05e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -198,7 +198,7 @@ static int dw_pcie_irq_domain_alloc(struct irq_domain *domain,
>  	for (i = 0; i < nr_irqs; i++)
>  		irq_domain_set_info(domain, virq + i, bit + i,
>  				    pp->msi_irq_chip,
> -				    pp, handle_edge_irq,
> +				    pp, handle_level_irq,
>  				    NULL, NULL);
>  
>  	return 0;
> -- 
> 2.47.0.rc1.288.g06298d1525-goog
>
Brian Norris Jan. 3, 2025, 1:43 a.m. UTC | #2
On Mon, Dec 30, 2024 at 10:41:45PM +0530, Manivannan Sadhasivam wrote:
> On Tue, Oct 15, 2024 at 02:12:16PM -0700, Brian Norris wrote:
> > From: Brian Norris <briannorris@google.com>
> > 
> > Per Synopsis's documentation, the msi_ctrl_int signal is
> > level-triggered, not edge-triggered.
> > 
> 
> Could you please quote the spec reference?

From the reference manual for msi_ctrl_int:

  "Asserted when an MSI interrupt is pending. De-asserted when there is
  no MSI interrupt pending.
  ...
  Active State: High (level)"

The reference manual also points at the databook for more info. One
relevant excerpt from the databook:

  "When any status bit remains set, then msi_ctrl_int remains asserted.
  The interrupt status register provides a status bit for up to 32
  interrupt vectors per Endpoint. When the decoded interrupt vector is
  enabled but is masked, then the controller sets the corresponding bit
  in interrupt status register but the it does not assert the top-level
  controller output msi_ctrl_int.

That's essentially a prose description of level-triggering, plus
32-vector multiplexing and masking.

Did you want a v2 with this included, or did you just want it noted
here?

(Side note: I think it doesn't really matter that much whether we use
the 'level' or 'edge' variant handlers here, at least if the parent
interrupt is configured correctly as level-triggered. We're not actually
in danger of a level-triggered interrupt flood or similar issue.)

Brian
Bjorn Helgaas Jan. 3, 2025, 5:49 p.m. UTC | #3
On Thu, Jan 02, 2025 at 05:43:26PM -0800, Brian Norris wrote:
> On Mon, Dec 30, 2024 at 10:41:45PM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Oct 15, 2024 at 02:12:16PM -0700, Brian Norris wrote:
> > > From: Brian Norris <briannorris@google.com>
> > > 
> > > Per Synopsis's documentation, the msi_ctrl_int signal is
> > > level-triggered, not edge-triggered.
> > 
> > Could you please quote the spec reference?
> 
> From the reference manual for msi_ctrl_int:
> 
>   "Asserted when an MSI interrupt is pending. De-asserted when there is
>   no MSI interrupt pending.
>   ...
>   Active State: High (level)"
> 
> The reference manual also points at the databook for more info. One
> relevant excerpt from the databook:
> 
>   "When any status bit remains set, then msi_ctrl_int remains asserted.
>   The interrupt status register provides a status bit for up to 32
>   interrupt vectors per Endpoint. When the decoded interrupt vector is
>   enabled but is masked, then the controller sets the corresponding bit
>   in interrupt status register but the it does not assert the top-level
>   controller output msi_ctrl_int.

"the it" might be a transcription error?

> That's essentially a prose description of level-triggering, plus
> 32-vector multiplexing and masking.
> 
> Did you want a v2 with this included, or did you just want it noted
> here?

I think a v2 with citations (spec name, revision, section number)
would be helpful.  Including these quotes as well would be fine with
me.

> (Side note: I think it doesn't really matter that much whether we use
> the 'level' or 'edge' variant handlers here, at least if the parent
> interrupt is configured correctly as level-triggered. We're not actually
> in danger of a level-triggered interrupt flood or similar issue.)
> 
> Brian
Bjorn Helgaas Jan. 3, 2025, 5:58 p.m. UTC | #4
On Fri, Jan 03, 2025 at 11:49:57AM -0600, Bjorn Helgaas wrote:
> On Thu, Jan 02, 2025 at 05:43:26PM -0800, Brian Norris wrote:
> > On Mon, Dec 30, 2024 at 10:41:45PM +0530, Manivannan Sadhasivam wrote:
> > > On Tue, Oct 15, 2024 at 02:12:16PM -0700, Brian Norris wrote:
> > > > From: Brian Norris <briannorris@google.com>
> > > > 
> > > > Per Synopsis's documentation, the msi_ctrl_int signal is
> > > > level-triggered, not edge-triggered.
> > > 
> > > Could you please quote the spec reference?
> > 
> > From the reference manual for msi_ctrl_int:
> > 
> >   "Asserted when an MSI interrupt is pending. De-asserted when there is
> >   no MSI interrupt pending.
> >   ...
> >   Active State: High (level)"
> > 
> > The reference manual also points at the databook for more info. One
> > relevant excerpt from the databook:
> > 
> >   "When any status bit remains set, then msi_ctrl_int remains asserted.
> >   The interrupt status register provides a status bit for up to 32
> >   interrupt vectors per Endpoint. When the decoded interrupt vector is
> >   enabled but is masked, then the controller sets the corresponding bit
> >   in interrupt status register but the it does not assert the top-level
> >   controller output msi_ctrl_int.
> 
> "the it" might be a transcription error?
> 
> > That's essentially a prose description of level-triggering, plus
> > 32-vector multiplexing and masking.
> > 
> > Did you want a v2 with this included, or did you just want it noted
> > here?
> 
> I think a v2 with citations (spec name, revision, section number)
> would be helpful.  Including these quotes as well would be fine with
> me.

Oh, and it would be awesome if we can motivate this patch by mentioning
an actual problem it can avoid.

It sounds like there really *is* a problem at least in some
topologies, so I think we should describe that problem before
explaining why we haven't seen it yet.

> > (Side note: I think it doesn't really matter that much whether we use
> > the 'level' or 'edge' variant handlers here, at least if the parent
> > interrupt is configured correctly as level-triggered. We're not actually
> > in danger of a level-triggered interrupt flood or similar issue.)
> > 
> > Brian
Brian Norris Jan. 3, 2025, 10:47 p.m. UTC | #5
Hi Bjorn,

On Fri, Jan 03, 2025 at 11:58:39AM -0600, Bjorn Helgaas wrote:
> On Fri, Jan 03, 2025 at 11:49:57AM -0600, Bjorn Helgaas wrote:
> > On Thu, Jan 02, 2025 at 05:43:26PM -0800, Brian Norris wrote:
> > > On Mon, Dec 30, 2024 at 10:41:45PM +0530, Manivannan Sadhasivam wrote:
> > > > On Tue, Oct 15, 2024 at 02:12:16PM -0700, Brian Norris wrote:
> > > > > From: Brian Norris <briannorris@google.com>
> > > > > 
> > > > > Per Synopsis's documentation, the msi_ctrl_int signal is
> > > > > level-triggered, not edge-triggered.
> > > > 
> > > > Could you please quote the spec reference?
> > > 
> > > From the reference manual for msi_ctrl_int:
> > > 
> > >   "Asserted when an MSI interrupt is pending. De-asserted when there is
> > >   no MSI interrupt pending.
> > >   ...
> > >   Active State: High (level)"
> > > 
> > > The reference manual also points at the databook for more info. One
> > > relevant excerpt from the databook:
> > > 
> > >   "When any status bit remains set, then msi_ctrl_int remains asserted.
> > >   The interrupt status register provides a status bit for up to 32
> > >   interrupt vectors per Endpoint. When the decoded interrupt vector is
> > >   enabled but is masked, then the controller sets the corresponding bit
> > >   in interrupt status register but the it does not assert the top-level
> > >   controller output msi_ctrl_int.
> > 
> > "the it" might be a transcription error?

Nope, direct copy/paste quote. I unwisely chose not to include a
"[sic]".

> > > That's essentially a prose description of level-triggering, plus
> > > 32-vector multiplexing and masking.
> > > 
> > > Did you want a v2 with this included, or did you just want it noted
> > > here?
> > 
> > I think a v2 with citations (spec name, revision, section number)
> > would be helpful.  Including these quotes as well would be fine with
> > me.

For the record:

DesignWare Cores PCI Express RP Controller Reference Manual
Version 6.00a, June 2022
Section 2.89 MSI Interface Signals

and

DesignWare Cores PCI Express Controller Databook
Version 6.00a, June 2022
Section 3.9.2.3 iMSI-RX: Integrated MSI Receiver [AXI Bridge]

Sure, I can send v2.

> Oh, and it would be awesome if we can motivate this patch by mentioning
> an actual problem it can avoid.
> 
> It sounds like there really *is* a problem at least in some
> topologies, so I think we should describe that problem before
> explaining why we haven't seen it yet.

Yeah, that's probably a good idea ... I'm still working out the nature
of a problem I'm dealing with here, but it has to do with when (due to
HW bugs) I have to configure the parent interrupt (GIC) as
edge-triggered. It turns out this change alone doesn't resolve all my
problems, but:

(a) I was hoping to get feedback on whether this change is sensible
    regardless of the adjacent HW bug I'm dealing with (I think it is);
    and
(b) I don't think I have a great publishable explanation of my HW bug(s)
    yet.

I understand (b) is not really a great situation for public review and
would understand if that delays/defers any action here. But I'm also not
really an IRQ expert (though I have to dabble quite a lot) and am
interested in (a) still.

(If it helps, I can try to summarize the above in a commit message, even
if it's still a bit vague.)

Brian

> > > (Side note: I think it doesn't really matter that much whether we use
> > > the 'level' or 'edge' variant handlers here, at least if the parent
> > > interrupt is configured correctly as level-triggered. We're not actually
> > > in danger of a level-triggered interrupt flood or similar issue.)
manivannan.sadhasivam@linaro.org Jan. 4, 2025, 10:09 a.m. UTC | #6
On Fri, Jan 03, 2025 at 02:47:09PM -0800, Brian Norris wrote:

[...]

> > Oh, and it would be awesome if we can motivate this patch by mentioning
> > an actual problem it can avoid.
> > 
> > It sounds like there really *is* a problem at least in some
> > topologies, so I think we should describe that problem before
> > explaining why we haven't seen it yet.
> 
> Yeah, that's probably a good idea ... I'm still working out the nature
> of a problem I'm dealing with here, but it has to do with when (due to
> HW bugs) I have to configure the parent interrupt (GIC) as
> edge-triggered. It turns out this change alone doesn't resolve all my
> problems, but:
> 
> (a) I was hoping to get feedback on whether this change is sensible
>     regardless of the adjacent HW bug I'm dealing with (I think it is);
>     and

This patch alone (with your proposed commit message change) makes sense to me.
Since DWC MSI controller is a hierarchial interrupt controller and most of the
designs chose to use level triggered SPI to signal GIC, we never had to face an
issue. GIC would mask the interrupt first and then unmask it after handling. So
even if the DWC MSI is level/edge triggered, the behavior would mostly be the
same.

But we should model the interrupt as per the hardware design. So your patch is
doing the right thing.

> (b) I don't think I have a great publishable explanation of my HW bug(s)
>     yet.
> 

This is not a blocker for this patch IMO.

> I understand (b) is not really a great situation for public review and
> would understand if that delays/defers any action here. But I'm also not
> really an IRQ expert (though I have to dabble quite a lot) and am
> interested in (a) still.
> 
> (If it helps, I can try to summarize the above in a commit message, even
> if it's still a bit vague.)
> 

Your bug is not strictly relevant to this patch. Quoting the spec reference and
changing the commit message as per Bjorn's suggestion is enough.

- Mani
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 3e41865c7290..0fb79a26d05e 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -198,7 +198,7 @@  static int dw_pcie_irq_domain_alloc(struct irq_domain *domain,
 	for (i = 0; i < nr_irqs; i++)
 		irq_domain_set_info(domain, virq + i, bit + i,
 				    pp->msi_irq_chip,
-				    pp, handle_edge_irq,
+				    pp, handle_level_irq,
 				    NULL, NULL);
 
 	return 0;