Message ID | qcoqksikfvdqxk6stezbzc7l2br37ccgqswztzqejmhrkhbrwt@ta4npsm35mqk (mailing list archive) |
---|---|
State | Accepted |
Commit | 07bbe3fd0704ab47d365756a31f45a86e3b45c0a |
Headers | show |
Series | [v2] arm64: dts: qcom: sa8540p-ride: disable pcie2a node | expand |
On Tue, Jan 09, 2024 at 10:20:50AM -0500, Lucas Karpinski wrote: > pcie2a and pcie3a both cause interrupt storms to occur. However, when > both are enabled simultaneously, the two combined interrupt storms will > lead to rcu stalls. Red Hat is the only company still using this board > and since we still need pcie3a, just disable pcie2a. > > Signed-off-by: Lucas Karpinski <lkarpins@redhat.com> Reviewed-by: Brian Masney <bmasney@redhat.com> To elaborate further: Leaving both pcie2a and pcie3a enabled will lead to rcu stalls and the board fails to boot when both are enabled. We have the latest firmware that we've been able to get from QC. Disabling one of the pcie nodes works around the boot issue. There's nothing interesting on pcie2a on the development board, and pcie3a is enabled because it has 10GB ethernet that works upstream. The interrupt storm on pcie3a can still occur on this platform, however that's a separate issue. Brian
On Thu, Jan 11, 2024 at 09:02:41AM -0500, Brian Masney wrote: > On Tue, Jan 09, 2024 at 10:20:50AM -0500, Lucas Karpinski wrote: > > pcie2a and pcie3a both cause interrupt storms to occur. However, when > > both are enabled simultaneously, the two combined interrupt storms will > > lead to rcu stalls. Red Hat is the only company still using this board > > and since we still need pcie3a, just disable pcie2a. > > > > Signed-off-by: Lucas Karpinski <lkarpins@redhat.com> > > Reviewed-by: Brian Masney <bmasney@redhat.com> > > To elaborate further: Leaving both pcie2a and pcie3a enabled will lead > to rcu stalls and the board fails to boot when both are enabled. We > have the latest firmware that we've been able to get from QC. > Disabling one of the pcie nodes works around the boot issue. There's > nothing interesting on pcie2a on the development board, and pcie3a is > enabled because it has 10GB ethernet that works upstream. > > The interrupt storm on pcie3a can still occur on this platform, however > that's a separate issue. Related work-around to that in case anyone is interested in the paper trail: https://lore.kernel.org/all/89c13962f5502a89d48f1efb7a6203d155a7e18d.camel@redhat.com/
On Tue, Jan 09, 2024 at 10:20:50AM -0500, Lucas Karpinski wrote: > pcie2a and pcie3a both cause interrupt storms to occur. However, when > both are enabled simultaneously, the two combined interrupt storms will > lead to rcu stalls. Red Hat is the only company still using this board > and since we still need pcie3a, just disable pcie2a. > Why are there interrupt storms? What interrupt(s) is(are) involved? Do you consider this a temporary fix? Are you okay with pcie3a misbehaving? Regards, Bjorn > Signed-off-by: Lucas Karpinski <lkarpins@redhat.com> > --- > v2: > - don't remove the entire pcie2a node, just set status to disabled. > - update commit message. > > arch/arm64/boot/dts/qcom/sa8540p-ride.dts | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > index b04f72ec097c..177b9dad6ff7 100644 > --- a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > +++ b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > @@ -376,14 +376,14 @@ &pcie2a { > pinctrl-names = "default"; > pinctrl-0 = <&pcie2a_default>; > > - status = "okay"; > + status = "disabled"; > }; > > &pcie2a_phy { > vdda-phy-supply = <&vreg_l11a>; > vdda-pll-supply = <&vreg_l3a>; > > - status = "okay"; > + status = "disabled"; > }; > > &pcie3a { > -- > 2.43.0 >
> Why are there interrupt storms? What interrupt(s) is(are) involved? In the earlier link that Andrew mentioned, the DesignWare PCIe driver uses a chained interrupt to demultiplex the downstream MSI interrupts. This meant we couldn't identify the MSI interrupt source, so it is not clear what is causing the hw to misbehave the way that it is. > Do you consider this a temporary fix? This will likely be a permanent fix. Qualcomm disabled pcie2a in their downstream kernel as well, quite some time ago, so this may never be actually fixed. > Are you okay with pcie3a misbehaving? Yes, it would be great of the underlying issue was addressed, but at least the boards are usable with just pcie3a enabled and the nic will be available. Lucas > > Signed-off-by: Lucas Karpinski <lkarpins@redhat.com> > > --- > > v2: > > - don't remove the entire pcie2a node, just set status to disabled. > > - update commit message. > > > > arch/arm64/boot/dts/qcom/sa8540p-ride.dts | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > > index b04f72ec097c..177b9dad6ff7 100644 > > --- a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > > +++ b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > > @@ -376,14 +376,14 @@ &pcie2a { > > pinctrl-names = "default"; > > pinctrl-0 = <&pcie2a_default>; > > > > - status = "okay"; > > + status = "disabled"; > > }; > > > > &pcie2a_phy { > > vdda-phy-supply = <&vreg_l11a>; > > vdda-pll-supply = <&vreg_l3a>; > > > > - status = "okay"; > > + status = "disabled"; > > }; > > > > &pcie3a { > > -- > > 2.43.0 > > >
On Tue, 09 Jan 2024 10:20:50 -0500, Lucas Karpinski wrote: > pcie2a and pcie3a both cause interrupt storms to occur. However, when > both are enabled simultaneously, the two combined interrupt storms will > lead to rcu stalls. Red Hat is the only company still using this board > and since we still need pcie3a, just disable pcie2a. > > Applied, thanks! [1/1] arm64: dts: qcom: sa8540p-ride: disable pcie2a node commit: 07bbe3fd0704ab47d365756a31f45a86e3b45c0a Best regards,
diff --git a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts index b04f72ec097c..177b9dad6ff7 100644 --- a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts +++ b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts @@ -376,14 +376,14 @@ &pcie2a { pinctrl-names = "default"; pinctrl-0 = <&pcie2a_default>; - status = "okay"; + status = "disabled"; }; &pcie2a_phy { vdda-phy-supply = <&vreg_l11a>; vdda-pll-supply = <&vreg_l3a>; - status = "okay"; + status = "disabled"; }; &pcie3a {
pcie2a and pcie3a both cause interrupt storms to occur. However, when both are enabled simultaneously, the two combined interrupt storms will lead to rcu stalls. Red Hat is the only company still using this board and since we still need pcie3a, just disable pcie2a. Signed-off-by: Lucas Karpinski <lkarpins@redhat.com> --- v2: - don't remove the entire pcie2a node, just set status to disabled. - update commit message. arch/arm64/boot/dts/qcom/sa8540p-ride.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)