Message ID | 20210823154958.305677-1-bjorn.andersson@linaro.org (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v2,1/2] PCI: dwc: Perform host_init() before registering msi | expand |
On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > On the Qualcomm sc8180x platform the bootloader does something related > to PCI that leaves a pending "msi" interrupt, which with the current > ordering often fires before init has a chance to enable the clocks that > are necessary for the interrupt handler to access the hardware. > > Move the host_init() call before the registration of the "msi" interrupt > handler to ensure the host driver has a chance to enable the clocks. > > The assignment of the bridge's ops and child_ops is moved along, because > at least the TI Keystone driver overwrites these in its host_init > callback. > > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > > Changes since v1: > - New patch, instead of enabling resources in the qcom driver before jumping to > dw_pcie_host_init(), per Rob Herring's suggestion. > > .../pci/controller/dwc/pcie-designware-host.c | 19 ++++++++++--------- > 1 file changed, 10 insertions(+), 9 deletions(-) Reviewed-by: Rob Herring <robh@kernel.org>
On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > On the Qualcomm sc8180x platform the bootloader does something related > to PCI that leaves a pending "msi" interrupt, which with the current > ordering often fires before init has a chance to enable the clocks that > are necessary for the interrupt handler to access the hardware. > > Move the host_init() call before the registration of the "msi" interrupt > handler to ensure the host driver has a chance to enable the clocks. Did you audit other drivers for similar issues? If they do, we should fix them all at once. > The assignment of the bridge's ops and child_ops is moved along, because > at least the TI Keystone driver overwrites these in its host_init > callback. > > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > > Changes since v1: > - New patch, instead of enabling resources in the qcom driver before jumping to > dw_pcie_host_init(), per Rob Herring's suggestion. > > .../pci/controller/dwc/pcie-designware-host.c | 19 ++++++++++--------- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c > index d1d9b8344ec9..f4755f3a03be 100644 > --- a/drivers/pci/controller/dwc/pcie-designware-host.c > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c > @@ -335,6 +335,16 @@ int dw_pcie_host_init(struct pcie_port *pp) > if (pci->link_gen < 1) > pci->link_gen = of_pci_get_max_link_speed(np); > > + /* Set default bus ops */ > + bridge->ops = &dw_pcie_ops; > + bridge->child_ops = &dw_child_pcie_ops; > + > + if (pp->ops->host_init) { > + ret = pp->ops->host_init(pp); > + if (ret) > + return ret; > + } > + > if (pci_msi_enabled()) { > pp->has_msi_ctrl = !(pp->ops->msi_host_init || > of_property_read_bool(np, "msi-parent") || > @@ -388,15 +398,6 @@ int dw_pcie_host_init(struct pcie_port *pp) > } > } > > - /* Set default bus ops */ > - bridge->ops = &dw_pcie_ops; > - bridge->child_ops = &dw_child_pcie_ops; > - > - if (pp->ops->host_init) { > - ret = pp->ops->host_init(pp); > - if (ret) > - goto err_free_msi; > - } > dw_pcie_iatu_detect(pci); > > dw_pcie_setup_rc(pp); > -- > 2.29.2 >
On Tue 24 Aug 12:05 PDT 2021, Bjorn Helgaas wrote: > On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > > On the Qualcomm sc8180x platform the bootloader does something related > > to PCI that leaves a pending "msi" interrupt, which with the current > > ordering often fires before init has a chance to enable the clocks that > > are necessary for the interrupt handler to access the hardware. > > > > Move the host_init() call before the registration of the "msi" interrupt > > handler to ensure the host driver has a chance to enable the clocks. > > Did you audit other drivers for similar issues? If they do, we should > fix them all at once. > I only looked at the DesignWware drivers, in an attempt to find any issues the proposed reordering. The set of bugs causes by drivers registering interrupts before critical resources tends to be rather visible and I don't know if there's much value in speculatively "fixing" drivers. E.g. a quick look through the drivers I see a similar pattern in pci-tegra.c, but it's unlikely that they have the similar problem in practice and I have no way to validate that a change to the order would have a positive effect - or any side effects. Or am I misunderstanding your request? Regards, Bjorn > > The assignment of the bridge's ops and child_ops is moved along, because > > at least the TI Keystone driver overwrites these in its host_init > > callback. > > > > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > > --- > > > > Changes since v1: > > - New patch, instead of enabling resources in the qcom driver before jumping to > > dw_pcie_host_init(), per Rob Herring's suggestion. > > > > .../pci/controller/dwc/pcie-designware-host.c | 19 ++++++++++--------- > > 1 file changed, 10 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c > > index d1d9b8344ec9..f4755f3a03be 100644 > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c > > @@ -335,6 +335,16 @@ int dw_pcie_host_init(struct pcie_port *pp) > > if (pci->link_gen < 1) > > pci->link_gen = of_pci_get_max_link_speed(np); > > > > + /* Set default bus ops */ > > + bridge->ops = &dw_pcie_ops; > > + bridge->child_ops = &dw_child_pcie_ops; > > + > > + if (pp->ops->host_init) { > > + ret = pp->ops->host_init(pp); > > + if (ret) > > + return ret; > > + } > > + > > if (pci_msi_enabled()) { > > pp->has_msi_ctrl = !(pp->ops->msi_host_init || > > of_property_read_bool(np, "msi-parent") || > > @@ -388,15 +398,6 @@ int dw_pcie_host_init(struct pcie_port *pp) > > } > > } > > > > - /* Set default bus ops */ > > - bridge->ops = &dw_pcie_ops; > > - bridge->child_ops = &dw_child_pcie_ops; > > - > > - if (pp->ops->host_init) { > > - ret = pp->ops->host_init(pp); > > - if (ret) > > - goto err_free_msi; > > - } > > dw_pcie_iatu_detect(pci); > > > > dw_pcie_setup_rc(pp); > > -- > > 2.29.2 > >
On Tue, Aug 24, 2021 at 01:15:49PM -0700, Bjorn Andersson wrote: > On Tue 24 Aug 12:05 PDT 2021, Bjorn Helgaas wrote: > > > On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > > > On the Qualcomm sc8180x platform the bootloader does something related > > > to PCI that leaves a pending "msi" interrupt, which with the current > > > ordering often fires before init has a chance to enable the clocks that > > > are necessary for the interrupt handler to access the hardware. > > > > > > Move the host_init() call before the registration of the "msi" interrupt > > > handler to ensure the host driver has a chance to enable the clocks. > > > > Did you audit other drivers for similar issues? If they do, we should > > fix them all at once. > > I only looked at the DesignWware drivers, in an attempt to find any > issues the proposed reordering. > > The set of bugs causes by drivers registering interrupts before critical > resources tends to be rather visible and I don't know if there's much > value in speculatively "fixing" drivers. > > E.g. a quick look through the drivers I see a similar pattern in > pci-tegra.c, but it's unlikely that they have the similar problem in > practice and I have no way to validate that a change to the order would > have a positive effect - or any side effects. > > Or am I misunderstanding your request? That is exactly my request. I'm not sure if the potential issue you noticed in pci-tegra.c is similar to the one I mentioned here: https://lore.kernel.org/linux-pci/20210624224040.GA3567297@bjorn-Precision-5520/ but I am actually in favor of speculatively fixing drivers even though they're hard to test. Code like this tends to get copied to other places, and fixing several drivers sometimes exposes opportunities for refactoring and sharing code. Bjorn
On Tue 24 Aug 13:29 PDT 2021, Bjorn Helgaas wrote: > On Tue, Aug 24, 2021 at 01:15:49PM -0700, Bjorn Andersson wrote: > > On Tue 24 Aug 12:05 PDT 2021, Bjorn Helgaas wrote: > > > > > On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > > > > On the Qualcomm sc8180x platform the bootloader does something related > > > > to PCI that leaves a pending "msi" interrupt, which with the current > > > > ordering often fires before init has a chance to enable the clocks that > > > > are necessary for the interrupt handler to access the hardware. > > > > > > > > Move the host_init() call before the registration of the "msi" interrupt > > > > handler to ensure the host driver has a chance to enable the clocks. > > > > > > Did you audit other drivers for similar issues? If they do, we should > > > fix them all at once. > > > > I only looked at the DesignWware drivers, in an attempt to find any > > issues the proposed reordering. > > > > The set of bugs causes by drivers registering interrupts before critical > > resources tends to be rather visible and I don't know if there's much > > value in speculatively "fixing" drivers. > > > > E.g. a quick look through the drivers I see a similar pattern in > > pci-tegra.c, but it's unlikely that they have the similar problem in > > practice and I have no way to validate that a change to the order would > > have a positive effect - or any side effects. > > > > Or am I misunderstanding your request? > > That is exactly my request. Okay. > I'm not sure if the potential issue you > noticed in pci-tegra.c is similar to the one I mentioned here: > > https://lore.kernel.org/linux-pci/20210624224040.GA3567297@bjorn-Precision-5520/ > As I still have the tegra driver open, I share your concern about the use of potentially uninitialized variables. The problem I was concerned about was however the same as in my patch and the rockchip one, that if the tegra hardware isn't clocked the pm_runtime_get_sync() (which would turn on power and clock) happens after setting up the msi chain handler... > but I am actually in favor of speculatively fixing drivers even though > they're hard to test. Code like this tends to get copied to other > places, and fixing several drivers sometimes exposes opportunities for > refactoring and sharing code. > Looking through the other cases mentioned in your reply above certainly gives a feeling that this problem has been inherited from driver to driver... I've added a ticket to my backlog to take a deeper look at this. Regards, Bjorn
[+Vidya] On Tue, Aug 24, 2021 at 02:23:14PM -0700, Bjorn Andersson wrote: > On Tue 24 Aug 13:29 PDT 2021, Bjorn Helgaas wrote: > > > On Tue, Aug 24, 2021 at 01:15:49PM -0700, Bjorn Andersson wrote: > > > On Tue 24 Aug 12:05 PDT 2021, Bjorn Helgaas wrote: > > > > > > > On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > > > > > On the Qualcomm sc8180x platform the bootloader does something related > > > > > to PCI that leaves a pending "msi" interrupt, which with the current > > > > > ordering often fires before init has a chance to enable the clocks that > > > > > are necessary for the interrupt handler to access the hardware. > > > > > > > > > > Move the host_init() call before the registration of the "msi" interrupt > > > > > handler to ensure the host driver has a chance to enable the clocks. > > > > > > > > Did you audit other drivers for similar issues? If they do, we should > > > > fix them all at once. > > > > > > I only looked at the DesignWware drivers, in an attempt to find any > > > issues the proposed reordering. > > > > > > The set of bugs causes by drivers registering interrupts before critical > > > resources tends to be rather visible and I don't know if there's much > > > value in speculatively "fixing" drivers. > > > > > > E.g. a quick look through the drivers I see a similar pattern in > > > pci-tegra.c, but it's unlikely that they have the similar problem in > > > practice and I have no way to validate that a change to the order would > > > have a positive effect - or any side effects. > > > > > > Or am I misunderstanding your request? > > > > That is exactly my request. > > Okay. > > > I'm not sure if the potential issue you > > noticed in pci-tegra.c is similar to the one I mentioned here: > > > > https://lore.kernel.org/linux-pci/20210624224040.GA3567297@bjorn-Precision-5520/ > > > > As I still have the tegra driver open, I share your concern about the > use of potentially uninitialized variables. > > The problem I was concerned about was however the same as in my patch > and the rockchip one, that if the tegra hardware isn't clocked the > pm_runtime_get_sync() (which would turn on power and clock) happens > after setting up the msi chain handler... > > > but I am actually in favor of speculatively fixing drivers even though > > they're hard to test. Code like this tends to get copied to other > > places, and fixing several drivers sometimes exposes opportunities for > > refactoring and sharing code. > > > > Looking through the other cases mentioned in your reply above certainly > gives a feeling that this problem has been inherited from driver to > driver... > > I've added a ticket to my backlog to take a deeper look at this. Vidya, can you look into this please ? In the meantime I would merge this series. Thanks, Lorenzo > > Regards, > Bjorn
On Fri 08 Oct 10:48 PDT 2021, Lorenzo Pieralisi wrote: > [+Vidya] > > On Tue, Aug 24, 2021 at 02:23:14PM -0700, Bjorn Andersson wrote: > > On Tue 24 Aug 13:29 PDT 2021, Bjorn Helgaas wrote: > > > > > On Tue, Aug 24, 2021 at 01:15:49PM -0700, Bjorn Andersson wrote: > > > > On Tue 24 Aug 12:05 PDT 2021, Bjorn Helgaas wrote: > > > > > > > > > On Mon, Aug 23, 2021 at 08:49:57AM -0700, Bjorn Andersson wrote: > > > > > > On the Qualcomm sc8180x platform the bootloader does something related > > > > > > to PCI that leaves a pending "msi" interrupt, which with the current > > > > > > ordering often fires before init has a chance to enable the clocks that > > > > > > are necessary for the interrupt handler to access the hardware. > > > > > > > > > > > > Move the host_init() call before the registration of the "msi" interrupt > > > > > > handler to ensure the host driver has a chance to enable the clocks. > > > > > > > > > > Did you audit other drivers for similar issues? If they do, we should > > > > > fix them all at once. > > > > > > > > I only looked at the DesignWware drivers, in an attempt to find any > > > > issues the proposed reordering. > > > > > > > > The set of bugs causes by drivers registering interrupts before critical > > > > resources tends to be rather visible and I don't know if there's much > > > > value in speculatively "fixing" drivers. > > > > > > > > E.g. a quick look through the drivers I see a similar pattern in > > > > pci-tegra.c, but it's unlikely that they have the similar problem in > > > > practice and I have no way to validate that a change to the order would > > > > have a positive effect - or any side effects. > > > > > > > > Or am I misunderstanding your request? > > > > > > That is exactly my request. > > > > Okay. > > > > > I'm not sure if the potential issue you > > > noticed in pci-tegra.c is similar to the one I mentioned here: > > > > > > https://lore.kernel.org/linux-pci/20210624224040.GA3567297@bjorn-Precision-5520/ > > > > > > > As I still have the tegra driver open, I share your concern about the > > use of potentially uninitialized variables. > > > > The problem I was concerned about was however the same as in my patch > > and the rockchip one, that if the tegra hardware isn't clocked the > > pm_runtime_get_sync() (which would turn on power and clock) happens > > after setting up the msi chain handler... > > > > > but I am actually in favor of speculatively fixing drivers even though > > > they're hard to test. Code like this tends to get copied to other > > > places, and fixing several drivers sometimes exposes opportunities for > > > refactoring and sharing code. > > > > > > > Looking through the other cases mentioned in your reply above certainly > > gives a feeling that this problem has been inherited from driver to > > driver... > > > > I've added a ticket to my backlog to take a deeper look at this. > > Vidya, can you look into this please ? In the meantime I would merge > this series. > I would greatly appreciate that, as it unlocks 5G connectivity and NVME support on my laptop. I still intend to review the items Bjorn pointed out, but I haven't gotten there yet. Thanks, Bjorn > Thanks, > Lorenzo > > > > > Regards, > > Bjorn
On Mon, 23 Aug 2021 08:49:57 -0700, Bjorn Andersson wrote: > On the Qualcomm sc8180x platform the bootloader does something related > to PCI that leaves a pending "msi" interrupt, which with the current > ordering often fires before init has a chance to enable the clocks that > are necessary for the interrupt handler to access the hardware. > > Move the host_init() call before the registration of the "msi" interrupt > handler to ensure the host driver has a chance to enable the clocks. > > [...] Applied to pci/dwc, thanks! [1/2] PCI: dwc: Perform host_init() before registering msi https://git.kernel.org/lpieralisi/pci/c/7e919677bb [2/2] PCI: qcom: Add sc8180x compatible https://git.kernel.org/lpieralisi/pci/c/0e00fc858f Thanks, Lorenzo
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c index d1d9b8344ec9..f4755f3a03be 100644 --- a/drivers/pci/controller/dwc/pcie-designware-host.c +++ b/drivers/pci/controller/dwc/pcie-designware-host.c @@ -335,6 +335,16 @@ int dw_pcie_host_init(struct pcie_port *pp) if (pci->link_gen < 1) pci->link_gen = of_pci_get_max_link_speed(np); + /* Set default bus ops */ + bridge->ops = &dw_pcie_ops; + bridge->child_ops = &dw_child_pcie_ops; + + if (pp->ops->host_init) { + ret = pp->ops->host_init(pp); + if (ret) + return ret; + } + if (pci_msi_enabled()) { pp->has_msi_ctrl = !(pp->ops->msi_host_init || of_property_read_bool(np, "msi-parent") || @@ -388,15 +398,6 @@ int dw_pcie_host_init(struct pcie_port *pp) } } - /* Set default bus ops */ - bridge->ops = &dw_pcie_ops; - bridge->child_ops = &dw_child_pcie_ops; - - if (pp->ops->host_init) { - ret = pp->ops->host_init(pp); - if (ret) - goto err_free_msi; - } dw_pcie_iatu_detect(pci); dw_pcie_setup_rc(pp);
On the Qualcomm sc8180x platform the bootloader does something related to PCI that leaves a pending "msi" interrupt, which with the current ordering often fires before init has a chance to enable the clocks that are necessary for the interrupt handler to access the hardware. Move the host_init() call before the registration of the "msi" interrupt handler to ensure the host driver has a chance to enable the clocks. The assignment of the bridge's ops and child_ops is moved along, because at least the TI Keystone driver overwrites these in its host_init callback. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- Changes since v1: - New patch, instead of enabling resources in the qcom driver before jumping to dw_pcie_host_init(), per Rob Herring's suggestion. .../pci/controller/dwc/pcie-designware-host.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)