diff mbox series

[1/2] soc: fsl: qbman: qman_portal: defer probing when qman is not available

Message ID 20180823213600.23426-1-alexandre.belloni@bootlin.com (mailing list archive)
State New, archived
Headers show
Series [1/2] soc: fsl: qbman: qman_portal: defer probing when qman is not available | expand

Commit Message

Alexandre Belloni Aug. 23, 2018, 9:35 p.m. UTC
If the qman driver (qman_ccsr) doesn't probe or fail to probe before
qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
unmapped page.

This leads to a crash when probing  qman_portal as the init_pcfg function
calls qman_liodn_fixup that tries to read qman registers.

Assume that qman didn't probe when the pool mask is 0.

Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
---
 drivers/soc/fsl/qbman/qman_portal.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Roy Pledge Aug. 24, 2018, 2:52 p.m. UTC | #1
On 8/23/2018 5:36 PM, Alexandre Belloni wrote:
> If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> unmapped page.
>
> This leads to a crash when probing  qman_portal as the init_pcfg function
> calls qman_liodn_fixup that tries to read qman registers.
>
> Assume that qman didn't probe when the pool mask is 0.
>
> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> ---
>  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> index a120002b630e..4fc80d2c8feb 100644
> --- a/drivers/soc/fsl/qbman/qman_portal.c
> +++ b/drivers/soc/fsl/qbman/qman_portal.c
> @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
>  	}
>  
>  	pcfg->pools = qm_get_pools_sdqcr();
> +	if (pcfg->pools == 0)
> +		return -EPROBE_DEFER;
>  
>  	spin_lock(&qman_lock);
>  	cpu = cpumask_next_zero(-1, &portal_cpus);

Reviewed-by: Roy Pledge <roy.pledge@nxp.com>
Leo Li Aug. 28, 2018, 10:49 p.m. UTC | #2
On Fri, Aug 24, 2018 at 9:54 AM Roy Pledge <roy.pledge@nxp.com> wrote:
>
> On 8/23/2018 5:36 PM, Alexandre Belloni wrote:
> > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > unmapped page.
> >
> > This leads to a crash when probing  qman_portal as the init_pcfg function
> > calls qman_liodn_fixup that tries to read qman registers.
> >
> > Assume that qman didn't probe when the pool mask is 0.
> >
> > Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

Series applied to the fix branch of soc/fsl.

> > ---
> >  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > index a120002b630e..4fc80d2c8feb 100644
> > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> >       }
> >
> >       pcfg->pools = qm_get_pools_sdqcr();
> > +     if (pcfg->pools == 0)
> > +             return -EPROBE_DEFER;
> >
> >       spin_lock(&qman_lock);
> >       cpu = cpumask_next_zero(-1, &portal_cpus);
>
> Reviewed-by: Roy Pledge <roy.pledge@nxp.com>
>
>
Olof Johansson Sept. 25, 2018, 7:45 p.m. UTC | #3
Hi,


On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
<alexandre.belloni@bootlin.com> wrote:
>
> If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> unmapped page.
>
> This leads to a crash when probing  qman_portal as the init_pcfg function
> calls qman_liodn_fixup that tries to read qman registers.
>
> Assume that qman didn't probe when the pool mask is 0.
>
> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> ---
>  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> index a120002b630e..4fc80d2c8feb 100644
> --- a/drivers/soc/fsl/qbman/qman_portal.c
> +++ b/drivers/soc/fsl/qbman/qman_portal.c
> @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
>         }
>
>         pcfg->pools = qm_get_pools_sdqcr();
> +       if (pcfg->pools == 0)
> +               return -EPROBE_DEFER;

This is quite late in the probe, after a bunch of resources have been claimed.

Note that the ioremaps above this are doing unwinds, and you'll end up
doing duplicate ioremaps if you come in and probe again.

You should probably unwind those allocations, or move them to devm_*
or do this check earlier in the function.


-Olof
Leo Li Sept. 25, 2018, 10:11 p.m. UTC | #4
On Tue, Sep 25, 2018 at 2:47 PM Olof Johansson <olof@lixom.net> wrote:
>
> Hi,
>
>
> On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> <alexandre.belloni@bootlin.com> wrote:
> >
> > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > unmapped page.
> >
> > This leads to a crash when probing  qman_portal as the init_pcfg function
> > calls qman_liodn_fixup that tries to read qman registers.
> >
> > Assume that qman didn't probe when the pool mask is 0.
> >
> > Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> > ---
> >  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > index a120002b630e..4fc80d2c8feb 100644
> > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> >         }
> >
> >         pcfg->pools = qm_get_pools_sdqcr();
> > +       if (pcfg->pools == 0)
> > +               return -EPROBE_DEFER;
>
> This is quite late in the probe, after a bunch of resources have been claimed.
>
> Note that the ioremaps above this are doing unwinds, and you'll end up
> doing duplicate ioremaps if you come in and probe again.
>
> You should probably unwind those allocations, or move them to devm_*
> or do this check earlier in the function.

Hi Roy,

Is there any more straightforward indicator on if qman has been
probed?  So that we can check it at the begining of the probe?

Regards,
Leo
Alexandre Belloni Sept. 26, 2018, 9:27 a.m. UTC | #5
On 25/09/2018 21:45:56+0200, Olof Johansson wrote:
> Hi,
> 
> 
> On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> <alexandre.belloni@bootlin.com> wrote:
> >
> > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > unmapped page.
> >
> > This leads to a crash when probing  qman_portal as the init_pcfg function
> > calls qman_liodn_fixup that tries to read qman registers.
> >
> > Assume that qman didn't probe when the pool mask is 0.
> >
> > Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> > ---
> >  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > index a120002b630e..4fc80d2c8feb 100644
> > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> >         }
> >
> >         pcfg->pools = qm_get_pools_sdqcr();
> > +       if (pcfg->pools == 0)
> > +               return -EPROBE_DEFER;
> 
> This is quite late in the probe, after a bunch of resources have been claimed.
> 
> Note that the ioremaps above this are doing unwinds, and you'll end up
> doing duplicate ioremaps if you come in and probe again.
> 
> You should probably unwind those allocations, or move them to devm_*
> or do this check earlier in the function.
> 

The actual chance of having that happen is quite small (this was coming
from a non working DT) and I mainly wanted to avoid a crash so the
platform could still boot. I would think moving to devm_ would be the
right thing to do.
Leo Li Sept. 26, 2018, 6:15 p.m. UTC | #6
On Wed, Sep 26, 2018 at 4:28 AM Alexandre Belloni
<alexandre.belloni@bootlin.com> wrote:
>
> On 25/09/2018 21:45:56+0200, Olof Johansson wrote:
> > Hi,
> >
> >
> > On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> > <alexandre.belloni@bootlin.com> wrote:
> > >
> > > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > > unmapped page.
> > >
> > > This leads to a crash when probing  qman_portal as the init_pcfg function
> > > calls qman_liodn_fixup that tries to read qman registers.
> > >
> > > Assume that qman didn't probe when the pool mask is 0.
> > >
> > > Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> > > ---
> > >  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > > index a120002b630e..4fc80d2c8feb 100644
> > > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > >         }
> > >
> > >         pcfg->pools = qm_get_pools_sdqcr();
> > > +       if (pcfg->pools == 0)
> > > +               return -EPROBE_DEFER;
> >
> > This is quite late in the probe, after a bunch of resources have been claimed.
> >
> > Note that the ioremaps above this are doing unwinds, and you'll end up
> > doing duplicate ioremaps if you come in and probe again.
> >
> > You should probably unwind those allocations, or move them to devm_*
> > or do this check earlier in the function.
> >
>
> The actual chance of having that happen is quite small (this was coming
> from a non working DT) and I mainly wanted to avoid a crash so the
> platform could still boot. I would think moving to devm_ would be the
> right thing to do.

Even if it is not failing with the upstreamed device trees, it is
still good to harden the driver for possible issues.  Moving to devm_
is definitely a right thing to do.  But I also think checking if the
qman is already probed should be the first thing to do before starting
to allocate resources and etc and rolling back later.  Probably we can
move the qm_get_pools_sdqcr() to the begining of the probe to
determine if qman is probed as it doesn't seem to depend on any of the
setups done right now.

Regards,
Leo
Leo Li Sept. 27, 2018, 7:24 p.m. UTC | #7
On Wed, Sep 26, 2018 at 1:15 PM Li Yang <leoyang.li@nxp.com> wrote:
>
> On Wed, Sep 26, 2018 at 4:28 AM Alexandre Belloni
> <alexandre.belloni@bootlin.com> wrote:
> >
> > On 25/09/2018 21:45:56+0200, Olof Johansson wrote:
> > > Hi,
> > >
> > >
> > > On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> > > <alexandre.belloni@bootlin.com> wrote:
> > > >
> > > > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > > > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > > > unmapped page.
> > > >
> > > > This leads to a crash when probing  qman_portal as the init_pcfg function
> > > > calls qman_liodn_fixup that tries to read qman registers.
> > > >
> > > > Assume that qman didn't probe when the pool mask is 0.
> > > >
> > > > Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> > > > ---
> > > >  drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > > > index a120002b630e..4fc80d2c8feb 100644
> > > > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > > > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > > > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > > >         }
> > > >
> > > >         pcfg->pools = qm_get_pools_sdqcr();
> > > > +       if (pcfg->pools == 0)
> > > > +               return -EPROBE_DEFER;
> > >
> > > This is quite late in the probe, after a bunch of resources have been claimed.
> > >
> > > Note that the ioremaps above this are doing unwinds, and you'll end up
> > > doing duplicate ioremaps if you come in and probe again.
> > >
> > > You should probably unwind those allocations, or move them to devm_*
> > > or do this check earlier in the function.
> > >
> >
> > The actual chance of having that happen is quite small (this was coming
> > from a non working DT) and I mainly wanted to avoid a crash so the
> > platform could still boot. I would think moving to devm_ would be the
> > right thing to do.
>
> Even if it is not failing with the upstreamed device trees, it is
> still good to harden the driver for possible issues.  Moving to devm_
> is definitely a right thing to do.  But I also think checking if the
> qman is already probed should be the first thing to do before starting
> to allocate resources and etc and rolling back later.  Probably we can
> move the qm_get_pools_sdqcr() to the begining of the probe to
> determine if qman is probed as it doesn't seem to depend on any of the
> setups done right now.

I just find out Laurentiu also included the following patches in his
SMMU patch series (although not neccessarily related to SMMU) which
also fix the same problem.  I think they are more straightforward and
can deal with the case that qman failed to probe.  So we can take
these to fix this problem instead in 4.19.

https://patchwork.kernel.org/patch/10616021/
https://patchwork.kernel.org/patch/10616019/
https://patchwork.kernel.org/patch/10615971/

Regards,
Leo
diff mbox series

Patch

diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
index a120002b630e..4fc80d2c8feb 100644
--- a/drivers/soc/fsl/qbman/qman_portal.c
+++ b/drivers/soc/fsl/qbman/qman_portal.c
@@ -277,6 +277,8 @@  static int qman_portal_probe(struct platform_device *pdev)
 	}
 
 	pcfg->pools = qm_get_pools_sdqcr();
+	if (pcfg->pools == 0)
+		return -EPROBE_DEFER;
 
 	spin_lock(&qman_lock);
 	cpu = cpumask_next_zero(-1, &portal_cpus);