diff mbox series

mailbox: imx: fix TXDB_V2 channel race condition

Message ID 20240524075632.1009044-1-peng.fan@oss.nxp.com (mailing list archive)
State New
Headers show
Series mailbox: imx: fix TXDB_V2 channel race condition | expand

Commit Message

Peng Fan (OSS) May 24, 2024, 7:56 a.m. UTC
From: Peng Fan <peng.fan@nxp.com>

Two TXDB_V2 channels are used between Linux and System Manager(SM).
Channel0 for normal TX, Channel 1 for notification completion.
The TXDB_V2 trigger logic is using imx_mu_xcr_rmw which uses
read/modify/update logic.

Note: clear MUB GSR BITs, the MUA side GCR BITs will also got cleared per
hardware design.
Channel0 Linux
read GCR->modify GCR->write GCR->M33 SM->read GSR----->clear GSR
                                                |-(1)-|
Channel1 Linux start in time slot(1)
read GCR->modify GCR->write GCR->M33 SM->read GSR->clear GSR
So Channel1 read GCR will read back the GCR that Channel0 wrote, because
M33 has not finish clear GSR, this means Channel1 GCR writing will
trigger Channel1 and Channel0 interrupt both which is wrong.

Channel0 will be freed(SCMI channel status set to FREE) in M33 SM when
processing the 1st Channel0 interrupt. So when 2nd interrupt trigger
(channel 0/1 trigger together), SM will see a freed Channel0, and report
protocol error.

To address the issue, not using read/modify/update logic, just use
write, because write 0 to GCR will be ignored. And after write MUA GCR,
wait the SM to clear MUB GSR by looping MUA GCR value.

Fixes: 5bfe4067d350 ("mailbox: imx: support channel type tx doorbell v2")
Reviewed-by: Ranjani Vaidyanathan <ranjani.vaidyanathan@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
---

V1: This patch has got R-b inside NXP and could be directly applied to
    upstream linux, so I keep the R-b tag from Ranjani.

 drivers/mailbox/imx-mailbox.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Peng Fan June 18, 2024, 3:13 p.m. UTC | #1
Hi Jassi,

> Subject: [PATCH] mailbox: imx: fix TXDB_V2 channel race condition

Ping..

Thanks,
Peng.

> 
> From: Peng Fan <peng.fan@nxp.com>
> 
> Two TXDB_V2 channels are used between Linux and System
> Manager(SM).
> Channel0 for normal TX, Channel 1 for notification completion.
> The TXDB_V2 trigger logic is using imx_mu_xcr_rmw which uses
> read/modify/update logic.
> 
> Note: clear MUB GSR BITs, the MUA side GCR BITs will also got cleared
> per hardware design.
> Channel0 Linux
> read GCR->modify GCR->write GCR->M33 SM->read GSR----->clear GSR
>                                                 |-(1)-|
> Channel1 Linux start in time slot(1)
> read GCR->modify GCR->write GCR->M33 SM->read GSR->clear GSR So
> Channel1 read GCR will read back the GCR that Channel0 wrote,
> because
> M33 has not finish clear GSR, this means Channel1 GCR writing will
> trigger Channel1 and Channel0 interrupt both which is wrong.
> 
> Channel0 will be freed(SCMI channel status set to FREE) in M33 SM
> when processing the 1st Channel0 interrupt. So when 2nd interrupt
> trigger (channel 0/1 trigger together), SM will see a freed Channel0,
> and report protocol error.
> 
> To address the issue, not using read/modify/update logic, just use write,
> because write 0 to GCR will be ignored. And after write MUA GCR, wait
> the SM to clear MUB GSR by looping MUA GCR value.
> 
> Fixes: 5bfe4067d350 ("mailbox: imx: support channel type tx doorbell
> v2")
> Reviewed-by: Ranjani Vaidyanathan <ranjani.vaidyanathan@nxp.com>
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> ---
> 
> V1: This patch has got R-b inside NXP and could be directly applied to
>     upstream linux, so I keep the R-b tag from Ranjani.
> 
>  drivers/mailbox/imx-mailbox.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-
> mailbox.c index 5c1d09cad761..38abe07babdf 100644
> --- a/drivers/mailbox/imx-mailbox.c
> +++ b/drivers/mailbox/imx-mailbox.c
> @@ -224,6 +224,8 @@ static int imx_mu_generic_tx(struct
> imx_mu_priv *priv,
>  			     void *data)
>  {
>  	u32 *arg = data;
> +	u32 val;
> +	int ret;
> 
>  	switch (cp->type) {
>  	case IMX_MU_TYPE_TX:
> @@ -235,7 +237,13 @@ static int imx_mu_generic_tx(struct
> imx_mu_priv *priv,
>  		tasklet_schedule(&cp->txdb_tasklet);
>  		break;
>  	case IMX_MU_TYPE_TXDB_V2:
> -		imx_mu_xcr_rmw(priv, IMX_MU_GCR,
> IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx), 0);
> +		imx_mu_write(priv, IMX_MU_xCR_GIRn(priv->dcfg-
> >type, cp->idx),
> +			     priv->dcfg->xCR[IMX_MU_GCR]);
> +		ret = readl_poll_timeout(priv->base + priv->dcfg-
> >xCR[IMX_MU_GCR], val,
> +					 !(val &
> IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx)),
> +					 0, 1000);
> +		if (ret)
> +			dev_warn_ratelimited(priv->dev, "channel
> type: %d failure\n",
> +cp->type);
>  		break;
>  	default:
>  		dev_warn_ratelimited(priv->dev, "Send data on wrong
> channel type: %d\n", cp->type);
> --
> 2.37.1
Peng Fan July 5, 2024, 9:34 a.m. UTC | #2
Hi Jassi,

> Subject: RE: [PATCH] mailbox: imx: fix TXDB_V2 channel race condition
> 
> Hi Jassi,
> 
> > Subject: [PATCH] mailbox: imx: fix TXDB_V2 channel race condition
> 
> Ping..

Ping again..

Thanks,
Peng.

> 
> Thanks,
> Peng.
> 
> >
> > From: Peng Fan <peng.fan@nxp.com>
> >
> > Two TXDB_V2 channels are used between Linux and System
> Manager(SM).
> > Channel0 for normal TX, Channel 1 for notification completion.
> > The TXDB_V2 trigger logic is using imx_mu_xcr_rmw which uses
> > read/modify/update logic.
> >
> > Note: clear MUB GSR BITs, the MUA side GCR BITs will also got
> cleared
> > per hardware design.
> > Channel0 Linux
> > read GCR->modify GCR->write GCR->M33 SM->read GSR----->clear
> GSR
> >                                                 |-(1)-|
> > Channel1 Linux start in time slot(1)
> > read GCR->modify GCR->write GCR->M33 SM->read GSR->clear GSR
> So
> > Channel1 read GCR will read back the GCR that Channel0 wrote,
> because
> > M33 has not finish clear GSR, this means Channel1 GCR writing will
> > trigger Channel1 and Channel0 interrupt both which is wrong.
> >
> > Channel0 will be freed(SCMI channel status set to FREE) in M33 SM
> when
> > processing the 1st Channel0 interrupt. So when 2nd interrupt trigger
> > (channel 0/1 trigger together), SM will see a freed Channel0, and
> > report protocol error.
> >
> > To address the issue, not using read/modify/update logic, just use
> > write, because write 0 to GCR will be ignored. And after write MUA
> > GCR, wait the SM to clear MUB GSR by looping MUA GCR value.
> >
> > Fixes: 5bfe4067d350 ("mailbox: imx: support channel type tx
> doorbell
> > v2")
> > Reviewed-by: Ranjani Vaidyanathan <ranjani.vaidyanathan@nxp.com>
> > Signed-off-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >
> > V1: This patch has got R-b inside NXP and could be directly applied to
> >     upstream linux, so I keep the R-b tag from Ranjani.
> >
> >  drivers/mailbox/imx-mailbox.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-
> > mailbox.c index 5c1d09cad761..38abe07babdf 100644
> > --- a/drivers/mailbox/imx-mailbox.c
> > +++ b/drivers/mailbox/imx-mailbox.c
> > @@ -224,6 +224,8 @@ static int imx_mu_generic_tx(struct
> imx_mu_priv
> > *priv,
> >  			     void *data)
> >  {
> >  	u32 *arg = data;
> > +	u32 val;
> > +	int ret;
> >
> >  	switch (cp->type) {
> >  	case IMX_MU_TYPE_TX:
> > @@ -235,7 +237,13 @@ static int imx_mu_generic_tx(struct
> imx_mu_priv
> > *priv,
> >  		tasklet_schedule(&cp->txdb_tasklet);
> >  		break;
> >  	case IMX_MU_TYPE_TXDB_V2:
> > -		imx_mu_xcr_rmw(priv, IMX_MU_GCR,
> > IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx), 0);
> > +		imx_mu_write(priv, IMX_MU_xCR_GIRn(priv->dcfg-
> > >type, cp->idx),
> > +			     priv->dcfg->xCR[IMX_MU_GCR]);
> > +		ret = readl_poll_timeout(priv->base + priv->dcfg-
> > >xCR[IMX_MU_GCR], val,
> > +					 !(val &
> > IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx)),
> > +					 0, 1000);
> > +		if (ret)
> > +			dev_warn_ratelimited(priv->dev, "channel
> > type: %d failure\n",
> > +cp->type);
> >  		break;
> >  	default:
> >  		dev_warn_ratelimited(priv->dev, "Send data on wrong
> channel type:
> > %d\n", cp->type);
> > --
> > 2.37.1
>
diff mbox series

Patch

diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-mailbox.c
index 5c1d09cad761..38abe07babdf 100644
--- a/drivers/mailbox/imx-mailbox.c
+++ b/drivers/mailbox/imx-mailbox.c
@@ -224,6 +224,8 @@  static int imx_mu_generic_tx(struct imx_mu_priv *priv,
 			     void *data)
 {
 	u32 *arg = data;
+	u32 val;
+	int ret;
 
 	switch (cp->type) {
 	case IMX_MU_TYPE_TX:
@@ -235,7 +237,13 @@  static int imx_mu_generic_tx(struct imx_mu_priv *priv,
 		tasklet_schedule(&cp->txdb_tasklet);
 		break;
 	case IMX_MU_TYPE_TXDB_V2:
-		imx_mu_xcr_rmw(priv, IMX_MU_GCR, IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx), 0);
+		imx_mu_write(priv, IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx),
+			     priv->dcfg->xCR[IMX_MU_GCR]);
+		ret = readl_poll_timeout(priv->base + priv->dcfg->xCR[IMX_MU_GCR], val,
+					 !(val & IMX_MU_xCR_GIRn(priv->dcfg->type, cp->idx)),
+					 0, 1000);
+		if (ret)
+			dev_warn_ratelimited(priv->dev, "channel type: %d failure\n", cp->type);
 		break;
 	default:
 		dev_warn_ratelimited(priv->dev, "Send data on wrong channel type: %d\n", cp->type);