diff mbox series

[1/2] bus: mhi: host: Remove duplicate ee check for syserr

Message ID 1674597444-24543-2-git-send-email-quic_jhugo@quicinc.com (mailing list archive)
State Superseded
Headers show
Series MHI host syserr fixes | expand

Commit Message

Jeffrey Hugo Jan. 24, 2023, 9:57 p.m. UTC
If we detect a system error via intvec, we only process the syserr if the
current ee is different than the last observed ee.  The reason for this
check is to prevent bhie from running multiple times, but with the single
queue handling syserr, that is not possible.

The check can cause an issue with device recovery.  If PBL loads a bad SBL
via BHI, but that SBL hangs before notifying the host of an ee change,
then issuing soc_reset to crash the device and retry (after supplying a
fixed SBL) will not recover the device as the host will observe a PBL->PBL
transition and not process the syserr.  The device will be stuck until
either the driver is reloaded, or the host is rebooted.  Instead, remove
the check so that we can attempt to recover the device.

Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
---
 drivers/bus/mhi/host/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Manivannan Sadhasivam April 3, 2023, 5:37 a.m. UTC | #1
On Tue, Jan 24, 2023 at 02:57:23PM -0700, Jeffrey Hugo wrote:
> If we detect a system error via intvec, we only process the syserr if the
> current ee is different than the last observed ee.  The reason for this
> check is to prevent bhie from running multiple times, but with the single
> queue handling syserr, that is not possible.
> 
> The check can cause an issue with device recovery.  If PBL loads a bad SBL
> via BHI, but that SBL hangs before notifying the host of an ee change,
> then issuing soc_reset to crash the device and retry (after supplying a
> fixed SBL) will not recover the device as the host will observe a PBL->PBL
> transition and not process the syserr.  The device will be stuck until
> either the driver is reloaded, or the host is rebooted.  Instead, remove
> the check so that we can attempt to recover the device.
> 
> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>

- Mani

> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
> ---
>  drivers/bus/mhi/host/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> index df0fbfe..0c3a009 100644
> --- a/drivers/bus/mhi/host/main.c
> +++ b/drivers/bus/mhi/host/main.c
> @@ -503,7 +503,7 @@ irqreturn_t mhi_intvec_threaded_handler(int irq_number, void *priv)
>  	}
>  	write_unlock_irq(&mhi_cntrl->pm_lock);
>  
> -	if (pm_state != MHI_PM_SYS_ERR_DETECT || ee == mhi_cntrl->ee)
> +	if (pm_state != MHI_PM_SYS_ERR_DETECT)
>  		goto exit_intvec;
>  
>  	switch (ee) {
> -- 
> 2.7.4
>
Manivannan Sadhasivam April 3, 2023, 5:45 a.m. UTC | #2
On Mon, Apr 03, 2023 at 11:07:35AM +0530, Manivannan Sadhasivam wrote:
> On Tue, Jan 24, 2023 at 02:57:23PM -0700, Jeffrey Hugo wrote:
> > If we detect a system error via intvec, we only process the syserr if the
> > current ee is different than the last observed ee.  The reason for this
> > check is to prevent bhie from running multiple times, but with the single
> > queue handling syserr, that is not possible.
> > 
> > The check can cause an issue with device recovery.  If PBL loads a bad SBL
> > via BHI, but that SBL hangs before notifying the host of an ee change,
> > then issuing soc_reset to crash the device and retry (after supplying a
> > fixed SBL) will not recover the device as the host will observe a PBL->PBL
> > transition and not process the syserr.  The device will be stuck until
> > either the driver is reloaded, or the host is rebooted.  Instead, remove
> > the check so that we can attempt to recover the device.
> > 
> > Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
> 
> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
> 

Forgot to add that, this patch also needs a fixes tag and backporting.

- Mani

> - Mani
> 
> > Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
> > ---
> >  drivers/bus/mhi/host/main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > index df0fbfe..0c3a009 100644
> > --- a/drivers/bus/mhi/host/main.c
> > +++ b/drivers/bus/mhi/host/main.c
> > @@ -503,7 +503,7 @@ irqreturn_t mhi_intvec_threaded_handler(int irq_number, void *priv)
> >  	}
> >  	write_unlock_irq(&mhi_cntrl->pm_lock);
> >  
> > -	if (pm_state != MHI_PM_SYS_ERR_DETECT || ee == mhi_cntrl->ee)
> > +	if (pm_state != MHI_PM_SYS_ERR_DETECT)
> >  		goto exit_intvec;
> >  
> >  	switch (ee) {
> > -- 
> > 2.7.4
> > 
> 
> -- 
> மணிவண்ணன் சதாசிவம்
diff mbox series

Patch

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index df0fbfe..0c3a009 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -503,7 +503,7 @@  irqreturn_t mhi_intvec_threaded_handler(int irq_number, void *priv)
 	}
 	write_unlock_irq(&mhi_cntrl->pm_lock);
 
-	if (pm_state != MHI_PM_SYS_ERR_DETECT || ee == mhi_cntrl->ee)
+	if (pm_state != MHI_PM_SYS_ERR_DETECT)
 		goto exit_intvec;
 
 	switch (ee) {