diff mbox series

[v2,RESEND] bus: mhi: core: Wait for ready state after reset

Message ID 1615408918-7242-1-git-send-email-jhugo@codeaurora.org (mailing list archive)
State Not Applicable, archived
Headers show
Series [v2,RESEND] bus: mhi: core: Wait for ready state after reset | expand

Commit Message

Jeffrey Hugo March 10, 2021, 8:41 p.m. UTC
After the device has signaled the end of reset by clearing the reset bit,
it will automatically reinit MHI and the internal device structures.  Once
That is done, the device will signal it has entered the ready state.

Signaling the ready state involves sending an interrupt (MSI) to the host
which might cause IOMMU faults if it occurs at the wrong time.

If the controller is being powered down, and possibly removed, then the
reset flow would only wait for the end of reset.  At which point, the host
and device would start a race.  The host may complete its reset work, and
remove the interrupt handler, which would cause the interrupt to be
disabled in the IOMMU.  If that occurs before the device signals the ready
state, then the IOMMU will fault since it blocked an interrupt.  While
harmless, the fault would appear like a serious issue has occurred so let's
silence it by making sure the device hits the ready state before the host
completes its reset processing.

Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
---
 drivers/bus/mhi/core/pm.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

Hemant Kumar March 10, 2021, 8:56 p.m. UTC | #1
On 3/10/21 12:41 PM, Jeffrey Hugo wrote:
> After the device has signaled the end of reset by clearing the reset bit,
> it will automatically reinit MHI and the internal device structures.  Once
> That is done, the device will signal it has entered the ready state.
> 
> Signaling the ready state involves sending an interrupt (MSI) to the host
> which might cause IOMMU faults if it occurs at the wrong time.
> 
> If the controller is being powered down, and possibly removed, then the
> reset flow would only wait for the end of reset.  At which point, the host
> and device would start a race.  The host may complete its reset work, and
> remove the interrupt handler, which would cause the interrupt to be
> disabled in the IOMMU.  If that occurs before the device signals the ready
> state, then the IOMMU will fault since it blocked an interrupt.  While
> harmless, the fault would appear like a serious issue has occurred so let's
> silence it by making sure the device hits the ready state before the host
> completes its reset processing.
> 
> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>

Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Manivannan Sadhasivam March 16, 2021, 6:14 a.m. UTC | #2
On Wed, Mar 10, 2021 at 01:41:58PM -0700, Jeffrey Hugo wrote:
> After the device has signaled the end of reset by clearing the reset bit,
> it will automatically reinit MHI and the internal device structures.  Once
> That is done, the device will signal it has entered the ready state.
> 
> Signaling the ready state involves sending an interrupt (MSI) to the host
> which might cause IOMMU faults if it occurs at the wrong time.
> 
> If the controller is being powered down, and possibly removed, then the
> reset flow would only wait for the end of reset.  At which point, the host
> and device would start a race.  The host may complete its reset work, and
> remove the interrupt handler, which would cause the interrupt to be
> disabled in the IOMMU.  If that occurs before the device signals the ready
> state, then the IOMMU will fault since it blocked an interrupt.  While
> harmless, the fault would appear like a serious issue has occurred so let's
> silence it by making sure the device hits the ready state before the host
> completes its reset processing.
> 
> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
> ---
>  drivers/bus/mhi/core/pm.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
> index adb0e80..414da4f 100644
> --- a/drivers/bus/mhi/core/pm.c
> +++ b/drivers/bus/mhi/core/pm.c
> @@ -467,7 +467,7 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>  
>  	/* Trigger MHI RESET so that the device will not access host memory */
>  	if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) {
> -		u32 in_reset = -1;
> +		u32 in_reset = -1, ready = 0;
>  		unsigned long timeout = msecs_to_jiffies(mhi_cntrl->timeout_ms);
>  
>  		dev_dbg(dev, "Triggering MHI Reset in device\n");
> @@ -490,6 +490,21 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>  		 * hence re-program it
>  		 */
>  		mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
> +
> +		if (!MHI_IN_PBL(mhi_get_exec_env(mhi_cntrl))) {
> +			/* wait for ready to be set */
> +			ret = wait_event_timeout(mhi_cntrl->state_event,
> +						 mhi_read_reg_field(mhi_cntrl,
> +							mhi_cntrl->regs,
> +							MHISTATUS,
> +							MHISTATUS_READY_MASK,
> +							MHISTATUS_READY_SHIFT,
> +							&ready)
> +						 || ready, timeout);
> +			if (!ret || !ready)
> +				dev_warn(dev,
> +					"Device failed to enter READY state\n");

Wouldn't dev_err be more appropriate here provided that we might get IOMMU fault
anytime soon?

Thanks,
Mani

> +		}
>  	}
>  
>  	dev_dbg(dev,
> -- 
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.
>
Jeffrey Hugo March 16, 2021, 7:28 p.m. UTC | #3
On 3/16/2021 12:14 AM, Manivannan Sadhasivam wrote:
> On Wed, Mar 10, 2021 at 01:41:58PM -0700, Jeffrey Hugo wrote:
>> After the device has signaled the end of reset by clearing the reset bit,
>> it will automatically reinit MHI and the internal device structures.  Once
>> That is done, the device will signal it has entered the ready state.
>>
>> Signaling the ready state involves sending an interrupt (MSI) to the host
>> which might cause IOMMU faults if it occurs at the wrong time.
>>
>> If the controller is being powered down, and possibly removed, then the
>> reset flow would only wait for the end of reset.  At which point, the host
>> and device would start a race.  The host may complete its reset work, and
>> remove the interrupt handler, which would cause the interrupt to be
>> disabled in the IOMMU.  If that occurs before the device signals the ready
>> state, then the IOMMU will fault since it blocked an interrupt.  While
>> harmless, the fault would appear like a serious issue has occurred so let's
>> silence it by making sure the device hits the ready state before the host
>> completes its reset processing.
>>
>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>> ---
>>   drivers/bus/mhi/core/pm.c | 17 ++++++++++++++++-
>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>> index adb0e80..414da4f 100644
>> --- a/drivers/bus/mhi/core/pm.c
>> +++ b/drivers/bus/mhi/core/pm.c
>> @@ -467,7 +467,7 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>>   
>>   	/* Trigger MHI RESET so that the device will not access host memory */
>>   	if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) {
>> -		u32 in_reset = -1;
>> +		u32 in_reset = -1, ready = 0;
>>   		unsigned long timeout = msecs_to_jiffies(mhi_cntrl->timeout_ms);
>>   
>>   		dev_dbg(dev, "Triggering MHI Reset in device\n");
>> @@ -490,6 +490,21 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
>>   		 * hence re-program it
>>   		 */
>>   		mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
>> +
>> +		if (!MHI_IN_PBL(mhi_get_exec_env(mhi_cntrl))) {
>> +			/* wait for ready to be set */
>> +			ret = wait_event_timeout(mhi_cntrl->state_event,
>> +						 mhi_read_reg_field(mhi_cntrl,
>> +							mhi_cntrl->regs,
>> +							MHISTATUS,
>> +							MHISTATUS_READY_MASK,
>> +							MHISTATUS_READY_SHIFT,
>> +							&ready)
>> +						 || ready, timeout);
>> +			if (!ret || !ready)
>> +				dev_warn(dev,
>> +					"Device failed to enter READY state\n");
> 
> Wouldn't dev_err be more appropriate here provided that we might get IOMMU fault
> anytime soon?

I supposed.  Didn't feel like a "true" error because nothing has 
actually failed, the chance of the IOMMU fault is low, and I couldn't 
enumerate what would be the expected action for the system user to take 
if they saw this as an error.

I don't have a particularly strong opinion one way or the other.  I 
figured warn was the more conservative option here.

Will change.
diff mbox series

Patch

diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
index adb0e80..414da4f 100644
--- a/drivers/bus/mhi/core/pm.c
+++ b/drivers/bus/mhi/core/pm.c
@@ -467,7 +467,7 @@  static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
 
 	/* Trigger MHI RESET so that the device will not access host memory */
 	if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) {
-		u32 in_reset = -1;
+		u32 in_reset = -1, ready = 0;
 		unsigned long timeout = msecs_to_jiffies(mhi_cntrl->timeout_ms);
 
 		dev_dbg(dev, "Triggering MHI Reset in device\n");
@@ -490,6 +490,21 @@  static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
 		 * hence re-program it
 		 */
 		mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
+
+		if (!MHI_IN_PBL(mhi_get_exec_env(mhi_cntrl))) {
+			/* wait for ready to be set */
+			ret = wait_event_timeout(mhi_cntrl->state_event,
+						 mhi_read_reg_field(mhi_cntrl,
+							mhi_cntrl->regs,
+							MHISTATUS,
+							MHISTATUS_READY_MASK,
+							MHISTATUS_READY_SHIFT,
+							&ready)
+						 || ready, timeout);
+			if (!ret || !ready)
+				dev_warn(dev,
+					"Device failed to enter READY state\n");
+		}
 	}
 
 	dev_dbg(dev,