[v2,1/5] bus: mhi: core: Handle syserr during power_up

Message ID	1586278230-29565-2-git-send-email-jhugo@codeaurora.org (mailing list archive)
State	Superseded
Headers	show Return-Path: <SRS0=udBK=5X=vger.kernel.org=linux-arm-msm-owner@kernel.org> In-Reply-To: Message-Id: Date: Subject: Cc: To: From: Sender; bh=GwM28tXHyTVLLRgeS3lHKUJcxxN/qgERFpLmSBazGwI=; b=uqt7nVfa3X5GmSX0puQ5f55zE/oGLm+yGZTknjS2mMP/Q5TAm6gk1jbUcc6tZYxhZzlUh4Tj 6UlVC91m6eO7/DKrYuQLmb78NfjOI3/aqynqeOYSPYTRwCg8AIrh7SgdsfMdATcH2V+K5Q+n mze0LFWYyvAkRXLzUhaovmPoKJU= sender: jhugo) by smtp.codeaurora.org (Postfix) with ESMTPSA id C9EB2C433F2; Tue, 7 Apr 2020 16:51:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org C9EB2C433F2 From: Jeffrey Hugo <jhugo@codeaurora.org> To: manivannan.sadhasivam@linaro.org, hemantk@codeaurora.org Cc: linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, Jeffrey Hugo <jhugo@codeaurora.org> Subject: [PATCH v2 1/5] bus: mhi: core: Handle syserr during power_up Date: Tue, 7 Apr 2020 10:50:26 -0600 Message-Id: <1586278230-29565-2-git-send-email-jhugo@codeaurora.org> In-Reply-To: <1586278230-29565-1-git-send-email-jhugo@codeaurora.org> References: <1586278230-29565-1-git-send-email-jhugo@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk
Series	Misc MHI fixes \| expand [v2,0/5] Misc MHI fixes [v2,1/5] bus: mhi: core: Handle syserr during power_up [v2,2/5] bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails [v2,3/5] bus: mhi: core: Remove link_status() callback [v2,4/5] bus: mhi: core: Offload register accesses to the controller [v2,5/5] bus: mhi: core: Fix typo in comment

Jeffrey Hugo April 7, 2020, 4:50 p.m. UTC

The MHI device may be in the syserr state when we attempt to init it in
power_up().  Since we have no local state, the handling is simple -
reset the device and wait for it to transition out of the reset state.

Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
---
 drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Hemant Kumar April 10, 2020, 12:55 a.m. UTC | #1

On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
> The MHI device may be in the syserr state when we attempt to init it in
> power_up().  Since we have no local state, the handling is simple -
> reset the device and wait for it to transition out of the reset state.
>
> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
> ---
>   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
>
> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
> index 52690cb..3285c9e 100644
> --- a/drivers/bus/mhi/core/pm.c
> +++ b/drivers/bus/mhi/core/pm.c
> @@ -9,6 +9,7 @@
>   #include <linux/dma-direction.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/interrupt.h>
> +#include <linux/iopoll.h>
>   #include <linux/list.h>
>   #include <linux/mhi.h>
>   #include <linux/module.h>
> @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct mhi_controller *mhi_cntrl,
>   
>   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>   {
> +	enum mhi_state state;
>   	enum mhi_ee_type current_ee;
>   	enum dev_st_transition next_state;
>   	struct device *dev = &mhi_cntrl->mhi_dev->dev;
> @@ -829,6 +831,24 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>   		goto error_bhi_offset;
>   	}
>   
> +	state = mhi_get_mhi_state(mhi_cntrl);
> +	if (state == MHI_STATE_SYS_ERR) {
> +		mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
> +		ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
> +					 !(val & MHICTRL_RESET_MASK), 1000,
> +					 mhi_cntrl->timeout_ms * 1000);
can we use this instead of polling because MSI is configures and int_vec 
handler is registered

     wait_event_timeout(mhi_cntrl->state_event,
                MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
               mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
                           MHICTRL_RESET_MASK,
                           MHICTRL_RESET_SHIFT, &reset) || !reset ,
                msecs_to_jiffies(mhi_cntrl->timeout_ms));

1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
2) Consistent with current MHI driver code.
> +		if (ret) {
> +			dev_info(dev, "Failed to reset MHI due to syserr state\n");
> +			goto error_bhi_offset;
> +		}
> +
> +		/*
> +		 * device cleares INTVEC as part of RESET processing,
> +		 * re-program it
> +		 */
> +		mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
> +	}
> +
>   	/* Transition to next state */
>   	next_state = MHI_IN_PBL(current_ee) ?
>   		DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;

Jeffrey Hugo April 10, 2020, 3:03 p.m. UTC | #2

On 4/9/2020 6:55 PM, Hemant Kumar wrote:
> 
> On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
>> The MHI device may be in the syserr state when we attempt to init it in
>> power_up().  Since we have no local state, the handling is simple -
>> reset the device and wait for it to transition out of the reset state.
>>
>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>> ---
>>   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>> index 52690cb..3285c9e 100644
>> --- a/drivers/bus/mhi/core/pm.c
>> +++ b/drivers/bus/mhi/core/pm.c
>> @@ -9,6 +9,7 @@
>>   #include <linux/dma-direction.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/interrupt.h>
>> +#include <linux/iopoll.h>
>>   #include <linux/list.h>
>>   #include <linux/mhi.h>
>>   #include <linux/module.h>
>> @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct 
>> mhi_controller *mhi_cntrl,
>>   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>>   {
>> +    enum mhi_state state;
>>       enum mhi_ee_type current_ee;
>>       enum dev_st_transition next_state;
>>       struct device *dev = &mhi_cntrl->mhi_dev->dev;
>> @@ -829,6 +831,24 @@ int mhi_async_power_up(struct mhi_controller 
>> *mhi_cntrl)
>>           goto error_bhi_offset;
>>       }
>> +    state = mhi_get_mhi_state(mhi_cntrl);
>> +    if (state == MHI_STATE_SYS_ERR) {
>> +        mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
>> +        ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
>> +                     !(val & MHICTRL_RESET_MASK), 1000,
>> +                     mhi_cntrl->timeout_ms * 1000);
> can we use this instead of polling because MSI is configures and int_vec 
> handler is registered
> 
>      wait_event_timeout(mhi_cntrl->state_event,
>                 MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
>                mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
>                            MHICTRL_RESET_MASK,
>                            MHICTRL_RESET_SHIFT, &reset) || !reset ,
>                 msecs_to_jiffies(mhi_cntrl->timeout_ms));
> 
> 1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
> 2) Consistent with current MHI driver code.

I'm not sure this works in the way you intend.

state_event is linked to the intvec, which is the BHI interrupt.  I 
don't see that the state_event is triggered in the MHI interrupt path 
(mhi_irq_handler).  So, if we are in the PBL EE, we would expect to see 
the BHI interrupt, but if we are in the AMSS EE, we would expect to see 
a MHI interrupt.

Now, for my concerned usecase, those two interrupts happen to be the 
same interrupt, so both will get triggered, but I don't expect that to 
be the same for all usecases.

So, with the solution I propose, we exit the wait (poll loop) as soon as 
we see the register change values.

With the solution you propose, if we only get the MHI interrupt, we'll 
have to wait out the entire timeout value, and then check the register. 
In this scenario, we are almost guaranteed to wait for longer than 
necessary.

Did I miss something?

>> +        if (ret) {
>> +            dev_info(dev, "Failed to reset MHI due to syserr state\n");
>> +            goto error_bhi_offset;
>> +        }
>> +
>> +        /*
>> +         * device cleares INTVEC as part of RESET processing,
>> +         * re-program it
>> +         */
>> +        mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
>> +    }
>> +
>>       /* Transition to next state */
>>       next_state = MHI_IN_PBL(current_ee) ?
>>           DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
>

Bhaumik Bhatt April 10, 2020, 8:37 p.m. UTC | #3

Hi Jeff,

We will always have the mhi_intvec_handler registered and trigger your 
wake_up state event when you write MHI RESET. BHI INTVEC IRQ using 
mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.

So, your below assumption is not true:
 >>>So, if we are in the PBL EE, we would expect to see the BHI 
interrupt, but if we are in the AMSS EE, we would expect to see a MHI 
interrupt.

At the start of mhi_async_power_up(), you've already registered for the 
BHI interrupt as we do setup for IRQ and it is only unregistered from 
power down if power up on the same cycle was a success.

On 4/10/20 8:03 AM, Jeffrey Hugo wrote:
> On 4/9/2020 6:55 PM, Hemant Kumar wrote:
>>
>> On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
>>> The MHI device may be in the syserr state when we attempt to init it in
>>> power_up().  Since we have no local state, the handling is simple -
>>> reset the device and wait for it to transition out of the reset state.
>>>
>>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>>> ---
>>>   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
>>>   1 file changed, 20 insertions(+)
>>>
>>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>>> index 52690cb..3285c9e 100644
>>> --- a/drivers/bus/mhi/core/pm.c
>>> +++ b/drivers/bus/mhi/core/pm.c
>>> @@ -9,6 +9,7 @@
>>>   #include <linux/dma-direction.h>
>>>   #include <linux/dma-mapping.h>
>>>   #include <linux/interrupt.h>
>>> +#include <linux/iopoll.h>
>>>   #include <linux/list.h>
>>>   #include <linux/mhi.h>
>>>   #include <linux/module.h>
>>> @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct 
>>> mhi_controller *mhi_cntrl,
>>>   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>>>   {
>>> +    enum mhi_state state;
>>>       enum mhi_ee_type current_ee;
>>>       enum dev_st_transition next_state;
>>>       struct device *dev = &mhi_cntrl->mhi_dev->dev;
>>> @@ -829,6 +831,24 @@ int mhi_async_power_up(struct mhi_controller 
>>> *mhi_cntrl)
>>>           goto error_bhi_offset;
>>>       }
>>> +    state = mhi_get_mhi_state(mhi_cntrl);
>>> +    if (state == MHI_STATE_SYS_ERR) {
>>> +        mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
>>> +        ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
>>> +                     !(val & MHICTRL_RESET_MASK), 1000,
>>> +                     mhi_cntrl->timeout_ms * 1000);
>> can we use this instead of polling because MSI is configures and 
>> int_vec handler is registered
>>
>>      wait_event_timeout(mhi_cntrl->state_event,
>>                 MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
>>                mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
>>                            MHICTRL_RESET_MASK,
>>                            MHICTRL_RESET_SHIFT, &reset) || !reset ,
>>                 msecs_to_jiffies(mhi_cntrl->timeout_ms));
>>
>> 1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
>> 2) Consistent with current MHI driver code.
>
> I'm not sure this works in the way you intend.
>
> state_event is linked to the intvec, which is the BHI interrupt. I 
> don't see that the state_event is triggered in the MHI interrupt path 
> (mhi_irq_handler).  So, if we are in the PBL EE, we would expect to 
> see the BHI interrupt, but if we are in the AMSS EE, we would expect 
> to see a MHI interrupt.
>
> Now, for my concerned usecase, those two interrupts happen to be the 
> same interrupt, so both will get triggered, but I don't expect that to 
> be the same for all usecases.
>
> So, with the solution I propose, we exit the wait (poll loop) as soon 
> as we see the register change values.
>
> With the solution you propose, if we only get the MHI interrupt, we'll 
> have to wait out the entire timeout value, and then check the 
> register. In this scenario, we are almost guaranteed to wait for 
> longer than necessary.
>
> Did I miss something?
>
>>> +        if (ret) {
>>> +            dev_info(dev, "Failed to reset MHI due to syserr 
>>> state\n");
>>> +            goto error_bhi_offset;
>>> +        }
>>> +
>>> +        /*
>>> +         * device cleares INTVEC as part of RESET processing,
>>> +         * re-program it
>>> +         */
>>> +        mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
>>> +    }
>>> +
>>>       /* Transition to next state */
>>>       next_state = MHI_IN_PBL(current_ee) ?
>>>           DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
>>
>
>

Jeffrey Hugo April 10, 2020, 9:39 p.m. UTC | #4

On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
> Hi Jeff,
> 
> We will always have the mhi_intvec_handler registered and trigger your 
> wake_up state event when you write MHI RESET. BHI INTVEC IRQ using 
> mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.

I understand it is not unregistered.  However mhi_cntrl->irq[0] may be 
reserved for BHI, and thus only exercised by PBL EE.  Where as, 
mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. 
mhi_intvec_handler is not called in response to mhi_cntrl->irq[1..N].

Additionally, I re-reviewed the MHI spec, and I don't see where the spec 
requires the device to issue an interrupt upon completion of the RESET 
request.

Under section 3.5, step 11 states -

"The host must poll for the value of the RESET bit to detect the 
completion of the reset procedure by the device (RESET is set to 0)."

So, by this, my proposed solution appears to be spec compliant, where as 
what Hemant proposed is not.

> 
> So, your below assumption is not true:
>  >>>So, if we are in the PBL EE, we would expect to see the BHI 
> interrupt, but if we are in the AMSS EE, we would expect to see a MHI 
> interrupt.
> 
> At the start of mhi_async_power_up(), you've already registered for the 
> BHI interrupt as we do setup for IRQ and it is only unregistered from 
> power down if power up on the same cycle was a success.

You seem to have misunderstood my point.  If the BHI irq is only for BHI 
activity, which is activity restricted to the PBL EE, and the MHI 
interrupt(s) are restricted to MHI activity, which for the purposes of 
this discussion only occur in the AMSS EE, then my assumption is 
correct.  When the device is in PBL EE, we should only observe BHI irqs, 
and when the device is in AMSS EE, we should only observe MHI irqs.

This is a statement of what IRQ lines the device is raising, and not a 
statement of what handlers the host has, or has not registered.

Again, if the BHI irq is only generated in the PBL EE, and we rely on 
the BHI irq for "sensing" the state_event - we will never see the 
state_event in the AMSS EE, unless the same IRQ line is used for both 
MHI and BHI (which is only a select set of usecases, and not universal).

> 
> On 4/10/20 8:03 AM, Jeffrey Hugo wrote:
>> On 4/9/2020 6:55 PM, Hemant Kumar wrote:
>>>
>>> On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
>>>> The MHI device may be in the syserr state when we attempt to init it in
>>>> power_up().  Since we have no local state, the handling is simple -
>>>> reset the device and wait for it to transition out of the reset state.
>>>>
>>>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>>>> ---
>>>>   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
>>>>   1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>>>> index 52690cb..3285c9e 100644
>>>> --- a/drivers/bus/mhi/core/pm.c
>>>> +++ b/drivers/bus/mhi/core/pm.c
>>>> @@ -9,6 +9,7 @@
>>>>   #include <linux/dma-direction.h>
>>>>   #include <linux/dma-mapping.h>
>>>>   #include <linux/interrupt.h>
>>>> +#include <linux/iopoll.h>
>>>>   #include <linux/list.h>
>>>>   #include <linux/mhi.h>
>>>>   #include <linux/module.h>
>>>> @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct 
>>>> mhi_controller *mhi_cntrl,
>>>>   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
>>>>   {
>>>> +    enum mhi_state state;
>>>>       enum mhi_ee_type current_ee;
>>>>       enum dev_st_transition next_state;
>>>>       struct device *dev = &mhi_cntrl->mhi_dev->dev;
>>>> @@ -829,6 +831,24 @@ int mhi_async_power_up(struct mhi_controller 
>>>> *mhi_cntrl)
>>>>           goto error_bhi_offset;
>>>>       }
>>>> +    state = mhi_get_mhi_state(mhi_cntrl);
>>>> +    if (state == MHI_STATE_SYS_ERR) {
>>>> +        mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
>>>> +        ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
>>>> +                     !(val & MHICTRL_RESET_MASK), 1000,
>>>> +                     mhi_cntrl->timeout_ms * 1000);
>>> can we use this instead of polling because MSI is configures and 
>>> int_vec handler is registered
>>>
>>>      wait_event_timeout(mhi_cntrl->state_event,
>>>                 MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
>>>                mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
>>>                            MHICTRL_RESET_MASK,
>>>                            MHICTRL_RESET_SHIFT, &reset) || !reset ,
>>>                 msecs_to_jiffies(mhi_cntrl->timeout_ms));
>>>
>>> 1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
>>> 2) Consistent with current MHI driver code.
>>
>> I'm not sure this works in the way you intend.
>>
>> state_event is linked to the intvec, which is the BHI interrupt. I 
>> don't see that the state_event is triggered in the MHI interrupt path 
>> (mhi_irq_handler).  So, if we are in the PBL EE, we would expect to 
>> see the BHI interrupt, but if we are in the AMSS EE, we would expect 
>> to see a MHI interrupt.
>>
>> Now, for my concerned usecase, those two interrupts happen to be the 
>> same interrupt, so both will get triggered, but I don't expect that to 
>> be the same for all usecases.
>>
>> So, with the solution I propose, we exit the wait (poll loop) as soon 
>> as we see the register change values.
>>
>> With the solution you propose, if we only get the MHI interrupt, we'll 
>> have to wait out the entire timeout value, and then check the 
>> register. In this scenario, we are almost guaranteed to wait for 
>> longer than necessary.
>>
>> Did I miss something?
>>
>>>> +        if (ret) {
>>>> +            dev_info(dev, "Failed to reset MHI due to syserr 
>>>> state\n");
>>>> +            goto error_bhi_offset;
>>>> +        }
>>>> +
>>>> +        /*
>>>> +         * device cleares INTVEC as part of RESET processing,
>>>> +         * re-program it
>>>> +         */
>>>> +        mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
>>>> +    }
>>>> +
>>>>       /* Transition to next state */
>>>>       next_state = MHI_IN_PBL(current_ee) ?
>>>>           DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
>>>
>>
>>

Manivannan Sadhasivam April 13, 2020, 1:34 p.m. UTC | #5

On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
> On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
> > Hi Jeff,
> > 
> > We will always have the mhi_intvec_handler registered and trigger your
> > wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
> > mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
> 
> I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
> reserved for BHI, and thus only exercised by PBL EE.  Where as,
> mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
> not called in response to mhi_cntrl->irq[1..N].
> 
> Additionally, I re-reviewed the MHI spec, and I don't see where the spec
> requires the device to issue an interrupt upon completion of the RESET
> request.
> 
> Under section 3.5, step 11 states -
> 
> "The host must poll for the value of the RESET bit to detect the completion
> of the reset procedure by the device (RESET is set to 0)."
> 

If this is the scenario then we need to change all of the wait_event_timeout()
implementation for MHI RESET in current stack to polling.

Or the interrupt generation is not defined in spec (sheet) but part of the
existing implementation?

Thanks,
Mani

> So, by this, my proposed solution appears to be spec compliant, where as
> what Hemant proposed is not.
> 
> > 
> > So, your below assumption is not true:
> >  >>>So, if we are in the PBL EE, we would expect to see the BHI
> > interrupt, but if we are in the AMSS EE, we would expect to see a MHI
> > interrupt.
> > 
> > At the start of mhi_async_power_up(), you've already registered for the
> > BHI interrupt as we do setup for IRQ and it is only unregistered from
> > power down if power up on the same cycle was a success.
> 
> You seem to have misunderstood my point.  If the BHI irq is only for BHI
> activity, which is activity restricted to the PBL EE, and the MHI
> interrupt(s) are restricted to MHI activity, which for the purposes of this
> discussion only occur in the AMSS EE, then my assumption is correct.  When
> the device is in PBL EE, we should only observe BHI irqs, and when the
> device is in AMSS EE, we should only observe MHI irqs.
> 
> This is a statement of what IRQ lines the device is raising, and not a
> statement of what handlers the host has, or has not registered.
> 
> Again, if the BHI irq is only generated in the PBL EE, and we rely on the
> BHI irq for "sensing" the state_event - we will never see the state_event in
> the AMSS EE, unless the same IRQ line is used for both MHI and BHI (which is
> only a select set of usecases, and not universal).
> 
> > 
> > On 4/10/20 8:03 AM, Jeffrey Hugo wrote:
> > > On 4/9/2020 6:55 PM, Hemant Kumar wrote:
> > > > 
> > > > On 4/7/20 9:50 AM, Jeffrey Hugo wrote:
> > > > > The MHI device may be in the syserr state when we attempt to init it in
> > > > > power_up().  Since we have no local state, the handling is simple -
> > > > > reset the device and wait for it to transition out of the reset state.
> > > > > 
> > > > > Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
> > > > > ---
> > > > >   drivers/bus/mhi/core/pm.c | 20 ++++++++++++++++++++
> > > > >   1 file changed, 20 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
> > > > > index 52690cb..3285c9e 100644
> > > > > --- a/drivers/bus/mhi/core/pm.c
> > > > > +++ b/drivers/bus/mhi/core/pm.c
> > > > > @@ -9,6 +9,7 @@
> > > > >   #include <linux/dma-direction.h>
> > > > >   #include <linux/dma-mapping.h>
> > > > >   #include <linux/interrupt.h>
> > > > > +#include <linux/iopoll.h>
> > > > >   #include <linux/list.h>
> > > > >   #include <linux/mhi.h>
> > > > >   #include <linux/module.h>
> > > > > @@ -760,6 +761,7 @@ static void mhi_deassert_dev_wake(struct
> > > > > mhi_controller *mhi_cntrl,
> > > > >   int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
> > > > >   {
> > > > > +    enum mhi_state state;
> > > > >       enum mhi_ee_type current_ee;
> > > > >       enum dev_st_transition next_state;
> > > > >       struct device *dev = &mhi_cntrl->mhi_dev->dev;
> > > > > @@ -829,6 +831,24 @@ int mhi_async_power_up(struct
> > > > > mhi_controller *mhi_cntrl)
> > > > >           goto error_bhi_offset;
> > > > >       }
> > > > > +    state = mhi_get_mhi_state(mhi_cntrl);
> > > > > +    if (state == MHI_STATE_SYS_ERR) {
> > > > > +        mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
> > > > > +        ret = readl_poll_timeout(mhi_cntrl->regs + MHICTRL, val,
> > > > > +                     !(val & MHICTRL_RESET_MASK), 1000,
> > > > > +                     mhi_cntrl->timeout_ms * 1000);
> > > > can we use this instead of polling because MSI is configures and
> > > > int_vec handler is registered
> > > > 
> > > >      wait_event_timeout(mhi_cntrl->state_event,
> > > >                 MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
> > > >                mhi_read_reg_field(mhi_cntrl, base, MHICTRL,
> > > >                            MHICTRL_RESET_MASK,
> > > >                            MHICTRL_RESET_SHIFT, &reset) || !reset ,
> > > >                 msecs_to_jiffies(mhi_cntrl->timeout_ms));
> > > > 
> > > > 1) In case of MHI_PM_IN_FATAL_STATE we would not be accessing MHI reg
> > > > 2) Consistent with current MHI driver code.
> > > 
> > > I'm not sure this works in the way you intend.
> > > 
> > > state_event is linked to the intvec, which is the BHI interrupt. I
> > > don't see that the state_event is triggered in the MHI interrupt
> > > path (mhi_irq_handler).  So, if we are in the PBL EE, we would
> > > expect to see the BHI interrupt, but if we are in the AMSS EE, we
> > > would expect to see a MHI interrupt.
> > > 
> > > Now, for my concerned usecase, those two interrupts happen to be the
> > > same interrupt, so both will get triggered, but I don't expect that
> > > to be the same for all usecases.
> > > 
> > > So, with the solution I propose, we exit the wait (poll loop) as
> > > soon as we see the register change values.
> > > 
> > > With the solution you propose, if we only get the MHI interrupt,
> > > we'll have to wait out the entire timeout value, and then check the
> > > register. In this scenario, we are almost guaranteed to wait for
> > > longer than necessary.
> > > 
> > > Did I miss something?
> > > 
> > > > > +        if (ret) {
> > > > > +            dev_info(dev, "Failed to reset MHI due to
> > > > > syserr state\n");
> > > > > +            goto error_bhi_offset;
> > > > > +        }
> > > > > +
> > > > > +        /*
> > > > > +         * device cleares INTVEC as part of RESET processing,
> > > > > +         * re-program it
> > > > > +         */
> > > > > +        mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
> > > > > +    }
> > > > > +
> > > > >       /* Transition to next state */
> > > > >       next_state = MHI_IN_PBL(current_ee) ?
> > > > >           DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
> > > > 
> > > 
> > > 
> 
> 
> -- 
> Jeffrey Hugo
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.

Jeffrey Hugo April 13, 2020, 2:01 p.m. UTC | #6

On 4/13/2020 7:34 AM, Manivannan Sadhasivam wrote:
> On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
>> On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
>>> Hi Jeff,
>>>
>>> We will always have the mhi_intvec_handler registered and trigger your
>>> wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
>>> mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
>>
>> I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
>> reserved for BHI, and thus only exercised by PBL EE.  Where as,
>> mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
>> not called in response to mhi_cntrl->irq[1..N].
>>
>> Additionally, I re-reviewed the MHI spec, and I don't see where the spec
>> requires the device to issue an interrupt upon completion of the RESET
>> request.
>>
>> Under section 3.5, step 11 states -
>>
>> "The host must poll for the value of the RESET bit to detect the completion
>> of the reset procedure by the device (RESET is set to 0)."
>>
> 
> If this is the scenario then we need to change all of the wait_event_timeout()
> implementation for MHI RESET in current stack to polling.
> 
> Or the interrupt generation is not defined in spec (sheet) but part of the
> existing implementation?

It probably could be considered part of the existing implementation, but 
I'd like to hear from Hemant/Bhaumik.  Wherever we end up, I'd like to 
have the spec match.

Manivannan Sadhasivam April 21, 2020, 6:08 a.m. UTC | #7

On Mon, Apr 13, 2020 at 08:01:36AM -0600, Jeffrey Hugo wrote:
> On 4/13/2020 7:34 AM, Manivannan Sadhasivam wrote:
> > On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
> > > On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
> > > > Hi Jeff,
> > > > 
> > > > We will always have the mhi_intvec_handler registered and trigger your
> > > > wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
> > > > mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
> > > 
> > > I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
> > > reserved for BHI, and thus only exercised by PBL EE.  Where as,
> > > mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
> > > not called in response to mhi_cntrl->irq[1..N].
> > > 
> > > Additionally, I re-reviewed the MHI spec, and I don't see where the spec
> > > requires the device to issue an interrupt upon completion of the RESET
> > > request.
> > > 
> > > Under section 3.5, step 11 states -
> > > 
> > > "The host must poll for the value of the RESET bit to detect the completion
> > > of the reset procedure by the device (RESET is set to 0)."
> > > 
> > 
> > If this is the scenario then we need to change all of the wait_event_timeout()
> > implementation for MHI RESET in current stack to polling.
> > 
> > Or the interrupt generation is not defined in spec (sheet) but part of the
> > existing implementation?
> 
> It probably could be considered part of the existing implementation, but I'd
> like to hear from Hemant/Bhaumik.  Wherever we end up, I'd like to have the
> spec match.

Hemant/Bhaumik, can you please share your thoughts?

Thanks,
Mani

> 
> -- 
> Jeffrey Hugo
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.

Jeffrey Hugo April 22, 2020, 8:16 p.m. UTC | #8

On 4/21/2020 12:08 AM, Manivannan Sadhasivam wrote:
> On Mon, Apr 13, 2020 at 08:01:36AM -0600, Jeffrey Hugo wrote:
>> On 4/13/2020 7:34 AM, Manivannan Sadhasivam wrote:
>>> On Fri, Apr 10, 2020 at 03:39:57PM -0600, Jeffrey Hugo wrote:
>>>> On 4/10/2020 2:37 PM, Bhaumik Vasav Bhatt wrote:
>>>>> Hi Jeff,
>>>>>
>>>>> We will always have the mhi_intvec_handler registered and trigger your
>>>>> wake_up state event when you write MHI RESET. BHI INTVEC IRQ using
>>>>> mhi_cntrl->irq[0] is _not_ unregistered once you enter AMSS EE.
>>>>
>>>> I understand it is not unregistered.  However mhi_cntrl->irq[0] may be
>>>> reserved for BHI, and thus only exercised by PBL EE.  Where as,
>>>> mhi_cntrl->irq[1..N] may be only exercised by AMSS EE. mhi_intvec_handler is
>>>> not called in response to mhi_cntrl->irq[1..N].
>>>>
>>>> Additionally, I re-reviewed the MHI spec, and I don't see where the spec
>>>> requires the device to issue an interrupt upon completion of the RESET
>>>> request.
>>>>
>>>> Under section 3.5, step 11 states -
>>>>
>>>> "The host must poll for the value of the RESET bit to detect the completion
>>>> of the reset procedure by the device (RESET is set to 0)."
>>>>
>>>
>>> If this is the scenario then we need to change all of the wait_event_timeout()
>>> implementation for MHI RESET in current stack to polling.
>>>
>>> Or the interrupt generation is not defined in spec (sheet) but part of the
>>> existing implementation?
>>
>> It probably could be considered part of the existing implementation, but I'd
>> like to hear from Hemant/Bhaumik.  Wherever we end up, I'd like to have the
>> spec match.
> 
> Hemant/Bhaumik, can you please share your thoughts?

Sorry.  Hemant, Bhaumik, and I have had a few calls discussing this.  We 
are trying to come to a consensus on the expectation of the device 
behavior, so that we can propose the best solution.  Probably a few more 
days yet while we await for a bit of clarification.

[v2,1/5] bus: mhi: core: Handle syserr during power_up

Commit Message

Comments

Patch