diff mbox

[1/1] mmc: mmc: Relax checking for switch errors after HS200 switch

Message ID dd764c3d-6303-2d26-4201-d491957dc980@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Adrian Hunter Dec. 1, 2016, 1:40 p.m. UTC
On 01/12/16 12:18, Ulf Hansson wrote:
> On 24 November 2016 at 13:02, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> The JEDEC specification indicates CMD13 can be used after a HS200 switch
>> to check for errors. However in practice some boards experience CRC errors
>> in the CMD13 response. Consequently, for HS200, CRC errors are not a
>> reliable way to know the switch failed. If there really is a problem, we
>> would expect tuning will fail and the result ends up the same. So change
>> the error condition to ignore CRC errors in that case.
> 
> I agree, this seems like a good idea.
> 
> However...
> 
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  drivers/mmc/core/mmc.c | 15 ++++++++++++++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
>> index 3268fcd3378d..34d30e2a09ff 100644
>> --- a/drivers/mmc/core/mmc.c
>> +++ b/drivers/mmc/core/mmc.c
>> @@ -1223,7 +1223,12 @@ int mmc_hs400_to_hs200(struct mmc_card *card)
>>         mmc_set_timing(host, MMC_TIMING_MMC_HS200);
>>
>>         err = mmc_switch_status(card);
>> -       if (err)
>> +       /*
>> +        * For HS200, CRC errors are not a reliable way to know the switch
>> +        * failed. If there really is a problem, we would expect tuning will
>> +        * fail and the result ends up the same.
>> +        */
>> +       if (err && err != -EILSEQ)
> 
> I would rather change mmc_switch_status() to take a new parameter,
> which tells it about ignoring CRC errors.
> 
> The reason is simply that I think these changes should probably apply
> to HS400 as well, and then it just gets lots of duplicated code for
> the same error check.

In the HS200 case, we have not yet done tuning and are not yet at the HS200 frequency - so there are reasons to ignore CRC errors.  As well as the fact that there is hardware that is known to have that weakness.

The HS400 case is a little different because we have already done tuning (in HS200 mode) by that time.  After the final switch, both the card and host controller have reached their final operating point, and then CMD13 is sent - so CRC errors would be unexpected.

The switches on the way from HS200 to HS400 are either done:
	- in HS200 mode after tuning is done and at HS200 frequency
	- in a mode that does not require tuning
So again in those cases, CRC errors would be unexpected.

Which does not mean I am against relaxing those cases too, but the rational is different, and I have no strong opinion on it - other than the obvious: that returning a fatal error when the hardware can in fact work, is counterproductive.

> 
>>                 goto out_err;
>>
>>         mmc_set_bus_speed(card);
>> @@ -1387,6 +1392,14 @@ static int mmc_select_hs200(struct mmc_card *card)
>>
>>                 err = mmc_switch_status(card);
>>                 /*
>> +                * For HS200, CRC errors are not a reliable way to know the
>> +                * switch failed. If there really is a problem, we would expect
>> +                * tuning will fail and the result ends up the same.
>> +                */
>> +               if (err == -EILSEQ)
>> +                       err = 0;
>> +
>> +               /*
>>                  * mmc_select_timing() assumes timing has not changed if
>>                  * it is a switch error.
>>                  */
>> --
>> 1.9.1
>>
> 
> Moreover, and somewhat related to this change:
> 
> Do we want to change mmc_switch_status() to be able to tell
> mmc_send_status() about the number of command retries? Currently it
> defaults to MMC_CMD_RETRIES, but I guess we only want to send *one*
> CMD13?

Yeah, although there is always the possibility that the retries overcome sporadic CRC errors, which would otherwise be fatal.

I tend to think this patch should still be separate because it's rational is different.
However we could consider something like this:


--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ulf Hansson Dec. 1, 2016, 2:06 p.m. UTC | #1
On 1 December 2016 at 14:40, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 01/12/16 12:18, Ulf Hansson wrote:
>> On 24 November 2016 at 13:02, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> The JEDEC specification indicates CMD13 can be used after a HS200 switch
>>> to check for errors. However in practice some boards experience CRC errors
>>> in the CMD13 response. Consequently, for HS200, CRC errors are not a
>>> reliable way to know the switch failed. If there really is a problem, we
>>> would expect tuning will fail and the result ends up the same. So change
>>> the error condition to ignore CRC errors in that case.
>>
>> I agree, this seems like a good idea.
>>
>> However...
>>
>>>
>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>>> ---
>>>  drivers/mmc/core/mmc.c | 15 ++++++++++++++-
>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
>>> index 3268fcd3378d..34d30e2a09ff 100644
>>> --- a/drivers/mmc/core/mmc.c
>>> +++ b/drivers/mmc/core/mmc.c
>>> @@ -1223,7 +1223,12 @@ int mmc_hs400_to_hs200(struct mmc_card *card)
>>>         mmc_set_timing(host, MMC_TIMING_MMC_HS200);
>>>
>>>         err = mmc_switch_status(card);
>>> -       if (err)
>>> +       /*
>>> +        * For HS200, CRC errors are not a reliable way to know the switch
>>> +        * failed. If there really is a problem, we would expect tuning will
>>> +        * fail and the result ends up the same.
>>> +        */
>>> +       if (err && err != -EILSEQ)
>>
>> I would rather change mmc_switch_status() to take a new parameter,
>> which tells it about ignoring CRC errors.
>>
>> The reason is simply that I think these changes should probably apply
>> to HS400 as well, and then it just gets lots of duplicated code for
>> the same error check.
>
> In the HS200 case, we have not yet done tuning and are not yet at the HS200 frequency - so there are reasons to ignore CRC errors.  As well as the fact that there is hardware that is known to have that weakness.
>
> The HS400 case is a little different because we have already done tuning (in HS200 mode) by that time.  After the final switch, both the card and host controller have reached their final operating point, and then CMD13 is sent - so CRC errors would be unexpected.
>
> The switches on the way from HS200 to HS400 are either done:
>         - in HS200 mode after tuning is done and at HS200 frequency
>         - in a mode that does not require tuning
> So again in those cases, CRC errors would be unexpected.

Agree.

>
> Which does not mean I am against relaxing those cases too, but the rational is different, and I have no strong opinion on it - other than the obvious: that returning a fatal error when the hardware can in fact work, is counterproductive.

Okay, so I suggest we make the changes for HS200 first, then we can
look into changes for HS400 as a step on top.

Anyway, I would still prefer to make mmc_switch_status() take another
parameter, unless you insist on the current proposal.

>
>>
>>>                 goto out_err;
>>>
>>>         mmc_set_bus_speed(card);
>>> @@ -1387,6 +1392,14 @@ static int mmc_select_hs200(struct mmc_card *card)
>>>
>>>                 err = mmc_switch_status(card);
>>>                 /*
>>> +                * For HS200, CRC errors are not a reliable way to know the
>>> +                * switch failed. If there really is a problem, we would expect
>>> +                * tuning will fail and the result ends up the same.
>>> +                */
>>> +               if (err == -EILSEQ)
>>> +                       err = 0;
>>> +
>>> +               /*
>>>                  * mmc_select_timing() assumes timing has not changed if
>>>                  * it is a switch error.
>>>                  */
>>> --
>>> 1.9.1
>>>
>>
>> Moreover, and somewhat related to this change:
>>
>> Do we want to change mmc_switch_status() to be able to tell
>> mmc_send_status() about the number of command retries? Currently it
>> defaults to MMC_CMD_RETRIES, but I guess we only want to send *one*
>> CMD13?
>
> Yeah, although there is always the possibility that the retries overcome sporadic CRC errors, which would otherwise be fatal.

I was thinking that the card could have cleared the SWITCH STATUS
information when CRC errors gets detected. But I guess that depends on
*how* the CRC error is triggered, perhaps as you say, then it's better
to retry than not.

Okay, so let's keep using MMC_CMD_RETRIES for this case.

>
> I tend to think this patch should still be separate because it's rational is different.
> However we could consider something like this:
>
> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> index c0cfaaeb7637..d61a955b69fc 100644
> --- a/drivers/mmc/core/mmc_ops.c
> +++ b/drivers/mmc/core/mmc_ops.c
> @@ -54,7 +54,7 @@
>         0xff, 0x77, 0x77, 0xff, 0x77, 0xbb, 0xdd, 0xee,
>  };
>
> -int mmc_send_status(struct mmc_card *card, u32 *status)
> +int __mmc_send_status(struct mmc_card *card, u32 *status, int retries)
>  {
>         int err;
>         struct mmc_command cmd = {0};
> @@ -67,7 +67,7 @@ int mmc_send_status(struct mmc_card *card, u32 *status)
>                 cmd.arg = card->rca << 16;
>         cmd.flags = MMC_RSP_SPI_R2 | MMC_RSP_R1 | MMC_CMD_AC;
>
> -       err = mmc_wait_for_cmd(card->host, &cmd, MMC_CMD_RETRIES);
> +       err = mmc_wait_for_cmd(card->host, &cmd, retries);
>         if (err)
>                 return err;
>
> @@ -80,6 +80,11 @@ int mmc_send_status(struct mmc_card *card, u32 *status)
>         return 0;
>  }
>
> +int mmc_send_status(struct mmc_card *card, u32 *status)
> +{
> +       return __mmc_send_status(card, status, MMC_CMD_RETRIES);
> +}
> +
>  static int _mmc_select_card(struct mmc_host *host, struct mmc_card *card)
>  {
>         struct mmc_command cmd = {0};
> @@ -453,7 +458,9 @@ int mmc_switch_status(struct mmc_card *card)
>         u32 status;
>         int err;
>
> -       err = mmc_send_status(card, &status);
> +       err = __mmc_send_status(card, &status, 0);
> +       if (err == -EILSEQ)
> +               return 0;
>         if (err)
>                 return err;
>
>

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
index c0cfaaeb7637..d61a955b69fc 100644
--- a/drivers/mmc/core/mmc_ops.c
+++ b/drivers/mmc/core/mmc_ops.c
@@ -54,7 +54,7 @@ 
 	0xff, 0x77, 0x77, 0xff, 0x77, 0xbb, 0xdd, 0xee,
 };
 
-int mmc_send_status(struct mmc_card *card, u32 *status)
+int __mmc_send_status(struct mmc_card *card, u32 *status, int retries)
 {
 	int err;
 	struct mmc_command cmd = {0};
@@ -67,7 +67,7 @@  int mmc_send_status(struct mmc_card *card, u32 *status)
 		cmd.arg = card->rca << 16;
 	cmd.flags = MMC_RSP_SPI_R2 | MMC_RSP_R1 | MMC_CMD_AC;
 
-	err = mmc_wait_for_cmd(card->host, &cmd, MMC_CMD_RETRIES);
+	err = mmc_wait_for_cmd(card->host, &cmd, retries);
 	if (err)
 		return err;
 
@@ -80,6 +80,11 @@  int mmc_send_status(struct mmc_card *card, u32 *status)
 	return 0;
 }
 
+int mmc_send_status(struct mmc_card *card, u32 *status)
+{
+	return __mmc_send_status(card, status, MMC_CMD_RETRIES);
+}
+
 static int _mmc_select_card(struct mmc_host *host, struct mmc_card *card)
 {
 	struct mmc_command cmd = {0};
@@ -453,7 +458,9 @@  int mmc_switch_status(struct mmc_card *card)
 	u32 status;
 	int err;
 
-	err = mmc_send_status(card, &status);
+	err = __mmc_send_status(card, &status, 0);
+	if (err == -EILSEQ)
+		return 0;
 	if (err)
 		return err;