mmc: core: don't return 1 for max_discard

Message ID	1387405663-14253-1-git-send-email-swarren@wwwdotorg.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-mmc-owner@kernel.org> From: Stephen Warren <swarren@wwwdotorg.org> To: Chris Ball <cjb@laptop.org> Cc: linux-mmc@vger.kernel.org, linux-tegra@vger.kernel.org, Stephen Warren <swarren@nvidia.com>, Adrian Hunter <adrian.hunter@intel.com>, Dong Aisheng <dongas86@gmail.com>, Ulf Hansson <ulf.hansson@linaro.org>, Vladimir Zapolskiy <vz@mleia.com> Subject: [PATCH] mmc: core: don't return 1 for max_discard Date: Wed, 18 Dec 2013 15:27:43 -0700 Message-Id: <1387405663-14253-1-git-send-email-swarren@wwwdotorg.org> Sender: linux-mmc-owner@vger.kernel.org Precedence: bulk

Stephen Warren Dec. 18, 2013, 10:27 p.m. UTC

From: Stephen Warren <swarren@nvidia.com>

In mmc_do_calc_max_discard(), if only a single erase block can be
discarded within the host controller's timeout, don't allow discard
operations at all.

Previously, the code allowed sector-at-a-time discard (rather than
erase-block-at-a-time), which was chronically slow.

Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
in qty == 1, which is immediately returned. This causes discard to
operate a single sector at a time, which is chronically slow. With this
patch in place, discard operates a single erase block at a time, which
is reasonably fast.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dong Aisheng <dongas86@gmail.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Vladimir Zapolskiy <vz@mleia.com>
Fixes: e056a1b5b67b "(mmc: queue: let host controllers specify maximum discard timeout")
Signed-off-by: Stephen Warren <swarren@nvidia.com>
---
 drivers/mmc/core/core.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

Stephen Warren Dec. 18, 2013, 11 p.m. UTC | #1

On 12/18/2013 03:27 PM, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> In mmc_do_calc_max_discard(), if only a single erase block can be
> discarded within the host controller's timeout, don't allow discard
> operations at all.
> 
> Previously, the code allowed sector-at-a-time discard (rather than
> erase-block-at-a-time), which was chronically slow.
> 
> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
> in qty == 1, which is immediately returned. This causes discard to
> operate a single sector at a time, which is chronically slow. With this
> patch in place, discard operates a single erase block at a time, which
> is reasonably fast.

Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
host controllers specify maximum discard timeout", followed by:

> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 050eb262485c..35c5b5d86c99 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>         cmd.opcode = MMC_ERASE;
>         cmd.arg = arg;
>         cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>         err = mmc_wait_for_cmd(card->host, &cmd, 0);
>         if (err) {
>                 pr_err("mmc_erase: erase error %d, status %#x\n",
> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>         if (mmc_host_is_spi(card->host))
>                 goto out;
>  
> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>         do {
>                 memset(&cmd, 0, sizeof(struct mmc_command));
>                 cmd.opcode = MMC_SEND_STATUS;

That certainly also seems to solve the problem on my board...
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vladimir Zapolskiy Dec. 19, 2013, 8:22 a.m. UTC | #2

On 12/19/13 00:00, Stephen Warren wrote:
> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>> From: Stephen Warren<swarren@nvidia.com>
>>
>> In mmc_do_calc_max_discard(), if only a single erase block can be
>> discarded within the host controller's timeout, don't allow discard
>> operations at all.
>>
>> Previously, the code allowed sector-at-a-time discard (rather than
>> erase-block-at-a-time), which was chronically slow.
>>
>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>> in qty == 1, which is immediately returned. This causes discard to
>> operate a single sector at a time, which is chronically slow. With this
>> patch in place, discard operates a single erase block at a time, which
>> is reasonably fast.
>
> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
> host controllers specify maximum discard timeout", followed by:
>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 050eb262485c..35c5b5d86c99 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>          cmd.opcode = MMC_ERASE;
>>          cmd.arg = arg;
>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>          if (err) {
>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>          if (mmc_host_is_spi(card->host))
>>                  goto out;
>>
>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>          do {
>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>                  cmd.opcode = MMC_SEND_STATUS;
>
> That certainly also seems to solve the problem on my board...

Personally I'd prefer this change.

Out of curiosity have you tried the approach I proposed as RFC with set
mmc_core.limit_erase_groups=0?

With best wishes,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dong Aisheng Dec. 19, 2013, 8:39 a.m. UTC | #3

On Thu, Dec 19, 2013 at 6:27 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> From: Stephen Warren <swarren@nvidia.com>
>
> In mmc_do_calc_max_discard(), if only a single erase block can be
> discarded within the host controller's timeout, don't allow discard
> operations at all.
>
> Previously, the code allowed sector-at-a-time discard (rather than
> erase-block-at-a-time), which was chronically slow.
>
> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
> in qty == 1, which is immediately returned. This causes discard to
> operate a single sector at a time, which is chronically slow. With this
> patch in place, discard operates a single erase block at a time, which
> is reasonably fast.
>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Dong Aisheng <dongas86@gmail.com>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Vladimir Zapolskiy <vz@mleia.com>
> Fixes: e056a1b5b67b "(mmc: queue: let host controllers specify maximum discard timeout")
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
>  drivers/mmc/core/core.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 57a2b403bf8e..eb952ca634ea 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -2150,8 +2150,25 @@ static unsigned int mmc_do_calc_max_discard(struct mmc_card *card,
>         if (!qty)
>                 return 0;
>
> -       if (qty == 1)
> -               return 1;
> +       /*
> +        * Discard operations may not be aligned to an erase block. In an
> +        * unaligned case, we need to issue 1 more discard operation to HW
> +        * than strictly calculated by:
> +        *     sectors_to_erase / sectors_per_discard_operation
> +        *
> +        * To account for this in the timeout calculations, we assume we can
> +        * actually discard one less erase block than fits into the HW
> +        * timeout. This explains the --qty below.
> +        *
> +        * When only a single discard block operation fits into the timeout,
> +        * disallow any discard operations at all. For example, discarding one
> +        * sector at a time is so chronically slow as to be useless. However,
> +        * make an exception for SD cards without an erase shift, since qty
> +        * isn't multiplied up by an erase block size in the code below for
> +        * that case.
> +        */
> +       if (qty == 1 && !(!card->erase_shift && mmc_card_sd(card)))
> +               return 0;
>

How about when qty == 2?
Erase 2 sectors may have no much difference from 1 sector per one time
for a SD card,
it may still chronically slow, i guess.
So i wonder it may not sovle the issues totally.

Regards
Dong Aisheng

>         /* Convert qty to sectors */
>         if (card->erase_shift)
> --
> 1.8.1.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adrian Hunter Dec. 19, 2013, 9:01 a.m. UTC | #4

On 19/12/13 01:00, Stephen Warren wrote:
> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>> From: Stephen Warren <swarren@nvidia.com>
>>
>> In mmc_do_calc_max_discard(), if only a single erase block can be
>> discarded within the host controller's timeout, don't allow discard
>> operations at all.
>>
>> Previously, the code allowed sector-at-a-time discard (rather than
>> erase-block-at-a-time), which was chronically slow.
>>
>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>> in qty == 1, which is immediately returned. This causes discard to
>> operate a single sector at a time, which is chronically slow. With this
>> patch in place, discard operates a single erase block at a time, which
>> is reasonably fast.
> 
> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
> host controllers specify maximum discard timeout", followed by:
> 
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 050eb262485c..35c5b5d86c99 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>         cmd.opcode = MMC_ERASE;
>>         cmd.arg = arg;
>>         cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>         err = mmc_wait_for_cmd(card->host, &cmd, 0);
>>         if (err) {
>>                 pr_err("mmc_erase: erase error %d, status %#x\n",
>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>         if (mmc_host_is_spi(card->host))
>>                 goto out;
>>  
>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>         do {
>>                 memset(&cmd, 0, sizeof(struct mmc_command));
>>                 cmd.opcode = MMC_SEND_STATUS;
> 
> That certainly also seems to solve the problem on my board...

But large erases will timeout when they should have been split into smaller
chunks.

A generic solution needs to be able to explain what happens when the host
controller *does* timeout.
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dong Aisheng Dec. 19, 2013, 9:05 a.m. UTC | #5

On Thu, Dec 19, 2013 at 7:00 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>> From: Stephen Warren <swarren@nvidia.com>
>>
>> In mmc_do_calc_max_discard(), if only a single erase block can be
>> discarded within the host controller's timeout, don't allow discard
>> operations at all.
>>
>> Previously, the code allowed sector-at-a-time discard (rather than
>> erase-block-at-a-time), which was chronically slow.
>>
>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>> in qty == 1, which is immediately returned. This causes discard to
>> operate a single sector at a time, which is chronically slow. With this
>> patch in place, discard operates a single erase block at a time, which
>> is reasonably fast.
>
> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
> host controllers specify maximum discard timeout", followed by:
>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 050eb262485c..35c5b5d86c99 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>         cmd.opcode = MMC_ERASE;
>>         cmd.arg = arg;
>>         cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>         err = mmc_wait_for_cmd(card->host, &cmd, 0);
>>         if (err) {
>>                 pr_err("mmc_erase: erase error %d, status %#x\n",
>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>         if (mmc_host_is_spi(card->host))
>>                 goto out;
>>
>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>         do {
>>                 memset(&cmd, 0, sizeof(struct mmc_command));
>>                 cmd.opcode = MMC_SEND_STATUS;
>
> That certainly also seems to solve the problem on my board...

Is the change you mean here only include above 3 lines changes?
If yes, it's strange to me how does this solve your issue?
It actually does not change the max_discard_to/max_discard_bytes.
It still discards only one sector one time if it does as before....

Regards
Dong Aisheng
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vladimir Zapolskiy Dec. 19, 2013, 9:14 a.m. UTC | #6

On 12/19/13 10:01, Adrian Hunter wrote:
> On 19/12/13 01:00, Stephen Warren wrote:
>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>> From: Stephen Warren<swarren@nvidia.com>
>>>
>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>> discarded within the host controller's timeout, don't allow discard
>>> operations at all.
>>>
>>> Previously, the code allowed sector-at-a-time discard (rather than
>>> erase-block-at-a-time), which was chronically slow.
>>>
>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>> in qty == 1, which is immediately returned. This causes discard to
>>> operate a single sector at a time, which is chronically slow. With this
>>> patch in place, discard operates a single erase block at a time, which
>>> is reasonably fast.
>>
>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>> host controllers specify maximum discard timeout", followed by:
>>
>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>> index 050eb262485c..35c5b5d86c99 100644
>>> --- a/drivers/mmc/core/core.c
>>> +++ b/drivers/mmc/core/core.c
>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>          cmd.opcode = MMC_ERASE;
>>>          cmd.arg = arg;
>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>          if (err) {
>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>          if (mmc_host_is_spi(card->host))
>>>                  goto out;
>>>
>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>>          do {
>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>                  cmd.opcode = MMC_SEND_STATUS;
>>
>> That certainly also seems to solve the problem on my board...
>
> But large erases will timeout when they should have been split into smaller
> chunks.
>
> A generic solution needs to be able to explain what happens when the host
> controller *does* timeout.

Please correct me, but if Data Timeout Error is disabled, then this is not
an issue for most of the host controllers.

With best wishes,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adrian Hunter Dec. 19, 2013, 9:42 a.m. UTC | #7

On 19/12/13 11:14, Vladimir Zapolskiy wrote:
> On 12/19/13 10:01, Adrian Hunter wrote:
>> On 19/12/13 01:00, Stephen Warren wrote:
>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>
>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>> discarded within the host controller's timeout, don't allow discard
>>>> operations at all.
>>>>
>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>> erase-block-at-a-time), which was chronically slow.
>>>>
>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>> in qty == 1, which is immediately returned. This causes discard to
>>>> operate a single sector at a time, which is chronically slow. With this
>>>> patch in place, discard operates a single erase block at a time, which
>>>> is reasonably fast.
>>>
>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>> host controllers specify maximum discard timeout", followed by:
>>>
>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>> index 050eb262485c..35c5b5d86c99 100644
>>>> --- a/drivers/mmc/core/core.c
>>>> +++ b/drivers/mmc/core/core.c
>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>> unsigned int from,
>>>>          cmd.opcode = MMC_ERASE;
>>>>          cmd.arg = arg;
>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>          if (err) {
>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>> unsigned int from,
>>>>          if (mmc_host_is_spi(card->host))
>>>>                  goto out;
>>>>
>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>> arg, qty));
>>>>          do {
>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>
>>> That certainly also seems to solve the problem on my board...
>>
>> But large erases will timeout when they should have been split into smaller
>> chunks.
>>
>> A generic solution needs to be able to explain what happens when the host
>> controller *does* timeout.
> 
> Please correct me, but if Data Timeout Error is disabled, then this is not
> an issue for most of the host controllers.

That is a very good point.  My experience with SDHCI was that masking the
"Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
" bits did not disable the timeout i.e. the host controller would not
deliver a TC interrupt if the erase exceeded the timeout.

What happens on your board?

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ulf Hansson Dec. 19, 2013, 10:26 a.m. UTC | #8

On 19 December 2013 10:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 19/12/13 11:14, Vladimir Zapolskiy wrote:
>> On 12/19/13 10:01, Adrian Hunter wrote:
>>> On 19/12/13 01:00, Stephen Warren wrote:
>>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>>
>>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>>> discarded within the host controller's timeout, don't allow discard
>>>>> operations at all.
>>>>>
>>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>>> erase-block-at-a-time), which was chronically slow.
>>>>>
>>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>>> in qty == 1, which is immediately returned. This causes discard to
>>>>> operate a single sector at a time, which is chronically slow. With this
>>>>> patch in place, discard operates a single erase block at a time, which
>>>>> is reasonably fast.
>>>>
>>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>>> host controllers specify maximum discard timeout", followed by:
>>>>
>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>> index 050eb262485c..35c5b5d86c99 100644
>>>>> --- a/drivers/mmc/core/core.c
>>>>> +++ b/drivers/mmc/core/core.c
>>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>> unsigned int from,
>>>>>          cmd.opcode = MMC_ERASE;
>>>>>          cmd.arg = arg;
>>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>>          if (err) {
>>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>> unsigned int from,
>>>>>          if (mmc_host_is_spi(card->host))
>>>>>                  goto out;
>>>>>
>>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>>> arg, qty));
>>>>>          do {
>>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>>
>>>> That certainly also seems to solve the problem on my board...
>>>
>>> But large erases will timeout when they should have been split into smaller
>>> chunks.
>>>
>>> A generic solution needs to be able to explain what happens when the host
>>> controller *does* timeout.
>>
>> Please correct me, but if Data Timeout Error is disabled, then this is not
>> an issue for most of the host controllers.
>
> That is a very good point.  My experience with SDHCI was that masking the
> "Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
> " bits did not disable the timeout i.e. the host controller would not
> deliver a TC interrupt if the erase exceeded the timeout.
>
> What happens on your board?
>

I posted a response yesterday for "[PATCH] mmc: core: don't decrement
qty when calculating max_discard", related to this. Please have a
look.

I think the interesting case to consider here is how we can handle
busy detection timeouts that is bigger than what the host hw can
support.

Option 1)
Should we tell the host to disable the timeout in this case? That
potentially means hanging forever - if the card misbehaves. Like
omap_hsmmc does for erase commands. Maybe that is an okay limitation?

Option 2)
Use a R1 response instead if R1B to prevent the host from doing busy
detection. Then rely on the CMD13 to poll for completion instead.
Obviously we can then stop polling after some selected timeout is the
card don't complete it's operations.

Would be very interesting to know what option you prefer!?

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dong Aisheng Dec. 19, 2013, 11:18 a.m. UTC | #9

On Thu, Dec 19, 2013 at 6:26 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 19 December 2013 10:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 19/12/13 11:14, Vladimir Zapolskiy wrote:
>>> On 12/19/13 10:01, Adrian Hunter wrote:
>>>> On 19/12/13 01:00, Stephen Warren wrote:
>>>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>>>
>>>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>>>> discarded within the host controller's timeout, don't allow discard
>>>>>> operations at all.
>>>>>>
>>>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>>>> erase-block-at-a-time), which was chronically slow.
>>>>>>
>>>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>>>> in qty == 1, which is immediately returned. This causes discard to
>>>>>> operate a single sector at a time, which is chronically slow. With this
>>>>>> patch in place, discard operates a single erase block at a time, which
>>>>>> is reasonably fast.
>>>>>
>>>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>>>> host controllers specify maximum discard timeout", followed by:
>>>>>
>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>> index 050eb262485c..35c5b5d86c99 100644
>>>>>> --- a/drivers/mmc/core/core.c
>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>> unsigned int from,
>>>>>>          cmd.opcode = MMC_ERASE;
>>>>>>          cmd.arg = arg;
>>>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>>>          if (err) {
>>>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>> unsigned int from,
>>>>>>          if (mmc_host_is_spi(card->host))
>>>>>>                  goto out;
>>>>>>
>>>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>>>> arg, qty));
>>>>>>          do {
>>>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>>>
>>>>> That certainly also seems to solve the problem on my board...
>>>>
>>>> But large erases will timeout when they should have been split into smaller
>>>> chunks.
>>>>
>>>> A generic solution needs to be able to explain what happens when the host
>>>> controller *does* timeout.
>>>
>>> Please correct me, but if Data Timeout Error is disabled, then this is not
>>> an issue for most of the host controllers.
>>
>> That is a very good point.  My experience with SDHCI was that masking the
>> "Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
>> " bits did not disable the timeout i.e. the host controller would not
>> deliver a TC interrupt if the erase exceeded the timeout.
>>
>> What happens on your board?
>>
>
> I posted a response yesterday for "[PATCH] mmc: core: don't decrement
> qty when calculating max_discard", related to this. Please have a
> look.
>
> I think the interesting case to consider here is how we can handle
> busy detection timeouts that is bigger than what the host hw can
> support.
>
> Option 1)
> Should we tell the host to disable the timeout in this case? That
> potentially means hanging forever - if the card misbehaves. Like
> omap_hsmmc does for erase commands. Maybe that is an okay limitation?
>
> Option 2)
> Use a R1 response instead if R1B to prevent the host from doing busy
> detection. Then rely on the CMD13 to poll for completion instead.
> Obviously we can then stop polling after some selected timeout is the
> card don't complete it's operations.
>

I proposed the same way before:
https://www.mail-archive.com/linux-mmc@vger.kernel.org/msg23757.html
Two concerns remain:

"1) not waiting for R1B seems a bit violation with standard spec.
Also it increase complexity on handling the R1B of the same command
for two different
cases: using hw timeout or polling status for CMD38.

2) In current implementation, the data size to erase will not exceed
the max_discard_bytes
which is calculated based on max_discard_to of host.
Then how do we specify max_discard_to if want to use polling? UNIT_MAX?
Will it be too long to affect other activities in the same system?"

That means we should erase a proper size of data in case it affects the system
a lot(e.g. two partitions on the same card, discard one partition may
cause the code
on another partition has no chance to run, it may be serious if the
file system is in it.).

Regards
Dong Aisheng

> Would be very interesting to know what option you prefer!?
>
> Kind regards
> Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adrian Hunter Dec. 19, 2013, 12:28 p.m. UTC | #10

On 19/12/13 12:26, Ulf Hansson wrote:
> On 19 December 2013 10:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 19/12/13 11:14, Vladimir Zapolskiy wrote:
>>> On 12/19/13 10:01, Adrian Hunter wrote:
>>>> On 19/12/13 01:00, Stephen Warren wrote:
>>>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>>>
>>>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>>>> discarded within the host controller's timeout, don't allow discard
>>>>>> operations at all.
>>>>>>
>>>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>>>> erase-block-at-a-time), which was chronically slow.
>>>>>>
>>>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>>>> in qty == 1, which is immediately returned. This causes discard to
>>>>>> operate a single sector at a time, which is chronically slow. With this
>>>>>> patch in place, discard operates a single erase block at a time, which
>>>>>> is reasonably fast.
>>>>>
>>>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>>>> host controllers specify maximum discard timeout", followed by:
>>>>>
>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>> index 050eb262485c..35c5b5d86c99 100644
>>>>>> --- a/drivers/mmc/core/core.c
>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>> unsigned int from,
>>>>>>          cmd.opcode = MMC_ERASE;
>>>>>>          cmd.arg = arg;
>>>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>>>          if (err) {
>>>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>> unsigned int from,
>>>>>>          if (mmc_host_is_spi(card->host))
>>>>>>                  goto out;
>>>>>>
>>>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>>>> arg, qty));
>>>>>>          do {
>>>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>>>
>>>>> That certainly also seems to solve the problem on my board...
>>>>
>>>> But large erases will timeout when they should have been split into smaller
>>>> chunks.
>>>>
>>>> A generic solution needs to be able to explain what happens when the host
>>>> controller *does* timeout.
>>>
>>> Please correct me, but if Data Timeout Error is disabled, then this is not
>>> an issue for most of the host controllers.
>>
>> That is a very good point.  My experience with SDHCI was that masking the
>> "Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
>> " bits did not disable the timeout i.e. the host controller would not
>> deliver a TC interrupt if the erase exceeded the timeout.
>>
>> What happens on your board?
>>
> 
> I posted a response yesterday for "[PATCH] mmc: core: don't decrement
> qty when calculating max_discard", related to this. Please have a
> look.
> 
> I think the interesting case to consider here is how we can handle
> busy detection timeouts that is bigger than what the host hw can
> support.
> 
> Option 1)
> Should we tell the host to disable the timeout in this case? That
> potentially means hanging forever - if the card misbehaves. Like
> omap_hsmmc does for erase commands. Maybe that is an okay limitation?

sdhci anyway has a 10 second timer to catch unresponsive host controllers.
I recently sent a patch to use the cmd_timeout_ms if it is bigger than 10
seconds.

	http://permalink.gmane.org/gmane.linux.kernel.mmc/23557

> 
> Option 2)
> Use a R1 response instead if R1B to prevent the host from doing busy
> detection. Then rely on the CMD13 to poll for completion instead.
> Obviously we can then stop polling after some selected timeout is the
> card don't complete it's operations.

It would be nice to avoid polling when the timeout can be supported. Also
the polling should be periodic.

> 
> Would be very interesting to know what option you prefer!?

At least 1 of the host controllers I have seen does not support disabling
the timeout - so option 1) might not work in all cases.  Although it is the
nicer option i.e. replace the hardware timeout with a software timeout.

So I would probably allow both options to co-exist.

> 
> Kind regards
> Uffe
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ulf Hansson Dec. 19, 2013, 1:04 p.m. UTC | #11

On 19 December 2013 12:18, Dong Aisheng <dongas86@gmail.com> wrote:
> On Thu, Dec 19, 2013 at 6:26 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 19 December 2013 10:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> On 19/12/13 11:14, Vladimir Zapolskiy wrote:
>>>> On 12/19/13 10:01, Adrian Hunter wrote:
>>>>> On 19/12/13 01:00, Stephen Warren wrote:
>>>>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>>>>
>>>>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>>>>> discarded within the host controller's timeout, don't allow discard
>>>>>>> operations at all.
>>>>>>>
>>>>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>>>>> erase-block-at-a-time), which was chronically slow.
>>>>>>>
>>>>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>>>>> in qty == 1, which is immediately returned. This causes discard to
>>>>>>> operate a single sector at a time, which is chronically slow. With this
>>>>>>> patch in place, discard operates a single erase block at a time, which
>>>>>>> is reasonably fast.
>>>>>>
>>>>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>>>>> host controllers specify maximum discard timeout", followed by:
>>>>>>
>>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>>> index 050eb262485c..35c5b5d86c99 100644
>>>>>>> --- a/drivers/mmc/core/core.c
>>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>>> unsigned int from,
>>>>>>>          cmd.opcode = MMC_ERASE;
>>>>>>>          cmd.arg = arg;
>>>>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>>>>          if (err) {
>>>>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>>> unsigned int from,
>>>>>>>          if (mmc_host_is_spi(card->host))
>>>>>>>                  goto out;
>>>>>>>
>>>>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>>>>> arg, qty));
>>>>>>>          do {
>>>>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>>>>
>>>>>> That certainly also seems to solve the problem on my board...
>>>>>
>>>>> But large erases will timeout when they should have been split into smaller
>>>>> chunks.
>>>>>
>>>>> A generic solution needs to be able to explain what happens when the host
>>>>> controller *does* timeout.
>>>>
>>>> Please correct me, but if Data Timeout Error is disabled, then this is not
>>>> an issue for most of the host controllers.
>>>
>>> That is a very good point.  My experience with SDHCI was that masking the
>>> "Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
>>> " bits did not disable the timeout i.e. the host controller would not
>>> deliver a TC interrupt if the erase exceeded the timeout.
>>>
>>> What happens on your board?
>>>
>>
>> I posted a response yesterday for "[PATCH] mmc: core: don't decrement
>> qty when calculating max_discard", related to this. Please have a
>> look.
>>
>> I think the interesting case to consider here is how we can handle
>> busy detection timeouts that is bigger than what the host hw can
>> support.
>>
>> Option 1)
>> Should we tell the host to disable the timeout in this case? That
>> potentially means hanging forever - if the card misbehaves. Like
>> omap_hsmmc does for erase commands. Maybe that is an okay limitation?
>>
>> Option 2)
>> Use a R1 response instead if R1B to prevent the host from doing busy
>> detection. Then rely on the CMD13 to poll for completion instead.
>> Obviously we can then stop polling after some selected timeout is the
>> card don't complete it's operations.
>>
>
> I proposed the same way before:
> https://www.mail-archive.com/linux-mmc@vger.kernel.org/msg23757.html
> Two concerns remain:
>
> "1) not waiting for R1B seems a bit violation with standard spec.

R1B is the same as R1 but with _optional_ busy detection on DAT0. So I
don't think we will be violating the spec by only using CMD13.

> Also it increase complexity on handling the R1B of the same command
> for two different
> cases: using hw timeout or polling status for CMD38.

I have started to craft to together a patch for how __mmc_switch()
would be changed in this respect.

I don't think it will become that complicated, and more importantly -
for the ERASE commands, it will be simplifier since there are not the
special case were CMD13 is _not_ allowed, which is the case for
SWITCH.

>
> 2) In current implementation, the data size to erase will not exceed
> the max_discard_bytes
> which is calculated based on max_discard_to of host.

I suggest we should move away from using max_discard_to to calculate
max_discard. This just don't work smoothly since we can end up in
cases were we get too few blocks or not any at all. Like what happen
for Stephen in the Tegra case.

> Then how do we specify max_discard_to if want to use polling? UNIT_MAX?
> Will it be too long to affect other activities in the same system?"

I would prefer if we could select a fixed number of max
blocks/eraseblocks, then calculate the timeout based on that.

>
> That means we should erase a proper size of data in case it affects the system
> a lot(e.g. two partitions on the same card, discard one partition may
> cause the code
> on another partition has no chance to run, it may be serious if the
> file system is in it.).

So maybe the fixed numer of max blocks/eraseblocks should be just a few ones?

>
> Regards
> Dong Aisheng
>
>> Would be very interesting to know what option you prefer!?
>>
>> Kind regards
>> Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ulf Hansson Dec. 19, 2013, 1:29 p.m. UTC | #12

On 19 December 2013 13:28, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 19/12/13 12:26, Ulf Hansson wrote:
>> On 19 December 2013 10:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> On 19/12/13 11:14, Vladimir Zapolskiy wrote:
>>>> On 12/19/13 10:01, Adrian Hunter wrote:
>>>>> On 19/12/13 01:00, Stephen Warren wrote:
>>>>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>>>>
>>>>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>>>>> discarded within the host controller's timeout, don't allow discard
>>>>>>> operations at all.
>>>>>>>
>>>>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>>>>> erase-block-at-a-time), which was chronically slow.
>>>>>>>
>>>>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>>>>> in qty == 1, which is immediately returned. This causes discard to
>>>>>>> operate a single sector at a time, which is chronically slow. With this
>>>>>>> patch in place, discard operates a single erase block at a time, which
>>>>>>> is reasonably fast.
>>>>>>
>>>>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>>>>> host controllers specify maximum discard timeout", followed by:
>>>>>>
>>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>>> index 050eb262485c..35c5b5d86c99 100644
>>>>>>> --- a/drivers/mmc/core/core.c
>>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>>> unsigned int from,
>>>>>>>          cmd.opcode = MMC_ERASE;
>>>>>>>          cmd.arg = arg;
>>>>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>>>>          if (err) {
>>>>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>>> unsigned int from,
>>>>>>>          if (mmc_host_is_spi(card->host))
>>>>>>>                  goto out;
>>>>>>>
>>>>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>>>>> arg, qty));
>>>>>>>          do {
>>>>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>>>>
>>>>>> That certainly also seems to solve the problem on my board...
>>>>>
>>>>> But large erases will timeout when they should have been split into smaller
>>>>> chunks.
>>>>>
>>>>> A generic solution needs to be able to explain what happens when the host
>>>>> controller *does* timeout.
>>>>
>>>> Please correct me, but if Data Timeout Error is disabled, then this is not
>>>> an issue for most of the host controllers.
>>>
>>> That is a very good point.  My experience with SDHCI was that masking the
>>> "Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
>>> " bits did not disable the timeout i.e. the host controller would not
>>> deliver a TC interrupt if the erase exceeded the timeout.
>>>
>>> What happens on your board?
>>>
>>
>> I posted a response yesterday for "[PATCH] mmc: core: don't decrement
>> qty when calculating max_discard", related to this. Please have a
>> look.
>>
>> I think the interesting case to consider here is how we can handle
>> busy detection timeouts that is bigger than what the host hw can
>> support.
>>
>> Option 1)
>> Should we tell the host to disable the timeout in this case? That
>> potentially means hanging forever - if the card misbehaves. Like
>> omap_hsmmc does for erase commands. Maybe that is an okay limitation?
>
> sdhci anyway has a 10 second timer to catch unresponsive host controllers.
> I recently sent a patch to use the cmd_timeout_ms if it is bigger than 10
> seconds.
>
>         http://permalink.gmane.org/gmane.linux.kernel.mmc/23557
>

I see the reason behind your patch. Somehow, I don't like that host
drivers need to care about such things for specific commands.

The host driver should only tell it's maximum supported busy detection
timeout (max_discard_to) to the core layer, which should be needed
only of it supports MMC_CAP_WAIT_WHILE_BUSY.

Then the core layer should decide what to do depending on current
needed timeout.

BTW, do you know why sdhci haven't enabled MMC_CAP_WAIT_WHILE_BUSY. It
seems like it should be?

>>
>> Option 2)
>> Use a R1 response instead if R1B to prevent the host from doing busy
>> detection. Then rely on the CMD13 to poll for completion instead.
>> Obviously we can then stop polling after some selected timeout is the
>> card don't complete it's operations.
>
> It would be nice to avoid polling when the timeout can be supported. Also
> the polling should be periodic.

Agree!

>
>>
>> Would be very interesting to know what option you prefer!?
>
> At least 1 of the host controllers I have seen does not support disabling
> the timeout - so option 1) might not work in all cases.  Although it is the
> nicer option i.e. replace the hardware timeout with a software timeout.
>
> So I would probably allow both options to co-exist.

Thanks for input Adrian!

>
>>
>> Kind regards
>> Uffe
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adrian Hunter Dec. 19, 2013, 1:49 p.m. UTC | #13

On 19/12/13 15:29, Ulf Hansson wrote:
> On 19 December 2013 13:28, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 19/12/13 12:26, Ulf Hansson wrote:
>>> On 19 December 2013 10:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>> On 19/12/13 11:14, Vladimir Zapolskiy wrote:
>>>>> On 12/19/13 10:01, Adrian Hunter wrote:
>>>>>> On 19/12/13 01:00, Stephen Warren wrote:
>>>>>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>>>>>> From: Stephen Warren<swarren@nvidia.com>
>>>>>>>>
>>>>>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>>>>>> discarded within the host controller's timeout, don't allow discard
>>>>>>>> operations at all.
>>>>>>>>
>>>>>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>>>>>> erase-block-at-a-time), which was chronically slow.
>>>>>>>>
>>>>>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>>>>>> in qty == 1, which is immediately returned. This causes discard to
>>>>>>>> operate a single sector at a time, which is chronically slow. With this
>>>>>>>> patch in place, discard operates a single erase block at a time, which
>>>>>>>> is reasonably fast.
>>>>>>>
>>>>>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>>>>>> host controllers specify maximum discard timeout", followed by:
>>>>>>>
>>>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>>>> index 050eb262485c..35c5b5d86c99 100644
>>>>>>>> --- a/drivers/mmc/core/core.c
>>>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>>>> unsigned int from,
>>>>>>>>          cmd.opcode = MMC_ERASE;
>>>>>>>>          cmd.arg = arg;
>>>>>>>>          cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>>>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>>>>>          err = mmc_wait_for_cmd(card->host,&cmd, 0);
>>>>>>>>          if (err) {
>>>>>>>>                  pr_err("mmc_erase: erase error %d, status %#x\n",
>>>>>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card,
>>>>>>>> unsigned int from,
>>>>>>>>          if (mmc_host_is_spi(card->host))
>>>>>>>>                  goto out;
>>>>>>>>
>>>>>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>>>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card,
>>>>>>>> arg, qty));
>>>>>>>>          do {
>>>>>>>>                  memset(&cmd, 0, sizeof(struct mmc_command));
>>>>>>>>                  cmd.opcode = MMC_SEND_STATUS;
>>>>>>>
>>>>>>> That certainly also seems to solve the problem on my board...
>>>>>>
>>>>>> But large erases will timeout when they should have been split into smaller
>>>>>> chunks.
>>>>>>
>>>>>> A generic solution needs to be able to explain what happens when the host
>>>>>> controller *does* timeout.
>>>>>
>>>>> Please correct me, but if Data Timeout Error is disabled, then this is not
>>>>> an issue for most of the host controllers.
>>>>
>>>> That is a very good point.  My experience with SDHCI was that masking the
>>>> "Data Timeout Error Status Enable" and "Data Timeout Error Signal Enable
>>>> " bits did not disable the timeout i.e. the host controller would not
>>>> deliver a TC interrupt if the erase exceeded the timeout.
>>>>
>>>> What happens on your board?
>>>>
>>>
>>> I posted a response yesterday for "[PATCH] mmc: core: don't decrement
>>> qty when calculating max_discard", related to this. Please have a
>>> look.
>>>
>>> I think the interesting case to consider here is how we can handle
>>> busy detection timeouts that is bigger than what the host hw can
>>> support.
>>>
>>> Option 1)
>>> Should we tell the host to disable the timeout in this case? That
>>> potentially means hanging forever - if the card misbehaves. Like
>>> omap_hsmmc does for erase commands. Maybe that is an okay limitation?
>>
>> sdhci anyway has a 10 second timer to catch unresponsive host controllers.
>> I recently sent a patch to use the cmd_timeout_ms if it is bigger than 10
>> seconds.
>>
>>         http://permalink.gmane.org/gmane.linux.kernel.mmc/23557
>>
> 
> I see the reason behind your patch. Somehow, I don't like that host
> drivers need to care about such things for specific commands.

It is not for a specific command - the timer is used for all commands.

> 
> The host driver should only tell it's maximum supported busy detection
> timeout (max_discard_to) to the core layer, which should be needed
> only of it supports MMC_CAP_WAIT_WHILE_BUSY.
> 
> Then the core layer should decide what to do depending on current
> needed timeout.
> 
> BTW, do you know why sdhci haven't enabled MMC_CAP_WAIT_WHILE_BUSY. It
> seems like it should be?

Yes it should be.  Just an oversight.

> 
>>>
>>> Option 2)
>>> Use a R1 response instead if R1B to prevent the host from doing busy
>>> detection. Then rely on the CMD13 to poll for completion instead.
>>> Obviously we can then stop polling after some selected timeout is the
>>> card don't complete it's operations.
>>
>> It would be nice to avoid polling when the timeout can be supported. Also
>> the polling should be periodic.
> 
> Agree!
> 
>>
>>>
>>> Would be very interesting to know what option you prefer!?
>>
>> At least 1 of the host controllers I have seen does not support disabling
>> the timeout - so option 1) might not work in all cases.  Although it is the
>> nicer option i.e. replace the hardware timeout with a software timeout.
>>
>> So I would probably allow both options to co-exist.
> 
> Thanks for input Adrian!
> 
>>
>>>
>>> Kind regards
>>> Uffe
>>>
>>>
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Warren Dec. 19, 2013, 7:08 p.m. UTC | #14

On 12/19/2013 01:39 AM, Dong Aisheng wrote:
> On Thu, Dec 19, 2013 at 6:27 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> From: Stephen Warren <swarren@nvidia.com>
>>
>> In mmc_do_calc_max_discard(), if only a single erase block can be
>> discarded within the host controller's timeout, don't allow discard
>> operations at all.
>>
>> Previously, the code allowed sector-at-a-time discard (rather than
>> erase-block-at-a-time), which was chronically slow.
>>
>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>> in qty == 1, which is immediately returned. This causes discard to
>> operate a single sector at a time, which is chronically slow. With this
>> patch in place, discard operates a single erase block at a time, which
>> is reasonably fast.
>>
>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>> Cc: Dong Aisheng <dongas86@gmail.com>
>> Cc: Ulf Hansson <ulf.hansson@linaro.org>
>> Cc: Vladimir Zapolskiy <vz@mleia.com>
>> Fixes: e056a1b5b67b "(mmc: queue: let host controllers specify maximum discard timeout")
>> Signed-off-by: Stephen Warren <swarren@nvidia.com>
>> ---
>>  drivers/mmc/core/core.c | 21 +++++++++++++++++++--
>>  1 file changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 57a2b403bf8e..eb952ca634ea 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -2150,8 +2150,25 @@ static unsigned int mmc_do_calc_max_discard(struct mmc_card *card,
>>         if (!qty)
>>                 return 0;
>>
>> -       if (qty == 1)
>> -               return 1;
>> +       /*
>> +        * Discard operations may not be aligned to an erase block. In an
>> +        * unaligned case, we need to issue 1 more discard operation to HW
>> +        * than strictly calculated by:
>> +        *     sectors_to_erase / sectors_per_discard_operation
>> +        *
>> +        * To account for this in the timeout calculations, we assume we can
>> +        * actually discard one less erase block than fits into the HW
>> +        * timeout. This explains the --qty below.
>> +        *
>> +        * When only a single discard block operation fits into the timeout,
>> +        * disallow any discard operations at all. For example, discarding one
>> +        * sector at a time is so chronically slow as to be useless. However,
>> +        * make an exception for SD cards without an erase shift, since qty
>> +        * isn't multiplied up by an erase block size in the code below for
>> +        * that case.
>> +        */
>> +       if (qty == 1 && !(!card->erase_shift && mmc_card_sd(card)))
>> +               return 0;
>>
> 
> How about when qty == 2?
> Erase 2 sectors may have no much difference from 1 sector per one time
> for a SD card,
> it may still chronically slow, i guess.
> So i wonder it may not sovle the issues totally.

When qty==2, the number of sectors gets multiplied by the number of
sectors in an erase block, so it isn't chronically slow. It's only slow
with qty==1 because without this patch, the multiplication gets skipped
and "1" returned rather then "1 << card->erase_shift".

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Warren Dec. 19, 2013, 7:11 p.m. UTC | #15

On 12/19/2013 02:01 AM, Adrian Hunter wrote:
> On 19/12/13 01:00, Stephen Warren wrote:
>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>> From: Stephen Warren <swarren@nvidia.com>
>>>
>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>> discarded within the host controller's timeout, don't allow discard
>>> operations at all.
>>>
>>> Previously, the code allowed sector-at-a-time discard (rather than
>>> erase-block-at-a-time), which was chronically slow.
>>>
>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>> in qty == 1, which is immediately returned. This causes discard to
>>> operate a single sector at a time, which is chronically slow. With this
>>> patch in place, discard operates a single erase block at a time, which
>>> is reasonably fast.
>>
>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>> host controllers specify maximum discard timeout", followed by:
>>
>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>> index 050eb262485c..35c5b5d86c99 100644
>>> --- a/drivers/mmc/core/core.c
>>> +++ b/drivers/mmc/core/core.c
>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>         cmd.opcode = MMC_ERASE;
>>>         cmd.arg = arg;
>>>         cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>         err = mmc_wait_for_cmd(card->host, &cmd, 0);
>>>         if (err) {
>>>                 pr_err("mmc_erase: erase error %d, status %#x\n",
>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>         if (mmc_host_is_spi(card->host))
>>>                 goto out;
>>>  
>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>>         do {
>>>                 memset(&cmd, 0, sizeof(struct mmc_command));
>>>                 cmd.opcode = MMC_SEND_STATUS;
>>
>> That certainly also seems to solve the problem on my board...
> 
> But large erases will timeout when they should have been split into smaller
> chunks.
> 
> A generic solution needs to be able to explain what happens when the host
> controller *does* timeout.

I thought the whole point of this discussion was that the timeout in the
first code segment above only applies to command submission, whereas
completion of the erase operation itself was determined by polling using
CMD13, which still uses mmc_erase_timeout(). As such, I don't /think/
that "large erases will timeout" here, unless I'm significantly
misunderstanding something?
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stephen Warren Dec. 19, 2013, 7:15 p.m. UTC | #16

On 12/19/2013 02:05 AM, Dong Aisheng wrote:
> On Thu, Dec 19, 2013 at 7:00 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>> From: Stephen Warren <swarren@nvidia.com>
>>>
>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>> discarded within the host controller's timeout, don't allow discard
>>> operations at all.
>>>
>>> Previously, the code allowed sector-at-a-time discard (rather than
>>> erase-block-at-a-time), which was chronically slow.
>>>
>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>> in qty == 1, which is immediately returned. This causes discard to
>>> operate a single sector at a time, which is chronically slow. With this
>>> patch in place, discard operates a single erase block at a time, which
>>> is reasonably fast.
>>
>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>> host controllers specify maximum discard timeout", followed by:
>>
>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>> index 050eb262485c..35c5b5d86c99 100644
>>> --- a/drivers/mmc/core/core.c
>>> +++ b/drivers/mmc/core/core.c
>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>         cmd.opcode = MMC_ERASE;
>>>         cmd.arg = arg;
>>>         cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>         err = mmc_wait_for_cmd(card->host, &cmd, 0);
>>>         if (err) {
>>>                 pr_err("mmc_erase: erase error %d, status %#x\n",
>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>         if (mmc_host_is_spi(card->host))
>>>                 goto out;
>>>
>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>>         do {
>>>                 memset(&cmd, 0, sizeof(struct mmc_command));
>>>                 cmd.opcode = MMC_SEND_STATUS;
>>
>> That certainly also seems to solve the problem on my board...
> 
> Is the change you mean here only include above 3 lines changes?
> If yes, it's strange to me how does this solve your issue?
> It actually does not change the max_discard_to/max_discard_bytes.
> It still discards only one sector one time if it does as before....

It's a revert of the patch which introduced max_discard_to, so all that
logic goes away completely, *then* the 3 lines above, which move the
timeout away from command submission (which is what I believe is
implemented by the controller's timeout HW) and to the CMD13 polling
operation (where we can implement whatever timeout we want, in the kernel).
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adrian Hunter Dec. 20, 2013, 7:17 a.m. UTC | #17

On 19/12/13 21:11, Stephen Warren wrote:
> On 12/19/2013 02:01 AM, Adrian Hunter wrote:
>> On 19/12/13 01:00, Stephen Warren wrote:
>>> On 12/18/2013 03:27 PM, Stephen Warren wrote:
>>>> From: Stephen Warren <swarren@nvidia.com>
>>>>
>>>> In mmc_do_calc_max_discard(), if only a single erase block can be
>>>> discarded within the host controller's timeout, don't allow discard
>>>> operations at all.
>>>>
>>>> Previously, the code allowed sector-at-a-time discard (rather than
>>>> erase-block-at-a-time), which was chronically slow.
>>>>
>>>> Without this patch, on the NVIDIA Tegra Cardhu board, the loops result
>>>> in qty == 1, which is immediately returned. This causes discard to
>>>> operate a single sector at a time, which is chronically slow. With this
>>>> patch in place, discard operates a single erase block at a time, which
>>>> is reasonably fast.
>>>
>>> Alternatively, is the real fix a revert of e056a1b5b67b "mmc: queue: let
>>> host controllers specify maximum discard timeout", followed by:
>>>
>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>> index 050eb262485c..35c5b5d86c99 100644
>>>> --- a/drivers/mmc/core/core.c
>>>> +++ b/drivers/mmc/core/core.c
>>>> @@ -1950,7 +1950,6 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>>         cmd.opcode = MMC_ERASE;
>>>>         cmd.arg = arg;
>>>>         cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
>>>> -       cmd.cmd_timeout_ms = mmc_erase_timeout(card, arg, qty);
>>>>         err = mmc_wait_for_cmd(card->host, &cmd, 0);
>>>>         if (err) {
>>>>                 pr_err("mmc_erase: erase error %d, status %#x\n",
>>>> @@ -1962,7 +1961,7 @@ static int mmc_do_erase(struct mmc_card *card, unsigned int from,
>>>>         if (mmc_host_is_spi(card->host))
>>>>                 goto out;
>>>>  
>>>> -       timeout = jiffies + msecs_to_jiffies(MMC_CORE_TIMEOUT_MS);
>>>> +       timeout = jiffies + msecs_to_jiffies(mmc_erase_timeout(card, arg, qty));
>>>>         do {
>>>>                 memset(&cmd, 0, sizeof(struct mmc_command));
>>>>                 cmd.opcode = MMC_SEND_STATUS;
>>>
>>> That certainly also seems to solve the problem on my board...
>>
>> But large erases will timeout when they should have been split into smaller
>> chunks.
>>
>> A generic solution needs to be able to explain what happens when the host
>> controller *does* timeout.
> 
> I thought the whole point of this discussion was that the timeout in the
> first code segment above only applies to command submission, whereas
> completion of the erase operation itself was determined by polling using
> CMD13, which still uses mmc_erase_timeout(). As such, I don't /think/
> that "large erases will timeout" here, unless I'm significantly
> misunderstanding something?

Setting cmd_timeout_ms to zero just makes the sdhci driver choose the
highest timeout.

Currently the sdhci driver does not have a facility to disable the timeout,
and, as I noted in another thread, masking the "Data Timeout Error Status
Enable" and "Data Timeout Error Signal Enable" bits did not work for me.
What Ulf is suggesting is to change the response type from R1B to R1.

But there is also a need to avoid polling when it is not necessary and to
make it periodic.

Ulf said he is working on a patch to doing something similar with mmc_switch.

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

mmc: core: don't return 1 for max_discard

Commit Message

Comments

Patch