diff mbox series

[v2,2/2] mmc: sdhci-omap: Workaround errata regarding SDR104/HS200 tuning failures (i929)

Message ID 20181211142253.23747-3-faiz_abbas@ti.com (mailing list archive)
State New, archived
Headers show
Series Workaround errata i929 | expand

Commit Message

Faiz Abbas Dec. 11, 2018, 2:22 p.m. UTC
Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
(SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
unexpected tuning pattern errors. A small failure band may be present
in the tuning range which may be missed by the current algorithm.
Furthermore, the failure bands vary with temperature leading to
different optimum tuning values for different temperatures.

As suggested in the related Application Report (SPRACA9B - October 2017
- Revised July 2018 [2]), tuning should be done in two stages.
In stage 1, assign the optimum ratio in the maximum pass window for the
current temperature. In stage 2, if the chosen value is close to the
small failure band, move away from it in the appropriate direction.

References:
[1] http://www.ti.com/lit/pdf/sprz426
[2] http://www.ti.com/lit/pdf/SPRACA9

Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/host/Kconfig      |  2 +
 drivers/mmc/host/sdhci-omap.c | 90 ++++++++++++++++++++++++++++++++++-
 2 files changed, 91 insertions(+), 1 deletion(-)

Comments

Ulf Hansson Dec. 12, 2018, 9:19 a.m. UTC | #1
+ Thermal maintainers

On Tue, 11 Dec 2018 at 15:20, Faiz Abbas <faiz_abbas@ti.com> wrote:
>
> Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
> (SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
> unexpected tuning pattern errors. A small failure band may be present
> in the tuning range which may be missed by the current algorithm.
> Furthermore, the failure bands vary with temperature leading to
> different optimum tuning values for different temperatures.
>
> As suggested in the related Application Report (SPRACA9B - October 2017
> - Revised July 2018 [2]), tuning should be done in two stages.
> In stage 1, assign the optimum ratio in the maximum pass window for the
> current temperature. In stage 2, if the chosen value is close to the
> small failure band, move away from it in the appropriate direction.
>
> References:
> [1] http://www.ti.com/lit/pdf/sprz426
> [2] http://www.ti.com/lit/pdf/SPRACA9
>
> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> Acked-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  drivers/mmc/host/Kconfig      |  2 +
>  drivers/mmc/host/sdhci-omap.c | 90 ++++++++++++++++++++++++++++++++++-
>  2 files changed, 91 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> index 5fa580cec831..d8f984483ab0 100644
> --- a/drivers/mmc/host/Kconfig
> +++ b/drivers/mmc/host/Kconfig
> @@ -977,6 +977,8 @@ config MMC_SDHCI_XENON
>  config MMC_SDHCI_OMAP
>         tristate "TI SDHCI Controller Support"
>         depends on MMC_SDHCI_PLTFM && OF
> +       select THERMAL
> +       select TI_SOC_THERMAL
>         help
>           This selects the Secure Digital Host Controller Interface (SDHCI)
>           support present in TI's DRA7 SOCs. The controller supports
> diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
> index f588ab679cb0..b75c55011fcb 100644
> --- a/drivers/mmc/host/sdhci-omap.c
> +++ b/drivers/mmc/host/sdhci-omap.c
> @@ -27,6 +27,7 @@
>  #include <linux/regulator/consumer.h>
>  #include <linux/pinctrl/consumer.h>
>  #include <linux/sys_soc.h>
> +#include <linux/thermal.h>
>
>  #include "sdhci-pltfm.h"
>
> @@ -286,15 +287,19 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
>         struct sdhci_host *host = mmc_priv(mmc);
>         struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>         struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
> +       struct thermal_zone_device *thermal_dev;
>         struct device *dev = omap_host->dev;
>         struct mmc_ios *ios = &mmc->ios;
>         u32 start_window = 0, max_window = 0;
> +       bool single_point_failure = false;
>         bool dcrc_was_enabled = false;
>         u8 cur_match, prev_match = 0;
>         u32 length = 0, max_len = 0;
>         u32 phase_delay = 0;
> +       int temperature;
>         int ret = 0;
>         u32 reg;
> +       int i;
>
>         /* clock tuning is not needed for upto 52MHz */
>         if (ios->clock <= 52000000)
> @@ -304,6 +309,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
>         if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
>                 return 0;
>
> +       thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");

I couldn't find a corresponding call to a put function, like
"thermal_zone_put()" or whatever, which made me realize that the
thermal zone API is incomplete. Or depending on how you put it, it
lacks object reference counting, unless I am missing something.

For example, what happens if the thermal zone becomes unregistered
between this point and when you call thermal_zone_get_temp() a couple
of line below. I assume it's a known problem, but just wanted to point
it out.

> +       if (IS_ERR(thermal_dev)) {
> +               dev_err(dev, "Unable to get thermal zone for tuning\n");
> +               return PTR_ERR(thermal_dev);
> +       }
> +
> +       ret = thermal_zone_get_temp(thermal_dev, &temperature);
> +       if (ret)
> +               return ret;
> +

[...]

Anyway, I have applied this for next, thanks!

Kind regards
Uffe
Olof Johansson Jan. 2, 2019, 6:29 p.m. UTC | #2
Hi,


On Wed, Dec 12, 2018 at 1:20 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> + Thermal maintainers
>
> On Tue, 11 Dec 2018 at 15:20, Faiz Abbas <faiz_abbas@ti.com> wrote:
> >
> > Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
> > (SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
> > unexpected tuning pattern errors. A small failure band may be present
> > in the tuning range which may be missed by the current algorithm.
> > Furthermore, the failure bands vary with temperature leading to
> > different optimum tuning values for different temperatures.
> >
> > As suggested in the related Application Report (SPRACA9B - October 2017
> > - Revised July 2018 [2]), tuning should be done in two stages.
> > In stage 1, assign the optimum ratio in the maximum pass window for the
> > current temperature. In stage 2, if the chosen value is close to the
> > small failure band, move away from it in the appropriate direction.
> >
> > References:
> > [1] http://www.ti.com/lit/pdf/sprz426
> > [2] http://www.ti.com/lit/pdf/SPRACA9
> >
> > Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> > Acked-by: Adrian Hunter <adrian.hunter@intel.com>
> > ---
> >  drivers/mmc/host/Kconfig      |  2 +
> >  drivers/mmc/host/sdhci-omap.c | 90 ++++++++++++++++++++++++++++++++++-
> >  2 files changed, 91 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> > index 5fa580cec831..d8f984483ab0 100644
> > --- a/drivers/mmc/host/Kconfig
> > +++ b/drivers/mmc/host/Kconfig
> > @@ -977,6 +977,8 @@ config MMC_SDHCI_XENON
> >  config MMC_SDHCI_OMAP
> >         tristate "TI SDHCI Controller Support"
> >         depends on MMC_SDHCI_PLTFM && OF
> > +       select THERMAL
> > +       select TI_SOC_THERMAL
> >         help
> >           This selects the Secure Digital Host Controller Interface (SDHCI)
> >           support present in TI's DRA7 SOCs. The controller supports
> > diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
> > index f588ab679cb0..b75c55011fcb 100644
> > --- a/drivers/mmc/host/sdhci-omap.c
> > +++ b/drivers/mmc/host/sdhci-omap.c
> > @@ -27,6 +27,7 @@
> >  #include <linux/regulator/consumer.h>
> >  #include <linux/pinctrl/consumer.h>
> >  #include <linux/sys_soc.h>
> > +#include <linux/thermal.h>
> >
> >  #include "sdhci-pltfm.h"
> >
> > @@ -286,15 +287,19 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
> >         struct sdhci_host *host = mmc_priv(mmc);
> >         struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
> >         struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
> > +       struct thermal_zone_device *thermal_dev;
> >         struct device *dev = omap_host->dev;
> >         struct mmc_ios *ios = &mmc->ios;
> >         u32 start_window = 0, max_window = 0;
> > +       bool single_point_failure = false;
> >         bool dcrc_was_enabled = false;
> >         u8 cur_match, prev_match = 0;
> >         u32 length = 0, max_len = 0;
> >         u32 phase_delay = 0;
> > +       int temperature;
> >         int ret = 0;
> >         u32 reg;
> > +       int i;
> >
> >         /* clock tuning is not needed for upto 52MHz */
> >         if (ios->clock <= 52000000)
> > @@ -304,6 +309,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
> >         if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
> >                 return 0;
> >
> > +       thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
>
> I couldn't find a corresponding call to a put function, like
> "thermal_zone_put()" or whatever, which made me realize that the
> thermal zone API is incomplete. Or depending on how you put it, it
> lacks object reference counting, unless I am missing something.
>
> For example, what happens if the thermal zone becomes unregistered
> between this point and when you call thermal_zone_get_temp() a couple
> of line below. I assume it's a known problem, but just wanted to point
> it out.
>
> > +       if (IS_ERR(thermal_dev)) {
> > +               dev_err(dev, "Unable to get thermal zone for tuning\n");
> > +               return PTR_ERR(thermal_dev);
> > +       }
> > +
> > +       ret = thermal_zone_get_temp(thermal_dev, &temperature);
> > +       if (ret)
> > +               return ret;
> > +
>
> [...]
>
> Anyway, I have applied this for next, thanks!

This is throwing errors on builds of keystone_defconfig in next and mainline:

http://arm-soc.lixom.net/buildlogs/next/next-20190102/buildall.arm.keystone_defconfig.log.passed

WARNING: unmet direct dependencies detected for TI_SOC_THERMAL
  Depends on [n]: THERMAL [=y] && (ARCH_HAS_BANDGAP [=n] ||
COMPILE_TEST [=n]) && HAS_IOMEM [=y]
  Selected by [y]:
  - MMC_SDHCI_OMAP [=y] && MMC [=y] && MMC_SDHCI_PLTFM [=y] && OF [=y]

So, thermal depends on ARCH_HAS_BANDGAP, which keystone doesn't provide.

Selecting a major framework such as THERMAL from a driver config is
likely not the right solution anyway, especially since THERMAL does
provide stubbed out versions of the functions if it's not enabled.


-Olof
Eduardo Valentin Jan. 2, 2019, 7:56 p.m. UTC | #3
On Wed, Jan 02, 2019 at 10:29:31AM -0800, Olof Johansson wrote:
> Hi,
> 
> 
> On Wed, Dec 12, 2018 at 1:20 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > + Thermal maintainers
> >
> > On Tue, 11 Dec 2018 at 15:20, Faiz Abbas <faiz_abbas@ti.com> wrote:
> > >
> > > Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
> > > (SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
> > > unexpected tuning pattern errors. A small failure band may be present
> > > in the tuning range which may be missed by the current algorithm.
> > > Furthermore, the failure bands vary with temperature leading to
> > > different optimum tuning values for different temperatures.
> > >
> > > As suggested in the related Application Report (SPRACA9B - October 2017
> > > - Revised July 2018 [2]), tuning should be done in two stages.
> > > In stage 1, assign the optimum ratio in the maximum pass window for the
> > > current temperature. In stage 2, if the chosen value is close to the
> > > small failure band, move away from it in the appropriate direction.
> > >
> > > References:
> > > [1] http://www.ti.com/lit/pdf/sprz426
> > > [2] http://www.ti.com/lit/pdf/SPRACA9
> > >
> > > Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> > > Acked-by: Adrian Hunter <adrian.hunter@intel.com>
> > > ---
> > >  drivers/mmc/host/Kconfig      |  2 +
> > >  drivers/mmc/host/sdhci-omap.c | 90 ++++++++++++++++++++++++++++++++++-
> > >  2 files changed, 91 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> > > index 5fa580cec831..d8f984483ab0 100644
> > > --- a/drivers/mmc/host/Kconfig
> > > +++ b/drivers/mmc/host/Kconfig
> > > @@ -977,6 +977,8 @@ config MMC_SDHCI_XENON
> > >  config MMC_SDHCI_OMAP
> > >         tristate "TI SDHCI Controller Support"
> > >         depends on MMC_SDHCI_PLTFM && OF
> > > +       select THERMAL
> > > +       select TI_SOC_THERMAL
> > >         help
> > >           This selects the Secure Digital Host Controller Interface (SDHCI)
> > >           support present in TI's DRA7 SOCs. The controller supports
> > > diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
> > > index f588ab679cb0..b75c55011fcb 100644
> > > --- a/drivers/mmc/host/sdhci-omap.c
> > > +++ b/drivers/mmc/host/sdhci-omap.c
> > > @@ -27,6 +27,7 @@
> > >  #include <linux/regulator/consumer.h>
> > >  #include <linux/pinctrl/consumer.h>
> > >  #include <linux/sys_soc.h>
> > > +#include <linux/thermal.h>
> > >
> > >  #include "sdhci-pltfm.h"
> > >
> > > @@ -286,15 +287,19 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
> > >         struct sdhci_host *host = mmc_priv(mmc);
> > >         struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
> > >         struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
> > > +       struct thermal_zone_device *thermal_dev;
> > >         struct device *dev = omap_host->dev;
> > >         struct mmc_ios *ios = &mmc->ios;
> > >         u32 start_window = 0, max_window = 0;
> > > +       bool single_point_failure = false;
> > >         bool dcrc_was_enabled = false;
> > >         u8 cur_match, prev_match = 0;
> > >         u32 length = 0, max_len = 0;
> > >         u32 phase_delay = 0;
> > > +       int temperature;
> > >         int ret = 0;
> > >         u32 reg;
> > > +       int i;
> > >
> > >         /* clock tuning is not needed for upto 52MHz */
> > >         if (ios->clock <= 52000000)
> > > @@ -304,6 +309,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
> > >         if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
> > >                 return 0;
> > >
> > > +       thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
> >
> > I couldn't find a corresponding call to a put function, like
> > "thermal_zone_put()" or whatever, which made me realize that the
> > thermal zone API is incomplete. Or depending on how you put it, it
> > lacks object reference counting, unless I am missing something.
> >
> > For example, what happens if the thermal zone becomes unregistered
> > between this point and when you call thermal_zone_get_temp() a couple
> > of line below. I assume it's a known problem, but just wanted to point
> > it out.
> >

Yes, there is no ref counting. Specially because the get zones usages
were too specific, and mostly used in application cases that module
would not really be removed. Though not a good excuse, still, not very
problematic. Now, if the API is getting other usages, then refcounting
may be necessary.

> > > +       if (IS_ERR(thermal_dev)) {
> > > +               dev_err(dev, "Unable to get thermal zone for tuning\n");
> > > +               return PTR_ERR(thermal_dev);
> > > +       }
> > > +
> > > +       ret = thermal_zone_get_temp(thermal_dev, &temperature);
> > > +       if (ret)
> > > +               return ret;
> > > +
> >
> > [...]
> >
> > Anyway, I have applied this for next, thanks!
> 
> This is throwing errors on builds of keystone_defconfig in next and mainline:
> 
> http://arm-soc.lixom.net/buildlogs/next/next-20190102/buildall.arm.keystone_defconfig.log.passed
> 
> WARNING: unmet direct dependencies detected for TI_SOC_THERMAL
>   Depends on [n]: THERMAL [=y] && (ARCH_HAS_BANDGAP [=n] ||
> COMPILE_TEST [=n]) && HAS_IOMEM [=y]
>   Selected by [y]:
>   - MMC_SDHCI_OMAP [=y] && MMC [=y] && MMC_SDHCI_PLTFM [=y] && OF [=y]
> 
> So, thermal depends on ARCH_HAS_BANDGAP, which keystone doesn't provide.
> 
> Selecting a major framework such as THERMAL from a driver config is
> likely not the right solution anyway, especially since THERMAL does
> provide stubbed out versions of the functions if it's not enabled.

Yeah, that seams a bit up-side-down. Can you guys give a bit more of
context? Why do you need the cpu thermal zone ? From patch description,
looks like you want to have your own zone then apply different tuning
values based on temperature (range?). Why do you need to mess up with
cpu_thermal zone? Don't you have a bandgap in the mem controller for
this application?

> 
> 
> -Olof
Faiz Abbas Jan. 3, 2019, 6:01 a.m. UTC | #4
Hi Olof, Eduardo,

On 03/01/19 1:26 AM, Eduardo Valentin wrote:
> On Wed, Jan 02, 2019 at 10:29:31AM -0800, Olof Johansson wrote:
>> Hi,
>>
>>
>> On Wed, Dec 12, 2018 at 1:20 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>
>>> + Thermal maintainers
>>>
>>> On Tue, 11 Dec 2018 at 15:20, Faiz Abbas <faiz_abbas@ti.com> wrote:
>>>>
>>>> Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
>>>> (SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
>>>> unexpected tuning pattern errors. A small failure band may be present
>>>> in the tuning range which may be missed by the current algorithm.
>>>> Furthermore, the failure bands vary with temperature leading to
>>>> different optimum tuning values for different temperatures.
>>>>
>>>> As suggested in the related Application Report (SPRACA9B - October 2017
>>>> - Revised July 2018 [2]), tuning should be done in two stages.
>>>> In stage 1, assign the optimum ratio in the maximum pass window for the
>>>> current temperature. In stage 2, if the chosen value is close to the
>>>> small failure band, move away from it in the appropriate direction.
>>>>
>>>> References:
>>>> [1] http://www.ti.com/lit/pdf/sprz426
>>>> [2] http://www.ti.com/lit/pdf/SPRACA9
>>>>
>>>> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
>>>> Acked-by: Adrian Hunter <adrian.hunter@intel.com>
>>>> ---
>>>>  drivers/mmc/host/Kconfig      |  2 +
>>>>  drivers/mmc/host/sdhci-omap.c | 90 ++++++++++++++++++++++++++++++++++-
>>>>  2 files changed, 91 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
>>>> index 5fa580cec831..d8f984483ab0 100644
>>>> --- a/drivers/mmc/host/Kconfig
>>>> +++ b/drivers/mmc/host/Kconfig
>>>> @@ -977,6 +977,8 @@ config MMC_SDHCI_XENON
>>>>  config MMC_SDHCI_OMAP
>>>>         tristate "TI SDHCI Controller Support"
>>>>         depends on MMC_SDHCI_PLTFM && OF
>>>> +       select THERMAL
>>>> +       select TI_SOC_THERMAL
>>>>         help
>>>>           This selects the Secure Digital Host Controller Interface (SDHCI)
>>>>           support present in TI's DRA7 SOCs. The controller supports
>>>> diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
>>>> index f588ab679cb0..b75c55011fcb 100644
>>>> --- a/drivers/mmc/host/sdhci-omap.c
>>>> +++ b/drivers/mmc/host/sdhci-omap.c
>>>> @@ -27,6 +27,7 @@
>>>>  #include <linux/regulator/consumer.h>
>>>>  #include <linux/pinctrl/consumer.h>
>>>>  #include <linux/sys_soc.h>
>>>> +#include <linux/thermal.h>
>>>>
>>>>  #include "sdhci-pltfm.h"
>>>>
>>>> @@ -286,15 +287,19 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
>>>>         struct sdhci_host *host = mmc_priv(mmc);
>>>>         struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>>>>         struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
>>>> +       struct thermal_zone_device *thermal_dev;
>>>>         struct device *dev = omap_host->dev;
>>>>         struct mmc_ios *ios = &mmc->ios;
>>>>         u32 start_window = 0, max_window = 0;
>>>> +       bool single_point_failure = false;
>>>>         bool dcrc_was_enabled = false;
>>>>         u8 cur_match, prev_match = 0;
>>>>         u32 length = 0, max_len = 0;
>>>>         u32 phase_delay = 0;
>>>> +       int temperature;
>>>>         int ret = 0;
>>>>         u32 reg;
>>>> +       int i;
>>>>
>>>>         /* clock tuning is not needed for upto 52MHz */
>>>>         if (ios->clock <= 52000000)
>>>> @@ -304,6 +309,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
>>>>         if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
>>>>                 return 0;
>>>>
>>>> +       thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
>>>
>>> I couldn't find a corresponding call to a put function, like
>>> "thermal_zone_put()" or whatever, which made me realize that the
>>> thermal zone API is incomplete. Or depending on how you put it, it
>>> lacks object reference counting, unless I am missing something.
>>>
>>> For example, what happens if the thermal zone becomes unregistered
>>> between this point and when you call thermal_zone_get_temp() a couple
>>> of line below. I assume it's a known problem, but just wanted to point
>>> it out.
>>>
> 
> Yes, there is no ref counting. Specially because the get zones usages
> were too specific, and mostly used in application cases that module
> would not really be removed. Though not a good excuse, still, not very
> problematic. Now, if the API is getting other usages, then refcounting
> may be necessary.
> 
>>>> +       if (IS_ERR(thermal_dev)) {
>>>> +               dev_err(dev, "Unable to get thermal zone for tuning\n");
>>>> +               return PTR_ERR(thermal_dev);
>>>> +       }
>>>> +
>>>> +       ret = thermal_zone_get_temp(thermal_dev, &temperature);
>>>> +       if (ret)
>>>> +               return ret;
>>>> +
>>>
>>> [...]
>>>
>>> Anyway, I have applied this for next, thanks!
>>
>> This is throwing errors on builds of keystone_defconfig in next and mainline:
>>
>> http://arm-soc.lixom.net/buildlogs/next/next-20190102/buildall.arm.keystone_defconfig.log.passed
>>
>> WARNING: unmet direct dependencies detected for TI_SOC_THERMAL
>>   Depends on [n]: THERMAL [=y] && (ARCH_HAS_BANDGAP [=n] ||
>> COMPILE_TEST [=n]) && HAS_IOMEM [=y]
>>   Selected by [y]:
>>   - MMC_SDHCI_OMAP [=y] && MMC [=y] && MMC_SDHCI_PLTFM [=y] && OF [=y]
>>
>> So, thermal depends on ARCH_HAS_BANDGAP, which keystone doesn't provide.
>>
>> Selecting a major framework such as THERMAL from a driver config is
>> likely not the right solution anyway, especially since THERMAL does
>> provide stubbed out versions of the functions if it's not enabled.
> 
> Yeah, that seams a bit up-side-down. Can you guys give a bit more of
> context? Why do you need the cpu thermal zone ? From patch description,
> looks like you want to have your own zone then apply different tuning
> values based on temperature (range?). Why do you need to mess up with
> cpu_thermal zone? Don't you have a bandgap in the mem controller for
> this application?
> 

Thats correct. We don't have a bandgap in the MMC controller and thus we
have to use the cpu one to measure temperature.

THERMAL is critical for tuning. The interface is supposed to fail if we
can't get temperature. So IMO we should ensure that it is present.

I can fix this by "select TI_SOC_THERMAL if ARCH_HAS_BANDGAP" if you
guys agree.
Thanks,
Faiz
Olof Johansson Jan. 5, 2019, 12:57 a.m. UTC | #5
On Wed, Jan 2, 2019 at 9:58 PM Faiz Abbas <faiz_abbas@ti.com> wrote:
>
> Hi Olof, Eduardo,
>
> On 03/01/19 1:26 AM, Eduardo Valentin wrote:
> > On Wed, Jan 02, 2019 at 10:29:31AM -0800, Olof Johansson wrote:
> >> Hi,
> >>
> >>
> >> On Wed, Dec 12, 2018 at 1:20 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >>>
> >>> + Thermal maintainers
> >>>
> >>> On Tue, 11 Dec 2018 at 15:20, Faiz Abbas <faiz_abbas@ti.com> wrote:
> >>>>
> >>>> Errata i929 in certain OMAP5/DRA7XX/AM57XX silicon revisions
> >>>> (SPRZ426D - November 2014 - Revised February 2018 [1]) mentions
> >>>> unexpected tuning pattern errors. A small failure band may be present
> >>>> in the tuning range which may be missed by the current algorithm.
> >>>> Furthermore, the failure bands vary with temperature leading to
> >>>> different optimum tuning values for different temperatures.
> >>>>
> >>>> As suggested in the related Application Report (SPRACA9B - October 2017
> >>>> - Revised July 2018 [2]), tuning should be done in two stages.
> >>>> In stage 1, assign the optimum ratio in the maximum pass window for the
> >>>> current temperature. In stage 2, if the chosen value is close to the
> >>>> small failure band, move away from it in the appropriate direction.
> >>>>
> >>>> References:
> >>>> [1] http://www.ti.com/lit/pdf/sprz426
> >>>> [2] http://www.ti.com/lit/pdf/SPRACA9
> >>>>
> >>>> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> >>>> Acked-by: Adrian Hunter <adrian.hunter@intel.com>
> >>>> ---
> >>>>  drivers/mmc/host/Kconfig      |  2 +
> >>>>  drivers/mmc/host/sdhci-omap.c | 90 ++++++++++++++++++++++++++++++++++-
> >>>>  2 files changed, 91 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> >>>> index 5fa580cec831..d8f984483ab0 100644
> >>>> --- a/drivers/mmc/host/Kconfig
> >>>> +++ b/drivers/mmc/host/Kconfig
> >>>> @@ -977,6 +977,8 @@ config MMC_SDHCI_XENON
> >>>>  config MMC_SDHCI_OMAP
> >>>>         tristate "TI SDHCI Controller Support"
> >>>>         depends on MMC_SDHCI_PLTFM && OF
> >>>> +       select THERMAL
> >>>> +       select TI_SOC_THERMAL
> >>>>         help
> >>>>           This selects the Secure Digital Host Controller Interface (SDHCI)
> >>>>           support present in TI's DRA7 SOCs. The controller supports
> >>>> diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
> >>>> index f588ab679cb0..b75c55011fcb 100644
> >>>> --- a/drivers/mmc/host/sdhci-omap.c
> >>>> +++ b/drivers/mmc/host/sdhci-omap.c
> >>>> @@ -27,6 +27,7 @@
> >>>>  #include <linux/regulator/consumer.h>
> >>>>  #include <linux/pinctrl/consumer.h>
> >>>>  #include <linux/sys_soc.h>
> >>>> +#include <linux/thermal.h>
> >>>>
> >>>>  #include "sdhci-pltfm.h"
> >>>>
> >>>> @@ -286,15 +287,19 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
> >>>>         struct sdhci_host *host = mmc_priv(mmc);
> >>>>         struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
> >>>>         struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
> >>>> +       struct thermal_zone_device *thermal_dev;
> >>>>         struct device *dev = omap_host->dev;
> >>>>         struct mmc_ios *ios = &mmc->ios;
> >>>>         u32 start_window = 0, max_window = 0;
> >>>> +       bool single_point_failure = false;
> >>>>         bool dcrc_was_enabled = false;
> >>>>         u8 cur_match, prev_match = 0;
> >>>>         u32 length = 0, max_len = 0;
> >>>>         u32 phase_delay = 0;
> >>>> +       int temperature;
> >>>>         int ret = 0;
> >>>>         u32 reg;
> >>>> +       int i;
> >>>>
> >>>>         /* clock tuning is not needed for upto 52MHz */
> >>>>         if (ios->clock <= 52000000)
> >>>> @@ -304,6 +309,16 @@ static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
> >>>>         if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
> >>>>                 return 0;
> >>>>
> >>>> +       thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
> >>>
> >>> I couldn't find a corresponding call to a put function, like
> >>> "thermal_zone_put()" or whatever, which made me realize that the
> >>> thermal zone API is incomplete. Or depending on how you put it, it
> >>> lacks object reference counting, unless I am missing something.
> >>>
> >>> For example, what happens if the thermal zone becomes unregistered
> >>> between this point and when you call thermal_zone_get_temp() a couple
> >>> of line below. I assume it's a known problem, but just wanted to point
> >>> it out.
> >>>
> >
> > Yes, there is no ref counting. Specially because the get zones usages
> > were too specific, and mostly used in application cases that module
> > would not really be removed. Though not a good excuse, still, not very
> > problematic. Now, if the API is getting other usages, then refcounting
> > may be necessary.
> >
> >>>> +       if (IS_ERR(thermal_dev)) {
> >>>> +               dev_err(dev, "Unable to get thermal zone for tuning\n");
> >>>> +               return PTR_ERR(thermal_dev);
> >>>> +       }
> >>>> +
> >>>> +       ret = thermal_zone_get_temp(thermal_dev, &temperature);
> >>>> +       if (ret)
> >>>> +               return ret;
> >>>> +
> >>>
> >>> [...]
> >>>
> >>> Anyway, I have applied this for next, thanks!
> >>
> >> This is throwing errors on builds of keystone_defconfig in next and mainline:
> >>
> >> http://arm-soc.lixom.net/buildlogs/next/next-20190102/buildall.arm.keystone_defconfig.log.passed
> >>
> >> WARNING: unmet direct dependencies detected for TI_SOC_THERMAL
> >>   Depends on [n]: THERMAL [=y] && (ARCH_HAS_BANDGAP [=n] ||
> >> COMPILE_TEST [=n]) && HAS_IOMEM [=y]
> >>   Selected by [y]:
> >>   - MMC_SDHCI_OMAP [=y] && MMC [=y] && MMC_SDHCI_PLTFM [=y] && OF [=y]
> >>
> >> So, thermal depends on ARCH_HAS_BANDGAP, which keystone doesn't provide.
> >>
> >> Selecting a major framework such as THERMAL from a driver config is
> >> likely not the right solution anyway, especially since THERMAL does
> >> provide stubbed out versions of the functions if it's not enabled.
> >
> > Yeah, that seams a bit up-side-down. Can you guys give a bit more of
> > context? Why do you need the cpu thermal zone ? From patch description,
> > looks like you want to have your own zone then apply different tuning
> > values based on temperature (range?). Why do you need to mess up with
> > cpu_thermal zone? Don't you have a bandgap in the mem controller for
> > this application?
> >
>
> Thats correct. We don't have a bandgap in the MMC controller and thus we
> have to use the cpu one to measure temperature.
>
> THERMAL is critical for tuning. The interface is supposed to fail if we
> can't get temperature. So IMO we should ensure that it is present.
>
> I can fix this by "select TI_SOC_THERMAL if ARCH_HAS_BANDGAP" if you
> guys agree.

Building elaborate select statements is usually fragile, once
dependencies for TI_SOC_THERMAL changes you need to come back here to
fixup the select.

Supposedly this driver works on keystone (or does it?), it doesn't
actually need TI_SOC_THERMAL for basic functionality beyond tuning?
Or, at least, it needs to fall back to a reasonable behavior if it's
unavailable on keystone.

Having the driver print a warning and refuse to tune to higher speeds
is a reasonable way to do this, I think. That would carry to all
platforms, i.e. even the ones who have TI_SOC_THERMAL and
ARCH_HAS_BANDGAP, without adding the select.


-Olof
Faiz Abbas Jan. 7, 2019, 10:35 a.m. UTC | #6
Hi Olof,

On 05/01/19 6:27 AM, Olof Johansson wrote:
> On Wed, Jan 2, 2019 at 9:58 PM Faiz Abbas <faiz_abbas@ti.com> wrote:
>>
>> Hi Olof, Eduardo,
>>
>> On 03/01/19 1:26 AM, Eduardo Valentin wrote:
>>> On Wed, Jan 02, 2019 at 10:29:31AM -0800, Olof Johansson wrote:
>>>> Hi,
>>>>
>>>>
...
>>>>
>>>> This is throwing errors on builds of keystone_defconfig in next and mainline:
>>>>
>>>> http://arm-soc.lixom.net/buildlogs/next/next-20190102/buildall.arm.keystone_defconfig.log.passed
>>>>
>>>> WARNING: unmet direct dependencies detected for TI_SOC_THERMAL
>>>>   Depends on [n]: THERMAL [=y] && (ARCH_HAS_BANDGAP [=n] ||
>>>> COMPILE_TEST [=n]) && HAS_IOMEM [=y]
>>>>   Selected by [y]:
>>>>   - MMC_SDHCI_OMAP [=y] && MMC [=y] && MMC_SDHCI_PLTFM [=y] && OF [=y]
>>>>
>>>> So, thermal depends on ARCH_HAS_BANDGAP, which keystone doesn't provide.
>>>>
>>>> Selecting a major framework such as THERMAL from a driver config is
>>>> likely not the right solution anyway, especially since THERMAL does
>>>> provide stubbed out versions of the functions if it's not enabled.
>>>
>>> Yeah, that seams a bit up-side-down. Can you guys give a bit more of
>>> context? Why do you need the cpu thermal zone ? From patch description,
>>> looks like you want to have your own zone then apply different tuning
>>> values based on temperature (range?). Why do you need to mess up with
>>> cpu_thermal zone? Don't you have a bandgap in the mem controller for
>>> this application?
>>>
>>
>> Thats correct. We don't have a bandgap in the MMC controller and thus we
>> have to use the cpu one to measure temperature.
>>
>> THERMAL is critical for tuning. The interface is supposed to fail if we
>> can't get temperature. So IMO we should ensure that it is present.
>>
>> I can fix this by "select TI_SOC_THERMAL if ARCH_HAS_BANDGAP" if you
>> guys agree.
> 
> Building elaborate select statements is usually fragile, once
> dependencies for TI_SOC_THERMAL changes you need to come back here to
> fixup the select.
> 
> Supposedly this driver works on keystone (or does it?), it doesn't

Yes. This driver works on keystone.

> actually need TI_SOC_THERMAL for basic functionality beyond tuning?
> Or, at least, it needs to fall back to a reasonable behavior if it's
> unavailable on keystone.

The scenario is this. dra7 devices (omap2plus_defconfig) support tuning
which needs THERMAL and TI_SOC_THERMAL. Keystone devices don't need
thermal at all. The tuning part of the code will never be touched by the
keystone device.

In omap2plus_defconfig, THERMAL AND TI_SOC_THERMAL are modules by
default. I need them to be built-in because MMC is built-in.

> 
> Having the driver print a warning and refuse to tune to higher speeds
> is a reasonable way to do this, I think. That would carry to all
> platforms, i.e. even the ones who have TI_SOC_THERMAL and
> ARCH_HAS_BANDGAP, without adding the select.
> 

The MMC core today doesn't support a fallback to lower speeds when
tuning fails. The interface just fails. This means a bunch cards that
were working will now fail. Also, people building with olddefconfig will
get build errors because of THERMAL=m. Thus the Kconfig architecture
should automatically select the dependencies for SDHCI_OMAP in DRA7.

An "imply TI_SOC_THERMAL" is even better here. It won't force
TI_SOC_THERMAL for keystone and doesn't cause any warnings.

So I propose the following patch:

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index e26b8145efb3..bc61eefcb695 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -978,7 +978,7 @@ config MMC_SDHCI_OMAP
        tristate "TI SDHCI Controller Support"
        depends on MMC_SDHCI_PLTFM && OF
        select THERMAL
-       select TI_SOC_THERMAL
+       imply TI_SOC_THERMAL
        help
--

Thanks,
Faiz
diff mbox series

Patch

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 5fa580cec831..d8f984483ab0 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -977,6 +977,8 @@  config MMC_SDHCI_XENON
 config MMC_SDHCI_OMAP
 	tristate "TI SDHCI Controller Support"
 	depends on MMC_SDHCI_PLTFM && OF
+	select THERMAL
+	select TI_SOC_THERMAL
 	help
 	  This selects the Secure Digital Host Controller Interface (SDHCI)
 	  support present in TI's DRA7 SOCs. The controller supports
diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
index f588ab679cb0..b75c55011fcb 100644
--- a/drivers/mmc/host/sdhci-omap.c
+++ b/drivers/mmc/host/sdhci-omap.c
@@ -27,6 +27,7 @@ 
 #include <linux/regulator/consumer.h>
 #include <linux/pinctrl/consumer.h>
 #include <linux/sys_soc.h>
+#include <linux/thermal.h>
 
 #include "sdhci-pltfm.h"
 
@@ -286,15 +287,19 @@  static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
 	struct sdhci_host *host = mmc_priv(mmc);
 	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
 	struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
+	struct thermal_zone_device *thermal_dev;
 	struct device *dev = omap_host->dev;
 	struct mmc_ios *ios = &mmc->ios;
 	u32 start_window = 0, max_window = 0;
+	bool single_point_failure = false;
 	bool dcrc_was_enabled = false;
 	u8 cur_match, prev_match = 0;
 	u32 length = 0, max_len = 0;
 	u32 phase_delay = 0;
+	int temperature;
 	int ret = 0;
 	u32 reg;
+	int i;
 
 	/* clock tuning is not needed for upto 52MHz */
 	if (ios->clock <= 52000000)
@@ -304,6 +309,16 @@  static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
 	if (ios->timing == MMC_TIMING_UHS_SDR50 && !(reg & CAPA2_TSDR50))
 		return 0;
 
+	thermal_dev = thermal_zone_get_zone_by_name("cpu_thermal");
+	if (IS_ERR(thermal_dev)) {
+		dev_err(dev, "Unable to get thermal zone for tuning\n");
+		return PTR_ERR(thermal_dev);
+	}
+
+	ret = thermal_zone_get_temp(thermal_dev, &temperature);
+	if (ret)
+		return ret;
+
 	reg = sdhci_omap_readl(omap_host, SDHCI_OMAP_DLL);
 	reg |= DLL_SWT;
 	sdhci_omap_writel(omap_host, SDHCI_OMAP_DLL, reg);
@@ -321,6 +336,11 @@  static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
 
 	omap_host->is_tuning = true;
 
+	/*
+	 * Stage 1: Search for a maximum pass window ignoring any
+	 * any single point failures. If the tuning value ends up
+	 * near it, move away from it in stage 2 below
+	 */
 	while (phase_delay <= MAX_PHASE_DELAY) {
 		sdhci_omap_set_dll(omap_host, phase_delay);
 
@@ -328,10 +348,15 @@  static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
 		if (cur_match) {
 			if (prev_match) {
 				length++;
+			} else if (single_point_failure) {
+				/* ignore single point failure */
+				length++;
 			} else {
 				start_window = phase_delay;
 				length = 1;
 			}
+		} else {
+			single_point_failure = prev_match;
 		}
 
 		if (length > max_len) {
@@ -349,13 +374,76 @@  static int sdhci_omap_execute_tuning(struct mmc_host *mmc, u32 opcode)
 		goto tuning_error;
 	}
 
+	/*
+	 * Assign tuning value as a ratio of maximum pass window based
+	 * on temperature
+	 */
+	if (temperature < -20000)
+		phase_delay = min(max_window + 4 * max_len - 24,
+				  max_window +
+				  DIV_ROUND_UP(13 * max_len, 16) * 4);
+	else if (temperature < 20000)
+		phase_delay = max_window + DIV_ROUND_UP(9 * max_len, 16) * 4;
+	else if (temperature < 40000)
+		phase_delay = max_window + DIV_ROUND_UP(8 * max_len, 16) * 4;
+	else if (temperature < 70000)
+		phase_delay = max_window + DIV_ROUND_UP(7 * max_len, 16) * 4;
+	else if (temperature < 90000)
+		phase_delay = max_window + DIV_ROUND_UP(5 * max_len, 16) * 4;
+	else if (temperature < 120000)
+		phase_delay = max_window + DIV_ROUND_UP(4 * max_len, 16) * 4;
+	else
+		phase_delay = max_window + DIV_ROUND_UP(3 * max_len, 16) * 4;
+
+	/*
+	 * Stage 2: Search for a single point failure near the chosen tuning
+	 * value in two steps. First in the +3 to +10 range and then in the
+	 * +2 to -10 range. If found, move away from it in the appropriate
+	 * direction by the appropriate amount depending on the temperature.
+	 */
+	for (i = 3; i <= 10; i++) {
+		sdhci_omap_set_dll(omap_host, phase_delay + i);
+
+		if (mmc_send_tuning(mmc, opcode, NULL)) {
+			if (temperature < 10000)
+				phase_delay += i + 6;
+			else if (temperature < 20000)
+				phase_delay += i - 12;
+			else if (temperature < 70000)
+				phase_delay += i - 8;
+			else
+				phase_delay += i - 6;
+
+			goto single_failure_found;
+		}
+	}
+
+	for (i = 2; i >= -10; i--) {
+		sdhci_omap_set_dll(omap_host, phase_delay + i);
+
+		if (mmc_send_tuning(mmc, opcode, NULL)) {
+			if (temperature < 10000)
+				phase_delay += i + 12;
+			else if (temperature < 20000)
+				phase_delay += i + 8;
+			else if (temperature < 70000)
+				phase_delay += i + 8;
+			else if (temperature < 90000)
+				phase_delay += i + 10;
+			else
+				phase_delay += i + 12;
+
+			goto single_failure_found;
+		}
+	}
+
+single_failure_found:
 	reg = sdhci_omap_readl(omap_host, SDHCI_OMAP_AC12);
 	if (!(reg & AC12_SCLK_SEL)) {
 		ret = -EIO;
 		goto tuning_error;
 	}
 
-	phase_delay = max_window + 4 * (max_len >> 1);
 	sdhci_omap_set_dll(omap_host, phase_delay);
 
 	omap_host->is_tuning = false;