mbox series

[v2,0/3] mmc: Fixup HW reset for SDIO cards

Message ID 20191109103046.26445-1-ulf.hansson@linaro.org (mailing list archive)
Headers show
Series mmc: Fixup HW reset for SDIO cards | expand

Message

Ulf Hansson Nov. 9, 2019, 10:30 a.m. UTC
Changes in v2:
	- Add adaptations to the mwifiex driver.
	- Keep existing syncronous reset behaviour if the SDIO card has a single
	func driver.

It has turned out that it's not a good idea to try to power cycle and to
re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
because there may be multiple SDIO funcs attached to the same SDIO card and
some of the others that didn't execute the call to mmc_hw_reset(), may then
simply experience an undefined behaviour.

The following patches in this series attempts to address this problem, by
reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
mwifiex driver to these changes.

Note that, I don't have the HW at hand so the the code has only compile tested.
Test on HW is greatly appreciated!

Ulf Hansson (3):
  mwifiex: Re-work support for SDIO HW reset
  mmc: core: Drop check for mmc_card_is_removable() in mmc_rescan()
  mmc: core: Re-work HW reset for SDIO cards

 drivers/mmc/core/core.c                     | 12 +++-----
 drivers/mmc/core/core.h                     |  2 ++
 drivers/mmc/core/sdio.c                     | 28 ++++++++++++++++-
 drivers/mmc/core/sdio_bus.c                 |  9 +++++-
 drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
 drivers/net/wireless/marvell/mwifiex/main.h |  1 +
 drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
 include/linux/mmc/card.h                    |  1 +
 8 files changed, 70 insertions(+), 22 deletions(-)

Comments

Tony Lindgren Nov. 11, 2019, 10:08 p.m. UTC | #1
Hi,

* Ulf Hansson <ulf.hansson@linaro.org> [191109 10:31]:
> Changes in v2:
> 	- Add adaptations to the mwifiex driver.
> 	- Keep existing syncronous reset behaviour if the SDIO card has a single
> 	func driver.
> 
> It has turned out that it's not a good idea to try to power cycle and to
> re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
> because there may be multiple SDIO funcs attached to the same SDIO card and
> some of the others that didn't execute the call to mmc_hw_reset(), may then
> simply experience an undefined behaviour.
> 
> The following patches in this series attempts to address this problem, by
> reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
> mwifiex driver to these changes.
> 
> Note that, I don't have the HW at hand so the the code has only compile tested.
> Test on HW is greatly appreciated!

Looks good to me. I tried testing this, but looks like at least on duovero
mwifiex_sdio is broken in v5.4-rc7:

# dmesg | grep mwifi
mwifiex_sdio mmc1:0001:1: poll card status failed, tries = 100
mwifiex_sdio mmc1:0001:1: FW download with helper:       poll status timeout @ 0
mwifiex_sdio mmc1:0001:1: prog_fw failed ret=0xffffffff
mwifiex_sdio mmc1:0001:1: info: _mwifiex_fw_dpc: unregister device

No idea when it broke and what might be the issue this time around.
Any ideas?

Regards,

Tony
Doug Anderson Nov. 12, 2019, 12:51 a.m. UTC | #2
Hi,

On Sat, Nov 9, 2019 at 2:31 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> Changes in v2:
>         - Add adaptations to the mwifiex driver.
>         - Keep existing syncronous reset behaviour if the SDIO card has a single
>         func driver.
>
> It has turned out that it's not a good idea to try to power cycle and to
> re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
> because there may be multiple SDIO funcs attached to the same SDIO card and
> some of the others that didn't execute the call to mmc_hw_reset(), may then
> simply experience an undefined behaviour.
>
> The following patches in this series attempts to address this problem, by
> reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
> mwifiex driver to these changes.
>
> Note that, I don't have the HW at hand so the the code has only compile tested.
> Test on HW is greatly appreciated!
>
> Ulf Hansson (3):
>   mwifiex: Re-work support for SDIO HW reset
>   mmc: core: Drop check for mmc_card_is_removable() in mmc_rescan()
>   mmc: core: Re-work HW reset for SDIO cards
>
>  drivers/mmc/core/core.c                     | 12 +++-----
>  drivers/mmc/core/core.h                     |  2 ++
>  drivers/mmc/core/sdio.c                     | 28 ++++++++++++++++-
>  drivers/mmc/core/sdio_bus.c                 |  9 +++++-
>  drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
>  drivers/net/wireless/marvell/mwifiex/main.h |  1 +
>  drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
>  include/linux/mmc/card.h                    |  1 +
>  8 files changed, 70 insertions(+), 22 deletions(-)

I put this on rk3288-veyron-jerry atop v5.4-rc7 and I could run my
test case for a while, AKA I got over 50 cycles of:

---

for i in $(seq 1000); do
  echo "LOOP $i --------"
  echo 1 > /sys/kernel/debug/mwifiex/mlan0/reset

  while true; do
    if ! ping6 -w15 -c1 "${GW}" >/dev/null 2>&1; then
      fail=$(( fail + 1 ))
      echo "Fail WiFi ${fail}"
      if [[ ${fail} == 3 ]]; then
        exit 1
      fi
    else
      fail=0
      break
    fi
  done

  hciconfig hci0 down
  sleep 1
  if ! hciconfig hci0 up; then
    echo "Fail BT"
    exit 1
  fi

done

---

NOTE: with no patches I couldn't even get my test case to pass w/out
the BT bits and I swear that used to work before.  ...but I didn't
debug since the end result (with full card hotplug) is happy-working
for me.  I'll still use it as further argument that (IMO) full unplug
/ plug of the card is better it uses more standard code paths and is
less likely to break.  ;-)

Tested-by: Douglas Anderson <dianders@chromium.org>
Ulf Hansson Nov. 12, 2019, 12:23 p.m. UTC | #3
On Mon, 11 Nov 2019 at 23:08, Tony Lindgren <tony@atomide.com> wrote:
>
> Hi,
>
> * Ulf Hansson <ulf.hansson@linaro.org> [191109 10:31]:
> > Changes in v2:
> >       - Add adaptations to the mwifiex driver.
> >       - Keep existing syncronous reset behaviour if the SDIO card has a single
> >       func driver.
> >
> > It has turned out that it's not a good idea to try to power cycle and to
> > re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
> > because there may be multiple SDIO funcs attached to the same SDIO card and
> > some of the others that didn't execute the call to mmc_hw_reset(), may then
> > simply experience an undefined behaviour.
> >
> > The following patches in this series attempts to address this problem, by
> > reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
> > mwifiex driver to these changes.
> >
> > Note that, I don't have the HW at hand so the the code has only compile tested.
> > Test on HW is greatly appreciated!
>
> Looks good to me. I tried testing this, but looks like at least on duovero
> mwifiex_sdio is broken in v5.4-rc7:
>
> # dmesg | grep mwifi
> mwifiex_sdio mmc1:0001:1: poll card status failed, tries = 100
> mwifiex_sdio mmc1:0001:1: FW download with helper:       poll status timeout @ 0
> mwifiex_sdio mmc1:0001:1: prog_fw failed ret=0xffffffff
> mwifiex_sdio mmc1:0001:1: info: _mwifiex_fw_dpc: unregister device
>
> No idea when it broke and what might be the issue this time around.
> Any ideas?

Sorry, no good idea.

Perhaps something on the mmc host level that has changed?

Kind regards
Uffe
Ulf Hansson Nov. 12, 2019, 12:27 p.m. UTC | #4
On Tue, 12 Nov 2019 at 01:51, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Sat, Nov 9, 2019 at 2:31 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > Changes in v2:
> >         - Add adaptations to the mwifiex driver.
> >         - Keep existing syncronous reset behaviour if the SDIO card has a single
> >         func driver.
> >
> > It has turned out that it's not a good idea to try to power cycle and to
> > re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
> > because there may be multiple SDIO funcs attached to the same SDIO card and
> > some of the others that didn't execute the call to mmc_hw_reset(), may then
> > simply experience an undefined behaviour.
> >
> > The following patches in this series attempts to address this problem, by
> > reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
> > mwifiex driver to these changes.
> >
> > Note that, I don't have the HW at hand so the the code has only compile tested.
> > Test on HW is greatly appreciated!
> >
> > Ulf Hansson (3):
> >   mwifiex: Re-work support for SDIO HW reset
> >   mmc: core: Drop check for mmc_card_is_removable() in mmc_rescan()
> >   mmc: core: Re-work HW reset for SDIO cards
> >
> >  drivers/mmc/core/core.c                     | 12 +++-----
> >  drivers/mmc/core/core.h                     |  2 ++
> >  drivers/mmc/core/sdio.c                     | 28 ++++++++++++++++-
> >  drivers/mmc/core/sdio_bus.c                 |  9 +++++-
> >  drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
> >  drivers/net/wireless/marvell/mwifiex/main.h |  1 +
> >  drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
> >  include/linux/mmc/card.h                    |  1 +
> >  8 files changed, 70 insertions(+), 22 deletions(-)
>
> I put this on rk3288-veyron-jerry atop v5.4-rc7 and I could run my
> test case for a while, AKA I got over 50 cycles of:
>
> ---
>
> for i in $(seq 1000); do
>   echo "LOOP $i --------"
>   echo 1 > /sys/kernel/debug/mwifiex/mlan0/reset
>
>   while true; do
>     if ! ping6 -w15 -c1 "${GW}" >/dev/null 2>&1; then
>       fail=$(( fail + 1 ))
>       echo "Fail WiFi ${fail}"
>       if [[ ${fail} == 3 ]]; then
>         exit 1
>       fi
>     else
>       fail=0
>       break
>     fi
>   done
>
>   hciconfig hci0 down
>   sleep 1
>   if ! hciconfig hci0 up; then
>     echo "Fail BT"
>     exit 1
>   fi
>
> done
>
> ---
>
> NOTE: with no patches I couldn't even get my test case to pass w/out
> the BT bits and I swear that used to work before.  ...but I didn't
> debug since the end result (with full card hotplug) is happy-working
> for me.  I'll still use it as further argument that (IMO) full unplug
> / plug of the card is better it uses more standard code paths and is
> less likely to break.  ;-)
>
> Tested-by: Douglas Anderson <dianders@chromium.org>

Thanks, I add this to the series and make a re-spin.

What do you think about tagging the patches for stable?

I guess there is a risk that we may "break" the other two users of
mmc_hw_reset(). But, as I said, in that case those needs to be fixed
anyways.

Kind regards
Uffe
Doug Anderson Nov. 12, 2019, 5:42 p.m. UTC | #5
Hi,

On Tue, Nov 12, 2019 at 4:28 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Tue, 12 Nov 2019 at 01:51, Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Sat, Nov 9, 2019 at 2:31 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >
> > > Changes in v2:
> > >         - Add adaptations to the mwifiex driver.
> > >         - Keep existing syncronous reset behaviour if the SDIO card has a single
> > >         func driver.
> > >
> > > It has turned out that it's not a good idea to try to power cycle and to
> > > re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
> > > because there may be multiple SDIO funcs attached to the same SDIO card and
> > > some of the others that didn't execute the call to mmc_hw_reset(), may then
> > > simply experience an undefined behaviour.
> > >
> > > The following patches in this series attempts to address this problem, by
> > > reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
> > > mwifiex driver to these changes.
> > >
> > > Note that, I don't have the HW at hand so the the code has only compile tested.
> > > Test on HW is greatly appreciated!
> > >
> > > Ulf Hansson (3):
> > >   mwifiex: Re-work support for SDIO HW reset
> > >   mmc: core: Drop check for mmc_card_is_removable() in mmc_rescan()
> > >   mmc: core: Re-work HW reset for SDIO cards
> > >
> > >  drivers/mmc/core/core.c                     | 12 +++-----
> > >  drivers/mmc/core/core.h                     |  2 ++
> > >  drivers/mmc/core/sdio.c                     | 28 ++++++++++++++++-
> > >  drivers/mmc/core/sdio_bus.c                 |  9 +++++-
> > >  drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
> > >  drivers/net/wireless/marvell/mwifiex/main.h |  1 +
> > >  drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
> > >  include/linux/mmc/card.h                    |  1 +
> > >  8 files changed, 70 insertions(+), 22 deletions(-)
> >
> > I put this on rk3288-veyron-jerry atop v5.4-rc7 and I could run my
> > test case for a while, AKA I got over 50 cycles of:
> >
> > ---
> >
> > for i in $(seq 1000); do
> >   echo "LOOP $i --------"
> >   echo 1 > /sys/kernel/debug/mwifiex/mlan0/reset
> >
> >   while true; do
> >     if ! ping6 -w15 -c1 "${GW}" >/dev/null 2>&1; then
> >       fail=$(( fail + 1 ))
> >       echo "Fail WiFi ${fail}"
> >       if [[ ${fail} == 3 ]]; then
> >         exit 1
> >       fi
> >     else
> >       fail=0
> >       break
> >     fi
> >   done
> >
> >   hciconfig hci0 down
> >   sleep 1
> >   if ! hciconfig hci0 up; then
> >     echo "Fail BT"
> >     exit 1
> >   fi
> >
> > done
> >
> > ---
> >
> > NOTE: with no patches I couldn't even get my test case to pass w/out
> > the BT bits and I swear that used to work before.  ...but I didn't
> > debug since the end result (with full card hotplug) is happy-working
> > for me.  I'll still use it as further argument that (IMO) full unplug
> > / plug of the card is better it uses more standard code paths and is
> > less likely to break.  ;-)
> >
> > Tested-by: Douglas Anderson <dianders@chromium.org>
>
> Thanks, I add this to the series and make a re-spin.
>
> What do you think about tagging the patches for stable?
>
> I guess there is a risk that we may "break" the other two users of
> mmc_hw_reset(). But, as I said, in that case those needs to be fixed
> anyways.

I'm not sure how to make that judgement call.  Certainly it would help
anyone using the Marvell case and the Marvell case was pretty broken
before.

How about this: if you can get a Tested-by from the other users then
I'd be good with a general CC: stable.  Otherwise, I'd be OK with a CC
to stable for 5.4, but I'd be a little hesitant to send it back to
older kernels (even though it certainly applies and fixes problems).
At least in the case of Chrome OS we already have a workable solution
for our 4.19 tree (my previous patches), and I'd guess anyone testing
on real hardware is either not seeing problems or has their own
private patches already.  If things have been sitting stable on 5.4
for a while and no problems were reported, then we could consider
going back further?

-Doug
Tony Lindgren Nov. 12, 2019, 6 p.m. UTC | #6
* Ulf Hansson <ulf.hansson@linaro.org> [191112 12:24]:
> On Mon, 11 Nov 2019 at 23:08, Tony Lindgren <tony@atomide.com> wrote:
> > Looks good to me. I tried testing this, but looks like at least on duovero
> > mwifiex_sdio is broken in v5.4-rc7:
> >
> > # dmesg | grep mwifi
> > mwifiex_sdio mmc1:0001:1: poll card status failed, tries = 100
> > mwifiex_sdio mmc1:0001:1: FW download with helper:       poll status timeout @ 0
> > mwifiex_sdio mmc1:0001:1: prog_fw failed ret=0xffffffff
> > mwifiex_sdio mmc1:0001:1: info: _mwifiex_fw_dpc: unregister device
> >
> > No idea when it broke and what might be the issue this time around.
> > Any ideas?
> 
> Sorry, no good idea.
> 
> Perhaps something on the mmc host level that has changed?

No idea, not even v4.19 is not behaving properly. I guess another
victim of eternal ongoing regressions.

Regards,

Tony
Ulf Hansson Nov. 13, 2019, 3:11 p.m. UTC | #7
On Tue, 12 Nov 2019 at 18:43, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Nov 12, 2019 at 4:28 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Tue, 12 Nov 2019 at 01:51, Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Sat, Nov 9, 2019 at 2:31 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > >
> > > > Changes in v2:
> > > >         - Add adaptations to the mwifiex driver.
> > > >         - Keep existing syncronous reset behaviour if the SDIO card has a single
> > > >         func driver.
> > > >
> > > > It has turned out that it's not a good idea to try to power cycle and to
> > > > re-initialize the SDIO card, as currently done through mmc_hw_reset(). This
> > > > because there may be multiple SDIO funcs attached to the same SDIO card and
> > > > some of the others that didn't execute the call to mmc_hw_reset(), may then
> > > > simply experience an undefined behaviour.
> > > >
> > > > The following patches in this series attempts to address this problem, by
> > > > reworking the mmc_hw_reset() behaviour for SDIO and by adopting the Marvel
> > > > mwifiex driver to these changes.
> > > >
> > > > Note that, I don't have the HW at hand so the the code has only compile tested.
> > > > Test on HW is greatly appreciated!
> > > >
> > > > Ulf Hansson (3):
> > > >   mwifiex: Re-work support for SDIO HW reset
> > > >   mmc: core: Drop check for mmc_card_is_removable() in mmc_rescan()
> > > >   mmc: core: Re-work HW reset for SDIO cards
> > > >
> > > >  drivers/mmc/core/core.c                     | 12 +++-----
> > > >  drivers/mmc/core/core.h                     |  2 ++
> > > >  drivers/mmc/core/sdio.c                     | 28 ++++++++++++++++-
> > > >  drivers/mmc/core/sdio_bus.c                 |  9 +++++-
> > > >  drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
> > > >  drivers/net/wireless/marvell/mwifiex/main.h |  1 +
> > > >  drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
> > > >  include/linux/mmc/card.h                    |  1 +
> > > >  8 files changed, 70 insertions(+), 22 deletions(-)
> > >
> > > I put this on rk3288-veyron-jerry atop v5.4-rc7 and I could run my
> > > test case for a while, AKA I got over 50 cycles of:
> > >
> > > ---
> > >
> > > for i in $(seq 1000); do
> > >   echo "LOOP $i --------"
> > >   echo 1 > /sys/kernel/debug/mwifiex/mlan0/reset
> > >
> > >   while true; do
> > >     if ! ping6 -w15 -c1 "${GW}" >/dev/null 2>&1; then
> > >       fail=$(( fail + 1 ))
> > >       echo "Fail WiFi ${fail}"
> > >       if [[ ${fail} == 3 ]]; then
> > >         exit 1
> > >       fi
> > >     else
> > >       fail=0
> > >       break
> > >     fi
> > >   done
> > >
> > >   hciconfig hci0 down
> > >   sleep 1
> > >   if ! hciconfig hci0 up; then
> > >     echo "Fail BT"
> > >     exit 1
> > >   fi
> > >
> > > done
> > >
> > > ---
> > >
> > > NOTE: with no patches I couldn't even get my test case to pass w/out
> > > the BT bits and I swear that used to work before.  ...but I didn't
> > > debug since the end result (with full card hotplug) is happy-working
> > > for me.  I'll still use it as further argument that (IMO) full unplug
> > > / plug of the card is better it uses more standard code paths and is
> > > less likely to break.  ;-)
> > >
> > > Tested-by: Douglas Anderson <dianders@chromium.org>
> >
> > Thanks, I add this to the series and make a re-spin.
> >
> > What do you think about tagging the patches for stable?
> >
> > I guess there is a risk that we may "break" the other two users of
> > mmc_hw_reset(). But, as I said, in that case those needs to be fixed
> > anyways.
>
> I'm not sure how to make that judgement call.  Certainly it would help
> anyone using the Marvell case and the Marvell case was pretty broken
> before.
>
> How about this: if you can get a Tested-by from the other users then
> I'd be good with a general CC: stable.  Otherwise, I'd be OK with a CC
> to stable for 5.4, but I'd be a little hesitant to send it back to
> older kernels (even though it certainly applies and fixes problems).

This makes sense, let's go with this!

> At least in the case of Chrome OS we already have a workable solution
> for our 4.19 tree (my previous patches), and I'd guess anyone testing
> on real hardware is either not seeing problems or has their own
> private patches already.  If things have been sitting stable on 5.4
> for a while and no problems were reported, then we could consider
> going back further?

Yep!

That said, the v3 series is queued up for next and by adding a stable
tag for 5.4+.

Kind regards
Uffe