diff mbox series

[v2,1/3] mwifiex: Re-work support for SDIO HW reset

Message ID 20191109103046.26445-2-ulf.hansson@linaro.org (mailing list archive)
State Not Applicable
Delegated to: Kalle Valo
Headers show
Series mmc: Fixup HW reset for SDIO cards | expand

Commit Message

Ulf Hansson Nov. 9, 2019, 10:30 a.m. UTC
The SDIO HW reset procedure in mwifiex_sdio_card_reset_work() is broken,
when the SDIO card is shared with another SDIO func driver. This is the
case when the Bluetooth btmrvl driver is being used in combination with
mwifiex. More precisely, when mwifiex_sdio_card_reset_work() runs to resets
the SDIO card, the btmrvl driver doesn't get notified about it. Beyond that
point, the btmrvl driver will fail to communicate with the SDIO card.

This is a generic problem for SDIO func drivers sharing an SDIO card, which
are about to be addressed in subsequent changes to the mmc core and the
mmc_hw_reset() interface. In principle, these changes means the
mmc_hw_reset() interface starts to return 1 if the are multiple drivers for
the SDIO card, as to indicate to the caller that the reset needed to be
scheduled asynchronously through a hotplug mechanism of the SDIO card.

Let's prepare the mwifiex driver to support the upcoming new behaviour of
mmc_hw_reset(), which means extending the mwifiex_sdio_card_reset_work() to
support the asynchronous SDIO HW reset path. This also means, we need to
allow the ->remove() callback to run, without waiting for the FW to be
loaded. Additionally, during system suspend, mwifiex_sdio_suspend() may be
called when a reset has been scheduled, but waiting to be executed. In this
scenario let's simply return -EBUSY to abort the suspend process, as to
allow the reset to be completed first.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
 drivers/net/wireless/marvell/mwifiex/main.h |  1 +
 drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
 3 files changed, 28 insertions(+), 12 deletions(-)

Comments

Doug Anderson Nov. 12, 2019, 12:33 a.m. UTC | #1
Hi,

On Sat, Nov 9, 2019 at 2:31 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> The SDIO HW reset procedure in mwifiex_sdio_card_reset_work() is broken,
> when the SDIO card is shared with another SDIO func driver. This is the
> case when the Bluetooth btmrvl driver is being used in combination with
> mwifiex. More precisely, when mwifiex_sdio_card_reset_work() runs to resets
> the SDIO card, the btmrvl driver doesn't get notified about it. Beyond that
> point, the btmrvl driver will fail to communicate with the SDIO card.
>
> This is a generic problem for SDIO func drivers sharing an SDIO card, which
> are about to be addressed in subsequent changes to the mmc core and the
> mmc_hw_reset() interface. In principle, these changes means the
> mmc_hw_reset() interface starts to return 1 if the are multiple drivers for
> the SDIO card, as to indicate to the caller that the reset needed to be
> scheduled asynchronously through a hotplug mechanism of the SDIO card.
>
> Let's prepare the mwifiex driver to support the upcoming new behaviour of
> mmc_hw_reset(), which means extending the mwifiex_sdio_card_reset_work() to
> support the asynchronous SDIO HW reset path. This also means, we need to
> allow the ->remove() callback to run, without waiting for the FW to be
> loaded. Additionally, during system suspend, mwifiex_sdio_suspend() may be
> called when a reset has been scheduled, but waiting to be executed. In this
> scenario let's simply return -EBUSY to abort the suspend process, as to
> allow the reset to be completed first.
>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
>  drivers/net/wireless/marvell/mwifiex/main.h |  1 +
>  drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
>  3 files changed, 28 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
> index a9657ae6d782..dbdbdd6769a9 100644
> --- a/drivers/net/wireless/marvell/mwifiex/main.c
> +++ b/drivers/net/wireless/marvell/mwifiex/main.c
> @@ -76,6 +76,7 @@ static int mwifiex_register(void *card, struct device *dev,
>         *padapter = adapter;
>         adapter->dev = dev;
>         adapter->card = card;
> +       adapter->is_adapter_up = false;

Probably not needed.  The 'adapter' was kzalloc-ed a few lines above
and there's no need to re-init to 0.


> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index 24c041dad9f6..2417c94c29c0 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -444,6 +444,9 @@ static int mwifiex_sdio_suspend(struct device *dev)
>                 return 0;
>         }
>
> +       if (!adapter->is_adapter_up)
> +               return -EBUSY;

I'm moderately concerned that there might be cases where firmware
never got loaded but we could suspend/resume OK.  ...and now we never
will?  I'm not familiar enough with the code to know if this is a real
concern, so I guess we can do this and then see...


> @@ -2220,22 +2223,30 @@ static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
>         struct sdio_func *func = card->func;
>         int ret;
>
> +       /* Prepare the adapter for the reset. */
>         mwifiex_shutdown_sw(adapter);
> +       clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
> +       clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
>
> -       /* power cycle the adapter */
> +       /* Run a HW reset of the SDIO interface. */
>         sdio_claim_host(func);
> -       mmc_hw_reset(func->card->host);
> +       ret = mmc_hw_reset(func->card->host);
>         sdio_release_host(func);
>
> -       /* Previous save_adapter won't be valid after this. We will cancel
> -        * pending work requests.
> -        */
> -       clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
> -       clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);

I don't know enough about the clearing of these bits to confirm that
it's OK to move their clearing to be before the mmc_hw_reset().
Possibly +Brian Norris does?


I can't promise that I didn't miss anything, but to the best of my
knowledge this is good now:

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Ulf Hansson Nov. 12, 2019, 12:13 p.m. UTC | #2
On Tue, 12 Nov 2019 at 01:33, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Sat, Nov 9, 2019 at 2:31 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > The SDIO HW reset procedure in mwifiex_sdio_card_reset_work() is broken,
> > when the SDIO card is shared with another SDIO func driver. This is the
> > case when the Bluetooth btmrvl driver is being used in combination with
> > mwifiex. More precisely, when mwifiex_sdio_card_reset_work() runs to resets
> > the SDIO card, the btmrvl driver doesn't get notified about it. Beyond that
> > point, the btmrvl driver will fail to communicate with the SDIO card.
> >
> > This is a generic problem for SDIO func drivers sharing an SDIO card, which
> > are about to be addressed in subsequent changes to the mmc core and the
> > mmc_hw_reset() interface. In principle, these changes means the
> > mmc_hw_reset() interface starts to return 1 if the are multiple drivers for
> > the SDIO card, as to indicate to the caller that the reset needed to be
> > scheduled asynchronously through a hotplug mechanism of the SDIO card.
> >
> > Let's prepare the mwifiex driver to support the upcoming new behaviour of
> > mmc_hw_reset(), which means extending the mwifiex_sdio_card_reset_work() to
> > support the asynchronous SDIO HW reset path. This also means, we need to
> > allow the ->remove() callback to run, without waiting for the FW to be
> > loaded. Additionally, during system suspend, mwifiex_sdio_suspend() may be
> > called when a reset has been scheduled, but waiting to be executed. In this
> > scenario let's simply return -EBUSY to abort the suspend process, as to
> > allow the reset to be completed first.
> >
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >  drivers/net/wireless/marvell/mwifiex/main.c |  6 +++-
> >  drivers/net/wireless/marvell/mwifiex/main.h |  1 +
> >  drivers/net/wireless/marvell/mwifiex/sdio.c | 33 ++++++++++++++-------
> >  3 files changed, 28 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
> > index a9657ae6d782..dbdbdd6769a9 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/main.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/main.c
> > @@ -76,6 +76,7 @@ static int mwifiex_register(void *card, struct device *dev,
> >         *padapter = adapter;
> >         adapter->dev = dev;
> >         adapter->card = card;
> > +       adapter->is_adapter_up = false;
>
> Probably not needed.  The 'adapter' was kzalloc-ed a few lines above
> and there's no need to re-init to 0.

Right, let me re-spin and drop this.

>
>
> > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > index 24c041dad9f6..2417c94c29c0 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > @@ -444,6 +444,9 @@ static int mwifiex_sdio_suspend(struct device *dev)
> >                 return 0;
> >         }
> >
> > +       if (!adapter->is_adapter_up)
> > +               return -EBUSY;
>
> I'm moderately concerned that there might be cases where firmware
> never got loaded but we could suspend/resume OK.  ...and now we never
> will?  I'm not familiar enough with the code to know if this is a real
> concern, so I guess we can do this and then see...

There is a completion variable that is used to make sure the firmware
is loaded, before the mwifiex driver runs ->suspend|remove(). This is
needed, because during ->probe() the FW will be loaded asynchronously,
hence both mwifiex_sdio_remove() and mwifiex_sdio_suspend(), may be
called while waiting for the FW to be loaded.

If a HW reset has been scheduled but not completed, which would be the
case if mmc_hw_reset() gets called after mmc_pm_notify() with a
PM_SUSPEND_PREPARE. This is because mmc_pm_notify() then disables the
rescan work, but then re-kicks/enables it at PM_POST_SUSPEND (after
the system has resumed).

Returning -EBUSY, should allow the mmc rescan work to be completed
when the system have resumed.

Of course, one could also considering using pm_wakeup_event(), in case
mmc_hw_reset() needed to schedule the reset, as to prevent the system
for suspending for a small amount of time. As to make sure the rescan
work, gets to run. But I am not sure that's needed here.

>
>
> > @@ -2220,22 +2223,30 @@ static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
> >         struct sdio_func *func = card->func;
> >         int ret;
> >
> > +       /* Prepare the adapter for the reset. */
> >         mwifiex_shutdown_sw(adapter);
> > +       clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
> > +       clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
> >
> > -       /* power cycle the adapter */
> > +       /* Run a HW reset of the SDIO interface. */
> >         sdio_claim_host(func);
> > -       mmc_hw_reset(func->card->host);
> > +       ret = mmc_hw_reset(func->card->host);
> >         sdio_release_host(func);
> >
> > -       /* Previous save_adapter won't be valid after this. We will cancel
> > -        * pending work requests.
> > -        */
> > -       clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
> > -       clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
>
> I don't know enough about the clearing of these bits to confirm that
> it's OK to move their clearing to be before the mmc_hw_reset().
> Possibly +Brian Norris does?

That shouldn't matter, because we are running in the path of the
mwifiex_sdio_work(), as work from the system_wq. Unless I am mistaken,
only one work of the type mwifiex_sdio_work() can execute at the same
time. By clearing these bits, we want to cancel any potential recently
scheduled work. It should matter if that's done before or after
mmc_hw_reset().

Moreover, this change makes it more consistent with how the pcie
driver clears the bits.

>
>
> I can't promise that I didn't miss anything, but to the best of my
> knowledge this is good now:
>
> Reviewed-by: Douglas Anderson <dianders@chromium.org>

Thanks!

Finally, if you want to verify that the above system suspend path
works fine, you could change the call to "_mmc_detect_change(host, 0,
false)" in mmc_sdio_hw_reset(), into "_mmc_detect_change(host,
msecs_to_jiffies(30000), false)", in patch3.

This should leave you a 30s window of where you can try to system
suspend the platform, while also waiting for the scheduled reset to be
completed.

Kind regards
Uffe
Doug Anderson Nov. 12, 2019, 6:04 p.m. UTC | #3
Hi,

On Tue, Nov 12, 2019 at 4:14 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> > > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > index 24c041dad9f6..2417c94c29c0 100644
> > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > @@ -444,6 +444,9 @@ static int mwifiex_sdio_suspend(struct device *dev)
> > >                 return 0;
> > >         }
> > >
> > > +       if (!adapter->is_adapter_up)
> > > +               return -EBUSY;
> >
> > I'm moderately concerned that there might be cases where firmware
> > never got loaded but we could suspend/resume OK.  ...and now we never
> > will?  I'm not familiar enough with the code to know if this is a real
> > concern, so I guess we can do this and then see...
>
> There is a completion variable that is used to make sure the firmware
> is loaded, before the mwifiex driver runs ->suspend|remove(). This is
> needed, because during ->probe() the FW will be loaded asynchronously,
> hence both mwifiex_sdio_remove() and mwifiex_sdio_suspend(), may be
> called while waiting for the FW to be loaded.
>
> If a HW reset has been scheduled but not completed, which would be the
> case if mmc_hw_reset() gets called after mmc_pm_notify() with a
> PM_SUSPEND_PREPARE. This is because mmc_pm_notify() then disables the
> rescan work, but then re-kicks/enables it at PM_POST_SUSPEND (after
> the system has resumed).
>
> Returning -EBUSY, should allow the mmc rescan work to be completed
> when the system have resumed.
>
> Of course, one could also considering using pm_wakeup_event(), in case
> mmc_hw_reset() needed to schedule the reset, as to prevent the system
> for suspending for a small amount of time. As to make sure the rescan
> work, gets to run. But I am not sure that's needed here.

I was more worried that we could get into a state where we'd return
EBUSY forever, but I think I've convinced myself that this isn't
possible.  If we fail to load things then the adapter variable will be
freed anyway.


> Finally, if you want to verify that the above system suspend path
> works fine, you could change the call to "_mmc_detect_change(host, 0,
> false)" in mmc_sdio_hw_reset(), into "_mmc_detect_change(host,
> msecs_to_jiffies(30000), false)", in patch3.
>
> This should leave you a 30s window of where you can try to system
> suspend the platform, while also waiting for the scheduled reset to be
> completed.

It worked.

https://pastebin.com/NdsvAdE8
Ulf Hansson Nov. 13, 2019, 3 p.m. UTC | #4
On Tue, 12 Nov 2019 at 19:05, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Nov 12, 2019 at 4:14 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > > > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > index 24c041dad9f6..2417c94c29c0 100644
> > > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > @@ -444,6 +444,9 @@ static int mwifiex_sdio_suspend(struct device *dev)
> > > >                 return 0;
> > > >         }
> > > >
> > > > +       if (!adapter->is_adapter_up)
> > > > +               return -EBUSY;
> > >
> > > I'm moderately concerned that there might be cases where firmware
> > > never got loaded but we could suspend/resume OK.  ...and now we never
> > > will?  I'm not familiar enough with the code to know if this is a real
> > > concern, so I guess we can do this and then see...
> >
> > There is a completion variable that is used to make sure the firmware
> > is loaded, before the mwifiex driver runs ->suspend|remove(). This is
> > needed, because during ->probe() the FW will be loaded asynchronously,
> > hence both mwifiex_sdio_remove() and mwifiex_sdio_suspend(), may be
> > called while waiting for the FW to be loaded.
> >
> > If a HW reset has been scheduled but not completed, which would be the
> > case if mmc_hw_reset() gets called after mmc_pm_notify() with a
> > PM_SUSPEND_PREPARE. This is because mmc_pm_notify() then disables the
> > rescan work, but then re-kicks/enables it at PM_POST_SUSPEND (after
> > the system has resumed).
> >
> > Returning -EBUSY, should allow the mmc rescan work to be completed
> > when the system have resumed.
> >
> > Of course, one could also considering using pm_wakeup_event(), in case
> > mmc_hw_reset() needed to schedule the reset, as to prevent the system
> > for suspending for a small amount of time. As to make sure the rescan
> > work, gets to run. But I am not sure that's needed here.
>
> I was more worried that we could get into a state where we'd return
> EBUSY forever, but I think I've convinced myself that this isn't
> possible.  If we fail to load things then the adapter variable will be
> freed anyway.
>
>
> > Finally, if you want to verify that the above system suspend path
> > works fine, you could change the call to "_mmc_detect_change(host, 0,
> > false)" in mmc_sdio_hw_reset(), into "_mmc_detect_change(host,
> > msecs_to_jiffies(30000), false)", in patch3.
> >
> > This should leave you a 30s window of where you can try to system
> > suspend the platform, while also waiting for the scheduled reset to be
> > completed.
>
> It worked.
>
> https://pastebin.com/NdsvAdE8

Great, thanks for confirming!

Kind regards
Uffe
diff mbox series

Patch

diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index a9657ae6d782..dbdbdd6769a9 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -76,6 +76,7 @@  static int mwifiex_register(void *card, struct device *dev,
 	*padapter = adapter;
 	adapter->dev = dev;
 	adapter->card = card;
+	adapter->is_adapter_up = false;
 
 	/* Save interface specific operations in adapter */
 	memmove(&adapter->if_ops, if_ops, sizeof(struct mwifiex_if_ops));
@@ -631,6 +632,7 @@  static int _mwifiex_fw_dpc(const struct firmware *firmware, void *context)
 
 	mwifiex_drv_get_driver_version(adapter, fmt, sizeof(fmt) - 1);
 	mwifiex_dbg(adapter, MSG, "driver_version = %s\n", fmt);
+	adapter->is_adapter_up = true;
 	goto done;
 
 err_add_intf:
@@ -1469,6 +1471,7 @@  int mwifiex_shutdown_sw(struct mwifiex_adapter *adapter)
 	mwifiex_deauthenticate(priv, NULL);
 
 	mwifiex_uninit_sw(adapter);
+	adapter->is_adapter_up = false;
 
 	if (adapter->if_ops.down_dev)
 		adapter->if_ops.down_dev(adapter);
@@ -1730,7 +1733,8 @@  int mwifiex_remove_card(struct mwifiex_adapter *adapter)
 	if (!adapter)
 		return 0;
 
-	mwifiex_uninit_sw(adapter);
+	if (adapter->is_adapter_up)
+		mwifiex_uninit_sw(adapter);
 
 	if (adapter->irq_wakeup >= 0)
 		device_init_wakeup(adapter->dev, false);
diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
index 095837fba300..7703d2e5d2e0 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.h
+++ b/drivers/net/wireless/marvell/mwifiex/main.h
@@ -1017,6 +1017,7 @@  struct mwifiex_adapter {
 
 	/* For synchronizing FW initialization with device lifecycle. */
 	struct completion *fw_done;
+	bool is_adapter_up;
 
 	bool ext_scan;
 	u8 fw_api_ver;
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index 24c041dad9f6..2417c94c29c0 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -444,6 +444,9 @@  static int mwifiex_sdio_suspend(struct device *dev)
 		return 0;
 	}
 
+	if (!adapter->is_adapter_up)
+		return -EBUSY;
+
 	mwifiex_enable_wake(adapter);
 
 	/* Enable the Host Sleep */
@@ -2220,22 +2223,30 @@  static void mwifiex_sdio_card_reset_work(struct mwifiex_adapter *adapter)
 	struct sdio_func *func = card->func;
 	int ret;
 
+	/* Prepare the adapter for the reset. */
 	mwifiex_shutdown_sw(adapter);
+	clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
+	clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
 
-	/* power cycle the adapter */
+	/* Run a HW reset of the SDIO interface. */
 	sdio_claim_host(func);
-	mmc_hw_reset(func->card->host);
+	ret = mmc_hw_reset(func->card->host);
 	sdio_release_host(func);
 
-	/* Previous save_adapter won't be valid after this. We will cancel
-	 * pending work requests.
-	 */
-	clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &card->work_flags);
-	clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &card->work_flags);
-
-	ret = mwifiex_reinit_sw(adapter);
-	if (ret)
-		dev_err(&func->dev, "reinit failed: %d\n", ret);
+	switch (ret) {
+	case 1:
+		dev_dbg(&func->dev, "SDIO HW reset asynchronous\n");
+		complete_all(adapter->fw_done);
+		break;
+	case 0:
+		ret = mwifiex_reinit_sw(adapter);
+		if (ret)
+			dev_err(&func->dev, "reinit failed: %d\n", ret);
+		break;
+	default:
+		dev_err(&func->dev, "SDIO HW reset failed: %d\n", ret);
+		break;
+	}
 }
 
 /* This function read/write firmware */