Message ID | 1479301749-14803-4-git-send-email-akarwar@marvell.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 41efaf5824e7cb16c54bbec1273d86d80cdac283 |
Delegated to: | Kalle Valo |
Headers | show |
On Wed, Nov 16, 2016 at 06:39:08PM +0530, Amitkumar Karwar wrote: > From: Xinming Hu <huxm@marvell.com> > > Wait for firmware dump complete in card remove function. > For sdio interface, there are two diffenrent cases, > card reset trigger sdio_work and firmware dump trigger sdio_work. > Do code rearrangement for distinguish between these two cases. > > Signed-off-by: Xinming Hu <huxm@marvell.com> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> > --- > v2: 1. Get rid of reset_triggered flag. Instead split the code and use > __mwifiex_sdio_remove() (Brian Norris/Dmitry Torokhov) > 2. "v1 4/5 mwifiex: firmware dump code rearrangement.." is dropped. So > rebased accordingly. > v3: same as [v2,5/5]. The improvement of 'moving pcie_work to card struct' > suggested by Brian is taken care in next patch. > --- > drivers/net/wireless/marvell/mwifiex/pcie.c | 6 +++++- > drivers/net/wireless/marvell/mwifiex/sdio.c | 15 ++++++++++++--- > 2 files changed, 17 insertions(+), 4 deletions(-) I've expressed my unhappiness with the SDIO card-reset code that already exists (and prevents doing sane code improvements on the rest of the driver), but the current change looks OK anyway. I haven't reviewed patch 1 as closely, but for 2 to 5: Reviewed-by: Brian Norris <briannorris@chromium.org>
Hi, On Wed, Nov 16, 2016 at 06:39:08PM +0530, Amitkumar Karwar wrote: > From: Xinming Hu <huxm@marvell.com> > > Wait for firmware dump complete in card remove function. > For sdio interface, there are two diffenrent cases, > card reset trigger sdio_work and firmware dump trigger sdio_work. > Do code rearrangement for distinguish between these two cases. On second review of the SDIO card reset code (which I'll repeat is quite ugly), you seem to be making a bad distinction here. What if there is a firmware dump happening concurrently with your card-reset handling? You *do* want to synchronize with the firmware dump before completing the card reset, or else you might be freeing up internal card resources that are still in use. See below. > > Signed-off-by: Xinming Hu <huxm@marvell.com> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> > --- > v2: 1. Get rid of reset_triggered flag. Instead split the code and use > __mwifiex_sdio_remove() (Brian Norris/Dmitry Torokhov) > 2. "v1 4/5 mwifiex: firmware dump code rearrangement.." is dropped. So > rebased accordingly. > v3: same as [v2,5/5]. The improvement of 'moving pcie_work to card struct' > suggested by Brian is taken care in next patch. > --- > drivers/net/wireless/marvell/mwifiex/pcie.c | 6 +++++- > drivers/net/wireless/marvell/mwifiex/sdio.c | 15 ++++++++++++--- > 2 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c > index dd8f7aa..c8e69a4 100644 > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c > @@ -51,6 +51,9 @@ static int mwifiex_pcie_probe_of(struct device *dev) > return 0; > } > > +static void mwifiex_pcie_work(struct work_struct *work); > +static DECLARE_WORK(pcie_work, mwifiex_pcie_work); > + > static int > mwifiex_map_pci_memory(struct mwifiex_adapter *adapter, struct sk_buff *skb, > size_t size, int flags) > @@ -254,6 +257,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev) > if (!adapter || !adapter->priv_num) > return; > > + cancel_work_sync(&pcie_work); > + > if (user_rmmod && !adapter->mfg_mode) { > mwifiex_deauthenticate_all(adapter); > > @@ -2722,7 +2727,6 @@ static void mwifiex_pcie_work(struct work_struct *work) > mwifiex_pcie_device_dump_work(save_adapter); > } > > -static DECLARE_WORK(pcie_work, mwifiex_pcie_work); > /* This function dumps FW information */ > static void mwifiex_pcie_device_dump(struct mwifiex_adapter *adapter) > { > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c > index 16d1d30..78f2cc9 100644 > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c > @@ -46,6 +46,9 @@ > */ > static u8 user_rmmod; > > +static void mwifiex_sdio_work(struct work_struct *work); > +static DECLARE_WORK(sdio_work, mwifiex_sdio_work); > + > static struct mwifiex_if_ops sdio_ops; > static unsigned long iface_work_flags; > > @@ -220,7 +223,7 @@ static int mwifiex_sdio_resume(struct device *dev) > * This function removes the interface and frees up the card structure. > */ > static void > -mwifiex_sdio_remove(struct sdio_func *func) > +__mwifiex_sdio_remove(struct sdio_func *func) > { > struct sdio_mmc_card *card; > struct mwifiex_adapter *adapter; > @@ -249,6 +252,13 @@ static int mwifiex_sdio_resume(struct device *dev) > mwifiex_remove_card(adapter); > } > > +static void > +mwifiex_sdio_remove(struct sdio_func *func) > +{ > + cancel_work_sync(&sdio_work); > + __mwifiex_sdio_remove(func); > +} > + > /* > * SDIO suspend. > * > @@ -2227,7 +2237,7 @@ static void mwifiex_recreate_adapter(struct sdio_mmc_card *card) > * discovered and initializes them from scratch. > */ > > - mwifiex_sdio_remove(func); > + __mwifiex_sdio_remove(func); ^^ So here, you're trying to avoid syncing with the card-reset work event, except that function will free up all your resources (including the static save_adapter). Thus, you're explicitly allowing a use-after-free error here. That seems unwise. Instead, you should actually retain the invariant that you're doing a full remove/reinitialize here, which includes doing the *same* cancel_work_sync() here in mwifiex_recreate_adapter() as you would in any other remove(). IOW, kill the __mwifiex_sdio_remove() and just call mwifiex_sdio_remove() as you were. That also means that you can do the same per-adapter cleanup in the following patch as you do for PCIe. Brian > > /* > * Normally, we would let the driver core take care of releasing these. > @@ -2568,7 +2578,6 @@ static void mwifiex_sdio_work(struct work_struct *work) > mwifiex_sdio_card_reset_work(save_adapter); > } > > -static DECLARE_WORK(sdio_work, mwifiex_sdio_work); > /* This function resets the card */ > static void mwifiex_sdio_card_reset(struct mwifiex_adapter *adapter) > { > -- > 1.9.1 >
Hi Brian, > From: Brian Norris [mailto:briannorris@chromium.org] > Sent: Monday, November 21, 2016 11:06 PM > To: Amitkumar Karwar > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam; > rajatja@google.com; dmitry.torokhov@gmail.com; Xinming Hu > Subject: Re: [PATCH v3 4/5] mwifiex: wait firmware dump complete during > card remove process > > Hi, > > On Wed, Nov 16, 2016 at 06:39:08PM +0530, Amitkumar Karwar wrote: > > From: Xinming Hu <huxm@marvell.com> > > > > Wait for firmware dump complete in card remove function. > > For sdio interface, there are two diffenrent cases, card reset > trigger > > sdio_work and firmware dump trigger sdio_work. > > Do code rearrangement for distinguish between these two cases. > > On second review of the SDIO card reset code (which I'll repeat is > quite ugly), you seem to be making a bad distinction here. What if > there is a firmware dump happening concurrently with your card-reset > handling? > You *do* want to synchronize with the firmware dump before completing the > card reset, or else you might be freeing up internal card resources > that are still in use. See below. I ran some tests and observed that if same work function is scheduled by two threads, it won't have re-entrant calls. They will be executed one after another. In SDIO work function, we have SDIO card reset call after completing firmware dump. So firmware dump won't run concurrently with card-reset as per my understanding. > > > > > Signed-off-by: Xinming Hu <huxm@marvell.com> > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> > > --- > > v2: 1. Get rid of reset_triggered flag. Instead split the code and > use > > __mwifiex_sdio_remove() (Brian Norris/Dmitry Torokhov) > > 2. "v1 4/5 mwifiex: firmware dump code rearrangement.." is > dropped. So > > rebased accordingly. > > v3: same as [v2,5/5]. The improvement of 'moving pcie_work to card > struct' > > suggested by Brian is taken care in next patch. > > --- > > drivers/net/wireless/marvell/mwifiex/pcie.c | 6 +++++- > > drivers/net/wireless/marvell/mwifiex/sdio.c | 15 ++++++++++++--- > > 2 files changed, 17 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c > > b/drivers/net/wireless/marvell/mwifiex/pcie.c > > index dd8f7aa..c8e69a4 100644 > > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c > > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c > > @@ -51,6 +51,9 @@ static int mwifiex_pcie_probe_of(struct device > *dev) > > return 0; > > } > > > > +static void mwifiex_pcie_work(struct work_struct *work); static > > +DECLARE_WORK(pcie_work, mwifiex_pcie_work); > > + > > static int > > mwifiex_map_pci_memory(struct mwifiex_adapter *adapter, struct > sk_buff *skb, > > size_t size, int flags) > > @@ -254,6 +257,8 @@ static void mwifiex_pcie_remove(struct pci_dev > *pdev) > > if (!adapter || !adapter->priv_num) > > return; > > > > + cancel_work_sync(&pcie_work); > > + > > if (user_rmmod && !adapter->mfg_mode) { > > mwifiex_deauthenticate_all(adapter); > > > > @@ -2722,7 +2727,6 @@ static void mwifiex_pcie_work(struct > work_struct *work) > > mwifiex_pcie_device_dump_work(save_adapter); > > } > > > > -static DECLARE_WORK(pcie_work, mwifiex_pcie_work); > > /* This function dumps FW information */ static void > > mwifiex_pcie_device_dump(struct mwifiex_adapter *adapter) { diff > > --git a/drivers/net/wireless/marvell/mwifiex/sdio.c > > b/drivers/net/wireless/marvell/mwifiex/sdio.c > > index 16d1d30..78f2cc9 100644 > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c > > @@ -46,6 +46,9 @@ > > */ > > static u8 user_rmmod; > > > > +static void mwifiex_sdio_work(struct work_struct *work); static > > +DECLARE_WORK(sdio_work, mwifiex_sdio_work); > > + > > static struct mwifiex_if_ops sdio_ops; static unsigned long > > iface_work_flags; > > > > @@ -220,7 +223,7 @@ static int mwifiex_sdio_resume(struct device > *dev) > > * This function removes the interface and frees up the card > structure. > > */ > > static void > > -mwifiex_sdio_remove(struct sdio_func *func) > > +__mwifiex_sdio_remove(struct sdio_func *func) > > { > > struct sdio_mmc_card *card; > > struct mwifiex_adapter *adapter; > > @@ -249,6 +252,13 @@ static int mwifiex_sdio_resume(struct device > *dev) > > mwifiex_remove_card(adapter); > > } > > > > +static void > > +mwifiex_sdio_remove(struct sdio_func *func) { > > + cancel_work_sync(&sdio_work); > > + __mwifiex_sdio_remove(func); > > +} > > + > > /* > > * SDIO suspend. > > * > > @@ -2227,7 +2237,7 @@ static void mwifiex_recreate_adapter(struct > sdio_mmc_card *card) > > * discovered and initializes them from scratch. > > */ > > > > - mwifiex_sdio_remove(func); > > + __mwifiex_sdio_remove(func); > > ^^ So here, you're trying to avoid syncing with the card-reset work > event, except that function will free up all your resources (including > the static save_adapter). Thus, you're explicitly allowing a use-after- > free error here. That seems unwise. Even if firmware dump is triggered after card reset is started, it will execute after card reset is completed as discussed above. Only problem I can see is with "save_adapter". We can put new_adapter pointer in "save_adapter" at the end of mwifiex_recreate_adapter() to solve the issue. > > Instead, you should actually retain the invariant that you're doing a > full remove/reinitialize here, which includes doing the *same* > cancel_work_sync() here in mwifiex_recreate_adapter() as you would in > any other remove(). We are executing mwifiex_recreate_adapter() as a part of sdio_work(). Calling cancel_work_sync() here would cause deadlock. The API is supposed to waits until sdio_work() is finished. > > IOW, kill the __mwifiex_sdio_remove() and just call > mwifiex_sdio_remove() as you were. > > That also means that you can do the same per-adapter cleanup in the > following patch as you do for PCIe. Currently as sdio_work recreates "card", the work structure can't be moved inside card structure. Let me know your suggestions. Regards, Amitkumar
Hi Amit, On Thu, Nov 24, 2016 at 12:14:07PM +0000, Amitkumar Karwar wrote: > > From: Brian Norris [mailto:briannorris@chromium.org] > > Sent: Monday, November 21, 2016 11:06 PM > > To: Amitkumar Karwar > > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam; > > rajatja@google.com; dmitry.torokhov@gmail.com; Xinming Hu > > Subject: Re: [PATCH v3 4/5] mwifiex: wait firmware dump complete during > > card remove process > > > > On Wed, Nov 16, 2016 at 06:39:08PM +0530, Amitkumar Karwar wrote: > > > From: Xinming Hu <huxm@marvell.com> > > > > > > Wait for firmware dump complete in card remove function. > > > For sdio interface, there are two diffenrent cases, card reset > > trigger > > > sdio_work and firmware dump trigger sdio_work. > > > Do code rearrangement for distinguish between these two cases. > > > > On second review of the SDIO card reset code (which I'll repeat is > > quite ugly), you seem to be making a bad distinction here. What if > > there is a firmware dump happening concurrently with your card-reset > > handling? > > You *do* want to synchronize with the firmware dump before completing the > > card reset, or else you might be freeing up internal card resources > > that are still in use. See below. > > I ran some tests and observed that if same work function is scheduled > by two threads, it won't have re-entrant calls. They will be executed > one after another. In SDIO work function, we have SDIO card reset call > after completing firmware dump. So firmware dump won't run > concurrently with card-reset as per my understanding. Ah, you're correct. It's somewhat obscure and potentially fragile, but correct AFAICT. As you noted though, you do still have a use-after-free bug, even if the concurrency isn't quite as high as I thought. > > > Signed-off-by: Xinming Hu <huxm@marvell.com> > > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> > > > --- ... > > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c > > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c > > > @@ -46,6 +46,9 @@ > > > */ > > > static u8 user_rmmod; > > > > > > +static void mwifiex_sdio_work(struct work_struct *work); static > > > +DECLARE_WORK(sdio_work, mwifiex_sdio_work); > > > + > > > static struct mwifiex_if_ops sdio_ops; static unsigned long > > > iface_work_flags; > > > > > > @@ -220,7 +223,7 @@ static int mwifiex_sdio_resume(struct device > > *dev) > > > * This function removes the interface and frees up the card > > structure. > > > */ > > > static void > > > -mwifiex_sdio_remove(struct sdio_func *func) > > > +__mwifiex_sdio_remove(struct sdio_func *func) > > > { > > > struct sdio_mmc_card *card; > > > struct mwifiex_adapter *adapter; > > > @@ -249,6 +252,13 @@ static int mwifiex_sdio_resume(struct device > > *dev) > > > mwifiex_remove_card(adapter); > > > } > > > > > > +static void > > > +mwifiex_sdio_remove(struct sdio_func *func) { > > > + cancel_work_sync(&sdio_work); > > > + __mwifiex_sdio_remove(func); > > > +} > > > + > > > /* > > > * SDIO suspend. > > > * > > > @@ -2227,7 +2237,7 @@ static void mwifiex_recreate_adapter(struct > > sdio_mmc_card *card) > > > * discovered and initializes them from scratch. > > > */ > > > > > > - mwifiex_sdio_remove(func); > > > + __mwifiex_sdio_remove(func); > > > > ^^ So here, you're trying to avoid syncing with the card-reset work > > event, except that function will free up all your resources (including > > the static save_adapter). Thus, you're explicitly allowing a use-after- > > free error here. That seems unwise. > > Even if firmware dump is triggered after card reset is started, it > will execute after card reset is completed as discussed above. Only > problem I can see is with "save_adapter". We can put new_adapter > pointer in "save_adapter" at the end of mwifiex_recreate_adapter() to > solve the issue. Ugh, yet another band-aid? You might do better to still cancel any pending work, just don't do it synchronously. i.e., do cancel_work() after you've removed the card. It doesn't make sense to do a FW dump on the "new" adapter when it was requested for the old one. > > Instead, you should actually retain the invariant that you're doing a > > full remove/reinitialize here, which includes doing the *same* > > cancel_work_sync() here in mwifiex_recreate_adapter() as you would in > > any other remove(). > > We are executing mwifiex_recreate_adapter() as a part of sdio_work(). Calling > cancel_work_sync() here would cause deadlock. The API is supposed to waits > until sdio_work() is finished. You are correct. So using the _sync() version would be wrong. > > > > IOW, kill the __mwifiex_sdio_remove() and just call > > mwifiex_sdio_remove() as you were. > > > > That also means that you can do the same per-adapter cleanup in the > > following patch as you do for PCIe. > > Currently as sdio_work recreates "card", the work structure can't be moved inside card structure. > Let me know your suggestions. If you address the TODO in mwifiex_create_adapter() instead, you can get past this problem: /* TODO mmc_hw_reset does not require destroying and re-probing the * whole adapter. Hence there was no need to for this rube-goldberg * design to reload the fw from an external workqueue. If we don't * destroy the adapter we could reload the fw from * mwifiex_main_work_queue directly. The "save_adapter" is an abomination that should be terminated swiftly, but it is perpetuated in part by the hacks noted in the TODO. So I'd recommend addressing the TODO ASAP, but in the meantime, a hack like my suggestion (cancel the FW dump work w/o synchronizing) or -- less preferably -- yours (manually set 'save_adapter' again) might be OK. I think I've asked elsewhere but didn't receive an answer: why is SDIO's mwifiex_recreate_adapter() so much different from PCIe's mwifiex_do_flr()? It seems like the latter should be refactored to remove some of the PCIe-specific cruft from main.c and then reused for SDIO. Brian
Hi Brian, > > > > * > > > > @@ -2227,7 +2237,7 @@ static void mwifiex_recreate_adapter(struct > > > sdio_mmc_card *card) > > > > * discovered and initializes them from scratch. > > > > */ > > > > > > > > - mwifiex_sdio_remove(func); > > > > + __mwifiex_sdio_remove(func); > > > > > > ^^ So here, you're trying to avoid syncing with the card-reset work > > > event, except that function will free up all your resources > > > (including the static save_adapter). Thus, you're explicitly > > > allowing a use-after- free error here. That seems unwise. > > > > Even if firmware dump is triggered after card reset is started, it > > will execute after card reset is completed as discussed above. Only > > problem I can see is with "save_adapter". We can put new_adapter > > pointer in "save_adapter" at the end of mwifiex_recreate_adapter() to > > solve the issue. > > Ugh, yet another band-aid? You might do better to still cancel any > pending work, just don't do it synchronously. i.e., do cancel_work() > after you've removed the card. It doesn't make sense to do a FW dump on > the "new" adapter when it was requested for the old one. I could not find async version of cancel_work(). We can fix this problem with below change at the end of mwifiex_sdio_work(). All pending work requests would be ignored. -------- @ -2571,6 +2571,8 @@ static void mwifiex_sdio_work(struct work_struct *work) if (test_and_clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags)) mwifiex_sdio_card_reset_work(save_adapter); + clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags); + clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags); } ---------- > > > > Instead, you should actually retain the invariant that you're doing > > > a full remove/reinitialize here, which includes doing the *same* > > > cancel_work_sync() here in mwifiex_recreate_adapter() as you would > > > in any other remove(). > > > > We are executing mwifiex_recreate_adapter() as a part of sdio_work(). > > Calling > > cancel_work_sync() here would cause deadlock. The API is supposed to > > waits until sdio_work() is finished. > > You are correct. So using the _sync() version would be wrong. > > > > > > > IOW, kill the __mwifiex_sdio_remove() and just call > > > mwifiex_sdio_remove() as you were. > > > > > > That also means that you can do the same per-adapter cleanup in the > > > following patch as you do for PCIe. > > > > Currently as sdio_work recreates "card", the work structure can't be > moved inside card structure. > > Let me know your suggestions. > > If you address the TODO in mwifiex_create_adapter() instead, you can > get past this problem: > > /* TODO mmc_hw_reset does not require destroying and re-probing > the > * whole adapter. Hence there was no need to for this rube- > goldberg > * design to reload the fw from an external workqueue. If we > don't > * destroy the adapter we could reload the fw from > * mwifiex_main_work_queue directly. > > The "save_adapter" is an abomination that should be terminated swiftly, > but it is perpetuated in part by the hacks noted in the TODO. > > So I'd recommend addressing the TODO ASAP, but in the meantime, a hack > like my suggestion (cancel the FW dump work w/o synchronizing) or -- > less preferably -- yours (manually set 'save_adapter' again) might be > OK. > > I think I've asked elsewhere but didn't receive an answer: why is > SDIO's mwifiex_recreate_adapter() so much different from PCIe's > mwifiex_do_flr()? It seems like the latter should be refactored to > remove some of the PCIe-specific cruft from main.c and then reused for > SDIO. Our initial SDIO card reset implementation was based on MMC APIs where remove() and probe() would automatically get called by MMC subsystem after power cycle. https://www.spinics.net/lists/linux-wireless/msg98435.html Later it was improved by Andreas Fenkart by replacing those power cycle APIs with mmc_hw_reset(). For PCIe, function level reset is standard feature. We implemented ".reset_notify" handler which gets called after and before FLR. You are right. We can have SDIO's handling similar to PCIe and avoid destroying+recreating adapter/card. We have started working on this. We will get rid of global save_adapter, sdio_work etc. Meanwhile I will post above mentioned change if it looks good to you. Regards, Amitkumar
On Wed, Nov 30, 2016 at 12:39:11PM +0000, Amitkumar Karwar wrote: > > Ugh, yet another band-aid? You might do better to still cancel any > > pending work, just don't do it synchronously. i.e., do cancel_work() > > after you've removed the card. It doesn't make sense to do a FW dump on > > the "new" adapter when it was requested for the old one. > > I could not find async version of cancel_work(). cancel_work() *is* asynchronous. It does not synchronize with the last event, so you won't have the deadlock. (Remember: the synchronous version is cancel_work_sync().) > We can fix this problem with below change at the end of > mwifiex_sdio_work(). All pending work requests would be ignored. > > -------- > @ -2571,6 +2571,8 @@ static void mwifiex_sdio_work(struct work_struct *work) > if (test_and_clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, > &iface_work_flags)) > mwifiex_sdio_card_reset_work(save_adapter); > + clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags); > + clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags); > } > ---------- I don't think that's exactly what you want. That might lose events, won't it? I'd rather this sort of hack go into mwifiex_recreate_adapter(), in between the remove() and probe() calls, where you don't expect any new events to trigger. And maybe include a comment as to why. > > I think I've asked elsewhere but didn't receive an answer: why is > > SDIO's mwifiex_recreate_adapter() so much different from PCIe's > > mwifiex_do_flr()? It seems like the latter should be refactored to > > remove some of the PCIe-specific cruft from main.c and then reused for > > SDIO. > > Our initial SDIO card reset implementation was based on MMC APIs where > remove() and probe() would automatically get called by MMC subsystem > after power cycle. > https://www.spinics.net/lists/linux-wireless/msg98435.html > Later it was improved by Andreas Fenkart by replacing those power > cycle APIs with mmc_hw_reset(). Right. > For PCIe, function level reset is standard feature. We implemented > ".reset_notify" handler which gets called after and before FLR. OK. > You are right. We can have SDIO's handling similar to PCIe and avoid > destroying+recreating adapter/card. So all in all, you're saying it's just an artifact of history, and there's no good reason they are so different? If so, then this looks like another instance where you would have done well to refactor and improve the existing mechanisms at the same time as you added new features (i.e., PCIe FLR). I've seen this problem already several times, where it seems development for your SDIO/PCIe/USB interface drivers occur almost in isolation. IMO, it'd do you well to notice these patterns while implementing features in the first place. The more code you can share, the fewer bugs you (or I) will have to chase down. > We have started working on this. We will get rid of global > save_adapter, sdio_work etc. Great! > Meanwhile I will post above mentioned change if it looks good to you. It's not perfect, but it's a start. Brian
Hi Brian, > From: Brian Norris [mailto:briannorris@chromium.org] > Sent: Thursday, December 01, 2016 12:04 AM > To: Amitkumar Karwar > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam; > rajatja@google.com; dmitry.torokhov@gmail.com; Xinming Hu > Subject: Re: [PATCH v3 4/5] mwifiex: wait firmware dump complete during > card remove process > > On Wed, Nov 30, 2016 at 12:39:11PM +0000, Amitkumar Karwar wrote: > > > > Ugh, yet another band-aid? You might do better to still cancel any > > > pending work, just don't do it synchronously. i.e., do > cancel_work() > > > after you've removed the card. It doesn't make sense to do a FW > dump > > > on the "new" adapter when it was requested for the old one. > > > > I could not find async version of cancel_work(). > > cancel_work() *is* asynchronous. It does not synchronize with the last > event, so you won't have the deadlock. (Remember: the synchronous > version is cancel_work_sync().) My bad! What I meant is "I could not find async version of cancel_work_sync()" cancel_work() isn't available in http://lxr.free-electrons.com/source/kernel/workqueue.c Anyways, clear_bit() after remove() during card reset would address the problem. > > > We can fix this problem with below change at the end of > > mwifiex_sdio_work(). All pending work requests would be ignored. > > > > -------- > > @ -2571,6 +2571,8 @@ static void mwifiex_sdio_work(struct work_struct > *work) > > if (test_and_clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, > > &iface_work_flags)) > > mwifiex_sdio_card_reset_work(save_adapter); > > + clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, &iface_work_flags); > > + clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, &iface_work_flags); > > } > > ---------- > > I don't think that's exactly what you want. That might lose events, > won't it? I'd rather this sort of hack go into > mwifiex_recreate_adapter(), in between the remove() and probe() calls, > where you don't expect any new events to trigger. And maybe include a > comment as to why. Right. I have just posted a patch for this. > > > > I think I've asked elsewhere but didn't receive an answer: why is > > > SDIO's mwifiex_recreate_adapter() so much different from PCIe's > > > mwifiex_do_flr()? It seems like the latter should be refactored to > > > remove some of the PCIe-specific cruft from main.c and then reused > > > for SDIO. > > > > Our initial SDIO card reset implementation was based on MMC APIs > where > > remove() and probe() would automatically get called by MMC subsystem > > after power cycle. > > https://www.spinics.net/lists/linux-wireless/msg98435.html > > Later it was improved by Andreas Fenkart by replacing those power > > cycle APIs with mmc_hw_reset(). > > Right. > > > For PCIe, function level reset is standard feature. We implemented > > ".reset_notify" handler which gets called after and before FLR. > > OK. > > > You are right. We can have SDIO's handling similar to PCIe and avoid > > destroying+recreating adapter/card. > > So all in all, you're saying it's just an artifact of history, and > there's no good reason they are so different? If so, then this looks > like another instance where you would have done well to refactor and > improve the existing mechanisms at the same time as you added new > features (i.e., PCIe FLR). I've seen this problem already several > times, where it seems development for your SDIO/PCIe/USB interface > drivers occur almost in isolation. IMO, it'd do you well to notice > these patterns while implementing features in the first place. The more > code you can share, the fewer bugs you (or I) will have to chase down. Thanks for your guidance. I'll follow this for future development. Regards, Amitkumar
Hi, On Thu, Dec 01, 2016 at 02:02:43PM +0000, Amitkumar Karwar wrote: > > > I could not find async version of cancel_work(). > > > > cancel_work() *is* asynchronous. It does not synchronize with the last > > event, so you won't have the deadlock. (Remember: the synchronous > > version is cancel_work_sync().) > > My bad! What I meant is "I could not find async version of cancel_work_sync()" > cancel_work() isn't available in http://lxr.free-electrons.com/source/kernel/workqueue.c It's in 4.9-rc1 (and it's available at the above link, at least by now). See: commit f72b8792d180948b4b3898374998f5ac8c02e539 Author: Jens Axboe <axboe@fb.com> Date: Wed Aug 24 15:51:50 2016 -0600 workqueue: add cancel_work() But anyway: > Anyways, clear_bit() after remove() during card reset would address the problem. Yes, I think that's OK. Brian
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c index dd8f7aa..c8e69a4 100644 --- a/drivers/net/wireless/marvell/mwifiex/pcie.c +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c @@ -51,6 +51,9 @@ static int mwifiex_pcie_probe_of(struct device *dev) return 0; } +static void mwifiex_pcie_work(struct work_struct *work); +static DECLARE_WORK(pcie_work, mwifiex_pcie_work); + static int mwifiex_map_pci_memory(struct mwifiex_adapter *adapter, struct sk_buff *skb, size_t size, int flags) @@ -254,6 +257,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev) if (!adapter || !adapter->priv_num) return; + cancel_work_sync(&pcie_work); + if (user_rmmod && !adapter->mfg_mode) { mwifiex_deauthenticate_all(adapter); @@ -2722,7 +2727,6 @@ static void mwifiex_pcie_work(struct work_struct *work) mwifiex_pcie_device_dump_work(save_adapter); } -static DECLARE_WORK(pcie_work, mwifiex_pcie_work); /* This function dumps FW information */ static void mwifiex_pcie_device_dump(struct mwifiex_adapter *adapter) { diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c index 16d1d30..78f2cc9 100644 --- a/drivers/net/wireless/marvell/mwifiex/sdio.c +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c @@ -46,6 +46,9 @@ */ static u8 user_rmmod; +static void mwifiex_sdio_work(struct work_struct *work); +static DECLARE_WORK(sdio_work, mwifiex_sdio_work); + static struct mwifiex_if_ops sdio_ops; static unsigned long iface_work_flags; @@ -220,7 +223,7 @@ static int mwifiex_sdio_resume(struct device *dev) * This function removes the interface and frees up the card structure. */ static void -mwifiex_sdio_remove(struct sdio_func *func) +__mwifiex_sdio_remove(struct sdio_func *func) { struct sdio_mmc_card *card; struct mwifiex_adapter *adapter; @@ -249,6 +252,13 @@ static int mwifiex_sdio_resume(struct device *dev) mwifiex_remove_card(adapter); } +static void +mwifiex_sdio_remove(struct sdio_func *func) +{ + cancel_work_sync(&sdio_work); + __mwifiex_sdio_remove(func); +} + /* * SDIO suspend. * @@ -2227,7 +2237,7 @@ static void mwifiex_recreate_adapter(struct sdio_mmc_card *card) * discovered and initializes them from scratch. */ - mwifiex_sdio_remove(func); + __mwifiex_sdio_remove(func); /* * Normally, we would let the driver core take care of releasing these. @@ -2568,7 +2578,6 @@ static void mwifiex_sdio_work(struct work_struct *work) mwifiex_sdio_card_reset_work(save_adapter); } -static DECLARE_WORK(sdio_work, mwifiex_sdio_work); /* This function resets the card */ static void mwifiex_sdio_card_reset(struct mwifiex_adapter *adapter) {