Message ID | 20130703015653.82651FAAD5@dev.laptop.org (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Hi Daniel, > When the card is being removed, mwifiex_remove_card() immediately sets > surprise_removed to 1. This flag then causes the SDIO interrupt handler > to ignore all interrupts without even acking them. Since the card is removed, even if there is a pending interrupt received after card removal, it's not possible to ack it or disable interrupts. We cannot access hardware registers after card removal. Perhaps you are talking about unloading driver (rmmod mwifiex_sdio) instead? Thanks, Bing -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 3, 2013 at 12:41 PM, Bing Zhao <bzhao@marvell.com> wrote:
> Perhaps you are talking about unloading driver (rmmod mwifiex_sdio) instead?
Yes - or going into unpowered suspend, or powering down the computer.
All of those trigger the "remove card" path which is what I wanted to
refer to.
Thanks
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, > Make the driver continue to ack interrupts during shutdown to avoid > this. This is harder than it might sound. > > We must be careful not to act upon the interrupt, only ack it. > Otherwise, we end up setting int_status to something. And hw_status is > set to MWIFIEX_HW_STATUS_CLOSING for obvious reasons. We would hit the > following infinite loop in mwifiex_main_process: > > process_start: > do { > if ((adapter->hw_status == MWIFIEX_HW_STATUS_CLOSING) || > (adapter->hw_status == MWIFIEX_HW_STATUS_NOT_READY)) > break; > [...] > } while (true); > if ((adapter->int_status) || IS_CARD_RX_RCVD(adapter)) > goto process_start; > > We must also observe that ACKing interrupts involves reading a load > of data into the mp_regs buffer. The driver doesn't do much in the > way of ensuring that interrupts are disabled before freeing buffers > such as mp_regs, but we do need something here to make sure that we > don't get any interrupts after mp_regs is freed. > > This whole thing feels rather fragile, but I couldn't see a clean > way to do it, the driver seems a bit disorganised here. I would > welcome a review from the designers. Followings are the actions taken for driver unload. a) If device is connected, sync deauth command is sent. b) Auto deep sleep is cancelled by sending sync command c) Sync shutdown command is queued and HW_STATUS is changed to reset. d) Now no other command gets queued based on HW_STATUS. e) Wait for shutdown command response and handle it. f) Set surprise_removed flag which blocks SDIO interrupts As per our design, we don't send any command to firmware after SHUTDOWN command. Also, firmware doesn't send any interrupt after SHUTDOWN command response. The interrupt received after setting surprise_removed flag is unexpected. We need to get the exact procedure to replicate and figure out the root cause. By thy way, I will be OOO for approximately two weeks. Amitkumar Karwar will continue to work with you to debug the issue. Thanks, Bing -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 3, 2013 at 4:39 PM, Bing Zhao <bzhao@marvell.com> wrote: > Followings are the actions taken for driver unload. > > a) If device is connected, sync deauth command is sent. > b) Auto deep sleep is cancelled by sending sync command > c) Sync shutdown command is queued and HW_STATUS is changed to reset. > d) Now no other command gets queued based on HW_STATUS. > e) Wait for shutdown command response and handle it. > f) Set surprise_removed flag which blocks SDIO interrupts > > As per our design, we don't send any command to firmware after SHUTDOWN command. Also, firmware doesn't send any interrupt after SHUTDOWN command response. I cannot see how the driver behaviour matches the above description for when the card is removed as the system is suspended. For example I cannot see where the SHUTDOWN command gets sent in such cases. Also, at which point do we wait upon all async commands to complete before shutting down? I walked through all the driver code in the codepaths hit when the card is removed as the system is going into suspend, and I found no such point. (This is why interrupts then arrive later, because those commands are completing.) Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 4, 2013 at 8:32 AM, Daniel Drake <dsd@laptop.org> wrote: > I cannot see how the driver behaviour matches the above description > for when the card is removed as the system is suspended. For example I > cannot see where the SHUTDOWN command gets sent in such cases. > > Also, at which point do we wait upon all async commands to complete > before shutting down? I walked through all the driver code in the > codepaths hit when the card is removed as the system is going into > suspend, and I found no such point. (This is why interrupts then > arrive later, because those commands are completing.) Also, from a more general standpoint, I would say it is bad practice/design to leave an interrupt handler running but in a state where it does not ACK interrupts. You're just asking for trouble. If you really don't expect interrupts beyond a certain point, and bad things would happen if interrupts *were* to arrive for any valid or invalid reason, disable the interrupt and remove the handler at that point. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, >> Followings are the actions taken for driver unload. >> >> a) If device is connected, sync deauth command is sent. >> b) Auto deep sleep is cancelled by sending sync command >> c) Sync shutdown command is queued and HW_STATUS is changed to reset. >> d) Now no other command gets queued based on HW_STATUS. >> e) Wait for shutdown command response and handle it. >> f) Set surprise_removed flag which blocks SDIO interrupts >> >> As per our design, we don't send any command to firmware after SHUTDOWN command. Also, firmware doesn't send any interrupt after SHUTDOWN command response. >I cannot see how the driver behaviour matches the above description >for when the card is removed as the system is suspended. For example I >cannot see where the SHUTDOWN command gets sent in such cases. This behavior (i.e. mwifiex_sdio_remove() handler) is for driver unload(rmmod mwifiex_sdio) or when the system is powering down. Our mwifiex_sdio_suspend() handler gets called when system is supended. It just informs firmware/hardware that host is going into sleep state by sending a command. The sleep is cancelled in mwifiex_sdio_resume() handler when sytem resumes. We support wake-on-lan feature. User can configure wol conditions(using 'ethtool wol' or 'iw wowlan') before system suspend. Firmware wakes up the host if Rx packet matching configured condition is received. I think, we should probably look into mwifiex_sdio_suspend() routine instead of mwifiex_sdio_remove() to debug interrupt storm issue. Please correct me if I am missing something. We have wait_event_interruptible() call in suspend handler to make sure that we don't return from the handler until we get host sleep activated event from firmware. Firmware is not supposed to send any event after this. We can't disable interrupts at this point, because we expect wakeup event from firmware upon packet arrival Can you please share system suspend logs with dynamic debug enabled for mwifiex? Thanks, Amitkumar Karwar -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, >Also, at which point do we wait upon all async commands to complete >before shutting down? All sync and async commands are queued into same queue. Sync and async type indicates if the thread just queues the command or it queues the command and waits until command response is received. Next command is sent to firmware only after receiving response of previous command. As we don't queue any command after SHUTDOWN command is queued, SHUTDOWN command is the last command sent to firmware. SHUTDOWN is a sync command. Hence remove handler thread waits until it's response. Please let me know if there are any doubts. Thanks, Amit -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 5, 2013 at 8:26 AM, Amitkumar Karwar <akarwar@marvell.com> wrote:
> I think, we should probably look into mwifiex_sdio_suspend() routine instead of mwifiex_sdio_remove() to debug interrupt storm issue. Please correct me if I am missing something.
The system is going into an unpowered suspend. This means that
mwifiex_sdio_suspend() returns -ENOSYS.
The card then gets removed by mwifiex_sdio_remove().
mwifiex_sdio_remove() calls mwifiex_remove_card() without having taken
any care to finish pending commands, etc. mwifiex_remove_card()
immediately sets surprise_removed which triggers the questionable
"ignore all interrupts" behaviour. If there were any async commands
pending, they complete now, with an interrupt. The interrupt doesn't
get acked, so it becomes an interrupt storm.
Hope that is clearer.
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, >mwifiex_sdio_remove() calls mwifiex_remove_card() without having taken >any care to finish pending commands, etc. mwifiex_remove_card() >immediately sets surprise_removed which triggers the questionable >"ignore all interrupts" behaviour. If there were any async commands >pending, they complete now, with an interrupt. The interrupt doesn't >get acked, so it becomes an interrupt storm. >Hope that is clearer. Thanks for the clarification. As per our current design, mwifiex_sdio_remove() will be called in following scenarios. 1) Card is powered off / card is unplugged As we are not allowed to interact with hardware (read/disable interrupts), other driver specific cleanup work is performed. 2) Driver is unloaded Here apart from cleanup work in case 1, we do send SHUTDOWN etc. command to firmware to perform hardware cleanup(this code is under if(user_rmmod) check). In your case, as card is powering off, is it ok to access hardware registers to ACK interrupts? If yes, we can disable interrupts before returning "-ENOSYS" in suspend handler. Modifying mwifiex_sdio_remove() to fix interrupt storm issue would break card unplugged scenario. Please let me know your thoughts. Thanks, Amit -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 5, 2013 at 9:24 AM, Amitkumar Karwar <akarwar@marvell.com> wrote: > In your case, as card is powering off, is it ok to access hardware registers to ACK interrupts? If yes, we can disable interrupts before returning "-ENOSYS" in suspend handler. You tell me - you know the hardware and the driver better than I do. I don't see any external reason why we are not allowed to communicate to the hardware here. The other suspend paths do run some communication. However, disabling interrupts at this point does seem like a bad solution to me, and something that would come back and bite us in the future. mwifiex_sdio_suspend() is not directly related to the cause of the problem. The problem that we should focus on is that mwifiex_remove_card() makes the driver enter a state where if an interrupt arrives, it will be handled badly. Therefore it should be mwifiex_remove_card() or something directly related to it (maybe the callsite, mwifiex_sdio_remove) that takes the necessary steps to make sure that interrupts will not arrive in future. > Modifying mwifiex_sdio_remove() to fix interrupt storm issue would break card unplugged scenario. Why would it break that scenario? Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, >I don't see any external reason why we are not allowed to communicate to >the hardware here. The other suspend paths do run some communication. The reason is mwifiex_sdio_remove() gets called even if user removes an externally connected wifi card. In that case the hardware is not present so there is no point in reading/writing hardware registers. If mwifiex_sdio_remove() is called when card is not powered off or unplugged, we do take care of hardware de-initialization using user_rmmod flag. Interrupt storm issue doesn't exist there. >However, disabling interrupts at this point does seem like a bad >solution to me, and something that would come back and bite us in the >future. mwifiex_sdio_suspend() is not directly related to the cause of >the problem. This approach looks correct to me. While returning failure in suspend handler, 'pm_flag & MMC_PM_KEEP_POWER' is false. Card is going to be powered off soon. There won't be any further communication with card, so disabling interrupts at this point of time is logical. When system is resumed, bus rescan etc. will happen and card will be redetected. Can you please try this change? Thanks and regards, Amitkumar Karwar -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 5, 2013 at 10:43 AM, Amitkumar Karwar <akarwar@marvell.com> wrote: > > Hi Daniel, > >>I don't see any external reason why we are not allowed to communicate to >>the hardware here. The other suspend paths do run some communication. > > The reason is mwifiex_sdio_remove() gets called even if user removes an externally connected wifi card. In that case the hardware is not present so there is no point in reading/writing hardware registers. There is no point in the one case that you mention. But there is a very good reason to do so in the case that I am talking about. So why not do it? There is no harm to do it if the hardware is not present, right? > This approach looks correct to me. While returning failure in suspend handler, 'pm_flag & MMC_PM_KEEP_POWER' is false. Card is going to be powered off soon. There won't be any further communication with card, so disabling interrupts at this point of time is logical. > When system is resumed, bus rescan etc. will happen and card will be redetected. This doesn't make much sense to me either. Why don't we just disable the interrupt handler at the point when interrupts can no longer be correctly handled by the driver? Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, >There is no point in the one case that you mention. But there is a >very good reason to do so in the case that I am talking about. So why >not do it? There is no harm to do it if the hardware is not present, >right? I got your point. We didn't test the behavior of sdio_read/sdio_write API's when hardware is not present. But they should just return an error. >This doesn't make much sense to me either. Why don't we just disable >the interrupt handler at the point when interrupts can no longer be >correctly handled by the driver? Agreed. Following change will fix the issue. Right? Instead of going for your earlier changes to ack interrupts until we disable them. @@ -152,6 +152,9 @@ mwifiex_sdio_remove(struct sdio_func *func) mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN); } + /* Disable host interrupt mask register for SDIO */ + mwifiex_sdio_disable_host_int(adapter); + mwifiex_remove_card(card->adapter, &add_remove_card_sem); kfree(card); } Thanks, Amitkumar Karwar -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel,
>We didn't test the behavior of sdio_read/sdio_write API's when hardware is >not present. But they should just return an error.
I ran some tests which confirmed that there is no harm in calling these API's when hardware is removed. They returned ENOMEDIUM error.
Can you please review and verify attached patch?
Thanks,
Amitkumar Karwar
On Sun, Jul 7, 2013 at 10:15 AM, Amitkumar Karwar <akarwar@marvell.com> wrote: > >>We didn't test the behavior of sdio_read/sdio_write API's when hardware is >not present. But they should just return an error. > > I ran some tests which confirmed that there is no harm in calling these API's when hardware is removed. They returned ENOMEDIUM error. > > Can you please review and verify attached patch? I gave it a quick test, didn't see any crashes. Thanks. However, I'm concerned that this patch is worsening the confusion in this part of the driver. Unless there is a reason to do otherwise, the norm here would be to tightly couple the hardware bits that enable interrupts to the Linux IRQ handler registration. i.e. sdio_claim_irq should be called immediately before mwifiex_sdio_enable_host_int, and sdio_release_irq should be called immediately after mwifiex_sdio_disable_host_int. Right now these 2 steps are very separate (sdio_release_irq is being called far too late, long after the point when the driver could sanely handle an interrupt). Although you would never expect an interrupt to arrive after mwifiex_sdio_disable_host_int() has been called, the cleanliness and readability has value here, and maybe damaged hardware would misbehave and fire an interrupt anyway, which would cause the driver to go wrong. Secondly, it should be symmetrical. If the core mwifiex driver is the thing that enables interrupts, it should be the component responsible for disabling them as well. With your patch, the core mwifiex driver enables interrupts, but it is up to the sdio sub-driver to disable them. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, Thanks for your elaborative comments. We can go with your patch which fixes the issue in a cleaner way. I ran some tests and verified remove path with it. >Unless there is a reason to do otherwise, the norm here would be to >tightly couple the hardware bits that enable interrupts to the Linux >IRQ handler registration. i.e. sdio_claim_irq should be called >immediately before mwifiex_sdio_enable_host_int, and sdio_release_irq >should be called immediately after mwifiex_sdio_disable_host_int. >Right now these 2 steps are very separate (sdio_release_irq is being >called far too late, long after the point when the driver could sanely >handle an interrupt). Now if_ops.disable_int call is a bit closer to sdio_release_irq (in if_ops.unregister). >Although you would never expect an interrupt to arrive after >mwifiex_sdio_disable_host_int() has been called, the cleanliness and >readability has value here, and maybe damaged hardware would misbehave >and fire an interrupt anyway, which would cause the driver to go >wrong. mwifiex_main_process() which may go into an infinite loop is not called in this case now. >Secondly, it should be symmetrical. If the core mwifiex driver is the >thing that enables interrupts, it should be the component responsible >for disabling them as well. With your patch, the core mwifiex driver >enables interrupts, but it is up to the sdio sub-driver to disable >them. if_ops.disable_int call takes care of it. Regards, Amitkumar Karwar -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 11, 2013 at 8:17 AM, Amitkumar Karwar <akarwar@marvell.com> wrote: > Hi Daniel, > > Thanks for your elaborative comments. We can go with your patch which fixes the issue in a cleaner way. I ran some tests and verified remove path with it. Hmm. Now that I heard back from you and Bing, that interrupts at this point are totally unexpected, I would prefer to see the interrupt handler being disabled at the appropriate time, I think we can do better than my original patch. Let me see if I can find some time today/tomorrow to explore a better approach. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 11, 2013 at 8:38 AM, Daniel Drake <dsd@laptop.org> wrote: > On Thu, Jul 11, 2013 at 8:17 AM, Amitkumar Karwar <akarwar@marvell.com> wrote: >> Hi Daniel, >> >> Thanks for your elaborative comments. We can go with your patch which fixes the issue in a cleaner way. I ran some tests and verified remove path with it. > > Hmm. Now that I heard back from you and Bing, that interrupts at this > point are totally unexpected, I would prefer to see the interrupt > handler being disabled at the appropriate time, I think we can do > better than my original patch. Let me see if I can find some time > today/tomorrow to explore a better approach. What about this one?
Hi Daniel, > Hmm. Now that I heard back from you and Bing, that interrupts at this > point are totally unexpected, I would prefer to see the interrupt > handler being disabled at the appropriate time, I think we can do > better than my original patch. Let me see if I can find some time > today/tomorrow to explore a better approach. >What about this one? The patch looks fine to me. Some comments 1) Unused HOST_INT_DISABLE macro can be removed now. 2) if_ops.disable_int() should be called before if_ops.unregister_dev() even in driver load failure code path. Otherwise we will miss to release sdio irq in this case. (You can consider applying attached changes on top of your patch for (1) and (2)) 3) Previously we used to have error handling for sdio_claim_irq(). Now we should check return status of if_ops.enable_int(). As I have couple of other patches to handle driver load failure paths correctly, I will take care of this separately. 4) I will create separate patch to avoid forward declaration of mwifiex_sdio_interrupt() by moving some code. Regards, Amitkumar Karwar
On Fri, Jul 12, 2013 at 7:14 AM, Amitkumar Karwar <akarwar@marvell.com> wrote: > Hi Daniel, > >> Hmm. Now that I heard back from you and Bing, that interrupts at this >> point are totally unexpected, I would prefer to see the interrupt >> handler being disabled at the appropriate time, I think we can do >> better than my original patch. Let me see if I can find some time >> today/tomorrow to explore a better approach. > >>What about this one? > > The patch looks fine to me. > Some comments > 1) Unused HOST_INT_DISABLE macro can be removed now. > 2) if_ops.disable_int() should be called before if_ops.unregister_dev() even in driver load failure code path. Otherwise we will miss to release sdio irq in this case. > (You can consider applying attached changes on top of your patch for (1) and (2)) > > 3) Previously we used to have error handling for sdio_claim_irq(). Now we should check return status of if_ops.enable_int(). As I have couple of other patches to handle driver load failure paths correctly, I will take care of this separately. > > 4) I will create separate patch to avoid forward declaration of mwifiex_sdio_interrupt() by moving some code. Thanks for looking at it, all your comments make sense. In your diff that you provided, I don't understand why you had to modify the mwifiex_add_card error path to disable the interrupt though. If there were any failures the interrupt handler will already have been disabled by your fixes to the mwifiex_fw_dpc() error handling path, right? Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c index caaf4bd..0e656db 100644 --- a/drivers/net/wireless/mwifiex/init.c +++ b/drivers/net/wireless/mwifiex/init.c @@ -643,6 +643,11 @@ mwifiex_shutdown_drv(struct mwifiex_adapter *adapter) } } + /* Must be done before cleanup_if (in mwifiex_free_adapter) and can't + * be done in atomic context. */ + if (adapter->if_ops.disable_int) + adapter->if_ops.disable_int(adapter); + spin_lock(&adapter->mwifiex_lock); if (adapter->if_ops.data_complete) { diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h index 3da73d3..5162e8c 100644 --- a/drivers/net/wireless/mwifiex/main.h +++ b/drivers/net/wireless/mwifiex/main.h @@ -601,6 +601,7 @@ struct mwifiex_if_ops { int (*register_dev) (struct mwifiex_adapter *); void (*unregister_dev) (struct mwifiex_adapter *); int (*enable_int) (struct mwifiex_adapter *); + int (*disable_int) (struct mwifiex_adapter *); int (*process_int_status) (struct mwifiex_adapter *); int (*host_to_card) (struct mwifiex_adapter *, u8, struct sk_buff *, struct mwifiex_tx_param *); diff --git a/drivers/net/wireless/mwifiex/sdio.c b/drivers/net/wireless/mwifiex/sdio.c index 5ee5ed0..25cfc30 100644 --- a/drivers/net/wireless/mwifiex/sdio.c +++ b/drivers/net/wireless/mwifiex/sdio.c @@ -948,7 +948,7 @@ static int mwifiex_check_fw_status(struct mwifiex_adapter *adapter, /* * This function reads the interrupt status from card. */ -static void mwifiex_interrupt_status(struct mwifiex_adapter *adapter) +static void mwifiex_process_interrupt(struct mwifiex_adapter *adapter) { struct sdio_mmc_card *card = adapter->card; u8 sdio_ireg; @@ -961,6 +961,9 @@ static void mwifiex_interrupt_status(struct mwifiex_adapter *adapter) return; } + if (adapter->surprise_removed) + return; + sdio_ireg = card->mp_regs[HOST_INTSTATUS_REG]; if (sdio_ireg) { /* @@ -975,6 +978,8 @@ static void mwifiex_interrupt_status(struct mwifiex_adapter *adapter) adapter->int_status |= sdio_ireg; spin_unlock_irqrestore(&adapter->int_lock, flags); } + + mwifiex_main_process(adapter); } /* @@ -997,14 +1002,10 @@ mwifiex_sdio_interrupt(struct sdio_func *func) } adapter = card->adapter; - if (adapter->surprise_removed) - return; - if (!adapter->pps_uapsd_mode && adapter->ps_state == PS_STATE_SLEEP) adapter->ps_state = PS_STATE_AWAKE; - mwifiex_interrupt_status(adapter); - mwifiex_main_process(adapter); + mwifiex_process_interrupt(adapter); } /* @@ -1957,6 +1958,7 @@ static struct mwifiex_if_ops sdio_ops = { .register_dev = mwifiex_register_dev, .unregister_dev = mwifiex_unregister_dev, .enable_int = mwifiex_sdio_enable_host_int, + .disable_int = mwifiex_sdio_disable_host_int, .process_int_status = mwifiex_process_int_status, .host_to_card = mwifiex_sdio_host_to_card, .wakeup = mwifiex_pm_wakeup_card,
When the card is being removed, mwifiex_remove_card() immediately sets surprise_removed to 1. This flag then causes the SDIO interrupt handler to ignore all interrupts without even acking them. If there are any async commands ongoing, it is very likely that interrupts will be received during this time. Since they are not acked (via the MP reg read in mwifiex_interrupt_status), it becomes an interrupt storm. This interrupt storm is undesirable and can cause problems for the bluetooth driver which also operates the 8787 SDIO card. Make the driver continue to ack interrupts during shutdown to avoid this. This is harder than it might sound. We must be careful not to act upon the interrupt, only ack it. Otherwise, we end up setting int_status to something. And hw_status is set to MWIFIEX_HW_STATUS_CLOSING for obvious reasons. We would hit the following infinite loop in mwifiex_main_process: process_start: do { if ((adapter->hw_status == MWIFIEX_HW_STATUS_CLOSING) || (adapter->hw_status == MWIFIEX_HW_STATUS_NOT_READY)) break; [...] } while (true); if ((adapter->int_status) || IS_CARD_RX_RCVD(adapter)) goto process_start; We must also observe that ACKing interrupts involves reading a load of data into the mp_regs buffer. The driver doesn't do much in the way of ensuring that interrupts are disabled before freeing buffers such as mp_regs, but we do need something here to make sure that we don't get any interrupts after mp_regs is freed. This whole thing feels rather fragile, but I couldn't see a clean way to do it, the driver seems a bit disorganised here. I would welcome a review from the designers. Signed-off-by: Daniel Drake <dsd@laptop.org> --- drivers/net/wireless/mwifiex/init.c | 5 +++++ drivers/net/wireless/mwifiex/main.h | 1 + drivers/net/wireless/mwifiex/sdio.c | 14 ++++++++------ 3 files changed, 14 insertions(+), 6 deletions(-)