diff mbox

mac80211: Clean up work-queues on disassociation.

Message ID 20130225102020.GA1735@redhat.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Stanislaw Gruszka Feb. 25, 2013, 10:20 a.m. UTC
On Tue, Feb 19, 2013 at 06:11:07PM -0800, greearb@candelatech.com wrote:
> From: Ben Greear <greearb@candelatech.com>
> 
> The monitor_work and beacon_connection_loss_work items were
> not being canceled on disassociation (and not on deletion
> either).  This leads to work-items trying to run after memory
> has been deleted.
> 
> I could not find a cleaner way to do this because the
> cancel_work_sync for these items must be done outside
> of the ifmgd->mtx.
> 
> In addition, re-order the quiesce code so that timers are
> always stopped before work-items are flushed.  This was
> not the problem I saw, but I think it may still be more
> correct.

I think this patch is quite complicated and simpler solution
can be used. We stop timers on disassociate, and since
we nullify ifmgd->associated under ifmgd->mtx, work procedures
will perform no action. The only thing we should care are works
queued to workqueue internals, and stop them before we remove
the interface, to stop workqueue code to use our freed
ifmgd->*_work data.

Regarding quiesce, I'm working on suspend/resume changes where this
function become not necessary, so I'll remove it.

Below is alternative fix proposition for the problem.

Comments

Ben Greear Feb. 25, 2013, 4:55 p.m. UTC | #1
On 02/25/2013 02:20 AM, Stanislaw Gruszka wrote:
> On Tue, Feb 19, 2013 at 06:11:07PM -0800, greearb@candelatech.com wrote:
>> From: Ben Greear <greearb@candelatech.com>
>>
>> The monitor_work and beacon_connection_loss_work items were
>> not being canceled on disassociation (and not on deletion
>> either).  This leads to work-items trying to run after memory
>> has been deleted.
>>
>> I could not find a cleaner way to do this because the
>> cancel_work_sync for these items must be done outside
>> of the ifmgd->mtx.
>>
>> In addition, re-order the quiesce code so that timers are
>> always stopped before work-items are flushed.  This was
>> not the problem I saw, but I think it may still be more
>> correct.
>
> I think this patch is quite complicated and simpler solution
> can be used. We stop timers on disassociate, and since

I think my second patch was closer to what you have...

> +	/*
> +	 * We canceled timers during disassoc, but works still can be pending.
> +	 * Even if we they do not perform action when unassociated, we should
> +	 * assure we stop them, before freeing resources.
> +	 */

The comment is a bit misleading....as I saw in my testing, it could actually
crash the system because the entire station could be deleted by the time
the work-item tries to complete.

Thanks,
Ben
Stanislaw Gruszka Feb. 26, 2013, 3:55 p.m. UTC | #2
On Mon, Feb 25, 2013 at 08:55:53AM -0800, Ben Greear wrote:
> On 02/25/2013 02:20 AM, Stanislaw Gruszka wrote:
> >On Tue, Feb 19, 2013 at 06:11:07PM -0800, greearb@candelatech.com wrote:
> >>From: Ben Greear <greearb@candelatech.com>
> >>
> >>The monitor_work and beacon_connection_loss_work items were
> >>not being canceled on disassociation (and not on deletion
> >>either).  This leads to work-items trying to run after memory
> >>has been deleted.
> >>
> >>I could not find a cleaner way to do this because the
> >>cancel_work_sync for these items must be done outside
> >>of the ifmgd->mtx.
> >>
> >>In addition, re-order the quiesce code so that timers are
> >>always stopped before work-items are flushed.  This was
> >>not the problem I saw, but I think it may still be more
> >>correct.
> >
> >I think this patch is quite complicated and simpler solution
> >can be used. We stop timers on disassociate, and since
> 
> I think my second patch was closer to what you have...

Yes, I see, I missed your second patch.

> >+	/*
> >+	 * We canceled timers during disassoc, but works still can be pending.
> >+	 * Even if we they do not perform action when unassociated, we should
> >+	 * assure we stop them, before freeing resources.
> >+	 */
> 
> The comment is a bit misleading....as I saw in my testing, it could actually
> crash the system because the entire station could be deleted by the time
> the work-item tries to complete.

Not sure if I understand. If I did not missed something, all works callbacks
check ifmgd->associated pointer and quit instantly if it is null. So as long
we do not free sdata, is fine to have them pending after disassociate.
However we should use ifmgd->mtx to protect ifmgd->associated and
ifmgd->bssid in ieee80211_beacon_connection_loss_work()

Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear Feb. 26, 2013, 4:51 p.m. UTC | #3
On 02/26/2013 07:55 AM, Stanislaw Gruszka wrote:
> On Mon, Feb 25, 2013 at 08:55:53AM -0800, Ben Greear wrote:
>> On 02/25/2013 02:20 AM, Stanislaw Gruszka wrote:
>>> On Tue, Feb 19, 2013 at 06:11:07PM -0800, greearb@candelatech.com wrote:
>>>> From: Ben Greear <greearb@candelatech.com>
>>>>
>>>> The monitor_work and beacon_connection_loss_work items were
>>>> not being canceled on disassociation (and not on deletion
>>>> either).  This leads to work-items trying to run after memory
>>>> has been deleted.
>>>>
>>>> I could not find a cleaner way to do this because the
>>>> cancel_work_sync for these items must be done outside
>>>> of the ifmgd->mtx.
>>>>
>>>> In addition, re-order the quiesce code so that timers are
>>>> always stopped before work-items are flushed.  This was
>>>> not the problem I saw, but I think it may still be more
>>>> correct.
>>>
>>> I think this patch is quite complicated and simpler solution
>>> can be used. We stop timers on disassociate, and since
>>
>> I think my second patch was closer to what you have...
>
> Yes, I see, I missed your second patch.
>
>>> +	/*
>>> +	 * We canceled timers during disassoc, but works still can be pending.
>>> +	 * Even if we they do not perform action when unassociated, we should
>>> +	 * assure we stop them, before freeing resources.
>>> +	 */
>>
>> The comment is a bit misleading....as I saw in my testing, it could actually
>> crash the system because the entire station could be deleted by the time
>> the work-item tries to complete.
>
> Not sure if I understand. If I did not missed something, all works callbacks
> check ifmgd->associated pointer and quit instantly if it is null. So as long
> we do not free sdata, is fine to have them pending after disassociate.

Right, but in the old code, there is nothing to keep you from
freeing sdata with work still pending.  Thus the crash I saw.


> However we should use ifmgd->mtx to protect ifmgd->associated and
> ifmgd->bssid in ieee80211_beacon_connection_loss_work()

Maybe so...I haven't looked at that code in any detail.

Thanks,
Ben

>
> Stanislaw
>
diff mbox

Patch

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index bdddb0b..15a24214 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -3992,6 +3992,17 @@  void ieee80211_mgd_stop(struct ieee80211_sub_if_data *sdata)
 {
 	struct ieee80211_if_managed *ifmgd = &sdata->u.mgd;
 
+	/*
+	 * We canceled timers during disassoc, but works still can be pending.
+	 * Even if we they do not perform action when unassociated, we should
+	 * assure we stop them, before freeing resources.
+	 */
+	cancel_work_sync(&ifmgd->request_smps_work);
+	cancel_work_sync(&ifmgd->monitor_work);
+	cancel_work_sync(&ifmgd->beacon_connection_loss_work);
+	cancel_work_sync(&ifmgd->csa_connection_drop_work);
+	cancel_work_sync(&ifmgd->chswitch_work);
+
 	mutex_lock(&ifmgd->mtx);
 	if (ifmgd->assoc_data)
 		ieee80211_destroy_assoc_data(sdata, false);