diff mbox

driver_nl80211 broken again

Message ID 1252116503.2398.26.camel@maxim-laptop (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Maxim Levitsky Sept. 5, 2009, 2:08 a.m. UTC
On Mon, 2009-08-24 at 22:58 +0200, Johannes Berg wrote:
> On Mon, 2009-08-24 at 23:06 +0300, Maxim Levitsky wrote:
> 
> > This is typical output of iwconfig, after failure
> > (and I know that this output means trouble):
> 
> Hmm, thanks for the info and especially the log. Unfortunately, I can't
> reproduce this at all.
> 
> Can you run wpa_supplicant with timing info (add -t to the command line)
> and at the same time run "iw event -t" please?
> 
> johannes

I have finally got to the bottom of this, ad it doesn't look good.
There are two bugs that overlap:


1 - when connecting again to the access point (same or another), 
wpa_supplicant does the following:

deassoc
auth
assoc

So it assumes that deassoc command disconnects completely, but it not
longer true.
Yet, I have tried to make its dissassoc function do both, but it failed.
I used following patch:




With this ugly hack, everything works just fine. 
-----------------------------------------------------------------------------------------------
2 - independent of the above, the ieee80211_set_disassoc
doesn't work right if deauth==false.


If it is, then a work item is added to station work thread, and it is
never removed:

	} else {
		struct ieee80211_mgd_work *wk = ifmgd->old_associate_work;

		wk->state = IEEE80211_MGD_STATE_IDLE;
		list_add(&wk->list, &ifmgd->work_list);
	}


iee80211_sta_work just ignores the IEEE80211_MGD_STATE_IDLE, thus it
work item remains forever.

This breaks scanning, since __ieee80211_start_scan will refuses to run
until, ifmgd->work_list is empty.



Best regards,
	Maxim Levitsky

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Johannes Berg Sept. 5, 2009, 1:07 p.m. UTC | #1
Hi Maxim,

Thanks for the analysis! I won't have time to look this weekend, and I'm
not sure I will early next week, and certainly not until the week after
then, but I'll leave your mail marked unread and will look later.

johannes
Johannes Berg Sept. 8, 2009, 3:29 p.m. UTC | #2
On Sat, 2009-09-05 at 05:08 +0300, Maxim Levitsky wrote:

> 1 - when connecting again to the access point (same or another), 
> wpa_supplicant does the following:
> 
> deassoc
> auth
> assoc
> 
> So it assumes that deassoc command disconnects completely, but it not
> longer true.
> Yet, I have tried to make its dissassoc function do both, but it failed.
> I used following patch:
> 
> 
> diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c
> index c68dd82..50afeeb 100644
> --- a/wpa_supplicant/wpa_supplicant.c
> +++ b/wpa_supplicant/wpa_supplicant.c
> @@ -1278,8 +1278,10 @@ void wpa_supplicant_disassociate(struct wpa_supplicant *wpa_s,
>         if (!is_zero_ether_addr(wpa_s->bssid)) {
>                 if (wpa_s->drv_flags & WPA_DRIVER_FLAGS_USER_SPACE_MLME)
>                         ieee80211_sta_disassociate(wpa_s, reason_code);
> -               else
> +               else {
>                         wpa_drv_disassociate(wpa_s, wpa_s->bssid, reason_code);
> +                       wpa_drv_deauthenticate(wpa_s, wpa_s->bssid, reason_code);
> +               }
>                 addr = wpa_s->bssid;
>         }
>         wpa_clear_keys(wpa_s, addr);

Right, this is a known problem. I still think it should be handled in
wpa_s, but I'm not sure whether that patch should have worked or not.

> EAPOL: startWhen --> 0
> EAPOL: disable timer tick
> wpa_driver_nl80211_disassociate
> wpa_driver_nl80211_deauthenticate
> nl80211: MLME command failed: ret=-67 (Link has been severed)

Ok so it was called, but got -ENOLINK? That's rather odd. But I suspect
that it had already internally cleared the BSSID, so that it was asking
to deauth from 00:...:00 -- could you check that?

> 2 - independent of the above, the ieee80211_set_disassoc
> doesn't work right if deauth==false.
> 
> 
> If it is, then a work item is added to station work thread, and it is
> never removed:
> 
> 	} else {
> 		struct ieee80211_mgd_work *wk = ifmgd->old_associate_work;
> 
> 		wk->state = IEEE80211_MGD_STATE_IDLE;
> 		list_add(&wk->list, &ifmgd->work_list);
> 	}
> 
> 
> iee80211_sta_work just ignores the IEEE80211_MGD_STATE_IDLE, thus it
> work item remains forever.
> 
> This breaks scanning, since __ieee80211_start_scan will refuses to run
> until, ifmgd->work_list is empty.

That's intentional, that work item represents the authentication state
we still have -- the required cleanup should be done by cfg80211 or
wpa_supplicant.

Can you try to figure out what the parameters are that
wpa_drv_deauthenticate() is sending to the kernel, and why it's getting
-ENOLINK?

johannes
Maxim Levitsky Sept. 8, 2009, 8:54 p.m. UTC | #3
On Tue, 2009-09-08 at 17:29 +0200, Johannes Berg wrote: 
> On Sat, 2009-09-05 at 05:08 +0300, Maxim Levitsky wrote:
> 
> > 1 - when connecting again to the access point (same or another), 
> > wpa_supplicant does the following:
> > 
> > deassoc
> > auth
> > assoc
> > 
> > So it assumes that deassoc command disconnects completely, but it not
> > longer true.
> > Yet, I have tried to make its dissassoc function do both, but it failed.
> > I used following patch:
> > 
> > 
> > diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c
> > index c68dd82..50afeeb 100644
> > --- a/wpa_supplicant/wpa_supplicant.c
> > +++ b/wpa_supplicant/wpa_supplicant.c
> > @@ -1278,8 +1278,10 @@ void wpa_supplicant_disassociate(struct wpa_supplicant *wpa_s,
> >         if (!is_zero_ether_addr(wpa_s->bssid)) {
> >                 if (wpa_s->drv_flags & WPA_DRIVER_FLAGS_USER_SPACE_MLME)
> >                         ieee80211_sta_disassociate(wpa_s, reason_code);
> > -               else
> > +               else {
> >                         wpa_drv_disassociate(wpa_s, wpa_s->bssid, reason_code);
> > +                       wpa_drv_deauthenticate(wpa_s, wpa_s->bssid, reason_code);
> > +               }
> >                 addr = wpa_s->bssid;
> >         }
> >         wpa_clear_keys(wpa_s, addr);
> 
> Right, this is a known problem. I still think it should be handled in
> wpa_s, but I'm not sure whether that patch should have worked or not.
> 
> > EAPOL: startWhen --> 0
> > EAPOL: disable timer tick
> > wpa_driver_nl80211_disassociate
> > wpa_driver_nl80211_deauthenticate
> > nl80211: MLME command failed: ret=-67 (Link has been severed)
> 
> Ok so it was called, but got -ENOLINK? That's rather odd. But I suspect
> that it had already internally cleared the BSSID, so that it was asking
> to deauth from 00:...:00 -- could you check that?

I figure that out, but the wpa_drv_disassociate and
wpa_drv_deauthenticate are direct wrappers over nl80211 calls.


> 
> > 2 - independent of the above, the ieee80211_set_disassoc
> > doesn't work right if deauth==false.
> > 
> > 
> > If it is, then a work item is added to station work thread, and it is
> > never removed:
> > 
> > 	} else {
> > 		struct ieee80211_mgd_work *wk = ifmgd->old_associate_work;
> > 
> > 		wk->state = IEEE80211_MGD_STATE_IDLE;
> > 		list_add(&wk->list, &ifmgd->work_list);
> > 	}
> > 
> > 
> > iee80211_sta_work just ignores the IEEE80211_MGD_STATE_IDLE, thus it
> > work item remains forever.
> > 
> > This breaks scanning, since __ieee80211_start_scan will refuses to run
> > until, ifmgd->work_list is empty.
> 
> That's intentional, that work item represents the authentication state
> we still have -- the required cleanup should be done by cfg80211 or
> wpa_supplicant.

But isn't it too much?
This means, the wpa_supplicant can lock the device.



> 
> Can you try to figure out what the parameters are that
> wpa_drv_deauthenticate() is sending to the kernel, and why it's getting
> -ENOLINK?

Sure!
Very soon will do


Best regards,
Maxim Levitsky


--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c
index c68dd82..50afeeb 100644
--- a/wpa_supplicant/wpa_supplicant.c
+++ b/wpa_supplicant/wpa_supplicant.c
@@ -1278,8 +1278,10 @@  void wpa_supplicant_disassociate(struct wpa_supplicant *wpa_s,
        if (!is_zero_ether_addr(wpa_s->bssid)) {
                if (wpa_s->drv_flags & WPA_DRIVER_FLAGS_USER_SPACE_MLME)
                        ieee80211_sta_disassociate(wpa_s, reason_code);
-               else
+               else {
                        wpa_drv_disassociate(wpa_s, wpa_s->bssid, reason_code);
+                       wpa_drv_deauthenticate(wpa_s, wpa_s->bssid, reason_code);
+               }
                addr = wpa_s->bssid;
        }
        wpa_clear_keys(wpa_s, addr);


I got this.


EAPOL: startWhen --> 0
EAPOL: disable timer tick
wpa_driver_nl80211_disassociate
wpa_driver_nl80211_deauthenticate
nl80211: MLME command failed: ret=-67 (Link has been severed)



However, this "hack", did the trick:

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 97a278a..60c4355 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2561,7 +2561,7 @@  int ieee80211_mgd_disassoc(struct ieee80211_sub_if_data *sdata,
                return -ENOLINK;
        }
 
-       ieee80211_set_disassoc(sdata, false);
+       ieee80211_set_disassoc(sdata, true);
 
        mutex_unlock(&ifmgd->mtx);
 
diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c
index 79d2eec..fec34a7 100644
--- a/net/wireless/mlme.c
+++ b/net/wireless/mlme.c
@@ -222,7 +222,7 @@  static void __cfg80211_send_disassoc(struct net_device *dev,
                for (i = 0; i < MAX_AUTH_BSSES; i++) {
                        if (wdev->authtry_bsses[i] || wdev->auth_bsses[i])
                                continue;
-                       wdev->auth_bsses[i] = wdev->current_bss;
+                       /*wdev->auth_bsses[i] = wdev->current_bss;*/
                        wdev->current_bss = NULL;
                        done = true;
                        cfg80211_sme_disassoc(dev, i);