Message ID | 20240717140009.2648885-1-prestwoj@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | netdev: handle disconnect event during the 4-way handshake | expand |
Hi James, On 7/17/24 9:00 AM, James Prestwood wrote: > If a disconnect arrives during the 4-way handshake this would result > in netdev sending a disconnect event to station. If this is a > reassociation this case is unhandled in station and causes a hang as > it expects any connection failure to be handled via the reassociation > callback, not a random disconnect event. > > To handle this case the disconnect event can check if there is a > pending connect callback and call that instead of the event handler. > > Below are logs showing the "Unexpected disconnect event" which > prevents IWD from cleaning up its state and ultimately results in a > hang: > > Jul 16 18:16:13: src/station.c:station_transition_reassociate() > Jul 16 18:16:13: event: state, old: connected, new: roaming > Jul 16 18:16:13: src/wiphy.c:wiphy_radio_work_done() Work item 65 done > Jul 16 18:16:13: src/wiphy.c:wiphy_radio_work_next() Starting work item 66 > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20) > Jul 16 18:16:13: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39) > Jul 16 18:16:13: src/netdev.c:netdev_deauthenticate_event() > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification New Station(19) > Jul 16 18:16:13: src/station.c:station_netdev_event() Associating > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification Authenticate(37) > Jul 16 18:16:13: src/netdev.c:netdev_authenticate_event() > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification Associate(38) > Jul 16 18:16:13: src/netdev.c:netdev_associate_event() > Jul 16 18:16:13: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification Connect(46) > Jul 16 18:16:13: src/netdev.c:netdev_connect_event() > Jul 16 18:16:13: src/netdev.c:netdev_connect_event() aborting and ignore_connect_event not set, proceed > Jul 16 18:16:13: src/netdev.c:netdev_connect_event() expect_connect_failure not set, proceed > Jul 16 18:16:13: src/netdev.c:parse_request_ies() > Jul 16 18:16:13: src/netdev.c:netdev_connect_event() Request / Response IEs parsed > Jul 16 18:16:13: src/netdev.c:netdev_get_oci() > Jul 16 18:16:13: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:13: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:13: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:13: src/netdev.c:netdev_get_oci_cb() Obtained OCI: freq: 5220, width: 3, center1: 5210, center2: 0 > Jul 16 18:16:13: src/eapol.c:eapol_start() > Jul 16 18:16:13: src/netdev.c:netdev_unicast_notify() Unicast notification Control Port Frame(129) > Jul 16 18:16:13: src/netdev.c:netdev_control_port_frame_event() > Jul 16 18:16:13: src/eapol.c:eapol_handle_ptk_1_of_4() ifindex=6 > Jul 16 18:16:13: src/netdev.c:netdev_mlme_notify() MLME notification Control Port TX Status(139) > Jul 16 18:16:14: src/netdev.c:netdev_mlme_notify() MLME notification Notify CQM(64) > Jul 16 18:16:14: src/netdev.c:netdev_cqm_event() Signal change event (above=1 signal=-60) > Jul 16 18:16:17: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:17: src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20) > Jul 16 18:16:17: src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39) > Jul 16 18:16:17: src/netdev.c:netdev_deauthenticate_event() > Jul 16 18:16:17: src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48) > Jul 16 18:16:17: src/netdev.c:netdev_disconnect_event() > Jul 16 18:16:17: Received Deauthentication event, reason: 15, from_ap: true > Jul 16 18:16:17: src/wiphy.c:wiphy_radio_work_done() Work item 66 done > Jul 16 18:16:17: src/station.c:station_disconnect_event() 6 > Jul 16 18:16:17: Unexpected disconnect event > Jul 16 18:16:17: src/netdev.c:netdev_link_notify() event 16 on ifindex 6 > Jul 16 18:16:17: src/wiphy.c:wiphy_reg_notify() Notification of command Reg Change(36) > Jul 16 18:16:17: src/wiphy.c:wiphy_update_reg_domain() New reg domain country code for (global) is XX > --- > src/netdev.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > <snip> > @@ -1310,9 +1312,20 @@ static void netdev_disconnect_event(struct l_genl_msg *msg, > > event_filter = netdev->event_filter; > event_data = netdev->user_data; > + connect_cb = netdev->connect_cb; > + user_data = netdev->user_data; > netdev_connect_free(netdev); > > - if (!event_filter) > + /* > + * If a deauth frame come in after the connect event during the 4-way > + * handshake we need to handle this via the connect callback since its > + * still effectively a connection failure. > + */ > + if (connect_cb) { > + connect_cb(netdev, NETDEV_RESULT_AUTHENTICATION_FAILED, > + &reason_code, user_data); > + return; > + } else if (!event_filter) > return; Can we use netdev_disconnected() instead? > > if (disconnect_by_ap) Regards, -Denis
Hi Alvin, On 7/17/24 12:35 PM, Alvin Šipraga wrote: > Hi James, > > On Wed, Jul 17, 2024 at 08:02:17AM GMT, James Prestwood wrote: >> + CC Alvin >> >> You had originally reported [1] a similar issue but it was unclear exactly >> what was happening at the time. I believe this is the same problem. This may >> not be the final version of a fix, but I wanted to CC you in case you still >> were seeing this and wanted to test it. >> >> [1] https://lore.kernel.org/iwd/20230403141927.235014-1-alvin@pqrs.dk/ > It's been a while since I dealt with this problem, and in the mean time > we have been using the original patch I sent to the list as seen in your > link. Unfortunately I don't recall how I reproduced this issue, but > maybe my colleague Pedro (+cc) remembers. > > Regardless, a static code analysis suggests that your patch yields the > same result, so I think this is good - one less patch for us to apply :) > > Do you have some easily communicable steps to reproduce this issue on > your end? I was able to write an autotest for it that took that code path. Its basically just as you thought from your previous thread. If a deauth frame comes in during the 4-way handshake or key settings during a reassociate roam it triggers that "Unexpected disconnect" path in station, causing IWD to hang up and not clean up its state. I just wanted to check in with you that its being fixed. Denis and I have been discussing it and my ML patch still isn't quite how he wants it, but we think we know how it should be handled. Thanks, James > > Thanks for following up on this! > > Kind regards, > Alvin
diff --git a/src/netdev.c b/src/netdev.c index 7a335894..c6be05b5 100644 --- a/src/netdev.c +++ b/src/netdev.c @@ -1270,6 +1270,8 @@ static void netdev_disconnect_event(struct l_genl_msg *msg, uint16_t reason_code = 0; bool disconnect_by_ap = false; netdev_event_func_t event_filter; + netdev_connect_cb_t connect_cb; + void *user_data; void *event_data; l_debug(""); @@ -1310,9 +1312,20 @@ static void netdev_disconnect_event(struct l_genl_msg *msg, event_filter = netdev->event_filter; event_data = netdev->user_data; + connect_cb = netdev->connect_cb; + user_data = netdev->user_data; netdev_connect_free(netdev); - if (!event_filter) + /* + * If a deauth frame come in after the connect event during the 4-way + * handshake we need to handle this via the connect callback since its + * still effectively a connection failure. + */ + if (connect_cb) { + connect_cb(netdev, NETDEV_RESULT_AUTHENTICATION_FAILED, + &reason_code, user_data); + return; + } else if (!event_filter) return; if (disconnect_by_ap)