diff mbox series

station: fix hang/crash with FT and roam scans

Message ID 20230113192557.777947-1-prestwoj@gmail.com (mailing list archive)
State New
Headers show
Series station: fix hang/crash with FT and roam scans | expand

Checks

Context Check Description
tedd_an/pre-ci_am success Success
prestwoj/iwd-alpine-ci-fetch success Fetch PR
prestwoj/iwd-ci-gitlint success GitLint
prestwoj/iwd-ci-fetch success Fetch PR
prestwoj/iwd-alpine-ci-makedistcheck success Make Distcheck
prestwoj/iwd-alpine-ci-incremental_build success Incremental build not run PASS
prestwoj/iwd-ci-makedistcheck success Make Distcheck
prestwoj/iwd-ci-incremental_build success Incremental build not run PASS
prestwoj/iwd-alpine-ci-build success Build - Configure
prestwoj/iwd-ci-build success Build - Configure
prestwoj/iwd-alpine-ci-makecheck success Make Check
prestwoj/iwd-alpine-ci-makecheckvalgrind success Make Check w/Valgrind
prestwoj/iwd-ci-clang success clang PASS
prestwoj/iwd-ci-makecheckvalgrind success Make Check w/Valgrind
prestwoj/iwd-ci-makecheck success Make Check
prestwoj/iwd-ci-testrunner success test-runner PASS

Commit Message

James Prestwood Jan. 13, 2023, 7:25 p.m. UTC
A user reported hangs and sometimes crashes when roaming. This is due
to FT being started without canceling/preventing another roam scan.
Then, while FT is in progress, the roam scan results come in and find
no candidates which triggers a disconnect mid-roam. This then causes
either station to never transition out of the roaming state, or in
some cases a crash.

Fix this in a few ways, first cancel the roam rearm timer when the
FT work starts. This will prevent any roam scans from occurring until
after the FT work finishes. And second don't end the FT work once
association starts and keep the work item open until after the roam.
---
 src/station.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/src/station.c b/src/station.c
index ad5ad724..cb0fd57c 100644
--- a/src/station.c
+++ b/src/station.c
@@ -1999,6 +1999,9 @@  static void station_roamed(struct station *station)
 
 	l_queue_clear(station->roam_bss_list, l_free);
 
+	if (station->ft_work.id)
+		wiphy_radio_work_done(station->wiphy, station->ft_work.id);
+
 	station_enter_state(station, STATION_STATE_CONNECTED);
 }
 
@@ -2023,6 +2026,9 @@  static void station_roam_failed(struct station *station)
 
 	l_queue_clear(station->roam_bss_list, l_free);
 
+	if (station->ft_work.id)
+		wiphy_radio_work_done(station->wiphy, station->ft_work.id);
+
 	/*
 	 * If we attempted a reassociation or a fast transition, and ended up
 	 * here then we are now disconnected.
@@ -2263,7 +2269,8 @@  try_next:
 	station->preparing_roam = false;
 	station_enter_state(station, STATION_STATE_FT_ROAMING);
 
-	return true;
+	/* Keep the work item until after FT finishes */
+	return false;
 
 assoc_failed:
 	station_roam_failed(station);
@@ -2292,6 +2299,11 @@  static bool station_fast_transition(struct station *station,
 	vendor_ies = network_info_get_extra_ies(info, bss, &iov_elems);
 	handshake_state_set_vendor_ies(hs, vendor_ies, iov_elems);
 
+	if (station->roam_trigger_timeout) {
+		l_timeout_remove(station->roam_trigger_timeout);
+		station->roam_trigger_timeout = NULL;
+	}
+
 	/* Both ft_action/ft_authenticate will gate the associate work item */
 	if ((hs->mde[4] & 1))
 		ft_action(netdev_get_ifindex(station->netdev),
@@ -2747,7 +2759,8 @@  static bool station_cannot_roam(struct station *station)
 
 	return disabled || station->preparing_roam ||
 				station->state == STATION_STATE_ROAMING ||
-				station->state == STATION_STATE_FT_ROAMING;
+				station->state == STATION_STATE_FT_ROAMING ||
+				station->ft_work.id;
 }
 
 #define WNM_REQUEST_MODE_PREFERRED_CANDIDATE_LIST	(1 << 0)