diff mbox series

station: check for roam timeout before rearming

Message ID 20240829112758.35494-1-prestwoj@gmail.com (mailing list archive)
State New
Headers show
Series station: check for roam timeout before rearming | expand

Checks

Context Check Description
tedd_an/pre-ci_am success Success
prestwoj/iwd-alpine-ci-fetch success Fetch PR
prestwoj/iwd-ci-gitlint success GitLint
prestwoj/iwd-ci-fetch success Fetch PR
prestwoj/iwd-alpine-ci-setupell success Prep - Setup ELL
prestwoj/iwd-ci-setupell success Prep - Setup ELL
prestwoj/iwd-ci-makedistcheck fail Make Distcheck Make FAIL: In file included from ./ell/ell.h:1, from ../../src/station.c:35: ../../src/station.c: In function ‘station_roam_timeout_rearm’: ../../src/station.c:3048:24: warning: implicit declaration of function ‘l_timeout_remaining’; did you mean ‘l_timeout_remove’? [-Wimplicit-function-declaration] 3048 | if (L_WARN_ON(!l_timeout_remaining(station->roam_trigger_timeout, | ^~~~~~~~~~~~~~~~~~~ ../../ell/util.h:38:29: note: in definition of macro ‘L_WARN_ON’ 38 | bool r = !!(condition); \ | ^~~~~~~~~ /usr/bin/ld: src/station.o: in function `station_roam_timeout_rearm': station.c:(.text+0x93f): undefined reference to `l_timeout_remaining' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:2186: src/iwd] Error 1 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [Makefile:1745: all] Error 2 make: *** [Makefile:3369: distcheck] Error 1
prestwoj/iwd-ci-incremental_build success Incremental build not run PASS
prestwoj/iwd-ci-build success Build - Configure
prestwoj/iwd-alpine-ci-makedistcheck fail Make Distcheck Make FAIL: In file included from ./ell/ell.h:1, from ../../src/station.c:35: ../../src/station.c: In function 'station_roam_timeout_rearm': ../../src/station.c:3048:24: warning: implicit declaration of function 'l_timeout_remaining'; did you mean 'l_timeout_remove'? [-Wimplicit-function-declaration] 3048 | if (L_WARN_ON(!l_timeout_remaining(station->roam_trigger_timeout, | ^~~~~~~~~~~~~~~~~~~ ../../ell/util.h:38:29: note: in definition of macro 'L_WARN_ON' 38 | bool r = !!(condition); \ | ^~~~~~~~~ /usr/lib/gcc/x86_64-alpine-linux-musl/11.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: src/station.o: in function `station_roam_timeout_rearm': station.c:(.text+0xa7f): undefined reference to `l_timeout_remaining' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:2187: src/iwd] Error 1 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [Makefile:1746: all] Error 2 make: *** [Makefile:3370: distcheck] Error 1
prestwoj/iwd-alpine-ci-incremental_build success Incremental build not run PASS
prestwoj/iwd-ci-testrunner pending testrunner SKIP
prestwoj/iwd-ci-makecheck pending makecheck SKIP
prestwoj/iwd-ci-makecheckvalgrind fail Make FAIL: In file included from ./ell/ell.h:1, from src/station.c:35: src/station.c: In function ‘station_roam_timeout_rearm’: src/station.c:3048:24: error: implicit declaration of function ‘l_timeout_remaining’; did you mean ‘l_timeout_remove’? [-Werror=implicit-function-declaration] 3048 | if (L_WARN_ON(!l_timeout_remaining(station->roam_trigger_timeout, | ^~~~~~~~~~~~~~~~~~~ ./ell/util.h:38:29: note: in definition of macro ‘L_WARN_ON’ 38 | bool r = !!(condition); \ | ^~~~~~~~~ cc1: all warnings being treated as errors make[1]: *** [Makefile:2585: src/station.o] Error 1 make[1]: *** Waiting for unfinished jobs.... make: *** [Makefile:1745: all] Error 2
prestwoj/iwd-ci-clang fail Clang IWD - make FAIL: src/station.c:3048:17: error: implicit declaration of function 'l_timeout_remaining' is invalid in C99 [-Werror,-Wimplicit-function-declaration] if (L_WARN_ON(!l_timeout_remaining(station->roam_trigger_timeout, ^ src/station.c:3048:17: note: did you mean 'l_timeout_remove'? ./ell/timeout.h:33:6: note: 'l_timeout_remove' declared here void l_timeout_remove(struct l_timeout *timeout); ^ 1 error generated. make[1]: *** [Makefile:2585: src/station.o] Error 1 make[1]: *** Waiting for unfinished jobs.... make: *** [Makefile:1745: all] Error 2
prestwoj/iwd-alpine-ci-build success Build - Configure
prestwoj/iwd-alpine-ci-makecheckvalgrind fail Make FAIL: In file included from ./ell/ell.h:1, from src/station.c:35: src/station.c: In function 'station_roam_timeout_rearm': src/station.c:3048:24: error: implicit declaration of function 'l_timeout_remaining'; did you mean 'l_timeout_remove'? [-Werror=implicit-function-declaration] 3048 | if (L_WARN_ON(!l_timeout_remaining(station->roam_trigger_timeout, | ^~~~~~~~~~~~~~~~~~~ ./ell/util.h:38:29: note: in definition of macro 'L_WARN_ON' 38 | bool r = !!(condition); \ | ^~~~~~~~~ cc1: all warnings being treated as errors make[1]: *** [Makefile:2586: src/station.o] Error 1 make[1]: *** Waiting for unfinished jobs.... make: *** [Makefile:1746: all] Error 2
prestwoj/iwd-alpine-ci-makecheck pending makecheck SKIP

Commit Message

James Prestwood Aug. 29, 2024, 11:27 a.m. UTC
A user reported a crash which was due to the roam trigger timeout
being overwritten, followed by a disconnect. Post-disconnect the
timer would fire and result in a crash. Its not clear exactly where
the overwrite was happening but upon code inspection it could
happen in the following scenario:

1. Beacon loss event, start roam timeout
2. Signal low event, no check if timeout is running and the timeout
   gets overwritten.

The reported crash actually didn't appear to be from the above
scenario but something else, so this logic is being hardened and
improved

Now if a roam timeout already exists and trying to be rearmed IWD
will check the time remaining on the current timer and either keep
the active timer or reschedule it to the lesser of the two values
(current or new rearm time). This will avoid cases such as a long
roam timer being active (e.g. 60 seconds) followed by a beacon or
packet loss event which should trigger a more agressive roam
schedule.
---
 src/station.c | 48 +++++++++++++++++++++++++++++-------------------
 1 file changed, 29 insertions(+), 19 deletions(-)

Comments

Denis Kenzior Sept. 3, 2024, 3:18 p.m. UTC | #1
Hi James,

On 8/29/24 6:27 AM, James Prestwood wrote:
> A user reported a crash which was due to the roam trigger timeout
> being overwritten, followed by a disconnect. Post-disconnect the
> timer would fire and result in a crash. Its not clear exactly where
> the overwrite was happening but upon code inspection it could
> happen in the following scenario:
> 
> 1. Beacon loss event, start roam timeout
> 2. Signal low event, no check if timeout is running and the timeout
>     gets overwritten.
> 
> The reported crash actually didn't appear to be from the above
> scenario but something else, so this logic is being hardened and
> improved
> 
> Now if a roam timeout already exists and trying to be rearmed IWD
> will check the time remaining on the current timer and either keep
> the active timer or reschedule it to the lesser of the two values
> (current or new rearm time). This will avoid cases such as a long
> roam timer being active (e.g. 60 seconds) followed by a beacon or
> packet loss event which should trigger a more agressive roam
> schedule.
> ---
>   src/station.c | 48 +++++++++++++++++++++++++++++-------------------
>   1 file changed, 29 insertions(+), 19 deletions(-)
> 

Applied, thanks.

Regards,
-Denis
diff mbox series

Patch

diff --git a/src/station.c b/src/station.c
index 30a1232a..8b29df77 100644
--- a/src/station.c
+++ b/src/station.c
@@ -102,7 +102,6 @@  struct station {
 	struct l_queue *owe_hidden_scan_ids;
 
 	/* Roaming related members */
-	struct timespec roam_min_time;
 	struct l_timeout *roam_trigger_timeout;
 	uint32_t roam_scan_id;
 	uint8_t preauth_bssid[6];
@@ -1820,7 +1819,6 @@  static void station_roam_state_clear(struct station *station)
 	station->preparing_roam = false;
 	station->roam_scan_full = false;
 	station->signal_low = false;
-	station->roam_min_time.tv_sec = 0;
 	station->netconfig_after_roam = false;
 	station->last_roam_scan = 0;
 
@@ -3041,20 +3039,33 @@  static void station_roam_trigger_cb(struct l_timeout *timeout, void *user_data)
 
 static void station_roam_timeout_rearm(struct station *station, int seconds)
 {
-	struct timespec now, min_timeout;
+	uint64_t remaining;
 
-	clock_gettime(CLOCK_MONOTONIC, &now);
+	if (!station->roam_trigger_timeout)
+		goto new_timeout;
 
-	min_timeout = now;
-	min_timeout.tv_sec += seconds;
+	/* If we cant get the remaining time just create a new timer */
+	if (L_WARN_ON(!l_timeout_remaining(station->roam_trigger_timeout,
+						&remaining))) {
+		l_timeout_remove(station->roam_trigger_timeout);
+		goto new_timeout;
+	}
+
+	/* Our current timeout is less than the rearm, keep current */
+	if (l_time_before(remaining, seconds * L_USEC_PER_SEC)) {
+		l_debug("Keeping current roam timeout of %lu seconds",
+				l_time_to_secs(remaining));
+		return;
+	}
+
+	l_debug("Rescheduling roam timeout from %lu to %u seconds",
+			l_time_to_secs(remaining), seconds);
+	l_timeout_modify(station->roam_trigger_timeout, seconds);
 
-	if (station->roam_min_time.tv_sec < min_timeout.tv_sec ||
-			(station->roam_min_time.tv_sec == min_timeout.tv_sec &&
-			 station->roam_min_time.tv_nsec < min_timeout.tv_nsec))
-		station->roam_min_time = min_timeout;
+	return;
 
-	seconds = station->roam_min_time.tv_sec - now.tv_sec +
-		(station->roam_min_time.tv_nsec > now.tv_nsec ? 1 : 0);
+new_timeout:
+	l_debug("Arming new roam timer for %u seconds", seconds);
 
 	station->roam_trigger_timeout =
 		l_timeout_create(seconds, station_roam_trigger_cb,
@@ -3212,7 +3223,6 @@  static void station_ok_rssi(struct station *station)
 	station->roam_trigger_timeout = NULL;
 
 	station->signal_low = false;
-	station->roam_min_time.tv_sec = 0;
 }
 
 static void station_event_roamed(struct station *station, struct scan_bss *new)
@@ -3558,14 +3568,17 @@  static void station_packets_lost(struct station *station, uint32_t num_pkts)
 		l_debug("Too many roam attempts in %u second timeframe, "
 			"delaying roam", LOSS_ROAM_RATE_LIMIT);
 
-		if (station->roam_trigger_timeout)
-			return;
-
 		station_roam_timeout_rearm(station, LOSS_ROAM_RATE_LIMIT);
 
 		return;
 	}
 
+	if (station->roam_trigger_timeout) {
+		l_debug("canceling roam timer to roam immediately");
+		l_timeout_remove(station->roam_trigger_timeout);
+		station->roam_trigger_timeout = NULL;
+	}
+
 	station_start_roam(station);
 }
 
@@ -3578,9 +3591,6 @@  static void station_beacon_lost(struct station *station)
 
 	station_debug_event(station, "beacon-loss-roam");
 
-	if (station->roam_trigger_timeout)
-		return;
-
 	station_roam_timeout_rearm(station, LOSS_ROAM_RATE_LIMIT);
 }