diff mbox series

[v2,1/5] Remove hardcoded checkpoint interval checking

Message ID 20240118103019.12385-2-mateusz.kusiak@intel.com (mailing list archive)
State Awaiting Upstream
Delegated to: Mariusz Tkaczyk
Headers show
Series Fix checkpointing - invasive | expand

Commit Message

Mateusz Kusiak Jan. 18, 2024, 10:30 a.m. UTC
Mdmon assumes that kernel marks checkpoint every 1/16 of the volume size
and that the checkpoints are equal in size. This is not true, kernel may
mark checkpoints more frequently depending on several factors, including
sync speed. This results in checkpoints reported by mdadm --examine
falling behind the one reported by kernel.

Remove hardcoded checkpoint interval checking.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
---
 monitor.c | 22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)
diff mbox series

Patch

diff --git a/monitor.c b/monitor.c
index 4acec6783e6e..b8d9e8810b17 100644
--- a/monitor.c
+++ b/monitor.c
@@ -564,22 +564,10 @@  static int read_and_act(struct active_array *a, fd_set *fds)
 		}
 	}
 
-	/* Check for recovery checkpoint notifications.  We need to be a
-	 * minimum distance away from the last checkpoint to prevent
-	 * over checkpointing.  Note reshape checkpointing is handled
-	 * in the second branch.
+	/* Handle reshape checkpointing
 	 */
-	if (sync_completed > a->last_checkpoint &&
-	    sync_completed - a->last_checkpoint > a->info.component_size >> 4 &&
-	    a->curr_action > reshape) {
-		/* A (non-reshape) sync_action has reached a checkpoint.
-		 * Record the updated position in the metadata
-		 */
-		a->last_checkpoint = sync_completed;
-		a->container->ss->set_array_state(a, a->curr_state <= clean);
-	} else if ((a->curr_action == idle && a->prev_action == reshape) ||
-		   (a->curr_action == reshape &&
-		    sync_completed > a->last_checkpoint)) {
+	if ((a->curr_action == idle && a->prev_action == reshape) ||
+	    (a->curr_action == reshape && sync_completed > a->last_checkpoint)) {
 		/* Reshape has progressed or completed so we need to
 		 * update the array state - and possibly the array size
 		 */
@@ -607,8 +595,10 @@  static int read_and_act(struct active_array *a, fd_set *fds)
 		a->last_checkpoint = sync_completed;
 	}
 
-	if (sync_completed > a->last_checkpoint)
+	if (sync_completed > a->last_checkpoint) {
 		a->last_checkpoint = sync_completed;
+		a->container->ss->set_array_state(a, a->curr_state <= clean);
+	}
 
 	if (sync_completed >= a->info.component_size)
 		a->last_checkpoint = 0;