diff mbox series

[2/2] Wait for mdmon when it is stared via systemd

Message ID 20240507033856.2195-3-kinga.stefaniuk@intel.com (mailing list archive)
State Accepted
Headers show
Series New timeout while waiting for mdmon | expand

Checks

Context Check Description
mdraidci/vmtest-md-6-10-PR fail merge-conflict
mdraidci/vmtest-md-6-10-VM_Test-0 success Logs for Lint
mdraidci/vmtest-md-6-10-VM_Test-1 success Logs for ShellCheck
mdraidci/vmtest-md-6-10-VM_Test-2 success Logs for Unittests
mdraidci/vmtest-md-6-10-VM_Test-4 success Logs for build-kernel
mdraidci/vmtest-md-6-10-VM_Test-3 success Logs for Validate matrix.py
mdraidci/vmtest-md-6-10-VM_Test-5 success Logs for set-matrix
mdraidci/vmtest-md-6-10-VM_Test-7 success Logs for x86_64-gcc / build-release
mdraidci/vmtest-md-6-10-VM_Test-6 success Logs for x86_64-gcc / build / build for x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-8 fail Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-9 fail Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-10 fail Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-12 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-11 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-13 fail Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-17 fail Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
mdraidci/vmtest-md-6-10-VM_Test-23 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
mdraidci/vmtest-md-6-10-VM_Test-14 fail Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
mdraidci/vmtest-md-6-10-VM_Test-20 fail Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
mdraidci/vmtest-md-6-10-VM_Test-15 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
mdraidci/vmtest-md-6-10-VM_Test-18 fail Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
mdraidci/vmtest-md-6-10-VM_Test-16 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
mdraidci/vmtest-md-6-10-VM_Test-19 fail Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
mdraidci/vmtest-md-6-10-VM_Test-21 success Logs for x86_64-llvm-17 / veristat
mdraidci/vmtest-md-6-10-VM_Test-25 fail Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
mdraidci/vmtest-md-6-10-VM_Test-29 success Logs for x86_64-llvm-18 / veristat
mdraidci/vmtest-md-6-10-VM_Test-27 fail Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
mdraidci/vmtest-md-6-10-VM_Test-24 fail Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
mdraidci/vmtest-md-6-10-VM_Test-28 fail Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
mdraidci/vmtest-md-6-10-VM_Test-26 fail Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
mdraidci/vmtest-md-6-10-VM_Test-22 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18

Commit Message

Kinga Stefaniuk May 7, 2024, 3:38 a.m. UTC
When mdmon is being started it may need few seconds to start.
For now, we didn't wait for it. Introduce wait_for_mdmon()
function, which waits up to 5 seconds for mdmon to start completely.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
---
 Assemble.c |  4 ++--
 Grow.c     |  7 ++++---
 mdadm.h    |  2 ++
 util.c     | 29 +++++++++++++++++++++++++++++
 4 files changed, 37 insertions(+), 5 deletions(-)

Comments

Paul Menzel May 14, 2024, 9:17 a.m. UTC | #1
Dear Kinga,


Thank you for the patch. There is a small typo in the summary: star*t*ed.

Am 07.05.24 um 05:38 schrieb Kinga Stefaniuk:
> When mdmon is being started it may need few seconds to start.
> For now, we didn't wait for it. Introduce wait_for_mdmon()
> function, which waits up to 5 seconds for mdmon to start completely.
> 
> Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
> ---
>   Assemble.c |  4 ++--
>   Grow.c     |  7 ++++---
>   mdadm.h    |  2 ++
>   util.c     | 29 +++++++++++++++++++++++++++++
>   4 files changed, 37 insertions(+), 5 deletions(-)
> 
> diff --git a/Assemble.c b/Assemble.c
> index f6c5b99e25e2..9cb1747df0a3 100644
> --- a/Assemble.c
> +++ b/Assemble.c
> @@ -2175,8 +2175,8 @@ int assemble_container_content(struct supertype *st, int mdfd,
>   			if (!mdmon_running(st->container_devnm))
>   				start_mdmon(st->container_devnm);
>   			ping_monitor(st->container_devnm);
> -			if (mdmon_running(st->container_devnm) &&
> -			    st->update_tail == NULL)
> +			if (wait_for_mdmon(st->container_devnm) == MDADM_STATUS_SUCCESS &&
> +			    !st->update_tail)
>   				st->update_tail = &st->updates;
>   		}
>   
> diff --git a/Grow.c b/Grow.c
> index 074f19956e17..0e44fae4891e 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -2085,7 +2085,7 @@ int Grow_reshape(char *devname, int fd,
>   			if (!mdmon_running(st->container_devnm))
>   				start_mdmon(st->container_devnm);
>   			ping_monitor(container);
> -			if (mdmon_running(st->container_devnm) == false) {
> +			if (wait_for_mdmon(st->container_devnm) != MDADM_STATUS_SUCCESS) {
>   				pr_err("No mdmon found. Grow cannot continue.\n");
>   				goto release;
>   			}
> @@ -3176,7 +3176,8 @@ static int reshape_array(char *container, int fd, char *devname,
>   			if (!mdmon_running(container))
>   				start_mdmon(container);
>   			ping_monitor(container);
> -			if (mdmon_running(container) && st->update_tail == NULL)
> +			if (wait_for_mdmon(container) == MDADM_STATUS_SUCCESS &&
> +			    !st->update_tail)
>   				st->update_tail = &st->updates;
>   		}
>   	}
> @@ -5140,7 +5141,7 @@ int Grow_continue_command(char *devname, int fd,
>   			start_mdmon(container);
>   		ping_monitor(container);
>   
> -		if (mdmon_running(container) == false) {
> +		if (wait_for_mdmon(container) != MDADM_STATUS_SUCCESS) {
>   			pr_err("No mdmon found. Grow cannot continue.\n");
>   			ret_val = 1;
>   			goto Grow_continue_command_exit;
> diff --git a/mdadm.h b/mdadm.h
> index af4c484afdf7..9b8fb3f6f8d8 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1769,6 +1769,8 @@ extern struct superswitch *version_to_superswitch(char *vers);
>   
>   extern int mdmon_running(const char *devnm);
>   extern int mdmon_pid(const char *devnm);
> +extern mdadm_status_t wait_for_mdmon(const char *devnm);
> +
>   extern int check_env(char *name);
>   extern __u32 random32(void);
>   extern void random_uuid(__u8 *buf);
> diff --git a/util.c b/util.c
> index 65056a19e2cd..df12cf2bb2b1 100644
> --- a/util.c
> +++ b/util.c
> @@ -1921,6 +1921,35 @@ int mdmon_running(const char *devnm)
>   	return 0;
>   }
>   
> +/*
> + * wait_for_mdmon() - Waits for mdmon within specified time.
> + * @devnm: Device for which mdmon should start.
> + *
> + * Function waits for mdmon to start. It may need few seconds
> + * to start, we set timeout to 5, it should be sufficient.
> + * Do not wait if mdmon has been started.
> + *
> + * Return: MDADM_STATUS_SUCCESS if mdmon is running, error code otherwise.
> + */
> +mdadm_status_t wait_for_mdmon(const char *devnm)
> +{
> +	const time_t mdmon_timeout = 5;
> +	time_t start_time = time(0);
> +
> +	if (mdmon_running(devnm))
> +		return MDADM_STATUS_SUCCESS;
> +
> +	pr_info("Waiting for mdmon to start\n");
> +	while (time(0) - start_time < mdmon_timeout) {
> +		sleep_for(0, MSEC_TO_NSEC(200), true);
> +		if (mdmon_running(devnm))
> +			return MDADM_STATUS_SUCCESS;
> +	};
> +
> +	pr_err("Timeout waiting for mdmon\n");

Please print the timeout limit.

> +	return MDADM_STATUS_ERROR;
> +}
> +
>   int start_mdmon(char *devnm)
>   {
>   	int i;

Doesn’t systemd have some interface sd_ on how to notify about a 
successful start?


Kind nregards,

Paul
Kinga Stefaniuk May 14, 2024, 10:56 a.m. UTC | #2
On Tue, 14 May 2024 11:17:16 +0200
Paul Menzel <pmenzel@molgen.mpg.de> wrote:

> Dear Kinga,
> 
> 
> Thank you for the patch. There is a small typo in the summary:
> star*t*ed.
> 
> Am 07.05.24 um 05:38 schrieb Kinga Stefaniuk:
> > When mdmon is being started it may need few seconds to start.
> > For now, we didn't wait for it. Introduce wait_for_mdmon()
> > function, which waits up to 5 seconds for mdmon to start completely.
> > 
> > Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
> > ---
> >   Assemble.c |  4 ++--
> >   Grow.c     |  7 ++++---
> >   mdadm.h    |  2 ++
> >   util.c     | 29 +++++++++++++++++++++++++++++
> >   4 files changed, 37 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Assemble.c b/Assemble.c
> > index f6c5b99e25e2..9cb1747df0a3 100644
> > --- a/Assemble.c
> > +++ b/Assemble.c
> > @@ -2175,8 +2175,8 @@ int assemble_container_content(struct
> > supertype *st, int mdfd, if (!mdmon_running(st->container_devnm))
> >   				start_mdmon(st->container_devnm);
> >   			ping_monitor(st->container_devnm);
> > -			if (mdmon_running(st->container_devnm) &&
> > -			    st->update_tail == NULL)
> > +			if (wait_for_mdmon(st->container_devnm) ==
> > MDADM_STATUS_SUCCESS &&
> > +			    !st->update_tail)
> >   				st->update_tail = &st->updates;
> >   		}
> >   
> > diff --git a/Grow.c b/Grow.c
> > index 074f19956e17..0e44fae4891e 100644
> > --- a/Grow.c
> > +++ b/Grow.c
> > @@ -2085,7 +2085,7 @@ int Grow_reshape(char *devname, int fd,
> >   			if (!mdmon_running(st->container_devnm))
> >   				start_mdmon(st->container_devnm);
> >   			ping_monitor(container);
> > -			if (mdmon_running(st->container_devnm) ==
> > false) {
> > +			if (wait_for_mdmon(st->container_devnm) !=
> > MDADM_STATUS_SUCCESS) { pr_err("No mdmon found. Grow cannot
> > continue.\n"); goto release;
> >   			}
> > @@ -3176,7 +3176,8 @@ static int reshape_array(char *container, int
> > fd, char *devname, if (!mdmon_running(container))
> >   				start_mdmon(container);
> >   			ping_monitor(container);
> > -			if (mdmon_running(container) &&
> > st->update_tail == NULL)
> > +			if (wait_for_mdmon(container) ==
> > MDADM_STATUS_SUCCESS &&
> > +			    !st->update_tail)
> >   				st->update_tail = &st->updates;
> >   		}
> >   	}
> > @@ -5140,7 +5141,7 @@ int Grow_continue_command(char *devname, int
> > fd, start_mdmon(container);
> >   		ping_monitor(container);
> >   
> > -		if (mdmon_running(container) == false) {
> > +		if (wait_for_mdmon(container) !=
> > MDADM_STATUS_SUCCESS) { pr_err("No mdmon found. Grow cannot
> > continue.\n"); ret_val = 1;
> >   			goto Grow_continue_command_exit;
> > diff --git a/mdadm.h b/mdadm.h
> > index af4c484afdf7..9b8fb3f6f8d8 100644
> > --- a/mdadm.h
> > +++ b/mdadm.h
> > @@ -1769,6 +1769,8 @@ extern struct superswitch
> > *version_to_superswitch(char *vers); 
> >   extern int mdmon_running(const char *devnm);
> >   extern int mdmon_pid(const char *devnm);
> > +extern mdadm_status_t wait_for_mdmon(const char *devnm);
> > +
> >   extern int check_env(char *name);
> >   extern __u32 random32(void);
> >   extern void random_uuid(__u8 *buf);
> > diff --git a/util.c b/util.c
> > index 65056a19e2cd..df12cf2bb2b1 100644
> > --- a/util.c
> > +++ b/util.c
> > @@ -1921,6 +1921,35 @@ int mdmon_running(const char *devnm)
> >   	return 0;
> >   }
> >   
> > +/*
> > + * wait_for_mdmon() - Waits for mdmon within specified time.
> > + * @devnm: Device for which mdmon should start.
> > + *
> > + * Function waits for mdmon to start. It may need few seconds
> > + * to start, we set timeout to 5, it should be sufficient.
> > + * Do not wait if mdmon has been started.
> > + *
> > + * Return: MDADM_STATUS_SUCCESS if mdmon is running, error code
> > otherwise.
> > + */
> > +mdadm_status_t wait_for_mdmon(const char *devnm)
> > +{
> > +	const time_t mdmon_timeout = 5;
> > +	time_t start_time = time(0);
> > +
> > +	if (mdmon_running(devnm))
> > +		return MDADM_STATUS_SUCCESS;
> > +
> > +	pr_info("Waiting for mdmon to start\n");
> > +	while (time(0) - start_time < mdmon_timeout) {
> > +		sleep_for(0, MSEC_TO_NSEC(200), true);
> > +		if (mdmon_running(devnm))
> > +			return MDADM_STATUS_SUCCESS;
> > +	};
> > +
> > +	pr_err("Timeout waiting for mdmon\n");  
> 
> Please print the timeout limit.
> 
> > +	return MDADM_STATUS_ERROR;
> > +}
> > +
> >   int start_mdmon(char *devnm)
> >   {
> >   	int i;  
> 
> Doesn’t systemd have some interface sd_ on how to notify about a 
> successful start?
> 
> 
> Kind nregards,
> 
> Paul
> 

Hi Paul,

mdadm has its own mechanism to verify if mdmon is running and using it
we are not limited only to systemd, so it's better to use this way.

Thanks,
Kinga
diff mbox series

Patch

diff --git a/Assemble.c b/Assemble.c
index f6c5b99e25e2..9cb1747df0a3 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -2175,8 +2175,8 @@  int assemble_container_content(struct supertype *st, int mdfd,
 			if (!mdmon_running(st->container_devnm))
 				start_mdmon(st->container_devnm);
 			ping_monitor(st->container_devnm);
-			if (mdmon_running(st->container_devnm) &&
-			    st->update_tail == NULL)
+			if (wait_for_mdmon(st->container_devnm) == MDADM_STATUS_SUCCESS &&
+			    !st->update_tail)
 				st->update_tail = &st->updates;
 		}
 
diff --git a/Grow.c b/Grow.c
index 074f19956e17..0e44fae4891e 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2085,7 +2085,7 @@  int Grow_reshape(char *devname, int fd,
 			if (!mdmon_running(st->container_devnm))
 				start_mdmon(st->container_devnm);
 			ping_monitor(container);
-			if (mdmon_running(st->container_devnm) == false) {
+			if (wait_for_mdmon(st->container_devnm) != MDADM_STATUS_SUCCESS) {
 				pr_err("No mdmon found. Grow cannot continue.\n");
 				goto release;
 			}
@@ -3176,7 +3176,8 @@  static int reshape_array(char *container, int fd, char *devname,
 			if (!mdmon_running(container))
 				start_mdmon(container);
 			ping_monitor(container);
-			if (mdmon_running(container) && st->update_tail == NULL)
+			if (wait_for_mdmon(container) == MDADM_STATUS_SUCCESS &&
+			    !st->update_tail)
 				st->update_tail = &st->updates;
 		}
 	}
@@ -5140,7 +5141,7 @@  int Grow_continue_command(char *devname, int fd,
 			start_mdmon(container);
 		ping_monitor(container);
 
-		if (mdmon_running(container) == false) {
+		if (wait_for_mdmon(container) != MDADM_STATUS_SUCCESS) {
 			pr_err("No mdmon found. Grow cannot continue.\n");
 			ret_val = 1;
 			goto Grow_continue_command_exit;
diff --git a/mdadm.h b/mdadm.h
index af4c484afdf7..9b8fb3f6f8d8 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1769,6 +1769,8 @@  extern struct superswitch *version_to_superswitch(char *vers);
 
 extern int mdmon_running(const char *devnm);
 extern int mdmon_pid(const char *devnm);
+extern mdadm_status_t wait_for_mdmon(const char *devnm);
+
 extern int check_env(char *name);
 extern __u32 random32(void);
 extern void random_uuid(__u8 *buf);
diff --git a/util.c b/util.c
index 65056a19e2cd..df12cf2bb2b1 100644
--- a/util.c
+++ b/util.c
@@ -1921,6 +1921,35 @@  int mdmon_running(const char *devnm)
 	return 0;
 }
 
+/*
+ * wait_for_mdmon() - Waits for mdmon within specified time.
+ * @devnm: Device for which mdmon should start.
+ *
+ * Function waits for mdmon to start. It may need few seconds
+ * to start, we set timeout to 5, it should be sufficient.
+ * Do not wait if mdmon has been started.
+ *
+ * Return: MDADM_STATUS_SUCCESS if mdmon is running, error code otherwise.
+ */
+mdadm_status_t wait_for_mdmon(const char *devnm)
+{
+	const time_t mdmon_timeout = 5;
+	time_t start_time = time(0);
+
+	if (mdmon_running(devnm))
+		return MDADM_STATUS_SUCCESS;
+
+	pr_info("Waiting for mdmon to start\n");
+	while (time(0) - start_time < mdmon_timeout) {
+		sleep_for(0, MSEC_TO_NSEC(200), true);
+		if (mdmon_running(devnm))
+			return MDADM_STATUS_SUCCESS;
+	};
+
+	pr_err("Timeout waiting for mdmon\n");
+	return MDADM_STATUS_ERROR;
+}
+
 int start_mdmon(char *devnm)
 {
 	int i;