diff mbox series

[V2] KVM: selftests: Take large C-state exit latency into consideration

Message ID 20240413053749.74313-1-zide.chen@intel.com (mailing list archive)
State New
Headers show
Series [V2] KVM: selftests: Take large C-state exit latency into consideration | expand

Commit Message

Zide Chen April 13, 2024, 5:37 a.m. UTC
Currently, the migration worker delays 1-10 us, assuming that one
KVM_RUN iteration only takes a few microseconds.  But if C-state exit
latencies are large enough, for example, hundreds or even thousands
of microseconds on server CPUs, it may happen that it's not able to
bring the target CPU out of C-state before the migration worker starts
to migrate it to the next CPU.

If the system workload is light, most CPUs could be at a certain level
of C-state, which may result in less successful migrations and fail the
migration/KVM_RUN ratio sanity check.

This patch adds a command line option to skip the sanity check in
this case.

Additionally, seems it's reasonable to randomize the length of usleep(),
other than delay in a fixed pattern.

V2:
- removed the busy loop implementation
- add the new "-s" option

Signed-off-by: Zide Chen <zide.chen@intel.com>
---
 tools/testing/selftests/kvm/rseq_test.c | 37 +++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 3 deletions(-)

Comments

Sean Christopherson April 15, 2024, 11:23 p.m. UTC | #1
On Fri, Apr 12, 2024, Zide Chen wrote:
> Currently, the migration worker delays 1-10 us, assuming that one
> KVM_RUN iteration only takes a few microseconds.  But if C-state exit
> latencies are large enough, for example, hundreds or even thousands
> of microseconds on server CPUs, it may happen that it's not able to
> bring the target CPU out of C-state before the migration worker starts
> to migrate it to the next CPU.
> 
> If the system workload is light, most CPUs could be at a certain level
> of C-state, which may result in less successful migrations and fail the
> migration/KVM_RUN ratio sanity check.
> 
> This patch adds a command line option to skip the sanity check in
> this case.
> 
> Additionally, seems it's reasonable to randomize the length of usleep(),
> other than delay in a fixed pattern.

This belongs in a separate patch.  And while it's reasonable on the surface, I
doubt think it buys us anything, and it makes an already non-deterministic test
even less deterministic.  In other words, unless a random sleep time helps find
more bugs or finds the original bug faster, just drop the randomization.

> V2:
> - removed the busy loop implementation
> - add the new "-s" option

This belongs in the ignored part of the patch...
> 
> Signed-off-by: Zide Chen <zide.chen@intel.com>

...down here.

> ---
>  tools/testing/selftests/kvm/rseq_test.c | 37 +++++++++++++++++++++++--
>  1 file changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
> index 28f97fb52044..515cfa32a925 100644
> --- a/tools/testing/selftests/kvm/rseq_test.c
> +++ b/tools/testing/selftests/kvm/rseq_test.c
> @@ -150,7 +150,7 @@ static void *migration_worker(void *__rseq_tid)
>  		 * Use usleep() for simplicity and to avoid unnecessary kernel
>  		 * dependencies.
>  		 */
> -		usleep((i % 10) + 1);
> +		usleep((rand() % 10) + 1);
>  	}
>  	done = true;
>  	return NULL;
> @@ -186,12 +186,35 @@ static void calc_min_max_cpu(void)
>  		       "Only one usable CPU, task migration not possible");
>  }
>  
> +static void usage(const char *name)

Uber nit, "help()" is more common than "usage()".

> @@ -254,9 +279,15 @@ int main(int argc, char *argv[])
>  	 * getcpu() to stabilize.  A 2:1 migration:KVM_RUN ratio is a fairly
>  	 * conservative ratio on x86-64, which can do _more_ KVM_RUNs than
>  	 * migrations given the 1us+ delay in the migration task.
> +	 *
> +	 * Another reason why it may have small migration:KVM_RUN ratio is that,
> +	 * on systems with large C-state exit latency, it may happen quite often
> +	 * that the scheduler is not able to wake up the target CPU before the
> +	 * vCPU thread is scheduled to another CPU.
>  	 */
> -	TEST_ASSERT(i > (NR_TASK_MIGRATIONS / 2),
> -		    "Only performed %d KVM_RUNs, task stalled too much?", i);
> +	TEST_ASSERT(skip_sanity_check || i > (NR_TASK_MIGRATIONS / 2),
> +		    "Only performed %d KVM_RUNs, task stalled too much? "
> +		    "Try to turn off C-states or run it with the -s option", i);

I think it's worth explicitly telling the user how to reduce CPU wakeup latency.
Also, are C-states called that on other architectures?  E.g. maybe this to avoid
confusing the user?  Not a big deal, e.g. I've no objection whatsoever to the
comment, but it seems easy enough to avoid confusing the user.

		    "Try setting /dev/cpu_dma_latency to reduce CPU wakeup latency, "
		    "or run with -s to skip this sanity check", i);
Zide Chen April 16, 2024, 7:43 p.m. UTC | #2
On 4/15/2024 4:23 PM, Sean Christopherson wrote:
> On Fri, Apr 12, 2024, Zide Chen wrote:
>> Currently, the migration worker delays 1-10 us, assuming that one
>> KVM_RUN iteration only takes a few microseconds.  But if C-state exit
>> latencies are large enough, for example, hundreds or even thousands
>> of microseconds on server CPUs, it may happen that it's not able to
>> bring the target CPU out of C-state before the migration worker starts
>> to migrate it to the next CPU.
>>
>> If the system workload is light, most CPUs could be at a certain level
>> of C-state, which may result in less successful migrations and fail the
>> migration/KVM_RUN ratio sanity check.
>>
>> This patch adds a command line option to skip the sanity check in
>> this case.
>>
>> Additionally, seems it's reasonable to randomize the length of usleep(),
>> other than delay in a fixed pattern.
> 
> This belongs in a separate patch.  And while it's reasonable on the surface, I
> doubt think it buys us anything, and it makes an already non-deterministic test
> even less deterministic.  In other words, unless a random sleep time helps find
> more bugs or finds the original bug faster, just drop the randomization.

OK, I will drop it.
> 
>> V2:
>> - removed the busy loop implementation
>> - add the new "-s" option
> 
> This belongs in the ignored part of the patch...
>>
>> Signed-off-by: Zide Chen <zide.chen@intel.com>
> 
> ...down here.

My bad. Will move it here.

> 
>> ---
>>  tools/testing/selftests/kvm/rseq_test.c | 37 +++++++++++++++++++++++--
>>  1 file changed, 34 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
>> index 28f97fb52044..515cfa32a925 100644
>> --- a/tools/testing/selftests/kvm/rseq_test.c
>> +++ b/tools/testing/selftests/kvm/rseq_test.c
>> @@ -150,7 +150,7 @@ static void *migration_worker(void *__rseq_tid)
>>  		 * Use usleep() for simplicity and to avoid unnecessary kernel
>>  		 * dependencies.
>>  		 */
>> -		usleep((i % 10) + 1);
>> +		usleep((rand() % 10) + 1);
>>  	}
>>  	done = true;
>>  	return NULL;
>> @@ -186,12 +186,35 @@ static void calc_min_max_cpu(void)
>>  		       "Only one usable CPU, task migration not possible");
>>  }
>>  
>> +static void usage(const char *name)
> 
> Uber nit, "help()" is more common than "usage()".

OK, will do.

> 
>> @@ -254,9 +279,15 @@ int main(int argc, char *argv[])
>>  	 * getcpu() to stabilize.  A 2:1 migration:KVM_RUN ratio is a fairly
>>  	 * conservative ratio on x86-64, which can do _more_ KVM_RUNs than
>>  	 * migrations given the 1us+ delay in the migration task.
>> +	 *
>> +	 * Another reason why it may have small migration:KVM_RUN ratio is that,
>> +	 * on systems with large C-state exit latency, it may happen quite often
>> +	 * that the scheduler is not able to wake up the target CPU before the
>> +	 * vCPU thread is scheduled to another CPU.
>>  	 */
>> -	TEST_ASSERT(i > (NR_TASK_MIGRATIONS / 2),
>> -		    "Only performed %d KVM_RUNs, task stalled too much?", i);
>> +	TEST_ASSERT(skip_sanity_check || i > (NR_TASK_MIGRATIONS / 2),
>> +		    "Only performed %d KVM_RUNs, task stalled too much? "
>> +		    "Try to turn off C-states or run it with the -s option", i);
> 
> I think it's worth explicitly telling the user how to reduce CPU wakeup latency.

OK, I will add more details in the TEST_ASSERT() statement.

> Also, are C-states called that on other architectures?  E.g. maybe this to avoid
> confusing the user?  Not a big deal, e.g. I've no objection whatsoever to the
> comment, but it seems easy enough to avoid confusing the user.

Yes, I agree C-State is not the best term here. I can remove the word
C-state from this patch.

Initially I thought it's OK since C-State is an arch independent term
from ACPI spec. But in Linux, it's not adopted by other arch code other
than x86 and ARM.  Furthermore, the ARM Architecture Reference Manual
simply uses "low power mode" and doesn't adopt the term C-State.
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
index 28f97fb52044..515cfa32a925 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -150,7 +150,7 @@  static void *migration_worker(void *__rseq_tid)
 		 * Use usleep() for simplicity and to avoid unnecessary kernel
 		 * dependencies.
 		 */
-		usleep((i % 10) + 1);
+		usleep((rand() % 10) + 1);
 	}
 	done = true;
 	return NULL;
@@ -186,12 +186,35 @@  static void calc_min_max_cpu(void)
 		       "Only one usable CPU, task migration not possible");
 }
 
+static void usage(const char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-s]\n", name);
+	printf(" -s: skip the sanity check for successful KVM_RUN.\n");
+	puts("");
+	exit(0);
+}
+
 int main(int argc, char *argv[])
 {
 	int r, i, snapshot;
 	struct kvm_vm *vm;
 	struct kvm_vcpu *vcpu;
 	u32 cpu, rseq_cpu;
+	bool skip_sanity_check = false;
+	int opt;
+
+	while ((opt = getopt(argc, argv, "sh")) != -1) {
+		switch (opt) {
+		case 's':
+			skip_sanity_check = true;
+			break;
+		case 'h':
+		default:
+			usage(argv[0]);
+			break;
+		}
+	}
 
 	r = sched_getaffinity(0, sizeof(possible_mask), &possible_mask);
 	TEST_ASSERT(!r, "sched_getaffinity failed, errno = %d (%s)", errno,
@@ -199,6 +222,8 @@  int main(int argc, char *argv[])
 
 	calc_min_max_cpu();
 
+	srand(time(NULL));
+
 	r = rseq_register_current_thread();
 	TEST_ASSERT(!r, "rseq_register_current_thread failed, errno = %d (%s)",
 		    errno, strerror(errno));
@@ -254,9 +279,15 @@  int main(int argc, char *argv[])
 	 * getcpu() to stabilize.  A 2:1 migration:KVM_RUN ratio is a fairly
 	 * conservative ratio on x86-64, which can do _more_ KVM_RUNs than
 	 * migrations given the 1us+ delay in the migration task.
+	 *
+	 * Another reason why it may have small migration:KVM_RUN ratio is that,
+	 * on systems with large C-state exit latency, it may happen quite often
+	 * that the scheduler is not able to wake up the target CPU before the
+	 * vCPU thread is scheduled to another CPU.
 	 */
-	TEST_ASSERT(i > (NR_TASK_MIGRATIONS / 2),
-		    "Only performed %d KVM_RUNs, task stalled too much?", i);
+	TEST_ASSERT(skip_sanity_check || i > (NR_TASK_MIGRATIONS / 2),
+		    "Only performed %d KVM_RUNs, task stalled too much? "
+		    "Try to turn off C-states or run it with the -s option", i);
 
 	pthread_join(migration_thread, NULL);