diff mbox

[v6,1/3] KVM: fix steal clock warp during guest cpu hotplug

Message ID 1465813966-3116-2-git-send-email-wanpeng.li@hotmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li June 13, 2016, 10:32 a.m. UTC
From: Wanpeng Li <wanpeng.li@hotmail.com>

Sometimes, after CPU hotplug you can observe a spike in stolen time
(100%) followed by the CPU being marked as 100% idle when it's actually
busy with a CPU hog task.  The trace looks like the following:

cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291
<idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
<idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437

The sudden decrease of "steal" causes steal_jiffies to underflow.
The root cause is kvm_steal_time being reset to 0 after hot-plugging
back in a CPU.  Instead, the preexisting value can be used, which is
what the core scheduler code expects.

John Stultz also reported a similar issue after guest S3.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kernel/kvm.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Paolo Bonzini June 13, 2016, 10:44 a.m. UTC | #1
On 13/06/2016 12:32, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> Sometimes, after CPU hotplug you can observe a spike in stolen time
> (100%) followed by the CPU being marked as 100% idle when it's actually
> busy with a CPU hog task.  The trace looks like the following:
> 
> cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
> cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291
> <idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
> <idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437
> 
> The sudden decrease of "steal" causes steal_jiffies to underflow.
> The root cause is kvm_steal_time being reset to 0 after hot-plugging
> back in a CPU.  Instead, the preexisting value can be used, which is
> what the core scheduler code expects.
> 
> John Stultz also reported a similar issue after guest S3.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kernel/kvm.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index eea2a6f..1ef5e48 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -301,8 +301,6 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> -	memset(st, 0, sizeof(*st));
> -
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
> 

Because there's no cover letter, I guess I have to ack each patch
independently.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Also, there's really no relation between patches 1-2 and 3...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra June 13, 2016, 11:28 a.m. UTC | #2
On Mon, Jun 13, 2016 at 12:44:46PM +0200, Paolo Bonzini wrote:
> Because there's no cover letter, I guess I have to ack each patch
> independently.
> 
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks, I'll take the lot through the sched tree.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wanpeng Li June 13, 2016, 11:31 a.m. UTC | #3
2016-06-13 18:44 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 13/06/2016 12:32, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> Sometimes, after CPU hotplug you can observe a spike in stolen time
>> (100%) followed by the CPU being marked as 100% idle when it's actually
>> busy with a CPU hog task.  The trace looks like the following:
>>
>> cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
>> cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291
>> <idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
>> <idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437
>>
>> The sudden decrease of "steal" causes steal_jiffies to underflow.
>> The root cause is kvm_steal_time being reset to 0 after hot-plugging
>> back in a CPU.  Instead, the preexisting value can be used, which is
>> what the core scheduler code expects.
>>
>> John Stultz also reported a similar issue after guest S3.
>>
>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: John Stultz <john.stultz@linaro.org>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>>  arch/x86/kernel/kvm.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index eea2a6f..1ef5e48 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -301,8 +301,6 @@ static void kvm_register_steal_time(void)
>>       if (!has_steal_clock)
>>               return;
>>
>> -     memset(st, 0, sizeof(*st));
>> -
>>       wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>>       pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>>               cpu, (unsigned long long) slow_virt_to_phys(st));
>>
>
> Because there's no cover letter, I guess I have to ack each patch
> independently.
>
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks for your and Rik's review, actually there is a cover letter for
this version, it seems that it just send to ML and forgot to Cc
maintainers/reviewers.

Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index eea2a6f..1ef5e48 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -301,8 +301,6 @@  static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
-	memset(st, 0, sizeof(*st));
-
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));