diff mbox

[2/2] KVM: x86: fix maintaining of kvm_clock stability on guest CPU hotplug

Message ID 1491466125-16988-3-git-send-email-dplotnikov@virtuozzo.com (mailing list archive)
State New, archived
Headers show

Commit Message

Denis Plotnikov April 6, 2017, 8:08 a.m. UTC
VCPU TSC synchronization is perfromed in kvm_write_tsc() when the TSC
value being set is within 1 second from the expected, as obtained by
extrapolating of the TSC in already synchronized VCPUs.

This is naturally achieved on all VCPUs at VM start and resume;
however on VCPU hotplug it is not: the newly added VCPU is created
with TSC == 0 while others are well ahead.

To compensate for that, consider host-initiated kvm_write_tsc() with
TSC == 0 a special case requiring synchronization regardless of the
current TSC on other VCPUs.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
---
 arch/x86/kvm/x86.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Comments

Radim Krčmář April 6, 2017, 5:11 p.m. UTC | #1
2017-04-06 11:08+0300, Denis Plotnikov:
> VCPU TSC synchronization is perfromed in kvm_write_tsc() when the TSC
> value being set is within 1 second from the expected, as obtained by
> extrapolating of the TSC in already synchronized VCPUs.
> 
> This is naturally achieved on all VCPUs at VM start and resume;
> however on VCPU hotplug it is not: the newly added VCPU is created
> with TSC == 0 while others are well ahead.
> 
> To compensate for that, consider host-initiated kvm_write_tsc() with
> TSC == 0 a special case requiring synchronization regardless of the
> current TSC on other VCPUs.
> 
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
> ---

The idea goes well with current kvm clock.

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -1455,15 +1455,23 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  	elapsed = ns - kvm->arch.last_tsc_nsec;
>  
>  	if (vcpu->arch.virtual_tsc_khz) {
> -		u64 tsc_exp = kvm->arch.last_tsc_write +
> -					nsec_to_cycles(vcpu, elapsed);
> -		u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
> -		/*
> -		 * Special case: TSC write with a small delta (1 second) of virtual
> -		 * cycle time against real time is interpreted as an attempt to
> -		 * synchronize the CPU.
> -		 */
> -		synchronizing = data < tsc_exp + tsc_hz && data > tsc_exp - tsc_hz;
> +		if ((data == 0) && msr->host_initiated) {
> +			/*
> +			* detection of vcpu initialization -- need to sync with other vCPUs
> +			* particularly helps to keep kvm_clock stable after CPU hotplug
> +			*/
> +			synchronizing = true;
> +		} else {
> +			u64 tsc_exp = kvm->arch.last_tsc_write +
> +						nsec_to_cycles(vcpu, elapsed);
> +			u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
> +			/*
> +			 * Special case: TSC write with a small delta (1 second) of virtual
> +			 * cycle time against real time is interpreted as an attempt to
> +			 * synchronize the CPU.
> +			 */
> +			synchronizing = data < tsc_exp + tsc_hz && data > tsc_exp - tsc_hz;

I would have proposed to fix the bug in [1/2] myself if there weren't
too many overlong lines here.  Please wrap it, so the last element
doesn't start after 80 characters.  (And the first comment can also use
spaces before stars.)

Thanks.
Radim Krčmář April 6, 2017, 5:25 p.m. UTC | #2
2017-04-06 19:11+0200, Radim Krčmář:
> 2017-04-06 11:08+0300, Denis Plotnikov:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> @@ -1455,15 +1455,23 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
>>  	elapsed = ns - kvm->arch.last_tsc_nsec;
>>  
>>  	if (vcpu->arch.virtual_tsc_khz) {
>> -		u64 tsc_exp = kvm->arch.last_tsc_write +
>> -					nsec_to_cycles(vcpu, elapsed);
>> -		u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
>> -		/*
>> -		 * Special case: TSC write with a small delta (1 second) of virtual
>> -		 * cycle time against real time is interpreted as an attempt to
>> -		 * synchronize the CPU.
>> -		 */
>> -		synchronizing = data < tsc_exp + tsc_hz && data > tsc_exp - tsc_hz;
>> +		if ((data == 0) && msr->host_initiated) {

And while I'm nitpicking ... the code would also be nicer without the
parentheses around data.
diff mbox

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4c674eb..a4091ee 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1455,15 +1455,23 @@  void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	elapsed = ns - kvm->arch.last_tsc_nsec;
 
 	if (vcpu->arch.virtual_tsc_khz) {
-		u64 tsc_exp = kvm->arch.last_tsc_write +
-					nsec_to_cycles(vcpu, elapsed);
-		u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
-		/*
-		 * Special case: TSC write with a small delta (1 second) of virtual
-		 * cycle time against real time is interpreted as an attempt to
-		 * synchronize the CPU.
-		 */
-		synchronizing = data < tsc_exp + tsc_hz && data > tsc_exp - tsc_hz;
+		if ((data == 0) && msr->host_initiated) {
+			/*
+			* detection of vcpu initialization -- need to sync with other vCPUs
+			* particularly helps to keep kvm_clock stable after CPU hotplug
+			*/
+			synchronizing = true;
+		} else {
+			u64 tsc_exp = kvm->arch.last_tsc_write +
+						nsec_to_cycles(vcpu, elapsed);
+			u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
+			/*
+			 * Special case: TSC write with a small delta (1 second) of virtual
+			 * cycle time against real time is interpreted as an attempt to
+			 * synchronize the CPU.
+			 */
+			synchronizing = data < tsc_exp + tsc_hz && data > tsc_exp - tsc_hz;
+		}
 	}
 
 	/*