diff mbox

x86/hvm: do not set msr_tsc_adjust on hvm_set_guest_tsc_fixed

Message ID 1484956262-18692-1-git-send-email-joao.m.martins@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Joao Martins Jan. 20, 2017, 11:51 p.m. UTC
Commit 6e03363 ("x86: Implement TSC adjust feature for HVM guest")
implemented TSC_ADJUST MSR for hvm guests. Though while booting
an HVM guest the boot CPU would have a value set with delta_tsc -
guest tsc while secondary CPUS would have 0. For example one can
observe:
 $ xen-hvmctx 17 | grep tsc_adjust
 TSC_ADJUST: tsc_adjust ff9377dfef47fe66
 TSC_ADJUST: tsc_adjust 0
 TSC_ADJUST: tsc_adjust 0
 TSC_ADJUST: tsc_adjust 0

Upcoming Linux 4.10 now validates whether this MSR is correct and
adjusts them accordingly under the following conditions: values of < 0
(our case for CPU 0) or != 0 or values > 7FFFFFFF. In this conditions it
will force set to 0 and for the CPUs that the value doesn't match all
together. If this msr is not correct we would see messages such as:

[Firmware Bug]: TSC ADJUST: CPU0: -30517044286984129 force to 0

And on HVM guests supporting TSC_ADJUST (requiring at least Haswell
Intel) it won't boot.

Our current vCPU 0 values are incorrect and according to Intel SDM which on
section 17.15.3 states that "On RESET, the value of the IA32_TSC_ADJUST MSR is
 0." hence we should set it 0 and be consistent across multiple vCPUs. Perhaps
this MSR should be only changed by the guest which already happens through
hvm_set_guest_tsc_adjust(..) routines (see below). After this patch guests
running Linux 4.10 will see a valid IA32_TSC_ADJUST msr of value 0 for
all CPUs and are able to boot.

On the same section of the spec (17.15.3) it is also stated:
"If an execution of WRMSR to the IA32_TIME_STAMP_COUNTER MSR
   adds (or subtracts) value X from the TSC, the logical processor also
   adds (or subtracts) value X from the IA32_TSC_ADJUST MSR."

This suggests these MSRs values should only be changed through guest i.e.
throught write intercept msrs. We keep IA32_TSC MSR logic such that writes so
accomodate adjustments to TSC_ADJUST, hence no functional change in the
msr_tsc_adjust for IA32_TSC msr. Though, we do that in a separate routine
namely hvm_set_guest_tsc_msr instead of through hvm_set_guest_tsc(...).

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Even if this is not the right way to fix it, at least I wanted to
raise the current issue with TSC_ADJUST. AFAICT this is the behaviour since
Xen 4.3 or Xen 4.4 and this logic hasn't changed ever since. Probably because
this MSR hasn't been really used/checked in guests up until now. Let me know
you folks think.
---
 xen/arch/x86/hvm/hvm.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Joao Martins Jan. 22, 2017, 1:52 p.m. UTC | #1
On 01/20/2017 11:51 PM, Joao Martins wrote:
> Commit 6e03363 ("x86: Implement TSC adjust feature for HVM guest")
> implemented TSC_ADJUST MSR for hvm guests. Though while booting
> an HVM guest the boot CPU would have a value set with delta_tsc -
> guest tsc while secondary CPUS would have 0. For example one can
> observe:
>  $ xen-hvmctx 17 | grep tsc_adjust
>  TSC_ADJUST: tsc_adjust ff9377dfef47fe66
>  TSC_ADJUST: tsc_adjust 0
>  TSC_ADJUST: tsc_adjust 0
>  TSC_ADJUST: tsc_adjust 0
> 
> Upcoming Linux 4.10 now validates whether this MSR is correct and
> adjusts them accordingly under the following conditions: values of < 0
> (our case for CPU 0) or != 0 or values > 7FFFFFFF. In this conditions it
> will force set to 0 and for the CPUs that the value doesn't match all
> together. If this msr is not correct we would see messages such as:
> 
> [Firmware Bug]: TSC ADJUST: CPU0: -30517044286984129 force to 0
> 
> And on HVM guests supporting TSC_ADJUST (requiring at least Haswell
> Intel) it won't boot.
> 
> Our current vCPU 0 values are incorrect and according to Intel SDM which on
> section 17.15.3 states that "On RESET, the value of the IA32_TSC_ADJUST MSR is
>  0." hence we should set it 0 and be consistent across multiple vCPUs. Perhaps
> this MSR should be only changed by the guest which already happens through
> hvm_set_guest_tsc_adjust(..) routines (see below). After this patch guests
> running Linux 4.10 will see a valid IA32_TSC_ADJUST msr of value 0 for
> all CPUs and are able to boot.
> 
> On the same section of the spec (17.15.3) it is also stated:
> "If an execution of WRMSR to the IA32_TIME_STAMP_COUNTER MSR
>    adds (or subtracts) value X from the TSC, the logical processor also
>    adds (or subtracts) value X from the IA32_TSC_ADJUST MSR."
> 
I should probably append here the citation below too from the same 17.15.3
section of the manual, to justify the next paragraph wrt to TSC Adjust only
changing through guest.

"Unlike the TSC, the value of the IA32_TSC_ADJUST MSR changes only in response
to WRMSR (either to the MSR itself, or to the IA32_TIME_STAMP_COUNTER MSR). Its
value does not otherwise change as time elapses. Software seeking to adjust the
TSC can do so by using WRMSR to write the same value to the IA32_TSC_ADJUST MSR
on each logical processor."

> This suggests these MSRs values should only be changed through guest i.e.
> throught write intercept msrs. We keep IA32_TSC MSR logic such that writes so
> accomodate adjustments to TSC_ADJUST, hence no functional change in the
> msr_tsc_adjust for IA32_TSC msr. Though, we do that in a separate routine
> namely hvm_set_guest_tsc_msr instead of through hvm_set_guest_tsc(...).
> 
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> Even if this is not the right way to fix it, at least I wanted to
> raise the current issue with TSC_ADJUST. AFAICT this is the behaviour since
> Xen 4.3 or Xen 4.4 and this logic hasn't changed ever since. Probably because
> this MSR hasn't been really used/checked in guests up until now. Let me know
> you folks think.
> ---
>  xen/arch/x86/hvm/hvm.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 2ec0800..06a6007 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -387,13 +387,20 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
>      }
>  
>      delta_tsc = guest_tsc - tsc;
> -    v->arch.hvm_vcpu.msr_tsc_adjust += delta_tsc
> -                          - v->arch.hvm_vcpu.cache_tsc_offset;
>      v->arch.hvm_vcpu.cache_tsc_offset = delta_tsc;
>  
>      hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset, at_tsc);
>  }
>  
> +static void hvm_set_guest_tsc_msr(struct vcpu *v, u64 guest_tsc)
> +{
> +    uint64_t tsc_offset = v->arch.hvm_vcpu.cache_tsc_offset;
> +
> +    hvm_set_guest_tsc(v, guest_tsc);
> +    v->arch.hvm_vcpu.msr_tsc_adjust += v->arch.hvm_vcpu.cache_tsc_offset
> +                          - tsc_offset;
> +}
> +
>  void hvm_set_guest_tsc_adjust(struct vcpu *v, u64 tsc_adjust)
>  {
>      v->arch.hvm_vcpu.cache_tsc_offset += tsc_adjust
> @@ -3491,7 +3498,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
>          break;
>  
>      case MSR_IA32_TSC:
> -        hvm_set_guest_tsc(v, msr_content);
> +        hvm_set_guest_tsc_msr(v, msr_content);
>          break;
>  
>      case MSR_IA32_TSC_ADJUST:
>
Jan Beulich Jan. 24, 2017, 9:24 a.m. UTC | #2
>>> On 21.01.17 at 00:51, <joao.m.martins@oracle.com> wrote:
> Commit 6e03363 ("x86: Implement TSC adjust feature for HVM guest")
> implemented TSC_ADJUST MSR for hvm guests. Though while booting
> an HVM guest the boot CPU would have a value set with delta_tsc -
> guest tsc while secondary CPUS would have 0. For example one can
> observe:
>  $ xen-hvmctx 17 | grep tsc_adjust
>  TSC_ADJUST: tsc_adjust ff9377dfef47fe66
>  TSC_ADJUST: tsc_adjust 0
>  TSC_ADJUST: tsc_adjust 0
>  TSC_ADJUST: tsc_adjust 0
> 
> Upcoming Linux 4.10 now validates whether this MSR is correct and
> adjusts them accordingly under the following conditions: values of < 0
> (our case for CPU 0) or != 0 or values > 7FFFFFFF. In this conditions it
> will force set to 0 and for the CPUs that the value doesn't match all
> together. If this msr is not correct we would see messages such as:
> 
> [Firmware Bug]: TSC ADJUST: CPU0: -30517044286984129 force to 0
> 
> And on HVM guests supporting TSC_ADJUST (requiring at least Haswell
> Intel) it won't boot.
> 
> Our current vCPU 0 values are incorrect and according to Intel SDM which on
> section 17.15.3 states that "On RESET, the value of the IA32_TSC_ADJUST MSR is

Please use section titles rather than section numbers, as the latter
change not too infrequently.

>  0." hence we should set it 0 and be consistent across multiple vCPUs. Perhaps
> this MSR should be only changed by the guest which already happens through
> hvm_set_guest_tsc_adjust(..) routines (see below). After this patch guests
> running Linux 4.10 will see a valid IA32_TSC_ADJUST msr of value 0 for
> all CPUs and are able to boot.
> 
> On the same section of the spec (17.15.3) it is also stated:
> "If an execution of WRMSR to the IA32_TIME_STAMP_COUNTER MSR
>    adds (or subtracts) value X from the TSC, the logical processor also
>    adds (or subtracts) value X from the IA32_TSC_ADJUST MSR."
> 
> This suggests these MSRs values should only be changed through guest i.e.
> throught write intercept msrs. We keep IA32_TSC MSR logic such that writes 
> so
> accomodate adjustments to TSC_ADJUST, hence no functional change in the
> msr_tsc_adjust for IA32_TSC msr. Though, we do that in a separate routine
> namely hvm_set_guest_tsc_msr instead of through hvm_set_guest_tsc(...).
> 
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>

With the description extended as per your reply, and with the
formal issue above taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2ec0800..06a6007 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -387,13 +387,20 @@  void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
     }
 
     delta_tsc = guest_tsc - tsc;
-    v->arch.hvm_vcpu.msr_tsc_adjust += delta_tsc
-                          - v->arch.hvm_vcpu.cache_tsc_offset;
     v->arch.hvm_vcpu.cache_tsc_offset = delta_tsc;
 
     hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset, at_tsc);
 }
 
+static void hvm_set_guest_tsc_msr(struct vcpu *v, u64 guest_tsc)
+{
+    uint64_t tsc_offset = v->arch.hvm_vcpu.cache_tsc_offset;
+
+    hvm_set_guest_tsc(v, guest_tsc);
+    v->arch.hvm_vcpu.msr_tsc_adjust += v->arch.hvm_vcpu.cache_tsc_offset
+                          - tsc_offset;
+}
+
 void hvm_set_guest_tsc_adjust(struct vcpu *v, u64 tsc_adjust)
 {
     v->arch.hvm_vcpu.cache_tsc_offset += tsc_adjust
@@ -3491,7 +3498,7 @@  int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
         break;
 
     case MSR_IA32_TSC:
-        hvm_set_guest_tsc(v, msr_content);
+        hvm_set_guest_tsc_msr(v, msr_content);
         break;
 
     case MSR_IA32_TSC_ADJUST: