Message ID | 20220421005645.56801-1-romanton@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] KVM: x86: Use current rather than snapshotted TSC frequency if it is constant | expand |
On Thu, 2022-04-21 at 00:56 +0000, Anton Romanov wrote: > Don't snapshot tsc_khz into per-cpu cpu_tsc_khz if the host TSC is > constant, in which case the actual TSC frequency will never change and thus > capturing TSC during initialization is unnecessary, KVM can simply use > tsc_khz. This value is snapshotted from > kvm_timer_init->kvmclock_cpu_online->tsc_khz_changed(NULL) > > On CPUs with constant TSC, but not a hardware-specified TSC frequency, > snapshotting cpu_tsc_khz and using that to set a VM's target TSC frequency > can lead to VM to think its TSC frequency is not what it actually is if > refining the TSC completes after KVM snapshots tsc_khz. The actual > frequency never changes, only the kernel's calculation of what that > frequency is changes. > > Ideally, KVM would not be able to race with TSC refinement, or would have > a hook into tsc_refine_calibration_work() to get an alert when refinement > is complete. Avoiding the race altogether isn't practical as refinement > takes a relative eternity; it's deliberately put on a work queue outside of > the normal boot sequence to avoid unnecessarily delaying boot. > > Adding a hook is doable, but somewhat gross due to KVM's ability to be > built as a module. And if the TSC is constant, which is likely the case > for every VMX/SVM-capable CPU produced in the last decade, the race can be > hit if and only if userspace is able to create a VM before TSC refinement > completes; refinement is slow, but not that slow. > > For now, punt on a proper fix, as not taking a snapshot can help some uses > cases and not taking a snapshot is arguably correct irrespective of the > race with refinement. > > Signed-off-by: Anton Romanov <romanton@google.com> > --- > v2: > fixed commit msg indentation > added WARN_ON_ONCE in kvm_hyperv_tsc_notifier > opened up condition in __get_kvmclock > arch/x86/kvm/x86.c | 27 +++++++++++++++++++++++---- > 1 file changed, 23 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 547ba00ef64f..1043cfd26576 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2907,6 +2907,19 @@ static void kvm_update_masterclock(struct kvm *kvm) > kvm_end_pvclock_update(kvm); > } > > +/* > + * If kvm is built into kernel it is possible that tsc_khz saved into > + * per-cpu cpu_tsc_khz was yet unrefined value. If CPU provides CONSTANT_TSC it > + * doesn't make sense to snapshot it anyway so just return tsc_khz > + */ > +static unsigned long get_cpu_tsc_khz(void) > +{ > + if (static_cpu_has(X86_FEATURE_CONSTANT_TSC)) > + return tsc_khz; > + else > + return __this_cpu_read(cpu_tsc_khz); > +} > + > /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */ > static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) > { > @@ -2917,7 +2930,8 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) > get_cpu(); > > data->flags = 0; > - if (ka->use_master_clock && __this_cpu_read(cpu_tsc_khz)) { > + if (ka->use_master_clock && > + (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz))) > #ifdef CONFIG_X86_64 > struct timespec64 ts; > > @@ -2931,7 +2945,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) > data->flags |= KVM_CLOCK_TSC_STABLE; > hv_clock.tsc_timestamp = ka->master_cycle_now; > hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; > - kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL, > + kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL, > &hv_clock.tsc_shift, > &hv_clock.tsc_to_system_mul); > data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc); > @@ -3049,7 +3063,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) > > /* Keep irq disabled to prevent changes to the clock */ > local_irq_save(flags); > - tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz); > + tgt_tsc_khz = get_cpu_tsc_khz(); > if (unlikely(tgt_tsc_khz == 0)) { > local_irq_restore(flags); > kvm_make_request(KVM_REQ_CLOCK_UPDATE, v); > @@ -8646,9 +8660,12 @@ static void tsc_khz_changed(void *data) > struct cpufreq_freqs *freq = data; > unsigned long khz = 0; > > + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) > + return; > + > if (data) > khz = freq->new; > - else if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) > + else > khz = cpufreq_quick_get(raw_smp_processor_id()); > if (!khz) > khz = tsc_khz; > @@ -8661,6 +8678,8 @@ static void kvm_hyperv_tsc_notifier(void) > struct kvm *kvm; > int cpu; > > + WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_TSC_RELIABLE)); > + > mutex_lock(&kvm_lock); > list_for_each_entry(kvm, &vm_list, vm_list) > kvm_make_mclock_inprogress_request(kvm); Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> A question to AMD engineers: Is that really true that AMD cpu doesn't report TSC frequency somewhere (CPUID/msr)? It really sucks that we still have to measure it. Best regards, Maxim Levitsky
Hi Anton, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on kvm/master] [also build test WARNING on mst-vhost/linux-next v5.18-rc3 next-20220420] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20220421/202204211558.rjkmRfSe-lkp@intel.com/config) compiler: gcc-11 (Debian 11.2.0-20) 11.2.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/c60b3070bd6e7e804de118dac10002e4f5f714a6 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 git checkout c60b3070bd6e7e804de118dac10002e4f5f714a6 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): arch/x86/kvm/x86.c: In function '__get_kvmclock': arch/x86/kvm/x86.c:2936:17: error: expected expression before 'struct' 2936 | struct timespec64 ts; | ^~~~~~ >> arch/x86/kvm/x86.c:2933:9: warning: this 'if' clause does not guard... [-Wmisleading-indentation] 2933 | if (ka->use_master_clock && | ^~ arch/x86/kvm/x86.c:2938:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 2938 | if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { | ^~ arch/x86/kvm/x86.c:2938:53: error: 'ts' undeclared (first use in this function); did you mean 'tms'? 2938 | if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { | ^~ | tms arch/x86/kvm/x86.c:2938:53: note: each undeclared identifier is reported only once for each function it appears in arch/x86/kvm/x86.c: At top level: arch/x86/kvm/x86.c:2952:11: error: expected identifier or '(' before 'else' 2952 | } else { | ^~~~ In file included from include/linux/percpu.h:6, from include/linux/context_tracking_state.h:5, from include/linux/hardirq.h:5, from include/linux/kvm_host.h:7, from arch/x86/kvm/x86.c:19: include/linux/preempt.h:219:1: error: expected identifier or '(' before 'do' 219 | do { \ | ^~ include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' 268 | #define put_cpu() preempt_enable() | ^~~~~~~~~~~~~~ arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' 2956 | put_cpu(); | ^~~~~~~ include/linux/preempt.h:223:3: error: expected identifier or '(' before 'while' 223 | } while (0) | ^~~~~ include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' 268 | #define put_cpu() preempt_enable() | ^~~~~~~~~~~~~~ arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' 2956 | put_cpu(); | ^~~~~~~ arch/x86/kvm/x86.c:2957:1: error: expected identifier or '(' before '}' token 2957 | } | ^ vim +/if +2933 arch/x86/kvm/x86.c 2922 2923 /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */ 2924 static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) 2925 { 2926 struct kvm_arch *ka = &kvm->arch; 2927 struct pvclock_vcpu_time_info hv_clock; 2928 2929 /* both __this_cpu_read() and rdtsc() should be on the same cpu */ 2930 get_cpu(); 2931 2932 data->flags = 0; > 2933 if (ka->use_master_clock && 2934 (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz))) 2935 #ifdef CONFIG_X86_64 2936 struct timespec64 ts; 2937 2938 if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { 2939 data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec; 2940 data->flags |= KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC; 2941 } else 2942 #endif 2943 data->host_tsc = rdtsc(); 2944 2945 data->flags |= KVM_CLOCK_TSC_STABLE; 2946 hv_clock.tsc_timestamp = ka->master_cycle_now; 2947 hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; 2948 kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL, 2949 &hv_clock.tsc_shift, 2950 &hv_clock.tsc_to_system_mul); 2951 data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc); 2952 } else { 2953 data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset; 2954 } 2955 2956 put_cpu(); 2957 } 2958
On Thu, 2022-04-21 at 16:09 +0800, kernel test robot wrote: > Hi Anton, > > Thank you for the patch! Perhaps something to improve: > > [auto build test WARNING on kvm/master] > [also build test WARNING on mst-vhost/linux-next v5.18-rc3 next-20220420] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch] > > url: https://github.com/intel-lab-lkp/linux/commits/Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 > base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master > config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20220421/202204211558.rjkmRfSe-lkp@intel.com/config) > compiler: gcc-11 (Debian 11.2.0-20) 11.2.0 > reproduce (this is a W=1 build): > # https://github.com/intel-lab-lkp/linux/commit/c60b3070bd6e7e804de118dac10002e4f5f714a6 > git remote add linux-review https://github.com/intel-lab-lkp/linux > git fetch --no-tags linux-review Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 > git checkout c60b3070bd6e7e804de118dac10002e4f5f714a6 > # save the config file > mkdir build_dir && cp config build_dir/.config > make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/ > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot <lkp@intel.com> > > All warnings (new ones prefixed by >>): > > arch/x86/kvm/x86.c: In function '__get_kvmclock': > arch/x86/kvm/x86.c:2936:17: error: expected expression before 'struct' > 2936 | struct timespec64 ts; > | ^~~~~~ > > > arch/x86/kvm/x86.c:2933:9: warning: this 'if' clause does not guard... [-Wmisleading-indentation] > 2933 | if (ka->use_master_clock && The bot is right, looks like '{' got removed by a mistake, I didn't notice. Best regards, Maxim Levitsky > | ^~ > arch/x86/kvm/x86.c:2938:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' > 2938 | if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { > | ^~ > arch/x86/kvm/x86.c:2938:53: error: 'ts' undeclared (first use in this function); did you mean 'tms'? > 2938 | if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { > | ^~ > | tms > arch/x86/kvm/x86.c:2938:53: note: each undeclared identifier is reported only once for each function it appears in > arch/x86/kvm/x86.c: At top level: > arch/x86/kvm/x86.c:2952:11: error: expected identifier or '(' before 'else' > 2952 | } else { > | ^~~~ > In file included from include/linux/percpu.h:6, > from include/linux/context_tracking_state.h:5, > from include/linux/hardirq.h:5, > from include/linux/kvm_host.h:7, > from arch/x86/kvm/x86.c:19: > include/linux/preempt.h:219:1: error: expected identifier or '(' before 'do' > 219 | do { \ > | ^~ > include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' > 268 | #define put_cpu() preempt_enable() > | ^~~~~~~~~~~~~~ > arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' > 2956 | put_cpu(); > | ^~~~~~~ > include/linux/preempt.h:223:3: error: expected identifier or '(' before 'while' > 223 | } while (0) > | ^~~~~ > include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' > 268 | #define put_cpu() preempt_enable() > | ^~~~~~~~~~~~~~ > arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' > 2956 | put_cpu(); > | ^~~~~~~ > arch/x86/kvm/x86.c:2957:1: error: expected identifier or '(' before '}' token > 2957 | } > | ^ > > > vim +/if +2933 arch/x86/kvm/x86.c > > 2922 > 2923 /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */ > 2924 static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) > 2925 { > 2926 struct kvm_arch *ka = &kvm->arch; > 2927 struct pvclock_vcpu_time_info hv_clock; > 2928 > 2929 /* both __this_cpu_read() and rdtsc() should be on the same cpu */ > 2930 get_cpu(); > 2931 > 2932 data->flags = 0; > > 2933 if (ka->use_master_clock && > 2934 (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz))) > 2935 #ifdef CONFIG_X86_64 > 2936 struct timespec64 ts; > 2937 > 2938 if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { > 2939 data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec; > 2940 data->flags |= KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC; > 2941 } else > 2942 #endif > 2943 data->host_tsc = rdtsc(); > 2944 > 2945 data->flags |= KVM_CLOCK_TSC_STABLE; > 2946 hv_clock.tsc_timestamp = ka->master_cycle_now; > 2947 hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; > 2948 kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL, > 2949 &hv_clock.tsc_shift, > 2950 &hv_clock.tsc_to_system_mul); > 2951 data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc); > 2952 } else { > 2953 data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset; > 2954 } > 2955 > 2956 put_cpu(); > 2957 } > 2958 >
Hi Anton, Thank you for the patch! Yet something to improve: [auto build test ERROR on kvm/master] [also build test ERROR on mst-vhost/linux-next v5.18-rc3 next-20220421] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master config: i386-randconfig-a003 (https://download.01.org/0day-ci/archive/20220421/202204211712.vEZBWfWu-lkp@intel.com/config) compiler: gcc-11 (Debian 11.2.0-20) 11.2.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/c60b3070bd6e7e804de118dac10002e4f5f714a6 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 git checkout c60b3070bd6e7e804de118dac10002e4f5f714a6 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash arch/x86/kvm/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): arch/x86/kvm/x86.c: In function '__get_kvmclock': >> arch/x86/kvm/x86.c:2933:9: warning: this 'if' clause does not guard... [-Wmisleading-indentation] 2933 | if (ka->use_master_clock && | ^~ arch/x86/kvm/x86.c:2945:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 2945 | data->flags |= KVM_CLOCK_TSC_STABLE; | ^~~~ arch/x86/kvm/x86.c: At top level: >> arch/x86/kvm/x86.c:2952:11: error: expected identifier or '(' before 'else' 2952 | } else { | ^~~~ In file included from include/linux/percpu.h:6, from include/linux/context_tracking_state.h:5, from include/linux/hardirq.h:5, from include/linux/kvm_host.h:7, from arch/x86/kvm/x86.c:19: include/linux/preempt.h:240:1: error: expected identifier or '(' before 'do' 240 | do { \ | ^~ include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' 268 | #define put_cpu() preempt_enable() | ^~~~~~~~~~~~~~ arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' 2956 | put_cpu(); | ^~~~~~~ include/linux/preempt.h:243:3: error: expected identifier or '(' before 'while' 243 | } while (0) | ^~~~~ include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' 268 | #define put_cpu() preempt_enable() | ^~~~~~~~~~~~~~ arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' 2956 | put_cpu(); | ^~~~~~~ >> arch/x86/kvm/x86.c:2957:1: error: expected identifier or '(' before '}' token 2957 | } | ^ vim +2952 arch/x86/kvm/x86.c c60b3070bd6e7e Anton Romanov 2022-04-21 2922 869b44211adc87 Paolo Bonzini 2021-09-16 2923 /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */ 869b44211adc87 Paolo Bonzini 2021-09-16 2924 static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) 108b249c453dd7 Paolo Bonzini 2016-09-01 2925 { 108b249c453dd7 Paolo Bonzini 2016-09-01 2926 struct kvm_arch *ka = &kvm->arch; 8b953440645631 Paolo Bonzini 2016-11-16 2927 struct pvclock_vcpu_time_info hv_clock; 8b953440645631 Paolo Bonzini 2016-11-16 2928 e2c2206a18993b Wanpeng Li 2017-05-11 2929 /* both __this_cpu_read() and rdtsc() should be on the same cpu */ e2c2206a18993b Wanpeng Li 2017-05-11 2930 get_cpu(); e2c2206a18993b Wanpeng Li 2017-05-11 2931 869b44211adc87 Paolo Bonzini 2021-09-16 2932 data->flags = 0; c60b3070bd6e7e Anton Romanov 2022-04-21 @2933 if (ka->use_master_clock && c60b3070bd6e7e Anton Romanov 2022-04-21 2934 (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz))) c68dc1b577eabd Oliver Upton 2021-09-16 2935 #ifdef CONFIG_X86_64 c68dc1b577eabd Oliver Upton 2021-09-16 2936 struct timespec64 ts; c68dc1b577eabd Oliver Upton 2021-09-16 2937 c68dc1b577eabd Oliver Upton 2021-09-16 2938 if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { c68dc1b577eabd Oliver Upton 2021-09-16 2939 data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec; c68dc1b577eabd Oliver Upton 2021-09-16 2940 data->flags |= KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC; c68dc1b577eabd Oliver Upton 2021-09-16 2941 } else c68dc1b577eabd Oliver Upton 2021-09-16 2942 #endif c68dc1b577eabd Oliver Upton 2021-09-16 2943 data->host_tsc = rdtsc(); c68dc1b577eabd Oliver Upton 2021-09-16 2944 869b44211adc87 Paolo Bonzini 2021-09-16 2945 data->flags |= KVM_CLOCK_TSC_STABLE; 869b44211adc87 Paolo Bonzini 2021-09-16 2946 hv_clock.tsc_timestamp = ka->master_cycle_now; 869b44211adc87 Paolo Bonzini 2021-09-16 2947 hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; c60b3070bd6e7e Anton Romanov 2022-04-21 2948 kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL, 8b953440645631 Paolo Bonzini 2016-11-16 2949 &hv_clock.tsc_shift, 8b953440645631 Paolo Bonzini 2016-11-16 2950 &hv_clock.tsc_to_system_mul); c68dc1b577eabd Oliver Upton 2021-09-16 2951 data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc); 55c0cefbdbdaca Oliver Upton 2021-09-16 @2952 } else { 55c0cefbdbdaca Oliver Upton 2021-09-16 2953 data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset; 55c0cefbdbdaca Oliver Upton 2021-09-16 2954 } e2c2206a18993b Wanpeng Li 2017-05-11 2955 e2c2206a18993b Wanpeng Li 2017-05-11 2956 put_cpu(); 55c0cefbdbdaca Oliver Upton 2021-09-16 @2957 } e2c2206a18993b Wanpeng Li 2017-05-11 2958
Hi Anton,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on kvm/master]
[also build test WARNING on mst-vhost/linux-next v5.18-rc3 next-20220421]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/intel-lab-lkp/linux/commits/Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master
config: x86_64-randconfig-a014 (https://download.01.org/0day-ci/archive/20220421/202204211846.Gtei0xp5-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project bac6cd5bf85669e3376610cfc4c4f9ca015e7b9b)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/c60b3070bd6e7e804de118dac10002e4f5f714a6
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221
git checkout c60b3070bd6e7e804de118dac10002e4f5f714a6
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
arch/x86/kvm/x86.c:2936:3: error: expected expression
struct timespec64 ts;
^
>> arch/x86/kvm/x86.c:2938:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation]
if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) {
^
arch/x86/kvm/x86.c:2933:2: note: previous statement is here
if (ka->use_master_clock &&
^
arch/x86/kvm/x86.c:2938:39: error: use of undeclared identifier 'ts'
if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) {
^
arch/x86/kvm/x86.c:2939:21: error: use of undeclared identifier 'ts'
data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec;
^
arch/x86/kvm/x86.c:2939:49: error: use of undeclared identifier 'ts'
data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec;
^
arch/x86/kvm/x86.c:2952:4: error: expected identifier or '('
} else {
^
arch/x86/kvm/x86.c:2956:2: error: expected identifier or '('
put_cpu();
^
include/linux/smp.h:268:20: note: expanded from macro 'put_cpu'
#define put_cpu() preempt_enable()
^
include/linux/preempt.h:239:26: note: expanded from macro 'preempt_enable'
#define preempt_enable() \
^
arch/x86/kvm/x86.c:2956:2: error: expected identifier or '('
include/linux/smp.h:268:20: note: expanded from macro 'put_cpu'
#define put_cpu() preempt_enable()
^
include/linux/preempt.h:243:3: note: expanded from macro 'preempt_enable'
} while (0)
^
arch/x86/kvm/x86.c:2957:1: error: extraneous closing brace ('}')
}
^
1 warning and 8 errors generated.
vim +/if +2938 arch/x86/kvm/x86.c
c60b3070bd6e7e Anton Romanov 2022-04-21 2922
869b44211adc87 Paolo Bonzini 2021-09-16 2923 /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */
869b44211adc87 Paolo Bonzini 2021-09-16 2924 static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
108b249c453dd7 Paolo Bonzini 2016-09-01 2925 {
108b249c453dd7 Paolo Bonzini 2016-09-01 2926 struct kvm_arch *ka = &kvm->arch;
8b953440645631 Paolo Bonzini 2016-11-16 2927 struct pvclock_vcpu_time_info hv_clock;
8b953440645631 Paolo Bonzini 2016-11-16 2928
e2c2206a18993b Wanpeng Li 2017-05-11 2929 /* both __this_cpu_read() and rdtsc() should be on the same cpu */
e2c2206a18993b Wanpeng Li 2017-05-11 2930 get_cpu();
e2c2206a18993b Wanpeng Li 2017-05-11 2931
869b44211adc87 Paolo Bonzini 2021-09-16 2932 data->flags = 0;
c60b3070bd6e7e Anton Romanov 2022-04-21 2933 if (ka->use_master_clock &&
c60b3070bd6e7e Anton Romanov 2022-04-21 2934 (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz)))
c68dc1b577eabd Oliver Upton 2021-09-16 2935 #ifdef CONFIG_X86_64
c68dc1b577eabd Oliver Upton 2021-09-16 2936 struct timespec64 ts;
c68dc1b577eabd Oliver Upton 2021-09-16 2937
c68dc1b577eabd Oliver Upton 2021-09-16 @2938 if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) {
c68dc1b577eabd Oliver Upton 2021-09-16 2939 data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec;
c68dc1b577eabd Oliver Upton 2021-09-16 2940 data->flags |= KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC;
c68dc1b577eabd Oliver Upton 2021-09-16 2941 } else
c68dc1b577eabd Oliver Upton 2021-09-16 2942 #endif
c68dc1b577eabd Oliver Upton 2021-09-16 2943 data->host_tsc = rdtsc();
c68dc1b577eabd Oliver Upton 2021-09-16 2944
869b44211adc87 Paolo Bonzini 2021-09-16 2945 data->flags |= KVM_CLOCK_TSC_STABLE;
869b44211adc87 Paolo Bonzini 2021-09-16 2946 hv_clock.tsc_timestamp = ka->master_cycle_now;
869b44211adc87 Paolo Bonzini 2021-09-16 2947 hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
c60b3070bd6e7e Anton Romanov 2022-04-21 2948 kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL,
8b953440645631 Paolo Bonzini 2016-11-16 2949 &hv_clock.tsc_shift,
8b953440645631 Paolo Bonzini 2016-11-16 2950 &hv_clock.tsc_to_system_mul);
c68dc1b577eabd Oliver Upton 2021-09-16 2951 data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc);
55c0cefbdbdaca Oliver Upton 2021-09-16 2952 } else {
55c0cefbdbdaca Oliver Upton 2021-09-16 2953 data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset;
55c0cefbdbdaca Oliver Upton 2021-09-16 2954 }
e2c2206a18993b Wanpeng Li 2017-05-11 2955
e2c2206a18993b Wanpeng Li 2017-05-11 2956 put_cpu();
55c0cefbdbdaca Oliver Upton 2021-09-16 2957 }
e2c2206a18993b Wanpeng Li 2017-05-11 2958
Hi Anton, Thank you for the patch! Yet something to improve: [auto build test ERROR on kvm/master] [also build test ERROR on mst-vhost/linux-next v5.18-rc3 next-20220421] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master config: x86_64-randconfig-a015 (https://download.01.org/0day-ci/archive/20220421/202204212040.klu29ce8-lkp@intel.com/config) compiler: gcc-11 (Debian 11.2.0-20) 11.2.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/c60b3070bd6e7e804de118dac10002e4f5f714a6 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anton-Romanov/KVM-x86-Use-current-rather-than-snapshotted-TSC-frequency-if-it-is-constant/20220421-090221 git checkout c60b3070bd6e7e804de118dac10002e4f5f714a6 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): arch/x86/kvm/x86.c: In function '__get_kvmclock': >> arch/x86/kvm/x86.c:2936:17: error: expected expression before 'struct' 2936 | struct timespec64 ts; | ^~~~~~ arch/x86/kvm/x86.c:2933:9: warning: this 'if' clause does not guard... [-Wmisleading-indentation] 2933 | if (ka->use_master_clock && | ^~ arch/x86/kvm/x86.c:2938:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 2938 | if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { | ^~ >> arch/x86/kvm/x86.c:2938:53: error: 'ts' undeclared (first use in this function); did you mean 'tms'? 2938 | if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { | ^~ | tms arch/x86/kvm/x86.c:2938:53: note: each undeclared identifier is reported only once for each function it appears in arch/x86/kvm/x86.c: At top level: arch/x86/kvm/x86.c:2952:11: error: expected identifier or '(' before 'else' 2952 | } else { | ^~~~ In file included from include/linux/percpu.h:6, from include/linux/context_tracking_state.h:5, from include/linux/hardirq.h:5, from include/linux/kvm_host.h:7, from arch/x86/kvm/x86.c:19: include/linux/preempt.h:240:1: error: expected identifier or '(' before 'do' 240 | do { \ | ^~ include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' 268 | #define put_cpu() preempt_enable() | ^~~~~~~~~~~~~~ arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' 2956 | put_cpu(); | ^~~~~~~ include/linux/preempt.h:243:3: error: expected identifier or '(' before 'while' 243 | } while (0) | ^~~~~ include/linux/smp.h:268:33: note: in expansion of macro 'preempt_enable' 268 | #define put_cpu() preempt_enable() | ^~~~~~~~~~~~~~ arch/x86/kvm/x86.c:2956:9: note: in expansion of macro 'put_cpu' 2956 | put_cpu(); | ^~~~~~~ arch/x86/kvm/x86.c:2957:1: error: expected identifier or '(' before '}' token 2957 | } | ^ vim +/struct +2936 arch/x86/kvm/x86.c c60b3070bd6e7e Anton Romanov 2022-04-21 2922 869b44211adc87 Paolo Bonzini 2021-09-16 2923 /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */ 869b44211adc87 Paolo Bonzini 2021-09-16 2924 static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) 108b249c453dd7 Paolo Bonzini 2016-09-01 2925 { 108b249c453dd7 Paolo Bonzini 2016-09-01 2926 struct kvm_arch *ka = &kvm->arch; 8b953440645631 Paolo Bonzini 2016-11-16 2927 struct pvclock_vcpu_time_info hv_clock; 8b953440645631 Paolo Bonzini 2016-11-16 2928 e2c2206a18993b Wanpeng Li 2017-05-11 2929 /* both __this_cpu_read() and rdtsc() should be on the same cpu */ e2c2206a18993b Wanpeng Li 2017-05-11 2930 get_cpu(); e2c2206a18993b Wanpeng Li 2017-05-11 2931 869b44211adc87 Paolo Bonzini 2021-09-16 2932 data->flags = 0; c60b3070bd6e7e Anton Romanov 2022-04-21 2933 if (ka->use_master_clock && c60b3070bd6e7e Anton Romanov 2022-04-21 2934 (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz))) c68dc1b577eabd Oliver Upton 2021-09-16 2935 #ifdef CONFIG_X86_64 c68dc1b577eabd Oliver Upton 2021-09-16 @2936 struct timespec64 ts; c68dc1b577eabd Oliver Upton 2021-09-16 2937 c68dc1b577eabd Oliver Upton 2021-09-16 @2938 if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) { c68dc1b577eabd Oliver Upton 2021-09-16 2939 data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec; c68dc1b577eabd Oliver Upton 2021-09-16 2940 data->flags |= KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC; c68dc1b577eabd Oliver Upton 2021-09-16 2941 } else c68dc1b577eabd Oliver Upton 2021-09-16 2942 #endif c68dc1b577eabd Oliver Upton 2021-09-16 2943 data->host_tsc = rdtsc(); c68dc1b577eabd Oliver Upton 2021-09-16 2944 869b44211adc87 Paolo Bonzini 2021-09-16 2945 data->flags |= KVM_CLOCK_TSC_STABLE; 869b44211adc87 Paolo Bonzini 2021-09-16 2946 hv_clock.tsc_timestamp = ka->master_cycle_now; 869b44211adc87 Paolo Bonzini 2021-09-16 2947 hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; c60b3070bd6e7e Anton Romanov 2022-04-21 2948 kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL, 8b953440645631 Paolo Bonzini 2016-11-16 2949 &hv_clock.tsc_shift, 8b953440645631 Paolo Bonzini 2016-11-16 2950 &hv_clock.tsc_to_system_mul); c68dc1b577eabd Oliver Upton 2021-09-16 2951 data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc); 55c0cefbdbdaca Oliver Upton 2021-09-16 2952 } else { 55c0cefbdbdaca Oliver Upton 2021-09-16 2953 data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset; 55c0cefbdbdaca Oliver Upton 2021-09-16 2954 } e2c2206a18993b Wanpeng Li 2017-05-11 2955 e2c2206a18993b Wanpeng Li 2017-05-11 2956 put_cpu(); 55c0cefbdbdaca Oliver Upton 2021-09-16 2957 } e2c2206a18993b Wanpeng Li 2017-05-11 2958
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 547ba00ef64f..1043cfd26576 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2907,6 +2907,19 @@ static void kvm_update_masterclock(struct kvm *kvm) kvm_end_pvclock_update(kvm); } +/* + * If kvm is built into kernel it is possible that tsc_khz saved into + * per-cpu cpu_tsc_khz was yet unrefined value. If CPU provides CONSTANT_TSC it + * doesn't make sense to snapshot it anyway so just return tsc_khz + */ +static unsigned long get_cpu_tsc_khz(void) +{ + if (static_cpu_has(X86_FEATURE_CONSTANT_TSC)) + return tsc_khz; + else + return __this_cpu_read(cpu_tsc_khz); +} + /* Called within read_seqcount_begin/retry for kvm->pvclock_sc. */ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) { @@ -2917,7 +2930,8 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) get_cpu(); data->flags = 0; - if (ka->use_master_clock && __this_cpu_read(cpu_tsc_khz)) { + if (ka->use_master_clock && + (static_cpu_has(X86_FEATURE_CONSTANT_TSC) || __this_cpu_read(cpu_tsc_khz))) #ifdef CONFIG_X86_64 struct timespec64 ts; @@ -2931,7 +2945,7 @@ static void __get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data) data->flags |= KVM_CLOCK_TSC_STABLE; hv_clock.tsc_timestamp = ka->master_cycle_now; hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; - kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL, + kvm_get_time_scale(NSEC_PER_SEC, get_cpu_tsc_khz() * 1000LL, &hv_clock.tsc_shift, &hv_clock.tsc_to_system_mul); data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc); @@ -3049,7 +3063,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) /* Keep irq disabled to prevent changes to the clock */ local_irq_save(flags); - tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz); + tgt_tsc_khz = get_cpu_tsc_khz(); if (unlikely(tgt_tsc_khz == 0)) { local_irq_restore(flags); kvm_make_request(KVM_REQ_CLOCK_UPDATE, v); @@ -8646,9 +8660,12 @@ static void tsc_khz_changed(void *data) struct cpufreq_freqs *freq = data; unsigned long khz = 0; + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) + return; + if (data) khz = freq->new; - else if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) + else khz = cpufreq_quick_get(raw_smp_processor_id()); if (!khz) khz = tsc_khz; @@ -8661,6 +8678,8 @@ static void kvm_hyperv_tsc_notifier(void) struct kvm *kvm; int cpu; + WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_TSC_RELIABLE)); + mutex_lock(&kvm_lock); list_for_each_entry(kvm, &vm_list, vm_list) kvm_make_mclock_inprogress_request(kvm);
Don't snapshot tsc_khz into per-cpu cpu_tsc_khz if the host TSC is constant, in which case the actual TSC frequency will never change and thus capturing TSC during initialization is unnecessary, KVM can simply use tsc_khz. This value is snapshotted from kvm_timer_init->kvmclock_cpu_online->tsc_khz_changed(NULL) On CPUs with constant TSC, but not a hardware-specified TSC frequency, snapshotting cpu_tsc_khz and using that to set a VM's target TSC frequency can lead to VM to think its TSC frequency is not what it actually is if refining the TSC completes after KVM snapshots tsc_khz. The actual frequency never changes, only the kernel's calculation of what that frequency is changes. Ideally, KVM would not be able to race with TSC refinement, or would have a hook into tsc_refine_calibration_work() to get an alert when refinement is complete. Avoiding the race altogether isn't practical as refinement takes a relative eternity; it's deliberately put on a work queue outside of the normal boot sequence to avoid unnecessarily delaying boot. Adding a hook is doable, but somewhat gross due to KVM's ability to be built as a module. And if the TSC is constant, which is likely the case for every VMX/SVM-capable CPU produced in the last decade, the race can be hit if and only if userspace is able to create a VM before TSC refinement completes; refinement is slow, but not that slow. For now, punt on a proper fix, as not taking a snapshot can help some uses cases and not taking a snapshot is arguably correct irrespective of the race with refinement. Signed-off-by: Anton Romanov <romanton@google.com> --- v2: fixed commit msg indentation added WARN_ON_ONCE in kvm_hyperv_tsc_notifier opened up condition in __get_kvmclock arch/x86/kvm/x86.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-)