Message ID | 20220329134632.6064-1-likexu@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/pmu: Update AMD PMC smaple period to fix guest NMI-watchdog | expand |
helped to Cc stable@ list... On 2022/3/29 21:46, Like Xu wrote: > From: Like Xu <likexu@tencent.com> > > NMI-watchdog is one of the favorite features of kernel developers, > but it does not work in AMD guest even with vPMU enabled and worse, > the system misrepresents this capability via /proc. > > This is a PMC emulation error. KVM does not pass the latest valid > value to perf_event in time when guest NMI-watchdog is running, thus > the perf_event corresponding to the watchdog counter will enter the > old state at some point after the first guest NMI injection, forcing > the hardware register PMC0 to be constantly written to 0x800000000001. > > Meanwhile, the running counter should accurately reflect its new value > based on the latest coordinated pmc->counter (from vPMC's point of view) > rather than the value written directly by the guest. > > Fixes: 168d918f2643 ("KVM: x86: Adjust counter sample period after a wrmsr") > Reported-by: Dongli Cao <caodongli@kingsoft.com> > Signed-off-by: Like Xu <likexu@tencent.com> > --- > arch/x86/kvm/pmu.h | 9 +++++++++ > arch/x86/kvm/svm/pmu.c | 1 + > arch/x86/kvm/vmx/pmu_intel.c | 8 ++------ > 3 files changed, 12 insertions(+), 6 deletions(-) Recently I also met the "NMI watchdog not working on AMD guest" issue, I have tested this patch locally and it helps. Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Tested-by: Yanan Wang <wangyanan55@huawei.com> Thanks, Yanan > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h > index 7a7b8d5b775e..5e7e8d163b98 100644 > --- a/arch/x86/kvm/pmu.h > +++ b/arch/x86/kvm/pmu.h > @@ -140,6 +140,15 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value) > return sample_period; > } > > +static inline void pmc_update_sample_period(struct kvm_pmc *pmc) > +{ > + if (!pmc->perf_event || pmc->is_paused) > + return; > + > + perf_event_period(pmc->perf_event, > + get_sample_period(pmc, pmc->counter)); > +} > + > void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel); > void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx); > void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx); > diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c > index 24eb935b6f85..b14860863c39 100644 > --- a/arch/x86/kvm/svm/pmu.c > +++ b/arch/x86/kvm/svm/pmu.c > @@ -257,6 +257,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > pmc = get_gp_pmc_amd(pmu, msr, PMU_TYPE_COUNTER); > if (pmc) { > pmc->counter += data - pmc_read_counter(pmc); > + pmc_update_sample_period(pmc); > return 0; > } > /* MSR_EVNTSELn */ > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > index efa172a7278e..e64046fbcdca 100644 > --- a/arch/x86/kvm/vmx/pmu_intel.c > +++ b/arch/x86/kvm/vmx/pmu_intel.c > @@ -431,15 +431,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > !(msr & MSR_PMC_FULL_WIDTH_BIT)) > data = (s64)(s32)data; > pmc->counter += data - pmc_read_counter(pmc); > - if (pmc->perf_event && !pmc->is_paused) > - perf_event_period(pmc->perf_event, > - get_sample_period(pmc, data)); > + pmc_update_sample_period(pmc); > return 0; > } else if ((pmc = get_fixed_pmc(pmu, msr))) { > pmc->counter += data - pmc_read_counter(pmc); > - if (pmc->perf_event && !pmc->is_paused) > - perf_event_period(pmc->perf_event, > - get_sample_period(pmc, data)); > + pmc_update_sample_period(pmc); > return 0; > } else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) { > if (data == pmc->eventsel)
On Thu, Mar 31, 2022 at 11:51:55AM +0800, wangyanan (Y) wrote:
> helped to Cc stable@ list...
<formletter>
This is not the correct way to submit patches for inclusion in the
stable kernel tree. Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.
</formletter>
On 2022/3/31 12:43, Greg KH wrote: > On Thu, Mar 31, 2022 at 11:51:55AM +0800, wangyanan (Y) wrote: >> helped to Cc stable@ list... > > <formletter> > > This is not the correct way to submit patches for inclusion in the > stable kernel tree. Please read: > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html > for how to do this properly. > > </formletter> > . Ah, got it. Thank you for the reminder.
On Tue, Mar 29, 2022 at 6:46 AM Like Xu <like.xu.linux@gmail.com> wrote: > > From: Like Xu <likexu@tencent.com> > > NMI-watchdog is one of the favorite features of kernel developers, > but it does not work in AMD guest even with vPMU enabled and worse, > the system misrepresents this capability via /proc. > > This is a PMC emulation error. KVM does not pass the latest valid > value to perf_event in time when guest NMI-watchdog is running, thus > the perf_event corresponding to the watchdog counter will enter the > old state at some point after the first guest NMI injection, forcing > the hardware register PMC0 to be constantly written to 0x800000000001. > > Meanwhile, the running counter should accurately reflect its new value > based on the latest coordinated pmc->counter (from vPMC's point of view) > rather than the value written directly by the guest. > > Fixes: 168d918f2643 ("KVM: x86: Adjust counter sample period after a wrmsr") > Reported-by: Dongli Cao <caodongli@kingsoft.com> > Signed-off-by: Like Xu <likexu@tencent.com> Reviewed-by: Jim Mattson <jmattson@google.com>
Minor nit: "sample" is misspelled in the subject line.
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 7a7b8d5b775e..5e7e8d163b98 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -140,6 +140,15 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value) return sample_period; } +static inline void pmc_update_sample_period(struct kvm_pmc *pmc) +{ + if (!pmc->perf_event || pmc->is_paused) + return; + + perf_event_period(pmc->perf_event, + get_sample_period(pmc, pmc->counter)); +} + void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel); void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx); void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx); diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index 24eb935b6f85..b14860863c39 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -257,6 +257,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) pmc = get_gp_pmc_amd(pmu, msr, PMU_TYPE_COUNTER); if (pmc) { pmc->counter += data - pmc_read_counter(pmc); + pmc_update_sample_period(pmc); return 0; } /* MSR_EVNTSELn */ diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index efa172a7278e..e64046fbcdca 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -431,15 +431,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) !(msr & MSR_PMC_FULL_WIDTH_BIT)) data = (s64)(s32)data; pmc->counter += data - pmc_read_counter(pmc); - if (pmc->perf_event && !pmc->is_paused) - perf_event_period(pmc->perf_event, - get_sample_period(pmc, data)); + pmc_update_sample_period(pmc); return 0; } else if ((pmc = get_fixed_pmc(pmu, msr))) { pmc->counter += data - pmc_read_counter(pmc); - if (pmc->perf_event && !pmc->is_paused) - perf_event_period(pmc->perf_event, - get_sample_period(pmc, data)); + pmc_update_sample_period(pmc); return 0; } else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) { if (data == pmc->eventsel)