Message ID | 20220713122507.29236-5-likexu@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/pmu: Fix some corner cases including Intel PEBS | expand |
"Don't" instead of "Not to". Not is an adverb, not a verb itself. On Wed, Jul 13, 2022, Like Xu wrote: > From: Like Xu <likexu@tencent.com> > > The KVM accumulate an enabeld counter for at least INSTRUCTIONS or Probably just "KVM" instead of "the KVM"? s/enabeld/enabled > BRANCH_INSTRUCTION hw event from any KVM emulated instructions, > generating emulated overflow interrupt on counter overflow, which > in theory should also happen when the PEBS counter overflows but > it currently lacks this part of the underlying support (e.g. through > software injection of records in the irq context or a lazy approach). > > In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively > dropping one PEBS record) and let the overflow counter move on. The loss > of a single sample does not introduce a loss of accuracy, but is easily > noticeable for certain specific instructions. > > This issue is expected to be addressed along with the issue > of PEBS cross-mapped counters with a slow-path proposal. > > Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter") > Signed-off-by: Like Xu <likexu@tencent.com> > --- > arch/x86/kvm/pmu.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index 02f9e4f245bd..08ee0fed63d5 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) > return; > > if (pmc->perf_event && pmc->perf_event->attr.precise_ip) { > - /* Indicate PEBS overflow PMI to guest. */ > - skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, > - (unsigned long *)&pmu->global_status); > + if (!in_pmi) { > + /* The emulated instructions does not generate PEBS records. */ This needs a better comment. IIUC, it's not that they don't generate records, it's that KVM is _choosing_ to not generate records to hack around a different bug(s). If that's true a TODO or FIXME would also be nice. > + skip_pmi = true; > + } else { > + /* Indicate PEBS overflow PMI to guest. */ > + skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, > + (unsigned long *)&pmu->global_status); > + } > } else { > __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); > } > -- > 2.37.0 >
On 21/7/2022 8:51 am, Sean Christopherson wrote: > "Don't" instead of "Not to". Not is an adverb, not a verb itself. > > On Wed, Jul 13, 2022, Like Xu wrote: >> From: Like Xu <likexu@tencent.com> >> >> The KVM accumulate an enabeld counter for at least INSTRUCTIONS or > > Probably just "KVM" instead of "the KVM"? > > s/enabeld/enabled Applied, thanks. > >> BRANCH_INSTRUCTION hw event from any KVM emulated instructions, >> generating emulated overflow interrupt on counter overflow, which >> in theory should also happen when the PEBS counter overflows but >> it currently lacks this part of the underlying support (e.g. through >> software injection of records in the irq context or a lazy approach). >> >> In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively >> dropping one PEBS record) and let the overflow counter move on. The loss >> of a single sample does not introduce a loss of accuracy, but is easily >> noticeable for certain specific instructions. >> >> This issue is expected to be addressed along with the issue >> of PEBS cross-mapped counters with a slow-path proposal. >> >> Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter") >> Signed-off-by: Like Xu <likexu@tencent.com> >> --- >> arch/x86/kvm/pmu.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index 02f9e4f245bd..08ee0fed63d5 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) >> return; >> >> if (pmc->perf_event && pmc->perf_event->attr.precise_ip) { >> - /* Indicate PEBS overflow PMI to guest. */ >> - skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, >> - (unsigned long *)&pmu->global_status); >> + if (!in_pmi) { >> + /* The emulated instructions does not generate PEBS records. */ > > This needs a better comment. IIUC, it's not that they don't generate records, > it's that KVM is _choosing_ to not generate records to hack around a different > bug(s). If that's true a TODO or FIXME would also be nice. Indeed, to understand more of the context, this part will look like this: if (!in_pmi) { /* * TODO: KVM is currently _choosing_ to not generate records * for emulated instructions, avoiding BUFFER_OVF PMI when * there are no records. Strictly speaking, it should be done * as well in the right context to improve sampling accuracy. */ skip_pmi = true; } else { /* Indicate PEBS overflow PMI to guest. */ skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, (unsigned long *)&pmu->global_status); } , what do you think ? > >> + skip_pmi = true; >> + } else { >> + /* Indicate PEBS overflow PMI to guest. */ >> + skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, >> + (unsigned long *)&pmu->global_status); >> + } >> } else { >> __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); >> } >> -- >> 2.37.0 >>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 02f9e4f245bd..08ee0fed63d5 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) return; if (pmc->perf_event && pmc->perf_event->attr.precise_ip) { - /* Indicate PEBS overflow PMI to guest. */ - skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, - (unsigned long *)&pmu->global_status); + if (!in_pmi) { + /* The emulated instructions does not generate PEBS records. */ + skip_pmi = true; + } else { + /* Indicate PEBS overflow PMI to guest. */ + skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, + (unsigned long *)&pmu->global_status); + } } else { __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); }