diff mbox series

[4/7] KVM: x86/pmu: Not to generate PEBS records for emulated instructions

Message ID 20220713122507.29236-5-likexu@tencent.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/pmu: Fix some corner cases including Intel PEBS | expand

Commit Message

Like Xu July 13, 2022, 12:25 p.m. UTC
From: Like Xu <likexu@tencent.com>

The KVM accumulate an enabeld counter for at least INSTRUCTIONS or
BRANCH_INSTRUCTION hw event from any KVM emulated instructions,
generating emulated overflow interrupt on counter overflow, which
in theory should also happen when the PEBS counter overflows but
it currently lacks this part of the underlying support (e.g. through
software injection of records in the irq context or a lazy approach).

In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively
dropping one PEBS record) and let the overflow counter move on. The loss
of a single sample does not introduce a loss of accuracy, but is easily
noticeable for certain specific instructions.

This issue is expected to be addressed along with the issue
of PEBS cross-mapped counters with a slow-path proposal.

Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter")
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Sean Christopherson July 21, 2022, 12:51 a.m. UTC | #1
"Don't" instead of "Not to".  Not is an adverb, not a verb itself.

On Wed, Jul 13, 2022, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> The KVM accumulate an enabeld counter for at least INSTRUCTIONS or

Probably just "KVM" instead of "the KVM"?

s/enabeld/enabled

> BRANCH_INSTRUCTION hw event from any KVM emulated instructions,
> generating emulated overflow interrupt on counter overflow, which
> in theory should also happen when the PEBS counter overflows but
> it currently lacks this part of the underlying support (e.g. through
> software injection of records in the irq context or a lazy approach).
> 
> In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively
> dropping one PEBS record) and let the overflow counter move on. The loss
> of a single sample does not introduce a loss of accuracy, but is easily
> noticeable for certain specific instructions.
> 
> This issue is expected to be addressed along with the issue
> of PEBS cross-mapped counters with a slow-path proposal.
> 
> Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter")
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/kvm/pmu.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 02f9e4f245bd..08ee0fed63d5 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
>  		return;
>  
>  	if (pmc->perf_event && pmc->perf_event->attr.precise_ip) {
> -		/* Indicate PEBS overflow PMI to guest. */
> -		skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
> -					      (unsigned long *)&pmu->global_status);
> +		if (!in_pmi) {
> +			/* The emulated instructions does not generate PEBS records. */

This needs a better comment.  IIUC, it's not that they don't generate records,
it's that KVM is _choosing_ to not generate records to hack around a different
bug(s).  If that's true a TODO or FIXME would also be nice.

> +			skip_pmi = true;
> +		} else {
> +			/* Indicate PEBS overflow PMI to guest. */
> +			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
> +						      (unsigned long *)&pmu->global_status);
> +		}
>  	} else {
>  		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
>  	}
> -- 
> 2.37.0
>
Like Xu July 21, 2022, 2:22 a.m. UTC | #2
On 21/7/2022 8:51 am, Sean Christopherson wrote:
> "Don't" instead of "Not to".  Not is an adverb, not a verb itself.
> 
> On Wed, Jul 13, 2022, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> The KVM accumulate an enabeld counter for at least INSTRUCTIONS or
> 
> Probably just "KVM" instead of "the KVM"?
> 
> s/enabeld/enabled

Applied, thanks.

> 
>> BRANCH_INSTRUCTION hw event from any KVM emulated instructions,
>> generating emulated overflow interrupt on counter overflow, which
>> in theory should also happen when the PEBS counter overflows but
>> it currently lacks this part of the underlying support (e.g. through
>> software injection of records in the irq context or a lazy approach).
>>
>> In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively
>> dropping one PEBS record) and let the overflow counter move on. The loss
>> of a single sample does not introduce a loss of accuracy, but is easily
>> noticeable for certain specific instructions.
>>
>> This issue is expected to be addressed along with the issue
>> of PEBS cross-mapped counters with a slow-path proposal.
>>
>> Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter")
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>>   arch/x86/kvm/pmu.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 02f9e4f245bd..08ee0fed63d5 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
>>   		return;
>>   
>>   	if (pmc->perf_event && pmc->perf_event->attr.precise_ip) {
>> -		/* Indicate PEBS overflow PMI to guest. */
>> -		skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
>> -					      (unsigned long *)&pmu->global_status);
>> +		if (!in_pmi) {
>> +			/* The emulated instructions does not generate PEBS records. */
> 
> This needs a better comment.  IIUC, it's not that they don't generate records,
> it's that KVM is _choosing_ to not generate records to hack around a different
> bug(s).  If that's true a TODO or FIXME would also be nice.

Indeed, to understand more of the context, this part will look like this:

		if (!in_pmi) {
			/*
			* TODO: KVM is currently _choosing_ to not generate records
			* for emulated instructions, avoiding BUFFER_OVF PMI when
			* there are no records. Strictly speaking, it should be done
			* as well in the right context to improve sampling accuracy.
			*/
			skip_pmi = true;
		} else {
			/* Indicate PEBS overflow PMI to guest. */
			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
						      (unsigned long *)&pmu->global_status);
		}

, what do you think ?

> 
>> +			skip_pmi = true;
>> +		} else {
>> +			/* Indicate PEBS overflow PMI to guest. */
>> +			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
>> +						      (unsigned long *)&pmu->global_status);
>> +		}
>>   	} else {
>>   		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
>>   	}
>> -- 
>> 2.37.0
>>
diff mbox series

Patch

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 02f9e4f245bd..08ee0fed63d5 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -106,9 +106,14 @@  static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
 		return;
 
 	if (pmc->perf_event && pmc->perf_event->attr.precise_ip) {
-		/* Indicate PEBS overflow PMI to guest. */
-		skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
-					      (unsigned long *)&pmu->global_status);
+		if (!in_pmi) {
+			/* The emulated instructions does not generate PEBS records. */
+			skip_pmi = true;
+		} else {
+			/* Indicate PEBS overflow PMI to guest. */
+			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
+						      (unsigned long *)&pmu->global_status);
+		}
 	} else {
 		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
 	}