diff mbox series

[v3] perf/amd: Implement erratum #1292 workaround for F19h M00-0Fh

Message ID 20220203095841.7937-1-ravi.bangoria@amd.com (mailing list archive)
State New, archived
Headers show
Series [v3] perf/amd: Implement erratum #1292 workaround for F19h M00-0Fh | expand

Commit Message

Ravi Bangoria Feb. 3, 2022, 9:58 a.m. UTC
Perf counter may overcount for a list of Retire Based Events. Implement
workaround for Zen3 Family 19 Model 00-0F processors as suggested in
Revision Guide[1]:

  To count the non-FP affected PMC events correctly:
    o Use Core::X86::Msr::PERF_CTL2 to count the events, and
    o Program Core::X86::Msr::PERF_CTL2[43] to 1b, and
    o Program Core::X86::Msr::PERF_CTL2[20] to 0b.

Note that the specified workaround applies only to counting events and
not to sampling events. Thus sampling event will continue functioning
as is.

Although the issue exists on all previous Zen revisions, the workaround
is different and thus not included in this patch.

This patch needs Like's patch[2] to make it work on kvm guest.

[1] https://bugzilla.kernel.org/attachment.cgi?id=298241
[2] https://lore.kernel.org/lkml/20220117055703.52020-1-likexu@tencent.com

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
v2: https://lore.kernel.org/r/20220202105158.7072-1-ravi.bangoria@amd.com
v2->v3:
  - Use EVENT_CONSTRAINT_RANGE() for continuous event codes.

 arch/x86/events/amd/core.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Comments

Peter Zijlstra Feb. 9, 2022, 12:51 p.m. UTC | #1
On Thu, Feb 03, 2022 at 03:28:41PM +0530, Ravi Bangoria wrote:
> Perf counter may overcount for a list of Retire Based Events. Implement
> workaround for Zen3 Family 19 Model 00-0F processors as suggested in
> Revision Guide[1]:
> 
>   To count the non-FP affected PMC events correctly:
>     o Use Core::X86::Msr::PERF_CTL2 to count the events, and
>     o Program Core::X86::Msr::PERF_CTL2[43] to 1b, and
>     o Program Core::X86::Msr::PERF_CTL2[20] to 0b.
> 
> Note that the specified workaround applies only to counting events and
> not to sampling events. Thus sampling event will continue functioning
> as is.
> 
> Although the issue exists on all previous Zen revisions, the workaround
> is different and thus not included in this patch.
> 
> This patch needs Like's patch[2] to make it work on kvm guest.
> 
> [1] https://bugzilla.kernel.org/attachment.cgi?id=298241
> [2] https://lore.kernel.org/lkml/20220117055703.52020-1-likexu@tencent.com
> 
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>

Thanks!
Ravi Bangoria Feb. 10, 2022, 4:05 a.m. UTC | #2
On 09-Feb-22 6:21 PM, Peter Zijlstra wrote:
> On Thu, Feb 03, 2022 at 03:28:41PM +0530, Ravi Bangoria wrote:
>> Perf counter may overcount for a list of Retire Based Events. Implement
>> workaround for Zen3 Family 19 Model 00-0F processors as suggested in
>> Revision Guide[1]:
>>
>>   To count the non-FP affected PMC events correctly:
>>     o Use Core::X86::Msr::PERF_CTL2 to count the events, and
>>     o Program Core::X86::Msr::PERF_CTL2[43] to 1b, and
>>     o Program Core::X86::Msr::PERF_CTL2[20] to 0b.
>>
>> Note that the specified workaround applies only to counting events and
>> not to sampling events. Thus sampling event will continue functioning
>> as is.
>>
>> Although the issue exists on all previous Zen revisions, the workaround
>> is different and thus not included in this patch.
>>
>> This patch needs Like's patch[2] to make it work on kvm guest.
>>
>> [1] https://bugzilla.kernel.org/attachment.cgi?id=298241
>> [2] https://lore.kernel.org/lkml/20220117055703.52020-1-likexu@tencent.com
>>
>> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> 
> Thanks!

Peter, On subsequent tests, I found that this 'fix' is still not
optimal. Please drop this patch from your queue for now. Really
sorry for the noise.

Thanks,
Ravi
Peter Zijlstra Feb. 10, 2022, 8:46 a.m. UTC | #3
On Thu, Feb 10, 2022 at 09:35:14AM +0530, Ravi Bangoria wrote:

> Peter, On subsequent tests, I found that this 'fix' is still not
> optimal. Please drop this patch from your queue for now. Really
> sorry for the noise.

Just in time, and done.
diff mbox series

Patch

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 9687a8aef01c..124ec15851bc 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -874,6 +874,17 @@  amd_get_event_constraints_f15h(struct cpu_hw_events *cpuc, int idx,
 	}
 }
 
+/* Overcounting of Retire Based Events Erratum */
+static struct event_constraint retire_event_constraints[] __read_mostly = {
+	EVENT_CONSTRAINT_RANGE(0xC0, 0xC5, 0x4, AMD64_EVENTSEL_EVENT),
+	EVENT_CONSTRAINT_RANGE(0xC8, 0xCA, 0x4, AMD64_EVENTSEL_EVENT),
+	EVENT_CONSTRAINT(0xCC, 0x4, AMD64_EVENTSEL_EVENT),
+	EVENT_CONSTRAINT(0xD1, 0x4, AMD64_EVENTSEL_EVENT),
+	EVENT_CONSTRAINT(0x1000000C7, 0x4, AMD64_EVENTSEL_EVENT),
+	EVENT_CONSTRAINT(0x1000000D0, 0x4, AMD64_EVENTSEL_EVENT),
+	EVENT_CONSTRAINT_END
+};
+
 static struct event_constraint pair_constraint;
 
 static struct event_constraint *
@@ -881,10 +892,30 @@  amd_get_event_constraints_f17h(struct cpu_hw_events *cpuc, int idx,
 			       struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct event_constraint *c;
 
 	if (amd_is_pair_event_code(hwc))
 		return &pair_constraint;
 
+	/*
+	 * Although 'Overcounting of Retire Based Events' erratum exists
+	 * for older generation cpus, workaround to set bit 43 works only
+	 * for Family 19h Model 00-0Fh as per the Revision Guide.
+	 */
+	if (boot_cpu_data.x86 == 0x19 && boot_cpu_data.x86_model <= 0xf) {
+		if (is_sampling_event(event))
+			goto out;
+
+		for_each_event_constraint(c, retire_event_constraints) {
+			if (constraint_match(c, event->hw.config)) {
+				event->hw.config |= (1ULL << 43);
+				event->hw.config &= ~(1ULL << 20);
+				return c;
+			}
+		}
+	}
+
+out:
 	return &unconstrained;
 }