diff mbox series

[v6,06/10] perf metricgroup: Fix metrics using aliases covering multiple PMUs

Message ID 1607080216-36968-7-git-send-email-john.garry@huawei.com (mailing list archive)
State New, archived
Headers show
Series perf pmu-events: Support event aliasing for system PMUs | expand

Commit Message

John Garry Dec. 4, 2020, 11:10 a.m. UTC
Support for metric expressions using aliases which cover multiple PMUs is
broken. Consider the following test metric expression:

"MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * UNC_CBO_XSNP_RESPONSE.MISS_EVICTION"

When used on my broadwell, "perf stat" gives:

unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
Control descriptor is not initialized
unc_cbo_xsnp_response.miss_eviction: 3645925 1000850523 1000850523
unc_cbo_xsnp_response.miss_xcore: 106850 1000850523 1000850523

 Performance counter stats for 'system wide':

         3,645,925      unc_cbo_xsnp_response.miss_eviction # 389567086250.00 test_metric_inc
           106,850      unc_cbo_xsnp_response.miss_xcore

       1.000883096 seconds time elapsed


Notice that only the results from one PMU are included. Fix the logic of
find_evsel_group() to enable events which apply to multiple PMUs, by
checking if the event pmu_name matches that of the metric event.

With that, "perf stat" now gives:

unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
Control descriptor is not initialized
unc_cbo_xsnp_response.miss_eviction: 4237983 1000904100 1000904100
unc_cbo_xsnp_response.miss_xcore: 218643 1000904100 1000904100
unc_cbo_xsnp_response.miss_eviction: 4254148 1000902629 1000902629
unc_cbo_xsnp_response.miss_xcore: 213352 1000902629 1000902629

 Performance counter stats for 'system wide':

         4,237,983      unc_cbo_xsnp_response.miss_eviction # 3668558131345.00 test_metric_inc
           218,643      unc_cbo_xsnp_response.miss_xcore
         4,254,148      unc_cbo_xsnp_response.miss_eviction
           213,352      unc_cbo_xsnp_response.miss_xcore

       1.000938151 seconds time elapsed


Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Kajol Jain <kjain@linux.ibm.com>
---
 tools/perf/util/metricgroup.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Arnaldo Carvalho de Melo Dec. 7, 2020, 5:19 p.m. UTC | #1
Em Fri, Dec 04, 2020 at 07:10:12PM +0800, John Garry escreveu:
> Support for metric expressions using aliases which cover multiple PMUs is
> broken. Consider the following test metric expression:
> 
> "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * UNC_CBO_XSNP_RESPONSE.MISS_EVICTION"
> 
> When used on my broadwell, "perf stat" gives:
> 
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
> Control descriptor is not initialized
> unc_cbo_xsnp_response.miss_eviction: 3645925 1000850523 1000850523
> unc_cbo_xsnp_response.miss_xcore: 106850 1000850523 1000850523
> 
>  Performance counter stats for 'system wide':
> 
>          3,645,925      unc_cbo_xsnp_response.miss_eviction # 389567086250.00 test_metric_inc
>            106,850      unc_cbo_xsnp_response.miss_xcore
> 
>        1.000883096 seconds time elapsed
> 
> 
> Notice that only the results from one PMU are included. Fix the logic of
> find_evsel_group() to enable events which apply to multiple PMUs, by
> checking if the event pmu_name matches that of the metric event.
> 
> With that, "perf stat" now gives:
> 
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
> Control descriptor is not initialized
> unc_cbo_xsnp_response.miss_eviction: 4237983 1000904100 1000904100
> unc_cbo_xsnp_response.miss_xcore: 218643 1000904100 1000904100
> unc_cbo_xsnp_response.miss_eviction: 4254148 1000902629 1000902629
> unc_cbo_xsnp_response.miss_xcore: 213352 1000902629 1000902629
> 
>  Performance counter stats for 'system wide':
> 
>          4,237,983      unc_cbo_xsnp_response.miss_eviction # 3668558131345.00 test_metric_inc
>            218,643      unc_cbo_xsnp_response.miss_xcore
>          4,254,148      unc_cbo_xsnp_response.miss_eviction
>            213,352      unc_cbo_xsnp_response.miss_xcore
> 
>        1.000938151 seconds time elapsed
> 

Next time please try to provides a Fixes: tag to help with
backporting/stable@kernel.org work.

- Arnaldo
 
> Signed-off-by: John Garry <john.garry@huawei.com>
> Acked-by: Kajol Jain <kjain@linux.ibm.com>
> ---
>  tools/perf/util/metricgroup.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 81d201c8b833..b89160718c04 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -279,7 +279,9 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
>  			 * when then group is left.
>  			 */
>  			if (!has_constraint &&
> -			    ev->leader != metric_events[i]->leader)
> +			    ev->leader != metric_events[i]->leader &&
> +			    !strcmp(ev->leader->pmu_name,
> +				    metric_events[i]->leader->pmu_name))
>  				break;
>  			if (!strcmp(metric_events[i]->name, ev->name)) {
>  				set_bit(ev->idx, evlist_used);
> -- 
> 2.26.2
>
John Garry Dec. 7, 2020, 6:02 p.m. UTC | #2
On 07/12/2020 17:19, Arnaldo Carvalho de Melo wrote:
> Next time please try to provides a Fixes: tag to help with
> backporting/stable@kernel.org  work.
> 

Hi Arnaldo,

I know you asked me this before Re. fixes tags ... but I don't know any 
cases of "metrics using aliases covering multiple PMUs" in mainline 
today (which this patch addresses). If there are some, then I guess that 
they are broken and we should seek them out.

Note that this topic was discussed here initially:
https://lore.kernel.org/linux-perf-users/CAP-5=fUy6FOszNRwJF6ZNpqQSSyrnLPV6GbkEcZMqAhUp3X0ZA@mail.gmail.com/

There has been much churn on the metric code recently, so prob not a 
straight stable backport; as such, I would prefer to know some metric 
was broken and verify we fix it.

If you have any better ideas to handle this, then please let me know.

Cheers,
John
diff mbox series

Patch

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 81d201c8b833..b89160718c04 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -279,7 +279,9 @@  static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 			 * when then group is left.
 			 */
 			if (!has_constraint &&
-			    ev->leader != metric_events[i]->leader)
+			    ev->leader != metric_events[i]->leader &&
+			    !strcmp(ev->leader->pmu_name,
+				    metric_events[i]->leader->pmu_name))
 				break;
 			if (!strcmp(metric_events[i]->name, ev->name)) {
 				set_bit(ev->idx, evlist_used);