diff mbox series

[bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile

Message ID 20201119042332.3343146-1-yhs@fb.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf-next
netdev/subject_prefix success Link
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 48 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Yonghong Song Nov. 19, 2020, 4:23 a.m. UTC
Commit 47c09d6a9f67("bpftool: Introduce "prog profile" command")
introduced "bpftool prog profile" command which can be used
to profile bpf program with metrics like # of instructions,

This patch added support for itlb_misses and dtlb_misses.
During an internal bpf program performance evaluation,
I found these two metrics are also very useful. The following
is an example output:

 $ bpftool prog profile id 324 duration 3 cycles itlb_misses

           1885029 run_cnt
        5134686073 cycles
            306893 itlb_misses
 $ bpftool prog profile id 324 duration 3 cycles dtlb_misses

           1827382 run_cnt
        4943593648 cycles
           5975636 dtlb_misses
 $ bpftool prog profile id 324 duration 3 cycles llc_misses

           1836527 run_cnt
        5019612972 cycles
           4161041 llc_misses

From the above, we can see quite some dtlb misses, 3 dtlb misses
perf prog run. This might be something worth further investigation.

Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/bpf/bpftool/prog.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

Comments

Yonghong Song Nov. 19, 2020, 7:18 a.m. UTC | #1
On 11/18/20 9:03 PM, Song Liu wrote:
> 
>> On Nov 18, 2020, at 8:23 PM, Yonghong Song <yhs@fb.com> wrote:
> 
> [...]
> 
>>
>> Cc: Song Liu <songliubraving@fb.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>> tools/bpf/bpftool/prog.c | 32 ++++++++++++++++++++++++++++++--
>> 1 file changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
>> index acdb2c245f0a..e33f27b950a5 100644
>> --- a/tools/bpf/bpftool/prog.c
>> +++ b/tools/bpf/bpftool/prog.c
>> @@ -1717,11 +1717,39 @@ struct profile_metric {
>> 		.ratio_desc = "LLC misses per million insns",
>> 		.ratio_mul = 1e6,
>> 	},
>> +	{
>> +		.name = "itlb_misses",
>> +		.attr = {
>> +			.type = PERF_TYPE_HW_CACHE,
>> +			.config =
>> +				PERF_COUNT_HW_CACHE_ITLB |
>> +				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
>> +				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
>> +			.exclude_user = 1
>> +		},
>> +		.ratio_metric = 2,
>> +		.ratio_desc = "itlb misses per million insns",
>> +		.ratio_mul = 1e6,
>> +	},
>> +	{
>> +		.name = "dtlb_misses",
>> +		.attr = {
>> +			.type = PERF_TYPE_HW_CACHE,
>> +			.config =
>> +				PERF_COUNT_HW_CACHE_DTLB |
>> +				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
>> +				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
>> +			.exclude_user = 1
>> +		},
>> +		.ratio_metric = 2,
>> +		.ratio_desc = "dtlb misses per million insns",
>> +		.ratio_mul = 1e6,
>> +	},
>> };
>>
>> static __u64 profile_total_count;
>>
>> -#define MAX_NUM_PROFILE_METRICS 4
>> +#define MAX_NUM_PROFILE_METRICS 6
> 
> This change is not necessary. This is the max number of enabled metrics.
> We don't stop the perf counters, each enabled metric adds some error to
> the final reading. Therefore, we don't want to enable too many metrics
> in the same run. If we don't have a use case for more metrics in one run,
> let's keep the cap of 4 metrics.
> 
> If we do want to increase this, we should also increase MAX_NUM_MATRICS
> in profiler.bpf.c.

Thanks for explanation. I missed that we also need to change 
MAX_NUM_MATRICS to really make it work. Ideally it would be best if we
only need to change in one header and that header is shared by both
bpf program and non-bpf program but probably is a overkill here just
for a single macro.

Also, your concern makes sense. We do not want to enable too many
metrics as this will cause counter sharing and may sacrifice accuracy.
I will leave current max profile metrics 4 then.

Will respin v2.

> 
> Other than this,
> 
> Acked-by: Song Liu <songliubraving@fb.com>
> 
> Thanks!
>
diff mbox series

Patch

diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index acdb2c245f0a..e33f27b950a5 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -1717,11 +1717,39 @@  struct profile_metric {
 		.ratio_desc = "LLC misses per million insns",
 		.ratio_mul = 1e6,
 	},
+	{
+		.name = "itlb_misses",
+		.attr = {
+			.type = PERF_TYPE_HW_CACHE,
+			.config =
+				PERF_COUNT_HW_CACHE_ITLB |
+				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
+				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
+			.exclude_user = 1
+		},
+		.ratio_metric = 2,
+		.ratio_desc = "itlb misses per million insns",
+		.ratio_mul = 1e6,
+	},
+	{
+		.name = "dtlb_misses",
+		.attr = {
+			.type = PERF_TYPE_HW_CACHE,
+			.config =
+				PERF_COUNT_HW_CACHE_DTLB |
+				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
+				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
+			.exclude_user = 1
+		},
+		.ratio_metric = 2,
+		.ratio_desc = "dtlb misses per million insns",
+		.ratio_mul = 1e6,
+	},
 };
 
 static __u64 profile_total_count;
 
-#define MAX_NUM_PROFILE_METRICS 4
+#define MAX_NUM_PROFILE_METRICS 6
 
 static int profile_parse_metrics(int argc, char **argv)
 {
@@ -2109,7 +2137,7 @@  static int do_help(int argc, char **argv)
 		"                 struct_ops | fentry | fexit | freplace | sk_lookup }\n"
 		"       ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n"
 		"                        flow_dissector }\n"
-		"       METRIC := { cycles | instructions | l1d_loads | llc_misses }\n"
+		"       METRIC := { cycles | instructions | l1d_loads | llc_misses | itlb_misses | dtlb_misses }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2]);