diff mbox series

[RFC,6/7] core/metricfs: expose x86-specific irq information through metricfs

Message ID 20200807212916.2883031-7-jwadams@google.com (mailing list archive)
State New, archived
Headers show
Series metricfs metric file system and examples | expand

Commit Message

Jonathan Adams Aug. 7, 2020, 9:29 p.m. UTC
Add metricfs support for displaying percpu irq counters for x86.
The top directory is /sys/kernel/debug/metricfs/irq_x86.
Then there is a subdirectory for each x86-specific irq counter.
For example:

    cat /sys/kernel/debug/metricfs/irq_x86/TLB/values

Signed-off-by: Jonathan Adams <jwadams@google.com>

---

jwadams@google.com: rebased to 5.8-pre6
	This is work originally done by another engineer at
	google, who would rather not have their name associated with
	this patchset. They're okay with me sending it under my name.
---
 arch/x86/kernel/irq.c | 80 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

Comments

Thomas Gleixner Aug. 13, 2020, 10:11 a.m. UTC | #1
Jonathan Adams <jwadams@google.com> writes:

How is that related to core? The x86 subsys prefix is 'x86' and for this
particular thing it's 'x86/irq:'. That applies to the rest of the series
as well. 

> Add metricfs support for displaying percpu irq counters for x86.
> The top directory is /sys/kernel/debug/metricfs/irq_x86.
> Then there is a subdirectory for each x86-specific irq counter.
> For example:
>
>    cat /sys/kernel/debug/metricfs/irq_x86/TLB/values

What is 'TLB'? I'm not aware of any vector which is named TLB.

The changelog is pretty useless in providing any form of rationale for
this. It tells the WHAT, but not the WHY.

Also what is does this file contain? Aggregates, one line per CPU or the
value of the random CPU of the day? I'm not going to dive into the macro
zoo to figure that out.

> jwadams@google.com: rebased to 5.8-pre6
> 	This is work originally done by another engineer at
> 	google, who would rather not have their name associated with
> 	this patchset. They're okay with me sending it under my name.

I can understand why they wont have their name associated with this.

> +#ifdef CONFIG_METRICFS
> +METRICFS_ITEM(NMI, __nmi_count, "Non-maskable interrupts");
> +#ifdef CONFIG_X86_LOCAL_APIC
> +METRICFS_ITEM(LOC, apic_timer_irqs, "Local timer interrupts");
> +METRICFS_ITEM(SPU, irq_spurious_count, "Spurious interrupts");
> +METRICFS_ITEM(PMI, apic_perf_irqs, "Performance monitoring interrupts");
> +METRICFS_ITEM(IWI, apic_irq_work_irqs, "IRQ work interrupts");
> +METRICFS_ITEM(RTR, icr_read_retry_count, "APIC ICR read retries");
> +#endif
....

So you are adding NR_CPUS * NR_DIRECT_VECTORS debugfs files which show
exactly the same information as /proc/interrupts, right? 

Aside of that _all_ of this information is available via tracepoints as
well.

That's NR_CPUS * 15 and incomplete because x86 has 23 of those directly
handled vectors which do not go through the irq core. So with just 15
and 256 CPUs that's 3840 files.

Impressive number especially without any information why this is useful
and provides value over the existing mechanisms to retrieve exactly the
same information.

The cover letter talks a lot about who Google finds this useful, but
that's not really a convincing argument for this metric failsystem
addon.

Thanks,

        tglx
Paolo Bonzini Aug. 13, 2020, 11:47 a.m. UTC | #2
On 13/08/20 12:11, Thomas Gleixner wrote:
>> Add metricfs support for displaying percpu irq counters for x86.
>> The top directory is /sys/kernel/debug/metricfs/irq_x86.
>> Then there is a subdirectory for each x86-specific irq counter.
>> For example:
>>
>>    cat /sys/kernel/debug/metricfs/irq_x86/TLB/values
> What is 'TLB'? I'm not aware of any vector which is named TLB.

There's a "TLB" entry in /proc/interrupts.

Paolo
Thomas Gleixner Aug. 13, 2020, 12:13 p.m. UTC | #3
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 13/08/20 12:11, Thomas Gleixner wrote:
>>> Add metricfs support for displaying percpu irq counters for x86.
>>> The top directory is /sys/kernel/debug/metricfs/irq_x86.
>>> Then there is a subdirectory for each x86-specific irq counter.
>>> For example:
>>>
>>>    cat /sys/kernel/debug/metricfs/irq_x86/TLB/values
>> What is 'TLB'? I'm not aware of any vector which is named TLB.
>
> There's a "TLB" entry in /proc/interrupts.

It's TLB shootdowns and not TLB.

Thanks,

        tglx
Paolo Bonzini Aug. 13, 2020, 2:10 p.m. UTC | #4
On 13/08/20 14:13, Thomas Gleixner wrote:
>>>> Add metricfs support for displaying percpu irq counters for x86.
>>>> The top directory is /sys/kernel/debug/metricfs/irq_x86.
>>>> Then there is a subdirectory for each x86-specific irq counter.
>>>> For example:
>>>>
>>>>    cat /sys/kernel/debug/metricfs/irq_x86/TLB/values
>>> What is 'TLB'? I'm not aware of any vector which is named TLB.
>> There's a "TLB" entry in /proc/interrupts.
> It's TLB shootdowns and not TLB.

Yes but it's using the shortcut name on the left of the table.

> +METRICFS_ITEM(LOC, apic_timer_irqs, "Local timer interrupts");
> +METRICFS_ITEM(SPU, irq_spurious_count, "Spurious interrupts");
> +METRICFS_ITEM(PMI, apic_perf_irqs, "Performance monitoring interrupts");
> +METRICFS_ITEM(IWI, apic_irq_work_irqs, "IRQ work interrupts");
> +METRICFS_ITEM(RTR, icr_read_retry_count, "APIC ICR read retries");
> +#endif
> +METRICFS_ITEM(PLT, x86_platform_ipis, "Platform interrupts");
> +#ifdef CONFIG_SMP
> +METRICFS_ITEM(RES, irq_resched_count, "Rescheduling interrupts");
> +METRICFS_ITEM(CAL, irq_call_count, "Function call interrupts");
> +METRICFS_ITEM(TLB, irq_tlb_count, "TLB shootdowns");

Paolo
Thomas Gleixner Aug. 13, 2020, 2:21 p.m. UTC | #5
Paolo Bonzini <pbonzini@redhat.com> writes:
> On 13/08/20 14:13, Thomas Gleixner wrote:
>>>>>    cat /sys/kernel/debug/metricfs/irq_x86/TLB/values
>>>> What is 'TLB'? I'm not aware of any vector which is named TLB.
>>> There's a "TLB" entry in /proc/interrupts.
>> It's TLB shootdowns and not TLB.
>
> Yes but it's using the shortcut name on the left of the table.

Fair enough, that's the first column in /proc/interrupts. I totally
missed the explanation in the elaborate changelog.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 181060247e3c..ffacbbc4066c 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -12,6 +12,7 @@ 
 #include <linux/delay.h>
 #include <linux/export.h>
 #include <linux/irq.h>
+#include <linux/metricfs.h>
 
 #include <asm/irq_stack.h>
 #include <asm/apic.h>
@@ -374,3 +375,82 @@  void fixup_irqs(void)
 	}
 }
 #endif
+
+#ifdef CONFIG_METRICFS
+#define METRICFS_ITEM(name, field, desc) \
+static void \
+metricfs_##name(struct metric_emitter *e, int cpu) \
+{ \
+	int64_t v = irq_stats(cpu)->field; \
+	METRIC_EMIT_PERCPU_INT(e, cpu, v); \
+} \
+METRIC_EXPORT_PERCPU_COUNTER(name, desc, metricfs_##name)
+
+METRICFS_ITEM(NMI, __nmi_count, "Non-maskable interrupts");
+#ifdef CONFIG_X86_LOCAL_APIC
+METRICFS_ITEM(LOC, apic_timer_irqs, "Local timer interrupts");
+METRICFS_ITEM(SPU, irq_spurious_count, "Spurious interrupts");
+METRICFS_ITEM(PMI, apic_perf_irqs, "Performance monitoring interrupts");
+METRICFS_ITEM(IWI, apic_irq_work_irqs, "IRQ work interrupts");
+METRICFS_ITEM(RTR, icr_read_retry_count, "APIC ICR read retries");
+#endif
+METRICFS_ITEM(PLT, x86_platform_ipis, "Platform interrupts");
+#ifdef CONFIG_SMP
+METRICFS_ITEM(RES, irq_resched_count, "Rescheduling interrupts");
+METRICFS_ITEM(CAL, irq_call_count, "Function call interrupts");
+METRICFS_ITEM(TLB, irq_tlb_count, "TLB shootdowns");
+#endif
+#ifdef CONFIG_X86_THERMAL_VECTOR
+METRICFS_ITEM(TRM, irq_thermal_count, "Thermal event interrupts");
+#endif
+#ifdef CONFIG_X86_MCE_THRESHOLD
+METRICFS_ITEM(THR, irq_threshold_count, "Threshold APIC interrupts");
+#endif
+#ifdef CONFIG_X86_MCE_AMD
+METRICFS_ITEM(DFR, irq_deferred_error_count, "Deferred Error APIC interrupts");
+#endif
+#ifdef CONFIG_HAVE_KVM
+METRICFS_ITEM(PIN, kvm_posted_intr_ipis, "Posted-interrupt notification event");
+METRICFS_ITEM(PIW, kvm_posted_intr_wakeup_ipis,
+	"Posted-interrupt wakeup event");
+#endif
+
+static int __init init_irq_metricfs(void)
+{
+	struct metricfs_subsys *subsys;
+
+	subsys = metricfs_create_subsys("irq_x86", NULL);
+
+	metric_init_NMI(subsys);
+#ifdef CONFIG_X86_LOCAL_APIC
+	metric_init_LOC(subsys);
+	metric_init_SPU(subsys);
+	metric_init_PMI(subsys);
+	metric_init_IWI(subsys);
+	metric_init_RTR(subsys);
+#endif
+	metric_init_PLT(subsys);
+#ifdef CONFIG_SMP
+	metric_init_RES(subsys);
+	metric_init_CAL(subsys);
+	metric_init_TLB(subsys);
+#endif
+#ifdef CONFIG_X86_THERMAL_VECTOR
+	metric_init_TRM(subsys);
+#endif
+#ifdef CONFIG_X86_MCE_THRESHOLD
+	metric_init_THR(subsys);
+#endif
+#ifdef CONFIG_X86_MCE_AMD
+	metric_init_DFR(subsys);
+#endif
+#ifdef CONFIG_HAVE_KVM
+	metric_init_PIN(subsys);
+	metric_init_PIW(subsys);
+#endif
+
+	return 0;
+}
+module_init(init_irq_metricfs);
+
+#endif