mbox series

[RFC,0/2] Add vcpu debugfs to record statstical data for every single

Message ID 1670331508-67322-1-git-send-email-chenxiang66@hisilicon.com (mailing list archive)
Headers show
Series Add vcpu debugfs to record statstical data for every single | expand

Message

chenxiang Dec. 6, 2022, 12:58 p.m. UTC
From: Xiang Chen <chenxiang66@hisilicon.com>

Currently it only records statistical data for all vcpus, but we ofen want
to know statistical data for a single vcpu, there is no debugfs for that.
So add vcpu debugfs to record statstical data for every single vcpu, and
also enable vcpu debugfs for arm64.

After the change, those vcpu debugfs are as follows (we have 4 vcpu in the
vm):

[root@centos kvm]# cd 2025-14/
[root@centos 2025-14]# ls
blocking                halt_wait_hist             vcpu0
exits                   halt_wait_ns               vcpu1
halt_attempted_poll     halt_wakeup                vcpu2
halt_poll_fail_hist     hvc_exit_stat              vcpu3
halt_poll_fail_ns       mmio_exit_kernel           vgic-state
halt_poll_invalid       mmio_exit_user             wfe_exit_stat
halt_poll_success_hist  remote_tlb_flush           wfi_exit_stat
halt_poll_success_ns    remote_tlb_flush_requests
halt_successful_poll    signal_exits
[root@centos 2025-14]# cat exits
124689
[root@centos 2025-14]# cat vcpu0/exits
52966
[root@centos 2025-14]# cat vcpu1/exits
21549
[root@centos 2025-14]# cat vcpu2/exits
43864
[root@centos 2025-14]# cat vcpu3/exits
6572
[root@centos 2025-14]# ls vcpu0
blocking             halt_poll_invalid       halt_wait_ns      pid
exits                halt_poll_success_hist  halt_wakeup       signal_exits
halt_attempted_poll  halt_poll_success_ns    hvc_exit_stat     wfe_exit_stat
halt_poll_fail_hist  halt_successful_poll    mmio_exit_kernel  wfi_exit_stat
halt_poll_fail_ns    halt_wait_hist          mmio_exit_user

Xiang Chen (2):
  KVM: debugfs: Add vcpu debugfs to record statstical data for every
    single vcpu
  KVM: arm64: Enable __KVM_HAVE_ARCH_VCPU_DEBUGFS

 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/arm.c              |  4 +++
 include/linux/kvm_host.h          |  2 ++
 virt/kvm/kvm_main.c               | 62 +++++++++++++++++++++++++++++++++++++--
 4 files changed, 67 insertions(+), 2 deletions(-)

Comments

Marc Zyngier Dec. 7, 2022, 8:21 a.m. UTC | #1
On Tue, 06 Dec 2022 12:58:26 +0000,
chenxiang <chenxiang66@hisilicon.com> wrote:
> 
> From: Xiang Chen <chenxiang66@hisilicon.com>
> 
> Currently it only records statistical data for all vcpus, but we ofen want
> to know statistical data for a single vcpu, there is no debugfs for that.
> So add vcpu debugfs to record statstical data for every single vcpu, and
> also enable vcpu debugfs for arm64.
>
> After the change, those vcpu debugfs are as follows (we have 4 vcpu in the
> vm):
> 
> [root@centos kvm]# cd 2025-14/
> [root@centos 2025-14]# ls
> blocking                halt_wait_hist             vcpu0
> exits                   halt_wait_ns               vcpu1
> halt_attempted_poll     halt_wakeup                vcpu2
> halt_poll_fail_hist     hvc_exit_stat              vcpu3
> halt_poll_fail_ns       mmio_exit_kernel           vgic-state
> halt_poll_invalid       mmio_exit_user             wfe_exit_stat
> halt_poll_success_hist  remote_tlb_flush           wfi_exit_stat
> halt_poll_success_ns    remote_tlb_flush_requests
> halt_successful_poll    signal_exits
> [root@centos 2025-14]# cat exits
> 124689
> [root@centos 2025-14]# cat vcpu0/exits
> 52966
> [root@centos 2025-14]# cat vcpu1/exits
> 21549
> [root@centos 2025-14]# cat vcpu2/exits
> 43864
> [root@centos 2025-14]# cat vcpu3/exits
> 6572
> [root@centos 2025-14]# ls vcpu0
> blocking             halt_poll_invalid       halt_wait_ns      pid
> exits                halt_poll_success_hist  halt_wakeup       signal_exits
> halt_attempted_poll  halt_poll_success_ns    hvc_exit_stat     wfe_exit_stat
> halt_poll_fail_hist  halt_successful_poll    mmio_exit_kernel  wfi_exit_stat
> halt_poll_fail_ns    halt_wait_hist          mmio_exit_user

This is yet another example of "KVM doesn't give me the stats I want,
so let's pile more stats on top". This affects every users (counters
are not free), and hardly benefits anyone.

How about you instead add trace hooks that allows you to plumb your
own counters using BPF or another kernel module? This is what is stuff
is for, and we really don't need to create more ABI around that. At
least, the other stat-hungry folks out there would also be able to get
their own stuff, and normal users wouldn't be affected by it.

Thanks,

	M.
chenxiang Dec. 7, 2022, 10:16 a.m. UTC | #2
在 2022/12/7 16:21, Marc Zyngier 写道:
> On Tue, 06 Dec 2022 12:58:26 +0000,
> chenxiang <chenxiang66@hisilicon.com> wrote:
>> From: Xiang Chen <chenxiang66@hisilicon.com>
>>
>> Currently it only records statistical data for all vcpus, but we ofen want
>> to know statistical data for a single vcpu, there is no debugfs for that.
>> So add vcpu debugfs to record statstical data for every single vcpu, and
>> also enable vcpu debugfs for arm64.
>>
>> After the change, those vcpu debugfs are as follows (we have 4 vcpu in the
>> vm):
>>
>> [root@centos kvm]# cd 2025-14/
>> [root@centos 2025-14]# ls
>> blocking                halt_wait_hist             vcpu0
>> exits                   halt_wait_ns               vcpu1
>> halt_attempted_poll     halt_wakeup                vcpu2
>> halt_poll_fail_hist     hvc_exit_stat              vcpu3
>> halt_poll_fail_ns       mmio_exit_kernel           vgic-state
>> halt_poll_invalid       mmio_exit_user             wfe_exit_stat
>> halt_poll_success_hist  remote_tlb_flush           wfi_exit_stat
>> halt_poll_success_ns    remote_tlb_flush_requests
>> halt_successful_poll    signal_exits
>> [root@centos 2025-14]# cat exits
>> 124689
>> [root@centos 2025-14]# cat vcpu0/exits
>> 52966
>> [root@centos 2025-14]# cat vcpu1/exits
>> 21549
>> [root@centos 2025-14]# cat vcpu2/exits
>> 43864
>> [root@centos 2025-14]# cat vcpu3/exits
>> 6572
>> [root@centos 2025-14]# ls vcpu0
>> blocking             halt_poll_invalid       halt_wait_ns      pid
>> exits                halt_poll_success_hist  halt_wakeup       signal_exits
>> halt_attempted_poll  halt_poll_success_ns    hvc_exit_stat     wfe_exit_stat
>> halt_poll_fail_hist  halt_successful_poll    mmio_exit_kernel  wfi_exit_stat
>> halt_poll_fail_ns    halt_wait_hist          mmio_exit_user
> This is yet another example of "KVM doesn't give me the stats I want,
> so let's pile more stats on top". This affects every users (counters
> are not free), and hardly benefits anyone.

Currently it already has vcpu debugfs on top, but it only records 
statstical data for total vm
which is helpless for debug, for example, file exists records the number 
of VM exist for all vcpus, before we encountered a
issue that there is something wrong with the thread of a vcpu which 
doesn't VM exit but other vcpus are normal,
we can't get anything useful from current vcpu debugfs as the number of 
exits still increase in current vcpu debugfs.
Compared with current vcpu debugfs, i think it is more useful to know 
the statstical data for every vcpu and it benefits more.

>
> How about you instead add trace hooks that allows you to plumb your
> own counters using BPF or another kernel module? This is what is stuff
> is for, and we really don't need to create more ABI around that. At
> least, the other stat-hungry folks out there would also be able to get
> their own stuff, and normal users wouldn't be affected by it.
>
> Thanks,
>
> 	M.
>
Marc Zyngier Dec. 7, 2022, 11:09 a.m. UTC | #3
On Wed, 07 Dec 2022 10:16:17 +0000,
"chenxiang (M)" <chenxiang66@hisilicon.com> wrote:
> 
> 
> 
> 在 2022/12/7 16:21, Marc Zyngier 写道:
> > On Tue, 06 Dec 2022 12:58:26 +0000,
> > chenxiang <chenxiang66@hisilicon.com> wrote:
> >> From: Xiang Chen <chenxiang66@hisilicon.com>
> >> 
> >> Currently it only records statistical data for all vcpus, but we ofen want
> >> to know statistical data for a single vcpu, there is no debugfs for that.
> >> So add vcpu debugfs to record statstical data for every single vcpu, and
> >> also enable vcpu debugfs for arm64.
> >> 
> >> After the change, those vcpu debugfs are as follows (we have 4 vcpu in the
> >> vm):
> >> 
> >> [root@centos kvm]# cd 2025-14/
> >> [root@centos 2025-14]# ls
> >> blocking                halt_wait_hist             vcpu0
> >> exits                   halt_wait_ns               vcpu1
> >> halt_attempted_poll     halt_wakeup                vcpu2
> >> halt_poll_fail_hist     hvc_exit_stat              vcpu3
> >> halt_poll_fail_ns       mmio_exit_kernel           vgic-state
> >> halt_poll_invalid       mmio_exit_user             wfe_exit_stat
> >> halt_poll_success_hist  remote_tlb_flush           wfi_exit_stat
> >> halt_poll_success_ns    remote_tlb_flush_requests
> >> halt_successful_poll    signal_exits
> >> [root@centos 2025-14]# cat exits
> >> 124689
> >> [root@centos 2025-14]# cat vcpu0/exits
> >> 52966
> >> [root@centos 2025-14]# cat vcpu1/exits
> >> 21549
> >> [root@centos 2025-14]# cat vcpu2/exits
> >> 43864
> >> [root@centos 2025-14]# cat vcpu3/exits
> >> 6572
> >> [root@centos 2025-14]# ls vcpu0
> >> blocking             halt_poll_invalid       halt_wait_ns      pid
> >> exits                halt_poll_success_hist  halt_wakeup       signal_exits
> >> halt_attempted_poll  halt_poll_success_ns    hvc_exit_stat     wfe_exit_stat
> >> halt_poll_fail_hist  halt_successful_poll    mmio_exit_kernel  wfi_exit_stat
> >> halt_poll_fail_ns    halt_wait_hist          mmio_exit_user
> > This is yet another example of "KVM doesn't give me the stats I want,
> > so let's pile more stats on top". This affects every users (counters
> > are not free), and hardly benefits anyone.
> 
> Currently it already has vcpu debugfs on top, but it only records
> statstical data for total vm
> which is helpless for debug, for example, file exists records the
> number of VM exist for all vcpus, before we encountered a
> issue that there is something wrong with the thread of a vcpu which
> doesn't VM exit but other vcpus are normal,
> we can't get anything useful from current vcpu debugfs as the number
> of exits still increase in current vcpu debugfs.
> Compared with current vcpu debugfs, i think it is more useful to know
> the statstical data for every vcpu and it benefits more.

What is useful for you is useless for somebody else. And vice versa.

At the end of the day, this is *debug* stuff. Why should a normal user
pay a price for *debug* counters?

> > How about you instead add trace hooks that allows you to plumb your
> > own counters using BPF or another kernel module? This is what is stuff
> > is for, and we really don't need to create more ABI around that. At
> > least, the other stat-hungry folks out there would also be able to get
> > their own stuff, and normal users wouldn't be affected by it.

The answer to your problems is ^^^HERE^^^.

	M.