diff mbox

expose per-vm statistics via debugfs

Message ID 5e93dcec0908190553k5b7e55aak9987e94b5bdb1893@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ryota Ozaki Aug. 19, 2009, 12:53 p.m. UTC
Hi,

This patch let kvm exposes per-vm statistics about
such as the total number of vm exits, via debugfs.

Existing kvm already collects the per-vm statistics,
but it has no interface to expose them to users.
This patch creates directories named a pid of the
corresponding vm in debugfs of kvm, containing the
same files that original debugfs exposes.

The per-vm statistics are useful to know activities
of vms (and to identify anomalous vms for example)
with more detailed information than cpu and memory
usage, and network traffics. And also the patch
introduces no performance overhead, thus, it should
be familiar with online operations, e.g., dynamic
adaptation of assigning vm resources using the
statistics.

Note that this patch requires a trivial modification
to kvm_stat script. Once this patch is accepted I will
send a patch for it.

Thanks,
  ozaki-r

Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
---

From da68a9c008e1159f5cf075a331038148edbb0967 Mon Sep 17 00:00:00 2001
From: Ryota Ozaki <ozaki.ryota@gmail.com>
Date: Wed, 19 Aug 2009 21:25:16 +0900
Subject: [PATCH] expose per-vm statistics via debugfs

Existing kvm already collects the per-vm statistics,
but it has no interface to expose them to users.
This patch creates directories named a pid of the
corresponding vm in debugfs of kvm, containing the
same files that original debugfs exposes.

The per-vm statistics are useful to know activities
of vms (and to identify anomalous vms for example)
with more detailed information than cpu and memory
usage, and network traffics. And also the patch
introduces no performance overhead, thus, it should
be familiar with online operations, e.g., dynamic
adaptation of assigning vm resources using the
statistics.
---
 arch/ia64/kvm/kvm-ia64.c |    3 +
 arch/powerpc/kvm/booke.c |    3 +
 arch/s390/kvm/kvm-s390.c |    3 +
 arch/x86/kvm/x86.c       |    7 ++-
 include/linux/kvm_host.h |   10 +++++
 virt/kvm/kvm_main.c      |  102 +++++++++++++++++++++++++++++++++++++++-------
 6 files changed, 111 insertions(+), 17 deletions(-)

Comments

Avi Kivity Aug. 20, 2009, 12:09 p.m. UTC | #1
On 08/19/2009 03:53 PM, Ryota Ozaki wrote:
> Hi,
>
> This patch let kvm exposes per-vm statistics about
> such as the total number of vm exits, via debugfs.
>
> Existing kvm already collects the per-vm statistics,
> but it has no interface to expose them to users.
> This patch creates directories named a pid of the
> corresponding vm in debugfs of kvm, containing the
> same files that original debugfs exposes.
>
> The per-vm statistics are useful to know activities
> of vms (and to identify anomalous vms for example)
> with more detailed information than cpu and memory
> usage, and network traffics. And also the patch
> introduces no performance overhead, thus, it should
> be familiar with online operations, e.g., dynamic
> adaptation of assigning vm resources using the
> statistics.
>
> Note that this patch requires a trivial modification
> to kvm_stat script. Once this patch is accepted I will
> send a patch for it.
>
> Thanks,
>    ozaki-r
>
>    

My plan is to completely remove the current statistics in favour of 
tracepoints.  You can already display tracepoint statistics with 'perf 
stat' (see tools/perf); tracepoints have the advantage that they can be 
completely disabled at runtime and thus have no performance impact.
Ryota Ozaki Aug. 20, 2009, 12:42 p.m. UTC | #2
On Thu, Aug 20, 2009 at 9:09 PM, Avi Kivity<avi@redhat.com> wrote:
> On 08/19/2009 03:53 PM, Ryota Ozaki wrote:
>>
>> Hi,
>>
>> This patch let kvm exposes per-vm statistics about
>> such as the total number of vm exits, via debugfs.
>>
>> Existing kvm already collects the per-vm statistics,
>> but it has no interface to expose them to users.
>> This patch creates directories named a pid of the
>> corresponding vm in debugfs of kvm, containing the
>> same files that original debugfs exposes.
>>
>> The per-vm statistics are useful to know activities
>> of vms (and to identify anomalous vms for example)
>> with more detailed information than cpu and memory
>> usage, and network traffics. And also the patch
>> introduces no performance overhead, thus, it should
>> be familiar with online operations, e.g., dynamic
>> adaptation of assigning vm resources using the
>> statistics.
>>
>> Note that this patch requires a trivial modification
>> to kvm_stat script. Once this patch is accepted I will
>> send a patch for it.
>>
>> Thanks,
>>   ozaki-r
>>
>>
>
> My plan is to completely remove the current statistics in favour of
> tracepoints.  You can already display tracepoint statistics with 'perf stat'
> (see tools/perf); tracepoints have the advantage that they can be completely
> disabled at runtime and thus have no performance impact.

Oh, I don't really know it. I'll try it later.

BTW, is the function feasible to my purpose, I mean, is it
low overhead enough even if it is enabled?

  ozaki-r

>
> --
> error compiling committee.c: too many arguments to function
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity Aug. 20, 2009, 12:49 p.m. UTC | #3
On 08/20/2009 03:42 PM, Ryota Ozaki wrote:
> Oh, I don't really know it. I'll try it later.
>
> BTW, is the function feasible to my purpose, I mean, is it
> low overhead enough even if it is enabled?
>    

I think so.
diff mbox

Patch

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 0ad09f0..593921f 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -68,6 +68,9 @@  struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ NULL }
 };

+size_t debugfs_entries_size =
+	sizeof(debugfs_entries)/sizeof(debugfs_entries[0]);
+
 static unsigned long kvm_get_itc(struct kvm_vcpu *vcpu)
 {
 #if defined(CONFIG_IA64_SGI_SN2) || defined(CONFIG_IA64_GENERIC)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7bf4d0..fdf4c79 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -56,6 +56,9 @@  struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ NULL }
 };

+size_t debugfs_entries_size =
+	sizeof(debugfs_entries)/sizeof(debugfs_entries[0]);
+
 /* TODO: use vcpu_printf() */
 void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 07ced89..5747a9a 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -71,6 +71,9 @@  struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ NULL }
 };

+size_t debugfs_entries_size =
+	sizeof(debugfs_entries)/sizeof(debugfs_entries[0]);
+
 static unsigned long long *facilities;

 /* Section: not file related */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 850cf56..609b4c4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -74,8 +74,8 @@  static u64 __read_mostly efer_reserved_bits =
0xfffffffffffffafeULL;
 static u64 __read_mostly efer_reserved_bits = 0xfffffffffffffffeULL;
 #endif

-#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
-#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
+#define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM, NULL
+#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU, NULL

 static void update_cr8_intercept(struct kvm_vcpu *vcpu);
 static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
@@ -123,6 +123,9 @@  struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ NULL }
 };

+size_t debugfs_entries_size =
+	sizeof(debugfs_entries)/sizeof(debugfs_entries[0]);
+
 unsigned long segment_base(u16 selector)
 {
 	struct descriptor_table gdt;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f814512..89077b5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -46,6 +46,11 @@  struct kvm;
 struct kvm_vcpu;
 extern struct kmem_cache *kvm_vcpu_cache;

+struct kvm_debugfs {
+	struct dentry *dentry;
+	struct kvm_stats_debugfs_item *debugfs_entries;
+};
+
 /*
  * It would be nice to use something smarter than a linear search, TBD...
  * Thankfully we dont expect many devices to register (famous last words :),
@@ -176,6 +181,8 @@  struct kvm {
 	unsigned long mmu_notifier_seq;
 	long mmu_notifier_count;
 #endif
+
+	struct kvm_debugfs debugfs;
 };

 /* The guest did something we don't support. */
@@ -490,10 +497,13 @@  struct kvm_stats_debugfs_item {
 	int offset;
 	enum kvm_stat_kind kind;
 	struct dentry *dentry;
+	struct kvm *owner_kvm;
 };
 extern struct kvm_stats_debugfs_item debugfs_entries[];
 extern struct dentry *kvm_debugfs_dir;

+extern size_t debugfs_entries_size;
+
 #ifdef KVM_ARCH_WANT_MMU_NOTIFIER
 static inline int mmu_notifier_retry(struct kvm_vcpu *vcpu, unsigned
long mmu_seq)
 {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1df4c04..1c3a69c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -43,6 +43,7 @@ 
 #include <linux/swap.h>
 #include <linux/bitops.h>
 #include <linux/spinlock.h>
+#include <linux/proc_fs.h>

 #include <asm/processor.h>
 #include <asm/io.h>
@@ -81,7 +82,9 @@  EXPORT_SYMBOL_GPL(kvm_vcpu_cache);

 static __read_mostly struct preempt_ops kvm_preempt_ops;

-struct dentry *kvm_debugfs_dir;
+struct dentry *kvm_debugfs_root;
+static void kvm_debugfs_init(struct kvm *kvm);
+static void kvm_debugfs_destroy(struct kvm *kvm);

 static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 			   unsigned long arg);
@@ -982,6 +985,7 @@  static struct kvm *kvm_create_vm(void)
 	mutex_init(&kvm->lock);
 	mutex_init(&kvm->irq_lock);
 	kvm_io_bus_init(&kvm->mmio_bus);
+	kvm_debugfs_init(kvm);
 	init_rwsem(&kvm->slots_lock);
 	atomic_set(&kvm->users_count, 1);
 	spin_lock(&kvm_lock);
@@ -1038,6 +1042,7 @@  static void kvm_destroy_vm(struct kvm *kvm)
 	list_del(&kvm->vm_list);
 	spin_unlock(&kvm_lock);
 	kvm_free_irq_routing(kvm);
+	kvm_debugfs_destroy(kvm);
 	kvm_io_bus_destroy(&kvm->pio_bus);
 	kvm_io_bus_destroy(&kvm->mmio_bus);
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
@@ -2590,33 +2595,48 @@  static struct notifier_block kvm_cpu_notifier = {
 	.priority = 20, /* must be > scheduler priority */
 };

-static int vm_stat_get(void *_offset, u64 *val)
+static int vm_stat_get(void *_item, u64 *val)
 {
-	unsigned offset = (long)_offset;
-	struct kvm *kvm;
+	struct kvm_stats_debugfs_item *item = (struct kvm_stats_debugfs_item *)_item;
+	unsigned offset = (long)item->offset;
+	struct kvm *kvm = (struct kvm *)item->owner_kvm;

 	*val = 0;
 	spin_lock(&kvm_lock);
-	list_for_each_entry(kvm, &vm_list, vm_list)
-		*val += *(u32 *)((void *)kvm + offset);
+	if (kvm) {
+		/* per-vm value */
+		*val = *(u32 *)((void *)kvm + offset);
+	} else {
+		/* total value */
+		list_for_each_entry(kvm, &vm_list, vm_list)
+			*val += *(u32 *)((void *)kvm + offset);
+	}
 	spin_unlock(&kvm_lock);
 	return 0;
 }

 DEFINE_SIMPLE_ATTRIBUTE(vm_stat_fops, vm_stat_get, NULL, "%llu¥n");

-static int vcpu_stat_get(void *_offset, u64 *val)
+static int vcpu_stat_get(void *_item, u64 *val)
 {
-	unsigned offset = (long)_offset;
-	struct kvm *kvm;
+	struct kvm_stats_debugfs_item *item = (struct kvm_stats_debugfs_item *)_item;
+	unsigned offset = (long)item->offset;
+	struct kvm *kvm = (struct kvm *)item->owner_kvm;
 	struct kvm_vcpu *vcpu;
 	int i;

 	*val = 0;
 	spin_lock(&kvm_lock);
-	list_for_each_entry(kvm, &vm_list, vm_list)
+	if (kvm) {
+		/* per-vm value */
 		kvm_for_each_vcpu(i, vcpu, kvm)
 			*val += *(u32 *)((void *)vcpu + offset);
+	} else {
+		/* total value */
+		list_for_each_entry(kvm, &vm_list, vm_list)
+			kvm_for_each_vcpu(i, vcpu, kvm)
+				*val += *(u32 *)((void *)vcpu + offset);
+	}

 	spin_unlock(&kvm_lock);
 	return 0;
@@ -2633,11 +2653,13 @@  static void kvm_init_debug(void)
 {
 	struct kvm_stats_debugfs_item *p;

-	kvm_debugfs_dir = debugfs_create_dir("kvm", NULL);
-	for (p = debugfs_entries; p->name; ++p)
-		p->dentry = debugfs_create_file(p->name, 0444, kvm_debugfs_dir,
-						(void *)(long)p->offset,
+	kvm_debugfs_root = debugfs_create_dir("kvm", NULL);
+	for (p = debugfs_entries; p->name; ++p) {
+		p->owner_kvm = NULL;
+		p->dentry = debugfs_create_file(p->name, 0444, kvm_debugfs_root,
+						(void *)p,
 						stat_fops[p->kind]);
+	}
 }

 static void kvm_exit_debug(void)
@@ -2646,7 +2668,57 @@  static void kvm_exit_debug(void)

 	for (p = debugfs_entries; p->name; ++p)
 		debugfs_remove(p->dentry);
-	debugfs_remove(kvm_debugfs_dir);
+	debugfs_remove(kvm_debugfs_root);
+}
+
+/* old linux/proc_fs.h doesn't include this */
+#ifndef PROC_NUMBUF
+#define PROC_NUMBUF 13
+#endif
+
+static void kvm_debugfs_init(struct kvm *kvm)
+{
+	struct kvm_stats_debugfs_item *p;
+	struct kvm_debugfs *kvm_debugfs = &kvm->debugfs;
+	struct dentry *dir;
+	char pid[PROC_NUMBUF];
+	size_t item_size = sizeof(struct kvm_stats_debugfs_item) ¥
+		* debugfs_entries_size;
+
+	sprintf(pid, "%d", pid_nr(task_pid(current)));
+	dir = debugfs_create_dir(pid, kvm_debugfs_root);
+	if (dir == NULL) {
+		return;
+	}
+
+	p = kmalloc(item_size, GFP_KERNEL);
+	if (p == NULL) {
+		debugfs_remove(dir);
+		return;
+	}
+	memcpy(p, debugfs_entries, item_size);
+
+	kvm_debugfs->dentry = dir;
+	kvm_debugfs->debugfs_entries = p;
+	for (; p->name; ++p) {
+		p->owner_kvm = kvm;
+		p->dentry = debugfs_create_file(p->name, 0444, dir,
+						(void *)p,
+						stat_fops[p->kind]);
+	}
+	return;
+}
+
+static void kvm_debugfs_destroy(struct kvm *kvm)
+{
+	struct kvm_stats_debugfs_item *p;
+	struct kvm_debugfs *debugfs = &kvm->debugfs;
+
+	p = debugfs->debugfs_entries;
+	for (; p->name; ++p)
+		debugfs_remove(p->dentry);
+	debugfs_remove(debugfs->dentry);
+	kfree(debugfs->debugfs_entries);
 }

 static int kvm_suspend(struct sys_device *dev, pm_message_t state)