diff mbox

[v2] KVM: trace the events of mmu_notifier

Message ID 50499123.70609@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong Sept. 7, 2012, 6:16 a.m. UTC
mmu_notifier is the interface to broadcast the mm events to KVM, the
tracepoints introduced in this patch can trace all these events, it is
very helpful for us to notice and fix the bug caused by mm

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 include/trace/events/kvm.h |  129 ++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c        |   19 +++++++
 2 files changed, 148 insertions(+), 0 deletions(-)

Comments

Avi Kivity Sept. 10, 2012, 9:09 a.m. UTC | #1
On 09/07/2012 09:16 AM, Xiao Guangrong wrote:
> mmu_notifier is the interface to broadcast the mm events to KVM, the
> tracepoints introduced in this patch can trace all these events, it is
> very helpful for us to notice and fix the bug caused by mm

There is nothing kvm specific here.  Perhaps this can be made generic
(with a mm parameter so we can filter by process).
Xiao Guangrong Sept. 10, 2012, 9:26 a.m. UTC | #2
On 09/10/2012 05:09 PM, Avi Kivity wrote:
> On 09/07/2012 09:16 AM, Xiao Guangrong wrote:
>> mmu_notifier is the interface to broadcast the mm events to KVM, the
>> tracepoints introduced in this patch can trace all these events, it is
>> very helpful for us to notice and fix the bug caused by mm
> 
> There is nothing kvm specific here.  Perhaps this can be made generic
> (with a mm parameter so we can filter by process).

Hmm, i would like to put these tracepoints in the mmu-lock then we can clearly
know the sequence between mm and kvm mmu. It is useful for us to detect the
issue/race between them.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Sept. 14, 2012, 5:59 a.m. UTC | #3
On 09/10/2012 05:26 PM, Xiao Guangrong wrote:
> On 09/10/2012 05:09 PM, Avi Kivity wrote:
>> On 09/07/2012 09:16 AM, Xiao Guangrong wrote:
>>> mmu_notifier is the interface to broadcast the mm events to KVM, the
>>> tracepoints introduced in this patch can trace all these events, it is
>>> very helpful for us to notice and fix the bug caused by mm
>>
>> There is nothing kvm specific here.  Perhaps this can be made generic
>> (with a mm parameter so we can filter by process).
> 
> Hmm, i would like to put these tracepoints in the mmu-lock then we can clearly
> know the sequence between mm and kvm mmu. It is useful for us to detect the
> issue/race between them.
> 

Ping...?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity Sept. 16, 2012, 9:58 a.m. UTC | #4
On 09/14/2012 08:59 AM, Xiao Guangrong wrote:
> On 09/10/2012 05:26 PM, Xiao Guangrong wrote:
>> On 09/10/2012 05:09 PM, Avi Kivity wrote:
>>> On 09/07/2012 09:16 AM, Xiao Guangrong wrote:
>>>> mmu_notifier is the interface to broadcast the mm events to KVM, the
>>>> tracepoints introduced in this patch can trace all these events, it is
>>>> very helpful for us to notice and fix the bug caused by mm
>>>
>>> There is nothing kvm specific here.  Perhaps this can be made generic
>>> (with a mm parameter so we can filter by process).
>> 
>> Hmm, i would like to put these tracepoints in the mmu-lock then we can clearly
>> know the sequence between mm and kvm mmu. It is useful for us to detect the
>> issue/race between them.
>> 
> 
> Ping...?

Sorry.  Yes you are right, knowing the exact sequence is valuable.  Yet
it will be hard to associate these events with the mmu since we don't
have gpas there.
Xiao Guangrong Sept. 18, 2012, 8:46 a.m. UTC | #5
On 09/16/2012 05:58 PM, Avi Kivity wrote:
> On 09/14/2012 08:59 AM, Xiao Guangrong wrote:
>> On 09/10/2012 05:26 PM, Xiao Guangrong wrote:
>>> On 09/10/2012 05:09 PM, Avi Kivity wrote:
>>>> On 09/07/2012 09:16 AM, Xiao Guangrong wrote:
>>>>> mmu_notifier is the interface to broadcast the mm events to KVM, the
>>>>> tracepoints introduced in this patch can trace all these events, it is
>>>>> very helpful for us to notice and fix the bug caused by mm
>>>>
>>>> There is nothing kvm specific here.  Perhaps this can be made generic
>>>> (with a mm parameter so we can filter by process).
>>>
>>> Hmm, i would like to put these tracepoints in the mmu-lock then we can clearly
>>> know the sequence between mm and kvm mmu. It is useful for us to detect the
>>> issue/race between them.
>>>
>>
>> Ping...?
> 
> Sorry.  Yes you are right, knowing the exact sequence is valuable.  Yet
> it will be hard to associate these events with the mmu since we don't
> have gpas there.
> 

Avi,

I have some patches in my local queue which use tracepoints instead of
rmap_printk, they can track rmap_add/rmap_remove/rmap_write_protect.

Though we can not directly get the gfn from these mmu-notifier events, it
can be got from the later events. We can see what mmu is doing when the
notifier events are triggered.




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 7ef9e75..5d082b7 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -309,6 +309,135 @@  TRACE_EVENT(

 #endif

+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+TRACE_EVENT(kvm_mmu_notifier_invalidate_page,
+
+	TP_PROTO(unsigned long hva),
+
+	TP_ARGS(hva),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, hva)
+	),
+
+	TP_fast_assign(
+		__entry->hva = hva;
+	),
+
+	TP_printk("hva %lx", __entry->hva)
+);
+
+DECLARE_EVENT_CLASS(mmu_notifier_young_class,
+
+	TP_PROTO(unsigned long hva, int young),
+
+	TP_ARGS(hva, young),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, hva)
+		__field(int, young)
+	),
+
+	TP_fast_assign(
+		__entry->hva = hva;
+		__entry->young = young;
+	),
+
+	TP_printk("hva %lx young %x", __entry->hva, __entry->young)
+);
+
+DEFINE_EVENT(mmu_notifier_young_class, kvm_mmu_notifier_clear_flush_young,
+
+	TP_PROTO(unsigned long hva, int young),
+
+	TP_ARGS(hva, young)
+);
+
+DEFINE_EVENT(mmu_notifier_young_class, kvm_mmu_notifier_test_young,
+
+	TP_PROTO(unsigned long hva, int young),
+
+	TP_ARGS(hva, young)
+);
+
+DECLARE_EVENT_CLASS(mmu_notifier_range_class,
+
+	TP_PROTO(unsigned long start, unsigned long end),
+
+	TP_ARGS(start, end),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, start)
+		__field(unsigned long, end)
+	),
+
+	TP_fast_assign(
+		__entry->start = start;
+		__entry->end = end;
+	),
+
+	TP_printk("start %lx end %lx", __entry->start, __entry->end)
+);
+
+DEFINE_EVENT(mmu_notifier_range_class, kvm_mmu_notifier_invalidate_range_start,
+
+	TP_PROTO(unsigned long start, unsigned long end),
+
+	TP_ARGS(start, end)
+);
+
+DEFINE_EVENT(mmu_notifier_range_class, kvm_mmu_notifier_invalidate_range_end,
+
+	TP_PROTO(unsigned long start, unsigned long end),
+
+	TP_ARGS(start, end)
+);
+
+#define pte_bit(func, bit)	\
+	(pte_##func(__pte(__entry->pteval)) ? bit : '-')
+
+TRACE_EVENT(kvm_mmu_notifier_change_pte,
+
+	TP_PROTO(unsigned long hva, pte_t pte),
+
+	TP_ARGS(hva, pte),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, hva)
+		__field(unsigned long long, pteval)
+		__field(pfn_t, pfn)
+		__field(bool, writable)
+	),
+
+	TP_fast_assign(
+		__entry->hva = hva;
+		__entry->pteval = (long long)pte_val(pte);
+	),
+
+	TP_printk("hva %lx pte %llx pfn %lx bits %c%c%c%c", __entry->hva,
+		  __entry->pteval, pte_pfn(__pte(__entry->pteval)),
+		  pte_bit(present, 'p'), pte_bit(write, 'w'),
+		  pte_bit(dirty, 'd'), pte_bit(young, 'a'))
+);
+
+TRACE_EVENT(kvm_mmu_notifier_release,
+
+	TP_PROTO(struct kvm *kvm),
+
+	TP_ARGS(kvm),
+
+	TP_STRUCT__entry(
+		__field(struct kvm *, kvm)
+	),
+
+	TP_fast_assign(
+		__entry->kvm = kvm;
+	),
+
+	TP_printk("kvm %p", __entry->kvm)
+);
+#endif
+
 #endif /* _TRACE_KVM_MAIN_H */

 /* This part must be outside protection */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0cbc809..9604f4c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -287,6 +287,8 @@  static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
 	idx = srcu_read_lock(&kvm->srcu);
 	spin_lock(&kvm->mmu_lock);

+	trace_kvm_mmu_notifier_invalidate_page(address);
+
 	kvm->mmu_notifier_seq++;
 	need_tlb_flush = kvm_unmap_hva(kvm, address) | kvm->tlbs_dirty;
 	/* we've to flush the tlb before the pages can be freed */
@@ -307,6 +309,9 @@  static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn,

 	idx = srcu_read_lock(&kvm->srcu);
 	spin_lock(&kvm->mmu_lock);
+
+	trace_kvm_mmu_notifier_change_pte(address, pte);
+
 	kvm->mmu_notifier_seq++;
 	kvm_set_spte_hva(kvm, address, pte);
 	spin_unlock(&kvm->mmu_lock);
@@ -323,6 +328,9 @@  static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,

 	idx = srcu_read_lock(&kvm->srcu);
 	spin_lock(&kvm->mmu_lock);
+
+	trace_kvm_mmu_notifier_invalidate_range_start(start, end);
+
 	/*
 	 * The count increase must become visible at unlock time as no
 	 * spte can be established without taking the mmu_lock and
@@ -347,6 +355,9 @@  static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
 	struct kvm *kvm = mmu_notifier_to_kvm(mn);

 	spin_lock(&kvm->mmu_lock);
+
+	trace_kvm_mmu_notifier_invalidate_range_end(start, end);
+
 	/*
 	 * This sequence increase will notify the kvm page fault that
 	 * the page that is going to be mapped in the spte could have
@@ -379,6 +390,8 @@  static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn,
 	if (young)
 		kvm_flush_remote_tlbs(kvm);

+	trace_kvm_mmu_notifier_clear_flush_young(address, young);
+
 	spin_unlock(&kvm->mmu_lock);
 	srcu_read_unlock(&kvm->srcu, idx);

@@ -395,6 +408,9 @@  static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn,
 	idx = srcu_read_lock(&kvm->srcu);
 	spin_lock(&kvm->mmu_lock);
 	young = kvm_test_age_hva(kvm, address);
+
+	trace_kvm_mmu_notifier_test_young(address, young);
+
 	spin_unlock(&kvm->mmu_lock);
 	srcu_read_unlock(&kvm->srcu, idx);

@@ -408,6 +424,9 @@  static void kvm_mmu_notifier_release(struct mmu_notifier *mn,
 	int idx;

 	idx = srcu_read_lock(&kvm->srcu);
+
+	trace_kvm_mmu_notifier_release(kvm);
+
 	kvm_arch_flush_shadow(kvm);
 	srcu_read_unlock(&kvm->srcu, idx);
 }