Message ID | 20171130180546.4331-2-rkrcmar@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 30/11/2017 19:05, Radim Krčmář wrote: > Does roughly what kvm_mmu_notifier_invalidate_page did before. > > I am not certain why this would be needed. It might mean that we have > another bug with start/end or just that I missed something. I don't think this is needed, because we don't have shared page tables. My understanding is that without shared page tables, you can assume that all page modifications go through invalidate_range_start/end. With shared page tables, there are additional TLB flushes to take care of, which require invalidate_range. Thanks, Paolo > Please try just [1/2] first and apply this one only if [1/2] still bugs, > thanks! > --- > virt/kvm/kvm_main.c | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index b7f4689e373f..0825ea624f16 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -342,6 +342,29 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, > srcu_read_unlock(&kvm->srcu, idx); > } > > +static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, > + struct mm_struct *mm, > + unsigned long start, > + unsigned long end) > +{ > + struct kvm *kvm = mmu_notifier_to_kvm(mn); > + int need_tlb_flush = 0, idx; > + > + idx = srcu_read_lock(&kvm->srcu); > + spin_lock(&kvm->mmu_lock); > + kvm->mmu_notifier_seq++; > + need_tlb_flush = kvm_unmap_hva_range(kvm, start, end); > + need_tlb_flush |= kvm->tlbs_dirty; > + if (need_tlb_flush) > + kvm_flush_remote_tlbs(kvm); > + > + spin_unlock(&kvm->mmu_lock); > + > + kvm_arch_mmu_notifier_invalidate_range(kvm, start, end); > + > + srcu_read_unlock(&kvm->srcu, idx); > +} > + > static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, > struct mm_struct *mm, > unsigned long start, > @@ -476,6 +499,7 @@ static void kvm_mmu_notifier_release(struct mmu_notifier *mn, > } > > static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { > + .invalidate_range = kvm_mmu_notifier_invalidate_range, > .invalidate_range_start = kvm_mmu_notifier_invalidate_range_start, > .invalidate_range_end = kvm_mmu_notifier_invalidate_range_end, > .clear_flush_young = kvm_mmu_notifier_clear_flush_young, >
On Fri, Dec 01, 2017 at 04:15:37PM +0100, Paolo Bonzini wrote: > On 30/11/2017 19:05, Radim Krčmář wrote: > > Does roughly what kvm_mmu_notifier_invalidate_page did before. > > > > I am not certain why this would be needed. It might mean that we have > > another bug with start/end or just that I missed something. > > I don't think this is needed, because we don't have shared page tables. > My understanding is that without shared page tables, you can assume that > all page modifications go through invalidate_range_start/end. With > shared page tables, there are additional TLB flushes to take care of, > which require invalidate_range. Agreed, invalidate_range only is ever needed if you the secondary MMU (i.e. KVM) shares the same pagetables of the primary MMU in the host. Only in such case we need a special secondary MMU invalidate in the tlb gather before the page is freed because there's no way to block the secondary MMU from walking the host pagetables in invalidate_range_start. In KVM case the secondary MMU always go through the shadow pagetables, so all shadow pagetable invalidates can happen in invalidate_range_start and patch 2/2 is not needed here. Note that the host kernel could have always decided to call invalidate_range_start/end and never to call invalidate_page even before invalidate_page was removed. So the problem in practice could only be noticed after the removal of invalidate_page of course, but in more theoretical terms 1/2 is actually fixing a longstanding bug. The removal of invalidate_page made the lack of kvm_arch_mmu_notifier_invalidate_page call in invalidate_range_start more apparent. Thanks, Andrea
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b7f4689e373f..0825ea624f16 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -342,6 +342,29 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); } +static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + struct kvm *kvm = mmu_notifier_to_kvm(mn); + int need_tlb_flush = 0, idx; + + idx = srcu_read_lock(&kvm->srcu); + spin_lock(&kvm->mmu_lock); + kvm->mmu_notifier_seq++; + need_tlb_flush = kvm_unmap_hva_range(kvm, start, end); + need_tlb_flush |= kvm->tlbs_dirty; + if (need_tlb_flush) + kvm_flush_remote_tlbs(kvm); + + spin_unlock(&kvm->mmu_lock); + + kvm_arch_mmu_notifier_invalidate_range(kvm, start, end); + + srcu_read_unlock(&kvm->srcu, idx); +} + static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, @@ -476,6 +499,7 @@ static void kvm_mmu_notifier_release(struct mmu_notifier *mn, } static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { + .invalidate_range = kvm_mmu_notifier_invalidate_range, .invalidate_range_start = kvm_mmu_notifier_invalidate_range_start, .invalidate_range_end = kvm_mmu_notifier_invalidate_range_end, .clear_flush_young = kvm_mmu_notifier_clear_flush_young,