diff mbox series

KVM: arm64: Remove redundant check for S2FWB

Message ID 20210205044403.1559010-1-jingzhangos@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Remove redundant check for S2FWB | expand

Commit Message

Jing Zhang Feb. 5, 2021, 4:44 a.m. UTC
Remove redundant check for CPU feature S2FWB in dcache flush code
to save some CPU cycles for every memslot flush and unmapping.
And move the S2FWB check to outer functions to avoid future
redundancy and keep consistent with other usage like in
access_dcsw and kvm_arch_prepare_memory_region.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 9 ++-------
 arch/arm64/kvm/mmu.c         | 3 ++-
 2 files changed, 4 insertions(+), 8 deletions(-)


base-commit: a8ac864a7d6dbc2fc43081b1eecd9e0183065d47

Comments

Marc Zyngier Feb. 5, 2021, 5:24 p.m. UTC | #1
Hi Jing,

On 2021-02-05 04:44, Jing Zhang wrote:
> Remove redundant check for CPU feature S2FWB in dcache flush code
> to save some CPU cycles for every memslot flush and unmapping.

What CPU cycles? This is only a static branch. Can you actually
measure the overhead? What does it represent in the face of
a full memslot unmapping?

Thanks,

         M.
Jing Zhang Feb. 5, 2021, 6:56 p.m. UTC | #2
Hi Marc,

Thanks for the comment.
On Fri, Feb 5, 2021 at 11:24 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Jing,
>
> On 2021-02-05 04:44, Jing Zhang wrote:
> > Remove redundant check for CPU feature S2FWB in dcache flush code
> > to save some CPU cycles for every memslot flush and unmapping.
>
> What CPU cycles? This is only a static branch. Can you actually
> measure the overhead? What does it represent in the face of
> a full memslot unmapping?
For CPU cycles, I mean CPU time spent for S2FWB check.
For memslot unmapping, there is actually no improvement, just move the
check to the stage2_unmap_walker since we removed the S2FWB check in
stage2_flush_dcache.
The saving is from the code path of memslot flush. The S2FWB check was
in stage2_flush_dcache, in which case, for a memslot flush, the check
was done for every page. Now it will save some CPU time if we do the
check at a higher level, like in kvm_toggle_cache, access_dcsw,
kvm_arch_prepare_memory_region.
The redundant check is as follows (Only the first check is necessary):
kvm_arch_prepare_memory_region -> S2FWB check -> stage2_flush_memslot
-> kvm_pgtable_stage2_flush -> S2FWB check -> stage2_flush_walker ->
S2FWB check -> __flush_dcache_area

>
> Thanks,
>
>          M.
> --
> Jazz is not dead. It just smells funny...
Will Deacon March 8, 2021, 4:49 p.m. UTC | #3
On Fri, Feb 05, 2021 at 04:44:03AM +0000, Jing Zhang wrote:
> Remove redundant check for CPU feature S2FWB in dcache flush code
> to save some CPU cycles for every memslot flush and unmapping.
> And move the S2FWB check to outer functions to avoid future
> redundancy and keep consistent with other usage like in
> access_dcsw and kvm_arch_prepare_memory_region.
> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 9 ++-------
>  arch/arm64/kvm/mmu.c         | 3 ++-
>  2 files changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index bdf8e55ed308..afd57564b1cb 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -642,9 +642,6 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
>  
>  static void stage2_flush_dcache(void *addr, u64 size)
>  {
> -	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> -		return;
> -
>  	__flush_dcache_area(addr, size);
>  }
>  
> @@ -670,7 +667,8 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
>  
>  		if (page_count(virt_to_page(childp)) != 1)
>  			return 0;
> -	} else if (stage2_pte_cacheable(pte)) {
> +	} else if (stage2_pte_cacheable(pte) &&
> +			!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
>  		need_flush = true;
>  	}
>  
> @@ -846,9 +844,6 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
>  		.flags	= KVM_PGTABLE_WALK_LEAF,
>  	};
>  
> -	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> -		return 0;
> -
>  	return kvm_pgtable_walk(pgt, addr, size, &walker);
>  }

I think we should leave pgtable.c as it is: there's no benefit from this
change on the unmap path, and the other path involves the case where the
caller has asked for a flush and we can elide it.

> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 7d2257cc5438..53130ed23304 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1458,7 +1458,8 @@ void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled)
>  	 * If switching it off, need to clean the caches.
>  	 * Clean + invalidate does the trick always.
>  	 */
> -	if (now_enabled != was_enabled)
> +	if (now_enabled != was_enabled &&
> +			!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
>  		stage2_flush_vm(vcpu->kvm);

This change looks fine, but I don't grok the justification in your follow-up
email. You say:

  | The saving is from the code path of memslot flush. The S2FWB check was
  | in stage2_flush_dcache, in which case, for a memslot flush, the check
  | was done for every page.

but I don't see where this is called for every page. It looks to me like it's
called for every pgd in the range, which is a very different kettle of frogs.

What am I missing?

Will
Jing Zhang March 8, 2021, 6:43 p.m. UTC | #4
Hi Will,

On Mon, Mar 8, 2021 at 10:49 AM Will Deacon <will@kernel.org> wrote:
>
> On Fri, Feb 05, 2021 at 04:44:03AM +0000, Jing Zhang wrote:
> > Remove redundant check for CPU feature S2FWB in dcache flush code
> > to save some CPU cycles for every memslot flush and unmapping.
> > And move the S2FWB check to outer functions to avoid future
> > redundancy and keep consistent with other usage like in
> > access_dcsw and kvm_arch_prepare_memory_region.
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/kvm/hyp/pgtable.c | 9 ++-------
> >  arch/arm64/kvm/mmu.c         | 3 ++-
> >  2 files changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index bdf8e55ed308..afd57564b1cb 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -642,9 +642,6 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
> >
> >  static void stage2_flush_dcache(void *addr, u64 size)
> >  {
> > -     if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> > -             return;
> > -
> >       __flush_dcache_area(addr, size);
> >  }
> >
> > @@ -670,7 +667,8 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
> >
> >               if (page_count(virt_to_page(childp)) != 1)
> >                       return 0;
> > -     } else if (stage2_pte_cacheable(pte)) {
> > +     } else if (stage2_pte_cacheable(pte) &&
> > +                     !cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
> >               need_flush = true;
> >       }
> >
> > @@ -846,9 +844,6 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
> >               .flags  = KVM_PGTABLE_WALK_LEAF,
> >       };
> >
> > -     if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> > -             return 0;
> > -
> >       return kvm_pgtable_walk(pgt, addr, size, &walker);
> >  }
>
> I think we should leave pgtable.c as it is: there's no benefit from this
> change on the unmap path, and the other path involves the case where the
> caller has asked for a flush and we can elide it.
Agreed.
>
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 7d2257cc5438..53130ed23304 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1458,7 +1458,8 @@ void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled)
> >        * If switching it off, need to clean the caches.
> >        * Clean + invalidate does the trick always.
> >        */
> > -     if (now_enabled != was_enabled)
> > +     if (now_enabled != was_enabled &&
> > +                     !cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> >               stage2_flush_vm(vcpu->kvm);
>
> This change looks fine, but I don't grok the justification in your follow-up
> email. You say:
>
>   | The saving is from the code path of memslot flush. The S2FWB check was
>   | in stage2_flush_dcache, in which case, for a memslot flush, the check
>   | was done for every page.
>
> but I don't see where this is called for every page. It looks to me like it's
> called for every pgd in the range, which is a very different kettle of frogs.
>
> What am I missing?
You are right. It is called for every pgd in the range instead of for
every page.
>
> Will

Thanks,
Jing
diff mbox series

Patch

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index bdf8e55ed308..afd57564b1cb 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -642,9 +642,6 @@  int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
 
 static void stage2_flush_dcache(void *addr, u64 size)
 {
-	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
-		return;
-
 	__flush_dcache_area(addr, size);
 }
 
@@ -670,7 +667,8 @@  static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
 
 		if (page_count(virt_to_page(childp)) != 1)
 			return 0;
-	} else if (stage2_pte_cacheable(pte)) {
+	} else if (stage2_pte_cacheable(pte) &&
+			!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
 		need_flush = true;
 	}
 
@@ -846,9 +844,6 @@  int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
 		.flags	= KVM_PGTABLE_WALK_LEAF,
 	};
 
-	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
-		return 0;
-
 	return kvm_pgtable_walk(pgt, addr, size, &walker);
 }
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7d2257cc5438..53130ed23304 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1458,7 +1458,8 @@  void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled)
 	 * If switching it off, need to clean the caches.
 	 * Clean + invalidate does the trick always.
 	 */
-	if (now_enabled != was_enabled)
+	if (now_enabled != was_enabled &&
+			!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
 		stage2_flush_vm(vcpu->kvm);
 
 	/* Caches are now on, stop trapping VM ops (until a S/W op) */