diff mbox

[3/6] kvm: arm64: Convert kvm_set_s2pte_readonly() from inline asm to cmpxchg()

Message ID 20170725135308.18173-4-catalin.marinas@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Catalin Marinas July 25, 2017, 1:53 p.m. UTC
To take advantage of the LSE atomic instructions and also make the code
cleaner, convert the kvm_set_s2pte_readonly() function to use the more
generic cmpxchg().

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

Comments

Christoffer Dall Aug. 1, 2017, 11:16 a.m. UTC | #1
Hi Catalin,

On Tue, Jul 25, 2017 at 02:53:05PM +0100, Catalin Marinas wrote:
> To take advantage of the LSE atomic instructions and also make the code
> cleaner, convert the kvm_set_s2pte_readonly() function to use the more
> generic cmpxchg().
> 
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Will Deacon <will.deacon@arm.com>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h | 20 ++++++++------------
>  1 file changed, 8 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index a89cc22abadc..40b3ea690826 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -175,18 +175,14 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
>  
>  static inline void kvm_set_s2pte_readonly(pte_t *pte)
>  {
> -	pteval_t pteval;
> -	unsigned long tmp;
> -
> -	asm volatile("//	kvm_set_s2pte_readonly\n"
> -	"	prfm	pstl1strm, %2\n"
> -	"1:	ldxr	%0, %2\n"
> -	"	and	%0, %0, %3		// clear PTE_S2_RDWR\n"
> -	"	orr	%0, %0, %4		// set PTE_S2_RDONLY\n"
> -	"	stxr	%w1, %0, %2\n"
> -	"	cbnz	%w1, 1b\n"
> -	: "=&r" (pteval), "=&r" (tmp), "+Q" (pte_val(*pte))
> -	: "L" (~PTE_S2_RDWR), "L" (PTE_S2_RDONLY));
> +	pteval_t old_pteval, pteval;
> +
> +	do {
> +		pteval = old_pteval = READ_ONCE(pte_val(*pte));
> +		pteval &= ~PTE_S2_RDWR;
> +		pteval |= PTE_S2_RDONLY;
> +	} while (cmpxchg_relaxed(&pte_val(*pte), old_pteval, pteval) !=
> +		 old_pteval);

I'm wondering if the READ_ONCE for old_pteval is strictly necessary, or
if that's really for the pteval.  Actually, I'm a little unsure whether
this is equivalent to

	old_pteval = READ_ONCE(pte_val(*pte));
	pteval = old_pteval;

or

	old_pteval = READ_ONCE(pte_val(*pte));
	pteval = READ_ONCE(pte_val(*pte));

I think it's the former, which I also think is correct, but the reason
I'm going down this road is that we have a use of cmpxchg() in the VGIC
code, which doesn't use READ_ONCE for the old value (~
vgic-mmio-v3.c:404), and I also found other occurences of this in the
kernel, so I'm wondering if the VGIC code is broken or we're being
overly careful here, or if this is necessary because hardware can update
the value behind our backs in this case?

In any case, for this patch:

Reviewed-by: Christoffer Dall <cdall@linaro.org>



>  }
>  
>  static inline bool kvm_s2pte_readonly(pte_t *pte)
Catalin Marinas Aug. 2, 2017, 9:15 a.m. UTC | #2
Hi Christoffer,

On Tue, Aug 01, 2017 at 01:16:18PM +0200, Christoffer Dall wrote:
> On Tue, Jul 25, 2017 at 02:53:05PM +0100, Catalin Marinas wrote:
> > +	pteval_t old_pteval, pteval;
> > +
> > +	do {
> > +		pteval = old_pteval = READ_ONCE(pte_val(*pte));
> > +		pteval &= ~PTE_S2_RDWR;
> > +		pteval |= PTE_S2_RDONLY;
> > +	} while (cmpxchg_relaxed(&pte_val(*pte), old_pteval, pteval) !=
> > +		 old_pteval);
> 
> I'm wondering if the READ_ONCE for old_pteval is strictly necessary, or
> if that's really for the pteval.  Actually, I'm a little unsure whether
> this is equivalent to
> 
> 	old_pteval = READ_ONCE(pte_val(*pte));
> 	pteval = old_pteval;
> 
> or
> 
> 	old_pteval = READ_ONCE(pte_val(*pte));
> 	pteval = READ_ONCE(pte_val(*pte));
> 
> I think it's the former, which I also think is correct,

I think so too.

> but the reason
> I'm going down this road is that we have a use of cmpxchg() in the VGIC
> code, which doesn't use READ_ONCE for the old value (~
> vgic-mmio-v3.c:404), and I also found other occurences of this in the
> kernel, so I'm wondering if the VGIC code is broken or we're being
> overly careful here, or if this is necessary because hardware can update
> the value behind our backs in this case?

We use it because the compiler may decide it's short on registers and
instead of saving old_pteval on the stack it reads it again from memory
just before cmpxchg, so we would miss any update to *pte done by the
hardware. In practice, I've never seen (on arm64) gcc generating two
loads to *pte without READ_ONCE but maybe I haven't tried hard enough.

We should probably fix the VGIC code as well as a precaution, just in
case the compiler tries to get smarter in the future.

> In any case, for this patch:
> 
> Reviewed-by: Christoffer Dall <cdall@linaro.org>

Thanks.
Christoffer Dall Aug. 2, 2017, 12:48 p.m. UTC | #3
On Wed, Aug 02, 2017 at 10:15:36AM +0100, Catalin Marinas wrote:
> Hi Christoffer,
> 
> On Tue, Aug 01, 2017 at 01:16:18PM +0200, Christoffer Dall wrote:
> > On Tue, Jul 25, 2017 at 02:53:05PM +0100, Catalin Marinas wrote:
> > > +	pteval_t old_pteval, pteval;
> > > +
> > > +	do {
> > > +		pteval = old_pteval = READ_ONCE(pte_val(*pte));
> > > +		pteval &= ~PTE_S2_RDWR;
> > > +		pteval |= PTE_S2_RDONLY;
> > > +	} while (cmpxchg_relaxed(&pte_val(*pte), old_pteval, pteval) !=
> > > +		 old_pteval);
> > 
> > I'm wondering if the READ_ONCE for old_pteval is strictly necessary, or
> > if that's really for the pteval.  Actually, I'm a little unsure whether
> > this is equivalent to
> > 
> > 	old_pteval = READ_ONCE(pte_val(*pte));
> > 	pteval = old_pteval;
> > 
> > or
> > 
> > 	old_pteval = READ_ONCE(pte_val(*pte));
> > 	pteval = READ_ONCE(pte_val(*pte));
> > 
> > I think it's the former, which I also think is correct,
> 
> I think so too.
> 
> > but the reason
> > I'm going down this road is that we have a use of cmpxchg() in the VGIC
> > code, which doesn't use READ_ONCE for the old value (~
> > vgic-mmio-v3.c:404), and I also found other occurences of this in the
> > kernel, so I'm wondering if the VGIC code is broken or we're being
> > overly careful here, or if this is necessary because hardware can update
> > the value behind our backs in this case?
> 
> We use it because the compiler may decide it's short on registers and
> instead of saving old_pteval on the stack it reads it again from memory
> just before cmpxchg, so we would miss any update to *pte done by the
> hardware. In practice, I've never seen (on arm64) gcc generating two
> loads to *pte without READ_ONCE but maybe I haven't tried hard enough.
> 
> We should probably fix the VGIC code as well as a precaution, just in
> case the compiler tries to get smarter in the future.
> 

Sounds like a plan, I'll cook up a patch.

Thanks,
-Christoffer
diff mbox

Patch

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index a89cc22abadc..40b3ea690826 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -175,18 +175,14 @@  static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
 
 static inline void kvm_set_s2pte_readonly(pte_t *pte)
 {
-	pteval_t pteval;
-	unsigned long tmp;
-
-	asm volatile("//	kvm_set_s2pte_readonly\n"
-	"	prfm	pstl1strm, %2\n"
-	"1:	ldxr	%0, %2\n"
-	"	and	%0, %0, %3		// clear PTE_S2_RDWR\n"
-	"	orr	%0, %0, %4		// set PTE_S2_RDONLY\n"
-	"	stxr	%w1, %0, %2\n"
-	"	cbnz	%w1, 1b\n"
-	: "=&r" (pteval), "=&r" (tmp), "+Q" (pte_val(*pte))
-	: "L" (~PTE_S2_RDWR), "L" (PTE_S2_RDONLY));
+	pteval_t old_pteval, pteval;
+
+	do {
+		pteval = old_pteval = READ_ONCE(pte_val(*pte));
+		pteval &= ~PTE_S2_RDWR;
+		pteval |= PTE_S2_RDONLY;
+	} while (cmpxchg_relaxed(&pte_val(*pte), old_pteval, pteval) !=
+		 old_pteval);
 }
 
 static inline bool kvm_s2pte_readonly(pte_t *pte)