diff mbox

[v6,06/26] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state

Message ID 20180314165049.30105-7-marc.zyngier@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marc Zyngier March 14, 2018, 4:50 p.m. UTC
kvm_vgic_global_state is part of the read-only section, and is
usually accessed using a PC-relative address generation (adrp + add).

It is thus useless to use kern_hyp_va() on it, and actively problematic
if kern_hyp_va() becomes non-idempotent. On the other hand, there is
no way that the compiler is going to guarantee that such access is
always PC relative.

So let's bite the bullet and provide our own accessor.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  7 +++++++
 arch/arm64/include/asm/kvm_mmu.h | 20 ++++++++++++++++++++
 virt/kvm/arm/hyp/vgic-v2-sr.c    |  4 ++--
 3 files changed, 29 insertions(+), 2 deletions(-)

Comments

James Morse March 15, 2018, 7:16 p.m. UTC | #1
Hi Marc,

On 14/03/18 16:50, Marc Zyngier wrote:
> kvm_vgic_global_state is part of the read-only section, and is
> usually accessed using a PC-relative address generation (adrp + add).
> 
> It is thus useless to use kern_hyp_va() on it, and actively problematic
> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> no way that the compiler is going to guarantee that such access is
> always PC relative.
> 
> So let's bite the bullet and provide our own accessor.


> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index de1b919404e4..a6808d2869f5 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -28,6 +28,13 @@
>   */
>  #define kern_hyp_va(kva)	(kva)
>  
> +/* Resolving symbol addresses in a PC-relative way is easy... */

(So this is guaranteed on 32bit? I thought this was because 32bit uses the
kernel-VA's at HYP, so any way the compiler generates the address will work.)


> +#define hyp_symbol_addr(s)						\
> +	({								\
> +		typeof(s) *addr = &(s);					\
> +		addr;							\
> +	})
> +
>  /*
>   * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
>   */

> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index e3bc1d0a5e93..7120bf3f22c7 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -110,6 +110,26 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
>  
>  #define kern_hyp_va(v) 	((typeof(v))(__kern_hyp_va((unsigned long)(v))))
>  
> +/*
> + * Obtain the PC-relative address of a kernel symbol
> + * s: symbol
> + *
> + * The goal of this macro is to return a symbol's address based on a
> + * PC-relative computation, as opposed to a loading the VA from a
> + * constant pool or something similar. This works well for HYP, as an
> + * absolute VA is guaranteed to be wrong. Only use this if trying to
> + * obtain the address of a symbol (i.e. not something you obtained by
> + * following a pointer).
> + */
> +#define hyp_symbol_addr(s)						\
> +	({								\
> +		typeof(s) *addr;					\
> +		asm volatile("adrp	%0, %1\n"			\
> +			     "add	%0, %0, :lo12:%1\n"		\
> +			     : "=r" (addr) : "S" (&s));			\
> +		addr;							\
> +	})

Why does this need to be volatile? I see gcc v4.9 generate this sequence twice
in __vgic_v2_save_state(). Can't it cache and reorder this sequence as it likes?

Either-way:
Reviewed-by: James Morse <james.morse@arm.com>


(I had a go at using 'Ush', to let the compiler schedule the adrp, but couldn't
get it to work.)

Thanks,

James
Marc Zyngier March 16, 2018, 9:31 a.m. UTC | #2
On 15/03/18 19:16, James Morse wrote:
> Hi Marc,
> 
> On 14/03/18 16:50, Marc Zyngier wrote:
>> kvm_vgic_global_state is part of the read-only section, and is
>> usually accessed using a PC-relative address generation (adrp + add).
>>
>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>> no way that the compiler is going to guarantee that such access is
>> always PC relative.
>>
>> So let's bite the bullet and provide our own accessor.
> 
> 
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index de1b919404e4..a6808d2869f5 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -28,6 +28,13 @@
>>   */
>>  #define kern_hyp_va(kva)	(kva)
>>  
>> +/* Resolving symbol addresses in a PC-relative way is easy... */
> 
> (So this is guaranteed on 32bit? I thought this was because 32bit uses the
> kernel-VA's at HYP, so any way the compiler generates the address will work.)

You're right, this comment is slightly idiotic. What I meant to convey
is that we don't need to provide PC-relative addresses on 32bit. I've
replaced it with:

/*
 * Contrary to arm64, there is no need to generate a PC-relative address
 */

> 
> 
>> +#define hyp_symbol_addr(s)						\
>> +	({								\
>> +		typeof(s) *addr = &(s);					\
>> +		addr;							\
>> +	})
>> +
>>  /*
>>   * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
>>   */
> 
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index e3bc1d0a5e93..7120bf3f22c7 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -110,6 +110,26 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
>>  
>>  #define kern_hyp_va(v) 	((typeof(v))(__kern_hyp_va((unsigned long)(v))))
>>  
>> +/*
>> + * Obtain the PC-relative address of a kernel symbol
>> + * s: symbol
>> + *
>> + * The goal of this macro is to return a symbol's address based on a
>> + * PC-relative computation, as opposed to a loading the VA from a
>> + * constant pool or something similar. This works well for HYP, as an
>> + * absolute VA is guaranteed to be wrong. Only use this if trying to
>> + * obtain the address of a symbol (i.e. not something you obtained by
>> + * following a pointer).
>> + */
>> +#define hyp_symbol_addr(s)						\
>> +	({								\
>> +		typeof(s) *addr;					\
>> +		asm volatile("adrp	%0, %1\n"			\
>> +			     "add	%0, %0, :lo12:%1\n"		\
>> +			     : "=r" (addr) : "S" (&s));			\
>> +		addr;							\
>> +	})
> 
> Why does this need to be volatile? I see gcc v4.9 generate this sequence twice
> in __vgic_v2_save_state(). Can't it cache and reorder this sequence as it likes?

I think you're right. I tend to use "volatile" to prevent the compiler
from optimizing it away, but the fact that we rely on the output of the
sequence makes it impossible.

> Either-way:
> Reviewed-by: James Morse <james.morse@arm.com>
> 
> 
> (I had a go at using 'Ush', to let the compiler schedule the adrp, but couldn't
> get it to work.)

I tried that as well at some point, but couldn't see how to use it. The
compiler was never happy with my use of the constraints, so I gave up
and did it my way...

	M.
Andrew Jones March 16, 2018, 11:35 a.m. UTC | #3
On Fri, Mar 16, 2018 at 09:31:57AM +0000, Marc Zyngier wrote:
> On 15/03/18 19:16, James Morse wrote:
> > 
> > (I had a go at using 'Ush', to let the compiler schedule the adrp, but couldn't
> > get it to work.)
> 
> I tried that as well at some point, but couldn't see how to use it. The
> compiler was never happy with my use of the constraints, so I gave up
> and did it my way...
>

What's 'Ush'? I tried to search for it, but came up short. I'm wondering
what things I can try (and fail) to use it on too :-)

Thanks,
drew
Ard Biesheuvel March 16, 2018, 11:38 a.m. UTC | #4
On 16 March 2018 at 11:35, Andrew Jones <drjones@redhat.com> wrote:
> On Fri, Mar 16, 2018 at 09:31:57AM +0000, Marc Zyngier wrote:
>> On 15/03/18 19:16, James Morse wrote:
>> >
>> > (I had a go at using 'Ush', to let the compiler schedule the adrp, but couldn't
>> > get it to work.)
>>
>> I tried that as well at some point, but couldn't see how to use it. The
>> compiler was never happy with my use of the constraints, so I gave up
>> and did it my way...
>>
>
> What's 'Ush'? I tried to search for it, but came up short. I'm wondering
> what things I can try (and fail) to use it on too :-)
>

https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html
Marc Zyngier March 16, 2018, 11:51 a.m. UTC | #5
On 16/03/18 11:38, Ard Biesheuvel wrote:
> On 16 March 2018 at 11:35, Andrew Jones <drjones@redhat.com> wrote:
>> On Fri, Mar 16, 2018 at 09:31:57AM +0000, Marc Zyngier wrote:
>>> On 15/03/18 19:16, James Morse wrote:
>>>>
>>>> (I had a go at using 'Ush', to let the compiler schedule the adrp, but couldn't
>>>> get it to work.)
>>>
>>> I tried that as well at some point, but couldn't see how to use it. The
>>> compiler was never happy with my use of the constraints, so I gave up
>>> and did it my way...
>>>
>>
>> What's 'Ush'? I tried to search for it, but came up short. I'm wondering
>> what things I can try (and fail) to use it on too :-)
>>
> 
> https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html

I was hoping that something like:

	asm("add %0, %1, :lo12:%2" : "=r" (addr) : "Ush" (&s), "S" (&s))

would do the right thing (generating an ADRP), but all I managed was to
get the compiler shouting at me (in a rather unhelpful manner).

I guess I could have looked into the GCC source code to find out how to
use this thing, but life is short.

	M.
diff mbox

Patch

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index de1b919404e4..a6808d2869f5 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -28,6 +28,13 @@ 
  */
 #define kern_hyp_va(kva)	(kva)
 
+/* Resolving symbol addresses in a PC-relative way is easy... */
+#define hyp_symbol_addr(s)						\
+	({								\
+		typeof(s) *addr = &(s);					\
+		addr;							\
+	})
+
 /*
  * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
  */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index e3bc1d0a5e93..7120bf3f22c7 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -110,6 +110,26 @@  static inline unsigned long __kern_hyp_va(unsigned long v)
 
 #define kern_hyp_va(v) 	((typeof(v))(__kern_hyp_va((unsigned long)(v))))
 
+/*
+ * Obtain the PC-relative address of a kernel symbol
+ * s: symbol
+ *
+ * The goal of this macro is to return a symbol's address based on a
+ * PC-relative computation, as opposed to a loading the VA from a
+ * constant pool or something similar. This works well for HYP, as an
+ * absolute VA is guaranteed to be wrong. Only use this if trying to
+ * obtain the address of a symbol (i.e. not something you obtained by
+ * following a pointer).
+ */
+#define hyp_symbol_addr(s)						\
+	({								\
+		typeof(s) *addr;					\
+		asm volatile("adrp	%0, %1\n"			\
+			     "add	%0, %0, :lo12:%1\n"		\
+			     : "=r" (addr) : "S" (&s));			\
+		addr;							\
+	})
+
 /*
  * We currently only support a 40bit IPA.
  */
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index 4fe6e797e8b3..a6ca049d9651 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -26,7 +26,7 @@ 
 static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
+	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
 	u32 elrsr0, elrsr1;
 
 	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
@@ -140,7 +140,7 @@  int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
 		return -1;
 
 	rd = kvm_vcpu_dabt_get_rd(vcpu);
-	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
+	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
 	addr += fault_ipa - vgic->vgic_cpu_base;
 
 	if (kvm_vcpu_dabt_iswrite(vcpu)) {