diff mbox series

[v2] KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM

Message ID 20240912-kvm-arm64-limit-guest-vl-v2-1-dd2c29cb2ac9@kernel.org (mailing list archive)
State New
Headers show
Series [v2] KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM | expand

Commit Message

Mark Brown Sept. 12, 2024, 11:39 a.m. UTC
When pKVM saves and restores the host floating point state on a SVE system
it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
for the PE is but uses a buffer allocated with kvm_host_sve_max_vl, the
maximum VL shared by all PEs in the system. This means that if we run on a
system where the maximum VLs are not consistent we will overflow the buffer
on PEs which support larger VLs.

Since the host will not currently attempt to make use of non-shared VLs fix
this by explicitly setting the EL2 VL to be the maximum shared VL when we
save and restore. This will enforce the limit on host VL usage. Should we
wish to support asymmetric VLs this code will need to be updated along with
the required changes for the host, patches have previously been posted:

  https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@kernel.org

Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Signed-off-by: Mark Brown <broonie@kernel.org>
---
Changes in v2:
- Update all places where we constrain the host VL, not just those where
  we save and restore host state.
- The value written to the register is 0 based, not 1 based.
- Link to v1: https://lore.kernel.org/r/20240910-kvm-arm64-limit-guest-vl-v1-1-54df40b95ffb@kernel.org
---
 arch/arm64/kvm/hyp/include/hyp/switch.h |  2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      | 12 +++++++-----
 2 files changed, 8 insertions(+), 6 deletions(-)


---
base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
change-id: 20240910-kvm-arm64-limit-guest-vl-d5fba0c7cc7b

Best regards,

Comments

Fuad Tabba Sept. 12, 2024, 2:21 p.m. UTC | #1
Hi Mark,

On Thu, 12 Sept 2024 at 12:39, Mark Brown <broonie@kernel.org> wrote:
>
> When pKVM saves and restores the host floating point state on a SVE system
> it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
> for the PE is but uses a buffer allocated with kvm_host_sve_max_vl, the
> maximum VL shared by all PEs in the system. This means that if we run on a
> system where the maximum VLs are not consistent we will overflow the buffer
> on PEs which support larger VLs.
>
> Since the host will not currently attempt to make use of non-shared VLs fix
> this by explicitly setting the EL2 VL to be the maximum shared VL when we
> save and restore. This will enforce the limit on host VL usage. Should we
> wish to support asymmetric VLs this code will need to be updated along with
> the required changes for the host, patches have previously been posted:
>
>   https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@kernel.org

Thanks for fixing this.

One part that you haven't changed is setting ZCR_EL2 during el2 setup:
arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291

I guess at that point it's not straightforward to figure sve_max_vl.
Is there a window after el2 setup where we might actually get the VL
implied by ZCR_ELx_LEN_MASK, or would it always get set to
sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ?

That said, this passes the sve-stress test now: tested on qemu using
nvhe, pKVM non-protected guest, pKVM protected guest (the latest with
patches not upstream yet).

Assuming that the current code in el2_setup.h is fine as it is:
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad



>
> Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
> Changes in v2:
> - Update all places where we constrain the host VL, not just those where
>   we save and restore host state.
> - The value written to the register is 0 based, not 1 based.
> - Link to v1: https://lore.kernel.org/r/20240910-kvm-arm64-limit-guest-vl-v1-1-54df40b95ffb@kernel.org
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h |  2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c      | 12 +++++++-----
>  2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index f59ccfe11ab9..c2cfb4d6dc92 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -339,7 +339,7 @@ static inline void __hyp_sve_save_host(void)
>         struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
>
>         sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
> -       write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> +       write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
>         __sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
>                          &sve_state->fpsr,
>                          true);
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index f43d845f3c4e..dd1c6aa907a2 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -33,7 +33,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
>          */
>         sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
>         __sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);
> -       write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> +       write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
>  }
>
>  static void __hyp_sve_restore_host(void)
> @@ -45,10 +45,11 @@ static void __hyp_sve_restore_host(void)
>          * the host. The layout of the data when saving the sve state depends
>          * on the VL, so use a consistent (i.e., the maximum) host VL.
>          *
> -        * Setting ZCR_EL2 to ZCR_ELx_LEN_MASK sets the effective length
> -        * supported by the system (or limited at EL3).
> +        * Note that this constrains the PE to the maximum shared VL
> +        * that was discovered, if we wish to use larger VLs this will
> +        * need to be revisited.
>          */
> -       write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> +       write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
>         __sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
>                             &sve_state->fpsr,
>                             true);
> @@ -479,7 +480,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt)
>         case ESR_ELx_EC_SVE:
>                 cpacr_clear_set(0, CPACR_ELx_ZEN);
>                 isb();
> -               sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> +               sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1,
> +                                      SYS_ZCR_EL2);
>                 break;
>         case ESR_ELx_EC_IABT_LOW:
>         case ESR_ELx_EC_DABT_LOW:
>
> ---
> base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
> change-id: 20240910-kvm-arm64-limit-guest-vl-d5fba0c7cc7b
>
> Best regards,
> --
> Mark Brown <broonie@kernel.org>
>
Mark Brown Sept. 12, 2024, 2:45 p.m. UTC | #2
On Thu, Sep 12, 2024 at 03:21:43PM +0100, Fuad Tabba wrote:

> One part that you haven't changed is setting ZCR_EL2 during el2 setup:
> arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291

> I guess at that point it's not straightforward to figure sve_max_vl.
> Is there a window after el2 setup where we might actually get the VL
> implied by ZCR_ELx_LEN_MASK, or would it always get set to
> sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ?

Yeah, at that point we have no idea what any other cores might look like
and there's just generally a bootstrapping issue - we need to see the
actual maximum VLs for the PEs to work out what the maximum VL for the
system is so we can tell KVM what to enforce.
Marc Zyngier Sept. 12, 2024, 2:56 p.m. UTC | #3
On Thu, 12 Sep 2024 15:45:27 +0100,
Mark Brown <broonie@kernel.org> wrote:
> 
> [1  <text/plain; us-ascii (7bit)>]
> On Thu, Sep 12, 2024 at 03:21:43PM +0100, Fuad Tabba wrote:
> 
> > One part that you haven't changed is setting ZCR_EL2 during el2 setup:
> > arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291
> 
> > I guess at that point it's not straightforward to figure sve_max_vl.
> > Is there a window after el2 setup where we might actually get the VL
> > implied by ZCR_ELx_LEN_MASK, or would it always get set to
> > sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ?
> 
> Yeah, at that point we have no idea what any other cores might look like
> and there's just generally a bootstrapping issue - we need to see the
> actual maximum VLs for the PEs to work out what the maximum VL for the
> system is so we can tell KVM what to enforce.

That's fine. By the time KVM initialises, all the "early" CPUs will be
on, and pKVM actively prevents any late onlining of CPUs.

The only gotcha would be if "something" snapshots the "large" VL in
between two CPU boots, but we really don't expect that sort of things.

	M.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index f59ccfe11ab9..c2cfb4d6dc92 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -339,7 +339,7 @@  static inline void __hyp_sve_save_host(void)
 	struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
 
 	sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
-	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 	__sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
 			 &sve_state->fpsr,
 			 true);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f43d845f3c4e..dd1c6aa907a2 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -33,7 +33,7 @@  static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
 	__sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);
-	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 }
 
 static void __hyp_sve_restore_host(void)
@@ -45,10 +45,11 @@  static void __hyp_sve_restore_host(void)
 	 * the host. The layout of the data when saving the sve state depends
 	 * on the VL, so use a consistent (i.e., the maximum) host VL.
 	 *
-	 * Setting ZCR_EL2 to ZCR_ELx_LEN_MASK sets the effective length
-	 * supported by the system (or limited at EL3).
+	 * Note that this constrains the PE to the maximum shared VL
+	 * that was discovered, if we wish to use larger VLs this will
+	 * need to be revisited.
 	 */
-	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 	__sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
 			    &sve_state->fpsr,
 			    true);
@@ -479,7 +480,8 @@  void handle_trap(struct kvm_cpu_context *host_ctxt)
 	case ESR_ELx_EC_SVE:
 		cpacr_clear_set(0, CPACR_ELx_ZEN);
 		isb();
-		sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+		sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1,
+				       SYS_ZCR_EL2);
 		break;
 	case ESR_ELx_EC_IABT_LOW:
 	case ESR_ELx_EC_DABT_LOW: