Message ID | 20230131092504.2880505-2-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: ARMv8.3/8.4 Nested Virtualization support | expand |
On Tue, Jan 31, 2023 at 09:23:56AM +0000, Marc Zyngier wrote: > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 6cfa6e3996cf..b7b0704e360e 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -2553,9 +2553,14 @@ > protected: nVHE-based mode with support for guests whose > state is kept private from the host. > > + nested: VHE-based mode with support for nested > + virtualization. Requires at least ARMv8.3 > + hardware. So we can't have protected + nested at the same time? ;) (I guess once you make the protected mode use VHE, this could be revisited) In the hope that this averts another post of the series: Acked-by: Catalin Marinas <catalin.marinas@arm.com>
On Tue, 31 Jan 2023 12:03:52 +0000, Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Tue, Jan 31, 2023 at 09:23:56AM +0000, Marc Zyngier wrote: > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > index 6cfa6e3996cf..b7b0704e360e 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -2553,9 +2553,14 @@ > > protected: nVHE-based mode with support for guests whose > > state is kept private from the host. > > > > + nested: VHE-based mode with support for nested > > + virtualization. Requires at least ARMv8.3 > > + hardware. > > So we can't have protected + nested at the same time? ;) (I guess once > you make the protected mode use VHE, this could be revisited) We could move the whole shadow S2 inside the protected hypervisor, but that's pretty complicated, as this mandates multiple S2 contexts per VMs. I'd really want to see a use case for it before I even try. On the other hand, debugging the protected hypervisor (or at least its VHE version) under NV is pretty fun. Makes the whole debug cycle incredibly short. > In the hope that this averts another post of the series: > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> Thanks! But I'm afraid you'll definitely see more of this stuff. I can only hope it gets merge quicker than I add to it... M.
Hi Marc On 31/01/2023 09:23, Marc Zyngier wrote: > From: Jintack Lim <jintack.lim@linaro.org> > > Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the > CPU has the ARMv8.3 nested virtualization capability, together > with the 'kvm-arm.mode=nested' command line option. > > This will be used to support nested virtualization in KVM. > > Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > Signed-off-by: Jintack Lim <jintack.lim@linaro.org> > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> > [maz: moved the command-line option to kvm-arm.mode] Should this be separate kvm-arm mode ? Or can this be tied to is_kernel_in_hyp_mode() ? Given this mode (from my limited review) doesn't conflict with normal VHE mode (and RME support), adding this explicit mode could confuse the user. In case we need a command line to turn the NV mode on/off, we could always use the id-override and simply leave this out ? Cheers Suzuki > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > .../admin-guide/kernel-parameters.txt | 7 +++++- > arch/arm64/include/asm/kvm_host.h | 5 ++++ > arch/arm64/kernel/cpufeature.c | 25 +++++++++++++++++++ > arch/arm64/kvm/arm.c | 5 ++++ > arch/arm64/tools/cpucaps | 1 + > 5 files changed, 42 insertions(+), 1 deletion(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 6cfa6e3996cf..b7b0704e360e 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -2553,9 +2553,14 @@ > protected: nVHE-based mode with support for guests whose > state is kept private from the host. > > + nested: VHE-based mode with support for nested > + virtualization. Requires at least ARMv8.3 > + hardware. > + > Defaults to VHE/nVHE based on hardware support. Setting > mode to "protected" will disable kexec and hibernation > - for the host. > + for the host. "nested" is experimental and should be > + used with extreme caution. > > kvm-arm.vgic_v3_group0_trap= > [KVM,ARM] Trap guest accesses to GICv3 group-0 > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index b14a0199eba4..186f1b759763 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -60,9 +60,14 @@ > enum kvm_mode { > KVM_MODE_DEFAULT, > KVM_MODE_PROTECTED, > + KVM_MODE_NV, > KVM_MODE_NONE, > }; > +#ifdef CONFIG_KVM > enum kvm_mode kvm_get_mode(void); > +#else > +static inline enum kvm_mode kvm_get_mode(void) { return KVM_MODE_NONE; }; > +#endif > > DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use); > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index a77315b338e6..3fc14ee86239 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -1956,6 +1956,20 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused) > write_sysreg(read_sysreg(tpidr_el1), tpidr_el2); > } > > +static bool has_nested_virt_support(const struct arm64_cpu_capabilities *cap, > + int scope) > +{ > + if (kvm_get_mode() != KVM_MODE_NV) > + return false; > + > + if (!has_cpuid_feature(cap, scope)) { > + pr_warn("unavailable: %s\n", cap->desc); > + return false; > + } > + > + return true; > +} > + > #ifdef CONFIG_ARM64_PAN > static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused) > { > @@ -2215,6 +2229,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = { > .matches = runs_at_el2, > .cpu_enable = cpu_copy_el2regs, > }, > + { > + .desc = "Nested Virtualization Support", > + .capability = ARM64_HAS_NESTED_VIRT, > + .type = ARM64_CPUCAP_SYSTEM_FEATURE, > + .matches = has_nested_virt_support, > + .sys_reg = SYS_ID_AA64MMFR2_EL1, > + .sign = FTR_UNSIGNED, > + .field_pos = ID_AA64MMFR2_EL1_NV_SHIFT, > + .field_width = 4, > + .min_field_value = ID_AA64MMFR2_EL1_NV_IMP, > + }, > { > .capability = ARM64_HAS_32BIT_EL0_DO_NOT_USE, > .type = ARM64_CPUCAP_SYSTEM_FEATURE, > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index 698787ed87e9..4e1943e3aa42 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -2325,6 +2325,11 @@ static int __init early_kvm_mode_cfg(char *arg) > return 0; > } > > + if (strcmp(arg, "nested") == 0 && !WARN_ON(!is_kernel_in_hyp_mode())) { > + kvm_mode = KVM_MODE_NV; > + return 0; > + } > + > return -EINVAL; > } > early_param("kvm-arm.mode", early_kvm_mode_cfg); > diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps > index dfeb2c51e257..1af77a3657f7 100644 > --- a/arch/arm64/tools/cpucaps > +++ b/arch/arm64/tools/cpucaps > @@ -31,6 +31,7 @@ HAS_GENERIC_AUTH_IMP_DEF > HAS_IRQ_PRIO_MASKING > HAS_LDAPR > HAS_LSE_ATOMICS > +HAS_NESTED_VIRT > HAS_NO_FPSIMD > HAS_NO_HW_PREFETCH > HAS_PAN
Hi Suzuki, On Tue, 31 Jan 2023 13:47:31 +0000, Suzuki K Poulose <suzuki.poulose@arm.com> wrote: > > Hi Marc > > On 31/01/2023 09:23, Marc Zyngier wrote: > > From: Jintack Lim <jintack.lim@linaro.org> > > > > Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the > > CPU has the ARMv8.3 nested virtualization capability, together > > with the 'kvm-arm.mode=nested' command line option. > > > > This will be used to support nested virtualization in KVM. > > > > Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > Signed-off-by: Jintack Lim <jintack.lim@linaro.org> > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> > > [maz: moved the command-line option to kvm-arm.mode] > > Should this be separate kvm-arm mode ? Or can this be tied to > is_kernel_in_hyp_mode() ? Given this mode (from my limited > review) doesn't conflict with normal VHE mode (and RME support), > adding this explicit mode could confuse the user. What is exactly the objection here? NV is more or less a VHE++ mode, but is also completely experimental and incomplete. > In case we need a command line to turn the NV mode on/off, > we could always use the id-override and simply leave this out ? I really want an explicit user buy-in. There is absolutely no way this can be enabled by default, the risks are way too high. Just look at the x86 story: it took them 10 years to enable NV by default. I don't expect to do any better. Thanks, M.
Hi Marc, On 31/01/2023 14:00, Marc Zyngier wrote: > Hi Suzuki, > > On Tue, 31 Jan 2023 13:47:31 +0000, > Suzuki K Poulose <suzuki.poulose@arm.com> wrote: >> >> Hi Marc >> >> On 31/01/2023 09:23, Marc Zyngier wrote: >>> From: Jintack Lim <jintack.lim@linaro.org> >>> >>> Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the >>> CPU has the ARMv8.3 nested virtualization capability, together >>> with the 'kvm-arm.mode=nested' command line option. >>> >>> This will be used to support nested virtualization in KVM. >>> >>> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> >>> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> >>> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> >>> [maz: moved the command-line option to kvm-arm.mode] >> >> Should this be separate kvm-arm mode ? Or can this be tied to >> is_kernel_in_hyp_mode() ? Given this mode (from my limited >> review) doesn't conflict with normal VHE mode (and RME support), >> adding this explicit mode could confuse the user. > > What is exactly the objection here? NV is more or less a VHE++ mode, > but is also completely experimental and incomplete. I am all in for making this an "optional", only enabled it when "I know what I want". kvm-arm.mode=nv kind of seems that the KVM driver is conditioned mainly for running NV (comparing with the other existing options for kvm-arm.mode). In reality, as you confirmed, NV is an *additional* capability of a VHE hypervisor. So it would be good to "opt" in for "nv" capability support. e.g, kvm-arm.nv=on Thinking more about it, either is fine. > >> In case we need a command line to turn the NV mode on/off, >> we could always use the id-override and simply leave this out ? > > I really want an explicit user buy-in. There is absolutely no way this > can be enabled by default, the risks are way too high. Just look at > the x86 story: it took them 10 years to enable NV by default. I don't > expect to do any better. Of course, I am with you on that. I realise that we may not be able to disable nv by default using idreg-override (i.e., without any kernel command line option). So we may need something outside of that, like the option mentioned above. Suzuki > > Thanks, > > M. >
On Tue, Jan 31, 2023 at 05:34:39PM +0000, Suzuki K Poulose wrote: > On 31/01/2023 14:00, Marc Zyngier wrote: [...] > > What is exactly the objection here? NV is more or less a VHE++ mode, > > but is also completely experimental and incomplete. > > I am all in for making this an "optional", only enabled it when "I know > what I want". > > kvm-arm.mode=nv kind of seems that the KVM driver is conditioned > mainly for running NV (comparing with the other existing options > for kvm-arm.mode). > > In reality, as you confirmed, NV is an *additional* capability > of a VHE hypervisor. So it would be good to "opt" in for "nv" capability > support. > > e.g, > > kvm-arm.nv=on > > Thinking more about it, either is fine. Marc, I'm curious, how do you plan to glue hVHE + NV together (if at all)? We may need two separate options for this so the user could separately configure NV for their hVHE KVM instance.
On Tue, 31 Jan 2023 17:34:39 +0000, Suzuki K Poulose <suzuki.poulose@arm.com> wrote: > > Hi Marc, > > On 31/01/2023 14:00, Marc Zyngier wrote: > > Hi Suzuki, > > > > On Tue, 31 Jan 2023 13:47:31 +0000, > > Suzuki K Poulose <suzuki.poulose@arm.com> wrote: > >> > >> Hi Marc > >> > >> On 31/01/2023 09:23, Marc Zyngier wrote: > >>> From: Jintack Lim <jintack.lim@linaro.org> > >>> > >>> Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the > >>> CPU has the ARMv8.3 nested virtualization capability, together > >>> with the 'kvm-arm.mode=nested' command line option. > >>> > >>> This will be used to support nested virtualization in KVM. > >>> > >>> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > >>> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> > >>> Signed-off-by: Andre Przywara <andre.przywara@arm.com> > >>> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> > >>> [maz: moved the command-line option to kvm-arm.mode] > >> > >> Should this be separate kvm-arm mode ? Or can this be tied to > >> is_kernel_in_hyp_mode() ? Given this mode (from my limited > >> review) doesn't conflict with normal VHE mode (and RME support), > >> adding this explicit mode could confuse the user. > > > > What is exactly the objection here? NV is more or less a VHE++ mode, > > but is also completely experimental and incomplete. > > I am all in for making this an "optional", only enabled it when "I know > what I want". > > kvm-arm.mode=nv kind of seems that the KVM driver is conditioned > mainly for running NV (comparing with the other existing options > for kvm-arm.mode). > > In reality, as you confirmed, NV is an *additional* capability > of a VHE hypervisor. So it would be good to "opt" in for "nv" capability > support. > > e.g, > > kvm-arm.nv=on > > Thinking more about it, either is fine. We had something like this for a long while, but it gave the bad impression what NV and all the other options were orthogonal. But they aren't, really. NV+nVHE is not a thing (at least, as far as I am concerned), and I'm not even mentioning NV+protected. The same reasoning applies to 'protected'. We don't have kvm-arm.protected=on because it only makes sense with nVHE, so we have kvm-arm.mode=protected which is nVHE + weird stuff, just like NV is VHE + even weirder stuff. > >> In case we need a command line to turn the NV mode on/off, > >> we could always use the id-override and simply leave this out ? > > > > I really want an explicit user buy-in. There is absolutely no way this > > can be enabled by default, the risks are way too high. Just look at > > the x86 story: it took them 10 years to enable NV by default. I don't > > expect to do any better. > > Of course, I am with you on that. I realise that we may not be able > to disable nv by default using idreg-override (i.e., without any kernel > command line option). So we may need something outside of that, like > the option mentioned above. The override would indeed need the user to *add* something to disable NV, and we want the opposite. If we really want to allow support for something like NV + protected (to mention the worse possible case), I'd rather we have something like: kvm-arm.mode=protected,nested which would describe the expected behaviour in a compact way, without adding a ton of extra options. But to be honest, I'm not expecting to support any of that within the foreseeable future! M.
On Tue, 31 Jan 2023 20:04:15 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > On Tue, Jan 31, 2023 at 05:34:39PM +0000, Suzuki K Poulose wrote: > > On 31/01/2023 14:00, Marc Zyngier wrote: > > [...] > > > > What is exactly the objection here? NV is more or less a VHE++ mode, > > > but is also completely experimental and incomplete. > > > > I am all in for making this an "optional", only enabled it when "I know > > what I want". > > > > kvm-arm.mode=nv kind of seems that the KVM driver is conditioned > > mainly for running NV (comparing with the other existing options > > for kvm-arm.mode). > > > > In reality, as you confirmed, NV is an *additional* capability > > of a VHE hypervisor. So it would be good to "opt" in for "nv" capability > > support. > > > > e.g, > > > > kvm-arm.nv=on > > > > Thinking more about it, either is fine. > > Marc, I'm curious, how do you plan to glue hVHE + NV together (if at > all)? We may need two separate options for this so the user could > separately configure NV for their hVHE KVM instance. I really don't plan to support them together. But if we wanted to support something, I'd rather express it as a composition of the various options, as I suggested to Suzuki earlier. Something along the lines of: kvm-arm.mode=hvhe,nested But the extra complexity is mind-boggling, frankly. And the result will suck terribly. NV is already exit-heavy, compared to single-level virtualisation. But now you double the cost of the exit by moving all the load/put work into the entry-exit phase. To give you an idea, an L2 guest under a hVHE L1 is ~30% slower than the same guest running under a VHE L1 with an exit-heavy workload (virtio-9p + hackbench). Making the L0 host hVHE would be even worse. We may be able to improve it to some extent, but it will always be the sucky option. Also, given where we are at the support stage (we basically offer an ARMv8.1 L1 environment), the impetus to support this sort of contraption is.. hrmm... low. I'd rather spend my energy on architecture correctness and feature-parity with single level. Thanks, M.
On Tue, Jan 31, 2023 at 09:26:56PM +0000, Marc Zyngier wrote: > On Tue, 31 Jan 2023 20:04:15 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > Marc, I'm curious, how do you plan to glue hVHE + NV together (if at > > all)? We may need two separate options for this so the user could > > separately configure NV for their hVHE KVM instance. > > I really don't plan to support them together. But if we wanted to > support something, I'd rather express it as a composition of the > various options, as I suggested to Suzuki earlier. Something along the > lines of: > > kvm-arm.mode=hvhe,nested > > But the extra complexity is mind-boggling, frankly. And the result > will suck terribly. NV is already exit-heavy, compared to single-level > virtualisation. But now you double the cost of the exit by moving all > the load/put work into the entry-exit phase. Oh yeah, that wasn't a feature request, just wanted to pick your brain for a moment :) Hopefully nobody comes along asking for it in the future, heh.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6cfa6e3996cf..b7b0704e360e 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2553,9 +2553,14 @@ protected: nVHE-based mode with support for guests whose state is kept private from the host. + nested: VHE-based mode with support for nested + virtualization. Requires at least ARMv8.3 + hardware. + Defaults to VHE/nVHE based on hardware support. Setting mode to "protected" will disable kexec and hibernation - for the host. + for the host. "nested" is experimental and should be + used with extreme caution. kvm-arm.vgic_v3_group0_trap= [KVM,ARM] Trap guest accesses to GICv3 group-0 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index b14a0199eba4..186f1b759763 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -60,9 +60,14 @@ enum kvm_mode { KVM_MODE_DEFAULT, KVM_MODE_PROTECTED, + KVM_MODE_NV, KVM_MODE_NONE, }; +#ifdef CONFIG_KVM enum kvm_mode kvm_get_mode(void); +#else +static inline enum kvm_mode kvm_get_mode(void) { return KVM_MODE_NONE; }; +#endif DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index a77315b338e6..3fc14ee86239 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1956,6 +1956,20 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused) write_sysreg(read_sysreg(tpidr_el1), tpidr_el2); } +static bool has_nested_virt_support(const struct arm64_cpu_capabilities *cap, + int scope) +{ + if (kvm_get_mode() != KVM_MODE_NV) + return false; + + if (!has_cpuid_feature(cap, scope)) { + pr_warn("unavailable: %s\n", cap->desc); + return false; + } + + return true; +} + #ifdef CONFIG_ARM64_PAN static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused) { @@ -2215,6 +2229,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = runs_at_el2, .cpu_enable = cpu_copy_el2regs, }, + { + .desc = "Nested Virtualization Support", + .capability = ARM64_HAS_NESTED_VIRT, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_nested_virt_support, + .sys_reg = SYS_ID_AA64MMFR2_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64MMFR2_EL1_NV_SHIFT, + .field_width = 4, + .min_field_value = ID_AA64MMFR2_EL1_NV_IMP, + }, { .capability = ARM64_HAS_32BIT_EL0_DO_NOT_USE, .type = ARM64_CPUCAP_SYSTEM_FEATURE, diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 698787ed87e9..4e1943e3aa42 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -2325,6 +2325,11 @@ static int __init early_kvm_mode_cfg(char *arg) return 0; } + if (strcmp(arg, "nested") == 0 && !WARN_ON(!is_kernel_in_hyp_mode())) { + kvm_mode = KVM_MODE_NV; + return 0; + } + return -EINVAL; } early_param("kvm-arm.mode", early_kvm_mode_cfg); diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index dfeb2c51e257..1af77a3657f7 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -31,6 +31,7 @@ HAS_GENERIC_AUTH_IMP_DEF HAS_IRQ_PRIO_MASKING HAS_LDAPR HAS_LSE_ATOMICS +HAS_NESTED_VIRT HAS_NO_FPSIMD HAS_NO_HW_PREFETCH HAS_PAN