diff mbox series

[v5,18/18] kvm: arm64: Allow tuning the physical address size for VM

Message ID 20180917104144.19188-19-suzuki.poulose@arm.com (mailing list archive)
State New, archived
Headers show
Series kvm: arm64: Dynamic IPA and 52bit IPA | expand

Commit Message

Suzuki K Poulose Sept. 17, 2018, 10:41 a.m. UTC
Allow specifying the physical address size limit for a new
VM via the kvm_type argument for the KVM_CREATE_VM ioctl. This
allows us to finalise the stage2 page table as early as possible
and hence perform the right checks on the memory slots
without complication. The size is ecnoded as Log2(PA_Size) in
bits[7:0] of the type field. For backward compatibility the
value 0 is reserved and implies 40bits. Also, lift the limit
of the IPA to host limit and allow lower IPA sizes (e.g, 32).

The userspace could check the extension KVM_CAP_ARM_VM_PHYS_SHIFT
for the availability of this feature. The cap check returns the
maximum limit for the physical address shift supported by the host.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@kernel.org>
Cc: Peter Maydel <peter.maydell@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since v4:
 - Fold the introduction of the KVM_CAP_ARM_VM_PHYS_SHIFT to this
   patch to allow detection of the availability of the feature for
   userspace.
 - Document the API
 - Restrict the feature only to arm64.
Changes since V3:
 - Switch to a CAP, that can be checkd via EXTENSIONS on KVM device
   fd, rather than a dedicated ioctl.
---
 Documentation/virtual/kvm/api.txt       |  8 ++++++++
 arch/arm64/include/asm/stage2_pgtable.h | 20 --------------------
 arch/arm64/kvm/reset.c                  | 20 ++++++++++++++++----
 include/uapi/linux/kvm.h                | 10 ++++++++++
 4 files changed, 34 insertions(+), 24 deletions(-)

Comments

Peter Maydell Sept. 17, 2018, 2:20 p.m. UTC | #1
On 17 September 2018 at 11:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> Allow specifying the physical address size limit for a new
> VM via the kvm_type argument for the KVM_CREATE_VM ioctl. This
> allows us to finalise the stage2 page table as early as possible
> and hence perform the right checks on the memory slots
> without complication. The size is ecnoded as Log2(PA_Size) in
> bits[7:0] of the type field. For backward compatibility the
> value 0 is reserved and implies 40bits. Also, lift the limit
> of the IPA to host limit and allow lower IPA sizes (e.g, 32).
>
> The userspace could check the extension KVM_CAP_ARM_VM_PHYS_SHIFT
> for the availability of this feature. The cap check returns the
> maximum limit for the physical address shift supported by the host.
>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <cdall@kernel.org>
> Cc: Peter Maydel <peter.maydell@linaro.org>

Typo: my surname has two "l"s in it.

thanks
-- PMM
Suzuki K Poulose Sept. 17, 2018, 2:43 p.m. UTC | #2
Hi Peter,

On 17/09/2018 15:20, Peter Maydell wrote:
> On 17 September 2018 at 11:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>> Allow specifying the physical address size limit for a new
>> VM via the kvm_type argument for the KVM_CREATE_VM ioctl. This
>> allows us to finalise the stage2 page table as early as possible
>> and hence perform the right checks on the memory slots
>> without complication. The size is ecnoded as Log2(PA_Size) in
>> bits[7:0] of the type field. For backward compatibility the
>> value 0 is reserved and implies 40bits. Also, lift the limit
>> of the IPA to host limit and allow lower IPA sizes (e.g, 32).
>>
>> The userspace could check the extension KVM_CAP_ARM_VM_PHYS_SHIFT
>> for the availability of this feature. The cap check returns the
>> maximum limit for the physical address shift supported by the host.
>>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Christoffer Dall <cdall@kernel.org>
>> Cc: Peter Maydel <peter.maydell@linaro.org>
> 
> Typo: my surname has two "l"s in it.

Sorry about that. I have fixed this locally. Btw, did you get
a chance to look at the patch itself ? If you did, are you fine
with the API ?

Suzuki
Peter Maydell Sept. 18, 2018, 1:55 a.m. UTC | #3
On 17 September 2018 at 11:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -122,6 +122,14 @@ the default trap & emulate implementation (which changes the virtual
>  memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
>  flag KVM_VM_MIPS_VZ.
>
> +To configure the physical address space size for a VM (IPA size) on arm64,
> +check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
> +IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of the
> +machine type has been reserved for specifying the PHYS_SHIFT.
> +The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
> +identified by checking KVM_CAP_ARM_VM_PHYS_SHIFT. For backward compatibility
> +a value of 0 selects 40bits.
> +

Given this as the API documentation, I don't think I could figure out
what I as a userspace user of it need to do without looking at the
kernel code. Could I ask you to expand it so that it is a bit less
terse and a bit more detailed? (For instance, what is a PHYS_SHIFT
and why do I have to specify it rather than just telling the kernel
I want a 48 bit guest address space?)

thanks
-- PMM
Suzuki K Poulose Sept. 18, 2018, 3:16 p.m. UTC | #4
Hi Peter,

On 18/09/2018 02:55, Peter Maydell wrote:
> On 17 September 2018 at 11:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -122,6 +122,14 @@ the default trap & emulate implementation (which changes the virtual
>>   memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
>>   flag KVM_VM_MIPS_VZ.
>>
>> +To configure the physical address space size for a VM (IPA size) on arm64,
>> +check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
>> +IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of the
>> +machine type has been reserved for specifying the PHYS_SHIFT.
>> +The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
>> +identified by checking KVM_CAP_ARM_VM_PHYS_SHIFT. For backward compatibility
>> +a value of 0 selects 40bits.
>> +
> 
> Given this as the API documentation, I don't think I could figure out
> what I as a userspace user of it need to do without looking at the
> kernel code. Could I ask you to expand it so that it is a bit less
> terse and a bit more detailed? (For instance, what is a PHYS_SHIFT
> and why do I have to specify it rather than just telling the kernel
> I want a 48 bit guest address space?)

Thanks for the feedback.  I acknowledge that the documentation is not
quite clear for a userspace user. How about:

"To configure the physical address space size for a VM (IPA size) on arm64,
check KVM_CAP_ARM_VM_IPA_SIZE and use KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits)
as the argument to KVM_CREATE_VM, where IPA_Bits is the maximum width of
any physical address used by the VM. The IPA_Bits is encoded in Bits[7-0]
of the machine type, and must be one of { 0, 32, ... , Host_IPA_Limit },
where :
1) IPA_Bits = 0 implies 40bits IPA (for backward compatibility)
2) Host_IPA_Limit is the maximum limit for IPA_Bits on the host, which is
    dependent on the CPU capability and the host kernel configuration.
    This can be detected by checking the extension KVM_CAP_ARM_VM_IPA_SIZE
"

note: I have renamed the KVM_CAP_ARM_VM_PHYS_SHIFT => KVM_CAP_ARM_VM_IPA_SIZE
above and can update the code, if the latter is better.

Suzuki
Peter Maydell Sept. 18, 2018, 3:36 p.m. UTC | #5
On 18 September 2018 at 16:16, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> Hi Peter,
>
> On 18/09/2018 02:55, Peter Maydell wrote:
>>
>> On 17 September 2018 at 11:41, Suzuki K Poulose <suzuki.poulose@arm.com>
>> wrote:
>>>
>>> --- a/Documentation/virtual/kvm/api.txt
>>> +++ b/Documentation/virtual/kvm/api.txt
>>> @@ -122,6 +122,14 @@ the default trap & emulate implementation (which
>>> changes the virtual
>>>   memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
>>>   flag KVM_VM_MIPS_VZ.
>>>
>>> +To configure the physical address space size for a VM (IPA size) on
>>> arm64,
>>> +check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
>>> +IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of
>>> the
>>> +machine type has been reserved for specifying the PHYS_SHIFT.
>>> +The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
>>> +identified by checking KVM_CAP_ARM_VM_PHYS_SHIFT. For backward
>>> compatibility
>>> +a value of 0 selects 40bits.
>>> +
>>
>>
>> Given this as the API documentation, I don't think I could figure out
>> what I as a userspace user of it need to do without looking at the
>> kernel code. Could I ask you to expand it so that it is a bit less
>> terse and a bit more detailed? (For instance, what is a PHYS_SHIFT
>> and why do I have to specify it rather than just telling the kernel
>> I want a 48 bit guest address space?)
>
>
> Thanks for the feedback.  I acknowledge that the documentation is not
> quite clear for a userspace user. How about:
>
> "To configure the physical address space size for a VM (IPA size) on arm64,
> check KVM_CAP_ARM_VM_IPA_SIZE and use KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits)
> as the argument to KVM_CREATE_VM, where IPA_Bits is the maximum width of
> any physical address used by the VM. The IPA_Bits is encoded in Bits[7-0]
> of the machine type, and must be one of { 0, 32, ... , Host_IPA_Limit },
> where :
> 1) IPA_Bits = 0 implies 40bits IPA (for backward compatibility)
> 2) Host_IPA_Limit is the maximum limit for IPA_Bits on the host, which is
>    dependent on the CPU capability and the host kernel configuration.
>    This can be detected by checking the extension KVM_CAP_ARM_VM_IPA_SIZE
> "

I think this is still somewhat confusing. In particular, you're
describing both the "ask the kernel what it supports" API and
the "tell the kernel what we want" API in a single sentence
("...check KVM_CAP_ARM_VM_IPA_SIZE and use KVM_VM_TYPE_ARM_IPA_SIZE...").
There isn't a length limit on documentation, so why not describe
them both clearly in separate sentences?

Also, can I use any IPA value between 32 and Host_IPA_Limit, or
are only certain values in that range supported (if so, which)?

> note: I have renamed the KVM_CAP_ARM_VM_PHYS_SHIFT =>
> KVM_CAP_ARM_VM_IPA_SIZE
> above and can update the code, if the latter is better.

I think the new name is definitely clearer, thanks.

-- PMM
Suzuki K Poulose Sept. 18, 2018, 4:27 p.m. UTC | #6
Hi Peter,

On 18/09/18 16:36, Peter Maydell wrote:
> On 18 September 2018 at 16:16, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
>> Hi Peter,
>>
>> On 18/09/2018 02:55, Peter Maydell wrote:
>>>
>>> On 17 September 2018 at 11:41, Suzuki K Poulose <suzuki.poulose@arm.com>
>>> wrote:
>>>>
>>>> --- a/Documentation/virtual/kvm/api.txt
>>>> +++ b/Documentation/virtual/kvm/api.txt
>>>> @@ -122,6 +122,14 @@ the default trap & emulate implementation (which
>>>> changes the virtual
>>>>    memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
>>>>    flag KVM_VM_MIPS_VZ.
>>>>
>>>> +To configure the physical address space size for a VM (IPA size) on
>>>> arm64,
>>>> +check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
>>>> +IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of
>>>> the
>>>> +machine type has been reserved for specifying the PHYS_SHIFT.
>>>> +The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
>>>> +identified by checking KVM_CAP_ARM_VM_PHYS_SHIFT. For backward
>>>> compatibility
>>>> +a value of 0 selects 40bits.
>>>> +
>>>
>>>
>>> Given this as the API documentation, I don't think I could figure out
>>> what I as a userspace user of it need to do without looking at the
>>> kernel code. Could I ask you to expand it so that it is a bit less
>>> terse and a bit more detailed? (For instance, what is a PHYS_SHIFT
>>> and why do I have to specify it rather than just telling the kernel
>>> I want a 48 bit guest address space?)
>>
>>
>> Thanks for the feedback.  I acknowledge that the documentation is not
>> quite clear for a userspace user. How about:
>>
>> "To configure the physical address space size for a VM (IPA size) on arm64,
>> check KVM_CAP_ARM_VM_IPA_SIZE and use KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits)
>> as the argument to KVM_CREATE_VM, where IPA_Bits is the maximum width of
>> any physical address used by the VM. The IPA_Bits is encoded in Bits[7-0]
>> of the machine type, and must be one of { 0, 32, ... , Host_IPA_Limit },
>> where :
>> 1) IPA_Bits = 0 implies 40bits IPA (for backward compatibility)
>> 2) Host_IPA_Limit is the maximum limit for IPA_Bits on the host, which is
>>     dependent on the CPU capability and the host kernel configuration.
>>     This can be detected by checking the extension KVM_CAP_ARM_VM_IPA_SIZE
>> "
> 
> I think this is still somewhat confusing. In particular, you're
> describing both the "ask the kernel what it supports" API and
> the "tell the kernel what we want" API in a single sentence
> ("...check KVM_CAP_ARM_VM_IPA_SIZE and use KVM_VM_TYPE_ARM_IPA_SIZE...").
> There isn't a length limit on documentation, so why not describe
> them both clearly in separate sentences?

Sure, will do. I didn't want to hijack the KVM_CREATE_VM ioctl command
section to explain a kvm CAP.

> 
> Also, can I use any IPA value between 32 and Host_IPA_Limit, or
> are only certain values in that range supported (if so, which)?

Any value in the range is supported. I should probably add that
"The configured IPA size is different from what is observed by guest in
ID_AA64MMFR0_EL1[PARange] to avoid the confusion." So here it goes :

---

"On arm64, the physical address size for a VM (IPA Size limit) is limited
to 40bits by default. The limit can be configured if the host supports the
extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
identifier, where IPA_Bits is the maximum width of any physical
address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
machine type identifier.

e.g, to configure a guest to use 48bit physical address size :

	vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));

The requested size (IPA_Bits) must be :
   0 - Implies default 40bits (for backward compatibility)

   or

   N - Implies N bits, where N is a positive integer such that 32 <= N <= Host_IPA_Limit

Host_IPA_Limit is the maximum possible value for IPA_Bits on the host and
is dependent on the CPU capability and the kernel configuration. The limit can
be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION
ioctl() at run-time.

Please note that configuring the IPA size does not affect the capability
exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
the guest to host physical address (stage2) translations setup by the host.
"

Suzuki
Peter Maydell Sept. 18, 2018, 5:15 p.m. UTC | #7
On 18 September 2018 at 17:27, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> ---
>
> "On arm64, the physical address size for a VM (IPA Size limit) is limited
> to 40bits by default. The limit can be configured if the host supports the
> extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
> identifier, where IPA_Bits is the maximum width of any physical
> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
> machine type identifier.
>
> e.g, to configure a guest to use 48bit physical address size :
>
>         vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
>
> The requested size (IPA_Bits) must be :
>   0 - Implies default 40bits (for backward compatibility)
>
>   or
>
>   N - Implies N bits, where N is a positive integer such that 32 <= N <=
> Host_IPA_Limit
>
> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host and
> is dependent on the CPU capability and the kernel configuration. The limit
> can
> be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION
> ioctl() at run-time.
>
> Please note that configuring the IPA size does not affect the capability
> exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
> the guest to host physical address (stage2) translations setup by the host.
> "

Thanks, this is much clearer. The only bit I'm not sure about is that
last paragraph -- if I ask for a VM with a 48 bit address space why
don't we tell the guest that that's what it has ?

thanks
-- PMM
Suzuki K Poulose Sept. 19, 2018, 10:03 a.m. UTC | #8
On 09/18/2018 06:15 PM, Peter Maydell wrote:
> On 18 September 2018 at 17:27, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
>> ---
>>
>> "On arm64, the physical address size for a VM (IPA Size limit) is limited
>> to 40bits by default. The limit can be configured if the host supports the
>> extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
>> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
>> identifier, where IPA_Bits is the maximum width of any physical
>> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
>> machine type identifier.
>>
>> e.g, to configure a guest to use 48bit physical address size :
>>
>>          vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
>>
>> The requested size (IPA_Bits) must be :
>>    0 - Implies default 40bits (for backward compatibility)
>>
>>    or
>>
>>    N - Implies N bits, where N is a positive integer such that 32 <= N <=
>> Host_IPA_Limit
>>
>> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host and
>> is dependent on the CPU capability and the kernel configuration. The limit
>> can
>> be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION
>> ioctl() at run-time.
>>
>> Please note that configuring the IPA size does not affect the capability
>> exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
>> the guest to host physical address (stage2) translations setup by the host.
>> "
> 
> Thanks, this is much clearer. The only bit I'm not sure about is that
> last paragraph -- if I ask for a VM with a 48 bit address space why
> don't we tell the guest that that's what it has ?

The point is the IPA Size is not a limit on the CPU's PA size. e.g, if
this guest was a nested hypervisor with 48bit IPA, it could still
support a 52bit IPA nested guest within by simply looking up the CPU
PARange. The IPA size configuration here is simply a hint to the
hypervisor on where the memory banks would be kept. And this certainly
true on real platforms (e.g, my Juno has 42bit PARange on A57, while
A53 reports 40bit, with PA max at 40bits).

Suzuki
Eric Auger Sept. 25, 2018, 10 a.m. UTC | #9
Hi Suzuki,
On 9/17/18 12:41 PM, Suzuki K Poulose wrote:
> Allow specifying the physical address size limit for a new
> VM via the kvm_type argument for the KVM_CREATE_VM ioctl. This
> allows us to finalise the stage2 page table as early as possible
> and hence perform the right checks on the memory slots
> without complication. The size is ecnoded as Log2(PA_Size) in
encoded
> bits[7:0] of the type field. For backward compatibility the
> value 0 is reserved and implies 40bits. Also, lift the limit
> of the IPA to host limit and allow lower IPA sizes (e.g, 32).
> 
> The userspace could check the extension KVM_CAP_ARM_VM_PHYS_SHIFT
> for the availability of this feature. The cap check returns the
> maximum limit for the physical address shift supported by the host.
> 
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <cdall@kernel.org>
> Cc: Peter Maydel <peter.maydell@linaro.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
> Changes since v4:
>  - Fold the introduction of the KVM_CAP_ARM_VM_PHYS_SHIFT to this
>    patch to allow detection of the availability of the feature for
>    userspace.
>  - Document the API
>  - Restrict the feature only to arm64.
> Changes since V3:
>  - Switch to a CAP, that can be checkd via EXTENSIONS on KVM device
>    fd, rather than a dedicated ioctl.
> ---
>  Documentation/virtual/kvm/api.txt       |  8 ++++++++
>  arch/arm64/include/asm/stage2_pgtable.h | 20 --------------------
>  arch/arm64/kvm/reset.c                  | 20 ++++++++++++++++----
>  include/uapi/linux/kvm.h                | 10 ++++++++++
>  4 files changed, 34 insertions(+), 24 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index c664064f76fb..f860251ff27c 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -122,6 +122,14 @@ the default trap & emulate implementation (which changes the virtual
>  memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
>  flag KVM_VM_MIPS_VZ.
>  
> +To configure the physical address space size for a VM (IPA size) on arm64,
> +check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
> +IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of the
> +machine type has been reserved for specifying the PHYS_SHIFT.
are reserved to pass the PHYS_SHIFT?
> +The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
s/could be/is
> +identified by checking KVM_CAP_ARM_VM_PHYS_SHIFT. For backward compatibility
> +a value of 0 selects 40bits.
> +
>  
>  4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST
>  
> diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
> index 6a56fdff0823..0b339f5a4a7c 100644
> --- a/arch/arm64/include/asm/stage2_pgtable.h
> +++ b/arch/arm64/include/asm/stage2_pgtable.h
> @@ -42,28 +42,8 @@
>   * the range (IPA_SHIFT, IPA_SHIFT - 4).
>   */
>  #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
> -#define STAGE2_PGTABLE_LEVELS		stage2_pgtable_levels(KVM_PHYS_SHIFT)
>  #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
>  
> -/*
> - * With all the supported VA_BITs and 40bit guest IPA, the following condition
> - * is always true:
> - *
> - *       STAGE2_PGTABLE_LEVELS <= CONFIG_PGTABLE_LEVELS
> - *
> - * We base our stage-2 page table walker helpers on this assumption and
> - * fall back to using the host version of the helper wherever possible.
> - * i.e, if a particular level is not folded (e.g, PUD) at stage2, we fall back
> - * to using the host version, since it is guaranteed it is not folded at host.
> - *
> - * If the condition breaks in the future, we can rearrange the host level
> - * definitions and reuse them for stage2. Till then...
> - */
> -#if STAGE2_PGTABLE_LEVELS > CONFIG_PGTABLE_LEVELS
> -#error "Unsupported combination of guest IPA and host VA_BITS."
> -#endif
> -
> -
>  /* stage2_pgdir_shift() is the size mapped by top-level stage2 entry for the VM */
>  #define stage2_pgdir_shift(kvm)		pt_levels_pgdir_shift(kvm_stage2_levels(kvm))
>  #define stage2_pgdir_size(kvm)		(1ULL << stage2_pgdir_shift(kvm))
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 0393bb974b23..c9640159e11f 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -89,6 +89,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_VCPU_EVENTS:
>  		r = 1;
>  		break;
> +	case KVM_CAP_ARM_VM_PHYS_SHIFT:
> +		r = kvm_ipa_limit;
> +		break;
>  	default:
>  		r = 0;
>  	}
> @@ -190,16 +193,25 @@ int kvm_arm_config_vm(struct kvm *kvm, unsigned long type)
>  {
>  	u64 vtcr = VTCR_EL2_FLAGS;
>  	u64 parange;
> -	u8 lvls;
> +	u8 lvls, ipa_shift;
>  
> -	if (type)
> +	if (type & ~KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK)
>  		return -EINVAL;
>  
> +	ipa_shift = KVM_VM_TYPE_ARM_PHYS_SHIFT(type);
> +	if (ipa_shift) {
> +		if (ipa_shift > kvm_ipa_limit ||
> +		    ipa_shift < 32)
> +			return -EINVAL;
> +	} else {
> +		ipa_shift = KVM_PHYS_SHIFT;
> +	}
> +
>  	/*
>  	 * Use a minimum 2 level page table to prevent splitting
>  	 * host PMD huge pages at stage2.
>  	 */
> -	lvls = stage2_pgtable_levels(KVM_PHYS_SHIFT);
> +	lvls = stage2_pgtable_levels(ipa_shift);
>  	if (lvls < 2)
>  		lvls = 2;
>  
> @@ -221,7 +233,7 @@ int kvm_arm_config_vm(struct kvm *kvm, unsigned long type)
>  		VTCR_EL2_VS_8BIT;
>  
>  	vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
> -	vtcr |= VTCR_EL2_T0SZ(KVM_PHYS_SHIFT);
> +	vtcr |= VTCR_EL2_T0SZ(ipa_shift);
>  
>  	kvm->arch.vtcr = vtcr;
>  	return 0;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 07548de5c988..2a6b29c446db 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -750,6 +750,15 @@ struct kvm_ppc_resize_hpt {
>  
>  #define KVM_S390_SIE_PAGE_OFFSET 1
>  
> +/*
> + * On arm64, machine type can be used to request the physical
> + * address size for the VM. Bits[7-0] has been reserved for the PA
s/has been reserved/are?

Thanks

Eric
> + * size shift (i.e, log2(PA_Size)). For backward compatibility,
> + * value 0 implies the default IPA size, 40bits.
> + */
> +#define KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK	0xffULL
> +#define KVM_VM_TYPE_ARM_PHYS_SHIFT(x)		\
> +	((x) & KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK)
>  /*
>   * ioctls for /dev/kvm fds:
>   */
> @@ -952,6 +961,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_S390_HPAGE_1M 156
>  #define KVM_CAP_NESTED_STATE 157
>  #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
> +#define KVM_CAP_ARM_VM_PHYS_SHIFT 159 /* returns maximum PA shift for a VM */
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
>
Suzuki K Poulose Sept. 25, 2018, 10:24 a.m. UTC | #10
On 09/25/2018 11:00 AM, Auger Eric wrote:
> Hi Suzuki,
> On 9/17/18 12:41 PM, Suzuki K Poulose wrote:
>> Allow specifying the physical address size limit for a new
>> VM via the kvm_type argument for the KVM_CREATE_VM ioctl. This
>> allows us to finalise the stage2 page table as early as possible
>> and hence perform the right checks on the memory slots
>> without complication. The size is ecnoded as Log2(PA_Size) in
> encoded

...

>>
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index c664064f76fb..f860251ff27c 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -122,6 +122,14 @@ the default trap & emulate implementation (which changes the virtual
>>   memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
>>   flag KVM_VM_MIPS_VZ.
>>   
>> +To configure the physical address space size for a VM (IPA size) on arm64,
>> +check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
>> +IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of the
>> +machine type has been reserved for specifying the PHYS_SHIFT.
> are reserved to pass the PHYS_SHIFT?
>> +The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
> s/could be/is



>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 07548de5c988..2a6b29c446db 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -750,6 +750,15 @@ struct kvm_ppc_resize_hpt {
>>   
>>   #define KVM_S390_SIE_PAGE_OFFSET 1
>>   
>> +/*
>> + * On arm64, machine type can be used to request the physical
>> + * address size for the VM. Bits[7-0] has been reserved for the PA
> s/has been reserved/are?

Thanks for spotting, fixed all the above.

Suzuki
diff mbox series

Patch

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index c664064f76fb..f860251ff27c 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -122,6 +122,14 @@  the default trap & emulate implementation (which changes the virtual
 memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
 flag KVM_VM_MIPS_VZ.
 
+To configure the physical address space size for a VM (IPA size) on arm64,
+check KVM_CAP_ARM_VM_PHYS_SHIFT (which returns the maximum limit for the
+IPA shift) and use KVM_VM_TYPE_ARM_PHYS_SHIFT(PHYS_SHIFT). Bits[7-0] of the
+machine type has been reserved for specifying the PHYS_SHIFT.
+The supported range is [32...IPA_LIMIT], where IPA_LIMIT could be
+identified by checking KVM_CAP_ARM_VM_PHYS_SHIFT. For backward compatibility
+a value of 0 selects 40bits.
+
 
 4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST
 
diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
index 6a56fdff0823..0b339f5a4a7c 100644
--- a/arch/arm64/include/asm/stage2_pgtable.h
+++ b/arch/arm64/include/asm/stage2_pgtable.h
@@ -42,28 +42,8 @@ 
  * the range (IPA_SHIFT, IPA_SHIFT - 4).
  */
 #define stage2_pgtable_levels(ipa)	ARM64_HW_PGTABLE_LEVELS((ipa) - 4)
-#define STAGE2_PGTABLE_LEVELS		stage2_pgtable_levels(KVM_PHYS_SHIFT)
 #define kvm_stage2_levels(kvm)		VTCR_EL2_LVLS(kvm->arch.vtcr)
 
-/*
- * With all the supported VA_BITs and 40bit guest IPA, the following condition
- * is always true:
- *
- *       STAGE2_PGTABLE_LEVELS <= CONFIG_PGTABLE_LEVELS
- *
- * We base our stage-2 page table walker helpers on this assumption and
- * fall back to using the host version of the helper wherever possible.
- * i.e, if a particular level is not folded (e.g, PUD) at stage2, we fall back
- * to using the host version, since it is guaranteed it is not folded at host.
- *
- * If the condition breaks in the future, we can rearrange the host level
- * definitions and reuse them for stage2. Till then...
- */
-#if STAGE2_PGTABLE_LEVELS > CONFIG_PGTABLE_LEVELS
-#error "Unsupported combination of guest IPA and host VA_BITS."
-#endif
-
-
 /* stage2_pgdir_shift() is the size mapped by top-level stage2 entry for the VM */
 #define stage2_pgdir_shift(kvm)		pt_levels_pgdir_shift(kvm_stage2_levels(kvm))
 #define stage2_pgdir_size(kvm)		(1ULL << stage2_pgdir_shift(kvm))
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 0393bb974b23..c9640159e11f 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -89,6 +89,9 @@  int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_VCPU_EVENTS:
 		r = 1;
 		break;
+	case KVM_CAP_ARM_VM_PHYS_SHIFT:
+		r = kvm_ipa_limit;
+		break;
 	default:
 		r = 0;
 	}
@@ -190,16 +193,25 @@  int kvm_arm_config_vm(struct kvm *kvm, unsigned long type)
 {
 	u64 vtcr = VTCR_EL2_FLAGS;
 	u64 parange;
-	u8 lvls;
+	u8 lvls, ipa_shift;
 
-	if (type)
+	if (type & ~KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK)
 		return -EINVAL;
 
+	ipa_shift = KVM_VM_TYPE_ARM_PHYS_SHIFT(type);
+	if (ipa_shift) {
+		if (ipa_shift > kvm_ipa_limit ||
+		    ipa_shift < 32)
+			return -EINVAL;
+	} else {
+		ipa_shift = KVM_PHYS_SHIFT;
+	}
+
 	/*
 	 * Use a minimum 2 level page table to prevent splitting
 	 * host PMD huge pages at stage2.
 	 */
-	lvls = stage2_pgtable_levels(KVM_PHYS_SHIFT);
+	lvls = stage2_pgtable_levels(ipa_shift);
 	if (lvls < 2)
 		lvls = 2;
 
@@ -221,7 +233,7 @@  int kvm_arm_config_vm(struct kvm *kvm, unsigned long type)
 		VTCR_EL2_VS_8BIT;
 
 	vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
-	vtcr |= VTCR_EL2_T0SZ(KVM_PHYS_SHIFT);
+	vtcr |= VTCR_EL2_T0SZ(ipa_shift);
 
 	kvm->arch.vtcr = vtcr;
 	return 0;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 07548de5c988..2a6b29c446db 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -750,6 +750,15 @@  struct kvm_ppc_resize_hpt {
 
 #define KVM_S390_SIE_PAGE_OFFSET 1
 
+/*
+ * On arm64, machine type can be used to request the physical
+ * address size for the VM. Bits[7-0] has been reserved for the PA
+ * size shift (i.e, log2(PA_Size)). For backward compatibility,
+ * value 0 implies the default IPA size, 40bits.
+ */
+#define KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK	0xffULL
+#define KVM_VM_TYPE_ARM_PHYS_SHIFT(x)		\
+	((x) & KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK)
 /*
  * ioctls for /dev/kvm fds:
  */
@@ -952,6 +961,7 @@  struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_HPAGE_1M 156
 #define KVM_CAP_NESTED_STATE 157
 #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
+#define KVM_CAP_ARM_VM_PHYS_SHIFT 159 /* returns maximum PA shift for a VM */
 
 #ifdef KVM_CAP_IRQ_ROUTING