[v6,29/43] arm64: RME: Always use 4k pages for realms

Message ID	20241212155610.76522-30-steven.price@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Steven Price <steven.price@arm.com> To: kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: Steven Price <steven.price@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>, Oliver Upton <oliver.upton@linux.dev>, Suzuki K Poulose <suzuki.poulose@arm.com>, Zenghui Yu <yuzenghui@huawei.com>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>, Alexandru Elisei <alexandru.elisei@arm.com>, Christoffer Dall <christoffer.dall@arm.com>, Fuad Tabba <tabba@google.com>, linux-coco@lists.linux.dev, Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>, Gavin Shan <gshan@redhat.com>, Shanker Donthineni <sdonthineni@nvidia.com>, Alper Gun <alpergun@google.com>, "Aneesh Kumar K . V" <aneesh.kumar@kernel.org> Subject: [PATCH v6 29/43] arm64: RME: Always use 4k pages for realms Date: Thu, 12 Dec 2024 15:55:54 +0000 Message-ID: <20241212155610.76522-30-steven.price@arm.com> In-Reply-To: <20241212155610.76522-1-steven.price@arm.com> References: <20241212155610.76522-1-steven.price@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	arm64: Support for Arm CCA in KVM \| expand [v6,00/43] arm64: Support for Arm CCA in KVM [v6,01/43] KVM: Prepare for handling only shared mappings in mmu_notifier events [v6,02/43] kvm: arm64: Include kvm_emulate.h in kvm/arm_psci.h [v6,03/43] arm64: RME: Handle Granule Protection Faults (GPFs) [v6,04/43] arm64: RME: Add SMC definitions for calling the RMM [v6,05/43] arm64: RME: Add wrappers for RMI calls [v6,06/43] arm64: RME: Check for RME support at KVM init [v6,07/43] arm64: RME: Define the user ABI [v6,08/43] arm64: RME: ioctls to create and configure realms [v6,09/43] kvm: arm64: Expose debug HW register numbers for Realm [v6,10/43] arm64: kvm: Allow passing machine type in KVM creation [v6,11/43] arm64: RME: RTT tear down [v6,12/43] arm64: RME: Allocate/free RECs to match vCPUs [v6,13/43] KVM: arm64: vgic: Provide helper for number of list registers [v6,14/43] arm64: RME: Support for the VGIC in realms [v6,15/43] KVM: arm64: Support timers in realm RECs [v6,16/43] arm64: RME: Allow VMM to set RIPAS [v6,17/43] arm64: RME: Handle realm enter/exit [v6,18/43] KVM: arm64: Handle realm MMIO emulation [v6,19/43] arm64: RME: Allow populating initial contents [v6,20/43] arm64: RME: Runtime faulting of memory [v6,21/43] KVM: arm64: Handle realm VCPU load [v6,22/43] KVM: arm64: Validate register access for a Realm VM [v6,23/43] KVM: arm64: Handle Realm PSCI requests [v6,24/43] KVM: arm64: WARN on injected undef exceptions [v6,25/43] arm64: Don't expose stolen time for realm guests [v6,26/43] arm64: rme: allow userspace to inject aborts [v6,27/43] arm64: rme: support RSI_HOST_CALL [v6,28/43] arm64: rme: Allow checking SVE on VM instance [v6,29/43] arm64: RME: Always use 4k pages for realms [v6,30/43] arm64: rme: Prevent Device mappings for Realms [v6,31/43] arm_pmu: Provide a mechanism for disabling the physical IRQ [v6,32/43] arm64: rme: Enable PMU support with a realm guest [v6,33/43] kvm: rme: Hide KVM_CAP_READONLY_MEM for realm guests [v6,34/43] arm64: RME: Propagate number of breakpoints and watchpoints to userspace [v6,35/43] arm64: RME: Set breakpoint parameters through SET_ONE_REG [v6,36/43] arm64: RME: Initialize PMCR.N with number counter supported by RMM [v6,37/43] arm64: RME: Propagate max SVE vector length from RMM [v6,38/43] arm64: RME: Configure max SVE vector length for a Realm [v6,39/43] arm64: RME: Provide register list for unfinalized RME RECs [v6,40/43] arm64: RME: Provide accurate register list [v6,41/43] arm64: kvm: Expose support for private memory [v6,42/43] KVM: arm64: Expose KVM_ARM_VCPU_REC to user space [v6,43/43] KVM: arm64: Allow activating realms

Message ID

20241212155610.76522-30-steven.price@arm.com (mailing list archive)

State

New, archived

Headers

From: Steven Price <steven.price@arm.com>
To: kvm@vger.kernel.org,
	kvmarm@lists.linux.dev
Cc: Steven Price <steven.price@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Will Deacon <will@kernel.org>,
	James Morse <james.morse@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Joey Gouly <joey.gouly@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Fuad Tabba <tabba@google.com>,
	linux-coco@lists.linux.dev,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
	Gavin Shan <gshan@redhat.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	Alper Gun <alpergun@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>
Subject: [PATCH v6 29/43] arm64: RME: Always use 4k pages for realms
Date: Thu, 12 Dec 2024 15:55:54 +0000
Message-ID: <20241212155610.76522-30-steven.price@arm.com>
In-Reply-To: <20241212155610.76522-1-steven.price@arm.com>
References: <20241212155610.76522-1-steven.price@arm.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

Series

arm64: Support for Arm CCA in KVM | expand

Commit Message

Steven Price Dec. 12, 2024, 3:55 p.m. UTC

Always split up huge pages to avoid problems managing huge pages. There
are two issues currently:

1. The uABI for the VMM allows populating memory on 4k boundaries even
   if the underlying allocator (e.g. hugetlbfs) is using a larger page
   size. Using a memfd for private allocations will push this issue onto
   the VMM as it will need to respect the granularity of the allocator.

2. The guest is able to request arbitrary ranges to be remapped as
   shared. Again with a memfd approach it will be up to the VMM to deal
   with the complexity and either overmap (need the huge mapping and add
   an additional 'overlapping' shared mapping) or reject the request as
   invalid due to the use of a huge page allocator.

For now just break everything down to 4k pages in the RMM controlled
stage 2.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/mmu.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Gavin Shan Feb. 2, 2025, 6:52 a.m. UTC | #1

On 12/13/24 1:55 AM, Steven Price wrote:
> Always split up huge pages to avoid problems managing huge pages. There
> are two issues currently:
> 
> 1. The uABI for the VMM allows populating memory on 4k boundaries even
>     if the underlying allocator (e.g. hugetlbfs) is using a larger page
>     size. Using a memfd for private allocations will push this issue onto
>     the VMM as it will need to respect the granularity of the allocator.
> 
> 2. The guest is able to request arbitrary ranges to be remapped as
>     shared. Again with a memfd approach it will be up to the VMM to deal
>     with the complexity and either overmap (need the huge mapping and add
>     an additional 'overlapping' shared mapping) or reject the request as
>     invalid due to the use of a huge page allocator.
> 
> For now just break everything down to 4k pages in the RMM controlled
> stage 2.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>   arch/arm64/kvm/mmu.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index e88714903ce5..9ede143ccef1 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1603,6 +1603,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   	if (logging_active) {
>   		force_pte = true;
>   		vma_shift = PAGE_SHIFT;
> +	} else if (kvm_is_realm(kvm)) {
> +		// Force PTE level mappings for realms
> +		force_pte = true;
> +		vma_shift = PAGE_SHIFT;
>   	} else {
>   		vma_shift = get_vma_page_shift(vma, hva);
>   	}

Since a memory abort is specific to a vCPU instead of a VM, so vcpu_is_rec()
instead of kvm_is_realm() is more accurate for the check. Besides, it looks
duplicate to the check added by "PATCH[20/43] arm64: RME: Runtime faulting
of memory", which is as below.

        /* FIXME: We shouldn't need to disable this for realms */
        if (vma_pagesize == PAGE_SIZE && !(force_pte || device || kvm_is_realm(kvm))) {
                                                                  ^^^^^^^^^^^^^^^^^
                                                                  Can be dropped now.

Thanks,
Gavin

Steven Price Feb. 7, 2025, 5:05 p.m. UTC | #2

On 02/02/2025 06:52, Gavin Shan wrote:
> On 12/13/24 1:55 AM, Steven Price wrote:
>> Always split up huge pages to avoid problems managing huge pages. There
>> are two issues currently:
>>
>> 1. The uABI for the VMM allows populating memory on 4k boundaries even
>>     if the underlying allocator (e.g. hugetlbfs) is using a larger page
>>     size. Using a memfd for private allocations will push this issue onto
>>     the VMM as it will need to respect the granularity of the allocator.
>>
>> 2. The guest is able to request arbitrary ranges to be remapped as
>>     shared. Again with a memfd approach it will be up to the VMM to deal
>>     with the complexity and either overmap (need the huge mapping and add
>>     an additional 'overlapping' shared mapping) or reject the request as
>>     invalid due to the use of a huge page allocator.
>>
>> For now just break everything down to 4k pages in the RMM controlled
>> stage 2.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   arch/arm64/kvm/mmu.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index e88714903ce5..9ede143ccef1 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -1603,6 +1603,10 @@ static int user_mem_abort(struct kvm_vcpu
>> *vcpu, phys_addr_t fault_ipa,
>>       if (logging_active) {
>>           force_pte = true;
>>           vma_shift = PAGE_SHIFT;
>> +    } else if (kvm_is_realm(kvm)) {
>> +        // Force PTE level mappings for realms
>> +        force_pte = true;
>> +        vma_shift = PAGE_SHIFT;
>>       } else {
>>           vma_shift = get_vma_page_shift(vma, hva);
>>       }
> 
> Since a memory abort is specific to a vCPU instead of a VM, so
> vcpu_is_rec()
> instead of kvm_is_realm() is more accurate for the check. Besides, it looks
> duplicate to the check added by "PATCH[20/43] arm64: RME: Runtime faulting
> of memory", which is as below.
> 
>        /* FIXME: We shouldn't need to disable this for realms */
>        if (vma_pagesize == PAGE_SIZE && !(force_pte || device ||
> kvm_is_realm(kvm))) {
>                                                                 
> ^^^^^^^^^^^^^^^^^
>                                                                  Can be
> dropped now.

Indeed, thanks for that - one less FIXME ;)

Thanks,
Steve

> Thanks,
> Gavin
>                 
>

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index e88714903ce5..9ede143ccef1 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1603,6 +1603,10 @@  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (logging_active) {
 		force_pte = true;
 		vma_shift = PAGE_SHIFT;
+	} else if (kvm_is_realm(kvm)) {
+		// Force PTE level mappings for realms
+		force_pte = true;
+		vma_shift = PAGE_SHIFT;
 	} else {
 		vma_shift = get_vma_page_shift(vma, hva);
 	}

[v6,29/43] arm64: RME: Always use 4k pages for realms

Commit Message

Comments

Patch