diff mbox series

[RFC,08/28] arm64: RME: Keep a spare page delegated to the RMM

Message ID 20230127112932.38045-9-steven.price@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: Support for Arm CCA in KVM | expand

Commit Message

Steven Price Jan. 27, 2023, 11:29 a.m. UTC
Pages can only be populated/destroyed on the RMM at the 4KB granule,
this requires creating the full depth of RTTs. However if the pages are
going to be combined into a 4MB huge page the last RTT is only
temporarily needed. Similarly when freeing memory the huge page must be
temporarily split requiring temporary usage of the full depth oF RTTs.

To avoid needing to perform a temporary allocation and delegation of a
page for this purpose we keep a spare delegated page around. In
particular this avoids the need for memory allocation while destroying
the realm guest.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h | 3 +++
 arch/arm64/kvm/rme.c             | 6 ++++++
 2 files changed, 9 insertions(+)

Comments

Zhi Wang Feb. 13, 2023, 4:47 p.m. UTC | #1
On Fri, 27 Jan 2023 11:29:12 +0000
Steven Price <steven.price@arm.com> wrote:

> Pages can only be populated/destroyed on the RMM at the 4KB granule,
> this requires creating the full depth of RTTs. However if the pages are
> going to be combined into a 4MB huge page the last RTT is only
> temporarily needed. Similarly when freeing memory the huge page must be
> temporarily split requiring temporary usage of the full depth oF RTTs.
> 
> To avoid needing to perform a temporary allocation and delegation of a
> page for this purpose we keep a spare delegated page around. In
> particular this avoids the need for memory allocation while destroying
> the realm guest.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_rme.h | 3 +++
>  arch/arm64/kvm/rme.c             | 6 ++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index 055a22accc08..a6318af3ed11 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -21,6 +21,9 @@ struct realm {
>  	void *rd;
>  	struct realm_params *params;
>  
> +	/* A spare already delegated page */
> +	phys_addr_t spare_page;
> +
>  	unsigned long num_aux;
>  	unsigned int vmid;
>  	unsigned int ia_bits;
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
>  	}
>  
>  	realm->rd = rd;
> +	realm->spare_page = PHYS_ADDR_MAX;
>  	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>  
>  	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
>  		free_page((unsigned long)realm->rd);
>  		realm->rd = NULL;
>  	}
> +	if (realm->spare_page != PHYS_ADDR_MAX) {
> +		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
> +			free_page((unsigned long)phys_to_virt(realm->spare_page));

Will the page be leaked (not usable for host and realms) if the undelegate
failed? If yes, better at least put a comment.

> +		realm->spare_page = PHYS_ADDR_MAX;
> +	}
>  
>  	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>  	for (i = 0; i < pgd_sz; i++) {
Steven Price March 1, 2023, 11:55 a.m. UTC | #2
On 13/02/2023 16:47, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:12 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Pages can only be populated/destroyed on the RMM at the 4KB granule,
>> this requires creating the full depth of RTTs. However if the pages are
>> going to be combined into a 4MB huge page the last RTT is only
>> temporarily needed. Similarly when freeing memory the huge page must be
>> temporarily split requiring temporary usage of the full depth oF RTTs.
>>
>> To avoid needing to perform a temporary allocation and delegation of a
>> page for this purpose we keep a spare delegated page around. In
>> particular this avoids the need for memory allocation while destroying
>> the realm guest.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_rme.h | 3 +++
>>  arch/arm64/kvm/rme.c             | 6 ++++++
>>  2 files changed, 9 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index 055a22accc08..a6318af3ed11 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -21,6 +21,9 @@ struct realm {
>>  	void *rd;
>>  	struct realm_params *params;
>>  
>> +	/* A spare already delegated page */
>> +	phys_addr_t spare_page;
>> +
>>  	unsigned long num_aux;
>>  	unsigned int vmid;
>>  	unsigned int ia_bits;
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
>>  	}
>>  
>>  	realm->rd = rd;
>> +	realm->spare_page = PHYS_ADDR_MAX;
>>  	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>>  
>>  	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
>> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
>>  		free_page((unsigned long)realm->rd);
>>  		realm->rd = NULL;
>>  	}
>> +	if (realm->spare_page != PHYS_ADDR_MAX) {
>> +		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
>> +			free_page((unsigned long)phys_to_virt(realm->spare_page));
> 
> Will the page be leaked (not usable for host and realms) if the undelegate
> failed? If yes, better at least put a comment.

Yes - I'll add a comment.

In general being unable to undelegate a page points to a programming
error in the host. The only reason the RMM should refuse the request is
it the page is in use by a Realm which the host has configured. So the
WARN() is correct (there's a kernel bug) and the only sensible course of
action is to leak the page and limp on.

Thanks,

Steve

>> +		realm->spare_page = PHYS_ADDR_MAX;
>> +	}
>>  
>>  	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>>  	for (i = 0; i < pgd_sz; i++) {
>
Zhi Wang March 1, 2023, 8:50 p.m. UTC | #3
On Wed, 1 Mar 2023 11:55:37 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 16:47, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:12 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> Pages can only be populated/destroyed on the RMM at the 4KB granule,
> >> this requires creating the full depth of RTTs. However if the pages are
> >> going to be combined into a 4MB huge page the last RTT is only
> >> temporarily needed. Similarly when freeing memory the huge page must be
> >> temporarily split requiring temporary usage of the full depth oF RTTs.
> >>
> >> To avoid needing to perform a temporary allocation and delegation of a
> >> page for this purpose we keep a spare delegated page around. In
> >> particular this avoids the need for memory allocation while destroying
> >> the realm guest.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_rme.h | 3 +++
> >>  arch/arm64/kvm/rme.c             | 6 ++++++
> >>  2 files changed, 9 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index 055a22accc08..a6318af3ed11 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -21,6 +21,9 @@ struct realm {
> >>  	void *rd;
> >>  	struct realm_params *params;
> >>  
> >> +	/* A spare already delegated page */
> >> +	phys_addr_t spare_page;
> >> +
> >>  	unsigned long num_aux;
> >>  	unsigned int vmid;
> >>  	unsigned int ia_bits;
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
> >>  	}
> >>  
> >>  	realm->rd = rd;
> >> +	realm->spare_page = PHYS_ADDR_MAX;
> >>  	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> >>  
> >>  	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> >> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
> >>  		free_page((unsigned long)realm->rd);
> >>  		realm->rd = NULL;
> >>  	}
> >> +	if (realm->spare_page != PHYS_ADDR_MAX) {
> >> +		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
> >> +			free_page((unsigned long)phys_to_virt(realm->spare_page));
> > 
> > Will the page be leaked (not usable for host and realms) if the undelegate
> > failed? If yes, better at least put a comment.
> 
> Yes - I'll add a comment.
> 
> In general being unable to undelegate a page points to a programming
> error in the host. The only reason the RMM should refuse the request is
> it the page is in use by a Realm which the host has configured. So the
> WARN() is correct (there's a kernel bug) and the only sensible course of
> action is to leak the page and limp on.
>

It would be nice to add a summary of above into the patch comments.

Having a comment when leaking a page (which mostly means the page cannot be
reclaimed by VMM and used on a REALM any more) is nice. TDX/SNP also have
the problem of leaking pages due to mystic reasons.

Imagine the leaking can turn worse bit by bit in a long running server and
KVM will definitely have a generic accounting interface for reporting the
numbers to the userspace later. Having a explicit comment at this time
really makes it easier later.
 
> Thanks,
> 
> Steve
> 
> >> +		realm->spare_page = PHYS_ADDR_MAX;
> >> +	}
> >>  
> >>  	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> >>  	for (i = 0; i < pgd_sz; i++) {
> > 
>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 055a22accc08..a6318af3ed11 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -21,6 +21,9 @@  struct realm {
 	void *rd;
 	struct realm_params *params;
 
+	/* A spare already delegated page */
+	phys_addr_t spare_page;
+
 	unsigned long num_aux;
 	unsigned int vmid;
 	unsigned int ia_bits;
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 9f8c5a91b8fc..0c9d70e4d9e6 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -148,6 +148,7 @@  static int realm_create_rd(struct kvm *kvm)
 	}
 
 	realm->rd = rd;
+	realm->spare_page = PHYS_ADDR_MAX;
 	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
 
 	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
@@ -357,6 +358,11 @@  void kvm_destroy_realm(struct kvm *kvm)
 		free_page((unsigned long)realm->rd);
 		realm->rd = NULL;
 	}
+	if (realm->spare_page != PHYS_ADDR_MAX) {
+		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
+			free_page((unsigned long)phys_to_virt(realm->spare_page));
+		realm->spare_page = PHYS_ADDR_MAX;
+	}
 
 	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
 	for (i = 0; i < pgd_sz; i++) {