diff mbox series

[v5,17/19] irqchip/gic-v3-its: Share ITS tables with a non-trusted hypervisor

Message ID 20240819131924.372366-18-steven.price@arm.com (mailing list archive)
State New
Headers show
Series arm64: Support for running as a guest in Arm CCA | expand

Commit Message

Steven Price Aug. 19, 2024, 1:19 p.m. UTC
Within a realm guest the ITS is emulated by the host. This means the
allocations must have been made available to the host by a call to
set_memory_decrypted(). Introduce an allocation function which performs
this extra call.

For the ITT use a custom genpool-based allocator that calls
set_memory_decrypted() for each page allocated, but then suballocates
the size needed for each ITT. Note that there is no mechanism
implemented to return pages from the genpool, but it is unlikely the
peak number of devices will so much larger than the normal level - so
this isn't expected to be an issue.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Will Deacon <will@kernel.org>
Signed-off-by: Steven Price <steven.price@arm.com>
---
Changes since v3:
 * Use BIT() macro.
 * Use a genpool based allocator in its_create_device() to avoid
   allocating a full page.
 * Fix subject to drop "realm" and use gic-v3-its.
 * Add error handling to ITS alloc/free.
Changes since v2:
 * Drop 'shared' from the new its_xxx function names as they are used
   for non-realm guests too.
 * Don't handle the NUMA_NO_NODE case specially - alloc_pages_node()
   should do the right thing.
 * Drop a pointless (void *) cast.
---
 drivers/irqchip/irq-gic-v3-its.c | 139 ++++++++++++++++++++++++++-----
 1 file changed, 116 insertions(+), 23 deletions(-)

Comments

Marc Zyngier Aug. 19, 2024, 2:27 p.m. UTC | #1
On Mon, 19 Aug 2024 14:19:22 +0100,
Steven Price <steven.price@arm.com> wrote:
> 
> Within a realm guest the ITS is emulated by the host. This means the
> allocations must have been made available to the host by a call to
> set_memory_decrypted(). Introduce an allocation function which performs
> this extra call.
> 
> For the ITT use a custom genpool-based allocator that calls
> set_memory_decrypted() for each page allocated, but then suballocates
> the size needed for each ITT. Note that there is no mechanism
> implemented to return pages from the genpool, but it is unlikely the
> peak number of devices will so much larger than the normal level - so
> this isn't expected to be an issue.
> 
> Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Tested-by: Will Deacon <will@kernel.org>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> Changes since v3:
>  * Use BIT() macro.
>  * Use a genpool based allocator in its_create_device() to avoid
>    allocating a full page.
>  * Fix subject to drop "realm" and use gic-v3-its.
>  * Add error handling to ITS alloc/free.
> Changes since v2:
>  * Drop 'shared' from the new its_xxx function names as they are used
>    for non-realm guests too.
>  * Don't handle the NUMA_NO_NODE case specially - alloc_pages_node()
>    should do the right thing.
>  * Drop a pointless (void *) cast.
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 139 ++++++++++++++++++++++++++-----
>  1 file changed, 116 insertions(+), 23 deletions(-)

I think this patch and the next are pretty ripe, and shouldn't have to
wait much longer.

Can you please send them as a separate irqchip series, with the
relevant people on Cc (realistically, tglx and me), with a

Reviewed-by: Marc Zyngier <maz@kernel.org>

added to them?

Thanks,

	M.
Suzuki K Poulose Aug. 19, 2024, 2:51 p.m. UTC | #2
Hi Steven,

On 19/08/2024 14:19, Steven Price wrote:
> Within a realm guest the ITS is emulated by the host. This means the
> allocations must have been made available to the host by a call to
> set_memory_decrypted(). Introduce an allocation function which performs
> this extra call.
> 
> For the ITT use a custom genpool-based allocator that calls
> set_memory_decrypted() for each page allocated, but then suballocates
> the size needed for each ITT. Note that there is no mechanism
> implemented to return pages from the genpool, but it is unlikely the
> peak number of devices will so much larger than the normal level - so
> this isn't expected to be an issue.
> 

This may not be sufficient to make it future proof. We need to detect if
the GIC is private vs shared, before we make the allocation choice. 
Please see below :

> Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Tested-by: Will Deacon <will@kernel.org>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> Changes since v3:
>   * Use BIT() macro.
>   * Use a genpool based allocator in its_create_device() to avoid
>     allocating a full page.
>   * Fix subject to drop "realm" and use gic-v3-its.
>   * Add error handling to ITS alloc/free.
> Changes since v2:
>   * Drop 'shared' from the new its_xxx function names as they are used
>     for non-realm guests too.
>   * Don't handle the NUMA_NO_NODE case specially - alloc_pages_node()
>     should do the right thing.
>   * Drop a pointless (void *) cast.
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 139 ++++++++++++++++++++++++++-----
>   1 file changed, 116 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9b34596b3542..557214c774c3 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -12,12 +12,14 @@
>   #include <linux/crash_dump.h>
>   #include <linux/delay.h>
>   #include <linux/efi.h>
> +#include <linux/genalloc.h>
>   #include <linux/interrupt.h>
>   #include <linux/iommu.h>
>   #include <linux/iopoll.h>
>   #include <linux/irqdomain.h>
>   #include <linux/list.h>
>   #include <linux/log2.h>
> +#include <linux/mem_encrypt.h>
>   #include <linux/memblock.h>
>   #include <linux/mm.h>
>   #include <linux/msi.h>
> @@ -27,6 +29,7 @@
>   #include <linux/of_pci.h>
>   #include <linux/of_platform.h>
>   #include <linux/percpu.h>
> +#include <linux/set_memory.h>
>   #include <linux/slab.h>
>   #include <linux/syscore_ops.h>
>   
> @@ -164,6 +167,7 @@ struct its_device {
>   	struct its_node		*its;
>   	struct event_lpi_map	event_map;
>   	void			*itt;
> +	u32			itt_sz;
>   	u32			nr_ites;
>   	u32			device_id;
>   	bool			shared;
> @@ -199,6 +203,81 @@ static DEFINE_IDA(its_vpeid_ida);
>   #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
>   #define gic_data_rdist_vlpi_base()	(gic_data_rdist_rd_base() + SZ_128K)
>   
> +static struct page *its_alloc_pages_node(int node, gfp_t gfp,
> +					 unsigned int order)
> +{
> +	struct page *page;
> +	int ret = 0;
> +
> +	page = alloc_pages_node(node, gfp, order);
> +
> +	if (!page)
> +		return NULL;
> +
> +	ret = set_memory_decrypted((unsigned long)page_address(page),
> +				   1 << order);
> +	if (WARN_ON(ret))
> +		return NULL;
> +
> +	return page;
> +}
> +
> +static struct page *its_alloc_pages(gfp_t gfp, unsigned int order)
> +{
> +	return its_alloc_pages_node(NUMA_NO_NODE, gfp, order);
> +}
> +
> +static void its_free_pages(void *addr, unsigned int order)
> +{
> +	if (WARN_ON(set_memory_encrypted((unsigned long)addr, 1 << order)))
> +		return;
> +	free_pages((unsigned long)addr, order);
> +}
> +
> +static struct gen_pool *itt_pool;
> +
> +static void *itt_alloc_pool(int node, int size)
> +{
> +	unsigned long addr;
> +	struct page *page;
> +
> +	if (size >= PAGE_SIZE) {
> +		page = its_alloc_pages_node(node,
> +					    GFP_KERNEL | __GFP_ZERO,
> +					    get_order(size));
> +
> +		return page_address(page);
> +	}
> +
> +	do {
> +		addr = gen_pool_alloc(itt_pool, size);
> +		if (addr)
> +			break;
> +
> +		page = its_alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 1);
> +		if (!page)
> +			break;
> +
> +		gen_pool_add(itt_pool, (unsigned long)page_address(page),
> +			     PAGE_SIZE, node);
> +	} while (!addr);
> +
> +	return (void *)addr;
> +}
> +
> +static void itt_free_pool(void *addr, int size)
> +{
> +	if (!addr)
> +		return;
> +
> +	if (size >= PAGE_SIZE) {
> +		its_free_pages(addr, get_order(size));
> +		return;
> +	}
> +
> +	gen_pool_free(itt_pool, (unsigned long)addr, size);
> +}
> +
>   /*
>    * Skip ITSs that have no vLPIs mapped, unless we're on GICv4.1, as we
>    * always have vSGIs mapped.
> @@ -2187,7 +2266,8 @@ static struct page *its_allocate_prop_table(gfp_t gfp_flags)
>   {
>   	struct page *prop_page;
>   
> -	prop_page = alloc_pages(gfp_flags, get_order(LPI_PROPBASE_SZ));
> +	prop_page = its_alloc_pages(gfp_flags,
> +				    get_order(LPI_PROPBASE_SZ));
>   	if (!prop_page)
>   		return NULL;
>   
> @@ -2198,8 +2278,8 @@ static struct page *its_allocate_prop_table(gfp_t gfp_flags)
>   
>   static void its_free_prop_table(struct page *prop_page)
>   {
> -	free_pages((unsigned long)page_address(prop_page),
> -		   get_order(LPI_PROPBASE_SZ));
> +	its_free_pages(page_address(prop_page),
> +		       get_order(LPI_PROPBASE_SZ));
>   }
>   
>   static bool gic_check_reserved_range(phys_addr_t addr, unsigned long size)
> @@ -2321,7 +2401,8 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
>   		order = get_order(GITS_BASER_PAGES_MAX * psz);
>   	}
>   
> -	page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO, order);
> +	page = its_alloc_pages_node(its->numa_node,
> +				    GFP_KERNEL | __GFP_ZERO, order);
>   	if (!page)
>   		return -ENOMEM;
>   
> @@ -2334,7 +2415,7 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
>   		/* 52bit PA is supported only when PageSize=64K */
>   		if (psz != SZ_64K) {
>   			pr_err("ITS: no 52bit PA support when psz=%d\n", psz);
> -			free_pages((unsigned long)base, order);
> +			its_free_pages(base, order);
>   			return -ENXIO;
>   		}
>   
> @@ -2390,7 +2471,7 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
>   		pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
>   		       &its->phys_base, its_base_type_string[type],
>   		       val, tmp);
> -		free_pages((unsigned long)base, order);
> +		its_free_pages(base, order);
>   		return -ENXIO;
>   	}
>   
> @@ -2529,8 +2610,8 @@ static void its_free_tables(struct its_node *its)
>   
>   	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
>   		if (its->tables[i].base) {
> -			free_pages((unsigned long)its->tables[i].base,
> -				   its->tables[i].order);
> +			its_free_pages(its->tables[i].base,
> +				       its->tables[i].order);
>   			its->tables[i].base = NULL;
>   		}
>   	}
> @@ -2796,7 +2877,8 @@ static bool allocate_vpe_l2_table(int cpu, u32 id)
>   
>   	/* Allocate memory for 2nd level table */
>   	if (!table[idx]) {
> -		page = alloc_pages(GFP_KERNEL | __GFP_ZERO, get_order(psz));
> +		page = its_alloc_pages(GFP_KERNEL | __GFP_ZERO,
> +				       get_order(psz));
>   		if (!page)
>   			return false;
>   
> @@ -2915,7 +2997,8 @@ static int allocate_vpe_l1_table(void)
>   
>   	pr_debug("np = %d, npg = %lld, psz = %d, epp = %d, esz = %d\n",
>   		 np, npg, psz, epp, esz);
> -	page = alloc_pages(GFP_ATOMIC | __GFP_ZERO, get_order(np * PAGE_SIZE));
> +	page = its_alloc_pages(GFP_ATOMIC | __GFP_ZERO,
> +			       get_order(np * PAGE_SIZE));
>   	if (!page)
>   		return -ENOMEM;
>   
> @@ -2961,8 +3044,8 @@ static struct page *its_allocate_pending_table(gfp_t gfp_flags)
>   {
>   	struct page *pend_page;
>   
> -	pend_page = alloc_pages(gfp_flags | __GFP_ZERO,
> -				get_order(LPI_PENDBASE_SZ));
> +	pend_page = its_alloc_pages(gfp_flags | __GFP_ZERO,
> +				    get_order(LPI_PENDBASE_SZ));
>   	if (!pend_page)
>   		return NULL;
>   
> @@ -2974,7 +3057,7 @@ static struct page *its_allocate_pending_table(gfp_t gfp_flags)
>   
>   static void its_free_pending_table(struct page *pt)
>   {
> -	free_pages((unsigned long)page_address(pt), get_order(LPI_PENDBASE_SZ));
> +	its_free_pages(page_address(pt), get_order(LPI_PENDBASE_SZ));
>   }
>   
>   /*
> @@ -3309,8 +3392,9 @@ static bool its_alloc_table_entry(struct its_node *its,
>   
>   	/* Allocate memory for 2nd level table */
>   	if (!table[idx]) {
> -		page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
> -					get_order(baser->psz));
> +		page = its_alloc_pages_node(its->numa_node,
> +					    GFP_KERNEL | __GFP_ZERO,
> +					    get_order(baser->psz));
>   		if (!page)
>   			return false;
>   
> @@ -3405,7 +3489,6 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>   	if (WARN_ON(!is_power_of_2(nvecs)))
>   		nvecs = roundup_pow_of_two(nvecs);
>   
> -	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
>   	/*
>   	 * Even if the device wants a single LPI, the ITT must be
>   	 * sized as a power of two (and you need at least one bit...).
> @@ -3413,7 +3496,11 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>   	nr_ites = max(2, nvecs);
>   	sz = nr_ites * (FIELD_GET(GITS_TYPER_ITT_ENTRY_SIZE, its->typer) + 1);
>   	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> -	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
> +
> +	itt = itt_alloc_pool(its->numa_node, sz);
> +
> +	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> +
>   	if (alloc_lpis) {
>   		lpi_map = its_lpi_alloc(nvecs, &lpi_base, &nr_lpis);
>   		if (lpi_map)
> @@ -3425,9 +3512,9 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>   		lpi_base = 0;
>   	}
>   
> -	if (!dev || !itt ||  !col_map || (!lpi_map && alloc_lpis)) {
> +	if (!dev || !itt || !col_map || (!lpi_map && alloc_lpis)) {
>   		kfree(dev);
> -		kfree(itt);
> +		itt_free_pool(itt, sz);
>   		bitmap_free(lpi_map);
>   		kfree(col_map);
>   		return NULL;
> @@ -3437,6 +3524,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>   
>   	dev->its = its;
>   	dev->itt = itt;
> +	dev->itt_sz = sz;
>   	dev->nr_ites = nr_ites;
>   	dev->event_map.lpi_map = lpi_map;
>   	dev->event_map.col_map = col_map;
> @@ -3464,7 +3552,7 @@ static void its_free_device(struct its_device *its_dev)
>   	list_del(&its_dev->entry);
>   	raw_spin_unlock_irqrestore(&its_dev->its->lock, flags);
>   	kfree(its_dev->event_map.col_map);
> -	kfree(its_dev->itt);
> +	itt_free_pool(its_dev->itt, its_dev->itt_sz);
>   	kfree(its_dev);
>   }
>   
> @@ -5112,8 +5200,9 @@ static int __init its_probe_one(struct its_node *its)
>   		}
>   	}
>   
> -	page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
> -				get_order(ITS_CMD_QUEUE_SZ));
> +	page = its_alloc_pages_node(its->numa_node,
> +				    GFP_KERNEL | __GFP_ZERO,
> +				    get_order(ITS_CMD_QUEUE_SZ));
>   	if (!page) {
>   		err = -ENOMEM;
>   		goto out_unmap_sgir;
> @@ -5177,7 +5266,7 @@ static int __init its_probe_one(struct its_node *its)
>   out_free_tables:
>   	its_free_tables(its);
>   out_free_cmd:
> -	free_pages((unsigned long)its->cmd_base, get_order(ITS_CMD_QUEUE_SZ));
> +	its_free_pages(its->cmd_base, get_order(ITS_CMD_QUEUE_SZ));
>   out_unmap_sgir:
>   	if (its->sgir_base)
>   		iounmap(its->sgir_base);
> @@ -5663,6 +5752,10 @@ int __init its_init(struct fwnode_handle *handle, struct rdists *rdists,
>   	bool has_v4_1 = false;
>   	int err;
>   
> +	itt_pool = gen_pool_create(get_order(ITS_ITT_ALIGN), -1);
> +	if (!itt_pool)
> +		return -ENOMEM;
> +
>   	gic_rdists = rdists;
>   
>   	lpi_prop_prio = irq_prio;


How about something like this folded into this patch ? Or if this patch 
goes in independently, we could carry the following as part of the CCA
series.

diff --git a/drivers/irqchip/irq-gic-v3-its.c 
b/drivers/irqchip/irq-gic-v3-its.c
index 6f4ddf7faed1..f1a779b52210 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -209,7 +209,7 @@ static struct page *its_alloc_pages_node(int node, 
gfp_t gfp,

  	page = alloc_pages_node(node, gfp, order);

-	if (page)
+	if (gic_rdists->is_shared && page)
  		set_memory_decrypted((unsigned long)page_address(page),
  				     BIT(order));
  	return page;
@@ -222,7 +222,8 @@ static struct page *its_alloc_pages(gfp_t gfp, 
unsigned int order)

  static void its_free_pages(void *addr, unsigned int order)
  {
-	set_memory_encrypted((unsigned long)addr, BIT(order));
+	if (gic_rdists->is_shared)
+		set_memory_encrypted((unsigned long)addr, BIT(order));
  	free_pages((unsigned long)addr, order);
  }

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 6fb276504bcc..48c6b2c8dd8c 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2015,6 +2015,8 @@ static int __init gic_init_bases(phys_addr_t 
dist_phys_base,
  	typer = readl_relaxed(gic_data.dist_base + GICD_TYPER);
  	gic_data.rdists.gicd_typer = typer;

+	gic_data.rdists.is_shared = 
!arm64_is_iomem_private(gic_data.dist_phys_base,
+							    PAGE_SIZE);
  	gic_enable_quirks(readl_relaxed(gic_data.dist_base + GICD_IIDR),
  			  gic_quirks, &gic_data);

diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 728691365464..1edc33608d52 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -631,6 +631,7 @@ struct rdists {
  	bool			has_rvpeid;
  	bool			has_direct_lpi;
  	bool			has_vpend_valid_dirty;
+	bool			is_shared;
  };

  struct irq_domain;
Marc Zyngier Aug. 19, 2024, 3:24 p.m. UTC | #3
On Mon, 19 Aug 2024 15:51:00 +0100,
Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> 
> Hi Steven,
> 
> On 19/08/2024 14:19, Steven Price wrote:
> > Within a realm guest the ITS is emulated by the host. This means the
> > allocations must have been made available to the host by a call to
> > set_memory_decrypted(). Introduce an allocation function which performs
> > this extra call.
> > 
> > For the ITT use a custom genpool-based allocator that calls
> > set_memory_decrypted() for each page allocated, but then suballocates
> > the size needed for each ITT. Note that there is no mechanism
> > implemented to return pages from the genpool, but it is unlikely the
> > peak number of devices will so much larger than the normal level - so
> > this isn't expected to be an issue.
> > 
> 
> This may not be sufficient to make it future proof. We need to detect if
> the GIC is private vs shared, before we make the allocation
> choice. Please see below :

What do you mean by that? Do you foresee a *GICv3* implementation on
the realm side?

[...]

> How about something like this folded into this patch ? Or if this
> patch goes in independently, we could carry the following as part of
> the CCA
> series.
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c
> b/drivers/irqchip/irq-gic-v3-its.c
> index 6f4ddf7faed1..f1a779b52210 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -209,7 +209,7 @@ static struct page *its_alloc_pages_node(int node,
> gfp_t gfp,
> 
>  	page = alloc_pages_node(node, gfp, order);
> 
> -	if (page)
> +	if (gic_rdists->is_shared && page)
>  		set_memory_decrypted((unsigned long)page_address(page),
>  				     BIT(order));
>  	return page;
> @@ -222,7 +222,8 @@ static struct page *its_alloc_pages(gfp_t gfp,
> unsigned int order)
> 
>  static void its_free_pages(void *addr, unsigned int order)
>  {
> -	set_memory_encrypted((unsigned long)addr, BIT(order));
> +	if (gic_rdists->is_shared)
> +		set_memory_encrypted((unsigned long)addr, BIT(order));
>  	free_pages((unsigned long)addr, order);
>  }
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 6fb276504bcc..48c6b2c8dd8c 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2015,6 +2015,8 @@ static int __init gic_init_bases(phys_addr_t
> dist_phys_base,
>  	typer = readl_relaxed(gic_data.dist_base + GICD_TYPER);
>  	gic_data.rdists.gicd_typer = typer;
> 
> +	gic_data.rdists.is_shared =
> !arm64_is_iomem_private(gic_data.dist_phys_base,
> +							    PAGE_SIZE);

Why would you base the status of the RDs on that of the distributor?

>  	gic_enable_quirks(readl_relaxed(gic_data.dist_base + GICD_IIDR),
>  			  gic_quirks, &gic_data);
> 
> diff --git a/include/linux/irqchip/arm-gic-v3.h
> b/include/linux/irqchip/arm-gic-v3.h
> index 728691365464..1edc33608d52 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -631,6 +631,7 @@ struct rdists {
>  	bool			has_rvpeid;
>  	bool			has_direct_lpi;
>  	bool			has_vpend_valid_dirty;
> +	bool			is_shared;
>  };
> 
>  struct irq_domain;
> 

I really don't like this.

If we have to go down the route of identifying whether the GIC needs
encryption or not based on the platform, then maybe we should bite the
bullet and treat it as a first class device, given that we expect
devices to be either realm or non-secure.

Thanks,

	M.
Suzuki K Poulose Aug. 19, 2024, 10:19 p.m. UTC | #4
Hi Marc

On 19/08/2024 16:24, Marc Zyngier wrote:
> On Mon, 19 Aug 2024 15:51:00 +0100,
> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi Steven,
>>
>> On 19/08/2024 14:19, Steven Price wrote:
>>> Within a realm guest the ITS is emulated by the host. This means the
>>> allocations must have been made available to the host by a call to
>>> set_memory_decrypted(). Introduce an allocation function which performs
>>> this extra call.
>>>
>>> For the ITT use a custom genpool-based allocator that calls
>>> set_memory_decrypted() for each page allocated, but then suballocates
>>> the size needed for each ITT. Note that there is no mechanism
>>> implemented to return pages from the genpool, but it is unlikely the
>>> peak number of devices will so much larger than the normal level - so
>>> this isn't expected to be an issue.
>>>
>>
>> This may not be sufficient to make it future proof. We need to detect if
>> the GIC is private vs shared, before we make the allocation
>> choice. Please see below :
> 
> What do you mean by that? Do you foresee a *GICv3* implementation on
> the realm side?

No, but it may be emulated in the Realm World (by a higher privileged 
component, with future RMM versions with Planes - Plane0) and this
"Realm guest" may run in a lesser privileged plane and must use
"protected" accesses to make sure the accesses are seen by the "Realm
world" emulator.

> 
> [...]
> 
>> How about something like this folded into this patch ? Or if this
>> patch goes in independently, we could carry the following as part of
>> the CCA
>> series.
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c
>> b/drivers/irqchip/irq-gic-v3-its.c
>> index 6f4ddf7faed1..f1a779b52210 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -209,7 +209,7 @@ static struct page *its_alloc_pages_node(int node,
>> gfp_t gfp,
>>
>>   	page = alloc_pages_node(node, gfp, order);
>>
>> -	if (page)
>> +	if (gic_rdists->is_shared && page)
>>   		set_memory_decrypted((unsigned long)page_address(page),
>>   				     BIT(order));
>>   	return page;
>> @@ -222,7 +222,8 @@ static struct page *its_alloc_pages(gfp_t gfp,
>> unsigned int order)
>>
>>   static void its_free_pages(void *addr, unsigned int order)
>>   {
>> -	set_memory_encrypted((unsigned long)addr, BIT(order));
>> +	if (gic_rdists->is_shared)
>> +		set_memory_encrypted((unsigned long)addr, BIT(order));
>>   	free_pages((unsigned long)addr, order);
>>   }
>>
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index 6fb276504bcc..48c6b2c8dd8c 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -2015,6 +2015,8 @@ static int __init gic_init_bases(phys_addr_t
>> dist_phys_base,
>>   	typer = readl_relaxed(gic_data.dist_base + GICD_TYPER);
>>   	gic_data.rdists.gicd_typer = typer;
>>
>> +	gic_data.rdists.is_shared =
>> !arm64_is_iomem_private(gic_data.dist_phys_base,
>> +							    PAGE_SIZE);
> 
> Why would you base the status of the RDs on that of the distributor?

We expect that, the GIC as a whole is either Realm or non-secure, but
not split (like most of the devices). The only reason for using rdists
is because thats shared and available with the ITS driver code. (and
was an easy hack). Happy to change this to something better.

> 
>>   	gic_enable_quirks(readl_relaxed(gic_data.dist_base + GICD_IIDR),
>>   			  gic_quirks, &gic_data);
>>
>> diff --git a/include/linux/irqchip/arm-gic-v3.h
>> b/include/linux/irqchip/arm-gic-v3.h
>> index 728691365464..1edc33608d52 100644
>> --- a/include/linux/irqchip/arm-gic-v3.h
>> +++ b/include/linux/irqchip/arm-gic-v3.h
>> @@ -631,6 +631,7 @@ struct rdists {
>>   	bool			has_rvpeid;
>>   	bool			has_direct_lpi;
>>   	bool			has_vpend_valid_dirty;
>> +	bool			is_shared;
>>   };
>>
>>   struct irq_domain;
>>
> 
> I really don't like this.
> 
> If we have to go down the route of identifying whether the GIC needs
> encryption or not based on the platform, then maybe we should bite the
> bullet and treat it as a first class device, given that we expect
> devices to be either realm or non-secure.

Agreed and that is exactly we would like. i.e., treat the GIC as either
Realm or NS (as a whole). Now, how do we make that decision is based on
whether GIC Distributor area is private or not. Like I mentioned above, 
we need a cleaner way of making this available in the ITS driver.

Thoughts ? Is that what you were hinting at ?

Suzuki


> Thanks,
> 
> 	M.
>
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9b34596b3542..557214c774c3 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -12,12 +12,14 @@ 
 #include <linux/crash_dump.h>
 #include <linux/delay.h>
 #include <linux/efi.h>
+#include <linux/genalloc.h>
 #include <linux/interrupt.h>
 #include <linux/iommu.h>
 #include <linux/iopoll.h>
 #include <linux/irqdomain.h>
 #include <linux/list.h>
 #include <linux/log2.h>
+#include <linux/mem_encrypt.h>
 #include <linux/memblock.h>
 #include <linux/mm.h>
 #include <linux/msi.h>
@@ -27,6 +29,7 @@ 
 #include <linux/of_pci.h>
 #include <linux/of_platform.h>
 #include <linux/percpu.h>
+#include <linux/set_memory.h>
 #include <linux/slab.h>
 #include <linux/syscore_ops.h>
 
@@ -164,6 +167,7 @@  struct its_device {
 	struct its_node		*its;
 	struct event_lpi_map	event_map;
 	void			*itt;
+	u32			itt_sz;
 	u32			nr_ites;
 	u32			device_id;
 	bool			shared;
@@ -199,6 +203,81 @@  static DEFINE_IDA(its_vpeid_ida);
 #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
 #define gic_data_rdist_vlpi_base()	(gic_data_rdist_rd_base() + SZ_128K)
 
+static struct page *its_alloc_pages_node(int node, gfp_t gfp,
+					 unsigned int order)
+{
+	struct page *page;
+	int ret = 0;
+
+	page = alloc_pages_node(node, gfp, order);
+
+	if (!page)
+		return NULL;
+
+	ret = set_memory_decrypted((unsigned long)page_address(page),
+				   1 << order);
+	if (WARN_ON(ret))
+		return NULL;
+
+	return page;
+}
+
+static struct page *its_alloc_pages(gfp_t gfp, unsigned int order)
+{
+	return its_alloc_pages_node(NUMA_NO_NODE, gfp, order);
+}
+
+static void its_free_pages(void *addr, unsigned int order)
+{
+	if (WARN_ON(set_memory_encrypted((unsigned long)addr, 1 << order)))
+		return;
+	free_pages((unsigned long)addr, order);
+}
+
+static struct gen_pool *itt_pool;
+
+static void *itt_alloc_pool(int node, int size)
+{
+	unsigned long addr;
+	struct page *page;
+
+	if (size >= PAGE_SIZE) {
+		page = its_alloc_pages_node(node,
+					    GFP_KERNEL | __GFP_ZERO,
+					    get_order(size));
+
+		return page_address(page);
+	}
+
+	do {
+		addr = gen_pool_alloc(itt_pool, size);
+		if (addr)
+			break;
+
+		page = its_alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 1);
+		if (!page)
+			break;
+
+		gen_pool_add(itt_pool, (unsigned long)page_address(page),
+			     PAGE_SIZE, node);
+	} while (!addr);
+
+	return (void *)addr;
+}
+
+static void itt_free_pool(void *addr, int size)
+{
+	if (!addr)
+		return;
+
+	if (size >= PAGE_SIZE) {
+		its_free_pages(addr, get_order(size));
+		return;
+	}
+
+	gen_pool_free(itt_pool, (unsigned long)addr, size);
+}
+
 /*
  * Skip ITSs that have no vLPIs mapped, unless we're on GICv4.1, as we
  * always have vSGIs mapped.
@@ -2187,7 +2266,8 @@  static struct page *its_allocate_prop_table(gfp_t gfp_flags)
 {
 	struct page *prop_page;
 
-	prop_page = alloc_pages(gfp_flags, get_order(LPI_PROPBASE_SZ));
+	prop_page = its_alloc_pages(gfp_flags,
+				    get_order(LPI_PROPBASE_SZ));
 	if (!prop_page)
 		return NULL;
 
@@ -2198,8 +2278,8 @@  static struct page *its_allocate_prop_table(gfp_t gfp_flags)
 
 static void its_free_prop_table(struct page *prop_page)
 {
-	free_pages((unsigned long)page_address(prop_page),
-		   get_order(LPI_PROPBASE_SZ));
+	its_free_pages(page_address(prop_page),
+		       get_order(LPI_PROPBASE_SZ));
 }
 
 static bool gic_check_reserved_range(phys_addr_t addr, unsigned long size)
@@ -2321,7 +2401,8 @@  static int its_setup_baser(struct its_node *its, struct its_baser *baser,
 		order = get_order(GITS_BASER_PAGES_MAX * psz);
 	}
 
-	page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO, order);
+	page = its_alloc_pages_node(its->numa_node,
+				    GFP_KERNEL | __GFP_ZERO, order);
 	if (!page)
 		return -ENOMEM;
 
@@ -2334,7 +2415,7 @@  static int its_setup_baser(struct its_node *its, struct its_baser *baser,
 		/* 52bit PA is supported only when PageSize=64K */
 		if (psz != SZ_64K) {
 			pr_err("ITS: no 52bit PA support when psz=%d\n", psz);
-			free_pages((unsigned long)base, order);
+			its_free_pages(base, order);
 			return -ENXIO;
 		}
 
@@ -2390,7 +2471,7 @@  static int its_setup_baser(struct its_node *its, struct its_baser *baser,
 		pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
 		       &its->phys_base, its_base_type_string[type],
 		       val, tmp);
-		free_pages((unsigned long)base, order);
+		its_free_pages(base, order);
 		return -ENXIO;
 	}
 
@@ -2529,8 +2610,8 @@  static void its_free_tables(struct its_node *its)
 
 	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
 		if (its->tables[i].base) {
-			free_pages((unsigned long)its->tables[i].base,
-				   its->tables[i].order);
+			its_free_pages(its->tables[i].base,
+				       its->tables[i].order);
 			its->tables[i].base = NULL;
 		}
 	}
@@ -2796,7 +2877,8 @@  static bool allocate_vpe_l2_table(int cpu, u32 id)
 
 	/* Allocate memory for 2nd level table */
 	if (!table[idx]) {
-		page = alloc_pages(GFP_KERNEL | __GFP_ZERO, get_order(psz));
+		page = its_alloc_pages(GFP_KERNEL | __GFP_ZERO,
+				       get_order(psz));
 		if (!page)
 			return false;
 
@@ -2915,7 +2997,8 @@  static int allocate_vpe_l1_table(void)
 
 	pr_debug("np = %d, npg = %lld, psz = %d, epp = %d, esz = %d\n",
 		 np, npg, psz, epp, esz);
-	page = alloc_pages(GFP_ATOMIC | __GFP_ZERO, get_order(np * PAGE_SIZE));
+	page = its_alloc_pages(GFP_ATOMIC | __GFP_ZERO,
+			       get_order(np * PAGE_SIZE));
 	if (!page)
 		return -ENOMEM;
 
@@ -2961,8 +3044,8 @@  static struct page *its_allocate_pending_table(gfp_t gfp_flags)
 {
 	struct page *pend_page;
 
-	pend_page = alloc_pages(gfp_flags | __GFP_ZERO,
-				get_order(LPI_PENDBASE_SZ));
+	pend_page = its_alloc_pages(gfp_flags | __GFP_ZERO,
+				    get_order(LPI_PENDBASE_SZ));
 	if (!pend_page)
 		return NULL;
 
@@ -2974,7 +3057,7 @@  static struct page *its_allocate_pending_table(gfp_t gfp_flags)
 
 static void its_free_pending_table(struct page *pt)
 {
-	free_pages((unsigned long)page_address(pt), get_order(LPI_PENDBASE_SZ));
+	its_free_pages(page_address(pt), get_order(LPI_PENDBASE_SZ));
 }
 
 /*
@@ -3309,8 +3392,9 @@  static bool its_alloc_table_entry(struct its_node *its,
 
 	/* Allocate memory for 2nd level table */
 	if (!table[idx]) {
-		page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
-					get_order(baser->psz));
+		page = its_alloc_pages_node(its->numa_node,
+					    GFP_KERNEL | __GFP_ZERO,
+					    get_order(baser->psz));
 		if (!page)
 			return false;
 
@@ -3405,7 +3489,6 @@  static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	if (WARN_ON(!is_power_of_2(nvecs)))
 		nvecs = roundup_pow_of_two(nvecs);
 
-	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
 	/*
 	 * Even if the device wants a single LPI, the ITT must be
 	 * sized as a power of two (and you need at least one bit...).
@@ -3413,7 +3496,11 @@  static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	nr_ites = max(2, nvecs);
 	sz = nr_ites * (FIELD_GET(GITS_TYPER_ITT_ENTRY_SIZE, its->typer) + 1);
 	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
-	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
+
+	itt = itt_alloc_pool(its->numa_node, sz);
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+
 	if (alloc_lpis) {
 		lpi_map = its_lpi_alloc(nvecs, &lpi_base, &nr_lpis);
 		if (lpi_map)
@@ -3425,9 +3512,9 @@  static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 		lpi_base = 0;
 	}
 
-	if (!dev || !itt ||  !col_map || (!lpi_map && alloc_lpis)) {
+	if (!dev || !itt || !col_map || (!lpi_map && alloc_lpis)) {
 		kfree(dev);
-		kfree(itt);
+		itt_free_pool(itt, sz);
 		bitmap_free(lpi_map);
 		kfree(col_map);
 		return NULL;
@@ -3437,6 +3524,7 @@  static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 
 	dev->its = its;
 	dev->itt = itt;
+	dev->itt_sz = sz;
 	dev->nr_ites = nr_ites;
 	dev->event_map.lpi_map = lpi_map;
 	dev->event_map.col_map = col_map;
@@ -3464,7 +3552,7 @@  static void its_free_device(struct its_device *its_dev)
 	list_del(&its_dev->entry);
 	raw_spin_unlock_irqrestore(&its_dev->its->lock, flags);
 	kfree(its_dev->event_map.col_map);
-	kfree(its_dev->itt);
+	itt_free_pool(its_dev->itt, its_dev->itt_sz);
 	kfree(its_dev);
 }
 
@@ -5112,8 +5200,9 @@  static int __init its_probe_one(struct its_node *its)
 		}
 	}
 
-	page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
-				get_order(ITS_CMD_QUEUE_SZ));
+	page = its_alloc_pages_node(its->numa_node,
+				    GFP_KERNEL | __GFP_ZERO,
+				    get_order(ITS_CMD_QUEUE_SZ));
 	if (!page) {
 		err = -ENOMEM;
 		goto out_unmap_sgir;
@@ -5177,7 +5266,7 @@  static int __init its_probe_one(struct its_node *its)
 out_free_tables:
 	its_free_tables(its);
 out_free_cmd:
-	free_pages((unsigned long)its->cmd_base, get_order(ITS_CMD_QUEUE_SZ));
+	its_free_pages(its->cmd_base, get_order(ITS_CMD_QUEUE_SZ));
 out_unmap_sgir:
 	if (its->sgir_base)
 		iounmap(its->sgir_base);
@@ -5663,6 +5752,10 @@  int __init its_init(struct fwnode_handle *handle, struct rdists *rdists,
 	bool has_v4_1 = false;
 	int err;
 
+	itt_pool = gen_pool_create(get_order(ITS_ITT_ALIGN), -1);
+	if (!itt_pool)
+		return -ENOMEM;
+
 	gic_rdists = rdists;
 
 	lpi_prop_prio = irq_prio;