diff mbox series

[v2] firmware: qcom_scm: Add a padded page to ensure DMA memory from lower 4GB

Message ID 1716564705-9929-1-git-send-email-quic_mojha@quicinc.com (mailing list archive)
State New
Headers show
Series [v2] firmware: qcom_scm: Add a padded page to ensure DMA memory from lower 4GB | expand

Commit Message

Mukesh Ojha May 24, 2024, 3:31 p.m. UTC
For SCM protection, memory allocation should be physically contiguous,
4K aligned, and non-cacheable to avoid XPU violations. This granularity
of protection applies from the secure world. Additionally, it's possible
that a 32-bit secure peripheral will access memory in SoCs like
sm8{4|5|6}50 for some remote processors. Therefore, memory allocation
needs to be done in the lower 4 GB range. To achieve this, Linux's CMA
pool can be used with dma_alloc APIs.

However, dma_alloc APIs will fall back to the buddy pool if the requested
size is less than or equal to PAGE_SIZE. It's also possible that the remote
processor's metadata blob size is less than a PAGE_SIZE. Even though the
DMA APIs align the requested memory size to PAGE_SIZE, they can still fall
back to the buddy allocator, which may fail if `CONFIG_ZONE_{DMA|DMA32}`
is disabled.

To address this issue, use an extra page as padding to ensure allocation
from the CMA region. Since this memory is temporary, it will be released
once the remote processor is up or in case of any failure.

Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
Changes in v2:
 - Described the issue more clearly in commit text.

 drivers/firmware/qcom/qcom_scm.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Bjorn Andersson May 26, 2024, 8:46 p.m. UTC | #1
On Fri, May 24, 2024 at 09:01:45PM GMT, Mukesh Ojha wrote:
> For SCM protection, memory allocation should be physically contiguous,
> 4K aligned, and non-cacheable to avoid XPU violations. This granularity
> of protection applies from the secure world. Additionally, it's possible
> that a 32-bit secure peripheral will access memory in SoCs like
> sm8{4|5|6}50 for some remote processors. Therefore, memory allocation
> needs to be done in the lower 4 GB range. To achieve this, Linux's CMA
> pool can be used with dma_alloc APIs.
> 
> However, dma_alloc APIs will fall back to the buddy pool if the requested
> size is less than or equal to PAGE_SIZE. It's also possible that the remote
> processor's metadata blob size is less than a PAGE_SIZE. Even though the
> DMA APIs align the requested memory size to PAGE_SIZE, they can still fall
> back to the buddy allocator, which may fail if `CONFIG_ZONE_{DMA|DMA32}`
> is disabled.

Does "fail" here mean that the buddy heap returns a failure - in some
case where dma_alloc would have succeeded, or that it does give you
a PAGE_SIZE allocation which doesn't meeting your requirements?

From this I do find the behavior of dma_alloc unintuitive, do we know if
there's a reason for the "equal to PAGE_SIZE" case you describe here?

> 
> To address this issue, use an extra page as padding to ensure allocation
> from the CMA region. Since this memory is temporary, it will be released
> once the remote processor is up or in case of any failure.
> 

Thanks for updating the commit message, this is good.

Regards,
Bjorn

> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
> ---
> Changes in v2:
>  - Described the issue more clearly in commit text.
> 
>  drivers/firmware/qcom/qcom_scm.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> index 520de9b5633a..0426972178a4 100644
> --- a/drivers/firmware/qcom/qcom_scm.c
> +++ b/drivers/firmware/qcom/qcom_scm.c
> @@ -538,6 +538,7 @@ static void qcom_scm_set_download_mode(bool enable)
>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>  			    struct qcom_scm_pas_metadata *ctx)
>  {
> +	size_t page_aligned_size;
>  	dma_addr_t mdata_phys;
>  	void *mdata_buf;
>  	int ret;
> @@ -555,7 +556,8 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>  	 * data blob, so make sure it's physically contiguous, 4K aligned and
>  	 * non-cachable to avoid XPU violations.
>  	 */
> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
> +	page_aligned_size = PAGE_ALIGN(size + PAGE_SIZE);
> +	mdata_buf = dma_alloc_coherent(__scm->dev, page_aligned_size, &mdata_phys,
>  				       GFP_KERNEL);
>  	if (!mdata_buf) {
>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
> @@ -580,11 +582,11 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>  
>  out:
>  	if (ret < 0 || !ctx) {
> -		dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys);
> +		dma_free_coherent(__scm->dev, page_aligned_size, mdata_buf, mdata_phys);
>  	} else if (ctx) {
>  		ctx->ptr = mdata_buf;
>  		ctx->phys = mdata_phys;
> -		ctx->size = size;
> +		ctx->size = page_aligned_size;
>  	}
>  
>  	return ret ? : res.result[0];
> -- 
> 2.7.4
>
Mukesh Ojha May 29, 2024, 11:54 a.m. UTC | #2
On 5/27/2024 2:16 AM, Bjorn Andersson wrote:
> On Fri, May 24, 2024 at 09:01:45PM GMT, Mukesh Ojha wrote:
>> For SCM protection, memory allocation should be physically contiguous,
>> 4K aligned, and non-cacheable to avoid XPU violations. This granularity
>> of protection applies from the secure world. Additionally, it's possible
>> that a 32-bit secure peripheral will access memory in SoCs like
>> sm8{4|5|6}50 for some remote processors. Therefore, memory allocation
>> needs to be done in the lower 4 GB range. To achieve this, Linux's CMA
>> pool can be used with dma_alloc APIs.
>>
>> However, dma_alloc APIs will fall back to the buddy pool if the requested
>> size is less than or equal to PAGE_SIZE. It's also possible that the remote
>> processor's metadata blob size is less than a PAGE_SIZE. Even though the
>> DMA APIs align the requested memory size to PAGE_SIZE, they can still fall
>> back to the buddy allocator, which may fail if `CONFIG_ZONE_{DMA|DMA32}`
>> is disabled.
> 
> Does "fail" here mean that the buddy heap returns a failure - in some
> case where dma_alloc would have succeeded, or that it does give you
> a PAGE_SIZE allocation which doesn't meeting your requirements?

Yes, buddy will also try to allocate memory and may not get PAGE_SIZE 
memory in lower 4GB(for 32bit capable device) if CONFIG_ZONE_{DMA|DMA32} 
is disabled. However, DMA memory would have successful such case if
padding is added to size to cross > PAGE_SIZE.

> 
>  From this I do find the behavior of dma_alloc unintuitive, do we know if
> there's a reason for the "equal to PAGE_SIZE" case you describe here?

I am not a memory expert but the reason i can think of could be, <= 
PAGE_SIZE can anyway possible to be requested outside DMA coherent api's
with kmalloc and friends api and that could be the reason it is falling
back to buddy pool in DMA api.

-Mukesh
diff mbox series

Patch

diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
index 520de9b5633a..0426972178a4 100644
--- a/drivers/firmware/qcom/qcom_scm.c
+++ b/drivers/firmware/qcom/qcom_scm.c
@@ -538,6 +538,7 @@  static void qcom_scm_set_download_mode(bool enable)
 int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
 			    struct qcom_scm_pas_metadata *ctx)
 {
+	size_t page_aligned_size;
 	dma_addr_t mdata_phys;
 	void *mdata_buf;
 	int ret;
@@ -555,7 +556,8 @@  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
 	 * data blob, so make sure it's physically contiguous, 4K aligned and
 	 * non-cachable to avoid XPU violations.
 	 */
-	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
+	page_aligned_size = PAGE_ALIGN(size + PAGE_SIZE);
+	mdata_buf = dma_alloc_coherent(__scm->dev, page_aligned_size, &mdata_phys,
 				       GFP_KERNEL);
 	if (!mdata_buf) {
 		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
@@ -580,11 +582,11 @@  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
 
 out:
 	if (ret < 0 || !ctx) {
-		dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys);
+		dma_free_coherent(__scm->dev, page_aligned_size, mdata_buf, mdata_phys);
 	} else if (ctx) {
 		ctx->ptr = mdata_buf;
 		ctx->phys = mdata_phys;
-		ctx->size = size;
+		ctx->size = page_aligned_size;
 	}
 
 	return ret ? : res.result[0];