diff mbox series

arm64: dts: ti: k3-am62x-sk-common: Reserve 128MiB of global CMA

Message ID 20230803111455.811339-1-devarsht@ti.com (mailing list archive)
State New, archived
Headers show
Series arm64: dts: ti: k3-am62x-sk-common: Reserve 128MiB of global CMA | expand

Commit Message

Devarsh Thakkar Aug. 3, 2023, 11:14 a.m. UTC
Reserve 128MiB of global CMA which is also marked as re-usable
so that OS can also use the same if peripheral drivers are not using the
same.

AM62x supports multimedia components such as GPU, dual Display and Camera.
Assuming the worst-case scenario where all 3 are run in parallel below
is the calculation :

1) OV5640 camera sensor supports 1920x1080 resolution
-> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
   (default in yavta) : 32MiB

2) 1920x1200 Microtips LVDS panel supported
-> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
   16 MiB

3) 1920x1080 HDMI display supported
-> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
   15.82 MiB which is ~16 MiB

4) IMG GPU shares with display allocated buffers while rendering
   but in case some dedicated operation viz color conversion,
   keeping same window of ~16 MiB for GPU too.

Total is 80 MiB and adding 32 MiB for other peripherals and extra
16 MiB to keep as buffer for fragmentation thus rounding total to 128
MiB.

Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
Acked-by: Darren Etheridge <detheridge@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
---
 arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Nishanth Menon Aug. 5, 2023, 7:33 p.m. UTC | #1
On 16:44-20230803, Devarsh Thakkar wrote:
> Reserve 128MiB of global CMA which is also marked as re-usable
> so that OS can also use the same if peripheral drivers are not using the
> same.
> 
> AM62x supports multimedia components such as GPU, dual Display and Camera.
> Assuming the worst-case scenario where all 3 are run in parallel below
> is the calculation :
> 
> 1) OV5640 camera sensor supports 1920x1080 resolution
> -> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
>    (default in yavta) : 32MiB
> 
> 2) 1920x1200 Microtips LVDS panel supported
> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>    16 MiB
> 
> 3) 1920x1080 HDMI display supported
> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>    15.82 MiB which is ~16 MiB
> 
> 4) IMG GPU shares with display allocated buffers while rendering
>    but in case some dedicated operation viz color conversion,
>    keeping same window of ~16 MiB for GPU too.
> 
> Total is 80 MiB and adding 32 MiB for other peripherals and extra
> 16 MiB to keep as buffer for fragmentation thus rounding total to 128
> MiB.
> 
> Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
> Acked-by: Darren Etheridge <detheridge@ti.com>
> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
> ---

I don't think this is right approach. There are other techniques
than having to do this (Andrew: please comment) and require drivers to
behave properly. I am esp concerned since there are platforms based on
am62x and just 256MB DDR.

>  arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
> index 34c8ffc553ec..9dd6e23ca9ca 100644
> --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
> @@ -47,6 +47,14 @@ ramoops@9ca00000 {
>  			pmsg-size = <0x8000>;
>  		};
>  
> +		/* global cma region */
> +		linux,cma {
> +			compatible = "shared-dma-pool";
> +			reusable;
> +			size = <0x00 0x8000000>;
> +			linux,cma-default;
> +		};
> +
>  		secure_tfa_ddr: tfa@9e780000 {
>  			reg = <0x00 0x9e780000 0x00 0x80000>;
>  			alignment = <0x1000>;
> -- 
> 2.34.1
>
Devarsh Thakkar Aug. 7, 2023, 6:03 a.m. UTC | #2
Hi Nishanth,

Thanks for the review.

On 06/08/23 01:03, Nishanth Menon wrote:
> On 16:44-20230803, Devarsh Thakkar wrote:
>> Reserve 128MiB of global CMA which is also marked as re-usable
>> so that OS can also use the same if peripheral drivers are not using the
>> same.
>>
>> AM62x supports multimedia components such as GPU, dual Display and Camera.
>> Assuming the worst-case scenario where all 3 are run in parallel below
>> is the calculation :
>>
>> 1) OV5640 camera sensor supports 1920x1080 resolution
>> -> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
>>    (default in yavta) : 32MiB
>>
>> 2) 1920x1200 Microtips LVDS panel supported
>> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>>    16 MiB
>>
>> 3) 1920x1080 HDMI display supported
>> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>>    15.82 MiB which is ~16 MiB
>>
>> 4) IMG GPU shares with display allocated buffers while rendering
>>    but in case some dedicated operation viz color conversion,
>>    keeping same window of ~16 MiB for GPU too.
>>
>> Total is 80 MiB and adding 32 MiB for other peripherals and extra
>> 16 MiB to keep as buffer for fragmentation thus rounding total to 128
>> MiB.
>>
>> Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
>> Acked-by: Darren Etheridge <detheridge@ti.com>
>> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
>> ---
> 
> I don't think this is right approach. There are other techniques
> than having to do this (Andrew: please comment) and require drivers to
> behave properly.

Sorry but I did not understand clearly the disadvantage of this approach.
Here we are reserving CMA and also marking it as re-usable so that in case
driver is not using it OS can use that region.

Also I see quite a few vendors already taking this approach :

$grep -r cma-default arch/arm64/boot/dts
arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts:276:
 linux,cma-default;
arch/arm64/boot/dts/qcom/sc8280xp-crd.dts:222:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mp-tqma8mpql-mba8mpxl.dts:201:
         linux,cma-default;
arch/arm64/boot/dts/freescale/imx8ulp-evk.dts:32:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8-apalis-v1.1.dtsi:198:
 linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mm-tqma8mqml.dtsi:48:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8dxl-evk.dts:50:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx93-tqma9352.dtsi:24:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi:59:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mn-tqma8mqnl.dtsi:46:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx93-11x11-evk.dts:28:
linux,cma-default;
arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts:67:
linux,cma-default;
arch/arm64/boot/dts/amlogic/meson-gx.dtsi:63:
linux,cma-default;
arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi:121:
linux,cma-default;
arch/arm64/boot/dts/amlogic/meson-a1.dtsi:59:
linux,cma-default;


 I am esp concerned since there are platforms based on
> am62x and just 256MB DDR.
> 

The file "k3-am62x-sk-common.dtsi" refers DDR memory size as 2Gb[1] and so I
put CMA reservation in same file assuming all boards including this file have
2Gb.

But if there are some boards having lesser DDR and including this
k3-am62x-sk-common.dtsi and overriding memory node, I can put the CMA
reservation node in board specific file i.e. k3-am625-sk.dts in V2.

Kindly let me know if above is preferred approach.

[1]
https://gitlab.com/linux-kernel/linux-next/-/blob/next-20230807/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi?ref_type=tags#L33

Regards
Devarsh

>>  arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> index 34c8ffc553ec..9dd6e23ca9ca 100644
>> --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>> @@ -47,6 +47,14 @@ ramoops@9ca00000 {
>>  			pmsg-size = <0x8000>;
>>  		};
>>  
>> +		/* global cma region */
>> +		linux,cma {
>> +			compatible = "shared-dma-pool";
>> +			reusable;
>> +			size = <0x00 0x8000000>;
>> +			linux,cma-default;
>> +		};
>> +
>>  		secure_tfa_ddr: tfa@9e780000 {
>>  			reg = <0x00 0x9e780000 0x00 0x80000>;
>>  			alignment = <0x1000>;
>> -- 
>> 2.34.1
>>
>
Francesco Dolcini Aug. 7, 2023, 6:46 a.m. UTC | #3
On Sat, Aug 05, 2023 at 02:33:55PM -0500, Nishanth Menon wrote:
> I am esp concerned since there are platforms based on am62x and just
> 256MB DDR.

On that regard, currently you have reserved memory areas just before
512MB DDR limit. Any plan to move those?

Francesco
Nishanth Menon Aug. 7, 2023, 12:07 p.m. UTC | #4
On 08:46-20230807, Francesco Dolcini wrote:
> On Sat, Aug 05, 2023 at 02:33:55PM -0500, Nishanth Menon wrote:
> > I am esp concerned since there are platforms based on am62x and just
> > 256MB DDR.
> 
> On that regard, currently you have reserved memory areas just before
> 512MB DDR limit. Any plan to move those?

Nope. Common min configuration we have seen is 512MB (why TI evms use
that definition). There are exceptional cases of 256MB boards, but in
such cases, the board dts should describe it.
Andrew Davis Aug. 7, 2023, 6 p.m. UTC | #5
On 8/7/23 1:03 AM, Devarsh Thakkar wrote:
> Hi Nishanth,
> 
> Thanks for the review.
> 
> On 06/08/23 01:03, Nishanth Menon wrote:
>> On 16:44-20230803, Devarsh Thakkar wrote:
>>> Reserve 128MiB of global CMA which is also marked as re-usable
>>> so that OS can also use the same if peripheral drivers are not using the
>>> same.
>>>
>>> AM62x supports multimedia components such as GPU, dual Display and Camera.
>>> Assuming the worst-case scenario where all 3 are run in parallel below
>>> is the calculation :
>>>
>>> 1) OV5640 camera sensor supports 1920x1080 resolution
>>> -> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
>>>     (default in yavta) : 32MiB
>>>
>>> 2) 1920x1200 Microtips LVDS panel supported
>>> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>>>     16 MiB
>>>
>>> 3) 1920x1080 HDMI display supported
>>> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>>>     15.82 MiB which is ~16 MiB
>>>
>>> 4) IMG GPU shares with display allocated buffers while rendering
>>>     but in case some dedicated operation viz color conversion,
>>>     keeping same window of ~16 MiB for GPU too.
>>>
>>> Total is 80 MiB and adding 32 MiB for other peripherals and extra
>>> 16 MiB to keep as buffer for fragmentation thus rounding total to 128
>>> MiB.
>>>
>>> Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
>>> Acked-by: Darren Etheridge <detheridge@ti.com>
>>> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
>>> ---
>>
>> I don't think this is right approach. There are other techniques
>> than having to do this (Andrew: please comment) and require drivers to
>> behave properly.
> 
> Sorry but I did not understand clearly the disadvantage of this approach.
> Here we are reserving CMA and also marking it as re-usable so that in case
> driver is not using it OS can use that region.
> 

It isn't always that easy, many types of allocations can be pinned and
cannot be placed in this region. It still has cost.

> Also I see quite a few vendors already taking this approach :
> 

Just because others have gotten away with it doesn't mean it is correct :)

There are some cases when the DMA/CMA region needs to be in a specific
location as the hardware only supports some addresses (only some address
pins wired out, etc..). But general CMA size selection is a configuration
and so has no place in DT which should only be used to describe hardware.

Another issue I have is that this forces all users of these boards to
have this rather large carveout, even if they do not intend to use all
of these IP at the same time, or even at all.

Actually, upstream we don't support GPU yet, so you can't use all of
this carveout anyway.

Lastly, large CMA carveouts as in this case are masking a bigger issue,
there is hardware IP that cannot handle scatter-gather and there is
no system level IOMMU to help with that. This simply does not scale,
fragmentation can set in even with CMA in a running system, physically
contiguous allocations can still fail. As our devices grow in complexity
while still not having an IOMMU we would need to reserve ever increasingly
sized CMA areas.

This might sound like an ad absurdum argument, but we only need to look
at our current evil vendor tree to see where this leads [0]. Yes you
are reading the size right, 1.75GB(!) of CMA..

We need a better solution upstream, I'm not claiming I know what
that solution is (probably something involved in the memory allocation
to allow for more/larger movable pages). But for the above 3 reasons
this patch is not a viable solution.

[0] https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/arch/arm64/boot/dts/ti/k3-am69-sk.dts?h=ti-linux-6.1.y#n48

Andrew

> $grep -r cma-default arch/arm64/boot/dts
> arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts:276:
>   linux,cma-default;
> arch/arm64/boot/dts/qcom/sc8280xp-crd.dts:222:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8mp-tqma8mpql-mba8mpxl.dts:201:
>           linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8ulp-evk.dts:32:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8-apalis-v1.1.dtsi:198:
>   linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8mm-tqma8mqml.dtsi:48:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8dxl-evk.dts:50:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx93-tqma9352.dtsi:24:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi:59:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx8mn-tqma8mqnl.dtsi:46:
> linux,cma-default;
> arch/arm64/boot/dts/freescale/imx93-11x11-evk.dts:28:
> linux,cma-default;
> arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts:67:
> linux,cma-default;
> arch/arm64/boot/dts/amlogic/meson-gx.dtsi:63:
> linux,cma-default;
> arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi:121:
> linux,cma-default;
> arch/arm64/boot/dts/amlogic/meson-a1.dtsi:59:
> linux,cma-default;
> 
> 
>   I am esp concerned since there are platforms based on
>> am62x and just 256MB DDR.
>>
> 
> The file "k3-am62x-sk-common.dtsi" refers DDR memory size as 2Gb[1] and so I
> put CMA reservation in same file assuming all boards including this file have
> 2Gb.
> 
> But if there are some boards having lesser DDR and including this
> k3-am62x-sk-common.dtsi and overriding memory node, I can put the CMA
> reservation node in board specific file i.e. k3-am625-sk.dts in V2.
> 
> Kindly let me know if above is preferred approach.
> 
> [1]
> https://gitlab.com/linux-kernel/linux-next/-/blob/next-20230807/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi?ref_type=tags#L33
> 
> Regards
> Devarsh
> 
>>>   arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
>>>   1 file changed, 8 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>> index 34c8ffc553ec..9dd6e23ca9ca 100644
>>> --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>> @@ -47,6 +47,14 @@ ramoops@9ca00000 {
>>>   			pmsg-size = <0x8000>;
>>>   		};
>>>   
>>> +		/* global cma region */
>>> +		linux,cma {
>>> +			compatible = "shared-dma-pool";
>>> +			reusable;
>>> +			size = <0x00 0x8000000>;
>>> +			linux,cma-default;
>>> +		};
>>> +
>>>   		secure_tfa_ddr: tfa@9e780000 {
>>>   			reg = <0x00 0x9e780000 0x00 0x80000>;
>>>   			alignment = <0x1000>;
>>> -- 
>>> 2.34.1
>>>
>>
Devarsh Thakkar Aug. 8, 2023, 8:14 a.m. UTC | #6
Hi Andrew,

Thanks for the review.

On 07/08/23 23:30, Andrew Davis wrote:
> On 8/7/23 1:03 AM, Devarsh Thakkar wrote:
>> Hi Nishanth,
>>
>> Thanks for the review.
>>
>> On 06/08/23 01:03, Nishanth Menon wrote:
>>> On 16:44-20230803, Devarsh Thakkar wrote:
>>>> Reserve 128MiB of global CMA which is also marked as re-usable
>>>> so that OS can also use the same if peripheral drivers are not using the
>>>> same.
>>>>
>>>> AM62x supports multimedia components such as GPU, dual Display and Camera.
>>>> Assuming the worst-case scenario where all 3 are run in parallel below
>>>> is the calculation :
>>>>
>>>> 1) OV5640 camera sensor supports 1920x1080 resolution
>>>> -> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
>>>>     (default in yavta) : 32MiB
>>>>
>>>> 2) 1920x1200 Microtips LVDS panel supported
>>>> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>>>>     16 MiB
>>>>
>>>> 3) 1920x1080 HDMI display supported
>>>> -> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
>>>>     15.82 MiB which is ~16 MiB
>>>>
>>>> 4) IMG GPU shares with display allocated buffers while rendering
>>>>     but in case some dedicated operation viz color conversion,
>>>>     keeping same window of ~16 MiB for GPU too.
>>>>
>>>> Total is 80 MiB and adding 32 MiB for other peripherals and extra
>>>> 16 MiB to keep as buffer for fragmentation thus rounding total to 128
>>>> MiB.
>>>>
>>>> Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
>>>> Acked-by: Darren Etheridge <detheridge@ti.com>
>>>> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
>>>> ---
>>>
>>> I don't think this is right approach. There are other techniques
>>> than having to do this (Andrew: please comment) and require drivers to
>>> behave properly.
>>
>> Sorry but I did not understand clearly the disadvantage of this approach.
>> Here we are reserving CMA and also marking it as re-usable so that in case
>> driver is not using it OS can use that region.
>>
> 

Agreed, the examples where shared just for informational purpose.

> It isn't always that easy, many types of allocations can be pinned and
> cannot be placed in this region. It still has cost.
> 

I agree if some allocations are requested with non-moveable pages than cma
region can't serve those, but that looks more like a tradeoff to me as we also
want to support those drivers requiring CMA to use it on-demand and for
allocations requiring non-moveable pages they can be still served through
non-CMA region.


>> Also I see quite a few vendors already taking this approach :
>>
> 
> Just because others have gotten away with it doesn't mean it is correct :)
> 
> There are some cases when the DMA/CMA region needs to be in a specific
> location as the hardware only supports some addresses (only some address
> pins wired out, etc..). But general CMA size selection is a configuration
> and so has no place in DT which should only be used to describe hardware.
> 

But I do see the DT specification [0] mentioning this provision for CMA size
selection, so I don't think we are violating the DT specification while
choosing to set CMA size this way :


> Another issue I have is that this forces all users of these boards to
> have this rather large carveout, even if they do not intend to use all
> of these IP at the same time, or even at all.
> 
> Actually, upstream we don't support GPU yet, so you can't use all of
> this carveout anyway.
> 
> Lastly, large CMA carveouts as in this case are masking a bigger issue,
> there is hardware IP that cannot handle scatter-gather and there is
> no system level IOMMU to help with that. This simply does not scale,
> fragmentation can set in even with CMA in a running system, physically
> contiguous allocations can still fail. As our devices grow in complexity
> while still not having an IOMMU we would need to reserve ever increasingly
> sized CMA areas.
> 

Yes I think the CMA allocator tries to address the same scenarios and CMA
reservation is needed to support use-cases with those hardware IP's.

Also, I agree one magic value of CMA may not fit aptly all the users as some
might not intend to use these IPs at all but a default value for CMA can be
chosen to support basic use-cases supported by SoC,
and if the setting doesn't suit the user they can override this setting using
bootargs or some other method.


> This might sound like an ad absurdum argument, but we only need to look
> at our current evil vendor tree to see where this leads [0]. Yes you
> are reading the size right, 1.75GB(!) of CMA..
> 

Hmm I don't have much context on large CMA there, but that board has 32Gb RAM :)

> We need a better solution upstream, I'm not claiming I know what
> that solution is (probably something involved in the memory allocation
> to allow for more/larger movable pages). But for the above 3 reasons
> this patch is not a viable solution.
> 

Regarding movable pages, I think the CMA reservation uses pages allocated with
MIGRATE_CMA [1] flag which ensure that it uses only the movable pages so
reserved memory can be used by system if not used by driver and driver can
reclaim it back when it requires.

I am not sure what could be better solution than this, but I think camera and
display are primary use-cases for AM625 and base device-tree will be enabling
these peripherals by default and so the basic use-cases with them should also
by supported out-of-box. My suggestion is to go with this approach and if
somebody find better approach than this, we can update to it at that time.

The alternatives I see are having a shared dma pool between a set of
peripherals but that will require each driver using it to call
of_reserved_mem_device_init(dev) before starting allocation and prevent other
drivers (those not linked with this reserved region) from using CMA, and so I
did not prefer that approach and prefer the base approach as shared in this patch.

Kindly let me know your opinion.

[0]
https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html#:~:text=If%20a%20linux%2Ccma%2Ddefault
[1] https://elixir.bootlin.com/linux/latest/source/include/linux/mmzone.h#L62

Regards
Devarsh

> [0]
> https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/arch/arm64/boot/dts/ti/k3-am69-sk.dts?h=ti-linux-6.1.y#n48
> 
> Andrew
> 
>> $grep -r cma-default arch/arm64/boot/dts
>> arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts:276:
>>   linux,cma-default;
>> arch/arm64/boot/dts/qcom/sc8280xp-crd.dts:222:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8mp-tqma8mpql-mba8mpxl.dts:201:
>>           linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8ulp-evk.dts:32:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8-apalis-v1.1.dtsi:198:
>>   linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8mm-tqma8mqml.dtsi:48:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8dxl-evk.dts:50:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx93-tqma9352.dtsi:24:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi:59:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx8mn-tqma8mqnl.dtsi:46:
>> linux,cma-default;
>> arch/arm64/boot/dts/freescale/imx93-11x11-evk.dts:28:
>> linux,cma-default;
>> arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts:67:
>> linux,cma-default;
>> arch/arm64/boot/dts/amlogic/meson-gx.dtsi:63:
>> linux,cma-default;
>> arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi:121:
>> linux,cma-default;
>> arch/arm64/boot/dts/amlogic/meson-a1.dtsi:59:
>> linux,cma-default;
>>
>>
>>   I am esp concerned since there are platforms based on
>>> am62x and just 256MB DDR.
>>>
>>
>> The file "k3-am62x-sk-common.dtsi" refers DDR memory size as 2Gb[1] and so I
>> put CMA reservation in same file assuming all boards including this file have
>> 2Gb.
>>
>> But if there are some boards having lesser DDR and including this
>> k3-am62x-sk-common.dtsi and overriding memory node, I can put the CMA
>> reservation node in board specific file i.e. k3-am625-sk.dts in V2.
>>
>> Kindly let me know if above is preferred approach.
>>
>> [1]
>> https://gitlab.com/linux-kernel/linux-next/-/blob/next-20230807/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi?ref_type=tags#L33
>>
>> Regards
>> Devarsh
>>
>>>>   arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
>>>>   1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>>> b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>>> index 34c8ffc553ec..9dd6e23ca9ca 100644
>>>> --- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>>> +++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
>>>> @@ -47,6 +47,14 @@ ramoops@9ca00000 {
>>>>               pmsg-size = <0x8000>;
>>>>           };
>>>>   +        /* global cma region */
>>>> +        linux,cma {
>>>> +            compatible = "shared-dma-pool";
>>>> +            reusable;
>>>> +            size = <0x00 0x8000000>;
>>>> +            linux,cma-default;
>>>> +        };
>>>> +
>>>>           secure_tfa_ddr: tfa@9e780000 {
>>>>               reg = <0x00 0x9e780000 0x00 0x80000>;
>>>>               alignment = <0x1000>;
>>>> -- 
>>>> 2.34.1
>>>>
>>>
Devarsh Thakkar May 14, 2024, 2:29 p.m. UTC | #7
Hi Andrew, Vignesh, Nishanth,

On 08/08/23 13:44, Devarsh Thakkar wrote:
[...]
>>>>> Reserve 128MiB of global CMA which is also marked as re-usable
>>>>> so that OS can also use the same if peripheral drivers are not using the
>>>>> same.
>>>>>

I wanted to re-initiate discussion on this. Per discussion with few of the
maintainers in OSS 2023 Summit, the suggestion was either to have MMU in the
devices or use the CMA region as done with this patch.
There was not much traction for the idea/suggestion to have some updates in
CMA API to dynamically increase the CMA region per the requirement (for e.g.
move the MIGRATE_MOVABLE pages to CMA area dynamically). Also I see some
efforts in the past made on that direction for introducing CMA_AGGRESSIVE
Kconfig to allocate from ZONE_MOVEABLE but were not accepted [1] and
nevertheless it still requires CMA region to be available to begin with and
only gives a bit of extra margin for the CMA.

My vote is still to use global cma region with re-usable flag (as done with
this patch) and 128 MiB is well estimated per the use-cases AM62x EVM is meant
to cater (the vendor specific kernel has been using it) and is very much
needed to provided out-of-box experience as otherwise basic display and camera
use-cases don't work out-of-box. Besides I see many other boards [presumably
mmu-less] using a similar approach :

     $git grep linux,cma-default arch/ | wc
     36      72    2538


Kindly let us know your opinion on this if we are aligned to proceed with this
approach or not.

[1]:
https://lore.kernel.org/all/20141107070655.GA3486@bbox/
https://linux-mm.kvack.narkive.com/LCGSAqAp/patch-0-4-cma-aggressive-make-cma-memory-be-more-aggressive-about-allocation


Regards
Devarsh
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
index 34c8ffc553ec..9dd6e23ca9ca 100644
--- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
@@ -47,6 +47,14 @@  ramoops@9ca00000 {
 			pmsg-size = <0x8000>;
 		};
 
+		/* global cma region */
+		linux,cma {
+			compatible = "shared-dma-pool";
+			reusable;
+			size = <0x00 0x8000000>;
+			linux,cma-default;
+		};
+
 		secure_tfa_ddr: tfa@9e780000 {
 			reg = <0x00 0x9e780000 0x00 0x80000>;
 			alignment = <0x1000>;