diff mbox

[v2,09/10] ARM: dts: sun7i-a20: Add Video Engine and reserved memory nodes

Message ID 20180419154536.17846-5-paul.kocialkowski@bootlin.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paul Kocialkowski April 19, 2018, 3:45 p.m. UTC
This adds nodes for the Video Engine and the associated reserved memory
for the Allwinner A20. Up to 96 MiB of memory are dedicated to the VPU.

The VPU can only map the first 256 MiB of DRAM, so the reserved memory
pool has to be located in that area. Following Allwinner's decision in
downstream software, the last 96 MiB of the first 256 MiB of RAM are
reserved for this purpose.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 arch/arm/boot/dts/sun7i-a20.dtsi | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Comments

Maxime Ripard April 20, 2018, 7:39 a.m. UTC | #1
On Thu, Apr 19, 2018 at 05:45:35PM +0200, Paul Kocialkowski wrote:
> This adds nodes for the Video Engine and the associated reserved memory
> for the Allwinner A20. Up to 96 MiB of memory are dedicated to the VPU.
> 
> The VPU can only map the first 256 MiB of DRAM, so the reserved memory
> pool has to be located in that area. Following Allwinner's decision in
> downstream software, the last 96 MiB of the first 256 MiB of RAM are
> reserved for this purpose.
> 
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  arch/arm/boot/dts/sun7i-a20.dtsi | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi b/arch/arm/boot/dts/sun7i-a20.dtsi
> index bd0cd3204273..cb6d82065dcf 100644
> --- a/arch/arm/boot/dts/sun7i-a20.dtsi
> +++ b/arch/arm/boot/dts/sun7i-a20.dtsi
> @@ -163,6 +163,20 @@
>  		reg = <0x40000000 0x80000000>;
>  	};
>  
> +	reserved-memory {
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges;
> +
> +		/* Address must be kept in the lower 256 MiBs of DRAM for VE. */
> +		ve_memory: cma@4a000000 {
> +			compatible = "shared-dma-pool";
> +			reg = <0x4a000000 0x6000000>;
> +			no-map;

I'm not sure why no-map is needed.

And I guess we could use alloc-ranges to make sure the region is in
the proper memory range, instead of hardcoding it.

> +			linux,cma-default;
> +		};
> +	};
> +
>  	timer {
>  		compatible = "arm,armv7-timer";
>  		interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
> @@ -451,6 +465,23 @@
>  			};
>  		};
>  
> +		ve: video-engine@1c0e000 {
> +			compatible = "allwinner,sun4i-a10-video-engine";

We should have an A20-specific compatible here, at least, so that we
can deal with any quirk we might find down the road.

> +			reg = <0x01c0e000 0x1000>;
> +			memory-region = <&ve_memory>;

Since you made the CMA region the default one, you don't need to tie
it to that device in particular (and you can drop it being mandatory
from your binding as well).

> +
> +			clocks = <&ccu CLK_AHB_VE>, <&ccu CLK_VE>,
> +				 <&ccu CLK_DRAM_VE>;
> +			clock-names = "ahb", "mod", "ram";
> +
> +			assigned-clocks = <&ccu CLK_VE>;
> +			assigned-clock-rates = <320000000>;

This should be set from within the driver. If it's something that you
absolutely needed for the device to operate, you have no guarantee
that the clock rate won't change at any point in time after the device
probe, so that's not a proper solution.

And if it's not needed and can be adjusted depending on the
framerate/codec/resolution, then it shouldn't be in the DT either.

Don't you also need to map the SRAM on the A20?

Maxime
Paul Kocialkowski May 4, 2018, 7:49 a.m. UTC | #2
Hi,

On Fri, 2018-04-20 at 09:39 +0200, Maxime Ripard wrote:
> On Thu, Apr 19, 2018 at 05:45:35PM +0200, Paul Kocialkowski wrote:
> > This adds nodes for the Video Engine and the associated reserved
> > memory
> > for the Allwinner A20. Up to 96 MiB of memory are dedicated to the
> > VPU.
> > 
> > The VPU can only map the first 256 MiB of DRAM, so the reserved
> > memory
> > pool has to be located in that area. Following Allwinner's decision
> > in
> > downstream software, the last 96 MiB of the first 256 MiB of RAM are
> > reserved for this purpose.
> > 
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> >  arch/arm/boot/dts/sun7i-a20.dtsi | 31
> > +++++++++++++++++++++++++++++++
> >  1 file changed, 31 insertions(+)
> > 
> > diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi
> > b/arch/arm/boot/dts/sun7i-a20.dtsi
> > index bd0cd3204273..cb6d82065dcf 100644
> > --- a/arch/arm/boot/dts/sun7i-a20.dtsi
> > +++ b/arch/arm/boot/dts/sun7i-a20.dtsi
> > @@ -163,6 +163,20 @@
> >  		reg = <0x40000000 0x80000000>;
> >  	};
> >  
> > +	reserved-memory {
> > +		#address-cells = <1>;
> > +		#size-cells = <1>;
> > +		ranges;
> > +
> > +		/* Address must be kept in the lower 256 MiBs of
> > DRAM for VE. */
> > +		ve_memory: cma@4a000000 {
> > +			compatible = "shared-dma-pool";
> > +			reg = <0x4a000000 0x6000000>;
> > +			no-map;
> 
> I'm not sure why no-map is needed.

In fact, having no-map here would lead to reserving the area as cache-
coherent instead of contiguous and thus prevented dmabuf support.
Replacing it by "resuable" allows proper CMA reservation.

> And I guess we could use alloc-ranges to make sure the region is in
> the proper memory range, instead of hardcoding it.

As far as I could understand from the documentation, "alloc-ranges" is
used for dynamic allocation while only "reg" is used for static
allocation. We are currently going with static allocation and thus
reserve the whole 96 MiB. Is using dynamic allocation instead desirable
here?

> > +			linux,cma-default;
> > +		};
> > +	};
> > +
> >  	timer {
> >  		compatible = "arm,armv7-timer";
> >  		interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) |
> > IRQ_TYPE_LEVEL_LOW)>,
> > @@ -451,6 +465,23 @@
> >  			};
> >  		};
> >  
> > +		ve: video-engine@1c0e000 {
> > +			compatible = "allwinner,sun4i-a10-video-
> > engine";
> 
> We should have an A20-specific compatible here, at least, so that we
> can deal with any quirk we might find down the road.

Yes that's a very good point.

> > +			reg = <0x01c0e000 0x1000>;
> > +			memory-region = <&ve_memory>;
> 
> Since you made the CMA region the default one, you don't need to tie
> it to that device in particular (and you can drop it being mandatory
> from your binding as well).

What if another driver (or the system) claims memory from that zone and
that the reserved memory ends up not being available for the VPU
anymore?

Acccording to the reserved-memory documentation, the reusable property
(that we need for dmabuf) puts a limitation that the device driver
owning the region must be able to reclaim it back.

How does that work out if the CMA region is not tied to a driver in
particular?

> > +
> > +			clocks = <&ccu CLK_AHB_VE>, <&ccu CLK_VE>,
> > +				 <&ccu CLK_DRAM_VE>;
> > +			clock-names = "ahb", "mod", "ram";
> > +
> > +			assigned-clocks = <&ccu CLK_VE>;
> > +			assigned-clock-rates = <320000000>;
> 
> This should be set from within the driver. If it's something that you
> absolutely needed for the device to operate, you have no guarantee
> that the clock rate won't change at any point in time after the device
> probe, so that's not a proper solution.
> 
> And if it's not needed and can be adjusted depending on the
> framerate/codec/resolution, then it shouldn't be in the DT either.

Yes, that makes sense.

> Don't you also need to map the SRAM on the A20?

That's a good point, there is currently no syscon handle for A20 (and
also A13). Maybe SRAM is muxed to the VE by default so it "just works"? 

I'll investigate on this side, also keeping in mind that the actual
solution is to use the SRAM controller driver (but that won't make it to
v3).

Cheers and thanks for the review!
Maxime Ripard May 4, 2018, 8:40 a.m. UTC | #3
On Fri, May 04, 2018 at 09:49:16AM +0200, Paul Kocialkowski wrote:
> > > +	reserved-memory {
> > > +		#address-cells = <1>;
> > > +		#size-cells = <1>;
> > > +		ranges;
> > > +
> > > +		/* Address must be kept in the lower 256 MiBs of
> > > DRAM for VE. */
> > > +		ve_memory: cma@4a000000 {
> > > +			compatible = "shared-dma-pool";
> > > +			reg = <0x4a000000 0x6000000>;
> > > +			no-map;
> > 
> > I'm not sure why no-map is needed.
> 
> In fact, having no-map here would lead to reserving the area as cache-
> coherent instead of contiguous and thus prevented dmabuf support.
> Replacing it by "resuable" allows proper CMA reservation.
> 
> > And I guess we could use alloc-ranges to make sure the region is in
> > the proper memory range, instead of hardcoding it.
> 
> As far as I could understand from the documentation, "alloc-ranges" is
> used for dynamic allocation while only "reg" is used for static
> allocation. We are currently going with static allocation and thus
> reserve the whole 96 MiB. Is using dynamic allocation instead desirable
> here?

I guess we could turn the question backward. Why do we need a static
allocation? This isn't a buffer that is always allocated on the same
area, but rather that we have a range available. So our constraint is
on the range, nothing else.

> > > +			reg = <0x01c0e000 0x1000>;
> > > +			memory-region = <&ve_memory>;
> > 
> > Since you made the CMA region the default one, you don't need to tie
> > it to that device in particular (and you can drop it being mandatory
> > from your binding as well).
> 
> What if another driver (or the system) claims memory from that zone and
> that the reserved memory ends up not being available for the VPU
> anymore?
> 
> Acccording to the reserved-memory documentation, the reusable property
> (that we need for dmabuf) puts a limitation that the device driver
> owning the region must be able to reclaim it back.
> 
> How does that work out if the CMA region is not tied to a driver in
> particular?

I'm not sure to get what you're saying. You have the property
linux,cma-default in your reserved region, so the behaviour you
described is what you explicitly asked for.

> 
> > > +
> > > +			clocks = <&ccu CLK_AHB_VE>, <&ccu CLK_VE>,
> > > +				 <&ccu CLK_DRAM_VE>;
> > > +			clock-names = "ahb", "mod", "ram";
> > > +
> > > +			assigned-clocks = <&ccu CLK_VE>;
> > > +			assigned-clock-rates = <320000000>;
> > 
> > This should be set from within the driver. If it's something that you
> > absolutely needed for the device to operate, you have no guarantee
> > that the clock rate won't change at any point in time after the device
> > probe, so that's not a proper solution.
> > 
> > And if it's not needed and can be adjusted depending on the
> > framerate/codec/resolution, then it shouldn't be in the DT either.
> 
> Yes, that makes sense.
> 
> > Don't you also need to map the SRAM on the A20?
> 
> That's a good point, there is currently no syscon handle for A20 (and
> also A13). Maybe SRAM is muxed to the VE by default so it "just works"? 
> 
> I'll investigate on this side, also keeping in mind that the actual
> solution is to use the SRAM controller driver (but that won't make it to
> v3).

The SRAM driver is available on the A20, so you should really use that
instead of a syscon.

Maxime
Paul Kocialkowski May 4, 2018, 8:47 a.m. UTC | #4
Hi,

On Fri, 2018-05-04 at 10:40 +0200, Maxime Ripard wrote:
> On Fri, May 04, 2018 at 09:49:16AM +0200, Paul Kocialkowski wrote:
> > > > +	reserved-memory {
> > > > +		#address-cells = <1>;
> > > > +		#size-cells = <1>;
> > > > +		ranges;
> > > > +
> > > > +		/* Address must be kept in the lower 256 MiBs
> > > > of
> > > > DRAM for VE. */
> > > > +		ve_memory: cma@4a000000 {
> > > > +			compatible = "shared-dma-pool";
> > > > +			reg = <0x4a000000 0x6000000>;
> > > > +			no-map;
> > > 
> > > I'm not sure why no-map is needed.
> > 
> > In fact, having no-map here would lead to reserving the area as
> > cache-
> > coherent instead of contiguous and thus prevented dmabuf support.
> > Replacing it by "resuable" allows proper CMA reservation.
> > 
> > > And I guess we could use alloc-ranges to make sure the region is
> > > in
> > > the proper memory range, instead of hardcoding it.
> > 
> > As far as I could understand from the documentation, "alloc-ranges"
> > is
> > used for dynamic allocation while only "reg" is used for static
> > allocation. We are currently going with static allocation and thus
> > reserve the whole 96 MiB. Is using dynamic allocation instead
> > desirable
> > here?
> 
> I guess we could turn the question backward. Why do we need a static
> allocation? This isn't a buffer that is always allocated on the same
> area, but rather that we have a range available. So our constraint is
> on the range, nothing else.

That makes sense, I will give it a shot with a range then.

> > > > +			reg = <0x01c0e000 0x1000>;
> > > > +			memory-region = <&ve_memory>;
> > > 
> > > Since you made the CMA region the default one, you don't need to
> > > tie
> > > it to that device in particular (and you can drop it being
> > > mandatory
> > > from your binding as well).
> > 
> > What if another driver (or the system) claims memory from that zone
> > and
> > that the reserved memory ends up not being available for the VPU
> > anymore?
> > 
> > Acccording to the reserved-memory documentation, the reusable
> > property
> > (that we need for dmabuf) puts a limitation that the device driver
> > owning the region must be able to reclaim it back.
> > 
> > How does that work out if the CMA region is not tied to a driver in
> > particular?
> 
> I'm not sure to get what you're saying. You have the property
> linux,cma-default in your reserved region, so the behaviour you
> described is what you explicitly asked for.

My point is that I don't see how the driver can claim back (part of) the
reserved area if the area is not explicitly attached to it.

Or is that mechanism made in a way that all drivers wishing to use the
reserved memory area can claim it back from the system, but there is no
priority (other than first-come first-served) for which drivers claims
it back in case two want to use the same reserved region (in a scenario
where there isn't enough memory to allow both drivers)?

> > > > +
> > > > +			clocks = <&ccu CLK_AHB_VE>, <&ccu
> > > > CLK_VE>,
> > > > +				 <&ccu CLK_DRAM_VE>;
> > > > +			clock-names = "ahb", "mod", "ram";
> > > > +
> > > > +			assigned-clocks = <&ccu CLK_VE>;
> > > > +			assigned-clock-rates = <320000000>;
> > > 
> > > This should be set from within the driver. If it's something that
> > > you
> > > absolutely needed for the device to operate, you have no guarantee
> > > that the clock rate won't change at any point in time after the
> > > device
> > > probe, so that's not a proper solution.
> > > 
> > > And if it's not needed and can be adjusted depending on the
> > > framerate/codec/resolution, then it shouldn't be in the DT either.
> > 
> > Yes, that makes sense.
> > 
> > > Don't you also need to map the SRAM on the A20?
> > 
> > That's a good point, there is currently no syscon handle for A20
> > (and
> > also A13). Maybe SRAM is muxed to the VE by default so it "just
> > works"? 
> > 
> > I'll investigate on this side, also keeping in mind that the actual
> > solution is to use the SRAM controller driver (but that won't make
> > it to
> > v3).
> 
> The SRAM driver is available on the A20, so you should really use that
> instead of a syscon.

The SRAM driver is indeed available for the A20, but still lacks support
for the VE in particular as far as I can see.

Cheers,
Paul Kocialkowski May 4, 2018, 8:54 a.m. UTC | #5
On Fri, 2018-05-04 at 10:47 +0200, Paul Kocialkowski wrote:
> > > > Don't you also need to map the SRAM on the A20?
> > > 
> > > That's a good point, there is currently no syscon handle for A20
> > > (and
> > > also A13). Maybe SRAM is muxed to the VE by default so it "just
> > > works"? 

I just checked on the manual and it appears that SRAM Area C1 is muxed
to the VE at reset, so we can probably keep things as-is until the SRAM
driver is ready to handle explicitly muxing that area to the VE.
Maxime Ripard May 4, 2018, 9:15 a.m. UTC | #6
On Fri, May 04, 2018 at 10:47:44AM +0200, Paul Kocialkowski wrote:
> > > > > +			reg = <0x01c0e000 0x1000>;
> > > > > +			memory-region = <&ve_memory>;
> > > > 
> > > > Since you made the CMA region the default one, you don't need to
> > > > tie
> > > > it to that device in particular (and you can drop it being
> > > > mandatory
> > > > from your binding as well).
> > > 
> > > What if another driver (or the system) claims memory from that zone
> > > and
> > > that the reserved memory ends up not being available for the VPU
> > > anymore?
> > > 
> > > Acccording to the reserved-memory documentation, the reusable
> > > property
> > > (that we need for dmabuf) puts a limitation that the device driver
> > > owning the region must be able to reclaim it back.
> > > 
> > > How does that work out if the CMA region is not tied to a driver in
> > > particular?
> > 
> > I'm not sure to get what you're saying. You have the property
> > linux,cma-default in your reserved region, so the behaviour you
> > described is what you explicitly asked for.
> 
> My point is that I don't see how the driver can claim back (part of) the
> reserved area if the area is not explicitly attached to it.
> 
> Or is that mechanism made in a way that all drivers wishing to use the
> reserved memory area can claim it back from the system, but there is no
> priority (other than first-come first-served) for which drivers claims
> it back in case two want to use the same reserved region (in a scenario
> where there isn't enough memory to allow both drivers)?

This is indeed what happens. Reusable is to let the system use the
reserved memory for things like caches that can easily be dropped when
a driver wants to use the memory in that reserved area. Once that
memory has been allocated, there's no claiming back, unless that
memory segment was freed of course.

Maxime
Paul Kocialkowski May 4, 2018, 12:04 p.m. UTC | #7
Hi,

On Fri, 2018-05-04 at 11:15 +0200, Maxime Ripard wrote:
> On Fri, May 04, 2018 at 10:47:44AM +0200, Paul Kocialkowski wrote:
> > > > > > +			reg = <0x01c0e000 0x1000>;
> > > > > > +			memory-region = <&ve_memory>;
> > > > > 
> > > > > Since you made the CMA region the default one, you don't need
> > > > > to
> > > > > tie
> > > > > it to that device in particular (and you can drop it being
> > > > > mandatory
> > > > > from your binding as well).
> > > > 
> > > > What if another driver (or the system) claims memory from that
> > > > zone
> > > > and
> > > > that the reserved memory ends up not being available for the VPU
> > > > anymore?
> > > > 
> > > > Acccording to the reserved-memory documentation, the reusable
> > > > property
> > > > (that we need for dmabuf) puts a limitation that the device
> > > > driver
> > > > owning the region must be able to reclaim it back.
> > > > 
> > > > How does that work out if the CMA region is not tied to a driver
> > > > in
> > > > particular?
> > > 
> > > I'm not sure to get what you're saying. You have the property
> > > linux,cma-default in your reserved region, so the behaviour you
> > > described is what you explicitly asked for.
> > 
> > My point is that I don't see how the driver can claim back (part of)
> > the
> > reserved area if the area is not explicitly attached to it.
> > 
> > Or is that mechanism made in a way that all drivers wishing to use
> > the
> > reserved memory area can claim it back from the system, but there is
> > no
> > priority (other than first-come first-served) for which drivers
> > claims
> > it back in case two want to use the same reserved region (in a
> > scenario
> > where there isn't enough memory to allow both drivers)?
> 
> This is indeed what happens. Reusable is to let the system use the
> reserved memory for things like caches that can easily be dropped when
> a driver wants to use the memory in that reserved area. Once that
> memory has been allocated, there's no claiming back, unless that
> memory segment was freed of course.

Thanks for the clarification. So in our case, perhaps the best fit would
be to make that area the default CMA pool so that we can be ensured that
the whole 96 MiB is available for the VPU and that no other consumer of
CMA will use it?

Cheers,
Maxime Ripard May 4, 2018, 1:40 p.m. UTC | #8
On Fri, May 04, 2018 at 02:04:38PM +0200, Paul Kocialkowski wrote:
> On Fri, 2018-05-04 at 11:15 +0200, Maxime Ripard wrote:
> > On Fri, May 04, 2018 at 10:47:44AM +0200, Paul Kocialkowski wrote:
> > > > > > > +			reg = <0x01c0e000 0x1000>;
> > > > > > > +			memory-region = <&ve_memory>;
> > > > > > 
> > > > > > Since you made the CMA region the default one, you don't need
> > > > > > to
> > > > > > tie
> > > > > > it to that device in particular (and you can drop it being
> > > > > > mandatory
> > > > > > from your binding as well).
> > > > > 
> > > > > What if another driver (or the system) claims memory from that
> > > > > zone
> > > > > and
> > > > > that the reserved memory ends up not being available for the VPU
> > > > > anymore?
> > > > > 
> > > > > Acccording to the reserved-memory documentation, the reusable
> > > > > property
> > > > > (that we need for dmabuf) puts a limitation that the device
> > > > > driver
> > > > > owning the region must be able to reclaim it back.
> > > > > 
> > > > > How does that work out if the CMA region is not tied to a driver
> > > > > in
> > > > > particular?
> > > > 
> > > > I'm not sure to get what you're saying. You have the property
> > > > linux,cma-default in your reserved region, so the behaviour you
> > > > described is what you explicitly asked for.
> > > 
> > > My point is that I don't see how the driver can claim back (part of)
> > > the
> > > reserved area if the area is not explicitly attached to it.
> > > 
> > > Or is that mechanism made in a way that all drivers wishing to use
> > > the
> > > reserved memory area can claim it back from the system, but there is
> > > no
> > > priority (other than first-come first-served) for which drivers
> > > claims
> > > it back in case two want to use the same reserved region (in a
> > > scenario
> > > where there isn't enough memory to allow both drivers)?
> > 
> > This is indeed what happens. Reusable is to let the system use the
> > reserved memory for things like caches that can easily be dropped when
> > a driver wants to use the memory in that reserved area. Once that
> > memory has been allocated, there's no claiming back, unless that
> > memory segment was freed of course.
> 
> Thanks for the clarification. So in our case, perhaps the best fit would
> be to make that area the default CMA pool so that we can be ensured that
> the whole 96 MiB is available for the VPU and that no other consumer of
> CMA will use it?

The best fit for what use case ? We already discussed this, and I
don't see any point in having two separate CMA regions. If you have a
reasonably sized region that will accomodate for both the VPU and
display engine, why would we want to split them?

Or did you have any experience of running out of buffers?

Maxime
Paul Kocialkowski May 4, 2018, 1:57 p.m. UTC | #9
On Fri, 2018-05-04 at 15:40 +0200, Maxime Ripard wrote:
> On Fri, May 04, 2018 at 02:04:38PM +0200, Paul Kocialkowski wrote:
> > On Fri, 2018-05-04 at 11:15 +0200, Maxime Ripard wrote:
> > > On Fri, May 04, 2018 at 10:47:44AM +0200, Paul Kocialkowski wrote:
> > > > > > > > +			reg = <0x01c0e000 0x1000>;
> > > > > > > > +			memory-region = <&ve_memory>;
> > > > > > > 
> > > > > > > Since you made the CMA region the default one, you don't
> > > > > > > need
> > > > > > > to
> > > > > > > tie
> > > > > > > it to that device in particular (and you can drop it being
> > > > > > > mandatory
> > > > > > > from your binding as well).
> > > > > > 
> > > > > > What if another driver (or the system) claims memory from
> > > > > > that
> > > > > > zone
> > > > > > and
> > > > > > that the reserved memory ends up not being available for the
> > > > > > VPU
> > > > > > anymore?
> > > > > > 
> > > > > > Acccording to the reserved-memory documentation, the
> > > > > > reusable
> > > > > > property
> > > > > > (that we need for dmabuf) puts a limitation that the device
> > > > > > driver
> > > > > > owning the region must be able to reclaim it back.
> > > > > > 
> > > > > > How does that work out if the CMA region is not tied to a
> > > > > > driver
> > > > > > in
> > > > > > particular?
> > > > > 
> > > > > I'm not sure to get what you're saying. You have the property
> > > > > linux,cma-default in your reserved region, so the behaviour
> > > > > you
> > > > > described is what you explicitly asked for.
> > > > 
> > > > My point is that I don't see how the driver can claim back (part
> > > > of)
> > > > the
> > > > reserved area if the area is not explicitly attached to it.
> > > > 
> > > > Or is that mechanism made in a way that all drivers wishing to
> > > > use
> > > > the
> > > > reserved memory area can claim it back from the system, but
> > > > there is
> > > > no
> > > > priority (other than first-come first-served) for which drivers
> > > > claims
> > > > it back in case two want to use the same reserved region (in a
> > > > scenario
> > > > where there isn't enough memory to allow both drivers)?
> > > 
> > > This is indeed what happens. Reusable is to let the system use the
> > > reserved memory for things like caches that can easily be dropped
> > > when
> > > a driver wants to use the memory in that reserved area. Once that
> > > memory has been allocated, there's no claiming back, unless that
> > > memory segment was freed of course.
> > 
> > Thanks for the clarification. So in our case, perhaps the best fit
> > would
> > be to make that area the default CMA pool so that we can be ensured
> > that
> > the whole 96 MiB is available for the VPU and that no other consumer
> > of
> > CMA will use it?
> 
> The best fit for what use case ? We already discussed this, and I
> don't see any point in having two separate CMA regions. If you have a
> reasonably sized region that will accomodate for both the VPU and
> display engine, why would we want to split them?

The use case I have in mind is boilerplate use of the VPU with the
display engine, say with DMAbuf.

It wasn't exactly clear in my memory whether we had decided that the CMA
pool we use for the VPU should also be used for other CMA consumers (I
realize that this is what we've been doing all along by having
linux,cma-default; though).

The fact that the memory region will accomodate for both the display
engine and the VPU is not straightforward IMO and I think this has to be
an explicit choice that we take. I was under the impression that we
chose the 96 MiB because that's what the Allwinner reference code does.
Does the reference code also use this pool for display?

I liked the idea of having 96 MiB fully reserved to the VPU because it
allows us to provide a limit on the use case, such as "this guarantees N
buffers for resolution foo in format bar". If the display engine also
uses it, then the limit also depends on how many GEM buffers are
allocated (unless I'm missing something).

> Or did you have any experience of running out of buffers?

Not yet, I haven't. I have no objection with making the reserved region
the default CMA pool and have other consumers use it too.

Cheers,
Maxime Ripard May 4, 2018, 3:44 p.m. UTC | #10
On Fri, May 04, 2018 at 03:57:48PM +0200, Paul Kocialkowski wrote:
> On Fri, 2018-05-04 at 15:40 +0200, Maxime Ripard wrote:
> > On Fri, May 04, 2018 at 02:04:38PM +0200, Paul Kocialkowski wrote:
> > > On Fri, 2018-05-04 at 11:15 +0200, Maxime Ripard wrote:
> > > > On Fri, May 04, 2018 at 10:47:44AM +0200, Paul Kocialkowski wrote:
> > > > > > > > > +			reg = <0x01c0e000 0x1000>;
> > > > > > > > > +			memory-region = <&ve_memory>;
> > > > > > > > 
> > > > > > > > Since you made the CMA region the default one, you don't
> > > > > > > > need
> > > > > > > > to
> > > > > > > > tie
> > > > > > > > it to that device in particular (and you can drop it being
> > > > > > > > mandatory
> > > > > > > > from your binding as well).
> > > > > > > 
> > > > > > > What if another driver (or the system) claims memory from
> > > > > > > that
> > > > > > > zone
> > > > > > > and
> > > > > > > that the reserved memory ends up not being available for the
> > > > > > > VPU
> > > > > > > anymore?
> > > > > > > 
> > > > > > > Acccording to the reserved-memory documentation, the
> > > > > > > reusable
> > > > > > > property
> > > > > > > (that we need for dmabuf) puts a limitation that the device
> > > > > > > driver
> > > > > > > owning the region must be able to reclaim it back.
> > > > > > > 
> > > > > > > How does that work out if the CMA region is not tied to a
> > > > > > > driver
> > > > > > > in
> > > > > > > particular?
> > > > > > 
> > > > > > I'm not sure to get what you're saying. You have the property
> > > > > > linux,cma-default in your reserved region, so the behaviour
> > > > > > you
> > > > > > described is what you explicitly asked for.
> > > > > 
> > > > > My point is that I don't see how the driver can claim back (part
> > > > > of)
> > > > > the
> > > > > reserved area if the area is not explicitly attached to it.
> > > > > 
> > > > > Or is that mechanism made in a way that all drivers wishing to
> > > > > use
> > > > > the
> > > > > reserved memory area can claim it back from the system, but
> > > > > there is
> > > > > no
> > > > > priority (other than first-come first-served) for which drivers
> > > > > claims
> > > > > it back in case two want to use the same reserved region (in a
> > > > > scenario
> > > > > where there isn't enough memory to allow both drivers)?
> > > > 
> > > > This is indeed what happens. Reusable is to let the system use the
> > > > reserved memory for things like caches that can easily be dropped
> > > > when
> > > > a driver wants to use the memory in that reserved area. Once that
> > > > memory has been allocated, there's no claiming back, unless that
> > > > memory segment was freed of course.
> > > 
> > > Thanks for the clarification. So in our case, perhaps the best fit
> > > would
> > > be to make that area the default CMA pool so that we can be ensured
> > > that
> > > the whole 96 MiB is available for the VPU and that no other consumer
> > > of
> > > CMA will use it?
> > 
> > The best fit for what use case ? We already discussed this, and I
> > don't see any point in having two separate CMA regions. If you have a
> > reasonably sized region that will accomodate for both the VPU and
> > display engine, why would we want to split them?
> 
> The use case I have in mind is boilerplate use of the VPU with the
> display engine, say with DMAbuf.
> 
> It wasn't exactly clear in my memory whether we had decided that the CMA
> pool we use for the VPU should also be used for other CMA consumers (I
> realize that this is what we've been doing all along by having
> linux,cma-default; though).
> 
> The fact that the memory region will accomodate for both the display
> engine and the VPU is not straightforward IMO and I think this has to be
> an explicit choice that we take. I was under the impression that we
> chose the 96 MiB because that's what the Allwinner reference code does.
> Does the reference code also use this pool for display?

Yes

> I liked the idea of having 96 MiB fully reserved to the VPU because it
> allows us to provide a limit on the use case, such as "this guarantees N
> buffers for resolution foo in format bar". If the display engine also
> uses it, then the limit also depends on how many GEM buffers are
> allocated (unless I'm missing something).

This also guarantees that you take away a fifth of the RAM on some
boards. If we had yet another CMA pool of 64MB as is the multi_v7
defconfig, that's a third of your RAM that's gone, possibly for no
particular reason.

If we make the math, let's say that we are running a system with 4
planes used in 1080p, with 4 Bpp, in double buffering (which is
already an unlikely setup). Let's add on top of that that we're
decoding a 1080p video with 8 buffers pre-allocated with 2Bpp (in
YUV422). Which really seems extreme now :)

And we're at 80MB. My guess is that your memory bus is going to be
dead before you need to use all that memory.

Maxime
diff mbox

Patch

diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi b/arch/arm/boot/dts/sun7i-a20.dtsi
index bd0cd3204273..cb6d82065dcf 100644
--- a/arch/arm/boot/dts/sun7i-a20.dtsi
+++ b/arch/arm/boot/dts/sun7i-a20.dtsi
@@ -163,6 +163,20 @@ 
 		reg = <0x40000000 0x80000000>;
 	};
 
+	reserved-memory {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges;
+
+		/* Address must be kept in the lower 256 MiBs of DRAM for VE. */
+		ve_memory: cma@4a000000 {
+			compatible = "shared-dma-pool";
+			reg = <0x4a000000 0x6000000>;
+			no-map;
+			linux,cma-default;
+		};
+	};
+
 	timer {
 		compatible = "arm,armv7-timer";
 		interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
@@ -451,6 +465,23 @@ 
 			};
 		};
 
+		ve: video-engine@1c0e000 {
+			compatible = "allwinner,sun4i-a10-video-engine";
+			reg = <0x01c0e000 0x1000>;
+			memory-region = <&ve_memory>;
+
+			clocks = <&ccu CLK_AHB_VE>, <&ccu CLK_VE>,
+				 <&ccu CLK_DRAM_VE>;
+			clock-names = "ahb", "mod", "ram";
+
+			assigned-clocks = <&ccu CLK_VE>;
+			assigned-clock-rates = <320000000>;
+
+			resets = <&ccu RST_VE>;
+
+			interrupts = <GIC_SPI 53 IRQ_TYPE_LEVEL_HIGH>;
+		};
+
 		mmc0: mmc@1c0f000 {
 			compatible = "allwinner,sun7i-a20-mmc";
 			reg = <0x01c0f000 0x1000>;