diff mbox series

Limiting the DMA zone in arm64

Message ID 0439cc08532849b1d0adb44a7b2cbc9ce5dceaf7.camel@suse.de (mailing list archive)
State New, archived
Headers show
Series Limiting the DMA zone in arm64 | expand

Commit Message

Nicolas Saenz Julienne July 11, 2019, 9:51 a.m. UTC
Hi,
I'm trying to bring up the new RPi4 on arm64, and running into issues with DMA
allocations. The device has up to 4GB of ram, but AFAIK only the first GB of
ram can be used for DMA: the DMA address range is 0xc0000000-0xffffffff which
is aliased to the first GB of memory 0x00000000-0x40000000.

This is solved in arm32 using a board file with '.dma_zone_size = SZ_1G'. But I
haven't found any similar mechanism for arm64. Any suggestions?

Just it case it helps understand the issue, I managed to get things going by
doing the following:

Nicolas

Comments

Jisheng Zhang July 11, 2019, 10:15 a.m. UTC | #1
On Thu, 11 Jul 2019 11:51:57 +0200
Nicolas Saenz Julienne wrote:

> Hi,
> I'm trying to bring up the new RPi4 on arm64, and running into issues with DMA
> allocations. The device has up to 4GB of ram, but AFAIK only the first GB of
> ram can be used for DMA: the DMA address range is 0xc0000000-0xffffffff which
> is aliased to the first GB of memory 0x00000000-0x40000000.
> 
> This is solved in arm32 using a board file with '.dma_zone_size = SZ_1G'. But I
> haven't found any similar mechanism for arm64. Any suggestions?

maybe setting up of the dma-ranges in the soc bus in DT?

soc {
	compatible = "simple-bus";
	dma-ranges = <0 0x40000000 0 0x40000000 0x40000000>;
...
}


> 
> Just it case it helps understand the issue, I managed to get things going by
> doing the following:
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index f3c795278def..ec3cb7b76a76 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -407,7 +407,8 @@ void __init arm64_memblock_init(void)
> 
>         /* 4GB maximum for 32-bit only capable devices */
>         if (IS_ENABLED(CONFIG_ZONE_DMA32))
> -               arm64_dma_phys_limit = max_zone_dma_phys();
> +               arm64_dma_phys_limit = 0x40000000;
>         else
>                 arm64_dma_phys_limit = PHYS_MASK + 1;
> 
> Regards,
> Nicolas
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Darm-2Dkernel&d=DwICAg&c=7dfBJ8cXbWjhc0BhImu8wQ&r=wlaKTGoVCDxOzHc2QUzpzGEf9oY3eidXlAe3OF1omvo&m=Wmuzr6hc5vH2mDGWO65GmjVfssfmIXNVvITrgjyYQIg&s=JeCwKaJWXU_gC66lnCTqMw9JlrV_t05V7axT8AEzNJA&e=
Will Deacon July 11, 2019, 10:17 a.m. UTC | #2
Hi Nicolas,

[+Robin, Andrew and Marc since we've been playing with getting arm64 Linux
 up and running too]

On Thu, Jul 11, 2019 at 11:51:57AM +0200, Nicolas Saenz Julienne wrote:
> I'm trying to bring up the new RPi4 on arm64, and running into issues with DMA
> allocations. The device has up to 4GB of ram, but AFAIK only the first GB of
> ram can be used for DMA: the DMA address range is 0xc0000000-0xffffffff which
> is aliased to the first GB of memory 0x00000000-0x40000000.

Do you know for sure that these aliases are equivalant and so it's
inconsequential if we use the lower addresses for DMA? Also, does this
limitation apply to all DMA-capable peripherals, or just some of them?

> This is solved in arm32 using a board file with '.dma_zone_size = SZ_1G'. But I
> haven't found any similar mechanism for arm64. Any suggestions?
> 
> Just it case it helps understand the issue, I managed to get things going by
> doing the following:
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index f3c795278def..ec3cb7b76a76 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -407,7 +407,8 @@ void __init arm64_memblock_init(void)
>  
>         /* 4GB maximum for 32-bit only capable devices */
>         if (IS_ENABLED(CONFIG_ZONE_DMA32))
> -               arm64_dma_phys_limit = max_zone_dma_phys();
> +               arm64_dma_phys_limit = 0x40000000;
>         else
>                 arm64_dma_phys_limit = PHYS_MASK + 1;

My superficial understanding (mainly from talking to Robin, who actually
knows how this works), is that we'd need to extend our support for
dma-ranges in order to limit ZONE_DMA32 as you're proposing above. However,
this may not help for streaming DMA, where we need to force everything
above 1G through a bounce buffer and likely requires something weird like
a 30-bit DMA mask.

Do you know how streaming DMA is handled in the 32-bit port for rpi4?

Will
Nicolas Saenz Julienne July 11, 2019, 10:17 a.m. UTC | #3
On Thu, 2019-07-11 at 10:15 +0000, Jisheng Zhang wrote:
> On Thu, 11 Jul 2019 11:51:57 +0200
> Nicolas Saenz Julienne wrote:
> 
> > Hi,
> > I'm trying to bring up the new RPi4 on arm64, and running into issues with
> > DMA
> > allocations. The device has up to 4GB of ram, but AFAIK only the first GB of
> > ram can be used for DMA: the DMA address range is 0xc0000000-0xffffffff
> > which
> > is aliased to the first GB of memory 0x00000000-0x40000000.
> > 
> > This is solved in arm32 using a board file with '.dma_zone_size = SZ_1G'.
> > But I
> > haven't found any similar mechanism for arm64. Any suggestions?
> 
> maybe setting up of the dma-ranges in the soc bus in DT?
> 
> soc {
> 	compatible = "simple-bus";
> 	dma-ranges = <0 0x40000000 0 0x40000000 0x40000000>;
> ...
> }

They are set-up like this (in the RPi foundation downstream kernel):

soc {
	/* Emulate a contiguous 30-bit address range for DMA */
	dma-ranges = <0xc0000000  0x0 0x00000000  0x3c000000>;
	...
}

> > Just it case it helps understand the issue, I managed to get things going by
> > doing the following:
> > 
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index f3c795278def..ec3cb7b76a76 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -407,7 +407,8 @@ void __init arm64_memblock_init(void)
> > 
> >         /* 4GB maximum for 32-bit only capable devices */
> >         if (IS_ENABLED(CONFIG_ZONE_DMA32))
> > -               arm64_dma_phys_limit = max_zone_dma_phys();
> > +               arm64_dma_phys_limit = 0x40000000;
> >         else
> >                 arm64_dma_phys_limit = PHYS_MASK + 1;
> > 
> > Regards,
> > Nicolas
> > 
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Darm-2Dkernel&d=DwICAg&c=7dfBJ8cXbWjhc0BhImu8wQ&r=wlaKTGoVCDxOzHc2QUzpzGEf9oY3eidXlAe3OF1omvo&m=Wmuzr6hc5vH2mDGWO65GmjVfssfmIXNVvITrgjyYQIg&s=JeCwKaJWXU_gC66lnCTqMw9JlrV_t05V7axT8AEzNJA&e=
Nicolas Saenz Julienne July 11, 2019, 11:10 a.m. UTC | #4
Hi Will, thanks for your response.

[+ Matthias and Phill who might be interested ]

On Thu, 2019-07-11 at 11:17 +0100, Will Deacon wrote:
> Hi Nicolas,
> 
> [+Robin, Andrew and Marc since we've been playing with getting arm64 Linux
>  up and running too]
> 
> On Thu, Jul 11, 2019 at 11:51:57AM +0200, Nicolas Saenz Julienne wrote:
> > I'm trying to bring up the new RPi4 on arm64, and running into issues with
> > DMA
> > allocations. The device has up to 4GB of ram, but AFAIK only the first GB of
> > ram can be used for DMA: the DMA address range is 0xc0000000-0xffffffff
> > which
> > is aliased to the first GB of memory 0x00000000-0x40000000.
> 
> Do you know for sure that these aliases are equivalant and so it's
> inconsequential if we use the lower addresses for DMA?

No, they are not exactly equivalent, see the 'dma-ranges' I posted on my other
reply. I was being overly generic to make the explanation simpler. The actual
size of the aliasing is smaller.

That said, I don't think using the lower addresses would work for DMA. I tested
some transfers and the offset is clearly being taken into account.

> Also, does this
> limitation apply to all DMA-capable peripherals, or just some of them?

I infer from '.dma_zone_size = SZ_1G' and dma-ranges that it's a device wide
limitation. Maybe Phill can contradict me.

> > This is solved in arm32 using a board file with '.dma_zone_size = SZ_1G'.
> > But I
> > haven't found any similar mechanism for arm64. Any suggestions?
> > 
> > Just it case it helps understand the issue, I managed to get things going by
> > doing the following:
> > 
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index f3c795278def..ec3cb7b76a76 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -407,7 +407,8 @@ void __init arm64_memblock_init(void)
> >  
> >         /* 4GB maximum for 32-bit only capable devices */
> >         if (IS_ENABLED(CONFIG_ZONE_DMA32))
> > -               arm64_dma_phys_limit = max_zone_dma_phys();
> > +               arm64_dma_phys_limit = 0x40000000;
> >         else
> >                 arm64_dma_phys_limit = PHYS_MASK + 1;
> 
> My superficial understanding (mainly from talking to Robin, who actually
> knows how this works), is that we'd need to extend our support for
> dma-ranges in order to limit ZONE_DMA32 as you're proposing above.

Noted.

> However, this may not help for streaming DMA, where we need to force
everything
> above 1G through a bounce buffer and likely requires something weird like
> a 30-bit DMA mask.
>
> Do you know how streaming DMA is handled in the 32-bit port for rpi4?

Not really, I'll have a look and come back to you.

Regards,
Nicolas
Phil Elwell July 11, 2019, 1:32 p.m. UTC | #5
Hi Nicolas et al.

On 11/07/2019 12:10, Nicolas Saenz Julienne wrote:
> Hi Will, thanks for your response.
> 
> [+ Matthias and Phill who might be interested ]
> 
> On Thu, 2019-07-11 at 11:17 +0100, Will Deacon wrote:
>> Hi Nicolas,
>>
>> [+Robin, Andrew and Marc since we've been playing with getting arm64 Linux
>>  up and running too]
>>
>> On Thu, Jul 11, 2019 at 11:51:57AM +0200, Nicolas Saenz Julienne wrote:
>>> I'm trying to bring up the new RPi4 on arm64, and running into issues with
>>> DMA
>>> allocations. The device has up to 4GB of ram, but AFAIK only the first GB of
>>> ram can be used for DMA: the DMA address range is 0xc0000000-0xffffffff
>>> which
>>> is aliased to the first GB of memory 0x00000000-0x40000000.
>>
>> Do you know for sure that these aliases are equivalant and so it's
>> inconsequential if we use the lower addresses for DMA?
> 
> No, they are not exactly equivalent, see the 'dma-ranges' I posted on my other
> reply. I was being overly generic to make the explanation simpler. The actual
> size of the aliasing is smaller.
> 
> That said, I don't think using the lower addresses would work for DMA. I tested
> some transfers and the offset is clearly being taken into account.
> 
>> Also, does this
>> limitation apply to all DMA-capable peripherals, or just some of them?
> 
> I infer from '.dma_zone_size = SZ_1G' and dma-ranges that it's a device wide
> limitation. Maybe Phill can contradict me.

It is a limitation for one of the internal buses used by many of the peripherals.
Newer components including the ARM cores, PCIe(*), GENET, and the 40-bit DMA
channels, have a different view of the address space where RAM starts at 0, and
peripherals etc. are positioned above 0x4_00000000.

>>> This is solved in arm32 using a board file with '.dma_zone_size = SZ_1G'.
>>> But I
>>> haven't found any similar mechanism for arm64. Any suggestions?
>>>
>>> Just it case it helps understand the issue, I managed to get things going by
>>> doing the following:
>>>
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index f3c795278def..ec3cb7b76a76 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -407,7 +407,8 @@ void __init arm64_memblock_init(void)
>>>  
>>>         /* 4GB maximum for 32-bit only capable devices */
>>>         if (IS_ENABLED(CONFIG_ZONE_DMA32))
>>> -               arm64_dma_phys_limit = max_zone_dma_phys();
>>> +               arm64_dma_phys_limit = 0x40000000;
>>>         else
>>>                 arm64_dma_phys_limit = PHYS_MASK + 1;
>>
>> My superficial understanding (mainly from talking to Robin, who actually
>> knows how this works), is that we'd need to extend our support for
>> dma-ranges in order to limit ZONE_DMA32 as you're proposing above.
> 
> Noted.
> 
>> However, this may not help for streaming DMA, where we need to force
> everything
>> above 1G through a bounce buffer and likely requires something weird like
>> a 30-bit DMA mask.
>>
>> Do you know how streaming DMA is handled in the 32-bit port for rpi4?
> 
> Not really, I'll have a look and come back to you.

Phil

(*) The wrapper around the PCIe block has a bug preventing it from accessing beyond
the first 3GB of RAM.
diff mbox series

Patch

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3c795278def..ec3cb7b76a76 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -407,7 +407,8 @@  void __init arm64_memblock_init(void)
 
        /* 4GB maximum for 32-bit only capable devices */
        if (IS_ENABLED(CONFIG_ZONE_DMA32))
-               arm64_dma_phys_limit = max_zone_dma_phys();
+               arm64_dma_phys_limit = 0x40000000;
        else
                arm64_dma_phys_limit = PHYS_MASK + 1;

Regards,