diff mbox

mtd: nand: raw: atmel: add module param to avoid using dma

Message ID c3cc1894-2d7d-93b6-a9de-ed9ca4564ae9@axentia.se (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Rosin May 28, 2018, 10:10 a.m. UTC
On 2018-05-28 00:11, Peter Rosin wrote:
> On 2018-05-27 11:18, Peter Rosin wrote:
>> On 2018-05-25 16:51, Tudor Ambarus wrote:
>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
>>> (7th slave).
>>
>> Exactly how do I accomplish that?
>>
>> I can see how I can move the LCD between slave DDR port 2 and 3 by
>> selecting LCDC DMA master 8 or 9 (but according to the above it should
>> not matter). The big question is how I control what slave the NAND flash
>> is going to use? I find nothing in the datasheet, and the code is also
>> non-transparent enough for me to figure it out by myself without
>> throwing out this question first...
> 
> I added this:
> 
> diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
> index e686fe73159e..3b33c63d2ed4 100644
> --- a/drivers/mtd/nand/raw/atmel/nand-controller.c
> +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
> @@ -1991,6 +1991,9 @@ static int atmel_nand_controller_init(struct atmel_nand_controller *nc,
>  		nc->dmac = dma_request_channel(mask, NULL, NULL);
>  		if (!nc->dmac)
>  			dev_err(nc->dev, "Failed to request DMA channel\n");
> +
> +		dev_info(nc->dev, "using %s for DMA transfers\n",
> +			 dma_chan_name(nc->dmac));
>  	}
>  
>  	/* We do not retrieve the SMC syscon when parsing old DTs. */
> 
> 
> 
> and the output is
> 
> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
> 
> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
> or how to find out. I guess IF2 is not in use since that does not allow any
> DDR2 port as slave...
> 
> From the datasheet, DMAC0/IF0 uses DDR2 Port 2, and DMAC0/IF1 uses DDR2 Port 1.
> But, by the looks of the register content in my other mail, it seems as if
> DMA0/IF1 can also use DDR2 Port 3.
> 
> So, I think I want either
> 
> A) the NAND controller to use DMAC0/IF0 (i.e. DDR2 port 1, and possibly 3) and
>    the LCDC to use master 9 (i.e. DDR2 Port 2)
> 
> or
> 
> B) the NAND controller to use DMAC1/IF1 (i.e. DDR2 port 2) and the LCDC to use
>    master 8 (i.e. DDR2 Port 3)

Crap, that was not what I meant to express. Sorry for the confusion. This is
better.

So, I think I want either

A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
   the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)

or

B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
   possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
   LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)

> But, again, how do I limit DMAC0 to either of IF0 or IF1 for NAND accesses?

So, I added a horrid patch (attached), which mainly adds printk lines, but
additionally does one more thing in atc_prep_dma_memcpy. It changes the DSCR_IF
field (from 0) to 1 for DMA-memcpy for dma0chan5 (i.e. the NAND). At least I
think it does that?

Running with that patch gets me this:

# dmesg | grep -i dma
at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
dma dma0chan0: xlate 0 2
dma dma0chan1: xlate 0 2
at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
dma dma1chan0: xlate 0 2
dma dma1chan1: xlate 0 2
at91_i2c f801c000.i2c: using dma1chan0 (tx) and dma1chan1 (rx) for DMA transfers
dma dma0chan2: xlate 0 2
dma dma0chan3: xlate 0 2
dma dma0chan4: xlate 0 2
atmel_mci f0000000.mmc: using dma0chan4 for DMA transfers
dma dma1chan2: xlate 0 2
dma dma1chan3: xlate 0 2
atmel_aes f8038000.aes: Atmel AES - Using dma1chan2, dma1chan3 for DMA transfers
dma dma1chan4: xlate 0 2
atmel_sha f8034000.sha: using dma1chan4 for DMA transfers
dma dma1chan5: xlate 0 2
dma dma1chan6: xlate 0 2
atmel_tdes f803c000.tdes: using dma1chan5, dma1chan6 for DMA transfers
atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
dma dma0chan5: memcpy: 0
dma dma0chan5: DSCR_IF: 1
dma dma0chan5: memcpy: 1

So, output is as expected and I believe that the patch makes the NAND DMA
accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
(and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
slave conflict?

But the on-screen crap remains during NAND accesses.

But pressing on.

I then changed the priorities for all accesses to 0 in the PRxSy registers, except
the ones for masters 8/9 LCDC (slaves 8/9) which I left at priority 3.

But the on-screen crap remains during NAND accesses.

My guess is that the NAND DMA is doing too long bursts and that the LCDC therefore
has to wait too long and simply fails to keep the pipeline from running short?

So I tried to reduce the maximum SLOT_CYCLE for slaves 7 and 9 in the SCFGx
registers. No noticeable effect either.

I then tried to split bursts from master 2 (DMAC0/IF1) with small values in the
MCFG2 register. No effect.

I'm getting nowhere.

Cheers,
Peter

Comments

Boris Brezillon May 28, 2018, 2:27 p.m. UTC | #1
Hi Peter,

On Mon, 28 May 2018 12:10:02 +0200
Peter Rosin <peda@axentia.se> wrote:

> On 2018-05-28 00:11, Peter Rosin wrote:
> > On 2018-05-27 11:18, Peter Rosin wrote:  
> >> On 2018-05-25 16:51, Tudor Ambarus wrote:  
> >>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
> >>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
> >>> (7th slave).  
> >>
> >> Exactly how do I accomplish that?
> >>
> >> I can see how I can move the LCD between slave DDR port 2 and 3 by
> >> selecting LCDC DMA master 8 or 9 (but according to the above it should
> >> not matter). The big question is how I control what slave the NAND flash
> >> is going to use? I find nothing in the datasheet, and the code is also
> >> non-transparent enough for me to figure it out by myself without
> >> throwing out this question first...  
> > 
> > I added this:
> > 
> > diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
> > index e686fe73159e..3b33c63d2ed4 100644
> > --- a/drivers/mtd/nand/raw/atmel/nand-controller.c
> > +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
> > @@ -1991,6 +1991,9 @@ static int atmel_nand_controller_init(struct atmel_nand_controller *nc,
> >  		nc->dmac = dma_request_channel(mask, NULL, NULL);
> >  		if (!nc->dmac)
> >  			dev_err(nc->dev, "Failed to request DMA channel\n");
> > +
> > +		dev_info(nc->dev, "using %s for DMA transfers\n",
> > +			 dma_chan_name(nc->dmac));
> >  	}
> >  
> >  	/* We do not retrieve the SMC syscon when parsing old DTs. */
> > 
> > 
> > 
> > and the output is
> > 
> > atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
> > 
> > So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
> > or how to find out. I guess IF2 is not in use since that does not allow any
> > DDR2 port as slave...
> > 
> > From the datasheet, DMAC0/IF0 uses DDR2 Port 2, and DMAC0/IF1 uses DDR2 Port 1.
> > But, by the looks of the register content in my other mail, it seems as if
> > DMA0/IF1 can also use DDR2 Port 3.
> > 
> > So, I think I want either
> > 
> > A) the NAND controller to use DMAC0/IF0 (i.e. DDR2 port 1, and possibly 3) and
> >    the LCDC to use master 9 (i.e. DDR2 Port 2)
> > 
> > or
> > 
> > B) the NAND controller to use DMAC1/IF1 (i.e. DDR2 port 2) and the LCDC to use
> >    master 8 (i.e. DDR2 Port 3)  
> 
> Crap, that was not what I meant to express. Sorry for the confusion. This is
> better.
> 
> So, I think I want either
> 
> A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
>    the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)
> 
> or
> 
> B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
>    possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
>    LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)
> 
> > But, again, how do I limit DMAC0 to either of IF0 or IF1 for NAND accesses?  
> 
> So, I added a horrid patch (attached), which mainly adds printk lines, but
> additionally does one more thing in atc_prep_dma_memcpy. It changes the DSCR_IF
> field (from 0) to 1 for DMA-memcpy for dma0chan5 (i.e. the NAND). At least I
> think it does that?
> 
> Running with that patch gets me this:
> 
> # dmesg | grep -i dma
> at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
> at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
> dma dma0chan0: xlate 0 2
> dma dma0chan1: xlate 0 2
> at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
> dma dma1chan0: xlate 0 2
> dma dma1chan1: xlate 0 2
> at91_i2c f801c000.i2c: using dma1chan0 (tx) and dma1chan1 (rx) for DMA transfers
> dma dma0chan2: xlate 0 2
> dma dma0chan3: xlate 0 2
> dma dma0chan4: xlate 0 2
> atmel_mci f0000000.mmc: using dma0chan4 for DMA transfers
> dma dma1chan2: xlate 0 2
> dma dma1chan3: xlate 0 2
> atmel_aes f8038000.aes: Atmel AES - Using dma1chan2, dma1chan3 for DMA transfers
> dma dma1chan4: xlate 0 2
> atmel_sha f8034000.sha: using dma1chan4 for DMA transfers
> dma dma1chan5: xlate 0 2
> dma dma1chan6: xlate 0 2
> atmel_tdes f803c000.tdes: using dma1chan5, dma1chan6 for DMA transfers
> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
> dma dma0chan5: memcpy: 0
> dma dma0chan5: DSCR_IF: 1
> dma dma0chan5: memcpy: 1
> 
> So, output is as expected and I believe that the patch makes the NAND DMA
> accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
> (and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
> slave conflict?
> 
> But the on-screen crap remains during NAND accesses.
> 
> But pressing on.
> 
> I then changed the priorities for all accesses to 0 in the PRxSy registers, except
> the ones for masters 8/9 LCDC (slaves 8/9) which I left at priority 3.
> 
> But the on-screen crap remains during NAND accesses.
> 
> My guess is that the NAND DMA is doing too long bursts and that the LCDC therefore
> has to wait too long and simply fails to keep the pipeline from running short?
> 
> So I tried to reduce the maximum SLOT_CYCLE for slaves 7 and 9 in the SCFGx
> registers. No noticeable effect either.
> 
> I then tried to split bursts from master 2 (DMAC0/IF1) with small values in the
> MCFG2 register. No effect.
> 
> I'm getting nowhere.

Could it just be that you're reaching the DDR bus limit. As I said
previously, when you go through the CPU, and assuming you're consuming
the data directly, you have:

1/ NFC SRAM -> CPU
2/ CPU -> L1 data cache --write-back--> DRAM
3/ L1-cache -> CPU

While, if you use DMA you get:

1/ NFC SRAM -> DRAM
2/ SDRAM -> L1 data cache -> CPU

So, if you're approaching the limit of (LP)DDR bandwidth, using the CPU
might make things a bit better. Still, if the limitation really comes
from the DDR bus, my opinion is that you should maybe use a smaller
resolution or use a more compact pixel format (RGB565?).

Did you calculate how much of the bandwidth is taken by the HLCDC
block and compared it to the max (LP)DDR bandwidth?

Regards,

Boris
Peter Rosin May 28, 2018, 3:52 p.m. UTC | #2
On 2018-05-28 16:27, Boris Brezillon wrote:
> Hi Peter,
> 
> On Mon, 28 May 2018 12:10:02 +0200
> Peter Rosin <peda@axentia.se> wrote:
> 
>> On 2018-05-28 00:11, Peter Rosin wrote:
>>> On 2018-05-27 11:18, Peter Rosin wrote:  
>>>> On 2018-05-25 16:51, Tudor Ambarus wrote:  
>>>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
>>>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
>>>>> (7th slave).  
>>>>
>>>> Exactly how do I accomplish that?
>>>>
>>>> I can see how I can move the LCD between slave DDR port 2 and 3 by
>>>> selecting LCDC DMA master 8 or 9 (but according to the above it should
>>>> not matter). The big question is how I control what slave the NAND flash
>>>> is going to use? I find nothing in the datasheet, and the code is also
>>>> non-transparent enough for me to figure it out by myself without
>>>> throwing out this question first...  
>>>
>>> I added this:
>>>
>>> diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
>>> index e686fe73159e..3b33c63d2ed4 100644
>>> --- a/drivers/mtd/nand/raw/atmel/nand-controller.c
>>> +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
>>> @@ -1991,6 +1991,9 @@ static int atmel_nand_controller_init(struct atmel_nand_controller *nc,
>>>  		nc->dmac = dma_request_channel(mask, NULL, NULL);
>>>  		if (!nc->dmac)
>>>  			dev_err(nc->dev, "Failed to request DMA channel\n");
>>> +
>>> +		dev_info(nc->dev, "using %s for DMA transfers\n",
>>> +			 dma_chan_name(nc->dmac));
>>>  	}
>>>  
>>>  	/* We do not retrieve the SMC syscon when parsing old DTs. */
>>>
>>>
>>>
>>> and the output is
>>>
>>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>>>
>>> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
>>> or how to find out. I guess IF2 is not in use since that does not allow any
>>> DDR2 port as slave...
>>>
>>> From the datasheet, DMAC0/IF0 uses DDR2 Port 2, and DMAC0/IF1 uses DDR2 Port 1.
>>> But, by the looks of the register content in my other mail, it seems as if
>>> DMA0/IF1 can also use DDR2 Port 3.
>>>
>>> So, I think I want either
>>>
>>> A) the NAND controller to use DMAC0/IF0 (i.e. DDR2 port 1, and possibly 3) and
>>>    the LCDC to use master 9 (i.e. DDR2 Port 2)
>>>
>>> or
>>>
>>> B) the NAND controller to use DMAC1/IF1 (i.e. DDR2 port 2) and the LCDC to use
>>>    master 8 (i.e. DDR2 Port 3)  
>>
>> Crap, that was not what I meant to express. Sorry for the confusion. This is
>> better.
>>
>> So, I think I want either
>>
>> A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
>>    the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)
>>
>> or
>>
>> B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
>>    possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
>>    LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)
>>
>>> But, again, how do I limit DMAC0 to either of IF0 or IF1 for NAND accesses?  
>>
>> So, I added a horrid patch (attached), which mainly adds printk lines, but
>> additionally does one more thing in atc_prep_dma_memcpy. It changes the DSCR_IF
>> field (from 0) to 1 for DMA-memcpy for dma0chan5 (i.e. the NAND). At least I
>> think it does that?
>>
>> Running with that patch gets me this:
>>
>> # dmesg | grep -i dma
>> at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
>> at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
>> dma dma0chan0: xlate 0 2
>> dma dma0chan1: xlate 0 2
>> at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
>> dma dma1chan0: xlate 0 2
>> dma dma1chan1: xlate 0 2
>> at91_i2c f801c000.i2c: using dma1chan0 (tx) and dma1chan1 (rx) for DMA transfers
>> dma dma0chan2: xlate 0 2
>> dma dma0chan3: xlate 0 2
>> dma dma0chan4: xlate 0 2
>> atmel_mci f0000000.mmc: using dma0chan4 for DMA transfers
>> dma dma1chan2: xlate 0 2
>> dma dma1chan3: xlate 0 2
>> atmel_aes f8038000.aes: Atmel AES - Using dma1chan2, dma1chan3 for DMA transfers
>> dma dma1chan4: xlate 0 2
>> atmel_sha f8034000.sha: using dma1chan4 for DMA transfers
>> dma dma1chan5: xlate 0 2
>> dma dma1chan6: xlate 0 2
>> atmel_tdes f803c000.tdes: using dma1chan5, dma1chan6 for DMA transfers
>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>> dma dma0chan5: memcpy: 0
>> dma dma0chan5: DSCR_IF: 1
>> dma dma0chan5: memcpy: 1
>>
>> So, output is as expected and I believe that the patch makes the NAND DMA
>> accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
>> (and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
>> slave conflict?
>>
>> But the on-screen crap remains during NAND accesses.
>>
>> But pressing on.
>>
>> I then changed the priorities for all accesses to 0 in the PRxSy registers, except
>> the ones for masters 8/9 LCDC (slaves 8/9) which I left at priority 3.
>>
>> But the on-screen crap remains during NAND accesses.
>>
>> My guess is that the NAND DMA is doing too long bursts and that the LCDC therefore
>> has to wait too long and simply fails to keep the pipeline from running short?
>>
>> So I tried to reduce the maximum SLOT_CYCLE for slaves 7 and 9 in the SCFGx
>> registers. No noticeable effect either.
>>
>> I then tried to split bursts from master 2 (DMAC0/IF1) with small values in the
>> MCFG2 register. No effect.
>>
>> I'm getting nowhere.
> 
> Could it just be that you're reaching the DDR bus limit. As I said
> previously, when you go through the CPU, and assuming you're consuming
> the data directly, you have:
> 
> 1/ NFC SRAM -> CPU
> 2/ CPU -> L1 data cache --write-back--> DRAM
> 3/ L1-cache -> CPU
> 
> While, if you use DMA you get:
> 
> 1/ NFC SRAM -> DRAM
> 2/ SDRAM -> L1 data cache -> CPU
> 
> So, if you're approaching the limit of (LP)DDR bandwidth, using the CPU
> might make things a bit better. Still, if the limitation really comes
> from the DDR bus, my opinion is that you should maybe use a smaller
> resolution or use a more compact pixel format (RGB565?).

The issue is still there if I use a CLUT mode instead of rgb565, which is
what I normally use (and what I would like to use, CLUT is just alien and
a pain these days).

The panels we are using only supports one resolution (each), but the issue
is there with both 1920x1080@16bpp and 1024x768@8bpp (~60Hz).

> Did you calculate how much of the bandwidth is taken by the HLCDC
> block and compared it to the max (LP)DDR bandwidth?

I did, but don't remember the exact details. There is some room even for
1920x1080@16bpp, but not oceans of it. We were a bit uncertain if 16bpp
would be possible, and in fact that was the reason I worked on CLUT
support for atmel-hlcdc last year. But since the problem persists with
much less memory pressure as well, I don't think that's it either.

Admittedly I have not tested these AHB matrix tricks with a smaller
panel (it would take a bit of work to arrange for those tests), but the
issue was there when I last tried (using defaults).

Cheers,
Peter
Boris Brezillon May 28, 2018, 4:09 p.m. UTC | #3
On Mon, 28 May 2018 17:52:53 +0200
Peter Rosin <peda@axentia.se> wrote:

> On 2018-05-28 16:27, Boris Brezillon wrote:
> > Hi Peter,
> > 
> > On Mon, 28 May 2018 12:10:02 +0200
> > Peter Rosin <peda@axentia.se> wrote:
> >   
> >> On 2018-05-28 00:11, Peter Rosin wrote:  
> >>> On 2018-05-27 11:18, Peter Rosin wrote:    
> >>>> On 2018-05-25 16:51, Tudor Ambarus wrote:    
> >>>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
> >>>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
> >>>>> (7th slave).    
> >>>>
> >>>> Exactly how do I accomplish that?
> >>>>
> >>>> I can see how I can move the LCD between slave DDR port 2 and 3 by
> >>>> selecting LCDC DMA master 8 or 9 (but according to the above it should
> >>>> not matter). The big question is how I control what slave the NAND flash
> >>>> is going to use? I find nothing in the datasheet, and the code is also
> >>>> non-transparent enough for me to figure it out by myself without
> >>>> throwing out this question first...    
> >>>
> >>> I added this:
> >>>
> >>> diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
> >>> index e686fe73159e..3b33c63d2ed4 100644
> >>> --- a/drivers/mtd/nand/raw/atmel/nand-controller.c
> >>> +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
> >>> @@ -1991,6 +1991,9 @@ static int atmel_nand_controller_init(struct atmel_nand_controller *nc,
> >>>  		nc->dmac = dma_request_channel(mask, NULL, NULL);
> >>>  		if (!nc->dmac)
> >>>  			dev_err(nc->dev, "Failed to request DMA channel\n");
> >>> +
> >>> +		dev_info(nc->dev, "using %s for DMA transfers\n",
> >>> +			 dma_chan_name(nc->dmac));
> >>>  	}
> >>>  
> >>>  	/* We do not retrieve the SMC syscon when parsing old DTs. */
> >>>
> >>>
> >>>
> >>> and the output is
> >>>
> >>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
> >>>
> >>> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
> >>> or how to find out. I guess IF2 is not in use since that does not allow any
> >>> DDR2 port as slave...
> >>>
> >>> From the datasheet, DMAC0/IF0 uses DDR2 Port 2, and DMAC0/IF1 uses DDR2 Port 1.
> >>> But, by the looks of the register content in my other mail, it seems as if
> >>> DMA0/IF1 can also use DDR2 Port 3.
> >>>
> >>> So, I think I want either
> >>>
> >>> A) the NAND controller to use DMAC0/IF0 (i.e. DDR2 port 1, and possibly 3) and
> >>>    the LCDC to use master 9 (i.e. DDR2 Port 2)
> >>>
> >>> or
> >>>
> >>> B) the NAND controller to use DMAC1/IF1 (i.e. DDR2 port 2) and the LCDC to use
> >>>    master 8 (i.e. DDR2 Port 3)    
> >>
> >> Crap, that was not what I meant to express. Sorry for the confusion. This is
> >> better.
> >>
> >> So, I think I want either
> >>
> >> A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
> >>    the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)
> >>
> >> or
> >>
> >> B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
> >>    possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
> >>    LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)
> >>  
> >>> But, again, how do I limit DMAC0 to either of IF0 or IF1 for NAND accesses?    
> >>
> >> So, I added a horrid patch (attached), which mainly adds printk lines, but
> >> additionally does one more thing in atc_prep_dma_memcpy. It changes the DSCR_IF
> >> field (from 0) to 1 for DMA-memcpy for dma0chan5 (i.e. the NAND). At least I
> >> think it does that?
> >>
> >> Running with that patch gets me this:
> >>
> >> # dmesg | grep -i dma
> >> at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
> >> at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
> >> dma dma0chan0: xlate 0 2
> >> dma dma0chan1: xlate 0 2
> >> at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
> >> dma dma1chan0: xlate 0 2
> >> dma dma1chan1: xlate 0 2
> >> at91_i2c f801c000.i2c: using dma1chan0 (tx) and dma1chan1 (rx) for DMA transfers
> >> dma dma0chan2: xlate 0 2
> >> dma dma0chan3: xlate 0 2
> >> dma dma0chan4: xlate 0 2
> >> atmel_mci f0000000.mmc: using dma0chan4 for DMA transfers
> >> dma dma1chan2: xlate 0 2
> >> dma dma1chan3: xlate 0 2
> >> atmel_aes f8038000.aes: Atmel AES - Using dma1chan2, dma1chan3 for DMA transfers
> >> dma dma1chan4: xlate 0 2
> >> atmel_sha f8034000.sha: using dma1chan4 for DMA transfers
> >> dma dma1chan5: xlate 0 2
> >> dma dma1chan6: xlate 0 2
> >> atmel_tdes f803c000.tdes: using dma1chan5, dma1chan6 for DMA transfers
> >> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
> >> dma dma0chan5: memcpy: 0
> >> dma dma0chan5: DSCR_IF: 1
> >> dma dma0chan5: memcpy: 1
> >>
> >> So, output is as expected and I believe that the patch makes the NAND DMA
> >> accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
> >> (and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
> >> slave conflict?
> >>
> >> But the on-screen crap remains during NAND accesses.
> >>
> >> But pressing on.
> >>
> >> I then changed the priorities for all accesses to 0 in the PRxSy registers, except
> >> the ones for masters 8/9 LCDC (slaves 8/9) which I left at priority 3.
> >>
> >> But the on-screen crap remains during NAND accesses.
> >>
> >> My guess is that the NAND DMA is doing too long bursts and that the LCDC therefore
> >> has to wait too long and simply fails to keep the pipeline from running short?
> >>
> >> So I tried to reduce the maximum SLOT_CYCLE for slaves 7 and 9 in the SCFGx
> >> registers. No noticeable effect either.
> >>
> >> I then tried to split bursts from master 2 (DMAC0/IF1) with small values in the
> >> MCFG2 register. No effect.
> >>
> >> I'm getting nowhere.  
> > 
> > Could it just be that you're reaching the DDR bus limit. As I said
> > previously, when you go through the CPU, and assuming you're consuming
> > the data directly, you have:
> > 
> > 1/ NFC SRAM -> CPU
> > 2/ CPU -> L1 data cache --write-back--> DRAM
> > 3/ L1-cache -> CPU
> > 
> > While, if you use DMA you get:
> > 
> > 1/ NFC SRAM -> DRAM
> > 2/ SDRAM -> L1 data cache -> CPU
> > 
> > So, if you're approaching the limit of (LP)DDR bandwidth, using the CPU
> > might make things a bit better. Still, if the limitation really comes
> > from the DDR bus, my opinion is that you should maybe use a smaller
> > resolution or use a more compact pixel format (RGB565?).  
> 
> The issue is still there if I use a CLUT mode instead of rgb565, which is
> what I normally use (and what I would like to use, CLUT is just alien and
> a pain these days).
> 
> The panels we are using only supports one resolution (each), but the issue
> is there with both 1920x1080@16bpp and 1024x768@8bpp (~60Hz).

Duh! This adds to the weirdness of this issue. I'd thought that by
dividing the required bandwidth by 2 you would get a reliable setup.

> 
> > Did you calculate how much of the bandwidth is taken by the HLCDC
> > block and compared it to the max (LP)DDR bandwidth?  
> 
> I did, but don't remember the exact details. There is some room even for
> 1920x1080@16bpp, but not oceans of it. We were a bit uncertain if 16bpp
> would be possible, and in fact that was the reason I worked on CLUT
> support for atmel-hlcdc last year. But since the problem persists with
> much less memory pressure as well, I don't think that's it either.
> 
> Admittedly I have not tested these AHB matrix tricks with a smaller
> panel (it would take a bit of work to arrange for those tests), but the
> issue was there when I last tried (using defaults).

Okay. I think I'll take your initial patch, but I'd really like to
understand the root cause of this problem. Tudor, any idea why the
various stuff Peter tried did not work?
Nicolas Ferre May 28, 2018, 4:09 p.m. UTC | #4
On 28/05/2018 at 17:52, Peter Rosin wrote:
> On 2018-05-28 16:27, Boris Brezillon wrote:

[..]

>> Could it just be that you're reaching the DDR bus limit. As I said
>> previously, when you go through the CPU, and assuming you're consuming
>> the data directly, you have:
>>
>> 1/ NFC SRAM -> CPU
>> 2/ CPU -> L1 data cache --write-back--> DRAM
>> 3/ L1-cache -> CPU
>>
>> While, if you use DMA you get:
>>
>> 1/ NFC SRAM -> DRAM
>> 2/ SDRAM -> L1 data cache -> CPU
>>
>> So, if you're approaching the limit of (LP)DDR bandwidth, using the CPU
>> might make things a bit better. Still, if the limitation really comes
>> from the DDR bus, my opinion is that you should maybe use a smaller
>> resolution or use a more compact pixel format (RGB565?).
> 
> The issue is still there if I use a CLUT mode instead of rgb565, which is
> what I normally use (and what I would like to use, CLUT is just alien and
> a pain these days).
> 
> The panels we are using only supports one resolution (each), but the issue
> is there with both 1920x1080@16bpp and 1024x768@8bpp (~60Hz).
> 
>> Did you calculate how much of the bandwidth is taken by the HLCDC
>> block and compared it to the max (LP)DDR bandwidth?
> 
> I did, but don't remember the exact details. There is some room even for
> 1920x1080@16bpp, but not oceans of it. We were a bit uncertain if 16bpp
> would be possible, and in fact that was the reason I worked on CLUT
> support for atmel-hlcdc last year. But since the problem persists with
> much less memory pressure as well, I don't think that's it either.

Just jumping in the conversation with another perspective (maybe already 
tried actually).

Can you try to make all that you can to maximize the blanking period of 
your screen (some are more tolerant than others according to that). By 
doing so, you would allow the LCD FIFO to recover better after each 
line. You might loose some columns on the side of your display but it 
would give us a good idea of how far we are from getting rid of those 
annoying LCD reset glitches (that are due to underruns on LCD FIFO).

> Admittedly I have not tested these AHB matrix tricks with a smaller
> panel (it would take a bit of work to arrange for those tests), but the
> issue was there when I last tried (using defaults).

If what I said earlier has an impact, reducing the panel size will also 
make a difference. But the question is how small is "smaller" ;-)

Best regards,
Eugen Hristev May 29, 2018, 6:30 a.m. UTC | #5
On 28.05.2018 13:10, Peter Rosin wrote:
> On 2018-05-28 00:11, Peter Rosin wrote:
>> On 2018-05-27 11:18, Peter Rosin wrote:
>>> On 2018-05-25 16:51, Tudor Ambarus wrote:
>>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
>>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
>>>> (7th slave).
>>>
>>> Exactly how do I accomplish that?
>>>
>>> I can see how I can move the LCD between slave DDR port 2 and 3 by
>>> selecting LCDC DMA master 8 or 9 (but according to the above it should
>>> not matter). The big question is how I control what slave the NAND flash
>>> is going to use? I find nothing in the datasheet, and the code is also
>>> non-transparent enough for me to figure it out by myself without
>>> throwing out this question first...

 >> [...]

>> and the output is
>>
>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>>
>> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
>> or how to find out. I guess IF2 is not in use since that does not allow any
>> DDR2 port as slave...

Hello Peter,

Thank you for all the information, I will chip in to help a little bit.
The Master/channel is described in the device tree. The channel has a 
controller, a mem/periph interface and a periph ID, plus a FIFO 
configuration.

The dma chan number reported in the dmesg is just software. Here is an 
example from DT:
dmas = <&dma0 2 AT91_DMA_CFG_PER_ID(1)>,
        <&dma0 2 AT91_DMA_CFG_PER_ID(2)>;

you can match this with the help from 
Documentation/devicetree/bindings/dma/atmel-dma.txt:

1. A phandle pointing to the DMA controller. 

2. The memory interface (16 most significant bits), the peripheral 
interface
(16 less significant bits). 

3. Parameters for the at91 DMA configuration register which are device 

dependent: 

   - bit 7-0: peripheral identifier for the hardware handshaking 
interface. The
   identifier can be different for tx and rx. 

   - bit 11-8: FIFO configuration. 0 for half FIFO, 1 for ALAP, 2 for ASAP.


So , what was Tudor asking for, is your DT for the ebi node (if you are 
using ebi), or, your NFC SRAM (Nand Flash Controller SRAM) DMA 
devicetree chunk, so, we can figure out which type of DMA are you using.

Normally, the ebi should be connected to both DMA0 and DMA1 on those 
interfaces specified in DT. Which ones you want to use, depends on your 
setup (and contention on the bus/accesses, like in your case, the HLCDC)

Thats why we have multiple choices, to pick the right one for each case.
In our vanilla DT sama5d3.dtsi we do not have DMA described for ebi 
interface.

Eugen

 >> [...]
Peter Rosin May 29, 2018, 7:10 a.m. UTC | #6
On 2018-05-29 08:30, Eugen Hristev wrote:
> 
> 
> On 28.05.2018 13:10, Peter Rosin wrote:
>> On 2018-05-28 00:11, Peter Rosin wrote:
>>> On 2018-05-27 11:18, Peter Rosin wrote:
>>>> On 2018-05-25 16:51, Tudor Ambarus wrote:
>>>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
>>>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
>>>>> (7th slave).
>>>>
>>>> Exactly how do I accomplish that?
>>>>
>>>> I can see how I can move the LCD between slave DDR port 2 and 3 by
>>>> selecting LCDC DMA master 8 or 9 (but according to the above it should
>>>> not matter). The big question is how I control what slave the NAND flash
>>>> is going to use? I find nothing in the datasheet, and the code is also
>>>> non-transparent enough for me to figure it out by myself without
>>>> throwing out this question first...
> 
>  >> [...]
> 
>>> and the output is
>>>
>>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>>>
>>> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
>>> or how to find out. I guess IF2 is not in use since that does not allow any
>>> DDR2 port as slave...
> 
> Hello Peter,
> 
> Thank you for all the information, I will chip in to help a little bit.
> The Master/channel is described in the device tree. The channel has a 
> controller, a mem/periph interface and a periph ID, plus a FIFO 
> configuration.
> 
> The dma chan number reported in the dmesg is just software.

Got that, that was why I added the various additional traces in that
horrid patch in the mail you are responding to :-)

>                                                             Here is an 
> example from DT:
> dmas = <&dma0 2 AT91_DMA_CFG_PER_ID(1)>,
>         <&dma0 2 AT91_DMA_CFG_PER_ID(2)>;
> 
> you can match this with the help from 
> Documentation/devicetree/bindings/dma/atmel-dma.txt:
> 
> 1. A phandle pointing to the DMA controller. 
> 
> 2. The memory interface (16 most significant bits), the peripheral 
> interface
> (16 less significant bits). 
> 
> 3. Parameters for the at91 DMA configuration register which are device 
> 
> dependent: 
> 
>    - bit 7-0: peripheral identifier for the hardware handshaking 
> interface. The
>    identifier can be different for tx and rx. 
> 
>    - bit 11-8: FIFO configuration. 0 for half FIFO, 1 for ALAP, 2 for ASAP.
> 
> 
> So , what was Tudor asking for, is your DT for the ebi node (if you are 
> using ebi), or, your NFC SRAM (Nand Flash Controller SRAM) DMA 
> devicetree chunk, so, we can figure out which type of DMA are you using.
> 
> Normally, the ebi should be connected to both DMA0 and DMA1 on those 
> interfaces specified in DT. Which ones you want to use, depends on your 
> setup (and contention on the bus/accesses, like in your case, the HLCDC)
> 
> Thats why we have multiple choices, to pick the right one for each case.
> In our vanilla DT sama5d3.dtsi we do not have DMA described for ebi 
> interface.

Ahh, *that* NAND DMA configuration. Of course, how silly of me...

This is a setup based on the at91-linea.dtsi "CPU" board. That dtsi should
have the relevant NAND/DMA info. Also, at91-nattis-2-natte-2.dts describes
the older HW (with a 1024x768 panel) that is also affected, if you want a
full device tree to look at.

Looks like I didn't make a selection, quoting from at91-linea.dtsi:

&ebi {
	pinctrl-0 = <&pinctrl_ebi_nand_addr>;
	pinctrl-names = "default";
	status = "okay";
};

&nand_controller {
	status = "okay";

	nand: nand@3 {
		reg = <0x3 0x0 0x2>;
		atmel,rb = <0>;
		nand-bus-width = <8>;
		nand-ecc-mode = "hw";
		nand-ecc-strength = <4>;
		nand-ecc-step-size = <512>;
		nand-on-flash-bbt;
		label = "atmel_nand";
	};
};

The reason is probably because the sama5d3xek device-trees didn't at the
time of "fork". Does anybody have any suggestion for some extra properties
to try in the above nodes?

Further, I forgot that I had actually upstreamed linea support for
at91bootstrap, so relevant NAND timings etc can be found in lpddr1_init()
in

	at91bootstrap/contrib/board/axentia/sama5d3_linea/sama5d3_linea.c

Cheers,
Peter
Eugen Hristev May 29, 2018, 7:25 a.m. UTC | #7
On 29.05.2018 10:10, Peter Rosin wrote:
> On 2018-05-29 08:30, Eugen Hristev wrote:
>>
>>
>> On 28.05.2018 13:10, Peter Rosin wrote:
>>> On 2018-05-28 00:11, Peter Rosin wrote:
>>>> On 2018-05-27 11:18, Peter Rosin wrote:
>>>>> On 2018-05-25 16:51, Tudor Ambarus wrote:
>>>>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
>>>>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
>>>>>> (7th slave).
>>>>>
>>>>> Exactly how do I accomplish that?
>>>>>
>>>>> I can see how I can move the LCD between slave DDR port 2 and 3 by
>>>>> selecting LCDC DMA master 8 or 9 (but according to the above it should
>>>>> not matter). The big question is how I control what slave the NAND flash
>>>>> is going to use? I find nothing in the datasheet, and the code is also
>>>>> non-transparent enough for me to figure it out by myself without
>>>>> throwing out this question first...
>>
>>   >> [...]
>>
>>>> and the output is
>>>>
>>>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>>>>
>>>> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
>>>> or how to find out. I guess IF2 is not in use since that does not allow any
>>>> DDR2 port as slave...
>>
>> Hello Peter,
>>
>> Thank you for all the information, I will chip in to help a little bit.
>> The Master/channel is described in the device tree. The channel has a
>> controller, a mem/periph interface and a periph ID, plus a FIFO
>> configuration.
>>
>> The dma chan number reported in the dmesg is just software.
> 
> Got that, that was why I added the various additional traces in that
> horrid patch in the mail you are responding to :-)
> 
>>                                                              Here is an
>> example from DT:
>> dmas = <&dma0 2 AT91_DMA_CFG_PER_ID(1)>,
>>          <&dma0 2 AT91_DMA_CFG_PER_ID(2)>;
>>
>> you can match this with the help from
>> Documentation/devicetree/bindings/dma/atmel-dma.txt:
>>
>> 1. A phandle pointing to the DMA controller.
>>
>> 2. The memory interface (16 most significant bits), the peripheral
>> interface
>> (16 less significant bits).
>>
>> 3. Parameters for the at91 DMA configuration register which are device
>>
>> dependent:
>>
>>     - bit 7-0: peripheral identifier for the hardware handshaking
>> interface. The
>>     identifier can be different for tx and rx.
>>
>>     - bit 11-8: FIFO configuration. 0 for half FIFO, 1 for ALAP, 2 for ASAP.
>>
>>
>> So , what was Tudor asking for, is your DT for the ebi node (if you are
>> using ebi), or, your NFC SRAM (Nand Flash Controller SRAM) DMA
>> devicetree chunk, so, we can figure out which type of DMA are you using.
>>
>> Normally, the ebi should be connected to both DMA0 and DMA1 on those
>> interfaces specified in DT. Which ones you want to use, depends on your
>> setup (and contention on the bus/accesses, like in your case, the HLCDC)
>>
>> Thats why we have multiple choices, to pick the right one for each case.
>> In our vanilla DT sama5d3.dtsi we do not have DMA described for ebi
>> interface.
> 
> Ahh, *that* NAND DMA configuration. Of course, how silly of me...
> 
> This is a setup based on the at91-linea.dtsi "CPU" board. That dtsi should
> have the relevant NAND/DMA info. Also, at91-nattis-2-natte-2.dts describes
> the older HW (with a 1024x768 panel) that is also affected, if you want a
> full device tree to look at.
> 
> Looks like I didn't make a selection, quoting from at91-linea.dtsi:

Ok, so try to force the nand to use DDR port 1 : use DMAC0 or DMAC1 on 
the interfaces corresponding just to DDR port 1, and see if helps (while 
leaving HLCDC on both masters - DDR port 2 and 3).

One more thing: what are the actual nand commands which you use when you 
get the glitches? read/write/erase ... ?
What happens if you try to minimize the nand access? you also said at 
some point that only *some* nand accesses cause glitches.

Another thing : even if the LCD displays a still image, the DMA still 
feeds data to the LCD right ?

> 
> &ebi {
> 	pinctrl-0 = <&pinctrl_ebi_nand_addr>;
> 	pinctrl-names = "default";
> 	status = "okay";
> };
> 
> &nand_controller {
> 	status = "okay";
> 
> 	nand: nand@3 {
> 		reg = <0x3 0x0 0x2>;
> 		atmel,rb = <0>;
> 		nand-bus-width = <8>;
> 		nand-ecc-mode = "hw";
> 		nand-ecc-strength = <4>;
> 		nand-ecc-step-size = <512>;
> 		nand-on-flash-bbt;
> 		label = "atmel_nand";
> 	};
> };
> 
> The reason is probably because the sama5d3xek device-trees didn't at the
> time of "fork". Does anybody have any suggestion for some extra properties
> to try in the above nodes?
> 
> Further, I forgot that I had actually upstreamed linea support for
> at91bootstrap, so relevant NAND timings etc can be found in lpddr1_init()
> in
> 
> 	at91bootstrap/contrib/board/axentia/sama5d3_linea/sama5d3_linea.c
> 
> Cheers,
> Peter
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Boris Brezillon May 29, 2018, 2:49 p.m. UTC | #8
Hi Eugen,

On Tue, 29 May 2018 09:30:54 +0300
Eugen Hristev <eugen.hristev@microchip.com> wrote:

> On 28.05.2018 13:10, Peter Rosin wrote:
> > On 2018-05-28 00:11, Peter Rosin wrote:  
> >> On 2018-05-27 11:18, Peter Rosin wrote:  
> >>> On 2018-05-25 16:51, Tudor Ambarus wrote:  
> >>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
> >>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
> >>>> (7th slave).  
> >>>
> >>> Exactly how do I accomplish that?
> >>>
> >>> I can see how I can move the LCD between slave DDR port 2 and 3 by
> >>> selecting LCDC DMA master 8 or 9 (but according to the above it should
> >>> not matter). The big question is how I control what slave the NAND flash
> >>> is going to use? I find nothing in the datasheet, and the code is also
> >>> non-transparent enough for me to figure it out by myself without
> >>> throwing out this question first...  
> 
>  >> [...]  
> 
> >> and the output is
> >>
> >> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
> >>
> >> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
> >> or how to find out. I guess IF2 is not in use since that does not allow any
> >> DDR2 port as slave...  
> 
> Hello Peter,
> 
> Thank you for all the information, I will chip in to help a little bit.
> The Master/channel is described in the device tree. The channel has a 
> controller, a mem/periph interface and a periph ID, plus a FIFO 
> configuration.
> 
> The dma chan number reported in the dmesg is just software. Here is an 
> example from DT:
> dmas = <&dma0 2 AT91_DMA_CFG_PER_ID(1)>,
>         <&dma0 2 AT91_DMA_CFG_PER_ID(2)>;
> 
> you can match this with the help from 
> Documentation/devicetree/bindings/dma/atmel-dma.txt:
> 
> 1. A phandle pointing to the DMA controller. 
> 
> 2. The memory interface (16 most significant bits), the peripheral 
> interface
> (16 less significant bits). 
> 
> 3. Parameters for the at91 DMA configuration register which are device 
> 
> dependent: 
> 
>    - bit 7-0: peripheral identifier for the hardware handshaking 
> interface. The
>    identifier can be different for tx and rx. 
> 
>    - bit 11-8: FIFO configuration. 0 for half FIFO, 1 for ALAP, 2 for ASAP.
> 
> 
> So , what was Tudor asking for, is your DT for the ebi node (if you are 
> using ebi), or, your NFC SRAM (Nand Flash Controller SRAM) DMA 
> devicetree chunk, so, we can figure out which type of DMA are you using.

I think you're missing something here. We use the DMA engine in memcpy
mode (SRAM -> DRAM), not in device mode (dev -> DRAM or DRAM -> dev).
So there's no dmas prop defined in the DT and there should not be.

Regards,

Boris

> 
> Normally, the ebi should be connected to both DMA0 and DMA1 on those 
> interfaces specified in DT. Which ones you want to use, depends on your 
> setup (and contention on the bus/accesses, like in your case, the HLCDC)
> 
> Thats why we have multiple choices, to pick the right one for each case.
> In our vanilla DT sama5d3.dtsi we do not have DMA described for ebi 
> interface.
> 
> Eugen
> 
>  >> [...]
Eugen Hristev May 29, 2018, 3:01 p.m. UTC | #9
[...]


> 
> I think you're missing something here. We use the DMA engine in memcpy
> mode (SRAM -> DRAM), not in device mode (dev -> DRAM or DRAM -> dev).
> So there's no dmas prop defined in the DT and there should not be.
> 
> Regards,
> 
> Boris
> 

Ok, so the memcpy SRAM <-> DRAM will hog the transfer between DRAM and 
LCD if my understanding is correct. That's the DMA that Peter wants to 
disable with his patch ?

Then we can then try to force NFC SRAM DMA channels to use just DDR port 
1 or 2 for memcpy ?

I have also received a suggestion to try to increase the porches in 
LCDC_LCDCFG3 .

>>
>>   >> [...]
> 
>
Boris Brezillon May 29, 2018, 3:15 p.m. UTC | #10
On Tue, 29 May 2018 18:01:40 +0300
Eugen Hristev <eugen.hristev@microchip.com> wrote:

> [...]
> 
> 
> > 
> > I think you're missing something here. We use the DMA engine in memcpy
> > mode (SRAM -> DRAM), not in device mode (dev -> DRAM or DRAM -> dev).
> > So there's no dmas prop defined in the DT and there should not be.
> > 
> > Regards,
> > 
> > Boris
> >   
> 
> Ok, so the memcpy SRAM <-> DRAM will hog the transfer between DRAM and 
> LCD if my understanding is correct. That's the DMA that Peter wants to 
> disable with his patch ?
> 
> Then we can then try to force NFC SRAM DMA channels to use just DDR port 
> 1 or 2 for memcpy ?

You mean the dmaengine? According to "14.1.3 Master to Slave Access"
that's already the case.

Only DMAC0 can access the NFC SRAM and it's done through DMAC0:IF0,
then access to DDR is going through port DDR port 1 (DMAC0:IF1) or 2
(DMAC0:IF0).

> 
> I have also received a suggestion to try to increase the porches in 
> LCDC_LCDCFG3 .

Yep, Nicolas suggested something similar. Peter, can you try that?
Eugen Hristev May 29, 2018, 3:21 p.m. UTC | #11
On 29.05.2018 18:15, Boris Brezillon wrote:
> On Tue, 29 May 2018 18:01:40 +0300
> Eugen Hristev <eugen.hristev@microchip.com> wrote:
> 
>> [...]
>>
>>
>>>
>>> I think you're missing something here. We use the DMA engine in memcpy
>>> mode (SRAM -> DRAM), not in device mode (dev -> DRAM or DRAM -> dev).
>>> So there's no dmas prop defined in the DT and there should not be.
>>>
>>> Regards,
>>>
>>> Boris
>>>    
>>
>> Ok, so the memcpy SRAM <-> DRAM will hog the transfer between DRAM and
>> LCD if my understanding is correct. That's the DMA that Peter wants to
>> disable with his patch ?
>>
>> Then we can then try to force NFC SRAM DMA channels to use just DDR port
>> 1 or 2 for memcpy ?
> 
> You mean the dmaengine? According to "14.1.3 Master to Slave Access"
> that's already the case.
> 
> Only DMAC0 can access the NFC SRAM and it's done through DMAC0:IF0,
> then access to DDR is going through port DDR port 1 (DMAC0:IF1) or 2
> (DMAC0:IF0).

If we can make NFC use port 1 only, then HLCDC could have two ports as 
master 8 & 9, maybe a better bandwidth.

> 
>>
>> I have also received a suggestion to try to increase the porches in
>> LCDC_LCDCFG3 .
> 
> Yep, Nicolas suggested something similar. Peter, can you try that?
>
Tudor Ambarus June 4, 2018, 3:46 p.m. UTC | #12
Hi, Peter,

On 05/28/2018 01:10 PM, Peter Rosin wrote:

[cut]

> So, I think I want either
> 
> A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
>     the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)
> 
> or
> 
> B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
>     possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
>     LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)

My understanding is that "Table 14-3. Master to Slave Access" describes
what connections are allowed between the masters and slaves, while the
PRxSy registers just set the priorities. What happens when you assign
the highest priority to a master to slave connection that is not
allowed? Probably it is ignored, but I'll check with the hardware team.
So I expect that the NAND controller can not use DDR2 port 3 regardless
of the priority set.

[cut]

> So, output is as expected and I believe that the patch makes the NAND DMA
> accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
> (and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
> slave conflict?
> 
> But the on-screen crap remains during NAND accesses.

No conflict, but you missed to dispatch the load on the LCDC DMA
masters, if I understood correctly.

So, I think we want to test the following:
- NAND controller to use DMAC0/IF1 (slave 7 DDR2 port 1)
- LCDC to use master 8 (slave 8 DDR2 Port 2) and master 9 (slave 9 DDR2
Port 3).

Best,
ta
Boris Brezillon June 4, 2018, 4:03 p.m. UTC | #13
On Mon, 4 Jun 2018 18:46:56 +0300
Tudor Ambarus <tudor.ambarus@microchip.com> wrote:

> Hi, Peter,
> 
> On 05/28/2018 01:10 PM, Peter Rosin wrote:
> 
> [cut]
> 
> > So, I think I want either
> > 
> > A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
> >     the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)
> > 
> > or
> > 
> > B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
> >     possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
> >     LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)  
> 
> My understanding is that "Table 14-3. Master to Slave Access" describes
> what connections are allowed between the masters and slaves, while the
> PRxSy registers just set the priorities. What happens when you assign
> the highest priority to a master to slave connection that is not
> allowed? Probably it is ignored, but I'll check with the hardware team.
> So I expect that the NAND controller can not use DDR2 port 3 regardless
> of the priority set.
> 
> [cut]
> 
> > So, output is as expected and I believe that the patch makes the NAND DMA
> > accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
> > (and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
> > slave conflict?
> > 
> > But the on-screen crap remains during NAND accesses.  
> 
> No conflict, but you missed to dispatch the load on the LCDC DMA
> masters, if I understood correctly.
> 
> So, I think we want to test the following:
> - NAND controller to use DMAC0/IF1 (slave 7 DDR2 port 1)

As I explained in one of my previous email, it's not that easy to set
up, because the SRAM is connected to IF0, and we're using DMA memcpy
here. Also, I don't see how it could solve Peter's problem if, even
when he switches to LCDC master 9 for the primary overlay, he still
keeps experiencing FIFO underruns.

> - LCDC to use master 8 (slave 8 DDR2 Port 2) and master 9 (slave 9 DDR2
> Port 3).

Except that only works if you have several overlays activated, which
AFAIR, is not the case in Peter's setup.
diff mbox

Patch

diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c
index 75f38d19fcbe..6cb58197bd29 100644
--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -243,6 +243,18 @@  static void atc_dostart(struct at_dma_chan *atchan, struct at_desc *first)
 
 	vdbg_dump_regs(atchan);
 
+	if (atchan->chan_common.chan_id == 5 &&
+	    atchan->chan_common.device->dev_id == 0)
+	{
+		static u32 last_if = 4;
+		u32 this_if = first->txd.phys & 3;
+		if (this_if != last_if) {
+			dev_info(chan2dev(&atchan->chan_common),
+				 "DSCR_IF: %u\n", this_if);
+			last_if = this_if;
+		}
+	}
+
 	channel_writel(atchan, SADDR, 0);
 	channel_writel(atchan, DADDR, 0);
 	channel_writel(atchan, CTRLA, 0);
@@ -854,6 +866,19 @@  atc_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dest, dma_addr_t src,
 		desc->lli.ctrlb = ctrlb;
 
 		desc->txd.cookie = 0;
+		if (chan->chan_id == 5 &&
+		    chan->device->dev_id == 0)
+		{
+			static u32 last_if = 4;
+			u32 this_if = desc->txd.phys & 3;
+			if (this_if != last_if) {
+				dev_info(chan2dev(chan),
+					 "memcpy: %u\n", this_if);
+				last_if = this_if;
+			}
+			desc->txd.phys = (desc->txd.phys & ~3) | 1;
+		}
+
 		desc->len = xfer_count << src_width;
 
 		atc_desc_chain(&first, &prev, desc);
@@ -1107,6 +1132,8 @@  atc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
 			| ATC_SRC_ADDR_MODE_INCR
 			| ATC_FC_MEM2PER
 			| ATC_SIF(atchan->mem_if) | ATC_DIF(atchan->per_if);
+		dev_info(chan2dev(chan), "slave_sg: mem2dev %d %d\n",
+			 atchan->mem_if, atchan->per_if);
 		reg = sconfig->dst_addr;
 		for_each_sg(sgl, sg, sg_len, i) {
 			struct at_desc	*desc;
@@ -1147,6 +1174,8 @@  atc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
 			| ATC_SRC_ADDR_MODE_FIXED
 			| ATC_FC_PER2MEM
 			| ATC_SIF(atchan->per_if) | ATC_DIF(atchan->mem_if);
+		dev_info(chan2dev(chan), "slave_sg: dev2mem %d %d\n",
+			 atchan->mem_if, atchan->per_if);
 
 		reg = sconfig->src_addr;
 		for_each_sg(sgl, sg, sg_len, i) {
@@ -1255,6 +1284,8 @@  atc_dma_cyclic_fill_desc(struct dma_chan *chan, struct at_desc *desc,
 				| ATC_FC_MEM2PER
 				| ATC_SIF(atchan->mem_if)
 				| ATC_DIF(atchan->per_if);
+		dev_info(chan2dev(chan), "fill_desc: mem2dev %d %d\n",
+			 atchan->mem_if, atchan->per_if);
 		desc->len = period_len;
 		break;
 
@@ -1267,6 +1298,8 @@  atc_dma_cyclic_fill_desc(struct dma_chan *chan, struct at_desc *desc,
 				| ATC_FC_PER2MEM
 				| ATC_SIF(atchan->per_if)
 				| ATC_DIF(atchan->mem_if);
+		dev_info(chan2dev(chan), "fill_desc: dev2mem %d %d\n",
+			 atchan->mem_if, atchan->per_if);
 		desc->len = period_len;
 		break;
 
@@ -1344,6 +1377,18 @@  atc_prep_dma_cyclic(struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		atc_desc_chain(&first, &prev, desc);
 	}
 
+	if (chan->chan_id == 5 &&
+	    chan->device->dev_id == 0)
+	{
+		static u32 last_if = 4;
+		u32 this_if = first->txd.phys & 3;
+		if (this_if != last_if) {
+			dev_info(chan2dev(chan),
+				 "cyclic: %u\n", this_if);
+			last_if = this_if;
+		}
+	}
+
 	/* lets make a cyclic list */
 	prev->lli.dscr = first->txd.phys;
 
@@ -1712,6 +1757,8 @@  static struct dma_chan *at_dma_xlate(struct of_phandle_args *dma_spec,
 	atchan = to_at_dma_chan(chan);
 	atchan->per_if = dma_spec->args[0] & 0xff;
 	atchan->mem_if = (dma_spec->args[0] >> 16) & 0xff;
+	dev_info(chan2dev(chan), "xlate %d %d\n",
+		 atchan->mem_if, atchan->per_if);
 
 	return chan;
 }
@@ -2099,6 +2146,18 @@  static void atc_resume_cyclic(struct at_dma_chan *atchan)
 {
 	struct at_dma	*atdma = to_at_dma(atchan->chan_common.device);
 
+	if (atchan->chan_common.chan_id == 5 &&
+	    atchan->chan_common.device->dev_id == 0)
+	{
+		static u32 last_if = 4;
+		u32 this_if = atchan->save_dscr;
+		if (this_if != last_if) {
+			dev_info(chan2dev(&atchan->chan_common),
+				 "resume_cyclic: %u\n", this_if);
+			last_if = this_if;
+		}
+	}
+
 	/* restore channel status for cyclic descriptors list:
 	 * next descriptor in the cyclic list at the time of suspend */
 	channel_writel(atchan, SADDR, 0);
diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
index e686fe73159e..3b33c63d2ed4 100644
--- a/drivers/mtd/nand/raw/atmel/nand-controller.c
+++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
@@ -1991,6 +1991,9 @@  static int atmel_nand_controller_init(struct atmel_nand_controller *nc,
 		nc->dmac = dma_request_channel(mask, NULL, NULL);
 		if (!nc->dmac)
 			dev_err(nc->dev, "Failed to request DMA channel\n");
+
+		dev_info(nc->dev, "using %s for DMA transfers\n",
+			 dma_chan_name(nc->dmac));
 	}
 
 	/* We do not retrieve the SMC syscon when parsing old DTs. */