diff mbox

[7/9] ARM: edma: Don't clear EMR of channel in edma_stop

Message ID 1375104595-16018-8-git-send-email-joelf@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Joel Fernandes July 29, 2013, 1:29 p.m. UTC
We certainly don't want error conditions to be cleared anywhere
as this will make us 'forget' about missed events. We depend on
knowing which events were missed in order to be able to reissue them.

This fixes a race condition where the EMR was being cleared
by the transfer completion interrupt handler.

Basically, what was happening was:

            Missed event
             |
             |
             V
SG1-SG2-SG3-Null
         \
          \__TC Interrupt (Almost same time as ARM is executing
TC interrupt handler, an event got missed and also forgotten
by clearing the EMR).

The EMR is ultimately being cleared by the Error interrupt
handler once it is handled so we don't have to do it in edma_stop.

Signed-off-by: Joel Fernandes <joelf@ti.com>
---
 arch/arm/common/edma.c |    1 -
 1 file changed, 1 deletion(-)

Comments

Sekhar Nori July 30, 2013, 8:29 a.m. UTC | #1
On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
> We certainly don't want error conditions to be cleared anywhere

'anywhere' is a really loaded term.

> as this will make us 'forget' about missed events. We depend on
> knowing which events were missed in order to be able to reissue them.

> This fixes a race condition where the EMR was being cleared
> by the transfer completion interrupt handler.
> 
> Basically, what was happening was:
> 
>             Missed event
>              |
>              |
>              V
> SG1-SG2-SG3-Null
>          \
>           \__TC Interrupt (Almost same time as ARM is executing
> TC interrupt handler, an event got missed and also forgotten
> by clearing the EMR).

Sorry, but I dont see how edma_stop() is coming into picture in the race
you describe?

> The EMR is ultimately being cleared by the Error interrupt
> handler once it is handled so we don't have to do it in edma_stop.

This, I agree with. edma_clean_channel() also there to re-initialize the
channel so doing it in edma_stop() certainly seems superfluous.

Thanks,
Sekhar

> 
> Signed-off-by: Joel Fernandes <joelf@ti.com>
> ---
>  arch/arm/common/edma.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
> index 10995b2..dec772e 100644
> --- a/arch/arm/common/edma.c
> +++ b/arch/arm/common/edma.c
> @@ -1328,7 +1328,6 @@ void edma_stop(unsigned channel)
>  		edma_shadow0_write_array(ctlr, SH_EECR, j, mask);
>  		edma_shadow0_write_array(ctlr, SH_ECR, j, mask);
>  		edma_shadow0_write_array(ctlr, SH_SECR, j, mask);
> -		edma_write_array(ctlr, EDMA_EMCR, j, mask);
>  
>  		pr_debug("EDMA: EER%d %08x\n", j,
>  				edma_shadow0_read_array(ctlr, SH_EER, j));
>
Joel Fernandes July 31, 2013, 5:05 a.m. UTC | #2
On 07/30/2013 03:29 AM, Sekhar Nori wrote:
> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>> We certainly don't want error conditions to be cleared anywhere
> 
> 'anywhere' is a really loaded term.
> 
>> as this will make us 'forget' about missed events. We depend on
>> knowing which events were missed in order to be able to reissue them.
> 
>> This fixes a race condition where the EMR was being cleared
>> by the transfer completion interrupt handler.
>>
>> Basically, what was happening was:
>>
>>             Missed event
>>              |
>>              |
>>              V
>> SG1-SG2-SG3-Null
>>          \
>>           \__TC Interrupt (Almost same time as ARM is executing
>> TC interrupt handler, an event got missed and also forgotten
>> by clearing the EMR).
> 
> Sorry, but I dont see how edma_stop() is coming into picture in the race
> you describe?

In edma_callback function, for the case of DMA_COMPLETE (Transfer
completion interrupt), edma_stop() is called when all sets have been
processed. This had the effect of clearing the EMR.

This has 2 problems:

1.
If error interrupt is also pending and TC interrupt clears the EMR.

Due to this the ARM will execute the error interrupt even though the EMR
is clear. As a result, the following if condition in dma_ccerr_handler
will be true and IRQ_NONE is returned.

        if ((edma_read_array(ctlr, EDMA_EMR, 0) == 0) &&
            (edma_read_array(ctlr, EDMA_EMR, 1) == 0) &&
            (edma_read(ctlr, EDMA_QEMR) == 0) &&
            (edma_read(ctlr, EDMA_CCERR) == 0))
                return IRQ_NONE;

If this happens enough number of times, IRQ subsystem disables the
interrupt thinking its spurious which creates serious problems.

2.
If the above if statement condition is removed, then EMR is 0 so the
callback function will not be called in dma_ccerr_handler thus the event
is forgotten, never triggered manually or never sets missed flag of the
channel.

So about the race: TC interrupt handler executing before the error
interrupt handler can result in clearing the EMR and creates these problems.

>> The EMR is ultimately being cleared by the Error interrupt
>> handler once it is handled so we don't have to do it in edma_stop.
> 
> This, I agree with. edma_clean_channel() also there to re-initialize the
> channel so doing it in edma_stop() certainly seems superfluous.

Sure.

Thanks,

-Joel
Sekhar Nori July 31, 2013, 9:35 a.m. UTC | #3
On Wednesday 31 July 2013 10:35 AM, Joel Fernandes wrote:
> On 07/30/2013 03:29 AM, Sekhar Nori wrote:
>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>> We certainly don't want error conditions to be cleared anywhere
>>
>> 'anywhere' is a really loaded term.
>>
>>> as this will make us 'forget' about missed events. We depend on
>>> knowing which events were missed in order to be able to reissue them.
>>
>>> This fixes a race condition where the EMR was being cleared
>>> by the transfer completion interrupt handler.
>>>
>>> Basically, what was happening was:
>>>
>>>             Missed event
>>>              |
>>>              |
>>>              V
>>> SG1-SG2-SG3-Null
>>>          \
>>>           \__TC Interrupt (Almost same time as ARM is executing
>>> TC interrupt handler, an event got missed and also forgotten
>>> by clearing the EMR).
>>
>> Sorry, but I dont see how edma_stop() is coming into picture in the race
>> you describe?
> 
> In edma_callback function, for the case of DMA_COMPLETE (Transfer
> completion interrupt), edma_stop() is called when all sets have been
> processed. This had the effect of clearing the EMR.

Ah, thanks. I was missing the fact that the race comes into picture only
when using the DMA engine driver. I guess that should be mentioned
somewhere since it is not immediately obvious.

The patch looks good to me. So if you respin just this one with some
updated explanation based on what you wrote below, I will take it.

Thanks,
Sekhar
Joel Fernandes Aug. 1, 2013, 1:59 a.m. UTC | #4
On 07/31/2013 04:35 AM, Sekhar Nori wrote:
> On Wednesday 31 July 2013 10:35 AM, Joel Fernandes wrote:
>> On 07/30/2013 03:29 AM, Sekhar Nori wrote:
>>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>>> We certainly don't want error conditions to be cleared anywhere
>>>
>>> 'anywhere' is a really loaded term.
>>>
>>>> as this will make us 'forget' about missed events. We depend on
>>>> knowing which events were missed in order to be able to reissue them.
>>>
>>>> This fixes a race condition where the EMR was being cleared
>>>> by the transfer completion interrupt handler.
>>>>
>>>> Basically, what was happening was:
>>>>
>>>>             Missed event
>>>>              |
>>>>              |
>>>>              V
>>>> SG1-SG2-SG3-Null
>>>>          \
>>>>           \__TC Interrupt (Almost same time as ARM is executing
>>>> TC interrupt handler, an event got missed and also forgotten
>>>> by clearing the EMR).
>>>
>>> Sorry, but I dont see how edma_stop() is coming into picture in the race
>>> you describe?
>>
>> In edma_callback function, for the case of DMA_COMPLETE (Transfer
>> completion interrupt), edma_stop() is called when all sets have been
>> processed. This had the effect of clearing the EMR.
> 
> Ah, thanks. I was missing the fact that the race comes into picture only
> when using the DMA engine driver. I guess that should be mentioned
> somewhere since it is not immediately obvious.
> 
> The patch looks good to me. So if you respin just this one with some
> updated explanation based on what you wrote below, I will take it.

Sure I'll do that. Also the trigger_channel patch, will you be taking
that one too? I can send these 2 in a series as they touch
arch/arm/common/edma.c

Thanks,

-Joel



> 
> Thanks,
> Sekhar
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
diff mbox

Patch

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 10995b2..dec772e 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1328,7 +1328,6 @@  void edma_stop(unsigned channel)
 		edma_shadow0_write_array(ctlr, SH_EECR, j, mask);
 		edma_shadow0_write_array(ctlr, SH_ECR, j, mask);
 		edma_shadow0_write_array(ctlr, SH_SECR, j, mask);
-		edma_write_array(ctlr, EDMA_EMCR, j, mask);
 
 		pr_debug("EDMA: EER%d %08x\n", j,
 				edma_shadow0_read_array(ctlr, SH_EER, j));