Message ID | 20220215044112.161634-1-vigneshr@ti.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | dmaengine: ti: k3-udma: Avoid false error msg on chan teardown | expand |
Hi Vignesh, On 15/02/2022 06:41, Vignesh Raghavendra wrote: > In cyclic mode, there is no additional descriptor pushed to collect > outstanding data on channel teardown. Therefore no need to wait for this > descriptor to come back. > > Without this terminating aplay cmd outputs false error msg like: > [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout! are you sure it is aplay? It is MEM_TO_DEV, we only use the flush descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the peripheral to flush out the FIFO. I have not seen this on am654, j721e. I can not recall seeing this on the capture side either. The cyclic TR should be able to drain the DEV_TO_MEM by itself and the TR should terminate. > Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> > --- > drivers/dma/ti/k3-udma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c > index 9abb08d353ca0..c9a1b2f312603 100644 > --- a/drivers/dma/ti/k3-udma.c > +++ b/drivers/dma/ti/k3-udma.c > @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan) > > vchan_synchronize(&uc->vc); > > - if (uc->state == UDMA_CHAN_IS_TERMINATING) { > + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) { > timeout = wait_for_completion_timeout(&uc->teardown_completed, > timeout); > if (!timeout) {
Hi Peter, On 21/02/22 1:42 am, Péter Ujfalusi wrote: > Hi Vignesh, > > On 15/02/2022 06:41, Vignesh Raghavendra wrote: >> In cyclic mode, there is no additional descriptor pushed to collect >> outstanding data on channel teardown. Therefore no need to wait for this >> descriptor to come back. >> >> Without this terminating aplay cmd outputs false error msg like: >> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout! > > are you sure it is aplay? It is MEM_TO_DEV, we only use the flush > descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the > peripheral to flush out the FIFO. > Yes, this is with aplay. You are right that MEM_TO_DEV should have worked w/o this patch. > I have not seen this on am654, j721e. I can not recall seeing this on > the capture side either. > I dont see it either > The cyclic TR should be able to drain the DEV_TO_MEM by itself and the > TR should terminate. > You are right. There seems to be a trobule with McASP + BCDMA on AM62 which needs more investigation. I see RT c0000000 peer RT 90000000 BCNT 5dc00, peer BCNT 46400 So there is some data stuck in pipe which prevents channel from disabling and TDCM being signaled. My guess is McASP is no longer requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA req enable state? Wondering what else can prevent draining of data. One difference is that AM62 has ti,tlv320aic3106 codec (codec is the master) where J7 uses PCM. Regards Vignesh > >> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> >> --- >> drivers/dma/ti/k3-udma.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c >> index 9abb08d353ca0..c9a1b2f312603 100644 >> --- a/drivers/dma/ti/k3-udma.c >> +++ b/drivers/dma/ti/k3-udma.c >> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan) >> >> vchan_synchronize(&uc->vc); >> >> - if (uc->state == UDMA_CHAN_IS_TERMINATING) { >> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) { >> timeout = wait_for_completion_timeout(&uc->teardown_completed, >> timeout); >> if (!timeout) { >
On 28/02/22 2:52 pm, Vignesh Raghavendra wrote: > Hi Peter, > > On 21/02/22 1:42 am, Péter Ujfalusi wrote: >> Hi Vignesh, >> >> On 15/02/2022 06:41, Vignesh Raghavendra wrote: >>> In cyclic mode, there is no additional descriptor pushed to collect >>> outstanding data on channel teardown. Therefore no need to wait for this >>> descriptor to come back. >>> >>> Without this terminating aplay cmd outputs false error msg like: >>> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout! >> >> are you sure it is aplay? It is MEM_TO_DEV, we only use the flush >> descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the >> peripheral to flush out the FIFO. >> > > Yes, this is with aplay. You are right that MEM_TO_DEV should have > worked w/o this patch. > > >> I have not seen this on am654, j721e. I can not recall seeing this on >> the capture side either. >> > > I dont see it either > >> The cyclic TR should be able to drain the DEV_TO_MEM by itself and the >> TR should terminate. >> > > You are right. There seems to be a trobule with McASP + BCDMA on AM62 > which needs more investigation. I see > > RT c0000000 peer RT 90000000 > BCNT 5dc00, peer BCNT 46400 > > So there is some data stuck in pipe which prevents channel from > disabling and TDCM being signaled. My guess is McASP is no longer > requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA > req enable state? Wondering what else can prevent draining of data. > > One difference is that AM62 has ti,tlv320aic3106 codec (codec is the > master) where J7 uses PCM. > I see couple of issues with DMA usage by McASP/sound: McASP TX FIFO events are disabled first and then DMA channel is stopped. This does not work for K3 SoCs as some data remains stuck in DMA pipe and channel never goes to disable state. I see .stop_dma_first flag in snd_soc_dai_link to force DMA to be stopped first, but I am not quite familiar on where to set this flag? Even so, snd_dmaengine_pcm_trigger() calls dmaengine_terminate_async() and does not call dmaengine_synchronize() before disabling McASP TX, so channel teardown would still be unsuccessful. Alternately, we could reduce dev_warn() in udma_synchronize() to dev_dbg() as channel is still recoverable via udma_reset_chan() which is done immediately after. There is a further dev_warn() message to indicate if channel refused to stop even after a reset? Regards Vignesh > Regards > Vignesh > > >> >>> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> >>> --- >>> drivers/dma/ti/k3-udma.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c >>> index 9abb08d353ca0..c9a1b2f312603 100644 >>> --- a/drivers/dma/ti/k3-udma.c >>> +++ b/drivers/dma/ti/k3-udma.c >>> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan) >>> >>> vchan_synchronize(&uc->vc); >>> >>> - if (uc->state == UDMA_CHAN_IS_TERMINATING) { >>> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) { >>> timeout = wait_for_completion_timeout(&uc->teardown_completed, >>> timeout); >>> if (!timeout) { >>
Hi Vignesh, On 28/02/2022 13:18, Vignesh Raghavendra wrote: > > > On 28/02/22 2:52 pm, Vignesh Raghavendra wrote: >> Hi Peter, >> >> On 21/02/22 1:42 am, Péter Ujfalusi wrote: >>> Hi Vignesh, >>> >>> On 15/02/2022 06:41, Vignesh Raghavendra wrote: >>>> In cyclic mode, there is no additional descriptor pushed to collect >>>> outstanding data on channel teardown. Therefore no need to wait for this >>>> descriptor to come back. >>>> >>>> Without this terminating aplay cmd outputs false error msg like: >>>> [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout! >>> >>> are you sure it is aplay? It is MEM_TO_DEV, we only use the flush >>> descriptor for DEV_TO_MEM. MEM_TO_DEV can 'disconnect' from the >>> peripheral to flush out the FIFO. >>> >> >> Yes, this is with aplay. You are right that MEM_TO_DEV should have >> worked w/o this patch. >> >> >>> I have not seen this on am654, j721e. I can not recall seeing this on >>> the capture side either. >>> >> >> I dont see it either >> >>> The cyclic TR should be able to drain the DEV_TO_MEM by itself and the >>> TR should terminate. >>> >> >> You are right. There seems to be a trobule with McASP + BCDMA on AM62 >> which needs more investigation. I see >> >> RT c0000000 peer RT 90000000 >> BCNT 5dc00, peer BCNT 46400 In case of MEM_TO_DEV stop we set the peer DMA (PDMA) to flush and set the UDMA/BCDMA/PKTDMA to tdown. If the flush is set for the PDMA, it will (should) disconnect it's trigger from the peripheral and 'free run' to flush all data. Afaik the PDMA on am62 is the same as it is on j721e, no? >> So there is some data stuck in pipe which prevents channel from >> disabling and TDCM being signaled. My guess is McASP is no longer >> requesting more data from PDMA. Any way to look at McASP FIFO state/ DMA >> req enable state? Wondering what else can prevent draining of data. >> >> One difference is that AM62 has ti,tlv320aic3106 codec (codec is the >> master) where J7 uses PCM. >> > > I see couple of issues with DMA usage by McASP/sound: > > McASP TX FIFO events are disabled first and then DMA channel is stopped. > This does not work for K3 SoCs as some data remains stuck in DMA pipe > and channel never goes to disable state. You can not really stop the DMA first because the McASP would undeflow right away. The McASP FIFO should be in bypass mode for K3 devices, the PDMA can handle the feeding of McASP just fine. One of the reasons to have the FLUSH on the peer (PDMA) side is exactly this: to be able to drain the MEM_TO_DEV DMA FIFO to /dev/null even if the peripheral is long gone (disabled, even powered down). > I see .stop_dma_first flag in snd_soc_dai_link to force DMA to be > stopped first, but I am not quite familiar on where to set this flag? The stop_dma_first might work, but it is actually added to support pxa (I think?) where they have two separate DMAs on two side of an external FIFO. The DMAengine side need to be stopped first and then they have open coded DMA code to busy loop to wait for the other DMA (non DMAengine) to drain out the data. > Even so, snd_dmaengine_pcm_trigger() calls dmaengine_terminate_async() > and does not call dmaengine_synchronize() before disabling McASP TX, so > channel teardown would still be unsuccessful. We can not do a dmaengine_synchronize() in pcm.trigger as it is in atomic context and we can not sleep. > Alternately, we could reduce dev_warn() in udma_synchronize() to > dev_dbg() as channel is still recoverable via udma_reset_chan() which > is done immediately after. > There is a further dev_warn() message to indicate if channel refused to > stop even after a reset? The problem is that it is also possible that after a forced shutdown the channel is not going to work anymore (we have this issue with am65, if I recall right). I would consult with the hardware team to understand what is going on, make sure that the McASP AFIFO is disabled (evnums are 0). If it is really something in the hardware that behaves differently in am62 then add a quirk to handle it and implement a workaround. > > Regards > Vignesh > >> Regards >> Vignesh >> >> >>> >>>> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> >>>> --- >>>> drivers/dma/ti/k3-udma.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c >>>> index 9abb08d353ca0..c9a1b2f312603 100644 >>>> --- a/drivers/dma/ti/k3-udma.c >>>> +++ b/drivers/dma/ti/k3-udma.c >>>> @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan) >>>> >>>> vchan_synchronize(&uc->vc); >>>> >>>> - if (uc->state == UDMA_CHAN_IS_TERMINATING) { >>>> + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) { >>>> timeout = wait_for_completion_timeout(&uc->teardown_completed, >>>> timeout); >>>> if (!timeout) { >>>
Hello Vignesh, reviving this old email, since it seems that it raised a valid issue that is still not resolved. On Tue, Feb 15, 2022 at 10:11:12AM +0530, Vignesh Raghavendra wrote: > In cyclic mode, there is no additional descriptor pushed to collect > outstanding data on channel teardown. Therefore no need to wait for this > descriptor to come back. > > Without this terminating aplay cmd outputs false error msg like: > [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout! I am experiencing a similar issue on Toradex Verdin AM62: ``` kern :warn : [ 40.196661] ti-udma 485c0100.dma-controller: chan2 teardown timeout! kern :warn : [ 41.077013] kauditd_printk_skb: 14 callbacks suppressed kern :warn : [ 46.404672] ti-udma 485c0100.dma-controller: chan2 teardown timeout! kern :warn : [ 50.852643] ti-udma 485c0100.dma-controller: chan1 teardown timeout! kern :warn : [ 56.932709] ti-udma 485c0100.dma-controller: chan1 teardown timeout! kern :warn : [ 64.516797] ti-udma 485c0100.dma-controller: chan1 teardown timeout! kern :warn : [ 70.404724] ti-udma 485c0100.dma-controller: chan1 teardown timeout! kern :warn : [ 71.588933] ti-udma 485c0100.dma-controller: chan2 teardown timeout! kern :warn : [ 75.908818] ti-udma 485c0100.dma-controller: chan1 teardown timeout! kern :warn : [ 77.064761] ti-udma 485c0100.dma-controller: chan2 teardown timeout! ``` I would say just the same, since this happens everytime aplay terminates. Any idea to progress on this? Francesco
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index 9abb08d353ca0..c9a1b2f312603 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -3924,7 +3924,7 @@ static void udma_synchronize(struct dma_chan *chan) vchan_synchronize(&uc->vc); - if (uc->state == UDMA_CHAN_IS_TERMINATING) { + if (uc->state == UDMA_CHAN_IS_TERMINATING && !uc->cyclic) { timeout = wait_for_completion_timeout(&uc->teardown_completed, timeout); if (!timeout) {
In cyclic mode, there is no additional descriptor pushed to collect outstanding data on channel teardown. Therefore no need to wait for this descriptor to come back. Without this terminating aplay cmd outputs false error msg like: [ 116.402800] ti-bcdma 485c0100.dma-controller: chan1 teardown timeout! Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> --- drivers/dma/ti/k3-udma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)