ASoC: SOF: Intel: hda: unsolicited RIRB response
diff mbox series

Message ID 1591883073-17190-1-git-send-email-brent.lu@intel.com
State New
Headers show
Series
  • ASoC: SOF: Intel: hda: unsolicited RIRB response
Related show

Commit Message

Brent Lu June 11, 2020, 1:44 p.m. UTC
The loop implementation could not solve the unsolicited response
issue because the RIRBSTS is cleared after leaving the
snd_hdac_bus_update_rirb() function. So the next loop will fail the
status test against the RIRB_INT_MASK and skip all the RIRB handling
stuff. On the other hand, there alwasy could be unsolicited responses
in the last loop regardless the number of loops.

Clear the RIRB interrupt before handling it so unsolicited response
could trigger another RIRB interrupt to handle it later.

Signed-off-by: Brent Lu <brent.lu@intel.com>
---
 sound/soc/sof/intel/hda-stream.c | 48 +++++++++++++++++-----------------------
 1 file changed, 20 insertions(+), 28 deletions(-)

Comments

Ranjani Sridharan June 11, 2020, 2:26 p.m. UTC | #1
On Thu, 2020-06-11 at 21:44 +0800, Brent Lu wrote:
> The loop implementation could not solve the unsolicited response
> issue because the RIRBSTS is cleared after leaving the
> snd_hdac_bus_update_rirb() function. So the next loop will fail the
> status test against the RIRB_INT_MASK and skip all the RIRB handling
> stuff. On the other hand, there alwasy could be unsolicited responses
> in the last loop regardless the number of loops.
> 
> Clear the RIRB interrupt before handling it so unsolicited response
> could trigger another RIRB interrupt to handle it later.
Hi Brent,

Thanks for the patch. Is this fix for a specific issue you're seeing?
If so, could you please give us some details about it?

Thanks,
Ranjani
Brent Lu June 11, 2020, 5:09 p.m. UTC | #2
> Hi Brent,
> 
> Thanks for the patch. Is this fix for a specific issue you're seeing?
> If so, could you please give us some details about it?
> 
> Thanks,
> Ranjani

Hi Ranjani,

It's reported to happen on GLK Chromebook 'Fleex' that sometimes it
cannot output the audio stream to external display. The kernel is
Chrome v4.14 branch. Following is the reproduce step provided by
ODM but I could reproduce it simply running aplay or cras_test_client
so I think it's not about the cable plug/unplug handling.

What steps will reproduce the problem?
1.      Play YouTube video on Chromebook and connect it to external monitor with Type C to DP dongle
2.      Press monitor power button to turn off the monitor
3.      Press monitor power button again to turn on the monitor
4.      Continue to play YouTube video and check audio playback
5.      No sound comes out from built-in speaker of external monitor when turn on external monitor

I added debug messages to print the RIRBWP register and realize that
response could come between the read of RIRBWP in the
snd_hdac_bus_update_rirb() function and the interrupt clear in the
hda_dsp_stream_interrupt() function. The response is not handled but
the interrupt is already cleared. It will cause timeout unless more
responses coming to RIRB.

[   69.173507] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_get_response: addr 0x2
[   69.173567] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_update_rirb: cmds 1 res 0 rp 21 wp 21
=> handle the response in slot 21
[   69.173570] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_update_rirb: updated wp 22
=> new response in slot 22 but not handled
[   70.174089] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_get_response: timeout, wp 22
[   70.174106] HDMI HDA Codec ehdaudio0D2: codec_read: fail to read codec

I found there is a commit addressing this issue and cherry-pick it to the
Chrome v4.14 but the issue is still there. I think more loop does not help
because eventually there will be response coming in the
snd_hdac_bus_update_rirb() function and become unhandled response
in the last loop.

commit 6297a0dc4c14a62bea5a9137ceef280cb7a80665
Author: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Date:   Wed Jun 12 12:23:40 2019 -0500

    ASoC: SOF: Intel: hda: modify stream interrupt handler

    Modify the stream interrupt handler to always wake up the
    IRQ thread if the status register is valid. The IRQ thread
    performs the check for stream interrupts and RIRB interrupts
    in a loop to handle the case of missed interrupts when an
    unsolicited response from the codec is received just before the
    stream interrupt handler is completed.


Regards,
Brent
Takashi Iwai June 11, 2020, 5:59 p.m. UTC | #3
On Thu, 11 Jun 2020 19:09:08 +0200,
Lu, Brent wrote:
> 
> > Hi Brent,
> > 
> > Thanks for the patch. Is this fix for a specific issue you're seeing?
> > If so, could you please give us some details about it?
> > 
> > Thanks,
> > Ranjani
> 
> Hi Ranjani,
> 
> It's reported to happen on GLK Chromebook 'Fleex' that sometimes it
> cannot output the audio stream to external display. The kernel is
> Chrome v4.14 branch. Following is the reproduce step provided by
> ODM but I could reproduce it simply running aplay or cras_test_client
> so I think it's not about the cable plug/unplug handling.
> 
> What steps will reproduce the problem?
> 1.      Play YouTube video on Chromebook and connect it to external monitor with Type C to DP dongle
> 2.      Press monitor power button to turn off the monitor
> 3.      Press monitor power button again to turn on the monitor
> 4.      Continue to play YouTube video and check audio playback
> 5.      No sound comes out from built-in speaker of external monitor when turn on external monitor
> 
> I added debug messages to print the RIRBWP register and realize that
> response could come between the read of RIRBWP in the
> snd_hdac_bus_update_rirb() function and the interrupt clear in the
> hda_dsp_stream_interrupt() function. The response is not handled but
> the interrupt is already cleared. It will cause timeout unless more
> responses coming to RIRB.

Now I noticed that the legacy driver already addressed it recently via
commit 6d011d5057ff
    ALSA: hda: Clear RIRB status before reading WP

We should have checked SOF at the same time, too...


thanks,

Takashi
Pierre-Louis Bossart June 11, 2020, 6:01 p.m. UTC | #4
On 6/11/20 12:09 PM, Lu, Brent wrote:
>> Hi Brent,
>>
>> Thanks for the patch. Is this fix for a specific issue you're seeing?
>> If so, could you please give us some details about it?
>>
>> Thanks,
>> Ranjani
> 
> Hi Ranjani,
> 
> It's reported to happen on GLK Chromebook 'Fleex' that sometimes it
> cannot output the audio stream to external display. The kernel is
> Chrome v4.14 branch. Following is the reproduce step provided by
> ODM but I could reproduce it simply running aplay or cras_test_client
> so I think it's not about the cable plug/unplug handling.
> 
> What steps will reproduce the problem?
> 1.      Play YouTube video on Chromebook and connect it to external monitor with Type C to DP dongle
> 2.      Press monitor power button to turn off the monitor
> 3.      Press monitor power button again to turn on the monitor
> 4.      Continue to play YouTube video and check audio playback
> 5.      No sound comes out from built-in speaker of external monitor when turn on external monitor
> 
> I added debug messages to print the RIRBWP register and realize that
> response could come between the read of RIRBWP in the
> snd_hdac_bus_update_rirb() function and the interrupt clear in the
> hda_dsp_stream_interrupt() function. The response is not handled but
> the interrupt is already cleared. It will cause timeout unless more
> responses coming to RIRB.
> 
> [   69.173507] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_get_response: addr 0x2
> [   69.173567] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_update_rirb: cmds 1 res 0 rp 21 wp 21
> => handle the response in slot 21
> [   69.173570] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_update_rirb: updated wp 22
> => new response in slot 22 but not handled
> [   70.174089] sof-audio-pci 0000:00:0e.0: snd_hdac_bus_get_response: timeout, wp 22
> [   70.174106] HDMI HDA Codec ehdaudio0D2: codec_read: fail to read codec
> 
> I found there is a commit addressing this issue and cherry-pick it to the
> Chrome v4.14 but the issue is still there. I think more loop does not help
> because eventually there will be response coming in the
> snd_hdac_bus_update_rirb() function and become unhandled response
> in the last loop.

IIRC the loop was added because on some versions of the hardware we seem 
to miss stream interrupts - and I believe this was inspired by the same 
solution with the legacy HDaudio driver.

Maybe we need to do something better for unsolicited responses but I'd 
be surprised if we can remove this loop - this sounds like asking for 
trouble.

> 
> commit 6297a0dc4c14a62bea5a9137ceef280cb7a80665
> Author: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> Date:   Wed Jun 12 12:23:40 2019 -0500
> 
>      ASoC: SOF: Intel: hda: modify stream interrupt handler
> 
>      Modify the stream interrupt handler to always wake up the
>      IRQ thread if the status register is valid. The IRQ thread
>      performs the check for stream interrupts and RIRB interrupts
>      in a loop to handle the case of missed interrupts when an
>      unsolicited response from the codec is received just before the
>      stream interrupt handler is completed.
> 
> 
> Regards,
> Brent
>
Ranjani Sridharan June 11, 2020, 6:12 p.m. UTC | #5
On Thu, 2020-06-11 at 19:59 +0200, Takashi Iwai wrote:
> On Thu, 11 Jun 2020 19:09:08 +0200,
> Lu, Brent wrote:
> > 
> > > Hi Brent,
> > > 
> > > Thanks for the patch. Is this fix for a specific issue you're
> > > seeing?
> > > If so, could you please give us some details about it?
> > > 
> > > Thanks,
> > > Ranjani
> > 
> > Hi Ranjani,
> > 
> > It's reported to happen on GLK Chromebook 'Fleex' that sometimes it
> > cannot output the audio stream to external display. The kernel is
> > Chrome v4.14 branch. Following is the reproduce step provided by
> > ODM but I could reproduce it simply running aplay or
> > cras_test_client
> > so I think it's not about the cable plug/unplug handling.
> > 
> > What steps will reproduce the problem?
> > 1.      Play YouTube video on Chromebook and connect it to external
> > monitor with Type C to DP dongle
> > 2.      Press monitor power button to turn off the monitor
> > 3.      Press monitor power button again to turn on the monitor
> > 4.      Continue to play YouTube video and check audio playback
> > 5.      No sound comes out from built-in speaker of external
> > monitor when turn on external monitor
> > 
> > I added debug messages to print the RIRBWP register and realize
> > that
> > response could come between the read of RIRBWP in the
> > snd_hdac_bus_update_rirb() function and the interrupt clear in the
> > hda_dsp_stream_interrupt() function. The response is not handled
> > but
> > the interrupt is already cleared. It will cause timeout unless more
> > responses coming to RIRB.
> 
> Now I noticed that the legacy driver already addressed it recently
> via
> commit 6d011d5057ff
>     ALSA: hda: Clear RIRB status before reading WP
> 
> We should have checked SOF at the same time, too...

Thanks, Takashi. But the legacy driver but doesnt remove the loop. The
loop added in the SOF driver was based on the legacy driver and
specifically to handle missed stream interrupts. Is there any harm in
keeping the loop?

Thanks,
Ranjani
Takashi Iwai June 11, 2020, 8:14 p.m. UTC | #6
On Thu, 11 Jun 2020 20:12:53 +0200,
Ranjani Sridharan wrote:
> 
> On Thu, 2020-06-11 at 19:59 +0200, Takashi Iwai wrote:
> > On Thu, 11 Jun 2020 19:09:08 +0200,
> > Lu, Brent wrote:
> > > 
> > > > Hi Brent,
> > > > 
> > > > Thanks for the patch. Is this fix for a specific issue you're
> > > > seeing?
> > > > If so, could you please give us some details about it?
> > > > 
> > > > Thanks,
> > > > Ranjani
> > > 
> > > Hi Ranjani,
> > > 
> > > It's reported to happen on GLK Chromebook 'Fleex' that sometimes it
> > > cannot output the audio stream to external display. The kernel is
> > > Chrome v4.14 branch. Following is the reproduce step provided by
> > > ODM but I could reproduce it simply running aplay or
> > > cras_test_client
> > > so I think it's not about the cable plug/unplug handling.
> > > 
> > > What steps will reproduce the problem?
> > > 1.      Play YouTube video on Chromebook and connect it to external
> > > monitor with Type C to DP dongle
> > > 2.      Press monitor power button to turn off the monitor
> > > 3.      Press monitor power button again to turn on the monitor
> > > 4.      Continue to play YouTube video and check audio playback
> > > 5.      No sound comes out from built-in speaker of external
> > > monitor when turn on external monitor
> > > 
> > > I added debug messages to print the RIRBWP register and realize
> > > that
> > > response could come between the read of RIRBWP in the
> > > snd_hdac_bus_update_rirb() function and the interrupt clear in the
> > > hda_dsp_stream_interrupt() function. The response is not handled
> > > but
> > > the interrupt is already cleared. It will cause timeout unless more
> > > responses coming to RIRB.
> > 
> > Now I noticed that the legacy driver already addressed it recently
> > via
> > commit 6d011d5057ff
> >     ALSA: hda: Clear RIRB status before reading WP
> > 
> > We should have checked SOF at the same time, too...
> 
> Thanks, Takashi. But the legacy driver but doesnt remove the loop. The
> loop added in the SOF driver was based on the legacy driver and
> specifically to handle missed stream interrupts. Is there any harm in
> keeping the loop?

A loop there might be safer to keep, indeed.  That's basically for a
difference kind of race, and it can still happen theoretically.

Though, SOF is with the threaded interrupt, and it's interesting how
the behavior differs.  I can imagine that, if a thread irq is running
while a new IRQ is re-triggered, the hard irq handler won't queue it
again.  But I might be wrong here, need some checks.


Takashi
Pierre-Louis Bossart June 11, 2020, 8:36 p.m. UTC | #7
>>>> I added debug messages to print the RIRBWP register and realize
>>>> that
>>>> response could come between the read of RIRBWP in the
>>>> snd_hdac_bus_update_rirb() function and the interrupt clear in the
>>>> hda_dsp_stream_interrupt() function. The response is not handled
>>>> but
>>>> the interrupt is already cleared. It will cause timeout unless more
>>>> responses coming to RIRB.
>>>
>>> Now I noticed that the legacy driver already addressed it recently
>>> via
>>> commit 6d011d5057ff
>>>      ALSA: hda: Clear RIRB status before reading WP
>>>
>>> We should have checked SOF at the same time, too...
>>
>> Thanks, Takashi. But the legacy driver but doesnt remove the loop. The
>> loop added in the SOF driver was based on the legacy driver and
>> specifically to handle missed stream interrupts. Is there any harm in
>> keeping the loop?
> 
> A loop there might be safer to keep, indeed.  That's basically for a
> difference kind of race, and it can still happen theoretically.
> 
> Though, SOF is with the threaded interrupt, and it's interesting how
> the behavior differs.  I can imagine that, if a thread irq is running
> while a new IRQ is re-triggered, the hard irq handler won't queue it
> again.  But I might be wrong here, need some checks.

IIRC we added this loop before merging all interrupt handling in one 
thread, somehow the MSI mode never worked reliably without this change, 
so maybe we don't need this loop any longer.

I'd really prefer it if we didn't tie the RIRB handing change to this 
loop change, removing the loop should only be done with *a lot of testing*.
Brent Lu June 11, 2020, 11:33 p.m. UTC | #8
> 
> IIRC we added this loop before merging all interrupt handling in one thread,
> somehow the MSI mode never worked reliably without this change, so
> maybe we don't need this loop any longer.
> 
> I'd really prefer it if we didn't tie the RIRB handing change to this loop change,
> removing the loop should only be done with *a lot of testing*.

The reason I removed the loop because I thought it's for the unsolicited response,
apparently it's not. I'd like to port the commit 6d011d5057ff

    ALSA: hda: Clear RIRB status before reading WP

to SOF driver and upload a version 2. Thanks.

Regards,
Brent
Brent Lu June 12, 2020, 6:15 a.m. UTC | #9
> 
> Now I noticed that the legacy driver already addressed it recently via commit
> 6d011d5057ff
>     ALSA: hda: Clear RIRB status before reading WP
> 
> We should have checked SOF at the same time, too...
> 
> 
> thanks,
> 
> Takashi

Hi Takashi-san,

Yes you are correct. I tested Chrome v5.4 on a CML Chromebook 'hatch' and
realize the SOF does no suffer from this issue because the 'sync write' feature
is enabled in hda_init. Soon I can reproduce the issue after turning it off. So I
think it's still worthy to have this fix in case we need to disable 'sync write'
someday.


Regards,
Brent

Patch
diff mbox series

diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c
index 7f65dcc..d21ac42 100644
--- a/sound/soc/sof/intel/hda-stream.c
+++ b/sound/soc/sof/intel/hda-stream.c
@@ -589,11 +589,10 @@  hda_dsp_set_bytes_transferred(struct hdac_stream *hstream, u64 buffer_size)
 	hstream->curr_pos += num_bytes;
 }
 
-static bool hda_dsp_stream_check(struct hdac_bus *bus, u32 status)
+static void hda_dsp_stream_check(struct hdac_bus *bus, u32 status)
 {
 	struct sof_intel_hda_dev *sof_hda = bus_to_sof_hda(bus);
 	struct hdac_stream *s;
-	bool active = false;
 	u32 sd_status;
 
 	list_for_each_entry(s, &bus->stream_list, list) {
@@ -605,7 +604,6 @@  static bool hda_dsp_stream_check(struct hdac_bus *bus, u32 status)
 
 			snd_hdac_stream_writeb(s, SD_STS, sd_status);
 
-			active = true;
 			if ((!s->substream && !s->cstream) ||
 			    !s->running ||
 			    (sd_status & SOF_HDA_CL_DMA_SD_INT_COMPLETE) == 0)
@@ -621,8 +619,6 @@  static bool hda_dsp_stream_check(struct hdac_bus *bus, u32 status)
 			}
 		}
 	}
-
-	return active;
 }
 
 irqreturn_t hda_dsp_stream_threaded_handler(int irq, void *context)
@@ -632,37 +628,33 @@  irqreturn_t hda_dsp_stream_threaded_handler(int irq, void *context)
 #if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
 	u32 rirb_status;
 #endif
-	bool active;
 	u32 status;
-	int i;
 
-	/*
-	 * Loop 10 times to handle missed interrupts caused by
-	 * unsolicited responses from the codec
-	 */
-	for (i = 0, active = true; i < 10 && active; i++) {
-		spin_lock_irq(&bus->reg_lock);
+	spin_lock_irq(&bus->reg_lock);
 
-		status = snd_hdac_chip_readl(bus, INTSTS);
+	status = snd_hdac_chip_readl(bus, INTSTS);
 
-		/* check streams */
-		active = hda_dsp_stream_check(bus, status);
+	/* check streams */
+	hda_dsp_stream_check(bus, status);
 
-		/* check and clear RIRB interrupt */
+	/* check and clear RIRB interrupt */
 #if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
-		if (status & AZX_INT_CTRL_EN) {
-			rirb_status = snd_hdac_chip_readb(bus, RIRBSTS);
-			if (rirb_status & RIRB_INT_MASK) {
-				active = true;
-				if (rirb_status & RIRB_INT_RESPONSE)
-					snd_hdac_bus_update_rirb(bus);
-				snd_hdac_chip_writeb(bus, RIRBSTS,
-						     RIRB_INT_MASK);
-			}
+	if (status & AZX_INT_CTRL_EN) {
+		rirb_status = snd_hdac_chip_readb(bus, RIRBSTS);
+		if (rirb_status & RIRB_INT_MASK) {
+			/*
+			 * clear current interrupt before reading RIRBWP
+			 * so unsolicited response could trigger another
+			 * interrupt
+			 */
+			snd_hdac_chip_writeb(bus, RIRBSTS, RIRB_INT_MASK);
+
+			if (rirb_status & RIRB_INT_RESPONSE)
+				snd_hdac_bus_update_rirb(bus);
 		}
-#endif
-		spin_unlock_irq(&bus->reg_lock);
 	}
+#endif
+	spin_unlock_irq(&bus->reg_lock);
 
 	return IRQ_HANDLED;
 }