diff mbox series

[v3,2/2] ASoC: Intel: Add period size constraint on strago board

Message ID 1596198365-10105-3-git-send-email-brent.lu@intel.com (mailing list archive)
State New, archived
Headers show
Series Add period size constraint for Atom Chromebook | expand

Commit Message

Brent Lu July 31, 2020, 12:26 p.m. UTC
From: Yu-Hsuan Hsu <yuhsuan@chromium.org>

The CRAS server does not set the period size in hw_param so ALSA will
calculate a value for period size which is based on the buffer size
and other parameters. The value may not always be aligned with Atom's
dsp design so a constraint is added to make sure the board always has
a good value.

Cyan uses chtmax98090 and others(banon, celes, edgar, kefka...) use
rt5650.

Signed-off-by: Yu-Hsuan Hsu <yuhsuan@chromium.org>
Signed-off-by: Brent Lu <brent.lu@intel.com>
---
 sound/soc/intel/boards/cht_bsw_max98090_ti.c | 14 +++++++++++++-
 sound/soc/intel/boards/cht_bsw_rt5645.c      | 14 +++++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

Comments

Takashi Iwai July 31, 2020, 1:34 p.m. UTC | #1
On Fri, 31 Jul 2020 14:26:05 +0200,
Brent Lu wrote:
> 
> From: Yu-Hsuan Hsu <yuhsuan@chromium.org>
> 
> The CRAS server does not set the period size in hw_param so ALSA will
> calculate a value for period size which is based on the buffer size
> and other parameters. The value may not always be aligned with Atom's
> dsp design so a constraint is added to make sure the board always has
> a good value.
> 
> Cyan uses chtmax98090 and others(banon, celes, edgar, kefka...) use
> rt5650.
> 
> Signed-off-by: Yu-Hsuan Hsu <yuhsuan@chromium.org>
> Signed-off-by: Brent Lu <brent.lu@intel.com>
> ---
>  sound/soc/intel/boards/cht_bsw_max98090_ti.c | 14 +++++++++++++-
>  sound/soc/intel/boards/cht_bsw_rt5645.c      | 14 +++++++++++++-
>  2 files changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/sound/soc/intel/boards/cht_bsw_max98090_ti.c b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
> index 835e9bd..bf67254 100644
> --- a/sound/soc/intel/boards/cht_bsw_max98090_ti.c
> +++ b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
> @@ -283,8 +283,20 @@ static int cht_codec_fixup(struct snd_soc_pcm_runtime *rtd,
>  
>  static int cht_aif1_startup(struct snd_pcm_substream *substream)
>  {
> -	return snd_pcm_hw_constraint_single(substream->runtime,
> +	int err;
> +
> +	/* Set period size to 240 to align with Atom design */
> +	err = snd_pcm_hw_constraint_minmax(substream->runtime,
> +			SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 240, 240);
> +	if (err < 0)
> +		return err;

Again, is this fixed 240 is a must?  Or is this also an alignment
issue?


thanks,

Takashi
Brent Lu Aug. 1, 2020, 8:58 a.m. UTC | #2
> 
> Again, is this fixed 240 is a must?  Or is this also an alignment issue?
Hi Takashi,

I think it's a must for Chromebooks. Google found this value works best
with their CRAS server running on their BSW products. They offered this
patch for their own Chromebooks.

> 
> 
> thanks,
> 
> Takashi
Takashi Iwai Aug. 1, 2020, 9:26 a.m. UTC | #3
On Sat, 01 Aug 2020 10:58:16 +0200,
Lu, Brent wrote:
> 
> > 
> > Again, is this fixed 240 is a must?  Or is this also an alignment issue?
> Hi Takashi,
> 
> I think it's a must for Chromebooks. Google found this value works best
> with their CRAS server running on their BSW products. They offered this
> patch for their own Chromebooks.

Hrm, but it's likely a worse choice on other sound backends.

Please double-check whether this fixed small period is a must, or it's
an alignment issue.


Takashi
Brent Lu Aug. 3, 2020, 1 p.m. UTC | #4
> > >
> > > Again, is this fixed 240 is a must?  Or is this also an alignment issue?
> > Hi Takashi,
> >
> > I think it's a must for Chromebooks. Google found this value works
> > best with their CRAS server running on their BSW products. They
> > offered this patch for their own Chromebooks.
> 
> Hrm, but it's likely a worse choice on other sound backends.
> 
> Please double-check whether this fixed small period is a must, or it's an
> alignment issue.
Hi Takashi,

I've double checked with google. It's a must for Chromebooks due to low
latency use case.


Regards,
Brent

> 
> 
> Takashi
Pierre-Louis Bossart Aug. 3, 2020, 3:13 p.m. UTC | #5
On 8/3/20 8:00 AM, Lu, Brent wrote:
>>>>
>>>> Again, is this fixed 240 is a must?  Or is this also an alignment issue?
>>> Hi Takashi,
>>>
>>> I think it's a must for Chromebooks. Google found this value works
>>> best with their CRAS server running on their BSW products. They
>>> offered this patch for their own Chromebooks.
>>
>> Hrm, but it's likely a worse choice on other sound backends.
>>
>> Please double-check whether this fixed small period is a must, or it's an
>> alignment issue.
> Hi Takashi,
> 
> I've double checked with google. It's a must for Chromebooks due to low
> latency use case.

I wonder if there's a misunderstanding here?

I believe Takashi's question was "is this a must to ONLY accept 240 
samples for the period size", there was no pushback on the value itself. 
Are those boards broken with e.g. 960 samples?
Brent Lu Aug. 3, 2020, 4:45 p.m. UTC | #6
> > Hi Takashi,
> >
> > I've double checked with google. It's a must for Chromebooks due to
> > low latency use case.
> 
> I wonder if there's a misunderstanding here?
> 
> I believe Takashi's question was "is this a must to ONLY accept 240 samples
> for the period size", there was no pushback on the value itself.
> Are those boards broken with e.g. 960 samples?

I've added google people to discuss directly.

Hi Yuhsuan,
Would you explain why CRAS needs to use such short period size? Thanks.


Regards,
Brent
Takashi Iwai Aug. 3, 2020, 4:56 p.m. UTC | #7
On Mon, 03 Aug 2020 18:45:29 +0200,
Lu, Brent wrote:
> 
> > > Hi Takashi,
> > >
> > > I've double checked with google. It's a must for Chromebooks due to
> > > low latency use case.
> > 
> > I wonder if there's a misunderstanding here?
> > 
> > I believe Takashi's question was "is this a must to ONLY accept 240 samples
> > for the period size", there was no pushback on the value itself.
> > Are those boards broken with e.g. 960 samples?
> 
> I've added google people to discuss directly.
> 
> Hi Yuhsuan,
> Would you explain why CRAS needs to use such short period size? Thanks.

For avoid further misunderstanding: it's fine that CRAS *uses* such a
short period.  It's often required for achieving a short latency.

However, the question is whether the driver can set *only* this value
for making it working.  IOW, if we don't have this constraint, what
actually happens?  If the driver gives the period size alignment,
wouldn't CRAS choose 240?


Takashi
Brent Lu Aug. 4, 2020, 4:33 a.m. UTC | #8
> 
> For avoid further misunderstanding: it's fine that CRAS *uses* such a short
> period.  It's often required for achieving a short latency.
> 
> However, the question is whether the driver can set *only* this value for
> making it working.  IOW, if we don't have this constraint, what actually
> happens?  If the driver gives the period size alignment, wouldn't CRAS
> choose 240?

It won't. Without the constraint it becomes 432. Actually CRAS does not set
period size specifically so the value depends on the constraint rules.

[   52.011146] sound pcmC1D0p: hw_param
[   52.011152] sound pcmC1D0p:   ACCESS 0x1
[   52.011155] sound pcmC1D0p:   FORMAT 0x4
[   52.011158] sound pcmC1D0p:   SUBFORMAT 0x1
[   52.011161] sound pcmC1D0p:   SAMPLE_BITS [16:16]
[   52.011164] sound pcmC1D0p:   FRAME_BITS [32:32]
[   52.011167] sound pcmC1D0p:   CHANNELS [2:2]
[   52.011170] sound pcmC1D0p:   RATE [48000:48000]
[   52.011173] sound pcmC1D0p:   PERIOD_TIME [9000:9000]
[   52.011176] sound pcmC1D0p:   PERIOD_SIZE [432:432]
[   52.011179] sound pcmC1D0p:   PERIOD_BYTES [1728:1728]
[   52.011182] sound pcmC1D0p:   PERIODS [474:474]
[   52.011185] sound pcmC1D0p:   BUFFER_TIME [4266000:4266000]
[   52.011188] sound pcmC1D0p:   BUFFER_SIZE [204768:204768]
[   52.011191] sound pcmC1D0p:   BUFFER_BYTES [819072:819072]
[   52.011194] sound pcmC1D0p:   TICK_TIME [0:0]

Regards,
Brent

> 
> 
> Takashi
Pierre-Louis Bossart Aug. 4, 2020, 2:24 p.m. UTC | #9
On 8/3/20 11:33 PM, Lu, Brent wrote:
>>
>> For avoid further misunderstanding: it's fine that CRAS *uses* such a short
>> period.  It's often required for achieving a short latency.
>>
>> However, the question is whether the driver can set *only* this value for
>> making it working.  IOW, if we don't have this constraint, what actually
>> happens?  If the driver gives the period size alignment, wouldn't CRAS
>> choose 240?
> 
> It won't. Without the constraint it becomes 432. Actually CRAS does not set
> period size specifically so the value depends on the constraint rules.

I don't get this. If the platform driver already stated 240 and 960 
samples why would 432 be chosen? Doesn't this mean the constraint is not 
applied?

> [   52.011146] sound pcmC1D0p: hw_param
> [   52.011152] sound pcmC1D0p:   ACCESS 0x1
> [   52.011155] sound pcmC1D0p:   FORMAT 0x4
> [   52.011158] sound pcmC1D0p:   SUBFORMAT 0x1
> [   52.011161] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> [   52.011164] sound pcmC1D0p:   FRAME_BITS [32:32]
> [   52.011167] sound pcmC1D0p:   CHANNELS [2:2]
> [   52.011170] sound pcmC1D0p:   RATE [48000:48000]
> [   52.011173] sound pcmC1D0p:   PERIOD_TIME [9000:9000]
> [   52.011176] sound pcmC1D0p:   PERIOD_SIZE [432:432]
> [   52.011179] sound pcmC1D0p:   PERIOD_BYTES [1728:1728]
> [   52.011182] sound pcmC1D0p:   PERIODS [474:474]
> [   52.011185] sound pcmC1D0p:   BUFFER_TIME [4266000:4266000]
> [   52.011188] sound pcmC1D0p:   BUFFER_SIZE [204768:204768]
> [   52.011191] sound pcmC1D0p:   BUFFER_BYTES [819072:819072]
> [   52.011194] sound pcmC1D0p:   TICK_TIME [0:0]
> 
> Regards,
> Brent
> 
>>
>>
>> Takashi
> 
>
Brent Lu Aug. 6, 2020, 4:41 p.m. UTC | #10
> 
> I don't get this. If the platform driver already stated 240 and 960 samples why
> would 432 be chosen? Doesn't this mean the constraint is not applied?

Hi Pierre,

Sorry for late reply. I used following constraints in V3 patch so any period which
aligns 1ms would be accepted.

+	/*
+	 * Make sure the period to be multiple of 1ms to align the
+	 * design of firmware. Apply same rule to buffer size to make
+	 * sure alsa could always find a value for period size
+	 * regardless the buffer size given by user space.
+	 */
+	snd_pcm_hw_constraint_step(substream->runtime, 0,
+			   SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 48);
+	snd_pcm_hw_constraint_step(substream->runtime, 0,
+			   SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 48);

Regards,
Brent

> 
> > [   52.011146] sound pcmC1D0p: hw_param
> > [   52.011152] sound pcmC1D0p:   ACCESS 0x1
> > [   52.011155] sound pcmC1D0p:   FORMAT 0x4
> > [   52.011158] sound pcmC1D0p:   SUBFORMAT 0x1
> > [   52.011161] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> > [   52.011164] sound pcmC1D0p:   FRAME_BITS [32:32]
> > [   52.011167] sound pcmC1D0p:   CHANNELS [2:2]
> > [   52.011170] sound pcmC1D0p:   RATE [48000:48000]
> > [   52.011173] sound pcmC1D0p:   PERIOD_TIME [9000:9000]
> > [   52.011176] sound pcmC1D0p:   PERIOD_SIZE [432:432]
> > [   52.011179] sound pcmC1D0p:   PERIOD_BYTES [1728:1728]
> > [   52.011182] sound pcmC1D0p:   PERIODS [474:474]
> > [   52.011185] sound pcmC1D0p:   BUFFER_TIME [4266000:4266000]
> > [   52.011188] sound pcmC1D0p:   BUFFER_SIZE [204768:204768]
> > [   52.011191] sound pcmC1D0p:   BUFFER_BYTES [819072:819072]
> > [   52.011194] sound pcmC1D0p:   TICK_TIME [0:0]
> >
> > Regards,
> > Brent
> >
> >>
> >>
> >> Takashi
> >
> >
Pierre-Louis Bossart Aug. 10, 2020, 3:03 p.m. UTC | #11
On 8/6/20 11:41 AM, Lu, Brent wrote:
>>
>> I don't get this. If the platform driver already stated 240 and 960 samples why
>> would 432 be chosen? Doesn't this mean the constraint is not applied?
> 
> Hi Pierre,
> 
> Sorry for late reply. I used following constraints in V3 patch so any period which
> aligns 1ms would be accepted.
> 
> +	/*
> +	 * Make sure the period to be multiple of 1ms to align the
> +	 * design of firmware. Apply same rule to buffer size to make
> +	 * sure alsa could always find a value for period size
> +	 * regardless the buffer size given by user space.
> +	 */
> +	snd_pcm_hw_constraint_step(substream->runtime, 0,
> +			   SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 48);
> +	snd_pcm_hw_constraint_step(substream->runtime, 0,
> +			   SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 48);

432 samples is 9ms, I don't have a clue why/how CRAS might ask for this 
value.

It'd be a bit odd to add constraints just for the purpose of letting 
userspace select a sensible value.
Yu-Hsuan Hsu Aug. 10, 2020, 5:38 p.m. UTC | #12
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> 於
2020年8月10日 週一 下午11:03寫道:
>
>
>
> On 8/6/20 11:41 AM, Lu, Brent wrote:
> >>
> >> I don't get this. If the platform driver already stated 240 and 960 samples why
> >> would 432 be chosen? Doesn't this mean the constraint is not applied?
> >
> > Hi Pierre,
> >
> > Sorry for late reply. I used following constraints in V3 patch so any period which
> > aligns 1ms would be accepted.
> >
> > +     /*
> > +      * Make sure the period to be multiple of 1ms to align the
> > +      * design of firmware. Apply same rule to buffer size to make
> > +      * sure alsa could always find a value for period size
> > +      * regardless the buffer size given by user space.
> > +      */
> > +     snd_pcm_hw_constraint_step(substream->runtime, 0,
> > +                        SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 48);
> > +     snd_pcm_hw_constraint_step(substream->runtime, 0,
> > +                        SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 48);
>
> 432 samples is 9ms, I don't have a clue why/how CRAS might ask for this
> value.
>
> It'd be a bit odd to add constraints just for the purpose of letting
> userspace select a sensible value.

Sorry for the late reply. CRAS does not set the period size when using it.
The default period size is 256, which consumes the samples
quickly(about 49627 fps when the rate is 48000 fps) at the beginning
of the playback.
Since CRAS write samples with the fixed frequency, it triggers
underruns immidiately.

According to Brent, the DSP is using 240 period regardless the
hw_param. If the period size is 256, DSP will read 256 samples each
time but only consume 240 samples until the ring buffer of DSP is
full. This behavior makes the samples in the ring buffer of kernel
consumed quickly. (Not sure whether the explanation is correct. Need
Brent to confirm it.)

Unfortunately, we can not change the behavior of DSP. After some
experiments, we found that the issue can be fixed if we set the period
size to 240. With the same frequency as the DSP, the samples are
consumed stably. Because everyone can trigger this issue when using
the driver without setting the period size, we think it is a general
issue that should be fixed in the kernel.

Thanks,
Yu-Hsuan
Brent Lu Aug. 11, 2020, 2:16 a.m. UTC | #13
> 
> Sorry for the late reply. CRAS does not set the period size when using it.
> The default period size is 256, which consumes the samples quickly(about 49627
> fps when the rate is 48000 fps) at the beginning of the playback.
> Since CRAS write samples with the fixed frequency, it triggers underruns
> immidiately.
> 
> According to Brent, the DSP is using 240 period regardless the hw_param. If the
> period size is 256, DSP will read 256 samples each time but only consume 240
> samples until the ring buffer of DSP is full. This behavior makes the samples in
> the ring buffer of kernel consumed quickly. (Not sure whether the explanation is
> correct. Need Brent to confirm it.)
> 
> Unfortunately, we can not change the behavior of DSP. After some experiments,
> we found that the issue can be fixed if we set the period size to 240. With the
> same frequency as the DSP, the samples are consumed stably. Because everyone
> can trigger this issue when using the driver without setting the period size, we
> think it is a general issue that should be fixed in the kernel.

I check the code and just realized CRAS does nothing but request maximum buffer
size. As I know the application needs to decide the buffer time and period time so
ALSA could generate a hw_param structure with proper period size instead of using
fixed constraint in machine driver because driver has no idea about the latency you
want.

You can use snd_pcm_hw_params_set_buffer_time_near() and
snd_pcm_hw_params_set_period_time_near() to get a proper configuration of
buffer and period parameters according to the latency requirement. In the CRAS
code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on
Celes and it looks quite promising. It seems to me that adding constraint in machine
driver is not necessary.

SectionDevice."Speaker".0 {
	Value {
		PlaybackPCM "hw:chtrt5650,0"
		DmaPeriodMicrosecs "5000"
...

[   52.434761] sound pcmC1D0p: hw_param
[   52.434767] sound pcmC1D0p:   ACCESS 0x1
[   52.434770] sound pcmC1D0p:   FORMAT 0x4
[   52.434772] sound pcmC1D0p:   SUBFORMAT 0x1
[   52.434776] sound pcmC1D0p:   SAMPLE_BITS [16:16]
[   52.434779] sound pcmC1D0p:   FRAME_BITS [32:32]
[   52.434782] sound pcmC1D0p:   CHANNELS [2:2]
[   52.434785] sound pcmC1D0p:   RATE [48000:48000]
[   52.434788] sound pcmC1D0p:   PERIOD_TIME [5000:5000]
[   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
[   52.434794] sound pcmC1D0p:   PERIOD_BYTES [960:960]
[   52.434797] sound pcmC1D0p:   PERIODS [852:852]
[   52.434799] sound pcmC1D0p:   BUFFER_TIME [4260000:4260000]
[   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
[   52.434805] sound pcmC1D0p:   BUFFER_BYTES [817920:817920]
[   52.434808] sound pcmC1D0p:   TICK_TIME [0:0]

Regards,
Brent

> 
> Thanks,
> Yu-Hsuan
Yu-Hsuan Hsu Aug. 11, 2020, 2:29 a.m. UTC | #14
Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道:
>
> >
> > Sorry for the late reply. CRAS does not set the period size when using it.
> > The default period size is 256, which consumes the samples quickly(about 49627
> > fps when the rate is 48000 fps) at the beginning of the playback.
> > Since CRAS write samples with the fixed frequency, it triggers underruns
> > immidiately.
> >
> > According to Brent, the DSP is using 240 period regardless the hw_param. If the
> > period size is 256, DSP will read 256 samples each time but only consume 240
> > samples until the ring buffer of DSP is full. This behavior makes the samples in
> > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is
> > correct. Need Brent to confirm it.)
> >
> > Unfortunately, we can not change the behavior of DSP. After some experiments,
> > we found that the issue can be fixed if we set the period size to 240. With the
> > same frequency as the DSP, the samples are consumed stably. Because everyone
> > can trigger this issue when using the driver without setting the period size, we
> > think it is a general issue that should be fixed in the kernel.
>
> I check the code and just realized CRAS does nothing but request maximum buffer
> size. As I know the application needs to decide the buffer time and period time so
> ALSA could generate a hw_param structure with proper period size instead of using
> fixed constraint in machine driver because driver has no idea about the latency you
> want.
>
> You can use snd_pcm_hw_params_set_buffer_time_near() and
> snd_pcm_hw_params_set_period_time_near() to get a proper configuration of
> buffer and period parameters according to the latency requirement. In the CRAS
> code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on
> Celes and it looks quite promising. It seems to me that adding constraint in machine
> driver is not necessary.
>
> SectionDevice."Speaker".0 {
>         Value {
>                 PlaybackPCM "hw:chtrt5650,0"
>                 DmaPeriodMicrosecs "5000"
> ...
>
> [   52.434761] sound pcmC1D0p: hw_param
> [   52.434767] sound pcmC1D0p:   ACCESS 0x1
> [   52.434770] sound pcmC1D0p:   FORMAT 0x4
> [   52.434772] sound pcmC1D0p:   SUBFORMAT 0x1
> [   52.434776] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> [   52.434779] sound pcmC1D0p:   FRAME_BITS [32:32]
> [   52.434782] sound pcmC1D0p:   CHANNELS [2:2]
> [   52.434785] sound pcmC1D0p:   RATE [48000:48000]
> [   52.434788] sound pcmC1D0p:   PERIOD_TIME [5000:5000]
> [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> [   52.434794] sound pcmC1D0p:   PERIOD_BYTES [960:960]
> [   52.434797] sound pcmC1D0p:   PERIODS [852:852]
> [   52.434799] sound pcmC1D0p:   BUFFER_TIME [4260000:4260000]
> [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> [   52.434805] sound pcmC1D0p:   BUFFER_BYTES [817920:817920]
> [   52.434808] sound pcmC1D0p:   TICK_TIME [0:0]
>
> Regards,
> Brent
Hi Brent,

Yes, I know we can do it to fix the issue as well. As I mentioned
before, we wanted to fix it in kernel because it is a real issue,
isn't it? Basically, a driver should work with any param it supports.
But in this case, everyone can trigger underrun if he or she does not
the period size to 240. If you still think it's not necessary, I can
modify UCM to make CRAS set the appropriate period size.

Thanks,
Yu-Hsuan

>
> >
> > Thanks,
> > Yu-Hsuan
Takashi Iwai Aug. 11, 2020, 7:43 a.m. UTC | #15
On Tue, 11 Aug 2020 04:29:24 +0200,
Yu-Hsuan Hsu wrote:
> 
> Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道:
> >
> > >
> > > Sorry for the late reply. CRAS does not set the period size when using it.
> > > The default period size is 256, which consumes the samples quickly(about 49627
> > > fps when the rate is 48000 fps) at the beginning of the playback.
> > > Since CRAS write samples with the fixed frequency, it triggers underruns
> > > immidiately.
> > >
> > > According to Brent, the DSP is using 240 period regardless the hw_param. If the
> > > period size is 256, DSP will read 256 samples each time but only consume 240
> > > samples until the ring buffer of DSP is full. This behavior makes the samples in
> > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is
> > > correct. Need Brent to confirm it.)
> > >
> > > Unfortunately, we can not change the behavior of DSP. After some experiments,
> > > we found that the issue can be fixed if we set the period size to 240. With the
> > > same frequency as the DSP, the samples are consumed stably. Because everyone
> > > can trigger this issue when using the driver without setting the period size, we
> > > think it is a general issue that should be fixed in the kernel.
> >
> > I check the code and just realized CRAS does nothing but request maximum buffer
> > size. As I know the application needs to decide the buffer time and period time so
> > ALSA could generate a hw_param structure with proper period size instead of using
> > fixed constraint in machine driver because driver has no idea about the latency you
> > want.
> >
> > You can use snd_pcm_hw_params_set_buffer_time_near() and
> > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of
> > buffer and period parameters according to the latency requirement. In the CRAS
> > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on
> > Celes and it looks quite promising. It seems to me that adding constraint in machine
> > driver is not necessary.
> >
> > SectionDevice."Speaker".0 {
> >         Value {
> >                 PlaybackPCM "hw:chtrt5650,0"
> >                 DmaPeriodMicrosecs "5000"
> > ...
> >
> > [   52.434761] sound pcmC1D0p: hw_param
> > [   52.434767] sound pcmC1D0p:   ACCESS 0x1
> > [   52.434770] sound pcmC1D0p:   FORMAT 0x4
> > [   52.434772] sound pcmC1D0p:   SUBFORMAT 0x1
> > [   52.434776] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> > [   52.434779] sound pcmC1D0p:   FRAME_BITS [32:32]
> > [   52.434782] sound pcmC1D0p:   CHANNELS [2:2]
> > [   52.434785] sound pcmC1D0p:   RATE [48000:48000]
> > [   52.434788] sound pcmC1D0p:   PERIOD_TIME [5000:5000]
> > [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> > [   52.434794] sound pcmC1D0p:   PERIOD_BYTES [960:960]
> > [   52.434797] sound pcmC1D0p:   PERIODS [852:852]
> > [   52.434799] sound pcmC1D0p:   BUFFER_TIME [4260000:4260000]
> > [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> > [   52.434805] sound pcmC1D0p:   BUFFER_BYTES [817920:817920]
> > [   52.434808] sound pcmC1D0p:   TICK_TIME [0:0]
> >
> > Regards,
> > Brent
> Hi Brent,
> 
> Yes, I know we can do it to fix the issue as well. As I mentioned
> before, we wanted to fix it in kernel because it is a real issue,
> isn't it? Basically, a driver should work with any param it supports.
> But in this case, everyone can trigger underrun if he or she does not
> the period size to 240. If you still think it's not necessary, I can
> modify UCM to make CRAS set the appropriate period size.

How does it *not* work if you set other than period size 240, more
exactly?

The hw_constraint to a fixed period size must be really an exception.
If you look at other drivers, you won't find any other doing such.
It already indicates that something is wrong.

Usually the fixed period size comes from the hardware limitation and
defined in snd_pcm_hardware.  Or, sometimes it's an alignment issue.
If you need more than that, you should doubt what's really not
working.


Takashi
Yu-Hsuan Hsu Aug. 11, 2020, 8:25 a.m. UTC | #16
Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午3:43寫道:
>
> On Tue, 11 Aug 2020 04:29:24 +0200,
> Yu-Hsuan Hsu wrote:
> >
> > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道:
> > >
> > > >
> > > > Sorry for the late reply. CRAS does not set the period size when using it.
> > > > The default period size is 256, which consumes the samples quickly(about 49627
> > > > fps when the rate is 48000 fps) at the beginning of the playback.
> > > > Since CRAS write samples with the fixed frequency, it triggers underruns
> > > > immidiately.
> > > >
> > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the
> > > > period size is 256, DSP will read 256 samples each time but only consume 240
> > > > samples until the ring buffer of DSP is full. This behavior makes the samples in
> > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is
> > > > correct. Need Brent to confirm it.)
> > > >
> > > > Unfortunately, we can not change the behavior of DSP. After some experiments,
> > > > we found that the issue can be fixed if we set the period size to 240. With the
> > > > same frequency as the DSP, the samples are consumed stably. Because everyone
> > > > can trigger this issue when using the driver without setting the period size, we
> > > > think it is a general issue that should be fixed in the kernel.
> > >
> > > I check the code and just realized CRAS does nothing but request maximum buffer
> > > size. As I know the application needs to decide the buffer time and period time so
> > > ALSA could generate a hw_param structure with proper period size instead of using
> > > fixed constraint in machine driver because driver has no idea about the latency you
> > > want.
> > >
> > > You can use snd_pcm_hw_params_set_buffer_time_near() and
> > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of
> > > buffer and period parameters according to the latency requirement. In the CRAS
> > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on
> > > Celes and it looks quite promising. It seems to me that adding constraint in machine
> > > driver is not necessary.
> > >
> > > SectionDevice."Speaker".0 {
> > >         Value {
> > >                 PlaybackPCM "hw:chtrt5650,0"
> > >                 DmaPeriodMicrosecs "5000"
> > > ...
> > >
> > > [   52.434761] sound pcmC1D0p: hw_param
> > > [   52.434767] sound pcmC1D0p:   ACCESS 0x1
> > > [   52.434770] sound pcmC1D0p:   FORMAT 0x4
> > > [   52.434772] sound pcmC1D0p:   SUBFORMAT 0x1
> > > [   52.434776] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> > > [   52.434779] sound pcmC1D0p:   FRAME_BITS [32:32]
> > > [   52.434782] sound pcmC1D0p:   CHANNELS [2:2]
> > > [   52.434785] sound pcmC1D0p:   RATE [48000:48000]
> > > [   52.434788] sound pcmC1D0p:   PERIOD_TIME [5000:5000]
> > > [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> > > [   52.434794] sound pcmC1D0p:   PERIOD_BYTES [960:960]
> > > [   52.434797] sound pcmC1D0p:   PERIODS [852:852]
> > > [   52.434799] sound pcmC1D0p:   BUFFER_TIME [4260000:4260000]
> > > [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> > > [   52.434805] sound pcmC1D0p:   BUFFER_BYTES [817920:817920]
> > > [   52.434808] sound pcmC1D0p:   TICK_TIME [0:0]
> > >
> > > Regards,
> > > Brent
> > Hi Brent,
> >
> > Yes, I know we can do it to fix the issue as well. As I mentioned
> > before, we wanted to fix it in kernel because it is a real issue,
> > isn't it? Basically, a driver should work with any param it supports.
> > But in this case, everyone can trigger underrun if he or she does not
> > the period size to 240. If you still think it's not necessary, I can
> > modify UCM to make CRAS set the appropriate period size.
>
> How does it *not* work if you set other than period size 240, more
> exactly?
>
> The hw_constraint to a fixed period size must be really an exception.
> If you look at other drivers, you won't find any other doing such.
> It already indicates that something is wrong.
>
> Usually the fixed period size comes from the hardware limitation and
> defined in snd_pcm_hardware.  Or, sometimes it's an alignment issue.
> If you need more than that, you should doubt what's really not
> working.
>
>
> Takashi
Thank Takashi,

As I mentioned before, if the period size is set to 256, the measured
rate of sample-consuming will be around 49627 fps. It causes underrun
because the rate we set is 48000 fps. This behavior also happen on the
other period rate except for 240.

Thanks,
Yu-Hsuan
Takashi Iwai Aug. 11, 2020, 8:39 a.m. UTC | #17
On Tue, 11 Aug 2020 10:25:22 +0200,
Yu-Hsuan Hsu wrote:
> 
> Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午3:43寫道:
> >
> > On Tue, 11 Aug 2020 04:29:24 +0200,
> > Yu-Hsuan Hsu wrote:
> > >
> > > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道:
> > > >
> > > > >
> > > > > Sorry for the late reply. CRAS does not set the period size when using it.
> > > > > The default period size is 256, which consumes the samples quickly(about 49627
> > > > > fps when the rate is 48000 fps) at the beginning of the playback.
> > > > > Since CRAS write samples with the fixed frequency, it triggers underruns
> > > > > immidiately.
> > > > >
> > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the
> > > > > period size is 256, DSP will read 256 samples each time but only consume 240
> > > > > samples until the ring buffer of DSP is full. This behavior makes the samples in
> > > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is
> > > > > correct. Need Brent to confirm it.)
> > > > >
> > > > > Unfortunately, we can not change the behavior of DSP. After some experiments,
> > > > > we found that the issue can be fixed if we set the period size to 240. With the
> > > > > same frequency as the DSP, the samples are consumed stably. Because everyone
> > > > > can trigger this issue when using the driver without setting the period size, we
> > > > > think it is a general issue that should be fixed in the kernel.
> > > >
> > > > I check the code and just realized CRAS does nothing but request maximum buffer
> > > > size. As I know the application needs to decide the buffer time and period time so
> > > > ALSA could generate a hw_param structure with proper period size instead of using
> > > > fixed constraint in machine driver because driver has no idea about the latency you
> > > > want.
> > > >
> > > > You can use snd_pcm_hw_params_set_buffer_time_near() and
> > > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of
> > > > buffer and period parameters according to the latency requirement. In the CRAS
> > > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on
> > > > Celes and it looks quite promising. It seems to me that adding constraint in machine
> > > > driver is not necessary.
> > > >
> > > > SectionDevice."Speaker".0 {
> > > >         Value {
> > > >                 PlaybackPCM "hw:chtrt5650,0"
> > > >                 DmaPeriodMicrosecs "5000"
> > > > ...
> > > >
> > > > [   52.434761] sound pcmC1D0p: hw_param
> > > > [   52.434767] sound pcmC1D0p:   ACCESS 0x1
> > > > [   52.434770] sound pcmC1D0p:   FORMAT 0x4
> > > > [   52.434772] sound pcmC1D0p:   SUBFORMAT 0x1
> > > > [   52.434776] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> > > > [   52.434779] sound pcmC1D0p:   FRAME_BITS [32:32]
> > > > [   52.434782] sound pcmC1D0p:   CHANNELS [2:2]
> > > > [   52.434785] sound pcmC1D0p:   RATE [48000:48000]
> > > > [   52.434788] sound pcmC1D0p:   PERIOD_TIME [5000:5000]
> > > > [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> > > > [   52.434794] sound pcmC1D0p:   PERIOD_BYTES [960:960]
> > > > [   52.434797] sound pcmC1D0p:   PERIODS [852:852]
> > > > [   52.434799] sound pcmC1D0p:   BUFFER_TIME [4260000:4260000]
> > > > [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> > > > [   52.434805] sound pcmC1D0p:   BUFFER_BYTES [817920:817920]
> > > > [   52.434808] sound pcmC1D0p:   TICK_TIME [0:0]
> > > >
> > > > Regards,
> > > > Brent
> > > Hi Brent,
> > >
> > > Yes, I know we can do it to fix the issue as well. As I mentioned
> > > before, we wanted to fix it in kernel because it is a real issue,
> > > isn't it? Basically, a driver should work with any param it supports.
> > > But in this case, everyone can trigger underrun if he or she does not
> > > the period size to 240. If you still think it's not necessary, I can
> > > modify UCM to make CRAS set the appropriate period size.
> >
> > How does it *not* work if you set other than period size 240, more
> > exactly?
> >
> > The hw_constraint to a fixed period size must be really an exception.
> > If you look at other drivers, you won't find any other doing such.
> > It already indicates that something is wrong.
> >
> > Usually the fixed period size comes from the hardware limitation and
> > defined in snd_pcm_hardware.  Or, sometimes it's an alignment issue.
> > If you need more than that, you should doubt what's really not
> > working.
> >
> >
> > Takashi
> Thank Takashi,
> 
> As I mentioned before, if the period size is set to 256, the measured
> rate of sample-consuming will be around 49627 fps. It causes underrun
> because the rate we set is 48000 fps.

But this explanation rather sounds like the alignment problem.
However...

> This behavior also happen on the
> other period rate except for 240.

... Why only 240?  That's the next logical question.
If you have a clarification for it, it may be the rigid reason to
introduce such a hw constraint.


Takashi
Yu-Hsuan Hsu Aug. 11, 2020, 9:35 a.m. UTC | #18
Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午4:39寫道:
>
> On Tue, 11 Aug 2020 10:25:22 +0200,
> Yu-Hsuan Hsu wrote:
> >
> > Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午3:43寫道:
> > >
> > > On Tue, 11 Aug 2020 04:29:24 +0200,
> > > Yu-Hsuan Hsu wrote:
> > > >
> > > > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道:
> > > > >
> > > > > >
> > > > > > Sorry for the late reply. CRAS does not set the period size when using it.
> > > > > > The default period size is 256, which consumes the samples quickly(about 49627
> > > > > > fps when the rate is 48000 fps) at the beginning of the playback.
> > > > > > Since CRAS write samples with the fixed frequency, it triggers underruns
> > > > > > immidiately.
> > > > > >
> > > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the
> > > > > > period size is 256, DSP will read 256 samples each time but only consume 240
> > > > > > samples until the ring buffer of DSP is full. This behavior makes the samples in
> > > > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is
> > > > > > correct. Need Brent to confirm it.)
> > > > > >
> > > > > > Unfortunately, we can not change the behavior of DSP. After some experiments,
> > > > > > we found that the issue can be fixed if we set the period size to 240. With the
> > > > > > same frequency as the DSP, the samples are consumed stably. Because everyone
> > > > > > can trigger this issue when using the driver without setting the period size, we
> > > > > > think it is a general issue that should be fixed in the kernel.
> > > > >
> > > > > I check the code and just realized CRAS does nothing but request maximum buffer
> > > > > size. As I know the application needs to decide the buffer time and period time so
> > > > > ALSA could generate a hw_param structure with proper period size instead of using
> > > > > fixed constraint in machine driver because driver has no idea about the latency you
> > > > > want.
> > > > >
> > > > > You can use snd_pcm_hw_params_set_buffer_time_near() and
> > > > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of
> > > > > buffer and period parameters according to the latency requirement. In the CRAS
> > > > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on
> > > > > Celes and it looks quite promising. It seems to me that adding constraint in machine
> > > > > driver is not necessary.
> > > > >
> > > > > SectionDevice."Speaker".0 {
> > > > >         Value {
> > > > >                 PlaybackPCM "hw:chtrt5650,0"
> > > > >                 DmaPeriodMicrosecs "5000"
> > > > > ...
> > > > >
> > > > > [   52.434761] sound pcmC1D0p: hw_param
> > > > > [   52.434767] sound pcmC1D0p:   ACCESS 0x1
> > > > > [   52.434770] sound pcmC1D0p:   FORMAT 0x4
> > > > > [   52.434772] sound pcmC1D0p:   SUBFORMAT 0x1
> > > > > [   52.434776] sound pcmC1D0p:   SAMPLE_BITS [16:16]
> > > > > [   52.434779] sound pcmC1D0p:   FRAME_BITS [32:32]
> > > > > [   52.434782] sound pcmC1D0p:   CHANNELS [2:2]
> > > > > [   52.434785] sound pcmC1D0p:   RATE [48000:48000]
> > > > > [   52.434788] sound pcmC1D0p:   PERIOD_TIME [5000:5000]
> > > > > [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> > > > > [   52.434794] sound pcmC1D0p:   PERIOD_BYTES [960:960]
> > > > > [   52.434797] sound pcmC1D0p:   PERIODS [852:852]
> > > > > [   52.434799] sound pcmC1D0p:   BUFFER_TIME [4260000:4260000]
> > > > > [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> > > > > [   52.434805] sound pcmC1D0p:   BUFFER_BYTES [817920:817920]
> > > > > [   52.434808] sound pcmC1D0p:   TICK_TIME [0:0]
> > > > >
> > > > > Regards,
> > > > > Brent
> > > > Hi Brent,
> > > >
> > > > Yes, I know we can do it to fix the issue as well. As I mentioned
> > > > before, we wanted to fix it in kernel because it is a real issue,
> > > > isn't it? Basically, a driver should work with any param it supports.
> > > > But in this case, everyone can trigger underrun if he or she does not
> > > > the period size to 240. If you still think it's not necessary, I can
> > > > modify UCM to make CRAS set the appropriate period size.
> > >
> > > How does it *not* work if you set other than period size 240, more
> > > exactly?
> > >
> > > The hw_constraint to a fixed period size must be really an exception.
> > > If you look at other drivers, you won't find any other doing such.
> > > It already indicates that something is wrong.
> > >
> > > Usually the fixed period size comes from the hardware limitation and
> > > defined in snd_pcm_hardware.  Or, sometimes it's an alignment issue.
> > > If you need more than that, you should doubt what's really not
> > > working.
> > >
> > >
> > > Takashi
> > Thank Takashi,
> >
> > As I mentioned before, if the period size is set to 256, the measured
> > rate of sample-consuming will be around 49627 fps. It causes underrun
> > because the rate we set is 48000 fps.
>
> But this explanation rather sounds like the alignment problem.
> However...
>
> > This behavior also happen on the
> > other period rate except for 240.
>
> ... Why only 240?  That's the next logical question.
> If you have a clarification for it, it may be the rigid reason to
> introduce such a hw constraint.
>
>
> Takashi

According to Brent, the DSP is using 240 period regardless the
hw_param. If the period size is 256, DSP will read 256 samples each
time but only consume 240 samples until the ring buffer of DSP is
full. This behavior makes the samples in the ring buffer of kernel
consumed quickly.

Not sure whether the explanation is correct. Hi Brent, can you confirm it?

Thanks,
Yu-Hsuan
Mark Brown Aug. 11, 2020, 2:53 p.m. UTC | #19
On Tue, Aug 11, 2020 at 05:35:45PM +0800, Yu-Hsuan Hsu wrote:
> Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午4:39寫道:

> > ... Why only 240?  That's the next logical question.
> > If you have a clarification for it, it may be the rigid reason to
> > introduce such a hw constraint.

> According to Brent, the DSP is using 240 period regardless the
> hw_param. If the period size is 256, DSP will read 256 samples each
> time but only consume 240 samples until the ring buffer of DSP is
> full. This behavior makes the samples in the ring buffer of kernel
> consumed quickly.

> Not sure whether the explanation is correct. Hi Brent, can you confirm it?

This seems to be going round and round in circles.  Userspace lets the
kernel pick the period size, if the period size isn't 240 (or a multiple
of it?) the DSP doesn't properly pay attention to that apparently due to
internal hard coding in the DSP firmware which we can't change so the
constraint logic needs to know about this DSP limitation - it seems like
none of this is going to change without something new going into the
mix?  We at least need a new question to ask about the DSP firmware I
think.
Pierre-Louis Bossart Aug. 11, 2020, 4:54 p.m. UTC | #20
>>> ... Why only 240?  That's the next logical question.
>>> If you have a clarification for it, it may be the rigid reason to
>>> introduce such a hw constraint.
> 
>> According to Brent, the DSP is using 240 period regardless the
>> hw_param. If the period size is 256, DSP will read 256 samples each
>> time but only consume 240 samples until the ring buffer of DSP is
>> full. This behavior makes the samples in the ring buffer of kernel
>> consumed quickly.
> 
>> Not sure whether the explanation is correct. Hi Brent, can you confirm it?
> 
> This seems to be going round and round in circles.  Userspace lets the
> kernel pick the period size, if the period size isn't 240 (or a multiple
> of it?) the DSP doesn't properly pay attention to that apparently due to
> internal hard coding in the DSP firmware which we can't change so the
> constraint logic needs to know about this DSP limitation - it seems like
> none of this is going to change without something new going into the
> mix?  We at least need a new question to ask about the DSP firmware I
> think.

I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 
5.4, and I see no issues with the 240 sample period. Same with 432, 960, 
9600, etc.

I also tried just for fun what happens with 256 samples, and I don't see 
any underflows thrown either, so I am wondering what exactly the problem 
is? Something's not adding up. I would definitively favor multiple of 
1ms periods, since it's the only case that was productized, but there's 
got to me something a side effect of how CRAS programs the hw_params.

root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480 -v 1.wav
Playing WAVE '1.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Hardware PCM card 0 'chtmax98090' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 2
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 480
   period_size  : 240
   period_time  : 5000
   tstamp_mode  : NONE
   tstamp_type  : MONOTONIC
   period_step  : 1
   avail_min    : 240
   period_event : 0
   start_threshold  : 480
   stop_threshold   : 480
   silence_threshold: 0
   silence_size : 0
   boundary     : 8646911284551352320
   appl_ptr     : 0
   hw_ptr       : 0

root@chrx:~# aplay -Dhw:0,0 --period-size=256 --buffer-size=512 -v 1.wav
Playing WAVE '1.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Hardware PCM card 0 'chtmax98090' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 2
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 512
   period_size  : 256
   period_time  : 5333
   tstamp_mode  : NONE
   tstamp_type  : MONOTONIC
   period_step  : 1
   avail_min    : 256
   period_event : 0
   start_threshold  : 512
   stop_threshold   : 512
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 0
   hw_ptr       : 0
Mark Brown Aug. 11, 2020, 5:22 p.m. UTC | #21
On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:

> > constraint logic needs to know about this DSP limitation - it seems like
> > none of this is going to change without something new going into the
> > mix?  We at least need a new question to ask about the DSP firmware I
> > think.

> I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> etc.

> I also tried just for fun what happens with 256 samples, and I don't see any
> underflows thrown either, so I am wondering what exactly the problem is?
> Something's not adding up. I would definitively favor multiple of 1ms
> periods, since it's the only case that was productized, but there's got to
> me something a side effect of how CRAS programs the hw_params.

Is it something that goes wrong with longer playbacks possibly (eg,
someone watching a feature film or something)?
Yu-Hsuan Hsu Aug. 12, 2020, 3:09 a.m. UTC | #22
Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道:
>
> On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:
>
> > > constraint logic needs to know about this DSP limitation - it seems like
> > > none of this is going to change without something new going into the
> > > mix?  We at least need a new question to ask about the DSP firmware I
> > > think.
>
> > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> > and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> > etc.
>
> > I also tried just for fun what happens with 256 samples, and I don't see any
> > underflows thrown either, so I am wondering what exactly the problem is?
> > Something's not adding up. I would definitively favor multiple of 1ms
> > periods, since it's the only case that was productized, but there's got to
> > me something a side effect of how CRAS programs the hw_params.
>
> Is it something that goes wrong with longer playbacks possibly (eg,
> someone watching a feature film or something)?

Thanks for testing!

After doing some experiments, I think I can identify the problem more precisely.
1. aplay can not reproduce this issue because it writes samples
immediately when there are some space in the buffer. However, you can
add --test-position to see how the delay grows with period size 256.
> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
Hz, Stereo
Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
...

2. Since many samples are moved to DSP(delay), the measured rate of
the ring-buffer is high. (I measured it by alsa_conformance_test,
which only test the sampling rate in the ring buffer of kernel not
DSP)

3. Since CRAS writes samples with a fixed frequency, this behavior
will take all samples from the ring buffer, which is seen as underrun
by CRAS. (It seems that it is not a real underrun because that avail
does not larger than buffer size. Maybe CRAS should also take dalay
into account.)

4. In spite of it is not a real underrun, the large delay is still a
big problem. Can we apply the constraint to fix it? Or any better
idea?

Thanks,
Yu-Hsuan
Takashi Iwai Aug. 12, 2020, 6:13 a.m. UTC | #23
On Wed, 12 Aug 2020 05:09:58 +0200,
Yu-Hsuan Hsu wrote:
> 
> Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道:
> >
> > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:
> >
> > > > constraint logic needs to know about this DSP limitation - it seems like
> > > > none of this is going to change without something new going into the
> > > > mix?  We at least need a new question to ask about the DSP firmware I
> > > > think.
> >
> > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> > > and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> > > etc.
> >
> > > I also tried just for fun what happens with 256 samples, and I don't see any
> > > underflows thrown either, so I am wondering what exactly the problem is?
> > > Something's not adding up. I would definitively favor multiple of 1ms
> > > periods, since it's the only case that was productized, but there's got to
> > > me something a side effect of how CRAS programs the hw_params.
> >
> > Is it something that goes wrong with longer playbacks possibly (eg,
> > someone watching a feature film or something)?
> 
> Thanks for testing!
> 
> After doing some experiments, I think I can identify the problem more precisely.
> 1. aplay can not reproduce this issue because it writes samples
> immediately when there are some space in the buffer. However, you can
> add --test-position to see how the delay grows with period size 256.
> > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> Hz, Stereo
> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
> ...

Isn't this about the alignment of the buffer size against the period
size, not the period size itself?  i.e. in the example above, the
buffer size isn't a multiple of period size, and DSP can't handle if
the position overlaps the buffer size in a half way.

If that's the problem (and it's an oft-seen restriction), the right
constraint is
  snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);


Takashi

> 2. Since many samples are moved to DSP(delay), the measured rate of
> the ring-buffer is high. (I measured it by alsa_conformance_test,
> which only test the sampling rate in the ring buffer of kernel not
> DSP)
> 
> 3. Since CRAS writes samples with a fixed frequency, this behavior
> will take all samples from the ring buffer, which is seen as underrun
> by CRAS. (It seems that it is not a real underrun because that avail
> does not larger than buffer size. Maybe CRAS should also take dalay
> into account.)
> 
> 4. In spite of it is not a real underrun, the large delay is still a
> big problem. Can we apply the constraint to fix it? Or any better
> idea?
> 
> Thanks,
> Yu-Hsuan
>
Yu-Hsuan Hsu Aug. 12, 2020, 6:53 a.m. UTC | #24
Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道:
>
> On Wed, 12 Aug 2020 05:09:58 +0200,
> Yu-Hsuan Hsu wrote:
> >
> > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道:
> > >
> > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:
> > >
> > > > > constraint logic needs to know about this DSP limitation - it seems like
> > > > > none of this is going to change without something new going into the
> > > > > mix?  We at least need a new question to ask about the DSP firmware I
> > > > > think.
> > >
> > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> > > > etc.
> > >
> > > > I also tried just for fun what happens with 256 samples, and I don't see any
> > > > underflows thrown either, so I am wondering what exactly the problem is?
> > > > Something's not adding up. I would definitively favor multiple of 1ms
> > > > periods, since it's the only case that was productized, but there's got to
> > > > me something a side effect of how CRAS programs the hw_params.
> > >
> > > Is it something that goes wrong with longer playbacks possibly (eg,
> > > someone watching a feature film or something)?
> >
> > Thanks for testing!
> >
> > After doing some experiments, I think I can identify the problem more precisely.
> > 1. aplay can not reproduce this issue because it writes samples
> > immediately when there are some space in the buffer. However, you can
> > add --test-position to see how the delay grows with period size 256.
> > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
> > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> > Hz, Stereo
> > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
> > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
> > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
> > ...
>
> Isn't this about the alignment of the buffer size against the period
> size, not the period size itself?  i.e. in the example above, the
> buffer size isn't a multiple of period size, and DSP can't handle if
> the position overlaps the buffer size in a half way.
>
> If that's the problem (and it's an oft-seen restriction), the right
> constraint is
>   snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
>
>
> Takashi
Oh sorry for my typo. The issue happens no matter what buffer size is
set. Actually, even if I want to set 480, it will change to 512
automatically.
Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
= 512 <-this one is the buffer size

>
> > 2. Since many samples are moved to DSP(delay), the measured rate of
> > the ring-buffer is high. (I measured it by alsa_conformance_test,
> > which only test the sampling rate in the ring buffer of kernel not
> > DSP)
> >
> > 3. Since CRAS writes samples with a fixed frequency, this behavior
> > will take all samples from the ring buffer, which is seen as underrun
> > by CRAS. (It seems that it is not a real underrun because that avail
> > does not larger than buffer size. Maybe CRAS should also take dalay
> > into account.)
> >
> > 4. In spite of it is not a real underrun, the large delay is still a
> > big problem. Can we apply the constraint to fix it? Or any better
> > idea?
> >
> > Thanks,
> > Yu-Hsuan
> >
Takashi Iwai Aug. 12, 2020, 6:58 a.m. UTC | #25
On Wed, 12 Aug 2020 08:53:42 +0200,
Yu-Hsuan Hsu wrote:
> 
> Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道:
> >
> > On Wed, 12 Aug 2020 05:09:58 +0200,
> > Yu-Hsuan Hsu wrote:
> > >
> > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道:
> > > >
> > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:
> > > >
> > > > > > constraint logic needs to know about this DSP limitation - it seems like
> > > > > > none of this is going to change without something new going into the
> > > > > > mix?  We at least need a new question to ask about the DSP firmware I
> > > > > > think.
> > > >
> > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> > > > > etc.
> > > >
> > > > > I also tried just for fun what happens with 256 samples, and I don't see any
> > > > > underflows thrown either, so I am wondering what exactly the problem is?
> > > > > Something's not adding up. I would definitively favor multiple of 1ms
> > > > > periods, since it's the only case that was productized, but there's got to
> > > > > me something a side effect of how CRAS programs the hw_params.
> > > >
> > > > Is it something that goes wrong with longer playbacks possibly (eg,
> > > > someone watching a feature film or something)?
> > >
> > > Thanks for testing!
> > >
> > > After doing some experiments, I think I can identify the problem more precisely.
> > > 1. aplay can not reproduce this issue because it writes samples
> > > immediately when there are some space in the buffer. However, you can
> > > add --test-position to see how the delay grows with period size 256.
> > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
> > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> > > Hz, Stereo
> > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
> > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
> > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
> > > ...
> >
> > Isn't this about the alignment of the buffer size against the period
> > size, not the period size itself?  i.e. in the example above, the
> > buffer size isn't a multiple of period size, and DSP can't handle if
> > the position overlaps the buffer size in a half way.
> >
> > If that's the problem (and it's an oft-seen restriction), the right
> > constraint is
> >   snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
> >
> >
> > Takashi
> Oh sorry for my typo. The issue happens no matter what buffer size is
> set. Actually, even if I want to set 480, it will change to 512
> automatically.
> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
> = 512 <-this one is the buffer size

OK, then it means that the buffer size alignment is already in place.

And this large delay won't happen if you use period size 240?


Takashi

> > > 2. Since many samples are moved to DSP(delay), the measured rate of
> > > the ring-buffer is high. (I measured it by alsa_conformance_test,
> > > which only test the sampling rate in the ring buffer of kernel not
> > > DSP)
> > >
> > > 3. Since CRAS writes samples with a fixed frequency, this behavior
> > > will take all samples from the ring buffer, which is seen as underrun
> > > by CRAS. (It seems that it is not a real underrun because that avail
> > > does not larger than buffer size. Maybe CRAS should also take dalay
> > > into account.)
> > >
> > > 4. In spite of it is not a real underrun, the large delay is still a
> > > big problem. Can we apply the constraint to fix it? Or any better
> > > idea?
> > >
> > > Thanks,
> > > Yu-Hsuan
> > >
>
Yu-Hsuan Hsu Aug. 12, 2020, 7:43 a.m. UTC | #26
Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:58寫道:
>
> On Wed, 12 Aug 2020 08:53:42 +0200,
> Yu-Hsuan Hsu wrote:
> >
> > Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道:
> > >
> > > On Wed, 12 Aug 2020 05:09:58 +0200,
> > > Yu-Hsuan Hsu wrote:
> > > >
> > > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道:
> > > > >
> > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:
> > > > >
> > > > > > > constraint logic needs to know about this DSP limitation - it seems like
> > > > > > > none of this is going to change without something new going into the
> > > > > > > mix?  We at least need a new question to ask about the DSP firmware I
> > > > > > > think.
> > > > >
> > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> > > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> > > > > > etc.
> > > > >
> > > > > > I also tried just for fun what happens with 256 samples, and I don't see any
> > > > > > underflows thrown either, so I am wondering what exactly the problem is?
> > > > > > Something's not adding up. I would definitively favor multiple of 1ms
> > > > > > periods, since it's the only case that was productized, but there's got to
> > > > > > me something a side effect of how CRAS programs the hw_params.
> > > > >
> > > > > Is it something that goes wrong with longer playbacks possibly (eg,
> > > > > someone watching a feature film or something)?
> > > >
> > > > Thanks for testing!
> > > >
> > > > After doing some experiments, I think I can identify the problem more precisely.
> > > > 1. aplay can not reproduce this issue because it writes samples
> > > > immediately when there are some space in the buffer. However, you can
> > > > add --test-position to see how the delay grows with period size 256.
> > > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
> > > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> > > > Hz, Stereo
> > > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
> > > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
> > > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
> > > > ...
> > >
> > > Isn't this about the alignment of the buffer size against the period
> > > size, not the period size itself?  i.e. in the example above, the
> > > buffer size isn't a multiple of period size, and DSP can't handle if
> > > the position overlaps the buffer size in a half way.
> > >
> > > If that's the problem (and it's an oft-seen restriction), the right
> > > constraint is
> > >   snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
> > >
> > >
> > > Takashi
> > Oh sorry for my typo. The issue happens no matter what buffer size is
> > set. Actually, even if I want to set 480, it will change to 512
> > automatically.
> > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
> > = 512 <-this one is the buffer size
>
> OK, then it means that the buffer size alignment is already in place.
>
> And this large delay won't happen if you use period size 240?
>
>
> Takashi
Yes! If I set the period size to 240, it will not print "Suspicious
buffer position ..."

Yu-Hsuan

>
> > > > 2. Since many samples are moved to DSP(delay), the measured rate of
> > > > the ring-buffer is high. (I measured it by alsa_conformance_test,
> > > > which only test the sampling rate in the ring buffer of kernel not
> > > > DSP)
> > > >
> > > > 3. Since CRAS writes samples with a fixed frequency, this behavior
> > > > will take all samples from the ring buffer, which is seen as underrun
> > > > by CRAS. (It seems that it is not a real underrun because that avail
> > > > does not larger than buffer size. Maybe CRAS should also take dalay
> > > > into account.)
> > > >
> > > > 4. In spite of it is not a real underrun, the large delay is still a
> > > > big problem. Can we apply the constraint to fix it? Or any better
> > > > idea?
> > > >
> > > > Thanks,
> > > > Yu-Hsuan
> > > >
> >
Takashi Iwai Aug. 12, 2020, 7:47 a.m. UTC | #27
On Wed, 12 Aug 2020 09:43:22 +0200,
Yu-Hsuan Hsu wrote:
> 
> Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:58寫道:
> >
> > On Wed, 12 Aug 2020 08:53:42 +0200,
> > Yu-Hsuan Hsu wrote:
> > >
> > > Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道:
> > > >
> > > > On Wed, 12 Aug 2020 05:09:58 +0200,
> > > > Yu-Hsuan Hsu wrote:
> > > > >
> > > > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道:
> > > > > >
> > > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote:
> > > > > >
> > > > > > > > constraint logic needs to know about this DSP limitation - it seems like
> > > > > > > > none of this is going to change without something new going into the
> > > > > > > > mix?  We at least need a new question to ask about the DSP firmware I
> > > > > > > > think.
> > > > > >
> > > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4,
> > > > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600,
> > > > > > > etc.
> > > > > >
> > > > > > > I also tried just for fun what happens with 256 samples, and I don't see any
> > > > > > > underflows thrown either, so I am wondering what exactly the problem is?
> > > > > > > Something's not adding up. I would definitively favor multiple of 1ms
> > > > > > > periods, since it's the only case that was productized, but there's got to
> > > > > > > me something a side effect of how CRAS programs the hw_params.
> > > > > >
> > > > > > Is it something that goes wrong with longer playbacks possibly (eg,
> > > > > > someone watching a feature film or something)?
> > > > >
> > > > > Thanks for testing!
> > > > >
> > > > > After doing some experiments, I think I can identify the problem more precisely.
> > > > > 1. aplay can not reproduce this issue because it writes samples
> > > > > immediately when there are some space in the buffer. However, you can
> > > > > add --test-position to see how the delay grows with period size 256.
> > > > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
> > > > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> > > > > Hz, Stereo
> > > > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
> > > > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
> > > > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
> > > > > ...
> > > >
> > > > Isn't this about the alignment of the buffer size against the period
> > > > size, not the period size itself?  i.e. in the example above, the
> > > > buffer size isn't a multiple of period size, and DSP can't handle if
> > > > the position overlaps the buffer size in a half way.
> > > >
> > > > If that's the problem (and it's an oft-seen restriction), the right
> > > > constraint is
> > > >   snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
> > > >
> > > >
> > > > Takashi
> > > Oh sorry for my typo. The issue happens no matter what buffer size is
> > > set. Actually, even if I want to set 480, it will change to 512
> > > automatically.
> > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
> > > = 512 <-this one is the buffer size
> >
> > OK, then it means that the buffer size alignment is already in place.
> >
> > And this large delay won't happen if you use period size 240?
> >
> >
> > Takashi
> Yes! If I set the period size to 240, it will not print "Suspicious
> buffer position ..."

So it sounds like DSP handles the delay report incorrectly.
Then it comes to another question: the driver supports both SOF and
SST.  Is there the behavior difference between both DSPs wrt this
delay issue?


Takashi

> 
> Yu-Hsuan
> 
> >
> > > > > 2. Since many samples are moved to DSP(delay), the measured rate of
> > > > > the ring-buffer is high. (I measured it by alsa_conformance_test,
> > > > > which only test the sampling rate in the ring buffer of kernel not
> > > > > DSP)
> > > > >
> > > > > 3. Since CRAS writes samples with a fixed frequency, this behavior
> > > > > will take all samples from the ring buffer, which is seen as underrun
> > > > > by CRAS. (It seems that it is not a real underrun because that avail
> > > > > does not larger than buffer size. Maybe CRAS should also take dalay
> > > > > into account.)
> > > > >
> > > > > 4. In spite of it is not a real underrun, the large delay is still a
> > > > > big problem. Can we apply the constraint to fix it? Or any better
> > > > > idea?
> > > > >
> > > > > Thanks,
> > > > > Yu-Hsuan
> > > > >
> > >
>
Pierre-Louis Bossart Aug. 12, 2020, 2:46 p.m. UTC | #28
>>>>>> After doing some experiments, I think I can identify the problem more precisely.
>>>>>> 1. aplay can not reproduce this issue because it writes samples
>>>>>> immediately when there are some space in the buffer. However, you can
>>>>>> add --test-position to see how the delay grows with period size 256.
>>>>>>> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
>>>>>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
>>>>>> Hz, Stereo
>>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
>>>>>> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
>>>>>> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
>>>>>> ...
>>>>>
>>>>> Isn't this about the alignment of the buffer size against the period
>>>>> size, not the period size itself?  i.e. in the example above, the
>>>>> buffer size isn't a multiple of period size, and DSP can't handle if
>>>>> the position overlaps the buffer size in a half way.
>>>>>
>>>>> If that's the problem (and it's an oft-seen restriction), the right
>>>>> constraint is
>>>>>    snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
>>>>>
>>>>>
>>>>> Takashi
>>>> Oh sorry for my typo. The issue happens no matter what buffer size is
>>>> set. Actually, even if I want to set 480, it will change to 512
>>>> automatically.
>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
>>>> = 512 <-this one is the buffer size
>>>
>>> OK, then it means that the buffer size alignment is already in place.
>>>
>>> And this large delay won't happen if you use period size 240?
>>>
>>>
>>> Takashi
>> Yes! If I set the period size to 240, it will not print "Suspicious
>> buffer position ..."
> 
> So it sounds like DSP handles the delay report incorrectly.
> Then it comes to another question: the driver supports both SOF and
> SST.  Is there the behavior difference between both DSPs wrt this
> delay issue?

I still don't get what the issue is. The two following cases work fine 
with the SST/Atom driver:

root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480 
/dev/zero -d 2 -f dat --test-position
Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 
Hz, Stereo
root@chrx:~# aplay -Dhw:0,0 --period-size=960 --buffer-size=4800 
/dev/zero -d 2 -f dat --test-position
Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 
Hz, Stereo

The existing code has this:

	/* Make sure, that the period size is always even */
	snd_pcm_hw_constraint_step(substream->runtime, 0,
			   SNDRV_PCM_HW_PARAM_PERIODS, 2);

	return snd_pcm_hw_constraint_integer(runtime,
			 SNDRV_PCM_HW_PARAM_PERIODS);

and with the addition of period size being a multiple of 1ms all 
requirements should be met?
Takashi Iwai Aug. 12, 2020, 2:55 p.m. UTC | #29
On Wed, 12 Aug 2020 16:46:40 +0200,
Pierre-Louis Bossart wrote:
> 
> 
> >>>>>> After doing some experiments, I think I can identify the problem more precisely.
> >>>>>> 1. aplay can not reproduce this issue because it writes samples
> >>>>>> immediately when there are some space in the buffer. However, you can
> >>>>>> add --test-position to see how the delay grows with period size 256.
> >>>>>>> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
> >>>>>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> >>>>>> Hz, Stereo
> >>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
> >>>>>> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
> >>>>>> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
> >>>>>> ...
> >>>>>
> >>>>> Isn't this about the alignment of the buffer size against the period
> >>>>> size, not the period size itself?  i.e. in the example above, the
> >>>>> buffer size isn't a multiple of period size, and DSP can't handle if
> >>>>> the position overlaps the buffer size in a half way.
> >>>>>
> >>>>> If that's the problem (and it's an oft-seen restriction), the right
> >>>>> constraint is
> >>>>>    snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
> >>>>>
> >>>>>
> >>>>> Takashi
> >>>> Oh sorry for my typo. The issue happens no matter what buffer size is
> >>>> set. Actually, even if I want to set 480, it will change to 512
> >>>> automatically.
> >>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
> >>>> = 512 <-this one is the buffer size
> >>>
> >>> OK, then it means that the buffer size alignment is already in place.
> >>>
> >>> And this large delay won't happen if you use period size 240?
> >>>
> >>>
> >>> Takashi
> >> Yes! If I set the period size to 240, it will not print "Suspicious
> >> buffer position ..."
> >
> > So it sounds like DSP handles the delay report incorrectly.
> > Then it comes to another question: the driver supports both SOF and
> > SST.  Is there the behavior difference between both DSPs wrt this
> > delay issue?
> 
> I still don't get what the issue is. The two following cases work fine
> with the SST/Atom driver:
> 
> root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480
> /dev/zero -d 2 -f dat --test-position
> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> Hz, Stereo
> root@chrx:~# aplay -Dhw:0,0 --period-size=960 --buffer-size=4800
> /dev/zero -d 2 -f dat --test-position
> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> Hz, Stereo

What if with --period-size=256 --buffer-size=512 and --test-position?
Can you reproduce the problem in your side?

> The existing code has this:
> 
> 	/* Make sure, that the period size is always even */
> 	snd_pcm_hw_constraint_step(substream->runtime, 0,
> 			   SNDRV_PCM_HW_PARAM_PERIODS, 2);
> 
> 	return snd_pcm_hw_constraint_integer(runtime,
> 			 SNDRV_PCM_HW_PARAM_PERIODS);
> 
> and with the addition of period size being a multiple of 1ms all
> requirements should be met?

I also wonder what's really missing, too :)

BTW, I took a look back at the thread, and CRAS seems using a very
large buffer, namely:
[   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
[   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]


Takashi
Pierre-Louis Bossart Aug. 12, 2020, 3:54 p.m. UTC | #30
On 8/12/20 9:55 AM, Takashi Iwai wrote:
> On Wed, 12 Aug 2020 16:46:40 +0200,
> Pierre-Louis Bossart wrote:
>>
>>
>>>>>>>> After doing some experiments, I think I can identify the problem more precisely.
>>>>>>>> 1. aplay can not reproduce this issue because it writes samples
>>>>>>>> immediately when there are some space in the buffer. However, you can
>>>>>>>> add --test-position to see how the delay grows with period size 256.
>>>>>>>>> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position
>>>>>>>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
>>>>>>>> Hz, Stereo
>>>>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
>>>>>>>> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
>>>>>>>> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512
>>>>>>>> ...
>>>>>>>
>>>>>>> Isn't this about the alignment of the buffer size against the period
>>>>>>> size, not the period size itself?  i.e. in the example above, the
>>>>>>> buffer size isn't a multiple of period size, and DSP can't handle if
>>>>>>> the position overlaps the buffer size in a half way.
>>>>>>>
>>>>>>> If that's the problem (and it's an oft-seen restriction), the right
>>>>>>> constraint is
>>>>>>>     snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS);
>>>>>>>
>>>>>>>
>>>>>>> Takashi
>>>>>> Oh sorry for my typo. The issue happens no matter what buffer size is
>>>>>> set. Actually, even if I want to set 480, it will change to 512
>>>>>> automatically.
>>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer
>>>>>> = 512 <-this one is the buffer size
>>>>>
>>>>> OK, then it means that the buffer size alignment is already in place.
>>>>>
>>>>> And this large delay won't happen if you use period size 240?
>>>>>
>>>>>
>>>>> Takashi
>>>> Yes! If I set the period size to 240, it will not print "Suspicious
>>>> buffer position ..."
>>>
>>> So it sounds like DSP handles the delay report incorrectly.
>>> Then it comes to another question: the driver supports both SOF and
>>> SST.  Is there the behavior difference between both DSPs wrt this
>>> delay issue?
>>
>> I still don't get what the issue is. The two following cases work fine
>> with the SST/Atom driver:
>>
>> root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480
>> /dev/zero -d 2 -f dat --test-position
>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
>> Hz, Stereo
>> root@chrx:~# aplay -Dhw:0,0 --period-size=960 --buffer-size=4800
>> /dev/zero -d 2 -f dat --test-position
>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
>> Hz, Stereo
> 
> What if with --period-size=256 --buffer-size=512 and --test-position?
> Can you reproduce the problem in your side?

Yes indeed with the existing driver:

root@chrx:~# aplay -Dhw:0,0 --period-size=256 --buffer-size=512 
/dev/zero -d 2 -f dat --test-position
Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 
Hz, Stereo
underrun!!! (at least 0.312 ms long)
underrun!!! (at least 0.326 ms long)
Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512
Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512
Suspicious buffer position (3 total): avail = 0, delay = 2080, buffer = 512
Suspicious buffer position (4 total): avail = 0, delay = 2080, buffer = 512
Suspicious buffer position (5 total): avail = 0, delay = 2096, buffer = 512
Suspicious buffer position (6 total): avail = 0, delay = 2096, buffer = 512

but the new constraint to force a 1ms step added in the patch1 should 
preclude this from happening.

>> The existing code has this:
>>
>> 	/* Make sure, that the period size is always even */
>> 	snd_pcm_hw_constraint_step(substream->runtime, 0,
>> 			   SNDRV_PCM_HW_PARAM_PERIODS, 2);
>>
>> 	return snd_pcm_hw_constraint_integer(runtime,
>> 			 SNDRV_PCM_HW_PARAM_PERIODS);
>>
>> and with the addition of period size being a multiple of 1ms all
>> requirements should be met?
> 
> I also wonder what's really missing, too :)
> 
> BTW, I took a look back at the thread, and CRAS seems using a very
> large buffer, namely:
> [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]

yes, that's 852 periods and 4.260 seconds. Never seen such values :-)
Brent Lu Aug. 12, 2020, 4:08 p.m. UTC | #31
> >
> > I also wonder what's really missing, too :)
> >
> > BTW, I took a look back at the thread, and CRAS seems using a very
> > large buffer, namely:
> > [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> > [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> 
> yes, that's 852 periods and 4.260 seconds. Never seen such values :-)

CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
buffer as possible. So the period size is an arbitrary number in different
platforms. Atom SST platform happens to be 256, and CML SOF platform
is 1056 for example.


Regards,
Brent
Pierre-Louis Bossart Aug. 12, 2020, 4:38 p.m. UTC | #32
On 8/12/20 11:08 AM, Lu, Brent wrote:
>>>
>>> I also wonder what's really missing, too :)
>>>
>>> BTW, I took a look back at the thread, and CRAS seems using a very
>>> large buffer, namely:
>>> [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
>>> [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
>>
>> yes, that's 852 periods and 4.260 seconds. Never seen such values :-)
> 
> CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
> buffer as possible. So the period size is an arbitrary number in different
> platforms. Atom SST platform happens to be 256, and CML SOF platform
> is 1056 for example.

ok, but earlier in this thread it was mentioned that values such as 432 
are not suitable. the statement above seems to mean the period actual 
value is a "don't care", so I don't quite see why this specific patch2 
restricting the value to 240 is necessary. Patch1 is needed for sure, 
Patch2 is where Takashi and I are not convinced.
Yu-Hsuan Hsu Aug. 13, 2020, 6:24 a.m. UTC | #33
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> 於
2020年8月13日 週四 上午12:38寫道:
>
>
>
> On 8/12/20 11:08 AM, Lu, Brent wrote:
> >>>
> >>> I also wonder what's really missing, too :)
> >>>
> >>> BTW, I took a look back at the thread, and CRAS seems using a very
> >>> large buffer, namely:
> >>> [   52.434791] sound pcmC1D0p:   PERIOD_SIZE [240:240]
> >>> [   52.434802] sound pcmC1D0p:   BUFFER_SIZE [204480:204480]
> >>
> >> yes, that's 852 periods and 4.260 seconds. Never seen such values :-)
> >
> > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
> > buffer as possible. So the period size is an arbitrary number in different
> > platforms. Atom SST platform happens to be 256, and CML SOF platform
> > is 1056 for example.
>
> ok, but earlier in this thread it was mentioned that values such as 432
> are not suitable. the statement above seems to mean the period actual
> value is a "don't care", so I don't quite see why this specific patch2
> restricting the value to 240 is necessary. Patch1 is needed for sure,
> Patch2 is where Takashi and I are not convinced.

I have downloaded the patch1 but it does not work. After applying
patch1, the default period size changes to 320. However, it also has
the same issue with period size 320. (It can be verified by aplay.)
Brent Lu Aug. 13, 2020, 7:55 a.m. UTC | #34
> > >
> > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
> > > buffer as possible. So the period size is an arbitrary number in
> > > different platforms. Atom SST platform happens to be 256, and CML
> > > SOF platform is 1056 for example.
> >
> > ok, but earlier in this thread it was mentioned that values such as
> > 432 are not suitable. the statement above seems to mean the period
> > actual value is a "don't care", so I don't quite see why this specific
> > patch2 restricting the value to 240 is necessary. Patch1 is needed for
> > sure,
> > Patch2 is where Takashi and I are not convinced.
> 
> I have downloaded the patch1 but it does not work. After applying patch1,
> the default period size changes to 320. However, it also has the same issue
> with period size 320. (It can be verified by aplay.)

The period_size is related to the audio latency so it's decided by application
according to the use case it's running. That's why there are concerns about
patch 2 and also you cannot find similar constraints in other machine driver.

Another problem is the buffer size. Too large buffer is not just wasting memories.
It also creates problems to memory allocator since continuous pages are not
always there. Using a small period_count like 2 or 4 should be sufficient for audio
data transfer.

buffer_size = period_size * period_count * 1000000 / sample_rate;
snd_pcm_hw_params_set_buffer_time_near(mPcmDevice, params, &buffer_size, NULL);

And one more problem here: you need to decide period_size and period_count
first in order to calculate the buffer size...


Regards,
Brent
Yu-Hsuan Hsu Aug. 13, 2020, 8:36 a.m. UTC | #35
Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道:
>
> > > >
> > > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
> > > > buffer as possible. So the period size is an arbitrary number in
> > > > different platforms. Atom SST platform happens to be 256, and CML
> > > > SOF platform is 1056 for example.
> > >
> > > ok, but earlier in this thread it was mentioned that values such as
> > > 432 are not suitable. the statement above seems to mean the period
> > > actual value is a "don't care", so I don't quite see why this specific
> > > patch2 restricting the value to 240 is necessary. Patch1 is needed for
> > > sure,
> > > Patch2 is where Takashi and I are not convinced.
> >
> > I have downloaded the patch1 but it does not work. After applying patch1,
> > the default period size changes to 320. However, it also has the same issue
> > with period size 320. (It can be verified by aplay.)
>
> The period_size is related to the audio latency so it's decided by application
> according to the use case it's running. That's why there are concerns about
> patch 2 and also you cannot find similar constraints in other machine driver.
You're right. However, the problem here is the provided period size
does not work. Like 256, setting the period size to 320 also makes
users have big latency in the DSP ring buffer.

localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640
/dev/zero -d 1 -f dat --test-position
Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
Hz, Stereo
Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640
Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640
Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640
...

>
> Another problem is the buffer size. Too large buffer is not just wasting memories.
> It also creates problems to memory allocator since continuous pages are not
> always there. Using a small period_count like 2 or 4 should be sufficient for audio
> data transfer.
>
> buffer_size = period_size * period_count * 1000000 / sample_rate;
> snd_pcm_hw_params_set_buffer_time_near(mPcmDevice, params, &buffer_size, NULL);
>
> And one more problem here: you need to decide period_size and period_count
> first in order to calculate the buffer size...
It's a good point. I will bring it up to our team and see whether we
can use the smaller buffer size. Thanks!
>
>
> Regards,
> Brent

Thanks,
Yu-Hsuan
Takashi Iwai Aug. 13, 2020, 8:45 a.m. UTC | #36
On Thu, 13 Aug 2020 10:36:57 +0200,
Yu-Hsuan Hsu wrote:
> 
> Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道:
> >
> > > > >
> > > > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
> > > > > buffer as possible. So the period size is an arbitrary number in
> > > > > different platforms. Atom SST platform happens to be 256, and CML
> > > > > SOF platform is 1056 for example.
> > > >
> > > > ok, but earlier in this thread it was mentioned that values such as
> > > > 432 are not suitable. the statement above seems to mean the period
> > > > actual value is a "don't care", so I don't quite see why this specific
> > > > patch2 restricting the value to 240 is necessary. Patch1 is needed for
> > > > sure,
> > > > Patch2 is where Takashi and I are not convinced.
> > >
> > > I have downloaded the patch1 but it does not work. After applying patch1,
> > > the default period size changes to 320. However, it also has the same issue
> > > with period size 320. (It can be verified by aplay.)
> >
> > The period_size is related to the audio latency so it's decided by application
> > according to the use case it's running. That's why there are concerns about
> > patch 2 and also you cannot find similar constraints in other machine driver.
> You're right. However, the problem here is the provided period size
> does not work. Like 256, setting the period size to 320 also makes
> users have big latency in the DSP ring buffer.
> 
> localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640
> /dev/zero -d 1 -f dat --test-position
> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> Hz, Stereo
> Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640
> Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640
> Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640
> ...

It means that the delay value returned from the driver is bogus.
I suppose it comes pcm_delay value calculated in sst_calc_tstamp(),
but haven't followed the code closely yet.  Maybe checking the debug
outputs can help to trace what's going wrong.


Takashi

> 
> >
> > Another problem is the buffer size. Too large buffer is not just wasting memories.
> > It also creates problems to memory allocator since continuous pages are not
> > always there. Using a small period_count like 2 or 4 should be sufficient for audio
> > data transfer.
> >
> > buffer_size = period_size * period_count * 1000000 / sample_rate;
> > snd_pcm_hw_params_set_buffer_time_near(mPcmDevice, params, &buffer_size, NULL);
> >
> > And one more problem here: you need to decide period_size and period_count
> > first in order to calculate the buffer size...
> It's a good point. I will bring it up to our team and see whether we
> can use the smaller buffer size. Thanks!
> >
> >
> > Regards,
> > Brent
> 
> Thanks,
> Yu-Hsuan
>
Pierre-Louis Bossart Aug. 13, 2020, 12:57 p.m. UTC | #37
On 8/13/20 3:45 AM, Takashi Iwai wrote:
> On Thu, 13 Aug 2020 10:36:57 +0200,
> Yu-Hsuan Hsu wrote:
>>
>> Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道:
>>>
>>>>>>
>>>>>> CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
>>>>>> buffer as possible. So the period size is an arbitrary number in
>>>>>> different platforms. Atom SST platform happens to be 256, and CML
>>>>>> SOF platform is 1056 for example.
>>>>>
>>>>> ok, but earlier in this thread it was mentioned that values such as
>>>>> 432 are not suitable. the statement above seems to mean the period
>>>>> actual value is a "don't care", so I don't quite see why this specific
>>>>> patch2 restricting the value to 240 is necessary. Patch1 is needed for
>>>>> sure,
>>>>> Patch2 is where Takashi and I are not convinced.
>>>>
>>>> I have downloaded the patch1 but it does not work. After applying patch1,
>>>> the default period size changes to 320. However, it also has the same issue
>>>> with period size 320. (It can be verified by aplay.)
>>>
>>> The period_size is related to the audio latency so it's decided by application
>>> according to the use case it's running. That's why there are concerns about
>>> patch 2 and also you cannot find similar constraints in other machine driver.
>> You're right. However, the problem here is the provided period size
>> does not work. Like 256, setting the period size to 320 also makes
>> users have big latency in the DSP ring buffer.
>>
>> localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640
>> /dev/zero -d 1 -f dat --test-position
>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
>> Hz, Stereo
>> Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640
>> Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640
>> Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640
>> ...
> 
> It means that the delay value returned from the driver is bogus.
> I suppose it comes pcm_delay value calculated in sst_calc_tstamp(),
> but haven't followed the code closely yet.  Maybe checking the debug
> outputs can help to trace what's going wrong.

the problem is really that we add a constraint that the period size be a 
multiple of 1ms, and it's not respected. 320 samples is not a valid 
choice, I don't get how it ends-up being selected? there's a glitch in 
the matrix here.
Yu-Hsuan Hsu Aug. 13, 2020, 5:15 p.m. UTC | #38
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> 於
2020年8月13日 週四 下午8:57寫道:
>
>
>
> On 8/13/20 3:45 AM, Takashi Iwai wrote:
> > On Thu, 13 Aug 2020 10:36:57 +0200,
> > Yu-Hsuan Hsu wrote:
> >>
> >> Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道:
> >>>
> >>>>>>
> >>>>>> CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large
> >>>>>> buffer as possible. So the period size is an arbitrary number in
> >>>>>> different platforms. Atom SST platform happens to be 256, and CML
> >>>>>> SOF platform is 1056 for example.
> >>>>>
> >>>>> ok, but earlier in this thread it was mentioned that values such as
> >>>>> 432 are not suitable. the statement above seems to mean the period
> >>>>> actual value is a "don't care", so I don't quite see why this specific
> >>>>> patch2 restricting the value to 240 is necessary. Patch1 is needed for
> >>>>> sure,
> >>>>> Patch2 is where Takashi and I are not convinced.
> >>>>
> >>>> I have downloaded the patch1 but it does not work. After applying patch1,
> >>>> the default period size changes to 320. However, it also has the same issue
> >>>> with period size 320. (It can be verified by aplay.)
> >>>
> >>> The period_size is related to the audio latency so it's decided by application
> >>> according to the use case it's running. That's why there are concerns about
> >>> patch 2 and also you cannot find similar constraints in other machine driver.
> >> You're right. However, the problem here is the provided period size
> >> does not work. Like 256, setting the period size to 320 also makes
> >> users have big latency in the DSP ring buffer.
> >>
> >> localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640
> >> /dev/zero -d 1 -f dat --test-position
> >> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000
> >> Hz, Stereo
> >> Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640
> >> Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640
> >> Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640
> >> ...
> >
> > It means that the delay value returned from the driver is bogus.
> > I suppose it comes pcm_delay value calculated in sst_calc_tstamp(),
> > but haven't followed the code closely yet.  Maybe checking the debug
> > outputs can help to trace what's going wrong.
>
> the problem is really that we add a constraint that the period size be a
> multiple of 1ms, and it's not respected. 320 samples is not a valid
> choice, I don't get how it ends-up being selected? there's a glitch in
> the matrix here.
>
>
Oh sorry that I applied the wrong patch. With the correct patch, the
default period size is 432.
With period size 432, running aplay with --test-position does not show
any errors. However, by cat `/proc/asound/card1/pcm0p/sub0/status`. We
can see the delay is around 3000.
Here are all period sizes I have tried. All buffer sizes are set to 2
* period size.
period size: 192,  delay is a negative number. Not sure what happened.
period size: 240, delay is fixed at 960
period size: 288, delay is around 27XX
period size: 336, delay is around 27XX
period size: 384, delay is around 24XX (no errors from aplay)
period size: 432, delay is around 30XX (no errors from aplay)
period size: 480, delay is fixed at 3120 (no errors from aplay)
period size: 524, delay is around 31XX (no errors from aplay)

Not sure why the delay is around 50ms except for the period size 240.
Is it normal?

Thanks,
Yu-Hsuan
diff mbox series

Patch

diff --git a/sound/soc/intel/boards/cht_bsw_max98090_ti.c b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
index 835e9bd..bf67254 100644
--- a/sound/soc/intel/boards/cht_bsw_max98090_ti.c
+++ b/sound/soc/intel/boards/cht_bsw_max98090_ti.c
@@ -283,8 +283,20 @@  static int cht_codec_fixup(struct snd_soc_pcm_runtime *rtd,
 
 static int cht_aif1_startup(struct snd_pcm_substream *substream)
 {
-	return snd_pcm_hw_constraint_single(substream->runtime,
+	int err;
+
+	/* Set period size to 240 to align with Atom design */
+	err = snd_pcm_hw_constraint_minmax(substream->runtime,
+			SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 240, 240);
+	if (err < 0)
+		return err;
+
+	err = snd_pcm_hw_constraint_single(substream->runtime,
 			SNDRV_PCM_HW_PARAM_RATE, 48000);
+	if (err < 0)
+		return err;
+
+	return 0;
 }
 
 static int cht_max98090_headset_init(struct snd_soc_component *component)
diff --git a/sound/soc/intel/boards/cht_bsw_rt5645.c b/sound/soc/intel/boards/cht_bsw_rt5645.c
index b53c024..6e62f0d 100644
--- a/sound/soc/intel/boards/cht_bsw_rt5645.c
+++ b/sound/soc/intel/boards/cht_bsw_rt5645.c
@@ -414,8 +414,20 @@  static int cht_codec_fixup(struct snd_soc_pcm_runtime *rtd,
 
 static int cht_aif1_startup(struct snd_pcm_substream *substream)
 {
-	return snd_pcm_hw_constraint_single(substream->runtime,
+	int err;
+
+	/* Set period size to 240 to align with Atom design */
+	err = snd_pcm_hw_constraint_minmax(substream->runtime,
+			SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 240, 240);
+	if (err < 0)
+		return err;
+
+	err = snd_pcm_hw_constraint_single(substream->runtime,
 			SNDRV_PCM_HW_PARAM_RATE, 48000);
+	if (err < 0)
+		return err;
+
+	return 0;
 }
 
 static const struct snd_soc_ops cht_aif1_ops = {