Message ID | 1596198365-10105-3-git-send-email-brent.lu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add period size constraint for Atom Chromebook | expand |
On Fri, 31 Jul 2020 14:26:05 +0200, Brent Lu wrote: > > From: Yu-Hsuan Hsu <yuhsuan@chromium.org> > > The CRAS server does not set the period size in hw_param so ALSA will > calculate a value for period size which is based on the buffer size > and other parameters. The value may not always be aligned with Atom's > dsp design so a constraint is added to make sure the board always has > a good value. > > Cyan uses chtmax98090 and others(banon, celes, edgar, kefka...) use > rt5650. > > Signed-off-by: Yu-Hsuan Hsu <yuhsuan@chromium.org> > Signed-off-by: Brent Lu <brent.lu@intel.com> > --- > sound/soc/intel/boards/cht_bsw_max98090_ti.c | 14 +++++++++++++- > sound/soc/intel/boards/cht_bsw_rt5645.c | 14 +++++++++++++- > 2 files changed, 26 insertions(+), 2 deletions(-) > > diff --git a/sound/soc/intel/boards/cht_bsw_max98090_ti.c b/sound/soc/intel/boards/cht_bsw_max98090_ti.c > index 835e9bd..bf67254 100644 > --- a/sound/soc/intel/boards/cht_bsw_max98090_ti.c > +++ b/sound/soc/intel/boards/cht_bsw_max98090_ti.c > @@ -283,8 +283,20 @@ static int cht_codec_fixup(struct snd_soc_pcm_runtime *rtd, > > static int cht_aif1_startup(struct snd_pcm_substream *substream) > { > - return snd_pcm_hw_constraint_single(substream->runtime, > + int err; > + > + /* Set period size to 240 to align with Atom design */ > + err = snd_pcm_hw_constraint_minmax(substream->runtime, > + SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 240, 240); > + if (err < 0) > + return err; Again, is this fixed 240 is a must? Or is this also an alignment issue? thanks, Takashi
> > Again, is this fixed 240 is a must? Or is this also an alignment issue? Hi Takashi, I think it's a must for Chromebooks. Google found this value works best with their CRAS server running on their BSW products. They offered this patch for their own Chromebooks. > > > thanks, > > Takashi
On Sat, 01 Aug 2020 10:58:16 +0200, Lu, Brent wrote: > > > > > Again, is this fixed 240 is a must? Or is this also an alignment issue? > Hi Takashi, > > I think it's a must for Chromebooks. Google found this value works best > with their CRAS server running on their BSW products. They offered this > patch for their own Chromebooks. Hrm, but it's likely a worse choice on other sound backends. Please double-check whether this fixed small period is a must, or it's an alignment issue. Takashi
> > > > > > Again, is this fixed 240 is a must? Or is this also an alignment issue? > > Hi Takashi, > > > > I think it's a must for Chromebooks. Google found this value works > > best with their CRAS server running on their BSW products. They > > offered this patch for their own Chromebooks. > > Hrm, but it's likely a worse choice on other sound backends. > > Please double-check whether this fixed small period is a must, or it's an > alignment issue. Hi Takashi, I've double checked with google. It's a must for Chromebooks due to low latency use case. Regards, Brent > > > Takashi
On 8/3/20 8:00 AM, Lu, Brent wrote: >>>> >>>> Again, is this fixed 240 is a must? Or is this also an alignment issue? >>> Hi Takashi, >>> >>> I think it's a must for Chromebooks. Google found this value works >>> best with their CRAS server running on their BSW products. They >>> offered this patch for their own Chromebooks. >> >> Hrm, but it's likely a worse choice on other sound backends. >> >> Please double-check whether this fixed small period is a must, or it's an >> alignment issue. > Hi Takashi, > > I've double checked with google. It's a must for Chromebooks due to low > latency use case. I wonder if there's a misunderstanding here? I believe Takashi's question was "is this a must to ONLY accept 240 samples for the period size", there was no pushback on the value itself. Are those boards broken with e.g. 960 samples?
> > Hi Takashi, > > > > I've double checked with google. It's a must for Chromebooks due to > > low latency use case. > > I wonder if there's a misunderstanding here? > > I believe Takashi's question was "is this a must to ONLY accept 240 samples > for the period size", there was no pushback on the value itself. > Are those boards broken with e.g. 960 samples? I've added google people to discuss directly. Hi Yuhsuan, Would you explain why CRAS needs to use such short period size? Thanks. Regards, Brent
On Mon, 03 Aug 2020 18:45:29 +0200, Lu, Brent wrote: > > > > Hi Takashi, > > > > > > I've double checked with google. It's a must for Chromebooks due to > > > low latency use case. > > > > I wonder if there's a misunderstanding here? > > > > I believe Takashi's question was "is this a must to ONLY accept 240 samples > > for the period size", there was no pushback on the value itself. > > Are those boards broken with e.g. 960 samples? > > I've added google people to discuss directly. > > Hi Yuhsuan, > Would you explain why CRAS needs to use such short period size? Thanks. For avoid further misunderstanding: it's fine that CRAS *uses* such a short period. It's often required for achieving a short latency. However, the question is whether the driver can set *only* this value for making it working. IOW, if we don't have this constraint, what actually happens? If the driver gives the period size alignment, wouldn't CRAS choose 240? Takashi
> > For avoid further misunderstanding: it's fine that CRAS *uses* such a short > period. It's often required for achieving a short latency. > > However, the question is whether the driver can set *only* this value for > making it working. IOW, if we don't have this constraint, what actually > happens? If the driver gives the period size alignment, wouldn't CRAS > choose 240? It won't. Without the constraint it becomes 432. Actually CRAS does not set period size specifically so the value depends on the constraint rules. [ 52.011146] sound pcmC1D0p: hw_param [ 52.011152] sound pcmC1D0p: ACCESS 0x1 [ 52.011155] sound pcmC1D0p: FORMAT 0x4 [ 52.011158] sound pcmC1D0p: SUBFORMAT 0x1 [ 52.011161] sound pcmC1D0p: SAMPLE_BITS [16:16] [ 52.011164] sound pcmC1D0p: FRAME_BITS [32:32] [ 52.011167] sound pcmC1D0p: CHANNELS [2:2] [ 52.011170] sound pcmC1D0p: RATE [48000:48000] [ 52.011173] sound pcmC1D0p: PERIOD_TIME [9000:9000] [ 52.011176] sound pcmC1D0p: PERIOD_SIZE [432:432] [ 52.011179] sound pcmC1D0p: PERIOD_BYTES [1728:1728] [ 52.011182] sound pcmC1D0p: PERIODS [474:474] [ 52.011185] sound pcmC1D0p: BUFFER_TIME [4266000:4266000] [ 52.011188] sound pcmC1D0p: BUFFER_SIZE [204768:204768] [ 52.011191] sound pcmC1D0p: BUFFER_BYTES [819072:819072] [ 52.011194] sound pcmC1D0p: TICK_TIME [0:0] Regards, Brent > > > Takashi
On 8/3/20 11:33 PM, Lu, Brent wrote: >> >> For avoid further misunderstanding: it's fine that CRAS *uses* such a short >> period. It's often required for achieving a short latency. >> >> However, the question is whether the driver can set *only* this value for >> making it working. IOW, if we don't have this constraint, what actually >> happens? If the driver gives the period size alignment, wouldn't CRAS >> choose 240? > > It won't. Without the constraint it becomes 432. Actually CRAS does not set > period size specifically so the value depends on the constraint rules. I don't get this. If the platform driver already stated 240 and 960 samples why would 432 be chosen? Doesn't this mean the constraint is not applied? > [ 52.011146] sound pcmC1D0p: hw_param > [ 52.011152] sound pcmC1D0p: ACCESS 0x1 > [ 52.011155] sound pcmC1D0p: FORMAT 0x4 > [ 52.011158] sound pcmC1D0p: SUBFORMAT 0x1 > [ 52.011161] sound pcmC1D0p: SAMPLE_BITS [16:16] > [ 52.011164] sound pcmC1D0p: FRAME_BITS [32:32] > [ 52.011167] sound pcmC1D0p: CHANNELS [2:2] > [ 52.011170] sound pcmC1D0p: RATE [48000:48000] > [ 52.011173] sound pcmC1D0p: PERIOD_TIME [9000:9000] > [ 52.011176] sound pcmC1D0p: PERIOD_SIZE [432:432] > [ 52.011179] sound pcmC1D0p: PERIOD_BYTES [1728:1728] > [ 52.011182] sound pcmC1D0p: PERIODS [474:474] > [ 52.011185] sound pcmC1D0p: BUFFER_TIME [4266000:4266000] > [ 52.011188] sound pcmC1D0p: BUFFER_SIZE [204768:204768] > [ 52.011191] sound pcmC1D0p: BUFFER_BYTES [819072:819072] > [ 52.011194] sound pcmC1D0p: TICK_TIME [0:0] > > Regards, > Brent > >> >> >> Takashi > >
> > I don't get this. If the platform driver already stated 240 and 960 samples why > would 432 be chosen? Doesn't this mean the constraint is not applied? Hi Pierre, Sorry for late reply. I used following constraints in V3 patch so any period which aligns 1ms would be accepted. + /* + * Make sure the period to be multiple of 1ms to align the + * design of firmware. Apply same rule to buffer size to make + * sure alsa could always find a value for period size + * regardless the buffer size given by user space. + */ + snd_pcm_hw_constraint_step(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 48); + snd_pcm_hw_constraint_step(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 48); Regards, Brent > > > [ 52.011146] sound pcmC1D0p: hw_param > > [ 52.011152] sound pcmC1D0p: ACCESS 0x1 > > [ 52.011155] sound pcmC1D0p: FORMAT 0x4 > > [ 52.011158] sound pcmC1D0p: SUBFORMAT 0x1 > > [ 52.011161] sound pcmC1D0p: SAMPLE_BITS [16:16] > > [ 52.011164] sound pcmC1D0p: FRAME_BITS [32:32] > > [ 52.011167] sound pcmC1D0p: CHANNELS [2:2] > > [ 52.011170] sound pcmC1D0p: RATE [48000:48000] > > [ 52.011173] sound pcmC1D0p: PERIOD_TIME [9000:9000] > > [ 52.011176] sound pcmC1D0p: PERIOD_SIZE [432:432] > > [ 52.011179] sound pcmC1D0p: PERIOD_BYTES [1728:1728] > > [ 52.011182] sound pcmC1D0p: PERIODS [474:474] > > [ 52.011185] sound pcmC1D0p: BUFFER_TIME [4266000:4266000] > > [ 52.011188] sound pcmC1D0p: BUFFER_SIZE [204768:204768] > > [ 52.011191] sound pcmC1D0p: BUFFER_BYTES [819072:819072] > > [ 52.011194] sound pcmC1D0p: TICK_TIME [0:0] > > > > Regards, > > Brent > > > >> > >> > >> Takashi > > > >
On 8/6/20 11:41 AM, Lu, Brent wrote: >> >> I don't get this. If the platform driver already stated 240 and 960 samples why >> would 432 be chosen? Doesn't this mean the constraint is not applied? > > Hi Pierre, > > Sorry for late reply. I used following constraints in V3 patch so any period which > aligns 1ms would be accepted. > > + /* > + * Make sure the period to be multiple of 1ms to align the > + * design of firmware. Apply same rule to buffer size to make > + * sure alsa could always find a value for period size > + * regardless the buffer size given by user space. > + */ > + snd_pcm_hw_constraint_step(substream->runtime, 0, > + SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 48); > + snd_pcm_hw_constraint_step(substream->runtime, 0, > + SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 48); 432 samples is 9ms, I don't have a clue why/how CRAS might ask for this value. It'd be a bit odd to add constraints just for the purpose of letting userspace select a sensible value.
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> 於 2020年8月10日 週一 下午11:03寫道: > > > > On 8/6/20 11:41 AM, Lu, Brent wrote: > >> > >> I don't get this. If the platform driver already stated 240 and 960 samples why > >> would 432 be chosen? Doesn't this mean the constraint is not applied? > > > > Hi Pierre, > > > > Sorry for late reply. I used following constraints in V3 patch so any period which > > aligns 1ms would be accepted. > > > > + /* > > + * Make sure the period to be multiple of 1ms to align the > > + * design of firmware. Apply same rule to buffer size to make > > + * sure alsa could always find a value for period size > > + * regardless the buffer size given by user space. > > + */ > > + snd_pcm_hw_constraint_step(substream->runtime, 0, > > + SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 48); > > + snd_pcm_hw_constraint_step(substream->runtime, 0, > > + SNDRV_PCM_HW_PARAM_BUFFER_SIZE, 48); > > 432 samples is 9ms, I don't have a clue why/how CRAS might ask for this > value. > > It'd be a bit odd to add constraints just for the purpose of letting > userspace select a sensible value. Sorry for the late reply. CRAS does not set the period size when using it. The default period size is 256, which consumes the samples quickly(about 49627 fps when the rate is 48000 fps) at the beginning of the playback. Since CRAS write samples with the fixed frequency, it triggers underruns immidiately. According to Brent, the DSP is using 240 period regardless the hw_param. If the period size is 256, DSP will read 256 samples each time but only consume 240 samples until the ring buffer of DSP is full. This behavior makes the samples in the ring buffer of kernel consumed quickly. (Not sure whether the explanation is correct. Need Brent to confirm it.) Unfortunately, we can not change the behavior of DSP. After some experiments, we found that the issue can be fixed if we set the period size to 240. With the same frequency as the DSP, the samples are consumed stably. Because everyone can trigger this issue when using the driver without setting the period size, we think it is a general issue that should be fixed in the kernel. Thanks, Yu-Hsuan
> > Sorry for the late reply. CRAS does not set the period size when using it. > The default period size is 256, which consumes the samples quickly(about 49627 > fps when the rate is 48000 fps) at the beginning of the playback. > Since CRAS write samples with the fixed frequency, it triggers underruns > immidiately. > > According to Brent, the DSP is using 240 period regardless the hw_param. If the > period size is 256, DSP will read 256 samples each time but only consume 240 > samples until the ring buffer of DSP is full. This behavior makes the samples in > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is > correct. Need Brent to confirm it.) > > Unfortunately, we can not change the behavior of DSP. After some experiments, > we found that the issue can be fixed if we set the period size to 240. With the > same frequency as the DSP, the samples are consumed stably. Because everyone > can trigger this issue when using the driver without setting the period size, we > think it is a general issue that should be fixed in the kernel. I check the code and just realized CRAS does nothing but request maximum buffer size. As I know the application needs to decide the buffer time and period time so ALSA could generate a hw_param structure with proper period size instead of using fixed constraint in machine driver because driver has no idea about the latency you want. You can use snd_pcm_hw_params_set_buffer_time_near() and snd_pcm_hw_params_set_period_time_near() to get a proper configuration of buffer and period parameters according to the latency requirement. In the CRAS code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on Celes and it looks quite promising. It seems to me that adding constraint in machine driver is not necessary. SectionDevice."Speaker".0 { Value { PlaybackPCM "hw:chtrt5650,0" DmaPeriodMicrosecs "5000" ... [ 52.434761] sound pcmC1D0p: hw_param [ 52.434767] sound pcmC1D0p: ACCESS 0x1 [ 52.434770] sound pcmC1D0p: FORMAT 0x4 [ 52.434772] sound pcmC1D0p: SUBFORMAT 0x1 [ 52.434776] sound pcmC1D0p: SAMPLE_BITS [16:16] [ 52.434779] sound pcmC1D0p: FRAME_BITS [32:32] [ 52.434782] sound pcmC1D0p: CHANNELS [2:2] [ 52.434785] sound pcmC1D0p: RATE [48000:48000] [ 52.434788] sound pcmC1D0p: PERIOD_TIME [5000:5000] [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] [ 52.434794] sound pcmC1D0p: PERIOD_BYTES [960:960] [ 52.434797] sound pcmC1D0p: PERIODS [852:852] [ 52.434799] sound pcmC1D0p: BUFFER_TIME [4260000:4260000] [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] [ 52.434805] sound pcmC1D0p: BUFFER_BYTES [817920:817920] [ 52.434808] sound pcmC1D0p: TICK_TIME [0:0] Regards, Brent > > Thanks, > Yu-Hsuan
Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道: > > > > > Sorry for the late reply. CRAS does not set the period size when using it. > > The default period size is 256, which consumes the samples quickly(about 49627 > > fps when the rate is 48000 fps) at the beginning of the playback. > > Since CRAS write samples with the fixed frequency, it triggers underruns > > immidiately. > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the > > period size is 256, DSP will read 256 samples each time but only consume 240 > > samples until the ring buffer of DSP is full. This behavior makes the samples in > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is > > correct. Need Brent to confirm it.) > > > > Unfortunately, we can not change the behavior of DSP. After some experiments, > > we found that the issue can be fixed if we set the period size to 240. With the > > same frequency as the DSP, the samples are consumed stably. Because everyone > > can trigger this issue when using the driver without setting the period size, we > > think it is a general issue that should be fixed in the kernel. > > I check the code and just realized CRAS does nothing but request maximum buffer > size. As I know the application needs to decide the buffer time and period time so > ALSA could generate a hw_param structure with proper period size instead of using > fixed constraint in machine driver because driver has no idea about the latency you > want. > > You can use snd_pcm_hw_params_set_buffer_time_near() and > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of > buffer and period parameters according to the latency requirement. In the CRAS > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on > Celes and it looks quite promising. It seems to me that adding constraint in machine > driver is not necessary. > > SectionDevice."Speaker".0 { > Value { > PlaybackPCM "hw:chtrt5650,0" > DmaPeriodMicrosecs "5000" > ... > > [ 52.434761] sound pcmC1D0p: hw_param > [ 52.434767] sound pcmC1D0p: ACCESS 0x1 > [ 52.434770] sound pcmC1D0p: FORMAT 0x4 > [ 52.434772] sound pcmC1D0p: SUBFORMAT 0x1 > [ 52.434776] sound pcmC1D0p: SAMPLE_BITS [16:16] > [ 52.434779] sound pcmC1D0p: FRAME_BITS [32:32] > [ 52.434782] sound pcmC1D0p: CHANNELS [2:2] > [ 52.434785] sound pcmC1D0p: RATE [48000:48000] > [ 52.434788] sound pcmC1D0p: PERIOD_TIME [5000:5000] > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > [ 52.434794] sound pcmC1D0p: PERIOD_BYTES [960:960] > [ 52.434797] sound pcmC1D0p: PERIODS [852:852] > [ 52.434799] sound pcmC1D0p: BUFFER_TIME [4260000:4260000] > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > [ 52.434805] sound pcmC1D0p: BUFFER_BYTES [817920:817920] > [ 52.434808] sound pcmC1D0p: TICK_TIME [0:0] > > Regards, > Brent Hi Brent, Yes, I know we can do it to fix the issue as well. As I mentioned before, we wanted to fix it in kernel because it is a real issue, isn't it? Basically, a driver should work with any param it supports. But in this case, everyone can trigger underrun if he or she does not the period size to 240. If you still think it's not necessary, I can modify UCM to make CRAS set the appropriate period size. Thanks, Yu-Hsuan > > > > > Thanks, > > Yu-Hsuan
On Tue, 11 Aug 2020 04:29:24 +0200, Yu-Hsuan Hsu wrote: > > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道: > > > > > > > > Sorry for the late reply. CRAS does not set the period size when using it. > > > The default period size is 256, which consumes the samples quickly(about 49627 > > > fps when the rate is 48000 fps) at the beginning of the playback. > > > Since CRAS write samples with the fixed frequency, it triggers underruns > > > immidiately. > > > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the > > > period size is 256, DSP will read 256 samples each time but only consume 240 > > > samples until the ring buffer of DSP is full. This behavior makes the samples in > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is > > > correct. Need Brent to confirm it.) > > > > > > Unfortunately, we can not change the behavior of DSP. After some experiments, > > > we found that the issue can be fixed if we set the period size to 240. With the > > > same frequency as the DSP, the samples are consumed stably. Because everyone > > > can trigger this issue when using the driver without setting the period size, we > > > think it is a general issue that should be fixed in the kernel. > > > > I check the code and just realized CRAS does nothing but request maximum buffer > > size. As I know the application needs to decide the buffer time and period time so > > ALSA could generate a hw_param structure with proper period size instead of using > > fixed constraint in machine driver because driver has no idea about the latency you > > want. > > > > You can use snd_pcm_hw_params_set_buffer_time_near() and > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of > > buffer and period parameters according to the latency requirement. In the CRAS > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on > > Celes and it looks quite promising. It seems to me that adding constraint in machine > > driver is not necessary. > > > > SectionDevice."Speaker".0 { > > Value { > > PlaybackPCM "hw:chtrt5650,0" > > DmaPeriodMicrosecs "5000" > > ... > > > > [ 52.434761] sound pcmC1D0p: hw_param > > [ 52.434767] sound pcmC1D0p: ACCESS 0x1 > > [ 52.434770] sound pcmC1D0p: FORMAT 0x4 > > [ 52.434772] sound pcmC1D0p: SUBFORMAT 0x1 > > [ 52.434776] sound pcmC1D0p: SAMPLE_BITS [16:16] > > [ 52.434779] sound pcmC1D0p: FRAME_BITS [32:32] > > [ 52.434782] sound pcmC1D0p: CHANNELS [2:2] > > [ 52.434785] sound pcmC1D0p: RATE [48000:48000] > > [ 52.434788] sound pcmC1D0p: PERIOD_TIME [5000:5000] > > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > > [ 52.434794] sound pcmC1D0p: PERIOD_BYTES [960:960] > > [ 52.434797] sound pcmC1D0p: PERIODS [852:852] > > [ 52.434799] sound pcmC1D0p: BUFFER_TIME [4260000:4260000] > > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > > [ 52.434805] sound pcmC1D0p: BUFFER_BYTES [817920:817920] > > [ 52.434808] sound pcmC1D0p: TICK_TIME [0:0] > > > > Regards, > > Brent > Hi Brent, > > Yes, I know we can do it to fix the issue as well. As I mentioned > before, we wanted to fix it in kernel because it is a real issue, > isn't it? Basically, a driver should work with any param it supports. > But in this case, everyone can trigger underrun if he or she does not > the period size to 240. If you still think it's not necessary, I can > modify UCM to make CRAS set the appropriate period size. How does it *not* work if you set other than period size 240, more exactly? The hw_constraint to a fixed period size must be really an exception. If you look at other drivers, you won't find any other doing such. It already indicates that something is wrong. Usually the fixed period size comes from the hardware limitation and defined in snd_pcm_hardware. Or, sometimes it's an alignment issue. If you need more than that, you should doubt what's really not working. Takashi
Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午3:43寫道: > > On Tue, 11 Aug 2020 04:29:24 +0200, > Yu-Hsuan Hsu wrote: > > > > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道: > > > > > > > > > > > Sorry for the late reply. CRAS does not set the period size when using it. > > > > The default period size is 256, which consumes the samples quickly(about 49627 > > > > fps when the rate is 48000 fps) at the beginning of the playback. > > > > Since CRAS write samples with the fixed frequency, it triggers underruns > > > > immidiately. > > > > > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the > > > > period size is 256, DSP will read 256 samples each time but only consume 240 > > > > samples until the ring buffer of DSP is full. This behavior makes the samples in > > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is > > > > correct. Need Brent to confirm it.) > > > > > > > > Unfortunately, we can not change the behavior of DSP. After some experiments, > > > > we found that the issue can be fixed if we set the period size to 240. With the > > > > same frequency as the DSP, the samples are consumed stably. Because everyone > > > > can trigger this issue when using the driver without setting the period size, we > > > > think it is a general issue that should be fixed in the kernel. > > > > > > I check the code and just realized CRAS does nothing but request maximum buffer > > > size. As I know the application needs to decide the buffer time and period time so > > > ALSA could generate a hw_param structure with proper period size instead of using > > > fixed constraint in machine driver because driver has no idea about the latency you > > > want. > > > > > > You can use snd_pcm_hw_params_set_buffer_time_near() and > > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of > > > buffer and period parameters according to the latency requirement. In the CRAS > > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on > > > Celes and it looks quite promising. It seems to me that adding constraint in machine > > > driver is not necessary. > > > > > > SectionDevice."Speaker".0 { > > > Value { > > > PlaybackPCM "hw:chtrt5650,0" > > > DmaPeriodMicrosecs "5000" > > > ... > > > > > > [ 52.434761] sound pcmC1D0p: hw_param > > > [ 52.434767] sound pcmC1D0p: ACCESS 0x1 > > > [ 52.434770] sound pcmC1D0p: FORMAT 0x4 > > > [ 52.434772] sound pcmC1D0p: SUBFORMAT 0x1 > > > [ 52.434776] sound pcmC1D0p: SAMPLE_BITS [16:16] > > > [ 52.434779] sound pcmC1D0p: FRAME_BITS [32:32] > > > [ 52.434782] sound pcmC1D0p: CHANNELS [2:2] > > > [ 52.434785] sound pcmC1D0p: RATE [48000:48000] > > > [ 52.434788] sound pcmC1D0p: PERIOD_TIME [5000:5000] > > > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > > > [ 52.434794] sound pcmC1D0p: PERIOD_BYTES [960:960] > > > [ 52.434797] sound pcmC1D0p: PERIODS [852:852] > > > [ 52.434799] sound pcmC1D0p: BUFFER_TIME [4260000:4260000] > > > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > > > [ 52.434805] sound pcmC1D0p: BUFFER_BYTES [817920:817920] > > > [ 52.434808] sound pcmC1D0p: TICK_TIME [0:0] > > > > > > Regards, > > > Brent > > Hi Brent, > > > > Yes, I know we can do it to fix the issue as well. As I mentioned > > before, we wanted to fix it in kernel because it is a real issue, > > isn't it? Basically, a driver should work with any param it supports. > > But in this case, everyone can trigger underrun if he or she does not > > the period size to 240. If you still think it's not necessary, I can > > modify UCM to make CRAS set the appropriate period size. > > How does it *not* work if you set other than period size 240, more > exactly? > > The hw_constraint to a fixed period size must be really an exception. > If you look at other drivers, you won't find any other doing such. > It already indicates that something is wrong. > > Usually the fixed period size comes from the hardware limitation and > defined in snd_pcm_hardware. Or, sometimes it's an alignment issue. > If you need more than that, you should doubt what's really not > working. > > > Takashi Thank Takashi, As I mentioned before, if the period size is set to 256, the measured rate of sample-consuming will be around 49627 fps. It causes underrun because the rate we set is 48000 fps. This behavior also happen on the other period rate except for 240. Thanks, Yu-Hsuan
On Tue, 11 Aug 2020 10:25:22 +0200, Yu-Hsuan Hsu wrote: > > Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午3:43寫道: > > > > On Tue, 11 Aug 2020 04:29:24 +0200, > > Yu-Hsuan Hsu wrote: > > > > > > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道: > > > > > > > > > > > > > > Sorry for the late reply. CRAS does not set the period size when using it. > > > > > The default period size is 256, which consumes the samples quickly(about 49627 > > > > > fps when the rate is 48000 fps) at the beginning of the playback. > > > > > Since CRAS write samples with the fixed frequency, it triggers underruns > > > > > immidiately. > > > > > > > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the > > > > > period size is 256, DSP will read 256 samples each time but only consume 240 > > > > > samples until the ring buffer of DSP is full. This behavior makes the samples in > > > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is > > > > > correct. Need Brent to confirm it.) > > > > > > > > > > Unfortunately, we can not change the behavior of DSP. After some experiments, > > > > > we found that the issue can be fixed if we set the period size to 240. With the > > > > > same frequency as the DSP, the samples are consumed stably. Because everyone > > > > > can trigger this issue when using the driver without setting the period size, we > > > > > think it is a general issue that should be fixed in the kernel. > > > > > > > > I check the code and just realized CRAS does nothing but request maximum buffer > > > > size. As I know the application needs to decide the buffer time and period time so > > > > ALSA could generate a hw_param structure with proper period size instead of using > > > > fixed constraint in machine driver because driver has no idea about the latency you > > > > want. > > > > > > > > You can use snd_pcm_hw_params_set_buffer_time_near() and > > > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of > > > > buffer and period parameters according to the latency requirement. In the CRAS > > > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on > > > > Celes and it looks quite promising. It seems to me that adding constraint in machine > > > > driver is not necessary. > > > > > > > > SectionDevice."Speaker".0 { > > > > Value { > > > > PlaybackPCM "hw:chtrt5650,0" > > > > DmaPeriodMicrosecs "5000" > > > > ... > > > > > > > > [ 52.434761] sound pcmC1D0p: hw_param > > > > [ 52.434767] sound pcmC1D0p: ACCESS 0x1 > > > > [ 52.434770] sound pcmC1D0p: FORMAT 0x4 > > > > [ 52.434772] sound pcmC1D0p: SUBFORMAT 0x1 > > > > [ 52.434776] sound pcmC1D0p: SAMPLE_BITS [16:16] > > > > [ 52.434779] sound pcmC1D0p: FRAME_BITS [32:32] > > > > [ 52.434782] sound pcmC1D0p: CHANNELS [2:2] > > > > [ 52.434785] sound pcmC1D0p: RATE [48000:48000] > > > > [ 52.434788] sound pcmC1D0p: PERIOD_TIME [5000:5000] > > > > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > > > > [ 52.434794] sound pcmC1D0p: PERIOD_BYTES [960:960] > > > > [ 52.434797] sound pcmC1D0p: PERIODS [852:852] > > > > [ 52.434799] sound pcmC1D0p: BUFFER_TIME [4260000:4260000] > > > > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > > > > [ 52.434805] sound pcmC1D0p: BUFFER_BYTES [817920:817920] > > > > [ 52.434808] sound pcmC1D0p: TICK_TIME [0:0] > > > > > > > > Regards, > > > > Brent > > > Hi Brent, > > > > > > Yes, I know we can do it to fix the issue as well. As I mentioned > > > before, we wanted to fix it in kernel because it is a real issue, > > > isn't it? Basically, a driver should work with any param it supports. > > > But in this case, everyone can trigger underrun if he or she does not > > > the period size to 240. If you still think it's not necessary, I can > > > modify UCM to make CRAS set the appropriate period size. > > > > How does it *not* work if you set other than period size 240, more > > exactly? > > > > The hw_constraint to a fixed period size must be really an exception. > > If you look at other drivers, you won't find any other doing such. > > It already indicates that something is wrong. > > > > Usually the fixed period size comes from the hardware limitation and > > defined in snd_pcm_hardware. Or, sometimes it's an alignment issue. > > If you need more than that, you should doubt what's really not > > working. > > > > > > Takashi > Thank Takashi, > > As I mentioned before, if the period size is set to 256, the measured > rate of sample-consuming will be around 49627 fps. It causes underrun > because the rate we set is 48000 fps. But this explanation rather sounds like the alignment problem. However... > This behavior also happen on the > other period rate except for 240. ... Why only 240? That's the next logical question. If you have a clarification for it, it may be the rigid reason to introduce such a hw constraint. Takashi
Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午4:39寫道: > > On Tue, 11 Aug 2020 10:25:22 +0200, > Yu-Hsuan Hsu wrote: > > > > Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午3:43寫道: > > > > > > On Tue, 11 Aug 2020 04:29:24 +0200, > > > Yu-Hsuan Hsu wrote: > > > > > > > > Lu, Brent <brent.lu@intel.com> 於 2020年8月11日 週二 上午10:17寫道: > > > > > > > > > > > > > > > > > Sorry for the late reply. CRAS does not set the period size when using it. > > > > > > The default period size is 256, which consumes the samples quickly(about 49627 > > > > > > fps when the rate is 48000 fps) at the beginning of the playback. > > > > > > Since CRAS write samples with the fixed frequency, it triggers underruns > > > > > > immidiately. > > > > > > > > > > > > According to Brent, the DSP is using 240 period regardless the hw_param. If the > > > > > > period size is 256, DSP will read 256 samples each time but only consume 240 > > > > > > samples until the ring buffer of DSP is full. This behavior makes the samples in > > > > > > the ring buffer of kernel consumed quickly. (Not sure whether the explanation is > > > > > > correct. Need Brent to confirm it.) > > > > > > > > > > > > Unfortunately, we can not change the behavior of DSP. After some experiments, > > > > > > we found that the issue can be fixed if we set the period size to 240. With the > > > > > > same frequency as the DSP, the samples are consumed stably. Because everyone > > > > > > can trigger this issue when using the driver without setting the period size, we > > > > > > think it is a general issue that should be fixed in the kernel. > > > > > > > > > > I check the code and just realized CRAS does nothing but request maximum buffer > > > > > size. As I know the application needs to decide the buffer time and period time so > > > > > ALSA could generate a hw_param structure with proper period size instead of using > > > > > fixed constraint in machine driver because driver has no idea about the latency you > > > > > want. > > > > > > > > > > You can use snd_pcm_hw_params_set_buffer_time_near() and > > > > > snd_pcm_hw_params_set_period_time_near() to get a proper configuration of > > > > > buffer and period parameters according to the latency requirement. In the CRAS > > > > > code, there is a UCM variable to support this: DmaPeriodMicrosecs. I tested it on > > > > > Celes and it looks quite promising. It seems to me that adding constraint in machine > > > > > driver is not necessary. > > > > > > > > > > SectionDevice."Speaker".0 { > > > > > Value { > > > > > PlaybackPCM "hw:chtrt5650,0" > > > > > DmaPeriodMicrosecs "5000" > > > > > ... > > > > > > > > > > [ 52.434761] sound pcmC1D0p: hw_param > > > > > [ 52.434767] sound pcmC1D0p: ACCESS 0x1 > > > > > [ 52.434770] sound pcmC1D0p: FORMAT 0x4 > > > > > [ 52.434772] sound pcmC1D0p: SUBFORMAT 0x1 > > > > > [ 52.434776] sound pcmC1D0p: SAMPLE_BITS [16:16] > > > > > [ 52.434779] sound pcmC1D0p: FRAME_BITS [32:32] > > > > > [ 52.434782] sound pcmC1D0p: CHANNELS [2:2] > > > > > [ 52.434785] sound pcmC1D0p: RATE [48000:48000] > > > > > [ 52.434788] sound pcmC1D0p: PERIOD_TIME [5000:5000] > > > > > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > > > > > [ 52.434794] sound pcmC1D0p: PERIOD_BYTES [960:960] > > > > > [ 52.434797] sound pcmC1D0p: PERIODS [852:852] > > > > > [ 52.434799] sound pcmC1D0p: BUFFER_TIME [4260000:4260000] > > > > > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > > > > > [ 52.434805] sound pcmC1D0p: BUFFER_BYTES [817920:817920] > > > > > [ 52.434808] sound pcmC1D0p: TICK_TIME [0:0] > > > > > > > > > > Regards, > > > > > Brent > > > > Hi Brent, > > > > > > > > Yes, I know we can do it to fix the issue as well. As I mentioned > > > > before, we wanted to fix it in kernel because it is a real issue, > > > > isn't it? Basically, a driver should work with any param it supports. > > > > But in this case, everyone can trigger underrun if he or she does not > > > > the period size to 240. If you still think it's not necessary, I can > > > > modify UCM to make CRAS set the appropriate period size. > > > > > > How does it *not* work if you set other than period size 240, more > > > exactly? > > > > > > The hw_constraint to a fixed period size must be really an exception. > > > If you look at other drivers, you won't find any other doing such. > > > It already indicates that something is wrong. > > > > > > Usually the fixed period size comes from the hardware limitation and > > > defined in snd_pcm_hardware. Or, sometimes it's an alignment issue. > > > If you need more than that, you should doubt what's really not > > > working. > > > > > > > > > Takashi > > Thank Takashi, > > > > As I mentioned before, if the period size is set to 256, the measured > > rate of sample-consuming will be around 49627 fps. It causes underrun > > because the rate we set is 48000 fps. > > But this explanation rather sounds like the alignment problem. > However... > > > This behavior also happen on the > > other period rate except for 240. > > ... Why only 240? That's the next logical question. > If you have a clarification for it, it may be the rigid reason to > introduce such a hw constraint. > > > Takashi According to Brent, the DSP is using 240 period regardless the hw_param. If the period size is 256, DSP will read 256 samples each time but only consume 240 samples until the ring buffer of DSP is full. This behavior makes the samples in the ring buffer of kernel consumed quickly. Not sure whether the explanation is correct. Hi Brent, can you confirm it? Thanks, Yu-Hsuan
On Tue, Aug 11, 2020 at 05:35:45PM +0800, Yu-Hsuan Hsu wrote: > Takashi Iwai <tiwai@suse.de> 於 2020年8月11日 週二 下午4:39寫道: > > ... Why only 240? That's the next logical question. > > If you have a clarification for it, it may be the rigid reason to > > introduce such a hw constraint. > According to Brent, the DSP is using 240 period regardless the > hw_param. If the period size is 256, DSP will read 256 samples each > time but only consume 240 samples until the ring buffer of DSP is > full. This behavior makes the samples in the ring buffer of kernel > consumed quickly. > Not sure whether the explanation is correct. Hi Brent, can you confirm it? This seems to be going round and round in circles. Userspace lets the kernel pick the period size, if the period size isn't 240 (or a multiple of it?) the DSP doesn't properly pay attention to that apparently due to internal hard coding in the DSP firmware which we can't change so the constraint logic needs to know about this DSP limitation - it seems like none of this is going to change without something new going into the mix? We at least need a new question to ask about the DSP firmware I think.
>>> ... Why only 240? That's the next logical question. >>> If you have a clarification for it, it may be the rigid reason to >>> introduce such a hw constraint. > >> According to Brent, the DSP is using 240 period regardless the >> hw_param. If the period size is 256, DSP will read 256 samples each >> time but only consume 240 samples until the ring buffer of DSP is >> full. This behavior makes the samples in the ring buffer of kernel >> consumed quickly. > >> Not sure whether the explanation is correct. Hi Brent, can you confirm it? > > This seems to be going round and round in circles. Userspace lets the > kernel pick the period size, if the period size isn't 240 (or a multiple > of it?) the DSP doesn't properly pay attention to that apparently due to > internal hard coding in the DSP firmware which we can't change so the > constraint logic needs to know about this DSP limitation - it seems like > none of this is going to change without something new going into the > mix? We at least need a new question to ask about the DSP firmware I > think. I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, and I see no issues with the 240 sample period. Same with 432, 960, 9600, etc. I also tried just for fun what happens with 256 samples, and I don't see any underflows thrown either, so I am wondering what exactly the problem is? Something's not adding up. I would definitively favor multiple of 1ms periods, since it's the only case that was productized, but there's got to me something a side effect of how CRAS programs the hw_params. root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480 -v 1.wav Playing WAVE '1.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo Hardware PCM card 0 'chtmax98090' device 0 subdevice 0 Its setup is: stream : PLAYBACK access : RW_INTERLEAVED format : S16_LE subformat : STD channels : 2 rate : 48000 exact rate : 48000 (48000/1) msbits : 16 buffer_size : 480 period_size : 240 period_time : 5000 tstamp_mode : NONE tstamp_type : MONOTONIC period_step : 1 avail_min : 240 period_event : 0 start_threshold : 480 stop_threshold : 480 silence_threshold: 0 silence_size : 0 boundary : 8646911284551352320 appl_ptr : 0 hw_ptr : 0 root@chrx:~# aplay -Dhw:0,0 --period-size=256 --buffer-size=512 -v 1.wav Playing WAVE '1.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo Hardware PCM card 0 'chtmax98090' device 0 subdevice 0 Its setup is: stream : PLAYBACK access : RW_INTERLEAVED format : S16_LE subformat : STD channels : 2 rate : 48000 exact rate : 48000 (48000/1) msbits : 16 buffer_size : 512 period_size : 256 period_time : 5333 tstamp_mode : NONE tstamp_type : MONOTONIC period_step : 1 avail_min : 256 period_event : 0 start_threshold : 512 stop_threshold : 512 silence_threshold: 0 silence_size : 0 boundary : 4611686018427387904 appl_ptr : 0 hw_ptr : 0
On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > constraint logic needs to know about this DSP limitation - it seems like > > none of this is going to change without something new going into the > > mix? We at least need a new question to ask about the DSP firmware I > > think. > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > etc. > I also tried just for fun what happens with 256 samples, and I don't see any > underflows thrown either, so I am wondering what exactly the problem is? > Something's not adding up. I would definitively favor multiple of 1ms > periods, since it's the only case that was productized, but there's got to > me something a side effect of how CRAS programs the hw_params. Is it something that goes wrong with longer playbacks possibly (eg, someone watching a feature film or something)?
Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道: > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > > > constraint logic needs to know about this DSP limitation - it seems like > > > none of this is going to change without something new going into the > > > mix? We at least need a new question to ask about the DSP firmware I > > > think. > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > > etc. > > > I also tried just for fun what happens with 256 samples, and I don't see any > > underflows thrown either, so I am wondering what exactly the problem is? > > Something's not adding up. I would definitively favor multiple of 1ms > > periods, since it's the only case that was productized, but there's got to > > me something a side effect of how CRAS programs the hw_params. > > Is it something that goes wrong with longer playbacks possibly (eg, > someone watching a feature film or something)? Thanks for testing! After doing some experiments, I think I can identify the problem more precisely. 1. aplay can not reproduce this issue because it writes samples immediately when there are some space in the buffer. However, you can add --test-position to see how the delay grows with period size 256. > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 ... 2. Since many samples are moved to DSP(delay), the measured rate of the ring-buffer is high. (I measured it by alsa_conformance_test, which only test the sampling rate in the ring buffer of kernel not DSP) 3. Since CRAS writes samples with a fixed frequency, this behavior will take all samples from the ring buffer, which is seen as underrun by CRAS. (It seems that it is not a real underrun because that avail does not larger than buffer size. Maybe CRAS should also take dalay into account.) 4. In spite of it is not a real underrun, the large delay is still a big problem. Can we apply the constraint to fix it? Or any better idea? Thanks, Yu-Hsuan
On Wed, 12 Aug 2020 05:09:58 +0200, Yu-Hsuan Hsu wrote: > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道: > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > > > > > constraint logic needs to know about this DSP limitation - it seems like > > > > none of this is going to change without something new going into the > > > > mix? We at least need a new question to ask about the DSP firmware I > > > > think. > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > > > etc. > > > > > I also tried just for fun what happens with 256 samples, and I don't see any > > > underflows thrown either, so I am wondering what exactly the problem is? > > > Something's not adding up. I would definitively favor multiple of 1ms > > > periods, since it's the only case that was productized, but there's got to > > > me something a side effect of how CRAS programs the hw_params. > > > > Is it something that goes wrong with longer playbacks possibly (eg, > > someone watching a feature film or something)? > > Thanks for testing! > > After doing some experiments, I think I can identify the problem more precisely. > 1. aplay can not reproduce this issue because it writes samples > immediately when there are some space in the buffer. However, you can > add --test-position to see how the delay grows with period size 256. > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > Hz, Stereo > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 > ... Isn't this about the alignment of the buffer size against the period size, not the period size itself? i.e. in the example above, the buffer size isn't a multiple of period size, and DSP can't handle if the position overlaps the buffer size in a half way. If that's the problem (and it's an oft-seen restriction), the right constraint is snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); Takashi > 2. Since many samples are moved to DSP(delay), the measured rate of > the ring-buffer is high. (I measured it by alsa_conformance_test, > which only test the sampling rate in the ring buffer of kernel not > DSP) > > 3. Since CRAS writes samples with a fixed frequency, this behavior > will take all samples from the ring buffer, which is seen as underrun > by CRAS. (It seems that it is not a real underrun because that avail > does not larger than buffer size. Maybe CRAS should also take dalay > into account.) > > 4. In spite of it is not a real underrun, the large delay is still a > big problem. Can we apply the constraint to fix it? Or any better > idea? > > Thanks, > Yu-Hsuan >
Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道: > > On Wed, 12 Aug 2020 05:09:58 +0200, > Yu-Hsuan Hsu wrote: > > > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道: > > > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > > > > > > > constraint logic needs to know about this DSP limitation - it seems like > > > > > none of this is going to change without something new going into the > > > > > mix? We at least need a new question to ask about the DSP firmware I > > > > > think. > > > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > > > > etc. > > > > > > > I also tried just for fun what happens with 256 samples, and I don't see any > > > > underflows thrown either, so I am wondering what exactly the problem is? > > > > Something's not adding up. I would definitively favor multiple of 1ms > > > > periods, since it's the only case that was productized, but there's got to > > > > me something a side effect of how CRAS programs the hw_params. > > > > > > Is it something that goes wrong with longer playbacks possibly (eg, > > > someone watching a feature film or something)? > > > > Thanks for testing! > > > > After doing some experiments, I think I can identify the problem more precisely. > > 1. aplay can not reproduce this issue because it writes samples > > immediately when there are some space in the buffer. However, you can > > add --test-position to see how the delay grows with period size 256. > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > > Hz, Stereo > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 > > ... > > Isn't this about the alignment of the buffer size against the period > size, not the period size itself? i.e. in the example above, the > buffer size isn't a multiple of period size, and DSP can't handle if > the position overlaps the buffer size in a half way. > > If that's the problem (and it's an oft-seen restriction), the right > constraint is > snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); > > > Takashi Oh sorry for my typo. The issue happens no matter what buffer size is set. Actually, even if I want to set 480, it will change to 512 automatically. Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 <-this one is the buffer size > > > 2. Since many samples are moved to DSP(delay), the measured rate of > > the ring-buffer is high. (I measured it by alsa_conformance_test, > > which only test the sampling rate in the ring buffer of kernel not > > DSP) > > > > 3. Since CRAS writes samples with a fixed frequency, this behavior > > will take all samples from the ring buffer, which is seen as underrun > > by CRAS. (It seems that it is not a real underrun because that avail > > does not larger than buffer size. Maybe CRAS should also take dalay > > into account.) > > > > 4. In spite of it is not a real underrun, the large delay is still a > > big problem. Can we apply the constraint to fix it? Or any better > > idea? > > > > Thanks, > > Yu-Hsuan > >
On Wed, 12 Aug 2020 08:53:42 +0200, Yu-Hsuan Hsu wrote: > > Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道: > > > > On Wed, 12 Aug 2020 05:09:58 +0200, > > Yu-Hsuan Hsu wrote: > > > > > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道: > > > > > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > > > > > > > > > constraint logic needs to know about this DSP limitation - it seems like > > > > > > none of this is going to change without something new going into the > > > > > > mix? We at least need a new question to ask about the DSP firmware I > > > > > > think. > > > > > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > > > > > etc. > > > > > > > > > I also tried just for fun what happens with 256 samples, and I don't see any > > > > > underflows thrown either, so I am wondering what exactly the problem is? > > > > > Something's not adding up. I would definitively favor multiple of 1ms > > > > > periods, since it's the only case that was productized, but there's got to > > > > > me something a side effect of how CRAS programs the hw_params. > > > > > > > > Is it something that goes wrong with longer playbacks possibly (eg, > > > > someone watching a feature film or something)? > > > > > > Thanks for testing! > > > > > > After doing some experiments, I think I can identify the problem more precisely. > > > 1. aplay can not reproduce this issue because it writes samples > > > immediately when there are some space in the buffer. However, you can > > > add --test-position to see how the delay grows with period size 256. > > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position > > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > > > Hz, Stereo > > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 > > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 > > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 > > > ... > > > > Isn't this about the alignment of the buffer size against the period > > size, not the period size itself? i.e. in the example above, the > > buffer size isn't a multiple of period size, and DSP can't handle if > > the position overlaps the buffer size in a half way. > > > > If that's the problem (and it's an oft-seen restriction), the right > > constraint is > > snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); > > > > > > Takashi > Oh sorry for my typo. The issue happens no matter what buffer size is > set. Actually, even if I want to set 480, it will change to 512 > automatically. > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer > = 512 <-this one is the buffer size OK, then it means that the buffer size alignment is already in place. And this large delay won't happen if you use period size 240? Takashi > > > 2. Since many samples are moved to DSP(delay), the measured rate of > > > the ring-buffer is high. (I measured it by alsa_conformance_test, > > > which only test the sampling rate in the ring buffer of kernel not > > > DSP) > > > > > > 3. Since CRAS writes samples with a fixed frequency, this behavior > > > will take all samples from the ring buffer, which is seen as underrun > > > by CRAS. (It seems that it is not a real underrun because that avail > > > does not larger than buffer size. Maybe CRAS should also take dalay > > > into account.) > > > > > > 4. In spite of it is not a real underrun, the large delay is still a > > > big problem. Can we apply the constraint to fix it? Or any better > > > idea? > > > > > > Thanks, > > > Yu-Hsuan > > > >
Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:58寫道: > > On Wed, 12 Aug 2020 08:53:42 +0200, > Yu-Hsuan Hsu wrote: > > > > Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道: > > > > > > On Wed, 12 Aug 2020 05:09:58 +0200, > > > Yu-Hsuan Hsu wrote: > > > > > > > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道: > > > > > > > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > > > > > > > > > > > constraint logic needs to know about this DSP limitation - it seems like > > > > > > > none of this is going to change without something new going into the > > > > > > > mix? We at least need a new question to ask about the DSP firmware I > > > > > > > think. > > > > > > > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > > > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > > > > > > etc. > > > > > > > > > > > I also tried just for fun what happens with 256 samples, and I don't see any > > > > > > underflows thrown either, so I am wondering what exactly the problem is? > > > > > > Something's not adding up. I would definitively favor multiple of 1ms > > > > > > periods, since it's the only case that was productized, but there's got to > > > > > > me something a side effect of how CRAS programs the hw_params. > > > > > > > > > > Is it something that goes wrong with longer playbacks possibly (eg, > > > > > someone watching a feature film or something)? > > > > > > > > Thanks for testing! > > > > > > > > After doing some experiments, I think I can identify the problem more precisely. > > > > 1. aplay can not reproduce this issue because it writes samples > > > > immediately when there are some space in the buffer. However, you can > > > > add --test-position to see how the delay grows with period size 256. > > > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position > > > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > > > > Hz, Stereo > > > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 > > > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 > > > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 > > > > ... > > > > > > Isn't this about the alignment of the buffer size against the period > > > size, not the period size itself? i.e. in the example above, the > > > buffer size isn't a multiple of period size, and DSP can't handle if > > > the position overlaps the buffer size in a half way. > > > > > > If that's the problem (and it's an oft-seen restriction), the right > > > constraint is > > > snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); > > > > > > > > > Takashi > > Oh sorry for my typo. The issue happens no matter what buffer size is > > set. Actually, even if I want to set 480, it will change to 512 > > automatically. > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer > > = 512 <-this one is the buffer size > > OK, then it means that the buffer size alignment is already in place. > > And this large delay won't happen if you use period size 240? > > > Takashi Yes! If I set the period size to 240, it will not print "Suspicious buffer position ..." Yu-Hsuan > > > > > 2. Since many samples are moved to DSP(delay), the measured rate of > > > > the ring-buffer is high. (I measured it by alsa_conformance_test, > > > > which only test the sampling rate in the ring buffer of kernel not > > > > DSP) > > > > > > > > 3. Since CRAS writes samples with a fixed frequency, this behavior > > > > will take all samples from the ring buffer, which is seen as underrun > > > > by CRAS. (It seems that it is not a real underrun because that avail > > > > does not larger than buffer size. Maybe CRAS should also take dalay > > > > into account.) > > > > > > > > 4. In spite of it is not a real underrun, the large delay is still a > > > > big problem. Can we apply the constraint to fix it? Or any better > > > > idea? > > > > > > > > Thanks, > > > > Yu-Hsuan > > > > > >
On Wed, 12 Aug 2020 09:43:22 +0200, Yu-Hsuan Hsu wrote: > > Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:58寫道: > > > > On Wed, 12 Aug 2020 08:53:42 +0200, > > Yu-Hsuan Hsu wrote: > > > > > > Takashi Iwai <tiwai@suse.de> 於 2020年8月12日 週三 下午2:14寫道: > > > > > > > > On Wed, 12 Aug 2020 05:09:58 +0200, > > > > Yu-Hsuan Hsu wrote: > > > > > > > > > > Mark Brown <broonie@kernel.org> 於 2020年8月12日 週三 上午1:22寫道: > > > > > > > > > > > > On Tue, Aug 11, 2020 at 11:54:38AM -0500, Pierre-Louis Bossart wrote: > > > > > > > > > > > > > > constraint logic needs to know about this DSP limitation - it seems like > > > > > > > > none of this is going to change without something new going into the > > > > > > > > mix? We at least need a new question to ask about the DSP firmware I > > > > > > > > think. > > > > > > > > > > > > > I just tested aplay -Dhw: on a Cyan Chromebook with the Ubuntu kernel 5.4, > > > > > > > and I see no issues with the 240 sample period. Same with 432, 960, 9600, > > > > > > > etc. > > > > > > > > > > > > > I also tried just for fun what happens with 256 samples, and I don't see any > > > > > > > underflows thrown either, so I am wondering what exactly the problem is? > > > > > > > Something's not adding up. I would definitively favor multiple of 1ms > > > > > > > periods, since it's the only case that was productized, but there's got to > > > > > > > me something a side effect of how CRAS programs the hw_params. > > > > > > > > > > > > Is it something that goes wrong with longer playbacks possibly (eg, > > > > > > someone watching a feature film or something)? > > > > > > > > > > Thanks for testing! > > > > > > > > > > After doing some experiments, I think I can identify the problem more precisely. > > > > > 1. aplay can not reproduce this issue because it writes samples > > > > > immediately when there are some space in the buffer. However, you can > > > > > add --test-position to see how the delay grows with period size 256. > > > > > > aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position > > > > > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > > > > > Hz, Stereo > > > > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 > > > > > Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 > > > > > Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 > > > > > ... > > > > > > > > Isn't this about the alignment of the buffer size against the period > > > > size, not the period size itself? i.e. in the example above, the > > > > buffer size isn't a multiple of period size, and DSP can't handle if > > > > the position overlaps the buffer size in a half way. > > > > > > > > If that's the problem (and it's an oft-seen restriction), the right > > > > constraint is > > > > snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); > > > > > > > > > > > > Takashi > > > Oh sorry for my typo. The issue happens no matter what buffer size is > > > set. Actually, even if I want to set 480, it will change to 512 > > > automatically. > > > Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer > > > = 512 <-this one is the buffer size > > > > OK, then it means that the buffer size alignment is already in place. > > > > And this large delay won't happen if you use period size 240? > > > > > > Takashi > Yes! If I set the period size to 240, it will not print "Suspicious > buffer position ..." So it sounds like DSP handles the delay report incorrectly. Then it comes to another question: the driver supports both SOF and SST. Is there the behavior difference between both DSPs wrt this delay issue? Takashi > > Yu-Hsuan > > > > > > > > 2. Since many samples are moved to DSP(delay), the measured rate of > > > > > the ring-buffer is high. (I measured it by alsa_conformance_test, > > > > > which only test the sampling rate in the ring buffer of kernel not > > > > > DSP) > > > > > > > > > > 3. Since CRAS writes samples with a fixed frequency, this behavior > > > > > will take all samples from the ring buffer, which is seen as underrun > > > > > by CRAS. (It seems that it is not a real underrun because that avail > > > > > does not larger than buffer size. Maybe CRAS should also take dalay > > > > > into account.) > > > > > > > > > > 4. In spite of it is not a real underrun, the large delay is still a > > > > > big problem. Can we apply the constraint to fix it? Or any better > > > > > idea? > > > > > > > > > > Thanks, > > > > > Yu-Hsuan > > > > > > > > >
>>>>>> After doing some experiments, I think I can identify the problem more precisely. >>>>>> 1. aplay can not reproduce this issue because it writes samples >>>>>> immediately when there are some space in the buffer. However, you can >>>>>> add --test-position to see how the delay grows with period size 256. >>>>>>> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position >>>>>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 >>>>>> Hz, Stereo >>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 >>>>>> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 >>>>>> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 >>>>>> ... >>>>> >>>>> Isn't this about the alignment of the buffer size against the period >>>>> size, not the period size itself? i.e. in the example above, the >>>>> buffer size isn't a multiple of period size, and DSP can't handle if >>>>> the position overlaps the buffer size in a half way. >>>>> >>>>> If that's the problem (and it's an oft-seen restriction), the right >>>>> constraint is >>>>> snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); >>>>> >>>>> >>>>> Takashi >>>> Oh sorry for my typo. The issue happens no matter what buffer size is >>>> set. Actually, even if I want to set 480, it will change to 512 >>>> automatically. >>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer >>>> = 512 <-this one is the buffer size >>> >>> OK, then it means that the buffer size alignment is already in place. >>> >>> And this large delay won't happen if you use period size 240? >>> >>> >>> Takashi >> Yes! If I set the period size to 240, it will not print "Suspicious >> buffer position ..." > > So it sounds like DSP handles the delay report incorrectly. > Then it comes to another question: the driver supports both SOF and > SST. Is there the behavior difference between both DSPs wrt this > delay issue? I still don't get what the issue is. The two following cases work fine with the SST/Atom driver: root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480 /dev/zero -d 2 -f dat --test-position Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo root@chrx:~# aplay -Dhw:0,0 --period-size=960 --buffer-size=4800 /dev/zero -d 2 -f dat --test-position Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo The existing code has this: /* Make sure, that the period size is always even */ snd_pcm_hw_constraint_step(substream->runtime, 0, SNDRV_PCM_HW_PARAM_PERIODS, 2); return snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); and with the addition of period size being a multiple of 1ms all requirements should be met?
On Wed, 12 Aug 2020 16:46:40 +0200, Pierre-Louis Bossart wrote: > > > >>>>>> After doing some experiments, I think I can identify the problem more precisely. > >>>>>> 1. aplay can not reproduce this issue because it writes samples > >>>>>> immediately when there are some space in the buffer. However, you can > >>>>>> add --test-position to see how the delay grows with period size 256. > >>>>>>> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position > >>>>>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > >>>>>> Hz, Stereo > >>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 > >>>>>> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 > >>>>>> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 > >>>>>> ... > >>>>> > >>>>> Isn't this about the alignment of the buffer size against the period > >>>>> size, not the period size itself? i.e. in the example above, the > >>>>> buffer size isn't a multiple of period size, and DSP can't handle if > >>>>> the position overlaps the buffer size in a half way. > >>>>> > >>>>> If that's the problem (and it's an oft-seen restriction), the right > >>>>> constraint is > >>>>> snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); > >>>>> > >>>>> > >>>>> Takashi > >>>> Oh sorry for my typo. The issue happens no matter what buffer size is > >>>> set. Actually, even if I want to set 480, it will change to 512 > >>>> automatically. > >>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer > >>>> = 512 <-this one is the buffer size > >>> > >>> OK, then it means that the buffer size alignment is already in place. > >>> > >>> And this large delay won't happen if you use period size 240? > >>> > >>> > >>> Takashi > >> Yes! If I set the period size to 240, it will not print "Suspicious > >> buffer position ..." > > > > So it sounds like DSP handles the delay report incorrectly. > > Then it comes to another question: the driver supports both SOF and > > SST. Is there the behavior difference between both DSPs wrt this > > delay issue? > > I still don't get what the issue is. The two following cases work fine > with the SST/Atom driver: > > root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480 > /dev/zero -d 2 -f dat --test-position > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > Hz, Stereo > root@chrx:~# aplay -Dhw:0,0 --period-size=960 --buffer-size=4800 > /dev/zero -d 2 -f dat --test-position > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > Hz, Stereo What if with --period-size=256 --buffer-size=512 and --test-position? Can you reproduce the problem in your side? > The existing code has this: > > /* Make sure, that the period size is always even */ > snd_pcm_hw_constraint_step(substream->runtime, 0, > SNDRV_PCM_HW_PARAM_PERIODS, 2); > > return snd_pcm_hw_constraint_integer(runtime, > SNDRV_PCM_HW_PARAM_PERIODS); > > and with the addition of period size being a multiple of 1ms all > requirements should be met? I also wonder what's really missing, too :) BTW, I took a look back at the thread, and CRAS seems using a very large buffer, namely: [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] Takashi
On 8/12/20 9:55 AM, Takashi Iwai wrote: > On Wed, 12 Aug 2020 16:46:40 +0200, > Pierre-Louis Bossart wrote: >> >> >>>>>>>> After doing some experiments, I think I can identify the problem more precisely. >>>>>>>> 1. aplay can not reproduce this issue because it writes samples >>>>>>>> immediately when there are some space in the buffer. However, you can >>>>>>>> add --test-position to see how the delay grows with period size 256. >>>>>>>>> aplay -Dhw:1,0 --period-size=256 --buffer-size=480 /dev/zero -d 1 -f dat --test-position >>>>>>>> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 >>>>>>>> Hz, Stereo >>>>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 >>>>>>>> Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 >>>>>>>> Suspicious buffer position (3 total): avail = 0, delay = 2096, buffer = 512 >>>>>>>> ... >>>>>>> >>>>>>> Isn't this about the alignment of the buffer size against the period >>>>>>> size, not the period size itself? i.e. in the example above, the >>>>>>> buffer size isn't a multiple of period size, and DSP can't handle if >>>>>>> the position overlaps the buffer size in a half way. >>>>>>> >>>>>>> If that's the problem (and it's an oft-seen restriction), the right >>>>>>> constraint is >>>>>>> snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); >>>>>>> >>>>>>> >>>>>>> Takashi >>>>>> Oh sorry for my typo. The issue happens no matter what buffer size is >>>>>> set. Actually, even if I want to set 480, it will change to 512 >>>>>> automatically. >>>>>> Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer >>>>>> = 512 <-this one is the buffer size >>>>> >>>>> OK, then it means that the buffer size alignment is already in place. >>>>> >>>>> And this large delay won't happen if you use period size 240? >>>>> >>>>> >>>>> Takashi >>>> Yes! If I set the period size to 240, it will not print "Suspicious >>>> buffer position ..." >>> >>> So it sounds like DSP handles the delay report incorrectly. >>> Then it comes to another question: the driver supports both SOF and >>> SST. Is there the behavior difference between both DSPs wrt this >>> delay issue? >> >> I still don't get what the issue is. The two following cases work fine >> with the SST/Atom driver: >> >> root@chrx:~# aplay -Dhw:0,0 --period-size=240 --buffer-size=480 >> /dev/zero -d 2 -f dat --test-position >> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 >> Hz, Stereo >> root@chrx:~# aplay -Dhw:0,0 --period-size=960 --buffer-size=4800 >> /dev/zero -d 2 -f dat --test-position >> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 >> Hz, Stereo > > What if with --period-size=256 --buffer-size=512 and --test-position? > Can you reproduce the problem in your side? Yes indeed with the existing driver: root@chrx:~# aplay -Dhw:0,0 --period-size=256 --buffer-size=512 /dev/zero -d 2 -f dat --test-position Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo underrun!!! (at least 0.312 ms long) underrun!!! (at least 0.326 ms long) Suspicious buffer position (1 total): avail = 0, delay = 2064, buffer = 512 Suspicious buffer position (2 total): avail = 0, delay = 2064, buffer = 512 Suspicious buffer position (3 total): avail = 0, delay = 2080, buffer = 512 Suspicious buffer position (4 total): avail = 0, delay = 2080, buffer = 512 Suspicious buffer position (5 total): avail = 0, delay = 2096, buffer = 512 Suspicious buffer position (6 total): avail = 0, delay = 2096, buffer = 512 but the new constraint to force a 1ms step added in the patch1 should preclude this from happening. >> The existing code has this: >> >> /* Make sure, that the period size is always even */ >> snd_pcm_hw_constraint_step(substream->runtime, 0, >> SNDRV_PCM_HW_PARAM_PERIODS, 2); >> >> return snd_pcm_hw_constraint_integer(runtime, >> SNDRV_PCM_HW_PARAM_PERIODS); >> >> and with the addition of period size being a multiple of 1ms all >> requirements should be met? > > I also wonder what's really missing, too :) > > BTW, I took a look back at the thread, and CRAS seems using a very > large buffer, namely: > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] yes, that's 852 periods and 4.260 seconds. Never seen such values :-)
> > > > I also wonder what's really missing, too :) > > > > BTW, I took a look back at the thread, and CRAS seems using a very > > large buffer, namely: > > [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > > [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > > yes, that's 852 periods and 4.260 seconds. Never seen such values :-) CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large buffer as possible. So the period size is an arbitrary number in different platforms. Atom SST platform happens to be 256, and CML SOF platform is 1056 for example. Regards, Brent
On 8/12/20 11:08 AM, Lu, Brent wrote: >>> >>> I also wonder what's really missing, too :) >>> >>> BTW, I took a look back at the thread, and CRAS seems using a very >>> large buffer, namely: >>> [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] >>> [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] >> >> yes, that's 852 periods and 4.260 seconds. Never seen such values :-) > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large > buffer as possible. So the period size is an arbitrary number in different > platforms. Atom SST platform happens to be 256, and CML SOF platform > is 1056 for example. ok, but earlier in this thread it was mentioned that values such as 432 are not suitable. the statement above seems to mean the period actual value is a "don't care", so I don't quite see why this specific patch2 restricting the value to 240 is necessary. Patch1 is needed for sure, Patch2 is where Takashi and I are not convinced.
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> 於 2020年8月13日 週四 上午12:38寫道: > > > > On 8/12/20 11:08 AM, Lu, Brent wrote: > >>> > >>> I also wonder what's really missing, too :) > >>> > >>> BTW, I took a look back at the thread, and CRAS seems using a very > >>> large buffer, namely: > >>> [ 52.434791] sound pcmC1D0p: PERIOD_SIZE [240:240] > >>> [ 52.434802] sound pcmC1D0p: BUFFER_SIZE [204480:204480] > >> > >> yes, that's 852 periods and 4.260 seconds. Never seen such values :-) > > > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large > > buffer as possible. So the period size is an arbitrary number in different > > platforms. Atom SST platform happens to be 256, and CML SOF platform > > is 1056 for example. > > ok, but earlier in this thread it was mentioned that values such as 432 > are not suitable. the statement above seems to mean the period actual > value is a "don't care", so I don't quite see why this specific patch2 > restricting the value to 240 is necessary. Patch1 is needed for sure, > Patch2 is where Takashi and I are not convinced. I have downloaded the patch1 but it does not work. After applying patch1, the default period size changes to 320. However, it also has the same issue with period size 320. (It can be verified by aplay.)
> > > > > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large > > > buffer as possible. So the period size is an arbitrary number in > > > different platforms. Atom SST platform happens to be 256, and CML > > > SOF platform is 1056 for example. > > > > ok, but earlier in this thread it was mentioned that values such as > > 432 are not suitable. the statement above seems to mean the period > > actual value is a "don't care", so I don't quite see why this specific > > patch2 restricting the value to 240 is necessary. Patch1 is needed for > > sure, > > Patch2 is where Takashi and I are not convinced. > > I have downloaded the patch1 but it does not work. After applying patch1, > the default period size changes to 320. However, it also has the same issue > with period size 320. (It can be verified by aplay.) The period_size is related to the audio latency so it's decided by application according to the use case it's running. That's why there are concerns about patch 2 and also you cannot find similar constraints in other machine driver. Another problem is the buffer size. Too large buffer is not just wasting memories. It also creates problems to memory allocator since continuous pages are not always there. Using a small period_count like 2 or 4 should be sufficient for audio data transfer. buffer_size = period_size * period_count * 1000000 / sample_rate; snd_pcm_hw_params_set_buffer_time_near(mPcmDevice, params, &buffer_size, NULL); And one more problem here: you need to decide period_size and period_count first in order to calculate the buffer size... Regards, Brent
Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道: > > > > > > > > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large > > > > buffer as possible. So the period size is an arbitrary number in > > > > different platforms. Atom SST platform happens to be 256, and CML > > > > SOF platform is 1056 for example. > > > > > > ok, but earlier in this thread it was mentioned that values such as > > > 432 are not suitable. the statement above seems to mean the period > > > actual value is a "don't care", so I don't quite see why this specific > > > patch2 restricting the value to 240 is necessary. Patch1 is needed for > > > sure, > > > Patch2 is where Takashi and I are not convinced. > > > > I have downloaded the patch1 but it does not work. After applying patch1, > > the default period size changes to 320. However, it also has the same issue > > with period size 320. (It can be verified by aplay.) > > The period_size is related to the audio latency so it's decided by application > according to the use case it's running. That's why there are concerns about > patch 2 and also you cannot find similar constraints in other machine driver. You're right. However, the problem here is the provided period size does not work. Like 256, setting the period size to 320 also makes users have big latency in the DSP ring buffer. localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640 /dev/zero -d 1 -f dat --test-position Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640 Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640 Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640 ... > > Another problem is the buffer size. Too large buffer is not just wasting memories. > It also creates problems to memory allocator since continuous pages are not > always there. Using a small period_count like 2 or 4 should be sufficient for audio > data transfer. > > buffer_size = period_size * period_count * 1000000 / sample_rate; > snd_pcm_hw_params_set_buffer_time_near(mPcmDevice, params, &buffer_size, NULL); > > And one more problem here: you need to decide period_size and period_count > first in order to calculate the buffer size... It's a good point. I will bring it up to our team and see whether we can use the smaller buffer size. Thanks! > > > Regards, > Brent Thanks, Yu-Hsuan
On Thu, 13 Aug 2020 10:36:57 +0200, Yu-Hsuan Hsu wrote: > > Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道: > > > > > > > > > > > > CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large > > > > > buffer as possible. So the period size is an arbitrary number in > > > > > different platforms. Atom SST platform happens to be 256, and CML > > > > > SOF platform is 1056 for example. > > > > > > > > ok, but earlier in this thread it was mentioned that values such as > > > > 432 are not suitable. the statement above seems to mean the period > > > > actual value is a "don't care", so I don't quite see why this specific > > > > patch2 restricting the value to 240 is necessary. Patch1 is needed for > > > > sure, > > > > Patch2 is where Takashi and I are not convinced. > > > > > > I have downloaded the patch1 but it does not work. After applying patch1, > > > the default period size changes to 320. However, it also has the same issue > > > with period size 320. (It can be verified by aplay.) > > > > The period_size is related to the audio latency so it's decided by application > > according to the use case it's running. That's why there are concerns about > > patch 2 and also you cannot find similar constraints in other machine driver. > You're right. However, the problem here is the provided period size > does not work. Like 256, setting the period size to 320 also makes > users have big latency in the DSP ring buffer. > > localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640 > /dev/zero -d 1 -f dat --test-position > Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > Hz, Stereo > Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640 > Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640 > Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640 > ... It means that the delay value returned from the driver is bogus. I suppose it comes pcm_delay value calculated in sst_calc_tstamp(), but haven't followed the code closely yet. Maybe checking the debug outputs can help to trace what's going wrong. Takashi > > > > > Another problem is the buffer size. Too large buffer is not just wasting memories. > > It also creates problems to memory allocator since continuous pages are not > > always there. Using a small period_count like 2 or 4 should be sufficient for audio > > data transfer. > > > > buffer_size = period_size * period_count * 1000000 / sample_rate; > > snd_pcm_hw_params_set_buffer_time_near(mPcmDevice, params, &buffer_size, NULL); > > > > And one more problem here: you need to decide period_size and period_count > > first in order to calculate the buffer size... > It's a good point. I will bring it up to our team and see whether we > can use the smaller buffer size. Thanks! > > > > > > Regards, > > Brent > > Thanks, > Yu-Hsuan >
On 8/13/20 3:45 AM, Takashi Iwai wrote: > On Thu, 13 Aug 2020 10:36:57 +0200, > Yu-Hsuan Hsu wrote: >> >> Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道: >>> >>>>>> >>>>>> CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large >>>>>> buffer as possible. So the period size is an arbitrary number in >>>>>> different platforms. Atom SST platform happens to be 256, and CML >>>>>> SOF platform is 1056 for example. >>>>> >>>>> ok, but earlier in this thread it was mentioned that values such as >>>>> 432 are not suitable. the statement above seems to mean the period >>>>> actual value is a "don't care", so I don't quite see why this specific >>>>> patch2 restricting the value to 240 is necessary. Patch1 is needed for >>>>> sure, >>>>> Patch2 is where Takashi and I are not convinced. >>>> >>>> I have downloaded the patch1 but it does not work. After applying patch1, >>>> the default period size changes to 320. However, it also has the same issue >>>> with period size 320. (It can be verified by aplay.) >>> >>> The period_size is related to the audio latency so it's decided by application >>> according to the use case it's running. That's why there are concerns about >>> patch 2 and also you cannot find similar constraints in other machine driver. >> You're right. However, the problem here is the provided period size >> does not work. Like 256, setting the period size to 320 also makes >> users have big latency in the DSP ring buffer. >> >> localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640 >> /dev/zero -d 1 -f dat --test-position >> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 >> Hz, Stereo >> Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640 >> Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640 >> Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640 >> ... > > It means that the delay value returned from the driver is bogus. > I suppose it comes pcm_delay value calculated in sst_calc_tstamp(), > but haven't followed the code closely yet. Maybe checking the debug > outputs can help to trace what's going wrong. the problem is really that we add a constraint that the period size be a multiple of 1ms, and it's not respected. 320 samples is not a valid choice, I don't get how it ends-up being selected? there's a glitch in the matrix here.
Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> 於 2020年8月13日 週四 下午8:57寫道: > > > > On 8/13/20 3:45 AM, Takashi Iwai wrote: > > On Thu, 13 Aug 2020 10:36:57 +0200, > > Yu-Hsuan Hsu wrote: > >> > >> Lu, Brent <brent.lu@intel.com> 於 2020年8月13日 週四 下午3:55寫道: > >>> > >>>>>> > >>>>>> CRAS calls snd_pcm_hw_params_set_buffer_size_max() to use as large > >>>>>> buffer as possible. So the period size is an arbitrary number in > >>>>>> different platforms. Atom SST platform happens to be 256, and CML > >>>>>> SOF platform is 1056 for example. > >>>>> > >>>>> ok, but earlier in this thread it was mentioned that values such as > >>>>> 432 are not suitable. the statement above seems to mean the period > >>>>> actual value is a "don't care", so I don't quite see why this specific > >>>>> patch2 restricting the value to 240 is necessary. Patch1 is needed for > >>>>> sure, > >>>>> Patch2 is where Takashi and I are not convinced. > >>>> > >>>> I have downloaded the patch1 but it does not work. After applying patch1, > >>>> the default period size changes to 320. However, it also has the same issue > >>>> with period size 320. (It can be verified by aplay.) > >>> > >>> The period_size is related to the audio latency so it's decided by application > >>> according to the use case it's running. That's why there are concerns about > >>> patch 2 and also you cannot find similar constraints in other machine driver. > >> You're right. However, the problem here is the provided period size > >> does not work. Like 256, setting the period size to 320 also makes > >> users have big latency in the DSP ring buffer. > >> > >> localhost ~ # aplay -Dhw:1,0 --period-size=320 --buffer-size=640 > >> /dev/zero -d 1 -f dat --test-position > >> Playing raw data '/dev/zero' : Signed 16 bit Little Endian, Rate 48000 > >> Hz, Stereo > >> Suspicious buffer position (1 total): avail = 0, delay = 2640, buffer = 640 > >> Suspicious buffer position (2 total): avail = 0, delay = 2640, buffer = 640 > >> Suspicious buffer position (3 total): avail = 0, delay = 2720, buffer = 640 > >> ... > > > > It means that the delay value returned from the driver is bogus. > > I suppose it comes pcm_delay value calculated in sst_calc_tstamp(), > > but haven't followed the code closely yet. Maybe checking the debug > > outputs can help to trace what's going wrong. > > the problem is really that we add a constraint that the period size be a > multiple of 1ms, and it's not respected. 320 samples is not a valid > choice, I don't get how it ends-up being selected? there's a glitch in > the matrix here. > > Oh sorry that I applied the wrong patch. With the correct patch, the default period size is 432. With period size 432, running aplay with --test-position does not show any errors. However, by cat `/proc/asound/card1/pcm0p/sub0/status`. We can see the delay is around 3000. Here are all period sizes I have tried. All buffer sizes are set to 2 * period size. period size: 192, delay is a negative number. Not sure what happened. period size: 240, delay is fixed at 960 period size: 288, delay is around 27XX period size: 336, delay is around 27XX period size: 384, delay is around 24XX (no errors from aplay) period size: 432, delay is around 30XX (no errors from aplay) period size: 480, delay is fixed at 3120 (no errors from aplay) period size: 524, delay is around 31XX (no errors from aplay) Not sure why the delay is around 50ms except for the period size 240. Is it normal? Thanks, Yu-Hsuan
diff --git a/sound/soc/intel/boards/cht_bsw_max98090_ti.c b/sound/soc/intel/boards/cht_bsw_max98090_ti.c index 835e9bd..bf67254 100644 --- a/sound/soc/intel/boards/cht_bsw_max98090_ti.c +++ b/sound/soc/intel/boards/cht_bsw_max98090_ti.c @@ -283,8 +283,20 @@ static int cht_codec_fixup(struct snd_soc_pcm_runtime *rtd, static int cht_aif1_startup(struct snd_pcm_substream *substream) { - return snd_pcm_hw_constraint_single(substream->runtime, + int err; + + /* Set period size to 240 to align with Atom design */ + err = snd_pcm_hw_constraint_minmax(substream->runtime, + SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 240, 240); + if (err < 0) + return err; + + err = snd_pcm_hw_constraint_single(substream->runtime, SNDRV_PCM_HW_PARAM_RATE, 48000); + if (err < 0) + return err; + + return 0; } static int cht_max98090_headset_init(struct snd_soc_component *component) diff --git a/sound/soc/intel/boards/cht_bsw_rt5645.c b/sound/soc/intel/boards/cht_bsw_rt5645.c index b53c024..6e62f0d 100644 --- a/sound/soc/intel/boards/cht_bsw_rt5645.c +++ b/sound/soc/intel/boards/cht_bsw_rt5645.c @@ -414,8 +414,20 @@ static int cht_codec_fixup(struct snd_soc_pcm_runtime *rtd, static int cht_aif1_startup(struct snd_pcm_substream *substream) { - return snd_pcm_hw_constraint_single(substream->runtime, + int err; + + /* Set period size to 240 to align with Atom design */ + err = snd_pcm_hw_constraint_minmax(substream->runtime, + SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 240, 240); + if (err < 0) + return err; + + err = snd_pcm_hw_constraint_single(substream->runtime, SNDRV_PCM_HW_PARAM_RATE, 48000); + if (err < 0) + return err; + + return 0; } static const struct snd_soc_ops cht_aif1_ops = {