diff mbox series

ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue

Message ID 20200116045318.5498-1-yang.jie@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue | expand

Commit Message

Keyon Jie Jan. 16, 2020, 4:53 a.m. UTC
With today's code, we preallocate DMA buffer for substreams at pcm_new()
stage, and the substream->buffer_bytes_max and substream->dma_max will
save as the actually preallocated buffer size and maximum size that the
dma buffer can be expanded by at hw_params() state, correspondingly.

At pcm_open() stage, the maximum constraint of HW_PARAM_BUFFER_BYTES is
set to substream->buffer_bytes_max and returned to user space as the max
interval of the HW_PARAM_BUFFER_BYTES, this will lead to issue that user
can't choose any buffer-bytes larger than the preallocated buffer size,
and the buffer reallocation will never happen actually.

Here change to use substream->dma_max as the maximum constraint of the
HW_PARAM_BUFFER_BYTES and fix the issue mentioned above.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
---
 sound/core/pcm_native.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Takashi Iwai Jan. 16, 2020, 7:15 a.m. UTC | #1
On Thu, 16 Jan 2020 05:53:18 +0100,
Keyon Jie wrote:
> 
> With today's code, we preallocate DMA buffer for substreams at pcm_new()
> stage, and the substream->buffer_bytes_max and substream->dma_max will
> save as the actually preallocated buffer size and maximum size that the
> dma buffer can be expanded by at hw_params() state, correspondingly.

No, it's other way round: the former, buffer_bytes_max, is the max
size defined by the driver (i.e. passed in snd_pcm_hardware) and the
latter, dma_max, is the max preallocation size (passed to
preallocation helper).

> At pcm_open() stage, the maximum constraint of HW_PARAM_BUFFER_BYTES is
> set to substream->buffer_bytes_max and returned to user space as the max
> interval of the HW_PARAM_BUFFER_BYTES, this will lead to issue that user
> can't choose any buffer-bytes larger than the preallocated buffer size,
> and the buffer reallocation will never happen actually.
> 
> Here change to use substream->dma_max as the maximum constraint of the
> HW_PARAM_BUFFER_BYTES and fix the issue mentioned above.

I don't think the logic in the current code you're changing is wrong.
If there is any, it must be something else.

This might be rather the FIXME code found in
snd_pcm_hw_constraints_complete()?


thanks,

Takashi

> Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
> ---
>  sound/core/pcm_native.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index c375c41496f8..326e921006e7 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -2301,7 +2301,7 @@ static int snd_pcm_hw_rule_buffer_bytes_max(struct snd_pcm_hw_params *params,
>  	struct snd_interval t;
>  	struct snd_pcm_substream *substream = rule->private;
>  	t.min = 0;
> -	t.max = substream->buffer_bytes_max;
> +	t.max = substream->dma_max;
>  	t.openmin = 0;
>  	t.openmax = 0;
>  	t.integer = 1;
> -- 
> 2.20.1
>
Keyon Jie Jan. 16, 2020, 9:50 a.m. UTC | #2
On Thu, 2020-01-16 at 08:15 +0100, Takashi Iwai wrote:
> On Thu, 16 Jan 2020 05:53:18 +0100,
> Keyon Jie wrote:
> > With today's code, we preallocate DMA buffer for substreams at
> > pcm_new()
> > stage, and the substream->buffer_bytes_max and substream->dma_max
> > will
> > save as the actually preallocated buffer size and maximum size that
> > the
> > dma buffer can be expanded by at hw_params() state,
> > correspondingly.
> 
> No, it's other way round: the former, buffer_bytes_max, is the max
> size defined by the driver (i.e. passed in snd_pcm_hardware) and the
> latter, dma_max, is the max preallocation size (passed to
> preallocation helper).

Hi Takashi, thanks for your comment.

First of all, have you ever hit issue I mentioned in the commit message
that we can't set buffer_bytes larger than the preallocated dma bytes?

I found this issue in kinds of platforms, not only on SOF/SoC ones, but
also on legacy HDA ones.

Secondly, I am not clear about the design intention of the substream-
>buffer_bytes_max and substream->dma_max, if it is as you commented
above, can you help answer my questions below inline the code?

void snd_pcm_lib_preallocate_pages(struct snd_pcm_substream *substream,
				  int type, struct device *data,
				  size_t size, size_t max)

static void preallocate_pages(struct snd_pcm_substream *substream,
			      int type, struct device *data,
			      size_t size, size_t max, bool managed)
{
...
	if (substream->dma_buffer.bytes > 0)
		substream->buffer_bytes_max = substream-
>dma_buffer.bytes;//Keyon: this is the actual allocated buffer bytes,
what is the intention here and why it is assigned to buffer_bytes_max
which will be used to constrain on the _HW_PARAM_BUFFER_BYTES later?

	substream->dma_max = max; //Keyon: looks here it is where the
*max* param used only if we don't define SND_VERBOSE_PROCFS? what
relationship can we have with the preallocation itself?
...
}


> 
> > At pcm_open() stage, the maximum constraint of
> > HW_PARAM_BUFFER_BYTES is
> > set to substream->buffer_bytes_max and returned to user space as
> > the max
> > interval of the HW_PARAM_BUFFER_BYTES, this will lead to issue that
> > user
> > can't choose any buffer-bytes larger than the preallocated buffer
> > size,
> > and the buffer reallocation will never happen actually.
> > 
> > Here change to use substream->dma_max as the maximum constraint of
> > the
> > HW_PARAM_BUFFER_BYTES and fix the issue mentioned above.
> 
> I don't think the logic in the current code you're changing is wrong.
> If there is any, it must be something else.
> 
> This might be rather the FIXME code found in
> snd_pcm_hw_constraints_complete()?

I just tried removing the FIXME part code and it doesn't help, the rule
snd_pcm_hw_rule_buffer_bytes_max here limit the max of the
SNDRV_PCM_HW_PARAM_BUFFER_BYTES and this will returned to user space
like aplay for the subsequent hw_params(), is this intentional?

int snd_pcm_hw_constraints_complete(struct snd_pcm_substream
*substream)
{
...
	err = snd_pcm_hw_rule_add(runtime, 0,
SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 
				  snd_pcm_hw_rule_buffer_bytes_max,
substream,
				  SNDRV_PCM_HW_PARAM_BUFFER_BYTES, -1);
	if (err < 0)
		return err;

	/* FIXME: remove */
	if (runtime->dma_bytes) {
		err = snd_pcm_hw_constraint_minmax(runtime,
SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 0, runtime->dma_bytes);
		if (err < 0)
			return err;
	}

...
	return 0;
}

Thanks,
~Keyon
Takashi Iwai Jan. 16, 2020, 10:27 a.m. UTC | #3
On Thu, 16 Jan 2020 10:50:33 +0100,
Keyon Jie wrote:
> 
> On Thu, 2020-01-16 at 08:15 +0100, Takashi Iwai wrote:
> > On Thu, 16 Jan 2020 05:53:18 +0100,
> > Keyon Jie wrote:
> > > With today's code, we preallocate DMA buffer for substreams at
> > > pcm_new()
> > > stage, and the substream->buffer_bytes_max and substream->dma_max
> > > will
> > > save as the actually preallocated buffer size and maximum size that
> > > the
> > > dma buffer can be expanded by at hw_params() state,
> > > correspondingly.
> > 
> > No, it's other way round: the former, buffer_bytes_max, is the max
> > size defined by the driver (i.e. passed in snd_pcm_hardware) and the
> > latter, dma_max, is the max preallocation size (passed to
> > preallocation helper).
> 
> Hi Takashi, thanks for your comment.
> 
> First of all, have you ever hit issue I mentioned in the commit message
> that we can't set buffer_bytes larger than the preallocated dma bytes?
> 
> I found this issue in kinds of platforms, not only on SOF/SoC ones, but
> also on legacy HDA ones.
> 
> Secondly, I am not clear about the design intention of the substream-
> >buffer_bytes_max and substream->dma_max, if it is as you commented
> above, can you help answer my questions below inline the code?
> 
> void snd_pcm_lib_preallocate_pages(struct snd_pcm_substream *substream,
> 				  int type, struct device *data,
> 				  size_t size, size_t max)
> 
> static void preallocate_pages(struct snd_pcm_substream *substream,
> 			      int type, struct device *data,
> 			      size_t size, size_t max, bool managed)
> {
> ...
> 	if (substream->dma_buffer.bytes > 0)
> 		substream->buffer_bytes_max = substream-
> >dma_buffer.bytes;//Keyon: this is the actual allocated buffer bytes,
> what is the intention here and why it is assigned to buffer_bytes_max
> which will be used to constrain on the _HW_PARAM_BUFFER_BYTES later?
> 
> 	substream->dma_max = max; //Keyon: looks here it is where the
> *max* param used only if we don't define SND_VERBOSE_PROCFS? what
> relationship can we have with the preallocation itself?
> ...
> }

Oh, you're right, and I completely misread the patch.

Now I took a coffee and can tell you the story behind the scene.

I believe the current code is intentionally limiting the size to the
preallocated size.  This limitation was brought for not trying to
allocate a larger buffer when the buffer has been preallocated.  In
the past, most hardware allocated the continuous pages for a buffer
and the allocation of a large buffer fails quite likely.  This was the
reason of the buffer preallocation.  So, the driver wanted to tell the
user-space the limit.  If user needs to have an extra large buffer,
they are supposed to fiddle with prealloc procfs (either setting zero
to clear the preallocation or setting a large enough buffer
beforehand).

For SG-buffers, though, limitation makes less sense than continuous
pages.  e.g. a patch below removes the limitation for SG-buffers.
But changing this would definitely cause the behavior difference, and
I don't know whether it's a reasonable move -- I'm afraid that apps
would start hogging too much memory if the limitation is gone.


thanks,

Takashi

---
diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c
index d4702cc1d376..6a6c3469bbcd 100644
--- a/sound/core/pcm_memory.c
+++ b/sound/core/pcm_memory.c
@@ -96,6 +96,29 @@ void snd_pcm_lib_preallocate_free_for_all(struct snd_pcm *pcm)
 }
 EXPORT_SYMBOL(snd_pcm_lib_preallocate_free_for_all);
 
+/* set up substream->buffer_bytes_max, which is used in hw_constraint */
+static void set_buffer_bytes_max(struct snd_pcm_substream *substream,
+				 size_t size)
+{
+	substream->buffer_bytes_max = UINT_MAX;
+
+	if (!size)
+		return; /* no preallocation */
+
+	/* for SG-buffers, no limitation is needed */
+	switch (substream->dma_buffer.dev.type) {
+#ifdef CONFIG_SND_DMA_SGBUF
+	case SNDRV_DMA_TYPE_DEV_SG:
+	case SNDRV_DMA_TYPE_DEV_UC_SG:
+#endif
+	case SNDRV_DMA_TYPE_VMALLOC:
+		return;
+	}
+
+	/* for continuous buffers, limit to the preallocated size */
+	substream->buffer_bytes_max = size;
+}
+
 #ifdef CONFIG_SND_VERBOSE_PROCFS
 /*
  * read callback for prealloc proc file
@@ -156,10 +179,8 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
 				buffer->error = -ENOMEM;
 				return;
 			}
-			substream->buffer_bytes_max = size;
-		} else {
-			substream->buffer_bytes_max = UINT_MAX;
 		}
+		set_buffer_bytes_max(substream, size);
 		if (substream->dma_buffer.area)
 			snd_dma_free_pages(&substream->dma_buffer);
 		substream->dma_buffer = new_dmab;
@@ -206,10 +227,8 @@ static void preallocate_pages(struct snd_pcm_substream *substream,
 
 	if (size > 0 && preallocate_dma && substream->number < maximum_substreams)
 		preallocate_pcm_pages(substream, size);
-
-	if (substream->dma_buffer.bytes > 0)
-		substream->buffer_bytes_max = substream->dma_buffer.bytes;
 	substream->dma_max = max;
+	set_buffer_bytes_max(substream, substream->dma_buffer.bytes);
 	if (max > 0)
 		preallocate_info_init(substream);
 	if (managed)
Keyon Jie Jan. 16, 2020, 11:25 a.m. UTC | #4
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
> On Thu, 16 Jan 2020 10:50:33 +0100,
> 
> Oh, you're right, and I completely misread the patch.
> 
> Now I took a coffee and can tell you the story behind the scene.
> 
> I believe the current code is intentionally limiting the size to the
> preallocated size.  This limitation was brought for not trying to
> allocate a larger buffer when the buffer has been preallocated.  In
> the past, most hardware allocated the continuous pages for a buffer
> and the allocation of a large buffer fails quite likely.  This was
> the
> reason of the buffer preallocation.  So, the driver wanted to tell
> the
> user-space the limit.  If user needs to have an extra large buffer,
> they are supposed to fiddle with prealloc procfs (either setting zero
> to clear the preallocation or setting a large enough buffer
> beforehand).

Thank you for the sharing, it is interesting and knowledge learned to
me.

> 
> For SG-buffers, though, limitation makes less sense than continuous
> pages.  e.g. a patch below removes the limitation for SG-buffers.
> But changing this would definitely cause the behavior difference, and
> I don't know whether it's a reasonable move -- I'm afraid that apps
> would start hogging too much memory if the limitation is gone.

I just went through all invoking to snd_pcm_lib_preallocate_pages*(),
for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to the
*max*, some set the *max* several times to the *size*, IMHO, the *max*s
are matched to those hardware's limiatation, comparing to the *size*s,
aren't they?

In this case, I still think my patch hanle all
TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV cases more
gracefully, we will still take the limitation from the specific driver
set, from the *max* param, and the test results looks very nice here,
we will take what the user space wanted for buffer-bytes via aply
exactly, as long as it is suitable for the interval and constraints.

What's your opinion about it?

> 
> 
> thanks,
> 
> Takashi
> 
> ---
> diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c
> index d4702cc1d376..6a6c3469bbcd 100644
> --- a/sound/core/pcm_memory.c
> +++ b/sound/core/pcm_memory.c
> @@ -96,6 +96,29 @@ void snd_pcm_lib_preallocate_free_for_all(struct
> snd_pcm *pcm)
>  }
>  EXPORT_SYMBOL(snd_pcm_lib_preallocate_free_for_all);
>  
> +/* set up substream->buffer_bytes_max, which is used in
> hw_constraint */
> +static void set_buffer_bytes_max(struct snd_pcm_substream
> *substream,
> +				 size_t size)
> +{
> +	substream->buffer_bytes_max = UINT_MAX;
> +
> +	if (!size)
> +		return; /* no preallocation */
> +
> +	/* for SG-buffers, no limitation is needed */
> +	switch (substream->dma_buffer.dev.type) {
> +#ifdef CONFIG_SND_DMA_SGBUF
> +	case SNDRV_DMA_TYPE_DEV_SG:
> +	case SNDRV_DMA_TYPE_DEV_UC_SG:
> +#endif
> +	case SNDRV_DMA_TYPE_VMALLOC:
> +		return;
> +	}
> +
> +	/* for continuous buffers, limit to the preallocated size */
> +	substream->buffer_bytes_max = size;
> +}
> +
>  #ifdef CONFIG_SND_VERBOSE_PROCFS
>  /*
>   * read callback for prealloc proc file
> @@ -156,10 +179,8 @@ static void
> snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
>  				buffer->error = -ENOMEM;

if we won't take this change from user's fiddling for SG buffer, we
should not reallocate dma pages here also?

Thanks,
~Keyon

>  				return;
>  			}
> -			substream->buffer_bytes_max = size;
> -		} else {
> -			substream->buffer_bytes_max = UINT_MAX;
>  		}
> +		set_buffer_bytes_max(substream, size);
>  		if (substream->dma_buffer.area)
>  			snd_dma_free_pages(&substream->dma_buffer);
>  		substream->dma_buffer = new_dmab;
> @@ -206,10 +227,8 @@ static void preallocate_pages(struct
> snd_pcm_substream *substream,
>  
>  	if (size > 0 && preallocate_dma && substream->number <
> maximum_substreams)
>  		preallocate_pcm_pages(substream, size);
> -
> -	if (substream->dma_buffer.bytes > 0)
> -		substream->buffer_bytes_max = substream-
> >dma_buffer.bytes;
>  	substream->dma_max = max;
> +	set_buffer_bytes_max(substream, substream->dma_buffer.bytes);
>  	if (max > 0)
>  		preallocate_info_init(substream);
>  	if (managed)
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Takashi Iwai Jan. 16, 2020, 11:50 a.m. UTC | #5
On Thu, 16 Jan 2020 12:25:38 +0100,
Keyon Jie wrote:
> 
> On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
> > On Thu, 16 Jan 2020 10:50:33 +0100,
> > 
> > Oh, you're right, and I completely misread the patch.
> > 
> > Now I took a coffee and can tell you the story behind the scene.
> > 
> > I believe the current code is intentionally limiting the size to the
> > preallocated size.  This limitation was brought for not trying to
> > allocate a larger buffer when the buffer has been preallocated.  In
> > the past, most hardware allocated the continuous pages for a buffer
> > and the allocation of a large buffer fails quite likely.  This was
> > the
> > reason of the buffer preallocation.  So, the driver wanted to tell
> > the
> > user-space the limit.  If user needs to have an extra large buffer,
> > they are supposed to fiddle with prealloc procfs (either setting zero
> > to clear the preallocation or setting a large enough buffer
> > beforehand).
> 
> Thank you for the sharing, it is interesting and knowledge learned to
> me.
> 
> > 
> > For SG-buffers, though, limitation makes less sense than continuous
> > pages.  e.g. a patch below removes the limitation for SG-buffers.
> > But changing this would definitely cause the behavior difference, and
> > I don't know whether it's a reasonable move -- I'm afraid that apps
> > would start hogging too much memory if the limitation is gone.
> 
> I just went through all invoking to snd_pcm_lib_preallocate_pages*(),
> for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to the
> *max*, some set the *max* several times to the *size*, IMHO, the *max*s
> are matched to those hardware's limiatation, comparing to the *size*s,
> aren't they?
> 
> In this case, I still think my patch hanle all
> TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV cases more
> gracefully, we will still take the limitation from the specific driver
> set, from the *max* param, and the test results looks very nice here,
> we will take what the user space wanted for buffer-bytes via aply
> exactly, as long as it is suitable for the interval and constraints.

Well, I have a mixed feeling.  Certainly we'd need some better way to
allow a larger buffer allocation, especially for HDA.  OTOH, if the
buffer was preallocated, it's meant to be used actually.  That's the
point of the hw_constraint setup.

And now thinking again after another cup of coffee, I wonder why we do
preallocate for HDA at all.  For HD-audio, the allocation of any large
buffer would succeed very likely because of SG-buffer.

So, just setting 0 to the preallocation size (but keeping else) would
work, e.g. something like below?  The help text needs adjustment, but
you can see the rough idea.


thanks,

Takashi

--- a/sound/hda/Kconfig
+++ b/sound/hda/Kconfig
@@ -21,9 +21,10 @@ config SND_HDA_EXT_CORE
        select SND_HDA_CORE
 
 config SND_HDA_PREALLOC_SIZE
-	int "Pre-allocated buffer size for HD-audio driver"
+	int "Pre-allocated buffer size for HD-audio driver" if !SND_DMA_SGBUF
 	range 0 32768
-	default 64
+	default 64 if !SND_DMA_SGBUF
+	default 0 if SND_DMA_SGBUF
 	help
 	  Specifies the default pre-allocated buffer-size in kB for the
 	  HD-audio driver.  A larger buffer (e.g. 2048) is preferred
Jie, Yang Jan. 16, 2020, 2:14 p.m. UTC | #6
> -----Original Message-----
> From: Alsa-devel <alsa-devel-bounces@alsa-project.org> On Behalf Of
> Takashi Iwai
> Sent: Thursday, January 16, 2020 7:51 PM
> To: Keyon Jie <yang.jie@linux.intel.com>
> Cc: alsa-devel@alsa-project.org
> Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max
> constrained by preallocated bytes issue
> 
> On Thu, 16 Jan 2020 12:25:38 +0100,
> Keyon Jie wrote:
> >
> > On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
> > > On Thu, 16 Jan 2020 10:50:33 +0100,
> > >
> > > Oh, you're right, and I completely misread the patch.
> > >
> > > Now I took a coffee and can tell you the story behind the scene.
> > >
> > > I believe the current code is intentionally limiting the size to the
> > > preallocated size.  This limitation was brought for not trying to
> > > allocate a larger buffer when the buffer has been preallocated.  In
> > > the past, most hardware allocated the continuous pages for a buffer
> > > and the allocation of a large buffer fails quite likely.  This was
> > > the reason of the buffer preallocation.  So, the driver wanted to
> > > tell the user-space the limit.  If user needs to have an extra large
> > > buffer, they are supposed to fiddle with prealloc procfs (either
> > > setting zero to clear the preallocation or setting a large enough
> > > buffer beforehand).
> >
> > Thank you for the sharing, it is interesting and knowledge learned to
> > me.
> >
> > >
> > > For SG-buffers, though, limitation makes less sense than continuous
> > > pages.  e.g. a patch below removes the limitation for SG-buffers.
> > > But changing this would definitely cause the behavior difference,
> > > and I don't know whether it's a reasonable move -- I'm afraid that
> > > apps would start hogging too much memory if the limitation is gone.
> >
> > I just went through all invoking to snd_pcm_lib_preallocate_pages*(),
> > for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to
> the
> > *max*, some set the *max* several times to the *size*, IMHO, the
> > *max*s are matched to those hardware's limiatation, comparing to the
> > *size*s, aren't they?
> >
> > In this case, I still think my patch hanle all
> > TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
> cases more
> > gracefully, we will still take the limitation from the specific driver
> > set, from the *max* param, and the test results looks very nice here,
> > we will take what the user space wanted for buffer-bytes via aply
> > exactly, as long as it is suitable for the interval and constraints.
> 
> Well, I have a mixed feeling.  Certainly we'd need some better way to allow a
> larger buffer allocation, especially for HDA.  OTOH, if the buffer was
> preallocated, it's meant to be used actually.  That's the point of the
> hw_constraint setup.

So if the buffer was preallocated, it won't be re-allocated at hw_params() stage,
is this conflict with the re-allocate logic in hw_params()?

> 
> And now thinking again after another cup of coffee, I wonder why we do
> preallocate for HDA at all.  For HD-audio, the allocation of any large buffer
> would succeed very likely because of SG-buffer.
> 
> So, just setting 0 to the preallocation size (but keeping else) would work, e.g.
> something like below?  The help text needs adjustment, but you can see the
> rough idea.

So, do you suggest not doing preallocation(or calling it with 0 size) for all driver
with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF
I2S platform to see if it can work as we required for very large buffer size.

Thanks,
~Keyon

> 
> 
> thanks,
> 
> Takashi
> 
> --- a/sound/hda/Kconfig
> +++ b/sound/hda/Kconfig
> @@ -21,9 +21,10 @@ config SND_HDA_EXT_CORE
>         select SND_HDA_CORE
> 
>  config SND_HDA_PREALLOC_SIZE
> -	int "Pre-allocated buffer size for HD-audio driver"
> +	int "Pre-allocated buffer size for HD-audio driver"
> if !SND_DMA_SGBUF
>  	range 0 32768
> -	default 64
> +	default 64 if !SND_DMA_SGBUF
> +	default 0 if SND_DMA_SGBUF
>  	help
>  	  Specifies the default pre-allocated buffer-size in kB for the
>  	  HD-audio driver.  A larger buffer (e.g. 2048) is preferred
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Jie, Yang Jan. 16, 2020, 3:31 p.m. UTC | #7
> -----Original Message-----
> From: Jie, Yang
> Sent: Thursday, January 16, 2020 10:14 PM
> To: 'Takashi Iwai' <tiwai@suse.de>; Keyon Jie <yang.jie@linux.intel.com>
> Cc: alsa-devel@alsa-project.org
> Subject: RE: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max
> constrained by preallocated bytes issue
> 
> > -----Original Message-----
> > From: Alsa-devel <alsa-devel-bounces@alsa-project.org> On Behalf Of
> > Takashi Iwai
> > Sent: Thursday, January 16, 2020 7:51 PM
> > To: Keyon Jie <yang.jie@linux.intel.com>
> > Cc: alsa-devel@alsa-project.org
> > Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max
> > constrained by preallocated bytes issue
> >
> > On Thu, 16 Jan 2020 12:25:38 +0100,
> > Keyon Jie wrote:
> > >
> > > On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
> > > > On Thu, 16 Jan 2020 10:50:33 +0100,
> > > >
> > > > Oh, you're right, and I completely misread the patch.
> > > >
> > > > Now I took a coffee and can tell you the story behind the scene.
> > > >
> > > > I believe the current code is intentionally limiting the size to
> > > > the preallocated size.  This limitation was brought for not trying
> > > > to allocate a larger buffer when the buffer has been preallocated.
> > > > In the past, most hardware allocated the continuous pages for a
> > > > buffer and the allocation of a large buffer fails quite likely.
> > > > This was the reason of the buffer preallocation.  So, the driver
> > > > wanted to tell the user-space the limit.  If user needs to have an
> > > > extra large buffer, they are supposed to fiddle with prealloc
> > > > procfs (either setting zero to clear the preallocation or setting
> > > > a large enough buffer beforehand).
> > >
> > > Thank you for the sharing, it is interesting and knowledge learned
> > > to me.
> > >
> > > >
> > > > For SG-buffers, though, limitation makes less sense than
> > > > continuous pages.  e.g. a patch below removes the limitation for SG-
> buffers.
> > > > But changing this would definitely cause the behavior difference,
> > > > and I don't know whether it's a reasonable move -- I'm afraid that
> > > > apps would start hogging too much memory if the limitation is gone.
> > >
> > > I just went through all invoking to
> > > snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV,
> some
> > > of them set the *size* equal to
> > the
> > > *max*, some set the *max* several times to the *size*, IMHO, the
> > > *max*s are matched to those hardware's limiatation, comparing to the
> > > *size*s, aren't they?
> > >
> > > In this case, I still think my patch hanle all
> > > TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
> > cases more
> > > gracefully, we will still take the limitation from the specific
> > > driver set, from the *max* param, and the test results looks very
> > > nice here, we will take what the user space wanted for buffer-bytes
> > > via aply exactly, as long as it is suitable for the interval and constraints.
> >
> > Well, I have a mixed feeling.  Certainly we'd need some better way to
> > allow a larger buffer allocation, especially for HDA.  OTOH, if the
> > buffer was preallocated, it's meant to be used actually.  That's the
> > point of the hw_constraint setup.
> 
> So if the buffer was preallocated, it won't be re-allocated at hw_params()
> stage, is this conflict with the re-allocate logic in hw_params()?
> 
> >
> > And now thinking again after another cup of coffee, I wonder why we do
> > preallocate for HDA at all.  For HD-audio, the allocation of any large
> > buffer would succeed very likely because of SG-buffer.
> >
> > So, just setting 0 to the preallocation size (but keeping else) would work,
> e.g.
> > something like below?  The help text needs adjustment, but you can see
> > the rough idea.
> 
> So, do you suggest not doing preallocation(or calling it with 0 size) for all
> driver with TYPE_SG? I am fine if this is the recommended method, I can try
> this on SOF I2S platform to see if it can work as we required for very large
> buffer size.

Tried and found setting 0 size for preallocation doesn't work for me, I have
even tried to setting the size as big as the max(which the user space may
 require for buffer-bytes), it still doesn't work for me.

Thanks,
~Keyon

> 
> Thanks,
> ~Keyon
>
Takashi Iwai Jan. 16, 2020, 3:45 p.m. UTC | #8
On Thu, 16 Jan 2020 15:14:28 +0100,
Jie, Yang wrote:
> 
> > -----Original Message-----
> > From: Alsa-devel <alsa-devel-bounces@alsa-project.org> On Behalf Of
> > Takashi Iwai
> > Sent: Thursday, January 16, 2020 7:51 PM
> > To: Keyon Jie <yang.jie@linux.intel.com>
> > Cc: alsa-devel@alsa-project.org
> > Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max
> > constrained by preallocated bytes issue
> > 
> > On Thu, 16 Jan 2020 12:25:38 +0100,
> > Keyon Jie wrote:
> > >
> > > On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
> > > > On Thu, 16 Jan 2020 10:50:33 +0100,
> > > >
> > > > Oh, you're right, and I completely misread the patch.
> > > >
> > > > Now I took a coffee and can tell you the story behind the scene.
> > > >
> > > > I believe the current code is intentionally limiting the size to the
> > > > preallocated size.  This limitation was brought for not trying to
> > > > allocate a larger buffer when the buffer has been preallocated.  In
> > > > the past, most hardware allocated the continuous pages for a buffer
> > > > and the allocation of a large buffer fails quite likely.  This was
> > > > the reason of the buffer preallocation.  So, the driver wanted to
> > > > tell the user-space the limit.  If user needs to have an extra large
> > > > buffer, they are supposed to fiddle with prealloc procfs (either
> > > > setting zero to clear the preallocation or setting a large enough
> > > > buffer beforehand).
> > >
> > > Thank you for the sharing, it is interesting and knowledge learned to
> > > me.
> > >
> > > >
> > > > For SG-buffers, though, limitation makes less sense than continuous
> > > > pages.  e.g. a patch below removes the limitation for SG-buffers.
> > > > But changing this would definitely cause the behavior difference,
> > > > and I don't know whether it's a reasonable move -- I'm afraid that
> > > > apps would start hogging too much memory if the limitation is gone.
> > >
> > > I just went through all invoking to snd_pcm_lib_preallocate_pages*(),
> > > for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to
> > the
> > > *max*, some set the *max* several times to the *size*, IMHO, the
> > > *max*s are matched to those hardware's limiatation, comparing to the
> > > *size*s, aren't they?
> > >
> > > In this case, I still think my patch hanle all
> > > TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
> > cases more
> > > gracefully, we will still take the limitation from the specific driver
> > > set, from the *max* param, and the test results looks very nice here,
> > > we will take what the user space wanted for buffer-bytes via aply
> > > exactly, as long as it is suitable for the interval and constraints.
> > 
> > Well, I have a mixed feeling.  Certainly we'd need some better way to allow a
> > larger buffer allocation, especially for HDA.  OTOH, if the buffer was
> > preallocated, it's meant to be used actually.  That's the point of the
> > hw_constraint setup.
> 
> So if the buffer was preallocated, it won't be re-allocated at hw_params() stage,
> is this conflict with the re-allocate logic in hw_params()?

If a larger buffer than the preallocated one is requested, the PCM
core tries to allocate another buffer while the preallocated one
remains kept untouched.  It's because such a larger allocation is
supposed to be one-off thing.  Normal usages should fit with the
preallocated buffer size.

That said, if a larger buffer is required too frequently, it makes no
sense to keep the preallocation.  Or, preallocate larger buffers
instead.

> > And now thinking again after another cup of coffee, I wonder why we do
> > preallocate for HDA at all.  For HD-audio, the allocation of any large buffer
> > would succeed very likely because of SG-buffer.
> > 
> > So, just setting 0 to the preallocation size (but keeping else) would work, e.g.
> > something like below?  The help text needs adjustment, but you can see the
> > rough idea.
> 
> So, do you suggest not doing preallocation(or calling it with 0 size) for all driver
> with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF
> I2S platform to see if it can work as we required for very large buffer size.

This really depends on the use case, and I'm not yet sure whether no
preallocation is really recommended or not.  Without preallocation,
each PCM open is involved with a large amount of page allocations, and
it makes easier for users to hog resources more easily.  It'll use
vmalloc addresses that aren't unlimited, and may trigger OOM easily.


thanks,

Takashi
Takashi Iwai Jan. 16, 2020, 4:07 p.m. UTC | #9
On Thu, 16 Jan 2020 16:31:02 +0100,
Jie, Yang wrote:
> 
> > -----Original Message-----
> > From: Jie, Yang
> > Sent: Thursday, January 16, 2020 10:14 PM
> > To: 'Takashi Iwai' <tiwai@suse.de>; Keyon Jie <yang.jie@linux.intel.com>
> > Cc: alsa-devel@alsa-project.org
> > Subject: RE: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max
> > constrained by preallocated bytes issue
> > 
> > > -----Original Message-----
> > > From: Alsa-devel <alsa-devel-bounces@alsa-project.org> On Behalf Of
> > > Takashi Iwai
> > > Sent: Thursday, January 16, 2020 7:51 PM
> > > To: Keyon Jie <yang.jie@linux.intel.com>
> > > Cc: alsa-devel@alsa-project.org
> > > Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max
> > > constrained by preallocated bytes issue
> > >
> > > On Thu, 16 Jan 2020 12:25:38 +0100,
> > > Keyon Jie wrote:
> > > >
> > > > On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
> > > > > On Thu, 16 Jan 2020 10:50:33 +0100,
> > > > >
> > > > > Oh, you're right, and I completely misread the patch.
> > > > >
> > > > > Now I took a coffee and can tell you the story behind the scene.
> > > > >
> > > > > I believe the current code is intentionally limiting the size to
> > > > > the preallocated size.  This limitation was brought for not trying
> > > > > to allocate a larger buffer when the buffer has been preallocated.
> > > > > In the past, most hardware allocated the continuous pages for a
> > > > > buffer and the allocation of a large buffer fails quite likely.
> > > > > This was the reason of the buffer preallocation.  So, the driver
> > > > > wanted to tell the user-space the limit.  If user needs to have an
> > > > > extra large buffer, they are supposed to fiddle with prealloc
> > > > > procfs (either setting zero to clear the preallocation or setting
> > > > > a large enough buffer beforehand).
> > > >
> > > > Thank you for the sharing, it is interesting and knowledge learned
> > > > to me.
> > > >
> > > > >
> > > > > For SG-buffers, though, limitation makes less sense than
> > > > > continuous pages.  e.g. a patch below removes the limitation for SG-
> > buffers.
> > > > > But changing this would definitely cause the behavior difference,
> > > > > and I don't know whether it's a reasonable move -- I'm afraid that
> > > > > apps would start hogging too much memory if the limitation is gone.
> > > >
> > > > I just went through all invoking to
> > > > snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV,
> > some
> > > > of them set the *size* equal to
> > > the
> > > > *max*, some set the *max* several times to the *size*, IMHO, the
> > > > *max*s are matched to those hardware's limiatation, comparing to the
> > > > *size*s, aren't they?
> > > >
> > > > In this case, I still think my patch hanle all
> > > > TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
> > > cases more
> > > > gracefully, we will still take the limitation from the specific
> > > > driver set, from the *max* param, and the test results looks very
> > > > nice here, we will take what the user space wanted for buffer-bytes
> > > > via aply exactly, as long as it is suitable for the interval and constraints.
> > >
> > > Well, I have a mixed feeling.  Certainly we'd need some better way to
> > > allow a larger buffer allocation, especially for HDA.  OTOH, if the
> > > buffer was preallocated, it's meant to be used actually.  That's the
> > > point of the hw_constraint setup.
> > 
> > So if the buffer was preallocated, it won't be re-allocated at hw_params()
> > stage, is this conflict with the re-allocate logic in hw_params()?
> > 
> > >
> > > And now thinking again after another cup of coffee, I wonder why we do
> > > preallocate for HDA at all.  For HD-audio, the allocation of any large
> > > buffer would succeed very likely because of SG-buffer.
> > >
> > > So, just setting 0 to the preallocation size (but keeping else) would work,
> > e.g.
> > > something like below?  The help text needs adjustment, but you can see
> > > the rough idea.
> > 
> > So, do you suggest not doing preallocation(or calling it with 0 size) for all
> > driver with TYPE_SG? I am fine if this is the recommended method, I can try
> > this on SOF I2S platform to see if it can work as we required for very large
> > buffer size.
> 
> Tried and found setting 0 size for preallocation doesn't work for me, I have
> even tried to setting the size as big as the max(which the user space may
>  require for buffer-bytes), it still doesn't work for me.

How did you test it?  I quickly checked now on my machine, and it
seems working...

# echo 1024 > /proc/asound/card0/pcm0p/sub0/prealloc
# aplay -Dhw:0 -v --buffer-size=1048576 foo.wav
Hardware PCM card 0 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
  stream       : PLAYBACK
  ....
  buffer_size  : 262144

# echo 0 > /proc/asound/card0/pcm0p/sub0/prealloc
# aplay -Dhw:0 -v --buffer-size=1048576 foo.wav
Hardware PCM card 0 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
  stream       : PLAYBACK
  ....
  buffer_size  : 1048576


Takashi
Pierre-Louis Bossart Jan. 16, 2020, 4:39 p.m. UTC | #10
>> So, do you suggest not doing preallocation(or calling it with 0 size) for all
>> driver with TYPE_SG? I am fine if this is the recommended method, I can try
>> this on SOF I2S platform to see if it can work as we required for very large
>> buffer size.

Keyon, for the rest of us to follow this patch, would you mind 
clarifying what drives the need for a 'very large buffer size', and what 
order of magnitude this very large size would be.

FWIW, we've measured consistently on different Windows/Linux platforms, 
maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the 
benefits from increasing that buffer size further are marginal in terms 
of power consumption, and generate all kinds of issues with volume 
updates and deferred routing changes.

Thanks
-Pierre
Rajwa, Marcin Jan. 16, 2020, 5:25 p.m. UTC | #11
On 1/16/2020 5:39 PM, Pierre-Louis Bossart wrote:
>
>>> So, do you suggest not doing preallocation(or calling it with 0 
>>> size) for all
>>> driver with TYPE_SG? I am fine if this is the recommended method, I 
>>> can try
>>> this on SOF I2S platform to see if it can work as we required for 
>>> very large
>>> buffer size.
>
> Keyon, for the rest of us to follow this patch, would you mind 
> clarifying what drives the need for a 'very large buffer size', and 
> what order of magnitude this very large size would be.
>
> FWIW, we've measured consistently on different Windows/Linux 
> platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 
> kB) the benefits from increasing that buffer size further are marginal 
> in terms of power consumption, and generate all kinds of issues with 
> volume updates and deferred routing changes.
>
We need bigger buffer on host side to compensate the wake up time from 
d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer 
sizes like < 2 seconds we overwrite data since FW keeps copping while 
host doesn't read until its up and running again.

> Thanks
> -Pierre
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Pierre-Louis Bossart Jan. 16, 2020, 5:40 p.m. UTC | #12
>>>> So, do you suggest not doing preallocation(or calling it with 0 
>>>> size) for all
>>>> driver with TYPE_SG? I am fine if this is the recommended method, I 
>>>> can try
>>>> this on SOF I2S platform to see if it can work as we required for 
>>>> very large
>>>> buffer size.
>>
>> Keyon, for the rest of us to follow this patch, would you mind 
>> clarifying what drives the need for a 'very large buffer size', and 
>> what order of magnitude this very large size would be.
>>
>> FWIW, we've measured consistently on different Windows/Linux 
>> platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 
>> kB) the benefits from increasing that buffer size further are marginal 
>> in terms of power consumption, and generate all kinds of issues with 
>> volume updates and deferred routing changes.
>>
> We need bigger buffer on host side to compensate the wake up time from 
> d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer 
> sizes like < 2 seconds we overwrite data since FW keeps copping while 
> host doesn't read until its up and running again.

Right, that's a valid case, but that's 256 kB, not 'very large' or 
likely to ever trigger an OOM case.
Takashi Iwai Jan. 16, 2020, 8:37 p.m. UTC | #13
On Thu, 16 Jan 2020 18:40:26 +0100,
Pierre-Louis Bossart wrote:
> 
> 
> >>>> So, do you suggest not doing preallocation(or calling it with 0
> >>>> size) for all
> >>>> driver with TYPE_SG? I am fine if this is the recommended method,
> >>>> I can try
> >>>> this on SOF I2S platform to see if it can work as we required for
> >>>> very large
> >>>> buffer size.
> >>
> >> Keyon, for the rest of us to follow this patch, would you mind
> >> clarifying what drives the need for a 'very large buffer size', and
> >> what order of magnitude this very large size would be.
> >>
> >> FWIW, we've measured consistently on different Windows/Linux
> >> platforms, maybe 10 years ago, that once you reach a buffer of 1s
> >> (384 kB) the benefits from increasing that buffer size further are
> >> marginal in terms of power consumption, and generate all kinds of
> >> issues with volume updates and deferred routing changes.
> >>
> > We need bigger buffer on host side to compensate the wake up time
> > from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
> > smaller buffer sizes like < 2 seconds we overwrite data since FW
> > keeps copping while host doesn't read until its up and running
> > again.
> 
> Right, that's a valid case, but that's 256 kB, not 'very large' or
> likely to ever trigger an OOM case.

That size shouldn't matter, and would work even with the
preallocation.

My concern is that removing the limitation would allow the allocation
of too large sizes.  Even with dma_max limit, it can go up to 32MB
physical pages per stream for HDA.  Depending on the hardware setup,
there can be a lot of streams assignment (e.g. HDMI codecs) and
multiple codecs / controllers, and imagine that all those allocated
pages are pinned and can't be swapped out...


Takashi
Keyon Jie Jan. 17, 2020, 5:30 a.m. UTC | #14
On 2020/1/17 上午4:37, Takashi Iwai wrote:
> On Thu, 16 Jan 2020 18:40:26 +0100,
> Pierre-Louis Bossart wrote:
>>
>>
>>>>>> So, do you suggest not doing preallocation(or calling it with 0
>>>>>> size) for all
>>>>>> driver with TYPE_SG? I am fine if this is the recommended method,
>>>>>> I can try
>>>>>> this on SOF I2S platform to see if it can work as we required for
>>>>>> very large
>>>>>> buffer size.
>>>>
>>>> Keyon, for the rest of us to follow this patch, would you mind
>>>> clarifying what drives the need for a 'very large buffer size', and
>>>> what order of magnitude this very large size would be.
>>>>
>>>> FWIW, we've measured consistently on different Windows/Linux
>>>> platforms, maybe 10 years ago, that once you reach a buffer of 1s
>>>> (384 kB) the benefits from increasing that buffer size further are
>>>> marginal in terms of power consumption, and generate all kinds of
>>>> issues with volume updates and deferred routing changes.
>>>>
>>> We need bigger buffer on host side to compensate the wake up time
>>> from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
>>> smaller buffer sizes like < 2 seconds we overwrite data since FW
>>> keeps copping while host doesn't read until its up and running
>>> again.
>>
>> Right, that's a valid case, but that's 256 kB, not 'very large' or
>> likely to ever trigger an OOM case.
> 
> That size shouldn't matter, and would work even with the
> preallocation.
> 
> My concern is that removing the limitation would allow the allocation
> of too large sizes.  Even with dma_max limit, it can go up to 32MB
> physical pages per stream for HDA.  Depending on the hardware setup,
> there can be a lot of streams assignment (e.g. HDMI codecs) and
> multiple codecs / controllers, and imagine that all those allocated
> pages are pinned and can't be swapped out...

Hi Takashi, I get your concern here, but if we switch to use dma_max 
limit, we won't change the preallocated buffer, it will be still 64KB 
for each stream, user space can ask for re-allocate buffer for each 
stream up to 32MB, but those pinned and can't be swapped out ones are 
the 64KB preallocated ones only, am I wrong?

Thanks,
~Keyon

> 
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
Keyon Jie Jan. 17, 2020, 5:37 a.m. UTC | #15
On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
> 
>>>>> So, do you suggest not doing preallocation(or calling it with 0 
>>>>> size) for all
>>>>> driver with TYPE_SG? I am fine if this is the recommended method, I 
>>>>> can try
>>>>> this on SOF I2S platform to see if it can work as we required for 
>>>>> very large
>>>>> buffer size.
>>>
>>> Keyon, for the rest of us to follow this patch, would you mind 
>>> clarifying what drives the need for a 'very large buffer size', and 
>>> what order of magnitude this very large size would be.
>>>
>>> FWIW, we've measured consistently on different Windows/Linux 
>>> platforms, maybe 10 years ago, that once you reach a buffer of 1s 
>>> (384 kB) the benefits from increasing that buffer size further are 
>>> marginal in terms of power consumption, and generate all kinds of 
>>> issues with volume updates and deferred routing changes.
>>>
>> We need bigger buffer on host side to compensate the wake up time from 
>> d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller 
>> buffer sizes like < 2 seconds we overwrite data since FW keeps copping 
>> while host doesn't read until its up and running again.
> 
> Right, that's a valid case, but that's 256 kB, not 'very large' or 
> likely to ever trigger an OOM case.

For S24_LE, it is 512KB, the point is that if we can't re-allocate 
buffer at hw_params() stage, then we need follow a BKM that we have to 
preallocate the largest DMA buffer that we claim to support at 
pcm_new(), I think this is actually another kind of wast with these 
largest pinned buffer that can't be swapped out...

Thanks,
~Keyon

> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Takashi Iwai Jan. 17, 2020, 7:57 a.m. UTC | #16
On Fri, 17 Jan 2020 06:30:18 +0100,
Keyon Jie wrote:
> 
> On 2020/1/17 上午4:37, Takashi Iwai wrote:
> > On Thu, 16 Jan 2020 18:40:26 +0100,
> > Pierre-Louis Bossart wrote:
> >>
> >>
> >>>>>> So, do you suggest not doing preallocation(or calling it with 0
> >>>>>> size) for all
> >>>>>> driver with TYPE_SG? I am fine if this is the recommended method,
> >>>>>> I can try
> >>>>>> this on SOF I2S platform to see if it can work as we required for
> >>>>>> very large
> >>>>>> buffer size.
> >>>>
> >>>> Keyon, for the rest of us to follow this patch, would you mind
> >>>> clarifying what drives the need for a 'very large buffer size', and
> >>>> what order of magnitude this very large size would be.
> >>>>
> >>>> FWIW, we've measured consistently on different Windows/Linux
> >>>> platforms, maybe 10 years ago, that once you reach a buffer of 1s
> >>>> (384 kB) the benefits from increasing that buffer size further are
> >>>> marginal in terms of power consumption, and generate all kinds of
> >>>> issues with volume updates and deferred routing changes.
> >>>>
> >>> We need bigger buffer on host side to compensate the wake up time
> >>> from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
> >>> smaller buffer sizes like < 2 seconds we overwrite data since FW
> >>> keeps copping while host doesn't read until its up and running
> >>> again.
> >>
> >> Right, that's a valid case, but that's 256 kB, not 'very large' or
> >> likely to ever trigger an OOM case.
> >
> > That size shouldn't matter, and would work even with the
> > preallocation.
> >
> > My concern is that removing the limitation would allow the allocation
> > of too large sizes.  Even with dma_max limit, it can go up to 32MB
> > physical pages per stream for HDA.  Depending on the hardware setup,
> > there can be a lot of streams assignment (e.g. HDMI codecs) and
> > multiple codecs / controllers, and imagine that all those allocated
> > pages are pinned and can't be swapped out...
> 
> Hi Takashi, I get your concern here, but if we switch to use dma_max
> limit, we won't change the preallocated buffer, it will be still 64KB
> for each stream, user space can ask for re-allocate buffer for each
> stream up to 32MB, but those pinned and can't be swapped out ones are
> the 64KB preallocated ones only, am I wrong?

No, in general, all sound hardware buffers are pinned.


Takashi
Takashi Iwai Jan. 17, 2020, 8 a.m. UTC | #17
On Fri, 17 Jan 2020 06:37:16 +0100,
Keyon Jie wrote:
> 
> On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
> >
> >>>>> So, do you suggest not doing preallocation(or calling it with 0
> >>>>> size) for all
> >>>>> driver with TYPE_SG? I am fine if this is the recommended
> >>>>> method, I can try
> >>>>> this on SOF I2S platform to see if it can work as we required
> >>>>> for very large
> >>>>> buffer size.
> >>>
> >>> Keyon, for the rest of us to follow this patch, would you mind
> >>> clarifying what drives the need for a 'very large buffer size',
> >>> and what order of magnitude this very large size would be.
> >>>
> >>> FWIW, we've measured consistently on different Windows/Linux
> >>> platforms, maybe 10 years ago, that once you reach a buffer of 1s
> >>> (384 kB) the benefits from increasing that buffer size further are
> >>> marginal in terms of power consumption, and generate all kinds of
> >>> issues with volume updates and deferred routing changes.
> >>>
> >> We need bigger buffer on host side to compensate the wake up time
> >> from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
> >> smaller buffer sizes like < 2 seconds we overwrite data since FW
> >> keeps copping while host doesn't read until its up and running
> >> again.
> >
> > Right, that's a valid case, but that's 256 kB, not 'very large' or
> > likely to ever trigger an OOM case.
> 
> For S24_LE, it is 512KB, the point is that if we can't re-allocate
> buffer at hw_params() stage, then we need follow a BKM that we have to
> preallocate the largest DMA buffer that we claim to support at
> pcm_new(), I think this is actually another kind of wast with these
> largest pinned buffer that can't be swapped out...

Well, that's the case you'd need a larger preallocation.
I guess many distros already set it to a higher value for PulseAudio.
The default 64kB is just from historical and compatibility reason, and
we may extend it to 1MB or so now.


Takashi
Keyon Jie Jan. 17, 2020, 10:13 a.m. UTC | #18
On 2020/1/17 下午3:57, Takashi Iwai wrote:
> On Fri, 17 Jan 2020 06:30:18 +0100,
> Keyon Jie wrote:
>>
>> On 2020/1/17 上午4:37, Takashi Iwai wrote:
>>
>> Hi Takashi, I get your concern here, but if we switch to use dma_max
>> limit, we won't change the preallocated buffer, it will be still 64KB
>> for each stream, user space can ask for re-allocate buffer for each
>> stream up to 32MB, but those pinned and can't be swapped out ones are
>> the 64KB preallocated ones only, am I wrong?
> 
> No, in general, all sound hardware buffers are pinned.

Sorry, I must have been wrong here, what I was focusing on is those 
allocated SG DMA buffers, I am not sure if they are those you called 
"hardware buffers" here.

My understanding was like this:

1. in pcm_new() stage, the device PCM driver should call 
snd_pcm_lib_preallocate_pages()->
	snd_pcm_lib_preallocate_pages()->
		preallocate_pcm_pages()
and then the substream->dma_buffer is initialized with the preallocated 
buffer.

2. in pcm_open() stage, the device PCM driver should call
snd_pcm_lib_malloc_pages()->
	snd_dma_alloc_pages() //if we need to reallocate bigger buffer. *The 
substream->dma_buffer won't be freed, Takashi, this is what I thought 
you named "pinned" buffer.* And those reallocated bigger buffer via 
snd_dma_alloc_pages() will be freed at pcm_close() per my understanding?

Thanks,
~Keyon

> 
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
Takashi Iwai Jan. 17, 2020, 10:30 a.m. UTC | #19
On Fri, 17 Jan 2020 11:13:31 +0100,
Keyon Jie wrote:
> 
> 
> 
> On 2020/1/17 下午3:57, Takashi Iwai wrote:
> > On Fri, 17 Jan 2020 06:30:18 +0100,
> > Keyon Jie wrote:
> >>
> >> On 2020/1/17 上午4:37, Takashi Iwai wrote:
> >>
> >> Hi Takashi, I get your concern here, but if we switch to use dma_max
> >> limit, we won't change the preallocated buffer, it will be still 64KB
> >> for each stream, user space can ask for re-allocate buffer for each
> >> stream up to 32MB, but those pinned and can't be swapped out ones are
> >> the 64KB preallocated ones only, am I wrong?
> >
> > No, in general, all sound hardware buffers are pinned.
> 
> Sorry, I must have been wrong here, what I was focusing on is those
> allocated SG DMA buffers, I am not sure if they are those you called
> "hardware buffers" here.
> 
> My understanding was like this:
> 
> 1. in pcm_new() stage, the device PCM driver should call
> snd_pcm_lib_preallocate_pages()->
> 	snd_pcm_lib_preallocate_pages()->
> 		preallocate_pcm_pages()
> and then the substream->dma_buffer is initialized with the
> preallocated buffer.
> 
> 2. in pcm_open() stage, the device PCM driver should call
> snd_pcm_lib_malloc_pages()->
> 	snd_dma_alloc_pages() //if we need to reallocate bigger
> buffer. *The substream->dma_buffer won't be freed, Takashi, this is
> what I thought you named "pinned" buffer.* And those reallocated
> bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close()
> per my understanding?

What I meant as "pinned" is that the pages are not swapped out by
swapper process like the user-space or anonymous pages.
So if you open all streams (say 16 streams) on a machine with 32MB
buffers, it'll cost a half GB.  And, we have no restriction about
which user may do it, so all normal users who have the access to the
sound device can consume a half GB kernel space pages easily.  For a
big server it's no problem, but for a small system, it's costing.


Takashi
Keyon Jie Jan. 17, 2020, 10:43 a.m. UTC | #20
On 2020/1/17 下午4:00, Takashi Iwai wrote:
> On Fri, 17 Jan 2020 06:37:16 +0100,
> Keyon Jie wrote:
>>
>> On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
>>>
>>>>>>> So, do you suggest not doing preallocation(or calling it with 0
>>>>>>> size) for all
>>>>>>> driver with TYPE_SG? I am fine if this is the recommended
>>>>>>> method, I can try
>>>>>>> this on SOF I2S platform to see if it can work as we required
>>>>>>> for very large
>>>>>>> buffer size.
>>>>>
>>>>> Keyon, for the rest of us to follow this patch, would you mind
>>>>> clarifying what drives the need for a 'very large buffer size',
>>>>> and what order of magnitude this very large size would be.
>>>>>
>>>>> FWIW, we've measured consistently on different Windows/Linux
>>>>> platforms, maybe 10 years ago, that once you reach a buffer of 1s
>>>>> (384 kB) the benefits from increasing that buffer size further are
>>>>> marginal in terms of power consumption, and generate all kinds of
>>>>> issues with volume updates and deferred routing changes.
>>>>>
>>>> We need bigger buffer on host side to compensate the wake up time
>>>> from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
>>>> smaller buffer sizes like < 2 seconds we overwrite data since FW
>>>> keeps copping while host doesn't read until its up and running
>>>> again.
>>>
>>> Right, that's a valid case, but that's 256 kB, not 'very large' or
>>> likely to ever trigger an OOM case.
>>
>> For S24_LE, it is 512KB, the point is that if we can't re-allocate
>> buffer at hw_params() stage, then we need follow a BKM that we have to
>> preallocate the largest DMA buffer that we claim to support at
>> pcm_new(), I think this is actually another kind of wast with these
>> largest pinned buffer that can't be swapped out...
> 
> Well, that's the case you'd need a larger preallocation.
> I guess many distros already set it to a higher value for PulseAudio.
> The default 64kB is just from historical and compatibility reason, and
> we may extend it to 1MB or so now.

In SOF driver, we don't use kernel config item like 
CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:

	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
				le32_to_cpu(caps->buffer_size_min),
				le32_to_cpu(caps->buffer_size_max));

So the preallocated size is configured via topology file, that is 
caps->buffer_size_min, no chance for PulseAudio to reconfigure it.

So, it looks like we have to change it to this if we don't change the 
ALSA core:

	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
-				le32_to_cpu(caps->buffer_size_min),
+				le32_to_cpu(caps->buffer_size_max),
				le32_to_cpu(caps->buffer_size_max));


Thanks,
~Keyon

> 
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
Keyon Jie Jan. 17, 2020, 10:56 a.m. UTC | #21
On 2020/1/17 下午6:30, Takashi Iwai wrote:
> On Fri, 17 Jan 2020 11:13:31 +0100,
> Keyon Jie wrote:
>>
>>
>>
>> On 2020/1/17 下午3:57, Takashi Iwai wrote:
>>> On Fri, 17 Jan 2020 06:30:18 +0100,
>>> Keyon Jie wrote:
>>>>
>>>> On 2020/1/17 上午4:37, Takashi Iwai wrote:
>>>>
>>>> Hi Takashi, I get your concern here, but if we switch to use dma_max
>>>> limit, we won't change the preallocated buffer, it will be still 64KB
>>>> for each stream, user space can ask for re-allocate buffer for each
>>>> stream up to 32MB, but those pinned and can't be swapped out ones are
>>>> the 64KB preallocated ones only, am I wrong?
>>>
>>> No, in general, all sound hardware buffers are pinned.
>>
>> Sorry, I must have been wrong here, what I was focusing on is those
>> allocated SG DMA buffers, I am not sure if they are those you called
>> "hardware buffers" here.
>>
>> My understanding was like this:
>>
>> 1. in pcm_new() stage, the device PCM driver should call
>> snd_pcm_lib_preallocate_pages()->
>> 	snd_pcm_lib_preallocate_pages()->
>> 		preallocate_pcm_pages()
>> and then the substream->dma_buffer is initialized with the
>> preallocated buffer.
>>
>> 2. in pcm_open() stage, the device PCM driver should call
>> snd_pcm_lib_malloc_pages()->
>> 	snd_dma_alloc_pages() //if we need to reallocate bigger
>> buffer. *The substream->dma_buffer won't be freed, Takashi, this is
>> what I thought you named "pinned" buffer.* And those reallocated
>> bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close()
>> per my understanding?
> 
> What I meant as "pinned" is that the pages are not swapped out by
> swapper process like the user-space or anonymous pages.
> So if you open all streams (say 16 streams) on a machine with 32MB
> buffers, it'll cost a half GB.  And, we have no restriction about
> which user may do it, so all normal users who have the access to the
> sound device can consume a half GB kernel space pages easily.  For a
> big server it's no problem, but for a small system, it's costing.

Understood, you are concerning about intentional attack from user space 
about memory consuming, you propose that normal user should be permitted 
to use the default 64KB only, if larger buffer required, please use proc 
fs expert mode, is my understanding correct?

Thanks,
~Keyon

> 
> 
> Takashi
>
Takashi Iwai Jan. 17, 2020, 11:12 a.m. UTC | #22
On Fri, 17 Jan 2020 11:43:24 +0100,
Keyon Jie wrote:
> 
> 
> 
> On 2020/1/17 下午4:00, Takashi Iwai wrote:
> > On Fri, 17 Jan 2020 06:37:16 +0100,
> > Keyon Jie wrote:
> >>
> >> On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
> >>>
> >>>>>>> So, do you suggest not doing preallocation(or calling it with 0
> >>>>>>> size) for all
> >>>>>>> driver with TYPE_SG? I am fine if this is the recommended
> >>>>>>> method, I can try
> >>>>>>> this on SOF I2S platform to see if it can work as we required
> >>>>>>> for very large
> >>>>>>> buffer size.
> >>>>>
> >>>>> Keyon, for the rest of us to follow this patch, would you mind
> >>>>> clarifying what drives the need for a 'very large buffer size',
> >>>>> and what order of magnitude this very large size would be.
> >>>>>
> >>>>> FWIW, we've measured consistently on different Windows/Linux
> >>>>> platforms, maybe 10 years ago, that once you reach a buffer of 1s
> >>>>> (384 kB) the benefits from increasing that buffer size further are
> >>>>> marginal in terms of power consumption, and generate all kinds of
> >>>>> issues with volume updates and deferred routing changes.
> >>>>>
> >>>> We need bigger buffer on host side to compensate the wake up time
> >>>> from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
> >>>> smaller buffer sizes like < 2 seconds we overwrite data since FW
> >>>> keeps copping while host doesn't read until its up and running
> >>>> again.
> >>>
> >>> Right, that's a valid case, but that's 256 kB, not 'very large' or
> >>> likely to ever trigger an OOM case.
> >>
> >> For S24_LE, it is 512KB, the point is that if we can't re-allocate
> >> buffer at hw_params() stage, then we need follow a BKM that we have to
> >> preallocate the largest DMA buffer that we claim to support at
> >> pcm_new(), I think this is actually another kind of wast with these
> >> largest pinned buffer that can't be swapped out...
> >
> > Well, that's the case you'd need a larger preallocation.
> > I guess many distros already set it to a higher value for PulseAudio.
> > The default 64kB is just from historical and compatibility reason, and
> > we may extend it to 1MB or so now.
> 
> In SOF driver, we don't use kernel config item like
> CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
> 
> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
> 				le32_to_cpu(caps->buffer_size_min),
> 				le32_to_cpu(caps->buffer_size_max));
> 
> So the preallocated size is configured via topology file, that is
> caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
> 
> So, it looks like we have to change it to this if we don't change the
> ALSA core:
> 
> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
> -				le32_to_cpu(caps->buffer_size_min),
> +				le32_to_cpu(caps->buffer_size_max),
> 				le32_to_cpu(caps->buffer_size_max));

Yes, passing buffer_size_min for the preallocation sounds already
bad.  The default value should be sufficient for usual operations, not
the cost-cutting minimum.  Otherwise there is no merit of
preallocation.

Alternatively, we may pass 0 there, indicating no limitation, too.
But, this would need a bit other adjustment, e.g. snd_pcm_hardware
should have lower buffer_bytes_max.


Takashi
Takashi Iwai Jan. 17, 2020, 11:15 a.m. UTC | #23
On Fri, 17 Jan 2020 11:56:48 +0100,
Keyon Jie wrote:
> 
> On 2020/1/17 下午6:30, Takashi Iwai wrote:
> > On Fri, 17 Jan 2020 11:13:31 +0100,
> > Keyon Jie wrote:
> >>
> >>
> >>
> >> On 2020/1/17 下午3:57, Takashi Iwai wrote:
> >>> On Fri, 17 Jan 2020 06:30:18 +0100,
> >>> Keyon Jie wrote:
> >>>>
> >>>> On 2020/1/17 上午4:37, Takashi Iwai wrote:
> >>>>
> >>>> Hi Takashi, I get your concern here, but if we switch to use dma_max
> >>>> limit, we won't change the preallocated buffer, it will be still 64KB
> >>>> for each stream, user space can ask for re-allocate buffer for each
> >>>> stream up to 32MB, but those pinned and can't be swapped out ones are
> >>>> the 64KB preallocated ones only, am I wrong?
> >>>
> >>> No, in general, all sound hardware buffers are pinned.
> >>
> >> Sorry, I must have been wrong here, what I was focusing on is those
> >> allocated SG DMA buffers, I am not sure if they are those you called
> >> "hardware buffers" here.
> >>
> >> My understanding was like this:
> >>
> >> 1. in pcm_new() stage, the device PCM driver should call
> >> snd_pcm_lib_preallocate_pages()->
> >> 	snd_pcm_lib_preallocate_pages()->
> >> 		preallocate_pcm_pages()
> >> and then the substream->dma_buffer is initialized with the
> >> preallocated buffer.
> >>
> >> 2. in pcm_open() stage, the device PCM driver should call
> >> snd_pcm_lib_malloc_pages()->
> >> 	snd_dma_alloc_pages() //if we need to reallocate bigger
> >> buffer. *The substream->dma_buffer won't be freed, Takashi, this is
> >> what I thought you named "pinned" buffer.* And those reallocated
> >> bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close()
> >> per my understanding?
> >
> > What I meant as "pinned" is that the pages are not swapped out by
> > swapper process like the user-space or anonymous pages.
> > So if you open all streams (say 16 streams) on a machine with 32MB
> > buffers, it'll cost a half GB.  And, we have no restriction about
> > which user may do it, so all normal users who have the access to the
> > sound device can consume a half GB kernel space pages easily.  For a
> > big server it's no problem, but for a small system, it's costing.
> 
> Understood, you are concerning about intentional attack from user
> space about memory consuming, you propose that normal user should be
> permitted to use the default 64KB only, if larger buffer required,
> please use proc fs expert mode, is my understanding correct?

Well, a normal user may want 1MB or 2MB buffer, and that's not too
bad.  So the most distros already set the larger preallocation for
HD-audio explicitly via CONFIG_SND_HDA_PREALLOC_SIZE without procfs
adjustment, I believe.  Then the system allows normal users buffers up
to the given size.


Takashi
Keyon Jie Jan. 19, 2020, 3:52 a.m. UTC | #24
On 2020/1/17 下午7:12, Takashi Iwai wrote:
> On Fri, 17 Jan 2020 11:43:24 +0100,
> Keyon Jie wrote:
>>
>>
>>
>> On 2020/1/17 下午4:00, Takashi Iwai wrote:
>>> On Fri, 17 Jan 2020 06:37:16 +0100,
>>> Keyon Jie wrote:
>>>>
>>>> On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
>>>>>
>>>>>>>>> So, do you suggest not doing preallocation(or calling it with 0
>>>>>>>>> size) for all
>>>>>>>>> driver with TYPE_SG? I am fine if this is the recommended
>>>>>>>>> method, I can try
>>>>>>>>> this on SOF I2S platform to see if it can work as we required
>>>>>>>>> for very large
>>>>>>>>> buffer size.
>>>>>>>
>>>>>>> Keyon, for the rest of us to follow this patch, would you mind
>>>>>>> clarifying what drives the need for a 'very large buffer size',
>>>>>>> and what order of magnitude this very large size would be.
>>>>>>>
>>>>>>> FWIW, we've measured consistently on different Windows/Linux
>>>>>>> platforms, maybe 10 years ago, that once you reach a buffer of 1s
>>>>>>> (384 kB) the benefits from increasing that buffer size further are
>>>>>>> marginal in terms of power consumption, and generate all kinds of
>>>>>>> issues with volume updates and deferred routing changes.
>>>>>>>
>>>>>> We need bigger buffer on host side to compensate the wake up time
>>>>>> from d0ix to d0 which takes ~2 seconds on my setup. So, wiith
>>>>>> smaller buffer sizes like < 2 seconds we overwrite data since FW
>>>>>> keeps copping while host doesn't read until its up and running
>>>>>> again.
>>>>>
>>>>> Right, that's a valid case, but that's 256 kB, not 'very large' or
>>>>> likely to ever trigger an OOM case.
>>>>
>>>> For S24_LE, it is 512KB, the point is that if we can't re-allocate
>>>> buffer at hw_params() stage, then we need follow a BKM that we have to
>>>> preallocate the largest DMA buffer that we claim to support at
>>>> pcm_new(), I think this is actually another kind of wast with these
>>>> largest pinned buffer that can't be swapped out...
>>>
>>> Well, that's the case you'd need a larger preallocation.
>>> I guess many distros already set it to a higher value for PulseAudio.
>>> The default 64kB is just from historical and compatibility reason, and
>>> we may extend it to 1MB or so now.
>>
>> In SOF driver, we don't use kernel config item like
>> CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
>>
>> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
>> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
>> 				le32_to_cpu(caps->buffer_size_min),
>> 				le32_to_cpu(caps->buffer_size_max));
>>
>> So the preallocated size is configured via topology file, that is
>> caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
>>
>> So, it looks like we have to change it to this if we don't change the
>> ALSA core:
>>
>> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
>> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
>> -				le32_to_cpu(caps->buffer_size_min),
>> +				le32_to_cpu(caps->buffer_size_max),
>> 				le32_to_cpu(caps->buffer_size_max));
> 
> Yes, passing buffer_size_min for the preallocation sounds already
> bad.  The default value should be sufficient for usual operations, not
> the cost-cutting minimum.  Otherwise there is no merit of
> preallocation.
> 
> Alternatively, we may pass 0 there, indicating no limitation, too.
> But, this would need a bit other adjustment, e.g. snd_pcm_hardware
> should have lower buffer_bytes_max.

Thank you Takashi, then let's follow it to pre-allocate with 
caps->buffer_size_max, as we don't specify any limitations in 
snd_pcm_hardware today, we want to leave it configurable to each 
specific topology file for different machines.

Thanks,
~Keyon

> 
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
Takashi Iwai Jan. 19, 2020, 7:09 a.m. UTC | #25
On Sun, 19 Jan 2020 04:52:55 +0100,
Keyon Jie wrote:
> 
> 
> On 2020/1/17 下午7:12, Takashi Iwai wrote:
> > On Fri, 17 Jan 2020 11:43:24 +0100,
> > Keyon Jie wrote:
> >>
> >> In SOF driver, we don't use kernel config item like
> >> CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
> >>
> >> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
> >> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
> >> 				le32_to_cpu(caps->buffer_size_min),
> >> 				le32_to_cpu(caps->buffer_size_max));
> >>
> >> So the preallocated size is configured via topology file, that is
> >> caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
> >>
> >> So, it looks like we have to change it to this if we don't change the
> >> ALSA core:
> >>
> >> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
> >> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
> >> -				le32_to_cpu(caps->buffer_size_min),
> >> +				le32_to_cpu(caps->buffer_size_max),
> >> 				le32_to_cpu(caps->buffer_size_max));
> >
> > Yes, passing buffer_size_min for the preallocation sounds already
> > bad.  The default value should be sufficient for usual operations, not
> > the cost-cutting minimum.  Otherwise there is no merit of
> > preallocation.
> >
> > Alternatively, we may pass 0 there, indicating no limitation, too.
> > But, this would need a bit other adjustment, e.g. snd_pcm_hardware
> > should have lower buffer_bytes_max.
> 
> Thank you Takashi, then let's follow it to pre-allocate with
> caps->buffer_size_max, as we don't specify any limitations in
> snd_pcm_hardware today, we want to leave it configurable to each
> specific topology file for different machines.

How big is caps->buffer_size_max?  Passing the value there means
actually trying to allocate the given size as default, and it'd be a
lot of waste if a too large value (e.g. 32MB) is passed there.

I think we can go for passing zero as default, which means skipping
preallocation.  In addition, we may add an upper limit of the total
amount of allocation per card, controlled in pcm_memory.c, for
example.  This logic can be applied to the legacy HDA, too.

This should be relatively easy, and I'll provide the patch in the next
week.


Takashi
Keyon Jie Jan. 19, 2020, 8:11 a.m. UTC | #26
On 2020/1/19 下午3:09, Takashi Iwai wrote:
> On Sun, 19 Jan 2020 04:52:55 +0100,
> Keyon Jie wrote:
>>
>>
>> On 2020/1/17 下午7:12, Takashi Iwai wrote:
>>> On Fri, 17 Jan 2020 11:43:24 +0100,
>>> Keyon Jie wrote:
>>>>
>>>> In SOF driver, we don't use kernel config item like
>>>> CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
>>>>
>>>> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
>>>> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
>>>> 				le32_to_cpu(caps->buffer_size_min),
>>>> 				le32_to_cpu(caps->buffer_size_max));
>>>>
>>>> So the preallocated size is configured via topology file, that is
>>>> caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
>>>>
>>>> So, it looks like we have to change it to this if we don't change the
>>>> ALSA core:
>>>>
>>>> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
>>>> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
>>>> -				le32_to_cpu(caps->buffer_size_min),
>>>> +				le32_to_cpu(caps->buffer_size_max),
>>>> 				le32_to_cpu(caps->buffer_size_max));
>>>
>>> Yes, passing buffer_size_min for the preallocation sounds already
>>> bad.  The default value should be sufficient for usual operations, not
>>> the cost-cutting minimum.  Otherwise there is no merit of
>>> preallocation.
>>>
>>> Alternatively, we may pass 0 there, indicating no limitation, too.
>>> But, this would need a bit other adjustment, e.g. snd_pcm_hardware
>>> should have lower buffer_bytes_max.
>>
>> Thank you Takashi, then let's follow it to pre-allocate with
>> caps->buffer_size_max, as we don't specify any limitations in
>> snd_pcm_hardware today, we want to leave it configurable to each
>> specific topology file for different machines.
> 
> How big is caps->buffer_size_max?  Passing the value there means
> actually trying to allocate the given size as default, and it'd be a
> lot of waste if a too large value (e.g. 32MB) is passed there.

It varies for each stream, most of them are 65536 Bytes only, whereas 
one for Wake-On-Voice might need a > 4 Seconds buffer could be up to 
about 1~2MBytes, and another one for deep-buffer playback can be up to 
about 8MBytes.

> 
> I think we can go for passing zero as default, which means skipping
> preallocation.  In addition, we may add an upper limit of the total

Just did an experiment and this works for me, I believe we still need to 
call snd_pcm_set_managed_buffer() though the preallocation is skipped in 
this, right?

> amount of allocation per card, controlled in pcm_memory.c, for
> example.  This logic can be applied to the legacy HDA, too.
> 
> This should be relatively easy, and I'll provide the patch in the next
> week.

OK, that's fine for me also, thank you.

~Keyon

> 
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
Takashi Iwai Jan. 19, 2020, 9:04 a.m. UTC | #27
On Sun, 19 Jan 2020 09:11:17 +0100,
Keyon Jie wrote:
> 
> On 2020/1/19 下午3:09, Takashi Iwai wrote:
> > On Sun, 19 Jan 2020 04:52:55 +0100,
> > Keyon Jie wrote:
> >>
> >>
> >> On 2020/1/17 下午7:12, Takashi Iwai wrote:
> >>> On Fri, 17 Jan 2020 11:43:24 +0100,
> >>> Keyon Jie wrote:
> >>>>
> >>>> In SOF driver, we don't use kernel config item like
> >>>> CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
> >>>>
> >>>> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
> >>>> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
> >>>> 				le32_to_cpu(caps->buffer_size_min),
> >>>> 				le32_to_cpu(caps->buffer_size_max));
> >>>>
> >>>> So the preallocated size is configured via topology file, that is
> >>>> caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
> >>>>
> >>>> So, it looks like we have to change it to this if we don't change the
> >>>> ALSA core:
> >>>>
> >>>> 	snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream,
> >>>> 				      SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
> >>>> -				le32_to_cpu(caps->buffer_size_min),
> >>>> +				le32_to_cpu(caps->buffer_size_max),
> >>>> 				le32_to_cpu(caps->buffer_size_max));
> >>>
> >>> Yes, passing buffer_size_min for the preallocation sounds already
> >>> bad.  The default value should be sufficient for usual operations, not
> >>> the cost-cutting minimum.  Otherwise there is no merit of
> >>> preallocation.
> >>>
> >>> Alternatively, we may pass 0 there, indicating no limitation, too.
> >>> But, this would need a bit other adjustment, e.g. snd_pcm_hardware
> >>> should have lower buffer_bytes_max.
> >>
> >> Thank you Takashi, then let's follow it to pre-allocate with
> >> caps->buffer_size_max, as we don't specify any limitations in
> >> snd_pcm_hardware today, we want to leave it configurable to each
> >> specific topology file for different machines.
> >
> > How big is caps->buffer_size_max?  Passing the value there means
> > actually trying to allocate the given size as default, and it'd be a
> > lot of waste if a too large value (e.g. 32MB) is passed there.
> 
> It varies for each stream, most of them are 65536 Bytes only, whereas
> one for Wake-On-Voice might need a > 4 Seconds buffer could be up to
> about 1~2MBytes, and another one for deep-buffer playback can be up to
> about 8MBytes.

Hm, so this varies so much depending on the use case?
I thought it comes from the topology file and it's essentially
consistent over various purposes.

> > I think we can go for passing zero as default, which means skipping
> > preallocation.  In addition, we may add an upper limit of the total
> 
> Just did an experiment and this works for me, I believe we still need
> to call snd_pcm_set_managed_buffer() though the preallocation is
> skipped in this, right?

No, snd_pcm_set_managed_buffer() is the new PCM preallocation API.
The old snd_pcm_lib_preallocate*() is almost gone.

> > amount of allocation per card, controlled in pcm_memory.c, for
> > example.  This logic can be applied to the legacy HDA, too.
> >
> > This should be relatively easy, and I'll provide the patch in the next
> > week.
> 
> OK, that's fine for me also, thank you.

Below is a quick hack for HDA.  We still need the certain amount of
preallocation for non-x86 systems that don't support SG-buffers, so
a bit of trick is applied to Kconfig.

Totally untested, as usual.


thanks,

Takashi

---
diff --git a/include/sound/core.h b/include/sound/core.h
index 0e14b7a3e67b..ac8b692b69b4 100644
--- a/include/sound/core.h
+++ b/include/sound/core.h
@@ -120,6 +120,9 @@ struct snd_card {
 	int sync_irq;			/* assigned irq, used for PCM sync */
 	wait_queue_head_t remove_sleep;
 
+	size_t total_pcm_alloc_bytes;	/* total amount of allocated buffers */
+	struct mutex memory_mutex;	/* protection for the above */
+
 #ifdef CONFIG_PM
 	unsigned int power_state;	/* power state */
 	wait_queue_head_t power_sleep;
diff --git a/sound/core/init.c b/sound/core/init.c
index faa9f03c01ca..b02a99766351 100644
--- a/sound/core/init.c
+++ b/sound/core/init.c
@@ -211,6 +211,7 @@ int snd_card_new(struct device *parent, int idx, const char *xid,
 	INIT_LIST_HEAD(&card->ctl_files);
 	spin_lock_init(&card->files_lock);
 	INIT_LIST_HEAD(&card->files_list);
+	mutex_init(&card->memory_mutex);
 #ifdef CONFIG_PM
 	init_waitqueue_head(&card->power_sleep);
 #endif
diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c
index d4702cc1d376..4883b0ccd475 100644
--- a/sound/core/pcm_memory.c
+++ b/sound/core/pcm_memory.c
@@ -27,6 +27,37 @@ MODULE_PARM_DESC(maximum_substreams, "Maximum substreams with preallocated DMA m
 
 static const size_t snd_minimum_buffer = 16384;
 
+static unsigned long max_alloc_per_card = 32UL * 1024UL * 1024UL * 1024UL;
+module_param(max_alloc_per_card, ulong, 0644);
+MODULE_PARM_DESC(max_alloc_per_card, "Max total allocation bytes per card.");
+
+static int do_alloc_pages(struct snd_card *card, int type, struct device *dev,
+			  size_t size, struct snd_dma_buffer *dmab)
+{
+	int err;
+
+	if (card->total_pcm_alloc_bytes + size > max_alloc_per_card)
+		return -ENOMEM;
+	err = snd_dma_alloc_pages(type, dev, size, dmab);
+	if (!err) {
+		mutex_lock(&card->memory_mutex);
+		card->total_pcm_alloc_bytes += dmab->bytes;
+		mutex_unlock(&card->memory_mutex);
+	}
+	return err;
+}
+
+static void do_free_pages(struct snd_card *card, struct snd_dma_buffer *dmab)
+{
+	if (!dmab->area)
+		return;
+	mutex_lock(&card->memory_mutex);
+	WARN_ON(card->total_pcm_alloc_bytes < dmab->bytes);
+	card->total_pcm_alloc_bytes -= dmab->bytes;
+	mutex_unlock(&card->memory_mutex);
+	snd_dma_free_pages(dmab);
+	dmab->area = NULL;
+}
 
 /*
  * try to allocate as the large pages as possible.
@@ -37,16 +68,15 @@ static const size_t snd_minimum_buffer = 16384;
 static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t size)
 {
 	struct snd_dma_buffer *dmab = &substream->dma_buffer;
+	struct snd_card *card = substream->pcm->card;
 	size_t orig_size = size;
 	int err;
 
 	do {
-		if ((err = snd_dma_alloc_pages(dmab->dev.type, dmab->dev.dev,
-					       size, dmab)) < 0) {
-			if (err != -ENOMEM)
-				return err; /* fatal error */
-		} else
-			return 0;
+		err = do_alloc_pages(card, dmab->dev.type, dmab->dev.dev,
+				     size, dmab);
+		if (err != -ENOMEM)
+			return err;
 		size >>= 1;
 	} while (size >= snd_minimum_buffer);
 	dmab->bytes = 0; /* tell error */
@@ -62,10 +92,7 @@ static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t siz
  */
 static void snd_pcm_lib_preallocate_dma_free(struct snd_pcm_substream *substream)
 {
-	if (substream->dma_buffer.area == NULL)
-		return;
-	snd_dma_free_pages(&substream->dma_buffer);
-	substream->dma_buffer.area = NULL;
+	do_free_pages(substream->pcm->card, &substream->dma_buffer);
 }
 
 /**
@@ -130,6 +157,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
 					       struct snd_info_buffer *buffer)
 {
 	struct snd_pcm_substream *substream = entry->private_data;
+	struct snd_card *card = substream->pcm->card;
 	char line[64], str[64];
 	size_t size;
 	struct snd_dma_buffer new_dmab;
@@ -150,9 +178,10 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
 		memset(&new_dmab, 0, sizeof(new_dmab));
 		new_dmab.dev = substream->dma_buffer.dev;
 		if (size > 0) {
-			if (snd_dma_alloc_pages(substream->dma_buffer.dev.type,
-						substream->dma_buffer.dev.dev,
-						size, &new_dmab) < 0) {
+			if (do_alloc_pages(card,
+					   substream->dma_buffer.dev.type,
+					   substream->dma_buffer.dev.dev,
+					   size, &new_dmab) < 0) {
 				buffer->error = -ENOMEM;
 				return;
 			}
@@ -161,7 +190,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
 			substream->buffer_bytes_max = UINT_MAX;
 		}
 		if (substream->dma_buffer.area)
-			snd_dma_free_pages(&substream->dma_buffer);
+			do_free_pages(card, &substream->dma_buffer);
 		substream->dma_buffer = new_dmab;
 	} else {
 		buffer->error = -EINVAL;
@@ -346,6 +375,7 @@ struct page *snd_pcm_sgbuf_ops_page(struct snd_pcm_substream *substream, unsigne
  */
 int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size)
 {
+	struct snd_card *card = substream->pcm->card;
 	struct snd_pcm_runtime *runtime;
 	struct snd_dma_buffer *dmab = NULL;
 
@@ -374,9 +404,10 @@ int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size)
 		if (! dmab)
 			return -ENOMEM;
 		dmab->dev = substream->dma_buffer.dev;
-		if (snd_dma_alloc_pages(substream->dma_buffer.dev.type,
-					substream->dma_buffer.dev.dev,
-					size, dmab) < 0) {
+		if (do_alloc_pages(card,
+				   substream->dma_buffer.dev.type,
+				   substream->dma_buffer.dev.dev,
+				   size, dmab) < 0) {
 			kfree(dmab);
 			return -ENOMEM;
 		}
@@ -397,6 +428,7 @@ EXPORT_SYMBOL(snd_pcm_lib_malloc_pages);
  */
 int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream)
 {
+	struct snd_card *card = substream->pcm->card;
 	struct snd_pcm_runtime *runtime;
 
 	if (PCM_RUNTIME_CHECK(substream))
@@ -406,7 +438,7 @@ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream)
 		return 0;
 	if (runtime->dma_buffer_p != &substream->dma_buffer) {
 		/* it's a newly allocated buffer.  release it now. */
-		snd_dma_free_pages(runtime->dma_buffer_p);
+		do_free_pages(card, runtime->dma_buffer_p);
 		kfree(runtime->dma_buffer_p);
 	}
 	snd_pcm_set_runtime_buffer(substream, NULL);
diff --git a/sound/hda/Kconfig b/sound/hda/Kconfig
index b0c88fe040ee..4ca6b09056f3 100644
--- a/sound/hda/Kconfig
+++ b/sound/hda/Kconfig
@@ -21,14 +21,16 @@ config SND_HDA_EXT_CORE
        select SND_HDA_CORE
 
 config SND_HDA_PREALLOC_SIZE
-	int "Pre-allocated buffer size for HD-audio driver"
+	int "Pre-allocated buffer size for HD-audio driver" if !SND_DMA_SGBUF
 	range 0 32768
-	default 64
+	default 0 if SND_DMA_SGBUF
+	default 64 if !SND_DMA_SGBUF
 	help
 	  Specifies the default pre-allocated buffer-size in kB for the
 	  HD-audio driver.  A larger buffer (e.g. 2048) is preferred
 	  for systems using PulseAudio.  The default 64 is chosen just
 	  for compatibility reasons.
+	  On x86 systems, the default is zero as we need no preallocation.
 
 	  Note that the pre-allocation size can be changed dynamically
 	  via a proc file (/proc/asound/card*/pcm*/sub*/prealloc), too.
Keyon Jie Jan. 19, 2020, 10:14 a.m. UTC | #28
On 2020/1/19 下午5:04, Takashi Iwai wrote:
> On Sun, 19 Jan 2020 09:11:17 +0100,
> Keyon Jie wrote:
>> On 2020/1/19 下午3:09, Takashi Iwai wrote:
>> It varies for each stream, most of them are 65536 Bytes only, whereas
>> one for Wake-On-Voice might need a > 4 Seconds buffer could be up to
>> about 1~2MBytes, and another one for deep-buffer playback can be up to
>> about 8MBytes.
> Hm, so this varies so much depending on the use case?
> I thought it comes from the topology file and it's essentially
> consistent over various purposes.

Yes, we add different buffer_bytes_max limitation to each stream 
depending on its use case, basically we set it to the maximum value we 
claim to support only, we don't want to waste any of the system memory.

> 
>>> I think we can go for passing zero as default, which means skipping
>>> preallocation.  In addition, we may add an upper limit of the total
>> Just did an experiment and this works for me, I believe we still need
>> to call snd_pcm_set_managed_buffer() though the preallocation is
>> skipped in this, right?
> No, snd_pcm_set_managed_buffer() is the new PCM preallocation API.
> The old snd_pcm_lib_preallocate*() is almost gone.

What I asked is actually that since the preallocation will be 
skipped(with passing size=0), can we just not calling 
snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() in our SOF 
PCM driver? I believe no(we still need the invoking to do initialization 
except buffer allocating)?

> 
>>> amount of allocation per card, controlled in pcm_memory.c, for
>>> example.  This logic can be applied to the legacy HDA, too.
>>>
>>> This should be relatively easy, and I'll provide the patch in the next
>>> week.
>> OK, that's fine for me also, thank you.
> Below is a quick hack for HDA.  We still need the certain amount of
> preallocation for non-x86 systems that don't support SG-buffers, so
> a bit of trick is applied to Kconfig.
> 
> Totally untested, as usual.

Did a quick test(plus passing 0 size for preallocate in SOF PCM driver) 
and it works for my use case(no regression comparing that without 
applying this patch), Thank you.

Thanks,
~Keyon

> 
> 
> thanks,
> 
> Takashi
> 
> ---
> diff --git a/include/sound/core.h b/include/sound/core.h
> index 0e14b7a3e67b..ac8b692b69b4 100644
> --- a/include/sound/core.h
> +++ b/include/sound/core.h
> @@ -120,6 +120,9 @@ struct snd_card {
>   	int sync_irq;			/* assigned irq, used for PCM sync */
>   	wait_queue_head_t remove_sleep;
>   
> +	size_t total_pcm_alloc_bytes;	/* total amount of allocated buffers */
> +	struct mutex memory_mutex;	/* protection for the above */
> +
>   #ifdef CONFIG_PM
>   	unsigned int power_state;	/* power state */
>   	wait_queue_head_t power_sleep;
> diff --git a/sound/core/init.c b/sound/core/init.c
> index faa9f03c01ca..b02a99766351 100644
> --- a/sound/core/init.c
> +++ b/sound/core/init.c
> @@ -211,6 +211,7 @@ int snd_card_new(struct device *parent, int idx, const char *xid,
>   	INIT_LIST_HEAD(&card->ctl_files);
>   	spin_lock_init(&card->files_lock);
>   	INIT_LIST_HEAD(&card->files_list);
> +	mutex_init(&card->memory_mutex);
>   #ifdef CONFIG_PM
>   	init_waitqueue_head(&card->power_sleep);
>   #endif
> diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c
> index d4702cc1d376..4883b0ccd475 100644
> --- a/sound/core/pcm_memory.c
> +++ b/sound/core/pcm_memory.c
> @@ -27,6 +27,37 @@ MODULE_PARM_DESC(maximum_substreams, "Maximum substreams with preallocated DMA m
>   
>   static const size_t snd_minimum_buffer = 16384;
>   
> +static unsigned long max_alloc_per_card = 32UL * 1024UL * 1024UL * 1024UL;
> +module_param(max_alloc_per_card, ulong, 0644);
> +MODULE_PARM_DESC(max_alloc_per_card, "Max total allocation bytes per card.");
> +
> +static int do_alloc_pages(struct snd_card *card, int type, struct device *dev,
> +			  size_t size, struct snd_dma_buffer *dmab)
> +{
> +	int err;
> +
> +	if (card->total_pcm_alloc_bytes + size > max_alloc_per_card)
> +		return -ENOMEM;
> +	err = snd_dma_alloc_pages(type, dev, size, dmab);
> +	if (!err) {
> +		mutex_lock(&card->memory_mutex);
> +		card->total_pcm_alloc_bytes += dmab->bytes;
> +		mutex_unlock(&card->memory_mutex);
> +	}
> +	return err;
> +}
> +
> +static void do_free_pages(struct snd_card *card, struct snd_dma_buffer *dmab)
> +{
> +	if (!dmab->area)
> +		return;
> +	mutex_lock(&card->memory_mutex);
> +	WARN_ON(card->total_pcm_alloc_bytes < dmab->bytes);
> +	card->total_pcm_alloc_bytes -= dmab->bytes;
> +	mutex_unlock(&card->memory_mutex);
> +	snd_dma_free_pages(dmab);
> +	dmab->area = NULL;
> +}
>   
>   /*
>    * try to allocate as the large pages as possible.
> @@ -37,16 +68,15 @@ static const size_t snd_minimum_buffer = 16384;
>   static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t size)
>   {
>   	struct snd_dma_buffer *dmab = &substream->dma_buffer;
> +	struct snd_card *card = substream->pcm->card;
>   	size_t orig_size = size;
>   	int err;
>   
>   	do {
> -		if ((err = snd_dma_alloc_pages(dmab->dev.type, dmab->dev.dev,
> -					       size, dmab)) < 0) {
> -			if (err != -ENOMEM)
> -				return err; /* fatal error */
> -		} else
> -			return 0;
> +		err = do_alloc_pages(card, dmab->dev.type, dmab->dev.dev,
> +				     size, dmab);
> +		if (err != -ENOMEM)
> +			return err;
>   		size >>= 1;
>   	} while (size >= snd_minimum_buffer);
>   	dmab->bytes = 0; /* tell error */
> @@ -62,10 +92,7 @@ static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t siz
>    */
>   static void snd_pcm_lib_preallocate_dma_free(struct snd_pcm_substream *substream)
>   {
> -	if (substream->dma_buffer.area == NULL)
> -		return;
> -	snd_dma_free_pages(&substream->dma_buffer);
> -	substream->dma_buffer.area = NULL;
> +	do_free_pages(substream->pcm->card, &substream->dma_buffer);
>   }
>   
>   /**
> @@ -130,6 +157,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
>   					       struct snd_info_buffer *buffer)
>   {
>   	struct snd_pcm_substream *substream = entry->private_data;
> +	struct snd_card *card = substream->pcm->card;
>   	char line[64], str[64];
>   	size_t size;
>   	struct snd_dma_buffer new_dmab;
> @@ -150,9 +178,10 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
>   		memset(&new_dmab, 0, sizeof(new_dmab));
>   		new_dmab.dev = substream->dma_buffer.dev;
>   		if (size > 0) {
> -			if (snd_dma_alloc_pages(substream->dma_buffer.dev.type,
> -						substream->dma_buffer.dev.dev,
> -						size, &new_dmab) < 0) {
> +			if (do_alloc_pages(card,
> +					   substream->dma_buffer.dev.type,
> +					   substream->dma_buffer.dev.dev,
> +					   size, &new_dmab) < 0) {
>   				buffer->error = -ENOMEM;
>   				return;
>   			}
> @@ -161,7 +190,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry,
>   			substream->buffer_bytes_max = UINT_MAX;
>   		}
>   		if (substream->dma_buffer.area)
> -			snd_dma_free_pages(&substream->dma_buffer);
> +			do_free_pages(card, &substream->dma_buffer);
>   		substream->dma_buffer = new_dmab;
>   	} else {
>   		buffer->error = -EINVAL;
> @@ -346,6 +375,7 @@ struct page *snd_pcm_sgbuf_ops_page(struct snd_pcm_substream *substream, unsigne
>    */
>   int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size)
>   {
> +	struct snd_card *card = substream->pcm->card;
>   	struct snd_pcm_runtime *runtime;
>   	struct snd_dma_buffer *dmab = NULL;
>   
> @@ -374,9 +404,10 @@ int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size)
>   		if (! dmab)
>   			return -ENOMEM;
>   		dmab->dev = substream->dma_buffer.dev;
> -		if (snd_dma_alloc_pages(substream->dma_buffer.dev.type,
> -					substream->dma_buffer.dev.dev,
> -					size, dmab) < 0) {
> +		if (do_alloc_pages(card,
> +				   substream->dma_buffer.dev.type,
> +				   substream->dma_buffer.dev.dev,
> +				   size, dmab) < 0) {
>   			kfree(dmab);
>   			return -ENOMEM;
>   		}
> @@ -397,6 +428,7 @@ EXPORT_SYMBOL(snd_pcm_lib_malloc_pages);
>    */
>   int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream)
>   {
> +	struct snd_card *card = substream->pcm->card;
>   	struct snd_pcm_runtime *runtime;
>   
>   	if (PCM_RUNTIME_CHECK(substream))
> @@ -406,7 +438,7 @@ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream)
>   		return 0;
>   	if (runtime->dma_buffer_p != &substream->dma_buffer) {
>   		/* it's a newly allocated buffer.  release it now. */
> -		snd_dma_free_pages(runtime->dma_buffer_p);
> +		do_free_pages(card, runtime->dma_buffer_p);
>   		kfree(runtime->dma_buffer_p);
>   	}
>   	snd_pcm_set_runtime_buffer(substream, NULL);
> diff --git a/sound/hda/Kconfig b/sound/hda/Kconfig
> index b0c88fe040ee..4ca6b09056f3 100644
> --- a/sound/hda/Kconfig
> +++ b/sound/hda/Kconfig
> @@ -21,14 +21,16 @@ config SND_HDA_EXT_CORE
>          select SND_HDA_CORE
>   
>   config SND_HDA_PREALLOC_SIZE
> -	int "Pre-allocated buffer size for HD-audio driver"
> +	int "Pre-allocated buffer size for HD-audio driver" if !SND_DMA_SGBUF
>   	range 0 32768
> -	default 64
> +	default 0 if SND_DMA_SGBUF
> +	default 64 if !SND_DMA_SGBUF
>   	help
>   	  Specifies the default pre-allocated buffer-size in kB for the
>   	  HD-audio driver.  A larger buffer (e.g. 2048) is preferred
>   	  for systems using PulseAudio.  The default 64 is chosen just
>   	  for compatibility reasons.
> +	  On x86 systems, the default is zero as we need no preallocation.
>   
>   	  Note that the pre-allocation size can be changed dynamically
>   	  via a proc file (/proc/asound/card*/pcm*/sub*/prealloc), too.
Takashi Iwai Jan. 19, 2020, 10:43 a.m. UTC | #29
On Sun, 19 Jan 2020 11:14:56 +0100,
Keyon Jie wrote:
> 
> On 2020/1/19 下午5:04, Takashi Iwai wrote:
> > On Sun, 19 Jan 2020 09:11:17 +0100,
> > Keyon Jie wrote:
> >> On 2020/1/19 下午3:09, Takashi Iwai wrote:
> >> It varies for each stream, most of them are 65536 Bytes only, whereas
> >> one for Wake-On-Voice might need a > 4 Seconds buffer could be up to
> >> about 1~2MBytes, and another one for deep-buffer playback can be up to
> >> about 8MBytes.
> > Hm, so this varies so much depending on the use case?
> > I thought it comes from the topology file and it's essentially
> > consistent over various purposes.
> 
> Yes, we add different buffer_bytes_max limitation to each stream
> depending on its use case, basically we set it to the maximum value we
> claim to support only, we don't want to waste any of the system
> memory.
> 
> >
> >>> I think we can go for passing zero as default, which means skipping
> >>> preallocation.  In addition, we may add an upper limit of the total
> >> Just did an experiment and this works for me, I believe we still need
> >> to call snd_pcm_set_managed_buffer() though the preallocation is
> >> skipped in this, right?
> > No, snd_pcm_set_managed_buffer() is the new PCM preallocation API.
> > The old snd_pcm_lib_preallocate*() is almost gone.
> 
> What I asked is actually that since the preallocation will be
> skipped(with passing size=0), can we just not calling
> snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() in our SOF
> PCM driver? I believe no(we still need the invoking to do
> initialization except buffer allocating)?

You still need to call it.  Otherwise the PCM core doesn't know what
kind of buffer type has to be allocated.

Basically snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*()
does two things: set the buffer type and its preallocation (default
and max size).  The latter default size can be 0, meaning that no
default preallocation is performed.  Also the max can be 0, i.e. no
preallocation is needed at all for the buffers (e.g. vmalloc
buffers).  Meanwhile the buffer type and its device pointer are
mandatory and can't be skipped.

> >
> >>> amount of allocation per card, controlled in pcm_memory.c, for
> >>> example.  This logic can be applied to the legacy HDA, too.
> >>>
> >>> This should be relatively easy, and I'll provide the patch in the next
> >>> week.
> >> OK, that's fine for me also, thank you.
> > Below is a quick hack for HDA.  We still need the certain amount of
> > preallocation for non-x86 systems that don't support SG-buffers, so
> > a bit of trick is applied to Kconfig.
> >
> > Totally untested, as usual.
> 
> Did a quick test(plus passing 0 size for preallocate in SOF PCM
> driver) and it works for my use case(no regression comparing that
> without applying this patch), Thank you.

OK, will tidy up and submit later.


Takashi
Keyon Jie Jan. 20, 2020, 2:23 a.m. UTC | #30
On 2020/1/19 下午6:43, Takashi Iwai wrote:
> On Sun, 19 Jan 2020 11:14:56 +0100,
> Keyon Jie wrote:
>>
>> On 2020/1/19 下午5:04, Takashi Iwai wrote:
>>> On Sun, 19 Jan 2020 09:11:17 +0100,
>>> Keyon Jie wrote:
>>>> On 2020/1/19 下午3:09, Takashi Iwai wrote:
>>>> It varies for each stream, most of them are 65536 Bytes only, whereas
>>>> one for Wake-On-Voice might need a > 4 Seconds buffer could be up to
>>>> about 1~2MBytes, and another one for deep-buffer playback can be up to
>>>> about 8MBytes.
>>> Hm, so this varies so much depending on the use case?
>>> I thought it comes from the topology file and it's essentially
>>> consistent over various purposes.
>>
>> Yes, we add different buffer_bytes_max limitation to each stream
>> depending on its use case, basically we set it to the maximum value we
>> claim to support only, we don't want to waste any of the system
>> memory.
>>
>>>
>>>>> I think we can go for passing zero as default, which means skipping
>>>>> preallocation.  In addition, we may add an upper limit of the total
>>>> Just did an experiment and this works for me, I believe we still need
>>>> to call snd_pcm_set_managed_buffer() though the preallocation is
>>>> skipped in this, right?
>>> No, snd_pcm_set_managed_buffer() is the new PCM preallocation API.
>>> The old snd_pcm_lib_preallocate*() is almost gone.
>>
>> What I asked is actually that since the preallocation will be
>> skipped(with passing size=0), can we just not calling
>> snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() in our SOF
>> PCM driver? I believe no(we still need the invoking to do
>> initialization except buffer allocating)?
> 
> You still need to call it.  Otherwise the PCM core doesn't know what
> kind of buffer type has to be allocated.
> 
> Basically snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*()
> does two things: set the buffer type and its preallocation (default
> and max size).  The latter default size can be 0, meaning that no
> default preallocation is performed.  Also the max can be 0, i.e. no
> preallocation is needed at all for the buffers (e.g. vmalloc
> buffers).  Meanwhile the buffer type and its device pointer are
> mandatory and can't be skipped.

Got it, thanks for guiding on it Takashi.

Thanks,
~Keyon

> 
>>>
>>>>> amount of allocation per card, controlled in pcm_memory.c, for
>>>>> example.  This logic can be applied to the legacy HDA, too.
>>>>>
>>>>> This should be relatively easy, and I'll provide the patch in the next
>>>>> week.
>>>> OK, that's fine for me also, thank you.
>>> Below is a quick hack for HDA.  We still need the certain amount of
>>> preallocation for non-x86 systems that don't support SG-buffers, so
>>> a bit of trick is applied to Kconfig.
>>>
>>> Totally untested, as usual.
>>
>> Did a quick test(plus passing 0 size for preallocate in SOF PCM
>> driver) and it works for my use case(no regression comparing that
>> without applying this patch), Thank you.
> 
> OK, will tidy up and submit later.
> 
> 
> Takashi
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>
diff mbox series

Patch

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index c375c41496f8..326e921006e7 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2301,7 +2301,7 @@  static int snd_pcm_hw_rule_buffer_bytes_max(struct snd_pcm_hw_params *params,
 	struct snd_interval t;
 	struct snd_pcm_substream *substream = rule->private;
 	t.min = 0;
-	t.max = substream->buffer_bytes_max;
+	t.max = substream->dma_max;
 	t.openmin = 0;
 	t.openmax = 0;
 	t.integer = 1;