diff mbox

[1/2] ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback

Message ID s5hvao7337g.wl-tiwai@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Takashi Iwai June 7, 2017, 9:20 p.m. UTC
On Wed, 07 Jun 2017 07:59:20 +0200,
Takashi Iwai wrote:
> 
> On Wed, 07 Jun 2017 02:38:05 +0200,
> Takashi Sakamoto wrote:
> > 
> > In recent commit for ALSA PCM core, some arrangement is done for
> > 'struct snd_pcm_ops.ack' callback. This is called when appl_ptr is
> > explicitly moved in intermediate buffer for PCM frames, except for
> > some cases described later.
> > 
> > For drivers in ALSA firewire stack, usage of this callback has a merit to
> > reduce latency between time of PCM frame queueing and handling actual
> > packets in recent isochronous cycle, because no need to wait for software
> > IRQ context from isochronous context of OHCI 1394.
> > 
> > If this works well in a case that mapped page frame is used for the
> > intermediate buffer, user process should execute some commands for ioctl(2)
> > to tell the number of handled PCM frames in the intermediate buffer just
> > after handling them.
> 
> This is one thing that was raised in the discussion with Intel people,
> and my suggestion was to add a new flag to suppress the status/control
> mmap like pcm_file->no_compat_mmap.  Then alsa-lib falls back to the
> sync_ptr ioctl, and the driver can catch each appl_ptr update.

Now I considered this again, and concluded that a simple patch like
below should suffice.

Adding Intel people to Cc, who raised the issue originally.


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
 present

The drivers using PCM ack ops require the notification whenever
appl_ptr is updated in general.  But when the PCM status/control page
is mmapped, this notification doesn't happen, per design, thus it's
not guaranteed to receive the fine-grained updates.

For improving the situation, this patch simply suppresses the PCM
status/control mmap when ack ops is defined.  At least, for all
existing drivers with ack, this should give more benefit.

Once when we really need the full optimization with status/control
mmap even using ack ops, we may reconsider the check, e.g. introducing
a new flag.  But, so far, this should be good enough.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/pcm_native.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Vinod Koul June 9, 2017, 3:42 a.m. UTC | #1
On Wed, Jun 07, 2017 at 11:20:03PM +0200, Takashi Iwai wrote:
> On Wed, 07 Jun 2017 07:59:20 +0200,
> Takashi Iwai wrote:
> > 
> > On Wed, 07 Jun 2017 02:38:05 +0200,
> > Takashi Sakamoto wrote:
> > > 
> > > In recent commit for ALSA PCM core, some arrangement is done for
> > > 'struct snd_pcm_ops.ack' callback. This is called when appl_ptr is
> > > explicitly moved in intermediate buffer for PCM frames, except for
> > > some cases described later.
> > > 
> > > For drivers in ALSA firewire stack, usage of this callback has a merit to
> > > reduce latency between time of PCM frame queueing and handling actual
> > > packets in recent isochronous cycle, because no need to wait for software
> > > IRQ context from isochronous context of OHCI 1394.
> > > 
> > > If this works well in a case that mapped page frame is used for the
> > > intermediate buffer, user process should execute some commands for ioctl(2)
> > > to tell the number of handled PCM frames in the intermediate buffer just
> > > after handling them.
> > 
> > This is one thing that was raised in the discussion with Intel people,
> > and my suggestion was to add a new flag to suppress the status/control
> > mmap like pcm_file->no_compat_mmap.  Then alsa-lib falls back to the
> > sync_ptr ioctl, and the driver can catch each appl_ptr update.
> 
> Now I considered this again, and concluded that a simple patch like
> below should suffice.
> 
> Adding Intel people to Cc, who raised the issue originally.
> 
> 
> thanks,
> 
> Takashi
> 
> -- 8< --
> From: Takashi Iwai <tiwai@suse.de>
> Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
>  present
> 
> The drivers using PCM ack ops require the notification whenever
> appl_ptr is updated in general.  But when the PCM status/control page
> is mmapped, this notification doesn't happen, per design, thus it's
> not guaranteed to receive the fine-grained updates.
> 
> For improving the situation, this patch simply suppresses the PCM
> status/control mmap when ack ops is defined.  At least, for all
> existing drivers with ack, this should give more benefit.
> 
> Once when we really need the full optimization with status/control
> mmap even using ack ops, we may reconsider the check, e.g. introducing
> a new flag.  But, so far, this should be good enough.

Yes this makes sense and we tested it for us, looks good

Reveiwed-by: Vinod Koul <vinod.koul@intel.com>
Tested-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>

> 
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> ---
>  sound/core/pcm_native.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 2bde07a4a87f..b993b0420411 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -3189,6 +3189,10 @@ static int snd_pcm_mmap_status(struct snd_pcm_substream *substream, struct file
>  			       struct vm_area_struct *area)
>  {
>  	long size;
> +
> +	/* suppress status/control mmap when driver requires ack */
> +	if (substream->ops->ack)
> +		return -ENXIO;
>  	if (!(area->vm_flags & VM_READ))
>  		return -EINVAL;
>  	size = area->vm_end - area->vm_start;
> @@ -3225,6 +3229,10 @@ static int snd_pcm_mmap_control(struct snd_pcm_substream *substream, struct file
>  				struct vm_area_struct *area)
>  {
>  	long size;
> +
> +	/* suppress status/control mmap when driver requires ack */
> +	if (substream->ops->ack)
> +		return -ENXIO;
>  	if (!(area->vm_flags & VM_READ))
>  		return -EINVAL;
>  	size = area->vm_end - area->vm_start;
> -- 
> 2.13.0
>
Takashi Iwai June 9, 2017, 6:53 a.m. UTC | #2
On Fri, 09 Jun 2017 05:42:31 +0200,
Vinod Koul wrote:
> 
> On Wed, Jun 07, 2017 at 11:20:03PM +0200, Takashi Iwai wrote:
> > On Wed, 07 Jun 2017 07:59:20 +0200,
> > Takashi Iwai wrote:
> > > 
> > > On Wed, 07 Jun 2017 02:38:05 +0200,
> > > Takashi Sakamoto wrote:
> > > > 
> > > > In recent commit for ALSA PCM core, some arrangement is done for
> > > > 'struct snd_pcm_ops.ack' callback. This is called when appl_ptr is
> > > > explicitly moved in intermediate buffer for PCM frames, except for
> > > > some cases described later.
> > > > 
> > > > For drivers in ALSA firewire stack, usage of this callback has a merit to
> > > > reduce latency between time of PCM frame queueing and handling actual
> > > > packets in recent isochronous cycle, because no need to wait for software
> > > > IRQ context from isochronous context of OHCI 1394.
> > > > 
> > > > If this works well in a case that mapped page frame is used for the
> > > > intermediate buffer, user process should execute some commands for ioctl(2)
> > > > to tell the number of handled PCM frames in the intermediate buffer just
> > > > after handling them.
> > > 
> > > This is one thing that was raised in the discussion with Intel people,
> > > and my suggestion was to add a new flag to suppress the status/control
> > > mmap like pcm_file->no_compat_mmap.  Then alsa-lib falls back to the
> > > sync_ptr ioctl, and the driver can catch each appl_ptr update.
> > 
> > Now I considered this again, and concluded that a simple patch like
> > below should suffice.
> > 
> > Adding Intel people to Cc, who raised the issue originally.
> > 
> > 
> > thanks,
> > 
> > Takashi
> > 
> > -- 8< --
> > From: Takashi Iwai <tiwai@suse.de>
> > Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
> >  present
> > 
> > The drivers using PCM ack ops require the notification whenever
> > appl_ptr is updated in general.  But when the PCM status/control page
> > is mmapped, this notification doesn't happen, per design, thus it's
> > not guaranteed to receive the fine-grained updates.
> > 
> > For improving the situation, this patch simply suppresses the PCM
> > status/control mmap when ack ops is defined.  At least, for all
> > existing drivers with ack, this should give more benefit.
> > 
> > Once when we really need the full optimization with status/control
> > mmap even using ack ops, we may reconsider the check, e.g. introducing
> > a new flag.  But, so far, this should be good enough.
> 
> Yes this makes sense and we tested it for us, looks good
> 
> Reveiwed-by: Vinod Koul <vinod.koul@intel.com>
> Tested-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>

OK, thanks.

If Sakamato-san is happy with this change, I'm going to merge it for
4.13.


Takashi


> > Signed-off-by: Takashi Iwai <tiwai@suse.de>
> > ---
> >  sound/core/pcm_native.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> > index 2bde07a4a87f..b993b0420411 100644
> > --- a/sound/core/pcm_native.c
> > +++ b/sound/core/pcm_native.c
> > @@ -3189,6 +3189,10 @@ static int snd_pcm_mmap_status(struct snd_pcm_substream *substream, struct file
> >  			       struct vm_area_struct *area)
> >  {
> >  	long size;
> > +
> > +	/* suppress status/control mmap when driver requires ack */
> > +	if (substream->ops->ack)
> > +		return -ENXIO;
> >  	if (!(area->vm_flags & VM_READ))
> >  		return -EINVAL;
> >  	size = area->vm_end - area->vm_start;
> > @@ -3225,6 +3229,10 @@ static int snd_pcm_mmap_control(struct snd_pcm_substream *substream, struct file
> >  				struct vm_area_struct *area)
> >  {
> >  	long size;
> > +
> > +	/* suppress status/control mmap when driver requires ack */
> > +	if (substream->ops->ack)
> > +		return -ENXIO;
> >  	if (!(area->vm_flags & VM_READ))
> >  		return -EINVAL;
> >  	size = area->vm_end - area->vm_start;
> > -- 
> > 2.13.0
> > 
> 
> -- 
> ~Vinod
>
Takashi Sakamoto June 9, 2017, 7:01 a.m. UTC | #3
On Jun 9 2017 15:53, Takashi Iwai wrote:
>>> From: Takashi Iwai <tiwai@suse.de>
>>> Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
>>>   present
>>>
>>> The drivers using PCM ack ops require the notification whenever
>>> appl_ptr is updated in general.  But when the PCM status/control page
>>> is mmapped, this notification doesn't happen, per design, thus it's
>>> not guaranteed to receive the fine-grained updates.
>>>
>>> For improving the situation, this patch simply suppresses the PCM
>>> status/control mmap when ack ops is defined.  At least, for all
>>> existing drivers with ack, this should give more benefit.
>>>
>>> Once when we really need the full optimization with status/control
>>> mmap even using ack ops, we may reconsider the check, e.g. introducing
>>> a new flag.  But, so far, this should be good enough.
>>
>> Yes this makes sense and we tested it for us, looks good
>>
>> Reveiwed-by: Vinod Koul <vinod.koul@intel.com>
>> Tested-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
> 
> OK, thanks.
> 
> If Sakamato-san is happy with this change, I'm going to merge it for
> 4.13.

I'm writing a long long message about my concern for this patch. I'm 
happy if you postpone application to your for-next branch, till next week.


Thanks

Takashi Sakamoto
Takashi Iwai June 9, 2017, 7:05 a.m. UTC | #4
On Fri, 09 Jun 2017 09:01:56 +0200,
Takashi Sakamoto wrote:
> 
> On Jun 9 2017 15:53, Takashi Iwai wrote:
> >>> From: Takashi Iwai <tiwai@suse.de>
> >>> Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
> >>>   present
> >>>
> >>> The drivers using PCM ack ops require the notification whenever
> >>> appl_ptr is updated in general.  But when the PCM status/control page
> >>> is mmapped, this notification doesn't happen, per design, thus it's
> >>> not guaranteed to receive the fine-grained updates.
> >>>
> >>> For improving the situation, this patch simply suppresses the PCM
> >>> status/control mmap when ack ops is defined.  At least, for all
> >>> existing drivers with ack, this should give more benefit.
> >>>
> >>> Once when we really need the full optimization with status/control
> >>> mmap even using ack ops, we may reconsider the check, e.g. introducing
> >>> a new flag.  But, so far, this should be good enough.
> >>
> >> Yes this makes sense and we tested it for us, looks good
> >>
> >> Reveiwed-by: Vinod Koul <vinod.koul@intel.com>
> >> Tested-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
> >
> > OK, thanks.
> >
> > If Sakamato-san is happy with this change, I'm going to merge it for
> > 4.13.
> 
> I'm writing a long long message about my concern for this patch.

Better to shorten, if you want to convince someone :)

> I'm
> happy if you postpone application to your for-next branch, till next
> week.

OK.


Takashi
Takashi Sakamoto June 12, 2017, 10:49 p.m. UTC | #5
Hi,

On Jun 8 2017 06:20, Takashi Iwai wrote:
> From: Takashi Iwai <tiwai@suse.de>
> Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
>   present
> 
> The drivers using PCM ack ops require the notification whenever
> appl_ptr is updated in general.  But when the PCM status/control page
> is mmapped, this notification doesn't happen, per design, thus it's
> not guaranteed to receive the fine-grained updates.
> 
> For improving the situation, this patch simply suppresses the PCM
> status/control mmap when ack ops is defined.  At least, for all
> existing drivers with ack, this should give more benefit.
> 
> Once when we really need the full optimization with status/control
> mmap even using ack ops, we may reconsider the check, e.g. introducing
> a new flag.  But, so far, this should be good enough.
> 
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> ---
>   sound/core/pcm_native.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 2bde07a4a87f..b993b0420411 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -3189,6 +3189,10 @@ static int snd_pcm_mmap_status(struct snd_pcm_substream *substream, struct file
>   			       struct vm_area_struct *area)
>   {
>   	long size;
> +
> +	/* suppress status/control mmap when driver requires ack */
> +	if (substream->ops->ack)
> +		return -ENXIO;
>   	if (!(area->vm_flags & VM_READ))
>   		return -EINVAL;
>   	size = area->vm_end - area->vm_start;
> @@ -3225,6 +3229,10 @@ static int snd_pcm_mmap_control(struct snd_pcm_substream *substream, struct file
>   				struct vm_area_struct *area)
>   {
>   	long size;
> +
> +	/* suppress status/control mmap when driver requires ack */
> +	if (substream->ops->ack)
> +		return -ENXIO;
>   	if (!(area->vm_flags & VM_READ))
>   		return -EINVAL;
>   	size = area->vm_end - area->vm_start;

Let me evaluate this patch in a point of overhead at kernel/userspace 
interaction.

When considering about shape of ALSA PCM applications, I can show it by
below simple pseudo code:

```
while:
   poll(2)
   calculate how many PCM frames are available.
   memcpy(2)
```

To satisfy requirements of some drivers, we need to find a way to take
userspace applications to commit the number of handled PCM frames to
kernel space after the memcpy(2). For this purpose, in ALSA PCM
interface, SNDRV_PCM_IOCTL_SYNC_PTR is available.

```
while:
   poll(2)
   calculate how many PCM frames are available.
   memcpy(2)
   ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR
```

Well, as an actual solution, this patch disallows userspace applications
to map page frame for the status/control data of PCM substream, because
alsa-lib has fallback mode for this case to utilize
SNDRV_PCM_IOCTL_SYNC_PTR. This is a similar situation to ARM platform.
As I described in a commit fccf53881e9b ("ALSA: pcm: add 'applptr' event
of tracepoint"), ALSA PCM core doesn't perform page frame mapping for
the status/control data[1].

In the commit message, I compared tracing data of tracepoints and log of
strace. In the case, the number of calls for ioctl(2) with
SNDRV_PCM_IOCTL_SYNC_PTR in a loop with a call of poll(2) is seven. One
of the sevel calls is actually used to commit the number of PCM frames.
There's 6 extra system calls. These 6 calls are just used to check
current status of runtime of PCM substream, and on cache coherent
architecture, they can be suppressed by the page frame mapping. The poll
loop repeats during runtime, thus it's important to analyze overhead of
the extra system calls.

I suggest 3 points to evaluate this patch.

1. CPU time for the system call
2. disabling kernel preemption and local IRQ
3. backward compatibility

Here, I describe my opinion to this patch for each of these points.

1. CPU time for the system call
System calls cost expensive than usual API calls. As long as
investigated on x86_64 architecture on perf-stat, additional 6 call of
ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR increases 5,000-15,000
instructions in program execution. Recent hardware is enough fast. It
might not be so large disadvantage.

However, when drivers implement 'struct snd_pcm_ops.ack' callback, I
have different opinion; it's too frequent. In either drivers with
indirect layer and packet-oriented drivers, the callback is used to
process PCM frames in intermediate buffer for userspace. The frequent
call of the callback is not enough effective because the applications is
programmed with a manner to be accurate for actual time somehow, thus
they don't process more PCM frames in one poll loop.

2. disabling kernel preemption and local IRQ
In kernel land, service routine for SNDRV_PCM_IOCTL_SYNC_PTR partly runs
with acquired PCM stream lock, which disables kernel preemption and
local IRQ usually for running CPU. This lock is required because the PCM
stream is handled in any handler for hardware events, but this situation
can reduce realtime performance because no process or service routine for
interrupt can't use the CPU. If we have the other options, it's worth to
reconsider.

Intel SST-based drivers use nonatomic flag for the lock. In this case,
kernel preemption is enabled but IRQ is disabled. I think Intel
developers work is for their embedded solution. The care of realtime
performance might be good for their work, too.

3. backward compatibility
For backward compatibility, this patch is a simple solution. Even if
patched kernel runs with old version of relevant stuffs, tinyalsa or
alsa-lib, things work well as expected.

However, without this patch, things still work well. In either drivers
with indirect layer or packet-oriented drivers, queued PCM frames are
transferred according to any hardware events. For such drivers, commit
notification from applications just expands their probability to
improve performance; e.g. latency between queueing and transmission.

I think it better to back to your initial direction; adding info flag,
let stuffs in userspace parse it, perform the commit for the number of
handled PCM frames in 'snd_pcm_hw_mmap_commit()'.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=fccf53881e9b564321326f62ed5f85130fa6e569


Regards

Takashi Sakamoto
Takashi Iwai June 13, 2017, 12:03 p.m. UTC | #6
On Tue, 13 Jun 2017 00:49:37 +0200,
Takashi Sakamoto wrote:
> 
> Hi,
> 
> On Jun 8 2017 06:20, Takashi Iwai wrote:
> > From: Takashi Iwai <tiwai@suse.de>
> > Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
> >   present
> >
> > The drivers using PCM ack ops require the notification whenever
> > appl_ptr is updated in general.  But when the PCM status/control page
> > is mmapped, this notification doesn't happen, per design, thus it's
> > not guaranteed to receive the fine-grained updates.
> >
> > For improving the situation, this patch simply suppresses the PCM
> > status/control mmap when ack ops is defined.  At least, for all
> > existing drivers with ack, this should give more benefit.
> >
> > Once when we really need the full optimization with status/control
> > mmap even using ack ops, we may reconsider the check, e.g. introducing
> > a new flag.  But, so far, this should be good enough.
> >
> > Signed-off-by: Takashi Iwai <tiwai@suse.de>
> > ---
> >   sound/core/pcm_native.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> >
> > diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> > index 2bde07a4a87f..b993b0420411 100644
> > --- a/sound/core/pcm_native.c
> > +++ b/sound/core/pcm_native.c
> > @@ -3189,6 +3189,10 @@ static int snd_pcm_mmap_status(struct snd_pcm_substream *substream, struct file
> >   			       struct vm_area_struct *area)
> >   {
> >   	long size;
> > +
> > +	/* suppress status/control mmap when driver requires ack */
> > +	if (substream->ops->ack)
> > +		return -ENXIO;
> >   	if (!(area->vm_flags & VM_READ))
> >   		return -EINVAL;
> >   	size = area->vm_end - area->vm_start;
> > @@ -3225,6 +3229,10 @@ static int snd_pcm_mmap_control(struct snd_pcm_substream *substream, struct file
> >   				struct vm_area_struct *area)
> >   {
> >   	long size;
> > +
> > +	/* suppress status/control mmap when driver requires ack */
> > +	if (substream->ops->ack)
> > +		return -ENXIO;
> >   	if (!(area->vm_flags & VM_READ))
> >   		return -EINVAL;
> >   	size = area->vm_end - area->vm_start;
> 
> Let me evaluate this patch in a point of overhead at kernel/userspace
> interaction.
> 
> When considering about shape of ALSA PCM applications, I can show it by
> below simple pseudo code:
> 
> ```
> while:
>   poll(2)
>   calculate how many PCM frames are available.
>   memcpy(2)
> ```
> 
> To satisfy requirements of some drivers, we need to find a way to take
> userspace applications to commit the number of handled PCM frames to
> kernel space after the memcpy(2). For this purpose, in ALSA PCM
> interface, SNDRV_PCM_IOCTL_SYNC_PTR is available.
> 
> ```
> while:
>   poll(2)
>   calculate how many PCM frames are available.
>   memcpy(2)
>   ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR
> ```
> 
> Well, as an actual solution, this patch disallows userspace applications
> to map page frame for the status/control data of PCM substream, because
> alsa-lib has fallback mode for this case to utilize
> SNDRV_PCM_IOCTL_SYNC_PTR. This is a similar situation to ARM platform.
> As I described in a commit fccf53881e9b ("ALSA: pcm: add 'applptr' event
> of tracepoint"), ALSA PCM core doesn't perform page frame mapping for
> the status/control data[1].
> 
> In the commit message, I compared tracing data of tracepoints and log of
> strace. In the case, the number of calls for ioctl(2) with
> SNDRV_PCM_IOCTL_SYNC_PTR in a loop with a call of poll(2) is seven. One
> of the sevel calls is actually used to commit the number of PCM frames.
> There's 6 extra system calls. These 6 calls are just used to check
> current status of runtime of PCM substream, and on cache coherent
> architecture, they can be suppressed by the page frame mapping. The poll
> loop repeats during runtime, thus it's important to analyze overhead of
> the extra system calls.
> 
> I suggest 3 points to evaluate this patch.
> 
> 1. CPU time for the system call
> 2. disabling kernel preemption and local IRQ
> 3. backward compatibility
> 
> Here, I describe my opinion to this patch for each of these points.
> 
> 1. CPU time for the system call
> System calls cost expensive than usual API calls. As long as
> investigated on x86_64 architecture on perf-stat, additional 6 call of
> ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR increases 5,000-15,000
> instructions in program execution. Recent hardware is enough fast. It
> might not be so large disadvantage.
> 
> However, when drivers implement 'struct snd_pcm_ops.ack' callback, I
> have different opinion; it's too frequent. In either drivers with
> indirect layer and packet-oriented drivers, the callback is used to
> process PCM frames in intermediate buffer for userspace. The frequent
> call of the callback is not enough effective because the applications is
> programmed with a manner to be accurate for actual time somehow, thus
> they don't process more PCM frames in one poll loop.
> 
> 2. disabling kernel preemption and local IRQ
> In kernel land, service routine for SNDRV_PCM_IOCTL_SYNC_PTR partly runs
> with acquired PCM stream lock, which disables kernel preemption and
> local IRQ usually for running CPU. This lock is required because the PCM
> stream is handled in any handler for hardware events, but this situation
> can reduce realtime performance because no process or service routine for
> interrupt can't use the CPU. If we have the other options, it's worth to
> reconsider.
> 
> Intel SST-based drivers use nonatomic flag for the lock. In this case,
> kernel preemption is enabled but IRQ is disabled. I think Intel
> developers work is for their embedded solution. The care of realtime
> performance might be good for their work, too.
> 
> 3. backward compatibility
> For backward compatibility, this patch is a simple solution. Even if
> patched kernel runs with old version of relevant stuffs, tinyalsa or
> alsa-lib, things work well as expected.
> 
> However, without this patch, things still work well. In either drivers
> with indirect layer or packet-oriented drivers, queued PCM frames are
> transferred according to any hardware events. For such drivers, commit
> notification from applications just expands their probability to
> improve performance; e.g. latency between queueing and transmission.
> 
> I think it better to back to your initial direction; adding info flag,
> let stuffs in userspace parse it, perform the commit for the number of
> handled PCM frames in 'snd_pcm_hw_mmap_commit()'.

Thanks for the analysis.  Yes, the cost by the explicit calls is
known, and it's what was mentioned in the commit log as a further
optimization.  I bought this cost as good enough for better appl_ptr
accuracy, but you thought differently on that.

One thing to be noted is that user-space doesn't have to call sync_ptr
at all, and even no period IRQ is triggered depending on the setup
(e.g. PA prefers it).  This is the case we want to solve.  That is,
the situation is worse than you thought -- things don't work as
expected unless we enforce the sync_ptr notification from user-space.

Then, the question is how to enforce it.  The easiest option is to
disable status/control mmap.  That's how the patch was developed.
As an option,
1) Selectively disable mmap by a new flag, or
2) Selectively disable mmap by the presence of ack callback.

And (2) seems too aggressively applied, from your opinion.

Now you might thing another option:
3) Add a new PCM info flag and let alsa-lib behaving differently
   according to it

But this is no-go, since it doesn't work with the old alsa-lib.

So, IMO, we need to go back to (1), which is my original patch, as a
start.  It affects only the driver that sets (in this case, it's SKL+
driver) and it works with the old user-space, at least.

Then we can improve the performance in alsa-lib.  alsa-lib has some
inefficient implementation regarding the hwptr/appl_ptr update.  In
may places, it calls hwsync, then do avail_update that again calls
sync_ptr, for example.  I guess we can halves the amount of sync_ptr
calls by optimizing that.

Since the sync_ptr is used for all non-x86 architectures (i.e.
nowadays majority of devices in the market), the optimization is a
good benefit for all.  Worth to try, in anyway.


And yet another obvious optimization would be to allow only the status
mmap and disallow the control mmap.  Currently, both are paired and
the control mmap can't fail if the status mmap succeeds.  But, for
the case like this, we want to suppress only the control mmap.

Unfortunately, changing this behavior requires both alsa-lib and
kernel codes.  And keeping the behavior compatibility, we'd need to
introduce something new, e.g. an ioctl to set the compatible protocol
version from alsa-lib.  For now, alsa-lib inquires the protocol
version the kernel supports (SNDRV_PCM_IOCTL_PVERSION).  The newly
required one is the other way round, alsa-lib telling the supported
protocol version to kernel.  Then the kernel can know whether this
alsa-lib version accepts the status-only mmap, and gracefully returns
-ENXIO for the control mmap.


thanks,

Takashi
Takashi Sakamoto June 14, 2017, 2:34 p.m. UTC | #7
On Jun 13 2017 21:03, Takashi Iwai wrote:
> Thanks for the analysis.  Yes, the cost by the explicit calls is
> known, and it's what was mentioned in the commit log as a further
> optimization.  I bought this cost as good enough for better appl_ptr
> accuracy, but you thought differently on that.
>
> One thing to be noted is that user-space doesn't have to call sync_ptr
> at all, and even no period IRQ is triggered depending on the setup
> (e.g. PA prefers it).  This is the case we want to solve.  That is,
> the situation is worse than you thought -- things don't work as
> expected unless we enforce the sync_ptr notification from user-space.
>
> Then, the question is how to enforce it.  The easiest option is to
> disable status/control mmap.  That's how the patch was developed.
> As an option,
> 1) Selectively disable mmap by a new flag, or
> 2) Selectively disable mmap by the presence of ack callback.
>
> And (2) seems too aggressively applied, from your opinion.
>
> Now you might thing another option:
> 3) Add a new PCM info flag and let alsa-lib behaving differently
>     according to it
>
> But this is no-go, since it doesn't work with the old alsa-lib.
>
> So, IMO, we need to go back to (1), which is my original patch, as a
> start.  It affects only the driver that sets (in this case, it's SKL+
> driver) and it works with the old user-space, at least.
>
> Then we can improve the performance in alsa-lib.  alsa-lib has some
> inefficient implementation regarding the hwptr/appl_ptr update.  In
> may places, it calls hwsync, then do avail_update that again calls
> sync_ptr, for example.  I guess we can halves the amount of sync_ptr
> calls by optimizing that.
>
> Since the sync_ptr is used for all non-x86 architectures (i.e.
> nowadays majority of devices in the market), the optimization is a
> good benefit for all.  Worth to try, in anyway.
>
>
> And yet another obvious optimization would be to allow only the status
> mmap and disallow the control mmap.  Currently, both are paired and
> the control mmap can't fail if the status mmap succeeds.  But, for
> the case like this, we want to suppress only the control mmap.
>
> Unfortunately, changing this behavior requires both alsa-lib and
> kernel codes.  And keeping the behavior compatibility, we'd need to
> introduce something new, e.g. an ioctl to set the compatible protocol
> version from alsa-lib.  For now, alsa-lib inquires the protocol
> version the kernel supports (SNDRV_PCM_IOCTL_PVERSION).  The newly
> required one is the other way round, alsa-lib telling the supported
> protocol version to kernel.  Then the kernel can know whether this
> alsa-lib version accepts the status-only mmap, and gracefully returns
> -ENXIO for the control mmap.

Hm, I have no objections to the changes of both kernel/userspace, but
from different reasons. Therefore I have different solution for this issue.

I recognize this issue as a change of programming model for new design
of devices. (Advantages for drivers for audio and music units on IEEE
1394 bus and the others is its sub-effect, a minor bonus.)

Current ALSA PCM interface is designed based on an idea for data
transmission; hardware is unaware of how many PCM frames are already
queued or dequeued on PCM buffer in system memory. Hardware can transfer
PCM frames to a part of the buffer with un-dequeued PCM frames (at
capture) or from a part of the buffer without enough queued PCM frames
(at playback). This design doesn't require kernel stuffs to maintain the
application-side position on PCM buffer.

If my understanding is correct, Intel's recent hardware can aware of
the application-side pointer and maintain the data transmission,
according to relative position of the application-side and the
hardware-side on the PCM buffer. As Pierre-Louis said, this could
satisfy Intel's convenience; e.g. reduce power comsumption. I guess
that it can decrease frequency to drive hardware for the data
transmission, or do the data transmission as better timing.

This model of data transmission is new in this subsystem. I think it
reasonable to add enough stuffs in both update kernel/userspace and
update version of the interface for this design.

A several years ago, no-period-wakeup programming model was introduced,
and this subsystem got large changes for both kernel/user land. I
believe this issue also has the similar impact. In my taste, in this
case, it's better to give up to keep full backward compatibility, and
renew related stuffs. When working with old userspace stuffs, drivers
run with old mode (namely, run for current interface). The difference
of interface version is absorbed in alsa-lib as much as possible, then
application runs without large changes.


Regards

Takashi Sakamoto
Takashi Iwai June 14, 2017, 2:52 p.m. UTC | #8
On Wed, 14 Jun 2017 16:34:32 +0200,
Takashi Sakamoto wrote:
> 
> On Jun 13 2017 21:03, Takashi Iwai wrote:
> > Thanks for the analysis.  Yes, the cost by the explicit calls is
> > known, and it's what was mentioned in the commit log as a further
> > optimization.  I bought this cost as good enough for better appl_ptr
> > accuracy, but you thought differently on that.
> >
> > One thing to be noted is that user-space doesn't have to call sync_ptr
> > at all, and even no period IRQ is triggered depending on the setup
> > (e.g. PA prefers it).  This is the case we want to solve.  That is,
> > the situation is worse than you thought -- things don't work as
> > expected unless we enforce the sync_ptr notification from user-space.
> >
> > Then, the question is how to enforce it.  The easiest option is to
> > disable status/control mmap.  That's how the patch was developed.
> > As an option,
> > 1) Selectively disable mmap by a new flag, or
> > 2) Selectively disable mmap by the presence of ack callback.
> >
> > And (2) seems too aggressively applied, from your opinion.
> >
> > Now you might thing another option:
> > 3) Add a new PCM info flag and let alsa-lib behaving differently
> >     according to it
> >
> > But this is no-go, since it doesn't work with the old alsa-lib.
> >
> > So, IMO, we need to go back to (1), which is my original patch, as a
> > start.  It affects only the driver that sets (in this case, it's SKL+
> > driver) and it works with the old user-space, at least.
> >
> > Then we can improve the performance in alsa-lib.  alsa-lib has some
> > inefficient implementation regarding the hwptr/appl_ptr update.  In
> > may places, it calls hwsync, then do avail_update that again calls
> > sync_ptr, for example.  I guess we can halves the amount of sync_ptr
> > calls by optimizing that.
> >
> > Since the sync_ptr is used for all non-x86 architectures (i.e.
> > nowadays majority of devices in the market), the optimization is a
> > good benefit for all.  Worth to try, in anyway.
> >
> >
> > And yet another obvious optimization would be to allow only the status
> > mmap and disallow the control mmap.  Currently, both are paired and
> > the control mmap can't fail if the status mmap succeeds.  But, for
> > the case like this, we want to suppress only the control mmap.
> >
> > Unfortunately, changing this behavior requires both alsa-lib and
> > kernel codes.  And keeping the behavior compatibility, we'd need to
> > introduce something new, e.g. an ioctl to set the compatible protocol
> > version from alsa-lib.  For now, alsa-lib inquires the protocol
> > version the kernel supports (SNDRV_PCM_IOCTL_PVERSION).  The newly
> > required one is the other way round, alsa-lib telling the supported
> > protocol version to kernel.  Then the kernel can know whether this
> > alsa-lib version accepts the status-only mmap, and gracefully returns
> > -ENXIO for the control mmap.
> 
> Hm, I have no objections to the changes of both kernel/userspace, but
> from different reasons. Therefore I have different solution for this issue.
> 
> I recognize this issue as a change of programming model for new design
> of devices. (Advantages for drivers for audio and music units on IEEE
> 1394 bus and the others is its sub-effect, a minor bonus.)
> 
> Current ALSA PCM interface is designed based on an idea for data
> transmission; hardware is unaware of how many PCM frames are already
> queued or dequeued on PCM buffer in system memory. Hardware can transfer
> PCM frames to a part of the buffer with un-dequeued PCM frames (at
> capture) or from a part of the buffer without enough queued PCM frames
> (at playback). This design doesn't require kernel stuffs to maintain the
> application-side position on PCM buffer.
> 
> If my understanding is correct, Intel's recent hardware can aware of
> the application-side pointer and maintain the data transmission,
> according to relative position of the application-side and the
> hardware-side on the PCM buffer. As Pierre-Louis said, this could
> satisfy Intel's convenience; e.g. reduce power comsumption. I guess
> that it can decrease frequency to drive hardware for the data
> transmission, or do the data transmission as better timing.
> 
> This model of data transmission is new in this subsystem. I think it
> reasonable to add enough stuffs in both update kernel/userspace and
> update version of the interface for this design.
> 
> A several years ago, no-period-wakeup programming model was introduced,
> and this subsystem got large changes for both kernel/user land. I
> believe this issue also has the similar impact. In my taste, in this
> case, it's better to give up to keep full backward compatibility, and
> renew related stuffs. When working with old userspace stuffs, drivers
> run with old mode (namely, run for current interface). The difference
> of interface version is absorbed in alsa-lib as much as possible, then
> application runs without large changes.

Well...  I guess you're a bit overreacting to it.  There is no new
model in a big picture.  The appl_ptr has been always present, and it
should be referred at each moment it's updated.  That said, the
problem is purely in the kernel side implementation -- or more exactly
to say, it's about the kernel / user-space ABI.

The required change would break the ABI the current alsa-lib expects,
thus we need to update and enable the new ABI, conditionally for the
newer alsa-lib for an optimized path.  For the older alsa-lib, we keep
the old ABI, a bit less optimized, but still works well enough.
That's all.


thanks,

Takashi
Takashi Sakamoto June 15, 2017, 2:32 a.m. UTC | #9
On 2017年06月14日 23:52, Takashi Iwai wrote:
> On Wed, 14 Jun 2017 16:34:32 +0200,
> Takashi Sakamoto wrote:
>>
>> On Jun 13 2017 21:03, Takashi Iwai wrote:
>>> Thanks for the analysis.  Yes, the cost by the explicit calls is
>>> known, and it's what was mentioned in the commit log as a further
>>> optimization.  I bought this cost as good enough for better appl_ptr
>>> accuracy, but you thought differently on that.
>>>
>>> One thing to be noted is that user-space doesn't have to call sync_ptr
>>> at all, and even no period IRQ is triggered depending on the setup
>>> (e.g. PA prefers it).  This is the case we want to solve.  That is,
>>> the situation is worse than you thought -- things don't work as
>>> expected unless we enforce the sync_ptr notification from user-space.
>>>
>>> Then, the question is how to enforce it.  The easiest option is to
>>> disable status/control mmap.  That's how the patch was developed.
>>> As an option,
>>> 1) Selectively disable mmap by a new flag, or
>>> 2) Selectively disable mmap by the presence of ack callback.
>>>
>>> And (2) seems too aggressively applied, from your opinion.
>>>
>>> Now you might thing another option:
>>> 3) Add a new PCM info flag and let alsa-lib behaving differently
>>>      according to it
>>>
>>> But this is no-go, since it doesn't work with the old alsa-lib.
>>>
>>> So, IMO, we need to go back to (1), which is my original patch, as a
>>> start.  It affects only the driver that sets (in this case, it's SKL+
>>> driver) and it works with the old user-space, at least.
>>>
>>> Then we can improve the performance in alsa-lib.  alsa-lib has some
>>> inefficient implementation regarding the hwptr/appl_ptr update.  In
>>> may places, it calls hwsync, then do avail_update that again calls
>>> sync_ptr, for example.  I guess we can halves the amount of sync_ptr
>>> calls by optimizing that.
>>>
>>> Since the sync_ptr is used for all non-x86 architectures (i.e.
>>> nowadays majority of devices in the market), the optimization is a
>>> good benefit for all.  Worth to try, in anyway.
>>>
>>>
>>> And yet another obvious optimization would be to allow only the status
>>> mmap and disallow the control mmap.  Currently, both are paired and
>>> the control mmap can't fail if the status mmap succeeds.  But, for
>>> the case like this, we want to suppress only the control mmap.
>>>
>>> Unfortunately, changing this behavior requires both alsa-lib and
>>> kernel codes.  And keeping the behavior compatibility, we'd need to
>>> introduce something new, e.g. an ioctl to set the compatible protocol
>>> version from alsa-lib.  For now, alsa-lib inquires the protocol
>>> version the kernel supports (SNDRV_PCM_IOCTL_PVERSION).  The newly
>>> required one is the other way round, alsa-lib telling the supported
>>> protocol version to kernel.  Then the kernel can know whether this
>>> alsa-lib version accepts the status-only mmap, and gracefully returns
>>> -ENXIO for the control mmap.
>>
>> Hm, I have no objections to the changes of both kernel/userspace, but
>> from different reasons. Therefore I have different solution for this issue.
>>
>> I recognize this issue as a change of programming model for new design
>> of devices. (Advantages for drivers for audio and music units on IEEE
>> 1394 bus and the others is its sub-effect, a minor bonus.)
>>
>> Current ALSA PCM interface is designed based on an idea for data
>> transmission; hardware is unaware of how many PCM frames are already
>> queued or dequeued on PCM buffer in system memory. Hardware can transfer
>> PCM frames to a part of the buffer with un-dequeued PCM frames (at
>> capture) or from a part of the buffer without enough queued PCM frames
>> (at playback). This design doesn't require kernel stuffs to maintain the
>> application-side position on PCM buffer.
>>
>> If my understanding is correct, Intel's recent hardware can aware of
>> the application-side pointer and maintain the data transmission,
>> according to relative position of the application-side and the
>> hardware-side on the PCM buffer. As Pierre-Louis said, this could
>> satisfy Intel's convenience; e.g. reduce power comsumption. I guess
>> that it can decrease frequency to drive hardware for the data
>> transmission, or do the data transmission as better timing.
>>
>> This model of data transmission is new in this subsystem. I think it
>> reasonable to add enough stuffs in both update kernel/userspace and
>> update version of the interface for this design.
>>
>> A several years ago, no-period-wakeup programming model was introduced,
>> and this subsystem got large changes for both kernel/user land. I
>> believe this issue also has the similar impact. In my taste, in this
>> case, it's better to give up to keep full backward compatibility, and
>> renew related stuffs. When working with old userspace stuffs, drivers
>> run with old mode (namely, run for current interface). The difference
>> of interface version is absorbed in alsa-lib as much as possible, then
>> application runs without large changes.
> 
> Well...  I guess you're a bit overreacting to it.  There is no new
> model in a big picture.

I suggested to the idea to change kernel/userspace interface itself, not 
to your raw idea. In short, I'm still against the idea to disable 
mmap(2) for status/control data of PCM runtime.

As I declare in my former message[1], there's an apparent change about 
programming model, independent on architecture support for cache 
coherency. As I declare in my previous message[2], the way for data 
transmission on the new hardware is not expected in current ALSA 
implementation. This should be described in new version of interface 
(2.0.14).

> The appl_ptr has been always present, and it
> should be referred at each moment it's updated.

Actually it's not maintained in kernel space. Userspace applications do 
it, at least for operations under mmap(2) of PCM buffer. Clearly, 
current application-side position on the buffer is not cared in any 
service routine to handle hardware event.

As of 2017, we have some userspace implementation as consumers of ALSA 
PCM interface. If we add any change into the interface, we're 
responsible for notification of it via version change.

> That said, the
> problem is purely in the kernel side implementation -- or more exactly
> to say, it's about the kernel / user-space ABI.

Yes. We need to add changes into the way to use kernel/userspace 
interface for applications in this case.

> The required change would break the ABI the current alsa-lib expects,
> thus we need to update and enable the new ABI, conditionally for the
> newer alsa-lib for an optimized path.  For the older alsa-lib, we keep
> the old ABI, a bit less optimized, but still works well enough.
> That's all.

As I described, I have few interests in your raw idea. It's too early in 
this discussion.


[1] [alsa-devel] [PATCH 1/2] ALSA: firewire: process packets in 'struct 
snd_pcm_ops.ack' callback
http://mailman.alsa-project.org/pipermail/alsa-devel/2017-June/121699.html
[2] [alsa-devel] [PATCH 1/2] ALSA: firewire: process packets in 'struct 
snd_pcm_ops.ack' callback
http://mailman.alsa-project.org/pipermail/alsa-devel/2017-June/121794.html


Regards

Takashi Sakamoto
Takashi Iwai June 15, 2017, 8:48 a.m. UTC | #10
On Thu, 15 Jun 2017 04:32:28 +0200,
Takashi Sakamoto wrote:
> 
> On 2017年06月14日 23:52, Takashi Iwai wrote:
> > On Wed, 14 Jun 2017 16:34:32 +0200,
> > Takashi Sakamoto wrote:
> >>
> >> On Jun 13 2017 21:03, Takashi Iwai wrote:
> >>> Thanks for the analysis.  Yes, the cost by the explicit calls is
> >>> known, and it's what was mentioned in the commit log as a further
> >>> optimization.  I bought this cost as good enough for better appl_ptr
> >>> accuracy, but you thought differently on that.
> >>>
> >>> One thing to be noted is that user-space doesn't have to call sync_ptr
> >>> at all, and even no period IRQ is triggered depending on the setup
> >>> (e.g. PA prefers it).  This is the case we want to solve.  That is,
> >>> the situation is worse than you thought -- things don't work as
> >>> expected unless we enforce the sync_ptr notification from user-space.
> >>>
> >>> Then, the question is how to enforce it.  The easiest option is to
> >>> disable status/control mmap.  That's how the patch was developed.
> >>> As an option,
> >>> 1) Selectively disable mmap by a new flag, or
> >>> 2) Selectively disable mmap by the presence of ack callback.
> >>>
> >>> And (2) seems too aggressively applied, from your opinion.
> >>>
> >>> Now you might thing another option:
> >>> 3) Add a new PCM info flag and let alsa-lib behaving differently
> >>>      according to it
> >>>
> >>> But this is no-go, since it doesn't work with the old alsa-lib.
> >>>
> >>> So, IMO, we need to go back to (1), which is my original patch, as a
> >>> start.  It affects only the driver that sets (in this case, it's SKL+
> >>> driver) and it works with the old user-space, at least.
> >>>
> >>> Then we can improve the performance in alsa-lib.  alsa-lib has some
> >>> inefficient implementation regarding the hwptr/appl_ptr update.  In
> >>> may places, it calls hwsync, then do avail_update that again calls
> >>> sync_ptr, for example.  I guess we can halves the amount of sync_ptr
> >>> calls by optimizing that.
> >>>
> >>> Since the sync_ptr is used for all non-x86 architectures (i.e.
> >>> nowadays majority of devices in the market), the optimization is a
> >>> good benefit for all.  Worth to try, in anyway.
> >>>
> >>>
> >>> And yet another obvious optimization would be to allow only the status
> >>> mmap and disallow the control mmap.  Currently, both are paired and
> >>> the control mmap can't fail if the status mmap succeeds.  But, for
> >>> the case like this, we want to suppress only the control mmap.
> >>>
> >>> Unfortunately, changing this behavior requires both alsa-lib and
> >>> kernel codes.  And keeping the behavior compatibility, we'd need to
> >>> introduce something new, e.g. an ioctl to set the compatible protocol
> >>> version from alsa-lib.  For now, alsa-lib inquires the protocol
> >>> version the kernel supports (SNDRV_PCM_IOCTL_PVERSION).  The newly
> >>> required one is the other way round, alsa-lib telling the supported
> >>> protocol version to kernel.  Then the kernel can know whether this
> >>> alsa-lib version accepts the status-only mmap, and gracefully returns
> >>> -ENXIO for the control mmap.
> >>
> >> Hm, I have no objections to the changes of both kernel/userspace, but
> >> from different reasons. Therefore I have different solution for this issue.
> >>
> >> I recognize this issue as a change of programming model for new design
> >> of devices. (Advantages for drivers for audio and music units on IEEE
> >> 1394 bus and the others is its sub-effect, a minor bonus.)
> >>
> >> Current ALSA PCM interface is designed based on an idea for data
> >> transmission; hardware is unaware of how many PCM frames are already
> >> queued or dequeued on PCM buffer in system memory. Hardware can transfer
> >> PCM frames to a part of the buffer with un-dequeued PCM frames (at
> >> capture) or from a part of the buffer without enough queued PCM frames
> >> (at playback). This design doesn't require kernel stuffs to maintain the
> >> application-side position on PCM buffer.
> >>
> >> If my understanding is correct, Intel's recent hardware can aware of
> >> the application-side pointer and maintain the data transmission,
> >> according to relative position of the application-side and the
> >> hardware-side on the PCM buffer. As Pierre-Louis said, this could
> >> satisfy Intel's convenience; e.g. reduce power comsumption. I guess
> >> that it can decrease frequency to drive hardware for the data
> >> transmission, or do the data transmission as better timing.
> >>
> >> This model of data transmission is new in this subsystem. I think it
> >> reasonable to add enough stuffs in both update kernel/userspace and
> >> update version of the interface for this design.
> >>
> >> A several years ago, no-period-wakeup programming model was introduced,
> >> and this subsystem got large changes for both kernel/user land. I
> >> believe this issue also has the similar impact. In my taste, in this
> >> case, it's better to give up to keep full backward compatibility, and
> >> renew related stuffs. When working with old userspace stuffs, drivers
> >> run with old mode (namely, run for current interface). The difference
> >> of interface version is absorbed in alsa-lib as much as possible, then
> >> application runs without large changes.
> >
> > Well...  I guess you're a bit overreacting to it.  There is no new
> > model in a big picture.
> 
> I suggested to the idea to change kernel/userspace interface itself,
> not to your raw idea. In short, I'm still against the idea to disable
> mmap(2) for status/control data of PCM runtime.
> 
> As I declare in my former message[1], there's an apparent change about
> programming model, independent on architecture support for cache
> coherency. As I declare in my previous message[2], the way for data
> transmission on the new hardware is not expected in current ALSA
> implementation. This should be described in new version of interface
> (2.0.14).
> 
> > The appl_ptr has been always present, and it
> > should be referred at each moment it's updated.
> 
> Actually it's not maintained in kernel space. Userspace applications
> do it, at least for operations under mmap(2) of PCM buffer. Clearly,
> current application-side position on the buffer is not cared in any
> service routine to handle hardware event.
> 
> As of 2017, we have some userspace implementation as consumers of ALSA
> PCM interface. If we add any change into the interface, we're
> responsible for notification of it via version change.
> 
> > That said, the
> > problem is purely in the kernel side implementation -- or more exactly
> > to say, it's about the kernel / user-space ABI.
> 
> Yes. We need to add changes into the way to use kernel/userspace
> interface for applications in this case.
> 
> > The required change would break the ABI the current alsa-lib expects,
> > thus we need to update and enable the new ABI, conditionally for the
> > newer alsa-lib for an optimized path.  For the older alsa-lib, we keep
> > the old ABI, a bit less optimized, but still works well enough.
> > That's all.
> 
> As I described, I have few interests in your raw idea. It's too early
> in this discussion.

Well, I still don't understand (anyone else can?) what you mean as "a
change of programming model" at all.  It's becoming again like a
typical situation you falling often into while discussing with others;
your new term is too ambiguous and unique to parse without a
clarification.  PLEASE, clarify your idea in a more understandable
way, at best with a (pseudo-) code snippet.

And, I think you miss a few points, thus the argument was twisted.

- The primary goal is to achieve the notification of appl_ptr change
  to kernel.

- The kernel needs to work with the existing user-space.
  It's a MUST.

- appl_ptr or whatever a status *is* maintained in kernel -- or better
  to say, it's kept in both kernel and user-space.  (Otherwise why it
  becomes a problem now?)


So, if you have a better idea for achieving the goal without changing
the current ABI, please tell us.  It'd be really appreciated!


For the future development with the ABI extension, we may do implement
in a different level, yes.  Basically we can change everything.  But
this is not the thing we need to fix right now.

I'm open for any changes with the ABI extension.  It's certainly
exciting.  But, don't mix up this with the solution for the current
implementation.


thanks,

Takashi

> [1] [alsa-devel] [PATCH 1/2] ALSA: firewire: process packets in
> 'struct snd_pcm_ops.ack' callback
> http://mailman.alsa-project.org/pipermail/alsa-devel/2017-June/121699.html
> [2] [alsa-devel] [PATCH 1/2] ALSA: firewire: process packets in
> 'struct snd_pcm_ops.ack' callback
> http://mailman.alsa-project.org/pipermail/alsa-devel/2017-June/121794.html
> 
> 
> Regards
> 
> Takashi Sakamoto
>
Takashi Sakamoto June 15, 2017, 5:56 p.m. UTC | #11
Hi,

On Jun 15 2017 17:48, Takashi Iwai wrote:
 > Well, I still don't understand (anyone else can?) what you mean as "a
 > change of programming model" at all.

Here, I refer to my former message.

On Jun 13 2017 07:49, Takashi Sakamoto wrote:
 > Let me evaluate this patch in a point of overhead at kernel/userspace
 > interaction.
 >
 > When considering about shape of ALSA PCM applications, I can show it by
 > below simple pseudo code:
 >
 > ```
 > while:
 >    poll(2)
 >    calculate how many PCM frames are available.
 >    memcpy(2)
 > ```
 >
 > To satisfy requirements of some drivers, we need to find a way to take
 > userspace applications to commit the number of handled PCM frames to
 > kernel space after the memcpy(2). For this purpose, in ALSA PCM
 > interface, SNDRV_PCM_IOCTL_SYNC_PTR is available.
 >
 > ```
 > while:
 >    poll(2)
 >    calculate how many PCM frames are available.
 >    memcpy(2)
 >    ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR
 > ```

Some devices/drivers request applications to perform this, due to their
design of hardware for data transmission.

 > It's becoming again like a typical situation you falling often into
 > while discussing with others; your new term is too ambiguous and
 > unique to parse without a clarification.  PLEASE, clarify your idea
 > in a more understandable way, at best with a (pseudo-) code snippet.

The above.

 > And, I think you miss a few points, thus the argument was twisted.
 >
 > - The primary goal is to achieve the notification of appl_ptr change
 >    to kernel.

It's for the purpose to support devices which have the new design for
data transmission. This is the reason that I agree with upgrading
version of PCM interface. I suggest adding new info flags for the
specific purpose.

Disabling mmap for status/control data of PCM substream is just to
support architectures to which ALSA PCM core judge non cache coherency.
It's not good to utilize it as a solution of this issue because of
abuse of interfaces.

 > - The kernel needs to work with the existing user-space.
 >    It's a MUST.

I described in a previous message.

On Jun 14 2017 23:34, Takashi Sakamoto wrote:
 > When working with old userspace stuffs, drivers run with old mode
 > (namely, run for current interface). The difference of interface
 > version is absorbed in alsa-lib as much as possible, then application
 > runs without large change

 > - appl_ptr or whatever a status *is* maintained in kernel -- or better
 >    to say, it's kept in both kernel and user-space.  (Otherwise why it
 >    becomes a problem now?)

This is not achieved in current implementation of ALSA PCM core yet.
Even if hardware generates any event to get current application-side
position on PCM buffer, it can't be achieved. There's no way for
hardwares. Therefore, it's unavoided to request userspace to have care
of the above pseudo code, when supporting devices with the new design.

 > So, if you have a better idea for achieving the goal without changing
 > the current ABI, please tell us.  It'd be really appreciated!
 >
 > For the future development with the ABI extension, we may do implement
 > in a different level, yes.  Basically we can change everything.  But
 > this is not the thing we need to fix right now.
 >
 > I'm open for any changes with the ABI extension.  It's certainly
 > exciting.  But, don't mix up this with the solution for the current
 > implementation.

Your idea to extend current ABI includes some jumps from current
position of discussion. At least, the reason of it is somehow based on
your idea to disable mmap for control/status data of PCM substream for
some devices/drivers. As already stated, I'm against it with several
reasons. On cache coherent architecture, there's no matter to consider
about it. Thus they're not matters to me at present.


Regards

Takashi Sakamoto
Takashi Iwai June 15, 2017, 7:06 p.m. UTC | #12
On Thu, 15 Jun 2017 19:56:18 +0200,
Takashi Sakamoto wrote:
> 
> Hi,
> 
> On Jun 15 2017 17:48, Takashi Iwai wrote:
> > Well, I still don't understand (anyone else can?) what you mean as "a
> > change of programming model" at all.
> 
> Here, I refer to my former message.
> 
> On Jun 13 2017 07:49, Takashi Sakamoto wrote:
> > Let me evaluate this patch in a point of overhead at kernel/userspace
> > interaction.
> >
> > When considering about shape of ALSA PCM applications, I can show it by
> > below simple pseudo code:
> >
> > ```
> > while:
> >    poll(2)
> >    calculate how many PCM frames are available.
> >    memcpy(2)
> > ```
> >
> > To satisfy requirements of some drivers, we need to find a way to take
> > userspace applications to commit the number of handled PCM frames to
> > kernel space after the memcpy(2). For this purpose, in ALSA PCM
> > interface, SNDRV_PCM_IOCTL_SYNC_PTR is available.
> >
> > ```
> > while:
> >    poll(2)
> >    calculate how many PCM frames are available.
> >    memcpy(2)
> >    ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR
> > ```
> 
> Some devices/drivers request applications to perform this, due to their
> design of hardware for data transmission.

Yes, and it's the default behavior of all non-x86 platforms, too.

> > It's becoming again like a typical situation you falling often into
> > while discussing with others; your new term is too ambiguous and
> > unique to parse without a clarification.  PLEASE, clarify your idea
> > in a more understandable way, at best with a (pseudo-) code snippet.
> 
> The above.

So... what's the "new"?  That is what I don't understand...
It's the already existing model deployed without mmap.  Nothing new.

In other words, such a driver already works fine on non-x86
platforms.  It's "broken" only on x86 due to the forced mmap usage.


> > And, I think you miss a few points, thus the argument was twisted.
> >
> > - The primary goal is to achieve the notification of appl_ptr change
> >    to kernel.
> 
> It's for the purpose to support devices which have the new design for
> data transmission. This is the reason that I agree with upgrading
> version of PCM interface. I suggest adding new info flags for the
> specific purpose.
> 
> Disabling mmap for status/control data of PCM substream is just to
> support architectures to which ALSA PCM core judge non cache coherency.
> It's not good to utilize it as a solution of this issue because of
> abuse of interfaces.

No, it's no abuse.  The sync_ptr is the correct and designed API to
notify appl_ptr update from the user space to kernel.  For most of
device drivers on x86, this isn't requested strictly, thus it's not
done when the PCM control is mmapped.  We've been just lucky, so far.

That is, letting user-space to use the sync_ptr is the right
strategy from the API POV, and it can be achieved by the disablement
of mmap.  Unfortunately, we can't currently disable only PCM control
mmap, but we have to disable both control and status PCM mmaps.  It's
a trade-off, but still good enough for the driver requesting it (the
Intel one, which can achieve the deep sleep by that).


> > - The kernel needs to work with the existing user-space.
> >    It's a MUST.
> 
> I described in a previous message.

Let me repeat: the support of old user-space is a must.  This is the
fact.  Not faked.


> On Jun 14 2017 23:34, Takashi Sakamoto wrote:
> > When working with old userspace stuffs, drivers run with old mode
> > (namely, run for current interface). The difference of interface
> > version is absorbed in alsa-lib as much as possible, then application
> > runs without large change

... and how the kernel knows that a newer alsa-lib is running, BTW?

There is no such information right now, and it's why I suggested an
extension of ABI.


> > - appl_ptr or whatever a status *is* maintained in kernel -- or better
> >    to say, it's kept in both kernel and user-space.  (Otherwise why it
> >    becomes a problem now?)
> 
> This is not achieved in current implementation of ALSA PCM core yet.

Sigh, again the misunderstanding due to the ambiguous term usage.

The appl_ptr value is tracked in user-space and kernel sides -- so
they are "maintained" in both sides.  "Maintain a value" doesn't
define who triggers the value change.

> Even if hardware generates any event to get current application-side
> position on PCM buffer, it can't be achieved. There's no way for
> hardwares. Therefore, it's unavoided to request userspace to have care
> of the above pseudo code, when supporting devices with the new design.

Yes, the missing piece is the synchronization between the appl_ptr
values maintained in both user-space and kernel sides.  And it's only
on x86, and the simplest way to fix it is to disable the PCM control
mmap.

> > So, if you have a better idea for achieving the goal without changing
> > the current ABI, please tell us.  It'd be really appreciated!
> >
> > For the future development with the ABI extension, we may do implement
> > in a different level, yes.  Basically we can change everything.  But
> > this is not the thing we need to fix right now.
> >
> > I'm open for any changes with the ABI extension.  It's certainly
> > exciting.  But, don't mix up this with the solution for the current
> > implementation.
> 
> Your idea to extend current ABI includes some jumps from current
> position of discussion. At least, the reason of it is somehow based on
> your idea to disable mmap for control/status data of PCM substream for
> some devices/drivers. As already stated, I'm against it with several
> reasons. On cache coherent architecture, there's no matter to consider
> about it. Thus they're not matters to me at present.

It's fine that you say against it.  But then please state your idea
how to implement the appl_ptr notification for the existing
application ABI?  Without it, the argument is simply useless.


Also, think again which disadvantages would disabling the control mmap
(not the status mmap) have, when sync_ptr is mandatory.  Or better in
another form: suppose you need to use sync_ptr for appl_ptr update,
what merit would you have by keeping the PCM control mmap...?

Take a look at the content of snd_pcm_mmap_control.  It's only
appl_ptr and avail_min where the latter can be ignored mostly.
That is, disabling the PCM control mmap is effectively equivalent with
requesting the appl_ptr update via sync_ptr.

So, if we can disable only the PCM control mmap, the efficiency won't
be lost for kernel -> user-space PCM status update by keeping the PCM
status mmap.  It's the plan I mentioned as the ABI extension.


To recap:
- The disablement of control/status mmap is allowed by request from a
  driver to enforce the sync_ptr usage for appl_ptr update.

- The extension of ABI for further improvements; then we can keep mmap
  of PCM status while disabling the PCM control mmap.


thanks,

Takashi
Takashi Sakamoto June 16, 2017, 3 p.m. UTC | #13
On Jun 16 2017 04:06, Takashi Iwai wrote:
 >> Some devices/drivers request applications to perform this, due to their
 >> design of hardware for data transmission.
 >
 > Yes, and it's the default behavior of all non-x86 platforms, too.

Here, you have mixture of two items; architecture difference for cache
coherent functionality, and device feature for data transmission. These
two items are apparently different things.

In other words, devices can be supported independently of platform
architectures. Why should I apply the solution prepared for cache
coherency issue to add better support for the device with new feature
for its data transmission? On x86 platform (although including some
exceptions such as old ATOM processors), status/control data of PCM
substream can successfully be mapped to process' VMA of applications.
Why should it be disable it even if works well?

Here, I'll show a table with chang history of relevant codes in kernel
land, and information resources.

year | release | pcm   | relevant changes
      | version | iface |
==================================================================================
2000 | v0.5.0  | 1.0.0 | status/control/fragments data can be mapped 
into VMA[1]
2003 | v0.9.0  | 2.0.5 | status/control data can be mapped independently[2]
2004 | v1.0.5  | 2.0.7 | SYNC_PTR command was introduced for PCM ioctl 
operation[3]
2004 | v1.0.7  | 2.0.7 | mmap for status/control data is allowed for 
x86/ppc/alpha[4]

In note for v1.0.5 release, Jaroslav described[5]:

```
  - PCM midlevel
   - added SYNC_PTR ioctl (for problematic cache coherency archs)
```

The SYNC_PTR and architecture-dependent enabling of the mmap was 
originally introduced to solve cache coherent issue.

 >>> It's becoming again like a typical situation you falling often into
 >>> while discussing with others; your new term is too ambiguous and
 >>> unique to parse without a clarification.  PLEASE, clarify your idea
 >>> in a more understandable way, at best with a (pseudo-) code snippet.
 >>
 >> The above.
 >
 > So... what's the "new"?  That is what I don't understand...
 > It's the already existing model deployed without mmap.  Nothing new.

This subsystem has no drivers with the similar feature, thus it's new
design for data transmission. The design supports below things:

1. Hardware features:
1.1. Data transmission is done by direct media access (DMA).
1.2. Hardware cares of two points for its data transmission; one is
      hwptr and another is appl_ptr on PCM buffer dedicated for the
      data transmission.
1.3. (perhaps)The granurarity of data transmission can be differed,
      not fixed to the size of 'period'.

2. Drivers are designed with below items:
2.1. Return SNDRV_PCM_ACCESS_MMAP_XXX flag
2.2. Implement for 'struct snd_pcm_ops.ack' to tell appl_ptr to
      hardware.

3. Applications should perform according to below items:
3.1.  Operate with SNDRV_PCM_IOCTL_[READ|WRITE][N|I] to drive kernel
       land stuffs with changed appl_ptr.
3.2.  Or operate with SYNC_PTR to driver kernel land stuffs when
       handling any PCM frames.

As of v4.12-rc5, there's no stuffs to satisfy all of these items. It's 
clear that we're working to support devices with the new design.

 > In other words, such a driver already works fine on non-x86
 > platforms.  It's "broken" only on x86 due to the forced mmap usage.

It's better to distinguish two issues; architecture's support for
cache coherency and support for new type of device.

 >>> And, I think you miss a few points, thus the argument was twisted.
 >>>
 >>> - The primary goal is to achieve the notification of appl_ptr change
 >>>     to kernel.
 >>
 >> It's for the purpose to support devices which have the new design for
 >> data transmission. This is the reason that I agree with upgrading
 >> version of PCM interface. I suggest adding new info flags for the
 >> specific purpose.
 >>
 >> Disabling mmap for status/control data of PCM substream is just to
 >> support architectures to which ALSA PCM core judge non cache coherency.
 >> It's not good to utilize it as a solution of this issue because of
 >> abuse of interfaces.
 >
 > No, it's no abuse.  The sync_ptr is the correct and designed API to
 > notify appl_ptr update from the user space to kernel.  For most of
 > device drivers on x86, this isn't requested strictly, thus it's not
 > done when the PCM control is mmapped.  We've been just lucky, so far.

I described that it was originally designed to solve architecture's
support for cache coherency. Don't depend on extra bonus from it.

 > That is, letting user-space to use the sync_ptr is the right
 > strategy from the API POV, and it can be achieved by the disablement
 > of mmap.  Unfortunately, we can't currently disable only PCM control
 > mmap, but we have to disable both control and status PCM mmaps.

Usage of SYNC_PTR ioctl command and disabling mmap are not tight-coupled.
Both of them can be used in the same time if user land stuffs is writte
so.

 > It's a trade-off, but still good enough for the driver requesting it
 > (the Intel one, which can achieve the deep sleep by that).

What's the 'deep sleep'? Please explain about it when you introduce
new words into this discussion.

 >>> - The kernel needs to work with the existing user-space.
 >>>     It's a MUST.
 >>
 >> I described in a previous message.
 >
 > Let me repeat: the support of old user-space is a must.  This is the
 > fact.  Not faked.
 >
 >
 >> On Jun 14 2017 23:34, Takashi Sakamoto wrote:
 >>> When working with old userspace stuffs, drivers run with old mode
 >>> (namely, run for current interface). The difference of interface
 >>> version is absorbed in alsa-lib as much as possible, then application
 >>> runs without large change
 >
 > ... and how the kernel knows that a newer alsa-lib is running, BTW?
 >
 > There is no such information right now, and it's why I suggested an
 > extension of ABI.

Enough information is not provided yet. Furthermore, I have few
interests in it at present.

 >>> - appl_ptr or whatever a status *is* maintained in kernel -- or better
 >>>     to say, it's kept in both kernel and user-space.  (Otherwise why it
 >>>     becomes a problem now?)
 >>
 >> This is not achieved in current implementation of ALSA PCM core yet.
 >
 > Sigh, again the misunderstanding due to the ambiguous term usage.
 >
 > The appl_ptr value is tracked in user-space and kernel sides -- so
 > they are "maintained" in both sides.  "Maintain a value" doesn't
 > define who triggers the value change.

The important thing for this issue is 'how to deliver the position of
appl_ptr from applications to hardware'. In current implementation,
when applications operate for mapped PCM buffer, kernel land stuffs
don't have correct appl_ptr. Hardware is given ways to get appl_ptr
and drivers don't assist it because appl_ptr is in process's VMA of
applications.

 >> Even if hardware generates any event to get current application-side
 >> position on PCM buffer, it can't be achieved. There's no way for
 >> hardwares. Therefore, it's unavoided to request userspace to have care
 >> of the above pseudo code, when supporting devices with the new design.
 >
 > Yes, the missing piece is the synchronization between the appl_ptr
 > values maintained in both user-space and kernel sides.  And it's only
 > on x86, and the simplest way to fix it is to disable the PCM control
 > mmap.

It's not better to depend on bonus from architecture differences.
Provide enough information to userspace via interface.

 >>> So, if you have a better idea for achieving the goal without changing
 >>> the current ABI, please tell us.  It'd be really appreciated!
 >>>
 >>> For the future development with the ABI extension, we may do implement
 >>> in a different level, yes.  Basically we can change everything.  But
 >>> this is not the thing we need to fix right now.
 >>>
 >>> I'm open for any changes with the ABI extension.  It's certainly
 >>> exciting.  But, don't mix up this with the solution for the current
 >>> implementation.
 >>
 >> Your idea to extend current ABI includes some jumps from current
 >> position of discussion. At least, the reason of it is somehow based on
 >> your idea to disable mmap for control/status data of PCM substream for
 >> some devices/drivers. As already stated, I'm against it with several
 >> reasons. On cache coherent architecture, there's no matter to consider
 >> about it. Thus they're not matters to me at present.
 >
 > It's fine that you say against it.  But then please state your idea
 > how to implement the appl_ptr notification for the existing
 > application ABI?  Without it, the argument is simply useless.

I already prepared patches for my ideas several days ago. However, I 
have hesitation to post it because you have mixtures of two issues, 
which I described. As long as you adhere to the mixtures, my patches are 
not evaluated in a appropriate logic, thus I'm unwilling to post it. I'd 
not like to waste of my private time.

 > Also, think again which disadvantages would disabling the control mmap
 > (not the status mmap) have, when sync_ptr is mandatory.  Or better in
 > another form: suppose you need to use sync_ptr for appl_ptr update,
 > what merit would you have by keeping the PCM control mmap...?
 >
 > Take a look at the content of snd_pcm_mmap_control.  It's only
 > appl_ptr and avail_min where the latter can be ignored mostly.
 > That is, disabling the PCM control mmap is effectively equivalent with
 > requesting the appl_ptr update via sync_ptr.
 >
 > So, if we can disable only the PCM control mmap, the efficiency won't
 > be lost for kernel -> user-space PCM status update by keeping the PCM
 > status mmap.  It's the plan I mentioned as the ABI extension.
 >
 >
 > To recap:
 > - The disablement of control/status mmap is allowed by request from a
 >    driver to enforce the sync_ptr usage for appl_ptr update.

Not fair. The feature to disable mmap is just for issues from
architecture feature. It's not good to utilize it for the other kind of
issues. It surely puzzle developers in user space. When working for
x86 architecture, developers just consider about content on mapped
control/status data. We soulnd not change it.

 > - The extension of ABI for further improvements; then we can keep mmap
 >    of PCM status while disabling the PCM control mmap.

I'm not interested in issues directly relevant to current issue. I'd 
like to just focus on the current one.

[1] ftp://ftp.alsa-project.org/pub/driver/alsa-driver-0.5.0.tar.gz
[2] ftp://ftp.alsa-project.org/pub/driver/alsa-driver-0.9.0rc8.tar.bz2
[3] 
http://git.alsa-project.org/?p=alsa-driver.git;a=commitdiff;h=60300f540726;hp=a1bed841e5b5b2e6c3d5b4709efbb3aa72235e72
[4] 
http://git.alsa-project.org/?p=alsa-driver.git;a=commitdiff;h=6cd0240aa173;hp=4bc54bf2643d5563d9489ff3ce3e4dbb0d2d4a99
[5] https://lwn.net/Articles/87517/


Regards

Takashi Sakamoto
Takashi Sakamoto June 16, 2017, 3:08 p.m. UTC | #14
On Jun 17 2017 00:00, Takashi Sakamoto wrote:
> This subsystem has no drivers with the similar feature, thus it's new
> design for data transmission. The design supports below things:
> 
> 1. Hardware features:
> 1.1. Data transmission is done by direct media access (DMA).
> 1.2. Hardware cares of two points for its data transmission; one is
>       hwptr and another is appl_ptr on PCM buffer dedicated for the
>       data transmission.
> 1.3. (perhaps)The granurarity of data transmission can be differed,
>       not fixed to the size of 'period'.

I realized that 'granularity' (misspelled...) is not proper in this 
context. Here, I described intervals to start data transmissions by DMA. 
The transmission can be managed by pre-programmed descriptors on device 
register, or service routines for interrupts on drivers, and so on.


Regards

Takashi Sakamoto
Takashi Iwai June 16, 2017, 3:45 p.m. UTC | #15
On Fri, 16 Jun 2017 17:00:13 +0200,
Takashi Sakamoto wrote:
> 
> On Jun 16 2017 04:06, Takashi Iwai wrote:
> >> Some devices/drivers request applications to perform this, due to their
> >> design of hardware for data transmission.
> >
> > Yes, and it's the default behavior of all non-x86 platforms, too.
> 
> Here, you have mixture of two items; architecture difference for cache
> coherent functionality, and device feature for data transmission. These
> two items are apparently different things.
> 
> In other words, devices can be supported independently of platform
> architectures. Why should I apply the solution prepared for cache
> coherency issue to add better support for the device with new feature
> for its data transmission? On x86 platform (although including some
> exceptions such as old ATOM processors), status/control data of PCM
> substream can successfully be mapped to process' VMA of applications.
> Why should it be disable it even if works well?

Because it doesn't work well for the new feature :)

> Here, I'll show a table with chang history of relevant codes in kernel
> land, and information resources.
> 
> year | release | pcm   | relevant changes
>      | version | iface |
> ==================================================================================
> 2000 | v0.5.0  | 1.0.0 | status/control/fragments data can be mapped
> into VMA[1]
> 2003 | v0.9.0  | 2.0.5 | status/control data can be mapped independently[2]
> 2004 | v1.0.5  | 2.0.7 | SYNC_PTR command was introduced for PCM ioctl
> operation[3]
> 2004 | v1.0.7  | 2.0.7 | mmap for status/control data is allowed for
> x86/ppc/alpha[4]
> 
> In note for v1.0.5 release, Jaroslav described[5]:
> 
> ```
>  - PCM midlevel
>   - added SYNC_PTR ioctl (for problematic cache coherency archs)
> ```
> 
> The SYNC_PTR and architecture-dependent enabling of the mmap was
> originally introduced to solve cache coherent issue.
>
> >>> It's becoming again like a typical situation you falling often into
> >>> while discussing with others; your new term is too ambiguous and
> >>> unique to parse without a clarification.  PLEASE, clarify your idea
> >>> in a more understandable way, at best with a (pseudo-) code snippet.
> >>
> >> The above.
> >
> > So... what's the "new"?  That is what I don't understand...
> > It's the already existing model deployed without mmap.  Nothing new.
> 
> This subsystem has no drivers with the similar feature, thus it's new
> design for data transmission. The design supports below things:
> 
> 1. Hardware features:
> 1.1. Data transmission is done by direct media access (DMA).
> 1.2. Hardware cares of two points for its data transmission; one is
>      hwptr and another is appl_ptr on PCM buffer dedicated for the
>      data transmission.
> 1.3. (perhaps)The granurarity of data transmission can be differed,
>      not fixed to the size of 'period'.
> 
> 2. Drivers are designed with below items:
> 2.1. Return SNDRV_PCM_ACCESS_MMAP_XXX flag
> 2.2. Implement for 'struct snd_pcm_ops.ack' to tell appl_ptr to
>      hardware.
> 
> 3. Applications should perform according to below items:
> 3.1.  Operate with SNDRV_PCM_IOCTL_[READ|WRITE][N|I] to drive kernel
>       land stuffs with changed appl_ptr.
> 3.2.  Or operate with SYNC_PTR to driver kernel land stuffs when
>       handling any PCM frames.
> 
> As of v4.12-rc5, there's no stuffs to satisfy all of these items.

It works on non-x86 systems as is with 4.12.

> It's
> clear that we're working to support devices with the new design.

Yeah, I sort of understand you're sticking with "new design".  So, no
need to argue about it.  It doesn't matter whether it's really so
shiny new or not.  It's something we need to fix, after all.


> > In other words, such a driver already works fine on non-x86
> > platforms.  It's "broken" only on x86 due to the forced mmap usage.
> 
> It's better to distinguish two issues; architecture's support for
> cache coherency and support for new type of device.

Well, that's not really true.  Most of drivers worked so far with a
luck.  Fortunately, the hardware that doesn't have the mappable
hardware buffer can be woken up well enough via period setup, thus it
could synchronize appl_ptr that user-space already changed in the
past.  This, however, doesn't mean that the current x86 mmap is
perfectly working.

Imagine that you stop the stream at the middle of period chunk.  For
the hardware with the mapped h/w buffer, it can play up to the aborted
position.  OTOH, for the hardware without mapped buffer, it can't
reach at that position because the sync of appl_ptr was missing before
the period elapsed.  If the appl_ptr could have been notified via
sync_ptr, the driver could copy the data, and it could reach to the
aborted point.

So, currently it effectively enforces the BATCH style buffer although
it doesn't have to be so.  It's a known bug by the status / control
mmap, but we've ignored this just because such hardware are minor and
the problem itself is trivial.  But you see that the current code is
even not perfect for the existing hardware.


> >>> And, I think you miss a few points, thus the argument was twisted.
> >>>
> >>> - The primary goal is to achieve the notification of appl_ptr change
> >>>     to kernel.
> >>
> >> It's for the purpose to support devices which have the new design for
> >> data transmission. This is the reason that I agree with upgrading
> >> version of PCM interface. I suggest adding new info flags for the
> >> specific purpose.
> >>
> >> Disabling mmap for status/control data of PCM substream is just to
> >> support architectures to which ALSA PCM core judge non cache coherency.
> >> It's not good to utilize it as a solution of this issue because of
> >> abuse of interfaces.
> >
> > No, it's no abuse.  The sync_ptr is the correct and designed API to
> > notify appl_ptr update from the user space to kernel.  For most of
> > device drivers on x86, this isn't requested strictly, thus it's not
> > done when the PCM control is mmapped.  We've been just lucky, so far.
> 
> I described that it was originally designed to solve architecture's
> support for cache coherency. Don't depend on extra bonus from it.

Oh well...  Please stop such a fundamentalism argument.
The original purpose doesn't mean to limit its usage.  We aren't
arguing history or religion.


> > That is, letting user-space to use the sync_ptr is the right
> > strategy from the API POV, and it can be achieved by the disablement
> > of mmap.  Unfortunately, we can't currently disable only PCM control
> > mmap, but we have to disable both control and status PCM mmaps.
> 
> Usage of SYNC_PTR ioctl command and disabling mmap are not tight-coupled.
> Both of them can be used in the same time if user land stuffs is writte
> so.

Sure, it's a semantic difference.  I don't pursue this way if we are
allowed to extend ABI, either.  This will be the "phase 2".

Remember that my suggestion is for the "phase 1", to fix the stuff for
the existing applications without changing the ABI.


> > It's a trade-off, but still good enough for the driver requesting it
> > (the Intel one, which can achieve the deep sleep by that).
> 
> What's the 'deep sleep'? Please explain about it when you introduce
> new words into this discussion.

I thought Pierre (or other Intel people) already mentioned that.  Or
maybe it's in a different patchset.  In anyway...

The chip (DSP) can prefetch the data on the buffer and go to a deep
sleep mode.  It's the reason why appl_ptr update is needed.  Then we
can know how much data can be prefetched.  This should be the great
reduction of power, thus a slight increase of instructions would be
like a peanut.


> >>> - The kernel needs to work with the existing user-space.
> >>>     It's a MUST.
> >>
> >> I described in a previous message.
> >
> > Let me repeat: the support of old user-space is a must.  This is the
> > fact.  Not faked.
> >
> >
> >> On Jun 14 2017 23:34, Takashi Sakamoto wrote:
> >>> When working with old userspace stuffs, drivers run with old mode
> >>> (namely, run for current interface). The difference of interface
> >>> version is absorbed in alsa-lib as much as possible, then application
> >>> runs without large change
> >
> > ... and how the kernel knows that a newer alsa-lib is running, BTW?
> >
> > There is no such information right now, and it's why I suggested an
> > extension of ABI.
> 
> Enough information is not provided yet. Furthermore, I have few
> interests in it at present.

Right, not yet.  That's the point.  If we want to achieve the fix
without changing the user-space, you can't rely on this
not-yet-present information.


> >>> - appl_ptr or whatever a status *is* maintained in kernel -- or better
> >>>     to say, it's kept in both kernel and user-space.  (Otherwise why it
> >>>     becomes a problem now?)
> >>
> >> This is not achieved in current implementation of ALSA PCM core yet.
> >
> > Sigh, again the misunderstanding due to the ambiguous term usage.
> >
> > The appl_ptr value is tracked in user-space and kernel sides -- so
> > they are "maintained" in both sides.  "Maintain a value" doesn't
> > define who triggers the value change.
> 
> The important thing for this issue is 'how to deliver the position of
> appl_ptr from applications to hardware'. In current implementation,
> when applications operate for mapped PCM buffer, kernel land stuffs
> don't have correct appl_ptr. Hardware is given ways to get appl_ptr
> and drivers don't assist it because appl_ptr is in process's VMA of
> applications.

Yes.  That's what I called synchronization in the below.

> >> Even if hardware generates any event to get current application-side
> >> position on PCM buffer, it can't be achieved. There's no way for
> >> hardwares. Therefore, it's unavoided to request userspace to have care
> >> of the above pseudo code, when supporting devices with the new design.
> >
> > Yes, the missing piece is the synchronization between the appl_ptr
> > values maintained in both user-space and kernel sides.  And it's only
> > on x86, and the simplest way to fix it is to disable the PCM control
> > mmap.
> 
> It's not better to depend on bonus from architecture differences.
> Provide enough information to userspace via interface.
>
> >>> So, if you have a better idea for achieving the goal without changing
> >>> the current ABI, please tell us.  It'd be really appreciated!
> >>>
> >>> For the future development with the ABI extension, we may do implement
> >>> in a different level, yes.  Basically we can change everything.  But
> >>> this is not the thing we need to fix right now.
> >>>
> >>> I'm open for any changes with the ABI extension.  It's certainly
> >>> exciting.  But, don't mix up this with the solution for the current
> >>> implementation.
> >>
> >> Your idea to extend current ABI includes some jumps from current
> >> position of discussion. At least, the reason of it is somehow based on
> >> your idea to disable mmap for control/status data of PCM substream for
> >> some devices/drivers. As already stated, I'm against it with several
> >> reasons. On cache coherent architecture, there's no matter to consider
> >> about it. Thus they're not matters to me at present.
> >
> > It's fine that you say against it.  But then please state your idea
> > how to implement the appl_ptr notification for the existing
> > application ABI?  Without it, the argument is simply useless.
> 
> I already prepared patches for my ideas several days ago. However, I
> have hesitation to post it because you have mixtures of two issues,
> which I described. As long as you adhere to the mixtures, my patches
> are not evaluated in a appropriate logic, thus I'm unwilling to post
> it. I'd not like to waste of my private time.

Sakamoto-san, if you have the idea to fix the behavior, PLEASE PLEASE
show it at first!!!  This is greatly appreciated.

I meant here a solution that works with the currently existing
application as is -- i.e. change only in the kernel side.  If this can
be achieved without disabling mmap, it's very interesting.  Please let
me know.

Once we fix this for the existing applications, then we can go
further, a better implementation with the extension of ABI.


BTW, mentioning about waste of your private time doesn't sound so
impressive, honestly speaking.  I noticed it often appearing in your
patch changelogs.  Most of people (including me) spend private time
for developing Linux kernel stuff, too.  My own main job is not
maintaining ALSA, either.


thanks,

Takashi


> > Also, think again which disadvantages would disabling the control mmap
> > (not the status mmap) have, when sync_ptr is mandatory.  Or better in
> > another form: suppose you need to use sync_ptr for appl_ptr update,
> > what merit would you have by keeping the PCM control mmap...?
> >
> > Take a look at the content of snd_pcm_mmap_control.  It's only
> > appl_ptr and avail_min where the latter can be ignored mostly.
> > That is, disabling the PCM control mmap is effectively equivalent with
> > requesting the appl_ptr update via sync_ptr.
> >
> > So, if we can disable only the PCM control mmap, the efficiency won't
> > be lost for kernel -> user-space PCM status update by keeping the PCM
> > status mmap.  It's the plan I mentioned as the ABI extension.
> >
> >
> > To recap:
> > - The disablement of control/status mmap is allowed by request from a
> >    driver to enforce the sync_ptr usage for appl_ptr update.
> 
> Not fair. The feature to disable mmap is just for issues from
> architecture feature. It's not good to utilize it for the other kind of
> issues. It surely puzzle developers in user space. When working for
> x86 architecture, developers just consider about content on mapped
> control/status data. We soulnd not change it.
> 
> > - The extension of ABI for further improvements; then we can keep mmap
> >    of PCM status while disabling the PCM control mmap.
> 
> I'm not interested in issues directly relevant to current issue. I'd
> like to just focus on the current one.
> 
> [1] ftp://ftp.alsa-project.org/pub/driver/alsa-driver-0.5.0.tar.gz
> [2] ftp://ftp.alsa-project.org/pub/driver/alsa-driver-0.9.0rc8.tar.bz2
> [3]
> http://git.alsa-project.org/?p=alsa-driver.git;a=commitdiff;h=60300f540726;hp=a1bed841e5b5b2e6c3d5b4709efbb3aa72235e72
> [4]
> http://git.alsa-project.org/?p=alsa-driver.git;a=commitdiff;h=6cd0240aa173;hp=4bc54bf2643d5563d9489ff3ce3e4dbb0d2d4a99
> [5] https://lwn.net/Articles/87517/
> 
> 
> Regards
> 
> Takashi Sakamoto
>
Takashi Sakamoto June 18, 2017, 10:13 a.m. UTC | #16
Hi,

On Jun 17 2017 00:45, Takashi Iwai wrote:
 > On Fri, 16 Jun 2017 17:00:13 +0200,
 > Takashi Sakamoto wrote:
 >>
 >> On Jun 16 2017 04:06, Takashi Iwai wrote:
 >>>> Some devices/drivers request applications to perform this, due to 
their
 >>>> design of hardware for data transmission.
 >>>
 >>> Yes, and it's the default behavior of all non-x86 platforms, too.
 >>
 >> Here, you have mixture of two items; architecture difference for cache
 >> coherent functionality, and device feature for data transmission. These
 >> two items are apparently different things.
 >>
 >> In other words, devices can be supported independently of platform
 >> architectures. Why should I apply the solution prepared for cache
 >> coherency issue to add better support for the device with new feature
 >> for its data transmission? On x86 platform (although including some
 >> exceptions such as old ATOM processors), status/control data of PCM
 >> substream can successfully be mapped to process' VMA of applications.
 >> Why should it be disable it even if works well?
 >
 > Because it doesn't work well for the new feature :)

In this point, I cannot understand your insistence.

The page frame for status/control data of PCM substream is mapped into
process' VMA of application with _read-only_ attributes. In a point to
deliver the status/control data from kernel space to user space, it 
works well. On x86 platforms, this works fine exactly as the aim.

On the other hand, what we should achieve for current issue is to 
deliver information from applications to hardware. This is not relevant 
to the page frame mapping.

These two are apparently different issues. You intend to apply a
solution for the former as a solution for the latter, however it's
against original aim of the latter. To me this is in a gray zone to
agree with it.

 >>> So... what's the "new"?  That is what I don't understand...
 >>> It's the already existing model deployed without mmap.  Nothing new.
 >>
 >> This subsystem has no drivers with the similar feature, thus it's new
 >> design for data transmission. The design supports below things:
 >>
 >> 1. Hardware features:
 >> 1.1. Data transmission is done by direct media access (DMA).
 >> 1.2. Hardware cares of two points for its data transmission; one is
 >>       hwptr and another is appl_ptr on PCM buffer dedicated for the
 >>       data transmission.
 >> 1.3. (perhaps)The granurarity of data transmission can be differed,
 >>       not fixed to the size of 'period'.
 >>
 >> 2. Drivers are designed with below items:
 >> 2.1. Return SNDRV_PCM_ACCESS_MMAP_XXX flag
 >> 2.2. Implement for 'struct snd_pcm_ops.ack' to tell appl_ptr to
 >>       hardware.
 >>
 >> 3. Applications should perform according to below items:
 >> 3.1.  Operate with SNDRV_PCM_IOCTL_[READ|WRITE][N|I] to drive kernel
 >>        land stuffs with changed appl_ptr.
 >> 3.2.  Or operate with SYNC_PTR to driver kernel land stuffs when
 >>        handling any PCM frames.
 >>
 >> As of v4.12-rc5, there's no stuffs to satisfy all of these items.
 >
 > It works on non-x86 systems as is with 4.12.

It's a sub-effect, like a bonus, from a solution for issues of cache
coherency dependently of architecture, as I said.

 >>> In other words, such a driver already works fine on non-x86
 >>> platforms.  It's "broken" only on x86 due to the forced mmap usage.
 >>
 >> It's better to distinguish two issues; architecture's support for
 >> cache coherency and support for new type of device.
 >
 > Well, that's not really true.  Most of drivers worked so far with a
 > luck.  Fortunately, the hardware that doesn't have the mappable
 > hardware buffer can be woken up well enough via period setup, thus it
 > could synchronize appl_ptr that user-space already changed in the
 > past.  This, however, doesn't mean that the current x86 mmap is
 > perfectly working.
 >
 > Imagine that you stop the stream at the middle of period chunk.  For
 > the hardware with the mapped h/w buffer, it can play up to the aborted
 > position.  OTOH, for the hardware without mapped buffer, it can't
 > reach at that position because the sync of appl_ptr was missing before
 > the period elapsed.  If the appl_ptr could have been notified via
 > sync_ptr, the driver could copy the data, and it could reach to the
 > aborted point.
 >
 > So, currently it effectively enforces the BATCH style buffer although
 > it doesn't have to be so.  It's a known bug by the status / control
 > mmap, but we've ignored this just because such hardware are minor and
 > the problem itself is trivial.  But you see that the current code is
 > even not perfect for the existing hardware.

I'm a developer for drivers which use PCM buffer as intermediate buffer
for packet buffer. I understand what you explained correctly because
I've considered about it for recent several years.

But this is our of my concern about your patch.

 >>>>> And, I think you miss a few points, thus the argument was twisted.
 >>>>>
 >>>>> - The primary goal is to achieve the notification of appl_ptr change
 >>>>>      to kernel.
 >>>>
 >>>> It's for the purpose to support devices which have the new design for
 >>>> data transmission. This is the reason that I agree with upgrading
 >>>> version of PCM interface. I suggest adding new info flags for the
 >>>> specific purpose.
 >>>>
 >>>> Disabling mmap for status/control data of PCM substream is just to
 >>>> support architectures to which ALSA PCM core judge non cache 
coherency.
 >>>> It's not good to utilize it as a solution of this issue because of
 >>>> abuse of interfaces.
 >>>
 >>> No, it's no abuse.  The sync_ptr is the correct and designed API to
 >>> notify appl_ptr update from the user space to kernel.  For most of
 >>> device drivers on x86, this isn't requested strictly, thus it's not
 >>> done when the PCM control is mmapped.  We've been just lucky, so far.
 >>
 >> I described that it was originally designed to solve architecture's
 >> support for cache coherency. Don't depend on extra bonus from it.
 >
 > Oh well...  Please stop such a fundamentalism argument.
 > The original purpose doesn't mean to limit its usage.  We aren't
 > arguing history or religion.

If an issue were encapsulated only in kernel land or user land,
I would have no objection to your patch. I'm willing to review it.

However, in a current case, it relates to interaction between
kernel/user. In my opinion, it's better to avoid changing fundamental
meaning of features which are already exposed to one side. Therefore
on x86/ppc/alpha architectures, userspace applications (not only
alsa-lib API applications but also applications with any I/O library)
can use status/control data on own VMA. Else, not. This is independent
on peripheral devices.

My intention to continue this discussion is the above. The scope of
issues is different; one is architecture-dependent, another is
device-dependent, thus solution should be different. Your patch has a
potential to puzzle developers for user land, like 'why the page frame
is not mapped for my application even if on the same architecture? why
it's device dependent?'

 >>> It's a trade-off, but still good enough for the driver requesting it
 >>> (the Intel one, which can achieve the deep sleep by that).
 >>
 >> What's the 'deep sleep'? Please explain about it when you introduce
 >> new words into this discussion.
 >
 > I thought Pierre (or other Intel people) already mentioned that.  Or
 > maybe it's in a different patchset.  In anyway...
 >
 > The chip (DSP) can prefetch the data on the buffer and go to a deep
 > sleep mode.  It's the reason why appl_ptr update is needed.  Then we
 > can know how much data can be prefetched.  This should be the great
 > reduction of power, thus a slight increase of instructions would be
 > like a peanut.

In my mailbox, there's no message with the keyword. I guess that it was
introduced with the other expression. But anyway, it's mostly what I've
imagined. Thanks for your confirmation.

(I omitted some texts to focus on the main issue.)


Regards

Takashi Sakamoto
Takashi Iwai June 18, 2017, 12:29 p.m. UTC | #17
On Sun, 18 Jun 2017 12:13:54 +0200,
Takashi Sakamoto wrote:
> 
> Hi,
> 
> On Jun 17 2017 00:45, Takashi Iwai wrote:
> > On Fri, 16 Jun 2017 17:00:13 +0200,
> > Takashi Sakamoto wrote:
> >>
> >> On Jun 16 2017 04:06, Takashi Iwai wrote:
> >>>> Some devices/drivers request applications to perform this, due to 
> their
> >>>> design of hardware for data transmission.
> >>>
> >>> Yes, and it's the default behavior of all non-x86 platforms, too.
> >>
> >> Here, you have mixture of two items; architecture difference for cache
> >> coherent functionality, and device feature for data transmission. These
> >> two items are apparently different things.
> >>
> >> In other words, devices can be supported independently of platform
> >> architectures. Why should I apply the solution prepared for cache
> >> coherency issue to add better support for the device with new feature
> >> for its data transmission? On x86 platform (although including some
> >> exceptions such as old ATOM processors), status/control data of PCM
> >> substream can successfully be mapped to process' VMA of applications.
> >> Why should it be disable it even if works well?
> >
> > Because it doesn't work well for the new feature :)
> 
> In this point, I cannot understand your insistence.
> 
> The page frame for status/control data of PCM substream is mapped into
> process' VMA of application with _read-only_ attributes.

No.  The basic ides is that control mmap is write (user -> kernel)
while the status mmap is read (kernel -> user).  It's why there are
two distinct mmaps.

> In a point to
> deliver the status/control data from kernel space to user space, it
> works well. On x86 platforms, this works fine exactly as the aim.
> 
> On the other hand, what we should achieve for current issue is to
> deliver information from applications to hardware. This is not
> relevant to the page frame mapping.

There was an implicit assumption by the control mmap that the hardware
buffer management doesn't need the kernel notification.  It is a
problem even for some already existing drivers.  However, usually this
is a small matter, so we've ignored it.  That is, the problem has been
always there from the beginning.  Now it becomes clearer, no longer
negligible with a large buffer without interrupts, thus it requires
a resolution.

> These two are apparently different issues. You intend to apply a
> solution for the former as a solution for the latter, however it's
> against original aim of the latter. To me this is in a gray zone to
> agree with it.
> 
> >>> So... what's the "new"?  That is what I don't understand...
> >>> It's the already existing model deployed without mmap.  Nothing new.
> >>
> >> This subsystem has no drivers with the similar feature, thus it's new
> >> design for data transmission. The design supports below things:
> >>
> >> 1. Hardware features:
> >> 1.1. Data transmission is done by direct media access (DMA).
> >> 1.2. Hardware cares of two points for its data transmission; one is
> >>       hwptr and another is appl_ptr on PCM buffer dedicated for the
> >>       data transmission.
> >> 1.3. (perhaps)The granurarity of data transmission can be differed,
> >>       not fixed to the size of 'period'.
> >>
> >> 2. Drivers are designed with below items:
> >> 2.1. Return SNDRV_PCM_ACCESS_MMAP_XXX flag
> >> 2.2. Implement for 'struct snd_pcm_ops.ack' to tell appl_ptr to
> >>       hardware.
> >>
> >> 3. Applications should perform according to below items:
> >> 3.1.  Operate with SNDRV_PCM_IOCTL_[READ|WRITE][N|I] to drive kernel
> >>        land stuffs with changed appl_ptr.
> >> 3.2.  Or operate with SYNC_PTR to driver kernel land stuffs when
> >>        handling any PCM frames.
> >>
> >> As of v4.12-rc5, there's no stuffs to satisfy all of these items.
> >
> > It works on non-x86 systems as is with 4.12.
> 
> It's a sub-effect, like a bonus, from a solution for issues of cache
> coherency dependently of architecture, as I said.

It's not sub-effect.  Even x86 requires sync_ptr in the 32bit
compatible mode.  That said, the sync_ptr API is not only for
non-coherent architecture, but it's a general purpose API.


> >>> In other words, such a driver already works fine on non-x86
> >>> platforms.  It's "broken" only on x86 due to the forced mmap usage.
> >>
> >> It's better to distinguish two issues; architecture's support for
> >> cache coherency and support for new type of device.
> >
> > Well, that's not really true.  Most of drivers worked so far with a
> > luck.  Fortunately, the hardware that doesn't have the mappable
> > hardware buffer can be woken up well enough via period setup, thus it
> > could synchronize appl_ptr that user-space already changed in the
> > past.  This, however, doesn't mean that the current x86 mmap is
> > perfectly working.
> >
> > Imagine that you stop the stream at the middle of period chunk.  For
> > the hardware with the mapped h/w buffer, it can play up to the aborted
> > position.  OTOH, for the hardware without mapped buffer, it can't
> > reach at that position because the sync of appl_ptr was missing before
> > the period elapsed.  If the appl_ptr could have been notified via
> > sync_ptr, the driver could copy the data, and it could reach to the
> > aborted point.
> >
> > So, currently it effectively enforces the BATCH style buffer although
> > it doesn't have to be so.  It's a known bug by the status / control
> > mmap, but we've ignored this just because such hardware are minor and
> > the problem itself is trivial.  But you see that the current code is
> > even not perfect for the existing hardware.
> 
> I'm a developer for drivers which use PCM buffer as intermediate buffer
> for packet buffer. I understand what you explained correctly because
> I've considered about it for recent several years.
> 
> But this is our of my concern about your patch.
> 
> >>>>> And, I think you miss a few points, thus the argument was twisted.
> >>>>>
> >>>>> - The primary goal is to achieve the notification of appl_ptr change
> >>>>>      to kernel.
> >>>>
> >>>> It's for the purpose to support devices which have the new design for
> >>>> data transmission. This is the reason that I agree with upgrading
> >>>> version of PCM interface. I suggest adding new info flags for the
> >>>> specific purpose.
> >>>>
> >>>> Disabling mmap for status/control data of PCM substream is just to
> >>>> support architectures to which ALSA PCM core judge non cache 
> coherency.
> >>>> It's not good to utilize it as a solution of this issue because of
> >>>> abuse of interfaces.
> >>>
> >>> No, it's no abuse.  The sync_ptr is the correct and designed API to
> >>> notify appl_ptr update from the user space to kernel.  For most of
> >>> device drivers on x86, this isn't requested strictly, thus it's not
> >>> done when the PCM control is mmapped.  We've been just lucky, so far.
> >>
> >> I described that it was originally designed to solve architecture's
> >> support for cache coherency. Don't depend on extra bonus from it.
> >
> > Oh well...  Please stop such a fundamentalism argument.
> > The original purpose doesn't mean to limit its usage.  We aren't
> > arguing history or religion.
> 
> If an issue were encapsulated only in kernel land or user land,
> I would have no objection to your patch. I'm willing to review it.
> 
> However, in a current case, it relates to interaction between
> kernel/user. In my opinion, it's better to avoid changing fundamental
> meaning of features which are already exposed to one side. Therefore
> on x86/ppc/alpha architectures, userspace applications (not only
> alsa-lib API applications but also applications with any I/O library)
> can use status/control data on own VMA. Else, not. This is independent
> on peripheral devices.
> 
> My intention to continue this discussion is the above. The scope of
> issues is different; one is architecture-dependent, another is
> device-dependent, thus solution should be different. Your patch has a
> potential to puzzle developers for user land, like 'why the page frame
> is not mapped for my application even if on the same architecture? why
> it's device dependent?'
 
It's an optimized path, and the user-space *must* support the codes
with sync_ptr as fallback mandatorily.  The code path using the mmap
is rather optional, so to say.  It should be documented clearly,
indeed, for avoiding misunderstanding.


I hope the situation is clearer for you now.  If you have the
alternative fix for the issue with keeping the current ABI, please
please show up.  We need the fix, not endless discussions.


thanks,

Takashi
diff mbox

Patch

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 2bde07a4a87f..b993b0420411 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3189,6 +3189,10 @@  static int snd_pcm_mmap_status(struct snd_pcm_substream *substream, struct file
 			       struct vm_area_struct *area)
 {
 	long size;
+
+	/* suppress status/control mmap when driver requires ack */
+	if (substream->ops->ack)
+		return -ENXIO;
 	if (!(area->vm_flags & VM_READ))
 		return -EINVAL;
 	size = area->vm_end - area->vm_start;
@@ -3225,6 +3229,10 @@  static int snd_pcm_mmap_control(struct snd_pcm_substream *substream, struct file
 				struct vm_area_struct *area)
 {
 	long size;
+
+	/* suppress status/control mmap when driver requires ack */
+	if (substream->ops->ack)
+		return -ENXIO;
 	if (!(area->vm_flags & VM_READ))
 		return -EINVAL;
 	size = area->vm_end - area->vm_start;