alsa-lib:Make snd_atomic_write_* truly atomic

Message ID	570E2279.7060809@axis.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <alsa-devel-bounces@alsa-project.org> Message-ID: <570E2279.7060809@axis.com> Date: Wed, 13 Apr 2016 12:42:01 +0200 From: Andreas x Larsson <andreas.x.larsson@axis.com> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.8.0 MIME-Version: 1.0 To: <alsa-devel@alsa-project.org> Subject: [alsa-devel] [PATCH] alsa-lib:Make snd_atomic_write_* truly atomic Precedence: list Content-Transfer-Encoding: base64 Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: alsa-devel-bounces@alsa-project.org Sender: alsa-devel-bounces@alsa-project.org

Andreas x Larsson April 13, 2016, 10:42 a.m. UTC

Under some usecases a race condition appears inside
the snd_atomic_write_* functions. The 'begin' and 'end'
variables are updated with the ++ operator which is not
atomic but needs to be. This can be achieved with the
gcc atomic_* built-ins. Combined with __ATOMIC_SEQ_CST
as the memory model, memory barriers are introduced so
those can also be removed from the code.

Signed-off-by: Andreas x Larsson <andrelan@axis.com>
Signed-off-by: Arvid Nihlgård Lindell <arvidnl@axis.com>
---
  alsa-lib/include/iatomic.h | 6 ++----
  1 file changed, 2 insertions(+), 4 deletions(-)

  static __inline__ void snd_atomic_read_init(snd_atomic_read_t *r, 
snd_atomic_write_t *w)

Takashi Iwai April 13, 2016, 1:49 p.m. UTC | #1

On Wed, 13 Apr 2016 12:42:01 +0200,
Andreas x Larsson wrote:
> 
> Under some usecases a race condition appears inside
> the snd_atomic_write_* functions. The 'begin' and 'end'
> variables are updated with the ++ operator which is not
> atomic but needs to be. This can be achieved with the
> gcc atomic_* built-ins. Combined with __ATOMIC_SEQ_CST
> as the memory model, memory barriers are introduced so
> those can also be removed from the code.
> 
> Signed-off-by: Andreas x Larsson <andrelan@axis.com>
> Signed-off-by: Arvid Nihlgård Lindell <arvidnl@axis.com>

Isn't this dependent on gcc version?

And, if we use the gcc atomic operation, the whole macros should be
rewritten in another way -- or consider to drop the whole.  It's only
partly used in pcm_rate.c and pcm_plugin.c in anyway.


thanks,

Takashi

> ---
>   alsa-lib/include/iatomic.h | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/alsa-lib/include/iatomic.h b/alsa-lib/include/iatomic.h
> index acdd3e2..7fafe20 100644
> --- a/alsa-lib/include/iatomic.h
> +++ b/alsa-lib/include/iatomic.h
> @@ -140,14 +140,12 @@ static __inline__ void 
> snd_atomic_write_init(snd_atomic_write_t *w)
> 
>   static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
>   {
> -   w->begin++;
> -   wmb();
> +   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
>   }
> 
>   static __inline__ void snd_atomic_write_end(snd_atomic_write_t *w)
>   {
> -   wmb();
> -   w->end++;
> +   __atomic_add_fetch(&w->end, 1, __ATOMIC_SEQ_CST);
>   }
> 
>   static __inline__ void snd_atomic_read_init(snd_atomic_read_t *r, 
> snd_atomic_write_t *w)
> -- 
> 2.1.4
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Ricard Wanderlof May 12, 2016, 12:51 p.m. UTC | #2

On Wed, 13 Apr 2016, Takashi Iwai wrote:

> On Wed, 13 Apr 2016 12:42:01 +0200,
> Andreas x Larsson wrote:
> > 
> > Under some usecases a race condition appears inside
> > the snd_atomic_write_* functions. The 'begin' and 'end'
> > variables are updated with the ++ operator which is not
> > atomic but needs to be. This can be achieved with the
> > gcc atomic_* built-ins. Combined with __ATOMIC_SEQ_CST
> > as the memory model, memory barriers are introduced so
> > those can also be removed from the code.
> > 
> > Signed-off-by: Andreas x Larsson <andrelan@axis.com>
> > Signed-off-by: Arvid Nihlgård Lindell <arvidnl@axis.com>
> 
> Isn't this dependent on gcc version?

Do you mean that the problem is dependent on gcc version, or the solution? 
If the problem only occurs with certain gcc versions, but the _atomic_* 
functions are the proper solution in all cases, then I would think that is 
the way to go.

> And, if we use the gcc atomic operation, the whole macros should be 
> rewritten in another way -- or consider to drop the whole. It's only 
> partly used in pcm_rate.c and pcm_plugin.c in anyway.

Do you mean that the inline functions should be replaced with
#define macros, or that they should be omitted entirely and the resulting 
code be integrated into the relevant places in the .c files?

/Ricard

> 
> > ---
> >   alsa-lib/include/iatomic.h | 6 ++----
> >   1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/alsa-lib/include/iatomic.h b/alsa-lib/include/iatomic.h
> > index acdd3e2..7fafe20 100644
> > --- a/alsa-lib/include/iatomic.h
> > +++ b/alsa-lib/include/iatomic.h
> > @@ -140,14 +140,12 @@ static __inline__ void 
> > snd_atomic_write_init(snd_atomic_write_t *w)
> > 
> >   static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
> >   {
> > -   w->begin++;
> > -   wmb();
> > +   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
> >   }
> > 
> >   static __inline__ void snd_atomic_write_end(snd_atomic_write_t *w)
> >   {
> > -   wmb();
> > -   w->end++;
> > +   __atomic_add_fetch(&w->end, 1, __ATOMIC_SEQ_CST);
> >   }
> > 
> >   static __inline__ void snd_atomic_read_init(snd_atomic_read_t *r, 
> > snd_atomic_write_t *w)
> > -- 
> > 2.1.4
> > _______________________________________________
> > Alsa-devel mailing list
> > Alsa-devel@alsa-project.org
> > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
>

Takashi Iwai May 12, 2016, 1:04 p.m. UTC | #3

On Thu, 12 May 2016 14:51:07 +0200,
Ricard Wanderlof wrote:
> 
> 
> On Wed, 13 Apr 2016, Takashi Iwai wrote:
> 
> > On Wed, 13 Apr 2016 12:42:01 +0200,
> > Andreas x Larsson wrote:
> > > 
> > > Under some usecases a race condition appears inside
> > > the snd_atomic_write_* functions. The 'begin' and 'end'
> > > variables are updated with the ++ operator which is not
> > > atomic but needs to be. This can be achieved with the
> > > gcc atomic_* built-ins. Combined with __ATOMIC_SEQ_CST
> > > as the memory model, memory barriers are introduced so
> > > those can also be removed from the code.
> > > 
> > > Signed-off-by: Andreas x Larsson <andrelan@axis.com>
> > > Signed-off-by: Arvid Nihlgård Lindell <arvidnl@axis.com>
> > 
> > Isn't this dependent on gcc version?
> 
> Do you mean that the problem is dependent on gcc version, or the solution? 
> If the problem only occurs with certain gcc versions, but the _atomic_* 
> functions are the proper solution in all cases, then I would think that is 
> the way to go.

I meant the solution, your patch.  Does it work with older version of
gcc, and any other compilers than gcc?

> > And, if we use the gcc atomic operation, the whole macros should be 
> > rewritten in another way -- or consider to drop the whole. It's only 
> > partly used in pcm_rate.c and pcm_plugin.c in anyway.
> 
> Do you mean that the inline functions should be replaced with
> #define macros, or that they should be omitted entirely and the resulting 
> code be integrated into the relevant places in the .c files?

No, no, I questioned about the usefulness of snd_atomic_*()
themselves.  Does it make sense to provide this thin wrapper?
And, above all, does it make sense at all to use this?  When looking
at the current code, it's used in only very limited places.


Takashi

> 
> /Ricard
> 
> > 
> > > ---
> > >   alsa-lib/include/iatomic.h | 6 ++----
> > >   1 file changed, 2 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/alsa-lib/include/iatomic.h b/alsa-lib/include/iatomic.h
> > > index acdd3e2..7fafe20 100644
> > > --- a/alsa-lib/include/iatomic.h
> > > +++ b/alsa-lib/include/iatomic.h
> > > @@ -140,14 +140,12 @@ static __inline__ void 
> > > snd_atomic_write_init(snd_atomic_write_t *w)
> > > 
> > >   static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
> > >   {
> > > -   w->begin++;
> > > -   wmb();
> > > +   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
> > >   }
> > > 
> > >   static __inline__ void snd_atomic_write_end(snd_atomic_write_t *w)
> > >   {
> > > -   wmb();
> > > -   w->end++;
> > > +   __atomic_add_fetch(&w->end, 1, __ATOMIC_SEQ_CST);
> > >   }
> > > 
> > >   static __inline__ void snd_atomic_read_init(snd_atomic_read_t *r, 
> > > snd_atomic_write_t *w)
> > > -- 
> > > 2.1.4
> > > _______________________________________________
> > > Alsa-devel mailing list
> > > Alsa-devel@alsa-project.org
> > > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> > _______________________________________________
> > Alsa-devel mailing list
> > Alsa-devel@alsa-project.org
> > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> > 
> 
> -- 
> Ricard Wolf Wanderlöf                           ricardw(at)axis.com
> Axis Communications AB, Lund, Sweden            www.axis.com
> Phone +46 46 272 2016                           Fax +46 46 13 61 30
>

Ricard Wanderlof May 12, 2016, 2:17 p.m. UTC | #4

On Thu, 12 May 2016, Takashi Iwai wrote:

> On Thu, 12 May 2016 14:51:07 +0200,
> Ricard Wanderlof wrote:
> > > > 
> > > > Under some usecases a race condition appears inside
> > > > the snd_atomic_write_* functions. The 'begin' and 'end'
> > > > variables are updated with the ++ operator which is not
> > > > atomic but needs to be. This can be achieved with the
> > > > gcc atomic_* built-ins. Combined with __ATOMIC_SEQ_CST
> > > > as the memory model, memory barriers are introduced so
> > > > those can also be removed from the code.
> > > > 
> > > > Signed-off-by: Andreas x Larsson <andrelan@axis.com>
> > > > Signed-off-by: Arvid Nihlgård Lindell <arvidnl@axis.com>
> > > 
> > > Isn't this dependent on gcc version?
> > 
> > Do you mean that the problem is dependent on gcc version, or the solution? 
> > If the problem only occurs with certain gcc versions, but the _atomic_* 
> > functions are the proper solution in all cases, then I would think that is 
> > the way to go.
> 
> I meant the solution, your patch.  Does it work with older version of
> gcc, and any other compilers than gcc?

The code as it currently stands has macros like

#define mb()    __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")
#define rmb()   mb()
#define wmb()   __asm__ __volatile__ ("": : :"memory")

which surely are gcc specific. So surely it can't be a requirement that 
the patch work on other compilers as well?

It is relevant to verify which gcc versions support the builtins used in 
the patch however. I believe the kernel has requirements on which gcc 
versions it can be compiled with, but perhaps we need to be more lenient 
with userspace?

> > > And, if we use the gcc atomic operation, the whole macros should be 
> > > rewritten in another way -- or consider to drop the whole. It's only 
> > > partly used in pcm_rate.c and pcm_plugin.c in anyway.
> > 
> > Do you mean that the inline functions should be replaced with
> > #define macros, or that they should be omitted entirely and the resulting 
> > code be integrated into the relevant places in the .c files?
> 
> No, no, I questioned about the usefulness of snd_atomic_*()
> themselves.  Does it make sense to provide this thin wrapper?
> And, above all, does it make sense at all to use this?  When looking
> at the current code, it's used in only very limited places.

In total, the various snd_atomic_* macros are used ~20 times in pcm_rate.c 
and ~30 times in pcm_plugin.c . If the atomic functionality is indeed 
necessary (and it appears to be), it would seem worth while abstracting 
the nitty gritty using macros I think, rather than cluttering up the code.

But that seems to be a different issue altogether than what the patch is 
intended to solve.

However, one thing struck me. In our case we've seen the problem that the 
patch is intended to solve on the MIPS architecture. In this case we get
the following definitions for mb() et al:

/* Generic __sync_synchronize is available from gcc 4.1 */

#define mb() __sync_synchronize()
#define rmb() mb()
#define wmb() mb()

Could in fact the problem be that __sync_synchronize() is buggy on this 
platform, if the problem hasn't turned up anywhere else? One would expect 
otherwise that this issue would have come up much sooner.

/Ricard


> > > > ---
> > > >   alsa-lib/include/iatomic.h | 6 ++----
> > > >   1 file changed, 2 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/alsa-lib/include/iatomic.h b/alsa-lib/include/iatomic.h
> > > > index acdd3e2..7fafe20 100644
> > > > --- a/alsa-lib/include/iatomic.h
> > > > +++ b/alsa-lib/include/iatomic.h
> > > > @@ -140,14 +140,12 @@ static __inline__ void 
> > > > snd_atomic_write_init(snd_atomic_write_t *w)
> > > > 
> > > >   static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
> > > >   {
> > > > -   w->begin++;
> > > > -   wmb();
> > > > +   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
> > > >   }
> > > > 
> > > >   static __inline__ void snd_atomic_write_end(snd_atomic_write_t *w)
> > > >   {
> > > > -   wmb();
> > > > -   w->end++;
> > > > +   __atomic_add_fetch(&w->end, 1, __ATOMIC_SEQ_CST);
> > > >   }
> > > > 
> > > >   static __inline__ void snd_atomic_read_init(snd_atomic_read_t *r, 
> > > > snd_atomic_write_t *w)
> > > > -- 
> > > > 2.1.4
> > > > _______________________________________________
> > > > Alsa-devel mailing list
> > > > Alsa-devel@alsa-project.org
> > > > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> > > _______________________________________________
> > > Alsa-devel mailing list
> > > Alsa-devel@alsa-project.org
> > > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> > > 
> > 
> > -- 
> > Ricard Wolf Wanderlöf                           ricardw(at)axis.com
> > Axis Communications AB, Lund, Sweden            www.axis.com
> > Phone +46 46 272 2016                           Fax +46 46 13 61 30
> > 
>

Takashi Iwai May 12, 2016, 2:47 p.m. UTC | #5

On Thu, 12 May 2016 16:17:47 +0200,
Ricard Wanderlof wrote:
> 
> 
> On Thu, 12 May 2016, Takashi Iwai wrote:
> 
> > On Thu, 12 May 2016 14:51:07 +0200,
> > Ricard Wanderlof wrote:
> > > > > 
> > > > > Under some usecases a race condition appears inside
> > > > > the snd_atomic_write_* functions. The 'begin' and 'end'
> > > > > variables are updated with the ++ operator which is not
> > > > > atomic but needs to be. This can be achieved with the
> > > > > gcc atomic_* built-ins. Combined with __ATOMIC_SEQ_CST
> > > > > as the memory model, memory barriers are introduced so
> > > > > those can also be removed from the code.
> > > > > 
> > > > > Signed-off-by: Andreas x Larsson <andrelan@axis.com>
> > > > > Signed-off-by: Arvid Nihlgård Lindell <arvidnl@axis.com>
> > > > 
> > > > Isn't this dependent on gcc version?
> > > 
> > > Do you mean that the problem is dependent on gcc version, or the solution? 
> > > If the problem only occurs with certain gcc versions, but the _atomic_* 
> > > functions are the proper solution in all cases, then I would think that is 
> > > the way to go.
> > 
> > I meant the solution, your patch.  Does it work with older version of
> > gcc, and any other compilers than gcc?
> 
> The code as it currently stands has macros like
> 
> #define mb()    __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")
> #define rmb()   mb()
> #define wmb()   __asm__ __volatile__ ("": : :"memory")
> 
> which surely are gcc specific. So surely it can't be a requirement that 
> the patch work on other compilers as well?

Well, it's not the requirement.  It's just a question.

> It is relevant to verify which gcc versions support the builtins used in 
> the patch however. I believe the kernel has requirements on which gcc 
> versions it can be compiled with, but perhaps we need to be more lenient 
> with userspace?

Yes.  And I bet that the kernel can be still built with a gcc version
that doesn't support the new atomic primitive, too.

> > > > And, if we use the gcc atomic operation, the whole macros should be 
> > > > rewritten in another way -- or consider to drop the whole. It's only 
> > > > partly used in pcm_rate.c and pcm_plugin.c in anyway.
> > > 
> > > Do you mean that the inline functions should be replaced with
> > > #define macros, or that they should be omitted entirely and the resulting 
> > > code be integrated into the relevant places in the .c files?
> > 
> > No, no, I questioned about the usefulness of snd_atomic_*()
> > themselves.  Does it make sense to provide this thin wrapper?
> > And, above all, does it make sense at all to use this?  When looking
> > at the current code, it's used in only very limited places.
> 
> In total, the various snd_atomic_* macros are used ~20 times in pcm_rate.c 
> and ~30 times in pcm_plugin.c . If the atomic functionality is indeed 
> necessary (and it appears to be), it would seem worth while abstracting 
> the nitty gritty using macros I think, rather than cluttering up the code.

In other words, it's used only there.  Most of other codes don't care
about it.  So, the usefulness of the atomic operation there is really
doubtful.

> But that seems to be a different issue altogether than what the patch is 
> intended to solve.
>
> However, one thing struck me. In our case we've seen the problem that the 
> patch is intended to solve on the MIPS architecture. In this case we get
> the following definitions for mb() et al:
> 
> /* Generic __sync_synchronize is available from gcc 4.1 */
> 
> #define mb() __sync_synchronize()
> #define rmb() mb()
> #define wmb() mb()
> 
> Could in fact the problem be that __sync_synchronize() is buggy on this 
> platform, if the problem hasn't turned up anywhere else? One would expect 
> otherwise that this issue would have come up much sooner.

Yeah, that's why I stated the primary question.  The atomic operation
there is mostly cosmetic, not really helpful.  In the whole other
operation, we don't guarantee the thread safety at all.  So,
implementing the atomic op only in a very small part of the whole code
appears like a superfluous optimization to me.

That said: my preferred option is to scratch the whole atomic thing.


thanks,

Takashi

> 
> /Ricard
> 
> 
> > > > > ---
> > > > >   alsa-lib/include/iatomic.h | 6 ++----
> > > > >   1 file changed, 2 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/alsa-lib/include/iatomic.h b/alsa-lib/include/iatomic.h
> > > > > index acdd3e2..7fafe20 100644
> > > > > --- a/alsa-lib/include/iatomic.h
> > > > > +++ b/alsa-lib/include/iatomic.h
> > > > > @@ -140,14 +140,12 @@ static __inline__ void 
> > > > > snd_atomic_write_init(snd_atomic_write_t *w)
> > > > > 
> > > > >   static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
> > > > >   {
> > > > > -   w->begin++;
> > > > > -   wmb();
> > > > > +   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
> > > > >   }
> > > > > 
> > > > >   static __inline__ void snd_atomic_write_end(snd_atomic_write_t *w)
> > > > >   {
> > > > > -   wmb();
> > > > > -   w->end++;
> > > > > +   __atomic_add_fetch(&w->end, 1, __ATOMIC_SEQ_CST);
> > > > >   }
> > > > > 
> > > > >   static __inline__ void snd_atomic_read_init(snd_atomic_read_t *r, 
> > > > > snd_atomic_write_t *w)
> > > > > -- 
> > > > > 2.1.4
> > > > > _______________________________________________
> > > > > Alsa-devel mailing list
> > > > > Alsa-devel@alsa-project.org
> > > > > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> > > > _______________________________________________
> > > > Alsa-devel mailing list
> > > > Alsa-devel@alsa-project.org
> > > > http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
> > > > 
> > > 
> > > -- 
> > > Ricard Wolf Wanderlöf                           ricardw(at)axis.com
> > > Axis Communications AB, Lund, Sweden            www.axis.com
> > > Phone +46 46 272 2016                           Fax +46 46 13 61 30
> > > 
> > 
> 
> -- 
> Ricard Wolf Wanderlöf                           ricardw(at)axis.com
> Axis Communications AB, Lund, Sweden            www.axis.com
> Phone +46 46 272 2016                           Fax +46 46 13 61 30
>

Ricard Wanderlof May 12, 2016, 4:09 p.m. UTC | #6

On Thu, 12 May 2016, Takashi Iwai wrote:

> > The code as it currently stands has macros like
> > 
> > #define mb()    __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")
> > #define rmb()   mb()
> > #define wmb()   __asm__ __volatile__ ("": : :"memory")
> > 
> > which surely are gcc specific. So surely it can't be a requirement that 
> > the patch work on other compilers as well?
> 
> Well, it's not the requirement.  It's just a question.

Ok.

> > It is relevant to verify which gcc versions support the builtins used in 
> > the patch however. I believe the kernel has requirements on which gcc 
> > versions it can be compiled with, but perhaps we need to be more lenient 
> > with userspace?
> 
> Yes.  And I bet that the kernel can be still built with a gcc version
> that doesn't support the new atomic primitive, too.

Yes, that would need to be looked in to of course.

> > In total, the various snd_atomic_* macros are used ~20 times in pcm_rate.c 
> > and ~30 times in pcm_plugin.c . If the atomic functionality is indeed 
> > necessary (and it appears to be), it would seem worth while abstracting 
> > the nitty gritty using macros I think, rather than cluttering up the code.
> 
> In other words, it's used only there.  Most of other codes don't care
> about it.  So, the usefulness of the atomic operation there is really
> doubtful.
>
> [...]
>
> > One would expect otherwise that this issue would have come up much 
> > sooner.
> 
> Yeah, that's why I stated the primary question.  The atomic operation
> there is mostly cosmetic, not really helpful.  In the whole other
> operation, we don't guarantee the thread safety at all.  So,
> implementing the atomic op only in a very small part of the whole code
> appears like a superfluous optimization to me.
> 
> That said: my preferred option is to scratch the whole atomic thing.

That's a valid point, still it was once added in a specific commit 
(a02e742609bb0ed9610809fc62b5f927dd086e62), so one would think it was 
added for a reason. It was a long time ago now, 15 years it seems, so of 
course it could have been part of larger thread safety operation that was 
never completed or in the end deemed unnecessary - the commit message 
itself doesn't provide many helpful hints.

The fact that we have seen a race condition which is fixed by the patch 
under discussion indicates that it is in fact requred, although it could 
also be that the patch simply shifts the timing around so that the 
potential real problem is not visible.

/Ricard

Ricard Wanderlof May 13, 2016, 12:26 p.m. UTC | #7

On Thu, 12 May 2016, Takashi Iwai wrote:

> Yeah, that's why I stated the primary question.  The atomic operation
> there is mostly cosmetic, not really helpful.  In the whole other
> operation, we don't guarantee the thread safety at all.  So,
> implementing the atomic op only in a very small part of the whole code
> appears like a superfluous optimization to me.

The question is, if we decide to remove this atomic stuff, how do we 
verify that it actually causes no problems, on all architectures, etc?

/Ricard

Takashi Iwai May 13, 2016, 12:37 p.m. UTC | #8

On Fri, 13 May 2016 14:26:59 +0200,
Ricard Wanderlof wrote:
> 
> 
> On Thu, 12 May 2016, Takashi Iwai wrote:
> 
> > Yeah, that's why I stated the primary question.  The atomic operation
> > there is mostly cosmetic, not really helpful.  In the whole other
> > operation, we don't guarantee the thread safety at all.  So,
> > implementing the atomic op only in a very small part of the whole code
> > appears like a superfluous optimization to me.
> 
> The question is, if we decide to remove this atomic stuff, how do we 
> verify that it actually causes no problems, on all architectures, etc?

It's not what one can show in 100%, as you know.  And, it's the reason
why the code is still left over years...

IMO, we just need to make a clear statement about the thread safety of
alsa-lib.  For a single threaded op, it can't give any regression, of
course.


Takashi

Ricard Wanderlof May 17, 2016, 12:11 p.m. UTC | #9

On Fri, 13 May 2016, Takashi Iwai wrote:

> > > Yeah, that's why I stated the primary question.  The atomic operation
> > > there is mostly cosmetic, not really helpful.  In the whole other
> > > operation, we don't guarantee the thread safety at all.  So,
> > > implementing the atomic op only in a very small part of the whole code
> > > appears like a superfluous optimization to me.
> > 
> > The question is, if we decide to remove this atomic stuff, how do we 
> > verify that it actually causes no problems, on all architectures, etc?
> 
> It's not what one can show in 100%, as you know.  And, it's the reason
> why the code is still left over years...

We have come up upon this problem in conjunction with GStreamer. It 
appears that GStreamer uses one thread for the streaming and a separate 
one for the device management (open/close) on the same device handle, 
which triggers the problem. So it appears that GStreamer falsly is 
assuming some form of thread safety here.

The atomic stuff is so specific to a couple of files, it seems as if it 
was added for a specific purpose, such as this particular case. In the 
other hand, looking at the git history of include/iatomic.h, it's gone 
through a number of cleanups over the years, lately mostly to remove 
originally buggy code, and what is remaining I guess like you say is just 
what nobody dared remove.

On the other hand, as the code stands, I can't really understand how it 
enforces atomicity. The read and write memory barriers just guarantee 
ordering, but there's nothing to stop a context switch from occurring in 
the middle of a ++ operation. wmb() and rmb() might influence timing 
though so that code at least might behave differently with those 
operations in place.

> IMO, we just need to make a clear statement about the thread safety of
> alsa-lib.  For a single threaded op, it can't give any regression, of
> course.

Just removing the memory barriers should have absolutely no effect in the 
single threaded case I would have thought, especially since there are no 
I/O or other hardware related things (such as DMA descriptors) being poked 
at.

/Ricard

Takashi Iwai May 17, 2016, 12:21 p.m. UTC | #10

On Tue, 17 May 2016 14:11:43 +0200,
Ricard Wanderlof wrote:
> 
> 
> On Fri, 13 May 2016, Takashi Iwai wrote:
> 
> > > > Yeah, that's why I stated the primary question.  The atomic operation
> > > > there is mostly cosmetic, not really helpful.  In the whole other
> > > > operation, we don't guarantee the thread safety at all.  So,
> > > > implementing the atomic op only in a very small part of the whole code
> > > > appears like a superfluous optimization to me.
> > > 
> > > The question is, if we decide to remove this atomic stuff, how do we 
> > > verify that it actually causes no problems, on all architectures, etc?
> > 
> > It's not what one can show in 100%, as you know.  And, it's the reason
> > why the code is still left over years...
> 
> We have come up upon this problem in conjunction with GStreamer. It 
> appears that GStreamer uses one thread for the streaming and a separate 
> one for the device management (open/close) on the same device handle, 
> which triggers the problem. So it appears that GStreamer falsly is 
> assuming some form of thread safety here.
> 
> The atomic stuff is so specific to a couple of files, it seems as if it 
> was added for a specific purpose, such as this particular case. In the 
> other hand, looking at the git history of include/iatomic.h, it's gone 
> through a number of cleanups over the years, lately mostly to remove 
> originally buggy code, and what is remaining I guess like you say is just 
> what nobody dared remove.
> 
> On the other hand, as the code stands, I can't really understand how it 
> enforces atomicity. The read and write memory barriers just guarantee 
> ordering, but there's nothing to stop a context switch from occurring in 
> the middle of a ++ operation. wmb() and rmb() might influence timing 
> though so that code at least might behave differently with those 
> operations in place.

There is a manual lock loop with snd_atomic_read_wait(), as found in
pcm_plugin.c.  So, removing the stuff may potentially damage some apps
that wrongly assume the thread safety.

Now I've been considering this issue again, I'm inclined to suggest:
let's begin with deprecating the atomic stuff there, but opt-in via
via a configure option for a while, in case someone still really needs
it.

By making it optional, it's fine to take your patch and cleanup the
old code.  Then we don't have to think of the compatibility with the
old toolchain so much.


Takashi

Ricard Wanderlof May 17, 2016, 1:29 p.m. UTC | #11

On Tue, 17 May 2016, Takashi Iwai wrote:

> There is a manual lock loop with snd_atomic_read_wait(), as found in
> pcm_plugin.c.  So, removing the stuff may potentially damage some apps
> that wrongly assume the thread safety.

Yes, I think could be true. Certainly GStreamer seems to assume this.

> Now I've been considering this issue again, I'm inclined to suggest:
> let's begin with deprecating the atomic stuff there, but opt-in via
> via a configure option for a while, in case someone still really needs
> it.
> 
> By making it optional, it's fine to take your patch and cleanup the
> old code.  Then we don't have to think of the compatibility with the
> old toolchain so much.

I'm trying to consider how this would be done in practice. Basically, add 
a configuration option like '--with-atomic' with the codec in iatomic.h 
looking something like:

  static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
  {
+ #ifdef USE_ATOMIC
+   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
+ #else
-   w->begin++;
+ #endif
-   wmb();
  }

Thus removing the (useless?) wmb() in any case, and going to w->begin++ in 
the non-atomic case, and __atomic_add_fetch() in the atomic case?

However, this does mean that those that need/expect atomic operations must 
explicitly add a new configuration option, but perhaps that is ok, as one 
could argue that the applications that need it are buggy to start with. 

Also, it would prohibit building the building of alsa-lib for such 
applications with gcc < 4.7 (when __atomic_add_fetch was introduced), but 
perhaps that's more of an incentive than anything else. :-)

/Ricard

Takashi Iwai May 17, 2016, 1:37 p.m. UTC | #12

On Tue, 17 May 2016 15:29:33 +0200,
Ricard Wanderlof wrote:
> 
> 
> On Tue, 17 May 2016, Takashi Iwai wrote:
> 
> > There is a manual lock loop with snd_atomic_read_wait(), as found in
> > pcm_plugin.c.  So, removing the stuff may potentially damage some apps
> > that wrongly assume the thread safety.
> 
> Yes, I think could be true. Certainly GStreamer seems to assume this.
> 
> > Now I've been considering this issue again, I'm inclined to suggest:
> > let's begin with deprecating the atomic stuff there, but opt-in via
> > via a configure option for a while, in case someone still really needs
> > it.
> > 
> > By making it optional, it's fine to take your patch and cleanup the
> > old code.  Then we don't have to think of the compatibility with the
> > old toolchain so much.
> 
> I'm trying to consider how this would be done in practice. Basically, add 
> a configuration option like '--with-atomic' with the codec in iatomic.h 
> looking something like:
> 
>   static __inline__ void snd_atomic_write_begin(snd_atomic_write_t *w)
>   {
> + #ifdef USE_ATOMIC
> +   __atomic_add_fetch(&w->begin, 1,__ATOMIC_SEQ_CST);
> + #else
> -   w->begin++;
> + #endif
> -   wmb();
>   }
> 
> Thus removing the (useless?) wmb() in any case, and going to w->begin++ in 
> the non-atomic case, and __atomic_add_fetch() in the atomic case?

No, my proposal is to make all snd_atomic_*() NOP unless a configure
option is passed.

> However, this does mean that those that need/expect atomic operations must 
> explicitly add a new configuration option, but perhaps that is ok, as one 
> could argue that the applications that need it are buggy to start with. 

Right.  We can spot out whether the application bugs, and yet some
excuse for breakage :)

> Also, it would prohibit building the building of alsa-lib for such 
> applications with gcc < 4.7 (when __atomic_add_fetch was introduced), but 
> perhaps that's more of an incentive than anything else. :-)

Yep.


Takashi

Ricard Wanderlof May 17, 2016, 3:28 p.m. UTC | #13

On Tue, 17 May 2016, Takashi Iwai wrote:

> No, my proposal is to make all snd_atomic_*() NOP unless a configure
> option is passed.

Ok, I must honestly say I hadn't studied the actual pcm_plugin.c code in 
great detail before (I didn't create the patch but was in the train of 
discussion at the time). I had just assumed that the w->begin and w->end 
variables were some form of counters used outside the atomic functions, 
but I can see now that they are not.

Looking at it now it appears that all this atomic stuff is trying to 
accomplish is to avoid the (sole) read in snd_pcm_plugin_status() from 
happening during one of the many potential writes in the other functions 
in the file, and the only reason for that in turn seems to be to get the 
acceses to *pcm->appl.ptr and *pcm->hw.ptr as well as other things needed 
for the return snd_pcm_status_t* consistent.

But that in turn means that fixing the atomicity of the w->begin and 
w->end accesses as proposed in the patch just glosses over that particular 
implementation issue; if indeed something is calling the functions doing 
the writes concurrently, something else is bound to get screwed up, unless 
by chance the two calls touch different variables, or the timing is such 
that two things aren't touched at the same time.

Right now it feels a bit uncomfortable to fix it in this way. It's just 
luck if it doesn't die somewhere else. Or am I missing something (highly 
likely)?

/Ricard

Takashi Iwai May 17, 2016, 3:37 p.m. UTC | #14

On Tue, 17 May 2016 17:28:43 +0200,
Ricard Wanderlof wrote:
> 
> 
> On Tue, 17 May 2016, Takashi Iwai wrote:
> 
> > No, my proposal is to make all snd_atomic_*() NOP unless a configure
> > option is passed.
> 
> Ok, I must honestly say I hadn't studied the actual pcm_plugin.c code in 
> great detail before (I didn't create the patch but was in the train of 
> discussion at the time). I had just assumed that the w->begin and w->end 
> variables were some form of counters used outside the atomic functions, 
> but I can see now that they are not.
> 
> Looking at it now it appears that all this atomic stuff is trying to 
> accomplish is to avoid the (sole) read in snd_pcm_plugin_status() from 
> happening during one of the many potential writes in the other functions 
> in the file, and the only reason for that in turn seems to be to get the 
> acceses to *pcm->appl.ptr and *pcm->hw.ptr as well as other things needed 
> for the return snd_pcm_status_t* consistent.
> 
> But that in turn means that fixing the atomicity of the w->begin and 
> w->end accesses as proposed in the patch just glosses over that particular 
> implementation issue; if indeed something is calling the functions doing 
> the writes concurrently, something else is bound to get screwed up, unless 
> by chance the two calls touch different variables, or the timing is such 
> that two things aren't touched at the same time.
> 
> Right now it feels a bit uncomfortable to fix it in this way. It's just 
> luck if it doesn't die somewhere else. Or am I missing something (highly 
> likely)?

Right, the current atomic ops usage is only covering the very light
locking scheme that might casually work for some cases.  But it never
works for heavy use, and a more proper way of concurrent accesses
would be needed in anyway.  That's the reason I proposed to drop off
these.  This makes things looking safe although they are not really
usable.

But how dangerous is this move?  I'm not 100% sure so far...


Takashi

alsa-lib:Make snd_atomic_write_* truly atomic

Commit Message

Comments

Patch