diff mbox

[14/15] futex: convert futex_pi_state.refcount to refcount_t

Message ID 20170904120009.ah2qu3lbgdqdgz6i@hirez.programming.kicks-ass.net (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Zijlstra Sept. 4, 2017, noon UTC
On Mon, Sep 04, 2017 at 10:31:54AM +0000, Reshetova, Elena wrote:
> > > But can they make "fast" implementation on ARM that would give stronger
> > > memory guarantees?
> > 
> > Whatever for?
> 
> Well, maybe just by default when arch.-specific implementation is
> done. But I was just trying to speculate to understand. I will resend
> this one with new comment added. 

So the generic lib/refcount.c already has weak ordering. It doesn't make
sense for an arch specific implementation (on a weakly ordered machine)
to provide stronger guarantees (it would make things slower).

The weaker ordering of the refcount_t primitives is sufficient if we're
talking pure refcounts. If for some reason code relies on stronger
ordering there _SHOULD_ be a comment with describing the additional
ordering requirements.

But that's a fairly big 'should'. I can well imagine the comment not
being there. In fact, see below.

> Still not sure if I need to resend the whole series with updated
> commits or break this up by individual patches further for the
> separate merges. 

I've yet to look at the ones targeted at subsystems I do, I'm forever
and terminally behind on review :/

I called out the issue on futex in particular because it is fairly
tricky code that.

Now Thomas would like you to mention the fact that refcount_t doesn't
provide the exact same ordering as the atomic_t usages it replaces and
I think it would be good if you could hand-wave an argument on why the
futex code doesn't care.


Now, suppose we were to convert i_count to refcount_t (yes, I know, my
initial conversion wasn't well received), then we need to add
futex_get_inode() similar to futex_get_mm().

That is, smp_mb__{before,after}_atomic() works as expected and can be
used to fortify the implied barriers by refcount_t.

---
Subject: fs,inode: Add comment explaining additional ordering

Add a note to ihold() to document the ordering futex relies upon.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 fs/inode.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Reshetova, Elena Sept. 14, 2017, 4:02 p.m. UTC | #1
Sorry for delayed reply. 

> On Mon, Sep 04, 2017 at 10:31:54AM +0000, Reshetova, Elena wrote:
> > > > But can they make "fast" implementation on ARM that would give stronger
> > > > memory guarantees?
> > >
> > > Whatever for?
> >
> > Well, maybe just by default when arch.-specific implementation is
> > done. But I was just trying to speculate to understand. I will resend
> > this one with new comment added.
> 
> So the generic lib/refcount.c already has weak ordering. It doesn't make
> sense for an arch specific implementation (on a weakly ordered machine)
> to provide stronger guarantees (it would make things slower).

Thank you for explaining this! Helps to understand a lot. 
> 
> The weaker ordering of the refcount_t primitives is sufficient if we're
> talking pure refcounts. If for some reason code relies on stronger
> ordering there _SHOULD_ be a comment with describing the additional
> ordering requirements.
> 
> But that's a fairly big 'should'. I can well imagine the comment not
> being there. In fact, see below.
> 
> > Still not sure if I need to resend the whole series with updated
> > commits or break this up by individual patches further for the
> > separate merges.
> 
> I've yet to look at the ones targeted at subsystems I do, I'm forever
> and terminally behind on review :/
> 
> I called out the issue on futex in particular because it is fairly
> tricky code that.
> 
> Now Thomas would like you to mention the fact that refcount_t doesn't
> provide the exact same ordering as the atomic_t usages it replaces and
> I think it would be good if you could hand-wave an argument on why the
> futex code doesn't care.

I think I can mention the ordering differences on all yet-to-be-merged
patches to make sure maintainers are aware. The problem with concrete
cases is that I don't usually have enough knowledge of code to understand 
for sure where it would matter or not. Previously I was even under impression
that it should not matter at all for the variables that we are converting since
they are classical refcounters, but your examples clearly show that it is not
*always* the case (but I think it is the case for most of patches). 
So, I am a bit confused on how to approach this. 
Either just put a statement to all patches and rely that maintainers certainly
know their code and can catch these things or do an analysis myself, but 
then I would need a bit of guidance on what is the reasonable heuristics on 
how check each refcounter. This goes really beyond my current
kernel knowledge, but I am happy to learn if somebody points me to smth
I can read/fill missing points.   

Best Regards,
Elena.


> 
> 
> Now, suppose we were to convert i_count to refcount_t (yes, I know, my
> initial conversion wasn't well received), then we need to add
> futex_get_inode() similar to futex_get_mm().
> 
> That is, smp_mb__{before,after}_atomic() works as expected and can be
> used to fortify the implied barriers by refcount_t.
> 
> ---
> Subject: fs,inode: Add comment explaining additional ordering
> 
> Add a note to ihold() to document the ordering futex relies upon.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  fs/inode.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 50370599e371..17192ba92fef 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -395,6 +395,10 @@ void __iget(struct inode *inode)
>   */
>  void ihold(struct inode *inode)
>  {
> +	/*
> +	 * Note: futex.c:get_futex_key_refs() relies on this function
> +	 * implying an smp_mb().
> +	 */
>  	WARN_ON(atomic_inc_return(&inode->i_count) < 2);
>  }
>  EXPORT_SYMBOL(ihold);
Reshetova, Elena Oct. 20, 2017, 12:03 p.m. UTC | #2
Since I am not really sure what to do with this futex patch, I will drop it
from the new series that I am about to send now. 

Please let me know what exactly should I do with this patch, as I wrote 
previously I really don't understand. 

Best Regards,
Elena.

> Sorry for delayed reply.
> 
> > On Mon, Sep 04, 2017 at 10:31:54AM +0000, Reshetova, Elena wrote:
> > > > > But can they make "fast" implementation on ARM that would give stronger
> > > > > memory guarantees?
> > > >
> > > > Whatever for?
> > >
> > > Well, maybe just by default when arch.-specific implementation is
> > > done. But I was just trying to speculate to understand. I will resend
> > > this one with new comment added.
> >
> > So the generic lib/refcount.c already has weak ordering. It doesn't make
> > sense for an arch specific implementation (on a weakly ordered machine)
> > to provide stronger guarantees (it would make things slower).
> 
> Thank you for explaining this! Helps to understand a lot.
> >
> > The weaker ordering of the refcount_t primitives is sufficient if we're
> > talking pure refcounts. If for some reason code relies on stronger
> > ordering there _SHOULD_ be a comment with describing the additional
> > ordering requirements.
> >
> > But that's a fairly big 'should'. I can well imagine the comment not
> > being there. In fact, see below.
> >
> > > Still not sure if I need to resend the whole series with updated
> > > commits or break this up by individual patches further for the
> > > separate merges.
> >
> > I've yet to look at the ones targeted at subsystems I do, I'm forever
> > and terminally behind on review :/
> >
> > I called out the issue on futex in particular because it is fairly
> > tricky code that.
> >
> > Now Thomas would like you to mention the fact that refcount_t doesn't
> > provide the exact same ordering as the atomic_t usages it replaces and
> > I think it would be good if you could hand-wave an argument on why the
> > futex code doesn't care.
> 
> I think I can mention the ordering differences on all yet-to-be-merged
> patches to make sure maintainers are aware. The problem with concrete
> cases is that I don't usually have enough knowledge of code to understand
> for sure where it would matter or not. Previously I was even under impression
> that it should not matter at all for the variables that we are converting since
> they are classical refcounters, but your examples clearly show that it is not
> *always* the case (but I think it is the case for most of patches).
> So, I am a bit confused on how to approach this.
> Either just put a statement to all patches and rely that maintainers certainly
> know their code and can catch these things or do an analysis myself, but
> then I would need a bit of guidance on what is the reasonable heuristics on
> how check each refcounter. This goes really beyond my current
> kernel knowledge, but I am happy to learn if somebody points me to smth
> I can read/fill missing points.
> 
> Best Regards,
> Elena.
> 
> 
> >
> >
> > Now, suppose we were to convert i_count to refcount_t (yes, I know, my
> > initial conversion wasn't well received), then we need to add
> > futex_get_inode() similar to futex_get_mm().
> >
> > That is, smp_mb__{before,after}_atomic() works as expected and can be
> > used to fortify the implied barriers by refcount_t.
> >
> > ---
> > Subject: fs,inode: Add comment explaining additional ordering
> >
> > Add a note to ihold() to document the ordering futex relies upon.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  fs/inode.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/fs/inode.c b/fs/inode.c
> > index 50370599e371..17192ba92fef 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -395,6 +395,10 @@ void __iget(struct inode *inode)
> >   */
> >  void ihold(struct inode *inode)
> >  {
> > +	/*
> > +	 * Note: futex.c:get_futex_key_refs() relies on this function
> > +	 * implying an smp_mb().
> > +	 */
> >  	WARN_ON(atomic_inc_return(&inode->i_count) < 2);
> >  }
> >  EXPORT_SYMBOL(ihold);
Thomas Gleixner Oct. 20, 2017, 12:30 p.m. UTC | #3
On Fri, 20 Oct 2017, Reshetova, Elena wrote:

> Since I am not really sure what to do with this futex patch, I will drop it
> from the new series that I am about to send now. 
> 
> Please let me know what exactly should I do with this patch, as I wrote 
> previously I really don't understand. 

As Peter said:

> > > Now Thomas would like you to mention the fact that refcount_t doesn't
> > > provide the exact same ordering as the atomic_t usages it replaces and
> > > I think it would be good if you could hand-wave an argument on why the
> > > futex code doesn't care.

So if you cannot come with a halfways reasonable argument then at least
make it entirely clear that refcount_t is not the same as atomic_t in terms
of ordering guarantees and you think that it does not matter but
explicitely ask for help from the developers and maintainers to look at it.

Thanks,

	tglx
Reshetova, Elena Oct. 23, 2017, 7:36 a.m. UTC | #4
> On Fri, 20 Oct 2017, Reshetova, Elena wrote:
> 
> > Since I am not really sure what to do with this futex patch, I will drop it
> > from the new series that I am about to send now.
> >
> > Please let me know what exactly should I do with this patch, as I wrote
> > previously I really don't understand.
> 
> As Peter said:
> 
> > > > Now Thomas would like you to mention the fact that refcount_t doesn't
> > > > provide the exact same ordering as the atomic_t usages it replaces and
> > > > I think it would be good if you could hand-wave an argument on why the
> > > > futex code doesn't care.
> 
> So if you cannot come with a halfways reasonable argument then at least
> make it entirely clear that refcount_t is not the same as atomic_t in terms
> of ordering guarantees and you think that it does not matter but
> explicitely ask for help from the developers and maintainers to look at it.


There is another (I think better) proposal that came from Kees now: 
What if we change the default refcount_t to provide the strict memory
ordering like atomic_t?
I guess the reason why Peter intially went with *_relaxed() versions of compare and
exchange loop was performance. But now when we have x86 spec. fast
implementation, maybe it is ok to make the default FULL refcount slower?

What do people think of this?

Best Regards,
Elena.


> 
> Thanks,
> 
> 	tglx
>
diff mbox

Patch

diff --git a/fs/inode.c b/fs/inode.c
index 50370599e371..17192ba92fef 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -395,6 +395,10 @@  void __iget(struct inode *inode)
  */
 void ihold(struct inode *inode)
 {
+	/*
+	 * Note: futex.c:get_futex_key_refs() relies on this function
+	 * implying an smp_mb().
+	 */
 	WARN_ON(atomic_inc_return(&inode->i_count) < 2);
 }
 EXPORT_SYMBOL(ihold);