diff mbox series

[09/15] fold the call of retain_dentry() into fast_dput()

Message ID 20231101062104.2104951-9-viro@zeniv.linux.org.uk (mailing list archive)
State New, archived
Headers show
Series [01/15] fast_dput(): having ->d_delete() is not reason to delay refcount decrement | expand

Commit Message

Al Viro Nov. 1, 2023, 6:20 a.m. UTC
Calls of retain_dentry() happen immediately after getting false
from fast_dput() and getting true from retain_dentry() is
treated the same way as non-zero refcount would be treated by
fast_dput() - unlock dentry and bugger off.

Doing that in fast_dput() itself is simpler.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/dcache.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

Comments

Al Viro Nov. 1, 2023, 8:45 a.m. UTC | #1
On Wed, Nov 01, 2023 at 06:20:58AM +0000, Al Viro wrote:
> Calls of retain_dentry() happen immediately after getting false
> from fast_dput() and getting true from retain_dentry() is
> treated the same way as non-zero refcount would be treated by
> fast_dput() - unlock dentry and bugger off.
> 
> Doing that in fast_dput() itself is simpler.

FWIW, I wonder if it would be better to reorganize it a bit -

// in some cases we can show that retain_dentry() would return true
// without having to take ->d_lock
< your comments regarding that part go here>
static inline bool lockless_retain_dentry(struct dentry *dentry)
{
        unsigned int d_flags;

        smp_rmb();
        d_flags = READ_ONCE(dentry->d_flags);
        d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_OP_DELETE |
                        DCACHE_DISCONNECTED | DCACHE_DONTCACHE;

        /* Nothing to do? Dropping the reference was all we needed? */
        if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry))
                return true;
	return false;
}

and fast_dput() becomes

{
        int ret;
	// try to do decrement locklessly
	ret = lockref_put_return(&dentry->d_lockref);
	if (likely(ret >= 0)) {
		// could we show that full check would succeed?
		if (ret || lockless_retain_dentry(dentry))
			return true;
		// too bad, have to lock it and do full variant
		spin_lock(&dentry->d_lock);
	} else {
		// failed, no chance to avoid grabbing lock
                spin_lock(&dentry->d_lock);
		// underflow?  whine and back off - we are done
                if (WARN_ON_ONCE(dentry->d_lockref.count <= 0)) {
                        spin_unlock(&dentry->d_lock);
                        return true;
                }
		// decrement it under lock, then...
                dentry->d_lockref.count--;
        }
	// full check it is...
        if (dentry->d_lockref.count || retain_dentry(dentry)) {
                spin_unlock(&dentry->d_lock);
                return true;
        }
        return false;
}

Might be easier to follow that way...  Another thing: would you mind

#if USE_CMPXCHG_LOCKREF
extern int lockref_put_return(struct lockref *);
#else
static inline int lockref_put_return(struct lockref *l)
{
	return -1;
}
#endif

in include/linux/lockref.h?  Would be useful on DEBUG_SPINLOCK configs...
Linus Torvalds Nov. 1, 2023, 5:30 p.m. UTC | #2
On Tue, 31 Oct 2023 at 22:45, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Wed, Nov 01, 2023 at 06:20:58AM +0000, Al Viro wrote:
> > Calls of retain_dentry() happen immediately after getting false
> > from fast_dput() and getting true from retain_dentry() is
> > treated the same way as non-zero refcount would be treated by
> > fast_dput() - unlock dentry and bugger off.
> >
> > Doing that in fast_dput() itself is simpler.
>
> FWIW, I wonder if it would be better to reorganize it a bit -

Hmm. Yes. Except I don't love how the retaining logic is then duplicated.

Could we perhaps at least try to share the dentry flag tests between
the "real" retain_dentry() code and the lockless version?

> Another thing: would you mind
>
> #if USE_CMPXCHG_LOCKREF
> extern int lockref_put_return(struct lockref *);
> #else
> static inline int lockref_put_return(struct lockref *l)
> {
>         return -1;
> }
> #endif
>
> in include/linux/lockref.h?  Would be useful on DEBUG_SPINLOCK configs...

The above sounds like a good idea, not only for better code generation
for the debug case, but because it would have possibly made the erofs
misuse more obvious to people.

             Linus
Al Viro Nov. 1, 2023, 6:19 p.m. UTC | #3
On Wed, Nov 01, 2023 at 07:30:34AM -1000, Linus Torvalds wrote:
> On Tue, 31 Oct 2023 at 22:45, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Wed, Nov 01, 2023 at 06:20:58AM +0000, Al Viro wrote:
> > > Calls of retain_dentry() happen immediately after getting false
> > > from fast_dput() and getting true from retain_dentry() is
> > > treated the same way as non-zero refcount would be treated by
> > > fast_dput() - unlock dentry and bugger off.
> > >
> > > Doing that in fast_dput() itself is simpler.
> >
> > FWIW, I wonder if it would be better to reorganize it a bit -
> 
> Hmm. Yes. Except I don't love how the retaining logic is then duplicated.

Er...  That change would be an equivalent transformation - the same duplication
is there right now...

> Could we perhaps at least try to share the dentry flag tests between
> the "real" retain_dentry() code and the lockless version?

Umm...  There are 3 groups:

DONTCACHE, DISCONNECTED - hard false
!LRU_LIST, !REFERENCED - not an obstacle to true, but need to take locks
OP_DELETE - can't tell without asking filesystem, which would need ->d_lock.

gcc-12 on x86 turns the series of ifs into
        movl    %edi, %eax
	andl    $32832, %eax
	cmpl    $32832, %eax
	jne     .L17
	andl    $168, %edi
	jne     .L17
instead of combining that into
        andl    $33000, %edi
	cmpl    $32832, %edi
	jne     .L17

OTOH, that's not much of pessimization...  Up to you.


> > Another thing: would you mind
> >
> > #if USE_CMPXCHG_LOCKREF
> > extern int lockref_put_return(struct lockref *);
> > #else
> > static inline int lockref_put_return(struct lockref *l)
> > {
> >         return -1;
> > }
> > #endif
> >
> > in include/linux/lockref.h?  Would be useful on DEBUG_SPINLOCK configs...
> 
> The above sounds like a good idea, not only for better code generation
> for the debug case, but because it would have possibly made the erofs
> misuse more obvious to people.

To make it even more obvious, perhaps rename it as well?  I.e.

/*
 * unlike the rest of these primitives, the one below does *not* contain
 * a fallback; caller needs to take care of handling that.
 */
#if USE_CMPXCHG_LOCKREF
extern int __lockref_put_return(struct lockref *);
#else
static inline int __lockref_put_return(struct lockref *l)
{
	return -1;
}
#endif
diff mbox series

Patch

diff --git a/fs/dcache.c b/fs/dcache.c
index 30bebec591db..6f79d452af81 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -757,6 +757,8 @@  static struct dentry *dentry_kill(struct dentry *dentry)
  * Try to do a lockless dput(), and return whether that was successful.
  *
  * If unsuccessful, we return false, having already taken the dentry lock.
+ * In that case refcount is guaranteed to be zero and we have already
+ * decided that it's not worth keeping around.
  *
  * The caller needs to hold the RCU read lock, so that the dentry is
  * guaranteed to stay around even if the refcount goes down to zero!
@@ -842,7 +844,7 @@  static inline bool fast_dput(struct dentry *dentry)
 	 * don't need to do anything else.
 	 */
 locked:
-	if (dentry->d_lockref.count) {
+	if (dentry->d_lockref.count || retain_dentry(dentry)) {
 		spin_unlock(&dentry->d_lock);
 		return true;
 	}
@@ -889,12 +891,6 @@  void dput(struct dentry *dentry)
 
 		/* Slow case: now with the dentry lock held */
 		rcu_read_unlock();
-
-		if (likely(retain_dentry(dentry))) {
-			spin_unlock(&dentry->d_lock);
-			return;
-		}
-
 		dentry->d_lockref.count = 1;
 		dentry = dentry_kill(dentry);
 	}
@@ -920,8 +916,7 @@  void dput_to_list(struct dentry *dentry, struct list_head *list)
 		return;
 	}
 	rcu_read_unlock();
-	if (!retain_dentry(dentry))
-		to_shrink_list(dentry, list);
+	to_shrink_list(dentry, list);
 	spin_unlock(&dentry->d_lock);
 }