diff mbox series

[07/22] shrink_dentry_list(): no need to check that dentry refcount is marked dead

Message ID 20231109062056.3181775-7-viro@zeniv.linux.org.uk (mailing list archive)
State New
Headers show
Series [01/22] struct dentry: get rid of randomize_layout idiocy | expand

Commit Message

Al Viro Nov. 9, 2023, 6:20 a.m. UTC
... we won't see DCACHE_MAY_FREE on anything that is *not* dead
and checking d_flags is just as cheap as checking refcount.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/dcache.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Christian Brauner Nov. 9, 2023, 1:53 p.m. UTC | #1
On Thu, Nov 09, 2023 at 06:20:41AM +0000, Al Viro wrote:
> ... we won't see DCACHE_MAY_FREE on anything that is *not* dead
> and checking d_flags is just as cheap as checking refcount.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---

Could also be a WARN_ON_ONCE() on d_lockref.count > 0 if DCACHE_MAY_FREE
is set but probably doesn't matter,
Reviewed-by: Christian Brauner <brauner@kernel.org>
Al Viro Nov. 9, 2023, 8:28 p.m. UTC | #2
On Thu, Nov 09, 2023 at 02:53:04PM +0100, Christian Brauner wrote:
> On Thu, Nov 09, 2023 at 06:20:41AM +0000, Al Viro wrote:
> > ... we won't see DCACHE_MAY_FREE on anything that is *not* dead
> > and checking d_flags is just as cheap as checking refcount.
> > 
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > ---
> 
> Could also be a WARN_ON_ONCE() on d_lockref.count > 0 if DCACHE_MAY_FREE
> is set but probably doesn't matter,

>= 0, actually, but... TBH, in longer run I would rather represent the
empty husk state (instance just waiting for shrink_dentry_list() to remove
it from its list and free the sucker) not by a bit in ->d_flags, but
by a specific negative value in ->d_lockref.count.

After this series we have the following picture: all real instances come
from __alloc_dentry().  Possible states after that
  Busy <-> Retained -> Dying -> Freeing
                         |        ^
			 V        |
			 Husk ----/

Busy and Retained are live dentries, with positive and zero refcount
resp.; that's the pure refcounting land.  Eventually we get to
succesful lock_for_kill(), which leads to call of __dentry_kill().
That's where the state becomes Dying.  On the way out of __dentry_kill()
(after ->d_prune()/->d_iput()/->d_release()) we either switch to Freeing
(only RCU references remain, actual memory object freed by the end of it)
or Husk (the only non-RCU reference is that of a shrink list it's on).
Husk, in turn, switches to Freeing as soon as shrink_dentry_list() gets
around to it and takes it out of its shrink list.  If shrink_dentry_list()
picks an instance in Dying state, it quietly removes it from the shrink
list and leaves it for __dentry_kill() to deal with.

All transitions are under ->d_lock.  ->d_lockref.count for those is
positive in Busy, zero in Retained and -128 in Dying, Husk and Freeing.
Husk is distinguished by having DCACHE_MAY_FREE set.  Freeing has no
visible difference from Dying.

All refcount changes are under ->d_lock.  None of them should _ever_
change the negative values.  If the last part is easy to verify (right
now it's up to "no refcount overflows, all callers of dget_dlock() are
guaranteed to be dealing with Busy or Retained instances"), it might
make sense to use 3 specific negative values for Dying/Husk/Freeing.
What's more, it might make sense to deal with overflows by adding a
separate unsigned long __d_large_count_dont_you_ever_touch_that_directly;
and have the overflow switch to the 4th special negative number indicating
that real counter sits in there.

I'm not 100% convinced that this is the right way to handle that mess,
but it's an approach I'm at least keeping in mind.  Anyway, we need to
get the damn thing documented and understandable before dealing with
overflows becomes even remotely possible.  As it is, it's way too
subtle and reasoning about correctness is too convoluted and brittle.

PS: "documented" includes explicit description of states, their
representations and transitions between them, as well as the objects
associated with the instance in each of those, what references
are allowed in each state, etc.  And the things like in-lookup,
cursor, etc. - live dentries have sub-states as well...
diff mbox series

Patch

diff --git a/fs/dcache.c b/fs/dcache.c
index 1476f2d6e9ea..5371f32eb4bb 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1186,11 +1186,10 @@  void shrink_dentry_list(struct list_head *list)
 		spin_lock(&dentry->d_lock);
 		rcu_read_lock();
 		if (!shrink_lock_dentry(dentry)) {
-			bool can_free = false;
+			bool can_free;
 			rcu_read_unlock();
 			d_shrink_del(dentry);
-			if (dentry->d_lockref.count < 0)
-				can_free = dentry->d_flags & DCACHE_MAY_FREE;
+			can_free = dentry->d_flags & DCACHE_MAY_FREE;
 			spin_unlock(&dentry->d_lock);
 			if (can_free)
 				dentry_free(dentry);