[ceph] locking fun with d_materialise_unique()

From: Sage Weil <sage@inktank.com>

Hi Al,

On Tue, 29 Jan 2013, Al Viro wrote:

> On Mon, Jan 28, 2013 at 09:20:34PM -0800, Sage Weil wrote:
> 
> > Yep, that is indeed a problem.  I think we just need to do the r_aborted 
> > and/or r_locked_dir check in the else if condition...
> > 
> > > I'm not sure if we are guaranteed that ceph_readdir_prepopulate() won't
> > > get to its splice_dentry() and d_delete() calls in similar situations -
> > > I hadn't checked that one yet.  If it isn't guaranteed, we have a problem
> > > there as well.
> > 
> > ...and the condition guarding readdir_prepopulate().  :)
> 
> > I think you're reading it correctly.  The main thing to keep in mind here 
> > is that we *do* need to call fill_inode() for the inode metadata on these 
> > requests to keep the mds and client state in sync.  The dentry state is 
> > safe to ignore.
> 
> You mean the parts under
>         if (rinfo->head->is_dentry) {
> and
>         if (rinfo->head->is_target) {
> in there?  Because there's fill_inode() called from readdir_prepopulate()
> and it's a lot more problematic than those two...

Yep, patch below.

> > It would be great to have the dir i_mutex rules summarized somewhere, even 
> > if it is just a copy of the below.  It took a fair bit of trial and error 
> > to infer what was going on when writing this code.  :)
> 
> Directory ->i_mutex rules are in part documented - "what VFS guarantees
> to hold" side is in Documentation/filesystems/directory-locking.  It's
> the other side ("what locks are expected to be held by callers of dcache.c
> functions") that is badly missing...
> 
> > Ping me when you've pushed that branch and I'll take a look...
> 
> To gitolite@ra.kernel.org:/pub/scm/linux/kernel/git/viro/vfs.git
>    01a88fa..4056362  master -> master
> 
> with tentative ceph patch in the very end.  Should be on git.kernel.org
> shortly...

We should drop teh mds_client.c hunk from your patch, and then do 
something like the below.  I'll put it in the ceph tree so we can do some 
basic testing.  Unfortunately, the abort case is a bit annoying to 
trigger.. we'll need to write a new test for it.

Thanks-
sage

From 80d70ef02d5277af8e1355db74fffe71f7217e76 Mon Sep 17 00:00:00 2001
From: Sage Weil <sage@inktank.com>
Date: Tue, 29 Jan 2013 13:01:15 -0800
Subject: [PATCH] ceph: prepopulate inodes only when request is aborted

If r_aborted is true, we do not hold the dir i_mutex, and cannot touch
the dcache.  However, we still need to update the inodes with the state
returned by the MDS.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sage Weil <sage@inktank.com>
---
 fs/ceph/inode.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

[ceph] locking fun with d_materialise_unique()

Commit Message

Comments

Patch