diff mbox series

[v2,18/20] fuse: Release file in process context

Message ID 20200807195526.426056-19-vgoyal@redhat.com (mailing list archive)
State New, archived
Headers show
Series virtiofs: Add DAX support | expand

Commit Message

Vivek Goyal Aug. 7, 2020, 7:55 p.m. UTC
fuse_file_put(sync) can be called with sync=true/false. If sync=true,
it waits for release request response and then calls iput() in the
caller's context. If sync=false, it does not wait for release request
response, frees the fuse_file struct immediately and req->end function
does the iput().

iput() can be a problem with DAX if called in req->end context. If this
is last reference to inode (VFS has let go its reference already), then
iput() will clean DAX mappings as well and send REMOVEMAPPING requests
and wait for completion. (All the the worker thread context which is
processing fuse replies from daemon on the host).

That means it blocks worker thread and it stops processing further
replies and system deadlocks.

So for now, force sync release of file in case of DAX inodes.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/file.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

Comments

Miklos Szeredi Aug. 10, 2020, 8:29 a.m. UTC | #1
On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> fuse_file_put(sync) can be called with sync=true/false. If sync=true,
> it waits for release request response and then calls iput() in the
> caller's context. If sync=false, it does not wait for release request
> response, frees the fuse_file struct immediately and req->end function
> does the iput().
>
> iput() can be a problem with DAX if called in req->end context. If this
> is last reference to inode (VFS has let go its reference already), then
> iput() will clean DAX mappings as well and send REMOVEMAPPING requests
> and wait for completion. (All the the worker thread context which is
> processing fuse replies from daemon on the host).
>
> That means it blocks worker thread and it stops processing further
> replies and system deadlocks.

Is this reasoning specific to DAX?  Seems to me this is a virtio-fs
specific issue.

Thanks,
Miklos
Vivek Goyal Aug. 10, 2020, 3:48 p.m. UTC | #2
On Mon, Aug 10, 2020 at 10:29:13AM +0200, Miklos Szeredi wrote:
> On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > fuse_file_put(sync) can be called with sync=true/false. If sync=true,
> > it waits for release request response and then calls iput() in the
> > caller's context. If sync=false, it does not wait for release request
> > response, frees the fuse_file struct immediately and req->end function
> > does the iput().
> >
> > iput() can be a problem with DAX if called in req->end context. If this
> > is last reference to inode (VFS has let go its reference already), then
> > iput() will clean DAX mappings as well and send REMOVEMAPPING requests
> > and wait for completion. (All the the worker thread context which is
> > processing fuse replies from daemon on the host).
> >
> > That means it blocks worker thread and it stops processing further
> > replies and system deadlocks.
> 
> Is this reasoning specific to DAX?  Seems to me this is a virtio-fs
> specific issue.

I would think it is virtio-fs + DAX issues. virtio-fs without DAX does
not have this problem.

If making this conditional on DAX an issue, I am wondering, can
we now set args->may_block instead. Now virtiofs will do completion
in a separate worker thread if ->may_block is set. That means, 
worker will block till REMOVEMAPPING completes and then be woken
up.

Do let me know if you like setting args->may_block approach better.
I can give that a try.

Vivek
Vivek Goyal Aug. 10, 2020, 7:37 p.m. UTC | #3
On Mon, Aug 10, 2020 at 10:29:13AM +0200, Miklos Szeredi wrote:
> On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > fuse_file_put(sync) can be called with sync=true/false. If sync=true,
> > it waits for release request response and then calls iput() in the
> > caller's context. If sync=false, it does not wait for release request
> > response, frees the fuse_file struct immediately and req->end function
> > does the iput().
> >
> > iput() can be a problem with DAX if called in req->end context. If this
> > is last reference to inode (VFS has let go its reference already), then
> > iput() will clean DAX mappings as well and send REMOVEMAPPING requests
> > and wait for completion. (All the the worker thread context which is
> > processing fuse replies from daemon on the host).
> >
> > That means it blocks worker thread and it stops processing further
> > replies and system deadlocks.
> 
> Is this reasoning specific to DAX?  Seems to me this is a virtio-fs
> specific issue.

Hi Miklos,

I am looking at this patch closely and maybe we don't need it, given
current code.

Reason being that looks like for virtiofs now we set ctx->destroy=true
and that sets fc->destroy. And if that's set, we don't schedule async req
in fuse_release_common();

fuse_release_common() {
  fuse_file_put(ff, ff->fc->destroy, isdir);
}

And if we don't schedule async file put, we don't run the risk of
blocking worker thread in evict_inode() with DAX enabled.

I ran blogbench multiple times with this patch reverted and I can't
reproduce the deadlock.

So we can drop this patch for now. If need be I will fix it as
post merge patch.

Thanks
Vivek
Miklos Szeredi Aug. 10, 2020, 7:39 p.m. UTC | #4
On Mon, Aug 10, 2020 at 9:37 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Mon, Aug 10, 2020 at 10:29:13AM +0200, Miklos Szeredi wrote:
> > On Fri, Aug 7, 2020 at 9:55 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > fuse_file_put(sync) can be called with sync=true/false. If sync=true,
> > > it waits for release request response and then calls iput() in the
> > > caller's context. If sync=false, it does not wait for release request
> > > response, frees the fuse_file struct immediately and req->end function
> > > does the iput().
> > >
> > > iput() can be a problem with DAX if called in req->end context. If this
> > > is last reference to inode (VFS has let go its reference already), then
> > > iput() will clean DAX mappings as well and send REMOVEMAPPING requests
> > > and wait for completion. (All the the worker thread context which is
> > > processing fuse replies from daemon on the host).
> > >
> > > That means it blocks worker thread and it stops processing further
> > > replies and system deadlocks.
> >
> > Is this reasoning specific to DAX?  Seems to me this is a virtio-fs
> > specific issue.
>
> Hi Miklos,
>
> I am looking at this patch closely and maybe we don't need it, given
> current code.
>
> Reason being that looks like for virtiofs now we set ctx->destroy=true
> and that sets fc->destroy. And if that's set, we don't schedule async req
> in fuse_release_common();
>
> fuse_release_common() {
>   fuse_file_put(ff, ff->fc->destroy, isdir);
> }
>
> And if we don't schedule async file put, we don't run the risk of
> blocking worker thread in evict_inode() with DAX enabled.
>
> I ran blogbench multiple times with this patch reverted and I can't
> reproduce the deadlock.
>
> So we can drop this patch for now. If need be I will fix it as
> post merge patch.

Okay.  Yeah, I was thinking along the lines of no-async-release-ever
for virtiofs, but apparently that's already done.

Thanks,
Miklos
diff mbox series

Patch

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index ecd2a42f6e30..605976a586c2 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -527,6 +527,7 @@  void fuse_release_common(struct file *file, bool isdir)
 	struct fuse_file *ff = file->private_data;
 	struct fuse_release_args *ra = ff->release_args;
 	int opcode = isdir ? FUSE_RELEASEDIR : FUSE_RELEASE;
+	bool sync = false;
 
 	fuse_prepare_release(fi, ff, file->f_flags, opcode);
 
@@ -546,8 +547,19 @@  void fuse_release_common(struct file *file, bool isdir)
 	 * Make the release synchronous if this is a fuseblk mount,
 	 * synchronous RELEASE is allowed (and desirable) in this case
 	 * because the server can be trusted not to screw up.
+	 *
+	 * For DAX, fuse server is trusted. So it should be fine to
+	 * do a sync file put. Doing async file put is creating
+	 * problems right now because when request finish, iput()
+	 * can lead to freeing of inode. That means it tears down
+	 * mappings backing DAX memory and sends REMOVEMAPPING message
+	 * to server and blocks for completion. Currently, waiting
+	 * in req->end context deadlocks the system as same worker thread
+	 * can't process REMOVEMAPPING reply it is waiting for.
 	 */
-	fuse_file_put(ff, ff->fc->destroy, isdir);
+	if (IS_DAX(file_inode(file)) || ff->fc->destroy)
+		sync = true;
+	fuse_file_put(ff, sync, isdir);
 }
 
 static int fuse_open(struct inode *inode, struct file *file)