diff mbox series

[04/14] iomap: Call inode_dio_end() before generic_write_sync()

Message ID 20200924163922.2547-5-rgoldwyn@suse.de
State New, archived
Headers show
Series BTRFS DIO inode locking/D_SYNC fix | expand

Commit Message

Goldwyn Rodrigues Sept. 24, 2020, 4:39 p.m. UTC
From: Goldwyn Rodrigues <rgoldwyn@suse.com>

iomap complete routine can deadlock with btrfs_fallocate because of the
call to generic_write_sync().

P0                      P1
inode_lock()            fallocate(FALLOC_FL_ZERO_RANGE)
__iomap_dio_rw()        inode_lock()
                        <block>
<submits IO>
<completes IO>
inode_unlock()
                        <gets inode_lock()>
                        inode_dio_wait()
iomap_dio_complete()
  generic_write_sync()
    btrfs_file_fsync()
      inode_lock()
      <deadlock>

inode_dio_end() is used to notify the end of DIO data in order
to synchronize with truncate. Call inode_dio_end() before calling
generic_write_sync(), so filesystems can lock i_rwsem during a sync.

This matches the way it is done in fs/direct-io.c:dio_complete().

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/iomap/direct-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Darrick J. Wong Sept. 26, 2020, 1:51 a.m. UTC | #1
On Thu, Sep 24, 2020 at 11:39:11AM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> iomap complete routine can deadlock with btrfs_fallocate because of the
> call to generic_write_sync().
> 
> P0                      P1
> inode_lock()            fallocate(FALLOC_FL_ZERO_RANGE)
> __iomap_dio_rw()        inode_lock()
>                         <block>
> <submits IO>
> <completes IO>
> inode_unlock()
>                         <gets inode_lock()>
>                         inode_dio_wait()
> iomap_dio_complete()
>   generic_write_sync()
>     btrfs_file_fsync()
>       inode_lock()
>       <deadlock>
> 
> inode_dio_end() is used to notify the end of DIO data in order
> to synchronize with truncate. Call inode_dio_end() before calling
> generic_write_sync(), so filesystems can lock i_rwsem during a sync.
> 
> This matches the way it is done in fs/direct-io.c:dio_complete().
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Looks ok (at least with the fses that use either iomap or ye olde
directio) to me...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

So, uh, do you want me to pull these two iomap patches in for 5.10?

--D

> ---
>  fs/iomap/direct-io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index b88dbfe15118..0ea88ae57cb2 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -118,6 +118,7 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
>  			dio_warn_stale_pagecache(iocb->ki_filp);
>  	}
>  
> +	inode_dio_end(file_inode(iocb->ki_filp));
>  	/*
>  	 * If this is a DSYNC write, make sure we push it to stable storage now
>  	 * that we've written data.
> @@ -125,7 +126,6 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio)
>  	if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
>  		ret = generic_write_sync(iocb, ret);
>  
> -	inode_dio_end(file_inode(iocb->ki_filp));
>  	kfree(dio);
>  
>  	return ret;
> -- 
> 2.26.2
>
David Sterba Sept. 28, 2020, 3:04 p.m. UTC | #2
On Fri, Sep 25, 2020 at 06:51:08PM -0700, Darrick J. Wong wrote:
> On Thu, Sep 24, 2020 at 11:39:11AM -0500, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > 
> > iomap complete routine can deadlock with btrfs_fallocate because of the
> > call to generic_write_sync().
> > 
> > P0                      P1
> > inode_lock()            fallocate(FALLOC_FL_ZERO_RANGE)
> > __iomap_dio_rw()        inode_lock()
> >                         <block>
> > <submits IO>
> > <completes IO>
> > inode_unlock()
> >                         <gets inode_lock()>
> >                         inode_dio_wait()
> > iomap_dio_complete()
> >   generic_write_sync()
> >     btrfs_file_fsync()
> >       inode_lock()
> >       <deadlock>
> > 
> > inode_dio_end() is used to notify the end of DIO data in order
> > to synchronize with truncate. Call inode_dio_end() before calling
> > generic_write_sync(), so filesystems can lock i_rwsem during a sync.
> > 
> > This matches the way it is done in fs/direct-io.c:dio_complete().
> > 
> > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> 
> Looks ok (at least with the fses that use either iomap or ye olde
> directio) to me...
> 
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> So, uh, do you want me to pull these two iomap patches in for 5.10?

That would be great, thanks. Once they land in 5.10-rc we'll be able to
base the rest on some master snapshot and target 5.11 for release.
Darrick J. Wong Sept. 28, 2020, 4:12 p.m. UTC | #3
On Mon, Sep 28, 2020 at 05:04:19PM +0200, David Sterba wrote:
> On Fri, Sep 25, 2020 at 06:51:08PM -0700, Darrick J. Wong wrote:
> > On Thu, Sep 24, 2020 at 11:39:11AM -0500, Goldwyn Rodrigues wrote:
> > > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > > 
> > > iomap complete routine can deadlock with btrfs_fallocate because of the
> > > call to generic_write_sync().
> > > 
> > > P0                      P1
> > > inode_lock()            fallocate(FALLOC_FL_ZERO_RANGE)
> > > __iomap_dio_rw()        inode_lock()
> > >                         <block>
> > > <submits IO>
> > > <completes IO>
> > > inode_unlock()
> > >                         <gets inode_lock()>
> > >                         inode_dio_wait()
> > > iomap_dio_complete()
> > >   generic_write_sync()
> > >     btrfs_file_fsync()
> > >       inode_lock()
> > >       <deadlock>
> > > 
> > > inode_dio_end() is used to notify the end of DIO data in order
> > > to synchronize with truncate. Call inode_dio_end() before calling
> > > generic_write_sync(), so filesystems can lock i_rwsem during a sync.
> > > 
> > > This matches the way it is done in fs/direct-io.c:dio_complete().
> > > 
> > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> > 
> > Looks ok (at least with the fses that use either iomap or ye olde
> > directio) to me...
> > 
> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > So, uh, do you want me to pull these two iomap patches in for 5.10?
> 
> That would be great, thanks. Once they land in 5.10-rc we'll be able to
> base the rest on some master snapshot and target 5.11 for release.

Ok, done. :)

--D
diff mbox series

Patch

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index b88dbfe15118..0ea88ae57cb2 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -118,6 +118,7 @@  ssize_t iomap_dio_complete(struct iomap_dio *dio)
 			dio_warn_stale_pagecache(iocb->ki_filp);
 	}
 
+	inode_dio_end(file_inode(iocb->ki_filp));
 	/*
 	 * If this is a DSYNC write, make sure we push it to stable storage now
 	 * that we've written data.
@@ -125,7 +126,6 @@  ssize_t iomap_dio_complete(struct iomap_dio *dio)
 	if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC))
 		ret = generic_write_sync(iocb, ret);
 
-	inode_dio_end(file_inode(iocb->ki_filp));
 	kfree(dio);
 
 	return ret;