diff mbox

Btrfs: Fix deadlock between direct IO and fast fsync

Message ID 1482485418-4190-1-git-send-email-chandan@linux.vnet.ibm.com (mailing list archive)
State Accepted
Headers show

Commit Message

Chandan Rajendra Dec. 23, 2016, 9:30 a.m. UTC
The following deadlock is seen when executing generic/113 test,

 ---------------------------------------------------------+----------------------------------------------------
  Direct I/O task                                           Fast fsync task
 ---------------------------------------------------------+----------------------------------------------------
  btrfs_direct_IO
    __blockdev_direct_IO
     do_blockdev_direct_IO
      do_direct_IO
       btrfs_get_blocks_direct
        while (blocks needs to written)
         get_more_blocks (first iteration)
          btrfs_get_blocks_direct
           btrfs_create_dio_extent
             down_read(&BTRFS_I(inode) >dio_sem)
             Create and add extent map and ordered extent
             up_read(&BTRFS_I(inode) >dio_sem)
                                                            btrfs_sync_file
                                                              btrfs_log_dentry_safe
                                                               btrfs_log_inode_parent
                                                                btrfs_log_inode
                                                                 btrfs_log_changed_extents
                                                                  down_write(&BTRFS_I(inode) >dio_sem)
                                                                   Collect new extent maps and ordered extents
                                                                    wait for ordered extent completion
         get_more_blocks (second iteration)
          btrfs_get_blocks_direct
           btrfs_create_dio_extent
             down_read(&BTRFS_I(inode) >dio_sem)
 --------------------------------------------------------------------------------------------------------------

In the above description, Btrfs direct I/O code path has not yet started
submitting bios for file range covered by the initial ordered
extent. Meanwhile, The fast fsync task obtains the write semaphore and
waits for I/O on the ordered extent to get completed. However, the
Direct I/O task is now blocked on obtaining the read semaphore.

To resolve the deadlock, this commit modifies the Direct I/O code path
to obtain the read semaphore before invoking
__blockdev_direct_IO(). The semaphore is then given up after
__blockdev_direct_IO() returns. This allows the Direct I/O code to
complete I/O on all the ordered extents it creates.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
---
 fs/btrfs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Chandan Rajendra Dec. 23, 2016, 10:27 a.m. UTC | #1
On Friday, December 23, 2016 03:00:18 PM Chandan Rajendra wrote:
> The following deadlock is seen when executing generic/113 test,
> 
>  ---------------------------------------------------------+----------------------------------------------------
>   Direct I/O task                                           Fast fsync task
>  ---------------------------------------------------------+----------------------------------------------------
>   btrfs_direct_IO
>     __blockdev_direct_IO
>      do_blockdev_direct_IO
>       do_direct_IO
>        btrfs_get_blocks_direct
>         while (blocks needs to written)
>          get_more_blocks (first iteration)
>           btrfs_get_blocks_direct
>            btrfs_create_dio_extent
>              down_read(&BTRFS_I(inode) >dio_sem)
>              Create and add extent map and ordered extent
>              up_read(&BTRFS_I(inode) >dio_sem)
>                                                             btrfs_sync_file
>                                                               btrfs_log_dentry_safe
>                                                                btrfs_log_inode_parent
>                                                                 btrfs_log_inode
>                                                                  btrfs_log_changed_extents
>                                                                   down_write(&BTRFS_I(inode) >dio_sem)
>                                                                    Collect new extent maps and ordered extents
>                                                                     wait for ordered extent completion
>          get_more_blocks (second iteration)
>           btrfs_get_blocks_direct
>            btrfs_create_dio_extent
>              down_read(&BTRFS_I(inode) >dio_sem)
>  --------------------------------------------------------------------------------------------------------------
> 
> In the above description, Btrfs direct I/O code path has not yet started
> submitting bios for file range covered by the initial ordered
> extent. Meanwhile, The fast fsync task obtains the write semaphore and
> waits for I/O on the ordered extent to get completed. However, the
> Direct I/O task is now blocked on obtaining the read semaphore.
> 
> To resolve the deadlock, this commit modifies the Direct I/O code path
> to obtain the read semaphore before invoking
> __blockdev_direct_IO(). The semaphore is then given up after
> __blockdev_direct_IO() returns. This allows the Direct I/O code to
> complete I/O on all the ordered extents it creates.
>

Btw, I was able to reproduce the issue on kdave/for-next branch with "Merge
branch 'for-next-next-4.9-20161125' into for-next-20161125" as the topmost
commit. The issue cannot be reproduced yet on latest code available from
kdave/for-next branch.
Chandan Rajendra Dec. 23, 2016, 11:57 a.m. UTC | #2
On Friday, December 23, 2016 03:57:40 PM Chandan Rajendra wrote:
> On Friday, December 23, 2016 03:00:18 PM Chandan Rajendra wrote:
> > The following deadlock is seen when executing generic/113 test,
> > 
> >  ---------------------------------------------------------+----------------------------------------------------
> >   Direct I/O task                                           Fast fsync task
> >  ---------------------------------------------------------+----------------------------------------------------
> >   btrfs_direct_IO
> >     __blockdev_direct_IO
> >      do_blockdev_direct_IO
> >       do_direct_IO
> >        btrfs_get_blocks_direct
> >         while (blocks needs to written)
> >          get_more_blocks (first iteration)
> >           btrfs_get_blocks_direct
> >            btrfs_create_dio_extent
> >              down_read(&BTRFS_I(inode) >dio_sem)
> >              Create and add extent map and ordered extent
> >              up_read(&BTRFS_I(inode) >dio_sem)
> >                                                             btrfs_sync_file
> >                                                               btrfs_log_dentry_safe
> >                                                                btrfs_log_inode_parent
> >                                                                 btrfs_log_inode
> >                                                                  btrfs_log_changed_extents
> >                                                                   down_write(&BTRFS_I(inode) >dio_sem)
> >                                                                    Collect new extent maps and ordered extents
> >                                                                     wait for ordered extent completion
> >          get_more_blocks (second iteration)
> >           btrfs_get_blocks_direct
> >            btrfs_create_dio_extent
> >              down_read(&BTRFS_I(inode) >dio_sem)
> >  --------------------------------------------------------------------------------------------------------------
> > 
> > In the above description, Btrfs direct I/O code path has not yet started
> > submitting bios for file range covered by the initial ordered
> > extent. Meanwhile, The fast fsync task obtains the write semaphore and
> > waits for I/O on the ordered extent to get completed. However, the
> > Direct I/O task is now blocked on obtaining the read semaphore.
> > 
> > To resolve the deadlock, this commit modifies the Direct I/O code path
> > to obtain the read semaphore before invoking
> > __blockdev_direct_IO(). The semaphore is then given up after
> > __blockdev_direct_IO() returns. This allows the Direct I/O code to
> > complete I/O on all the ordered extents it creates.
> >
> 
> Btw, I was able to reproduce the issue on kdave/for-next branch with "Merge
> branch 'for-next-next-4.9-20161125' into for-next-20161125" as the topmost
> commit. The issue cannot be reproduced yet on latest code available from
> kdave/for-next branch.
> 
>

Maybe changes in upstream might have masked the issue in the recent
kdave/for-next branch. I say that because 'git bisect' resulted in the
following commit ...

e3597e6090ddf40904dce6d0a5a404e2c490cac6
Author:     Chris Mason <clm@fb.com>
AuthorDate: Tue Nov 1 12:54:45 2016 -0700
Commit:     Chris Mason <clm@fb.com>
CommitDate: Tue Nov 1 12:54:45 2016 -0700

Parent:     570dd45 btrfs: fix races on root_log_ctx lists
Parent:     9d1032c btrfs: fix WARNING in btrfs_select_ref_head()
Merged:     btrfs-next-for-linus-4.8 kdave-master linus-v4.7-rc6 local-v4.7-rc4
Containing: direct-io-fsync-deadlock kdave-for-next
Follows:    v4.8-rc8 (57)
Precedes:   next-20161219 (30006)

Merge branch 'for-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.9

5 files changed, 29 insertions(+), 9 deletions(-)
fs/btrfs/extent-tree.c |  3 +++
fs/btrfs/extent_io.c   |  8 ++++----
fs/btrfs/inode.c       | 13 +++++++++----
fs/btrfs/ioctl.c       |  5 +++++
fs/btrfs/relocation.c  |  9 ++++++++-
Liu Bo Dec. 24, 2016, 12:18 a.m. UTC | #3
On Fri, Dec 23, 2016 at 05:27:55PM +0530, Chandan Rajendra wrote:
> On Friday, December 23, 2016 03:57:40 PM Chandan Rajendra wrote:
> > On Friday, December 23, 2016 03:00:18 PM Chandan Rajendra wrote:
> > > The following deadlock is seen when executing generic/113 test,
> > > 
> > >  ---------------------------------------------------------+----------------------------------------------------
> > >   Direct I/O task                                           Fast fsync task
> > >  ---------------------------------------------------------+----------------------------------------------------
> > >   btrfs_direct_IO
> > >     __blockdev_direct_IO
> > >      do_blockdev_direct_IO
> > >       do_direct_IO
> > >        btrfs_get_blocks_direct
> > >         while (blocks needs to written)
> > >          get_more_blocks (first iteration)
> > >           btrfs_get_blocks_direct
> > >            btrfs_create_dio_extent
> > >              down_read(&BTRFS_I(inode) >dio_sem)
> > >              Create and add extent map and ordered extent
> > >              up_read(&BTRFS_I(inode) >dio_sem)
> > >                                                             btrfs_sync_file
> > >                                                               btrfs_log_dentry_safe
> > >                                                                btrfs_log_inode_parent
> > >                                                                 btrfs_log_inode
> > >                                                                  btrfs_log_changed_extents
> > >                                                                   down_write(&BTRFS_I(inode) >dio_sem)
> > >                                                                    Collect new extent maps and ordered extents
> > >                                                                     wait for ordered extent completion
> > >          get_more_blocks (second iteration)
> > >           btrfs_get_blocks_direct
> > >            btrfs_create_dio_extent
> > >              down_read(&BTRFS_I(inode) >dio_sem)
> > >  --------------------------------------------------------------------------------------------------------------
> > > 
> > > In the above description, Btrfs direct I/O code path has not yet started
> > > submitting bios for file range covered by the initial ordered
> > > extent. Meanwhile, The fast fsync task obtains the write semaphore and
> > > waits for I/O on the ordered extent to get completed. However, the
> > > Direct I/O task is now blocked on obtaining the read semaphore.
> > > 
> > > To resolve the deadlock, this commit modifies the Direct I/O code path
> > > to obtain the read semaphore before invoking
> > > __blockdev_direct_IO(). The semaphore is then given up after
> > > __blockdev_direct_IO() returns. This allows the Direct I/O code to
> > > complete I/O on all the ordered extents it creates.
> > >
> > 
> > Btw, I was able to reproduce the issue on kdave/for-next branch with "Merge
> > branch 'for-next-next-4.9-20161125' into for-next-20161125" as the topmost
> > commit. The issue cannot be reproduced yet on latest code available from
> > kdave/for-next branch.
> > 
> >
> 
> Maybe changes in upstream might have masked the issue in the recent
> kdave/for-next branch. I say that because 'git bisect' resulted in the
> following commit ...

I guess that the for-next branch didn't revert this patch[1] as upstream
did, so that generic/113 would complain, however, even w/o that patch,
this fix is still required since the deadlock could be reproduced by
running generic/113 with '-ofragment=data' and in fact Filipe has
proposed a almost same fix but not a real patch in this thread [2].

[1]: Btrfs: adjust len of writes if following a preallocated extent
https://patchwork.kernel.org/patch/9413129/
[2]: https://patchwork.kernel.org/patch/9445231/

Thanks,

-liubo

> 
> e3597e6090ddf40904dce6d0a5a404e2c490cac6
> Author:     Chris Mason <clm@fb.com>
> AuthorDate: Tue Nov 1 12:54:45 2016 -0700
> Commit:     Chris Mason <clm@fb.com>
> CommitDate: Tue Nov 1 12:54:45 2016 -0700
> 
> Parent:     570dd45 btrfs: fix races on root_log_ctx lists
> Parent:     9d1032c btrfs: fix WARNING in btrfs_select_ref_head()
> Merged:     btrfs-next-for-linus-4.8 kdave-master linus-v4.7-rc6 local-v4.7-rc4
> Containing: direct-io-fsync-deadlock kdave-for-next
> Follows:    v4.8-rc8 (57)
> Precedes:   next-20161219 (30006)
> 
> Merge branch 'for-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.9
> 
> 5 files changed, 29 insertions(+), 9 deletions(-)
> fs/btrfs/extent-tree.c |  3 +++
> fs/btrfs/extent_io.c   |  8 ++++----
> fs/btrfs/inode.c       | 13 +++++++++----
> fs/btrfs/ioctl.c       |  5 +++++
> fs/btrfs/relocation.c  |  9 ++++++++-
> 
> -- 
> chandan
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chandan Rajendra Dec. 24, 2016, 3:27 a.m. UTC | #4
On Friday, December 23, 2016 04:18:00 PM Liu Bo wrote:
> On Fri, Dec 23, 2016 at 05:27:55PM +0530, Chandan Rajendra wrote:
> > On Friday, December 23, 2016 03:57:40 PM Chandan Rajendra wrote:
> > > On Friday, December 23, 2016 03:00:18 PM Chandan Rajendra wrote:
> > > > The following deadlock is seen when executing generic/113 test,
> > > > 
> > > >  ---------------------------------------------------------+----------------------------------------------------
> > > >   Direct I/O task                                           Fast fsync task
> > > >  ---------------------------------------------------------+----------------------------------------------------
> > > >   btrfs_direct_IO
> > > >     __blockdev_direct_IO
> > > >      do_blockdev_direct_IO
> > > >       do_direct_IO
> > > >        btrfs_get_blocks_direct
> > > >         while (blocks needs to written)
> > > >          get_more_blocks (first iteration)
> > > >           btrfs_get_blocks_direct
> > > >            btrfs_create_dio_extent
> > > >              down_read(&BTRFS_I(inode) >dio_sem)
> > > >              Create and add extent map and ordered extent
> > > >              up_read(&BTRFS_I(inode) >dio_sem)
> > > >                                                             btrfs_sync_file
> > > >                                                               btrfs_log_dentry_safe
> > > >                                                                btrfs_log_inode_parent
> > > >                                                                 btrfs_log_inode
> > > >                                                                  btrfs_log_changed_extents
> > > >                                                                   down_write(&BTRFS_I(inode) >dio_sem)
> > > >                                                                    Collect new extent maps and ordered extents
> > > >                                                                     wait for ordered extent completion
> > > >          get_more_blocks (second iteration)
> > > >           btrfs_get_blocks_direct
> > > >            btrfs_create_dio_extent
> > > >              down_read(&BTRFS_I(inode) >dio_sem)
> > > >  --------------------------------------------------------------------------------------------------------------
> > > > 
> > > > In the above description, Btrfs direct I/O code path has not yet started
> > > > submitting bios for file range covered by the initial ordered
> > > > extent. Meanwhile, The fast fsync task obtains the write semaphore and
> > > > waits for I/O on the ordered extent to get completed. However, the
> > > > Direct I/O task is now blocked on obtaining the read semaphore.
> > > > 
> > > > To resolve the deadlock, this commit modifies the Direct I/O code path
> > > > to obtain the read semaphore before invoking
> > > > __blockdev_direct_IO(). The semaphore is then given up after
> > > > __blockdev_direct_IO() returns. This allows the Direct I/O code to
> > > > complete I/O on all the ordered extents it creates.
> > > >
> > > 
> > > Btw, I was able to reproduce the issue on kdave/for-next branch with "Merge
> > > branch 'for-next-next-4.9-20161125' into for-next-20161125" as the topmost
> > > commit. The issue cannot be reproduced yet on latest code available from
> > > kdave/for-next branch.
> > > 
> > >
> > 
> > Maybe changes in upstream might have masked the issue in the recent
> > kdave/for-next branch. I say that because 'git bisect' resulted in the
> > following commit ...
> 
> I guess that the for-next branch didn't revert this patch[1] as upstream
> did, so that generic/113 would complain, however, even w/o that patch,
> this fix is still required since the deadlock could be reproduced by
> running generic/113 with '-ofragment=data' and in fact Filipe has
> proposed a almost same fix but not a real patch in this thread [2].
> 
> [1]: Btrfs: adjust len of writes if following a preallocated extent
> https://patchwork.kernel.org/patch/9413129/
> [2]: https://patchwork.kernel.org/patch/9445231/
>

Ah ok. Thanks for pointing it out.
David Sterba Jan. 2, 2017, 2:29 p.m. UTC | #5
On Fri, Dec 23, 2016 at 04:18:00PM -0800, Liu Bo wrote:
> > > Btw, I was able to reproduce the issue on kdave/for-next branch with "Merge
> > > branch 'for-next-next-4.9-20161125' into for-next-20161125" as the topmost
> > > commit. The issue cannot be reproduced yet on latest code available from
> > > kdave/for-next branch.
> > 
> > Maybe changes in upstream might have masked the issue in the recent
> > kdave/for-next branch. I say that because 'git bisect' resulted in the
> > following commit ...
> 
> I guess that the for-next branch didn't revert this patch[1] as upstream
> did, [...]

> [1]: Btrfs: adjust len of writes if following a preallocated extent
> https://patchwork.kernel.org/patch/9413129/

The revert was present in the for-next-20161219 branch, the for-next
branches sit on top of recent upstream rc tag and are merged from
various dev branches so the patch difference should be minimal in the
end.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana Jan. 6, 2017, 11:45 a.m. UTC | #6
On Fri, Dec 23, 2016 at 9:30 AM, Chandan Rajendra
<chandan@linux.vnet.ibm.com> wrote:
> The following deadlock is seen when executing generic/113 test,
>
>  ---------------------------------------------------------+----------------------------------------------------
>   Direct I/O task                                           Fast fsync task
>  ---------------------------------------------------------+----------------------------------------------------
>   btrfs_direct_IO
>     __blockdev_direct_IO
>      do_blockdev_direct_IO
>       do_direct_IO
>        btrfs_get_blocks_direct
>         while (blocks needs to written)
>          get_more_blocks (first iteration)
>           btrfs_get_blocks_direct
>            btrfs_create_dio_extent
>              down_read(&BTRFS_I(inode) >dio_sem)
>              Create and add extent map and ordered extent
>              up_read(&BTRFS_I(inode) >dio_sem)
>                                                             btrfs_sync_file
>                                                               btrfs_log_dentry_safe
>                                                                btrfs_log_inode_parent
>                                                                 btrfs_log_inode
>                                                                  btrfs_log_changed_extents
>                                                                   down_write(&BTRFS_I(inode) >dio_sem)
>                                                                    Collect new extent maps and ordered extents
>                                                                     wait for ordered extent completion
>          get_more_blocks (second iteration)
>           btrfs_get_blocks_direct
>            btrfs_create_dio_extent
>              down_read(&BTRFS_I(inode) >dio_sem)
>  --------------------------------------------------------------------------------------------------------------
>
> In the above description, Btrfs direct I/O code path has not yet started
> submitting bios for file range covered by the initial ordered
> extent. Meanwhile, The fast fsync task obtains the write semaphore and
> waits for I/O on the ordered extent to get completed. However, the
> Direct I/O task is now blocked on obtaining the read semaphore.
>
> To resolve the deadlock, this commit modifies the Direct I/O code path
> to obtain the read semaphore before invoking
> __blockdev_direct_IO(). The semaphore is then given up after
> __blockdev_direct_IO() returns. This allows the Direct I/O code to
> complete I/O on all the ordered extents it creates.
>
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>

As pointed out by Liu Bo, this was known, discussed and with a
proposed fix in another older thread:
https://www.marc.info/?l=linux-btrfs&m=147998603528213&w=2

I was on vacations and waiting for Josef's feedback as well (cc'ed but
never replied) before making a change log and formalizing the proposed
patch in that thread.

From my point of view, it's ok and you can have:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

thanks

> ---
>  fs/btrfs/inode.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 5ca88f0..f796037 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7325,7 +7325,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode,
>         struct extent_map *em = NULL;
>         int ret;
>
> -       down_read(&BTRFS_I(inode)->dio_sem);
>         if (type != BTRFS_ORDERED_NOCOW) {
>                 em = create_pinned_em(inode, start, len, orig_start,
>                                       block_start, block_len, orig_block_len,
> @@ -7344,7 +7343,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode,
>                 em = ERR_PTR(ret);
>         }
>   out:
> -       up_read(&BTRFS_I(inode)->dio_sem);
>
>         return em;
>  }
> @@ -8800,6 +8798,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>                 dio_data.unsubmitted_oe_range_start = (u64)offset;
>                 dio_data.unsubmitted_oe_range_end = (u64)offset;
>                 current->journal_info = &dio_data;
> +               down_read(&BTRFS_I(inode)->dio_sem);
>         } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
>                                      &BTRFS_I(inode)->runtime_flags)) {
>                 inode_dio_end(inode);
> @@ -8812,6 +8811,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>                                    iter, btrfs_get_blocks_direct, NULL,
>                                    btrfs_submit_direct, flags);
>         if (iov_iter_rw(iter) == WRITE) {
> +               up_read(&BTRFS_I(inode)->dio_sem);
>                 current->journal_info = NULL;
>                 if (ret < 0 && ret != -EIOCBQUEUED) {
>                         if (dio_data.reserve)
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Jan. 6, 2017, 12:18 p.m. UTC | #7
On Fri, Jan 06, 2017 at 11:45:46AM +0000, Filipe Manana wrote:
> >From my point of view, it's ok and you can have:
> 
> Reviewed-by: Filipe Manana <fdmanana@suse.com>

Added to next and queued for 4.10, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5ca88f0..f796037 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7325,7 +7325,6 @@  static struct extent_map *btrfs_create_dio_extent(struct inode *inode,
 	struct extent_map *em = NULL;
 	int ret;
 
-	down_read(&BTRFS_I(inode)->dio_sem);
 	if (type != BTRFS_ORDERED_NOCOW) {
 		em = create_pinned_em(inode, start, len, orig_start,
 				      block_start, block_len, orig_block_len,
@@ -7344,7 +7343,6 @@  static struct extent_map *btrfs_create_dio_extent(struct inode *inode,
 		em = ERR_PTR(ret);
 	}
  out:
-	up_read(&BTRFS_I(inode)->dio_sem);
 
 	return em;
 }
@@ -8800,6 +8798,7 @@  static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 		dio_data.unsubmitted_oe_range_start = (u64)offset;
 		dio_data.unsubmitted_oe_range_end = (u64)offset;
 		current->journal_info = &dio_data;
+		down_read(&BTRFS_I(inode)->dio_sem);
 	} else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
 				     &BTRFS_I(inode)->runtime_flags)) {
 		inode_dio_end(inode);
@@ -8812,6 +8811,7 @@  static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 				   iter, btrfs_get_blocks_direct, NULL,
 				   btrfs_submit_direct, flags);
 	if (iov_iter_rw(iter) == WRITE) {
+		up_read(&BTRFS_I(inode)->dio_sem);
 		current->journal_info = NULL;
 		if (ret < 0 && ret != -EIOCBQUEUED) {
 			if (dio_data.reserve)