diff mbox series

[v2] btrfs: fix data race when accessing the inode's disk_i_size at btrfs_drop_extents()

Message ID 20241202133515.92286-1-zhenghaoran154@gmail.com (mailing list archive)
State New
Headers show
Series [v2] btrfs: fix data race when accessing the inode's disk_i_size at btrfs_drop_extents() | expand

Commit Message

Hao-ran Zheng Dec. 2, 2024, 1:35 p.m. UTC
A data race occurs when the function `insert_ordered_extent_file_extent()`
and the function `btrfs_inode_safe_disk_i_size_write()` are executed
concurrently. The function `insert_ordered_extent_file_extent()` is not
locked when reading inode->disk_i_size, causing
`btrfs_inode_safe_disk_i_size_write()` to cause data competition when
writing inode->disk_i_size, thus affecting the value of `modify_tree`.

Since the impact of data race is limited, it is recommended to use the
`data_race` mark judgment.

The specific call stack that appears during testing is as follows:
============DATA_RACE============
 btrfs_drop_extents+0x89a/0xa060 [btrfs]
 insert_reserved_file_extent+0xb54/0x2960 [btrfs]
 insert_ordered_extent_file_extent+0xff5/0x1760 [btrfs]
 btrfs_finish_one_ordered+0x1b85/0x36a0 [btrfs]
 btrfs_finish_ordered_io+0x37/0x60 [btrfs]
 finish_ordered_fn+0x3e/0x50 [btrfs]
 btrfs_work_helper+0x9c9/0x27a0 [btrfs]
 process_scheduled_works+0x716/0xf10
 worker_thread+0xb6a/0x1190
 kthread+0x292/0x330
 ret_from_fork+0x4d/0x80
 ret_from_fork_asm+0x1a/0x30
============OTHER_INFO============
 btrfs_inode_safe_disk_i_size_write+0x4ec/0x600 [btrfs]
 btrfs_finish_one_ordered+0x24c7/0x36a0 [btrfs]
 btrfs_finish_ordered_io+0x37/0x60 [btrfs]
 finish_ordered_fn+0x3e/0x50 [btrfs]
 btrfs_work_helper+0x9c9/0x27a0 [btrfs]
 process_scheduled_works+0x716/0xf10
 worker_thread+0xb6a/0x1190
 kthread+0x292/0x330
 ret_from_fork+0x4d/0x80
 ret_from_fork_asm+0x1a/0x30
=================================

Signed-off-by: Hao-ran Zheng <zhenghaoran154@gmail.com>
---
V1->V2: Modify patch description and format
---
 fs/btrfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Filipe Manana Dec. 2, 2024, 2:40 p.m. UTC | #1
On Mon, Dec 2, 2024 at 1:35 PM Hao-ran Zheng <zhenghaoran154@gmail.com> wrote:
>
> A data race occurs when the function `insert_ordered_extent_file_extent()`
> and the function `btrfs_inode_safe_disk_i_size_write()` are executed
> concurrently. The function `insert_ordered_extent_file_extent()` is not
> locked when reading inode->disk_i_size, causing
> `btrfs_inode_safe_disk_i_size_write()` to cause data competition when
> writing inode->disk_i_size, thus affecting the value of `modify_tree`.
>
> Since the impact of data race is limited, it is recommended to use the
> `data_race` mark judgment.

This should explain why it's a harmless race.
Also it's better to explicitly say harmless race rather than say it
has limited impact, because the latter gives the idea of rare problems
when we don't have any.

>
> The specific call stack that appears during testing is as follows:
> ============DATA_RACE============
>  btrfs_drop_extents+0x89a/0xa060 [btrfs]
>  insert_reserved_file_extent+0xb54/0x2960 [btrfs]
>  insert_ordered_extent_file_extent+0xff5/0x1760 [btrfs]
>  btrfs_finish_one_ordered+0x1b85/0x36a0 [btrfs]
>  btrfs_finish_ordered_io+0x37/0x60 [btrfs]
>  finish_ordered_fn+0x3e/0x50 [btrfs]
>  btrfs_work_helper+0x9c9/0x27a0 [btrfs]
>  process_scheduled_works+0x716/0xf10
>  worker_thread+0xb6a/0x1190
>  kthread+0x292/0x330
>  ret_from_fork+0x4d/0x80
>  ret_from_fork_asm+0x1a/0x30
> ============OTHER_INFO============
>  btrfs_inode_safe_disk_i_size_write+0x4ec/0x600 [btrfs]
>  btrfs_finish_one_ordered+0x24c7/0x36a0 [btrfs]
>  btrfs_finish_ordered_io+0x37/0x60 [btrfs]
>  finish_ordered_fn+0x3e/0x50 [btrfs]
>  btrfs_work_helper+0x9c9/0x27a0 [btrfs]
>  process_scheduled_works+0x716/0xf10
>  worker_thread+0xb6a/0x1190
>  kthread+0x292/0x330
>  ret_from_fork+0x4d/0x80
>  ret_from_fork_asm+0x1a/0x30
> =================================
>
> Signed-off-by: Hao-ran Zheng <zhenghaoran154@gmail.com>
> ---
> V1->V2: Modify patch description and format
> ---
>  fs/btrfs/file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 4fb521d91b06..f9631713f67d 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -242,7 +242,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans,
>         if (args->drop_cache)
>                 btrfs_drop_extent_map_range(inode, args->start, args->end - 1, false);
>
> -       if (args->start >= inode->disk_i_size && !args->replace_extent)
> +       if (data_race(args->start >= inode->disk_i_size && !args->replace_extent))

Don't surround the whole condition with data_race().
Just put it around the disk_i_size check:

if (data_race(args->start >= inode->disk_i_size) && !args->replace_extent))

This makes it more clear it's about disk_i_size and nothing else.

Thanks.

>                 modify_tree = 0;
>
>         update_refs = (btrfs_root_id(root) != BTRFS_TREE_LOG_OBJECTID);
> --
> 2.34.1
>
>
Hao-ran Zheng Dec. 3, 2024, 7:08 a.m. UTC | #2
Thanks for the advice, I will explain in detail why it's a harmless race.
And submit patch v3.

On Mon, Dec 2, 2024 at 10:40 PM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Mon, Dec 2, 2024 at 1:35 PM Hao-ran Zheng <zhenghaoran154@gmail.com> wrote:
> >
> > A data race occurs when the function `insert_ordered_extent_file_extent()`
> > and the function `btrfs_inode_safe_disk_i_size_write()` are executed
> > concurrently. The function `insert_ordered_extent_file_extent()` is not
> > locked when reading inode->disk_i_size, causing
> > `btrfs_inode_safe_disk_i_size_write()` to cause data competition when
> > writing inode->disk_i_size, thus affecting the value of `modify_tree`.
> >
> > Since the impact of data race is limited, it is recommended to use the
> > `data_race` mark judgment.
>
> This should explain why it's a harmless race.
> Also it's better to explicitly say harmless race rather than say it
> has limited impact, because the latter gives the idea of rare problems
> when we don't have any.
>
> >
> > The specific call stack that appears during testing is as follows:
> > ============DATA_RACE============
> >  btrfs_drop_extents+0x89a/0xa060 [btrfs]
> >  insert_reserved_file_extent+0xb54/0x2960 [btrfs]
> >  insert_ordered_extent_file_extent+0xff5/0x1760 [btrfs]
> >  btrfs_finish_one_ordered+0x1b85/0x36a0 [btrfs]
> >  btrfs_finish_ordered_io+0x37/0x60 [btrfs]
> >  finish_ordered_fn+0x3e/0x50 [btrfs]
> >  btrfs_work_helper+0x9c9/0x27a0 [btrfs]
> >  process_scheduled_works+0x716/0xf10
> >  worker_thread+0xb6a/0x1190
> >  kthread+0x292/0x330
> >  ret_from_fork+0x4d/0x80
> >  ret_from_fork_asm+0x1a/0x30
> > ============OTHER_INFO============
> >  btrfs_inode_safe_disk_i_size_write+0x4ec/0x600 [btrfs]
> >  btrfs_finish_one_ordered+0x24c7/0x36a0 [btrfs]
> >  btrfs_finish_ordered_io+0x37/0x60 [btrfs]
> >  finish_ordered_fn+0x3e/0x50 [btrfs]
> >  btrfs_work_helper+0x9c9/0x27a0 [btrfs]
> >  process_scheduled_works+0x716/0xf10
> >  worker_thread+0xb6a/0x1190
> >  kthread+0x292/0x330
> >  ret_from_fork+0x4d/0x80
> >  ret_from_fork_asm+0x1a/0x30
> > =================================
> >
> > Signed-off-by: Hao-ran Zheng <zhenghaoran154@gmail.com>
> > ---
> > V1->V2: Modify patch description and format
> > ---
> >  fs/btrfs/file.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 4fb521d91b06..f9631713f67d 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -242,7 +242,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans,
> >         if (args->drop_cache)
> >                 btrfs_drop_extent_map_range(inode, args->start, args->end - 1, false);
> >
> > -       if (args->start >= inode->disk_i_size && !args->replace_extent)
> > +       if (data_race(args->start >= inode->disk_i_size && !args->replace_extent))
>
> Don't surround the whole condition with data_race().
> Just put it around the disk_i_size check:
>
> if (data_race(args->start >= inode->disk_i_size) && !args->replace_extent))
>
> This makes it more clear it's about disk_i_size and nothing else.
>
> Thanks.
>
> >                 modify_tree = 0;
> >
> >         update_refs = (btrfs_root_id(root) != BTRFS_TREE_LOG_OBJECTID);
> > --
> > 2.34.1
> >
> >
diff mbox series

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 4fb521d91b06..f9631713f67d 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -242,7 +242,7 @@  int btrfs_drop_extents(struct btrfs_trans_handle *trans,
 	if (args->drop_cache)
 		btrfs_drop_extent_map_range(inode, args->start, args->end - 1, false);
 
-	if (args->start >= inode->disk_i_size && !args->replace_extent)
+	if (data_race(args->start >= inode->disk_i_size && !args->replace_extent))
 		modify_tree = 0;
 
 	update_refs = (btrfs_root_id(root) != BTRFS_TREE_LOG_OBJECTID);