diff mbox series

[v2] btrfs: properly wait for writeback before buffered write

Message ID 3fa83d6d472d78beb5fd519d0290b73d02d53d15.1733176045.git.wqu@suse.com (mailing list archive)
State New
Headers show
Series [v2] btrfs: properly wait for writeback before buffered write | expand

Commit Message

Qu Wenruo Dec. 2, 2024, 9:47 p.m. UTC
[BUG]
Before commit e820dbeb6ad1 ("btrfs: convert btrfs_buffered_write() to
use folios"), function prepare_one_folio() will always wait for folio
writeback to finish before returning the folio.

However commit e820dbeb6ad1 ("btrfs: convert btrfs_buffered_write() to
use folios") changed to use FGP_STABLE to do the writeback wait, but
FGP_STABLE is calling folio_wait_stable(), which only calls
folio_wait_writeback() if the address space has AS_STABLE_WRITES, which
is not set for btrfs inodes.

This means we will not wait for folio writeback at all.

[CAUSE]
The cause is FGP_STABLE is not wait for writeback unconditionally, but
only for address spaces with AS_STABLE_WRITES, normally such flag is set
when the super block has SB_I_STABLE_WRITES flag.

Such super block flag is set when the block device has hardware digest
support or has internal checksum requirement.

I'd argue btrfs should set such super block due to its default data
checksum behavior, but it is not set yet, so this means FGP_STABLE flag
will have no effect at all.

(For NODATACSUM inodes, we can skip the wait in theory but that should
be an optimization in the future)

This can lead to data checksum mismatch, as we can modify the folio
meanwhile it's still under writeback, this will make the contents
differ from the contents at submission and checksum calculation.

[FIX]
Instead of fully rely on FGP_STABLE, manually do the folio writeback
wait, until we set the address space or super flag.

Fixes: e820dbeb6ad1 ("btrfs: convert btrfs_buffered_write() to use folios")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
v2:
- Update the commit message and fixes by tag
  I was too focused on the syzbot report, not noticing it's a different
  commit causing the regression.
  Now removed the syzbot report part.
---
 fs/btrfs/file.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Filipe Manana Dec. 3, 2024, 10:56 a.m. UTC | #1
On Mon, Dec 2, 2024 at 9:48 PM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> Before commit e820dbeb6ad1 ("btrfs: convert btrfs_buffered_write() to
> use folios"), function prepare_one_folio() will always wait for folio
> writeback to finish before returning the folio.
>
> However commit e820dbeb6ad1 ("btrfs: convert btrfs_buffered_write() to
> use folios") changed to use FGP_STABLE to do the writeback wait, but
> FGP_STABLE is calling folio_wait_stable(), which only calls
> folio_wait_writeback() if the address space has AS_STABLE_WRITES, which
> is not set for btrfs inodes.
>
> This means we will not wait for folio writeback at all.
>
> [CAUSE]
> The cause is FGP_STABLE is not wait for writeback unconditionally, but
> only for address spaces with AS_STABLE_WRITES, normally such flag is set
> when the super block has SB_I_STABLE_WRITES flag.
>
> Such super block flag is set when the block device has hardware digest
> support or has internal checksum requirement.
>
> I'd argue btrfs should set such super block due to its default data
> checksum behavior, but it is not set yet, so this means FGP_STABLE flag
> will have no effect at all.
>
> (For NODATACSUM inodes, we can skip the wait in theory but that should
> be an optimization in the future)
>
> This can lead to data checksum mismatch, as we can modify the folio
> meanwhile it's still under writeback, this will make the contents
> differ from the contents at submission and checksum calculation.
>
> [FIX]
> Instead of fully rely on FGP_STABLE, manually do the folio writeback
> wait, until we set the address space or super flag.
>
> Fixes: e820dbeb6ad1 ("btrfs: convert btrfs_buffered_write() to use folios")
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks for the update of the changelog.

> ---
> v2:
> - Update the commit message and fixes by tag
>   I was too focused on the syzbot report, not noticing it's a different
>   commit causing the regression.
>   Now removed the syzbot report part.
> ---
>  fs/btrfs/file.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index fbb753300071..8734f5fc681f 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -911,6 +911,7 @@ static noinline int prepare_one_folio(struct inode *inode, struct folio **folio_
>                         ret = PTR_ERR(folio);
>                 return ret;
>         }
> +       folio_wait_writeback(folio);
>         /* Only support page sized folio yet. */
>         ASSERT(folio_order(folio) == 0);
>         ret = set_folio_extent_mapped(folio);
> --
> 2.47.1
>
>
diff mbox series

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index fbb753300071..8734f5fc681f 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -911,6 +911,7 @@  static noinline int prepare_one_folio(struct inode *inode, struct folio **folio_
 			ret = PTR_ERR(folio);
 		return ret;
 	}
+	folio_wait_writeback(folio);
 	/* Only support page sized folio yet. */
 	ASSERT(folio_order(folio) == 0);
 	ret = set_folio_extent_mapped(folio);