diff mbox

[1/3] btrfs: Continue write in case of can_not_nocow

Message ID 9b89157957abcc87a6ea0530989a4984794484cd.1452078012.git.zhaolei@cn.fujitsu.com (mailing list archive)
State Accepted
Headers show

Commit Message

Zhaolei Jan. 6, 2016, 11 a.m. UTC
btrfs failed in xfstests btrfs/080 with -o nodatacow.

Can be reproduced by following script:
  DEV=/dev/vdg
  MNT=/mnt/tmp

  umount $DEV &>/dev/null
  mkfs.btrfs -f $DEV
  mount -o nodatacow $DEV $MNT

  dd if=/dev/zero of=$MNT/test bs=1 count=2048 &
  btrfs subvolume snapshot -r $MNT $MNT/test_snap &
  wait
  --
  We can see dd failed on NO_SPACE.

Reason:
  __btrfs_buffered_write should run cow write when no_cow impossible,
  and current code is designed with above logic.
  But check_can_nocow() have 2 type of return value(0 and <0) on
  can_not_no_cow, and current code only continue write on first case,
  the second case happened in doing subvolume.

Fix:
  Continue write when check_can_nocow() return 0 and <0.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 fs/btrfs/file.c | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

Comments

Filipe Manana Jan. 20, 2016, 8:48 p.m. UTC | #1
On Wed, Jan 6, 2016 at 11:00 AM, Zhao Lei <zhaolei@cn.fujitsu.com> wrote:
> btrfs failed in xfstests btrfs/080 with -o nodatacow.
>
> Can be reproduced by following script:
>   DEV=/dev/vdg
>   MNT=/mnt/tmp
>
>   umount $DEV &>/dev/null
>   mkfs.btrfs -f $DEV
>   mount -o nodatacow $DEV $MNT
>
>   dd if=/dev/zero of=$MNT/test bs=1 count=2048 &
>   btrfs subvolume snapshot -r $MNT $MNT/test_snap &
>   wait
>   --
>   We can see dd failed on NO_SPACE.
>
> Reason:
>   __btrfs_buffered_write should run cow write when no_cow impossible,
>   and current code is designed with above logic.
>   But check_can_nocow() have 2 type of return value(0 and <0) on
>   can_not_no_cow, and current code only continue write on first case,
>   the second case happened in doing subvolume.
>
> Fix:
>   Continue write when check_can_nocow() return 0 and <0.
>
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>

Could you please submit a test case for xfstests as well?
Thanks.

> ---
>  fs/btrfs/file.c | 37 +++++++++++++++++--------------------
>  1 file changed, 17 insertions(+), 20 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 977e715..11fd981 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1516,27 +1516,24 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
>
>                 reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
>
> -               if (BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
> -                                            BTRFS_INODE_PREALLOC)) {
> -                       ret = check_can_nocow(inode, pos, &write_bytes);
> -                       if (ret < 0)
> -                               break;
> -                       if (ret > 0) {
> -                               /*
> -                                * For nodata cow case, no need to reserve
> -                                * data space.
> -                                */
> -                               only_release_metadata = true;
> -                               /*
> -                                * our prealloc extent may be smaller than
> -                                * write_bytes, so scale down.
> -                                */
> -                               num_pages = DIV_ROUND_UP(write_bytes + offset,
> -                                                        PAGE_CACHE_SIZE);
> -                               reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
> -                               goto reserve_metadata;
> -                       }
> +               if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
> +                                             BTRFS_INODE_PREALLOC)) &&
> +                   check_can_nocow(inode, pos, &write_bytes) > 0) {
> +                       /*
> +                        * For nodata cow case, no need to reserve
> +                        * data space.
> +                        */
> +                       only_release_metadata = true;
> +                       /*
> +                        * our prealloc extent may be smaller than
> +                        * write_bytes, so scale down.
> +                        */
> +                       num_pages = DIV_ROUND_UP(write_bytes + offset,
> +                                                PAGE_CACHE_SIZE);
> +                       reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
> +                       goto reserve_metadata;
>                 }
> +
>                 ret = btrfs_check_data_free_space(inode, pos, write_bytes);
>                 if (ret < 0)
>                         break;
> --
> 1.8.5.1
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhaolei Jan. 21, 2016, 3:39 a.m. UTC | #2
Hi, Filipe Manana

> -----Original Message-----
> From: Filipe Manana [mailto:fdmanana@gmail.com]
> Sent: Thursday, January 21, 2016 4:49 AM
> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: linux-btrfs@vger.kernel.org
> Subject: Re: [PATCH 1/3] btrfs: Continue write in case of can_not_nocow
> 
> On Wed, Jan 6, 2016 at 11:00 AM, Zhao Lei <zhaolei@cn.fujitsu.com> wrote:
> > btrfs failed in xfstests btrfs/080 with -o nodatacow.
> >
> > Can be reproduced by following script:
> >   DEV=/dev/vdg
> >   MNT=/mnt/tmp
> >
> >   umount $DEV &>/dev/null
> >   mkfs.btrfs -f $DEV
> >   mount -o nodatacow $DEV $MNT
> >
> >   dd if=/dev/zero of=$MNT/test bs=1 count=2048 &
> >   btrfs subvolume snapshot -r $MNT $MNT/test_snap &
> >   wait
> >   --
> >   We can see dd failed on NO_SPACE.
> >
> > Reason:
> >   __btrfs_buffered_write should run cow write when no_cow impossible,
> >   and current code is designed with above logic.
> >   But check_can_nocow() have 2 type of return value(0 and <0) on
> >   can_not_no_cow, and current code only continue write on first case,
> >   the second case happened in doing subvolume.
> >
> > Fix:
> >   Continue write when check_can_nocow() return 0 and <0.
> >
> > Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> 
Thanks for review.

> Could you please submit a test case for xfstests as well?

btrfs/080 with -o nodatacow already can trigger this bug.

Thanks
Zhaolei

> Thanks.
> 

> > ---
> >  fs/btrfs/file.c | 37 +++++++++++++++++--------------------
> >  1 file changed, 17 insertions(+), 20 deletions(-)
> >
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 977e715..11fd981
> > 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1516,27 +1516,24 @@ static noinline ssize_t
> > __btrfs_buffered_write(struct file *file,
> >
> >                 reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
> >
> > -               if (BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
> > -
> BTRFS_INODE_PREALLOC)) {
> > -                       ret = check_can_nocow(inode, pos,
> &write_bytes);
> > -                       if (ret < 0)
> > -                               break;
> > -                       if (ret > 0) {
> > -                               /*
> > -                                * For nodata cow case, no need to
> reserve
> > -                                * data space.
> > -                                */
> > -                               only_release_metadata = true;
> > -                               /*
> > -                                * our prealloc extent may be smaller
> than
> > -                                * write_bytes, so scale down.
> > -                                */
> > -                               num_pages =
> DIV_ROUND_UP(write_bytes + offset,
> > -
> PAGE_CACHE_SIZE);
> > -                               reserve_bytes = num_pages <<
> PAGE_CACHE_SHIFT;
> > -                               goto reserve_metadata;
> > -                       }
> > +               if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW
> |
> > +
> BTRFS_INODE_PREALLOC)) &&
> > +                   check_can_nocow(inode, pos, &write_bytes) > 0) {
> > +                       /*
> > +                        * For nodata cow case, no need to reserve
> > +                        * data space.
> > +                        */
> > +                       only_release_metadata = true;
> > +                       /*
> > +                        * our prealloc extent may be smaller than
> > +                        * write_bytes, so scale down.
> > +                        */
> > +                       num_pages = DIV_ROUND_UP(write_bytes +
> offset,
> > +
> PAGE_CACHE_SIZE);
> > +                       reserve_bytes = num_pages <<
> PAGE_CACHE_SHIFT;
> > +                       goto reserve_metadata;
> >                 }
> > +
> >                 ret = btrfs_check_data_free_space(inode, pos,
> write_bytes);
> >                 if (ret < 0)
> >                         break;
> > --
> > 1.8.5.1
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> --
> Filipe David Manana,
> 
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roman Mamedov Feb. 26, 2016, 5:41 a.m. UTC | #3
On Wed, 6 Jan 2016 19:00:17 +0800
Zhao Lei <zhaolei@cn.fujitsu.com> wrote:

> btrfs failed in xfstests btrfs/080 with -o nodatacow.
> 
> Can be reproduced by following script:
>   DEV=/dev/vdg
>   MNT=/mnt/tmp
> 
>   umount $DEV &>/dev/null
>   mkfs.btrfs -f $DEV
>   mount -o nodatacow $DEV $MNT
> 
>   dd if=/dev/zero of=$MNT/test bs=1 count=2048 &
>   btrfs subvolume snapshot -r $MNT $MNT/test_snap &
>   wait
>   --
>   We can see dd failed on NO_SPACE.
> 
> Reason:
>   __btrfs_buffered_write should run cow write when no_cow impossible,
>   and current code is designed with above logic.
>   But check_can_nocow() have 2 type of return value(0 and <0) on
>   can_not_no_cow, and current code only continue write on first case,
>   the second case happened in doing subvolume.
> 
> Fix:
>   Continue write when check_can_nocow() return 0 and <0.
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>

Guys please don't forget about this patch. It solves real problem for people,
http://www.spinics.net/lists/linux-btrfs/msg51276.html
http://www.spinics.net/lists/linux-btrfs/msg51819.html
but it's not in 4.4.1, not in 4.4.2... and now not in 4.4.3

> ---
>  fs/btrfs/file.c | 37 +++++++++++++++++--------------------
>  1 file changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 977e715..11fd981 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1516,27 +1516,24 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
>  
>  		reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
>  
> -		if (BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
> -					     BTRFS_INODE_PREALLOC)) {
> -			ret = check_can_nocow(inode, pos, &write_bytes);
> -			if (ret < 0)
> -				break;
> -			if (ret > 0) {
> -				/*
> -				 * For nodata cow case, no need to reserve
> -				 * data space.
> -				 */
> -				only_release_metadata = true;
> -				/*
> -				 * our prealloc extent may be smaller than
> -				 * write_bytes, so scale down.
> -				 */
> -				num_pages = DIV_ROUND_UP(write_bytes + offset,
> -							 PAGE_CACHE_SIZE);
> -				reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
> -				goto reserve_metadata;
> -			}
> +		if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
> +					      BTRFS_INODE_PREALLOC)) &&
> +		    check_can_nocow(inode, pos, &write_bytes) > 0) {
> +			/*
> +			 * For nodata cow case, no need to reserve
> +			 * data space.
> +			 */
> +			only_release_metadata = true;
> +			/*
> +			 * our prealloc extent may be smaller than
> +			 * write_bytes, so scale down.
> +			 */
> +			num_pages = DIV_ROUND_UP(write_bytes + offset,
> +						 PAGE_CACHE_SIZE);
> +			reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
> +			goto reserve_metadata;
>  		}
> +
>  		ret = btrfs_check_data_free_space(inode, pos, write_bytes);
>  		if (ret < 0)
>  			break;
Chris Mason Feb. 26, 2016, 2:57 p.m. UTC | #4
On Fri, Feb 26, 2016 at 10:41:31AM +0500, Roman Mamedov wrote:
> On Wed, 6 Jan 2016 19:00:17 +0800
> Zhao Lei <zhaolei@cn.fujitsu.com> wrote:
> 
> > btrfs failed in xfstests btrfs/080 with -o nodatacow.
> > 
> > Can be reproduced by following script:
> >   DEV=/dev/vdg
> >   MNT=/mnt/tmp
> > 
> >   umount $DEV &>/dev/null
> >   mkfs.btrfs -f $DEV
> >   mount -o nodatacow $DEV $MNT
> > 
> >   dd if=/dev/zero of=$MNT/test bs=1 count=2048 &
> >   btrfs subvolume snapshot -r $MNT $MNT/test_snap &
> >   wait
> >   --
> >   We can see dd failed on NO_SPACE.
> > 
> > Reason:
> >   __btrfs_buffered_write should run cow write when no_cow impossible,
> >   and current code is designed with above logic.
> >   But check_can_nocow() have 2 type of return value(0 and <0) on
> >   can_not_no_cow, and current code only continue write on first case,
> >   the second case happened in doing subvolume.
> > 
> > Fix:
> >   Continue write when check_can_nocow() return 0 and <0.
> > 
> > Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> 
> Guys please don't forget about this patch. It solves real problem for people,
> http://www.spinics.net/lists/linux-btrfs/msg51276.html
> http://www.spinics.net/lists/linux-btrfs/msg51819.html
> but it's not in 4.4.1, not in 4.4.2... and now not in 4.4.3

Dave already has it queued for the next merge window.  Thanks!

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 977e715..11fd981 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1516,27 +1516,24 @@  static noinline ssize_t __btrfs_buffered_write(struct file *file,
 
 		reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
 
-		if (BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
-					     BTRFS_INODE_PREALLOC)) {
-			ret = check_can_nocow(inode, pos, &write_bytes);
-			if (ret < 0)
-				break;
-			if (ret > 0) {
-				/*
-				 * For nodata cow case, no need to reserve
-				 * data space.
-				 */
-				only_release_metadata = true;
-				/*
-				 * our prealloc extent may be smaller than
-				 * write_bytes, so scale down.
-				 */
-				num_pages = DIV_ROUND_UP(write_bytes + offset,
-							 PAGE_CACHE_SIZE);
-				reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
-				goto reserve_metadata;
-			}
+		if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
+					      BTRFS_INODE_PREALLOC)) &&
+		    check_can_nocow(inode, pos, &write_bytes) > 0) {
+			/*
+			 * For nodata cow case, no need to reserve
+			 * data space.
+			 */
+			only_release_metadata = true;
+			/*
+			 * our prealloc extent may be smaller than
+			 * write_bytes, so scale down.
+			 */
+			num_pages = DIV_ROUND_UP(write_bytes + offset,
+						 PAGE_CACHE_SIZE);
+			reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
+			goto reserve_metadata;
 		}
+
 		ret = btrfs_check_data_free_space(inode, pos, write_bytes);
 		if (ret < 0)
 			break;