diff mbox

Btrfs: adjust outstanding_extents counter properly when dio write is split

Message ID 1482455634-9185-1-git-send-email-bo.li.liu@oracle.com (mailing list archive)
State Accepted
Headers show

Commit Message

Liu Bo Dec. 23, 2016, 1:13 a.m. UTC
Currently how btrfs dio deals with split dio write is not good
enough if dio write is split into several segments due to the
lack of contiguous space, a large dio write like 'dd bs=1G count=1'
can end up with incorrect outstanding_extents counter and endio
would complain loudly with an assertion.

This fixes the problem by compensating the outstanding_extents
counter in inode if a large dio write gets split.

Reported-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/inode.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Anand Jain Dec. 23, 2016, 3:18 a.m. UTC | #1
On 12/23/16 09:13, Liu Bo wrote:
> Currently how btrfs dio deals with split dio write is not good
> enough if dio write is split into several segments due to the
> lack of contiguous space, a large dio write like 'dd bs=1G count=1'
> can end up with incorrect outstanding_extents counter and endio
> would complain loudly with an assertion.
>
> This fixes the problem by compensating the outstanding_extents
> counter in inode if a large dio write gets split.

  Fix works. Thanks Liu bo for working on this.

  Tested-by: Anand Jain <anand.jain@oracle.com>

> Reported-by: Anand Jain <anand.jain@oracle.com>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/inode.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index a4c8796..4175987 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7641,11 +7641,18 @@ static void adjust_dio_outstanding_extents(struct inode *inode,
>  	 * within our reservation, otherwise we need to adjust our inode
>  	 * counter appropriately.
>  	 */
> -	if (dio_data->outstanding_extents) {
> +	if (dio_data->outstanding_extents >= num_extents) {
>  		dio_data->outstanding_extents -= num_extents;
>  	} else {
> +		/*
> +		 * If dio write length has been split due to no large enough
> +		 * contiguous space, we need to compensate our inode counter
> +		 * appropriately.
> +		 */
> +		u64 num_needed = num_extents - dio_data->outstanding_extents;
> +
>  		spin_lock(&BTRFS_I(inode)->lock);
> -		BTRFS_I(inode)->outstanding_extents += num_extents;
> +		BTRFS_I(inode)->outstanding_extents += num_needed;
>  		spin_unlock(&BTRFS_I(inode)->lock);
>  	}
>  }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana Jan. 6, 2017, 11:28 a.m. UTC | #2
On Fri, Dec 23, 2016 at 1:13 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> Currently how btrfs dio deals with split dio write is not good
> enough if dio write is split into several segments due to the
> lack of contiguous space, a large dio write like 'dd bs=1G count=1'
> can end up with incorrect outstanding_extents counter and endio
> would complain loudly with an assertion.
>
> This fixes the problem by compensating the outstanding_extents
> counter in inode if a large dio write gets split.
>
> Reported-by: Anand Jain <anand.jain@oracle.com>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Bo, can you please create a test case for fstests?

Thanks

> ---
>  fs/btrfs/inode.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index a4c8796..4175987 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7641,11 +7641,18 @@ static void adjust_dio_outstanding_extents(struct inode *inode,
>          * within our reservation, otherwise we need to adjust our inode
>          * counter appropriately.
>          */
> -       if (dio_data->outstanding_extents) {
> +       if (dio_data->outstanding_extents >= num_extents) {
>                 dio_data->outstanding_extents -= num_extents;
>         } else {
> +               /*
> +                * If dio write length has been split due to no large enough
> +                * contiguous space, we need to compensate our inode counter
> +                * appropriately.
> +                */
> +               u64 num_needed = num_extents - dio_data->outstanding_extents;
> +
>                 spin_lock(&BTRFS_I(inode)->lock);
> -               BTRFS_I(inode)->outstanding_extents += num_extents;
> +               BTRFS_I(inode)->outstanding_extents += num_needed;
>                 spin_unlock(&BTRFS_I(inode)->lock);
>         }
>  }
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo Jan. 12, 2017, 10:24 p.m. UTC | #3
On Fri, Jan 06, 2017 at 11:28:06AM +0000, Filipe Manana wrote:
> On Fri, Dec 23, 2016 at 1:13 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> > Currently how btrfs dio deals with split dio write is not good
> > enough if dio write is split into several segments due to the
> > lack of contiguous space, a large dio write like 'dd bs=1G count=1'
> > can end up with incorrect outstanding_extents counter and endio
> > would complain loudly with an assertion.
> >
> > This fixes the problem by compensating the outstanding_extents
> > counter in inode if a large dio write gets split.
> >
> > Reported-by: Anand Jain <anand.jain@oracle.com>
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> 
> Bo, can you please create a test case for fstests?

It took me some time to recall all the details, anyway, I've sent out a
fstests case for it[1].

[1]: https://patchwork.kernel.org/patch/9514277/
Thanks,

-liubo

> 
> Thanks
> 
> > ---
> >  fs/btrfs/inode.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index a4c8796..4175987 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -7641,11 +7641,18 @@ static void adjust_dio_outstanding_extents(struct inode *inode,
> >          * within our reservation, otherwise we need to adjust our inode
> >          * counter appropriately.
> >          */
> > -       if (dio_data->outstanding_extents) {
> > +       if (dio_data->outstanding_extents >= num_extents) {
> >                 dio_data->outstanding_extents -= num_extents;
> >         } else {
> > +               /*
> > +                * If dio write length has been split due to no large enough
> > +                * contiguous space, we need to compensate our inode counter
> > +                * appropriately.
> > +                */
> > +               u64 num_needed = num_extents - dio_data->outstanding_extents;
> > +
> >                 spin_lock(&BTRFS_I(inode)->lock);
> > -               BTRFS_I(inode)->outstanding_extents += num_extents;
> > +               BTRFS_I(inode)->outstanding_extents += num_needed;
> >                 spin_unlock(&BTRFS_I(inode)->lock);
> >         }
> >  }
> > --
> > 2.5.5
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> -- 
> Filipe David Manana,
> 
> "People will forget what you said,
>  people will forget what you did,
>  but people will never forget how you made them feel."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a4c8796..4175987 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7641,11 +7641,18 @@  static void adjust_dio_outstanding_extents(struct inode *inode,
 	 * within our reservation, otherwise we need to adjust our inode
 	 * counter appropriately.
 	 */
-	if (dio_data->outstanding_extents) {
+	if (dio_data->outstanding_extents >= num_extents) {
 		dio_data->outstanding_extents -= num_extents;
 	} else {
+		/*
+		 * If dio write length has been split due to no large enough
+		 * contiguous space, we need to compensate our inode counter
+		 * appropriately.
+		 */
+		u64 num_needed = num_extents - dio_data->outstanding_extents;
+
 		spin_lock(&BTRFS_I(inode)->lock);
-		BTRFS_I(inode)->outstanding_extents += num_extents;
+		BTRFS_I(inode)->outstanding_extents += num_needed;
 		spin_unlock(&BTRFS_I(inode)->lock);
 	}
 }