diff mbox series

[07/13] xfs: allow merging ioends over append boundaries

Message ID 20190627104836.25446-8-hch@lst.de (mailing list archive)
State Accepted
Headers show
Series [01/13] list.h: add list_pop and list_pop_entry helpers | expand

Commit Message

Christoph Hellwig June 27, 2019, 10:48 a.m. UTC
There is no real problem merging ioends that go beyond i_size into an
ioend that doesn't.  We just need to move the append transaction to the
base ioend.  Also use the opportunity to use a real error code instead
of the magic 1 to cancel the transactions, and write a comment
explaining the scheme.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_aops.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

Comments

Darrick J. Wong June 27, 2019, 6:23 p.m. UTC | #1
On Thu, Jun 27, 2019 at 12:48:30PM +0200, Christoph Hellwig wrote:
> There is no real problem merging ioends that go beyond i_size into an
> ioend that doesn't.  We just need to move the append transaction to the
> base ioend.  Also use the opportunity to use a real error code instead
> of the magic 1 to cancel the transactions, and write a comment
> explaining the scheme.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reading through this patch, I have a feeling it fixes the crash that
Zorro has been seeing occasionally with generic/475...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_aops.c | 28 +++++++++++++++++++++-------
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 8b3070a40245..4ef8343c3759 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -314,11 +314,28 @@ xfs_ioend_can_merge(
>  		return false;
>  	if (ioend->io_offset + ioend->io_size != next->io_offset)
>  		return false;
> -	if (xfs_ioend_is_append(ioend) != xfs_ioend_is_append(next))
> -		return false;
>  	return true;
>  }
>  
> +/*
> + * If the to be merged ioend has a preallocated transaction for file
> + * size updates we need to ensure the ioend it is merged into also
> + * has one.  If it already has one we can simply cancel the transaction
> + * as it is guaranteed to be clean.
> + */
> +static void
> +xfs_ioend_merge_append_transactions(
> +	struct xfs_ioend	*ioend,
> +	struct xfs_ioend	*next)
> +{
> +	if (!ioend->io_append_trans) {
> +		ioend->io_append_trans = next->io_append_trans;
> +		next->io_append_trans = NULL;
> +	} else {
> +		xfs_setfilesize_ioend(next, -ECANCELED);
> +	}
> +}
> +
>  /* Try to merge adjacent completions. */
>  STATIC void
>  xfs_ioend_try_merge(
> @@ -327,7 +344,6 @@ xfs_ioend_try_merge(
>  {
>  	struct xfs_ioend	*next_ioend;
>  	int			ioend_error;
> -	int			error;
>  
>  	if (list_empty(more_ioends))
>  		return;
> @@ -341,10 +357,8 @@ xfs_ioend_try_merge(
>  			break;
>  		list_move_tail(&next_ioend->io_list, &ioend->io_list);
>  		ioend->io_size += next_ioend->io_size;
> -		if (ioend->io_append_trans) {
> -			error = xfs_setfilesize_ioend(next_ioend, 1);
> -			ASSERT(error == 1);
> -		}
> +		if (next_ioend->io_append_trans)
> +			xfs_ioend_merge_append_transactions(ioend, next_ioend);
>  	}
>  }
>  
> -- 
> 2.20.1
>
Luis Chamberlain June 27, 2019, 9:43 p.m. UTC | #2
On Thu, Jun 27, 2019 at 11:23:09AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 27, 2019 at 12:48:30PM +0200, Christoph Hellwig wrote:
> > There is no real problem merging ioends that go beyond i_size into an
> > ioend that doesn't.  We just need to move the append transaction to the
> > base ioend.  Also use the opportunity to use a real error code instead
> > of the magic 1 to cancel the transactions, and write a comment
> > explaining the scheme.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Reading through this patch, I have a feeling it fixes the crash that
> Zorro has been seeing occasionally with generic/475...
> 
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

Zorro, can you confirm? If so it would be great to also refer to
the respective bugzilla entry #203947 [0].

[0] https://bugzilla.kernel.org/show_bug.cgi?id=203947

  Luis
Zorro Lang June 28, 2019, 2:52 a.m. UTC | #3
On Thu, Jun 27, 2019 at 09:43:04PM +0000, Luis Chamberlain wrote:
> On Thu, Jun 27, 2019 at 11:23:09AM -0700, Darrick J. Wong wrote:
> > On Thu, Jun 27, 2019 at 12:48:30PM +0200, Christoph Hellwig wrote:
> > > There is no real problem merging ioends that go beyond i_size into an
> > > ioend that doesn't.  We just need to move the append transaction to the
> > > base ioend.  Also use the opportunity to use a real error code instead
> > > of the magic 1 to cancel the transactions, and write a comment
> > > explaining the scheme.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> > Reading through this patch, I have a feeling it fixes the crash that
> > Zorro has been seeing occasionally with generic/475...
> > 
> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Zorro, can you confirm? If so it would be great to also refer to
> the respective bugzilla entry #203947 [0].

Sure, I'll give it a test. But it's so hard to reproduce, I need long enough
time to prove "the panic's gone".

BTW, should I only merge this single patch to test, or merge your whole patchset
with 13 patches?

Thanks,
Zorro

> 
> [0] https://bugzilla.kernel.org/show_bug.cgi?id=203947
> 
>   Luis
Darrick J. Wong June 28, 2019, 3:33 a.m. UTC | #4
On Fri, Jun 28, 2019 at 10:52:04AM +0800, Zorro Lang wrote:
> On Thu, Jun 27, 2019 at 09:43:04PM +0000, Luis Chamberlain wrote:
> > On Thu, Jun 27, 2019 at 11:23:09AM -0700, Darrick J. Wong wrote:
> > > On Thu, Jun 27, 2019 at 12:48:30PM +0200, Christoph Hellwig wrote:
> > > > There is no real problem merging ioends that go beyond i_size into an
> > > > ioend that doesn't.  We just need to move the append transaction to the
> > > > base ioend.  Also use the opportunity to use a real error code instead
> > > > of the magic 1 to cancel the transactions, and write a comment
> > > > explaining the scheme.
> > > > 
> > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > 
> > > Reading through this patch, I have a feeling it fixes the crash that
> > > Zorro has been seeing occasionally with generic/475...
> > > 
> > > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Zorro, can you confirm? If so it would be great to also refer to
> > the respective bugzilla entry #203947 [0].
> 
> Sure, I'll give it a test. But it's so hard to reproduce, I need long enough
> time to prove "the panic's gone".
> 
> BTW, should I only merge this single patch to test, or merge your whole patchset
> with 13 patches?

Just this one patch.

--D

> Thanks,
> Zorro
> 
> > 
> > [0] https://bugzilla.kernel.org/show_bug.cgi?id=203947
> > 
> >   Luis
Christoph Hellwig June 28, 2019, 5:51 a.m. UTC | #5
On Thu, Jun 27, 2019 at 11:23:09AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 27, 2019 at 12:48:30PM +0200, Christoph Hellwig wrote:
> > There is no real problem merging ioends that go beyond i_size into an
> > ioend that doesn't.  We just need to move the append transaction to the
> > base ioend.  Also use the opportunity to use a real error code instead
> > of the magic 1 to cancel the transactions, and write a comment
> > explaining the scheme.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Reading through this patch, I have a feeling it fixes the crash that
> Zorro has been seeing occasionally with generic/475...

So you think for some reason the disk i_size changes underneath and thus
the xfs_ioend_is_append misfired vs the actual transaction allocations?
I didn't even think of that, but using the different checks sure sounds
dangerous.  So yes, we'd either need to backport my patch, or at least
replace the checks in xfs_ioend_can_merge with direct checks of
io_append_trans.
Darrick J. Wong June 28, 2019, 5:05 p.m. UTC | #6
On Fri, Jun 28, 2019 at 07:51:43AM +0200, Christoph Hellwig wrote:
> On Thu, Jun 27, 2019 at 11:23:09AM -0700, Darrick J. Wong wrote:
> > On Thu, Jun 27, 2019 at 12:48:30PM +0200, Christoph Hellwig wrote:
> > > There is no real problem merging ioends that go beyond i_size into an
> > > ioend that doesn't.  We just need to move the append transaction to the
> > > base ioend.  Also use the opportunity to use a real error code instead
> > > of the magic 1 to cancel the transactions, and write a comment
> > > explaining the scheme.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> > Reading through this patch, I have a feeling it fixes the crash that
> > Zorro has been seeing occasionally with generic/475...
> 
> So you think for some reason the disk i_size changes underneath and thus
> the xfs_ioend_is_append misfired vs the actual transaction allocations?
> I didn't even think of that, but using the different checks sure sounds
> dangerous.  So yes, we'd either need to backport my patch, or at least
> replace the checks in xfs_ioend_can_merge with direct checks of
> io_append_trans.

That's my working theory, yes.

1. Dirty pages 0 and 2 of an empty file.

2. Writeback gets scheduled for pages 0 and 2, creating ioends A and C.
Both ioends describe writes past the on-disk isize so we allocate
transactions.

3. ioend C completes immediately, sets the ondisk isize to (3 * PAGESIZE).

4. Dirty page 1 of the file and immediately schedule writeback for it,
creating ioend B.  ioend B describes a write within the on-disk isize so
we do not allocate setfilesize transaction.

5. ioend A and B complete and are sorted into the per-inode ioend
completion list.  xfs_ioend_try_merge looks at ioend A, sees that ioend
can be merged with ioend B (same type, same cow status, same current
setfilesize status (which does not reflect the setfilesize status when A
was created)) and therefore decides to merge them.

6. A has a setfilesize transaction so _try_merge calls
xfs_setfilesize_ioend(ioend B, -1) to cancel ioend B's transaction, but
as we saw in (4), ioend B has no transaction and crashes.

--D
diff mbox series

Patch

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 8b3070a40245..4ef8343c3759 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -314,11 +314,28 @@  xfs_ioend_can_merge(
 		return false;
 	if (ioend->io_offset + ioend->io_size != next->io_offset)
 		return false;
-	if (xfs_ioend_is_append(ioend) != xfs_ioend_is_append(next))
-		return false;
 	return true;
 }
 
+/*
+ * If the to be merged ioend has a preallocated transaction for file
+ * size updates we need to ensure the ioend it is merged into also
+ * has one.  If it already has one we can simply cancel the transaction
+ * as it is guaranteed to be clean.
+ */
+static void
+xfs_ioend_merge_append_transactions(
+	struct xfs_ioend	*ioend,
+	struct xfs_ioend	*next)
+{
+	if (!ioend->io_append_trans) {
+		ioend->io_append_trans = next->io_append_trans;
+		next->io_append_trans = NULL;
+	} else {
+		xfs_setfilesize_ioend(next, -ECANCELED);
+	}
+}
+
 /* Try to merge adjacent completions. */
 STATIC void
 xfs_ioend_try_merge(
@@ -327,7 +344,6 @@  xfs_ioend_try_merge(
 {
 	struct xfs_ioend	*next_ioend;
 	int			ioend_error;
-	int			error;
 
 	if (list_empty(more_ioends))
 		return;
@@ -341,10 +357,8 @@  xfs_ioend_try_merge(
 			break;
 		list_move_tail(&next_ioend->io_list, &ioend->io_list);
 		ioend->io_size += next_ioend->io_size;
-		if (ioend->io_append_trans) {
-			error = xfs_setfilesize_ioend(next_ioend, 1);
-			ASSERT(error == 1);
-		}
+		if (next_ioend->io_append_trans)
+			xfs_ioend_merge_append_transactions(ioend, next_ioend);
 	}
 }