NFS: Always return the error that truncates a flushing page

Message ID	1cbf359e173a9f93e7c858faa43e623dda3858df.1548602308.git.bcodding@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: Benjamin Coddington <bcodding@redhat.com> To: Trond Myklebust <trondmy@hammerspace.com>, Anna Schumaker <anna.schumaker@netapp.com> Cc: linux-nfs@vger.kernel.org Subject: [PATCH] NFS: Always return the error that truncates a flushing page Date: Sun, 27 Jan 2019 10:19:47 -0500 Message-Id: <1cbf359e173a9f93e7c858faa43e623dda3858df.1548602308.git.bcodding@redhat.com> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk
Series	NFS: Always return the error that truncates a flushing page \| expand NFS: Always return the error that truncates a flushing page

Benjamin Coddington Jan. 27, 2019, 3:19 p.m. UTC

We can't have nfs_wb_page() truncate the page from the mapping if there's
an error on the context without returning that error, because we may be in
nfs_updatepage() holding the page and trying to update the request.  Not
having any error returned means we'll proceed to create a new request and
dereference the truncated page->mapping.

If we're going to remove the page, always return the error that signaled us
to do so in nfs_page_async_flush().

Fixes: c373fff7bd25 ("NFSv4: Don't special case "launder"")
Cc: stable@vger.kernel.org # v4.11
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/write.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Trond Myklebust Jan. 28, 2019, 5:59 p.m. UTC | #1

On Sun, 2019-01-27 at 10:19 -0500, Benjamin Coddington wrote:
> We can't have nfs_wb_page() truncate the page from the mapping if
> there's
> an error on the context without returning that error, because we may
> be in
> nfs_updatepage() holding the page and trying to update the
> request.  Not
> having any error returned means we'll proceed to create a new request
> and
> dereference the truncated page->mapping.
> 
> If we're going to remove the page, always return the error that
> signaled us
> to do so in nfs_page_async_flush().
> 
> Fixes: c373fff7bd25 ("NFSv4: Don't special case "launder"")
> Cc: stable@vger.kernel.org # v4.11
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/write.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 5a0bbf917a32..c274339176cc 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -622,9 +622,11 @@ static int nfs_page_async_flush(struct
> nfs_pageio_descriptor *pgio,
>  	WARN_ON_ONCE(test_bit(PG_CLEAN, &req->wb_flags));
>  
>  	ret = 0;
> -	/* If there is a fatal error that covers this write, just exit
> */
> -	if (nfs_error_is_fatal_on_server(req->wb_context->error))
> +	/* If there is a fatal on server error on this context, just
> exit */
> +	if (nfs_error_is_fatal_on_server(req->wb_context->error)) {
> +		ret = req->wb_context->error;
>  		goto out_launder;
> +	}
>  
>  	if (!nfs_pageio_add_request(pgio, req)) {
>  		ret = pgio->pg_error;

Hi Ben

We were apparently both looking at the same code last week ☺. I have a
similar patch, but that also fixes a similar clobbering issue with the
nfs_error_is_fatal() error just a few lines further down.

Without the extra hunk, we could end up converting an interrupted call
into a 'successful' write.

Cheers
  Trond

---
From 676d3340abc26ce9ff2b7609c293454dec6d4897 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue, 22 Jan 2019 07:34:45 -0500
Subject: [PATCH] NFS: Fix up return value on fatal errors in
 nfs_page_async_flush()

Ensure that we return the fatal error value that caused us to exit
nfs_page_async_flush().

Fixes: a6598813a4c5 ("NFS: Don't write back further requests...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.12+
---
 fs/nfs/write.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 5a0bbf917a32..f12cb31a41e5 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -621,11 +621,12 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
 	nfs_set_page_writeback(page);
 	WARN_ON_ONCE(test_bit(PG_CLEAN, &req->wb_flags));
 
-	ret = 0;
+	ret = req->wb_context->error;
 	/* If there is a fatal error that covers this write, just exit */
-	if (nfs_error_is_fatal_on_server(req->wb_context->error))
+	if (nfs_error_is_fatal_on_server(ret))
 		goto out_launder;
 
+	ret = 0;
 	if (!nfs_pageio_add_request(pgio, req)) {
 		ret = pgio->pg_error;
 		/*
@@ -635,9 +636,9 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
 			nfs_context_set_write_error(req->wb_context, ret);
 			if (nfs_error_is_fatal_on_server(ret))
 				goto out_launder;
-		}
+		} else
+			ret = -EAGAIN;
 		nfs_redirty_request(req);
-		ret = -EAGAIN;
 	} else
 		nfs_add_stats(page_file_mapping(page)->host,
 				NFSIOS_WRITEPAGES, 1);

Benjamin Coddington Jan. 28, 2019, 6:55 p.m. UTC | #2

On 28 Jan 2019, at 12:59, Trond Myklebust wrote:
> Hi Ben
>
> We were apparently both looking at the same code last week ☺. I have a
> similar patch, but that also fixes a similar clobbering issue with the
> nfs_error_is_fatal() error just a few lines further down.
>
> Without the extra hunk, we could end up converting an interrupted call
> into a 'successful' write.
>
> Cheers
>   Trond

Looks right to me.  I hope you'll take the Fixes and stable version from
my patch, which will cover the problem I'm seeing.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>

Ben

Trond Myklebust Jan. 29, 2019, 8:49 p.m. UTC | #3

On Mon, 2019-01-28 at 13:55 -0500, Benjamin Coddington wrote:
> On 28 Jan 2019, at 12:59, Trond Myklebust wrote:
> > Hi Ben
> > 
> > We were apparently both looking at the same code last week ☺. I
> > have a
> > similar patch, but that also fixes a similar clobbering issue with
> > the
> > nfs_error_is_fatal() error just a few lines further down.
> > 
> > Without the extra hunk, we could end up converting an interrupted
> > call
> > into a 'successful' write.
> > 
> > Cheers
> >   Trond
> 
> Looks right to me.  I hope you'll take the Fixes and stable version
> from
> my patch, which will cover the problem I'm seeing.
> 
> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
> 

At first, I thought I'd have to split this patch in 2 so that it could
apply to 4.11 and 4.12, but it looks to me as if the commit from the
Fixes line you were showing is actually only present in 4.12 and later.
Am I wrong?

Benjamin Coddington Jan. 29, 2019, 10:34 p.m. UTC | #4

On 29 Jan 2019, at 20:49, Trond Myklebust wrote:

> On Mon, 2019-01-28 at 13:55 -0500, Benjamin Coddington wrote:
>> On 28 Jan 2019, at 12:59, Trond Myklebust wrote:
>>> Hi Ben
>>>
>>> We were apparently both looking at the same code last week ☺. I
>>> have a
>>> similar patch, but that also fixes a similar clobbering issue with
>>> the
>>> nfs_error_is_fatal() error just a few lines further down.
>>>
>>> Without the extra hunk, we could end up converting an interrupted
>>> call
>>> into a 'successful' write.
>>>
>>> Cheers
>>>   Trond
>>
>> Looks right to me.  I hope you'll take the Fixes and stable version
>> from
>> my patch, which will cover the problem I'm seeing.
>>
>> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
>>
>
> At first, I thought I'd have to split this patch in 2 so that it could
> apply to 4.11 and 4.12, but it looks to me as if the commit from the
> Fixes line you were showing is actually only present in 4.12 and 
> later.
> Am I wrong?

You're right.  I did

# git describe c373fff7bd25
v4.11-rc7-70-gc373fff7bd25

.. and assumed it ended up in v4.11.  I wonder what actually happened to 
it, I'll see if I can find out.

Ben

Benjamin Coddington Jan. 30, 2019, 9:03 a.m. UTC | #5

On 29 Jan 2019, at 22:34, Benjamin Coddington wrote:
> You're right.  I did
>
> # git describe c373fff7bd25
> v4.11-rc7-70-gc373fff7bd25
>
> .. and assumed it ended up in v4.11.  I wonder what actually happened to
> it, I'll see if I can find out.

git describe is giving me the closest ancestor of the commit, not (what I
expected) the closest ancestor after the merge.  I should use git describe
--contains instead.

Ben

NFS: Always return the error that truncates a flushing page

Commit Message

Comments

Patch