diff mbox series

NFS: Don't use page_file_mapping after removing the page

Message ID b85247642d77ea12153d042a4dce85b94d7624ca.1549303962.git.bcodding@redhat.com (mailing list archive)
State New, archived
Headers show
Series NFS: Don't use page_file_mapping after removing the page | expand

Commit Message

Benjamin Coddington Feb. 4, 2019, 6:13 p.m. UTC
If nfs_page_async_flush() removes the page from the mapping, then we can't
use page_file_mapping() on it as nfs_updatepate() is wont to do when
receiving an error.  Instead, simplify nfs_zap_mapping() to take the inode.

Fixes: 8fc75bed96bb ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/dir.c           |  2 +-
 fs/nfs/direct.c        |  2 +-
 fs/nfs/inode.c         |  4 ++--
 fs/nfs/write.c         | 10 ++--------
 include/linux/nfs_fs.h |  2 +-
 5 files changed, 7 insertions(+), 13 deletions(-)

Comments

Trond Myklebust Feb. 4, 2019, 6:32 p.m. UTC | #1
On Mon, 2019-02-04 at 13:13 -0500, Benjamin Coddington wrote:
> If nfs_page_async_flush() removes the page from the mapping, then we
> can't
> use page_file_mapping() on it as nfs_updatepate() is wont to do when
> receiving an error.  Instead, simplify nfs_zap_mapping() to take the
> inode.
> 

Won't this break NFS swap?
Benjamin Coddington Feb. 4, 2019, 7:03 p.m. UTC | #2
On 4 Feb 2019, at 13:32, Trond Myklebust wrote:

> On Mon, 2019-02-04 at 13:13 -0500, Benjamin Coddington wrote:
>> If nfs_page_async_flush() removes the page from the mapping, then we
>> can't
>> use page_file_mapping() on it as nfs_updatepate() is wont to do when
>> receiving an error.  Instead, simplify nfs_zap_mapping() to take the
>> inode.
>>
>
> Won't this break NFS swap?

I guess it may, but then I wonder: is that a thing that anyone does?  It
sounds like a terrible idea..

I'm curious enough though, and doing some research on it.

Ben
Benjamin Coddington Feb. 4, 2019, 7:07 p.m. UTC | #3
On 4 Feb 2019, at 14:03, Benjamin Coddington wrote:

> On 4 Feb 2019, at 13:32, Trond Myklebust wrote:
>
>> On Mon, 2019-02-04 at 13:13 -0500, Benjamin Coddington wrote:
>>> If nfs_page_async_flush() removes the page from the mapping, then we
>>> can't
>>> use page_file_mapping() on it as nfs_updatepate() is wont to do when
>>> receiving an error.  Instead, simplify nfs_zap_mapping() to take the
>>> inode.
>>>
>>
>> Won't this break NFS swap?
>
> I guess it may, but then I wonder: is that a thing that anyone does?  It
> sounds like a terrible idea..
>
> I'm curious enough though, and doing some research on it.

There's enough to commits in the tree fixing problem or supporting swap on
NFS, so I suppose I've just missed all that.

.. back to the code to find another way.

Ben
Benjamin Coddington Feb. 4, 2019, 7:20 p.m. UTC | #4
On 4 Feb 2019, at 14:07, Benjamin Coddington wrote:

> On 4 Feb 2019, at 14:03, Benjamin Coddington wrote:
>
>> On 4 Feb 2019, at 13:32, Trond Myklebust wrote:
>>
>>> On Mon, 2019-02-04 at 13:13 -0500, Benjamin Coddington wrote:
>>>> If nfs_page_async_flush() removes the page from the mapping, then 
>>>> we
>>>> can't
>>>> use page_file_mapping() on it as nfs_updatepate() is wont to do 
>>>> when
>>>> receiving an error.  Instead, simplify nfs_zap_mapping() to take 
>>>> the
>>>> inode.
>>>>
>>>
>>> Won't this break NFS swap?
>>
>> I guess it may, but then I wonder: is that a thing that anyone does?  
>> It
>> sounds like a terrible idea..
>>
>> I'm curious enough though, and doing some research on it.
>
> There's enough to commits in the tree fixing problem or supporting 
> swap on
> NFS, so I suppose I've just missed all that.

So for diskless systems, ah.  It makes sense.

But does it make any sense to set NFS_INO_INVALID_DATA on a swap file?  
Does anyone share them?

Dazed and confused, but trying to continue..

Ben
Benjamin Coddington Feb. 4, 2019, 8:31 p.m. UTC | #5
On 4 Feb 2019, at 14:20, Benjamin Coddington wrote:

> On 4 Feb 2019, at 14:07, Benjamin Coddington wrote:
>
>> On 4 Feb 2019, at 14:03, Benjamin Coddington wrote:
>>
>>> On 4 Feb 2019, at 13:32, Trond Myklebust wrote:
>>>
>>>> On Mon, 2019-02-04 at 13:13 -0500, Benjamin Coddington wrote:
>>>>> If nfs_page_async_flush() removes the page from the mapping, then we
>>>>> can't use page_file_mapping() on it as nfs_updatepate() is wont to do
>>>>> when receiving an error.  Instead, simplify nfs_zap_mapping() to take
>>>>> the inode.
>>>>>
>>>>
>>>> Won't this break NFS swap?
>>>
>>> I guess it may, but then I wonder: is that a thing that anyone does?  It
>>> sounds like a terrible idea..
>>>
>>> I'm curious enough though, and doing some research on it.
>>
>> There's enough to commits in the tree fixing problem or supporting swap
>> on NFS, so I suppose I've just missed all that.
>
> So for diskless systems, ah.  It makes sense.
>
> But does it make any sense to set NFS_INO_INVALID_DATA on a swap file?
> Does anyone share them?
>
> Dazed and confused, but trying to continue..

So if the page is backing a swap file, nfs_zap_mapping() will fail to set
NFS_INO_INVALID_DATA because i_mapping.nrpages will always be 0.

But it looks like NFS has already decided to not care if the mapping needs
revalidation for swap, since we assume /shared/ swap files are insane.

So without testing swap on NFS at all, it seems safe.

This is all new to me, though, so I need experienced revalidation of this
reasoning myself.

Perhaps it needs a comment to explain why it is safe to do..  any thoughts?

Ben
Trond Myklebust Feb. 4, 2019, 9:40 p.m. UTC | #6
On Mon, 2019-02-04 at 15:31 -0500, Benjamin Coddington wrote:
> On 4 Feb 2019, at 14:20, Benjamin Coddington wrote:
> 
> > On 4 Feb 2019, at 14:07, Benjamin Coddington wrote:
> > 
> > > On 4 Feb 2019, at 14:03, Benjamin Coddington wrote:
> > > 
> > > > On 4 Feb 2019, at 13:32, Trond Myklebust wrote:
> > > > 
> > > > > On Mon, 2019-02-04 at 13:13 -0500, Benjamin Coddington wrote:
> > > > > > If nfs_page_async_flush() removes the page from the
> > > > > > mapping, then we
> > > > > > can't use page_file_mapping() on it as nfs_updatepate() is
> > > > > > wont to do
> > > > > > when receiving an error.  Instead, simplify
> > > > > > nfs_zap_mapping() to take
> > > > > > the inode.
> > > > > > 
> > > > > 
> > > > > Won't this break NFS swap?
> > > > 
> > > > I guess it may, but then I wonder: is that a thing that anyone
> > > > does?  It
> > > > sounds like a terrible idea..
> > > > 
> > > > I'm curious enough though, and doing some research on it.
> > > 
> > > There's enough to commits in the tree fixing problem or
> > > supporting swap
> > > on NFS, so I suppose I've just missed all that.
> > 
> > So for diskless systems, ah.  It makes sense.
> > 
> > But does it make any sense to set NFS_INO_INVALID_DATA on a swap
> > file?
> > Does anyone share them?
> > 
> > Dazed and confused, but trying to continue..
> 
> So if the page is backing a swap file, nfs_zap_mapping() will fail to
> set
> NFS_INO_INVALID_DATA because i_mapping.nrpages will always be 0.
> 
> But it looks like NFS has already decided to not care if the mapping
> needs
> revalidation for swap, since we assume /shared/ swap files are
> insane.
> 
> So without testing swap on NFS at all, it seems safe.
> 

What about the case when the page is in the swap cache? Is that also
safe?
Benjamin Coddington Feb. 5, 2019, 10:23 a.m. UTC | #7
On 4 Feb 2019, at 16:40, Trond Myklebust wrote:
> What about the case when the page is in the swap cache? Is that also
> safe?

I think the case you'd like me to consider is when a file on NFS has its
mapping backed by pages that are in swap.  In that case, it makes no sense
to check the count of pages in the swap file's mapping to determine whether
to set NFS_INO_INVALID_DATA.  We actually want to optimize away the
revalidation if the NFS file lacks mapped pages.

Ben
Benjamin Coddington Feb. 6, 2019, 11:12 a.m. UTC | #8
On 5 Feb 2019, at 5:23, Benjamin Coddington wrote:

> On 4 Feb 2019, at 16:40, Trond Myklebust wrote:
>> What about the case when the page is in the swap cache? Is that also
>> safe?
>
> I think the case you'd like me to consider is when a file on NFS has its
> mapping backed by pages that are in swap.  In that case, it makes no sense
> to check the count of pages in the swap file's mapping to determine whether
> to set NFS_INO_INVALID_DATA.  We actually want to optimize away the
> revalidation if the NFS file lacks mapped pages.

All this makes me nervous about breaking swap, and is probably not
appropriate for a fix here so I am going to drop this approach.

Ben
diff mbox series

Patch

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 6bf4471850c8..eb8adceb9962 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -671,7 +671,7 @@  int nfs_readdir_filler(nfs_readdir_descriptor_t *desc, struct page* page)
 
 	if (invalidate_inode_pages2_range(inode->i_mapping, page->index + 1, -1) < 0) {
 		/* Should never happen */
-		nfs_zap_mapping(inode, inode->i_mapping);
+		nfs_zap_mapping(inode);
 	}
 	unlock_page(page);
 	return 0;
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 33824a0a57bf..9ef96c7a5826 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -758,7 +758,7 @@  static void nfs_direct_write_schedule_work(struct work_struct *work)
 			nfs_direct_write_reschedule(dreq);
 			break;
 		default:
-			nfs_zap_mapping(dreq->inode, dreq->inode->i_mapping);
+			nfs_zap_mapping(dreq->inode);
 			nfs_direct_complete(dreq);
 	}
 }
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 094775ea0781..e56e05984a01 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -247,9 +247,9 @@  void nfs_zap_caches(struct inode *inode)
 	spin_unlock(&inode->i_lock);
 }
 
-void nfs_zap_mapping(struct inode *inode, struct address_space *mapping)
+void nfs_zap_mapping(struct inode *inode)
 {
-	if (mapping->nrpages != 0) {
+	if (inode->i_mapping->nrpages != 0) {
 		spin_lock(&inode->i_lock);
 		nfs_set_cache_invalid(inode, NFS_INO_INVALID_DATA);
 		spin_unlock(&inode->i_lock);
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index f12cb31a41e5..609bea7e2434 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -237,12 +237,6 @@  static void nfs_grow_file(struct page *page, unsigned int offset, unsigned int c
 	spin_unlock(&inode->i_lock);
 }
 
-/* A writeback failed: mark the page as bad, and invalidate the page cache */
-static void nfs_set_pageerror(struct page *page)
-{
-	nfs_zap_mapping(page_file_mapping(page)->host, page_file_mapping(page));
-}
-
 /*
  * nfs_page_group_search_locked
  * @head - head request of page group
@@ -994,7 +988,7 @@  static void nfs_write_completion(struct nfs_pgio_header *hdr)
 		nfs_list_remove_request(req);
 		if (test_bit(NFS_IOHDR_ERROR, &hdr->flags) &&
 		    (hdr->good_bytes < bytes)) {
-			nfs_set_pageerror(req->wb_page);
+			nfs_zap_mapping(hdr->inode);
 			nfs_context_set_write_error(req->wb_context, hdr->error);
 			goto remove_req;
 		}
@@ -1366,7 +1360,7 @@  int nfs_updatepage(struct file *file, struct page *page,
 
 	status = nfs_writepage_setup(ctx, page, offset, count);
 	if (status < 0)
-		nfs_set_pageerror(page);
+		nfs_zap_mapping(file_inode(file));
 	else
 		__set_page_dirty_nobuffers(page);
 out:
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 40e30376130b..6afd64d140f2 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -364,7 +364,7 @@  static inline int nfs_verify_change_attribute(struct inode *dir, unsigned long c
  * linux/fs/nfs/inode.c
  */
 extern int nfs_sync_mapping(struct address_space *mapping);
-extern void nfs_zap_mapping(struct inode *inode, struct address_space *mapping);
+extern void nfs_zap_mapping(struct inode *inode);
 extern void nfs_zap_caches(struct inode *);
 extern void nfs_invalidate_atime(struct inode *);
 extern struct inode *nfs_fhget(struct super_block *, struct nfs_fh *,