diff mbox series

[stable,6.6,and,6.7] NFS: Fix data corruption caused by congestion.

Message ID 170907621128.24797.4390391329078744015@noble.neil.brown.name (mailing list archive)
State New
Headers show
Series [stable,6.6,and,6.7] NFS: Fix data corruption caused by congestion. | expand

Commit Message

NeilBrown Feb. 27, 2024, 11:23 p.m. UTC
when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
congestion) it is important that the folio is redirtied.
nfs_writepage_locked() doesn't do this, so files can become corrupted as
writes can be lost.

Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
returned.  It is needed for kernels v5.18..v6.7.  Prior to 6.3 the patch
is different as it needs to mention "page", not "folio".

Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfs/write.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Jeffrey Layton March 6, 2024, 1:42 p.m. UTC | #1
On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote:
> when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
> congestion) it is important that the folio is redirtied.
> nfs_writepage_locked() doesn't do this, so files can become corrupted as
> writes can be lost.
> 
> Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
> returned.  It is needed for kernels v5.18..v6.7.  Prior to 6.3 the patch
> is different as it needs to mention "page", not "folio".
> 

Neil, I have a question about the above statement. In Linus's tree as of
this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked:

        if (wbc->sync_mode == WB_SYNC_NONE &&
            NFS_SERVER(inode)->write_congested)           
                return AOP_WRITEPAGE_ACTIVATE;

The only caller of nfs_writepages_locked, and I don't see where it
redirties the page. Why don't we need this in v6.8?


> Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
> Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfs/write.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index b664caea8b4e..9e345d3c305a 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
>  	int err;
>  
>  	if (wbc->sync_mode == WB_SYNC_NONE &&
> -	    NFS_SERVER(inode)->write_congested)
> +	    NFS_SERVER(inode)->write_congested) {
> +		folio_redirty_for_writepage(wbc, folio);
>  		return AOP_WRITEPAGE_ACTIVATE;
> +	}
>  
>  	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
>  	nfs_pageio_init_write(&pgio, inode, 0, false,
Jeffrey Layton March 6, 2024, 5:12 p.m. UTC | #2
On Wed, 2024-03-06 at 08:42 -0500, Jeff Layton wrote:
> On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote:
> > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
> > congestion) it is important that the folio is redirtied.
> > nfs_writepage_locked() doesn't do this, so files can become corrupted as
> > writes can be lost.
> > 
> > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
> > returned.  It is needed for kernels v5.18..v6.7.  Prior to 6.3 the patch
> > is different as it needs to mention "page", not "folio".
> > 
> 
> Neil, I have a question about the above statement. In Linus's tree as of
> this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked:
> 
>         if (wbc->sync_mode == WB_SYNC_NONE &&
>             NFS_SERVER(inode)->write_congested)           
>                 return AOP_WRITEPAGE_ACTIVATE;
> 

Sorry, I meant to say:

The only caller of nfs_writepages_locked is nfs_wb_folio, and I don't
see where it redirties the folio. Why don't we need this in v6.8?


> 
> > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
> > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfs/write.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > index b664caea8b4e..9e345d3c305a 100644
> > --- a/fs/nfs/write.c
> > +++ b/fs/nfs/write.c
> > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
> >  	int err;
> >  
> >  	if (wbc->sync_mode == WB_SYNC_NONE &&
> > -	    NFS_SERVER(inode)->write_congested)
> > +	    NFS_SERVER(inode)->write_congested) {
> > +		folio_redirty_for_writepage(wbc, folio);
> >  		return AOP_WRITEPAGE_ACTIVATE;
> > +	}
> >  
> >  	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
> >  	nfs_pageio_init_write(&pgio, inode, 0, false,
>
NeilBrown March 7, 2024, 11:41 a.m. UTC | #3
On Thu, 07 Mar 2024, Jeff Layton wrote:
> On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote:
> > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
> > congestion) it is important that the folio is redirtied.
> > nfs_writepage_locked() doesn't do this, so files can become corrupted as
> > writes can be lost.
> > 
> > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
> > returned.  It is needed for kernels v5.18..v6.7.  Prior to 6.3 the patch
> > is different as it needs to mention "page", not "folio".
> > 
> 
> Neil, I have a question about the above statement. In Linus's tree as of
> this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked:
> 
>         if (wbc->sync_mode == WB_SYNC_NONE &&
>             NFS_SERVER(inode)->write_congested)           
>                 return AOP_WRITEPAGE_ACTIVATE;
> 
> The only caller of nfs_writepages_locked, and I don't see where it
> redirties the page. Why don't we need this in v6.8?

You are right - it doesn't redirty anything.  But there is no bug
here....
I didn't see it at first either, but the only caller of
nfs_writepage_locked() is nfs_wb_folio() (as you say) and that always
passes a wbc with .sync_mode = WB_SYNC_ALL.  So sync_mode is never
WB_SYNC_NODE and the code snippet you included above is dead code.  I've
already posted a patch to Trond and Anna to remove that code.

Thanks for the review!

NeilBrown

> 
> 
> > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
> > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfs/write.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > index b664caea8b4e..9e345d3c305a 100644
> > --- a/fs/nfs/write.c
> > +++ b/fs/nfs/write.c
> > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
> >  	int err;
> >  
> >  	if (wbc->sync_mode == WB_SYNC_NONE &&
> > -	    NFS_SERVER(inode)->write_congested)
> > +	    NFS_SERVER(inode)->write_congested) {
> > +		folio_redirty_for_writepage(wbc, folio);
> >  		return AOP_WRITEPAGE_ACTIVATE;
> > +	}
> >  
> >  	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
> >  	nfs_pageio_init_write(&pgio, inode, 0, false,
> 
> -- 
> Jeff Layton <jlayton@kernel.org>
>
Jeffrey Layton March 7, 2024, 12:30 p.m. UTC | #4
On Thu, 2024-03-07 at 22:41 +1100, NeilBrown wrote:
> On Thu, 07 Mar 2024, Jeff Layton wrote:
> > On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote:
> > > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
> > > congestion) it is important that the folio is redirtied.
> > > nfs_writepage_locked() doesn't do this, so files can become corrupted as
> > > writes can be lost.
> > > 
> > > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
> > > returned.  It is needed for kernels v5.18..v6.7.  Prior to 6.3 the patch
> > > is different as it needs to mention "page", not "folio".
> > > 
> > 
> > Neil, I have a question about the above statement. In Linus's tree as of
> > this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked:
> > 
> >         if (wbc->sync_mode == WB_SYNC_NONE &&
> >             NFS_SERVER(inode)->write_congested)           
> >                 return AOP_WRITEPAGE_ACTIVATE;
> > 
> > The only caller of nfs_writepages_locked, and I don't see where it
> > redirties the page. Why don't we need this in v6.8?
> 
> You are right - it doesn't redirty anything.  But there is no bug
> here....
> I didn't see it at first either, but the only caller of
> nfs_writepage_locked() is nfs_wb_folio() (as you say) and that always
> passes a wbc with .sync_mode = WB_SYNC_ALL.  So sync_mode is never
> WB_SYNC_NODE and the code snippet you included above is dead code.  I've
> already posted a patch to Trond and Anna to remove that code.
> 
> Thanks for the review!
> 

Thanks Neil,

I missed that bit about the sync_mode. I sent a R-b for your other patch
too.

Cheers,
Jeff


> > 
> > 
> > > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
> > > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/nfs/write.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > > index b664caea8b4e..9e345d3c305a 100644
> > > --- a/fs/nfs/write.c
> > > +++ b/fs/nfs/write.c
> > > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
> > >  	int err;
> > >  
> > >  	if (wbc->sync_mode == WB_SYNC_NONE &&
> > > -	    NFS_SERVER(inode)->write_congested)
> > > +	    NFS_SERVER(inode)->write_congested) {
> > > +		folio_redirty_for_writepage(wbc, folio);
> > >  		return AOP_WRITEPAGE_ACTIVATE;
> > > +	}
> > >  
> > >  	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
> > >  	nfs_pageio_init_write(&pgio, inode, 0, false,
> > 
> > -- 
> > Jeff Layton <jlayton@kernel.org>
> > 
>
diff mbox series

Patch

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index b664caea8b4e..9e345d3c305a 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -668,8 +668,10 @@  static int nfs_writepage_locked(struct folio *folio,
 	int err;
 
 	if (wbc->sync_mode == WB_SYNC_NONE &&
-	    NFS_SERVER(inode)->write_congested)
+	    NFS_SERVER(inode)->write_congested) {
+		folio_redirty_for_writepage(wbc, folio);
 		return AOP_WRITEPAGE_ACTIVATE;
+	}
 
 	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
 	nfs_pageio_init_write(&pgio, inode, 0, false,