diff mbox

[v2,3/9] mm: clear any AS_* errors when returning error on any fsync or close

Message ID 20170308162934.21989-4-jlayton@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Layton March 8, 2017, 4:29 p.m. UTC
Currently we don't clear the address space error when there is a -EIO
error on fsynci, due to writeback initiation failure. If writes fail
with -EIO and the mapping is flagged with an AS_EIO or AS_ENOSPC error,
then we can end up returning errors on two fsync calls, even when a
write between them succeeded (or there was no write).

Ensure that we also clear out any mapping errors when initiating
writeback fails with -EIO in filemap_write_and_wait and
filemap_write_and_wait_range.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 mm/filemap.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

Comments

NeilBrown March 8, 2017, 9:23 p.m. UTC | #1
On Thu, Mar 09 2017, Jeff Layton wrote:

> Currently we don't clear the address space error when there is a -EIO
> error on fsynci, due to writeback initiation failure. If writes fail
> with -EIO and the mapping is flagged with an AS_EIO or AS_ENOSPC error,
> then we can end up returning errors on two fsync calls, even when a
> write between them succeeded (or there was no write).
>
> Ensure that we also clear out any mapping errors when initiating
> writeback fails with -EIO in filemap_write_and_wait and
> filemap_write_and_wait_range.

This change appears to assume that filemap_write_and_wait* is only
called from fsync() (or similar) and the return status is always
checked.

A __must_check annotation might be helpful.

It would catch v9_fs_file_lock(), afs_setattr() and others.

While I think your change is probably heading in the right direction,
there seem to be some loose ends still.

Thanks,
NeilBrown


>
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  mm/filemap.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 1694623a6289..fc123b9833e1 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -488,7 +488,7 @@ EXPORT_SYMBOL(filemap_fdatawait);
>  
>  int filemap_write_and_wait(struct address_space *mapping)
>  {
> -	int err = 0;
> +	int err;
>  
>  	if ((!dax_mapping(mapping) && mapping->nrpages) ||
>  	    (dax_mapping(mapping) && mapping->nrexceptional)) {
> @@ -499,10 +499,18 @@ int filemap_write_and_wait(struct address_space *mapping)
>  		 * But the -EIO is special case, it may indicate the worst
>  		 * thing (e.g. bug) happened, so we avoid waiting for it.
>  		 */
> -		if (err != -EIO) {
> +		if (likely(err != -EIO)) {
>  			int err2 = filemap_fdatawait(mapping);
>  			if (!err)
>  				err = err2;
> +		} else {
> +			/*
> +			 * Clear the error in the address space since we're
> +			 * returning an error here. -EIO takes precedence over
> +			 * everything else though, so we can just discard
> +			 * the return here.
> +			 */
> +			filemap_check_errors(mapping);
>  		}
>  	} else {
>  		err = filemap_check_errors(mapping);
> @@ -537,6 +545,14 @@ int filemap_write_and_wait_range(struct address_space *mapping,
>  						lstart, lend);
>  			if (!err)
>  				err = err2;
> +		} else {
> +			/*
> +			 * Clear the error in the address space since we're
> +			 * returning an error here. -EIO takes precedence over
> +			 * everything else though, so we can just discard
> +			 * the return here.
> +			 */
> +			filemap_check_errors(mapping);
>  		}
>  	} else {
>  		err = filemap_check_errors(mapping);
> -- 
> 2.9.3
Jeff Layton March 9, 2017, 12:10 a.m. UTC | #2
On Thu, 2017-03-09 at 08:23 +1100, NeilBrown wrote:
> On Thu, Mar 09 2017, Jeff Layton wrote:
> 
> > Currently we don't clear the address space error when there is a -EIO
> > error on fsynci, due to writeback initiation failure. If writes fail
> > with -EIO and the mapping is flagged with an AS_EIO or AS_ENOSPC error,
> > then we can end up returning errors on two fsync calls, even when a
> > write between them succeeded (or there was no write).
> > 
> > Ensure that we also clear out any mapping errors when initiating
> > writeback fails with -EIO in filemap_write_and_wait and
> > filemap_write_and_wait_range.
> 
> This change appears to assume that filemap_write_and_wait* is only
> called from fsync() (or similar) and the return status is always
> checked.
> 
> A __must_check annotation might be helpful.
> 

Yes, good idea.

> It would catch v9_fs_file_lock(), afs_setattr() and others.
> 

Ouch -- good catch.

Actually, those look like bugs in the code as it exists today. If some
background page writeback fails, but no write initiation fails on that
call, then those callers are discarding errors that should have been
reported at fsync.

> While I think your change is probably heading in the right direction,
> there seem to be some loose ends still.
> 

Yes...I probably should be prefacing all of these patches with [RFC] at
this point.

I think I'm starting to grasp the problem (and its scope), but we might
have to think about how to approach this more strategically. Given that
we have this wrong in so many places, I think that probably means that
the interfaces we have make it easy to do so. I need to consider how to
correct that.

> 
> 
> > 
> > Suggested-by: Jan Kara <jack@suse.cz>
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  mm/filemap.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 1694623a6289..fc123b9833e1 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -488,7 +488,7 @@ EXPORT_SYMBOL(filemap_fdatawait);
> >  
> >  int filemap_write_and_wait(struct address_space *mapping)
> >  {
> > -	int err = 0;
> > +	int err;
> >  
> >  	if ((!dax_mapping(mapping) && mapping->nrpages) ||
> >  	    (dax_mapping(mapping) && mapping->nrexceptional)) {
> > @@ -499,10 +499,18 @@ int filemap_write_and_wait(struct address_space *mapping)
> >  		 * But the -EIO is special case, it may indicate the worst
> >  		 * thing (e.g. bug) happened, so we avoid waiting for it.
> >  		 */
> > -		if (err != -EIO) {
> > +		if (likely(err != -EIO)) {
> >  			int err2 = filemap_fdatawait(mapping);
> >  			if (!err)
> >  				err = err2;
> > +		} else {
> > +			/*
> > +			 * Clear the error in the address space since we're
> > +			 * returning an error here. -EIO takes precedence over
> > +			 * everything else though, so we can just discard
> > +			 * the return here.
> > +			 */
> > +			filemap_check_errors(mapping);
> >  		}
> >  	} else {
> >  		err = filemap_check_errors(mapping);
> > @@ -537,6 +545,14 @@ int filemap_write_and_wait_range(struct address_space *mapping,
> >  						lstart, lend);
> >  			if (!err)
> >  				err = err2;
> > +		} else {
> > +			/*
> > +			 * Clear the error in the address space since we're
> > +			 * returning an error here. -EIO takes precedence over
> > +			 * everything else though, so we can just discard
> > +			 * the return here.
> > +			 */
> > +			filemap_check_errors(mapping);
> >  		}
> >  	} else {
> >  		err = filemap_check_errors(mapping);
> > -- 
> > 2.9.3
Ross Zwisler March 10, 2017, 12:09 a.m. UTC | #3
On Wed, Mar 08, 2017 at 11:29:28AM -0500, Jeff Layton wrote:
> Currently we don't clear the address space error when there is a -EIO
> error on fsynci, due to writeback initiation failure. If writes fail
	   fsync

> with -EIO and the mapping is flagged with an AS_EIO or AS_ENOSPC error,
> then we can end up returning errors on two fsync calls, even when a
> write between them succeeded (or there was no write).
> 
> Ensure that we also clear out any mapping errors when initiating
> writeback fails with -EIO in filemap_write_and_wait and
> filemap_write_and_wait_range.
> 
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  mm/filemap.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 1694623a6289..fc123b9833e1 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -488,7 +488,7 @@ EXPORT_SYMBOL(filemap_fdatawait);
>  
>  int filemap_write_and_wait(struct address_space *mapping)
>  {
> -	int err = 0;
> +	int err;
>  
>  	if ((!dax_mapping(mapping) && mapping->nrpages) ||
>  	    (dax_mapping(mapping) && mapping->nrexceptional)) {
> @@ -499,10 +499,18 @@ int filemap_write_and_wait(struct address_space *mapping)
>  		 * But the -EIO is special case, it may indicate the worst
>  		 * thing (e.g. bug) happened, so we avoid waiting for it.
>  		 */
> -		if (err != -EIO) {
> +		if (likely(err != -EIO)) {

The above two cleanup changes were made only to filemap_write_and_wait(), but
should also probably be done to filemap_write_and_wait_range() to keep them as
consistent as possible?
Jeff Layton March 10, 2017, 3:08 a.m. UTC | #4
On Thu, 2017-03-09 at 17:09 -0700, Ross Zwisler wrote:
> On Wed, Mar 08, 2017 at 11:29:28AM -0500, Jeff Layton wrote:
> > Currently we don't clear the address space error when there is a -EIO
> > error on fsynci, due to writeback initiation failure. If writes fail
> 
> 	   fsync
> 
> > with -EIO and the mapping is flagged with an AS_EIO or AS_ENOSPC error,
> > then we can end up returning errors on two fsync calls, even when a
> > write between them succeeded (or there was no write).
> > 
> > Ensure that we also clear out any mapping errors when initiating
> > writeback fails with -EIO in filemap_write_and_wait and
> > filemap_write_and_wait_range.
> > 
> > Suggested-by: Jan Kara <jack@suse.cz>
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  mm/filemap.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 1694623a6289..fc123b9833e1 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -488,7 +488,7 @@ EXPORT_SYMBOL(filemap_fdatawait);
> >  
> >  int filemap_write_and_wait(struct address_space *mapping)
> >  {
> > -	int err = 0;
> > +	int err;
> >  
> >  	if ((!dax_mapping(mapping) && mapping->nrpages) ||
> >  	    (dax_mapping(mapping) && mapping->nrexceptional)) {
> > @@ -499,10 +499,18 @@ int filemap_write_and_wait(struct address_space *mapping)
> >  		 * But the -EIO is special case, it may indicate the worst
> >  		 * thing (e.g. bug) happened, so we avoid waiting for it.
> >  		 */
> > -		if (err != -EIO) {
> > +		if (likely(err != -EIO)) {
> 
> The above two cleanup changes were made only to filemap_write_and_wait(), but
> should also probably be done to filemap_write_and_wait_range() to keep them as
> consistent as possible?

Thanks, I fixed that in the patch in my tree. Unfortunately, as Neil
pointed out, there is a bigger problem here...

There are a lot of callers of the filemap_write_and_wait* functions
that never check the return code at all, and some others that call this
from codepaths that where we can't report errors properly. Yet, the
mapping error gets cleared out anyway, which means that fsync will
probably never see it.

So while I doubt this patch will make anything worse, I think we have
to look at fixing those problems first. We need to ensure that when
filemap_check_errors is called, that we're in a codepath where we can
actually report the error to something that can interpret it properly.
Basically, only in write, fsync, msync or close codepaths. For the
others, we need to use something like filemap_fdatawait_keep_errors so
that we don't end up dropping writeback errors onto the floor.

I'm going to look at fixing that up first (maybe as a preliminary
series to this one). There are a lot of callers though, and I don't see
a way around having to go and review all of these callsites
individually. Maybe it's be best to just lift the filemap_check_errors
calls higher in the call stack to ensure that? Not sure...

Anyway...I'm first trying to collect a list of what I think needs
fixing here, and figure out how to break all of this up into manageable
pieces and order it sanely.
diff mbox

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index 1694623a6289..fc123b9833e1 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -488,7 +488,7 @@  EXPORT_SYMBOL(filemap_fdatawait);
 
 int filemap_write_and_wait(struct address_space *mapping)
 {
-	int err = 0;
+	int err;
 
 	if ((!dax_mapping(mapping) && mapping->nrpages) ||
 	    (dax_mapping(mapping) && mapping->nrexceptional)) {
@@ -499,10 +499,18 @@  int filemap_write_and_wait(struct address_space *mapping)
 		 * But the -EIO is special case, it may indicate the worst
 		 * thing (e.g. bug) happened, so we avoid waiting for it.
 		 */
-		if (err != -EIO) {
+		if (likely(err != -EIO)) {
 			int err2 = filemap_fdatawait(mapping);
 			if (!err)
 				err = err2;
+		} else {
+			/*
+			 * Clear the error in the address space since we're
+			 * returning an error here. -EIO takes precedence over
+			 * everything else though, so we can just discard
+			 * the return here.
+			 */
+			filemap_check_errors(mapping);
 		}
 	} else {
 		err = filemap_check_errors(mapping);
@@ -537,6 +545,14 @@  int filemap_write_and_wait_range(struct address_space *mapping,
 						lstart, lend);
 			if (!err)
 				err = err2;
+		} else {
+			/*
+			 * Clear the error in the address space since we're
+			 * returning an error here. -EIO takes precedence over
+			 * everything else though, so we can just discard
+			 * the return here.
+			 */
+			filemap_check_errors(mapping);
 		}
 	} else {
 		err = filemap_check_errors(mapping);