diff mbox series

[08/11] cifs: Remove call to filemap_check_wb_err()

Message ID 20230109051823.480289-9-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series Remove AS_EIO and AS_ENOSPC | expand

Commit Message

Matthew Wilcox Jan. 9, 2023, 5:18 a.m. UTC
filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot
glean any additional information by calling it ourselves.  It may also
be misleading as it will pick up on any errors since the beginning of
time which may well be since before this program opened the file.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/cifs/file.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Jeff Layton Jan. 9, 2023, 2:42 p.m. UTC | #1
On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote:
> filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot
> glean any additional information by calling it ourselves.  It may also
> be misleading as it will pick up on any errors since the beginning of
> time which may well be since before this program opened the file.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  fs/cifs/file.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 22dfc1f8b4f1..7e7ee26cf77d 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id)
>  	int rc = 0;
>  
>  	if (file->f_mode & FMODE_WRITE)
> -		rc = filemap_write_and_wait(inode->i_mapping);
> +		rc = filemap_write_and_wait(file->f_mapping);

If we're calling ->flush, then the file is being closed. Should this
just be?
		rc = file_write_and_wait(file);

It's not like we need to worry about corrupting ->f_wb_err at that
point.

>  
>  	cifs_dbg(FYI, "Flush inode %p file %p rc %d\n", inode, file rc);
> -	if (rc) {
> -		/* get more nuanced writeback errors */
> -		rc = filemap_check_wb_err(file->f_mapping, 0);
> +	if (rc)
>  		trace_cifs_flush_err(inode->i_ino, rc);
> -	}
> +
>  	return rc;
>  }
>
Matthew Wilcox Jan. 9, 2023, 3:07 p.m. UTC | #2
On Mon, Jan 09, 2023 at 09:42:36AM -0500, Jeff Layton wrote:
> On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote:
> > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot
> > glean any additional information by calling it ourselves.  It may also
> > be misleading as it will pick up on any errors since the beginning of
> > time which may well be since before this program opened the file.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  fs/cifs/file.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > index 22dfc1f8b4f1..7e7ee26cf77d 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id)
> >  	int rc = 0;
> >  
> >  	if (file->f_mode & FMODE_WRITE)
> > -		rc = filemap_write_and_wait(inode->i_mapping);
> > +		rc = filemap_write_and_wait(file->f_mapping);
> 
> If we're calling ->flush, then the file is being closed. Should this
> just be?
> 		rc = file_write_and_wait(file);
> 
> It's not like we need to worry about corrupting ->f_wb_err at that
> point.

Yes, I think you're right, and then this is a standalone patch that can
go in this cycle, perhaps.
Jeff Layton Jan. 9, 2023, 3:14 p.m. UTC | #3
On Mon, 2023-01-09 at 09:42 -0500, Jeff Layton wrote:
> On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote:
> > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot
> > glean any additional information by calling it ourselves.  It may also
> > be misleading as it will pick up on any errors since the beginning of
> > time which may well be since before this program opened the file.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  fs/cifs/file.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > index 22dfc1f8b4f1..7e7ee26cf77d 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id)
> >  	int rc = 0;
> >  
> >  	if (file->f_mode & FMODE_WRITE)
> > -		rc = filemap_write_and_wait(inode->i_mapping);
> > +		rc = filemap_write_and_wait(file->f_mapping);
> 
> If we're calling ->flush, then the file is being closed. Should this
> just be?
> 		rc = file_write_and_wait(file);
> 
> It's not like we need to worry about corrupting ->f_wb_err at that
> point.
> 

OTOH, I suppose it is possible for there to be racing fsync syscall with
a filp_close, and in that case advancing the f_wb_err might be a bad
idea, particularly since a lot of places ignore the return from
filp_close. It's probably best to _not_ advance the f_wb_err on ->flush
calls.

That said...wonder if we ought to consider making filp_close and ->flush
void return functions. There's no POSIX requirement to flush all of the
data on close(), so an application really shouldn't rely on seeing
writeback errors returned there since it's not reliable.

If you care about writeback errors, you have to call fsync -- full stop.

> >  
> >  	cifs_dbg(FYI, "Flush inode %p file %p rc %d\n", inode, file rc);
> > -	if (rc) {
> > -		/* get more nuanced writeback errors */
> > -		rc = filemap_check_wb_err(file->f_mapping, 0);
> > +	if (rc)
> >  		trace_cifs_flush_err(inode->i_ino, rc);
> > -	}
> > +
> >  	return rc;
> >  }
> >  
>
Matthew Wilcox Jan. 9, 2023, 3:30 p.m. UTC | #4
On Mon, Jan 09, 2023 at 10:14:12AM -0500, Jeff Layton wrote:
> On Mon, 2023-01-09 at 09:42 -0500, Jeff Layton wrote:
> > On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote:
> > > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot
> > > glean any additional information by calling it ourselves.  It may also
> > > be misleading as it will pick up on any errors since the beginning of
> > > time which may well be since before this program opened the file.
> > > 
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> > >  fs/cifs/file.c | 8 +++-----
> > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > > index 22dfc1f8b4f1..7e7ee26cf77d 100644
> > > --- a/fs/cifs/file.c
> > > +++ b/fs/cifs/file.c
> > > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id)
> > >  	int rc = 0;
> > >  
> > >  	if (file->f_mode & FMODE_WRITE)
> > > -		rc = filemap_write_and_wait(inode->i_mapping);
> > > +		rc = filemap_write_and_wait(file->f_mapping);
> > 
> > If we're calling ->flush, then the file is being closed. Should this
> > just be?
> > 		rc = file_write_and_wait(file);
> > 
> > It's not like we need to worry about corrupting ->f_wb_err at that
> > point.
> > 
> 
> OTOH, I suppose it is possible for there to be racing fsync syscall with
> a filp_close, and in that case advancing the f_wb_err might be a bad
> idea, particularly since a lot of places ignore the return from
> filp_close. It's probably best to _not_ advance the f_wb_err on ->flush
> calls.

There's only so much we can do to protect an application from itself.
If it's racing an fsync() against close(), it might get an EBADF from
fsync(), or end up fsyncing entirely the wrong file due to a close();
open(); associating the fd with a different file.

> That said...wonder if we ought to consider making filp_close and ->flush
> void return functions. There's no POSIX requirement to flush all of the
> data on close(), so an application really shouldn't rely on seeing
> writeback errors returned there since it's not reliable.
> 
> If you care about writeback errors, you have to call fsync -- full stop.

Yes, most filesystems do not writeback dirty data on close().
Applications can't depend on that behaviour.  Interestingly, if you read
https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html
really carefully, it says:

   If an I/O error occurred while reading from or writing to the file
   system during close(), it may return -1 with errno set to [EIO];
   if this error is returned, the state of fildes is unspecified.

So if we return an error, userspace doesn't know if this fd is still
open or not!  This feels like poor underspecification on POSIX's part
(and probably stems from a historical unwillingness to declare any
vendor's implementation as "not Unix").
Jeff Layton Jan. 9, 2023, 3:43 p.m. UTC | #5
On Mon, 2023-01-09 at 15:30 +0000, Matthew Wilcox wrote:
> On Mon, Jan 09, 2023 at 10:14:12AM -0500, Jeff Layton wrote:
> > On Mon, 2023-01-09 at 09:42 -0500, Jeff Layton wrote:
> > > On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote:
> > > > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot
> > > > glean any additional information by calling it ourselves.  It may also
> > > > be misleading as it will pick up on any errors since the beginning of
> > > > time which may well be since before this program opened the file.
> > > > 
> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > ---
> > > >  fs/cifs/file.c | 8 +++-----
> > > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > > > index 22dfc1f8b4f1..7e7ee26cf77d 100644
> > > > --- a/fs/cifs/file.c
> > > > +++ b/fs/cifs/file.c
> > > > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id)
> > > >  	int rc = 0;
> > > >  
> > > >  	if (file->f_mode & FMODE_WRITE)
> > > > -		rc = filemap_write_and_wait(inode->i_mapping);
> > > > +		rc = filemap_write_and_wait(file->f_mapping);
> > > 
> > > If we're calling ->flush, then the file is being closed. Should this
> > > just be?
> > > 		rc = file_write_and_wait(file);
> > > 
> > > It's not like we need to worry about corrupting ->f_wb_err at that
> > > point.
> > > 
> > 
> > OTOH, I suppose it is possible for there to be racing fsync syscall with
> > a filp_close, and in that case advancing the f_wb_err might be a bad
> > idea, particularly since a lot of places ignore the return from
> > filp_close. It's probably best to _not_ advance the f_wb_err on ->flush
> > calls.
> 
> There's only so much we can do to protect an application from itself.
> If it's racing an fsync() against close(), it might get an EBADF from
> fsync(), or end up fsyncing entirely the wrong file due to a close();
> open(); associating the fd with a different file.
> 

close() is not the worry, as it does return the error from ->flush. It's
the other callers of filp_close that concern me. A lot of those are
dealing with special "internal" struct files, but not all of them.

There's no benefit to advancing f_wb_err in ->flush, so I think we ought
to just avoid it. I think we don't even want to mark it SEEN in that
case either, so filemap_check_wb_err ought to be fine.

> > That said...wonder if we ought to consider making filp_close and ->flush
> > void return functions. There's no POSIX requirement to flush all of the
> > data on close(), so an application really shouldn't rely on seeing
> > writeback errors returned there since it's not reliable.
> > 
> > If you care about writeback errors, you have to call fsync -- full stop.
> 
> Yes, most filesystems do not writeback dirty data on close().
> Applications can't depend on that behaviour.  Interestingly, if you read
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html
> really carefully, it says:
> 
>    If an I/O error occurred while reading from or writing to the file
>    system during close(), it may return -1 with errno set to [EIO];
>    if this error is returned, the state of fildes is unspecified.
> 
> So if we return an error, userspace doesn't know if this fd is still
> open or not!  This feels like poor underspecification on POSIX's part
> (and probably stems from a historical unwillingness to declare any
> vendor's implementation as "not Unix").
> 

It's a mess. On Linux we even tear down the fd on EINTR and EAGAIN, so
retrying a close() on failure will always get you a EBADF. The only sane
thing for userland to do is to just ignore errors on close (aside from
maybe EBADF).
diff mbox series

Patch

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 22dfc1f8b4f1..7e7ee26cf77d 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -3042,14 +3042,12 @@  int cifs_flush(struct file *file, fl_owner_t id)
 	int rc = 0;
 
 	if (file->f_mode & FMODE_WRITE)
-		rc = filemap_write_and_wait(inode->i_mapping);
+		rc = filemap_write_and_wait(file->f_mapping);
 
 	cifs_dbg(FYI, "Flush inode %p file %p rc %d\n", inode, file, rc);
-	if (rc) {
-		/* get more nuanced writeback errors */
-		rc = filemap_check_wb_err(file->f_mapping, 0);
+	if (rc)
 		trace_cifs_flush_err(inode->i_ino, rc);
-	}
+
 	return rc;
 }