Message ID | 20230109051823.480289-9-willy@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Remove AS_EIO and AS_ENOSPC | expand |
On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote: > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot > glean any additional information by calling it ourselves. It may also > be misleading as it will pick up on any errors since the beginning of > time which may well be since before this program opened the file. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > fs/cifs/file.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > index 22dfc1f8b4f1..7e7ee26cf77d 100644 > --- a/fs/cifs/file.c > +++ b/fs/cifs/file.c > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id) > int rc = 0; > > if (file->f_mode & FMODE_WRITE) > - rc = filemap_write_and_wait(inode->i_mapping); > + rc = filemap_write_and_wait(file->f_mapping); If we're calling ->flush, then the file is being closed. Should this just be? rc = file_write_and_wait(file); It's not like we need to worry about corrupting ->f_wb_err at that point. > > cifs_dbg(FYI, "Flush inode %p file %p rc %d\n", inode, file rc); > - if (rc) { > - /* get more nuanced writeback errors */ > - rc = filemap_check_wb_err(file->f_mapping, 0); > + if (rc) > trace_cifs_flush_err(inode->i_ino, rc); > - } > + > return rc; > } >
On Mon, Jan 09, 2023 at 09:42:36AM -0500, Jeff Layton wrote: > On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote: > > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot > > glean any additional information by calling it ourselves. It may also > > be misleading as it will pick up on any errors since the beginning of > > time which may well be since before this program opened the file. > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > --- > > fs/cifs/file.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > > index 22dfc1f8b4f1..7e7ee26cf77d 100644 > > --- a/fs/cifs/file.c > > +++ b/fs/cifs/file.c > > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id) > > int rc = 0; > > > > if (file->f_mode & FMODE_WRITE) > > - rc = filemap_write_and_wait(inode->i_mapping); > > + rc = filemap_write_and_wait(file->f_mapping); > > If we're calling ->flush, then the file is being closed. Should this > just be? > rc = file_write_and_wait(file); > > It's not like we need to worry about corrupting ->f_wb_err at that > point. Yes, I think you're right, and then this is a standalone patch that can go in this cycle, perhaps.
On Mon, 2023-01-09 at 09:42 -0500, Jeff Layton wrote: > On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote: > > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot > > glean any additional information by calling it ourselves. It may also > > be misleading as it will pick up on any errors since the beginning of > > time which may well be since before this program opened the file. > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > --- > > fs/cifs/file.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > > index 22dfc1f8b4f1..7e7ee26cf77d 100644 > > --- a/fs/cifs/file.c > > +++ b/fs/cifs/file.c > > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id) > > int rc = 0; > > > > if (file->f_mode & FMODE_WRITE) > > - rc = filemap_write_and_wait(inode->i_mapping); > > + rc = filemap_write_and_wait(file->f_mapping); > > If we're calling ->flush, then the file is being closed. Should this > just be? > rc = file_write_and_wait(file); > > It's not like we need to worry about corrupting ->f_wb_err at that > point. > OTOH, I suppose it is possible for there to be racing fsync syscall with a filp_close, and in that case advancing the f_wb_err might be a bad idea, particularly since a lot of places ignore the return from filp_close. It's probably best to _not_ advance the f_wb_err on ->flush calls. That said...wonder if we ought to consider making filp_close and ->flush void return functions. There's no POSIX requirement to flush all of the data on close(), so an application really shouldn't rely on seeing writeback errors returned there since it's not reliable. If you care about writeback errors, you have to call fsync -- full stop. > > > > cifs_dbg(FYI, "Flush inode %p file %p rc %d\n", inode, file rc); > > - if (rc) { > > - /* get more nuanced writeback errors */ > > - rc = filemap_check_wb_err(file->f_mapping, 0); > > + if (rc) > > trace_cifs_flush_err(inode->i_ino, rc); > > - } > > + > > return rc; > > } > > >
On Mon, Jan 09, 2023 at 10:14:12AM -0500, Jeff Layton wrote: > On Mon, 2023-01-09 at 09:42 -0500, Jeff Layton wrote: > > On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote: > > > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot > > > glean any additional information by calling it ourselves. It may also > > > be misleading as it will pick up on any errors since the beginning of > > > time which may well be since before this program opened the file. > > > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > > --- > > > fs/cifs/file.c | 8 +++----- > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > > > index 22dfc1f8b4f1..7e7ee26cf77d 100644 > > > --- a/fs/cifs/file.c > > > +++ b/fs/cifs/file.c > > > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id) > > > int rc = 0; > > > > > > if (file->f_mode & FMODE_WRITE) > > > - rc = filemap_write_and_wait(inode->i_mapping); > > > + rc = filemap_write_and_wait(file->f_mapping); > > > > If we're calling ->flush, then the file is being closed. Should this > > just be? > > rc = file_write_and_wait(file); > > > > It's not like we need to worry about corrupting ->f_wb_err at that > > point. > > > > OTOH, I suppose it is possible for there to be racing fsync syscall with > a filp_close, and in that case advancing the f_wb_err might be a bad > idea, particularly since a lot of places ignore the return from > filp_close. It's probably best to _not_ advance the f_wb_err on ->flush > calls. There's only so much we can do to protect an application from itself. If it's racing an fsync() against close(), it might get an EBADF from fsync(), or end up fsyncing entirely the wrong file due to a close(); open(); associating the fd with a different file. > That said...wonder if we ought to consider making filp_close and ->flush > void return functions. There's no POSIX requirement to flush all of the > data on close(), so an application really shouldn't rely on seeing > writeback errors returned there since it's not reliable. > > If you care about writeback errors, you have to call fsync -- full stop. Yes, most filesystems do not writeback dirty data on close(). Applications can't depend on that behaviour. Interestingly, if you read https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html really carefully, it says: If an I/O error occurred while reading from or writing to the file system during close(), it may return -1 with errno set to [EIO]; if this error is returned, the state of fildes is unspecified. So if we return an error, userspace doesn't know if this fd is still open or not! This feels like poor underspecification on POSIX's part (and probably stems from a historical unwillingness to declare any vendor's implementation as "not Unix").
On Mon, 2023-01-09 at 15:30 +0000, Matthew Wilcox wrote: > On Mon, Jan 09, 2023 at 10:14:12AM -0500, Jeff Layton wrote: > > On Mon, 2023-01-09 at 09:42 -0500, Jeff Layton wrote: > > > On Mon, 2023-01-09 at 05:18 +0000, Matthew Wilcox (Oracle) wrote: > > > > filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot > > > > glean any additional information by calling it ourselves. It may also > > > > be misleading as it will pick up on any errors since the beginning of > > > > time which may well be since before this program opened the file. > > > > > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > > > --- > > > > fs/cifs/file.c | 8 +++----- > > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > > > > index 22dfc1f8b4f1..7e7ee26cf77d 100644 > > > > --- a/fs/cifs/file.c > > > > +++ b/fs/cifs/file.c > > > > @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id) > > > > int rc = 0; > > > > > > > > if (file->f_mode & FMODE_WRITE) > > > > - rc = filemap_write_and_wait(inode->i_mapping); > > > > + rc = filemap_write_and_wait(file->f_mapping); > > > > > > If we're calling ->flush, then the file is being closed. Should this > > > just be? > > > rc = file_write_and_wait(file); > > > > > > It's not like we need to worry about corrupting ->f_wb_err at that > > > point. > > > > > > > OTOH, I suppose it is possible for there to be racing fsync syscall with > > a filp_close, and in that case advancing the f_wb_err might be a bad > > idea, particularly since a lot of places ignore the return from > > filp_close. It's probably best to _not_ advance the f_wb_err on ->flush > > calls. > > There's only so much we can do to protect an application from itself. > If it's racing an fsync() against close(), it might get an EBADF from > fsync(), or end up fsyncing entirely the wrong file due to a close(); > open(); associating the fd with a different file. > close() is not the worry, as it does return the error from ->flush. It's the other callers of filp_close that concern me. A lot of those are dealing with special "internal" struct files, but not all of them. There's no benefit to advancing f_wb_err in ->flush, so I think we ought to just avoid it. I think we don't even want to mark it SEEN in that case either, so filemap_check_wb_err ought to be fine. > > That said...wonder if we ought to consider making filp_close and ->flush > > void return functions. There's no POSIX requirement to flush all of the > > data on close(), so an application really shouldn't rely on seeing > > writeback errors returned there since it's not reliable. > > > > If you care about writeback errors, you have to call fsync -- full stop. > > Yes, most filesystems do not writeback dirty data on close(). > Applications can't depend on that behaviour. Interestingly, if you read > https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html > really carefully, it says: > > If an I/O error occurred while reading from or writing to the file > system during close(), it may return -1 with errno set to [EIO]; > if this error is returned, the state of fildes is unspecified. > > So if we return an error, userspace doesn't know if this fd is still > open or not! This feels like poor underspecification on POSIX's part > (and probably stems from a historical unwillingness to declare any > vendor's implementation as "not Unix"). > It's a mess. On Linux we even tear down the fd on EINTR and EAGAIN, so retrying a close() on failure will always get you a EBADF. The only sane thing for userland to do is to just ignore errors on close (aside from maybe EBADF).
diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 22dfc1f8b4f1..7e7ee26cf77d 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -3042,14 +3042,12 @@ int cifs_flush(struct file *file, fl_owner_t id) int rc = 0; if (file->f_mode & FMODE_WRITE) - rc = filemap_write_and_wait(inode->i_mapping); + rc = filemap_write_and_wait(file->f_mapping); cifs_dbg(FYI, "Flush inode %p file %p rc %d\n", inode, file, rc); - if (rc) { - /* get more nuanced writeback errors */ - rc = filemap_check_wb_err(file->f_mapping, 0); + if (rc) trace_cifs_flush_err(inode->i_ino, rc); - } + return rc; }
filemap_write_and_wait() now calls filemap_check_wb_err(), so we cannot glean any additional information by calling it ourselves. It may also be misleading as it will pick up on any errors since the beginning of time which may well be since before this program opened the file. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- fs/cifs/file.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)