[v2,07/17] fs: new infrastructure for writeback error handling and reporting

My apologies, this patch in particular should have gotten an updated
changelog. Here's a revised patch. The only real difference in this is
the updated changelog.

----------------------------8<--------------------------------

[PATCH] fs: new infrastructure for writeback error handling and reporting

Most filesystems currently use mapping_set_error and
filemap_check_errors for setting and reporting/clearing writeback errors
at the mapping level. filemap_check_errors is indirectly called from
most of the filemap_fdatawait_* functions and from
filemap_write_and_wait*. These functions are called from all sorts of
contexts to wait on writeback to finish -- e.g. mostly in fsync, but
also in truncate calls, getattr, etc.

It's those non-fsync callers that are problematic. We should be
reporting writeback errors during fsync, but many places in the code
clear out errors before they can be properly reported, or report errors
at nonsensical times. If I get -EIO on a stat() call, there is no reason
for me to assume that it is because some previous writeback failed. The
fact that it also clears out the error such that a subsequent fsync
returns 0 is a bug, IMO, and a nasty one since that's potentially silent
data corruption.

This patch adds a small bit of new infrastructure for setting and
reporting errors during address_space writeback. While the above was my
original impetus for adding this, I think it's also the case that
current fsync semantics are just problematic for userland. Most
applications that call fsync do so to ensure that the data they wrote
has hit the backing store.

In the case where there are multiple writers to the file at the same
time, this is really hard to determine. The first one to call fsync will
see any stored error, and the rest get back 0. The processes with open
fds may not be associated with one another in any way. They could even
be in different containers, so ensuring coordination between all fsync
callers is not really an option.

One way to remedy this would be to track what file descriptor was used
to dirty the file, but that's rather cumbersome and would likely be
slow. However, there is a simpler way to improve the semantics here
without incurring too much overhead.

This set defines a new 32-bit value (wb_err_t) that encompasses an
error code (up to MAX_ERROR), a sequence counter and a "seen" flag.

One of these is added to struct address_space, and a corresponding one
is added to struct file. When errors are reported during writeback, we
set the error field, and clear the seen flag and increment the sequence
counter if the seen flag is set.

On fsync we can check the file's value against what's in the mapping and
quickly return 0 if it hasn't changed. If it has changed, we'll set the
seen flag if it's not already set, update the value in the struct file
to the latest and return an error.

This changes the semantics of fsync such that applications can now use
it to determine whether there were any writeback errors since fsync(fd)
was last called (or since the file was opened in the case of fsync
having never been called).

Note that those writeback errors may have occurred when writing data
that was dirtied via an entirely different fd, but that's the case now
with the current mapping_set_error/filemap_check_error infrastructure.
This will at least prevent you from getting a false report of success.

The basic idea here is for filesystems to use filemap_set_wb_error to
set the error in the mapping when there are writeback errors, and then
have the fsync and flush operations call filemap_report_wb_error just
before returning to ensure that those errors get reported properly.

Eventually, it may make sense to move the reporting into the generic
vfs_fsync_range helper, but doing it this way for now makes it simpler
to convert filesystems to the new API individually.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 Documentation/filesystems/vfs.txt |  10 +-
 fs/open.c                         |   3 +
 include/linux/fs.h                |  23 +++++
 mm/filemap.c                      | 209 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 244 insertions(+), 1 deletion(-)

[v2,07/17] fs: new infrastructure for writeback error handling and reporting

Commit Message

Comments

Patch