diff mbox series

[v4,1/3] fs: hoist EFSCORRUPTED definition into uapi header

Message ID 20190118161440.220134-1-jannh@google.com (mailing list archive)
State New, archived
Headers show
Series [v4,1/3] fs: hoist EFSCORRUPTED definition into uapi header | expand

Commit Message

Jann Horn Jan. 18, 2019, 4:14 p.m. UTC
Multiple filesystems can already return EFSCORRUPTED errors to userspace;
however, so far, definitions of EFSCORRUPTED were in filesystem-private
headers.

I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
Dave Chinner says that I should instead hoist the definitions of
EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.

This patch is marked for stable backport because it is a prerequisite for
the following patch.

Cc: stable@vger.kernel.org
Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jann Horn <jannh@google.com>
---
 fs/ext2/ext2.h                   | 1 -
 fs/ext4/ext4.h                   | 1 -
 fs/xfs/xfs_linux.h               | 1 -
 include/linux/jbd2.h             | 1 -
 include/uapi/asm-generic/errno.h | 1 +
 5 files changed, 1 insertion(+), 4 deletions(-)

Comments

Arnd Bergmann Jan. 18, 2019, 4:23 p.m. UTC | #1
On Fri, Jan 18, 2019 at 5:15 PM Jann Horn <jannh@google.com> wrote:
>
> Multiple filesystems can already return EFSCORRUPTED errors to userspace;
> however, so far, definitions of EFSCORRUPTED were in filesystem-private
> headers.
>
> I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
> Dave Chinner says that I should instead hoist the definitions of
> EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.
>
> This patch is marked for stable backport because it is a prerequisite for
> the following patch.
>
> Cc: stable@vger.kernel.org
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Jann Horn <jannh@google.com>
> ---
>  fs/ext2/ext2.h                   | 1 -
>  fs/ext4/ext4.h                   | 1 -
>  fs/xfs/xfs_linux.h               | 1 -
>  include/linux/jbd2.h             | 1 -
>  include/uapi/asm-generic/errno.h | 1 +
>  5 files changed, 1 insertion(+), 4 deletions(-)


For asm-generic:

Acked-by: Arnd Bergmann <arnd@arndb.de>
Dave Chinner Jan. 20, 2019, 10:13 p.m. UTC | #2
On Fri, Jan 18, 2019 at 05:14:38PM +0100, Jann Horn wrote:
> Multiple filesystems can already return EFSCORRUPTED errors to userspace;
> however, so far, definitions of EFSCORRUPTED were in filesystem-private
> headers.
> 
> I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
> Dave Chinner says that I should instead hoist the definitions of
> EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.
> 
> This patch is marked for stable backport because it is a prerequisite for
> the following patch.
> 
> Cc: stable@vger.kernel.org
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Jann Horn <jannh@google.com>

Looks fine to me.

Acked-by: Dave Chinner <dchinner@redhat.com>
Theodore Ts'o Jan. 21, 2019, 9:54 p.m. UTC | #3
On Fri, Jan 18, 2019 at 05:14:38PM +0100, Jann Horn wrote:
> Multiple filesystems can already return EFSCORRUPTED errors to userspace;
> however, so far, definitions of EFSCORRUPTED were in filesystem-private
> headers.
> 
> I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
> Dave Chinner says that I should instead hoist the definitions of
> EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.
> 
> This patch is marked for stable backport because it is a prerequisite for
> the following patch.
> 
> Cc: stable@vger.kernel.org
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Jann Horn <jannh@google.com>

Before we enshrine the overloading of EUCLEAN and EFSCORRUPTED, I
wonder if we should at least consider the option of assigning a new
error code number for EFSCORRUPTED.  The downside of doing this is
that for a while, older versions glibc won't have strerror/perror
translation for the new error code.  On the other hand, I'm not sure
it will be that much more confusing to the average user than
"Structure needs cleaning".  :-)

The upside of assigning a new error code is that in a year or two,
we'll actually have an intelligible error message showing up in log
files and in user's terminals.

					- Ted
Dave Chinner Jan. 21, 2019, 10:13 p.m. UTC | #4
On Mon, Jan 21, 2019 at 04:54:54PM -0500, Theodore Y. Ts'o wrote:
> On Fri, Jan 18, 2019 at 05:14:38PM +0100, Jann Horn wrote:
> > Multiple filesystems can already return EFSCORRUPTED errors to userspace;
> > however, so far, definitions of EFSCORRUPTED were in filesystem-private
> > headers.
> > 
> > I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
> > Dave Chinner says that I should instead hoist the definitions of
> > EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.
> > 
> > This patch is marked for stable backport because it is a prerequisite for
> > the following patch.
> > 
> > Cc: stable@vger.kernel.org
> > Suggested-by: Dave Chinner <david@fromorbit.com>
> > Signed-off-by: Jann Horn <jannh@google.com>
> 
> Before we enshrine the overloading of EUCLEAN and EFSCORRUPTED, I
> wonder if we should at least consider the option of assigning a new
> error code number for EFSCORRUPTED. 

No.

We've exposed filesystem corruption errors to userspace as errno 117
for many years now, so people are already familiar with this error
as indicating a filesystem problem.

> The downside of doing this is
> that for a while, older versions glibc won't have strerror/perror
> translation for the new error code.

And everyone will end up asking "WTF is this undefined error the
application just received?" until glibc, man pages and enough
occurrences of the question have been asked that the search engines
develop enough of a history record that they return useful results.
Not only won't users have a clue, but the app developers the users
first ask "what's this error mean" won't have a clue, either.

> On the other hand, I'm not sure
> it will be that much more confusing to the average user than
> "Structure needs cleaning".  :-)

Go search for "XFS structure needs cleaning" on your preferred
search engine and will you get lots and lots of hits indicating what
you should do when you get that error. It's taken years to build up
that history such that it's extremely useful to the average user....

> The upside of assigning a new error code is that in a year or two,
> we'll actually have an intelligible error message showing up in log
> files and in user's terminals.

The downside is that it will take several years before people will
become familiar with the new error, and we'll have to deal with the
fallout repeatedly from it. Hence, IMO, there's no upside to
changing the errno of EFSCORRUPTED now that it is largely ubiquitous
in userspace.

Cheers,

Dave.
David Sterba Jan. 21, 2019, 10:14 p.m. UTC | #5
On Mon, Jan 21, 2019 at 04:54:54PM -0500, Theodore Y. Ts'o wrote:
> On Fri, Jan 18, 2019 at 05:14:38PM +0100, Jann Horn wrote:
> > Multiple filesystems can already return EFSCORRUPTED errors to userspace;
> > however, so far, definitions of EFSCORRUPTED were in filesystem-private
> > headers.
> > 
> > I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
> > Dave Chinner says that I should instead hoist the definitions of
> > EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.
> > 
> > This patch is marked for stable backport because it is a prerequisite for
> > the following patch.
> > 
> > Cc: stable@vger.kernel.org
> > Suggested-by: Dave Chinner <david@fromorbit.com>
> > Signed-off-by: Jann Horn <jannh@google.com>
> 
> Before we enshrine the overloading of EUCLEAN and EFSCORRUPTED, I
> wonder if we should at least consider the option of assigning a new
> error code number for EFSCORRUPTED.  The downside of doing this is
> that for a while, older versions glibc won't have strerror/perror
> translation for the new error code.  On the other hand, I'm not sure
> it will be that much more confusing to the average user than
> "Structure needs cleaning".  :-)
>
> The upside of assigning a new error code is that in a year or two,
> we'll actually have an intelligible error message showing up in log
> files and in user's terminals.

I vote for a new code with a better message than EUCLEAN provides.
Darrick J. Wong Jan. 21, 2019, 11:51 p.m. UTC | #6
On Mon, Jan 21, 2019 at 04:54:54PM -0500, Theodore Y. Ts'o wrote:
> On Fri, Jan 18, 2019 at 05:14:38PM +0100, Jann Horn wrote:
> > Multiple filesystems can already return EFSCORRUPTED errors to userspace;
> > however, so far, definitions of EFSCORRUPTED were in filesystem-private
> > headers.
> > 
> > I wanted to use EUCLEAN to indicate data corruption in the VFS layer;
> > Dave Chinner says that I should instead hoist the definitions of
> > EFSCORRUPTED into the UAPI header and then use EFSCORRUPTED.
> > 
> > This patch is marked for stable backport because it is a prerequisite for
> > the following patch.
> > 
> > Cc: stable@vger.kernel.org
> > Suggested-by: Dave Chinner <david@fromorbit.com>
> > Signed-off-by: Jann Horn <jannh@google.com>
> 
> Before we enshrine the overloading of EUCLEAN and EFSCORRUPTED, I
> wonder if we should at least consider the option of assigning a new
> error code number for EFSCORRUPTED.  The downside of doing this is
> that for a while, older versions glibc won't have strerror/perror
> translation for the new error code.  On the other hand, I'm not sure
> it will be that much more confusing to the average user than
> "Structure needs cleaning".  :-)
> 
> The upside of assigning a new error code is that in a year or two,
> we'll actually have an intelligible error message showing up in log
> files and in user's terminals.

Uh... Ted?  Back in ~2009 we had a discussion on the ext4 conference
call about what error codes to return for "metadata is garbage" and
"metadata crc doesn't validate".  Back then you said that it would have
been great if someone had thought of defining error codes for that so
that by the time we got around to merging metadata checksums for ext4,
we'd have some error codes ready to go.

I pointed out that "the XFS people" already returned EUCLEAN /
EFSCORRUPTED and EILSEQ / EBADFSCRC for those cases and that upstream
had been doing that for a couple of years already, so we decided that
we'd just make ext4 behave like XFS because they'd already started
training everyone and their pet search engines that these oddly phrased
error messages in the context of a filesystem means they need to run
some sort of fsck/repair tool.  We also decided that while the messaging
was weird, it would be less work for both of us than to try to push
disruptive changes through Linux uapi, glibc, man pages, strerror
localization catalogs, etc.

Now it's a *decade* later, and ext4 / XFS have both converged on more or
less the same behavioral patterns w.r.t. when they return EFSCORRUPTED
and EFSBADCRC.  btrfs, hpfs, jffs2, and ubifs seem to use EUCLEAN to
signal bad metadata just like XFS and ext4 do.

I disagree with upending 13 years of established precedent for user
visible behavior.  We possibly could've pulled this off ten years ago,
but it's waaaay too late now.  Too much work, too little gain.

--D

> 					- Ted
Theodore Ts'o Jan. 22, 2019, 12:38 a.m. UTC | #7
On Mon, Jan 21, 2019 at 03:51:58PM -0800, Darrick J. Wong wrote:
> 
> I disagree with upending 13 years of established precedent for user
> visible behavior.  We possibly could've pulled this off ten years ago,
> but it's waaaay too late now.  Too much work, too little gain.

I remember the discussion; but now that we're adding it to uapi header
files, it's really going to be impossible.  And I have had some
regrets about that decision ten years ago.  I agree it would cause
confusion if we do it now, but it's basically the our last
opportunity.
 
How about this then?  We could ask glibc to change the string returned
by strerror for EUCLEAN/EFSCORRUPTED to be something like "File system
or block device corrupted".  This is how the errno is used in the
kernel; and if we don't want to change the error code, changing the
string returned by glibc should be less problematic.

					- Ted
diff mbox series

Patch

diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
index e770cd100a6a..7fafc19e5aa0 100644
--- a/fs/ext2/ext2.h
+++ b/fs/ext2/ext2.h
@@ -369,7 +369,6 @@  struct ext2_inode {
  */
 #define	EXT2_VALID_FS			0x0001	/* Unmounted cleanly */
 #define	EXT2_ERROR_FS			0x0002	/* Errors detected */
-#define	EFSCORRUPTED			EUCLEAN	/* Filesystem is corrupted */
 
 /*
  * Mount flags
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 185a05d3257e..9397e97fc15b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3247,6 +3247,5 @@  extern const struct iomap_ops ext4_iomap_ops;
 #endif	/* __KERNEL__ */
 
 #define EFSBADCRC	EBADMSG		/* Bad CRC detected */
-#define EFSCORRUPTED	EUCLEAN		/* Filesystem is corrupted */
 
 #endif	/* _EXT4_H */
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index edbd5a210df2..36e5c6549f15 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -125,7 +125,6 @@  typedef __u32			xfs_nlink_t;
 
 #define ENOATTR		ENODATA		/* Attribute not found */
 #define EWRONGFS	EINVAL		/* Mount with wrong filesystem type */
-#define EFSCORRUPTED	EUCLEAN		/* Filesystem is corrupted */
 #define EFSBADCRC	EBADMSG		/* Bad CRC detected */
 
 #define SYNCHRONIZE()	barrier()
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 0f919d5fe84f..1d0da9c78216 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1640,6 +1640,5 @@  static inline tid_t  jbd2_get_latest_transaction(journal_t *journal)
 #endif	/* __KERNEL__ */
 
 #define EFSBADCRC	EBADMSG		/* Bad CRC detected */
-#define EFSCORRUPTED	EUCLEAN		/* Filesystem is corrupted */
 
 #endif	/* _LINUX_JBD2_H */
diff --git a/include/uapi/asm-generic/errno.h b/include/uapi/asm-generic/errno.h
index cf9c51ac49f9..5ddebc1bf951 100644
--- a/include/uapi/asm-generic/errno.h
+++ b/include/uapi/asm-generic/errno.h
@@ -98,6 +98,7 @@ 
 #define	EINPROGRESS	115	/* Operation now in progress */
 #define	ESTALE		116	/* Stale file handle */
 #define	EUCLEAN		117	/* Structure needs cleaning */
+#define	EFSCORRUPTED	EUCLEAN	/* Filesystem is corrupted */
 #define	ENOTNAM		118	/* Not a XENIX named type file */
 #define	ENAVAIL		119	/* No XENIX semaphores available */
 #define	EISNAM		120	/* Is a named type file */