xfs_repair: clear extra file attributes on symlinks
diff mbox

Message ID 20171031215156.12544-1-mcgrof@kernel.org
State Superseded
Headers show

Commit Message

Luis Chamberlain Oct. 31, 2017, 9:51 p.m. UTC
Linux filesystems cannot set extra file attributes (stx_attributes as per
statx(2)) on a symbolic link as ioctl(2) with FS_IOC_SETFLAGS is used for
this purpose, and *all* ioctl(2) calls on a symbolic link yield EBADF.
This is because ioctl(2) tries to obtain struct fd from the symbolic link
file descriptor passed using fdget(), fdget() in turn always returns no
file set when a file descriptor is open with O_PATH. As per symlink(2)
O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
descriptor of a symbolic link, and this holds true for Linux, as such extra
file attributes cannot possibly be set on symbolic links on Linux.

Given this Linux filesystems should treat extra file attributes set on
symbolic links as corruption and fix them.

   The TL;DR:

How I discovered this was finding a corrupted filesystem with symbolic
links with the extra file attribute append (STATX_ATTR_APPEND) set. Symbolic
links with the attribute append set cannot be removed as they are treated as
if a file was set with the immutable flag set. Standard tools however cannot
remove *any* attribute flag:

	# chattr -a symlink
	chattr: Operation not supported while reading flags on b

If you end up with these symbolic links userspace cannot remove them.

Likewise one cannot use the same tool to *set* this extra file attribute on
a symbolic link using chattr:
	# rm -f y z
	# ln -s y z
	# chattr +a z
	chattr: Operation not supported while reading flags on z

What makes this puzzling was one cannot even list attributes on symlinks
using lsattr either:

	# rm -f a b
	# ln -s a b
	# lsattr b
	lsattr: Operation not supported While reading flags on b

The above was due to commit 023d111e92195 ("chattr.1.in: Document the
compression attribute flags E, X, and ...") merged on e2fsprogs v1.28 since
August 2002. That commit just refers to the fact that attributes were only
allowed after that commit for directories and regular files due to Debian
bug 152029 [0]. This bug was filed since lsattr lsattr would segfault when
used against a special file of an old DRM buggy driver, the ioctl seem to
have worked but crashed lsattr with its information. The bug report doesn't
list any specific reasoning for not allowing attributes on symlinks though.

Crafting your own tool to query the extra file attributes with the new
shiny statx(2) works, and if a symbolic link has the extra attribute
flag set statx(2) would inform userspace of this. statx(2) is only used
for getting file information, not for setting them.

This all meant that if you end up with the append extra file attribute
set on a symlink you need special tools to try to remove it and currently
that's only possible on XFS with xfs_db [1] [2].

Fix XFS filesystems which have these extra file attributes set as the only
way they could have been set was through corruption.

[0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=152029
[1] xfs_db -x -c 'inode inode-number' -c 'write core.append 1' /dev/device
[2] xfs_db -x -c 'inode inode-number' -c 'write core.append 0' /dev/device

Cc: Tso Ted <tytso@mit.edu>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Flex Liu <fliu@suse.com>
Cc: Jake Norris <jake.norris@suse.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---

On this v2 I've provided a much better explanation as to why these
extra file attributes don't make sense on Linux, and trimmed the flags
we venture out to clear to *only* match what statx defines. It may be
possible to clear more dino->di_flags and maybe even dino->di_flags2
for symbolic links however that those be determined separately as the
other flags' semantics are clarified for setting.

 repair/dinode.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Darrick J. Wong Oct. 31, 2017, 10:12 p.m. UTC | #1
On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
> Linux filesystems cannot set extra file attributes (stx_attributes as per
> statx(2)) on a symbolic link as ioctl(2) with FS_IOC_SETFLAGS is used for
> this purpose, and *all* ioctl(2) calls on a symbolic link yield EBADF.
> This is because ioctl(2) tries to obtain struct fd from the symbolic link
> file descriptor passed using fdget(), fdget() in turn always returns no
> file set when a file descriptor is open with O_PATH. As per symlink(2)
> O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> descriptor of a symbolic link, and this holds true for Linux, as such extra
> file attributes cannot possibly be set on symbolic links on Linux.
> 
> Given this Linux filesystems should treat extra file attributes set on
> symbolic links as corruption and fix them.
> 
>    The TL;DR:
> 
> How I discovered this was finding a corrupted filesystem with symbolic
> links with the extra file attribute append (STATX_ATTR_APPEND) set. Symbolic
> links with the attribute append set cannot be removed as they are treated as
> if a file was set with the immutable flag set. Standard tools however cannot
> remove *any* attribute flag:
> 
> 	# chattr -a symlink
> 	chattr: Operation not supported while reading flags on b
> 
> If you end up with these symbolic links userspace cannot remove them.
> 
> Likewise one cannot use the same tool to *set* this extra file attribute on
> a symbolic link using chattr:
> 	# rm -f y z
> 	# ln -s y z
> 	# chattr +a z
> 	chattr: Operation not supported while reading flags on z
> 
> What makes this puzzling was one cannot even list attributes on symlinks
> using lsattr either:
> 
> 	# rm -f a b
> 	# ln -s a b
> 	# lsattr b
> 	lsattr: Operation not supported While reading flags on b
> 
> The above was due to commit 023d111e92195 ("chattr.1.in: Document the
> compression attribute flags E, X, and ...") merged on e2fsprogs v1.28 since
> August 2002. That commit just refers to the fact that attributes were only
> allowed after that commit for directories and regular files due to Debian
> bug 152029 [0]. This bug was filed since lsattr lsattr would segfault when
> used against a special file of an old DRM buggy driver, the ioctl seem to
> have worked but crashed lsattr with its information. The bug report doesn't
> list any specific reasoning for not allowing attributes on symlinks though.
> 
> Crafting your own tool to query the extra file attributes with the new
> shiny statx(2) works, and if a symbolic link has the extra attribute
> flag set statx(2) would inform userspace of this. statx(2) is only used
> for getting file information, not for setting them.
> 
> This all meant that if you end up with the append extra file attribute
> set on a symlink you need special tools to try to remove it and currently
> that's only possible on XFS with xfs_db [1] [2].
> 
> Fix XFS filesystems which have these extra file attributes set as the only
> way they could have been set was through corruption.
> 
> [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=152029
> [1] xfs_db -x -c 'inode inode-number' -c 'write core.append 1' /dev/device
> [2] xfs_db -x -c 'inode inode-number' -c 'write core.append 0' /dev/device
> 
> Cc: Tso Ted <tytso@mit.edu>
> Cc: Nikolay Borisov <nborisov@suse.com>
> Cc: Flex Liu <fliu@suse.com>
> Cc: Jake Norris <jake.norris@suse.com>
> Cc: Jan Kara <jack@suse.cz>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
> 
> On this v2 I've provided a much better explanation as to why these
> extra file attributes don't make sense on Linux, and trimmed the flags
> we venture out to clear to *only* match what statx defines. It may be
> possible to clear more dino->di_flags and maybe even dino->di_flags2
> for symbolic links however that those be determined separately as the
> other flags' semantics are clarified for setting.
> 
>  repair/dinode.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/repair/dinode.c b/repair/dinode.c
> index 15ba8cc22b39..6288e42de15e 100644
> --- a/repair/dinode.c
> +++ b/repair/dinode.c
> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
>  						FS_XFLAG_EXTSIZE);
>  			}
>  		}
> +		if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
> +			     XFS_DIFLAG_NODUMP)) {
> +			/*
> +			 * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
> +			 * yields EBADF on symlinks as they have O_PATH set.
> +			 * "Extra file attributes", stx_attributes, as per
> +			 * statx(2) cannot be set on symlinks on Linux.
> +			 */
> +			if (di_mode && S_ISLNK(di_mode) &&
> +			    !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {

I don't think we can be a link and a file at the same time, right?

Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?

--D

> +				if (!uncertain) {
> +					do_warn(
> +	_("file or directory attributes set on symlink inode %" PRIu64 "\n"),
> +						lino);
> +				}
> +				flags &= ~(XFS_DIFLAG_IMMUTABLE |
> +					   XFS_DIFLAG_APPEND |
> +					   XFS_DIFLAG_NODUMP);
> +			}
> +		}
> +
>  		if (!verify_mode && flags != be16_to_cpu(dino->di_flags)) {
>  			if (!no_modify) {
>  				do_warn(_("fixing bad flags.\n"));
> -- 
> 2.14.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain Oct. 31, 2017, 10:19 p.m. UTC | #2
On Tue, Oct 31, 2017 at 3:12 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
>> Linux filesystems cannot set extra file attributes (stx_attributes as per
>> statx(2)) on a symbolic link as ioctl(2) with FS_IOC_SETFLAGS is used for
>> this purpose, and *all* ioctl(2) calls on a symbolic link yield EBADF.
>> This is because ioctl(2) tries to obtain struct fd from the symbolic link
>> file descriptor passed using fdget(), fdget() in turn always returns no
>> file set when a file descriptor is open with O_PATH. As per symlink(2)
>> O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
>> descriptor of a symbolic link, and this holds true for Linux, as such extra
>> file attributes cannot possibly be set on symbolic links on Linux.
>>
>> Given this Linux filesystems should treat extra file attributes set on
>> symbolic links as corruption and fix them.
>>
>>    The TL;DR:
>>
>> How I discovered this was finding a corrupted filesystem with symbolic
>> links with the extra file attribute append (STATX_ATTR_APPEND) set. Symbolic
>> links with the attribute append set cannot be removed as they are treated as
>> if a file was set with the immutable flag set. Standard tools however cannot
>> remove *any* attribute flag:
>>
>>       # chattr -a symlink
>>       chattr: Operation not supported while reading flags on b
>>
>> If you end up with these symbolic links userspace cannot remove them.
>>
>> Likewise one cannot use the same tool to *set* this extra file attribute on
>> a symbolic link using chattr:
>>       # rm -f y z
>>       # ln -s y z
>>       # chattr +a z
>>       chattr: Operation not supported while reading flags on z
>>
>> What makes this puzzling was one cannot even list attributes on symlinks
>> using lsattr either:
>>
>>       # rm -f a b
>>       # ln -s a b
>>       # lsattr b
>>       lsattr: Operation not supported While reading flags on b
>>
>> The above was due to commit 023d111e92195 ("chattr.1.in: Document the
>> compression attribute flags E, X, and ...") merged on e2fsprogs v1.28 since
>> August 2002. That commit just refers to the fact that attributes were only
>> allowed after that commit for directories and regular files due to Debian
>> bug 152029 [0]. This bug was filed since lsattr lsattr would segfault when
>> used against a special file of an old DRM buggy driver, the ioctl seem to
>> have worked but crashed lsattr with its information. The bug report doesn't
>> list any specific reasoning for not allowing attributes on symlinks though.
>>
>> Crafting your own tool to query the extra file attributes with the new
>> shiny statx(2) works, and if a symbolic link has the extra attribute
>> flag set statx(2) would inform userspace of this. statx(2) is only used
>> for getting file information, not for setting them.
>>
>> This all meant that if you end up with the append extra file attribute
>> set on a symlink you need special tools to try to remove it and currently
>> that's only possible on XFS with xfs_db [1] [2].
>>
>> Fix XFS filesystems which have these extra file attributes set as the only
>> way they could have been set was through corruption.
>>
>> [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=152029
>> [1] xfs_db -x -c 'inode inode-number' -c 'write core.append 1' /dev/device
>> [2] xfs_db -x -c 'inode inode-number' -c 'write core.append 0' /dev/device
>>
>> Cc: Tso Ted <tytso@mit.edu>
>> Cc: Nikolay Borisov <nborisov@suse.com>
>> Cc: Flex Liu <fliu@suse.com>
>> Cc: Jake Norris <jake.norris@suse.com>
>> Cc: Jan Kara <jack@suse.cz>
>> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>> ---
>>
>> On this v2 I've provided a much better explanation as to why these
>> extra file attributes don't make sense on Linux, and trimmed the flags
>> we venture out to clear to *only* match what statx defines. It may be
>> possible to clear more dino->di_flags and maybe even dino->di_flags2
>> for symbolic links however that those be determined separately as the
>> other flags' semantics are clarified for setting.
>>
>>  repair/dinode.c | 21 +++++++++++++++++++++
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/repair/dinode.c b/repair/dinode.c
>> index 15ba8cc22b39..6288e42de15e 100644
>> --- a/repair/dinode.c
>> +++ b/repair/dinode.c
>> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
>>                                               FS_XFLAG_EXTSIZE);
>>                       }
>>               }
>> +             if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
>> +                          XFS_DIFLAG_NODUMP)) {
>> +                     /*
>> +                      * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
>> +                      * yields EBADF on symlinks as they have O_PATH set.
>> +                      * "Extra file attributes", stx_attributes, as per
>> +                      * statx(2) cannot be set on symlinks on Linux.
>> +                      */
>> +                     if (di_mode && S_ISLNK(di_mode) &&
>> +                         !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
>
> I don't think we can be a link and a file at the same time, right?

True, the check should be much simpler with just S_ISLNK().

> Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?

Not at the moment given the semantics I hunted down and tested for
were for O_PATH only.  The validation I hunted down applies to any
file descriptors which we open via O_PATH only.

 Recall that the Debian bug that Ted had fixed in userspace was a work
around for userspace querying the file extra attributes via ioctl onto
a buggy DRM driver, so a special file. The *setting* via ioctl() can
certainly be ruled out for O_PATH files, but for other types of files
other types of evaluations would be needed.

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Oct. 31, 2017, 10:43 p.m. UTC | #3
On Tue, Oct 31, 2017 at 03:19:00PM -0700, Luis R. Rodriguez wrote:
> On Tue, Oct 31, 2017 at 3:12 PM, Darrick J. Wong
> <darrick.wong@oracle.com> wrote:
> > On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
> >> Linux filesystems cannot set extra file attributes (stx_attributes as per
> >> statx(2)) on a symbolic link as ioctl(2) with FS_IOC_SETFLAGS is used for
> >> this purpose, and *all* ioctl(2) calls on a symbolic link yield EBADF.
> >> This is because ioctl(2) tries to obtain struct fd from the symbolic link
> >> file descriptor passed using fdget(), fdget() in turn always returns no
> >> file set when a file descriptor is open with O_PATH. As per symlink(2)
> >> O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> >> descriptor of a symbolic link, and this holds true for Linux, as such extra
> >> file attributes cannot possibly be set on symbolic links on Linux.
> >>
> >> Given this Linux filesystems should treat extra file attributes set on
> >> symbolic links as corruption and fix them.
> >>
> >>    The TL;DR:
> >>
> >> How I discovered this was finding a corrupted filesystem with symbolic
> >> links with the extra file attribute append (STATX_ATTR_APPEND) set. Symbolic
> >> links with the attribute append set cannot be removed as they are treated as
> >> if a file was set with the immutable flag set. Standard tools however cannot
> >> remove *any* attribute flag:
> >>
> >>       # chattr -a symlink
> >>       chattr: Operation not supported while reading flags on b
> >>
> >> If you end up with these symbolic links userspace cannot remove them.
> >>
> >> Likewise one cannot use the same tool to *set* this extra file attribute on
> >> a symbolic link using chattr:
> >>       # rm -f y z
> >>       # ln -s y z
> >>       # chattr +a z
> >>       chattr: Operation not supported while reading flags on z
> >>
> >> What makes this puzzling was one cannot even list attributes on symlinks
> >> using lsattr either:
> >>
> >>       # rm -f a b
> >>       # ln -s a b
> >>       # lsattr b
> >>       lsattr: Operation not supported While reading flags on b
> >>
> >> The above was due to commit 023d111e92195 ("chattr.1.in: Document the
> >> compression attribute flags E, X, and ...") merged on e2fsprogs v1.28 since
> >> August 2002. That commit just refers to the fact that attributes were only
> >> allowed after that commit for directories and regular files due to Debian
> >> bug 152029 [0]. This bug was filed since lsattr lsattr would segfault when
> >> used against a special file of an old DRM buggy driver, the ioctl seem to
> >> have worked but crashed lsattr with its information. The bug report doesn't
> >> list any specific reasoning for not allowing attributes on symlinks though.
> >>
> >> Crafting your own tool to query the extra file attributes with the new
> >> shiny statx(2) works, and if a symbolic link has the extra attribute
> >> flag set statx(2) would inform userspace of this. statx(2) is only used
> >> for getting file information, not for setting them.
> >>
> >> This all meant that if you end up with the append extra file attribute
> >> set on a symlink you need special tools to try to remove it and currently
> >> that's only possible on XFS with xfs_db [1] [2].
> >>
> >> Fix XFS filesystems which have these extra file attributes set as the only
> >> way they could have been set was through corruption.
> >>
> >> [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=152029
> >> [1] xfs_db -x -c 'inode inode-number' -c 'write core.append 1' /dev/device
> >> [2] xfs_db -x -c 'inode inode-number' -c 'write core.append 0' /dev/device
> >>
> >> Cc: Tso Ted <tytso@mit.edu>
> >> Cc: Nikolay Borisov <nborisov@suse.com>
> >> Cc: Flex Liu <fliu@suse.com>
> >> Cc: Jake Norris <jake.norris@suse.com>
> >> Cc: Jan Kara <jack@suse.cz>
> >> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> >> ---
> >>
> >> On this v2 I've provided a much better explanation as to why these
> >> extra file attributes don't make sense on Linux, and trimmed the flags
> >> we venture out to clear to *only* match what statx defines. It may be
> >> possible to clear more dino->di_flags and maybe even dino->di_flags2
> >> for symbolic links however that those be determined separately as the
> >> other flags' semantics are clarified for setting.
> >>
> >>  repair/dinode.c | 21 +++++++++++++++++++++
> >>  1 file changed, 21 insertions(+)
> >>
> >> diff --git a/repair/dinode.c b/repair/dinode.c
> >> index 15ba8cc22b39..6288e42de15e 100644
> >> --- a/repair/dinode.c
> >> +++ b/repair/dinode.c
> >> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
> >>                                               FS_XFLAG_EXTSIZE);
> >>                       }
> >>               }
> >> +             if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
> >> +                          XFS_DIFLAG_NODUMP)) {
> >> +                     /*
> >> +                      * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
> >> +                      * yields EBADF on symlinks as they have O_PATH set.
> >> +                      * "Extra file attributes", stx_attributes, as per
> >> +                      * statx(2) cannot be set on symlinks on Linux.
> >> +                      */
> >> +                     if (di_mode && S_ISLNK(di_mode) &&
> >> +                         !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
> >
> > I don't think we can be a link and a file at the same time, right?
> 
> True, the check should be much simpler with just S_ISLNK().
> 
> > Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?
> 
> Not at the moment given the semantics I hunted down and tested for
> were for O_PATH only.  The validation I hunted down applies to any
> file descriptors which we open via O_PATH only.

iirc when you open one of those special files you end up with a fd that
points to an inode on a special bdevfs/pipefs/etc., not an inode linked
to the underlying filesystem containing the special file.  Therefore,
you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.

# mknod block b 8 0 ; mknod char c 1 3 ; mknod fifo p
# lsattr block char fifo
lsattr: Operation not supported While reading flags on block
lsattr: Operation not supported While reading flags on char
lsattr: Operation not supported While reading flags on fifo

--D

>  Recall that the Debian bug that Ted had fixed in userspace was a work
> around for userspace querying the file extra attributes via ioctl onto
> a buggy DRM driver, so a special file. The *setting* via ioctl() can
> certainly be ruled out for O_PATH files, but for other types of files
> other types of evaluations would be needed.
> 
>  Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain Oct. 31, 2017, 11:10 p.m. UTC | #4
On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
> On Tue, Oct 31, 2017 at 03:19:00PM -0700, Luis R. Rodriguez wrote:
> > On Tue, Oct 31, 2017 at 3:12 PM, Darrick J. Wong
> > <darrick.wong@oracle.com> wrote:
> > > On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
> > >> diff --git a/repair/dinode.c b/repair/dinode.c
> > >> index 15ba8cc22b39..6288e42de15e 100644
> > >> --- a/repair/dinode.c
> > >> +++ b/repair/dinode.c
> > >> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
> > >>                                               FS_XFLAG_EXTSIZE);
> > >>                       }
> > >>               }
> > >> +             if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
> > >> +                          XFS_DIFLAG_NODUMP)) {
> > >> +                     /*
> > >> +                      * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
> > >> +                      * yields EBADF on symlinks as they have O_PATH set.
> > >> +                      * "Extra file attributes", stx_attributes, as per
> > >> +                      * statx(2) cannot be set on symlinks on Linux.
> > >> +                      */
> > >> +                     if (di_mode && S_ISLNK(di_mode) &&
> > >> +                         !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
> > >
> > > Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?
> > 
> > Not at the moment given the semantics I hunted down and tested for
> > were for O_PATH only.  The validation I hunted down applies to any
> > file descriptors which we open via O_PATH only.
> 
> iirc when you open one of those special files you end up with a fd that
> points to an inode on a special bdevfs/pipefs/etc., not an inode linked
> to the underlying filesystem containing the special file.

That seems to fit the O_PATH intent, however its unclear if O_PATH was needed,
as per my testing on /dev/loop0 I don't need O_PATH set for it.

> Therefore you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.

That would be great if we can verify.

> # mknod block b 8 0 ; mknod char c 1 3 ; mknod fifo p
> # lsattr block char fifo
> lsattr: Operation not supported While reading flags on block
> lsattr: Operation not supported While reading flags on char
> lsattr: Operation not supported While reading flags on fifo

I'm afraid e2fsprogs has a special check for these, ie, userspace is barred
from actually toying with special files purposely because of the Debian bug I
named.

strace should reveal the respective ioctl() was not actually issued.

I just tested a stupid program against /dev/loop0 and it fails, but not because
of O_PATH and EBADF being returned, somewhere in the path EINVAL is returned,
question is where and why.

openat(AT_FDCWD, "/dev/loop0", O_RDONLY) = 3
ioctl(3, FS_IOC_FSGETXATTR, 0x7ffde0d82d40) = -1 EINVAL (Invalid argument)

I'm happy to fold another patch in for these but it seems to me the logic is a
bit different and the special checks should be confirmed.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Oct. 31, 2017, 11:12 p.m. UTC | #5
On 10/31/17 6:10 PM, Luis R. Rodriguez wrote:
> On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
>> On Tue, Oct 31, 2017 at 03:19:00PM -0700, Luis R. Rodriguez wrote:
>>> On Tue, Oct 31, 2017 at 3:12 PM, Darrick J. Wong
>>> <darrick.wong@oracle.com> wrote:
>>>> On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
>>>>> diff --git a/repair/dinode.c b/repair/dinode.c
>>>>> index 15ba8cc22b39..6288e42de15e 100644
>>>>> --- a/repair/dinode.c
>>>>> +++ b/repair/dinode.c
>>>>> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
>>>>>                                               FS_XFLAG_EXTSIZE);
>>>>>                       }
>>>>>               }
>>>>> +             if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
>>>>> +                          XFS_DIFLAG_NODUMP)) {
>>>>> +                     /*
>>>>> +                      * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
>>>>> +                      * yields EBADF on symlinks as they have O_PATH set.
>>>>> +                      * "Extra file attributes", stx_attributes, as per
>>>>> +                      * statx(2) cannot be set on symlinks on Linux.
>>>>> +                      */
>>>>> +                     if (di_mode && S_ISLNK(di_mode) &&
>>>>> +                         !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
>>>>
>>>> Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?
>>>
>>> Not at the moment given the semantics I hunted down and tested for
>>> were for O_PATH only.  The validation I hunted down applies to any
>>> file descriptors which we open via O_PATH only.
>>
>> iirc when you open one of those special files you end up with a fd that
>> points to an inode on a special bdevfs/pipefs/etc., not an inode linked
>> to the underlying filesystem containing the special file.
> 
> That seems to fit the O_PATH intent, however its unclear if O_PATH was needed,
> as per my testing on /dev/loop0 I don't need O_PATH set for it.
> 
>> Therefore you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.
> 
> That would be great if we can verify.
> 
>> # mknod block b 8 0 ; mknod char c 1 3 ; mknod fifo p
>> # lsattr block char fifo
>> lsattr: Operation not supported While reading flags on block
>> lsattr: Operation not supported While reading flags on char
>> lsattr: Operation not supported While reading flags on fifo
> 
> I'm afraid e2fsprogs has a special check for these, ie, userspace is barred
> from actually toying with special files purposely because of the Debian bug I
> named.
> 
> strace should reveal the respective ioctl() was not actually issued.

That's very easy to short-circuit in e2fsprogs if you'd like to test what
the kernel does today.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain Nov. 1, 2017, 11:50 p.m. UTC | #6
On Tue, Oct 31, 2017 at 06:12:30PM -0500, Eric Sandeen wrote:
> 
> 
> On 10/31/17 6:10 PM, Luis R. Rodriguez wrote:
> > On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
> >> On Tue, Oct 31, 2017 at 03:19:00PM -0700, Luis R. Rodriguez wrote:
> >>> On Tue, Oct 31, 2017 at 3:12 PM, Darrick J. Wong
> >>> <darrick.wong@oracle.com> wrote:
> >>>> On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
> >>>>> diff --git a/repair/dinode.c b/repair/dinode.c
> >>>>> index 15ba8cc22b39..6288e42de15e 100644
> >>>>> --- a/repair/dinode.c
> >>>>> +++ b/repair/dinode.c
> >>>>> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
> >>>>>                                               FS_XFLAG_EXTSIZE);
> >>>>>                       }
> >>>>>               }
> >>>>> +             if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
> >>>>> +                          XFS_DIFLAG_NODUMP)) {
> >>>>> +                     /*
> >>>>> +                      * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
> >>>>> +                      * yields EBADF on symlinks as they have O_PATH set.
> >>>>> +                      * "Extra file attributes", stx_attributes, as per
> >>>>> +                      * statx(2) cannot be set on symlinks on Linux.
> >>>>> +                      */
> >>>>> +                     if (di_mode && S_ISLNK(di_mode) &&
> >>>>> +                         !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
> >>>>
> >>>> Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?
> >>>
> >>> Not at the moment given the semantics I hunted down and tested for
> >>> were for O_PATH only.  The validation I hunted down applies to any
> >>> file descriptors which we open via O_PATH only.
> >>
> >> iirc when you open one of those special files you end up with a fd that
> >> points to an inode on a special bdevfs/pipefs/etc., not an inode linked
> >> to the underlying filesystem containing the special file.
> > 
> > That seems to fit the O_PATH intent, however its unclear if O_PATH was needed,
> > as per my testing on /dev/loop0 I don't need O_PATH set for it.
> > 
> >> Therefore you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.
> > 
> > That would be great if we can verify.
> > 
> >> # mknod block b 8 0 ; mknod char c 1 3 ; mknod fifo p
> >> # lsattr block char fifo
> >> lsattr: Operation not supported While reading flags on block
> >> lsattr: Operation not supported While reading flags on char
> >> lsattr: Operation not supported While reading flags on fifo
> > 
> > I'm afraid e2fsprogs has a special check for these, ie, userspace is barred
> > from actually toying with special files purposely because of the Debian bug I
> > named.
> > 
> > strace should reveal the respective ioctl() was not actually issued.
> 
> That's very easy to short-circuit in e2fsprogs if you'd like to test what
> the kernel does today.

I meant that *I know* that the ioctl() is not issued, as it is a guard implemented
on e2fsprogs to avoid potentially interfacing with buggy kernel drivers.

One however can write a program which does the raw ioctl() call to really test.

For all the above: block char fifo, ioctl() with FS_IOC_FSGETXATTR and
FS_IOC_FSSETXATTR fails with: -1 an errno is set to ENOTTY (Inappropriate ioctl
for device). Reason is that vfs_ioctl() is used, and on XFS we'll hit the goto
out here as no f_op callbacks are set for these special  devices:

long vfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
        int error = -ENOTTY;                                                    
                                                                                
        if (!filp->f_op->unlocked_ioctl)                                        
                goto out;                                                       
                                                                                
        error = filp->f_op->unlocked_ioctl(filp, cmd, arg);                     
        if (error == -ENOIOCTLCMD)                                              
                error = -ENOTTY;                                                
 out:                                                                           
        return error;                                                           
}

My point also was that the above logic is different than for symlink. I'd  much
prefer to address these as a secondary patch given the logic is different.

PF_LOCAL named sockets are also represented on the filesystem, and same situation
with them, so I can bunch up a check for them as well. Are you OK with these
going in as separate patches or do you want me to mesh this all together along
with the new explanation for these?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain Nov. 2, 2017, 12:39 a.m. UTC | #7
On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
> you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.

Do we know for certain *all* DIFLAG/DIFLAG2 flags can *only* be set via
an ioctl()?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Nov. 2, 2017, 5:23 a.m. UTC | #8
On Thu, Nov 02, 2017 at 01:39:48AM +0100, Luis R. Rodriguez wrote:
> On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
> > you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.
> 
> Do we know for certain *all* DIFLAG/DIFLAG2 flags can *only* be set via
> an ioctl()?

I'm fairly sure that inode flags get set via ioctl, via inheritance at
creation time with only two exceptions*, but could you please take a look
at all the places we set inode di_flags/di_flags2 and report back?

--D

* The DAX flag can get set (I think?) via the mount option; and the
  REFLINK flag can be set on files via copy_file_range().

>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner Nov. 2, 2017, 9:22 p.m. UTC | #9
On Wed, Nov 01, 2017 at 10:23:06PM -0700, Darrick J. Wong wrote:
> On Thu, Nov 02, 2017 at 01:39:48AM +0100, Luis R. Rodriguez wrote:
> > On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
> > > you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.
> > 
> > Do we know for certain *all* DIFLAG/DIFLAG2 flags can *only* be set via
> > an ioctl()?
> 
> I'm fairly sure that inode flags get set via ioctl, via inheritance at
> creation time with only two exceptions*, but could you please take a look
> at all the places we set inode di_flags/di_flags2 and report back?

*users* can only set/clear inode flags through ioctls. The kernel
can set/clear inode flags at any time, and they may be flags that
users cannot change (e.g. the reflink flag, or the prealloc flag
which is set during fallocate() calls to tell the kernel not to
remove preallocations beyond EOF).

Write up a set of masks that state what flags are valid on what
type of inode. We can review that for correctness, then we can
work out how to handle bad flags in repair...

Cheers,

Dave.
Luis Chamberlain April 26, 2018, 11:50 p.m. UTC | #10
On Thu, Nov 02, 2017 at 12:50:07AM +0100, Luis R. Rodriguez wrote:
> On Tue, Oct 31, 2017 at 06:12:30PM -0500, Eric Sandeen wrote:
> > 
> > 
> > On 10/31/17 6:10 PM, Luis R. Rodriguez wrote:
> > > On Tue, Oct 31, 2017 at 03:43:20PM -0700, Darrick J. Wong wrote:
> > >> On Tue, Oct 31, 2017 at 03:19:00PM -0700, Luis R. Rodriguez wrote:
> > >>> On Tue, Oct 31, 2017 at 3:12 PM, Darrick J. Wong
> > >>> <darrick.wong@oracle.com> wrote:
> > >>>> On Tue, Oct 31, 2017 at 02:51:56PM -0700, Luis R. Rodriguez wrote:
> > >>>>> diff --git a/repair/dinode.c b/repair/dinode.c
> > >>>>> index 15ba8cc22b39..6288e42de15e 100644
> > >>>>> --- a/repair/dinode.c
> > >>>>> +++ b/repair/dinode.c
> > >>>>> @@ -2482,6 +2482,27 @@ _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
> > >>>>>                                               FS_XFLAG_EXTSIZE);
> > >>>>>                       }
> > >>>>>               }
> > >>>>> +             if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
> > >>>>> +                          XFS_DIFLAG_NODUMP)) {
> > >>>>> +                     /*
> > >>>>> +                      * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
> > >>>>> +                      * yields EBADF on symlinks as they have O_PATH set.
> > >>>>> +                      * "Extra file attributes", stx_attributes, as per
> > >>>>> +                      * statx(2) cannot be set on symlinks on Linux.
> > >>>>> +                      */
> > >>>>> +                     if (di_mode && S_ISLNK(di_mode) &&
> > >>>>> +                         !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
> > >>>>
> > >>>> Does this DIFLAG clearing applies to bdev/cdev/fifo/socket files too?
> > >>>
> > >>> Not at the moment given the semantics I hunted down and tested for
> > >>> were for O_PATH only.  The validation I hunted down applies to any
> > >>> file descriptors which we open via O_PATH only.
> > >>
> > >> iirc when you open one of those special files you end up with a fd that
> > >> points to an inode on a special bdevfs/pipefs/etc., not an inode linked
> > >> to the underlying filesystem containing the special file.
> > > 
> > > That seems to fit the O_PATH intent, however its unclear if O_PATH was needed,
> > > as per my testing on /dev/loop0 I don't need O_PATH set for it.
> > > 
> > >> Therefore you shouldn't be able to set any DIFLAG/DIFLAG2 flags on special files.
> > > 
> > > That would be great if we can verify.
> > > 
> > >> # mknod block b 8 0 ; mknod char c 1 3 ; mknod fifo p
> > >> # lsattr block char fifo
> > >> lsattr: Operation not supported While reading flags on block
> > >> lsattr: Operation not supported While reading flags on char
> > >> lsattr: Operation not supported While reading flags on fifo
> > > 
> > > I'm afraid e2fsprogs has a special check for these, ie, userspace is barred
> > > from actually toying with special files purposely because of the Debian bug I
> > > named.
> > > 
> > > strace should reveal the respective ioctl() was not actually issued.
> > 
> > That's very easy to short-circuit in e2fsprogs if you'd like to test what
> > the kernel does today.
> 
> I meant that *I know* that the ioctl() is not issued, as it is a guard implemented
> on e2fsprogs to avoid potentially interfacing with buggy kernel drivers.
> 
> One however can write a program which does the raw ioctl() call to really test.
> 
> For all the above: block char fifo, ioctl() with FS_IOC_FSGETXATTR and
> FS_IOC_FSSETXATTR fails with: -1 an errno is set to ENOTTY (Inappropriate ioctl
> for device). Reason is that vfs_ioctl() is used, and on XFS we'll hit the goto
> out here as no f_op callbacks are set for these special  devices:
> 
> long vfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> {
>         int error = -ENOTTY;                                                    
>                                                                                 
>         if (!filp->f_op->unlocked_ioctl)                                        
>                 goto out;                                                       
>                                                                                 
>         error = filp->f_op->unlocked_ioctl(filp, cmd, arg);                     
>         if (error == -ENOIOCTLCMD)                                              
>                 error = -ENOTTY;                                                
>  out:                                                                           
>         return error;                                                           
> }
> 
> My point also was that the above logic is different than for symlink. I'd  much
> prefer to address these as a secondary patch given the logic is different.
> 
> PF_LOCAL named sockets are also represented on the filesystem, and same situation
> with them, so I can bunch up a check for them as well. Are you OK with these
> going in as separate patches or do you want me to mesh this all together along
> with the new explanation for these?

*re-poke*

I hope the VFS RFC patch I just posted clarifies the situation a bit more.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/repair/dinode.c b/repair/dinode.c
index 15ba8cc22b39..6288e42de15e 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -2482,6 +2482,27 @@  _("bad (negative) size %" PRId64 " on inode %" PRIu64 "\n"),
 						FS_XFLAG_EXTSIZE);
 			}
 		}
+		if (flags & (XFS_DIFLAG_IMMUTABLE | XFS_DIFLAG_APPEND |
+			     XFS_DIFLAG_NODUMP)) {
+			/*
+			 * ioctl(fd, *) and so ioctl(fd, FS_IOC_SETFLAGS)
+			 * yields EBADF on symlinks as they have O_PATH set.
+			 * "Extra file attributes", stx_attributes, as per
+			 * statx(2) cannot be set on symlinks on Linux.
+			 */
+			if (di_mode && S_ISLNK(di_mode) &&
+			    !S_ISREG(di_mode) && !S_ISDIR(di_mode)) {
+				if (!uncertain) {
+					do_warn(
+	_("file or directory attributes set on symlink inode %" PRIu64 "\n"),
+						lino);
+				}
+				flags &= ~(XFS_DIFLAG_IMMUTABLE |
+					   XFS_DIFLAG_APPEND |
+					   XFS_DIFLAG_NODUMP);
+			}
+		}
+
 		if (!verify_mode && flags != be16_to_cpu(dino->di_flags)) {
 			if (!no_modify) {
 				do_warn(_("fixing bad flags.\n"));