diff mbox

[RFC] vfs: skip extra attributes check on removal for symlinks

Message ID 20180426234639.12480-1-mcgrof@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Luis Chamberlain April 26, 2018, 11:46 p.m. UTC
Linux filesystems cannot set extra file attributes (stx_attributes as per
statx(2)) on a symbolic link. To set extra file attributes you issue
ioctl(2) with FS_IOC_SETFLAGS, *all* ioctl(2) calls on a symbolic link
yield EBADF.

This is because ioctl(2) tries to obtain struct fd from the symbolic link
file descriptor passed using fdget(), fdget() in turn always returns no
file set when a file descriptor is open with O_PATH. As per symlink(2)
O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
descriptor of a symbolic link, and this holds true for Linux, as such extra
file attributes cannot possibly be set on symbolic links on Linux.

Filesystems repair utilities should be updated to detect this as
corruption and correct this, however, the VFS *does* respect these
extra attributes on symlinks for removal.

Since we cannot set these attributes we should special-case the
immutable/append on delete for symlinks, this would be consistent with
what we *do* allow on Linux for all filesystems.

The userspace utility chattr(1) cannot set these attributes on symlinks
*and* other special files as well:

	# chattr -a symlink
	chattr: Operation not supported while reading flags on b

The reason for this is different though. Refer to commit 023d111e92195
("chattr.1.in: Document the compression attribute flags E, X, and ...")
merged on e2fsprogs v1.28 since August 2002. This commit prevented
issuing the ioctl() for symlink *and* special files in consideration for
a buggy DRM driver where issuing lsattr on their special files crashed
the system. For details refer to Debian bug 152029 [0].

You can craft your own tool to query the extra file attributes with
the new shiny statx(2) tool, statx(2) will list the attributes if
they were set for instance on a corrupt filesystem. However statx(2)
is only used for *querying* -- not for setting the attributes.

If you implement issuing your own ioctl() for FS_IOC_FSGETXATTR or
FS_IOC_FSSETXATTR on special files (block, char, fifo) it will fail
returning -1 and errno is set to ENOTTY (Inappropriate ioctl for
device). The reason for this is different than for symlinks.
For special files this fails on vfs_ioctl() when the filesystem
f_op callbacks are not set for these special files:

long vfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
	int error = -ENOTTY;

        if (!filp->f_op->unlocked_ioctl)
		goto out;

	error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
	if (error == -ENOIOCTLCMD)
		error = -ENOTTY;
out:
	return error;
}

The same applies to PF_LOCAL named sockets. Since this varies by
filesystem for special files, only make a special rule to respect
the immutable and append attribute on symlinks.

[0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=152029

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---

As discussed at LSF/MM -- I'd follow up on this low hanging fruit as
the discussion had stalled on linux-xfs on review of the respective
xfs_repair changes. This addresses the general API question, and
as such I think could help establish order in how we split up patches
for those changes.

This requires some other eyeballs, and it also requires a putting it through
xfstests which I can do in the next few days, hence the RFC. But better put it
out for review already. I'd also like feedback from the linux-api folks to
see if this matches their own known / documented expectations.

 fs/namei.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Darrick J. Wong May 1, 2018, 5:23 p.m. UTC | #1
On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> Linux filesystems cannot set extra file attributes (stx_attributes as per
> statx(2)) on a symbolic link. To set extra file attributes you issue
> ioctl(2) with FS_IOC_SETFLAGS, *all* ioctl(2) calls on a symbolic link
> yield EBADF.
> 
> This is because ioctl(2) tries to obtain struct fd from the symbolic link
> file descriptor passed using fdget(), fdget() in turn always returns no
> file set when a file descriptor is open with O_PATH. As per symlink(2)
> O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> descriptor of a symbolic link, and this holds true for Linux, as such extra
> file attributes cannot possibly be set on symbolic links on Linux.
> 
> Filesystems repair utilities should be updated to detect this as
> corruption and correct this, however, the VFS *does* respect these
> extra attributes on symlinks for removal.
> 
> Since we cannot set these attributes we should special-case the
> immutable/append on delete for symlinks, this would be consistent with
> what we *do* allow on Linux for all filesystems.

Ah, ok, so the problem here is that you can't rm an "immutable" symlink
nor can you clear the immutable flag on such a beast, so therefore
ignore the immutable (and append) flags if we're trying to delete a
symlink?

I think we ought to teach the xfs inode verifier to check for
immutable/append symlinks and return error so that we don't end up with
such things in core in the first place, and fix xfs_repair to zap such
things.

That said, for the filesystems that aren't going to check their inodes,
I guess this is a (hackish) way to avoid presenting undeletable gunk in
the fs to the user...

(Were it up to me I'd make a common vfs_check_inode() to reject
struct inode containing garbage that the vfs won't deal with, and teach
the filesystems to use it; but I was shot down when I tried to do that
for negative isize.)

--D

> The userspace utility chattr(1) cannot set these attributes on symlinks
> *and* other special files as well:
> 
> 	# chattr -a symlink
> 	chattr: Operation not supported while reading flags on b
> 
> The reason for this is different though. Refer to commit 023d111e92195
> ("chattr.1.in: Document the compression attribute flags E, X, and ...")
> merged on e2fsprogs v1.28 since August 2002. This commit prevented
> issuing the ioctl() for symlink *and* special files in consideration for
> a buggy DRM driver where issuing lsattr on their special files crashed
> the system. For details refer to Debian bug 152029 [0].
> 
> You can craft your own tool to query the extra file attributes with
> the new shiny statx(2) tool, statx(2) will list the attributes if
> they were set for instance on a corrupt filesystem. However statx(2)
> is only used for *querying* -- not for setting the attributes.
> 
> If you implement issuing your own ioctl() for FS_IOC_FSGETXATTR or
> FS_IOC_FSSETXATTR on special files (block, char, fifo) it will fail
> returning -1 and errno is set to ENOTTY (Inappropriate ioctl for
> device). The reason for this is different than for symlinks.
> For special files this fails on vfs_ioctl() when the filesystem
> f_op callbacks are not set for these special files:
> 
> long vfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> {
> 	int error = -ENOTTY;
> 
>         if (!filp->f_op->unlocked_ioctl)
> 		goto out;
> 
> 	error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
> 	if (error == -ENOIOCTLCMD)
> 		error = -ENOTTY;
> out:
> 	return error;
> }
> 
> The same applies to PF_LOCAL named sockets. Since this varies by
> filesystem for special files, only make a special rule to respect
> the immutable and append attribute on symlinks.
> 
> [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=152029
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
> 
> As discussed at LSF/MM -- I'd follow up on this low hanging fruit as
> the discussion had stalled on linux-xfs on review of the respective
> xfs_repair changes. This addresses the general API question, and
> as such I think could help establish order in how we split up patches
> for those changes.
> 
> This requires some other eyeballs, and it also requires a putting it through
> xfstests which I can do in the next few days, hence the RFC. But better put it
> out for review already. I'd also like feedback from the linux-api folks to
> see if this matches their own known / documented expectations.
> 
>  fs/namei.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 186bd2464fd5..0f9069468cfb 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2719,6 +2719,14 @@ int __check_sticky(struct inode *dir, struct inode *inode)
>  }
>  EXPORT_SYMBOL(__check_sticky);
>  
> +/* Process extra file attributes only when they make sense */
> +static bool may_delete_stx_attributes(struct inode *inode)
> +{
> +	if (!S_ISLNK(inode->i_mode) && (IS_APPEND(inode) || IS_IMMUTABLE(inode)))
> +		return false;
> +	return  true;
> +}
> +
>  /*
>   *	Check whether we can remove a link victim from directory dir, check
>   *  whether the type of victim is right.
> @@ -2757,8 +2765,8 @@ static int may_delete(struct inode *dir, struct dentry *victim, bool isdir)
>  	if (IS_APPEND(dir))
>  		return -EPERM;
>  
> -	if (check_sticky(dir, inode) || IS_APPEND(inode) ||
> -	    IS_IMMUTABLE(inode) || IS_SWAPFILE(inode) || HAS_UNMAPPED_ID(inode))
> +	if (check_sticky(dir, inode) || !may_delete_stx_attributes(inode) ||
> +	    IS_SWAPFILE(inode) || HAS_UNMAPPED_ID(inode))
>  		return -EPERM;
>  	if (isdir) {
>  		if (!d_is_dir(victim))
> -- 
> 2.17.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain May 1, 2018, 5:45 p.m. UTC | #2
On Tue, May 01, 2018 at 10:23:19AM -0700, Darrick J. Wong wrote:
> On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> > Linux filesystems cannot set extra file attributes (stx_attributes as per
> > statx(2)) on a symbolic link. To set extra file attributes you issue
> > ioctl(2) with FS_IOC_SETFLAGS, *all* ioctl(2) calls on a symbolic link
> > yield EBADF.
> > 
> > This is because ioctl(2) tries to obtain struct fd from the symbolic link
> > file descriptor passed using fdget(), fdget() in turn always returns no
> > file set when a file descriptor is open with O_PATH. As per symlink(2)
> > O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> > descriptor of a symbolic link, and this holds true for Linux, as such extra
> > file attributes cannot possibly be set on symbolic links on Linux.
> > 
> > Filesystems repair utilities should be updated to detect this as
> > corruption and correct this, however, the VFS *does* respect these
> > extra attributes on symlinks for removal.
> > 
> > Since we cannot set these attributes we should special-case the
> > immutable/append on delete for symlinks, this would be consistent with
> > what we *do* allow on Linux for all filesystems.
> 
> Ah, ok, so the problem here is that you can't rm an "immutable" symlink
> nor can you clear the immutable flag on such a beast, so therefore
> ignore the immutable (and append) flags if we're trying to delete a
> symlink?

Yup.

> I think we ought to teach the xfs inode verifier to check for
> immutable/append symlinks and return error so that we don't end up with
> such things in core in the first place,

Agreed. But note that one way to end up with these things is through
corruption. Once a user finds this the first signs they'll run into
(unless they have the awesome new online scrubber) is they cannot delete
some odd file and not know why. And then they'll see they cannot change
or remove either the immutable or append flag. The immutable attribute
is known to not let you delete the file, but its less known that the
append attribute implies the same. Folks would scratch their heads.

Since one cannot *set* these attributes on symlinks on Linux, other than
ignoring such attributes perhaps we should warn about it? As the only
way you could end up with that is if your filesystem got corrupted.

But without filesystems having a fix for that merged users can't do anything.

> and fix xfs_repair to zap such things.

This is what my patch for xfs_repair does, which is pending review.

But note that special files are handled differently, I explain the logic on the
RFC commit log, which is why I suggested splitting up adding a fix for special
files as a separate patch.

> That said, for the filesystems that aren't going to check their inodes,
> I guess this is a (hackish) way to avoid presenting undeletable gunk in
> the fs to the user...

That's why its RFC. What is the right thing to do here?

My own logic here was since we cannot possibly allow extended attributes on
symlinks is to not use them then as well, in this case for delete.

> (Were it up to me I'd make a common vfs_check_inode() to reject
> struct inode containing garbage that the vfs won't deal with,

There certainly are cases that we could come up with for the VFS where
if such things are found, regardless of the filesystem, we're very sure
its a corruption. This is just one example, as you note.

So it is correct that there are two things here:

1) Do we respect the attribute for symlink on delete
2) Do we warn to users of the fact that the inode is very likely corrupted
   regardless of the filesystem?

This patch only addresses 1) and makes it match the VFS expectation, and
I think this is safe if we're not doing 2). This is why I'm also looking
for feedback from linux-api folks.

This is one of those odd corner cases we run into, and very likely not
well documented anywhere.

> and teach the filesystems to use it;

Or the VFS uses it.

  Luis
Darrick J. Wong May 8, 2018, 12:30 a.m. UTC | #3
On Tue, May 01, 2018 at 05:45:12PM +0000, Luis R. Rodriguez wrote:
> On Tue, May 01, 2018 at 10:23:19AM -0700, Darrick J. Wong wrote:
> > On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> > > Linux filesystems cannot set extra file attributes (stx_attributes as per
> > > statx(2)) on a symbolic link. To set extra file attributes you issue
> > > ioctl(2) with FS_IOC_SETFLAGS, *all* ioctl(2) calls on a symbolic link
> > > yield EBADF.
> > > 
> > > This is because ioctl(2) tries to obtain struct fd from the symbolic link
> > > file descriptor passed using fdget(), fdget() in turn always returns no
> > > file set when a file descriptor is open with O_PATH. As per symlink(2)
> > > O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> > > descriptor of a symbolic link, and this holds true for Linux, as such extra
> > > file attributes cannot possibly be set on symbolic links on Linux.
> > > 
> > > Filesystems repair utilities should be updated to detect this as
> > > corruption and correct this, however, the VFS *does* respect these
> > > extra attributes on symlinks for removal.
> > > 
> > > Since we cannot set these attributes we should special-case the
> > > immutable/append on delete for symlinks, this would be consistent with
> > > what we *do* allow on Linux for all filesystems.
> > 
> > Ah, ok, so the problem here is that you can't rm an "immutable" symlink
> > nor can you clear the immutable flag on such a beast, so therefore
> > ignore the immutable (and append) flags if we're trying to delete a
> > symlink?
> 
> Yup.
> 
> > I think we ought to teach the xfs inode verifier to check for
> > immutable/append symlinks and return error so that we don't end up with
> > such things in core in the first place,
> 
> Agreed. But note that one way to end up with these things is through
> corruption. Once a user finds this the first signs they'll run into
> (unless they have the awesome new online scrubber) is they cannot delete
> some odd file and not know why. And then they'll see they cannot change
> or remove either the immutable or append flag. The immutable attribute
> is known to not let you delete the file, but its less known that the
> append attribute implies the same. Folks would scratch their heads.
> 
> Since one cannot *set* these attributes on symlinks on Linux, other than
> ignoring such attributes perhaps we should warn about it? As the only
> way you could end up with that is if your filesystem got corrupted.
> 
> But without filesystems having a fix for that merged users can't do anything.
> 
> > and fix xfs_repair to zap such things.
> 
> This is what my patch for xfs_repair does, which is pending review.
> 
> But note that special files are handled differently, I explain the logic on the
> RFC commit log, which is why I suggested splitting up adding a fix for special
> files as a separate patch.
> 
> > That said, for the filesystems that aren't going to check their inodes,
> > I guess this is a (hackish) way to avoid presenting undeletable gunk in
> > the fs to the user...
> 
> That's why its RFC. What is the right thing to do here?

Ideally, none of the filesystems should ever feed garbage to the vfs,
but it's not all that obvious exactly what things the vfs will trip over.
And knowing that a lot of the filesystems do minimal checking if any, I
guess I'll just say that...

...XFS (and probably ext4) should catch immutable symlinks in the inode
verifiers so that xfs_iget returns -EFSCORRUPTED.  For the rest of the
filesystems, it's probably fine to let them delete the symlink since the
user wants to kill it anyway.

> My own logic here was since we cannot possibly allow extended attributes on
> symlinks is to not use them then as well, in this case for delete.
> 
> > (Were it up to me I'd make a common vfs_check_inode() to reject
> > struct inode containing garbage that the vfs won't deal with,
> 
> There certainly are cases that we could come up with for the VFS where
> if such things are found, regardless of the filesystem, we're very sure
> its a corruption. This is just one example, as you note.
> 
> So it is correct that there are two things here:
> 
> 1) Do we respect the attribute for symlink on delete
> 2) Do we warn to users of the fact that the inode is very likely corrupted
>    regardless of the filesystem?

But I suppose we could WARN_ON_ONCE to state that we're allowing
deletion of an immutable symlink that we couldn't possibly have set.
Anyway, couple this patch with a second one to fix the xfs verifier and
I'll be happy.

--D

> This patch only addresses 1) and makes it match the VFS expectation, and
> I think this is safe if we're not doing 2). This is why I'm also looking
> for feedback from linux-api folks.
> 
> This is one of those odd corner cases we run into, and very likely not
> well documented anywhere.
> 
> > and teach the filesystems to use it;
> 
> Or the VFS uses it.
> 
>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain May 9, 2018, 12:03 a.m. UTC | #4
On Mon, May 07, 2018 at 05:30:55PM -0700, Darrick J. Wong wrote:
> On Tue, May 01, 2018 at 05:45:12PM +0000, Luis R. Rodriguez wrote:
> > On Tue, May 01, 2018 at 10:23:19AM -0700, Darrick J. Wong wrote:
> > > On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> > > > Linux filesystems cannot set extra file attributes (stx_attributes as per
> > > > statx(2)) on a symbolic link. To set extra file attributes you issue
> > > > ioctl(2) with FS_IOC_SETFLAGS, *all* ioctl(2) calls on a symbolic link
> > > > yield EBADF.
> > > > 
> > > > This is because ioctl(2) tries to obtain struct fd from the symbolic link
> > > > file descriptor passed using fdget(), fdget() in turn always returns no
> > > > file set when a file descriptor is open with O_PATH. As per symlink(2)
> > > > O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> > > > descriptor of a symbolic link, and this holds true for Linux, as such extra
> > > > file attributes cannot possibly be set on symbolic links on Linux.
> > > > 
> > > > Filesystems repair utilities should be updated to detect this as
> > > > corruption and correct this, however, the VFS *does* respect these
> > > > extra attributes on symlinks for removal.
> > > > 
> > > > Since we cannot set these attributes we should special-case the
> > > > immutable/append on delete for symlinks, this would be consistent with
> > > > what we *do* allow on Linux for all filesystems.
> > > 
> > > Ah, ok, so the problem here is that you can't rm an "immutable" symlink
> > > nor can you clear the immutable flag on such a beast, so therefore
> > > ignore the immutable (and append) flags if we're trying to delete a
> > > symlink?
> > 
> > Yup.
> > 
> > > I think we ought to teach the xfs inode verifier to check for
> > > immutable/append symlinks and return error so that we don't end up with
> > > such things in core in the first place,
> > 
> > Agreed. But note that one way to end up with these things is through
> > corruption. Once a user finds this the first signs they'll run into
> > (unless they have the awesome new online scrubber) is they cannot delete
> > some odd file and not know why. And then they'll see they cannot change
> > or remove either the immutable or append flag. The immutable attribute
> > is known to not let you delete the file, but its less known that the
> > append attribute implies the same. Folks would scratch their heads.
> > 
> > Since one cannot *set* these attributes on symlinks on Linux, other than
> > ignoring such attributes perhaps we should warn about it? As the only
> > way you could end up with that is if your filesystem got corrupted.
> > 
> > But without filesystems having a fix for that merged users can't do anything.
> > 
> > > and fix xfs_repair to zap such things.
> > 
> > This is what my patch for xfs_repair does, which is pending review.
> > 
> > But note that special files are handled differently, I explain the logic on the
> > RFC commit log, which is why I suggested splitting up adding a fix for special
> > files as a separate patch.
> > 
> > > That said, for the filesystems that aren't going to check their inodes,
> > > I guess this is a (hackish) way to avoid presenting undeletable gunk in
> > > the fs to the user...
> > 
> > That's why its RFC. What is the right thing to do here?
> 
> Ideally, none of the filesystems should ever feed garbage to the vfs,
> but it's not all that obvious exactly what things the vfs will trip over.
> And knowing that a lot of the filesystems do minimal checking if any, I
> guess I'll just say that...
> 
> ...XFS (and probably ext4) should catch immutable symlinks in the inode
> verifiers so that xfs_iget returns -EFSCORRUPTED.  For the rest of the
> filesystems, it's probably fine to let them delete the symlink since the
> user wants to kill it anyway.

Alrighty, I have this implemented now.

> > My own logic here was since we cannot possibly allow extended attributes on
> > symlinks is to not use them then as well, in this case for delete.
> > 
> > > (Were it up to me I'd make a common vfs_check_inode() to reject
> > > struct inode containing garbage that the vfs won't deal with,
> > 
> > There certainly are cases that we could come up with for the VFS where
> > if such things are found, regardless of the filesystem, we're very sure
> > its a corruption. This is just one example, as you note.
> > 
> > So it is correct that there are two things here:
> > 
> > 1) Do we respect the attribute for symlink on delete
> > 2) Do we warn to users of the fact that the inode is very likely corrupted
> >    regardless of the filesystem?
> 
> But I suppose we could WARN_ON_ONCE to state that we're allowing
> deletion of an immutable symlink that we couldn't possibly have set.

Yes, I think that's better than to allow filesystems to do things even
though the VFS in this case *knows* better.

> Anyway, couple this patch with a second one to fix the xfs verifier and
> I'll be happy.

Groovy, thanks, let's not forget the xfs_repair respective fix :) let me know
if you have any feedback on that.

  Luis
Darrick J. Wong May 9, 2018, 12:17 a.m. UTC | #5
On Wed, May 09, 2018 at 12:03:28AM +0000, Luis R. Rodriguez wrote:
> On Mon, May 07, 2018 at 05:30:55PM -0700, Darrick J. Wong wrote:
> > On Tue, May 01, 2018 at 05:45:12PM +0000, Luis R. Rodriguez wrote:
> > > On Tue, May 01, 2018 at 10:23:19AM -0700, Darrick J. Wong wrote:
> > > > On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> > > > > Linux filesystems cannot set extra file attributes (stx_attributes as per
> > > > > statx(2)) on a symbolic link. To set extra file attributes you issue
> > > > > ioctl(2) with FS_IOC_SETFLAGS, *all* ioctl(2) calls on a symbolic link
> > > > > yield EBADF.
> > > > > 
> > > > > This is because ioctl(2) tries to obtain struct fd from the symbolic link
> > > > > file descriptor passed using fdget(), fdget() in turn always returns no
> > > > > file set when a file descriptor is open with O_PATH. As per symlink(2)
> > > > > O_PATH and O_NOFOLLOW must *always* be used when you want to get the file
> > > > > descriptor of a symbolic link, and this holds true for Linux, as such extra
> > > > > file attributes cannot possibly be set on symbolic links on Linux.
> > > > > 
> > > > > Filesystems repair utilities should be updated to detect this as
> > > > > corruption and correct this, however, the VFS *does* respect these
> > > > > extra attributes on symlinks for removal.
> > > > > 
> > > > > Since we cannot set these attributes we should special-case the
> > > > > immutable/append on delete for symlinks, this would be consistent with
> > > > > what we *do* allow on Linux for all filesystems.
> > > > 
> > > > Ah, ok, so the problem here is that you can't rm an "immutable" symlink
> > > > nor can you clear the immutable flag on such a beast, so therefore
> > > > ignore the immutable (and append) flags if we're trying to delete a
> > > > symlink?
> > > 
> > > Yup.
> > > 
> > > > I think we ought to teach the xfs inode verifier to check for
> > > > immutable/append symlinks and return error so that we don't end up with
> > > > such things in core in the first place,
> > > 
> > > Agreed. But note that one way to end up with these things is through
> > > corruption. Once a user finds this the first signs they'll run into
> > > (unless they have the awesome new online scrubber) is they cannot delete
> > > some odd file and not know why. And then they'll see they cannot change
> > > or remove either the immutable or append flag. The immutable attribute
> > > is known to not let you delete the file, but its less known that the
> > > append attribute implies the same. Folks would scratch their heads.
> > > 
> > > Since one cannot *set* these attributes on symlinks on Linux, other than
> > > ignoring such attributes perhaps we should warn about it? As the only
> > > way you could end up with that is if your filesystem got corrupted.
> > > 
> > > But without filesystems having a fix for that merged users can't do anything.
> > > 
> > > > and fix xfs_repair to zap such things.
> > > 
> > > This is what my patch for xfs_repair does, which is pending review.
> > > 
> > > But note that special files are handled differently, I explain the logic on the
> > > RFC commit log, which is why I suggested splitting up adding a fix for special
> > > files as a separate patch.
> > > 
> > > > That said, for the filesystems that aren't going to check their inodes,
> > > > I guess this is a (hackish) way to avoid presenting undeletable gunk in
> > > > the fs to the user...
> > > 
> > > That's why its RFC. What is the right thing to do here?
> > 
> > Ideally, none of the filesystems should ever feed garbage to the vfs,
> > but it's not all that obvious exactly what things the vfs will trip over.
> > And knowing that a lot of the filesystems do minimal checking if any, I
> > guess I'll just say that...
> > 
> > ...XFS (and probably ext4) should catch immutable symlinks in the inode
> > verifiers so that xfs_iget returns -EFSCORRUPTED.  For the rest of the
> > filesystems, it's probably fine to let them delete the symlink since the
> > user wants to kill it anyway.
> 
> Alrighty, I have this implemented now.
> 
> > > My own logic here was since we cannot possibly allow extended attributes on
> > > symlinks is to not use them then as well, in this case for delete.
> > > 
> > > > (Were it up to me I'd make a common vfs_check_inode() to reject
> > > > struct inode containing garbage that the vfs won't deal with,
> > > 
> > > There certainly are cases that we could come up with for the VFS where
> > > if such things are found, regardless of the filesystem, we're very sure
> > > its a corruption. This is just one example, as you note.
> > > 
> > > So it is correct that there are two things here:
> > > 
> > > 1) Do we respect the attribute for symlink on delete
> > > 2) Do we warn to users of the fact that the inode is very likely corrupted
> > >    regardless of the filesystem?
> > 
> > But I suppose we could WARN_ON_ONCE to state that we're allowing
> > deletion of an immutable symlink that we couldn't possibly have set.
> 
> Yes, I think that's better than to allow filesystems to do things even
> though the VFS in this case *knows* better.
> 
> > Anyway, couple this patch with a second one to fix the xfs verifier and
> > I'll be happy.
> 
> Groovy, thanks, let's not forget the xfs_repair respective fix :) let me know
> if you have any feedback on that.

TBH I've lost any proposed xfs_repair patches to the mists of time
because some patch volcano keeps erupting on the lists. :P

Uh... I think it's fine for xfs_{repair,scrub} to clear the immutable
and append flags on any special inodes it finds, particularly since
neither flag has any real meaning for block/char/fifo/socket/symlinks
anyway.

(Well ok I could imagine immutable pipes being meaningful for
ignoring^Wdealing with the bureaucracy but I don't see a non-joke
meaning. :P)

--D

>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis Chamberlain May 9, 2018, 12:38 a.m. UTC | #6
On Tue, May 08, 2018 at 05:17:41PM -0700, Darrick J. Wong wrote:
> On Wed, May 09, 2018 at 12:03:28AM +0000, Luis R. Rodriguez wrote:
> > Groovy, thanks, let's not forget the xfs_repair respective fix :) let me know
> > if you have any feedback on that.
> 
> TBH I've lost any proposed xfs_repair patches to the mists of time
> because some patch volcano keeps erupting on the lists. :P
> 
> Uh... I think it's fine for xfs_{repair,scrub} to clear the immutable
> and append flags on any special inodes it finds, particularly since
> neither flag has any real meaning for block/char/fifo/socket/symlinks
> anyway.

Sure, my point during review of that series for xfs_repair in particular
though was that for symlinks the justification is different (half of the
commit log in this new patch), as such I'd prefer to deal with them in a
separate follow up patch.

  Luis
Darrick J. Wong May 9, 2018, 1:39 a.m. UTC | #7
On Wed, May 09, 2018 at 12:38:55AM +0000, Luis R. Rodriguez wrote:
> On Tue, May 08, 2018 at 05:17:41PM -0700, Darrick J. Wong wrote:
> > On Wed, May 09, 2018 at 12:03:28AM +0000, Luis R. Rodriguez wrote:
> > > Groovy, thanks, let's not forget the xfs_repair respective fix :) let me know
> > > if you have any feedback on that.
> > 
> > TBH I've lost any proposed xfs_repair patches to the mists of time
> > because some patch volcano keeps erupting on the lists. :P
> > 
> > Uh... I think it's fine for xfs_{repair,scrub} to clear the immutable
> > and append flags on any special inodes it finds, particularly since
> > neither flag has any real meaning for block/char/fifo/socket/symlinks
> > anyway.
> 
> Sure, my point during review of that series for xfs_repair in particular
> though was that for symlinks the justification is different (half of the
> commit log in this new patch), as such I'd prefer to deal with them in a
> separate follow up patch.

Ah, found it again finally[1].  ISTR the discussion petered out after
Dave asked if you had a map of inode mode
(file/dir/bdev/cdev/fifo/socket/symlink) to allowable flags.  That would
be a good general way to detect and clear stray flags.

--D

[1] https://marc.info/?l=linux-xfs&m=150948985429374&w=4

>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro May 10, 2018, 8:48 p.m. UTC | #8
On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:

> Since we cannot set these attributes we should special-case the
> immutable/append on delete for symlinks, this would be consistent with
> what we *do* allow on Linux for all filesystems.

Er...  So why not simply sanity-check it in places that set it on
inodes?  If anything, I would suggest
	* converting all places that set those in ->i_flags to
inode_set_flags()
	* making inode_set_flags() check and return an error on
that...
Luis Chamberlain May 10, 2018, 11:05 p.m. UTC | #9
On Thu, May 10, 2018 at 09:48:07PM +0100, Al Viro wrote:
> On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> 
> > Since we cannot set these attributes we should special-case the
> > immutable/append on delete for symlinks, this would be consistent with
> > what we *do* allow on Linux for all filesystems.
> 
> Er...  So why not simply sanity-check it in places that set it on
> inodes? 

The patch is not about sanity-checks on setters though as *that* is in place
already. Its about the case where the filesystem gets corrupted and the VFS
*still* does process these attributes for symlinks and still prevents
deletion because of these attributes.

So we already do not allow for settings these attributes.

> If anything, I would suggest
> 	* converting all places that set those in ->i_flags to
> inode_set_flags()
> 	* making inode_set_flags() check and return an error on
> that...

But if I misunderstood your suggestion please let me know. I'll send out
a v2 RFC next which illustrates what filesystems can do for now.

  Luis
Al Viro May 11, 2018, 2:42 a.m. UTC | #10
On Thu, May 10, 2018 at 11:05:51PM +0000, Luis R. Rodriguez wrote:
> On Thu, May 10, 2018 at 09:48:07PM +0100, Al Viro wrote:
> > On Thu, Apr 26, 2018 at 04:46:39PM -0700, Luis R. Rodriguez wrote:
> > 
> > > Since we cannot set these attributes we should special-case the
> > > immutable/append on delete for symlinks, this would be consistent with
> > > what we *do* allow on Linux for all filesystems.
> > 
> > Er...  So why not simply sanity-check it in places that set it on
> > inodes? 
> 
> The patch is not about sanity-checks on setters though as *that* is in place
> already. Its about the case where the filesystem gets corrupted and the VFS
> *still* does process these attributes for symlinks and still prevents
> deletion because of these attributes.

... and this corrupted fs ends up setting those flags on in-core inodes.
Which is where we ought to block that.  Seriously, let's make sure that
->i_flags manipulations are done by inode_set_flags() (e.g. btrfs open-codes
that, apparently) and let's make _that_ check and reject those.
diff mbox

Patch

diff --git a/fs/namei.c b/fs/namei.c
index 186bd2464fd5..0f9069468cfb 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2719,6 +2719,14 @@  int __check_sticky(struct inode *dir, struct inode *inode)
 }
 EXPORT_SYMBOL(__check_sticky);
 
+/* Process extra file attributes only when they make sense */
+static bool may_delete_stx_attributes(struct inode *inode)
+{
+	if (!S_ISLNK(inode->i_mode) && (IS_APPEND(inode) || IS_IMMUTABLE(inode)))
+		return false;
+	return  true;
+}
+
 /*
  *	Check whether we can remove a link victim from directory dir, check
  *  whether the type of victim is right.
@@ -2757,8 +2765,8 @@  static int may_delete(struct inode *dir, struct dentry *victim, bool isdir)
 	if (IS_APPEND(dir))
 		return -EPERM;
 
-	if (check_sticky(dir, inode) || IS_APPEND(inode) ||
-	    IS_IMMUTABLE(inode) || IS_SWAPFILE(inode) || HAS_UNMAPPED_ID(inode))
+	if (check_sticky(dir, inode) || !may_delete_stx_attributes(inode) ||
+	    IS_SWAPFILE(inode) || HAS_UNMAPPED_ID(inode))
 		return -EPERM;
 	if (isdir) {
 		if (!d_is_dir(victim))