diff mbox series

[v7] Add a "nosymfollow" mount option.

Message ID 20200811222803.3224434-1-zwisler@google.com (mailing list archive)
State New, archived
Headers show
Series [v7] Add a "nosymfollow" mount option. | expand

Commit Message

Ross Zwisler Aug. 11, 2020, 10:28 p.m. UTC
From: Mattias Nissler <mnissler@chromium.org>

For mounts that have the new "nosymfollow" option, don't follow symlinks
when resolving paths. The new option is similar in spirit to the
existing "nodev", "noexec", and "nosuid" options, as well as to the
LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD
variants have been supporting the "nosymfollow" mount option for a long
time with equivalent implementations.

Note that symlinks may still be created on file systems mounted with
the "nosymfollow" option present. readlink() remains functional, so
user space code that is aware of symlinks can still choose to follow
them explicitly.

Setting the "nosymfollow" mount option helps prevent privileged
writers from modifying files unintentionally in case there is an
unexpected link along the accessed path. The "nosymfollow" option is
thus useful as a defensive measure for systems that need to deal with
untrusted file systems in privileged contexts.

More information on the history and motivation for this patch can be
found here:

https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-docs/hardening-against-malicious-stateful-data#TOC-Restricting-symlink-traversal

Signed-off-by: Mattias Nissler <mnissler@chromium.org>
Signed-off-by: Ross Zwisler <zwisler@google.com>
---
Changes since v6 [1]:
 * Rebased onto v5.8.
 * Another round of testing including readlink(1), readlink(2),
   realpath(1), realpath(3), statfs(2) and mount(2) to make sure
   everything still works.

After this lands I will upstream changes to util-linux[2] and man-pages
[3].

[1]: https://lkml.org/lkml/2020/3/4/770
[2]: https://github.com/rzwisler/util-linux/commit/7f8771acd85edb70d97921c026c55e1e724d4e15
[3]: https://github.com/rzwisler/man-pages/commit/b8fe8079f64b5068940c0144586e580399a71668
---
 fs/namei.c                 | 3 ++-
 fs/namespace.c             | 2 ++
 fs/proc_namespace.c        | 1 +
 fs/statfs.c                | 2 ++
 include/linux/mount.h      | 3 ++-
 include/linux/statfs.h     | 1 +
 include/uapi/linux/mount.h | 1 +
 7 files changed, 11 insertions(+), 2 deletions(-)

Comments

Aleksa Sarai Aug. 12, 2020, 1:43 a.m. UTC | #1
On 2020-08-11, Ross Zwisler <zwisler@chromium.org> wrote:
> From: Mattias Nissler <mnissler@chromium.org>
> 
> For mounts that have the new "nosymfollow" option, don't follow symlinks
> when resolving paths. The new option is similar in spirit to the
> existing "nodev", "noexec", and "nosuid" options, as well as to the
> LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD
> variants have been supporting the "nosymfollow" mount option for a long
> time with equivalent implementations.
> 
> Note that symlinks may still be created on file systems mounted with
> the "nosymfollow" option present. readlink() remains functional, so
> user space code that is aware of symlinks can still choose to follow
> them explicitly.
> 
> Setting the "nosymfollow" mount option helps prevent privileged
> writers from modifying files unintentionally in case there is an
> unexpected link along the accessed path. The "nosymfollow" option is
> thus useful as a defensive measure for systems that need to deal with
> untrusted file systems in privileged contexts.
> 
> More information on the history and motivation for this patch can be
> found here:
> 
> https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-docs/hardening-against-malicious-stateful-data#TOC-Restricting-symlink-traversal

Looks good. Did you plan to add an in-tree test for this (you could
shove it in tools/testing/selftests/mount)?

Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>

> Signed-off-by: Mattias Nissler <mnissler@chromium.org>
> Signed-off-by: Ross Zwisler <zwisler@google.com>
> ---
> Changes since v6 [1]:
>  * Rebased onto v5.8.
>  * Another round of testing including readlink(1), readlink(2),
>    realpath(1), realpath(3), statfs(2) and mount(2) to make sure
>    everything still works.
> 
> After this lands I will upstream changes to util-linux[2] and man-pages
> [3].
> 
> [1]: https://lkml.org/lkml/2020/3/4/770
> [2]: https://github.com/rzwisler/util-linux/commit/7f8771acd85edb70d97921c026c55e1e724d4e15
> [3]: https://github.com/rzwisler/man-pages/commit/b8fe8079f64b5068940c0144586e580399a71668
> ---
>  fs/namei.c                 | 3 ++-
>  fs/namespace.c             | 2 ++
>  fs/proc_namespace.c        | 1 +
>  fs/statfs.c                | 2 ++
>  include/linux/mount.h      | 3 ++-
>  include/linux/statfs.h     | 1 +
>  include/uapi/linux/mount.h | 1 +
>  7 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 72d4219c93acb..ed68478fb1fb6 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1626,7 +1626,8 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
>  			return ERR_PTR(error);
>  	}
>  
> -	if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS))
> +	if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS) ||
> +			unlikely(nd->path.mnt->mnt_flags & MNT_NOSYMFOLLOW))
>  		return ERR_PTR(-ELOOP);
>  
>  	if (!(nd->flags & LOOKUP_RCU)) {
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 4a0f600a33285..1cbbf5a9b954f 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -3167,6 +3167,8 @@ long do_mount(const char *dev_name, const char __user *dir_name,
>  		mnt_flags &= ~(MNT_RELATIME | MNT_NOATIME);
>  	if (flags & MS_RDONLY)
>  		mnt_flags |= MNT_READONLY;
> +	if (flags & MS_NOSYMFOLLOW)
> +		mnt_flags |= MNT_NOSYMFOLLOW;
>  
>  	/* The default atime for remount is preservation */
>  	if ((flags & MS_REMOUNT) &&
> diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
> index 3059a9394c2d6..e59d4bb3a89e4 100644
> --- a/fs/proc_namespace.c
> +++ b/fs/proc_namespace.c
> @@ -70,6 +70,7 @@ static void show_mnt_opts(struct seq_file *m, struct vfsmount *mnt)
>  		{ MNT_NOATIME, ",noatime" },
>  		{ MNT_NODIRATIME, ",nodiratime" },
>  		{ MNT_RELATIME, ",relatime" },
> +		{ MNT_NOSYMFOLLOW, ",nosymfollow" },
>  		{ 0, NULL }
>  	};
>  	const struct proc_fs_opts *fs_infop;
> diff --git a/fs/statfs.c b/fs/statfs.c
> index 2616424012ea7..59f33752c1311 100644
> --- a/fs/statfs.c
> +++ b/fs/statfs.c
> @@ -29,6 +29,8 @@ static int flags_by_mnt(int mnt_flags)
>  		flags |= ST_NODIRATIME;
>  	if (mnt_flags & MNT_RELATIME)
>  		flags |= ST_RELATIME;
> +	if (mnt_flags & MNT_NOSYMFOLLOW)
> +		flags |= ST_NOSYMFOLLOW;
>  	return flags;
>  }
>  
> diff --git a/include/linux/mount.h b/include/linux/mount.h
> index de657bd211fa6..aaf343b38671c 100644
> --- a/include/linux/mount.h
> +++ b/include/linux/mount.h
> @@ -30,6 +30,7 @@ struct fs_context;
>  #define MNT_NODIRATIME	0x10
>  #define MNT_RELATIME	0x20
>  #define MNT_READONLY	0x40	/* does the user want this to be r/o? */
> +#define MNT_NOSYMFOLLOW	0x80
>  
>  #define MNT_SHRINKABLE	0x100
>  #define MNT_WRITE_HOLD	0x200
> @@ -46,7 +47,7 @@ struct fs_context;
>  #define MNT_SHARED_MASK	(MNT_UNBINDABLE)
>  #define MNT_USER_SETTABLE_MASK  (MNT_NOSUID | MNT_NODEV | MNT_NOEXEC \
>  				 | MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME \
> -				 | MNT_READONLY)
> +				 | MNT_READONLY | MNT_NOSYMFOLLOW)
>  #define MNT_ATIME_MASK (MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME )
>  
>  #define MNT_INTERNAL_FLAGS (MNT_SHARED | MNT_WRITE_HOLD | MNT_INTERNAL | \
> diff --git a/include/linux/statfs.h b/include/linux/statfs.h
> index 9bc69edb8f188..fac4356ea1bfc 100644
> --- a/include/linux/statfs.h
> +++ b/include/linux/statfs.h
> @@ -40,6 +40,7 @@ struct kstatfs {
>  #define ST_NOATIME	0x0400	/* do not update access times */
>  #define ST_NODIRATIME	0x0800	/* do not update directory access times */
>  #define ST_RELATIME	0x1000	/* update atime relative to mtime/ctime */
> +#define ST_NOSYMFOLLOW	0x2000	/* do not follow symlinks */
>  
>  struct dentry;
>  extern int vfs_get_fsid(struct dentry *dentry, __kernel_fsid_t *fsid);
> diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
> index 96a0240f23fed..dd8306ea336c1 100644
> --- a/include/uapi/linux/mount.h
> +++ b/include/uapi/linux/mount.h
> @@ -16,6 +16,7 @@
>  #define MS_REMOUNT	32	/* Alter flags of a mounted FS */
>  #define MS_MANDLOCK	64	/* Allow mandatory locks on an FS */
>  #define MS_DIRSYNC	128	/* Directory modifications are synchronous */
> +#define MS_NOSYMFOLLOW	256	/* Do not follow symlinks */
>  #define MS_NOATIME	1024	/* Do not update access times. */
>  #define MS_NODIRATIME	2048	/* Do not update directory access times */
>  #define MS_BIND		4096
> -- 
> 2.28.0.236.gb10cc79966-goog
>
Ross Zwisler Aug. 12, 2020, 5:59 p.m. UTC | #2
On Tue, Aug 11, 2020 at 7:43 PM Aleksa Sarai <cyphar@cyphar.com> wrote:
> On 2020-08-11, Ross Zwisler <zwisler@chromium.org> wrote:
> > From: Mattias Nissler <mnissler@chromium.org>
> >
> > For mounts that have the new "nosymfollow" option, don't follow symlinks
> > when resolving paths. The new option is similar in spirit to the
> > existing "nodev", "noexec", and "nosuid" options, as well as to the
> > LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD
> > variants have been supporting the "nosymfollow" mount option for a long
> > time with equivalent implementations.
> >
> > Note that symlinks may still be created on file systems mounted with
> > the "nosymfollow" option present. readlink() remains functional, so
> > user space code that is aware of symlinks can still choose to follow
> > them explicitly.
> >
> > Setting the "nosymfollow" mount option helps prevent privileged
> > writers from modifying files unintentionally in case there is an
> > unexpected link along the accessed path. The "nosymfollow" option is
> > thus useful as a defensive measure for systems that need to deal with
> > untrusted file systems in privileged contexts.
> >
> > More information on the history and motivation for this patch can be
> > found here:
> >
> > https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-docs/hardening-against-malicious-stateful-data#TOC-Restricting-symlink-traversal
>
> Looks good. Did you plan to add an in-tree test for this (you could
> shove it in tools/testing/selftests/mount)?

Sure, that sounds like a good idea.  I'll take a look.

> Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>

Thank you for the review.
Matthew Wilcox Aug. 12, 2020, 6:35 p.m. UTC | #3
On Tue, Aug 11, 2020 at 04:28:03PM -0600, Ross Zwisler wrote:
> diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
> index 96a0240f23fed..dd8306ea336c1 100644
> --- a/include/uapi/linux/mount.h
> +++ b/include/uapi/linux/mount.h
> @@ -16,6 +16,7 @@
>  #define MS_REMOUNT	32	/* Alter flags of a mounted FS */
>  #define MS_MANDLOCK	64	/* Allow mandatory locks on an FS */
>  #define MS_DIRSYNC	128	/* Directory modifications are synchronous */
> +#define MS_NOSYMFOLLOW	256	/* Do not follow symlinks */
>  #define MS_NOATIME	1024	/* Do not update access times. */
>  #define MS_NODIRATIME	2048	/* Do not update directory access times */
>  #define MS_BIND		4096

Does this need to be added to MS_RMT_MASK too?
Ross Zwisler Aug. 12, 2020, 7:53 p.m. UTC | #4
On Wed, Aug 12, 2020 at 12:36 PM Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Aug 11, 2020 at 04:28:03PM -0600, Ross Zwisler wrote:
> > diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
> > index 96a0240f23fed..dd8306ea336c1 100644
> > --- a/include/uapi/linux/mount.h
> > +++ b/include/uapi/linux/mount.h
> > @@ -16,6 +16,7 @@
> >  #define MS_REMOUNT   32      /* Alter flags of a mounted FS */
> >  #define MS_MANDLOCK  64      /* Allow mandatory locks on an FS */
> >  #define MS_DIRSYNC   128     /* Directory modifications are synchronous */
> > +#define MS_NOSYMFOLLOW       256     /* Do not follow symlinks */
> >  #define MS_NOATIME   1024    /* Do not update access times. */
> >  #define MS_NODIRATIME        2048    /* Do not update directory access times */
> >  #define MS_BIND              4096
>
> Does this need to be added to MS_RMT_MASK too?

I don't think so, because IIUC that is only for "superblock flags",
i.e. flags which are part of the sb_flags definition in
do_mount()/path_mount()?

https://github.com/torvalds/linux/blob/master/fs/namespace.c#L3172

With the current code I'm able to remount and flip nosymfollow both
directions without issue.  Similarly, MS_NOEXEC can be turned on and
off at will, and it's not part of MS_RMT_MASK nor sb_flags.

Interestingly, though, if you change MS_RMT_MASK to be 0, I would
expect us to be unable to twiddle all the flags which are currently
part of it, but that isn't the case.   I was still able to change
between RO/RW, and turn on lazytime.

So, I think this flag is working as expected, but that there is
probably a bug in there somewhere WRT the handling of MS_RMT_MASK.
diff mbox series

Patch

diff --git a/fs/namei.c b/fs/namei.c
index 72d4219c93acb..ed68478fb1fb6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1626,7 +1626,8 @@  static const char *pick_link(struct nameidata *nd, struct path *link,
 			return ERR_PTR(error);
 	}
 
-	if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS))
+	if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS) ||
+			unlikely(nd->path.mnt->mnt_flags & MNT_NOSYMFOLLOW))
 		return ERR_PTR(-ELOOP);
 
 	if (!(nd->flags & LOOKUP_RCU)) {
diff --git a/fs/namespace.c b/fs/namespace.c
index 4a0f600a33285..1cbbf5a9b954f 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3167,6 +3167,8 @@  long do_mount(const char *dev_name, const char __user *dir_name,
 		mnt_flags &= ~(MNT_RELATIME | MNT_NOATIME);
 	if (flags & MS_RDONLY)
 		mnt_flags |= MNT_READONLY;
+	if (flags & MS_NOSYMFOLLOW)
+		mnt_flags |= MNT_NOSYMFOLLOW;
 
 	/* The default atime for remount is preservation */
 	if ((flags & MS_REMOUNT) &&
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 3059a9394c2d6..e59d4bb3a89e4 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -70,6 +70,7 @@  static void show_mnt_opts(struct seq_file *m, struct vfsmount *mnt)
 		{ MNT_NOATIME, ",noatime" },
 		{ MNT_NODIRATIME, ",nodiratime" },
 		{ MNT_RELATIME, ",relatime" },
+		{ MNT_NOSYMFOLLOW, ",nosymfollow" },
 		{ 0, NULL }
 	};
 	const struct proc_fs_opts *fs_infop;
diff --git a/fs/statfs.c b/fs/statfs.c
index 2616424012ea7..59f33752c1311 100644
--- a/fs/statfs.c
+++ b/fs/statfs.c
@@ -29,6 +29,8 @@  static int flags_by_mnt(int mnt_flags)
 		flags |= ST_NODIRATIME;
 	if (mnt_flags & MNT_RELATIME)
 		flags |= ST_RELATIME;
+	if (mnt_flags & MNT_NOSYMFOLLOW)
+		flags |= ST_NOSYMFOLLOW;
 	return flags;
 }
 
diff --git a/include/linux/mount.h b/include/linux/mount.h
index de657bd211fa6..aaf343b38671c 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -30,6 +30,7 @@  struct fs_context;
 #define MNT_NODIRATIME	0x10
 #define MNT_RELATIME	0x20
 #define MNT_READONLY	0x40	/* does the user want this to be r/o? */
+#define MNT_NOSYMFOLLOW	0x80
 
 #define MNT_SHRINKABLE	0x100
 #define MNT_WRITE_HOLD	0x200
@@ -46,7 +47,7 @@  struct fs_context;
 #define MNT_SHARED_MASK	(MNT_UNBINDABLE)
 #define MNT_USER_SETTABLE_MASK  (MNT_NOSUID | MNT_NODEV | MNT_NOEXEC \
 				 | MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME \
-				 | MNT_READONLY)
+				 | MNT_READONLY | MNT_NOSYMFOLLOW)
 #define MNT_ATIME_MASK (MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME )
 
 #define MNT_INTERNAL_FLAGS (MNT_SHARED | MNT_WRITE_HOLD | MNT_INTERNAL | \
diff --git a/include/linux/statfs.h b/include/linux/statfs.h
index 9bc69edb8f188..fac4356ea1bfc 100644
--- a/include/linux/statfs.h
+++ b/include/linux/statfs.h
@@ -40,6 +40,7 @@  struct kstatfs {
 #define ST_NOATIME	0x0400	/* do not update access times */
 #define ST_NODIRATIME	0x0800	/* do not update directory access times */
 #define ST_RELATIME	0x1000	/* update atime relative to mtime/ctime */
+#define ST_NOSYMFOLLOW	0x2000	/* do not follow symlinks */
 
 struct dentry;
 extern int vfs_get_fsid(struct dentry *dentry, __kernel_fsid_t *fsid);
diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
index 96a0240f23fed..dd8306ea336c1 100644
--- a/include/uapi/linux/mount.h
+++ b/include/uapi/linux/mount.h
@@ -16,6 +16,7 @@ 
 #define MS_REMOUNT	32	/* Alter flags of a mounted FS */
 #define MS_MANDLOCK	64	/* Allow mandatory locks on an FS */
 #define MS_DIRSYNC	128	/* Directory modifications are synchronous */
+#define MS_NOSYMFOLLOW	256	/* Do not follow symlinks */
 #define MS_NOATIME	1024	/* Do not update access times. */
 #define MS_NODIRATIME	2048	/* Do not update directory access times */
 #define MS_BIND		4096