diff mbox series

fanotify: fix permission model of unprivileged group

Message ID 20210522091916.196741-1-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show
Series fanotify: fix permission model of unprivileged group | expand

Commit Message

Amir Goldstein May 22, 2021, 9:19 a.m. UTC
Reporting event->pid should depend on the privileges of the user that
initialized the group, not the privileges of the user reading the
events.

Use an internal group flag FANOTIFY_UNPRIV to record the fact the the
group was initialized by an unprivileged user.

To be on the safe side, the premissions to setup filesystem and mount
marks now require that both the user that initialized the group and
the user setting up the mark have CAP_SYS_ADMIN.

Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Jan,

The original RFC [1] used the internal flag to check permissions for:
1. Reporting event->pid
2. Reporting event->fd
3. Setting up sb/mount marks

Although we discussed only adding the check for #1, I left all those
checks.

The check for #2 is redundant, but it feels safer to be
defensive to protect against leaked fds.

The check for #3 was added in addition to the existing permission checks
because it feels right. Let me know if you disagree.

I've adjusted Matthew's LTP test [2] to check case #1.

Thanks,
Amir.


[1] https://lore.kernel.org/linux-fsdevel/20210124184204.899729-3-amir73il@gmail.com/
[1] https://github.com/amir73il/ltp/commits/fanotify_unpriv

 fs/notify/fanotify/fanotify_user.c | 30 ++++++++++++++++++++++++------
 fs/notify/fdinfo.c                 |  2 +-
 include/linux/fanotify.h           |  4 ++++
 3 files changed, 29 insertions(+), 7 deletions(-)

Comments

Matthew Bobrowski May 24, 2021, 8:41 a.m. UTC | #1
On Sat, May 22, 2021 at 12:19:16PM +0300, Amir Goldstein wrote:
> Reporting event->pid should depend on the privileges of the user that
> initialized the group, not the privileges of the user reading the
> events.
> 
> Use an internal group flag FANOTIFY_UNPRIV to record the fact the the
> group was initialized by an unprivileged user.
> 
> To be on the safe side, the premissions to setup filesystem and mount
> marks now require that both the user that initialized the group and
> the user setting up the mark have CAP_SYS_ADMIN.
> 
> Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Thanks for sending through this patch Amir!

In general, the patch looks good to me, however there's just a few
nits below.

> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 71fefb30e015..7df6cba4a06d 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -424,11 +424,18 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>  	 * events generated by the listener process itself, without disclosing
>  	 * the pids of other processes.
>  	 */
> -	if (!capable(CAP_SYS_ADMIN) &&
> +	if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
>  	    task_tgid(current) != event->pid)
>  		metadata.pid = 0;
>  
> -	if (path && path->mnt && path->dentry) {
> +	/*
> +	 * For now, we require fid mode for unprivileged listener, which does
> +	 * record path events, but keep this check for safety in case we want
> +	 * to allow unprivileged listener to get events with no fd and no fid
> +	 * in the future.
> +	 */

I think it's best if we keep clear of using first person in our
comments throughout our code base. Maybe we could change this to:

* For now, fid mode is required for an unprivileged listener, which
  does record path events. However, this check must be kept...

> +	if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> +	    path && path->mnt && path->dentry) {
>  		fd = create_fd(group, path, &f);
>  		if (fd < 0)
>  			return fd;
> @@ -1040,6 +1047,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  	int f_flags, fd;
>  	unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
>  	unsigned int class = flags & FANOTIFY_CLASS_BITS;
> +	unsigned int internal_flags = 0;
>  
>  	pr_debug("%s: flags=%x event_f_flags=%x\n",
>  		 __func__, flags, event_f_flags);
> @@ -1053,6 +1061,13 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  		 */
>  		if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode)
>  			return -EPERM;
> +
> +		/*
> +		 * We set the internal flag FANOTIFY_UNPRIV on the group, so we
> +		 * know that we need to limit setting mount/filesystem marks on
> +		 * this group and avoid providing pid and open fd in the event.
> +		 */

Same comment as above applies here. This could be changed to:

* Set the internal FANOTIFY_UNPRIV flag for this notification group so
  that certain restrictions can be enforced upon it. This includes
  things like not permitting an unprivileged user from setting up
  mount/filesystem scoped marks and not returning an open file
  descriptor or pid meta-information within an event.

You can make it shorter if you like, but you get the drift.

> +		internal_flags |= FANOTIFY_UNPRIV;
>  	}
>  
>  #ifdef CONFIG_AUDITSYSCALL
> @@ -1105,7 +1120,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  		goto out_destroy_group;
>  	}
>  
> -	group->fanotify_data.flags = flags;
> +	group->fanotify_data.flags = flags | internal_flags;
>  	group->memcg = get_mem_cgroup_from_mm(current->mm);
>  
>  	group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
> @@ -1305,11 +1320,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>  	group = f.file->private_data;
>  
>  	/*
> -	 * An unprivileged user is not allowed to watch a mount point nor
> -	 * a filesystem.
> +	 * An unprivileged user is not allowed to setup mount point nor
  	      		   	       	       	  	      	   ^
								   s
> +	 * filesystem marks. It is not allowed to setup those marks for
> +	 * a group that was initialized by an unprivileged user.

I think the second sentence would better read as:

       * This also includes setting up such marks by a group that was
         intialized by an unprivileged user.

>  	ret = -EPERM;
> -	if (!capable(CAP_SYS_ADMIN) &&
> +	if ((!capable(CAP_SYS_ADMIN) ||
> +	     FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&

...

> diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
> index a712b2aaa9ac..57f0d5d9f934 100644
> --- a/fs/notify/fdinfo.c
> +++ b/fs/notify/fdinfo.c
> @@ -144,7 +144,7 @@ void fanotify_show_fdinfo(struct seq_file *m, struct file *f)
>  	struct fsnotify_group *group = f->private_data;
>  
>  	seq_printf(m, "fanotify flags:%x event-flags:%x\n",
> -		   group->fanotify_data.flags,
> +		   group->fanotify_data.flags & FANOTIFY_INIT_FLAGS,
>  		   group->fanotify_data.f_flags);

I feel like the internal initialization flags have been dropped off
here as FANOTIFY_INIT_FLAGS technically wouldn't cover all flags
present in group->fanotify_data.flags with FANOTIFY_UNPRIV, right?

>  	show_fdinfo(m, f, fanotify_fdinfo);
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index bad41bcb25df..f277d1c4e6b8 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -51,6 +51,10 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
>  #define FANOTIFY_INIT_FLAGS	(FANOTIFY_ADMIN_INIT_FLAGS | \
>  				 FANOTIFY_USER_INIT_FLAGS)
>  
> +/* Internal flags */
> +#define FANOTIFY_UNPRIV		0x80000000
> +#define FANOTIFY_INTERNAL_FLAGS	(FANOTIFY_UNPRIV)

Should we be more distinct here i.e. FANOTIFY_INTERNAL_INIT_FLAGS?
Just thinking about a possible case where there's some other internal
fanotify flags that are used for something else?

>  #define FANOTIFY_MARK_TYPE_BITS	(FAN_MARK_INODE | FAN_MARK_MOUNT | \
>  				 FAN_MARK_FILESYSTEM)

/M
Jan Kara May 24, 2021, 10 a.m. UTC | #2
On Mon 24-05-21 18:41:36, Matthew Bobrowski wrote:
> On Sat, May 22, 2021 at 12:19:16PM +0300, Amir Goldstein wrote:
> > Reporting event->pid should depend on the privileges of the user that
> > initialized the group, not the privileges of the user reading the
> > events.
> > 
> > Use an internal group flag FANOTIFY_UNPRIV to record the fact the the
> > group was initialized by an unprivileged user.
> > 
> > To be on the safe side, the premissions to setup filesystem and mount
> > marks now require that both the user that initialized the group and
> > the user setting up the mark have CAP_SYS_ADMIN.
> > 
> > Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> 
> Thanks for sending through this patch Amir!
> 
> In general, the patch looks good to me, however there's just a few
> nits below.
> 
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index 71fefb30e015..7df6cba4a06d 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -424,11 +424,18 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >  	 * events generated by the listener process itself, without disclosing
> >  	 * the pids of other processes.
> >  	 */
> > -	if (!capable(CAP_SYS_ADMIN) &&
> > +	if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> >  	    task_tgid(current) != event->pid)
> >  		metadata.pid = 0;
> >  
> > -	if (path && path->mnt && path->dentry) {
> > +	/*
> > +	 * For now, we require fid mode for unprivileged listener, which does
> > +	 * record path events, but keep this check for safety in case we want
> > +	 * to allow unprivileged listener to get events with no fd and no fid
> > +	 * in the future.
> > +	 */
> 
> I think it's best if we keep clear of using first person in our
> comments throughout our code base. Maybe we could change this to:
> 
> * For now, fid mode is required for an unprivileged listener, which
>   does record path events. However, this check must be kept...

Actually, I have no problem with the first person in comments. It is a
standard "anonymous" language and IMO easy to understand as well. Also
frequently used in the kernel AFAICT. What problem do you see with the
first person? I'm well aware that unlike us you are a native speaker ;)

> > @@ -1305,11 +1320,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> >  	group = f.file->private_data;
> >  
> >  	/*
> > -	 * An unprivileged user is not allowed to watch a mount point nor
> > -	 * a filesystem.
> > +	 * An unprivileged user is not allowed to setup mount point nor
>   	      		   	       	       	  	      	   ^
> 								   s
> > +	 * filesystem marks. It is not allowed to setup those marks for
> > +	 * a group that was initialized by an unprivileged user.
> 
> I think the second sentence would better read as:
> 
>        * This also includes setting up such marks by a group that was
>          intialized by an unprivileged user.

Yes, this is probably better.

> >  	show_fdinfo(m, f, fanotify_fdinfo);
> > diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> > index bad41bcb25df..f277d1c4e6b8 100644
> > --- a/include/linux/fanotify.h
> > +++ b/include/linux/fanotify.h
> > @@ -51,6 +51,10 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
> >  #define FANOTIFY_INIT_FLAGS	(FANOTIFY_ADMIN_INIT_FLAGS | \
> >  				 FANOTIFY_USER_INIT_FLAGS)
> >  
> > +/* Internal flags */
> > +#define FANOTIFY_UNPRIV		0x80000000
> > +#define FANOTIFY_INTERNAL_FLAGS	(FANOTIFY_UNPRIV)
> 
> Should we be more distinct here i.e. FANOTIFY_INTERNAL_INIT_FLAGS?
> Just thinking about a possible case where there's some other internal
> fanotify flags that are used for something else?

Well, do we need to distinguish different uses of internal flags? I don't
think so...

								Honza
Jan Kara May 24, 2021, 10:02 a.m. UTC | #3
On Sat 22-05-21 12:19:16, Amir Goldstein wrote:
> Reporting event->pid should depend on the privileges of the user that
> initialized the group, not the privileges of the user reading the
> events.
> 
> Use an internal group flag FANOTIFY_UNPRIV to record the fact the the
> group was initialized by an unprivileged user.
> 
> To be on the safe side, the premissions to setup filesystem and mount
> marks now require that both the user that initialized the group and
> the user setting up the mark have CAP_SYS_ADMIN.
> 
> Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Jan,
> 
> The original RFC [1] used the internal flag to check permissions for:
> 1. Reporting event->pid
> 2. Reporting event->fd
> 3. Setting up sb/mount marks
> 
> Although we discussed only adding the check for #1, I left all those
> checks.
> 
> The check for #2 is redundant, but it feels safer to be
> defensive to protect against leaked fds.
> 
> The check for #3 was added in addition to the existing permission checks
> because it feels right. Let me know if you disagree.
> 
> I've adjusted Matthew's LTP test [2] to check case #1.

Thanks! Modulo those language nits from Matthew the patch looks good to me.

								Honza

> [1] https://lore.kernel.org/linux-fsdevel/20210124184204.899729-3-amir73il@gmail.com/
> [1] https://github.com/amir73il/ltp/commits/fanotify_unpriv
> 
>  fs/notify/fanotify/fanotify_user.c | 30 ++++++++++++++++++++++++------
>  fs/notify/fdinfo.c                 |  2 +-
>  include/linux/fanotify.h           |  4 ++++
>  3 files changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 71fefb30e015..7df6cba4a06d 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -424,11 +424,18 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>  	 * events generated by the listener process itself, without disclosing
>  	 * the pids of other processes.
>  	 */
> -	if (!capable(CAP_SYS_ADMIN) &&
> +	if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
>  	    task_tgid(current) != event->pid)
>  		metadata.pid = 0;
>  
> -	if (path && path->mnt && path->dentry) {
> +	/*
> +	 * For now, we require fid mode for unprivileged listener, which does
> +	 * record path events, but keep this check for safety in case we want
> +	 * to allow unprivileged listener to get events with no fd and no fid
> +	 * in the future.
> +	 */
> +	if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> +	    path && path->mnt && path->dentry) {
>  		fd = create_fd(group, path, &f);
>  		if (fd < 0)
>  			return fd;
> @@ -1040,6 +1047,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  	int f_flags, fd;
>  	unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
>  	unsigned int class = flags & FANOTIFY_CLASS_BITS;
> +	unsigned int internal_flags = 0;
>  
>  	pr_debug("%s: flags=%x event_f_flags=%x\n",
>  		 __func__, flags, event_f_flags);
> @@ -1053,6 +1061,13 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  		 */
>  		if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode)
>  			return -EPERM;
> +
> +		/*
> +		 * We set the internal flag FANOTIFY_UNPRIV on the group, so we
> +		 * know that we need to limit setting mount/filesystem marks on
> +		 * this group and avoid providing pid and open fd in the event.
> +		 */
> +		internal_flags |= FANOTIFY_UNPRIV;
>  	}
>  
>  #ifdef CONFIG_AUDITSYSCALL
> @@ -1105,7 +1120,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  		goto out_destroy_group;
>  	}
>  
> -	group->fanotify_data.flags = flags;
> +	group->fanotify_data.flags = flags | internal_flags;
>  	group->memcg = get_mem_cgroup_from_mm(current->mm);
>  
>  	group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
> @@ -1305,11 +1320,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>  	group = f.file->private_data;
>  
>  	/*
> -	 * An unprivileged user is not allowed to watch a mount point nor
> -	 * a filesystem.
> +	 * An unprivileged user is not allowed to setup mount point nor
> +	 * filesystem marks. It is not allowed to setup those marks for
> +	 * a group that was initialized by an unprivileged user.
>  	 */
>  	ret = -EPERM;
> -	if (!capable(CAP_SYS_ADMIN) &&
> +	if ((!capable(CAP_SYS_ADMIN) ||
> +	     FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
>  	    mark_type != FAN_MARK_INODE)
>  		goto fput_and_out;
>  
> @@ -1460,6 +1477,7 @@ static int __init fanotify_user_setup(void)
>  	max_marks = clamp(max_marks, FANOTIFY_OLD_DEFAULT_MAX_MARKS,
>  				     FANOTIFY_DEFAULT_MAX_USER_MARKS);
>  
> +	BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_FLAGS);
>  	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10);
>  	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
>  
> diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
> index a712b2aaa9ac..57f0d5d9f934 100644
> --- a/fs/notify/fdinfo.c
> +++ b/fs/notify/fdinfo.c
> @@ -144,7 +144,7 @@ void fanotify_show_fdinfo(struct seq_file *m, struct file *f)
>  	struct fsnotify_group *group = f->private_data;
>  
>  	seq_printf(m, "fanotify flags:%x event-flags:%x\n",
> -		   group->fanotify_data.flags,
> +		   group->fanotify_data.flags & FANOTIFY_INIT_FLAGS,
>  		   group->fanotify_data.f_flags);
>  
>  	show_fdinfo(m, f, fanotify_fdinfo);
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index bad41bcb25df..f277d1c4e6b8 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -51,6 +51,10 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
>  #define FANOTIFY_INIT_FLAGS	(FANOTIFY_ADMIN_INIT_FLAGS | \
>  				 FANOTIFY_USER_INIT_FLAGS)
>  
> +/* Internal flags */
> +#define FANOTIFY_UNPRIV		0x80000000
> +#define FANOTIFY_INTERNAL_FLAGS	(FANOTIFY_UNPRIV)
> +
>  #define FANOTIFY_MARK_TYPE_BITS	(FAN_MARK_INODE | FAN_MARK_MOUNT | \
>  				 FAN_MARK_FILESYSTEM)
>  
> -- 
> 2.31.1
>
Amir Goldstein May 24, 2021, 10:40 a.m. UTC | #4
On Mon, May 24, 2021 at 11:41 AM Matthew Bobrowski <repnop@google.com> wrote:
>
> On Sat, May 22, 2021 at 12:19:16PM +0300, Amir Goldstein wrote:
> > Reporting event->pid should depend on the privileges of the user that
> > initialized the group, not the privileges of the user reading the
> > events.
> >
> > Use an internal group flag FANOTIFY_UNPRIV to record the fact the the
> > group was initialized by an unprivileged user.
> >
> > To be on the safe side, the premissions to setup filesystem and mount
> > marks now require that both the user that initialized the group and
> > the user setting up the mark have CAP_SYS_ADMIN.
> >
> > Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> Thanks for sending through this patch Amir!
>
> In general, the patch looks good to me, however there's just a few
> nits below.
>
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index 71fefb30e015..7df6cba4a06d 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -424,11 +424,18 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >        * events generated by the listener process itself, without disclosing
> >        * the pids of other processes.
> >        */
> > -     if (!capable(CAP_SYS_ADMIN) &&
> > +     if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> >           task_tgid(current) != event->pid)
> >               metadata.pid = 0;
> >
> > -     if (path && path->mnt && path->dentry) {
> > +     /*
> > +      * For now, we require fid mode for unprivileged listener, which does
> > +      * record path events, but keep this check for safety in case we want
> > +      * to allow unprivileged listener to get events with no fd and no fid
> > +      * in the future.
> > +      */
>
> I think it's best if we keep clear of using first person in our
> comments throughout our code base. Maybe we could change this to:
>
> * For now, fid mode is required for an unprivileged listener, which
>   does record path events. However, this check must be kept...
>
> > +     if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> > +         path && path->mnt && path->dentry) {
> >               fd = create_fd(group, path, &f);
> >               if (fd < 0)
> >                       return fd;
> > @@ -1040,6 +1047,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> >       int f_flags, fd;
> >       unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
> >       unsigned int class = flags & FANOTIFY_CLASS_BITS;
> > +     unsigned int internal_flags = 0;
> >
> >       pr_debug("%s: flags=%x event_f_flags=%x\n",
> >                __func__, flags, event_f_flags);
> > @@ -1053,6 +1061,13 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> >                */
> >               if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode)
> >                       return -EPERM;
> > +
> > +             /*
> > +              * We set the internal flag FANOTIFY_UNPRIV on the group, so we
> > +              * know that we need to limit setting mount/filesystem marks on
> > +              * this group and avoid providing pid and open fd in the event.
> > +              */
>
> Same comment as above applies here. This could be changed to:
>
> * Set the internal FANOTIFY_UNPRIV flag for this notification group so
>   that certain restrictions can be enforced upon it. This includes
>   things like not permitting an unprivileged user from setting up
>   mount/filesystem scoped marks and not returning an open file
>   descriptor or pid meta-information within an event.
>
> You can make it shorter if you like, but you get the drift.
>
> > +             internal_flags |= FANOTIFY_UNPRIV;
> >       }
> >
> >  #ifdef CONFIG_AUDITSYSCALL
> > @@ -1105,7 +1120,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> >               goto out_destroy_group;
> >       }
> >
> > -     group->fanotify_data.flags = flags;
> > +     group->fanotify_data.flags = flags | internal_flags;
> >       group->memcg = get_mem_cgroup_from_mm(current->mm);
> >
> >       group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
> > @@ -1305,11 +1320,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> >       group = f.file->private_data;
> >
> >       /*
> > -      * An unprivileged user is not allowed to watch a mount point nor
> > -      * a filesystem.
> > +      * An unprivileged user is not allowed to setup mount point nor
>                                                                    ^
>                                                                    s
> > +      * filesystem marks. It is not allowed to setup those marks for
> > +      * a group that was initialized by an unprivileged user.
>
> I think the second sentence would better read as:
>
>        * This also includes setting up such marks by a group that was
>          intialized by an unprivileged user.
>
> >       ret = -EPERM;
> > -     if (!capable(CAP_SYS_ADMIN) &&
> > +     if ((!capable(CAP_SYS_ADMIN) ||
> > +          FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
>
> ...
>
> > diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
> > index a712b2aaa9ac..57f0d5d9f934 100644
> > --- a/fs/notify/fdinfo.c
> > +++ b/fs/notify/fdinfo.c
> > @@ -144,7 +144,7 @@ void fanotify_show_fdinfo(struct seq_file *m, struct file *f)
> >       struct fsnotify_group *group = f->private_data;
> >
> >       seq_printf(m, "fanotify flags:%x event-flags:%x\n",
> > -                group->fanotify_data.flags,
> > +                group->fanotify_data.flags & FANOTIFY_INIT_FLAGS,
> >                  group->fanotify_data.f_flags);
>
> I feel like the internal initialization flags have been dropped off
> here as FANOTIFY_INIT_FLAGS technically wouldn't cover all flags
> present in group->fanotify_data.flags with FANOTIFY_UNPRIV, right?
>

Right. CRIU reads those values and tries to restore the same
fanotify group on a new running instance, so we must not export flags
not allowed by fanotify_init().


> >       show_fdinfo(m, f, fanotify_fdinfo);
> > diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> > index bad41bcb25df..f277d1c4e6b8 100644
> > --- a/include/linux/fanotify.h
> > +++ b/include/linux/fanotify.h
> > @@ -51,6 +51,10 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
> >  #define FANOTIFY_INIT_FLAGS  (FANOTIFY_ADMIN_INIT_FLAGS | \
> >                                FANOTIFY_USER_INIT_FLAGS)
> >
> > +/* Internal flags */
> > +#define FANOTIFY_UNPRIV              0x80000000
> > +#define FANOTIFY_INTERNAL_FLAGS      (FANOTIFY_UNPRIV)
>
> Should we be more distinct here i.e. FANOTIFY_INTERNAL_INIT_FLAGS?

If anything, it would be FANOTIFY_INTERNAL_GROUP_FLAGS
FANOTIFY_INIT_FLAGS can only be set by fanotify_init(), but internal
flags could potential be set at any time.

> Just thinking about a possible case where there's some other internal
> fanotify flags that are used for something else?
>

I prefer the brevity. It's an internal name so we can always change it later
should it become ambiguous.

Thanks for the review.
I'll send v2 shortly.

Thanks,
Amir.
Matthew Bobrowski May 24, 2021, 11:42 p.m. UTC | #5
On Mon, May 24, 2021 at 12:00:04PM +0200, Jan Kara wrote:
> On Mon 24-05-21 18:41:36, Matthew Bobrowski wrote:
> > On Sat, May 22, 2021 at 12:19:16PM +0300, Amir Goldstein wrote:
> > > Reporting event->pid should depend on the privileges of the user that
> > > initialized the group, not the privileges of the user reading the
> > > events.
> > > 
> > > Use an internal group flag FANOTIFY_UNPRIV to record the fact the the
> > > group was initialized by an unprivileged user.
> > > 
> > > To be on the safe side, the premissions to setup filesystem and mount
> > > marks now require that both the user that initialized the group and
> > > the user setting up the mark have CAP_SYS_ADMIN.
> > > 
> > > Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > 
> > Thanks for sending through this patch Amir!
> > 
> > In general, the patch looks good to me, however there's just a few
> > nits below.
> > 
> > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > > index 71fefb30e015..7df6cba4a06d 100644
> > > --- a/fs/notify/fanotify/fanotify_user.c
> > > +++ b/fs/notify/fanotify/fanotify_user.c
> > > @@ -424,11 +424,18 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> > >  	 * events generated by the listener process itself, without disclosing
> > >  	 * the pids of other processes.
> > >  	 */
> > > -	if (!capable(CAP_SYS_ADMIN) &&
> > > +	if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> > >  	    task_tgid(current) != event->pid)
> > >  		metadata.pid = 0;
> > >  
> > > -	if (path && path->mnt && path->dentry) {
> > > +	/*
> > > +	 * For now, we require fid mode for unprivileged listener, which does
> > > +	 * record path events, but keep this check for safety in case we want
> > > +	 * to allow unprivileged listener to get events with no fd and no fid
> > > +	 * in the future.
> > > +	 */
> > 
> > I think it's best if we keep clear of using first person in our
> > comments throughout our code base. Maybe we could change this to:
> > 
> > * For now, fid mode is required for an unprivileged listener, which
> >   does record path events. However, this check must be kept...
> 
> Actually, I have no problem with the first person in comments. It is a
> standard "anonymous" language and IMO easy to understand as well. Also
> frequently used in the kernel AFAICT. What problem do you see with the
> first person? I'm well aware that unlike us you are a native speaker ;)

That's fair, perhaps it's just personal preference more than
anything. I do believe that it does lead to more succinct comments
that doesn't necessarily lead to thinking about who the reader is
intended to be in this partcular context.

> > > +/* Internal flags */
> > > +#define FANOTIFY_UNPRIV		0x80000000
> > > +#define FANOTIFY_INTERNAL_FLAGS	(FANOTIFY_UNPRIV)
> > 
> > Should we be more distinct here i.e. FANOTIFY_INTERNAL_INIT_FLAGS?
> > Just thinking about a possible case where there's some other internal
> > fanotify flags that are used for something else?
> 
> Well, do we need to distinguish different uses of internal flags? I don't
> think so...

Maybe not. I was just thinking about the possibility of other internal
flags being possibly introduced further on down the line that wouldn't
properly align with the use of FANOTIFY_INTERNAL_FLAGS, therefore me
providing the suggestion for renaming it. Anyway, if such a situation
ever arises then there's absolutely no reason why we can't shuffle
things around later.

/M
diff mbox series

Patch

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 71fefb30e015..7df6cba4a06d 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -424,11 +424,18 @@  static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	 * events generated by the listener process itself, without disclosing
 	 * the pids of other processes.
 	 */
-	if (!capable(CAP_SYS_ADMIN) &&
+	if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
 	    task_tgid(current) != event->pid)
 		metadata.pid = 0;
 
-	if (path && path->mnt && path->dentry) {
+	/*
+	 * For now, we require fid mode for unprivileged listener, which does
+	 * record path events, but keep this check for safety in case we want
+	 * to allow unprivileged listener to get events with no fd and no fid
+	 * in the future.
+	 */
+	if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
+	    path && path->mnt && path->dentry) {
 		fd = create_fd(group, path, &f);
 		if (fd < 0)
 			return fd;
@@ -1040,6 +1047,7 @@  SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	int f_flags, fd;
 	unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
 	unsigned int class = flags & FANOTIFY_CLASS_BITS;
+	unsigned int internal_flags = 0;
 
 	pr_debug("%s: flags=%x event_f_flags=%x\n",
 		 __func__, flags, event_f_flags);
@@ -1053,6 +1061,13 @@  SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 		 */
 		if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode)
 			return -EPERM;
+
+		/*
+		 * We set the internal flag FANOTIFY_UNPRIV on the group, so we
+		 * know that we need to limit setting mount/filesystem marks on
+		 * this group and avoid providing pid and open fd in the event.
+		 */
+		internal_flags |= FANOTIFY_UNPRIV;
 	}
 
 #ifdef CONFIG_AUDITSYSCALL
@@ -1105,7 +1120,7 @@  SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 		goto out_destroy_group;
 	}
 
-	group->fanotify_data.flags = flags;
+	group->fanotify_data.flags = flags | internal_flags;
 	group->memcg = get_mem_cgroup_from_mm(current->mm);
 
 	group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
@@ -1305,11 +1320,13 @@  static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	group = f.file->private_data;
 
 	/*
-	 * An unprivileged user is not allowed to watch a mount point nor
-	 * a filesystem.
+	 * An unprivileged user is not allowed to setup mount point nor
+	 * filesystem marks. It is not allowed to setup those marks for
+	 * a group that was initialized by an unprivileged user.
 	 */
 	ret = -EPERM;
-	if (!capable(CAP_SYS_ADMIN) &&
+	if ((!capable(CAP_SYS_ADMIN) ||
+	     FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
 	    mark_type != FAN_MARK_INODE)
 		goto fput_and_out;
 
@@ -1460,6 +1477,7 @@  static int __init fanotify_user_setup(void)
 	max_marks = clamp(max_marks, FANOTIFY_OLD_DEFAULT_MAX_MARKS,
 				     FANOTIFY_DEFAULT_MAX_USER_MARKS);
 
+	BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_FLAGS);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
 
diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
index a712b2aaa9ac..57f0d5d9f934 100644
--- a/fs/notify/fdinfo.c
+++ b/fs/notify/fdinfo.c
@@ -144,7 +144,7 @@  void fanotify_show_fdinfo(struct seq_file *m, struct file *f)
 	struct fsnotify_group *group = f->private_data;
 
 	seq_printf(m, "fanotify flags:%x event-flags:%x\n",
-		   group->fanotify_data.flags,
+		   group->fanotify_data.flags & FANOTIFY_INIT_FLAGS,
 		   group->fanotify_data.f_flags);
 
 	show_fdinfo(m, f, fanotify_fdinfo);
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index bad41bcb25df..f277d1c4e6b8 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -51,6 +51,10 @@  extern struct ctl_table fanotify_table[]; /* for sysctl */
 #define FANOTIFY_INIT_FLAGS	(FANOTIFY_ADMIN_INIT_FLAGS | \
 				 FANOTIFY_USER_INIT_FLAGS)
 
+/* Internal flags */
+#define FANOTIFY_UNPRIV		0x80000000
+#define FANOTIFY_INTERNAL_FLAGS	(FANOTIFY_UNPRIV)
+
 #define FANOTIFY_MARK_TYPE_BITS	(FAN_MARK_INODE | FAN_MARK_MOUNT | \
 				 FAN_MARK_FILESYSTEM)