Message ID | 20210124184204.899729-3-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | unprivileged fanotify listener | expand |
On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > Add limited support for unprivileged fanotify event listener. > An unprivileged event listener does not get an open file descriptor in > the event nor the process pid of another process. An unprivileged event > listener cannot request permission events, cannot set mount/filesystem > marks and cannot request unlimited queue/marks. > > This enables the limited functionality similar to inotify when watching a > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > requiring SYS_CAP_ADMIN privileges. > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > to monitor all changes inside those directories. > > This typically requires that the listener keeps a map of watched directory > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > before starting to watch for changes. > > When getting an event, the reported fid of the parent should be resolved > to dirfd and fstatsat(2) with dirfd and name should be used to query the > state of the filesystem entry. > > Note that even though events do not report the event creator pid, > fanotify does not merge similar events on the same object that were > generated by different processes. This is aligned with exiting behavior > when generating processes are outside of the listener pidns (which > results in reporting 0 pid to listener). > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> The patch looks mostly good to me. Just two questions: a) Remind me please, why did we decide pid isn't safe to report to unpriviledged listeners? b) Why did we decide returning open file descriptors isn't safe for unpriviledged listeners? Is it about FMODE_NONOTIFY? I'm not opposed to either but I'm wondering. Also with b) old style fanotify events are not very useful so maybe we could just disallow all notification groups without FID/DFID reporting? In the future if we ever decide returning open fds is safe or how to do it, we can enable that group type for unpriviledged users. However just starting to return open fds later won't fly because listener has to close these fds when receiving events. Honza > --- > fs/notify/fanotify/fanotify_user.c | 49 +++++++++++++++++++++++++++--- > fs/notify/fdinfo.c | 3 +- > include/linux/fanotify.h | 16 ++++++++++ > 3 files changed, 62 insertions(+), 6 deletions(-) > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > index 4ade3f9df337..b70de273eedb 100644 > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -397,9 +397,21 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group, > metadata.vers = FANOTIFY_METADATA_VERSION; > metadata.reserved = 0; > metadata.mask = event->mask & FANOTIFY_OUTGOING_EVENTS; > - metadata.pid = pid_vnr(event->pid); > + /* > + * An unprivileged event listener does not get an open file descriptor > + * in the event nor another generating process pid. If the event was > + * generated by the unprivileged process itself, self pid is reported. > + * We may relax this in the future by checking calling process access > + * permissions to the object. > + */ > + if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) || > + task_tgid(current) == event->pid) > + metadata.pid = pid_vnr(event->pid); > + else > + metadata.pid = 0; > > - if (path && path->mnt && path->dentry) { > + if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) && > + path && path->mnt && path->dentry) { > fd = create_fd(group, path, &f); > if (fd < 0) > return fd; > @@ -995,12 +1007,29 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > int f_flags, fd; > unsigned int fid_mode = flags & FANOTIFY_FID_BITS; > unsigned int class = flags & FANOTIFY_CLASS_BITS; > + unsigned int internal_flags = 0; > > pr_debug("%s: flags=%x event_f_flags=%x\n", > __func__, flags, event_f_flags); > > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > + if (!capable(CAP_SYS_ADMIN)) { > + /* > + * An unprivileged user can setup an unprivileged listener with > + * limited functionality - an unprivileged event listener cannot > + * request permission events, cannot set mount/filesystem marks > + * and cannot request unlimited queue/marks. > + */ > + if ((flags & ~FANOTIFY_UNPRIV_INIT_FLAGS) || > + class != FAN_CLASS_NOTIF) > + return -EPERM; > + > + /* > + * We set the internal flag FANOTIFY_UNPRIV on the group, so we > + * know that we need to limit setting mount/filesystem marks on > + * this group and avoid providing pid and open fd in the event. > + */ > + internal_flags |= FANOTIFY_UNPRIV; > + } > > #ifdef CONFIG_AUDITSYSCALL > if (flags & ~(FANOTIFY_INIT_FLAGS | FAN_ENABLE_AUDIT)) > @@ -1051,7 +1080,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > goto out_destroy_group; > } > > - group->fanotify_data.flags = flags; > + group->fanotify_data.flags = flags | internal_flags; > group->memcg = get_mem_cgroup_from_mm(current->mm); > > group->overflow_event = fanotify_alloc_overflow_event(); > @@ -1247,6 +1276,15 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask, > goto fput_and_out; > group = f.file->private_data; > > + /* > + * An unprivileged event listener is not allowed to watch a mount > + * point nor a filesystem. > + */ > + ret = -EPERM; > + if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) && > + mark_type != FAN_MARK_INODE) > + goto fput_and_out; > + > /* > * group->priority == FS_PRIO_0 == FAN_CLASS_NOTIF. These are not > * allowed to set permissions events. > @@ -1379,6 +1417,7 @@ SYSCALL32_DEFINE6(fanotify_mark, > */ > static int __init fanotify_user_setup(void) > { > + BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_FLAGS); > BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10); > BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9); > > diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c > index f0d6b54be412..57f0d5d9f934 100644 > --- a/fs/notify/fdinfo.c > +++ b/fs/notify/fdinfo.c > @@ -144,7 +144,8 @@ void fanotify_show_fdinfo(struct seq_file *m, struct file *f) > struct fsnotify_group *group = f->private_data; > > seq_printf(m, "fanotify flags:%x event-flags:%x\n", > - group->fanotify_data.flags, group->fanotify_data.f_flags); > + group->fanotify_data.flags & FANOTIFY_INIT_FLAGS, > + group->fanotify_data.f_flags); > > show_fdinfo(m, f, fanotify_fdinfo); > } > diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h > index 031a97d8369a..a573c1028c14 100644 > --- a/include/linux/fanotify.h > +++ b/include/linux/fanotify.h > @@ -28,6 +28,22 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */ > FAN_CLOEXEC | FAN_NONBLOCK | \ > FAN_UNLIMITED_QUEUE | FAN_UNLIMITED_MARKS) > > +/* Internal flags */ > +#define FANOTIFY_UNPRIV 0x80000000 > +#define FANOTIFY_INTERNAL_FLAGS (FANOTIFY_UNPRIV) > + > +/* > + * fanotify_init() flags allowed for unprivileged listener. > + * FAN_CLASS_NOTIF in this mask is purely semantic because it is zero, > + * but it is the only class we allow for unprivileged listener. > + * Since unprivileged listener does not provide file descriptors in events, > + * reporting file handles makes sense, but it is not a must. > + * FAN_REPORT_TID does not make sense for unprivileged listener, which uses > + * event->pid only to filter out events generated by listener process itself. > + */ > +#define FANOTIFY_UNPRIV_INIT_FLAGS (FAN_CLOEXEC | FAN_NONBLOCK | \ > + FAN_CLASS_NOTIF | FANOTIFY_FID_BITS) > + > #define FANOTIFY_MARK_TYPE_BITS (FAN_MARK_INODE | FAN_MARK_MOUNT | \ > FAN_MARK_FILESYSTEM) > > -- > 2.25.1 >
On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > Add limited support for unprivileged fanotify event listener. > > An unprivileged event listener does not get an open file descriptor in > > the event nor the process pid of another process. An unprivileged event > > listener cannot request permission events, cannot set mount/filesystem > > marks and cannot request unlimited queue/marks. > > > > This enables the limited functionality similar to inotify when watching a > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > requiring SYS_CAP_ADMIN privileges. > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > to monitor all changes inside those directories. > > > > This typically requires that the listener keeps a map of watched directory > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > before starting to watch for changes. > > > > When getting an event, the reported fid of the parent should be resolved > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > state of the filesystem entry. > > > > Note that even though events do not report the event creator pid, > > fanotify does not merge similar events on the same object that were > > generated by different processes. This is aligned with exiting behavior > > when generating processes are outside of the listener pidns (which > > results in reporting 0 pid to listener). > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > The patch looks mostly good to me. Just two questions: > > a) Remind me please, why did we decide pid isn't safe to report to > unpriviledged listeners? Just because the information that process X modified file Y is not an information that user can generally obtain without extra capabilities(?) I can add a flag FAN_REPORT_OWN_PID to make that behavior explicit and then we can relax reporting pids later. > > b) Why did we decide returning open file descriptors isn't safe for > unpriviledged listeners? Is it about FMODE_NONOTIFY? > Don't remember something in particular. I feels risky. > I'm not opposed to either but I'm wondering. Also with b) old style > fanotify events are not very useful so maybe we could just disallow all > notification groups without FID/DFID reporting? In the future if we ever > decide returning open fds is safe or how to do it, we can enable that group > type for unpriviledged users. However just starting to return open fds > later won't fly because listener has to close these fds when receiving > events. > I like this option better. Thanks, Amir.
On Tue, Feb 16, 2021 at 8:12 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > > Add limited support for unprivileged fanotify event listener. > > > An unprivileged event listener does not get an open file descriptor in > > > the event nor the process pid of another process. An unprivileged event > > > listener cannot request permission events, cannot set mount/filesystem > > > marks and cannot request unlimited queue/marks. > > > > > > This enables the limited functionality similar to inotify when watching a > > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > > requiring SYS_CAP_ADMIN privileges. > > > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > > to monitor all changes inside those directories. > > > > > > This typically requires that the listener keeps a map of watched directory > > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > > before starting to watch for changes. > > > > > > When getting an event, the reported fid of the parent should be resolved > > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > > state of the filesystem entry. > > > > > > Note that even though events do not report the event creator pid, > > > fanotify does not merge similar events on the same object that were > > > generated by different processes. This is aligned with exiting behavior > > > when generating processes are outside of the listener pidns (which > > > results in reporting 0 pid to listener). > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > > The patch looks mostly good to me. Just two questions: > > > > a) Remind me please, why did we decide pid isn't safe to report to > > unpriviledged listeners? > > Just because the information that process X modified file Y is not an > information that user can generally obtain without extra capabilities(?) > I can add a flag FAN_REPORT_OWN_PID to make that behavior > explicit and then we can relax reporting pids later. > FYI a patch for flag FAN_REPORT_SELF_PID is pushed to branch fanotify_unpriv. The UAPI feels a bit awkward with this flag, but that is the easiest way to start without worrying about disclosing pids. I guess we can require that unprivileged listener has pid 1 in its own pid ns. The outcome is similar to FAN_REPORT_SELF_PID, except it can also get pids of its children which is probably fine. I am not sure if this is a reasonable option from users POV. > > > > b) Why did we decide returning open file descriptors isn't safe for > > unpriviledged listeners? Is it about FMODE_NONOTIFY? > > > > Don't remember something in particular. I feels risky. > > > I'm not opposed to either but I'm wondering. Also with b) old style > > fanotify events are not very useful so maybe we could just disallow all > > notification groups without FID/DFID reporting? In the future if we ever > > decide returning open fds is safe or how to do it, we can enable that group > > type for unpriviledged users. However just starting to return open fds > > later won't fly because listener has to close these fds when receiving > > events. > > > > I like this option better. > This is also pushed to branch fanotify_unpriv. With all the behavior specified explicitly in fanotify_init() and fanotify_mark() flags, there is no need for the internal FANOTIFY_UNPRIV group flag, which looks better IMO. Thanks, Amir.
On Fri, Feb 19, 2021 at 6:16 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Tue, Feb 16, 2021 at 8:12 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > > > Add limited support for unprivileged fanotify event listener. > > > > An unprivileged event listener does not get an open file descriptor in > > > > the event nor the process pid of another process. An unprivileged event > > > > listener cannot request permission events, cannot set mount/filesystem > > > > marks and cannot request unlimited queue/marks. > > > > > > > > This enables the limited functionality similar to inotify when watching a > > > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > > > requiring SYS_CAP_ADMIN privileges. > > > > > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > > > to monitor all changes inside those directories. > > > > > > > > This typically requires that the listener keeps a map of watched directory > > > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > > > before starting to watch for changes. > > > > > > > > When getting an event, the reported fid of the parent should be resolved > > > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > > > state of the filesystem entry. > > > > > > > > Note that even though events do not report the event creator pid, > > > > fanotify does not merge similar events on the same object that were > > > > generated by different processes. This is aligned with exiting behavior > > > > when generating processes are outside of the listener pidns (which > > > > results in reporting 0 pid to listener). > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > > > > The patch looks mostly good to me. Just two questions: > > > > > > a) Remind me please, why did we decide pid isn't safe to report to > > > unpriviledged listeners? > > > > Just because the information that process X modified file Y is not an > > information that user can generally obtain without extra capabilities(?) > > I can add a flag FAN_REPORT_OWN_PID to make that behavior > > explicit and then we can relax reporting pids later. > > > > FYI a patch for flag FAN_REPORT_SELF_PID is pushed to branch > fanotify_unpriv. > > The UAPI feels a bit awkward with this flag, but that is the easiest way > to start without worrying about disclosing pids. > > I guess we can require that unprivileged listener has pid 1 in its own > pid ns. The outcome is similar to FAN_REPORT_SELF_PID, except > it can also get pids of its children which is probably fine. > Jan, WRT your comment in github: "So maybe we can just require that this flag is already set by userspace instead of silently setting it? Like: if (!(flags & FAN_REPORT_SELF_PID)) return -EPERM; I'd say that variant is more futureproof and the difference for user is minimal." I started with this approach and then I wrote the tests and imagined the man page requiring this flag would be a bit awkward, so I changed it to auto-enable. I am not strongly against the more implicit flag requirement, but in favor of the auto-enable approach I would like to argue that with current fanotify you CAN get zero pid in event, so think about it this way: If a listener is started in (or moved into) its own pid ns, it will get zero pid in all events (other than those generated by itself and its own children). With the proposed change, the same applies also if the listener is started without CAP_SYS_ADMIN. As a matter of fact, we do not need the flag at all, we can determine whether or not to report pid according to capabilities of the event reader at event read time. And we can check for one of: - CAP_SYS_ADMIN - CAP_SYS_PACCT - CAP_SYS_PTRACE Do you prefer this flag-less approach? Thanks, Amir.
On Tue 23-02-21 19:16:40, Amir Goldstein wrote: > On Fri, Feb 19, 2021 at 6:16 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > On Tue, Feb 16, 2021 at 8:12 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > > > > Add limited support for unprivileged fanotify event listener. > > > > > An unprivileged event listener does not get an open file descriptor in > > > > > the event nor the process pid of another process. An unprivileged event > > > > > listener cannot request permission events, cannot set mount/filesystem > > > > > marks and cannot request unlimited queue/marks. > > > > > > > > > > This enables the limited functionality similar to inotify when watching a > > > > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > > > > requiring SYS_CAP_ADMIN privileges. > > > > > > > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > > > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > > > > to monitor all changes inside those directories. > > > > > > > > > > This typically requires that the listener keeps a map of watched directory > > > > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > > > > before starting to watch for changes. > > > > > > > > > > When getting an event, the reported fid of the parent should be resolved > > > > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > > > > state of the filesystem entry. > > > > > > > > > > Note that even though events do not report the event creator pid, > > > > > fanotify does not merge similar events on the same object that were > > > > > generated by different processes. This is aligned with exiting behavior > > > > > when generating processes are outside of the listener pidns (which > > > > > results in reporting 0 pid to listener). > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > > > > > > The patch looks mostly good to me. Just two questions: > > > > > > > > a) Remind me please, why did we decide pid isn't safe to report to > > > > unpriviledged listeners? > > > > > > Just because the information that process X modified file Y is not an > > > information that user can generally obtain without extra capabilities(?) > > > I can add a flag FAN_REPORT_OWN_PID to make that behavior > > > explicit and then we can relax reporting pids later. > > > > > > > FYI a patch for flag FAN_REPORT_SELF_PID is pushed to branch > > fanotify_unpriv. > > > > The UAPI feels a bit awkward with this flag, but that is the easiest way > > to start without worrying about disclosing pids. > > > > I guess we can require that unprivileged listener has pid 1 in its own > > pid ns. The outcome is similar to FAN_REPORT_SELF_PID, except > > it can also get pids of its children which is probably fine. > > > > Jan, > > WRT your comment in github: > "So maybe we can just require that this flag is already set by userspace > instead of silently setting it? Like: > > if (!(flags & FAN_REPORT_SELF_PID)) return -EPERM; > > I'd say that variant is more futureproof and the difference for user > is minimal." > > I started with this approach and then I wrote the tests and imagined > the man page > requiring this flag would be a bit awkward, so I changed it to auto-enable. > > I am not strongly against the more implicit flag requirement, but in > favor of the > auto-enable approach I would like to argue that with current fanotify you CAN > get zero pid in event, so think about it this way: > If a listener is started in (or moved into) its own pid ns, it will > get zero pid in all > events (other than those generated by itself and its own children). > > With the proposed change, the same applies also if the listener is started > without CAP_SYS_ADMIN. > > As a matter of fact, we do not need the flag at all, we can determine whether > or not to report pid according to capabilities of the event reader at > event read time. > And we can check for one of: > - CAP_SYS_ADMIN > - CAP_SYS_PACCT > - CAP_SYS_PTRACE > > Do you prefer this flag-less approach? Well, I don't have strong opinion what we should do internally either. The flag seems OK to me. The biggest question is whether we should expose the FAN_REPORT_SELF_PID flag to userspace or not. If we would not require explicit flag for unpriv users, I see little reason to expose that flag at all. Honza
On Wed, Feb 24, 2021 at 12:52 PM Jan Kara <jack@suse.cz> wrote: > > On Tue 23-02-21 19:16:40, Amir Goldstein wrote: > > On Fri, Feb 19, 2021 at 6:16 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > On Tue, Feb 16, 2021 at 8:12 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > > > On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > > > > > Add limited support for unprivileged fanotify event listener. > > > > > > An unprivileged event listener does not get an open file descriptor in > > > > > > the event nor the process pid of another process. An unprivileged event > > > > > > listener cannot request permission events, cannot set mount/filesystem > > > > > > marks and cannot request unlimited queue/marks. > > > > > > > > > > > > This enables the limited functionality similar to inotify when watching a > > > > > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > > > > > requiring SYS_CAP_ADMIN privileges. > > > > > > > > > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > > > > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > > > > > to monitor all changes inside those directories. > > > > > > > > > > > > This typically requires that the listener keeps a map of watched directory > > > > > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > > > > > before starting to watch for changes. > > > > > > > > > > > > When getting an event, the reported fid of the parent should be resolved > > > > > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > > > > > state of the filesystem entry. > > > > > > > > > > > > Note that even though events do not report the event creator pid, > > > > > > fanotify does not merge similar events on the same object that were > > > > > > generated by different processes. This is aligned with exiting behavior > > > > > > when generating processes are outside of the listener pidns (which > > > > > > results in reporting 0 pid to listener). > > > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > > > > > > > > The patch looks mostly good to me. Just two questions: > > > > > > > > > > a) Remind me please, why did we decide pid isn't safe to report to > > > > > unpriviledged listeners? > > > > > > > > Just because the information that process X modified file Y is not an > > > > information that user can generally obtain without extra capabilities(?) > > > > I can add a flag FAN_REPORT_OWN_PID to make that behavior > > > > explicit and then we can relax reporting pids later. > > > > > > > > > > FYI a patch for flag FAN_REPORT_SELF_PID is pushed to branch > > > fanotify_unpriv. > > > > > > The UAPI feels a bit awkward with this flag, but that is the easiest way > > > to start without worrying about disclosing pids. > > > > > > I guess we can require that unprivileged listener has pid 1 in its own > > > pid ns. The outcome is similar to FAN_REPORT_SELF_PID, except > > > it can also get pids of its children which is probably fine. > > > > > > > Jan, > > > > WRT your comment in github: > > "So maybe we can just require that this flag is already set by userspace > > instead of silently setting it? Like: > > > > if (!(flags & FAN_REPORT_SELF_PID)) return -EPERM; > > > > I'd say that variant is more futureproof and the difference for user > > is minimal." > > > > I started with this approach and then I wrote the tests and imagined > > the man page > > requiring this flag would be a bit awkward, so I changed it to auto-enable. > > > > I am not strongly against the more implicit flag requirement, but in > > favor of the > > auto-enable approach I would like to argue that with current fanotify you CAN > > get zero pid in event, so think about it this way: > > If a listener is started in (or moved into) its own pid ns, it will > > get zero pid in all > > events (other than those generated by itself and its own children). > > > > With the proposed change, the same applies also if the listener is started > > without CAP_SYS_ADMIN. > > > > As a matter of fact, we do not need the flag at all, we can determine whether > > or not to report pid according to capabilities of the event reader at > > event read time. > > And we can check for one of: > > - CAP_SYS_ADMIN > > - CAP_SYS_PACCT > > - CAP_SYS_PTRACE > > > > Do you prefer this flag-less approach? > > Well, I don't have strong opinion what we should do internally either. The > flag seems OK to me. The biggest question is whether we should expose the > FAN_REPORT_SELF_PID flag to userspace or not. If we would not require > explicit flag for unpriv users, I see little reason to expose that flag at > all. > IMO the only listeners that actually care about event->pid are permission event listeners. I think that FAN_CLASS_NOTIF listeners do not care about it. The only thing that *I* ever used event->pid for is to ignore events from self pid. My question is, do you mind if we start with this code: @@ -419,6 +419,14 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group, metadata.reserved = 0; metadata.mask = event->mask & FANOTIFY_OUTGOING_EVENTS; metadata.pid = pid_vnr(event->pid); + + /* + * For an unprivileged listener, event->pid can be used to identify the + * events generated by the listener process itself, without disclosing + * the pids of other processes. + */ + if (!capable(CAP_SYS_ADMIN) && + task_tgid(current) != event->pid) + metadata.pid = 0; No need for any visible or invisible user flags. If users ask for event->pid of other processes later (I don't think they will) and we decide that it is safe to disclose them, we will require another flag and then the test will become: + if (!capable(CAP_SYS_ADMIN) || + FAN_GROUP_FLAG(group, FAN_REPORT_PID)) and *if* that ever happens, we will document the FAN_REPORT_PID flag and say that it is enabled by default for CAP_SYS_ADMIN instead of requiring and documenting FAN_REPORT_SELF_PID now. The way I see it, the only disadvantage with this negated approach is that CAP_SYS_ADMIN listeners cannot turn off event->pid reporting, but why would anybody need to do that? Thanks, Amir.
On Wed, Feb 24, 2021 at 2:58 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Wed, Feb 24, 2021 at 12:52 PM Jan Kara <jack@suse.cz> wrote: > > > > On Tue 23-02-21 19:16:40, Amir Goldstein wrote: > > > On Fri, Feb 19, 2021 at 6:16 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > > > On Tue, Feb 16, 2021 at 8:12 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > > > > > On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > > > > > > Add limited support for unprivileged fanotify event listener. > > > > > > > An unprivileged event listener does not get an open file descriptor in > > > > > > > the event nor the process pid of another process. An unprivileged event > > > > > > > listener cannot request permission events, cannot set mount/filesystem > > > > > > > marks and cannot request unlimited queue/marks. > > > > > > > > > > > > > > This enables the limited functionality similar to inotify when watching a > > > > > > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > > > > > > requiring SYS_CAP_ADMIN privileges. > > > > > > > > > > > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > > > > > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > > > > > > to monitor all changes inside those directories. > > > > > > > > > > > > > > This typically requires that the listener keeps a map of watched directory > > > > > > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > > > > > > before starting to watch for changes. > > > > > > > > > > > > > > When getting an event, the reported fid of the parent should be resolved > > > > > > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > > > > > > state of the filesystem entry. > > > > > > > > > > > > > > Note that even though events do not report the event creator pid, > > > > > > > fanotify does not merge similar events on the same object that were > > > > > > > generated by different processes. This is aligned with exiting behavior > > > > > > > when generating processes are outside of the listener pidns (which > > > > > > > results in reporting 0 pid to listener). > > > > > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > > > > > > > > > > The patch looks mostly good to me. Just two questions: > > > > > > > > > > > > a) Remind me please, why did we decide pid isn't safe to report to > > > > > > unpriviledged listeners? > > > > > > > > > > Just because the information that process X modified file Y is not an > > > > > information that user can generally obtain without extra capabilities(?) > > > > > I can add a flag FAN_REPORT_OWN_PID to make that behavior > > > > > explicit and then we can relax reporting pids later. > > > > > > > > > > > > > FYI a patch for flag FAN_REPORT_SELF_PID is pushed to branch > > > > fanotify_unpriv. > > > > > > > > The UAPI feels a bit awkward with this flag, but that is the easiest way > > > > to start without worrying about disclosing pids. > > > > > > > > I guess we can require that unprivileged listener has pid 1 in its own > > > > pid ns. The outcome is similar to FAN_REPORT_SELF_PID, except > > > > it can also get pids of its children which is probably fine. > > > > > > > > > > Jan, > > > > > > WRT your comment in github: > > > "So maybe we can just require that this flag is already set by userspace > > > instead of silently setting it? Like: > > > > > > if (!(flags & FAN_REPORT_SELF_PID)) return -EPERM; > > > > > > I'd say that variant is more futureproof and the difference for user > > > is minimal." > > > > > > I started with this approach and then I wrote the tests and imagined > > > the man page > > > requiring this flag would be a bit awkward, so I changed it to auto-enable. > > > > > > I am not strongly against the more implicit flag requirement, but in > > > favor of the > > > auto-enable approach I would like to argue that with current fanotify you CAN > > > get zero pid in event, so think about it this way: > > > If a listener is started in (or moved into) its own pid ns, it will > > > get zero pid in all > > > events (other than those generated by itself and its own children). > > > > > > With the proposed change, the same applies also if the listener is started > > > without CAP_SYS_ADMIN. > > > > > > As a matter of fact, we do not need the flag at all, we can determine whether > > > or not to report pid according to capabilities of the event reader at > > > event read time. > > > And we can check for one of: > > > - CAP_SYS_ADMIN > > > - CAP_SYS_PACCT > > > - CAP_SYS_PTRACE > > > > > > Do you prefer this flag-less approach? > > > > Well, I don't have strong opinion what we should do internally either. The > > flag seems OK to me. The biggest question is whether we should expose the > > FAN_REPORT_SELF_PID flag to userspace or not. If we would not require > > explicit flag for unpriv users, I see little reason to expose that flag at > > all. > > > > IMO the only listeners that actually care about event->pid are permission > event listeners. I think that FAN_CLASS_NOTIF listeners do not care > about it. The only thing that *I* ever used event->pid for is to ignore events > from self pid. > > My question is, do you mind if we start with this code: > > @@ -419,6 +419,14 @@ static ssize_t copy_event_to_user(struct > fsnotify_group *group, > metadata.reserved = 0; > metadata.mask = event->mask & FANOTIFY_OUTGOING_EVENTS; > metadata.pid = pid_vnr(event->pid); > + > + /* > + * For an unprivileged listener, event->pid can be used to identify the > + * events generated by the listener process itself, without disclosing > + * the pids of other processes. > + */ > + if (!capable(CAP_SYS_ADMIN) && > + task_tgid(current) != event->pid) > + metadata.pid = 0; > > No need for any visible or invisible user flags. > If users ask for event->pid of other processes later (I don't think they will) > and we decide that it is safe to disclose them, we will require another flag > and then the test will become: > > + if (!capable(CAP_SYS_ADMIN) || > + FAN_GROUP_FLAG(group, FAN_REPORT_PID)) This condition came out wrong... > > and *if* that ever happens, we will document the FAN_REPORT_PID > flag and say that it is enabled by default for CAP_SYS_ADMIN instead of > requiring and documenting FAN_REPORT_SELF_PID now. > > The way I see it, the only disadvantage with this negated approach is > that CAP_SYS_ADMIN listeners cannot turn off event->pid reporting, > but why would anybody need to do that? > Anyway, I pushed an example with FAN_REPORT_PID to branch fanotify_unpriv. It just defines a flag that indicates the existing behavior, auto enables for CAP_SYS_ADMIN and denied for !CAP_SYS_ADMIN. Because it is a purely semantic flag at this point, the patch that defines the flag can be dropped now and maybe added later. Option for future use - the reaper process (pid 1) of pid ns will be allowed to request FAN_REPORT_PID without any other capabilities. Thanks, Amir. P.S. fixes to your review comments on fanotify_merge also pushed.
On Wed 24-02-21 14:58:31, Amir Goldstein wrote: > On Wed, Feb 24, 2021 at 12:52 PM Jan Kara <jack@suse.cz> wrote: > > > > On Tue 23-02-21 19:16:40, Amir Goldstein wrote: > > > On Fri, Feb 19, 2021 at 6:16 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > > > On Tue, Feb 16, 2021 at 8:12 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > > > > > On Tue, Feb 16, 2021 at 7:01 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > On Sun 24-01-21 20:42:04, Amir Goldstein wrote: > > > > > > > Add limited support for unprivileged fanotify event listener. > > > > > > > An unprivileged event listener does not get an open file descriptor in > > > > > > > the event nor the process pid of another process. An unprivileged event > > > > > > > listener cannot request permission events, cannot set mount/filesystem > > > > > > > marks and cannot request unlimited queue/marks. > > > > > > > > > > > > > > This enables the limited functionality similar to inotify when watching a > > > > > > > set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without > > > > > > > requiring SYS_CAP_ADMIN privileges. > > > > > > > > > > > > > > The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged > > > > > > > event listener watching a set of directories (with FAN_EVENT_ON_CHILD) > > > > > > > to monitor all changes inside those directories. > > > > > > > > > > > > > > This typically requires that the listener keeps a map of watched directory > > > > > > > fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() > > > > > > > before starting to watch for changes. > > > > > > > > > > > > > > When getting an event, the reported fid of the parent should be resolved > > > > > > > to dirfd and fstatsat(2) with dirfd and name should be used to query the > > > > > > > state of the filesystem entry. > > > > > > > > > > > > > > Note that even though events do not report the event creator pid, > > > > > > > fanotify does not merge similar events on the same object that were > > > > > > > generated by different processes. This is aligned with exiting behavior > > > > > > > when generating processes are outside of the listener pidns (which > > > > > > > results in reporting 0 pid to listener). > > > > > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > > > > > > > > > > The patch looks mostly good to me. Just two questions: > > > > > > > > > > > > a) Remind me please, why did we decide pid isn't safe to report to > > > > > > unpriviledged listeners? > > > > > > > > > > Just because the information that process X modified file Y is not an > > > > > information that user can generally obtain without extra capabilities(?) > > > > > I can add a flag FAN_REPORT_OWN_PID to make that behavior > > > > > explicit and then we can relax reporting pids later. > > > > > > > > > > > > > FYI a patch for flag FAN_REPORT_SELF_PID is pushed to branch > > > > fanotify_unpriv. > > > > > > > > The UAPI feels a bit awkward with this flag, but that is the easiest way > > > > to start without worrying about disclosing pids. > > > > > > > > I guess we can require that unprivileged listener has pid 1 in its own > > > > pid ns. The outcome is similar to FAN_REPORT_SELF_PID, except > > > > it can also get pids of its children which is probably fine. > > > > > > > > > > Jan, > > > > > > WRT your comment in github: > > > "So maybe we can just require that this flag is already set by userspace > > > instead of silently setting it? Like: > > > > > > if (!(flags & FAN_REPORT_SELF_PID)) return -EPERM; > > > > > > I'd say that variant is more futureproof and the difference for user > > > is minimal." > > > > > > I started with this approach and then I wrote the tests and imagined > > > the man page > > > requiring this flag would be a bit awkward, so I changed it to auto-enable. > > > > > > I am not strongly against the more implicit flag requirement, but in > > > favor of the > > > auto-enable approach I would like to argue that with current fanotify you CAN > > > get zero pid in event, so think about it this way: > > > If a listener is started in (or moved into) its own pid ns, it will > > > get zero pid in all > > > events (other than those generated by itself and its own children). > > > > > > With the proposed change, the same applies also if the listener is started > > > without CAP_SYS_ADMIN. > > > > > > As a matter of fact, we do not need the flag at all, we can determine whether > > > or not to report pid according to capabilities of the event reader at > > > event read time. > > > And we can check for one of: > > > - CAP_SYS_ADMIN > > > - CAP_SYS_PACCT > > > - CAP_SYS_PTRACE > > > > > > Do you prefer this flag-less approach? > > > > Well, I don't have strong opinion what we should do internally either. The > > flag seems OK to me. The biggest question is whether we should expose the > > FAN_REPORT_SELF_PID flag to userspace or not. If we would not require > > explicit flag for unpriv users, I see little reason to expose that flag at > > all. > > > > IMO the only listeners that actually care about event->pid are permission > event listeners. I think that FAN_CLASS_NOTIF listeners do not care > about it. The only thing that *I* ever used event->pid for is to ignore events > from self pid. > > My question is, do you mind if we start with this code: > > @@ -419,6 +419,14 @@ static ssize_t copy_event_to_user(struct > fsnotify_group *group, > metadata.reserved = 0; > metadata.mask = event->mask & FANOTIFY_OUTGOING_EVENTS; > metadata.pid = pid_vnr(event->pid); > + > + /* > + * For an unprivileged listener, event->pid can be used to identify the > + * events generated by the listener process itself, without disclosing > + * the pids of other processes. > + */ > + if (!capable(CAP_SYS_ADMIN) && > + task_tgid(current) != event->pid) > + metadata.pid = 0; > > No need for any visible or invisible user flags. OK, I think this is fine. > If users ask for event->pid of other processes later (I don't think they will) > and we decide that it is safe to disclose them, we will require another flag > and then the test will become: > > + if (!capable(CAP_SYS_ADMIN) || > + FAN_GROUP_FLAG(group, FAN_REPORT_PID)) > > and *if* that ever happens, we will document the FAN_REPORT_PID > flag and say that it is enabled by default for CAP_SYS_ADMIN instead of > requiring and documenting FAN_REPORT_SELF_PID now. > > The way I see it, the only disadvantage with this negated approach is > that CAP_SYS_ADMIN listeners cannot turn off event->pid reporting, > but why would anybody need to do that? Yeah, let's leave complications for later when someone asks for it :) Honza
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 4ade3f9df337..b70de273eedb 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -397,9 +397,21 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group, metadata.vers = FANOTIFY_METADATA_VERSION; metadata.reserved = 0; metadata.mask = event->mask & FANOTIFY_OUTGOING_EVENTS; - metadata.pid = pid_vnr(event->pid); + /* + * An unprivileged event listener does not get an open file descriptor + * in the event nor another generating process pid. If the event was + * generated by the unprivileged process itself, self pid is reported. + * We may relax this in the future by checking calling process access + * permissions to the object. + */ + if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) || + task_tgid(current) == event->pid) + metadata.pid = pid_vnr(event->pid); + else + metadata.pid = 0; - if (path && path->mnt && path->dentry) { + if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) && + path && path->mnt && path->dentry) { fd = create_fd(group, path, &f); if (fd < 0) return fd; @@ -995,12 +1007,29 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) int f_flags, fd; unsigned int fid_mode = flags & FANOTIFY_FID_BITS; unsigned int class = flags & FANOTIFY_CLASS_BITS; + unsigned int internal_flags = 0; pr_debug("%s: flags=%x event_f_flags=%x\n", __func__, flags, event_f_flags); - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; + if (!capable(CAP_SYS_ADMIN)) { + /* + * An unprivileged user can setup an unprivileged listener with + * limited functionality - an unprivileged event listener cannot + * request permission events, cannot set mount/filesystem marks + * and cannot request unlimited queue/marks. + */ + if ((flags & ~FANOTIFY_UNPRIV_INIT_FLAGS) || + class != FAN_CLASS_NOTIF) + return -EPERM; + + /* + * We set the internal flag FANOTIFY_UNPRIV on the group, so we + * know that we need to limit setting mount/filesystem marks on + * this group and avoid providing pid and open fd in the event. + */ + internal_flags |= FANOTIFY_UNPRIV; + } #ifdef CONFIG_AUDITSYSCALL if (flags & ~(FANOTIFY_INIT_FLAGS | FAN_ENABLE_AUDIT)) @@ -1051,7 +1080,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) goto out_destroy_group; } - group->fanotify_data.flags = flags; + group->fanotify_data.flags = flags | internal_flags; group->memcg = get_mem_cgroup_from_mm(current->mm); group->overflow_event = fanotify_alloc_overflow_event(); @@ -1247,6 +1276,15 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask, goto fput_and_out; group = f.file->private_data; + /* + * An unprivileged event listener is not allowed to watch a mount + * point nor a filesystem. + */ + ret = -EPERM; + if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) && + mark_type != FAN_MARK_INODE) + goto fput_and_out; + /* * group->priority == FS_PRIO_0 == FAN_CLASS_NOTIF. These are not * allowed to set permissions events. @@ -1379,6 +1417,7 @@ SYSCALL32_DEFINE6(fanotify_mark, */ static int __init fanotify_user_setup(void) { + BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_FLAGS); BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10); BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9); diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c index f0d6b54be412..57f0d5d9f934 100644 --- a/fs/notify/fdinfo.c +++ b/fs/notify/fdinfo.c @@ -144,7 +144,8 @@ void fanotify_show_fdinfo(struct seq_file *m, struct file *f) struct fsnotify_group *group = f->private_data; seq_printf(m, "fanotify flags:%x event-flags:%x\n", - group->fanotify_data.flags, group->fanotify_data.f_flags); + group->fanotify_data.flags & FANOTIFY_INIT_FLAGS, + group->fanotify_data.f_flags); show_fdinfo(m, f, fanotify_fdinfo); } diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h index 031a97d8369a..a573c1028c14 100644 --- a/include/linux/fanotify.h +++ b/include/linux/fanotify.h @@ -28,6 +28,22 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */ FAN_CLOEXEC | FAN_NONBLOCK | \ FAN_UNLIMITED_QUEUE | FAN_UNLIMITED_MARKS) +/* Internal flags */ +#define FANOTIFY_UNPRIV 0x80000000 +#define FANOTIFY_INTERNAL_FLAGS (FANOTIFY_UNPRIV) + +/* + * fanotify_init() flags allowed for unprivileged listener. + * FAN_CLASS_NOTIF in this mask is purely semantic because it is zero, + * but it is the only class we allow for unprivileged listener. + * Since unprivileged listener does not provide file descriptors in events, + * reporting file handles makes sense, but it is not a must. + * FAN_REPORT_TID does not make sense for unprivileged listener, which uses + * event->pid only to filter out events generated by listener process itself. + */ +#define FANOTIFY_UNPRIV_INIT_FLAGS (FAN_CLOEXEC | FAN_NONBLOCK | \ + FAN_CLASS_NOTIF | FANOTIFY_FID_BITS) + #define FANOTIFY_MARK_TYPE_BITS (FAN_MARK_INODE | FAN_MARK_MOUNT | \ FAN_MARK_FILESYSTEM)
Add limited support for unprivileged fanotify event listener. An unprivileged event listener does not get an open file descriptor in the event nor the process pid of another process. An unprivileged event listener cannot request permission events, cannot set mount/filesystem marks and cannot request unlimited queue/marks. This enables the limited functionality similar to inotify when watching a set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without requiring SYS_CAP_ADMIN privileges. The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged event listener watching a set of directories (with FAN_EVENT_ON_CHILD) to monitor all changes inside those directories. This typically requires that the listener keeps a map of watched directory fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() before starting to watch for changes. When getting an event, the reported fid of the parent should be resolved to dirfd and fstatsat(2) with dirfd and name should be used to query the state of the filesystem entry. Note that even though events do not report the event creator pid, fanotify does not merge similar events on the same object that were generated by different processes. This is aligned with exiting behavior when generating processes are outside of the listener pidns (which results in reporting 0 pid to listener). Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- fs/notify/fanotify/fanotify_user.c | 49 +++++++++++++++++++++++++++--- fs/notify/fdinfo.c | 3 +- include/linux/fanotify.h | 16 ++++++++++ 3 files changed, 62 insertions(+), 6 deletions(-)