diff mbox series

fanotify: support limited functionality for unprivileged users

Message ID 20181105132816.12241-1-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show
Series fanotify: support limited functionality for unprivileged users | expand

Commit Message

Amir Goldstein Nov. 5, 2018, 1:28 p.m. UTC
Add support for new fanotify_init() flag FAN_UNPRIVILEGED.
User may request an unprivileged event listener using this flag even if
user is privileged.

An unprivileged event listener does not get an open file descriptor in
the event nor the process pid of another process. An unprivileged event
listener and cannot request permission events, cannot set mount/filesystem
marks and cannot request unlimited queue/marks.

This enables the limited functionality similar to inotify when watching a
single inode for OPEN/ACCESS/MODIFY/CLOSE events, but at least it does
not require CAP_ADMIN privileges.

Going forward, this limited functionality will be enriched with more
event types and the ability to watch multiple objects by indetifying
the object in the event with its nfs file handle.

Note that even though events do not report the event creator pid,
fanotify does not merge similar events on the same object that were
generated by different processes. This in aligned with exiting behavior
when generating processes are outside of the listener pidns (which
results in reporting 0 pid to listener).

Cc: <linux-api@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Jan,

I realize you are still catching up with current dev cycle fixes,
but I have mentioned FAN_UNPRIVILEGED several times on public
API discussions, so making the patch public as well.

The FAN_UNPRIVILEGED functionality was tested manually and with some
WIP LTP tests. Matthew Bobrowski is working on some more LTP tests.

FAN_UNPRIVILEGED has merit on its own and depends on no other patches,
but it is more of a prelude for more upcoming API changes.

FWIW, this patch shouldn't have any conflicts neither with my
FAN_EVENT_ON_CHILD fix patch nor with Matthew's FAN_OPEN_EXEC patches.

Thanks,
Amir.


 fs/notify/fanotify/fanotify_user.c | 45 ++++++++++++++++++++++++++----
 include/linux/fanotify.h           | 11 +++++++-
 include/uapi/linux/fanotify.h      |  1 +
 3 files changed, 50 insertions(+), 7 deletions(-)

Comments

Jan Kara Nov. 14, 2018, 1:57 p.m. UTC | #1
Hi,

On Mon 05-11-18 15:28:16, Amir Goldstein wrote:
> Add support for new fanotify_init() flag FAN_UNPRIVILEGED.
> User may request an unprivileged event listener using this flag even if
> user is privileged.
> 
> An unprivileged event listener does not get an open file descriptor in
> the event nor the process pid of another process. An unprivileged event
> listener and cannot request permission events, cannot set mount/filesystem
> marks and cannot request unlimited queue/marks.
> 
> This enables the limited functionality similar to inotify when watching a
> single inode for OPEN/ACCESS/MODIFY/CLOSE events, but at least it does
> not require CAP_ADMIN privileges.
> 
> Going forward, this limited functionality will be enriched with more
> event types and the ability to watch multiple objects by indetifying
> the object in the event with its nfs file handle.
> 
> Note that even though events do not report the event creator pid,
> fanotify does not merge similar events on the same object that were
> generated by different processes. This in aligned with exiting behavior
> when generating processes are outside of the listener pidns (which
> results in reporting 0 pid to listener).
> 
> Cc: <linux-api@vger.kernel.org>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

OK, this patch is next on my fanotify todo list :) Thanks for the patch!

> FAN_UNPRIVILEGED has merit on its own and depends on no other patches,
> but it is more of a prelude for more upcoming API changes.

So I find it hard to see how this is going to be useful. If you don't
return open fd, listener has no way of determining on which file something
has happened. Inotify has returned watch descriptor which could be used for
this purpose but fanotify has no such thing. I guess you might want to use
your FID extensions to identify files (provided that does not leak any
sensitive information) but without that this patch is pretty much useless
AFAICT.

Also another thing I'd like to get clarified: What is FAN_UNPRIVILEGED
going to be useful for? So far it will have only disadvantages compared to
inotify and no advantages. Trying to get to feature parity with inotify is
a nice thing but unless we see how this is ever going to more useful than
inotify, I don't see much point because people will just not have any
incentive to use it.

And finally, inotify has these tunable namespaced limits on number of
events, marks, and groups (to limit memory usage and inode pinning). I
think you need to implement something similar for fanotify before you
expose it to arbitrary users.
 
> FWIW, this patch shouldn't have any conflicts neither with my
> FAN_EVENT_ON_CHILD fix patch nor with Matthew's FAN_OPEN_EXEC patches.

I think there's a conflict with FAN_REPORT_TID patch but that should be
easy to resolve.

								Honza
Amir Goldstein Nov. 14, 2018, 3:41 p.m. UTC | #2
On Wed, Nov 14, 2018 at 3:57 PM Jan Kara <jack@suse.cz> wrote:
>
> Hi,
>
> On Mon 05-11-18 15:28:16, Amir Goldstein wrote:
> > Add support for new fanotify_init() flag FAN_UNPRIVILEGED.
> > User may request an unprivileged event listener using this flag even if
> > user is privileged.
> >
> > An unprivileged event listener does not get an open file descriptor in
> > the event nor the process pid of another process. An unprivileged event
> > listener and cannot request permission events, cannot set mount/filesystem
> > marks and cannot request unlimited queue/marks.
> >
> > This enables the limited functionality similar to inotify when watching a
> > single inode for OPEN/ACCESS/MODIFY/CLOSE events, but at least it does
> > not require CAP_ADMIN privileges.
> >
> > Going forward, this limited functionality will be enriched with more
> > event types and the ability to watch multiple objects by indetifying
> > the object in the event with its nfs file handle.
> >
> > Note that even though events do not report the event creator pid,
> > fanotify does not merge similar events on the same object that were
> > generated by different processes. This in aligned with exiting behavior
> > when generating processes are outside of the listener pidns (which
> > results in reporting 0 pid to listener).
> >
> > Cc: <linux-api@vger.kernel.org>
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> OK, this patch is next on my fanotify todo list :) Thanks for the patch!
>
> > FAN_UNPRIVILEGED has merit on its own and depends on no other patches,
> > but it is more of a prelude for more upcoming API changes.
>
> So I find it hard to see how this is going to be useful. If you don't
> return open fd, listener has no way of determining on which file something
> has happened. Inotify has returned watch descriptor which could be used for
> this purpose but fanotify has no such thing. I guess you might want to use
> your FID extensions to identify files (provided that does not leak any
> sensitive information) but without that this patch is pretty much useless
> AFAICT.
>

I see your point. FAN_UNPRIVILEGED is indeed most useful along with
FAN_REPORT_FID, which I am going to claim, is safe for use with
unprivileged listeners.
In that case, I should probably post FAN_REPORT_FID patches,
then post FAN_UNPRIVILEGED patch and enforce that FAN_UNPRIVILEGED
requires FAN_REPORT_FID.

> Also another thing I'd like to get clarified: What is FAN_UNPRIVILEGED
> going to be useful for? So far it will have only disadvantages compared to
> inotify and no advantages. Trying to get to feature parity with inotify is
> a nice thing but unless we see how this is ever going to more useful than
> inotify, I don't see much point because people will just not have any
> incentive to use it.

I see your point again.
At this point in time, the only FAN_UNPRIVILEGED features I can think of
are event->pid, namely was the event generated by self, and the new
FAN_OPEN_EXEC. Maybe not enough to migrate away from inotify.
But if we lay the foundations of FAN_UNPRIVILEGED, people may come
up with other FAN_UNPRIVILEGED features, because one thing we can
say for certain is that inotify is not going to gain any new features - right?
Anyway, there is no rush for me.
If the benefit of FAN_UNPRIVILEGED is questionable, we can leave it to
another time. The patch is quite small and will be event smaller after
FAN_REPORT_FID (which already implies FAN_NOFD).

>
> And finally, inotify has these tunable namespaced limits on number of
> events, marks, and groups (to limit memory usage and inode pinning). I
> think you need to implement something similar for fanotify before you
> expose it to arbitrary users.
>

Good point. More of a reason to leave FAN_UNPRIVILEGED for another
time. If you change your mind or find better reasons to expedite
FAN_UNPRIVILEGED, please let me know.

> > FWIW, this patch shouldn't have any conflicts neither with my
> > FAN_EVENT_ON_CHILD fix patch nor with Matthew's FAN_OPEN_EXEC patches.
>
> I think there's a conflict with FAN_REPORT_TID patch but that should be
> easy to resolve.
>

That is strange. All my work is based off of v4.20-rc1.

Thanks,
Amir.
diff mbox series

Patch

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e03be5071362..3e64326085ba 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -132,10 +132,21 @@  static int fill_event_metadata(struct fsnotify_group *group,
 	metadata->vers = FANOTIFY_METADATA_VERSION;
 	metadata->reserved = 0;
 	metadata->mask = fsn_event->mask & FANOTIFY_OUTGOING_EVENTS;
-	metadata->pid = pid_vnr(event->pid);
-	if (unlikely(fsn_event->mask & FAN_Q_OVERFLOW))
+	/*
+	 * An unprivileged event listener does not get an open file descriptor
+	 * in the event nor another generating process pid. If the event was
+	 * generated by the unprivileged process itself, self pid is reported.
+	 */
+	if (!FAN_GROUP_FLAG(group, FAN_UNPRIVILEGED) ||
+	    task_tgid(current) == event->pid)
+		metadata->pid = pid_vnr(event->pid);
+	else
+		metadata->pid = 0;
+
+	if (FAN_GROUP_FLAG(group, FAN_UNPRIVILEGED) ||
+	    unlikely(fsn_event->mask & FAN_Q_OVERFLOW)) {
 		metadata->fd = FAN_NOFD;
-	else {
+	} else {
 		metadata->fd = create_fd(group, event, file);
 		if (metadata->fd < 0)
 			ret = metadata->fd;
@@ -683,12 +694,26 @@  SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	int f_flags, fd;
 	struct user_struct *user;
 	struct fanotify_event_info *oevent;
+	unsigned int class = flags & FANOTIFY_CLASS_BITS;
 
 	pr_debug("%s: flags=%x event_f_flags=%x\n",
 		 __func__, flags, event_f_flags);
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (flags & FAN_UNPRIVILEGED) {
+		/*
+		 * User can request an unprivileged event listener even if
+		 * user is privileged. An unprivileged event listener does not
+		 * get an open file descriptor in the event nor the proccess id
+		 * of another process. An unprivileged event listener and cannot
+		 * request permission events, cannot set mount/filesystem marks
+		 * and cannot request unlimited queue/marks.
+		 */
+		if ((flags & ~FANOTIFY_UNPRIV_INIT_FLAGS) ||
+		    class != FAN_CLASS_NOTIF)
+			return -EINVAL;
+	} else if (!capable(CAP_SYS_ADMIN)) {
 		return -EPERM;
+	}
 
 #ifdef CONFIG_AUDITSYSCALL
 	if (flags & ~(FANOTIFY_INIT_FLAGS | FAN_ENABLE_AUDIT))
@@ -745,7 +770,7 @@  SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	group->fanotify_data.f_flags = event_f_flags;
 	init_waitqueue_head(&group->fanotify_data.access_waitq);
 	INIT_LIST_HEAD(&group->fanotify_data.access_list);
-	switch (flags & FANOTIFY_CLASS_BITS) {
+	switch (class) {
 	case FAN_CLASS_NOTIF:
 		group->priority = FS_PRIO_0;
 		break;
@@ -865,6 +890,14 @@  static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	    group->priority == FS_PRIO_0)
 		goto fput_and_out;
 
+	/*
+	 * An unprivileged event listener is not allowed to watch a mount
+	 * point nor a filesystem.
+	 */
+	if (FAN_GROUP_FLAG(group, FAN_UNPRIVILEGED) &&
+	    mark_type != FAN_MARK_INODE)
+		goto fput_and_out;
+
 	if (flags & FAN_MARK_FLUSH) {
 		ret = 0;
 		if (mark_type == FAN_MARK_MOUNT)
@@ -944,7 +977,7 @@  COMPAT_SYSCALL_DEFINE6(fanotify_mark,
  */
 static int __init fanotify_user_setup(void)
 {
-	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 7);
+	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 8);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
 
 	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index a5a60691e48b..4d31be37ba2a 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -21,7 +21,16 @@ 
 #define FANOTIFY_INIT_FLAGS	(FANOTIFY_CLASS_BITS | \
 				 FAN_REPORT_TID | \
 				 FAN_CLOEXEC | FAN_NONBLOCK | \
-				 FAN_UNLIMITED_QUEUE | FAN_UNLIMITED_MARKS)
+				 FAN_UNLIMITED_QUEUE | FAN_UNLIMITED_MARKS | \
+				 FAN_UNPRIVILEGED)
+
+/*
+ * fanotify_init() flags allowed for unprivileged users.
+ * It's only ok to add FAN_CLASS_NOTIF to this mask because it is zero and
+ * because it is the only class we allow for unprivileged users.
+ */
+#define FANOTIFY_UNPRIV_INIT_FLAGS	(FAN_CLOEXEC | FAN_NONBLOCK | \
+					 FAN_CLASS_NOTIF | FAN_UNPRIVILEGED)
 
 #define FANOTIFY_MARK_TYPE_BITS	(FAN_MARK_INODE | FAN_MARK_MOUNT | \
 				 FAN_MARK_FILESYSTEM)
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index b86740d1c50a..6d7dbab3fb14 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -39,6 +39,7 @@ 
 #define FAN_UNLIMITED_QUEUE	0x00000010
 #define FAN_UNLIMITED_MARKS	0x00000020
 #define FAN_ENABLE_AUDIT	0x00000040
+#define FAN_UNPRIVILEGED	0x00000080
 
 /* Flags to determine fanotify event format */
 #define FAN_REPORT_TID		0x00000100	/* event->pid is thread id */