[v5,15/22] fsnotify: send event with parent/name info to sb/mount/non-dir marks
diff mbox series

Message ID 20200716084230.30611-16-amir73il@gmail.com
State New
Headers show
Series
  • fanotify events with name info
Related show

Commit Message

Amir Goldstein July 16, 2020, 8:42 a.m. UTC
Similar to events "on child" to watching directory, send event "on child"
with parent/name info if sb/mount/non-dir marks are interested in
parent/name info.

The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
interest in parent/name info for events on non-directory inodes.

Events on "orphan" children (disconnected dentries) are sent without
parent/name info.

Events on direcories are send with parent/name info only if the parent
directory is watching.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c             | 54 ++++++++++++++++++++++++++++----
 include/linux/fsnotify.h         | 10 ++++--
 include/linux/fsnotify_backend.h | 32 ++++++++++++++++---
 3 files changed, 84 insertions(+), 12 deletions(-)

Comments

Jan Kara July 16, 2020, 5:01 p.m. UTC | #1
On Thu 16-07-20 11:42:23, Amir Goldstein wrote:
> Similar to events "on child" to watching directory, send event "on child"
> with parent/name info if sb/mount/non-dir marks are interested in
> parent/name info.
> 
> The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> interest in parent/name info for events on non-directory inodes.
> 
> Events on "orphan" children (disconnected dentries) are sent without
> parent/name info.
> 
> Events on direcories are send with parent/name info only if the parent
> directory is watching.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Hum, doesn't this break ignore mask handling in
fanotify_group_event_mask()? Because parent's ignore mask will be included
even though parent is added into the iter only to carry the parent info...

Also I'm constantly getting confused about mark->mask handling in that
function WRT __fsnotify_parent() sending FS_EVENT_ON_CHILD event. But in
the end I've convinced myself it is correct ;)

								Honza
Amir Goldstein July 16, 2020, 5:20 p.m. UTC | #2
On Thu, Jul 16, 2020 at 8:01 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 16-07-20 11:42:23, Amir Goldstein wrote:
> > Similar to events "on child" to watching directory, send event "on child"
> > with parent/name info if sb/mount/non-dir marks are interested in
> > parent/name info.
> >
> > The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> > interest in parent/name info for events on non-directory inodes.
> >
> > Events on "orphan" children (disconnected dentries) are sent without
> > parent/name info.
> >
> > Events on direcories are send with parent/name info only if the parent
> > directory is watching.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> Hum, doesn't this break ignore mask handling in
> fanotify_group_event_mask()? Because parent's ignore mask will be included
> even though parent is added into the iter only to carry the parent info...
>

Hmm, break ignore mask handling? or fix it?

Man page said:
"Having these two types of masks permits a mount point or directory to be
 marked for receiving events, while at the  same time ignoring events for
 specific objects under that mount point or directory."

The author did not say what to expect from marking a mount and ignoring
a directory.

As it turns out, I need this exact functionality for my snapshot.
- sb is watching all (pre) modify events
- after dir has been marked with a change in snapshot an exclude
  mark is set on that dir inode
- further modification events on files inside that dir are ignored
  without calling event handler

I am sure you are aware that we have been fixing a lot of problems
in handling combinations of mark masks.
I see the unified event as another step in the direction to fix those
issues and to get consistent and expected behavior.

Thanks,
Amir.
Jan Kara July 16, 2020, 5:57 p.m. UTC | #3
On Thu 16-07-20 20:20:04, Amir Goldstein wrote:
> On Thu, Jul 16, 2020 at 8:01 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 16-07-20 11:42:23, Amir Goldstein wrote:
> > > Similar to events "on child" to watching directory, send event "on child"
> > > with parent/name info if sb/mount/non-dir marks are interested in
> > > parent/name info.
> > >
> > > The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> > > interest in parent/name info for events on non-directory inodes.
> > >
> > > Events on "orphan" children (disconnected dentries) are sent without
> > > parent/name info.
> > >
> > > Events on direcories are send with parent/name info only if the parent
> > > directory is watching.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > Hum, doesn't this break ignore mask handling in
> > fanotify_group_event_mask()? Because parent's ignore mask will be included
> > even though parent is added into the iter only to carry the parent info...
> >
> 
> Hmm, break ignore mask handling? or fix it?
> 
> Man page said:
> "Having these two types of masks permits a mount point or directory to be
>  marked for receiving events, while at the  same time ignoring events for
>  specific objects under that mount point or directory."

Right, but presumably that speaks of the case of a mark where the parent
has FS_EVENT_ON_CHILD set. For case of parent watching events of a child, I
agree it makes sense to apply ignore masks of both the parent and the child.

> The author did not say what to expect from marking a mount and ignoring
> a directory.

Yes and I'd expect to apply ignore mask on events for that directory but
not for events on files in that directory... Even more so because this will
be currently inconsistent wrt whether the child is dir (parent's ignore mask
does not apply) or file (parent's ignore mask does apply).

> As it turns out, I need this exact functionality for my snapshot.
> - sb is watching all (pre) modify events
> - after dir has been marked with a change in snapshot an exclude
>   mark is set on that dir inode
> - further modification events on files inside that dir are ignored
>   without calling event handler

Yes, I can see how the functionality is useful. Maybe with
FS_EVENT_ON_CHILD in the ignore mask, applying the mask to child's events
would make sense... But that's IMO a dispute for a different patch.

> I am sure you are aware that we have been fixing a lot of problems
> in handling combinations of mark masks.

Very well aware :)

> I see the unified event as another step in the direction to fix those
> issues and to get consistent and expected behavior.

								Honza
Amir Goldstein July 16, 2020, 6:42 p.m. UTC | #4
On Thu, Jul 16, 2020 at 8:57 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 16-07-20 20:20:04, Amir Goldstein wrote:
> > On Thu, Jul 16, 2020 at 8:01 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 16-07-20 11:42:23, Amir Goldstein wrote:
> > > > Similar to events "on child" to watching directory, send event "on child"
> > > > with parent/name info if sb/mount/non-dir marks are interested in
> > > > parent/name info.
> > > >
> > > > The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> > > > interest in parent/name info for events on non-directory inodes.
> > > >
> > > > Events on "orphan" children (disconnected dentries) are sent without
> > > > parent/name info.
> > > >
> > > > Events on direcories are send with parent/name info only if the parent
> > > > directory is watching.
> > > >
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > >
> > > Hum, doesn't this break ignore mask handling in
> > > fanotify_group_event_mask()? Because parent's ignore mask will be included
> > > even though parent is added into the iter only to carry the parent info...
> > >
> >
> > Hmm, break ignore mask handling? or fix it?
> >
> > Man page said:
> > "Having these two types of masks permits a mount point or directory to be
> >  marked for receiving events, while at the  same time ignoring events for
> >  specific objects under that mount point or directory."
>
> Right, but presumably that speaks of the case of a mark where the parent
> has FS_EVENT_ON_CHILD set. For case of parent watching events of a child, I
> agree it makes sense to apply ignore masks of both the parent and the child.
>
> > The author did not say what to expect from marking a mount and ignoring
> > a directory.
>
> Yes and I'd expect to apply ignore mask on events for that directory but
> not for events on files in that directory... Even more so because this will
> be currently inconsistent wrt whether the child is dir (parent's ignore mask
> does not apply) or file (parent's ignore mask does apply).
>

Indeed. For that I used this trick in my POC:

        /* Set the mark mask, so fsnotify_parent() will find this mark */
        ovm->fsn_mark.mask = mask | FS_EVENT_ON_CHILD;
        ovm->fsn_mark.ignored_mask = mask;

It's not how users are expected to configure an ignored mask on children
but we can work the ignored mask information into the object mask, like
I already did w.r.t FS_MODIFY and get the same result without the hack.

Thanks,
Amir.

P.S. for whoever is interested, my POC is on ovl-fsnotify branch.
It seems to be working well. I am just trying to get those "ephemeral
exclude marks" to not pin the dir inodes to cache, so that those inodes
could be evicted.
Jan Kara July 16, 2020, 10:34 p.m. UTC | #5
On Thu 16-07-20 21:42:20, Amir Goldstein wrote:
> On Thu, Jul 16, 2020 at 8:57 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 16-07-20 20:20:04, Amir Goldstein wrote:
> > > On Thu, Jul 16, 2020 at 8:01 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Thu 16-07-20 11:42:23, Amir Goldstein wrote:
> > > > > Similar to events "on child" to watching directory, send event "on child"
> > > > > with parent/name info if sb/mount/non-dir marks are interested in
> > > > > parent/name info.
> > > > >
> > > > > The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
> > > > > interest in parent/name info for events on non-directory inodes.
> > > > >
> > > > > Events on "orphan" children (disconnected dentries) are sent without
> > > > > parent/name info.
> > > > >
> > > > > Events on direcories are send with parent/name info only if the parent
> > > > > directory is watching.
> > > > >
> > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > >
> > > > Hum, doesn't this break ignore mask handling in
> > > > fanotify_group_event_mask()? Because parent's ignore mask will be included
> > > > even though parent is added into the iter only to carry the parent info...
> > > >
> > >
> > > Hmm, break ignore mask handling? or fix it?
> > >
> > > Man page said:
> > > "Having these two types of masks permits a mount point or directory to be
> > >  marked for receiving events, while at the  same time ignoring events for
> > >  specific objects under that mount point or directory."
> >
> > Right, but presumably that speaks of the case of a mark where the parent
> > has FS_EVENT_ON_CHILD set. For case of parent watching events of a child, I
> > agree it makes sense to apply ignore masks of both the parent and the child.
> >
> > > The author did not say what to expect from marking a mount and ignoring
> > > a directory.
> >
> > Yes and I'd expect to apply ignore mask on events for that directory but
> > not for events on files in that directory... Even more so because this will
> > be currently inconsistent wrt whether the child is dir (parent's ignore mask
> > does not apply) or file (parent's ignore mask does apply).
> >
> 
> Indeed. For that I used this trick in my POC:
> 
>         /* Set the mark mask, so fsnotify_parent() will find this mark */
>         ovm->fsn_mark.mask = mask | FS_EVENT_ON_CHILD;
>         ovm->fsn_mark.ignored_mask = mask;
> 
> It's not how users are expected to configure an ignored mask on children
> but we can work the ignored mask information into the object mask, like
> I already did w.r.t FS_MODIFY and get the same result without the hack.

OK, nice trick but for this series, I'd like to keep the original ignore
mask behavior (bug to bug compatibility) or possibly let parent's ignore
mask be applied only for events being sent to the parent due to its
FS_EVENT_ON_CHILD. Can you please fix that up? I won't get to it before I
leave for vacation but once I return, I'd like to just pick the fixed up
commit and push everything to linux-next... Thanks!

								Honza
Amir Goldstein July 17, 2020, 3:49 a.m. UTC | #6
> OK, nice trick but for this series, I'd like to keep the original ignore
> mask behavior (bug to bug compatibility) or possibly let parent's ignore
> mask be applied only for events being sent to the parent due to its
> FS_EVENT_ON_CHILD.

That should be easy if we set the FS_EVENT_ON_CHILD flag only for
the case of a watching parent.
And I believe that would make the flag meaningful again and not
redundant as you now see it.
Please note that the series is already not "bug compatible", because
the patch " send event to parent and child with single callback"
already fixes the combination parent watch + child ignore, which
did not work before and this is declared in commit message.

> Can you please fix that up?

Will do.
I will also take care of LTP test coverage for the ignore mask cases
that have been fixed and for dnotify/inotify parent+child watching case.

> I won't get to it before I
> leave for vacation but once I return, I'd like to just pick the fixed up
> commit and push everything to linux-next... Thanks!
>

Thanks a lot for everything!

Amir.

Patch
diff mbox series

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 7120c675e9a6..efa5c1c4908a 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -142,31 +142,73 @@  void __fsnotify_update_child_dentry_flags(struct inode *inode)
 	spin_unlock(&inode->i_lock);
 }
 
+/* Are inode/sb/mount interested in parent and name info with this event? */
+static bool fsnotify_event_needs_parent(struct inode *inode, struct mount *mnt,
+					__u32 mask)
+{
+	__u32 marks_mask = 0;
+
+	/* We only send parent/name to inode/sb/mount for events on non-dir */
+	if (mask & FS_ISDIR)
+		return false;
+
+	/* Did either inode/sb/mount subscribe for events with parent/name? */
+	marks_mask |= fsnotify_parent_needed_mask(inode->i_fsnotify_mask);
+	marks_mask |= fsnotify_parent_needed_mask(inode->i_sb->s_fsnotify_mask);
+	if (mnt)
+		marks_mask |= fsnotify_parent_needed_mask(mnt->mnt_fsnotify_mask);
+
+	/* Did they subscribe for this event with parent/name info? */
+	return mask & marks_mask;
+}
+
 /*
  * Notify this dentry's parent about a child's events with child name info
- * if parent is watching.
- * Notify only the child without name info if parent is not watching.
+ * if parent is watching or if inode/sb/mount are interested in events with
+ * parent and name info.
+ *
+ * Notify only the child without name info if parent is not watching and
+ * inode/sb/mount are not interested in events with parent and name info.
  */
 int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 		      int data_type)
 {
+	const struct path *path = fsnotify_data_path(data, data_type);
+	struct mount *mnt = path ? real_mount(path->mnt) : NULL;
 	struct inode *inode = d_inode(dentry);
 	struct dentry *parent;
+	bool parent_watched = dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED;
+	__u32 p_mask;
 	struct inode *p_inode;
 	struct name_snapshot name;
 	struct qstr *file_name = NULL;
 	int ret = 0;
 
+	/*
+	 * Do inode/sb/mount care about parent and name info on non-dir?
+	 * Do they care about any event at all?
+	 */
+	if (!inode->i_fsnotify_marks && !inode->i_sb->s_fsnotify_marks &&
+	    (!mnt || !mnt->mnt_fsnotify_marks) && !parent_watched)
+		return 0;
+
 	parent = NULL;
-	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+	if (!parent_watched && !fsnotify_event_needs_parent(inode, mnt, mask))
 		goto notify;
 
+	/* Does parent inode care about events on children? */
 	parent = dget_parent(dentry);
 	p_inode = parent->d_inode;
-
-	if (unlikely(!fsnotify_inode_watches_children(p_inode))) {
+	p_mask = fsnotify_inode_watches_children(p_inode);
+	if (unlikely(parent_watched && !p_mask))
 		__fsnotify_update_child_dentry_flags(p_inode);
-	} else if (p_inode->i_fsnotify_mask & mask & ALL_FSNOTIFY_EVENTS) {
+
+	/*
+	 * Include parent/name in notification either if some notification
+	 * groups require parent info (!parent_watched case) or the parent is
+	 * interested in this event.
+	 */
+	if (!parent_watched || (mask & p_mask & ALL_FSNOTIFY_EVENTS)) {
 		/* When notifying parent, child should be passed as data */
 		WARN_ON_ONCE(inode != fsnotify_data_inode(data, data_type));
 
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 9b2566d273a9..044cae3a0628 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -44,10 +44,16 @@  static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
 {
 	struct inode *inode = d_inode(dentry);
 
-	if (S_ISDIR(inode->i_mode))
+	if (S_ISDIR(inode->i_mode)) {
 		mask |= FS_ISDIR;
 
-	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+		/* sb/mount marks are not interested in name of directory */
+		if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+			goto notify_child;
+	}
+
+	/* disconnected dentry cannot notify parent */
+	if (IS_ROOT(dentry))
 		goto notify_child;
 
 	return __fsnotify_parent(dentry, mask, data, data_type);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 2c62628566c5..6f0df110e9f8 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -49,8 +49,11 @@ 
 #define FS_OPEN_EXEC_PERM	0x00040000	/* open/exec event in a permission hook */
 
 #define FS_EXCL_UNLINK		0x04000000	/* do not send events if object is unlinked */
-/* This inode cares about things that happen to its children.  Always set for
- * dnotify and inotify. */
+/*
+ * Set on inode mark that cares about things that happen to its children.
+ * Always set for dnotify and inotify.
+ * Set on inode/sb/mount marks that care about parenet/name info.
+ */
 #define FS_EVENT_ON_CHILD	0x08000000
 
 #define FS_DN_RENAME		0x10000000	/* file renamed */
@@ -72,14 +75,22 @@ 
 				  FS_OPEN_EXEC_PERM)
 
 /*
- * This is a list of all events that may get sent to a parent based on fs event
- * happening to inodes inside that directory.
+ * This is a list of all events that may get sent to a parent that is watching
+ * with flag FS_EVENT_ON_CHILD based on fs event on a child of that directory.
  */
 #define FS_EVENTS_POSS_ON_CHILD   (ALL_FSNOTIFY_PERM_EVENTS | \
 				   FS_ACCESS | FS_MODIFY | FS_ATTRIB | \
 				   FS_CLOSE_WRITE | FS_CLOSE_NOWRITE | \
 				   FS_OPEN | FS_OPEN_EXEC)
 
+/*
+ * This is a list of all events that may get sent with the parent inode as the
+ * @to_tell argument of fsnotify().
+ * It may include events that can be sent to an inode/sb/mount mark, but cannot
+ * be sent to a parent watching children.
+ */
+#define FS_EVENTS_POSS_TO_PARENT (FS_EVENTS_POSS_ON_CHILD)
+
 /* Events that can be reported to backends */
 #define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
 			     FS_EVENTS_POSS_ON_CHILD | \
@@ -398,6 +409,19 @@  extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
 extern void fsnotify_sb_delete(struct super_block *sb);
 extern u32 fsnotify_get_cookie(void);
 
+static inline __u32 fsnotify_parent_needed_mask(__u32 mask)
+{
+	/* FS_EVENT_ON_CHILD is set on marks that want parent/name info */
+	if (!(mask & FS_EVENT_ON_CHILD))
+		return 0;
+	/*
+	 * This object might be watched by a mark that cares about parent/name
+	 * info, does it care about the specific set of events that can be
+	 * reported with parent/name info?
+	 */
+	return mask & FS_EVENTS_POSS_TO_PARENT;
+}
+
 static inline int fsnotify_inode_watches_children(struct inode *inode)
 {
 	/* FS_EVENT_ON_CHILD is set if the inode may care */