Message ID | 20190424100951.16678-1-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fsnotify: do not generate duplicate events for "fake" path | expand |
Hi Amir! On Wed 24-04-19 13:09:51, Amir Goldstein wrote: > Overlayfs "fake" path is used for stacked file operations on > underlying files. Operations on files with "fake" path must not > generate events on mount marks and on parent watches, because > those events have already been generated at overlayfs layer. > > The reported event->fd for inode/sb marks will have the wrong path > (overlayfs path), but we have no choice but to report them anyway. > > Link: https://lore.kernel.org/linux-fsdevel/20190423065024.12695-1-jencce.kernel@gmail.com/ > Reported-by: Murphy Zhou <jencce.kernel@gmail.com> > Fixes: d1d04ef8572b ("ovl: stack file ops") > Signed-off-by: Amir Goldstein <amir73il@gmail.com> So honestly I don't quite like that fsnotify core has to care about peculiarities of overlayfs. And I'm not sure I fully understand what the problem actually is so let me try to summarize here: With write to overlayfs, we generate event both for inode on underlying filesystem (upper inode; this gets generated by vfs_write()->ovl_write_iter()->do_iter_write()) and also for "virtual" overlayfs inode (generated directly by vfs_write()). Now for overlayfs inode the (mnt, dentry) pair is a valid one - the one used for opening the file. For upper inode you say we don't use proper path (mnt, dentry) pair but some made-up one - looks like the original overlayfs path if I read the code right. So fsnotify will get two fsnotify_modify() calls, both with the same file->f_path but with different file->f_inode. So no surprise mntpoint / path based stuff is getting confused by this. I guess the first question that comes to my mind is: Is fsnotify on 'upper' inodes actually useful? On overlayfs as a whole it makes sense but on individual filesystems I'm not so sure. And here we can see that mountpoint watches are going to have hard times, some events (e.g. open / close) do not seem to be generated at all, not sure if directory events are generated... Honza > --- > > Jan, > > This is a slightly simplified and cleaner version of the patch I posted > yesterday on the linked bug report thread. > > I have posted an additional overlayfs test case to fanotify06 on LTP > list. > > Thanks, > Amir. > > fs/notify/fsnotify.c | 3 ++- > include/linux/fsnotify.h | 23 ++++++++++++++++++++--- > 2 files changed, 22 insertions(+), 4 deletions(-) > > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c > index df06f3da166c..6f752b13e3fd 100644 > --- a/fs/notify/fsnotify.c > +++ b/fs/notify/fsnotify.c > @@ -334,7 +334,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is, > int ret = 0; > __u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS); > > - if (data_is == FSNOTIFY_EVENT_PATH) { > + if (data_is == FSNOTIFY_EVENT_PATH && > + !fsnotify_is_fake_path(to_tell, data)) { > mnt = real_mount(((const struct path *)data)->mnt); > mnt_or_sb_mask |= mnt->mnt_fsnotify_mask; > } > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h > index 09587e2860b5..c51f57d3a025 100644 > --- a/include/linux/fsnotify.h > +++ b/include/linux/fsnotify.h > @@ -40,6 +40,20 @@ static inline int fsnotify_parent(const struct path *path, > return __fsnotify_parent(path, dentry, mask); > } > > +/* > + * Overlayfs "fake" path is used for stacked file operations on underlying > + * files. file->f_path is from overlayfs layer and file->f_inode is from > + * underlying layer. We must not generate events on mount and on parent > + * based on fake path, because those events have already been generated at > + * overlayfs layer. The reported event->fd for inode/sb marks will have the > + * wrong path (overlayfs path), but we have no choice but to report them as is. > + */ > +static inline bool fsnotify_is_fake_path(struct inode *inode, > + const struct path *path) > +{ > + return unlikely(inode->i_sb != path->dentry->d_sb); > +} > + > /* > * Simple wrapper to consolidate calls fsnotify_parent()/fsnotify() when > * an event is on a path. > @@ -47,10 +61,13 @@ static inline int fsnotify_parent(const struct path *path, > static inline int fsnotify_path(struct inode *inode, const struct path *path, > __u32 mask) > { > - int ret = fsnotify_parent(path, NULL, mask); > + if (!fsnotify_is_fake_path(inode, path)) { > + int ret = fsnotify_parent(path, NULL, mask); > + > + if (ret) > + return ret; > + } > > - if (ret) > - return ret; > return fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0); > } > > -- > 2.17.1 >
On Wed, Apr 24, 2019 at 6:57 PM Jan Kara <jack@suse.cz> wrote: > > Hi Amir! > > On Wed 24-04-19 13:09:51, Amir Goldstein wrote: > > Overlayfs "fake" path is used for stacked file operations on > > underlying files. Operations on files with "fake" path must not > > generate events on mount marks and on parent watches, because > > those events have already been generated at overlayfs layer. > > > > The reported event->fd for inode/sb marks will have the wrong path > > (overlayfs path), but we have no choice but to report them anyway. > > > > Link: https://lore.kernel.org/linux-fsdevel/20190423065024.12695-1-jencce.kernel@gmail.com/ > > Reported-by: Murphy Zhou <jencce.kernel@gmail.com> > > Fixes: d1d04ef8572b ("ovl: stack file ops") > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > So honestly I don't quite like that fsnotify core has to care about > peculiarities of overlayfs. And I'm not sure I fully understand what the > problem actually is so let me try to summarize here: > > With write to overlayfs, we generate event both for inode on underlying > filesystem (upper inode; this gets generated by > vfs_write()->ovl_write_iter()->do_iter_write()) and also for "virtual" > overlayfs inode (generated directly by vfs_write()). Now for overlayfs > inode the (mnt, dentry) pair is a valid one - the one used for opening the > file. For upper inode you say we don't use proper path (mnt, dentry) pair > but some made-up one - looks like the original overlayfs path if I read the > code right. So fsnotify will get two fsnotify_modify() calls, both with the > same file->f_path but with different file->f_inode. So no surprise mntpoint > / path based stuff is getting confused by this. > That is correct. > I guess the first question that comes to my mind is: Is fsnotify on 'upper' > inodes actually useful? On overlayfs as a whole it makes sense but on > individual filesystems I'm not so sure. Not sure. I think the use case is FAN_MARK_FILESYSTEM watch from host by Anti-Virus software to protect host from containers. Yes, host can watch all container overlayfs mounts, but that is less scalable. > And here we can see that mountpoint > watches are going to have hard times, some events (e.g. open / close) do > not seem to be generated at all, not sure if directory events are > generated... > So it seems that before 573e1784817c Revert "fsnotify: support overlayfs" no fanotify events were generated on underlying regular inodes. Overlayfs uses private cloned mounts to access real inodes, so marking underlying mounts was never going to produce any events. Overlayfs does open underlying directory files, so events on underlying directories would be generated (I think) even before v4.19. inotify and new fanotify directory modification events would also be generated from overlayfs, but we do not have a "fake" path issue with those. The bottom line is that it is probably best not to report any events originating from overlayfs file with "fake" path and the simplest way to achieve that would be to open the overlayfs realfile with FMODE_NONOTIFY. At first I though doing that may cause regressions from pre v4.19, but I was wrong. It would be the same behavior as pre v4.19. I'll go write the patch... Thanks, Amir.
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index df06f3da166c..6f752b13e3fd 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -334,7 +334,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is, int ret = 0; __u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS); - if (data_is == FSNOTIFY_EVENT_PATH) { + if (data_is == FSNOTIFY_EVENT_PATH && + !fsnotify_is_fake_path(to_tell, data)) { mnt = real_mount(((const struct path *)data)->mnt); mnt_or_sb_mask |= mnt->mnt_fsnotify_mask; } diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h index 09587e2860b5..c51f57d3a025 100644 --- a/include/linux/fsnotify.h +++ b/include/linux/fsnotify.h @@ -40,6 +40,20 @@ static inline int fsnotify_parent(const struct path *path, return __fsnotify_parent(path, dentry, mask); } +/* + * Overlayfs "fake" path is used for stacked file operations on underlying + * files. file->f_path is from overlayfs layer and file->f_inode is from + * underlying layer. We must not generate events on mount and on parent + * based on fake path, because those events have already been generated at + * overlayfs layer. The reported event->fd for inode/sb marks will have the + * wrong path (overlayfs path), but we have no choice but to report them as is. + */ +static inline bool fsnotify_is_fake_path(struct inode *inode, + const struct path *path) +{ + return unlikely(inode->i_sb != path->dentry->d_sb); +} + /* * Simple wrapper to consolidate calls fsnotify_parent()/fsnotify() when * an event is on a path. @@ -47,10 +61,13 @@ static inline int fsnotify_parent(const struct path *path, static inline int fsnotify_path(struct inode *inode, const struct path *path, __u32 mask) { - int ret = fsnotify_parent(path, NULL, mask); + if (!fsnotify_is_fake_path(inode, path)) { + int ret = fsnotify_parent(path, NULL, mask); + + if (ret) + return ret; + } - if (ret) - return ret; return fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0); }
Overlayfs "fake" path is used for stacked file operations on underlying files. Operations on files with "fake" path must not generate events on mount marks and on parent watches, because those events have already been generated at overlayfs layer. The reported event->fd for inode/sb marks will have the wrong path (overlayfs path), but we have no choice but to report them anyway. Link: https://lore.kernel.org/linux-fsdevel/20190423065024.12695-1-jencce.kernel@gmail.com/ Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: d1d04ef8572b ("ovl: stack file ops") Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- Jan, This is a slightly simplified and cleaner version of the patch I posted yesterday on the linked bug report thread. I have posted an additional overlayfs test case to fanotify06 on LTP list. Thanks, Amir. fs/notify/fsnotify.c | 3 ++- include/linux/fsnotify.h | 23 ++++++++++++++++++++--- 2 files changed, 22 insertions(+), 4 deletions(-)