diff mbox series

[v4,15/16] gfs2: add pre-content fsnotify hook to fault

Message ID 2bd333be8352f31163eac7528fdcb8b47a1f97b4.1723670362.git.josef@toxicpanda.com (mailing list archive)
State New
Headers show
Series fanotify: add pre-content hooks | expand

Commit Message

Josef Bacik Aug. 14, 2024, 9:25 p.m. UTC
gfs2 takes the glock before calling into filemap fault, so add the
fsnotify hook for ->fault before we take the glock in order to avoid any
possible deadlock with the HSM.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/gfs2/file.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Jan Kara Aug. 29, 2024, 11:15 a.m. UTC | #1
On Wed 14-08-24 17:25:33, Josef Bacik wrote:
> gfs2 takes the glock before calling into filemap fault, so add the
> fsnotify hook for ->fault before we take the glock in order to avoid any
> possible deadlock with the HSM.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

The idea of interactions between GFS2 cluster locking and HSM gives me
creeps. But yes, this patch looks good to me. Would be nice to get ack from
GFS2 guys. Andreas?

								Honza

> ---
>  fs/gfs2/file.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
> index 08982937b5df..d4af70d765e0 100644
> --- a/fs/gfs2/file.c
> +++ b/fs/gfs2/file.c
> @@ -556,6 +556,10 @@ static vm_fault_t gfs2_fault(struct vm_fault *vmf)
>  	vm_fault_t ret;
>  	int err;
>  
> +	ret = filemap_maybe_emit_fsnotify_event(vmf);
> +	if (unlikely(ret))
> +		return ret;
> +
>  	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
>  	err = gfs2_glock_nq(&gh);
>  	if (err) {
> -- 
> 2.43.0
>
Amir Goldstein Aug. 29, 2024, 11:26 a.m. UTC | #2
On Thu, Aug 29, 2024 at 1:15 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 14-08-24 17:25:33, Josef Bacik wrote:
> > gfs2 takes the glock before calling into filemap fault, so add the
> > fsnotify hook for ->fault before we take the glock in order to avoid any
> > possible deadlock with the HSM.
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>
> The idea of interactions between GFS2 cluster locking and HSM gives me
> creeps. But yes, this patch looks good to me. Would be nice to get ack from
> GFS2 guys. Andreas?

If we are being honest, I think that the fact that HSM events require careful
handling in ->fault() and not to mention no documentation of this fact,
perhaps we should let HSM events be an opt-in file_system_type feature?

Additionally, we had to introduce FS_DISALLOW_NOTIFY_PERM
and restrict sb marks on SB_NOUSER, all because these fanotify
features did not require fs opt-in to begin with.

I think we would be repeating this mistake if we do not add
FS_ALLOW_HSM from the start.

After all, I cannot imagine HSM being used on anything but
the major disk filesystems.

Hmm?

Amir.
Jan Kara Aug. 29, 2024, 11:43 a.m. UTC | #3
On Thu 29-08-24 13:26:17, Amir Goldstein wrote:
> On Thu, Aug 29, 2024 at 1:15 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 14-08-24 17:25:33, Josef Bacik wrote:
> > > gfs2 takes the glock before calling into filemap fault, so add the
> > > fsnotify hook for ->fault before we take the glock in order to avoid any
> > > possible deadlock with the HSM.
> > >
> > > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> >
> > The idea of interactions between GFS2 cluster locking and HSM gives me
> > creeps. But yes, this patch looks good to me. Would be nice to get ack from
> > GFS2 guys. Andreas?
> 
> If we are being honest, I think that the fact that HSM events require careful
> handling in ->fault() and not to mention no documentation of this fact,
> perhaps we should let HSM events be an opt-in file_system_type feature?
> 
> Additionally, we had to introduce FS_DISALLOW_NOTIFY_PERM
> and restrict sb marks on SB_NOUSER, all because these fanotify
> features did not require fs opt-in to begin with.
> 
> I think we would be repeating this mistake if we do not add
> FS_ALLOW_HSM from the start.
> 
> After all, I cannot imagine HSM being used on anything but
> the major disk filesystems.
> 
> Hmm?

Yeah, I was considering this already when thinking about btrfs quirks with
readahead and various special filesystem ioctls and I agree that a need to be
careful with page faults is another good reason to make this a
per-filesystem opt in. Will you send a patch?

								Honza
Josef Bacik Aug. 29, 2024, 12:42 p.m. UTC | #4
On Thu, Aug 29, 2024 at 01:15:10PM +0200, Jan Kara wrote:
> On Wed 14-08-24 17:25:33, Josef Bacik wrote:
> > gfs2 takes the glock before calling into filemap fault, so add the
> > fsnotify hook for ->fault before we take the glock in order to avoid any
> > possible deadlock with the HSM.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> The idea of interactions between GFS2 cluster locking and HSM gives me
> creeps. But yes, this patch looks good to me. Would be nice to get ack from
> GFS2 guys. Andreas?

I did a lot of gfs2 work originally so I'm familiar with how it works, otherwise
I definitely would have just left it off.

That being said I'd also be fine with just gating it at an FS level.  Thanks,

Josef
Amir Goldstein Aug. 29, 2024, 12:49 p.m. UTC | #5
On Thu, Aug 29, 2024 at 1:43 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 29-08-24 13:26:17, Amir Goldstein wrote:
> > On Thu, Aug 29, 2024 at 1:15 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Wed 14-08-24 17:25:33, Josef Bacik wrote:
> > > > gfs2 takes the glock before calling into filemap fault, so add the
> > > > fsnotify hook for ->fault before we take the glock in order to avoid any
> > > > possible deadlock with the HSM.
> > > >
> > > > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > >
> > > The idea of interactions between GFS2 cluster locking and HSM gives me
> > > creeps. But yes, this patch looks good to me. Would be nice to get ack from
> > > GFS2 guys. Andreas?
> >
> > If we are being honest, I think that the fact that HSM events require careful
> > handling in ->fault() and not to mention no documentation of this fact,
> > perhaps we should let HSM events be an opt-in file_system_type feature?
> >
> > Additionally, we had to introduce FS_DISALLOW_NOTIFY_PERM
> > and restrict sb marks on SB_NOUSER, all because these fanotify
> > features did not require fs opt-in to begin with.
> >
> > I think we would be repeating this mistake if we do not add
> > FS_ALLOW_HSM from the start.
> >
> > After all, I cannot imagine HSM being used on anything but
> > the major disk filesystems.
> >
> > Hmm?
>
> Yeah, I was considering this already when thinking about btrfs quirks with
> readahead and various special filesystem ioctls and I agree that a need to be
> careful with page faults is another good reason to make this a
> per-filesystem opt in. Will you send a patch?

Huh! I am still struggling to keep my head above the water
coming back from a long vacation, so I think it is better if Josef takes
that as well.

But here is a trivial, untested, probably broken, space damaged patch,
that should be broken and squashed into other patches in the series
if this is any help at all...

Thanks,
Amir.

index c3c8b2ea80b6..07c3cf038221 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1672,6 +1672,11 @@ static int fanotify_events_supported(struct
fsnotify_group *group,
                                 (mask & FAN_RENAME) ||
                                 (flags & FAN_MARK_IGNORE);

+       /* Filesystems need to opt-into pre-content evnets (a.k.a HSM) */
+       if (mask & FANOTIFY_PRE_CONTENT_EVENTS &&
+           path->mnt->mnt_sb->s_type->fs_flags & FS_ALLOW_HSM)
+               return -EINVAL;
+
        /*
         * Some filesystems such as 'proc' acquire unusual locks when opening
         * files. For them fanotify permission events have high chances of
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index cbfaa000f815..32690e5c4815 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -207,7 +207,8 @@ bool fsnotify_file_has_pre_content_watches(struct
file *file)
        struct inode *inode = file_inode(file);
        __u32 mnt_mask = real_mount(file->f_path.mnt)->mnt_fsnotify_mask;

-       return fsnotify_object_watched(inode, mnt_mask,
+       return inode->i_sb->s_type->fs_flags & FS_ALLOW_HSM &&
+              fsnotify_object_watched(inode, mnt_mask,
                                       FSNOTIFY_PRE_CONTENT_EVENTS);
 }
 #endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index fb0426f349fc..14468736d15b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2499,6 +2499,7 @@ struct file_system_type {
 #define FS_USERNS_MOUNT                8       /* Can be mounted by
userns root */
 #define FS_DISALLOW_NOTIFY_PERM        16      /* Disable fanotify
permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to
handle vfs idmappings. */
+#define FS_ALLOW_HSM           64      /* FS can handle fanotify
pre-content events. */
 #define FS_RENAME_DOES_D_MOVE  32768   /* FS will handle d_move()
during rename() internally. */
        int (*init_fs_context)(struct fs_context *);
        const struct fs_parameter_spec *parameters;
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index ec1e88342442..b6845ab477d6 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -178,6 +178,7 @@ static inline int fsnotify_file_area_perm(struct
file *file, int perm_mask,
         * if there are any pre-content event watchers on this sb.
         */
        if ((!S_ISDIR(inode->i_mode) && !S_ISREG(inode->i_mode)) ||
+           !(inode->i_sb->s_type->fs_flags & FS_ALLOW_HSM) ||
            !fsnotify_sb_has_priority_watchers(inode->i_sb,
                                               FSNOTIFY_PRIO_PRE_CONTENT))
                return 0;

---
diff mbox series

Patch

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 08982937b5df..d4af70d765e0 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -556,6 +556,10 @@  static vm_fault_t gfs2_fault(struct vm_fault *vmf)
 	vm_fault_t ret;
 	int err;
 
+	ret = filemap_maybe_emit_fsnotify_event(vmf);
+	if (unlikely(ret))
+		return ret;
+
 	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
 	err = gfs2_glock_nq(&gh);
 	if (err) {