diff mbox series

[2/2] fsnotify: avoid pre-content events when faulting in user pages

Message ID 20250311114153.1763176-3-amir73il@gmail.com (mailing list archive)
State New
Headers show
Series Fix for potential deadlock in pre-content event | expand

Commit Message

Amir Goldstein March 11, 2025, 11:41 a.m. UTC
In the use case of buffered write whose input buffer is mmapped file on a
filesystem with a pre-content mark, the prefaulting of the buffer can
happen under the filesystem freeze protection (obtained in vfs_write())
which breaks assumptions of pre-content hook and introduces potential
deadlock of HSM handler in userspace with filesystem freezing.

Now that we have pre-content hooks at file mmap() time, disable the
pre-content event hooks on page fault to avoid the potential deadlock.

Leave the code of pre-content hooks in page fault because we may want
to re-enable them on executables or user mapped files under certain
conditions after resolving the potential deadlocks.

Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-fsdevel/7ehxrhbvehlrjwvrduoxsao5k3x4aw275patsb3krkwuq573yv@o2hskrfawbnc/
Fixes: 8392bc2ff8c8b ("fsnotify: generate pre-content permission event on page fault")
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Tested-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 include/linux/fsnotify.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Lorenzo Stoakes March 11, 2025, 12:37 p.m. UTC | #1
On Tue, Mar 11, 2025 at 12:41:53PM +0100, Amir Goldstein wrote:
> In the use case of buffered write whose input buffer is mmapped file on a
> filesystem with a pre-content mark, the prefaulting of the buffer can
> happen under the filesystem freeze protection (obtained in vfs_write())
> which breaks assumptions of pre-content hook and introduces potential
> deadlock of HSM handler in userspace with filesystem freezing.
>
> Now that we have pre-content hooks at file mmap() time, disable the
> pre-content event hooks on page fault to avoid the potential deadlock.
>
> Leave the code of pre-content hooks in page fault because we may want
> to re-enable them on executables or user mapped files under certain
> conditions after resolving the potential deadlocks.
>

Will leave the fs bits to fs people but not hugely comfortable with the
concept of 'leaving code in place just in case'.

Often things end up not being the case :)

> Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/linux-fsdevel/7ehxrhbvehlrjwvrduoxsao5k3x4aw275patsb3krkwuq573yv@o2hskrfawbnc/
> Fixes: 8392bc2ff8c8b ("fsnotify: generate pre-content permission event on page fault")
> Suggested-by: Josef Bacik <josef@toxicpanda.com>
> Tested-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  include/linux/fsnotify.h | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 6a33288bd6a1f..796dacceec488 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -137,6 +137,14 @@ void file_set_fsnotify_mode_from_watchers(struct file *file);
>  static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
>  					  const loff_t *ppos, size_t count)
>  {
> +	/*
> +	 * Temporarily disable pre-content hooks from page faults (MAY_ACCESS).
> +	 * We may bring them back later either only to executables or to user
> +	 * mapped files under some conditions.
> +	 */
> +	if (!(perm_mask & (MAY_READ | MAY_WRITE)))
> +		return 0;
> +
>  	/*
>  	 * filesystem may be modified in the context of permission events
>  	 * (e.g. by HSM filling a file on access), so sb freeze protection
> @@ -144,9 +152,6 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
>  	 */
>  	lockdep_assert_once(file_write_not_started(file));
>
> -	if (!(perm_mask & (MAY_READ | MAY_WRITE | MAY_ACCESS)))
> -		return 0;
> -
>  	if (likely(!FMODE_FSNOTIFY_PERM(file->f_mode)))
>  		return 0;
>
> --
> 2.34.1
>
Amir Goldstein March 11, 2025, 2:10 p.m. UTC | #2
On Tue, Mar 11, 2025 at 1:37 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Tue, Mar 11, 2025 at 12:41:53PM +0100, Amir Goldstein wrote:
> > In the use case of buffered write whose input buffer is mmapped file on a
> > filesystem with a pre-content mark, the prefaulting of the buffer can
> > happen under the filesystem freeze protection (obtained in vfs_write())
> > which breaks assumptions of pre-content hook and introduces potential
> > deadlock of HSM handler in userspace with filesystem freezing.
> >
> > Now that we have pre-content hooks at file mmap() time, disable the
> > pre-content event hooks on page fault to avoid the potential deadlock.
> >
> > Leave the code of pre-content hooks in page fault because we may want
> > to re-enable them on executables or user mapped files under certain
> > conditions after resolving the potential deadlocks.
> >
>
> Will leave the fs bits to fs people but not hugely comfortable with the
> concept of 'leaving code in place just in case'.
>
> Often things end up not being the case :)

Fair point.

It's not exactly a "just in case" situation - the existing user of
this code (Meta)
do use the page fault hooks, so I expect they will work on re-enabling
the hooks upstream after this release, but I am fine with reverting the page
fault hooks instead of the temporary disable if that is what everyone wants.

Thanks,
Amir.
diff mbox series

Patch

diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 6a33288bd6a1f..796dacceec488 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -137,6 +137,14 @@  void file_set_fsnotify_mode_from_watchers(struct file *file);
 static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 					  const loff_t *ppos, size_t count)
 {
+	/*
+	 * Temporarily disable pre-content hooks from page faults (MAY_ACCESS).
+	 * We may bring them back later either only to executables or to user
+	 * mapped files under some conditions.
+	 */
+	if (!(perm_mask & (MAY_READ | MAY_WRITE)))
+		return 0;
+
 	/*
 	 * filesystem may be modified in the context of permission events
 	 * (e.g. by HSM filling a file on access), so sb freeze protection
@@ -144,9 +152,6 @@  static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 	 */
 	lockdep_assert_once(file_write_not_started(file));
 
-	if (!(perm_mask & (MAY_READ | MAY_WRITE | MAY_ACCESS)))
-		return 0;
-
 	if (likely(!FMODE_FSNOTIFY_PERM(file->f_mode)))
 		return 0;