diff mbox

fsnotify: Fix oops in fsnotify_clear_marks_by_group_flags()

Message ID 1437659699-29812-1-git-send-email-jack@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kara July 23, 2015, 1:54 p.m. UTC
fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops
mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and thus the next
entry pointer we have cached may become stale and we dereference
free memory.

Fix the problem by first moving marks to free to a special private list
and then always free the first entry in the special list. This method is
safe even when entries from the list can disappear once we drop the lock.

CC: stable@vger.kernel.org
Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
Signed-off-by: Jan Kara <jack@suse.com>
---
 fs/notify/mark.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

Andrew, this is the new version of the fsnotify oops fix. It has survived
LTP tests and also a reproducer I wrote for triggering the oops. I'll work
on integrating the reproducer in LTP inotify tests.

Comments

Ashish Sangwan July 24, 2015, 5:52 a.m. UTC | #1
On Thu, Jul 23, 2015 at 7:24 PM, Jan Kara <jack@suse.com> wrote:
> fsnotify_clear_marks_by_group_flags() can race with
> fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops
> mark_mutex, a mark from the list iterated by
> fsnotify_clear_marks_by_group_flags() can be freed and thus the next
> entry pointer we have cached may become stale and we dereference
> free memory.
>
> Fix the problem by first moving marks to free to a special private list
> and then always free the first entry in the special list. This method is
> safe even when entries from the list can disappear once we drop the lock.
>
> CC: stable@vger.kernel.org
> Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
> Signed-off-by: Jan Kara <jack@suse.com>
> ---
>  fs/notify/mark.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
>
> Andrew, this is the new version of the fsnotify oops fix. It has survived
> LTP tests and also a reproducer I wrote for triggering the oops. I'll work
> on integrating the reproducer in LTP inotify tests.
>
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index 92e48c70f0f0..39ddcaf0918f 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -412,16 +412,36 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
>                                          unsigned int flags)
>  {
>         struct fsnotify_mark *lmark, *mark;
> +       LIST_HEAD(to_free);
>
> +       /*
> +        * We have to be really careful here. Anytime we drop mark_mutex, e.g.
> +        * fsnotify_clear_marks_by_inode() can come and free marks. Even in our
> +        * to_free list so we have to use mark_mutex even when accessing that
> +        * list. And freeing mark requires us to drop mark_mutex. So we can
> +        * reliably free only the first mark in the list. That's why we first
> +        * move marks to free to to_free list in one go and then free marks in
> +        * to_free list one by one.
> +        */
>         mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
>         list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
> -               if (mark->flags & flags) {
> -                       fsnotify_get_mark(mark);
> -                       fsnotify_destroy_mark_locked(mark, group);
> -                       fsnotify_put_mark(mark);
> -               }
> +               if (mark->flags & flags)
> +                       list_move(&mark->g_list, &to_free);
>         }
>         mutex_unlock(&group->mark_mutex);
> +
> +       while (1) {
> +               mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
Just a nitpick. Instead of locking/unlocking mutex multiple times in
the while loop,
can't we just keep the entire while loop inside the mutex_lock?
Overall, the patch seems ok to me.
Reviewed-by:  Ashish Sangwan <a.sangwan@samsung.com>

> +               if (list_empty(&to_free)) {
> +                       mutex_unlock(&group->mark_mutex);
> +                       break;
> +               }
> +               mark = list_first_entry(&to_free, struct fsnotify_mark, g_list);
> +               fsnotify_get_mark(mark);
> +               fsnotify_destroy_mark_locked(mark, group);
> +               mutex_unlock(&group->mark_mutex);
> +               fsnotify_put_mark(mark);
> +       }
>  }
>
>  /*
> --
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara July 27, 2015, 9:43 a.m. UTC | #2
On Fri 24-07-15 11:22:49, Ashish Sangwan wrote:
> On Thu, Jul 23, 2015 at 7:24 PM, Jan Kara <jack@suse.com> wrote:
> > fsnotify_clear_marks_by_group_flags() can race with
> > fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops
> > mark_mutex, a mark from the list iterated by
> > fsnotify_clear_marks_by_group_flags() can be freed and thus the next
> > entry pointer we have cached may become stale and we dereference
> > free memory.
> >
> > Fix the problem by first moving marks to free to a special private list
> > and then always free the first entry in the special list. This method is
> > safe even when entries from the list can disappear once we drop the lock.
> >
> > CC: stable@vger.kernel.org
> > Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
> > Signed-off-by: Jan Kara <jack@suse.com>
> > ---
> >  fs/notify/mark.c | 30 +++++++++++++++++++++++++-----
> >  1 file changed, 25 insertions(+), 5 deletions(-)
> >
> > Andrew, this is the new version of the fsnotify oops fix. It has survived
> > LTP tests and also a reproducer I wrote for triggering the oops. I'll work
> > on integrating the reproducer in LTP inotify tests.
> >
> > diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> > index 92e48c70f0f0..39ddcaf0918f 100644
> > --- a/fs/notify/mark.c
> > +++ b/fs/notify/mark.c
> > @@ -412,16 +412,36 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
> >                                          unsigned int flags)
> >  {
> >         struct fsnotify_mark *lmark, *mark;
> > +       LIST_HEAD(to_free);
> >
> > +       /*
> > +        * We have to be really careful here. Anytime we drop mark_mutex, e.g.
> > +        * fsnotify_clear_marks_by_inode() can come and free marks. Even in our
> > +        * to_free list so we have to use mark_mutex even when accessing that
> > +        * list. And freeing mark requires us to drop mark_mutex. So we can
> > +        * reliably free only the first mark in the list. That's why we first
> > +        * move marks to free to to_free list in one go and then free marks in
> > +        * to_free list one by one.
> > +        */
> >         mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
> >         list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
> > -               if (mark->flags & flags) {
> > -                       fsnotify_get_mark(mark);
> > -                       fsnotify_destroy_mark_locked(mark, group);
> > -                       fsnotify_put_mark(mark);
> > -               }
> > +               if (mark->flags & flags)
> > +                       list_move(&mark->g_list, &to_free);
> >         }
> >         mutex_unlock(&group->mark_mutex);
> > +
> > +       while (1) {
> > +               mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
> Just a nitpick. Instead of locking/unlocking mutex multiple times in
> the while loop,
> can't we just keep the entire while loop inside the mutex_lock?
> Overall, the patch seems ok to me.
> Reviewed-by:  Ashish Sangwan <a.sangwan@samsung.com>

Thanks for review! We cannot because fsnotify_destroy_mark_locked() drops
the mutex anyway. I have some cleanup patches prepared which split
fsnotify_destroy_mark_locked() into two functions - one which needs to be
called under mark_mutex and one which has to be called outside of it. And
for these patches the current code makes it easier to convert...

									Honza
> 
> > +               if (list_empty(&to_free)) {
> > +                       mutex_unlock(&group->mark_mutex);
> > +                       break;
> > +               }
> > +               mark = list_first_entry(&to_free, struct fsnotify_mark, g_list);
> > +               fsnotify_get_mark(mark);
> > +               fsnotify_destroy_mark_locked(mark, group);
> > +               mutex_unlock(&group->mark_mutex);
> > +               fsnotify_put_mark(mark);
> > +       }
> >  }
> >
> >  /*
> > --
> > 2.1.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
diff mbox

Patch

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 92e48c70f0f0..39ddcaf0918f 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -412,16 +412,36 @@  void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
 					 unsigned int flags)
 {
 	struct fsnotify_mark *lmark, *mark;
+	LIST_HEAD(to_free);
 
+	/*
+	 * We have to be really careful here. Anytime we drop mark_mutex, e.g.
+	 * fsnotify_clear_marks_by_inode() can come and free marks. Even in our
+	 * to_free list so we have to use mark_mutex even when accessing that
+	 * list. And freeing mark requires us to drop mark_mutex. So we can
+	 * reliably free only the first mark in the list. That's why we first
+	 * move marks to free to to_free list in one go and then free marks in
+	 * to_free list one by one.
+	 */
 	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
 	list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
-		if (mark->flags & flags) {
-			fsnotify_get_mark(mark);
-			fsnotify_destroy_mark_locked(mark, group);
-			fsnotify_put_mark(mark);
-		}
+		if (mark->flags & flags)
+			list_move(&mark->g_list, &to_free);
 	}
 	mutex_unlock(&group->mark_mutex);
+
+	while (1) {
+		mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
+		if (list_empty(&to_free)) {
+			mutex_unlock(&group->mark_mutex);
+			break;
+		}
+		mark = list_first_entry(&to_free, struct fsnotify_mark, g_list);
+		fsnotify_get_mark(mark);
+		fsnotify_destroy_mark_locked(mark, group);
+		mutex_unlock(&group->mark_mutex);
+		fsnotify_put_mark(mark);
+	}
 }
 
 /*