fsnotify: fix a crash due to invalid virtual address
diff mbox

Message ID 1434970396-19644-1-git-send-email-a.sangwan@samsung.com
State New
Headers show

Commit Message

Ashish Sangwan June 22, 2015, 10:53 a.m. UTC
For deleting  the fsnotify_mark related with an inode, there are 2 paths in the
kernel. When the inotify fd is closed, all the marks belonging to a group are
removed one by one in fsnotify_clear_marks_by_group_flags. Other path is when
the inode is removed from user space by unlink, fsnotify_destroy_mark is
called to delete a single mark.
There is a race between these 2 paths which is caused due to the temporary
release of the mark_mutex inside fsnotify_destroy_mark_locked.
The race happen when the inotify app monitoring the file(s) exits, triggering 
fsnotify_clear_marks_by_group_flags to delete the marks.
This function use lmark pointer to move to the next node after a safe removal
of the node. In parallel, if there is rm call for a file and such that the
lmark is pointing to the mark which is removed by this rm call, lmark ends up
pointing to a freed memory. Now, when we try to move to the next node using
lmark, it triggers an invalid virtual address crash.
Although fsnotify_clear_marks_by_group_flags and fsnotify_destroy_mark are
synchronized by mark_mutex, but both of these functions call
fsnotify_destroy_mark_locked which release the mark_mutex and acquire it again
creating a subtle race window. There seems to be no reason for releasing
mark_mutex, so this patch remove the mutex_unlock call.

Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Amit Sahrawat <a.sahrawat@samsung.com>
---
 fs/notify/mark.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

Comments

Namjae Jeon June 23, 2015, 7:33 a.m. UTC | #1
+ Lino Sanfilippo

> 
> For deleting  the fsnotify_mark related with an inode, there are 2 paths in the
> kernel. When the inotify fd is closed, all the marks belonging to a group are
> removed one by one in fsnotify_clear_marks_by_group_flags. Other path is when
> the inode is removed from user space by unlink, fsnotify_destroy_mark is
> called to delete a single mark.
> There is a race between these 2 paths which is caused due to the temporary
> release of the mark_mutex inside fsnotify_destroy_mark_locked.
> The race happen when the inotify app monitoring the file(s) exits, triggering
> fsnotify_clear_marks_by_group_flags to delete the marks.
> This function use lmark pointer to move to the next node after a safe removal
> of the node. In parallel, if there is rm call for a file and such that the
> lmark is pointing to the mark which is removed by this rm call, lmark ends up
> pointing to a freed memory. Now, when we try to move to the next node using
> lmark, it triggers an invalid virtual address crash.
> Although fsnotify_clear_marks_by_group_flags and fsnotify_destroy_mark are
> synchronized by mark_mutex, but both of these functions call
> fsnotify_destroy_mark_locked which release the mark_mutex and acquire it again
> creating a subtle race window. There seems to be no reason for releasing
> mark_mutex, so this patch remove the mutex_unlock call.
> 
> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
> Reviewed-by: Amit Sahrawat <a.sahrawat@samsung.com>
> ---
>  fs/notify/mark.c |    4 ----
>  1 files changed, 0 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index 92e48c7..4ee419f 100755
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -157,8 +157,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
> 
>  	if (inode && (mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED))
>  		iput(inode);
> -	/* release lock temporarily */
> -	mutex_unlock(&group->mark_mutex);
> 
>  	spin_lock(&destroy_lock);
>  	list_add(&mark->g_list, &destroy_list);
> @@ -191,8 +189,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
>  	 */
> 
>  	atomic_dec(&group->num_marks);
> -
> -	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
>  }
> 
>  void fsnotify_destroy_mark(struct fsnotify_mark *mark,
> --
> 1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
Jan Kara June 23, 2015, 10:25 a.m. UTC | #2
On Mon 22-06-15 16:23:16, Ashish Sangwan wrote:
> For deleting  the fsnotify_mark related with an inode, there are 2 paths in the
> kernel. When the inotify fd is closed, all the marks belonging to a group are
> removed one by one in fsnotify_clear_marks_by_group_flags. Other path is when
> the inode is removed from user space by unlink, fsnotify_destroy_mark is
> called to delete a single mark.
> There is a race between these 2 paths which is caused due to the temporary
> release of the mark_mutex inside fsnotify_destroy_mark_locked.
> The race happen when the inotify app monitoring the file(s) exits, triggering 
> fsnotify_clear_marks_by_group_flags to delete the marks.
> This function use lmark pointer to move to the next node after a safe removal
> of the node. In parallel, if there is rm call for a file and such that the
> lmark is pointing to the mark which is removed by this rm call, lmark ends up
> pointing to a freed memory. Now, when we try to move to the next node using
> lmark, it triggers an invalid virtual address crash.
> Although fsnotify_clear_marks_by_group_flags and fsnotify_destroy_mark are
> synchronized by mark_mutex, but both of these functions call
> fsnotify_destroy_mark_locked which release the mark_mutex and acquire it again
> creating a subtle race window. There seems to be no reason for releasing
> mark_mutex, so this patch remove the mutex_unlock call.

Thanks for report and the analysis. I agree with your problem analysis.
Indeed the loop in fsnotify_clear_marks_by_group_flags() isn't safe against
us dropping the mark_mutex inside fsnotify_destroy_mark_locked(). However
mark_mutex is dropped in fsnotify_destroy_mark_locked() for a purpose. We
call ->freeing_mark() callback from there and that should be called without
mark_mutex. In particular inotify uses this callback to send the IN_IGNORE
event and that code certainly isn't prepared to be called under mark_mutex
and you likely introduce interesting deadlock possibilities there.

Looking into this in more detail, it might be worthwhile to revisit how
mark_mutex is used since at least fanotify and dnotify use it for more than
just a protection of list of group marks and untangling this would simplify
things. But that's a longer term goal.

A relatively simple fix for your issue is to split group list of marks into
a list of inode marks and a list of mount marks. Then destroying becomes
much simpler because we always discard the whole list (or both of them) and
we can easily avoid problems with list corruption when dropping the
mark_mutex. I can write the patch later or you can do that if you are
interested.

								Honza

> 
> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
> Reviewed-by: Amit Sahrawat <a.sahrawat@samsung.com>
> ---
>  fs/notify/mark.c |    4 ----
>  1 files changed, 0 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index 92e48c7..4ee419f 100755
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -157,8 +157,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
>  
>  	if (inode && (mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED))
>  		iput(inode);
> -	/* release lock temporarily */
> -	mutex_unlock(&group->mark_mutex);
>  
>  	spin_lock(&destroy_lock);
>  	list_add(&mark->g_list, &destroy_list);
> @@ -191,8 +189,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
>  	 */
>  
>  	atomic_dec(&group->num_marks);
> -
> -	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
>  }
>  
>  void fsnotify_destroy_mark(struct fsnotify_mark *mark,
> -- 
> 1.7.7
>
Lino Sanfilippo June 23, 2015, 10:30 p.m. UTC | #3
Hi,

On 23.06.2015 12:25, Jan Kara wrote:
> On Mon 22-06-15 16:23:16, Ashish Sangwan wrote:
>> For deleting  the fsnotify_mark related with an inode, there are 2 paths in the
>> kernel. When the inotify fd is closed, all the marks belonging to a group are
>> removed one by one in fsnotify_clear_marks_by_group_flags. Other path is when
>> the inode is removed from user space by unlink, fsnotify_destroy_mark is
>> called to delete a single mark.
>> There is a race between these 2 paths which is caused due to the temporary
>> release of the mark_mutex inside fsnotify_destroy_mark_locked.
>> The race happen when the inotify app monitoring the file(s) exits, triggering 
>> fsnotify_clear_marks_by_group_flags to delete the marks.
>> This function use lmark pointer to move to the next node after a safe removal
>> of the node. In parallel, if there is rm call for a file and such that the
>> lmark is pointing to the mark which is removed by this rm call, lmark ends up
>> pointing to a freed memory. Now, when we try to move to the next node using
>> lmark, it triggers an invalid virtual address crash.
>> Although fsnotify_clear_marks_by_group_flags and fsnotify_destroy_mark are
>> synchronized by mark_mutex, but both of these functions call
>> fsnotify_destroy_mark_locked which release the mark_mutex and acquire it again
>> creating a subtle race window. There seems to be no reason for releasing
>> mark_mutex, so this patch remove the mutex_unlock call.
> 
> Thanks for report and the analysis. I agree with your problem analysis.
> Indeed the loop in fsnotify_clear_marks_by_group_flags() isn't safe against
> us dropping the mark_mutex inside fsnotify_destroy_mark_locked(). However
> mark_mutex is dropped in fsnotify_destroy_mark_locked() for a purpose. We
> call ->freeing_mark() callback from there and that should be called without
> mark_mutex. In particular inotify uses this callback to send the IN_IGNORE
> event and that code certainly isn't prepared to be called under mark_mutex
> and you likely introduce interesting deadlock possibilities there.
> 


Why dont we call freeing_mark() from the "fsnotify_mark"-thread instead
of fsnotify_destroy_mark_locked()? So there would not be a reason for
this temporary unlock any longer and we could close that race as Ashish
suggested.

Lino

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara June 24, 2015, 8:42 a.m. UTC | #4
Hi,

On Wed 24-06-15 00:30:16, Lino Sanfilippo wrote:
> On 23.06.2015 12:25, Jan Kara wrote:
> > On Mon 22-06-15 16:23:16, Ashish Sangwan wrote:
> >> For deleting  the fsnotify_mark related with an inode, there are 2 paths in the
> >> kernel. When the inotify fd is closed, all the marks belonging to a group are
> >> removed one by one in fsnotify_clear_marks_by_group_flags. Other path is when
> >> the inode is removed from user space by unlink, fsnotify_destroy_mark is
> >> called to delete a single mark.
> >> There is a race between these 2 paths which is caused due to the temporary
> >> release of the mark_mutex inside fsnotify_destroy_mark_locked.
> >> The race happen when the inotify app monitoring the file(s) exits, triggering 
> >> fsnotify_clear_marks_by_group_flags to delete the marks.
> >> This function use lmark pointer to move to the next node after a safe removal
> >> of the node. In parallel, if there is rm call for a file and such that the
> >> lmark is pointing to the mark which is removed by this rm call, lmark ends up
> >> pointing to a freed memory. Now, when we try to move to the next node using
> >> lmark, it triggers an invalid virtual address crash.
> >> Although fsnotify_clear_marks_by_group_flags and fsnotify_destroy_mark are
> >> synchronized by mark_mutex, but both of these functions call
> >> fsnotify_destroy_mark_locked which release the mark_mutex and acquire it again
> >> creating a subtle race window. There seems to be no reason for releasing
> >> mark_mutex, so this patch remove the mutex_unlock call.
> > 
> > Thanks for report and the analysis. I agree with your problem analysis.
> > Indeed the loop in fsnotify_clear_marks_by_group_flags() isn't safe against
> > us dropping the mark_mutex inside fsnotify_destroy_mark_locked(). However
> > mark_mutex is dropped in fsnotify_destroy_mark_locked() for a purpose. We
> > call ->freeing_mark() callback from there and that should be called without
> > mark_mutex. In particular inotify uses this callback to send the IN_IGNORE
> > event and that code certainly isn't prepared to be called under mark_mutex
> > and you likely introduce interesting deadlock possibilities there.
> 
> Why dont we call freeing_mark() from the "fsnotify_mark"-thread instead
> of fsnotify_destroy_mark_locked()? So there would not be a reason for
> this temporary unlock any longer and we could close that race as Ashish
> suggested.

We could do that as well. But I'd prefer to keep sending the IN_IGNORED
event from the context of the process destroying the mark (not that I would
be aware of any strong reason why that must happen but it just seems more
natural). Also the event from destruction thread will be sent with a delay
caused by synchronize_srcu(). Finally one long critical section for
destruction of all marks belonging to a group doesn't seem ideal either.

Anyway, I'll have this possibility in mind when implementing some solution.
Maybe it will be the most elegant way...

								Honza

Patch
diff mbox

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index 92e48c7..4ee419f 100755
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -157,8 +157,6 @@  void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
 
 	if (inode && (mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED))
 		iput(inode);
-	/* release lock temporarily */
-	mutex_unlock(&group->mark_mutex);
 
 	spin_lock(&destroy_lock);
 	list_add(&mark->g_list, &destroy_list);
@@ -191,8 +189,6 @@  void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
 	 */
 
 	atomic_dec(&group->num_marks);
-
-	mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
 }
 
 void fsnotify_destroy_mark(struct fsnotify_mark *mark,