diff mbox

fsnotify: Fix fsnotify_mark_connector race

Message ID 20180419000525.21673-1-rkolchmeyer@google.com (mailing list archive)
State New, archived
Headers show

Commit Message

Robert Kolchmeyer April 19, 2018, 12:05 a.m. UTC
fsnotify() acquires a reference to a fsnotify_mark_connector through
the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it
appears that no precautions are taken in fsnotify_put_mark() to
ensure that fsnotify() drops its reference to this
fsnotify_mark_connector before assigning a value to its 'destroy_next'
field. This can result in fsnotify_put_mark() assigning a value
to a connector's 'destroy_next' field right before fsnotify() tries to
traverse the linked list referenced by the connector's 'list' field.
Since these two fields are members of the same union, this behavior
results in a kernel panic.

This issue is resolved by calling synchronize_srcu() before updating
the connector's 'destroy_next' field in fsnotify_put_mark(). Since the
relevant section of fsnotify() occurs in a SRCU read-side critical
section, this will force fsnotify_put_mark() to wait for fsnotify()
to finish operating on the connector before updating its 'destroy_next'
field. Since fsnotify_put_mark() removes references to the connector
before assigning its 'destroy_next' field, it shouldn't be possible for
another thread to acquire a reference to the connector while
fsnotify_put_mark() waits for fsnotify() to finish its work.

The offending behavior here is extremely unlikely; since
fsnotify_put_mark() removes references to a connector (specifically,
it ensures that the connector is unreachable from the inode it was
formerly attached to) before updating its 'destroy_next' field, a
sizeable chunk of code in fsnotify_put_mark() has to execute in the
short window between when fsnotify() acquires the connector reference
and saves the value of its 'list' field. On the HEAD kernel, I've only
been able to reproduce this by inserting a udelay(1) in fsnotify().
However, I've been able to reproduce this issue without inserting a
udelay(1) anywhere on older unmodified release kernels, so I believe
it's worth fixing at HEAD.

Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437

Fixes: 08991e83b7286635167bab40927665a90fb00d81
Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com>
---
 fs/notify/mark.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Jan Kara April 19, 2018, 9:23 a.m. UTC | #1
On Wed 18-04-18 17:05:25, Robert Kolchmeyer wrote:
> fsnotify() acquires a reference to a fsnotify_mark_connector through
> the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it
> appears that no precautions are taken in fsnotify_put_mark() to
> ensure that fsnotify() drops its reference to this
> fsnotify_mark_connector before assigning a value to its 'destroy_next'
> field. This can result in fsnotify_put_mark() assigning a value
> to a connector's 'destroy_next' field right before fsnotify() tries to
> traverse the linked list referenced by the connector's 'list' field.
> Since these two fields are members of the same union, this behavior
> results in a kernel panic.

Ahh, I was reading this code just two days ago trying to find some issues
(due to reports of softlockups in fsnotify()) but this didn't occur to me.
Great spotting!

> This issue is resolved by calling synchronize_srcu() before updating
> the connector's 'destroy_next' field in fsnotify_put_mark(). Since the
> relevant section of fsnotify() occurs in a SRCU read-side critical
> section, this will force fsnotify_put_mark() to wait for fsnotify()
> to finish operating on the connector before updating its 'destroy_next'
> field. Since fsnotify_put_mark() removes references to the connector
> before assigning its 'destroy_next' field, it shouldn't be possible for
> another thread to acquire a reference to the connector while
> fsnotify_put_mark() waits for fsnotify() to finish its work.
> 
> The offending behavior here is extremely unlikely; since
> fsnotify_put_mark() removes references to a connector (specifically,
> it ensures that the connector is unreachable from the inode it was
> formerly attached to) before updating its 'destroy_next' field, a
> sizeable chunk of code in fsnotify_put_mark() has to execute in the
> short window between when fsnotify() acquires the connector reference
> and saves the value of its 'list' field. On the HEAD kernel, I've only
> been able to reproduce this by inserting a udelay(1) in fsnotify().
> However, I've been able to reproduce this issue without inserting a
> udelay(1) anywhere on older unmodified release kernels, so I believe
> it's worth fixing at HEAD.

Oh, the bug definitely needs fixing. I somewhat dislike synchronize_srcu()
when destroying the connector though. The whole point of
connector_destroy_list is to batch synchronize_srcu() calls and move them
out of the relative fast path. So what I'd do instead is to move
'destroy_next' from a union with 'list' to a union with 'inode' and 'mnt'.
Those are not safe to access under srcu anyway - you have to hold
conn->lock to get them and use conn->flags to determine the type under the
same lock. So in that union we are actually safe to use the space for
anything once conn->flags are cleared by
fsnotify_detach_connector_from_object(). Will you send a patch for that so
that you get a proper credit for all the debugging you did? :) Thanks!

								Honza

> Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437
> 
> Fixes: 08991e83b7286635167bab40927665a90fb00d81
> Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com>
> ---
>  fs/notify/mark.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index e9191b416434..358fc7de1e86 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -227,6 +227,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
>  	iput(inode);
>  
>  	if (free_conn) {
> +		synchronize_srcu(&fsnotify_mark_srcu);
>  		spin_lock(&destroy_lock);
>  		conn->destroy_next = connector_destroy_list;
>  		connector_destroy_list = conn;
> -- 
> 2.17.0.484.g0c8726318c-goog
>
Robert Kolchmeyer April 19, 2018, 4:17 p.m. UTC | #2
Sounds great. I'll send out that patch shortly. Thanks!

On Thu, Apr 19, 2018 at 2:23 AM, Jan Kara <jack@suse.cz> wrote:
> On Wed 18-04-18 17:05:25, Robert Kolchmeyer wrote:
>> fsnotify() acquires a reference to a fsnotify_mark_connector through
>> the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it
>> appears that no precautions are taken in fsnotify_put_mark() to
>> ensure that fsnotify() drops its reference to this
>> fsnotify_mark_connector before assigning a value to its 'destroy_next'
>> field. This can result in fsnotify_put_mark() assigning a value
>> to a connector's 'destroy_next' field right before fsnotify() tries to
>> traverse the linked list referenced by the connector's 'list' field.
>> Since these two fields are members of the same union, this behavior
>> results in a kernel panic.
>
> Ahh, I was reading this code just two days ago trying to find some issues
> (due to reports of softlockups in fsnotify()) but this didn't occur to me.
> Great spotting!
>
>> This issue is resolved by calling synchronize_srcu() before updating
>> the connector's 'destroy_next' field in fsnotify_put_mark(). Since the
>> relevant section of fsnotify() occurs in a SRCU read-side critical
>> section, this will force fsnotify_put_mark() to wait for fsnotify()
>> to finish operating on the connector before updating its 'destroy_next'
>> field. Since fsnotify_put_mark() removes references to the connector
>> before assigning its 'destroy_next' field, it shouldn't be possible for
>> another thread to acquire a reference to the connector while
>> fsnotify_put_mark() waits for fsnotify() to finish its work.
>>
>> The offending behavior here is extremely unlikely; since
>> fsnotify_put_mark() removes references to a connector (specifically,
>> it ensures that the connector is unreachable from the inode it was
>> formerly attached to) before updating its 'destroy_next' field, a
>> sizeable chunk of code in fsnotify_put_mark() has to execute in the
>> short window between when fsnotify() acquires the connector reference
>> and saves the value of its 'list' field. On the HEAD kernel, I've only
>> been able to reproduce this by inserting a udelay(1) in fsnotify().
>> However, I've been able to reproduce this issue without inserting a
>> udelay(1) anywhere on older unmodified release kernels, so I believe
>> it's worth fixing at HEAD.
>
> Oh, the bug definitely needs fixing. I somewhat dislike synchronize_srcu()
> when destroying the connector though. The whole point of
> connector_destroy_list is to batch synchronize_srcu() calls and move them
> out of the relative fast path. So what I'd do instead is to move
> 'destroy_next' from a union with 'list' to a union with 'inode' and 'mnt'.
> Those are not safe to access under srcu anyway - you have to hold
> conn->lock to get them and use conn->flags to determine the type under the
> same lock. So in that union we are actually safe to use the space for
> anything once conn->flags are cleared by
> fsnotify_detach_connector_from_object(). Will you send a patch for that so
> that you get a proper credit for all the debugging you did? :) Thanks!
>
>                                                                 Honza
>
>> Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437
>>
>> Fixes: 08991e83b7286635167bab40927665a90fb00d81
>> Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com>
>> ---
>>  fs/notify/mark.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
>> index e9191b416434..358fc7de1e86 100644
>> --- a/fs/notify/mark.c
>> +++ b/fs/notify/mark.c
>> @@ -227,6 +227,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
>>       iput(inode);
>>
>>       if (free_conn) {
>> +             synchronize_srcu(&fsnotify_mark_srcu);
>>               spin_lock(&destroy_lock);
>>               conn->destroy_next = connector_destroy_list;
>>               connector_destroy_list = conn;
>> --
>> 2.17.0.484.g0c8726318c-goog
>>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
diff mbox

Patch

diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index e9191b416434..358fc7de1e86 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -227,6 +227,7 @@  void fsnotify_put_mark(struct fsnotify_mark *mark)
 	iput(inode);
 
 	if (free_conn) {
+		synchronize_srcu(&fsnotify_mark_srcu);
 		spin_lock(&destroy_lock);
 		conn->destroy_next = connector_destroy_list;
 		connector_destroy_list = conn;