Message ID | 20180419000525.21673-1-rkolchmeyer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed 18-04-18 17:05:25, Robert Kolchmeyer wrote: > fsnotify() acquires a reference to a fsnotify_mark_connector through > the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it > appears that no precautions are taken in fsnotify_put_mark() to > ensure that fsnotify() drops its reference to this > fsnotify_mark_connector before assigning a value to its 'destroy_next' > field. This can result in fsnotify_put_mark() assigning a value > to a connector's 'destroy_next' field right before fsnotify() tries to > traverse the linked list referenced by the connector's 'list' field. > Since these two fields are members of the same union, this behavior > results in a kernel panic. Ahh, I was reading this code just two days ago trying to find some issues (due to reports of softlockups in fsnotify()) but this didn't occur to me. Great spotting! > This issue is resolved by calling synchronize_srcu() before updating > the connector's 'destroy_next' field in fsnotify_put_mark(). Since the > relevant section of fsnotify() occurs in a SRCU read-side critical > section, this will force fsnotify_put_mark() to wait for fsnotify() > to finish operating on the connector before updating its 'destroy_next' > field. Since fsnotify_put_mark() removes references to the connector > before assigning its 'destroy_next' field, it shouldn't be possible for > another thread to acquire a reference to the connector while > fsnotify_put_mark() waits for fsnotify() to finish its work. > > The offending behavior here is extremely unlikely; since > fsnotify_put_mark() removes references to a connector (specifically, > it ensures that the connector is unreachable from the inode it was > formerly attached to) before updating its 'destroy_next' field, a > sizeable chunk of code in fsnotify_put_mark() has to execute in the > short window between when fsnotify() acquires the connector reference > and saves the value of its 'list' field. On the HEAD kernel, I've only > been able to reproduce this by inserting a udelay(1) in fsnotify(). > However, I've been able to reproduce this issue without inserting a > udelay(1) anywhere on older unmodified release kernels, so I believe > it's worth fixing at HEAD. Oh, the bug definitely needs fixing. I somewhat dislike synchronize_srcu() when destroying the connector though. The whole point of connector_destroy_list is to batch synchronize_srcu() calls and move them out of the relative fast path. So what I'd do instead is to move 'destroy_next' from a union with 'list' to a union with 'inode' and 'mnt'. Those are not safe to access under srcu anyway - you have to hold conn->lock to get them and use conn->flags to determine the type under the same lock. So in that union we are actually safe to use the space for anything once conn->flags are cleared by fsnotify_detach_connector_from_object(). Will you send a patch for that so that you get a proper credit for all the debugging you did? :) Thanks! Honza > Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437 > > Fixes: 08991e83b7286635167bab40927665a90fb00d81 > Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com> > --- > fs/notify/mark.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/notify/mark.c b/fs/notify/mark.c > index e9191b416434..358fc7de1e86 100644 > --- a/fs/notify/mark.c > +++ b/fs/notify/mark.c > @@ -227,6 +227,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark) > iput(inode); > > if (free_conn) { > + synchronize_srcu(&fsnotify_mark_srcu); > spin_lock(&destroy_lock); > conn->destroy_next = connector_destroy_list; > connector_destroy_list = conn; > -- > 2.17.0.484.g0c8726318c-goog >
Sounds great. I'll send out that patch shortly. Thanks! On Thu, Apr 19, 2018 at 2:23 AM, Jan Kara <jack@suse.cz> wrote: > On Wed 18-04-18 17:05:25, Robert Kolchmeyer wrote: >> fsnotify() acquires a reference to a fsnotify_mark_connector through >> the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it >> appears that no precautions are taken in fsnotify_put_mark() to >> ensure that fsnotify() drops its reference to this >> fsnotify_mark_connector before assigning a value to its 'destroy_next' >> field. This can result in fsnotify_put_mark() assigning a value >> to a connector's 'destroy_next' field right before fsnotify() tries to >> traverse the linked list referenced by the connector's 'list' field. >> Since these two fields are members of the same union, this behavior >> results in a kernel panic. > > Ahh, I was reading this code just two days ago trying to find some issues > (due to reports of softlockups in fsnotify()) but this didn't occur to me. > Great spotting! > >> This issue is resolved by calling synchronize_srcu() before updating >> the connector's 'destroy_next' field in fsnotify_put_mark(). Since the >> relevant section of fsnotify() occurs in a SRCU read-side critical >> section, this will force fsnotify_put_mark() to wait for fsnotify() >> to finish operating on the connector before updating its 'destroy_next' >> field. Since fsnotify_put_mark() removes references to the connector >> before assigning its 'destroy_next' field, it shouldn't be possible for >> another thread to acquire a reference to the connector while >> fsnotify_put_mark() waits for fsnotify() to finish its work. >> >> The offending behavior here is extremely unlikely; since >> fsnotify_put_mark() removes references to a connector (specifically, >> it ensures that the connector is unreachable from the inode it was >> formerly attached to) before updating its 'destroy_next' field, a >> sizeable chunk of code in fsnotify_put_mark() has to execute in the >> short window between when fsnotify() acquires the connector reference >> and saves the value of its 'list' field. On the HEAD kernel, I've only >> been able to reproduce this by inserting a udelay(1) in fsnotify(). >> However, I've been able to reproduce this issue without inserting a >> udelay(1) anywhere on older unmodified release kernels, so I believe >> it's worth fixing at HEAD. > > Oh, the bug definitely needs fixing. I somewhat dislike synchronize_srcu() > when destroying the connector though. The whole point of > connector_destroy_list is to batch synchronize_srcu() calls and move them > out of the relative fast path. So what I'd do instead is to move > 'destroy_next' from a union with 'list' to a union with 'inode' and 'mnt'. > Those are not safe to access under srcu anyway - you have to hold > conn->lock to get them and use conn->flags to determine the type under the > same lock. So in that union we are actually safe to use the space for > anything once conn->flags are cleared by > fsnotify_detach_connector_from_object(). Will you send a patch for that so > that you get a proper credit for all the debugging you did? :) Thanks! > > Honza > >> Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437 >> >> Fixes: 08991e83b7286635167bab40927665a90fb00d81 >> Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com> >> --- >> fs/notify/mark.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fs/notify/mark.c b/fs/notify/mark.c >> index e9191b416434..358fc7de1e86 100644 >> --- a/fs/notify/mark.c >> +++ b/fs/notify/mark.c >> @@ -227,6 +227,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark) >> iput(inode); >> >> if (free_conn) { >> + synchronize_srcu(&fsnotify_mark_srcu); >> spin_lock(&destroy_lock); >> conn->destroy_next = connector_destroy_list; >> connector_destroy_list = conn; >> -- >> 2.17.0.484.g0c8726318c-goog >> > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
diff --git a/fs/notify/mark.c b/fs/notify/mark.c index e9191b416434..358fc7de1e86 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -227,6 +227,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark) iput(inode); if (free_conn) { + synchronize_srcu(&fsnotify_mark_srcu); spin_lock(&destroy_lock); conn->destroy_next = connector_destroy_list; connector_destroy_list = conn;
fsnotify() acquires a reference to a fsnotify_mark_connector through the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it appears that no precautions are taken in fsnotify_put_mark() to ensure that fsnotify() drops its reference to this fsnotify_mark_connector before assigning a value to its 'destroy_next' field. This can result in fsnotify_put_mark() assigning a value to a connector's 'destroy_next' field right before fsnotify() tries to traverse the linked list referenced by the connector's 'list' field. Since these two fields are members of the same union, this behavior results in a kernel panic. This issue is resolved by calling synchronize_srcu() before updating the connector's 'destroy_next' field in fsnotify_put_mark(). Since the relevant section of fsnotify() occurs in a SRCU read-side critical section, this will force fsnotify_put_mark() to wait for fsnotify() to finish operating on the connector before updating its 'destroy_next' field. Since fsnotify_put_mark() removes references to the connector before assigning its 'destroy_next' field, it shouldn't be possible for another thread to acquire a reference to the connector while fsnotify_put_mark() waits for fsnotify() to finish its work. The offending behavior here is extremely unlikely; since fsnotify_put_mark() removes references to a connector (specifically, it ensures that the connector is unreachable from the inode it was formerly attached to) before updating its 'destroy_next' field, a sizeable chunk of code in fsnotify_put_mark() has to execute in the short window between when fsnotify() acquires the connector reference and saves the value of its 'list' field. On the HEAD kernel, I've only been able to reproduce this by inserting a udelay(1) in fsnotify(). However, I've been able to reproduce this issue without inserting a udelay(1) anywhere on older unmodified release kernels, so I believe it's worth fixing at HEAD. Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437 Fixes: 08991e83b7286635167bab40927665a90fb00d81 Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com> --- fs/notify/mark.c | 1 + 1 file changed, 1 insertion(+)