CIFS: Fix kernel crash on simultaneous mount/umount calls

Message ID	1307622570-7141-1-git-send-email-piastryyy@gmail.com (mailing list archive)
State	New, archived
Headers	show Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p59CUc3q013597 for <patchwork-linux-cifs@patchwork.kernel.org>; Thu, 9 Jun 2011 12:30:38 GMT DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:message-id:x-mailer; b=ch4S4MDlAS6Dq4q/AP9SsPuIqm0AXwLDRALyDfhp7XSz6FzoMZ+onbXgKUHLcu7ubJ mFqNbyiwnOzmRADK1zq/5AEbrcRpjNFE06b+5aAy2Re3zqVpus91s18KKN0xlNyRPivY +OGFlFxoBSGw/tdwiQgCscBRTQOfMj0vofKdM= From: Pavel Shilovsky <piastryyy@gmail.com> To: linux-cifs@vger.kernel.org Subject: [PATCH] CIFS: Fix kernel crash on simultaneous mount/umount calls Date: Thu, 9 Jun 2011 16:29:30 +0400 Message-Id: <1307622570-7141-1-git-send-email-piastryyy@gmail.com> Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk

Pavel Shilovsky June 9, 2011, 12:29 p.m. UTC

Recently merged shared-sb capability causes kernel to crash when
we try to do mounts and umounts simultaneously on the same machine.
The patch fixes umount codepath by doing kill_anon_super at first
(it calls generic_shutdown_super that tries to take sb_lock and
removes a superblock from fs_supers list) and then processing
cifs_umount (that frees cifs related fields).

In this case when we call sget and it calls cifs_match_super under
sb_lock for every entry from fs_supers list, we can be sure that
all cifs related fields are not freed by simultaneous umount.

Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
---
 fs/cifs/cifsfs.c |   19 ++++++++++++++++---
 1 files changed, 16 insertions(+), 3 deletions(-)

Christoph Hellwig June 9, 2011, 12:40 p.m. UTC | #1

>  static void
>  cifs_put_super(struct super_block *sb)
>  {
> +	struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
> +	if (cifs_sb == NULL) {
> +		cFYI(1, "Empty cifs superblock info passed to put_super");
> +		return;
> +	}
> +
> +	bdi_destroy(&cifs_sb->bdi);

This means you have a problem with the lifetime rules in cifs_do_mount.

The VFS only calls ->put_super if sb->s_root is set.  So the rules for
the mount handler are to only set s_root once the superblock is fully
set up.

Also you should never call cifs_umount from the error handling in
cifs_do_mount.  Until s_root is set, please unwind manually, after
that leave it to ->put_super.


> +
> +static void
> +cifs_kill_super(struct super_block *sb)
> +{

This also seems quite broken.  If you fix up the mount path like
I suggested it won't be nessecary.

>  	int rc = 0;
>  	struct cifs_sb_info *cifs_sb;
>  
>  	cFYI(1, "In cifs_put_super");
>  	cifs_sb = CIFS_SB(sb);
>  	if (cifs_sb == NULL) {

And this check should also be removed.

--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Pavel Shilovsky June 9, 2011, 4:30 p.m. UTC | #2

2011/6/9 Christoph Hellwig <hch@infradead.org>:
>>  static void
>>  cifs_put_super(struct super_block *sb)
>>  {
>> +     struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
>> +     if (cifs_sb == NULL) {
>> +             cFYI(1, "Empty cifs superblock info passed to put_super");
>> +             return;
>> +     }
>> +
>> +     bdi_destroy(&cifs_sb->bdi);
>
> This means you have a problem with the lifetime rules in cifs_do_mount.
>
> The VFS only calls ->put_super if sb->s_root is set.  So the rules for
> the mount handler are to only set s_root once the superblock is fully
> set up.
>
> Also you should never call cifs_umount from the error handling in
> cifs_do_mount.  Until s_root is set, please unwind manually, after
> that leave it to ->put_super.
>
>
>> +
>> +static void
>> +cifs_kill_super(struct super_block *sb)
>> +{
>
> This also seems quite broken.  If you fix up the mount path like
> I suggested it won't be nessecary.
>
>>       int rc = 0;
>>       struct cifs_sb_info *cifs_sb;
>>
>>       cFYI(1, "In cifs_put_super");
>>       cifs_sb = CIFS_SB(sb);
>>       if (cifs_sb == NULL) {
>
> And this check should also be removed.
>
>

It seems to me that I didn't understand your points clearly.

That's how I reproduce the problem:
1) one process does mount/umount calls in a loop;
2) another process does mount and umount calls.

As the result, kernel crashes sometimes on mount of the second
process, sometimes - on umount of the second process. The problem is
that:

one of these two processes process umount call. So, it comes to
cifs_put_super, calls cifs_umount and then calls kfree(cifs_sb).  In
the same time, another process comes into cifs_do_mount and calls
sget(). Then it appers into cifs_match_super that thinks that all
structures like server, ses, tcon and cifs_sb are valid, but it is not
true - the first process has already freed them but didn't remove
superblock from fs_supers list.

So, I simply reodered calls: now umount codepath marks a superblock as
non-active and calls kill_sb(new cifs_kill_super), it calls
kill_anon_super (that calls cifs_put_super and removes superblock from
fs_supers list) and after that kill_sb frees all cifs structures
(server, ses, tcon, cifs_sb).

As the result, when we get into cifs_match_super (in mount from
another process) we are sure that all cifs structures are valid.

If I am not right, correct me, please!

Pavel Shilovsky June 10, 2011, 6 a.m. UTC | #3

2011/6/9 Pavel Shilovsky <piastryyy@gmail.com>:
> 2011/6/9 Christoph Hellwig <hch@infradead.org>:
>>>  static void
>>>  cifs_put_super(struct super_block *sb)
>>>  {
>>> +     struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
>>> +     if (cifs_sb == NULL) {
>>> +             cFYI(1, "Empty cifs superblock info passed to put_super");
>>> +             return;
>>> +     }
>>> +
>>> +     bdi_destroy(&cifs_sb->bdi);
>>
>> This means you have a problem with the lifetime rules in cifs_do_mount.
>>
>> The VFS only calls ->put_super if sb->s_root is set.  So the rules for
>> the mount handler are to only set s_root once the superblock is fully
>> set up.
>>
>> Also you should never call cifs_umount from the error handling in
>> cifs_do_mount.  Until s_root is set, please unwind manually, after
>> that leave it to ->put_super.
>>
>>
>>> +
>>> +static void
>>> +cifs_kill_super(struct super_block *sb)
>>> +{
>>
>> This also seems quite broken.  If you fix up the mount path like
>> I suggested it won't be nessecary.
>>
>>>       int rc = 0;
>>>       struct cifs_sb_info *cifs_sb;
>>>
>>>       cFYI(1, "In cifs_put_super");
>>>       cifs_sb = CIFS_SB(sb);
>>>       if (cifs_sb == NULL) {
>>
>> And this check should also be removed.
>>
>>
>
> It seems to me that I didn't understand your points clearly.
>
> That's how I reproduce the problem:
> 1) one process does mount/umount calls in a loop;
> 2) another process does mount and umount calls.
>
> As the result, kernel crashes sometimes on mount of the second
> process, sometimes - on umount of the second process. The problem is
> that:
>
> one of these two processes process umount call. So, it comes to
> cifs_put_super, calls cifs_umount and then calls kfree(cifs_sb).  In
> the same time, another process comes into cifs_do_mount and calls
> sget(). Then it appers into cifs_match_super that thinks that all
> structures like server, ses, tcon and cifs_sb are valid, but it is not
> true - the first process has already freed them but didn't remove
> superblock from fs_supers list.
>
> So, I simply reodered calls: now umount codepath marks a superblock as
> non-active and calls kill_sb(new cifs_kill_super), it calls
> kill_anon_super (that calls cifs_put_super and removes superblock from
> fs_supers list) and after that kill_sb frees all cifs structures
> (server, ses, tcon, cifs_sb).
>
> As the result, when we get into cifs_match_super (in mount from
> another process) we are sure that all cifs structures are valid.
>
> If I am not right, correct me, please!
>
> --
> Best regards,
> Pavel Shilovsky.
>

I also want to mention that NFS mount code uses the same order:
1) in moun at first it sets up server structure and then it calls sget.
2) it has put_super that only calls bdi_unregister and kill_super that
at first call kill_anon_super and then frees server structure.

Christoph Hellwig June 11, 2011, 9:43 a.m. UTC | #4

> one of these two processes process umount call. So, it comes to
> cifs_put_super, calls cifs_umount and then calls kfree(cifs_sb).  In
> the same time, another process comes into cifs_do_mount and calls
> sget(). Then it appers into cifs_match_super that thinks that all
> structures like server, ses, tcon and cifs_sb are valid, but it is not
> true - the first process has already freed them but didn't remove
> superblock from fs_supers list.
> 
> So, I simply reodered calls: now umount codepath marks a superblock as
> non-active and calls kill_sb(new cifs_kill_super), it calls
> kill_anon_super (that calls cifs_put_super and removes superblock from
> fs_supers list) and after that kill_sb frees all cifs structures
> (server, ses, tcon, cifs_sb).
> 
> As the result, when we get into cifs_match_super (in mount from
> another process) we are sure that all cifs structures are valid.
> 
> If I am not right, correct me, please!

Ok, if we look at sb-private data in the sget callsbacks it seems like
the clreanup for those does indeed need to be done in ->kill_sb.  I have
to say I really hate it, and the culprit is that we call the sget test
callback is called before we call grab_super in sget, that is we don't
protect against filesystems going away.  I suspect that is the real
problem here.

The workaround you've copied from NFS also works, but it's pretty
ugly, given that ->kill_sb also is called for partially set up
superblocks.

--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Al Viro June 11, 2011, 11:12 a.m. UTC | #5

On Sat, Jun 11, 2011 at 05:43:14AM -0400, Christoph Hellwig wrote:

> Ok, if we look at sb-private data in the sget callsbacks it seems like
> the clreanup for those does indeed need to be done in ->kill_sb.  I have
> to say I really hate it, and the culprit is that we call the sget test
> callback is called before we call grab_super in sget, that is we don't
> protect against filesystems going away.  I suspect that is the real
> problem here.

grab_super() is *heavy* and having to undo it even once means that we need to
rescan.  Sorry, test() has to be callable without that.
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Pavel Shilovsky June 11, 2011, 2 p.m. UTC | #6

2011/6/11 Al Viro <viro@zeniv.linux.org.uk>:
> On Sat, Jun 11, 2011 at 05:43:14AM -0400, Christoph Hellwig wrote:
>
>> Ok, if we look at sb-private data in the sget callsbacks it seems like
>> the clreanup for those does indeed need to be done in ->kill_sb.  I have
>> to say I really hate it, and the culprit is that we call the sget test
>> callback is called before we call grab_super in sget, that is we don't
>> protect against filesystems going away.  I suspect that is the real
>> problem here.
>
> grab_super() is *heavy* and having to undo it even once means that we need to
> rescan.  Sorry, test() has to be callable without that.
>

In this case, I think we should follow NFS strategy and apply my patch
as a workaround. Thoughts?

Al Viro June 17, 2011, 3:54 a.m. UTC | #7

On Sat, Jun 11, 2011 at 06:00:25PM +0400, Pavel Shilovsky wrote:
> 2011/6/11 Al Viro <viro@zeniv.linux.org.uk>:
> > On Sat, Jun 11, 2011 at 05:43:14AM -0400, Christoph Hellwig wrote:
> >
> >> Ok, if we look at sb-private data in the sget callsbacks it seems like
> >> the clreanup for those does indeed need to be done in ->kill_sb. ?I have
> >> to say I really hate it, and the culprit is that we call the sget test
> >> callback is called before we call grab_super in sget, that is we don't
> >> protect against filesystems going away. ?I suspect that is the real
> >> problem here.
> >
> > grab_super() is *heavy* and having to undo it even once means that we need to
> > rescan. ?Sorry, test() has to be callable without that.
> >
> 
> In this case, I think we should follow NFS strategy and apply my patch
> as a workaround. Thoughts?

See #untested in vfs-2.6.git; I *hope* I've got all the leaks in current
mainline plugged and all sget() races ought to be taken care of.  And
yes, the branch name matches the reality - this sucker is completely
untested.  Have fun...  Commit message is atrocious, of course - I'm too
tired to even try writing a coherent patch description at the moment and
I'll probably need to split it into several patches anyway.  Tomorrow...
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Al Viro June 17, 2011, 2:08 p.m. UTC | #8

On Fri, Jun 17, 2011 at 04:54:21AM +0100, Al Viro wrote:

> See #untested in vfs-2.6.git; I *hope* I've got all the leaks in current
> mainline plugged and all sget() races ought to be taken care of.  And
> yes, the branch name matches the reality - this sucker is completely
> untested.  Have fun...  Commit message is atrocious, of course - I'm too
> tired to even try writing a coherent patch description at the moment and
> I'll probably need to split it into several patches anyway.  Tomorrow...

Updated, split up, pushed into #untested.  Enjoy...
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Pavel Shilovsky June 20, 2011, 12:54 p.m. UTC | #9

2011/6/17 Al Viro <viro@zeniv.linux.org.uk>:
> On Fri, Jun 17, 2011 at 04:54:21AM +0100, Al Viro wrote:
>
>> See #untested in vfs-2.6.git; I *hope* I've got all the leaks in current
>> mainline plugged and all sget() races ought to be taken care of.  And
>> yes, the branch name matches the reality - this sucker is completely
>> untested.  Have fun...  Commit message is atrocious, of course - I'm too
>> tired to even try writing a coherent patch description at the moment and
>> I'll probably need to split it into several patches anyway.  Tomorrow...
>
> Updated, split up, pushed into #untested.  Enjoy...
>

Today I tried this branch and it was ok.

Steve, Jeff, what do you think?

Jeff Layton June 20, 2011, 12:59 p.m. UTC | #10

On Mon, 20 Jun 2011 16:54:48 +0400
Pavel Shilovsky <piastryyy@gmail.com> wrote:

> 2011/6/17 Al Viro <viro@zeniv.linux.org.uk>:
> > On Fri, Jun 17, 2011 at 04:54:21AM +0100, Al Viro wrote:
> >
> >> See #untested in vfs-2.6.git; I *hope* I've got all the leaks in current
> >> mainline plugged and all sget() races ought to be taken care of.  And
> >> yes, the branch name matches the reality - this sucker is completely
> >> untested.  Have fun...  Commit message is atrocious, of course - I'm too
> >> tired to even try writing a coherent patch description at the moment and
> >> I'll probably need to split it into several patches anyway.  Tomorrow...
> >
> > Updated, split up, pushed into #untested.  Enjoy...
> >
> 
> Today I tried this branch and it was ok.
> 
> Steve, Jeff, what do you think?
> 

I did some light testing on it over the weekend and it looks good to me
as well. Nice cleanup, Al.

Steve French June 20, 2011, 4:13 p.m. UTC | #11

Would the plan be for Al to merge these out of his tree?

On Mon, Jun 20, 2011 at 7:59 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Mon, 20 Jun 2011 16:54:48 +0400
> Pavel Shilovsky <piastryyy@gmail.com> wrote:
>
>> 2011/6/17 Al Viro <viro@zeniv.linux.org.uk>:
>> > On Fri, Jun 17, 2011 at 04:54:21AM +0100, Al Viro wrote:
>> >
>> >> See #untested in vfs-2.6.git; I *hope* I've got all the leaks in current
>> >> mainline plugged and all sget() races ought to be taken care of.  And
>> >> yes, the branch name matches the reality - this sucker is completely
>> >> untested.  Have fun...  Commit message is atrocious, of course - I'm too
>> >> tired to even try writing a coherent patch description at the moment and
>> >> I'll probably need to split it into several patches anyway.  Tomorrow...
>> >
>> > Updated, split up, pushed into #untested.  Enjoy...
>> >
>>
>> Today I tried this branch and it was ok.
>>
>> Steve, Jeff, what do you think?
>>
>
> I did some light testing on it over the weekend and it looks good to me
> as well. Nice cleanup, Al.
>
> --
> Jeff Layton <jlayton@redhat.com>
>

Al Viro June 20, 2011, 4:15 p.m. UTC | #12

On Mon, Jun 20, 2011 at 11:13:22AM -0500, Steve French wrote:
> Would the plan be for Al to merge these out of his tree?

Can do, if cifs folks are OK with it...  Note that it's a regression since
the last merge window, so it probably ought to go to Linus before 3.0-final.
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Pavel Shilovsky June 24, 2011, 8:47 a.m. UTC | #13

2011/6/20 Al Viro <viro@zeniv.linux.org.uk>:
> On Mon, Jun 20, 2011 at 11:13:22AM -0500, Steve French wrote:
>> Would the plan be for Al to merge these out of his tree?
>
> Can do, if cifs folks are OK with it...  Note that it's a regression since
> the last merge window, so it probably ought to go to Linus before 3.0-final.
>

Al, what is the status of your patchset? Note, that you can add my
"Acked-by: Pavel Shilovsky <piastryyy@gmail.com>" tag as well if you
need it.

Jeff Layton June 24, 2011, 10:48 a.m. UTC | #14

On Fri, 24 Jun 2011 12:47:42 +0400
Pavel Shilovsky <piastryyy@gmail.com> wrote:

> 2011/6/20 Al Viro <viro@zeniv.linux.org.uk>:
> > On Mon, Jun 20, 2011 at 11:13:22AM -0500, Steve French wrote:
> >> Would the plan be for Al to merge these out of his tree?
> >
> > Can do, if cifs folks are OK with it...  Note that it's a regression since
> > the last merge window, so it probably ought to go to Linus before 3.0-final.
> >
> 
> Al, what is the status of your patchset? Note, that you can add my
> "Acked-by: Pavel Shilovsky <piastryyy@gmail.com>" tag as well if you
> need it.
> 

Yes, I've looked over the set and it looks good to me. Nice cleanup,
Al. You can add my:

Reviewed-by: Jeff Layton <jlayton@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Al Viro June 24, 2011, 10:54 p.m. UTC | #15

> > Al, what is the status of your patchset? Note, that you can add my
> > "Acked-by: Pavel Shilovsky <piastryyy@gmail.com>" tag as well if you
> > need it.
> > 
> 
> Yes, I've looked over the set and it looks good to me. Nice cleanup,
> Al. You can add my:
> 
> Reviewed-by: Jeff Layton <jlayton@redhat.com>

OK...  The patchset fixes breakage that got into cifs ->mount() since
cifs had started to play with shared superblocks - sget() races, leaks,
etc.  Commit dates are recent due to added Acked-by and Reviewed-by;
other than that, it's an exact copy of the stuff that sat in for-next.
What is the proper way to deal with such situations, BTW?  I know that
you seriously dislike being asked to pull just-created commits, but
the normal reasons do not apply in this case...  Please, pull from
the usual place -
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6.git/ for-linus

Shortlog:
Al Viro (15):
      take bdi setup/destruction into cifs_mount/cifs_umount
      cifs: double free on mount failure
      cifs: don't leak nls on mount failure
      cifs: don't pass superblock to cifs_mount()
      cifs: leak on mount if we share superblock
      cifs: allocate mountdata earlier
      cifs: initialize ->tlink_tree in cifs_setup_cifs_sb()
      sanitize cifs_umount() prototype
      cifs: pull cifs_mount() call up
      cifs: move cifs_umount() call into ->kill_sb()
      cifs: pull freeing mountdata/dropping nls/freeing cifs_sb into cifs_umount()
      cifs: close sget() races
      cifs: more breakage on mount failures
      cifs: tidy cifs_do_mount() up a bit
      cifs: propagate errors from cifs_get_root() to mount(2)

Diffstat:
 fs/cifs/cifs_fs_sb.h |    1 +
 fs/cifs/cifsfs.c     |  156 +++++++++++++++++++++-----------------------------
 fs/cifs/cifsproto.h  |    8 +-
 fs/cifs/connect.c    |   49 +++++++++-------
 4 files changed, 98 insertions(+), 116 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steve French June 24, 2011, 11:03 p.m. UTC | #16

I have reviewed most of it and am fine with it going out of your tree.

On Fri, Jun 24, 2011 at 5:54 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > Al, what is the status of your patchset? Note, that you can add my
>> > "Acked-by: Pavel Shilovsky <piastryyy@gmail.com>" tag as well if you
>> > need it.
>> >
>>
>> Yes, I've looked over the set and it looks good to me. Nice cleanup,
>> Al. You can add my:
>>
>> Reviewed-by: Jeff Layton <jlayton@redhat.com>
>
> OK...  The patchset fixes breakage that got into cifs ->mount() since
> cifs had started to play with shared superblocks - sget() races, leaks,
> etc.  Commit dates are recent due to added Acked-by and Reviewed-by;
> other than that, it's an exact copy of the stuff that sat in for-next.
> What is the proper way to deal with such situations, BTW?  I know that
> you seriously dislike being asked to pull just-created commits, but
> the normal reasons do not apply in this case...  Please, pull from
> the usual place -
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6.git/ for-linus
>
> Shortlog:
> Al Viro (15):
>      take bdi setup/destruction into cifs_mount/cifs_umount
>      cifs: double free on mount failure
>      cifs: don't leak nls on mount failure
>      cifs: don't pass superblock to cifs_mount()
>      cifs: leak on mount if we share superblock
>      cifs: allocate mountdata earlier
>      cifs: initialize ->tlink_tree in cifs_setup_cifs_sb()
>      sanitize cifs_umount() prototype
>      cifs: pull cifs_mount() call up
>      cifs: move cifs_umount() call into ->kill_sb()
>      cifs: pull freeing mountdata/dropping nls/freeing cifs_sb into cifs_umount()
>      cifs: close sget() races
>      cifs: more breakage on mount failures
>      cifs: tidy cifs_do_mount() up a bit
>      cifs: propagate errors from cifs_get_root() to mount(2)
>
> Diffstat:
>  fs/cifs/cifs_fs_sb.h |    1 +
>  fs/cifs/cifsfs.c     |  156 +++++++++++++++++++++-----------------------------
>  fs/cifs/cifsproto.h  |    8 +-
>  fs/cifs/connect.c    |   49 +++++++++-------
>  4 files changed, 98 insertions(+), 116 deletions(-)
>

CIFS: Fix kernel crash on simultaneous mount/umount calls

Commit Message

Comments

Patch