Message ID | 20210908030026.2dLZCmkE4%akpm@linux-foundation.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [001/147] mm, slub: don't call flush_all() from slab_debug_trace_open() | expand |
Hi! > From: Zhen Lei <thunder.leizhen@huawei.com> > Subject: nilfs2: use refcount_dec_and_lock() to fix potential UAF > > When the refcount is decreased to 0, the resource reclamation branch is > entered. Before CPU0 reaches the race point (1), CPU1 may obtain the > spinlock and traverse the rbtree to find 'root', see nilfs_lookup_root(). > Although CPU1 will call refcount_inc() to increase the refcount, it is > obviously too late. CPU0 will release 'root' directly, CPU1 then accesses > 'root' and triggers UAF. > > Use refcount_dec_and_lock() to ensure that both the operations of decrease > refcount to 0 and link deletion are lock protected eliminates this risk. > > CPU0 CPU1 > nilfs_put_root(): > <-------- (1) > spin_lock(&nilfs->ns_cptree_lock); > rb_erase(&root->rb_node, &nilfs->ns_cptree); > spin_unlock(&nilfs->ns_cptree_lock); > > kfree(root); > <-------- use-after-free > There is no reproduction program, and the above is only theoretical > analysis. Ok, so we have a theoretical bug, and fix already on its way to stable. But ... is it correct? > +++ a/fs/nilfs2/the_nilfs.c > @@ -792,14 +792,13 @@ nilfs_find_or_create_root(struct the_nil > > void nilfs_put_root(struct nilfs_root *root) > { > - if (refcount_dec_and_test(&root->count)) { > - struct the_nilfs *nilfs = root->nilfs; > + struct the_nilfs *nilfs = root->nilfs; > > - nilfs_sysfs_delete_snapshot_group(root); > - > - spin_lock(&nilfs->ns_cptree_lock); > + if (refcount_dec_and_lock(&root->count, &nilfs->ns_cptree_lock)) { > rb_erase(&root->rb_node, &nilfs->ns_cptree); > spin_unlock(&nilfs->ns_cptree_lock); > + > + nilfs_sysfs_delete_snapshot_group(root); > iput(root->ifile); > > kfree(root); spin_lock() is deleted, but spin_unlock() is not affected. This means unbalanced locking, right? Best regards, Pavel -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Hi, On Fri, Sep 24, 2021 at 7:35 PM Pavel Machek <pavel@denx.de> wrote: > > Hi! > > > From: Zhen Lei <thunder.leizhen@huawei.com> > > Subject: nilfs2: use refcount_dec_and_lock() to fix potential UAF > > > > When the refcount is decreased to 0, the resource reclamation branch is > > entered. Before CPU0 reaches the race point (1), CPU1 may obtain the > > spinlock and traverse the rbtree to find 'root', see nilfs_lookup_root(). > > Although CPU1 will call refcount_inc() to increase the refcount, it is > > obviously too late. CPU0 will release 'root' directly, CPU1 then accesses > > 'root' and triggers UAF. > > > > Use refcount_dec_and_lock() to ensure that both the operations of decrease > > refcount to 0 and link deletion are lock protected eliminates this risk. > > > > CPU0 CPU1 > > nilfs_put_root(): > > <-------- (1) > > spin_lock(&nilfs->ns_cptree_lock); > > rb_erase(&root->rb_node, &nilfs->ns_cptree); > > spin_unlock(&nilfs->ns_cptree_lock); > > > > kfree(root); > > <-------- use-after-free > > > There is no reproduction program, and the above is only theoretical > > analysis. > > Ok, so we have a theoretical bug, and fix already on its way to > stable. But ... is it correct? > > > +++ a/fs/nilfs2/the_nilfs.c > > @@ -792,14 +792,13 @@ nilfs_find_or_create_root(struct the_nil > > > > void nilfs_put_root(struct nilfs_root *root) > > { > > - if (refcount_dec_and_test(&root->count)) { > > - struct the_nilfs *nilfs = root->nilfs; > > + struct the_nilfs *nilfs = root->nilfs; > > > > - nilfs_sysfs_delete_snapshot_group(root); > > - > > - spin_lock(&nilfs->ns_cptree_lock); > > + if (refcount_dec_and_lock(&root->count, &nilfs->ns_cptree_lock)) { > > rb_erase(&root->rb_node, &nilfs->ns_cptree); > > spin_unlock(&nilfs->ns_cptree_lock); > > + > > + nilfs_sysfs_delete_snapshot_group(root); > > iput(root->ifile); > > > > kfree(root); > > spin_lock() is deleted, but spin_unlock() is not affected. This means > unbalanced locking, right? It's okay. spin_lock() is integrated into refcount_dec_and_lock(), which was originally refcount_dec_and_test(). Thanks, Ryusuke Konishi > > Best regards, > Pavel > -- > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany >
On Tue, Sep 07, 2021 at 08:00:26PM -0700, Andrew Morton wrote: > From: Zhen Lei <thunder.leizhen@huawei.com> > Subject: nilfs2: use refcount_dec_and_lock() to fix potential UAF > > When the refcount is decreased to 0, the resource reclamation branch is > entered. Before CPU0 reaches the race point (1), CPU1 may obtain the > spinlock and traverse the rbtree to find 'root', see nilfs_lookup_root(). > Although CPU1 will call refcount_inc() to increase the refcount, it is > obviously too late. CPU0 will release 'root' directly, CPU1 then accesses > 'root' and triggers UAF. > > Use refcount_dec_and_lock() to ensure that both the operations of decrease > refcount to 0 and link deletion are lock protected eliminates this risk. > > CPU0 CPU1 > nilfs_put_root(): > <-------- (1) > spin_lock(&nilfs->ns_cptree_lock); > rb_erase(&root->rb_node, &nilfs->ns_cptree); > spin_unlock(&nilfs->ns_cptree_lock); > > kfree(root); > <-------- use-after-free I don't know where this happened, but the leading whitespace has been eaten at some point, making this description of the race completely unreadable as everything appears to be done by CPU 0.
Hi, On Fri, Sep 24, 2021 at 9:13 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Sep 07, 2021 at 08:00:26PM -0700, Andrew Morton wrote: > > From: Zhen Lei <thunder.leizhen@huawei.com> > > Subject: nilfs2: use refcount_dec_and_lock() to fix potential UAF > > > > When the refcount is decreased to 0, the resource reclamation branch is > > entered. Before CPU0 reaches the race point (1), CPU1 may obtain the > > spinlock and traverse the rbtree to find 'root', see nilfs_lookup_root(). > > Although CPU1 will call refcount_inc() to increase the refcount, it is > > obviously too late. CPU0 will release 'root' directly, CPU1 then accesses > > 'root' and triggers UAF. > > > > Use refcount_dec_and_lock() to ensure that both the operations of decrease > > refcount to 0 and link deletion are lock protected eliminates this risk. > > > > CPU0 CPU1 > > nilfs_put_root(): > > <-------- (1) > > spin_lock(&nilfs->ns_cptree_lock); > > rb_erase(&root->rb_node, &nilfs->ns_cptree); > > spin_unlock(&nilfs->ns_cptree_lock); > > > > kfree(root); > > <-------- use-after-free > > I don't know where this happened, but the leading whitespace has been > eaten at some point, making this description of the race completely > unreadable as everything appears to be done by CPU 0. The diagram is the same as that in the original patch by author, and I approved it without any discomfort because these five operations (nilfs_put_root() ~ spin_lock(); rb_erase(); spin_unlock() ~ kfree() ) are all done by CPU0. But, yeah, an example of a function call might have been written on the CPU1 side as well, for instance, a nilfs_lookup_root() call etc, to clarify the race issue the message explains.. Regards, Ryusuke Konishi
--- a/fs/nilfs2/the_nilfs.c~nilfs2-use-refcount_dec_and_lock-to-fix-potential-uaf +++ a/fs/nilfs2/the_nilfs.c @@ -792,14 +792,13 @@ nilfs_find_or_create_root(struct the_nil void nilfs_put_root(struct nilfs_root *root) { - if (refcount_dec_and_test(&root->count)) { - struct the_nilfs *nilfs = root->nilfs; + struct the_nilfs *nilfs = root->nilfs; - nilfs_sysfs_delete_snapshot_group(root); - - spin_lock(&nilfs->ns_cptree_lock); + if (refcount_dec_and_lock(&root->count, &nilfs->ns_cptree_lock)) { rb_erase(&root->rb_node, &nilfs->ns_cptree); spin_unlock(&nilfs->ns_cptree_lock); + + nilfs_sysfs_delete_snapshot_group(root); iput(root->ifile); kfree(root);