diff mbox

Btrfs: fix a tree mod bug while inserting a new root

Message ID 1350914572-4205-1-git-send-email-bo.li.liu@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Liu Bo Oct. 22, 2012, 2:02 p.m. UTC
According to btree's balance algorithm, when we split a root into two parts,
we insert a new one to be their parent:

                                                 new root
            node A                            /              \
      | x1 x2 x3 x4 x5 x6 |   =>          node A             node A'
                                    | x1 x2 x3 - - - |  | x4 x5 x6 - - - |
                             split

The original root won't be freed because it becomes a child of the new root,
and a move to keep balance is needed then.

So we should not add REMOVE_WHILE_FREEING keys for the old root, otherwise,
we will hit use-after-free since we first add REMOVE_WHILE_FREEING keys and
then add REMOVE keys, which is invalid.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/ctree.c |   16 +++++++++-------
 1 files changed, 9 insertions(+), 7 deletions(-)

Comments

Jan Schmidt Oct. 22, 2012, 5:05 p.m. UTC | #1
Hi liubo,

On Mon, October 22, 2012 at 16:02 (+0200), Liu Bo wrote:
> According to btree's balance algorithm, when we split a root into two parts,
> we insert a new one to be their parent:
> 
>                                                  new root
>             node A                            /              \
>       | x1 x2 x3 x4 x5 x6 |   =>          node A             node A'
>                                     | x1 x2 x3 - - - |  | x4 x5 x6 - - - |
>                              split
> 
> The original root won't be freed because it becomes a child of the new root,
> and a move to keep balance is needed then.
> 
> So we should not add REMOVE_WHILE_FREEING keys for the old root, otherwise,
> we will hit use-after-free since we first add REMOVE_WHILE_FREEING keys and
> then add REMOVE keys, which is invalid.

I don't like adding another parameter there, the function is already confusing
without it. I've got a different fix for that problem here as well. I haven't
been sending it since Friday because there's at least one additional problem in
the tree mod log, and I'd like to see all of the issues fixed.

There's also a fix for double frees from push_node_left here. That one may be
fixing the other issue you're seeing (which I still cannot reproduce). I'm still
not convinced it's a good idea to change the semantics in del_ptr as done in
your previous patch set.

Probably we can try working together on irc in a more interactive fashion? Or
tell me if you want my patches anywhere before I send them out.

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo Oct. 23, 2012, 12:39 a.m. UTC | #2
On 10/23/2012 01:05 AM, Jan Schmidt wrote:
> Hi liubo,
> 
> On Mon, October 22, 2012 at 16:02 (+0200), Liu Bo wrote:
>> According to btree's balance algorithm, when we split a root into two parts,
>> we insert a new one to be their parent:
>>
>>                                                  new root
>>             node A                            /              \
>>       | x1 x2 x3 x4 x5 x6 |   =>          node A             node A'
>>                                     | x1 x2 x3 - - - |  | x4 x5 x6 - - - |
>>                              split
>>
>> The original root won't be freed because it becomes a child of the new root,
>> and a move to keep balance is needed then.
>>
>> So we should not add REMOVE_WHILE_FREEING keys for the old root, otherwise,
>> we will hit use-after-free since we first add REMOVE_WHILE_FREEING keys and
>> then add REMOVE keys, which is invalid.
> 
> I don't like adding another parameter there, the function is already confusing
> without it. I've got a different fix for that problem here as well. I haven't
> been sending it since Friday because there's at least one additional problem in
> the tree mod log, and I'd like to see all of the issues fixed.
> 
> There's also a fix for double frees from push_node_left here. That one may be
> fixing the other issue you're seeing (which I still cannot reproduce). I'm still
> not convinced it's a good idea to change the semantics in del_ptr as done in
> your previous patch set.
> 

If you have better fixes, that'd be good.

> Probably we can try working together on irc in a more interactive fashion? Or
> tell me if you want my patches anywhere before I send them out.
> 

OK, I'm on IRC now, lets rock it ;)

thanks,
liubo


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index b334362..26987ef 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -639,7 +639,8 @@  __tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb)
 static noinline int
 tree_mod_log_insert_root(struct btrfs_fs_info *fs_info,
 			 struct extent_buffer *old_root,
-			 struct extent_buffer *new_root, gfp_t flags)
+			 struct extent_buffer *new_root,
+			 gfp_t flags, int free_old)
 {
 	struct tree_mod_elem *tm;
 	int ret;
@@ -647,7 +648,8 @@  tree_mod_log_insert_root(struct btrfs_fs_info *fs_info,
 	if (tree_mod_dont_log(fs_info, NULL))
 		return 0;
 
-	__tree_mod_log_free_eb(fs_info, old_root);
+	if (free_old)
+		__tree_mod_log_free_eb(fs_info, old_root);
 
 	ret = tree_mod_alloc(fs_info, flags, &tm);
 	if (ret < 0)
@@ -797,11 +799,11 @@  tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb)
 
 static noinline void
 tree_mod_log_set_root_pointer(struct btrfs_root *root,
-			      struct extent_buffer *new_root_node)
+			      struct extent_buffer *new_root_node, int free_old)
 {
 	int ret;
 	ret = tree_mod_log_insert_root(root->fs_info, root->node,
-				       new_root_node, GFP_NOFS);
+				       new_root_node, GFP_NOFS, free_old);
 	BUG_ON(ret < 0);
 }
 
@@ -1029,7 +1031,7 @@  static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
 			parent_start = 0;
 
 		extent_buffer_get(cow);
-		tree_mod_log_set_root_pointer(root, cow);
+		tree_mod_log_set_root_pointer(root, cow, 1);
 		rcu_assign_pointer(root->node, cow);
 
 		btrfs_free_tree_block(trans, root, buf, parent_start,
@@ -1725,7 +1727,7 @@  static noinline int balance_level(struct btrfs_trans_handle *trans,
 			goto enospc;
 		}
 
-		tree_mod_log_set_root_pointer(root, child);
+		tree_mod_log_set_root_pointer(root, child, 1);
 		rcu_assign_pointer(root->node, child);
 
 		add_root_to_dirty_list(root);
@@ -3107,7 +3109,7 @@  static noinline int insert_new_root(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(c);
 
 	old = root->node;
-	tree_mod_log_set_root_pointer(root, c);
+	tree_mod_log_set_root_pointer(root, c, 0);
 	rcu_assign_pointer(root->node, c);
 
 	/* the super has an extra ref to root->node */