diff mbox

[V2] Btrfs: keep dropped roots in cache until transaction commit

Message ID 1442326024-2536-1-git-send-email-jbacik@fb.com (mailing list archive)
State Superseded
Headers show

Commit Message

Josef Bacik Sept. 15, 2015, 2:07 p.m. UTC
When dropping a snapshot we need to account for the qgroup changes.  If we drop
the snapshot in all one go then the backref code will fail to find blocks from
the snapshot we dropped since it won't be able to find the root in the fs root
cache.  This can lead to us failing to find refs from other roots that pointed
at blocks in the now deleted root.  To handle this we need to not remove the fs
roots from the cache until after we process the qgroup operations.  Do this by
adding dropped roots to a list on the transaction, and letting the transaction
remove the roots at the same time it drops the commit roots.  This will keep all
of the backref searching code in sync properly, and fixes a problem Mark was
seeing with snapshot delete and qgroups.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
---
V1->V2:
-clear the trans tag when we are finished dropping the subvol so we don't try to
  update the root during commit.

 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/transaction.c | 30 ++++++++++++++++++++++++++++++
 fs/btrfs/transaction.h |  5 ++++-
 3 files changed, 35 insertions(+), 2 deletions(-)

Comments

Holger Hoffstätte Sept. 15, 2015, 3:50 p.m. UTC | #1
On 09/15/15 16:07, Josef Bacik wrote:
> When dropping a snapshot we need to account for the qgroup changes.  If we drop
> the snapshot in all one go then the backref code will fail to find blocks from
> the snapshot we dropped since it won't be able to find the root in the fs root
> cache.  This can lead to us failing to find refs from other roots that pointed
> at blocks in the now deleted root.  To handle this we need to not remove the fs
> roots from the cache until after we process the qgroup operations.  Do this by
> adding dropped roots to a list on the transaction, and letting the transaction
> remove the roots at the same time it drops the commit roots.  This will keep all
> of the backref searching code in sync properly, and fixes a problem Mark was
> seeing with snapshot delete and qgroups.  Thanks,
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> ---
> V1->V2:
> -clear the trans tag when we are finished dropping the subvol so we don't try to
>   update the root during commit.

This V2 does indeed seem to fix the issues I reported with snapshot deletion &
concurrent sync. I've now created/filled/deleted countless snapshots while issuing
sync(s) in parallel, and the problem that I saw fairly frequently with V1 no longer
seems to occur here. Therefore:

Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>

Thanks for the quick fix, Josef!

cheers
Holger

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Sept. 15, 2015, 3:56 p.m. UTC | #2
On 09/15/2015 11:50 AM, Holger Hoffstätte wrote:
> On 09/15/15 16:07, Josef Bacik wrote:
>> When dropping a snapshot we need to account for the qgroup changes.  If we drop
>> the snapshot in all one go then the backref code will fail to find blocks from
>> the snapshot we dropped since it won't be able to find the root in the fs root
>> cache.  This can lead to us failing to find refs from other roots that pointed
>> at blocks in the now deleted root.  To handle this we need to not remove the fs
>> roots from the cache until after we process the qgroup operations.  Do this by
>> adding dropped roots to a list on the transaction, and letting the transaction
>> remove the roots at the same time it drops the commit roots.  This will keep all
>> of the backref searching code in sync properly, and fixes a problem Mark was
>> seeing with snapshot delete and qgroups.  Thanks,
>>
>> Signed-off-by: Josef Bacik <jbacik@fb.com>
>> ---
>> V1->V2:
>> -clear the trans tag when we are finished dropping the subvol so we don't try to
>>    update the root during commit.
>
> This V2 does indeed seem to fix the issues I reported with snapshot deletion &
> concurrent sync. I've now created/filled/deleted countless snapshots while issuing
> sync(s) in parallel, and the problem that I saw fairly frequently with V1 no longer
> seems to occur here. Therefore:
>
> Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
>
> Thanks for the quick fix, Josef!
>

Great thanks,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Holger Hoffstätte Sept. 15, 2015, 7:08 p.m. UTC | #3
On 09/15/15 17:50, Holger Hoffstätte wrote:
> This V2 does indeed seem to fix the issues I reported with snapshot
> deletion & concurrent sync. I've now created/filled/deleted countless
> snapshots while issuing sync(s) in parallel, and the problem that I
> saw fairly frequently with V1 no longer seems to occur here.

Well..I may have spoken too soon:

[20502.470227] ------------[ cut here ]------------
[20502.470247] WARNING: CPU: 0 PID: 27322 at fs/btrfs/transaction.c:66 btrfs_put_transaction+0xb7/0x100 [btrfs]()
[20502.470248] Modules linked in: btrfs nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel xor snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp raid6_pq crc32_pclmul i915 crc32c_intel aesni_intel snd_hda_intel aes_x86_64 intel_gtt glue_helper lrw snd_hda_controller i2c_algo_bit gf128mul ablk_helper drm_kms_helper snd_hda_codec cryptd usbhid snd_hda_core drm snd_pcm i2c_i801 snd_timer snd r8169 i2c_core soundcore mii video [last unloaded: btrfs]
[20502.470279] CPU: 0 PID: 27322 Comm: btrfs-transacti Not tainted 4.1.7 #1
[20502.470280] Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
[20502.470281]  ffffffffa0843635 ffff8800c096fd48 ffffffff8156eae1 0000000000000007
[20502.470282]  0000000000000000 ffff8800c096fd88 ffffffff8105098a 0000000000000296
[20502.470284]  ffff8802865a1d40 ffff8802865a1d40 ffff88014b5b5678 0000000000000000
[20502.470285] Call Trace:
[20502.470289]  [<ffffffff8156eae1>] dump_stack+0x45/0x57
[20502.470292]  [<ffffffff8105098a>] warn_slowpath_common+0x8a/0xc0
[20502.470293]  [<ffffffff81050a7a>] warn_slowpath_null+0x1a/0x20
[20502.470298]  [<ffffffffa07c1027>] btrfs_put_transaction+0xb7/0x100 [btrfs]
[20502.470303]  [<ffffffffa07c280d>] btrfs_commit_transaction+0xb3d/0xc10 [btrfs]
[20502.470308]  [<ffffffffa07bd305>] transaction_kthread+0x245/0x260 [btrfs]
[20502.470313]  [<ffffffffa07bd0c0>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
[20502.470315]  [<ffffffff8106ddc9>] kthread+0xc9/0xe0
[20502.470317]  [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20502.470319]  [<ffffffff81573f52>] ret_from_fork+0x42/0x70
[20502.470320]  [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20502.470321] ---[ end trace 0556ec192b60907f ]---
[20532.457806] ------------[ cut here ]------------
[20532.457828] WARNING: CPU: 0 PID: 27322 at fs/btrfs/transaction.c:66 btrfs_put_transaction+0xb7/0x100 [btrfs]()
[20532.457829] Modules linked in: btrfs nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel xor snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp raid6_pq crc32_pclmul i915 crc32c_intel aesni_intel snd_hda_intel aes_x86_64 intel_gtt glue_helper lrw snd_hda_controller i2c_algo_bit gf128mul ablk_helper drm_kms_helper snd_hda_codec cryptd usbhid snd_hda_core drm snd_pcm i2c_i801 snd_timer snd r8169 i2c_core soundcore mii video [last unloaded: btrfs]
[20532.457872] CPU: 0 PID: 27322 Comm: btrfs-transacti Tainted: G        W       4.1.7 #1
[20532.457873] Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
[20532.457873]  ffffffffa0843635 ffff8800c096fd48 ffffffff8156eae1 0000000000000007
[20532.457875]  0000000000000000 ffff8800c096fd88 ffffffff8105098a 0000000000000296
[20532.457876]  ffff8802865a1d40 ffff8802865a1d40 ffff8801123dc678 0000000000000000
[20532.457878] Call Trace:
[20532.457882]  [<ffffffff8156eae1>] dump_stack+0x45/0x57
[20532.457884]  [<ffffffff8105098a>] warn_slowpath_common+0x8a/0xc0
[20532.457885]  [<ffffffff81050a7a>] warn_slowpath_null+0x1a/0x20
[20532.457890]  [<ffffffffa07c1027>] btrfs_put_transaction+0xb7/0x100 [btrfs]
[20532.457895]  [<ffffffffa07c280d>] btrfs_commit_transaction+0xb3d/0xc10 [btrfs]
[20532.457901]  [<ffffffffa07bd305>] transaction_kthread+0x245/0x260 [btrfs]
[20532.457905]  [<ffffffffa07bd0c0>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
[20532.457907]  [<ffffffff8106ddc9>] kthread+0xc9/0xe0
[20532.457909]  [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20532.457911]  [<ffffffff81573f52>] ret_from_fork+0x42/0x70
[20532.457912]  [<ffffffff8106dd00>] ? kthread_worker_fn+0x170/0x170
[20532.457913] ---[ end trace 0556ec192b609080 ]---

As expected pid 27322 is the [btrfs-transacti] thread. It now seems to happen
(just did twice in a row) when I delete a snapshot but *don't* sync, but instead
just let it smurf along all by itself.

This warning is from:

..
void btrfs_put_transaction(struct btrfs_transaction *transaction)
{
	WARN_ON(atomic_read(&transaction->use_count) == 0);
	if (atomic_dec_and_test(&transaction->use_count)) {
		BUG_ON(!list_empty(&transaction->list));
		WARN_ON(!RB_EMPTY_ROOT(&transaction->delayed_refs.href_root));
		if (transaction->delayed_refs.pending_csums)
..

Once this happens unmounting results in:

..blabla..
[21618.982176] BTRFS error (device sdc1): unable to find ref byte nr 628535836672 parent 0 root 1 owner 0 offset 0
[21618.982176] ------------[ cut here ]------------
[21618.982181] WARNING: CPU: 1 PID: 387 at fs/btrfs/extent-tree.c:6363 __btrfs_free_extent.isra.33+0x960/0xdf0 [btrfs]()
[21618.982181] BTRFS: Transaction aborted (error -2)
[21618.982182] Modules linked in: btrfs nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel xor snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp raid6_pq crc32_pclmul i915 crc32c_intel aesni_intel snd_hda_intel aes_x86_64 intel_gtt glue_helper lrw snd_hda_controller i2c_algo_bit gf128mul ablk_helper drm_kms_helper snd_hda_codec cryptd usbhid snd_hda_core drm snd_pcm i2c_i801 snd_timer snd r8169 i2c_core soundcore mii video [last unloaded: btrfs]
[21618.982196] CPU: 1 PID: 387 Comm: umount Tainted: G        W       4.1.7 #1
[21618.982196] Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
[21618.982197]  ffffffffa0842df0 ffff88014336b948 ffffffff8156eae1 0000000000000007
[21618.982198]  ffff88014336b998 ffff88014336b988 ffffffff8105098a ffffffffa0842d5a
[21618.982199]  00000000fffffffe 0000009257a7c000 0000000000000000 ffff8803c9350000
[21618.982201] Call Trace:
[21618.982202]  [<ffffffff8156eae1>] dump_stack+0x45/0x57
[21618.982203]  [<ffffffff8105098a>] warn_slowpath_common+0x8a/0xc0
[21618.982204]  [<ffffffff81050a06>] warn_slowpath_fmt+0x46/0x50
[21618.982208]  [<ffffffffa07a5330>] __btrfs_free_extent.isra.33+0x960/0xdf0 [btrfs]
[21618.982214]  [<ffffffffa080ce1d>] ? btrfs_delayed_ref_lock+0x3d/0x270 [btrfs]
[21618.982219]  [<ffffffffa07a9802>] __btrfs_run_delayed_refs+0x9c2/0x1110 [btrfs]
[21618.982224]  [<ffffffffa07acdb3>] btrfs_run_delayed_refs.part.38+0x73/0x280 [btrfs]
[21618.982228]  [<ffffffffa07b0dbe>] btrfs_start_dirty_block_groups+0x35e/0x410 [btrfs]
[21618.982234]  [<ffffffffa07c1e95>] btrfs_commit_transaction+0x1c5/0xc10 [btrfs]
[21618.982239]  [<ffffffffa07c2976>] ? start_transaction+0x96/0x510 [btrfs]
[21618.982244]  [<ffffffffa07bba23>] btrfs_commit_super+0x93/0xa0 [btrfs]
[21618.982248]  [<ffffffffa07bd5b4>] close_ctree+0x294/0x350 [btrfs]
[21618.982252]  [<ffffffffa0790dd9>] btrfs_put_super+0x19/0x20 [btrfs]
[21618.982253]  [<ffffffff8117535a>] generic_shutdown_super+0x6a/0xf0
[21618.982254]  [<ffffffff81175646>] kill_anon_super+0x16/0x30
[21618.982258]  [<ffffffffa07906e8>] btrfs_kill_super+0x18/0x110 [btrfs]
[21618.982259]  [<ffffffff811759f9>] deactivate_locked_super+0x49/0x80
[21618.982260]  [<ffffffff81175e6c>] deactivate_super+0x6c/0x80
[21618.982261]  [<ffffffff81192a53>] cleanup_mnt+0x43/0xa0
[21618.982263]  [<ffffffff81192b02>] __cleanup_mnt+0x12/0x20
[21618.982264]  [<ffffffff8106c387>] task_work_run+0xa7/0xe0
[21618.982265]  [<ffffffff81002b9d>] do_notify_resume+0x6d/0x70
[21618.982266]  [<ffffffff81573d42>] int_signal+0x12/0x17
[21618.982267] ---[ end trace 0556ec192b609086 ]---
[21618.982269] BTRFS: error (device sdc1) in __btrfs_free_extent:6363: errno=-2 No such entry
[21618.982800] BTRFS info (device sdc1): forced readonly
[21618.982802] BTRFS: error (device sdc1) in btrfs_run_delayed_refs:2863: errno=-2 No such entry
[21618.983332] BTRFS error (device sdc1): commit super ret -2
[21618.983407] BTRFS error (device sdc1): cleaner transaction attach returned -30

Sorry..

-h
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Sept. 15, 2015, 7:15 p.m. UTC | #4
On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
> On 09/15/15 17:50, Holger Hoffstätte wrote:
>> This V2 does indeed seem to fix the issues I reported with snapshot
>> deletion & concurrent sync. I've now created/filled/deleted countless
>> snapshots while issuing sync(s) in parallel, and the problem that I
>> saw fairly frequently with V1 no longer seems to occur here.
>
> Well..I may have spoken too soon:

Huh this doesn't seem related to my stuff.  Can I have your script you 
are running to reproduce this?  Thanks,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Holger Hoffstätte Sept. 15, 2015, 7:39 p.m. UTC | #5
On 09/15/15 21:15, Josef Bacik wrote:
> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>> saw fairly frequently with V1 no longer seems to occur here.
>>
>> Well..I may have spoken too soon:
> 
> Huh this doesn't seem related to my stuff. Can I have your script you
> are running to reproduce this? Thanks,

This was manual - deleted two large files, deleted two snapshots (one
from the modified subvolume and one sibling), let it sit in the background
while doing something else for a few seconds and found the messages.
Don't know how to reproduce this as it didn't show up previously.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Holger Hoffstätte Sept. 16, 2015, 8:58 a.m. UTC | #6
On 09/15/15 21:15, Josef Bacik wrote:
> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>> saw fairly frequently with V1 no longer seems to occur here.
>>
>> Well..I may have spoken too soon:
> 
> Huh this doesn't seem related to my stuff.  Can I have your script
> you are running to reproduce this?  Thanks,
> 
> Josef

It seemed to be related to Omar's free-space-tree - I saw the same bug
a bit later on a different device without snapshot deletion, just
deleting two large files. Sorry for the false positive.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Sept. 16, 2015, 1:50 p.m. UTC | #7
On 09/16/2015 04:58 AM, Holger Hoffstätte wrote:
> On 09/15/15 21:15, Josef Bacik wrote:
>> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>>> saw fairly frequently with V1 no longer seems to occur here.
>>>
>>> Well..I may have spoken too soon:
>>
>> Huh this doesn't seem related to my stuff.  Can I have your script
>> you are running to reproduce this?  Thanks,
>>
>> Josef
>
> It seemed to be related to Omar's free-space-tree - I saw the same bug
> a bit later on a different device without snapshot deletion, just
> deleting two large files. Sorry for the false positive.
>

Omar, do you have time to look into this?  Were they highly fragmented 
files or just giant contiguous files?  Thanks,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Holger Hoffstätte Sept. 16, 2015, 2 p.m. UTC | #8
On 09/16/15 15:50, Josef Bacik wrote:
> On 09/16/2015 04:58 AM, Holger Hoffstätte wrote:
>> On 09/15/15 21:15, Josef Bacik wrote:
>>> On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
>>>> On 09/15/15 17:50, Holger Hoffstätte wrote:
>>>>> This V2 does indeed seem to fix the issues I reported with snapshot
>>>>> deletion & concurrent sync. I've now created/filled/deleted countless
>>>>> snapshots while issuing sync(s) in parallel, and the problem that I
>>>>> saw fairly frequently with V1 no longer seems to occur here.
>>>>
>>>> Well..I may have spoken too soon:
>>>
>>> Huh this doesn't seem related to my stuff.  Can I have your script
>>> you are running to reproduce this?  Thanks,
>>>
>>> Josef
>>
>> It seemed to be related to Omar's free-space-tree - I saw the same bug
>> a bit later on a different device without snapshot deletion, just
>> deleting two large files. Sorry for the false positive.
>>
> 
> Omar, do you have time to look into this? Were they highly fragmented
> files or just giant contiguous files? Thanks,

These were very contiguous (2* ~1.3 GB, 1-3 extents each). I already sent
Omar a long klog dump with more background info. The biggest issue is that
after this bug the volume is hosed since the fst sets an incompat flag,
and after mounting rw just stumbles again trying to replay.
I'll keep trying to isolate exactly how and when this happens.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval Sept. 18, 2015, 6:28 p.m. UTC | #9
On Wed, Sep 16, 2015 at 09:50:17AM -0400, Josef Bacik wrote:
> On 09/16/2015 04:58 AM, Holger Hoffstätte wrote:
> >On 09/15/15 21:15, Josef Bacik wrote:
> >>On 09/15/2015 03:08 PM, Holger Hoffstätte wrote:
> >>>On 09/15/15 17:50, Holger Hoffstätte wrote:
> >>>>This V2 does indeed seem to fix the issues I reported with snapshot
> >>>>deletion & concurrent sync. I've now created/filled/deleted countless
> >>>>snapshots while issuing sync(s) in parallel, and the problem that I
> >>>>saw fairly frequently with V1 no longer seems to occur here.
> >>>
> >>>Well..I may have spoken too soon:
> >>
> >>Huh this doesn't seem related to my stuff.  Can I have your script
> >>you are running to reproduce this?  Thanks,
> >>
> >>Josef
> >
> >It seemed to be related to Omar's free-space-tree - I saw the same bug
> >a bit later on a different device without snapshot deletion, just
> >deleting two large files. Sorry for the false positive.
> >
> 
> Omar, do you have time to look into this?  Were they highly fragmented files
> or just giant contiguous files?  Thanks,
> 
> Josef
> 

Yeah, I'll take a look over the weekend or next week at the latest.
diff mbox

Patch

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ab81135..2327d4c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8596,7 +8596,7 @@  int btrfs_drop_snapshot(struct btrfs_root *root,
 	}
 
 	if (test_bit(BTRFS_ROOT_IN_RADIX, &root->state)) {
-		btrfs_drop_and_free_fs_root(tree_root->fs_info, root);
+		btrfs_add_dropped_root(trans, root);
 	} else {
 		free_extent_buffer(root->node);
 		free_extent_buffer(root->commit_root);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index f5021fc..cd15f39 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -117,6 +117,16 @@  static noinline void switch_commit_roots(struct btrfs_transaction *trans,
 			btrfs_unpin_free_ino(root);
 		clear_btree_io_tree(&root->dirty_log_pages);
 	}
+
+	/* We can free old roots now. */
+	spin_lock(&trans->dropped_roots_lock);
+	while (!list_empty(&trans->dropped_roots)) {
+		root = list_first_entry(&trans->dropped_roots,
+					struct btrfs_root, root_list);
+		list_del_init(&root->root_list);
+		btrfs_drop_and_free_fs_root(fs_info, root);
+	}
+	spin_unlock(&trans->dropped_roots_lock);
 	up_write(&fs_info->commit_root_sem);
 }
 
@@ -255,9 +265,11 @@  loop:
 	INIT_LIST_HEAD(&cur_trans->pending_ordered);
 	INIT_LIST_HEAD(&cur_trans->dirty_bgs);
 	INIT_LIST_HEAD(&cur_trans->io_bgs);
+	INIT_LIST_HEAD(&cur_trans->dropped_roots);
 	mutex_init(&cur_trans->cache_write_mutex);
 	cur_trans->num_dirty_bgs = 0;
 	spin_lock_init(&cur_trans->dirty_bgs_lock);
+	spin_lock_init(&cur_trans->dropped_roots_lock);
 	list_add_tail(&cur_trans->list, &fs_info->trans_list);
 	extent_io_tree_init(&cur_trans->dirty_pages,
 			     fs_info->btree_inode->i_mapping);
@@ -334,6 +346,24 @@  static int record_root_in_trans(struct btrfs_trans_handle *trans,
 }
 
 
+void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
+			    struct btrfs_root *root)
+{
+	struct btrfs_transaction *cur_trans = trans->transaction;
+
+	/* Add ourselves to the transaction dropped list */
+	spin_lock(&cur_trans->dropped_roots_lock);
+	list_add_tail(&root->root_list, &cur_trans->dropped_roots);
+	spin_unlock(&cur_trans->dropped_roots_lock);
+
+	/* Make sure we don't try to update the root at commit time */
+	spin_lock(&root->fs_info->fs_roots_radix_lock);
+	radix_tree_tag_clear(&root->fs_info->fs_roots_radix,
+			     (unsigned long)root->root_key.objectid,
+			     BTRFS_ROOT_TRANS_TAG);
+	spin_unlock(&root->fs_info->fs_roots_radix_lock);
+}
+
 int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
 			       struct btrfs_root *root)
 {
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index eb09c20..ef4ff38 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -65,6 +65,7 @@  struct btrfs_transaction {
 	struct list_head switch_commits;
 	struct list_head dirty_bgs;
 	struct list_head io_bgs;
+	struct list_head dropped_roots;
 	u64 num_dirty_bgs;
 
 	/*
@@ -74,6 +75,7 @@  struct btrfs_transaction {
 	 */
 	struct mutex cache_write_mutex;
 	spinlock_t dirty_bgs_lock;
+	spinlock_t dropped_roots_lock;
 	struct btrfs_delayed_ref_root delayed_refs;
 	int aborted;
 	int dirty_bg_run;
@@ -214,5 +216,6 @@  int btrfs_transaction_blocked(struct btrfs_fs_info *info);
 int btrfs_transaction_in_commit(struct btrfs_fs_info *info);
 void btrfs_put_transaction(struct btrfs_transaction *transaction);
 void btrfs_apply_pending_changes(struct btrfs_fs_info *fs_info);
-
+void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
+			    struct btrfs_root *root);
 #endif