From patchwork Thu Aug 9 18:04:05 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Mason X-Patchwork-Id: 1302601 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 80489DFF7B for ; Thu, 9 Aug 2012 18:04:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758564Ab2HISEK (ORCPT ); Thu, 9 Aug 2012 14:04:10 -0400 Received: from mx2.fusionio.com ([66.114.96.31]:50314 "EHLO mx2.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758359Ab2HISEI (ORCPT ); Thu, 9 Aug 2012 14:04:08 -0400 X-ASG-Debug-ID: 1344535447-0421b53abf71800001-6jHSXT Received: from mail1.int.fusionio.com (mail1.int.fusionio.com [10.101.1.21]) by mx2.fusionio.com with ESMTP id 2Z5S6cD1KqNZqIsv (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO); Thu, 09 Aug 2012 12:04:07 -0600 (MDT) X-Barracuda-Envelope-From: clmason@fusionio.com Received: from localhost (67.247.67.114) by mail.fusionio.com (10.101.1.19) with Microsoft SMTP Server (TLS) id 8.3.83.0; Thu, 9 Aug 2012 12:04:06 -0600 Date: Thu, 9 Aug 2012 14:04:05 -0400 From: Chris Mason To: Miao Xie CC: Linux Btrfs , David Sterba Subject: Re: [RFC PATCH] Btrfs: fix full backref problem when inserting shared block reference Message-ID: <20120809180405.GC32103@shiny> X-ASG-Orig-Subj: Re: [RFC PATCH] Btrfs: fix full backref problem when inserting shared block reference Mail-Followup-To: Chris Mason , Miao Xie , Linux Btrfs , David Sterba References: <50232A19.2010704@cn.fujitsu.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <50232A19.2010704@cn.fujitsu.com> User-Agent: Mutt/1.5.21 (2011-07-01) X-Barracuda-Connect: mail1.int.fusionio.com[10.101.1.21] X-Barracuda-Start-Time: 1344535447 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://10.101.1.181:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at fusionio.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.105160 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Wed, Aug 08, 2012 at 09:10:17PM -0600, Miao Xie wrote: > If we create several snapshots at the same time, the following BUG_ON() will be > triggered. > > kernel BUG at fs/btrfs/extent-tree.c:6047! > > Steps to reproduce: > # mkfs.btrfs > # mount > # cd > # for ((i=0;i<2400;i++)); do touch long_name_to_make_tree_more_deep$i; done > # for ((i=0; i<4; i++)) > > do > > mkdir $i > > for ((j=0; j<200; j++)) > > do > > btrfs sub snap . $i/$j > > done & > > done snapshot creation has a critical section. Once we copy a given root to its snapshot, we're not allowed to change it until the transaction is fully committed. This means that if you're taking a snapshot of root A and storing the directory item of the snapshot in root A, you can only do it once per transaction with getting into trouble. Looks like the code doesn't enforce this though. Dave, could you please give this a try: --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index fcc8c21..9e7c621 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1324,6 +1324,8 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, u64 search_start; int ret; + WARN_ON(root->danger_transid == trans->transid); + if (trans->transaction != root->fs_info->running_transaction) { printk(KERN_CRIT "trans %llu running %llu\n", (unsigned long long)trans->transid, diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 0d195b5..35b5603 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1490,6 +1490,12 @@ struct btrfs_root { u64 objectid; u64 last_trans; + /* + * the last transaction that took a snapshot of this + * root. We're only allowed one snapshot per root per transaction + */ + u64 snapshot_trans; + /* data allocations are done in sectorsize units */ u32 sectorsize; @@ -1550,6 +1556,13 @@ struct btrfs_root { int force_cow; + /* + * this marks the critical section of snapshot creation. If we + * make any changes to a root after this critical section starts, + * we corrupt the FS. It is checked by btrfs_cow_block + */ + u64 danger_transid; + spinlock_t root_times_lock; }; diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9df50fa..b20d835 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -524,13 +524,28 @@ static int create_snapshot(struct btrfs_root *root, struct dentry *dentry, pending_snapshot->inherit = *inherit; *inherit = NULL; /* take responsibility to free it */ } - +again: trans = btrfs_start_transaction(root->fs_info->extent_root, 5); if (IS_ERR(trans)) { ret = PTR_ERR(trans); goto fail; } + /* we're only allowed to snapshot a given root once per transaction */ + spin_lock(&root->fs_info->trans_lock); + if (root->snapshot_trans == trans->transid) { + spin_unlock(&root->fs_info->trans_lock); + ret = btrfs_commit_transaction(trans, root->fs_info->extent_root); + if (ret) + goto fail; + goto again; + } + + root->snapshot_trans = trans->transid; + + spin_unlock(&root->fs_info->trans_lock); + + ret = btrfs_snap_reserve_metadata(trans, pending_snapshot); BUG_ON(ret); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 27c2600..4507421 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1093,6 +1093,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, /* see comments in should_cow_block() */ root->force_cow = 1; + root->danger_transid = trans->transid; smp_wmb(); btrfs_set_root_node(new_root_item, tmp);