diff mbox

[v2,9/9] Btrfs: add free space tree mount option

Message ID de086134d128aad13d16b2aabc72918d7ec7637e.1441309178.git.osandov@fb.com (mailing list archive)
State Superseded
Headers show

Commit Message

Omar Sandoval Sept. 3, 2015, 7:44 p.m. UTC
Now we can finally hook up everything so we can actually use free space
tree. On the first mount with the free_space_tree mount option, the free
space tree will be created and the FREE_SPACE_TREE read-only compat bit
will be set. Any time the filesystem is mounted from then on, we will
use the free space tree.

Having both the free space cache and free space trees enabled is
nonsense, so we don't allow that to happen. Since mkfs sets the
superblock cache generation to -1, this means that the filesystem will
have to be mounted with nospace_cache,free_space_tree to create the free
space trees on first mount. Once the FREE_SPACE_TREE bit is set, the
cache generation is ignored when mounting. This is all a little more
complicated than would be ideal, but at some point we can presumably
make the free space tree the default and stop setting the cache
generation in mkfs.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/ctree.h   |  7 ++++++-
 fs/btrfs/disk-io.c | 26 ++++++++++++++++++++++++++
 fs/btrfs/super.c   | 21 +++++++++++++++++++--
 3 files changed, 51 insertions(+), 3 deletions(-)

Comments

David Sterba Sept. 9, 2015, noon UTC | #1
On Thu, Sep 03, 2015 at 12:44:27PM -0700, Omar Sandoval wrote:
> Now we can finally hook up everything so we can actually use free space
> tree. On the first mount with the free_space_tree mount option, the free
> space tree will be created and the FREE_SPACE_TREE read-only compat bit
> will be set. Any time the filesystem is mounted from then on, we will
> use the free space tree.
> 
> Having both the free space cache and free space trees enabled is
> nonsense, so we don't allow that to happen. Since mkfs sets the
> superblock cache generation to -1, this means that the filesystem will
> have to be mounted with nospace_cache,free_space_tree to create the free
> space trees on first mount. Once the FREE_SPACE_TREE bit is set, the
> cache generation is ignored when mounting. This is all a little more
> complicated than would be ideal, but at some point we can presumably
> make the free space tree the default and stop setting the cache
> generation in mkfs.

I have objections against introducing another options to do something
with space cache. As you write, it does not make sens to have
'space_cache' and 'free_space_tree' enabled, and I agree. The b-tree
approach is an "implementation detail", an improved version of space
caching.

Because of that I propose to do the following:

* use space_cache mount option, and add a value denoting the used
  implementation, eg. space_cache=btree or space_cache=v2 etc

* keep space_cache for backward compatibility for the current
  implementaion

* clear_cache should reset state for both

* nospace_cache prevents using any of the two versions of space cache

On the mkfs side, we can add new incompat feature to the -O option that
will set the incompat bit to the superblock. Mounting such filesystem
would use the v2 cache automatically.

I'd like to see the b-tree space cache default in the future, until then
it'll be mkfs-time option or mount-time option.

For backward compatibility, mounting a free space v2 filesystem on older
kernel can be done with support of userspace tools: reset the cache
generation (as if clear_cache was used), drop all the free-space-tree
structures and unset the incompat bit. I think this kind of fallback is
desirable.


Other than that, I like the series and the improvements it's supposed to
bring.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval Sept. 11, 2015, 12:52 a.m. UTC | #2
On Wed, Sep 09, 2015 at 02:00:23PM +0200, David Sterba wrote:
> On Thu, Sep 03, 2015 at 12:44:27PM -0700, Omar Sandoval wrote:
> > Now we can finally hook up everything so we can actually use free space
> > tree. On the first mount with the free_space_tree mount option, the free
> > space tree will be created and the FREE_SPACE_TREE read-only compat bit
> > will be set. Any time the filesystem is mounted from then on, we will
> > use the free space tree.
> > 
> > Having both the free space cache and free space trees enabled is
> > nonsense, so we don't allow that to happen. Since mkfs sets the
> > superblock cache generation to -1, this means that the filesystem will
> > have to be mounted with nospace_cache,free_space_tree to create the free
> > space trees on first mount. Once the FREE_SPACE_TREE bit is set, the
> > cache generation is ignored when mounting. This is all a little more
> > complicated than would be ideal, but at some point we can presumably
> > make the free space tree the default and stop setting the cache
> > generation in mkfs.
> 
> I have objections against introducing another options to do something
> with space cache. As you write, it does not make sens to have
> 'space_cache' and 'free_space_tree' enabled, and I agree. The b-tree
> approach is an "implementation detail", an improved version of space
> caching.
> 
> Because of that I propose to do the following:
> 
> * use space_cache mount option, and add a value denoting the used
>   implementation, eg. space_cache=btree or space_cache=v2 etc
> 
> * keep space_cache for backward compatibility for the current
>   implementaion
> 
> * clear_cache should reset state for both
> 
> * nospace_cache prevents using any of the two versions of space cache

Okay, I like the idea of calling this space_cache=v2 and allowing
clear_cache to clear the free space tree just in case. However, the free
space tree doesn't use a cache generation like the old free space cache,
so once it's created, we can't ever ignore it, so for nospace_cache, the
best we could do would be to fail the mount (unless clear_cache is also
set). The other option would be to add something like the cache
generation for the free space tree, but I'd rather not do that since
fixing an out-of-date free space tree is a little more involved than
with the old cache (at that point, we might as well clear the tree and
redo it all over again). What do you think of that? Is failing on
nospace_cache okay with you?

Thanks.

> On the mkfs side, we can add new incompat feature to the -O option that
> will set the incompat bit to the superblock. Mounting such filesystem
> would use the v2 cache automatically.
> 
> I'd like to see the b-tree space cache default in the future, until then
> it'll be mkfs-time option or mount-time option.
> 
> For backward compatibility, mounting a free space v2 filesystem on older
> kernel can be done with support of userspace tools: reset the cache
> generation (as if clear_cache was used), drop all the free-space-tree
> structures and unset the incompat bit. I think this kind of fallback is
> desirable.
> 
> 
> Other than that, I like the series and the improvements it's supposed to
> bring.
diff mbox

Patch

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 05420991e101..3524fe065b72 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -531,7 +531,10 @@  struct btrfs_super_block {
 #define BTRFS_FEATURE_COMPAT_SUPP		0ULL
 #define BTRFS_FEATURE_COMPAT_SAFE_SET		0ULL
 #define BTRFS_FEATURE_COMPAT_SAFE_CLEAR		0ULL
-#define BTRFS_FEATURE_COMPAT_RO_SUPP		0ULL
+
+#define BTRFS_FEATURE_COMPAT_RO_SUPP			\
+	(BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE)
+
 #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET	0ULL
 #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR	0ULL
 
@@ -2203,6 +2206,7 @@  struct btrfs_ioctl_defrag_range_args {
 #define BTRFS_MOUNT_CHECK_INTEGRITY_INCLUDING_EXTENT_DATA (1 << 21)
 #define BTRFS_MOUNT_PANIC_ON_FATAL_ERROR	(1 << 22)
 #define BTRFS_MOUNT_RESCAN_UUID_TREE	(1 << 23)
+#define BTRFS_MOUNT_FREE_SPACE_TREE	(1 << 24)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
 #define BTRFS_DEFAULT_MAX_INLINE	(8192)
@@ -3746,6 +3750,7 @@  static inline void free_fs_info(struct btrfs_fs_info *fs_info)
 	kfree(fs_info->csum_root);
 	kfree(fs_info->quota_root);
 	kfree(fs_info->uuid_root);
+	kfree(fs_info->free_space_root);
 	kfree(fs_info->super_copy);
 	kfree(fs_info->super_for_commit);
 	security_free_mnt_opts(&fs_info->security_opts);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f556c3732c2c..e88674c594da 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -42,6 +42,7 @@ 
 #include "locking.h"
 #include "tree-log.h"
 #include "free-space-cache.h"
+#include "free-space-tree.h"
 #include "inode-map.h"
 #include "check-integrity.h"
 #include "rcu-string.h"
@@ -1641,6 +1642,9 @@  struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
 	if (location->objectid == BTRFS_UUID_TREE_OBJECTID)
 		return fs_info->uuid_root ? fs_info->uuid_root :
 					    ERR_PTR(-ENOENT);
+	if (location->objectid == BTRFS_FREE_SPACE_TREE_OBJECTID)
+		return fs_info->free_space_root ? fs_info->free_space_root :
+						  ERR_PTR(-ENOENT);
 again:
 	root = btrfs_lookup_fs_root(fs_info, location->objectid);
 	if (root) {
@@ -2138,6 +2142,7 @@  static void free_root_pointers(struct btrfs_fs_info *info, int chunk_root)
 	free_root_extent_buffers(info->uuid_root);
 	if (chunk_root)
 		free_root_extent_buffers(info->chunk_root);
+	free_root_extent_buffers(info->free_space_root);
 }
 
 void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info)
@@ -2439,6 +2444,15 @@  static int btrfs_read_roots(struct btrfs_fs_info *fs_info,
 		fs_info->uuid_root = root;
 	}
 
+	if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+		location.objectid = BTRFS_FREE_SPACE_TREE_OBJECTID;
+		root = btrfs_read_tree_root(tree_root, &location);
+		if (IS_ERR(root))
+			return PTR_ERR(root);
+		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
+		fs_info->free_space_root = root;
+	}
+
 	return 0;
 }
 
@@ -3063,6 +3077,18 @@  retry_root_backup:
 
 	btrfs_qgroup_rescan_resume(fs_info);
 
+	if (btrfs_test_opt(tree_root, FREE_SPACE_TREE) &&
+	    !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) {
+		pr_info("BTRFS: creating free space tree\n");
+		ret = btrfs_create_free_space_tree(fs_info);
+		if (ret) {
+			pr_warn("BTRFS: failed to create free space tree %d\n",
+				ret);
+			close_ctree(tree_root);
+			return ret;
+		}
+	}
+
 	if (!fs_info->uuid_root) {
 		pr_info("BTRFS: creating UUID tree\n");
 		ret = btrfs_create_uuid_tree(fs_info);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b93f127c4bc8..d7705e4ed119 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -319,7 +319,7 @@  enum {
 	Opt_check_integrity_print_mask, Opt_fatal_errors, Opt_rescan_uuid_tree,
 	Opt_commit_interval, Opt_barrier, Opt_nodefrag, Opt_nodiscard,
 	Opt_noenospc_debug, Opt_noflushoncommit, Opt_acl, Opt_datacow,
-	Opt_datasum, Opt_treelog, Opt_noinode_cache,
+	Opt_datasum, Opt_treelog, Opt_noinode_cache, Opt_free_space_tree,
 	Opt_err,
 };
 
@@ -372,6 +372,7 @@  static match_table_t tokens = {
 	{Opt_rescan_uuid_tree, "rescan_uuid_tree"},
 	{Opt_fatal_errors, "fatal_errors=%s"},
 	{Opt_commit_interval, "commit=%d"},
+	{Opt_free_space_tree, "free_space_tree"},
 	{Opt_err, NULL},
 };
 
@@ -392,7 +393,9 @@  int btrfs_parse_options(struct btrfs_root *root, char *options)
 	bool compress_force = false;
 
 	cache_gen = btrfs_super_cache_generation(root->fs_info->super_copy);
-	if (cache_gen)
+	if (btrfs_fs_compat_ro(root->fs_info, FREE_SPACE_TREE))
+		btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE);
+	else if (cache_gen)
 		btrfs_set_opt(info->mount_opt, SPACE_CACHE);
 
 	if (!options)
@@ -738,6 +741,10 @@  int btrfs_parse_options(struct btrfs_root *root, char *options)
 				info->commit_interval = BTRFS_DEFAULT_COMMIT_INTERVAL;
 			}
 			break;
+		case Opt_free_space_tree:
+			btrfs_set_and_info(root, FREE_SPACE_TREE,
+					   "enabling free space tree");
+			break;
 		case Opt_err:
 			btrfs_info(root->fs_info, "unrecognized mount option '%s'", p);
 			ret = -EINVAL;
@@ -747,8 +754,16 @@  int btrfs_parse_options(struct btrfs_root *root, char *options)
 		}
 	}
 out:
+	if (btrfs_test_opt(root, SPACE_CACHE) &&
+	    btrfs_test_opt(root, FREE_SPACE_TREE)) {
+		btrfs_err(root->fs_info,
+			  "cannot use both free space cache and free space tree");
+		ret = -EINVAL;
+	}
 	if (!ret && btrfs_test_opt(root, SPACE_CACHE))
 		btrfs_info(root->fs_info, "disk space caching is enabled");
+	if (!ret && btrfs_test_opt(root, FREE_SPACE_TREE))
+		btrfs_info(root->fs_info, "using free space tree");
 	kfree(orig);
 	return ret;
 }
@@ -1152,6 +1167,8 @@  static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 		seq_puts(seq, ",discard");
 	if (!(root->fs_info->sb->s_flags & MS_POSIXACL))
 		seq_puts(seq, ",noacl");
+	if (btrfs_test_opt(root, FREE_SPACE_TREE))
+		seq_puts(seq, ",free_space_tree");
 	if (btrfs_test_opt(root, SPACE_CACHE))
 		seq_puts(seq, ",space_cache");
 	else