[RFC] btrfs: Slightly speedup btrfs_read_block_groups
diff mbox

Message ID 1462434673-8395-1-git-send-email-quwenruo@cn.fujitsu.com
State New
Headers show

Commit Message

Qu Wenruo May 5, 2016, 7:51 a.m. UTC
Btrfs_read_block_groups() function is the most time consuming function
if the whole fs is filled with small extents.

For a btrfs filled with all 16K sized files, and when 2T space is used,
mount the fs needs 10 to 12 seconds.

While ftrace shows that, btrfs_read_block_groups() takes about 9
seconds, while btrfs_read_chunk_tree() only takes 14ms.
In theory, btrfs_read_chunk_tree() and btrfs_read_block_groups() should
take the same time, as chunk and block groups are 1:1 mapped.

However, considering block group items are spread across the large
extent tree, it takes a lot of time to search btree.

And furthermore, find_first_block_group() function used by
btrfs_read_block_groups() is using a very bad method to locate block
group item, by searching and then checking slot by slot.

In kernel space, checking slot by slot is a little time consuming, as
for next_leaf() case, kernel need to do extra locking.

This patch will fix the slot by slot checking, as when we call
btrfs_read_block_groups(), we have already read out all chunks and save
them into map_tree.

So we use map_tree to get exact block group start and length, then do
exact btrfs_search_slot(), without slot by slot check, to speedup the
mount.

With this patch, time spent on btrfs_read_block_groups() is reduced to
7.56s, compared to old 8.94s.

Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

---
The further fix would change the mount process from reading out all
block groups to reading out block group on demand.

But according to the btrfs_read_chunk_tree() calling time, the real
problem is the on-disk format and btree locking.

If block group items are arranged like chunks, in a dedicated tree,
btrfs_read_block_groups() should take the same time as
btrfs_read_chunk_tree().

And further more, if we can split current huge extent tree into
something like per-chunk extent tree, a lot of current code like
delayed_refs can be removed, as extent tree operation will be much
faster.
---
 fs/btrfs/extent-tree.c | 61 ++++++++++++++++++++------------------------------
 fs/btrfs/extent_map.c  |  1 +
 fs/btrfs/extent_map.h  | 22 ++++++++++++++++++
 3 files changed, 47 insertions(+), 37 deletions(-)

Comments

Qu Wenruo May 23, 2016, 8:02 a.m. UTC | #1
Any comment on this patch?

BTW, for anyone who is interested in the speedup, and the trace result,
I've updated it to google driver:

https://drive.google.com/open?id=0BxpkL3ehzX3pbFEybXd3X3MzRGM
https://drive.google.com/open?id=0BxpkL3ehzX3pd1ByOFhhbml3Ujg

Thanks,
Qu

Qu Wenruo wrote on 2016/05/05 15:51 +0800:
> Btrfs_read_block_groups() function is the most time consuming function
> if the whole fs is filled with small extents.
>
> For a btrfs filled with all 16K sized files, and when 2T space is used,
> mount the fs needs 10 to 12 seconds.
>
> While ftrace shows that, btrfs_read_block_groups() takes about 9
> seconds, while btrfs_read_chunk_tree() only takes 14ms.
> In theory, btrfs_read_chunk_tree() and btrfs_read_block_groups() should
> take the same time, as chunk and block groups are 1:1 mapped.
>
> However, considering block group items are spread across the large
> extent tree, it takes a lot of time to search btree.
>
> And furthermore, find_first_block_group() function used by
> btrfs_read_block_groups() is using a very bad method to locate block
> group item, by searching and then checking slot by slot.
>
> In kernel space, checking slot by slot is a little time consuming, as
> for next_leaf() case, kernel need to do extra locking.
>
> This patch will fix the slot by slot checking, as when we call
> btrfs_read_block_groups(), we have already read out all chunks and save
> them into map_tree.
>
> So we use map_tree to get exact block group start and length, then do
> exact btrfs_search_slot(), without slot by slot check, to speedup the
> mount.
>
> With this patch, time spent on btrfs_read_block_groups() is reduced to
> 7.56s, compared to old 8.94s.
>
> Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>
> ---
> The further fix would change the mount process from reading out all
> block groups to reading out block group on demand.
>
> But according to the btrfs_read_chunk_tree() calling time, the real
> problem is the on-disk format and btree locking.
>
> If block group items are arranged like chunks, in a dedicated tree,
> btrfs_read_block_groups() should take the same time as
> btrfs_read_chunk_tree().
>
> And further more, if we can split current huge extent tree into
> something like per-chunk extent tree, a lot of current code like
> delayed_refs can be removed, as extent tree operation will be much
> faster.
> ---
>  fs/btrfs/extent-tree.c | 61 ++++++++++++++++++++------------------------------
>  fs/btrfs/extent_map.c  |  1 +
>  fs/btrfs/extent_map.h  | 22 ++++++++++++++++++
>  3 files changed, 47 insertions(+), 37 deletions(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 8507484..9fa7728 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -9520,39 +9520,20 @@ out:
>  	return ret;
>  }
>
> -static int find_first_block_group(struct btrfs_root *root,
> -		struct btrfs_path *path, struct btrfs_key *key)
> +int find_block_group(struct btrfs_root *root,
> +				   struct btrfs_path *path,
> +				   struct extent_map *chunk_em)
>  {
>  	int ret = 0;
> -	struct btrfs_key found_key;
> -	struct extent_buffer *leaf;
> -	int slot;
> -
> -	ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
> -	if (ret < 0)
> -		goto out;
> +	struct btrfs_key key;
>
> -	while (1) {
> -		slot = path->slots[0];
> -		leaf = path->nodes[0];
> -		if (slot >= btrfs_header_nritems(leaf)) {
> -			ret = btrfs_next_leaf(root, path);
> -			if (ret == 0)
> -				continue;
> -			if (ret < 0)
> -				goto out;
> -			break;
> -		}
> -		btrfs_item_key_to_cpu(leaf, &found_key, slot);
> +	key.objectid = chunk_em->start;
> +	key.offset = chunk_em->len;
> +	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
>
> -		if (found_key.objectid >= key->objectid &&
> -		    found_key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) {
> -			ret = 0;
> -			goto out;
> -		}
> -		path->slots[0]++;
> -	}
> -out:
> +	ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
> +	if (ret > 0)
> +		ret = -ENOENT;
>  	return ret;
>  }
>
> @@ -9771,16 +9752,14 @@ int btrfs_read_block_groups(struct btrfs_root *root)
>  	struct btrfs_block_group_cache *cache;
>  	struct btrfs_fs_info *info = root->fs_info;
>  	struct btrfs_space_info *space_info;
> -	struct btrfs_key key;
> +	struct btrfs_mapping_tree *map_tree = &root->fs_info->mapping_tree;
> +	struct extent_map *chunk_em;
>  	struct btrfs_key found_key;
>  	struct extent_buffer *leaf;
>  	int need_clear = 0;
>  	u64 cache_gen;
>
>  	root = info->extent_root;
> -	key.objectid = 0;
> -	key.offset = 0;
> -	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
>  	path = btrfs_alloc_path();
>  	if (!path)
>  		return -ENOMEM;
> @@ -9793,10 +9772,16 @@ int btrfs_read_block_groups(struct btrfs_root *root)
>  	if (btrfs_test_opt(root, CLEAR_CACHE))
>  		need_clear = 1;
>
> +	/* Here we don't lock the map tree, as we are the only reader */
> +	chunk_em = first_extent_mapping(&map_tree->map_tree);
> +	/* Not really possible */
> +	if (!chunk_em) {
> +		ret = -ENOENT;
> +		goto error;
> +	}
> +
>  	while (1) {
> -		ret = find_first_block_group(root, path, &key);
> -		if (ret > 0)
> -			break;
> +		ret = find_block_group(root, path, chunk_em);
>  		if (ret != 0)
>  			goto error;
>
> @@ -9830,7 +9815,6 @@ int btrfs_read_block_groups(struct btrfs_root *root)
>  				   sizeof(cache->item));
>  		cache->flags = btrfs_block_group_flags(&cache->item);
>
> -		key.objectid = found_key.objectid + found_key.offset;
>  		btrfs_release_path(path);
>
>  		/*
> @@ -9911,6 +9895,9 @@ int btrfs_read_block_groups(struct btrfs_root *root)
>  			}
>  			spin_unlock(&info->unused_bgs_lock);
>  		}
> +		chunk_em = next_extent_mapping(chunk_em);
> +		if (!chunk_em)
> +			break;
>  	}
>
>  	list_for_each_entry_rcu(space_info, &root->fs_info->space_info, list) {
> diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
> index 318b048..7e781e7 100644
> --- a/fs/btrfs/extent_map.c
> +++ b/fs/btrfs/extent_map.c
> @@ -453,3 +453,4 @@ void replace_extent_mapping(struct extent_map_tree *tree,
>
>  	setup_extent_mapping(tree, new, modified);
>  }
> +
> diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
> index eb8b8fa..9358e3e 100644
> --- a/fs/btrfs/extent_map.h
> +++ b/fs/btrfs/extent_map.h
> @@ -90,4 +90,26 @@ int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len, u64 gen
>  void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em);
>  struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
>  					 u64 start, u64 len);
> +
> +static inline struct extent_map *
> +first_extent_mapping(struct extent_map_tree *tree)
> +{
> +	struct rb_node *node;
> +
> +	node = rb_first(&tree->map);
> +	if (!node)
> +		return NULL;
> +	return rb_entry(node, struct extent_map, rb_node);
> +}
> +
> +static inline struct extent_map *
> +next_extent_mapping(struct extent_map *map)
> +{
> +	struct rb_node *node;
> +
> +	node = rb_next(&map->rb_node);
> +	if (!node)
> +		return NULL;
> +	return rb_entry(node, struct extent_map, rb_node);
> +}
>  #endif
>


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo June 8, 2016, 1:25 a.m. UTC | #2
Ping?

Thanks,
Qu

At 05/23/2016 04:02 PM, Qu Wenruo wrote:
> Any comment on this patch?
>
> BTW, for anyone who is interested in the speedup, and the trace result,
> I've updated it to google driver:
>
> https://drive.google.com/open?id=0BxpkL3ehzX3pbFEybXd3X3MzRGM
> https://drive.google.com/open?id=0BxpkL3ehzX3pd1ByOFhhbml3Ujg
>
> Thanks,
> Qu
>
> Qu Wenruo wrote on 2016/05/05 15:51 +0800:
>> Btrfs_read_block_groups() function is the most time consuming function
>> if the whole fs is filled with small extents.
>>
>> For a btrfs filled with all 16K sized files, and when 2T space is used,
>> mount the fs needs 10 to 12 seconds.
>>
>> While ftrace shows that, btrfs_read_block_groups() takes about 9
>> seconds, while btrfs_read_chunk_tree() only takes 14ms.
>> In theory, btrfs_read_chunk_tree() and btrfs_read_block_groups() should
>> take the same time, as chunk and block groups are 1:1 mapped.
>>
>> However, considering block group items are spread across the large
>> extent tree, it takes a lot of time to search btree.
>>
>> And furthermore, find_first_block_group() function used by
>> btrfs_read_block_groups() is using a very bad method to locate block
>> group item, by searching and then checking slot by slot.
>>
>> In kernel space, checking slot by slot is a little time consuming, as
>> for next_leaf() case, kernel need to do extra locking.
>>
>> This patch will fix the slot by slot checking, as when we call
>> btrfs_read_block_groups(), we have already read out all chunks and save
>> them into map_tree.
>>
>> So we use map_tree to get exact block group start and length, then do
>> exact btrfs_search_slot(), without slot by slot check, to speedup the
>> mount.
>>
>> With this patch, time spent on btrfs_read_block_groups() is reduced to
>> 7.56s, compared to old 8.94s.
>>
>> Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>
>> ---
>> The further fix would change the mount process from reading out all
>> block groups to reading out block group on demand.
>>
>> But according to the btrfs_read_chunk_tree() calling time, the real
>> problem is the on-disk format and btree locking.
>>
>> If block group items are arranged like chunks, in a dedicated tree,
>> btrfs_read_block_groups() should take the same time as
>> btrfs_read_chunk_tree().
>>
>> And further more, if we can split current huge extent tree into
>> something like per-chunk extent tree, a lot of current code like
>> delayed_refs can be removed, as extent tree operation will be much
>> faster.
>> ---
>>  fs/btrfs/extent-tree.c | 61
>> ++++++++++++++++++++------------------------------
>>  fs/btrfs/extent_map.c  |  1 +
>>  fs/btrfs/extent_map.h  | 22 ++++++++++++++++++
>>  3 files changed, 47 insertions(+), 37 deletions(-)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 8507484..9fa7728 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -9520,39 +9520,20 @@ out:
>>      return ret;
>>  }
>>
>> -static int find_first_block_group(struct btrfs_root *root,
>> -        struct btrfs_path *path, struct btrfs_key *key)
>> +int find_block_group(struct btrfs_root *root,
>> +                   struct btrfs_path *path,
>> +                   struct extent_map *chunk_em)
>>  {
>>      int ret = 0;
>> -    struct btrfs_key found_key;
>> -    struct extent_buffer *leaf;
>> -    int slot;
>> -
>> -    ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
>> -    if (ret < 0)
>> -        goto out;
>> +    struct btrfs_key key;
>>
>> -    while (1) {
>> -        slot = path->slots[0];
>> -        leaf = path->nodes[0];
>> -        if (slot >= btrfs_header_nritems(leaf)) {
>> -            ret = btrfs_next_leaf(root, path);
>> -            if (ret == 0)
>> -                continue;
>> -            if (ret < 0)
>> -                goto out;
>> -            break;
>> -        }
>> -        btrfs_item_key_to_cpu(leaf, &found_key, slot);
>> +    key.objectid = chunk_em->start;
>> +    key.offset = chunk_em->len;
>> +    key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
>>
>> -        if (found_key.objectid >= key->objectid &&
>> -            found_key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) {
>> -            ret = 0;
>> -            goto out;
>> -        }
>> -        path->slots[0]++;
>> -    }
>> -out:
>> +    ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
>> +    if (ret > 0)
>> +        ret = -ENOENT;
>>      return ret;
>>  }
>>
>> @@ -9771,16 +9752,14 @@ int btrfs_read_block_groups(struct btrfs_root
>> *root)
>>      struct btrfs_block_group_cache *cache;
>>      struct btrfs_fs_info *info = root->fs_info;
>>      struct btrfs_space_info *space_info;
>> -    struct btrfs_key key;
>> +    struct btrfs_mapping_tree *map_tree = &root->fs_info->mapping_tree;
>> +    struct extent_map *chunk_em;
>>      struct btrfs_key found_key;
>>      struct extent_buffer *leaf;
>>      int need_clear = 0;
>>      u64 cache_gen;
>>
>>      root = info->extent_root;
>> -    key.objectid = 0;
>> -    key.offset = 0;
>> -    key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
>>      path = btrfs_alloc_path();
>>      if (!path)
>>          return -ENOMEM;
>> @@ -9793,10 +9772,16 @@ int btrfs_read_block_groups(struct btrfs_root
>> *root)
>>      if (btrfs_test_opt(root, CLEAR_CACHE))
>>          need_clear = 1;
>>
>> +    /* Here we don't lock the map tree, as we are the only reader */
>> +    chunk_em = first_extent_mapping(&map_tree->map_tree);
>> +    /* Not really possible */
>> +    if (!chunk_em) {
>> +        ret = -ENOENT;
>> +        goto error;
>> +    }
>> +
>>      while (1) {
>> -        ret = find_first_block_group(root, path, &key);
>> -        if (ret > 0)
>> -            break;
>> +        ret = find_block_group(root, path, chunk_em);
>>          if (ret != 0)
>>              goto error;
>>
>> @@ -9830,7 +9815,6 @@ int btrfs_read_block_groups(struct btrfs_root
>> *root)
>>                     sizeof(cache->item));
>>          cache->flags = btrfs_block_group_flags(&cache->item);
>>
>> -        key.objectid = found_key.objectid + found_key.offset;
>>          btrfs_release_path(path);
>>
>>          /*
>> @@ -9911,6 +9895,9 @@ int btrfs_read_block_groups(struct btrfs_root
>> *root)
>>              }
>>              spin_unlock(&info->unused_bgs_lock);
>>          }
>> +        chunk_em = next_extent_mapping(chunk_em);
>> +        if (!chunk_em)
>> +            break;
>>      }
>>
>>      list_for_each_entry_rcu(space_info, &root->fs_info->space_info,
>> list) {
>> diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
>> index 318b048..7e781e7 100644
>> --- a/fs/btrfs/extent_map.c
>> +++ b/fs/btrfs/extent_map.c
>> @@ -453,3 +453,4 @@ void replace_extent_mapping(struct extent_map_tree
>> *tree,
>>
>>      setup_extent_mapping(tree, new, modified);
>>  }
>> +
>> diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
>> index eb8b8fa..9358e3e 100644
>> --- a/fs/btrfs/extent_map.h
>> +++ b/fs/btrfs/extent_map.h
>> @@ -90,4 +90,26 @@ int unpin_extent_cache(struct extent_map_tree
>> *tree, u64 start, u64 len, u64 gen
>>  void clear_em_logging(struct extent_map_tree *tree, struct extent_map
>> *em);
>>  struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
>>                       u64 start, u64 len);
>> +
>> +static inline struct extent_map *
>> +first_extent_mapping(struct extent_map_tree *tree)
>> +{
>> +    struct rb_node *node;
>> +
>> +    node = rb_first(&tree->map);
>> +    if (!node)
>> +        return NULL;
>> +    return rb_entry(node, struct extent_map, rb_node);
>> +}
>> +
>> +static inline struct extent_map *
>> +next_extent_mapping(struct extent_map *map)
>> +{
>> +    struct rb_node *node;
>> +
>> +    node = rb_next(&map->rb_node);
>> +    if (!node)
>> +        return NULL;
>> +    return rb_entry(node, struct extent_map, rb_node);
>> +}
>>  #endif
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8507484..9fa7728 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -9520,39 +9520,20 @@  out:
 	return ret;
 }
 
-static int find_first_block_group(struct btrfs_root *root,
-		struct btrfs_path *path, struct btrfs_key *key)
+int find_block_group(struct btrfs_root *root,
+				   struct btrfs_path *path,
+				   struct extent_map *chunk_em)
 {
 	int ret = 0;
-	struct btrfs_key found_key;
-	struct extent_buffer *leaf;
-	int slot;
-
-	ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
-	if (ret < 0)
-		goto out;
+	struct btrfs_key key;
 
-	while (1) {
-		slot = path->slots[0];
-		leaf = path->nodes[0];
-		if (slot >= btrfs_header_nritems(leaf)) {
-			ret = btrfs_next_leaf(root, path);
-			if (ret == 0)
-				continue;
-			if (ret < 0)
-				goto out;
-			break;
-		}
-		btrfs_item_key_to_cpu(leaf, &found_key, slot);
+	key.objectid = chunk_em->start;
+	key.offset = chunk_em->len;
+	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
 
-		if (found_key.objectid >= key->objectid &&
-		    found_key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) {
-			ret = 0;
-			goto out;
-		}
-		path->slots[0]++;
-	}
-out:
+	ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+	if (ret > 0)
+		ret = -ENOENT;
 	return ret;
 }
 
@@ -9771,16 +9752,14 @@  int btrfs_read_block_groups(struct btrfs_root *root)
 	struct btrfs_block_group_cache *cache;
 	struct btrfs_fs_info *info = root->fs_info;
 	struct btrfs_space_info *space_info;
-	struct btrfs_key key;
+	struct btrfs_mapping_tree *map_tree = &root->fs_info->mapping_tree;
+	struct extent_map *chunk_em;
 	struct btrfs_key found_key;
 	struct extent_buffer *leaf;
 	int need_clear = 0;
 	u64 cache_gen;
 
 	root = info->extent_root;
-	key.objectid = 0;
-	key.offset = 0;
-	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
@@ -9793,10 +9772,16 @@  int btrfs_read_block_groups(struct btrfs_root *root)
 	if (btrfs_test_opt(root, CLEAR_CACHE))
 		need_clear = 1;
 
+	/* Here we don't lock the map tree, as we are the only reader */
+	chunk_em = first_extent_mapping(&map_tree->map_tree);
+	/* Not really possible */
+	if (!chunk_em) {
+		ret = -ENOENT;
+		goto error;
+	}
+
 	while (1) {
-		ret = find_first_block_group(root, path, &key);
-		if (ret > 0)
-			break;
+		ret = find_block_group(root, path, chunk_em);
 		if (ret != 0)
 			goto error;
 
@@ -9830,7 +9815,6 @@  int btrfs_read_block_groups(struct btrfs_root *root)
 				   sizeof(cache->item));
 		cache->flags = btrfs_block_group_flags(&cache->item);
 
-		key.objectid = found_key.objectid + found_key.offset;
 		btrfs_release_path(path);
 
 		/*
@@ -9911,6 +9895,9 @@  int btrfs_read_block_groups(struct btrfs_root *root)
 			}
 			spin_unlock(&info->unused_bgs_lock);
 		}
+		chunk_em = next_extent_mapping(chunk_em);
+		if (!chunk_em)
+			break;
 	}
 
 	list_for_each_entry_rcu(space_info, &root->fs_info->space_info, list) {
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 318b048..7e781e7 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -453,3 +453,4 @@  void replace_extent_mapping(struct extent_map_tree *tree,
 
 	setup_extent_mapping(tree, new, modified);
 }
+
diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
index eb8b8fa..9358e3e 100644
--- a/fs/btrfs/extent_map.h
+++ b/fs/btrfs/extent_map.h
@@ -90,4 +90,26 @@  int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len, u64 gen
 void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em);
 struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
 					 u64 start, u64 len);
+
+static inline struct extent_map *
+first_extent_mapping(struct extent_map_tree *tree)
+{
+	struct rb_node *node;
+
+	node = rb_first(&tree->map);
+	if (!node)
+		return NULL;
+	return rb_entry(node, struct extent_map, rb_node);
+}
+
+static inline struct extent_map *
+next_extent_mapping(struct extent_map *map)
+{
+	struct rb_node *node;
+
+	node = rb_next(&map->rb_node);
+	if (!node)
+		return NULL;
+	return rb_entry(node, struct extent_map, rb_node);
+}
 #endif