diff mbox series

[1/2] btrfs: add extra comments on extent_map members

Message ID 50b37f24a3ac5a2c68529a2373ad98d9c45e6f33.1712038308.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: more explaination on extent_map members | expand

Commit Message

Qu Wenruo April 2, 2024, 6:23 a.m. UTC
The extent_map structure is very critical to btrfs, as it is involved
for both read and write paths.

Unfortunately the structure is not properly explained, making it pretty
hard to understand nor to do further improvement.

This patch would add extra comments explaining the major numbers base on
my code reading.
Hopefully we can find more members to cleanup in the future.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_map.h | 62 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 61 insertions(+), 1 deletion(-)

Comments

Andrea Gelmini April 2, 2024, 7:33 a.m. UTC | #1
Il giorno mar 2 apr 2024 alle ore 08:24 Qu Wenruo <wqu@suse.com> ha scritto:
>
> +        * This is an in-memory-only member, matching

Just a stupid fix about "mathcing".

Thanks a lot Qu,
Gelma
Qu Wenruo April 2, 2024, 8:25 a.m. UTC | #2
在 2024/4/2 18:03, Andrea Gelmini 写道:
> Il giorno mar 2 apr 2024 alle ore 08:24 Qu Wenruo <wqu@suse.com> ha scritto:
>>
>> +        * This is an in-memory-only member, matching
> 
> Just a stupid fix about "mathcing".

Thanks for pointing out, I guess David is already giving up on my grammar...

Thanks,
Qu
> 
> Thanks a lot Qu,
> Gelma
Filipe Manana April 2, 2024, 3:45 p.m. UTC | #3
On Tue, Apr 2, 2024 at 7:24 AM Qu Wenruo <wqu@suse.com> wrote:
>
> The extent_map structure is very critical to btrfs, as it is involved
> for both read and write paths.
>
> Unfortunately the structure is not properly explained, making it pretty
> hard to understand nor to do further improvement.
>
> This patch would add extra comments explaining the major numbers base on

would add -> adds

And by "numbers" I think you wanted to say "members"?

base -> based

> my code reading.
> Hopefully we can find more members to cleanup in the future.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/extent_map.h | 62 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
> index c5a098c99cc6..30322defcd03 100644
> --- a/fs/btrfs/extent_map.h
> +++ b/fs/btrfs/extent_map.h
> @@ -37,21 +37,81 @@ enum {
>  };
>
>  /*
> + * This extent_map structure is an in-memory representation of file extents,
> + * it would represent all file extents (including holes, no matter if we have
> + * hole file extents).

This can simply be:

"This structure represents file extents and holes."

Which also fixes some grammar (it would represent -> it represents)
and the stuff between parentheses is confusing.

> + *
>   * Keep this structure as compact as possible, as we can have really large
>   * amounts of allocated extent maps at any time.
>   */
>  struct extent_map {
>         struct rb_node rb_node;
>
> -       /* all of these are in bytes */
> +       /* All of these are in bytes */

Please add punctuation as well.

> +
> +       /*
> +        * File offset of the file extent. matching key.offset of
> +        * (INO EXTENT_DATA FILEPOS) key.
> +        */

"File offset matching the offset of a BTRFS_EXTENT_ITEM_KEY key."

That's a lot more clear than using the shortened format from tree-dump.
Also note that "matching" should start with a capital letter
(beginning of sentence.

>         u64 start;
> +
> +       /*
> +        * Length of the file extent.
> +        * For non-inlined file extents it's btrfs_file_extent_item::num_bytes.
> +        * For inlined file extents it's sectorsize. (as there is no reliable
> +        * btrfs_file_extent::num_bytes).

Don't put whole sentences in parentheses after a punctuation mark.
This can also be rephrased in a more clear way:

"For inline extents it's sectorsize and
btrfs_file_extent_item::num_bytes has data and not a valid length,
because inline data starts at offsetof(struct btrfs_file_extent_item,
disk_bytenr)."

> +        */
>         u64 len;
> +
> +       /*
> +        * The modified range start/length, these are in-memory-only
> +        * members for fsync/logtree optimization.
> +        */

These were initially used to avoid logging the same csum ranges
multiple times when extent maps get merged.
But from a quick look and experiment we don't actually need them in
order to avoid that (we don't merge new modified extents).
I'll send a patch to remove them and update the fsync logic.

>         u64 mod_start;
>         u64 mod_len;
> +
> +       /*
> +        * The file offset of the original file extent before splitting.
> +        *
> +        * This is an in-memory-only member, mathcing

in-memory-only -> in-memory only
mathcing -> matching

> +        * em::start - btrfs_file_extent_item::offset for regular/preallocated

Instead of em, which is only a typical variable name we use for extent
maps, use 'extent_map' so that it's precise and leaves no room for
confusion.

> +        * extents. EXTENT_MAP_HOLE otherwise.
> +        */
>         u64 orig_start;
> +
> +       /*
> +        * The full on-disk extent length, matching
> +        * btrfs_file_extent_item::disk_num_bytes.
> +        */
>         u64 orig_block_len;
> +
> +       /*
> +        * The decompressed size of the whole on-disk extent, matching
> +        * btrfs_file_extent_item::ram_bytes.
> +        *
> +        * For non-compressed extents, this matches orig_block_len.
> +        */
>         u64 ram_bytes;
> +
> +       /*
> +        * The on-disk logical bytenr for the file extent.
> +        *
> +        * For compressed extents it matches btrfs_file_extent_item::disk_bytenr.
> +        * For uncompressed extents it matches
> +        * btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset
> +        *
> +        * For hole extents it is EXTENT_MAP_HOLE and for inline extents it is
> +        * EXTENT_MAP_INLINE.
> +        */
>         u64 block_start;
> +
> +       /*
> +        * The on-disk length for the file extent.
> +        *
> +        * For compressed extents it matches btrfs_file_extent_item::disk_num_bytes.
> +        * For uncompressed extents it matches em::len.

Same as before, use 'extent_map::len' instead of 'em::len'.

> +        * Otherwise -1 (aka doesn't make much sense).

That is cryptic...
Should be something like:  "If the extent map represents a hole, then
it's -1 and shouldn't be used."

Thanks.



> +        */
>         u64 block_len;
>
>         /*
> --
> 2.44.0
>
>
diff mbox series

Patch

diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
index c5a098c99cc6..30322defcd03 100644
--- a/fs/btrfs/extent_map.h
+++ b/fs/btrfs/extent_map.h
@@ -37,21 +37,81 @@  enum {
 };
 
 /*
+ * This extent_map structure is an in-memory representation of file extents,
+ * it would represent all file extents (including holes, no matter if we have
+ * hole file extents).
+ *
  * Keep this structure as compact as possible, as we can have really large
  * amounts of allocated extent maps at any time.
  */
 struct extent_map {
 	struct rb_node rb_node;
 
-	/* all of these are in bytes */
+	/* All of these are in bytes */
+
+	/*
+	 * File offset of the file extent. matching key.offset of
+	 * (INO EXTENT_DATA FILEPOS) key.
+	 */
 	u64 start;
+
+	/*
+	 * Length of the file extent.
+	 * For non-inlined file extents it's btrfs_file_extent_item::num_bytes.
+	 * For inlined file extents it's sectorsize. (as there is no reliable
+	 * btrfs_file_extent::num_bytes).
+	 */
 	u64 len;
+
+	/*
+	 * The modified range start/length, these are in-memory-only
+	 * members for fsync/logtree optimization.
+	 */
 	u64 mod_start;
 	u64 mod_len;
+
+	/*
+	 * The file offset of the original file extent before splitting.
+	 *
+	 * This is an in-memory-only member, mathcing
+	 * em::start - btrfs_file_extent_item::offset for regular/preallocated
+	 * extents. EXTENT_MAP_HOLE otherwise.
+	 */
 	u64 orig_start;
+
+	/*
+	 * The full on-disk extent length, matching
+	 * btrfs_file_extent_item::disk_num_bytes.
+	 */
 	u64 orig_block_len;
+
+	/*
+	 * The decompressed size of the whole on-disk extent, matching
+	 * btrfs_file_extent_item::ram_bytes.
+	 *
+	 * For non-compressed extents, this matches orig_block_len.
+	 */
 	u64 ram_bytes;
+
+	/*
+	 * The on-disk logical bytenr for the file extent.
+	 *
+	 * For compressed extents it matches btrfs_file_extent_item::disk_bytenr.
+	 * For uncompressed extents it matches
+	 * btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset
+	 *
+	 * For hole extents it is EXTENT_MAP_HOLE and for inline extents it is
+	 * EXTENT_MAP_INLINE.
+	 */
 	u64 block_start;
+
+	/*
+	 * The on-disk length for the file extent.
+	 *
+	 * For compressed extents it matches btrfs_file_extent_item::disk_num_bytes.
+	 * For uncompressed extents it matches em::len.
+	 * Otherwise -1 (aka doesn't make much sense).
+	 */
 	u64 block_len;
 
 	/*