diff mbox series

btrfs: add a debug accounting for eb pages contiguousness

Message ID 20230707084027.91022-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: add a debug accounting for eb pages contiguousness | expand

Commit Message

Qu Wenruo July 7, 2023, 8:40 a.m. UTC
!!! DO NOT MERGE !!!

Although the current folio interface is not yet providing a way to
allocate a range of contiguous pages, I'm always curious if we can get
rid of the cross page handling.

So this patch is an attempt to provide some benchmark on the extent
buffer page contiguousness.

The patch handle such work by:

- Every allocated extent buffer (except dummy) would increase
  fs_info->eb_allocated

- If the pages of the extent buffer are contiguous, increase
  fs_info->eb_pages_contig

- On close_ctree(), output both numbers

The VM has the following setup for benchmark:

- Mem: 4G
- CPU: 8 vCores
- Disk: 500G SATA SSD directly passed to VM
- Fs: 2X10G LV, 16K nodesize 4K sectorsize.

I tested two sitautions:

- Metadata heavy workload
  The workload is "fsstress -p 20 -n 10000 -w -d $mnt".

  The result is 566140 / 854009 = 66.3 %

  Which is better than what I thought.

- Data heavy workload
  The workload is fio with a 12G file (3x system memory).

  The result is 1355 / 2358 = 57.5 %

Considering it's mostly beyond 50%, I believe it should be worthy for us
to consider reduce the cross-page boundary overhead.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/disk-io.c   | 5 +++++
 fs/btrfs/extent_io.c | 8 +++++++-
 fs/btrfs/extent_io.h | 3 +++
 fs/btrfs/fs.h        | 2 ++
 4 files changed, 17 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig July 7, 2023, 11:39 a.m. UTC | #1
On Fri, Jul 07, 2023 at 04:40:27PM +0800, Qu Wenruo wrote:
> !!! DO NOT MERGE !!!
> 
> Although the current folio interface is not yet providing a way to
> allocate a range of contiguous pages,

What do you mean?  Both folios and compound pages will allocate larger
than PAGE_SIZE objects if you as for it.
Qu Wenruo July 7, 2023, 11:45 a.m. UTC | #2
On 2023/7/7 19:39, Christoph Hellwig wrote:
> On Fri, Jul 07, 2023 at 04:40:27PM +0800, Qu Wenruo wrote:
>> !!! DO NOT MERGE !!!
>>
>> Although the current folio interface is not yet providing a way to
>> allocate a range of contiguous pages,
>
> What do you mean?  Both folios and compound pages will allocate larger
> than PAGE_SIZE objects if you as for it.
>

I mean a multi-page/folios version of find_or_create_page().

We have filemap_get_folios(), but we can not use it to create new pages.
While find_or_create_page()/pagecache_get_page() can only get one page.

What I want here a something like find_or_create_folios() or
pagecache_get_pages().

Thanks,
Qu
Christoph Hellwig July 7, 2023, 11:51 a.m. UTC | #3
On Fri, Jul 07, 2023 at 07:45:41PM +0800, Qu Wenruo wrote:
> I mean a multi-page/folios version of find_or_create_page().

__filemap_get_folio with with FGP_CREAT.
Qu Wenruo July 9, 2023, 4:44 a.m. UTC | #4
On 2023/7/7 19:51, Christoph Hellwig wrote:
> On Fri, Jul 07, 2023 at 07:45:41PM +0800, Qu Wenruo wrote:
>> I mean a multi-page/folios version of find_or_create_page().
>
> __filemap_get_folio with with FGP_CREAT.
>
Unfortunately __filemap_get_folio() is still only creating folios using
order 0, under the no_page tag:

	folio = filemap_alloc_folio(gfp, 0);

So in that case, it's really no difference than find_or_create_page()
for new pages/folios.

Thanks,
Qu
Christoph Hellwig July 10, 2023, 5:28 a.m. UTC | #5
On Sun, Jul 09, 2023 at 12:44:33PM +0800, Qu Wenruo wrote:
> Unfortunately __filemap_get_folio() is still only creating folios using
> order 0, under the no_page tag:
> 
> 	folio = filemap_alloc_folio(gfp, 0);
> 
> So in that case, it's really no difference than find_or_create_page()
> for new pages/folios.

You're right, willys patches to pass the order haven't actually
made it to mainline yet, and I assumed they did.  So you'll have
to wait another merge window.
Qu Wenruo July 10, 2023, 5:52 a.m. UTC | #6
On 2023/7/10 13:28, Christoph Hellwig wrote:
> On Sun, Jul 09, 2023 at 12:44:33PM +0800, Qu Wenruo wrote:
>> Unfortunately __filemap_get_folio() is still only creating folios using
>> order 0, under the no_page tag:
>>
>> 	folio = filemap_alloc_folio(gfp, 0);
>>
>> So in that case, it's really no difference than find_or_create_page()
>> for new pages/folios.
> 
> You're right, willys patches to pass the order haven't actually
> made it to mainline yet, and I assumed they did.  So you'll have
> to wait another merge window.

I'm considering another solution like this:

	folio = filemap_alloc_folio();
	do_something_with_folio();
	ret = filemap_add_folio();

Which is not that complex and pretty much used everywhere in mm code.

Although I still need the old per-page allocation as a fallback, which 
may make it more complex than expected.

Thanks,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7513388b0567..4ba51e1235c4 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2698,6 +2698,8 @@  void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
 	INIT_LIST_HEAD(&fs_info->allocated_roots);
 	INIT_LIST_HEAD(&fs_info->allocated_ebs);
 	spin_lock_init(&fs_info->eb_leak_lock);
+	atomic64_set(&fs_info->eb_allocated, 0);
+	atomic64_set(&fs_info->eb_pages_contig, 0);
 #endif
 	extent_map_tree_init(&fs_info->mapping_tree);
 	btrfs_init_block_rsv(&fs_info->global_block_rsv,
@@ -4321,6 +4323,9 @@  void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	/* Cancel or finish ongoing discard work */
 	btrfs_discard_cleanup(fs_info);
 
+	btrfs_info(fs_info, "eb contig=%llu allocated=%llu",
+			atomic64_read(&fs_info->eb_pages_contig),
+			atomic64_read(&fs_info->eb_allocated));
 	if (!sb_rdonly(fs_info->sb)) {
 		/*
 		 * The cleaner kthread is stopped, so do one final pass over
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a91d5ad27984..995205441937 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3530,6 +3530,7 @@  struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
 	struct page *p;
 	struct address_space *mapping = fs_info->btree_inode->i_mapping;
 	u64 lockdep_owner = owner_root;
+	bool pages_contig = true;
 	int uptodate = 1;
 	int ret;
 
@@ -3624,6 +3625,8 @@  struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
 		eb->pages[i] = p;
 		if (!btrfs_page_test_uptodate(fs_info, p, eb->start, eb->len))
 			uptodate = 0;
+		if (i && eb->pages[i - 1] + 1 != p)
+			pages_contig = false;
 
 		/*
 		 * We can't unlock the pages just yet since the extent buffer
@@ -3657,7 +3660,10 @@  struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
 	/* add one reference for the tree */
 	check_buffer_tree_ref(eb);
 	set_bit(EXTENT_BUFFER_IN_TREE, &eb->bflags);
-
+	eb->pages_contig = pages_contig;
+	atomic64_inc(&fs_info->eb_allocated);
+	if (pages_contig)
+		atomic64_inc(&fs_info->eb_pages_contig);
 	/*
 	 * Now it's safe to unlock the pages because any calls to
 	 * btree_release_folio will correctly detect that a page belongs to a
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index c5fae3a7d911..98b596bbac2e 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -86,6 +86,9 @@  struct extent_buffer {
 	/* >= 0 if eb belongs to a log tree, -1 otherwise */
 	s8 log_index;
 
+	/* Set if the pages are contiguous. */
+	bool pages_contig;
+
 	struct rw_semaphore lock;
 
 	struct page *pages[INLINE_EXTENT_BUFFER_PAGES];
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 203d2a267828..8cdd2af98bd6 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -802,6 +802,8 @@  struct btrfs_fs_info {
 
 	spinlock_t eb_leak_lock;
 	struct list_head allocated_ebs;
+	atomic64_t eb_allocated;
+	atomic64_t eb_pages_contig;
 #endif
 };