diff mbox series

btrfs: Refactor main loop in extent_readpages

Message ID 20181129164131.4716-1-nborisov@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: Refactor main loop in extent_readpages | expand

Commit Message

Nikolay Borisov Nov. 29, 2018, 4:41 p.m. UTC
extent_readpages processes all pages in the readlist in batches of 16,
this is implemented by a single for loop but thanks to an if condition
the loop does 2 things based on whether we've filled the batch or not.
Additionally due to the structure of the code there is an additional
check which deals with partial batches.

Streamline all of this by explicitly using two loops. The outter one is
used to process all pages while the inner one just fills in the batch
of 16 (currently). Due to this new structure the code guarantees that
all pages are processed in the loop hence the code to deal with any
leftovers is eliminated.

This also enable the compiler to inline __extent_readpages:

	./scripts/bloat-o-meter fs/btrfs/extent_io.o extent_io.for

	add/remove: 0/1 grow/shrink: 1/0 up/down: 660/-820 (-160)
	Function                                     old     new   delta
	extent_readpages                             476    1136    +660
	__extent_readpages                           820       -    -820
	Total: Before=44315, After=44155, chg -0.36%

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/extent_io.c | 37 ++++++++++++++++---------------------
 1 file changed, 16 insertions(+), 21 deletions(-)

Comments

David Sterba Dec. 12, 2018, 2:40 p.m. UTC | #1
On Thu, Nov 29, 2018 at 06:41:31PM +0200, Nikolay Borisov wrote:
> extent_readpages processes all pages in the readlist in batches of 16,
> this is implemented by a single for loop but thanks to an if condition
> the loop does 2 things based on whether we've filled the batch or not.
> Additionally due to the structure of the code there is an additional
> check which deals with partial batches.
> 
> Streamline all of this by explicitly using two loops. The outter one is
> used to process all pages while the inner one just fills in the batch
> of 16 (currently). Due to this new structure the code guarantees that
> all pages are processed in the loop hence the code to deal with any
> leftovers is eliminated.
> 
> This also enable the compiler to inline __extent_readpages:
> 
> 	./scripts/bloat-o-meter fs/btrfs/extent_io.o extent_io.for
> 
> 	add/remove: 0/1 grow/shrink: 1/0 up/down: 660/-820 (-160)
> 	Function                                     old     new   delta
> 	extent_readpages                             476    1136    +660
> 	__extent_readpages                           820       -    -820
> 	Total: Before=44315, After=44155, chg -0.36%
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
>  fs/btrfs/extent_io.c | 37 ++++++++++++++++---------------------
>  1 file changed, 16 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 8332c5f4b1c3..233f835dd6f8 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4093,43 +4093,38 @@ int extent_writepages(struct address_space *mapping,
>  int extent_readpages(struct address_space *mapping, struct list_head *pages,
>  		     unsigned nr_pages)

nr_pages becomes unused now. I've checked the MM code that calls here
and it's guaranteed that the number of list 'pages' members is same as
nr_pages.

>  {
> +#define BTRFS_PAGES_BATCH 16
> +
>  	struct bio *bio = NULL;
> -	unsigned page_idx;
>  	unsigned long bio_flags = 0;
> -	struct page *pagepool[16];
> -	struct page *page;
> +	struct page *pagepool[BTRFS_PAGES_BATCH];

I don't think we need the extra define, if

	struct page *pagepool[16];

	...

	for (nr = 0; nr < ARRAY_SIZE(pagepool) ...

Otherwise ok, nice cleanup.

Reviewed-by: David Sterba <dsterba@suse.com>
Nikolay Borisov Dec. 12, 2018, 3:16 p.m. UTC | #2
On 12.12.18 г. 16:40 ч., David Sterba wrote:
> On Thu, Nov 29, 2018 at 06:41:31PM +0200, Nikolay Borisov wrote:
>> extent_readpages processes all pages in the readlist in batches of 16,
>> this is implemented by a single for loop but thanks to an if condition
>> the loop does 2 things based on whether we've filled the batch or not.
>> Additionally due to the structure of the code there is an additional
>> check which deals with partial batches.
>>
>> Streamline all of this by explicitly using two loops. The outter one is
>> used to process all pages while the inner one just fills in the batch
>> of 16 (currently). Due to this new structure the code guarantees that
>> all pages are processed in the loop hence the code to deal with any
>> leftovers is eliminated.
>>
>> This also enable the compiler to inline __extent_readpages:
>>
>> 	./scripts/bloat-o-meter fs/btrfs/extent_io.o extent_io.for
>>
>> 	add/remove: 0/1 grow/shrink: 1/0 up/down: 660/-820 (-160)
>> 	Function                                     old     new   delta
>> 	extent_readpages                             476    1136    +660
>> 	__extent_readpages                           820       -    -820
>> 	Total: Before=44315, After=44155, chg -0.36%
>>
>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>> ---
>>  fs/btrfs/extent_io.c | 37 ++++++++++++++++---------------------
>>  1 file changed, 16 insertions(+), 21 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 8332c5f4b1c3..233f835dd6f8 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -4093,43 +4093,38 @@ int extent_writepages(struct address_space *mapping,
>>  int extent_readpages(struct address_space *mapping, struct list_head *pages,
>>  		     unsigned nr_pages)
> 
> nr_pages becomes unused now. I've checked the MM code that calls here
> and it's guaranteed that the number of list 'pages' members is same as
> nr_pages.

Indeed, I missed that.

> 
>>  {
>> +#define BTRFS_PAGES_BATCH 16
>> +
>>  	struct bio *bio = NULL;
>> -	unsigned page_idx;
>>  	unsigned long bio_flags = 0;
>> -	struct page *pagepool[16];
>> -	struct page *page;
>> +	struct page *pagepool[BTRFS_PAGES_BATCH];
> 
> I don't think we need the extra define, if
> 
> 	struct page *pagepool[16];
> 
> 	...
> 
> 	for (nr = 0; nr < ARRAY_SIZE(pagepool) ...
> 
> Otherwise ok, nice cleanup.

Fair enough, I'm ok with that as well.

> 
> Reviewed-by: David Sterba <dsterba@suse.com>
>
diff mbox series

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 8332c5f4b1c3..233f835dd6f8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4093,43 +4093,38 @@  int extent_writepages(struct address_space *mapping,
 int extent_readpages(struct address_space *mapping, struct list_head *pages,
 		     unsigned nr_pages)
 {
+#define BTRFS_PAGES_BATCH 16
+
 	struct bio *bio = NULL;
-	unsigned page_idx;
 	unsigned long bio_flags = 0;
-	struct page *pagepool[16];
-	struct page *page;
+	struct page *pagepool[BTRFS_PAGES_BATCH];
 	struct extent_map *em_cached = NULL;
 	struct extent_io_tree *tree = &BTRFS_I(mapping->host)->io_tree;
 	int nr = 0;
 	u64 prev_em_start = (u64)-1;
 
-	for (page_idx = 0; page_idx < nr_pages; page_idx++) {
-		page = lru_to_page(pages);
+	while (!list_empty(pages)) {
+		for (nr = 0; nr < BTRFS_PAGES_BATCH && !list_empty(pages);) {
+			struct page *page = lru_to_page(pages);
 
-		prefetchw(&page->flags);
-		list_del(&page->lru);
-		if (add_to_page_cache_lru(page, mapping,
-					page->index,
-					readahead_gfp_mask(mapping))) {
-			put_page(page);
-			continue;
+			prefetchw(&page->flags);
+			list_del(&page->lru);
+			if (add_to_page_cache_lru(page, mapping, page->index,
+						readahead_gfp_mask(mapping))) {
+				put_page(page);
+				continue;
+			}
+
+			pagepool[nr++] = page;
 		}
 
-		pagepool[nr++] = page;
-		if (nr < ARRAY_SIZE(pagepool))
-			continue;
 		__extent_readpages(tree, pagepool, nr, &em_cached, &bio,
-				&bio_flags, &prev_em_start);
-		nr = 0;
+				   &bio_flags, &prev_em_start);
 	}
-	if (nr)
-		__extent_readpages(tree, pagepool, nr, &em_cached, &bio,
-				&bio_flags, &prev_em_start);
 
 	if (em_cached)
 		free_extent_map(em_cached);
 
-	BUG_ON(!list_empty(pages));
 	if (bio)
 		return submit_one_bio(bio, 0, bio_flags);
 	return 0;