diff mbox series

[02/32] btrfs: extent_io: integrate page status update into endio_readpage_release_extent()

Message ID 20201103133108.148112-3-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series [01/32] btrfs: extent_io: remove the extent_start/extent_len for end_bio_extent_readpage() | expand

Commit Message

Qu Wenruo Nov. 3, 2020, 1:30 p.m. UTC
In end_bio_extent_readpage(), we set page uptodate or error according to
the bio status.  However that assumes all submitted reads are in page
size.

To support case like subpage read, we should only set the whole page
uptodate if all data in the page have been read from disk.

This patch will integrate the page status update into
endio_readpage_release_extent() for end_bio_extent_readpage().

Now in endio_readpage_release_extent() we will set the page uptodate if:

- start/end range covers the full page
  This is the existing behavior already.

- the whole page range is already uptodate
  This adds the support for subpage read.

And for the error path, we always clear the page uptodate and set the
page error.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++----------
 1 file changed, 28 insertions(+), 10 deletions(-)

Comments

Nikolay Borisov Nov. 5, 2020, 10:26 a.m. UTC | #1
On 3.11.20 г. 15:30 ч., Qu Wenruo wrote:
> In end_bio_extent_readpage(), we set page uptodate or error according to
> the bio status.  However that assumes all submitted reads are in page
> size.
> 
> To support case like subpage read, we should only set the whole page
> uptodate if all data in the page have been read from disk.
> 
> This patch will integrate the page status update into
> endio_readpage_release_extent() for end_bio_extent_readpage().
> 
> Now in endio_readpage_release_extent() we will set the page uptodate if:
> 
> - start/end range covers the full page
>   This is the existing behavior already.
> 
> - the whole page range is already uptodate
>   This adds the support for subpage read.
> 
> And for the error path, we always clear the page uptodate and set the
> page error.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>  fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++----------
>  1 file changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 58dc55e1429d..228bf0c5f7a0 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2779,13 +2779,35 @@ static void end_bio_extent_writepage(struct bio *bio)
>  	bio_put(bio);
>  }
>  
> -static void endio_readpage_release_extent(struct extent_io_tree *tree, u64 start,
> -					  u64 end, int uptodate)
> +static void endio_readpage_release_extent(struct extent_io_tree *tree,
> +		struct page *page, u64 start, u64 end, int uptodate)
>  {
>  	struct extent_state *cached = NULL;
>  
> -	if (uptodate && tree->track_uptodate)
> -		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
> +	if (uptodate) {
> +		u64 page_start = page_offset(page);
> +		u64 page_end = page_offset(page) + PAGE_SIZE - 1;
> +
> +		if (tree->track_uptodate) {
> +			/*
> +			 * The tree has EXTENT_UPTODATE bit tracking, update
> +			 * extent io tree, and use it to update the page if
> +			 * needed.
> +			 */
> +			set_extent_uptodate(tree, start, end, &cached, GFP_NOFS);
> +			check_page_uptodate(tree, page);
> +		} else if (start <= page_start && end >= page_end) {

'start' is 'page_offset(page) + bvec->bv_offset' from
end_bio_extent_readpage, this means it's either equal to 'page_start' in
endio_readpage_release_extent or different, you are effectively checking
if bvec->bv_offset is non zero. As such the '<' condition can never
trigger. So simplify this check to start == page_start

For 'end' it makes sense to check for end >= becuase of multipage bvec I
guess.

Also the only relevant portion in this function is really
check_page_uptodate  I don't see a reason why you actually put the page
uptodate into this function and not simply adjust the endio handler?



> +			/* We have covered the full page, set it uptodate */
> +			SetPageUptodate(page);
> +		}
> +	} else if (!uptodate){
> +		if (tree->track_uptodate)
> +			clear_extent_uptodate(tree, start, end, &cached);
> +
> +		/* Any error in the page range would invalid the uptodate bit */

nit: invalidate the whole page

<snip>
Nikolay Borisov Nov. 5, 2020, 10:35 a.m. UTC | #2
On 3.11.20 г. 15:30 ч., Qu Wenruo wrote:
> In end_bio_extent_readpage(), we set page uptodate or error according to
> the bio status.  However that assumes all submitted reads are in page
> size.
> 
> To support case like subpage read, we should only set the whole page
> uptodate if all data in the page have been read from disk.
> 
> This patch will integrate the page status update into
> endio_readpage_release_extent() for end_bio_extent_readpage().
> 
> Now in endio_readpage_release_extent() we will set the page uptodate if:
> 
> - start/end range covers the full page
>   This is the existing behavior already.
> 
> - the whole page range is already uptodate
>   This adds the support for subpage read.
> 
> And for the error path, we always clear the page uptodate and set the
> page error.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>  fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++----------
>  1 file changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 58dc55e1429d..228bf0c5f7a0 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2779,13 +2779,35 @@ static void end_bio_extent_writepage(struct bio *bio)
>  	bio_put(bio);
>  }
>  
> -static void endio_readpage_release_extent(struct extent_io_tree *tree, u64 start,
> -					  u64 end, int uptodate)
> +static void endio_readpage_release_extent(struct extent_io_tree *tree,
> +		struct page *page, u64 start, u64 end, int uptodate)
>  {
>  	struct extent_state *cached = NULL;
>  
> -	if (uptodate && tree->track_uptodate)
> -		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
> +	if (uptodate) {
> +		u64 page_start = page_offset(page);
> +		u64 page_end = page_offset(page) + PAGE_SIZE - 1;
> +
> +		if (tree->track_uptodate) {
> +			/*
> +			 * The tree has EXTENT_UPTODATE bit tracking, update
> +			 * extent io tree, and use it to update the page if
> +			 * needed.
> +			 */
> +			set_extent_uptodate(tree, start, end, &cached, GFP_NOFS);
> +			check_page_uptodate(tree, page);
> +		} else if (start <= page_start && end >= page_end) {
> +			/* We have covered the full page, set it uptodate */
> +			SetPageUptodate(page);
> +		}
> +	} else if (!uptodate){
> +		if (tree->track_uptodate)
> +			clear_extent_uptodate(tree, start, end, &cached);

Hm, that call to clear_extent_uptodate was absent before, so either:

a) The old code is buggy since it misses it
b) this will be a nullop because we have just read the extent and we
haven't really set it to uptodate so there won't be anything to clear?

Which is it?

<snip>
Qu Wenruo Nov. 5, 2020, 11:15 a.m. UTC | #3
On 2020/11/5 下午6:26, Nikolay Borisov wrote:
>
>
> On 3.11.20 г. 15:30 ч., Qu Wenruo wrote:
>> In end_bio_extent_readpage(), we set page uptodate or error according to
>> the bio status.  However that assumes all submitted reads are in page
>> size.
>>
>> To support case like subpage read, we should only set the whole page
>> uptodate if all data in the page have been read from disk.
>>
>> This patch will integrate the page status update into
>> endio_readpage_release_extent() for end_bio_extent_readpage().
>>
>> Now in endio_readpage_release_extent() we will set the page uptodate if:
>>
>> - start/end range covers the full page
>>   This is the existing behavior already.
>>
>> - the whole page range is already uptodate
>>   This adds the support for subpage read.
>>
>> And for the error path, we always clear the page uptodate and set the
>> page error.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> Signed-off-by: David Sterba <dsterba@suse.com>
>> ---
>>  fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++----------
>>  1 file changed, 28 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 58dc55e1429d..228bf0c5f7a0 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -2779,13 +2779,35 @@ static void end_bio_extent_writepage(struct bio *bio)
>>  	bio_put(bio);
>>  }
>>
>> -static void endio_readpage_release_extent(struct extent_io_tree *tree, u64 start,
>> -					  u64 end, int uptodate)
>> +static void endio_readpage_release_extent(struct extent_io_tree *tree,
>> +		struct page *page, u64 start, u64 end, int uptodate)
>>  {
>>  	struct extent_state *cached = NULL;
>>
>> -	if (uptodate && tree->track_uptodate)
>> -		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
>> +	if (uptodate) {
>> +		u64 page_start = page_offset(page);
>> +		u64 page_end = page_offset(page) + PAGE_SIZE - 1;
>> +
>> +		if (tree->track_uptodate) {
>> +			/*
>> +			 * The tree has EXTENT_UPTODATE bit tracking, update
>> +			 * extent io tree, and use it to update the page if
>> +			 * needed.
>> +			 */
>> +			set_extent_uptodate(tree, start, end, &cached, GFP_NOFS);
>> +			check_page_uptodate(tree, page);
>> +		} else if (start <= page_start && end >= page_end) {
>
> 'start' is 'page_offset(page) + bvec->bv_offset' from
> end_bio_extent_readpage, this means it's either equal to 'page_start' in
> endio_readpage_release_extent or different, you are effectively checking
> if bvec->bv_offset is non zero. As such the '<' condition can never
> trigger. So simplify this check to start == page_start

Nope. Don't forget the objective of the whole patchset, to support subpage.

That means, we could have cases like bvec->bv_offset == 4K, bvec->bv_len
== 16K if we are reading one 16K extent on a 64K page size system.

In that case, we could have start == 4K end = (20K - 1).
If the tree needs track_update, then we go the set_extent_update() call
and re-check if we need to mark the full page uptodate.

But if we don't need track_updoate (mostly for btree inode), then we go
the else branch, and if fails, we do nothing (expected).

This if branch is a little tricky, as it in fact checks two things, thus
should have 4 combinations:
1) Need track_uptodate, and covers the full page
   Set extent range uptodate, and set page Uptodate

2) Need track_uptodate, and doesn't cover the full page
   Set extent range uptodate, and check if the full page range has
   UPTODATE bit. If has, set page uptodate.

3) Don't need track_uptodate, and covers the full page
   Just set page Uptodate

4) Doesn't need track_uptodate and doesn't dover the full page
   Do nothing.
   Although this is an invalid case.
   For subpage we set tree->track_uptodate in later patches.
   For regular case, we always have range covering a full page.

>
> For 'end' it makes sense to check for end >= becuase of multipage bvec I
> guess.
>
> Also the only relevant portion in this function is really
> check_page_uptodate  I don't see a reason why you actually put the page
> uptodate into this function and not simply adjust the endio handler?

'Cause we are here handling the page status update, (with above 4
branches combination), thus it seems complete sane to me to do things here.

Or did I miss something?

Thanks,
Qu

>
>
>
>> +			/* We have covered the full page, set it uptodate */
>> +			SetPageUptodate(page);
>> +		}
>> +	} else if (!uptodate){
>> +		if (tree->track_uptodate)
>> +			clear_extent_uptodate(tree, start, end, &cached);
>> +
>> +		/* Any error in the page range would invalid the uptodate bit */
>
> nit: invalidate the whole page
>
> <snip>
>
Qu Wenruo Nov. 5, 2020, 11:25 a.m. UTC | #4
On 2020/11/5 下午6:35, Nikolay Borisov wrote:
>
>
> On 3.11.20 г. 15:30 ч., Qu Wenruo wrote:
>> In end_bio_extent_readpage(), we set page uptodate or error according to
>> the bio status.  However that assumes all submitted reads are in page
>> size.
>>
>> To support case like subpage read, we should only set the whole page
>> uptodate if all data in the page have been read from disk.
>>
>> This patch will integrate the page status update into
>> endio_readpage_release_extent() for end_bio_extent_readpage().
>>
>> Now in endio_readpage_release_extent() we will set the page uptodate if:
>>
>> - start/end range covers the full page
>>   This is the existing behavior already.
>>
>> - the whole page range is already uptodate
>>   This adds the support for subpage read.
>>
>> And for the error path, we always clear the page uptodate and set the
>> page error.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> Signed-off-by: David Sterba <dsterba@suse.com>
>> ---
>>  fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++----------
>>  1 file changed, 28 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 58dc55e1429d..228bf0c5f7a0 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -2779,13 +2779,35 @@ static void end_bio_extent_writepage(struct bio *bio)
>>  	bio_put(bio);
>>  }
>>
>> -static void endio_readpage_release_extent(struct extent_io_tree *tree, u64 start,
>> -					  u64 end, int uptodate)
>> +static void endio_readpage_release_extent(struct extent_io_tree *tree,
>> +		struct page *page, u64 start, u64 end, int uptodate)
>>  {
>>  	struct extent_state *cached = NULL;
>>
>> -	if (uptodate && tree->track_uptodate)
>> -		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
>> +	if (uptodate) {
>> +		u64 page_start = page_offset(page);
>> +		u64 page_end = page_offset(page) + PAGE_SIZE - 1;
>> +
>> +		if (tree->track_uptodate) {
>> +			/*
>> +			 * The tree has EXTENT_UPTODATE bit tracking, update
>> +			 * extent io tree, and use it to update the page if
>> +			 * needed.
>> +			 */
>> +			set_extent_uptodate(tree, start, end, &cached, GFP_NOFS);
>> +			check_page_uptodate(tree, page);
>> +		} else if (start <= page_start && end >= page_end) {
>> +			/* We have covered the full page, set it uptodate */
>> +			SetPageUptodate(page);
>> +		}
>> +	} else if (!uptodate){
>> +		if (tree->track_uptodate)
>> +			clear_extent_uptodate(tree, start, end, &cached);
>
> Hm, that call to clear_extent_uptodate was absent before, so either:
>
> a) The old code is buggy since it misses it
> b) this will be a nullop because we have just read the extent and we
> haven't really set it to uptodate so there won't be anything to clear?

You're right, we didn't need this, since the page hasn't been read from
disk, the range should not have EXTENT_UPTODATE bit at all, nor the
PageUptodate bit.

Thanks
Qu

>
> Which is it?
>
> <snip>
>
Josef Bacik Nov. 5, 2020, 7:34 p.m. UTC | #5
On 11/3/20 8:30 AM, Qu Wenruo wrote:
> In end_bio_extent_readpage(), we set page uptodate or error according to
> the bio status.  However that assumes all submitted reads are in page
> size.
> 
> To support case like subpage read, we should only set the whole page
> uptodate if all data in the page have been read from disk.
> 
> This patch will integrate the page status update into
> endio_readpage_release_extent() for end_bio_extent_readpage().
> 
> Now in endio_readpage_release_extent() we will set the page uptodate if:
> 
> - start/end range covers the full page
>    This is the existing behavior already.
> 
> - the whole page range is already uptodate
>    This adds the support for subpage read.
> 
> And for the error path, we always clear the page uptodate and set the
> page error.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>   fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++----------
>   1 file changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 58dc55e1429d..228bf0c5f7a0 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2779,13 +2779,35 @@ static void end_bio_extent_writepage(struct bio *bio)
>   	bio_put(bio);
>   }
>   
> -static void endio_readpage_release_extent(struct extent_io_tree *tree, u64 start,
> -					  u64 end, int uptodate)
> +static void endio_readpage_release_extent(struct extent_io_tree *tree,
> +		struct page *page, u64 start, u64 end, int uptodate)
>   {
>   	struct extent_state *cached = NULL;
>   
> -	if (uptodate && tree->track_uptodate)
> -		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
> +	if (uptodate) {
> +		u64 page_start = page_offset(page);
> +		u64 page_end = page_offset(page) + PAGE_SIZE - 1;
> +
> +		if (tree->track_uptodate) {
> +			/*
> +			 * The tree has EXTENT_UPTODATE bit tracking, update
> +			 * extent io tree, and use it to update the page if
> +			 * needed.
> +			 */
> +			set_extent_uptodate(tree, start, end, &cached, GFP_NOFS);

Why is the switching from GFP_ATOMIC to GFP_NOFS safe here?  If it is it should 
be in it's own patch with it's own explanation.

> +			check_page_uptodate(tree, page);
> +		} else if (start <= page_start && end >= page_end) {
> +			/* We have covered the full page, set it uptodate */
> +			SetPageUptodate(page);
> +		}
> +	} else if (!uptodate){

} else if (!uptodate) {

> +		if (tree->track_uptodate)
> +			clear_extent_uptodate(tree, start, end, &cached);
> +

And this is new.  Please keep logical changes separate.  In this patch you are

1) Changing the GFP pretty majorly.
2) Cleaning up error handling to handle ranges properly.
3) Changing the behavior of EXTENT_UPTODATE for ->track_uptodate trees.

These each require their own explanation and commit so I can understand why 
they're safe to do.  Thanks,

Josef
diff mbox series

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 58dc55e1429d..228bf0c5f7a0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2779,13 +2779,35 @@  static void end_bio_extent_writepage(struct bio *bio)
 	bio_put(bio);
 }
 
-static void endio_readpage_release_extent(struct extent_io_tree *tree, u64 start,
-					  u64 end, int uptodate)
+static void endio_readpage_release_extent(struct extent_io_tree *tree,
+		struct page *page, u64 start, u64 end, int uptodate)
 {
 	struct extent_state *cached = NULL;
 
-	if (uptodate && tree->track_uptodate)
-		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
+	if (uptodate) {
+		u64 page_start = page_offset(page);
+		u64 page_end = page_offset(page) + PAGE_SIZE - 1;
+
+		if (tree->track_uptodate) {
+			/*
+			 * The tree has EXTENT_UPTODATE bit tracking, update
+			 * extent io tree, and use it to update the page if
+			 * needed.
+			 */
+			set_extent_uptodate(tree, start, end, &cached, GFP_NOFS);
+			check_page_uptodate(tree, page);
+		} else if (start <= page_start && end >= page_end) {
+			/* We have covered the full page, set it uptodate */
+			SetPageUptodate(page);
+		}
+	} else if (!uptodate){
+		if (tree->track_uptodate)
+			clear_extent_uptodate(tree, start, end, &cached);
+
+		/* Any error in the page range would invalid the uptodate bit */
+		ClearPageUptodate(page);
+		SetPageError(page);
+	}
 	unlock_extent_cached_atomic(tree, start, end, &cached);
 }
 
@@ -2910,15 +2932,11 @@  static void end_bio_extent_readpage(struct bio *bio)
 			off = offset_in_page(i_size);
 			if (page->index == end_index && off)
 				zero_user_segment(page, off, PAGE_SIZE);
-			SetPageUptodate(page);
-		} else {
-			ClearPageUptodate(page);
-			SetPageError(page);
 		}
-		unlock_page(page);
 		offset += len;
 
-		endio_readpage_release_extent(tree, start, end, uptodate);
+		endio_readpage_release_extent(tree, page, start, end, uptodate);
+		unlock_page(page);
 	}
 
 	btrfs_io_bio_free_csum(io_bio);