diff mbox series

btrfs: fix a bug that btrfs_invalidapge() can double account ordered extent for subpage

Message ID 20210127063848.72528-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix a bug that btrfs_invalidapge() can double account ordered extent for subpage | expand

Commit Message

Qu Wenruo Jan. 27, 2021, 6:38 a.m. UTC
Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could
span across a page") make btrfs_invalidapage() to search all ordered
extents.

The offending code looks like this:

again:
	start = page_start;
	ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
	if (ordred) {
		end = min(page_end,
			  ordered->file_offset + ordered->num_bytes - 1);

		/* Do the cleanup */

		start = end + 1;
		if (start < page_end)
			goto again;
	}

The behavior is indeed necessary for the incoming subpage support, but
when it iterate through all the ordered extents, it also resets the
search range @start.

This means, for the following cases, we can double account the ordered
extents, causing its bytes_left underflow:

	Page offset
	0		16K		32K
	|<--- OE 1  --->|<--- OE 2 ---->|

As the first iteration will find OE 1, which doesn't cover the full
page, thus after cleanup code, we need to retry again.
But again label will reset start to page_start, and we got OE 1 again,
which causes double account on OE1, and cause OE1's byte_left to
underflow.

The only good news is, this problem can only happen for subpage case, as
for regular sectorsize == PAGE_SIZE case, we will always find a OE ends
at or after page end, thus no way to trigger the problem.

This patch will move the again label after start = page_start, to fix
the bug.
This is just a quick fix, which is easy to backport.

There will be more comprehensive rework to convert the open coded loop to
a proper while loop.

Fixes: dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Filipe Manana Jan. 27, 2021, 10:44 a.m. UTC | #1
On Wed, Jan 27, 2021 at 8:29 AM Qu Wenruo <wqu@suse.com> wrote:
>
> Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could
> span across a page") make btrfs_invalidapage() to search all ordered
> extents.
>
> The offending code looks like this:
>
> again:
>         start = page_start;
>         ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
>         if (ordred) {
>                 end = min(page_end,
>                           ordered->file_offset + ordered->num_bytes - 1);
>
>                 /* Do the cleanup */
>
>                 start = end + 1;
>                 if (start < page_end)
>                         goto again;
>         }
>
> The behavior is indeed necessary for the incoming subpage support, but
> when it iterate through all the ordered extents, it also resets the
> search range @start.
>
> This means, for the following cases, we can double account the ordered
> extents, causing its bytes_left underflow:
>
>         Page offset
>         0               16K             32K
>         |<--- OE 1  --->|<--- OE 2 ---->|
>
> As the first iteration will find OE 1, which doesn't cover the full
> page, thus after cleanup code, we need to retry again.
> But again label will reset start to page_start, and we got OE 1 again,
> which causes double account on OE1, and cause OE1's byte_left to
> underflow.
>
> The only good news is, this problem can only happen for subpage case, as
> for regular sectorsize == PAGE_SIZE case, we will always find a OE ends
> at or after page end, thus no way to trigger the problem.
>
> This patch will move the again label after start = page_start, to fix
> the bug.
> This is just a quick fix, which is easy to backport.

Hum? Why does it need to be backported?
There are no kernel releases that support subpage sector size yet and
this does not affect the case where sector size is the same as the
page size.

>
> There will be more comprehensive rework to convert the open coded loop to
> a proper while loop.
>
> Fixes: dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page")
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Anyway, it looks good to me, thanks.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> ---
>  fs/btrfs/inode.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index ef6cb7b620d0..2eea7d22405a 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -8184,8 +8184,9 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
>
>         if (!inode_evicting)
>                 lock_extent_bits(tree, page_start, page_end, &cached_state);
> -again:
> +
>         start = page_start;
> +again:
>         ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
>         if (ordered) {
>                 found_ordered = true;
> --
> 2.30.0
>
Qu Wenruo Jan. 27, 2021, 10:52 a.m. UTC | #2
On 2021/1/27 下午6:44, Filipe Manana wrote:
> On Wed, Jan 27, 2021 at 8:29 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could
>> span across a page") make btrfs_invalidapage() to search all ordered
>> extents.
>>
>> The offending code looks like this:
>>
>> again:
>>          start = page_start;
>>          ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
>>          if (ordred) {
>>                  end = min(page_end,
>>                            ordered->file_offset + ordered->num_bytes - 1);
>>
>>                  /* Do the cleanup */
>>
>>                  start = end + 1;
>>                  if (start < page_end)
>>                          goto again;
>>          }
>>
>> The behavior is indeed necessary for the incoming subpage support, but
>> when it iterate through all the ordered extents, it also resets the
>> search range @start.
>>
>> This means, for the following cases, we can double account the ordered
>> extents, causing its bytes_left underflow:
>>
>>          Page offset
>>          0               16K             32K
>>          |<--- OE 1  --->|<--- OE 2 ---->|
>>
>> As the first iteration will find OE 1, which doesn't cover the full
>> page, thus after cleanup code, we need to retry again.
>> But again label will reset start to page_start, and we got OE 1 again,
>> which causes double account on OE1, and cause OE1's byte_left to
>> underflow.
>>
>> The only good news is, this problem can only happen for subpage case, as
>> for regular sectorsize == PAGE_SIZE case, we will always find a OE ends
>> at or after page end, thus no way to trigger the problem.
>>
>> This patch will move the again label after start = page_start, to fix
>> the bug.
>> This is just a quick fix, which is easy to backport.
>
> Hum? Why does it need to be backported?
> There are no kernel releases that support subpage sector size yet and
> this does not affect the case where sector size is the same as the
> page size.


Hmmm, right, there is no need to backport.

Hopes David can remove that line when merging.

Thanks,
Qu

>
>>
>> There will be more comprehensive rework to convert the open coded loop to
>> a proper while loop.
>>
>> Fixes: dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> Anyway, it looks good to me, thanks.
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
>
>> ---
>>   fs/btrfs/inode.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index ef6cb7b620d0..2eea7d22405a 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -8184,8 +8184,9 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
>>
>>          if (!inode_evicting)
>>                  lock_extent_bits(tree, page_start, page_end, &cached_state);
>> -again:
>> +
>>          start = page_start;
>> +again:
>>          ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
>>          if (ordered) {
>>                  found_ordered = true;
>> --
>> 2.30.0
>>
>
>
David Sterba Jan. 28, 2021, 10:50 a.m. UTC | #3
On Wed, Jan 27, 2021 at 06:52:04PM +0800, Qu Wenruo wrote:
> On 2021/1/27 下午6:44, Filipe Manana wrote:
> > On Wed, Jan 27, 2021 at 8:29 AM Qu Wenruo <wqu@suse.com> wrote:
> >> As the first iteration will find OE 1, which doesn't cover the full
> >> page, thus after cleanup code, we need to retry again.
> >> But again label will reset start to page_start, and we got OE 1 again,
> >> which causes double account on OE1, and cause OE1's byte_left to
> >> underflow.
> >>
> >> The only good news is, this problem can only happen for subpage case, as
> >> for regular sectorsize == PAGE_SIZE case, we will always find a OE ends
> >> at or after page end, thus no way to trigger the problem.
> >>
> >> This patch will move the again label after start = page_start, to fix
> >> the bug.
> >> This is just a quick fix, which is easy to backport.
> >
> > Hum? Why does it need to be backported?
> > There are no kernel releases that support subpage sector size yet and
> > this does not affect the case where sector size is the same as the
> > page size.
> 
> 
> Hmmm, right, there is no need to backport.
> 
> Hopes David can remove that line when merging.

Yes, I've dropped the 'backport' line and slightly changed the wording
so it sounds like it's preparatory patch.
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ef6cb7b620d0..2eea7d22405a 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8184,8 +8184,9 @@  static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 
 	if (!inode_evicting)
 		lock_extent_bits(tree, page_start, page_end, &cached_state);
-again:
+
 	start = page_start;
+again:
 	ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
 	if (ordered) {
 		found_ordered = true;