Message ID | 20210127063848.72528-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix a bug that btrfs_invalidapge() can double account ordered extent for subpage | expand |
On Wed, Jan 27, 2021 at 8:29 AM Qu Wenruo <wqu@suse.com> wrote: > > Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could > span across a page") make btrfs_invalidapage() to search all ordered > extents. > > The offending code looks like this: > > again: > start = page_start; > ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1); > if (ordred) { > end = min(page_end, > ordered->file_offset + ordered->num_bytes - 1); > > /* Do the cleanup */ > > start = end + 1; > if (start < page_end) > goto again; > } > > The behavior is indeed necessary for the incoming subpage support, but > when it iterate through all the ordered extents, it also resets the > search range @start. > > This means, for the following cases, we can double account the ordered > extents, causing its bytes_left underflow: > > Page offset > 0 16K 32K > |<--- OE 1 --->|<--- OE 2 ---->| > > As the first iteration will find OE 1, which doesn't cover the full > page, thus after cleanup code, we need to retry again. > But again label will reset start to page_start, and we got OE 1 again, > which causes double account on OE1, and cause OE1's byte_left to > underflow. > > The only good news is, this problem can only happen for subpage case, as > for regular sectorsize == PAGE_SIZE case, we will always find a OE ends > at or after page end, thus no way to trigger the problem. > > This patch will move the again label after start = page_start, to fix > the bug. > This is just a quick fix, which is easy to backport. Hum? Why does it need to be backported? There are no kernel releases that support subpage sector size yet and this does not affect the case where sector size is the same as the page size. > > There will be more comprehensive rework to convert the open coded loop to > a proper while loop. > > Fixes: dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page") > Signed-off-by: Qu Wenruo <wqu@suse.com> Anyway, it looks good to me, thanks. Reviewed-by: Filipe Manana <fdmanana@suse.com> > --- > fs/btrfs/inode.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index ef6cb7b620d0..2eea7d22405a 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -8184,8 +8184,9 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, > > if (!inode_evicting) > lock_extent_bits(tree, page_start, page_end, &cached_state); > -again: > + > start = page_start; > +again: > ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1); > if (ordered) { > found_ordered = true; > -- > 2.30.0 >
On 2021/1/27 下午6:44, Filipe Manana wrote: > On Wed, Jan 27, 2021 at 8:29 AM Qu Wenruo <wqu@suse.com> wrote: >> >> Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could >> span across a page") make btrfs_invalidapage() to search all ordered >> extents. >> >> The offending code looks like this: >> >> again: >> start = page_start; >> ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1); >> if (ordred) { >> end = min(page_end, >> ordered->file_offset + ordered->num_bytes - 1); >> >> /* Do the cleanup */ >> >> start = end + 1; >> if (start < page_end) >> goto again; >> } >> >> The behavior is indeed necessary for the incoming subpage support, but >> when it iterate through all the ordered extents, it also resets the >> search range @start. >> >> This means, for the following cases, we can double account the ordered >> extents, causing its bytes_left underflow: >> >> Page offset >> 0 16K 32K >> |<--- OE 1 --->|<--- OE 2 ---->| >> >> As the first iteration will find OE 1, which doesn't cover the full >> page, thus after cleanup code, we need to retry again. >> But again label will reset start to page_start, and we got OE 1 again, >> which causes double account on OE1, and cause OE1's byte_left to >> underflow. >> >> The only good news is, this problem can only happen for subpage case, as >> for regular sectorsize == PAGE_SIZE case, we will always find a OE ends >> at or after page end, thus no way to trigger the problem. >> >> This patch will move the again label after start = page_start, to fix >> the bug. >> This is just a quick fix, which is easy to backport. > > Hum? Why does it need to be backported? > There are no kernel releases that support subpage sector size yet and > this does not affect the case where sector size is the same as the > page size. Hmmm, right, there is no need to backport. Hopes David can remove that line when merging. Thanks, Qu > >> >> There will be more comprehensive rework to convert the open coded loop to >> a proper while loop. >> >> Fixes: dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page") >> Signed-off-by: Qu Wenruo <wqu@suse.com> > > Anyway, it looks good to me, thanks. > > Reviewed-by: Filipe Manana <fdmanana@suse.com> > >> --- >> fs/btrfs/inode.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >> index ef6cb7b620d0..2eea7d22405a 100644 >> --- a/fs/btrfs/inode.c >> +++ b/fs/btrfs/inode.c >> @@ -8184,8 +8184,9 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, >> >> if (!inode_evicting) >> lock_extent_bits(tree, page_start, page_end, &cached_state); >> -again: >> + >> start = page_start; >> +again: >> ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1); >> if (ordered) { >> found_ordered = true; >> -- >> 2.30.0 >> > >
On Wed, Jan 27, 2021 at 06:52:04PM +0800, Qu Wenruo wrote: > On 2021/1/27 下午6:44, Filipe Manana wrote: > > On Wed, Jan 27, 2021 at 8:29 AM Qu Wenruo <wqu@suse.com> wrote: > >> As the first iteration will find OE 1, which doesn't cover the full > >> page, thus after cleanup code, we need to retry again. > >> But again label will reset start to page_start, and we got OE 1 again, > >> which causes double account on OE1, and cause OE1's byte_left to > >> underflow. > >> > >> The only good news is, this problem can only happen for subpage case, as > >> for regular sectorsize == PAGE_SIZE case, we will always find a OE ends > >> at or after page end, thus no way to trigger the problem. > >> > >> This patch will move the again label after start = page_start, to fix > >> the bug. > >> This is just a quick fix, which is easy to backport. > > > > Hum? Why does it need to be backported? > > There are no kernel releases that support subpage sector size yet and > > this does not affect the case where sector size is the same as the > > page size. > > > Hmmm, right, there is no need to backport. > > Hopes David can remove that line when merging. Yes, I've dropped the 'backport' line and slightly changed the wording so it sounds like it's preparatory patch.
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ef6cb7b620d0..2eea7d22405a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8184,8 +8184,9 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, if (!inode_evicting) lock_extent_bits(tree, page_start, page_end, &cached_state); -again: + start = page_start; +again: ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1); if (ordered) { found_ordered = true;
Commit dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page") make btrfs_invalidapage() to search all ordered extents. The offending code looks like this: again: start = page_start; ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1); if (ordred) { end = min(page_end, ordered->file_offset + ordered->num_bytes - 1); /* Do the cleanup */ start = end + 1; if (start < page_end) goto again; } The behavior is indeed necessary for the incoming subpage support, but when it iterate through all the ordered extents, it also resets the search range @start. This means, for the following cases, we can double account the ordered extents, causing its bytes_left underflow: Page offset 0 16K 32K |<--- OE 1 --->|<--- OE 2 ---->| As the first iteration will find OE 1, which doesn't cover the full page, thus after cleanup code, we need to retry again. But again label will reset start to page_start, and we got OE 1 again, which causes double account on OE1, and cause OE1's byte_left to underflow. The only good news is, this problem can only happen for subpage case, as for regular sectorsize == PAGE_SIZE case, we will always find a OE ends at or after page end, thus no way to trigger the problem. This patch will move the again label after start = page_start, to fix the bug. This is just a quick fix, which is easy to backport. There will be more comprehensive rework to convert the open coded loop to a proper while loop. Fixes: dbfdb6d1b369 ("Btrfs: Search for all ordered extents that could span across a page") Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)