Message ID | 64d8a34bed1360c4771ead6a66e3c6df0ab86a7f.1743113694.git.wqu@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: add the missing preparations exposed by initial large data folio support | expand |
On Thu, Mar 27, 2025 at 10:33 PM Qu Wenruo <wqu@suse.com> wrote: > > The function btrfs_punch_hole_lock_range() needs to make sure there is > no other folio in the range, thus it goes with filemap_range_has_page(), > which works pretty fine. > > But if we have large folios, under the following case > filemap_range_has_page() will always return true, forcing > btrfs_punch_hole_lock_range() to do a very time consuming busy loop: > > start end > | | > |//|//|//|//| | | | | | | | |//|//| > \ / \ / > Folio A Folio B > > In above case, folio A and B contains our start/end index, and there is contains -> contain index -> indexes there is -> there are > no other folios in the range. > Thus there is no other folios and we do not need to retry inside is -> are. "Thus there is no other folios" is repeated from the previous sentence, so it can be just: Thus we do not need to retry inside... > btrfs_punch_hole_lock_range(). > > To prepare for large data folios, introduce a helper, > check_range_has_page(), which will: > > - Grab all the folios inside the range > > - Skip any large folios that covers the start and end index covers -> cover index -> indexes > > - If any other folios is found return true is found -> are found > > - Otherwise return false > > This new helper is going to handle both large folios and regular ones. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 49 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 5d10ae321687..417c90ffc6fa 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -2157,6 +2157,54 @@ static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len) > return ret; > } > > +/* > + * The helper to check if there is no folio in the range. > + * > + * We can not utilized filemap_range_has_page() in a filemap with large folios > + * as we can hit the following false postive: postive -> positive > + * > + * start end > + * | | > + * |//|//|//|//| | | | | | | | |//|//| > + * \ / \ / > + * Folio A Folio B > + * > + * That large folio A and B covers the start and end index. covers -> cover index -> indexes Anyway, those are minor things, so: Reviewed-by: Filipe Manana <fdmanana@suse.com> Thanks. > + * In that case filemap_range_has_page() will always return true, but the above > + * case is fine for btrfs_punch_hole_lock_range() usage. > + * > + * So here we only ensure that no other folio is in the range, excluding the > + * head/tail large folio. > + */ > +static bool check_range_has_page(struct inode *inode, u64 start, u64 end) > +{ > + struct folio_batch fbatch; > + bool ret = false; > + const pgoff_t start_index = start >> PAGE_SHIFT; > + const pgoff_t end_index = end >> PAGE_SHIFT; > + pgoff_t tmp = start_index; > + int found_folios; > + > + folio_batch_init(&fbatch); > + found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index, > + &fbatch); > + for (int i = 0; i < found_folios; i++) { > + struct folio *folio = fbatch.folios[i]; > + > + /* A large folio begins before the start. Not a target. */ > + if (folio->index < start_index) > + continue; > + /* A large folio extends beyond the end. Not a target. */ > + if (folio->index + folio_nr_pages(folio) > end_index) > + continue; > + /* A folio doesn't cover the head/tail index. Found a target. */ > + ret = true; > + break; > + } > + folio_batch_release(&fbatch); > + return ret; > +} > + > static void btrfs_punch_hole_lock_range(struct inode *inode, > const u64 lockstart, > const u64 lockend, > @@ -2188,8 +2236,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, > * locking the range check if we have pages in the range, and if > * we do, unlock the range and retry. > */ > - if (!filemap_range_has_page(inode->i_mapping, page_lockstart, > - page_lockend)) > + if (!check_range_has_page(inode, page_lockstart, page_lockend)) > break; > > unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend, > -- > 2.49.0 > >
在 2025/3/29 04:27, Filipe Manana 写道: > On Thu, Mar 27, 2025 at 10:33 PM Qu Wenruo <wqu@suse.com> wrote: >> >> The function btrfs_punch_hole_lock_range() needs to make sure there is >> no other folio in the range, thus it goes with filemap_range_has_page(), >> which works pretty fine. >> >> But if we have large folios, under the following case >> filemap_range_has_page() will always return true, forcing >> btrfs_punch_hole_lock_range() to do a very time consuming busy loop: >> >> start end >> | | >> |//|//|//|//| | | | | | | | |//|//| >> \ / \ / >> Folio A Folio B >> >> In above case, folio A and B contains our start/end index, and there is > > contains -> contain > index -> indexes > there is -> there are > >> no other folios in the range. >> Thus there is no other folios and we do not need to retry inside > > is -> are. > > "Thus there is no other folios" is repeated from the previous > sentence, so it can be just: > > Thus we do not need to retry inside... > >> btrfs_punch_hole_lock_range(). >> >> To prepare for large data folios, introduce a helper, >> check_range_has_page(), which will: >> >> - Grab all the folios inside the range >> >> - Skip any large folios that covers the start and end index > > covers -> cover > index -> indexes > >> >> - If any other folios is found return true > > is found -> are found > >> >> - Otherwise return false >> >> This new helper is going to handle both large folios and regular ones. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++-- >> 1 file changed, 49 insertions(+), 2 deletions(-) >> >> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c >> index 5d10ae321687..417c90ffc6fa 100644 >> --- a/fs/btrfs/file.c >> +++ b/fs/btrfs/file.c >> @@ -2157,6 +2157,54 @@ static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len) >> return ret; >> } >> >> +/* >> + * The helper to check if there is no folio in the range. >> + * >> + * We can not utilized filemap_range_has_page() in a filemap with large folios >> + * as we can hit the following false postive: > > postive -> positive > >> + * >> + * start end >> + * | | >> + * |//|//|//|//| | | | | | | | |//|//| >> + * \ / \ / >> + * Folio A Folio B >> + * >> + * That large folio A and B covers the start and end index. > > covers -> cover > index -> indexes > > Anyway, those are minor things, so: > > Reviewed-by: Filipe Manana <fdmanana@suse.com> Thanks a lot for the review, although there is some more fixes needed in this patch. Thus I'll resent with all your comments addressed, but not with RoB tag in case you find something else wrong in the fix. > > Thanks. > > > >> + * In that case filemap_range_has_page() will always return true, but the above >> + * case is fine for btrfs_punch_hole_lock_range() usage. >> + * >> + * So here we only ensure that no other folio is in the range, excluding the >> + * head/tail large folio. >> + */ >> +static bool check_range_has_page(struct inode *inode, u64 start, u64 end) >> +{ Fsstress exposed rare cases where btrfs_punch_hole_lock_range() can be called with a range inside the same folio: btrfs_punch_hole_lock_range: r/i=5/2745 start=4096(65536) end=20479(18446744073709551615) enter And since both @lockstart and @lockend are inside the same folio 0, the page_lockend calculation will be -1, causing us to check to the end of the inode. But then we can hit a cases some one else is still holding the folio at 64K: check_range_has_page: r/i=5/2745 start=65536(1) end=18446744073709551615(281474976710655) enter check_range_has_page: found folio=65536(1) size=65536(1) Then we will hit a deadloop waiting for folio 64K meanwhile our zero range doesn't even need that folio. The original filemap_range_has_page() has a basic check to skip such case completely, which is missing in the new helper. Will get this fixed and resend. Thanks, Qu>> + struct folio_batch fbatch; >> + bool ret = false; >> + const pgoff_t start_index = start >> PAGE_SHIFT; >> + const pgoff_t end_index = end >> PAGE_SHIFT; >> + pgoff_t tmp = start_index; >> + int found_folios; >> + >> + folio_batch_init(&fbatch); >> + found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index, >> + &fbatch); >> + for (int i = 0; i < found_folios; i++) { >> + struct folio *folio = fbatch.folios[i]; >> + >> + /* A large folio begins before the start. Not a target. */ >> + if (folio->index < start_index) >> + continue; >> + /* A large folio extends beyond the end. Not a target. */ >> + if (folio->index + folio_nr_pages(folio) > end_index) >> + continue; >> + /* A folio doesn't cover the head/tail index. Found a target. */ >> + ret = true; >> + break; >> + } >> + folio_batch_release(&fbatch); >> + return ret; >> +} >> + >> static void btrfs_punch_hole_lock_range(struct inode *inode, >> const u64 lockstart, >> const u64 lockend, >> @@ -2188,8 +2236,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, >> * locking the range check if we have pages in the range, and if >> * we do, unlock the range and retry. >> */ >> - if (!filemap_range_has_page(inode->i_mapping, page_lockstart, >> - page_lockend)) >> + if (!check_range_has_page(inode, page_lockstart, page_lockend)) >> break; >> >> unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend, >> -- >> 2.49.0 >> >> >
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 5d10ae321687..417c90ffc6fa 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2157,6 +2157,54 @@ static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len) return ret; } +/* + * The helper to check if there is no folio in the range. + * + * We can not utilized filemap_range_has_page() in a filemap with large folios + * as we can hit the following false postive: + * + * start end + * | | + * |//|//|//|//| | | | | | | | |//|//| + * \ / \ / + * Folio A Folio B + * + * That large folio A and B covers the start and end index. + * In that case filemap_range_has_page() will always return true, but the above + * case is fine for btrfs_punch_hole_lock_range() usage. + * + * So here we only ensure that no other folio is in the range, excluding the + * head/tail large folio. + */ +static bool check_range_has_page(struct inode *inode, u64 start, u64 end) +{ + struct folio_batch fbatch; + bool ret = false; + const pgoff_t start_index = start >> PAGE_SHIFT; + const pgoff_t end_index = end >> PAGE_SHIFT; + pgoff_t tmp = start_index; + int found_folios; + + folio_batch_init(&fbatch); + found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index, + &fbatch); + for (int i = 0; i < found_folios; i++) { + struct folio *folio = fbatch.folios[i]; + + /* A large folio begins before the start. Not a target. */ + if (folio->index < start_index) + continue; + /* A large folio extends beyond the end. Not a target. */ + if (folio->index + folio_nr_pages(folio) > end_index) + continue; + /* A folio doesn't cover the head/tail index. Found a target. */ + ret = true; + break; + } + folio_batch_release(&fbatch); + return ret; +} + static void btrfs_punch_hole_lock_range(struct inode *inode, const u64 lockstart, const u64 lockend, @@ -2188,8 +2236,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode, * locking the range check if we have pages in the range, and if * we do, unlock the range and retry. */ - if (!filemap_range_has_page(inode->i_mapping, page_lockstart, - page_lockend)) + if (!check_range_has_page(inode, page_lockstart, page_lockend)) break; unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend,
The function btrfs_punch_hole_lock_range() needs to make sure there is no other folio in the range, thus it goes with filemap_range_has_page(), which works pretty fine. But if we have large folios, under the following case filemap_range_has_page() will always return true, forcing btrfs_punch_hole_lock_range() to do a very time consuming busy loop: start end | | |//|//|//|//| | | | | | | | |//|//| \ / \ / Folio A Folio B In above case, folio A and B contains our start/end index, and there is no other folios in the range. Thus there is no other folios and we do not need to retry inside btrfs_punch_hole_lock_range(). To prepare for large data folios, introduce a helper, check_range_has_page(), which will: - Grab all the folios inside the range - Skip any large folios that covers the start and end index - If any other folios is found return true - Otherwise return false This new helper is going to handle both large folios and regular ones. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 49 insertions(+), 2 deletions(-)