diff mbox series

[4/4] btrfs: prepare btrfs_punch_hole_lock_range() for large data folios

Message ID 64d8a34bed1360c4771ead6a66e3c6df0ab86a7f.1743113694.git.wqu@suse.com (mailing list archive)
State New
Headers show
Series btrfs: add the missing preparations exposed by initial large data folio support | expand

Commit Message

Qu Wenruo March 27, 2025, 10:31 p.m. UTC
The function btrfs_punch_hole_lock_range() needs to make sure there is
no other folio in the range, thus it goes with filemap_range_has_page(),
which works pretty fine.

But if we have large folios, under the following case
filemap_range_has_page() will always return true, forcing
btrfs_punch_hole_lock_range() to do a very time consuming busy loop:

        start                            end
        |                                |
  |//|//|//|//|  |  |  |  |  |  |  |  |//|//|
   \         /                         \   /
    Folio A                            Folio B

In above case, folio A and B contains our start/end index, and there is
no other folios in the range.
Thus there is no other folios and we do not need to retry inside
btrfs_punch_hole_lock_range().

To prepare for large data folios, introduce a helper,
check_range_has_page(), which will:

- Grab all the folios inside the range

- Skip any large folios that covers the start and end index

- If any other folios is found return true

- Otherwise return false

This new helper is going to handle both large folios and regular ones.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)

Comments

Filipe Manana March 28, 2025, 5:57 p.m. UTC | #1
On Thu, Mar 27, 2025 at 10:33 PM Qu Wenruo <wqu@suse.com> wrote:
>
> The function btrfs_punch_hole_lock_range() needs to make sure there is
> no other folio in the range, thus it goes with filemap_range_has_page(),
> which works pretty fine.
>
> But if we have large folios, under the following case
> filemap_range_has_page() will always return true, forcing
> btrfs_punch_hole_lock_range() to do a very time consuming busy loop:
>
>         start                            end
>         |                                |
>   |//|//|//|//|  |  |  |  |  |  |  |  |//|//|
>    \         /                         \   /
>     Folio A                            Folio B
>
> In above case, folio A and B contains our start/end index, and there is

contains -> contain
index -> indexes
there is -> there are

> no other folios in the range.
> Thus there is no other folios and we do not need to retry inside

is -> are.

"Thus there is no other folios" is repeated from the previous
sentence, so it can be just:

Thus we do not need to retry inside...

> btrfs_punch_hole_lock_range().
>
> To prepare for large data folios, introduce a helper,
> check_range_has_page(), which will:
>
> - Grab all the folios inside the range
>
> - Skip any large folios that covers the start and end index

covers -> cover
index -> indexes

>
> - If any other folios is found return true

is found -> are found

>
> - Otherwise return false
>
> This new helper is going to handle both large folios and regular ones.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 49 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 5d10ae321687..417c90ffc6fa 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2157,6 +2157,54 @@ static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len)
>         return ret;
>  }
>
> +/*
> + * The helper to check if there is no folio in the range.
> + *
> + * We can not utilized filemap_range_has_page() in a filemap with large folios
> + * as we can hit the following false postive:

 postive -> positive

> + *
> + *        start                            end
> + *        |                                |
> + *  |//|//|//|//|  |  |  |  |  |  |  |  |//|//|
> + *   \         /                         \   /
> + *    Folio A                            Folio B
> + *
> + * That large folio A and B covers the start and end index.

covers -> cover
index -> indexes

Anyway, those are minor things, so:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.



> + * In that case filemap_range_has_page() will always return true, but the above
> + * case is fine for btrfs_punch_hole_lock_range() usage.
> + *
> + * So here we only ensure that no other folio is in the range, excluding the
> + * head/tail large folio.
> + */
> +static bool check_range_has_page(struct inode *inode, u64 start, u64 end)
> +{
> +       struct folio_batch fbatch;
> +       bool ret = false;
> +       const pgoff_t start_index = start >> PAGE_SHIFT;
> +       const pgoff_t end_index = end >> PAGE_SHIFT;
> +       pgoff_t tmp = start_index;
> +       int found_folios;
> +
> +       folio_batch_init(&fbatch);
> +       found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index,
> +                                         &fbatch);
> +       for (int i = 0; i < found_folios; i++) {
> +               struct folio *folio = fbatch.folios[i];
> +
> +               /* A large folio begins before the start. Not a target. */
> +               if (folio->index < start_index)
> +                       continue;
> +               /* A large folio extends beyond the end. Not a target. */
> +               if (folio->index + folio_nr_pages(folio) > end_index)
> +                       continue;
> +               /* A folio doesn't cover the head/tail index. Found a target. */
> +               ret = true;
> +               break;
> +       }
> +       folio_batch_release(&fbatch);
> +       return ret;
> +}
> +
>  static void btrfs_punch_hole_lock_range(struct inode *inode,
>                                         const u64 lockstart,
>                                         const u64 lockend,
> @@ -2188,8 +2236,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode,
>                  * locking the range check if we have pages in the range, and if
>                  * we do, unlock the range and retry.
>                  */
> -               if (!filemap_range_has_page(inode->i_mapping, page_lockstart,
> -                                           page_lockend))
> +               if (!check_range_has_page(inode, page_lockstart, page_lockend))
>                         break;
>
>                 unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend,
> --
> 2.49.0
>
>
Qu Wenruo March 29, 2025, 3:18 a.m. UTC | #2
在 2025/3/29 04:27, Filipe Manana 写道:
> On Thu, Mar 27, 2025 at 10:33 PM Qu Wenruo <wqu@suse.com> wrote:
>>
>> The function btrfs_punch_hole_lock_range() needs to make sure there is
>> no other folio in the range, thus it goes with filemap_range_has_page(),
>> which works pretty fine.
>>
>> But if we have large folios, under the following case
>> filemap_range_has_page() will always return true, forcing
>> btrfs_punch_hole_lock_range() to do a very time consuming busy loop:
>>
>>          start                            end
>>          |                                |
>>    |//|//|//|//|  |  |  |  |  |  |  |  |//|//|
>>     \         /                         \   /
>>      Folio A                            Folio B
>>
>> In above case, folio A and B contains our start/end index, and there is
>
> contains -> contain
> index -> indexes
> there is -> there are
>
>> no other folios in the range.
>> Thus there is no other folios and we do not need to retry inside
>
> is -> are.
>
> "Thus there is no other folios" is repeated from the previous
> sentence, so it can be just:
>
> Thus we do not need to retry inside...
>
>> btrfs_punch_hole_lock_range().
>>
>> To prepare for large data folios, introduce a helper,
>> check_range_has_page(), which will:
>>
>> - Grab all the folios inside the range
>>
>> - Skip any large folios that covers the start and end index
>
> covers -> cover
> index -> indexes
>
>>
>> - If any other folios is found return true
>
> is found -> are found
>
>>
>> - Otherwise return false
>>
>> This new helper is going to handle both large folios and regular ones.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 49 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index 5d10ae321687..417c90ffc6fa 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -2157,6 +2157,54 @@ static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len)
>>          return ret;
>>   }
>>
>> +/*
>> + * The helper to check if there is no folio in the range.
>> + *
>> + * We can not utilized filemap_range_has_page() in a filemap with large folios
>> + * as we can hit the following false postive:
>
>   postive -> positive
>
>> + *
>> + *        start                            end
>> + *        |                                |
>> + *  |//|//|//|//|  |  |  |  |  |  |  |  |//|//|
>> + *   \         /                         \   /
>> + *    Folio A                            Folio B
>> + *
>> + * That large folio A and B covers the start and end index.
>
> covers -> cover
> index -> indexes
>
> Anyway, those are minor things, so:
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>


Thanks a lot for the review, although there is some more fixes needed in
this patch.

Thus I'll resent with all your comments addressed, but not with RoB tag
in case you find something else wrong in the fix.

>
> Thanks.
>
>
>
>> + * In that case filemap_range_has_page() will always return true, but the above
>> + * case is fine for btrfs_punch_hole_lock_range() usage.
>> + *
>> + * So here we only ensure that no other folio is in the range, excluding the
>> + * head/tail large folio.
>> + */
>> +static bool check_range_has_page(struct inode *inode, u64 start, u64 end)
>> +{

Fsstress exposed rare cases where btrfs_punch_hole_lock_range() can be
called with a range inside the same folio:

   btrfs_punch_hole_lock_range: r/i=5/2745 start=4096(65536)
end=20479(18446744073709551615) enter

And since both @lockstart and @lockend are inside the same folio 0, the
page_lockend calculation will be -1, causing us to check to the end of
the inode.

But then we can hit a cases some one else is still holding the folio at 64K:

   check_range_has_page: r/i=5/2745 start=65536(1)
end=18446744073709551615(281474976710655) enter
   check_range_has_page: found folio=65536(1) size=65536(1)

Then we will hit a deadloop waiting for folio 64K meanwhile our zero
range doesn't even need that folio.

The original filemap_range_has_page() has a basic check to skip such
case completely, which is missing in the new helper.

Will get this fixed and resend.

Thanks,
Qu>> +       struct folio_batch fbatch;
>> +       bool ret = false;
>> +       const pgoff_t start_index = start >> PAGE_SHIFT;
>> +       const pgoff_t end_index = end >> PAGE_SHIFT;
>> +       pgoff_t tmp = start_index;
>> +       int found_folios;
>> +
>> +       folio_batch_init(&fbatch);
>> +       found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index,
>> +                                         &fbatch);
>> +       for (int i = 0; i < found_folios; i++) {
>> +               struct folio *folio = fbatch.folios[i];
>> +
>> +               /* A large folio begins before the start. Not a target. */
>> +               if (folio->index < start_index)
>> +                       continue;
>> +               /* A large folio extends beyond the end. Not a target. */
>> +               if (folio->index + folio_nr_pages(folio) > end_index)
>> +                       continue;
>> +               /* A folio doesn't cover the head/tail index. Found a target. */
>> +               ret = true;
>> +               break;
>> +       }
>> +       folio_batch_release(&fbatch);
>> +       return ret;
>> +}
>> +
>>   static void btrfs_punch_hole_lock_range(struct inode *inode,
>>                                          const u64 lockstart,
>>                                          const u64 lockend,
>> @@ -2188,8 +2236,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode,
>>                   * locking the range check if we have pages in the range, and if
>>                   * we do, unlock the range and retry.
>>                   */
>> -               if (!filemap_range_has_page(inode->i_mapping, page_lockstart,
>> -                                           page_lockend))
>> +               if (!check_range_has_page(inode, page_lockstart, page_lockend))
>>                          break;
>>
>>                  unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend,
>> --
>> 2.49.0
>>
>>
>
diff mbox series

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 5d10ae321687..417c90ffc6fa 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2157,6 +2157,54 @@  static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len)
 	return ret;
 }
 
+/*
+ * The helper to check if there is no folio in the range.
+ *
+ * We can not utilized filemap_range_has_page() in a filemap with large folios
+ * as we can hit the following false postive:
+ *
+ *        start                            end
+ *        |                                |
+ *  |//|//|//|//|  |  |  |  |  |  |  |  |//|//|
+ *   \         /                         \   /
+ *    Folio A                            Folio B
+ *
+ * That large folio A and B covers the start and end index.
+ * In that case filemap_range_has_page() will always return true, but the above
+ * case is fine for btrfs_punch_hole_lock_range() usage.
+ *
+ * So here we only ensure that no other folio is in the range, excluding the
+ * head/tail large folio.
+ */
+static bool check_range_has_page(struct inode *inode, u64 start, u64 end)
+{
+	struct folio_batch fbatch;
+	bool ret = false;
+	const pgoff_t start_index = start >> PAGE_SHIFT;
+	const pgoff_t end_index = end >> PAGE_SHIFT;
+	pgoff_t tmp = start_index;
+	int found_folios;
+
+	folio_batch_init(&fbatch);
+	found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index,
+					  &fbatch);
+	for (int i = 0; i < found_folios; i++) {
+		struct folio *folio = fbatch.folios[i];
+
+		/* A large folio begins before the start. Not a target. */
+		if (folio->index < start_index)
+			continue;
+		/* A large folio extends beyond the end. Not a target. */
+		if (folio->index + folio_nr_pages(folio) > end_index)
+			continue;
+		/* A folio doesn't cover the head/tail index. Found a target. */
+		ret = true;
+		break;
+	}
+	folio_batch_release(&fbatch);
+	return ret;
+}
+
 static void btrfs_punch_hole_lock_range(struct inode *inode,
 					const u64 lockstart,
 					const u64 lockend,
@@ -2188,8 +2236,7 @@  static void btrfs_punch_hole_lock_range(struct inode *inode,
 		 * locking the range check if we have pages in the range, and if
 		 * we do, unlock the range and retry.
 		 */
-		if (!filemap_range_has_page(inode->i_mapping, page_lockstart,
-					    page_lockend))
+		if (!check_range_has_page(inode, page_lockstart, page_lockend))
 			break;
 
 		unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend,