diff mbox series

[v3,03/26] ext4: correct the hole length returned by ext4_map_blocks()

Message ID 20240127015825.1608160-4-yi.zhang@huaweicloud.com (mailing list archive)
State New, archived
Headers show
Series ext4: use iomap for regular file's buffered IO path and enable large foilo | expand

Commit Message

Zhang Yi Jan. 27, 2024, 1:58 a.m. UTC
From: Zhang Yi <yi.zhang@huawei.com>

In ext4_map_blocks(), if we can't find a range of mapping in the
extents cache, we are calling ext4_ext_map_blocks() to search the real
path and ext4_ext_determine_hole() to determine the hole range. But if
the querying range was partially or completely overlaped by a delalloc
extent, we can't find it in the real extent path, so the returned hole
length could be incorrect.

Fortunately, ext4_ext_put_gap_in_cache() have already handle delalloc
extent, but it searches start from the expanded hole_start, doesn't
start from the querying range, so the delalloc extent found could not be
the one that overlaped the querying range, plus, it also didn't adjust
the hole length. Let's just remove ext4_ext_put_gap_in_cache(), handle
delalloc and insert adjusted hole extent in ext4_ext_determine_hole().

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/extents.c | 111 +++++++++++++++++++++++++++++-----------------
 1 file changed, 70 insertions(+), 41 deletions(-)

Comments

Theodore Ts'o Feb. 3, 2024, 5:56 p.m. UTC | #1
On Sat, Jan 27, 2024 at 09:58:02AM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> In ext4_map_blocks(), if we can't find a range of mapping in the
> extents cache, we are calling ext4_ext_map_blocks() to search the real
> path and ext4_ext_determine_hole() to determine the hole range. But if
> the querying range was partially or completely overlaped by a delalloc
> extent, we can't find it in the real extent path, so the returned hole
> length could be incorrect.
> 
> Fortunately, ext4_ext_put_gap_in_cache() have already handle delalloc
> extent, but it searches start from the expanded hole_start, doesn't
> start from the querying range, so the delalloc extent found could not be
> the one that overlaped the querying range, plus, it also didn't adjust
> the hole length. Let's just remove ext4_ext_put_gap_in_cache(), handle
> delalloc and insert adjusted hole extent in ext4_ext_determine_hole().
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> Suggested-by: Jan Kara <jack@suse.cz>
> Reviewed-by: Jan Kara <jack@suse.cz>

Thanks, applied.

					- Ted
Luis Henriques May 9, 2024, 3:16 p.m. UTC | #2
On Sat 27 Jan 2024 09:58:02 AM +08, Zhang Yi wrote;
<...>
> +static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
> +						  struct ext4_ext_path *path,
> +						  ext4_lblk_t lblk)
> +{
> +	ext4_lblk_t hole_start, len;
> +	struct extent_status es;
> +
> +	hole_start = lblk;
> +	len = ext4_ext_find_hole(inode, path, &hole_start);
> +again:
> +	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
> +				  hole_start + len - 1, &es);
> +	if (!es.es_len)
> +		goto insert_hole;
> +
> +	/*
> +	 * There's a delalloc extent in the hole, handle it if the delalloc
> +	 * extent is in front of, behind and straddle the queried range.
> +	 */
> +	if (lblk >= es.es_lblk + es.es_len) {
> +		/*
> +		 * The delalloc extent is in front of the queried range,
> +		 * find again from the queried start block.
> +		 */
> +		len -= lblk - hole_start;
> +		hole_start = lblk;
> +		goto again;

It's looks like it's easy to trigger an infinite loop here using fstest
generic/039.  If I understand it correctly (which doesn't happen as often
as I'd like), this is due to an integer overflow in the 'if' condition,
and should be fixed with the patch below.

From 3117af2f8dacad37a2722850421f31075ae9e88d Mon Sep 17 00:00:00 2001
From: "Luis Henriques (SUSE)" <luis.henriques@linux.dev>
Date: Thu, 9 May 2024 15:53:01 +0100
Subject: [PATCH] ext4: fix infinite loop caused by integer overflow

An integer overflow will happen if the extent_status len is set to
EXT_MAX_BLOCKS (0xffffffff).  This may cause an infinite loop in function
ext4_ext_determine_insert_hole(), easily reproducible using fstest
generic/039.

Fixes: 6430dea07e85 ("ext4: correct the hole length returned by ext4_map_blocks()")
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
 fs/ext4/extents.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index e57054bdc5fd..193121b394f9 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4064,7 +4064,7 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
 	 * There's a delalloc extent in the hole, handle it if the delalloc
 	 * extent is in front of, behind and straddle the queried range.
 	 */
-	if (lblk >= es.es_lblk + es.es_len) {
+	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
 		/*
 		 * The delalloc extent is in front of the queried range,
 		 * find again from the queried start block.
Theodore Ts'o May 9, 2024, 4:39 p.m. UTC | #3
On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
> 
> It's looks like it's easy to trigger an infinite loop here using fstest
> generic/039.  If I understand it correctly (which doesn't happen as often
> as I'd like), this is due to an integer overflow in the 'if' condition,
> and should be fixed with the patch below.

Thanks for the report.  However, I can't reproduce the failure, and
looking at generic/039, I don't see how it could be relevant to the
code path in question.  Generic/039 creates a test symlink with two
hard links in the same directory, syncs the file system, and then
removes one of the hard links, and then drops access to the block
device using dmflakey.  So I don't see how the extent code would be
involved at all.  Are you sure that you have the correct test listed?

Looking at the code in question in fs/ext4/extents.c:

again:
	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
				  hole_start + len - 1, &es);
	if (!es.es_len)
		goto insert_hole;

  	 * There's a delalloc extent in the hole, handle it if the delalloc
  	 * extent is in front of, behind and straddle the queried range.
  	 */
 -	if (lblk >= es.es_lblk + es.es_len) {
 +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
  		/*
  		 * The delalloc extent is in front of the queried range,
  		 * find again from the queried start block.
		len -= lblk - hole_start;
		hole_start = lblk;
		goto again;

lblk and es.es_lblk are both __u32.  So the infinite loop is
presumably because es.es_lblk + es.es_len has overflowed.  This should
never happen(tm), and in fact we have a test for this case which
*should* have gotten tripped when ext4_es_find_extent_range() calls
__es_tree_search() in fs/ext4/extents_status.c:

static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
{
	BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
	return es->es_lblk + es->es_len - 1;
}

So the patch is harmless, and I can see how it might fix what you were
seeing --- but I'm a bit nervous that I can't reproduce it and the
commit description claims that it reproduces easily; and we should
have never allowed the entry to have gotten introduced into the
extents status tree in the first place, and if it had been introduced,
it should have been caught before it was returned by
ext4_es_find_extent_range().

Can you give more details about the reproducer; can you double check
the test id, and how easily you can trigger the failure, and what is
the hardware you used to run the test?

Many thanks,

					- Ted
Luis Henriques May 9, 2024, 5:23 p.m. UTC | #4
On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;

> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>> 
>> It's looks like it's easy to trigger an infinite loop here using fstest
>> generic/039.  If I understand it correctly (which doesn't happen as often
>> as I'd like), this is due to an integer overflow in the 'if' condition,
>> and should be fixed with the patch below.
>
> Thanks for the report.  However, I can't reproduce the failure, and
> looking at generic/039, I don't see how it could be relevant to the
> code path in question.  Generic/039 creates a test symlink with two
> hard links in the same directory, syncs the file system, and then
> removes one of the hard links, and then drops access to the block
> device using dmflakey.  So I don't see how the extent code would be
> involved at all.  Are you sure that you have the correct test listed?

Yep, I just retested and it's definitely generic/039.  I'm using a simple
test environment, with virtme-ng.

> Looking at the code in question in fs/ext4/extents.c:
>
> again:
> 	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
> 				  hole_start + len - 1, &es);
> 	if (!es.es_len)
> 		goto insert_hole;
>
>   	 * There's a delalloc extent in the hole, handle it if the delalloc
>   	 * extent is in front of, behind and straddle the queried range.
>   	 */
>  -	if (lblk >= es.es_lblk + es.es_len) {
>  +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>   		/*
>   		 * The delalloc extent is in front of the queried range,
>   		 * find again from the queried start block.
> 		len -= lblk - hole_start;
> 		hole_start = lblk;
> 		goto again;
>
> lblk and es.es_lblk are both __u32.  So the infinite loop is
> presumably because es.es_lblk + es.es_len has overflowed.  This should
> never happen(tm), and in fact we have a test for this case which

If I instrument the code, I can see that es.es_len is definitely set to
EXT_MAX_BLOCKS, which will overflow.

> *should* have gotten tripped when ext4_es_find_extent_range() calls
> __es_tree_search() in fs/ext4/extents_status.c:
>
> static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
> {
> 	BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
> 	return es->es_lblk + es->es_len - 1;
> }
>
> So the patch is harmless, and I can see how it might fix what you were
> seeing --- but I'm a bit nervous that I can't reproduce it and the
> commit description claims that it reproduces easily; and we should
> have never allowed the entry to have gotten introduced into the
> extents status tree in the first place, and if it had been introduced,
> it should have been caught before it was returned by
> ext4_es_find_extent_range().
>
> Can you give more details about the reproducer; can you double check
> the test id, and how easily you can trigger the failure, and what is
> the hardware you used to run the test?

So, here's few more details that may clarify, and that I should have added
to the commit description:

When the test hangs, the test is blocked mounting the flakey device:

   mount -t ext4 -o acl,user_xattr /dev/mapper/flakey-test /mnt/scratch

which will eventually call into ext4_ext_map_blocks(), triggering the bug.

Also, some more code instrumentation shows that after the call to
ext4_ext_find_hole(), the 'hole_start' will be set to '1' and 'len' to
'0xfffffffe'.  This '0xfffffffe' value is a bit odd, but it comes from the
fact that, in ext4_ext_find_hole(), the call to
ext4_ext_next_allocated_block() will return EXT_MAX_BLOCKS and 'len' will
thus be set to 'EXT_MAX_BLOCKS - 1'.

Does this make sense?

Cheers,
Zhang Yi May 10, 2024, 3:39 a.m. UTC | #5
On 2024/5/10 1:23, Luis Henriques wrote:
> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
> 
>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>
>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>> generic/039.  If I understand it correctly (which doesn't happen as often
>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>> and should be fixed with the patch below.
>>
>> Thanks for the report.  However, I can't reproduce the failure, and
>> looking at generic/039, I don't see how it could be relevant to the
>> code path in question.  Generic/039 creates a test symlink with two
>> hard links in the same directory, syncs the file system, and then
>> removes one of the hard links, and then drops access to the block
>> device using dmflakey.  So I don't see how the extent code would be
>> involved at all.  Are you sure that you have the correct test listed?
> 
> Yep, I just retested and it's definitely generic/039.  I'm using a simple
> test environment, with virtme-ng.
> 
>> Looking at the code in question in fs/ext4/extents.c:
>>
>> again:
>> 	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>> 				  hole_start + len - 1, &es);
>> 	if (!es.es_len)
>> 		goto insert_hole;
>>
>>   	 * There's a delalloc extent in the hole, handle it if the delalloc
>>   	 * extent is in front of, behind and straddle the queried range.
>>   	 */
>>  -	if (lblk >= es.es_lblk + es.es_len) {
>>  +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>>   		/*
>>   		 * The delalloc extent is in front of the queried range,
>>   		 * find again from the queried start block.
>> 		len -= lblk - hole_start;
>> 		hole_start = lblk;
>> 		goto again;
>>
>> lblk and es.es_lblk are both __u32.  So the infinite loop is
>> presumably because es.es_lblk + es.es_len has overflowed.  This should
>> never happen(tm), and in fact we have a test for this case which
> 
> If I instrument the code, I can see that es.es_len is definitely set to
> EXT_MAX_BLOCKS, which will overflow.
> 

Thanks for the report. After looking at the code, I think the root
cause of this issue is the variable es was not initialized on replaying
fast commit. ext4_es_find_extent_range() will return directly when
EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.

I can always reproduce this issue on generic/039 with
MKFS_OPTIONS="-O fast_commit".

This uninitialization problem originally existed in the old
ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
since we never check and use extent cache when replaying fast commit.
So I suppose the correct fix would be to unconditionally initialize
the es variable.

Thanks,
Yi.

>> *should* have gotten tripped when ext4_es_find_extent_range() calls
>> __es_tree_search() in fs/ext4/extents_status.c:
>>
>> static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
>> {
>> 	BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
>> 	return es->es_lblk + es->es_len - 1;
>> }
>>
>> So the patch is harmless, and I can see how it might fix what you were
>> seeing --- but I'm a bit nervous that I can't reproduce it and the
>> commit description claims that it reproduces easily; and we should
>> have never allowed the entry to have gotten introduced into the
>> extents status tree in the first place, and if it had been introduced,
>> it should have been caught before it was returned by
>> ext4_es_find_extent_range().
>>
>> Can you give more details about the reproducer; can you double check
>> the test id, and how easily you can trigger the failure, and what is
>> the hardware you used to run the test?
> 
> So, here's few more details that may clarify, and that I should have added
> to the commit description:
> 
> When the test hangs, the test is blocked mounting the flakey device:
> 
>    mount -t ext4 -o acl,user_xattr /dev/mapper/flakey-test /mnt/scratch
> 
> which will eventually call into ext4_ext_map_blocks(), triggering the bug.
> 
> Also, some more code instrumentation shows that after the call to
> ext4_ext_find_hole(), the 'hole_start' will be set to '1' and 'len' to
> '0xfffffffe'.  This '0xfffffffe' value is a bit odd, but it comes from the
> fact that, in ext4_ext_find_hole(), the call to
> ext4_ext_next_allocated_block() will return EXT_MAX_BLOCKS and 'len' will
> thus be set to 'EXT_MAX_BLOCKS - 1'.
> 
> Does this make sense?
> 
> Cheers,
>
Luis Henriques May 10, 2024, 9:41 a.m. UTC | #6
On Fri 10 May 2024 11:39:48 AM +08, Zhang Yi wrote;

> On 2024/5/10 1:23, Luis Henriques wrote:
>> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
>> 
>>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>>
>>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>>> generic/039.  If I understand it correctly (which doesn't happen as often
>>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>>> and should be fixed with the patch below.
>>>
>>> Thanks for the report.  However, I can't reproduce the failure, and
>>> looking at generic/039, I don't see how it could be relevant to the
>>> code path in question.  Generic/039 creates a test symlink with two
>>> hard links in the same directory, syncs the file system, and then
>>> removes one of the hard links, and then drops access to the block
>>> device using dmflakey.  So I don't see how the extent code would be
>>> involved at all.  Are you sure that you have the correct test listed?
>> 
>> Yep, I just retested and it's definitely generic/039.  I'm using a simple
>> test environment, with virtme-ng.
>> 
>>> Looking at the code in question in fs/ext4/extents.c:
>>>
>>> again:
>>> 	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>>> 				  hole_start + len - 1, &es);
>>> 	if (!es.es_len)
>>> 		goto insert_hole;
>>>
>>>   	 * There's a delalloc extent in the hole, handle it if the delalloc
>>>   	 * extent is in front of, behind and straddle the queried range.
>>>   	 */
>>>  -	if (lblk >= es.es_lblk + es.es_len) {
>>>  +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>>>   		/*
>>>   		 * The delalloc extent is in front of the queried range,
>>>   		 * find again from the queried start block.
>>> 		len -= lblk - hole_start;
>>> 		hole_start = lblk;
>>> 		goto again;
>>>
>>> lblk and es.es_lblk are both __u32.  So the infinite loop is
>>> presumably because es.es_lblk + es.es_len has overflowed.  This should
>>> never happen(tm), and in fact we have a test for this case which
>> 
>> If I instrument the code, I can see that es.es_len is definitely set to
>> EXT_MAX_BLOCKS, which will overflow.
>> 
>
> Thanks for the report. After looking at the code, I think the root
> cause of this issue is the variable es was not initialized on replaying
> fast commit. ext4_es_find_extent_range() will return directly when
> EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.
>
> I can always reproduce this issue on generic/039 with
> MKFS_OPTIONS="-O fast_commit".
>
> This uninitialization problem originally existed in the old
> ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
> since we never check and use extent cache when replaying fast commit.
> So I suppose the correct fix would be to unconditionally initialize
> the es variable.

Oh, you're absolutely right -- the extent_status 'es' struct isn't being
initialized in that case.  I totally failed to see that.  And yes, I also
failed to mention I had 'fast_commit' feature enabled, sorry!

Thanks a lot for figuring this out, Yi.  I'm looking at this code and
trying to understand if it would be safe to call __es_find_extent_range()
when EXT4_FC_REPLAY is in progress.  Probably not, and probably better to
simply do:

	es->es_lblk = es->es_len = es->es_pblk = 0;

in that case.  I'll send out a patch later today.

Cheers,
Zhang Yi May 10, 2024, 11:40 a.m. UTC | #7
On 2024/5/10 17:41, Luis Henriques wrote:
> On Fri 10 May 2024 11:39:48 AM +08, Zhang Yi wrote;
> 
>> On 2024/5/10 1:23, Luis Henriques wrote:
>>> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
>>>
>>>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>>>
>>>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>>>> generic/039.  If I understand it correctly (which doesn't happen as often
>>>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>>>> and should be fixed with the patch below.
>>>>
>>>> Thanks for the report.  However, I can't reproduce the failure, and
>>>> looking at generic/039, I don't see how it could be relevant to the
>>>> code path in question.  Generic/039 creates a test symlink with two
>>>> hard links in the same directory, syncs the file system, and then
>>>> removes one of the hard links, and then drops access to the block
>>>> device using dmflakey.  So I don't see how the extent code would be
>>>> involved at all.  Are you sure that you have the correct test listed?
>>>
>>> Yep, I just retested and it's definitely generic/039.  I'm using a simple
>>> test environment, with virtme-ng.
>>>
>>>> Looking at the code in question in fs/ext4/extents.c:
>>>>
>>>> again:
>>>> 	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>>>> 				  hole_start + len - 1, &es);
>>>> 	if (!es.es_len)
>>>> 		goto insert_hole;
>>>>
>>>>   	 * There's a delalloc extent in the hole, handle it if the delalloc
>>>>   	 * extent is in front of, behind and straddle the queried range.
>>>>   	 */
>>>>  -	if (lblk >= es.es_lblk + es.es_len) {
>>>>  +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>>>>   		/*
>>>>   		 * The delalloc extent is in front of the queried range,
>>>>   		 * find again from the queried start block.
>>>> 		len -= lblk - hole_start;
>>>> 		hole_start = lblk;
>>>> 		goto again;
>>>>
>>>> lblk and es.es_lblk are both __u32.  So the infinite loop is
>>>> presumably because es.es_lblk + es.es_len has overflowed.  This should
>>>> never happen(tm), and in fact we have a test for this case which
>>>
>>> If I instrument the code, I can see that es.es_len is definitely set to
>>> EXT_MAX_BLOCKS, which will overflow.
>>>
>>
>> Thanks for the report. After looking at the code, I think the root
>> cause of this issue is the variable es was not initialized on replaying
>> fast commit. ext4_es_find_extent_range() will return directly when
>> EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.
>>
>> I can always reproduce this issue on generic/039 with
>> MKFS_OPTIONS="-O fast_commit".
>>
>> This uninitialization problem originally existed in the old
>> ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
>> since we never check and use extent cache when replaying fast commit.
>> So I suppose the correct fix would be to unconditionally initialize
>> the es variable.
> 
> Oh, you're absolutely right -- the extent_status 'es' struct isn't being
> initialized in that case.  I totally failed to see that.  And yes, I also
> failed to mention I had 'fast_commit' feature enabled, sorry!
> 
> Thanks a lot for figuring this out, Yi.  I'm looking at this code and
> trying to understand if it would be safe to call __es_find_extent_range()
> when EXT4_FC_REPLAY is in progress.  Probably not, and probably better to
> simply do:
> 
> 	es->es_lblk = es->es_len = es->es_pblk = 0;
> 
> in that case.  I'll send out a patch later today.
> 

Yeah, I'm glad it could help.

Thanks,
Yi.
diff mbox series

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index d5efe076d3d3..e0b7e48c4c67 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2229,7 +2229,7 @@  static int ext4_fill_es_cache_info(struct inode *inode,
 
 
 /*
- * ext4_ext_determine_hole - determine hole around given block
+ * ext4_ext_find_hole - find hole around given block according to the given path
  * @inode:	inode we lookup in
  * @path:	path in extent tree to @lblk
  * @lblk:	pointer to logical block around which we want to determine hole
@@ -2241,9 +2241,9 @@  static int ext4_fill_es_cache_info(struct inode *inode,
  * The function returns the length of a hole starting at @lblk. We update @lblk
  * to the beginning of the hole if we managed to find it.
  */
-static ext4_lblk_t ext4_ext_determine_hole(struct inode *inode,
-					   struct ext4_ext_path *path,
-					   ext4_lblk_t *lblk)
+static ext4_lblk_t ext4_ext_find_hole(struct inode *inode,
+				      struct ext4_ext_path *path,
+				      ext4_lblk_t *lblk)
 {
 	int depth = ext_depth(inode);
 	struct ext4_extent *ex;
@@ -2270,30 +2270,6 @@  static ext4_lblk_t ext4_ext_determine_hole(struct inode *inode,
 	return len;
 }
 
-/*
- * ext4_ext_put_gap_in_cache:
- * calculate boundaries of the gap that the requested block fits into
- * and cache this gap
- */
-static void
-ext4_ext_put_gap_in_cache(struct inode *inode, ext4_lblk_t hole_start,
-			  ext4_lblk_t hole_len)
-{
-	struct extent_status es;
-
-	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
-				  hole_start + hole_len - 1, &es);
-	if (es.es_len) {
-		/* There's delayed extent containing lblock? */
-		if (es.es_lblk <= hole_start)
-			return;
-		hole_len = min(es.es_lblk - hole_start, hole_len);
-	}
-	ext_debug(inode, " -> %u:%u\n", hole_start, hole_len);
-	ext4_es_insert_extent(inode, hole_start, hole_len, ~0,
-			      EXTENT_STATUS_HOLE);
-}
-
 /*
  * ext4_ext_rm_idx:
  * removes index from the index block.
@@ -4062,6 +4038,69 @@  static int get_implied_cluster_alloc(struct super_block *sb,
 	return 0;
 }
 
+/*
+ * Determine hole length around the given logical block, first try to
+ * locate and expand the hole from the given @path, and then adjust it
+ * if it's partially or completely converted to delayed extents, insert
+ * it into the extent cache tree if it's indeed a hole, finally return
+ * the length of the determined extent.
+ */
+static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
+						  struct ext4_ext_path *path,
+						  ext4_lblk_t lblk)
+{
+	ext4_lblk_t hole_start, len;
+	struct extent_status es;
+
+	hole_start = lblk;
+	len = ext4_ext_find_hole(inode, path, &hole_start);
+again:
+	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
+				  hole_start + len - 1, &es);
+	if (!es.es_len)
+		goto insert_hole;
+
+	/*
+	 * There's a delalloc extent in the hole, handle it if the delalloc
+	 * extent is in front of, behind and straddle the queried range.
+	 */
+	if (lblk >= es.es_lblk + es.es_len) {
+		/*
+		 * The delalloc extent is in front of the queried range,
+		 * find again from the queried start block.
+		 */
+		len -= lblk - hole_start;
+		hole_start = lblk;
+		goto again;
+	} else if (in_range(lblk, es.es_lblk, es.es_len)) {
+		/*
+		 * The delalloc extent containing lblk, it must have been
+		 * added after ext4_map_blocks() checked the extent status
+		 * tree, adjust the length to the delalloc extent's after
+		 * lblk.
+		 */
+		len = es.es_lblk + es.es_len - lblk;
+		return len;
+	} else {
+		/*
+		 * The delalloc extent is partially or completely behind
+		 * the queried range, update hole length until the
+		 * beginning of the delalloc extent.
+		 */
+		len = min(es.es_lblk - hole_start, len);
+	}
+
+insert_hole:
+	/* Put just found gap into cache to speed up subsequent requests */
+	ext_debug(inode, " -> %u:%u\n", hole_start, len);
+	ext4_es_insert_extent(inode, hole_start, len, ~0, EXTENT_STATUS_HOLE);
+
+	/* Update hole_len to reflect hole size after lblk */
+	if (hole_start != lblk)
+		len -= lblk - hole_start;
+
+	return len;
+}
 
 /*
  * Block allocation/map/preallocation routine for extents based files
@@ -4179,22 +4218,12 @@  int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 	 * we couldn't try to create block if create flag is zero
 	 */
 	if ((flags & EXT4_GET_BLOCKS_CREATE) == 0) {
-		ext4_lblk_t hole_start, hole_len;
+		ext4_lblk_t len;
 
-		hole_start = map->m_lblk;
-		hole_len = ext4_ext_determine_hole(inode, path, &hole_start);
-		/*
-		 * put just found gap into cache to speed up
-		 * subsequent requests
-		 */
-		ext4_ext_put_gap_in_cache(inode, hole_start, hole_len);
+		len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk);
 
-		/* Update hole_len to reflect hole size after map->m_lblk */
-		if (hole_start != map->m_lblk)
-			hole_len -= map->m_lblk - hole_start;
 		map->m_pblk = 0;
-		map->m_len = min_t(unsigned int, map->m_len, hole_len);
-
+		map->m_len = min_t(unsigned int, map->m_len, len);
 		goto out;
 	}