diff mbox series

[02/10] iomap: remove iomap_file_buffered_write_punch_delalloc

Message ID 20240923152904.1747117-3-hch@lst.de (mailing list archive)
State New
Headers show
Series [01/10] iomap: factor out a iomap_last_written_block helper | expand

Commit Message

Christoph Hellwig Sept. 23, 2024, 3:28 p.m. UTC
Currently iomap_file_buffered_write_punch_delalloc can be called from
XFS either with the invalidate lock held or not.  To fix this while
keeping the locking in the file system and not the iomap library
code we'll need to life the locking up into the file system.

To prepare for that, open code iomap_file_buffered_write_punch_delalloc
in the only caller, and instead export iomap_write_delalloc_release.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 .../filesystems/iomap/operations.rst          |  2 +-
 fs/iomap/buffered-io.c                        | 85 ++++++-------------
 fs/xfs/xfs_iomap.c                            | 16 +++-
 include/linux/iomap.h                         |  6 +-
 4 files changed, 46 insertions(+), 63 deletions(-)

Comments

Darrick J. Wong Sept. 23, 2024, 4:18 p.m. UTC | #1
On Mon, Sep 23, 2024 at 05:28:16PM +0200, Christoph Hellwig wrote:
> Currently iomap_file_buffered_write_punch_delalloc can be called from
> XFS either with the invalidate lock held or not.  To fix this while
> keeping the locking in the file system and not the iomap library
> code we'll need to life the locking up into the file system.
> 
> To prepare for that, open code iomap_file_buffered_write_punch_delalloc
> in the only caller, and instead export iomap_write_delalloc_release.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  .../filesystems/iomap/operations.rst          |  2 +-
>  fs/iomap/buffered-io.c                        | 85 ++++++-------------
>  fs/xfs/xfs_iomap.c                            | 16 +++-
>  include/linux/iomap.h                         |  6 +-
>  4 files changed, 46 insertions(+), 63 deletions(-)
> 
> diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> index 8e6c721d233010..b93115ab8748ae 100644
> --- a/Documentation/filesystems/iomap/operations.rst
> +++ b/Documentation/filesystems/iomap/operations.rst
> @@ -208,7 +208,7 @@ The filesystem must arrange to `cancel
>  such `reservations
>  <https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_
>  because writeback will not consume the reservation.
> -The ``iomap_file_buffered_write_punch_delalloc`` can be called from a
> +The ``iomap_write_delalloc_release`` can be called from a
>  ``->iomap_end`` function to find all the clean areas of the folios
>  caching a fresh (``IOMAP_F_NEW``) delalloc mapping.
>  It takes the ``invalidate_lock``.
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 884891ac7a226c..237aeb883166df 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1149,6 +1149,32 @@ static void iomap_write_delalloc_scan(struct inode *inode,
>   * have dirty data still pending in the page cache - those are going to be
>   * written and so must still retain the delalloc backing for writeback.
>   *
> + * When a short write occurs, the filesystem may need to remove reserved space
> + * that was allocated in ->iomap_begin from it's ->iomap_end method. For

"When a short write occurs, the filesystem may need to remove space
reservations created in ->iomap_begin.

> + * filesystems that use delayed allocation, we need to punch out delalloc
> + * extents from the range that are not dirty in the page cache. As the write can
> + * race with page faults, there can be dirty pages over the delalloc extent
> + * outside the range of a short write but still within the delalloc extent
> + * allocated for this iomap.
> + *
> + * The punch() callback *must* only punch delalloc extents in the range passed
> + * to it. It must skip over all other types of extents in the range and leave
> + * them completely unchanged. It must do this punch atomically with respect to
> + * other extent modifications.

Can a failing buffered write race with a write fault to the same file
range?

write() thread:			page_mkwrite thread:
---------------			--------------------
take i_rwsem
->iomap_begin
create da reservation
lock folio
fail to write
unlock folio
				take invalidation lock
				lock folio
				->iomap_begin
				sees da reservation
				mark folio dirty
				unlock folio
				drop invalidation lock
->iomap_end
take invalidation lock
iomap_write_delalloc_release
drop invalidation lock

Can we end up in this situation, where the write fault thinks it has a
dirty page backed by a delalloc reservation, yet the delalloc
reservation gets removed by the delalloc punch logic?  I think the
answer to my question is that this sequence is impossible because the
write fault dirties the folio so the iomap_write_delalloc_release does
nothing, correct?

Unrelated question about iomap_write_begin: Can we get rid of the
!mapping_large_folio_support if-body just prior to __iomap_get_folio?
filemap_get_folio won't return large folios if
!mapping_large_folio_support, so I think the separate check in iomap
isn't needed anymore?

This push-down looks fine though, so
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> + *
> + * The punch() callback may be called with a folio locked to prevent writeback
> + * extent allocation racing at the edge of the range we are currently punching.
> + * The locked folio may or may not cover the range being punched, so it is not
> + * safe for the punch() callback to lock folios itself.
> + *
> + * Lock order is:
> + *
> + * inode->i_rwsem (shared or exclusive)
> + *   inode->i_mapping->invalidate_lock (exclusive)
> + *     folio_lock()
> + *       ->punch
> + *         internal filesystem allocation lock
> + *
>   * As we are scanning the page cache for data, we don't need to reimplement the
>   * wheel - mapping_seek_hole_data() does exactly what we need to identify the
>   * start and end of data ranges correctly even for sub-folio block sizes. This
> @@ -1177,7 +1203,7 @@ static void iomap_write_delalloc_scan(struct inode *inode,
>   * require sprinkling this code with magic "+ 1" and "- 1" arithmetic and expose
>   * the code to subtle off-by-one bugs....
>   */
> -static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
> +void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
>  		loff_t end_byte, unsigned flags, struct iomap *iomap,
>  		iomap_punch_t punch)
>  {
> @@ -1243,62 +1269,7 @@ static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
>  out_unlock:
>  	filemap_invalidate_unlock(inode->i_mapping);
>  }
> -
> -/*
> - * When a short write occurs, the filesystem may need to remove reserved space
> - * that was allocated in ->iomap_begin from it's ->iomap_end method. For
> - * filesystems that use delayed allocation, we need to punch out delalloc
> - * extents from the range that are not dirty in the page cache. As the write can
> - * race with page faults, there can be dirty pages over the delalloc extent
> - * outside the range of a short write but still within the delalloc extent
> - * allocated for this iomap.
> - *
> - * This function uses [start_byte, end_byte) intervals (i.e. open ended) to
> - * simplify range iterations.
> - *
> - * The punch() callback *must* only punch delalloc extents in the range passed
> - * to it. It must skip over all other types of extents in the range and leave
> - * them completely unchanged. It must do this punch atomically with respect to
> - * other extent modifications.
> - *
> - * The punch() callback may be called with a folio locked to prevent writeback
> - * extent allocation racing at the edge of the range we are currently punching.
> - * The locked folio may or may not cover the range being punched, so it is not
> - * safe for the punch() callback to lock folios itself.
> - *
> - * Lock order is:
> - *
> - * inode->i_rwsem (shared or exclusive)
> - *   inode->i_mapping->invalidate_lock (exclusive)
> - *     folio_lock()
> - *       ->punch
> - *         internal filesystem allocation lock
> - */
> -void iomap_file_buffered_write_punch_delalloc(struct inode *inode,
> -		loff_t pos, loff_t length, ssize_t written, unsigned flags,
> -		struct iomap *iomap, iomap_punch_t punch)
> -{
> -	loff_t			start_byte;
> -	loff_t			end_byte;
> -
> -	if (iomap->type != IOMAP_DELALLOC)
> -		return;
> -
> -	/* If we didn't reserve the blocks, we're not allowed to punch them. */
> -	if (!(iomap->flags & IOMAP_F_NEW))
> -		return;
> -
> -	start_byte = iomap_last_written_block(inode, pos, written);
> -	end_byte = round_up(pos + length, i_blocksize(inode));
> -
> -	/* Nothing to do if we've written the entire delalloc extent */
> -	if (start_byte >= end_byte)
> -		return;
> -
> -	iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
> -			punch);
> -}
> -EXPORT_SYMBOL_GPL(iomap_file_buffered_write_punch_delalloc);
> +EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
>  
>  static loff_t iomap_unshare_iter(struct iomap_iter *iter)
>  {
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 1e11f48814c0d0..30f2530b6d5461 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1227,8 +1227,20 @@ xfs_buffered_write_iomap_end(
>  	unsigned		flags,
>  	struct iomap		*iomap)
>  {
> -	iomap_file_buffered_write_punch_delalloc(inode, offset, length, written,
> -			flags, iomap, &xfs_buffered_write_delalloc_punch);
> +	loff_t			start_byte, end_byte;
> +
> +	/* If we didn't reserve the blocks, we're not allowed to punch them. */
> +	if (iomap->type != IOMAP_DELALLOC || !(iomap->flags & IOMAP_F_NEW))
> +		return 0;
> +
> +	/* Nothing to do if we've written the entire delalloc extent */
> +	start_byte = iomap_last_written_block(inode, offset, written);
> +	end_byte = round_up(offset + length, i_blocksize(inode));
> +	if (start_byte >= end_byte)
> +		return 0;
> +
> +	iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
> +			xfs_buffered_write_delalloc_punch);
>  	return 0;
>  }
>  
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index e62df5d93f04de..137e0783faa224 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -290,9 +290,9 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf,
>  
>  typedef void (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length,
>  		struct iomap *iomap);
> -void iomap_file_buffered_write_punch_delalloc(struct inode *inode, loff_t pos,
> -		loff_t length, ssize_t written, unsigned flag,
> -		struct iomap *iomap, iomap_punch_t punch);
> +void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
> +		loff_t end_byte, unsigned flags, struct iomap *iomap,
> +		iomap_punch_t punch);
>  
>  int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>  		u64 start, u64 len, const struct iomap_ops *ops);
> -- 
> 2.45.2
>
Dave Chinner Sept. 23, 2024, 10:43 p.m. UTC | #2
On Mon, Sep 23, 2024 at 09:18:25AM -0700, Darrick J. Wong wrote:
> On Mon, Sep 23, 2024 at 05:28:16PM +0200, Christoph Hellwig wrote:
> > Currently iomap_file_buffered_write_punch_delalloc can be called from
> > XFS either with the invalidate lock held or not.  To fix this while
> > keeping the locking in the file system and not the iomap library
> > code we'll need to life the locking up into the file system.
> > 
> > To prepare for that, open code iomap_file_buffered_write_punch_delalloc
> > in the only caller, and instead export iomap_write_delalloc_release.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  .../filesystems/iomap/operations.rst          |  2 +-
> >  fs/iomap/buffered-io.c                        | 85 ++++++-------------
> >  fs/xfs/xfs_iomap.c                            | 16 +++-
> >  include/linux/iomap.h                         |  6 +-
> >  4 files changed, 46 insertions(+), 63 deletions(-)
> > 
> > diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> > index 8e6c721d233010..b93115ab8748ae 100644
> > --- a/Documentation/filesystems/iomap/operations.rst
> > +++ b/Documentation/filesystems/iomap/operations.rst
> > @@ -208,7 +208,7 @@ The filesystem must arrange to `cancel
> >  such `reservations
> >  <https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_
> >  because writeback will not consume the reservation.
> > -The ``iomap_file_buffered_write_punch_delalloc`` can be called from a
> > +The ``iomap_write_delalloc_release`` can be called from a
> >  ``->iomap_end`` function to find all the clean areas of the folios
> >  caching a fresh (``IOMAP_F_NEW``) delalloc mapping.
> >  It takes the ``invalidate_lock``.
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 884891ac7a226c..237aeb883166df 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1149,6 +1149,32 @@ static void iomap_write_delalloc_scan(struct inode *inode,
> >   * have dirty data still pending in the page cache - those are going to be
> >   * written and so must still retain the delalloc backing for writeback.
> >   *
> > + * When a short write occurs, the filesystem may need to remove reserved space
> > + * that was allocated in ->iomap_begin from it's ->iomap_end method. For
> 
> "When a short write occurs, the filesystem may need to remove space
> reservations created in ->iomap_begin.
> 
> > + * filesystems that use delayed allocation, we need to punch out delalloc
> > + * extents from the range that are not dirty in the page cache. As the write can
> > + * race with page faults, there can be dirty pages over the delalloc extent
> > + * outside the range of a short write but still within the delalloc extent
> > + * allocated for this iomap.
> > + *
> > + * The punch() callback *must* only punch delalloc extents in the range passed
> > + * to it. It must skip over all other types of extents in the range and leave
> > + * them completely unchanged. It must do this punch atomically with respect to
> > + * other extent modifications.
> 
> Can a failing buffered write race with a write fault to the same file
> range?

Yes.

> write() thread:			page_mkwrite thread:
> ---------------			--------------------
> take i_rwsem
> ->iomap_begin
> create da reservation
> lock folio
> fail to write
> unlock folio
> 				take invalidation lock
> 				lock folio
> 				->iomap_begin
> 				sees da reservation
> 				mark folio dirty
> 				unlock folio
> 				drop invalidation lock
> ->iomap_end
> take invalidation lock
> iomap_write_delalloc_release
> drop invalidation lock
> 
> Can we end up in this situation, where the write fault thinks it has a
> dirty page backed by a delalloc reservation, yet the delalloc
> reservation gets removed by the delalloc punch logic? 

No.

> I think the
> answer to my question is that this sequence is impossible because the
> write fault dirties the folio so the iomap_write_delalloc_release does
> nothing, correct?

Yes.

The above situation is the race condition that the delalloc punching
code is taking into account when it checks for dirty data over the
range being punched. As the comment above
iomap_write_delalloc_release() says:

/*
 * Punch out all the delalloc blocks in the range given except for those that
 * have dirty data still pending in the page cache - those are going to be
 * written and so must still retain the delalloc backing for writeback.
 *
....

-Dave.
Christoph Hellwig Sept. 24, 2024, 5:55 a.m. UTC | #3
On Mon, Sep 23, 2024 at 09:18:25AM -0700, Darrick J. Wong wrote:
> > + * When a short write occurs, the filesystem may need to remove reserved space
> > + * that was allocated in ->iomap_begin from it's ->iomap_end method. For
> 
> "When a short write occurs, the filesystem may need to remove space
> reservations created in ->iomap_begin.

This just moved the text from the existing comment.  I agree that your
wording is better, but I'd keep the "from it's ->iomap_end".

> Unrelated question about iomap_write_begin: Can we get rid of the
> !mapping_large_folio_support if-body just prior to __iomap_get_folio?
> filemap_get_folio won't return large folios if
> !mapping_large_folio_support, so I think the separate check in iomap
> isn't needed anymore?

From the iomap POV it seems like we could (after checking no one
is doing something weird with len in ->get_folio).
Darrick J. Wong Sept. 24, 2024, 6:05 a.m. UTC | #4
On Tue, Sep 24, 2024 at 07:55:33AM +0200, Christoph Hellwig wrote:
> On Mon, Sep 23, 2024 at 09:18:25AM -0700, Darrick J. Wong wrote:
> > > + * When a short write occurs, the filesystem may need to remove reserved space
> > > + * that was allocated in ->iomap_begin from it's ->iomap_end method. For
> > 
> > "When a short write occurs, the filesystem may need to remove space
> > reservations created in ->iomap_begin.
> 
> This just moved the text from the existing comment.  I agree that your
> wording is better, but I'd keep the "from it's ->iomap_end".

Yeah, please do.  What do you think of:

"When a short write occurs, the filesystem might need to use ->iomap_end
to remove space reservations created in ->iomap_begin." ?

> > Unrelated question about iomap_write_begin: Can we get rid of the
> > !mapping_large_folio_support if-body just prior to __iomap_get_folio?
> > filemap_get_folio won't return large folios if
> > !mapping_large_folio_support, so I think the separate check in iomap
> > isn't needed anymore?
> 
> From the iomap POV it seems like we could (after checking no one
> is doing something weird with len in ->get_folio).

The only user I know of is gfs2, which allocates a transaction and then
calls iomap_get_folio with pos/len unchanged.

--D
Christoph Hellwig Sept. 24, 2024, 6:10 a.m. UTC | #5
On Mon, Sep 23, 2024 at 11:05:16PM -0700, Darrick J. Wong wrote:
> Yeah, please do.  What do you think of:
> 
> "When a short write occurs, the filesystem might need to use ->iomap_end
> to remove space reservations created in ->iomap_begin." ?

Sounds good, I'll update it again.

> > > Unrelated question about iomap_write_begin: Can we get rid of the
> > > !mapping_large_folio_support if-body just prior to __iomap_get_folio?
> > > filemap_get_folio won't return large folios if
> > > !mapping_large_folio_support, so I think the separate check in iomap
> > > isn't needed anymore?
> > 
> > From the iomap POV it seems like we could (after checking no one
> > is doing something weird with len in ->get_folio).
> 
> The only user I know of is gfs2, which allocates a transaction and then
> calls iomap_get_folio with pos/len unchanged.

Yeah, so it _should_ be fine.  Not really feeling like changing it now
with all the other stuff in flight, though.  And eventually I really
want to sort out a few things in the area, like confusing
__iomap_get_folio naming for the wrapper with iomap_get_folio as
the default instance, and the fact that the get_folio and invalidation
are indirect calls right next to each other.

> 
> --D
---end quoted text---
diff mbox series

Patch

diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index 8e6c721d233010..b93115ab8748ae 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -208,7 +208,7 @@  The filesystem must arrange to `cancel
 such `reservations
 <https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_
 because writeback will not consume the reservation.
-The ``iomap_file_buffered_write_punch_delalloc`` can be called from a
+The ``iomap_write_delalloc_release`` can be called from a
 ``->iomap_end`` function to find all the clean areas of the folios
 caching a fresh (``IOMAP_F_NEW``) delalloc mapping.
 It takes the ``invalidate_lock``.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 884891ac7a226c..237aeb883166df 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1149,6 +1149,32 @@  static void iomap_write_delalloc_scan(struct inode *inode,
  * have dirty data still pending in the page cache - those are going to be
  * written and so must still retain the delalloc backing for writeback.
  *
+ * When a short write occurs, the filesystem may need to remove reserved space
+ * that was allocated in ->iomap_begin from it's ->iomap_end method. For
+ * filesystems that use delayed allocation, we need to punch out delalloc
+ * extents from the range that are not dirty in the page cache. As the write can
+ * race with page faults, there can be dirty pages over the delalloc extent
+ * outside the range of a short write but still within the delalloc extent
+ * allocated for this iomap.
+ *
+ * The punch() callback *must* only punch delalloc extents in the range passed
+ * to it. It must skip over all other types of extents in the range and leave
+ * them completely unchanged. It must do this punch atomically with respect to
+ * other extent modifications.
+ *
+ * The punch() callback may be called with a folio locked to prevent writeback
+ * extent allocation racing at the edge of the range we are currently punching.
+ * The locked folio may or may not cover the range being punched, so it is not
+ * safe for the punch() callback to lock folios itself.
+ *
+ * Lock order is:
+ *
+ * inode->i_rwsem (shared or exclusive)
+ *   inode->i_mapping->invalidate_lock (exclusive)
+ *     folio_lock()
+ *       ->punch
+ *         internal filesystem allocation lock
+ *
  * As we are scanning the page cache for data, we don't need to reimplement the
  * wheel - mapping_seek_hole_data() does exactly what we need to identify the
  * start and end of data ranges correctly even for sub-folio block sizes. This
@@ -1177,7 +1203,7 @@  static void iomap_write_delalloc_scan(struct inode *inode,
  * require sprinkling this code with magic "+ 1" and "- 1" arithmetic and expose
  * the code to subtle off-by-one bugs....
  */
-static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
+void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
 		loff_t end_byte, unsigned flags, struct iomap *iomap,
 		iomap_punch_t punch)
 {
@@ -1243,62 +1269,7 @@  static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
 out_unlock:
 	filemap_invalidate_unlock(inode->i_mapping);
 }
-
-/*
- * When a short write occurs, the filesystem may need to remove reserved space
- * that was allocated in ->iomap_begin from it's ->iomap_end method. For
- * filesystems that use delayed allocation, we need to punch out delalloc
- * extents from the range that are not dirty in the page cache. As the write can
- * race with page faults, there can be dirty pages over the delalloc extent
- * outside the range of a short write but still within the delalloc extent
- * allocated for this iomap.
- *
- * This function uses [start_byte, end_byte) intervals (i.e. open ended) to
- * simplify range iterations.
- *
- * The punch() callback *must* only punch delalloc extents in the range passed
- * to it. It must skip over all other types of extents in the range and leave
- * them completely unchanged. It must do this punch atomically with respect to
- * other extent modifications.
- *
- * The punch() callback may be called with a folio locked to prevent writeback
- * extent allocation racing at the edge of the range we are currently punching.
- * The locked folio may or may not cover the range being punched, so it is not
- * safe for the punch() callback to lock folios itself.
- *
- * Lock order is:
- *
- * inode->i_rwsem (shared or exclusive)
- *   inode->i_mapping->invalidate_lock (exclusive)
- *     folio_lock()
- *       ->punch
- *         internal filesystem allocation lock
- */
-void iomap_file_buffered_write_punch_delalloc(struct inode *inode,
-		loff_t pos, loff_t length, ssize_t written, unsigned flags,
-		struct iomap *iomap, iomap_punch_t punch)
-{
-	loff_t			start_byte;
-	loff_t			end_byte;
-
-	if (iomap->type != IOMAP_DELALLOC)
-		return;
-
-	/* If we didn't reserve the blocks, we're not allowed to punch them. */
-	if (!(iomap->flags & IOMAP_F_NEW))
-		return;
-
-	start_byte = iomap_last_written_block(inode, pos, written);
-	end_byte = round_up(pos + length, i_blocksize(inode));
-
-	/* Nothing to do if we've written the entire delalloc extent */
-	if (start_byte >= end_byte)
-		return;
-
-	iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
-			punch);
-}
-EXPORT_SYMBOL_GPL(iomap_file_buffered_write_punch_delalloc);
+EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
 
 static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 {
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 1e11f48814c0d0..30f2530b6d5461 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1227,8 +1227,20 @@  xfs_buffered_write_iomap_end(
 	unsigned		flags,
 	struct iomap		*iomap)
 {
-	iomap_file_buffered_write_punch_delalloc(inode, offset, length, written,
-			flags, iomap, &xfs_buffered_write_delalloc_punch);
+	loff_t			start_byte, end_byte;
+
+	/* If we didn't reserve the blocks, we're not allowed to punch them. */
+	if (iomap->type != IOMAP_DELALLOC || !(iomap->flags & IOMAP_F_NEW))
+		return 0;
+
+	/* Nothing to do if we've written the entire delalloc extent */
+	start_byte = iomap_last_written_block(inode, offset, written);
+	end_byte = round_up(offset + length, i_blocksize(inode));
+	if (start_byte >= end_byte)
+		return 0;
+
+	iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
+			xfs_buffered_write_delalloc_punch);
 	return 0;
 }
 
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e62df5d93f04de..137e0783faa224 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -290,9 +290,9 @@  vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf,
 
 typedef void (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length,
 		struct iomap *iomap);
-void iomap_file_buffered_write_punch_delalloc(struct inode *inode, loff_t pos,
-		loff_t length, ssize_t written, unsigned flag,
-		struct iomap *iomap, iomap_punch_t punch);
+void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
+		loff_t end_byte, unsigned flags, struct iomap *iomap,
+		iomap_punch_t punch);
 
 int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		u64 start, u64 len, const struct iomap_ops *ops);