diff mbox series

[RFC,1/7] iomap: Don't fall back to buffered write if the write is atomic

Message ID 09ec4c88b565c85dee91eccf6e894a0c047d9e69.1701339358.git.ojaswin@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series ext4: Allocator changes for atomic write support with DIO | expand

Commit Message

Ojaswin Mujoo Nov. 30, 2023, 1:53 p.m. UTC
Currently, iomap only supports atomic writes for direct IOs and there is
no guarantees that a buffered IO will be atomic. Hence, if the user has
explicitly requested the direct write to be atomic and there's a
failure, return -EIO instead of falling back to buffered IO.

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 fs/iomap/direct-io.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Dave Chinner Nov. 30, 2023, 9:10 p.m. UTC | #1
On Thu, Nov 30, 2023 at 07:23:09PM +0530, Ojaswin Mujoo wrote:
> Currently, iomap only supports atomic writes for direct IOs and there is
> no guarantees that a buffered IO will be atomic. Hence, if the user has
> explicitly requested the direct write to be atomic and there's a
> failure, return -EIO instead of falling back to buffered IO.
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  fs/iomap/direct-io.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 6ef25e26f1a1..3e7cd9bc8f4d 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -662,7 +662,13 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  			if (ret != -EAGAIN) {
>  				trace_iomap_dio_invalidate_fail(inode, iomi.pos,
>  								iomi.len);
> -				ret = -ENOTBLK;
> +				/*
> +				 * if this write was supposed to be atomic,
> +				 * return the err rather than trying to fall
> +				 * back to buffered IO.
> +				 */
> +				if (!atomic_write)
> +					ret = -ENOTBLK;

This belongs in the caller when it receives an -ENOTBLK from
iomap_dio_rw(). The iomap code is saying "this IO cannot be done
with direct IO" by returning this value, and then the caller can
make the determination of whether to run a buffered IO or not.

For example, a filesystem might still be able to perform an atomic
IO via a COW-based buffered IO slow path. Sure, ext4 can't do this,
but the above patch would prevent filesystems that could from being
able to implement such a fallback....

-Dave.
John Garry Dec. 1, 2023, 10:42 a.m. UTC | #2
On 30/11/2023 21:10, Dave Chinner wrote:
> On Thu, Nov 30, 2023 at 07:23:09PM +0530, Ojaswin Mujoo wrote:
>> Currently, iomap only supports atomic writes for direct IOs and there is
>> no guarantees that a buffered IO will be atomic. Hence, if the user has
>> explicitly requested the direct write to be atomic and there's a
>> failure, return -EIO instead of falling back to buffered IO.
>>
>> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
>> ---
>>   fs/iomap/direct-io.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
>> index 6ef25e26f1a1..3e7cd9bc8f4d 100644
>> --- a/fs/iomap/direct-io.c
>> +++ b/fs/iomap/direct-io.c
>> @@ -662,7 +662,13 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>>   			if (ret != -EAGAIN) {
>>   				trace_iomap_dio_invalidate_fail(inode, iomi.pos,
>>   								iomi.len);
>> -				ret = -ENOTBLK;
>> +				/*
>> +				 * if this write was supposed to be atomic,
>> +				 * return the err rather than trying to fall
>> +				 * back to buffered IO.
>> +				 */
>> +				if (!atomic_write)
>> +					ret = -ENOTBLK;
> This belongs in the caller when it receives an -ENOTBLK from
> iomap_dio_rw(). The iomap code is saying "this IO cannot be done
> with direct IO" by returning this value, and then the caller can
> make the determination of whether to run a buffered IO or not.
> 
> For example, a filesystem might still be able to perform an atomic
> IO via a COW-based buffered IO slow path. Sure, ext4 can't do this,
> but the above patch would prevent filesystems that could from being
> able to implement such a fallback....

Sure, and I think that we need a better story for supporting buffered IO 
for atomic writes.

Currently we have:
- man pages tell us RWF_ATOMIC is only supported for direct IO
- statx gives atomic write unit min/max, not explicitly telling us it's 
for direct IO
- RWF_ATOMIC is ignored for !O_DIRECT

So I am thinking of expanding statx support to enable querying of atomic 
write capabilities for buffered IO and direct IO separately.

Thanks,
John
Matthew Wilcox Dec. 1, 2023, 1:27 p.m. UTC | #3
On Fri, Dec 01, 2023 at 10:42:57AM +0000, John Garry wrote:
> Sure, and I think that we need a better story for supporting buffered IO for
> atomic writes.
> 
> Currently we have:
> - man pages tell us RWF_ATOMIC is only supported for direct IO
> - statx gives atomic write unit min/max, not explicitly telling us it's for
> direct IO
> - RWF_ATOMIC is ignored for !O_DIRECT
> 
> So I am thinking of expanding statx support to enable querying of atomic
> write capabilities for buffered IO and direct IO separately.

Or ... we could support RWF_ATOMIC in the page cache?

I haven't particularly been following the atomic writes patchset, but
for filesystems which support large folios, we now create large folios
in the write path.  I see four problems to solve:

1. We might already have a smaller folio in the page cache from an
   earlier access,  We'd have to kick it out before creating a new folio
   that is the appropriate size.

2. We currently believe it's always OK to fall back to allocating smaller
   folios if memory allocation fails.  We'd need to change that policy
   (which we need to modify anyway for the bs>PS support).

3. We need to somewhere keep the information that writeback of this
   folio has to use the atomic commands.  Maybe it becomes a per-inode
   flag so that all writeback from this inode now uses the atomic
   commands?

4. If somebody does a weird thing like truncate/holepunch into the
   middle of the folio, we need to define what we do.  It's conceptually
   a bizarre thing to do, so I can't see any user actually wanting to
   do that ... but we need to define the semantics.

Maybe there are things I haven't thought of.  And of course, some
filesystems don't support large folios yet.
John Garry Dec. 1, 2023, 7:06 p.m. UTC | #4
On 01/12/2023 13:27, Matthew Wilcox wrote:
>> Sure, and I think that we need a better story for supporting buffered IO for
>> atomic writes.
>>
>> Currently we have:
>> - man pages tell us RWF_ATOMIC is only supported for direct IO
>> - statx gives atomic write unit min/max, not explicitly telling us it's for
>> direct IO
>> - RWF_ATOMIC is ignored for !O_DIRECT
>>
>> So I am thinking of expanding statx support to enable querying of atomic
>> write capabilities for buffered IO and direct IO separately.
> Or ... we could support RWF_ATOMIC in the page cache?
> 
> I haven't particularly been following the atomic writes patchset,

Some background is that we are focused on direct IO as the database 
applications we're interested in use direct IO, but there are other DBs 
which do not support direct IO (and want atomic write support).

> but
> for filesystems which support large folios, we now create large folios
> in the write path.  I see four problems to solve:
> 
> 1. We might already have a smaller folio in the page cache from an
>     earlier access,  We'd have to kick it out before creating a new folio
>     that is the appropriate size.

Understood. Even though we give scope to do atomic writes of variable 
size, we do expect applications to use a fixed size mostly. In addition, 
typically we would expect only atomic or non-atomic writes. But what you 
say would be possible.

> 
> 2. We currently believe it's always OK to fall back to allocating smaller
>     folios if memory allocation fails.  We'd need to change that policy
>     (which we need to modify anyway for the bs>PS support).

ok

> 
> 3. We need to somewhere keep the information that writeback of this
>     folio has to use the atomic commands.  Maybe it becomes a per-inode
>     flag so that all writeback from this inode now uses the atomic
>     commands?

I'm not sure. Currently atomic writes are simply flagged per IO, and 
per-inode atomic flags are something which we have avoided so far.

> 
> 4. If somebody does a weird thing like truncate/holepunch into the
>     middle of the folio, we need to define what we do.  It's conceptually
>     a bizarre thing to do, so I can't see any user actually wanting to
>     do that ... but we need to define the semantics.

ok

> 
> Maybe there are things I haven't thought of.  And of course, some
> filesystems don't support large folios yet.

I may consider a PoC...

Thanks,
John
Dave Chinner Dec. 1, 2023, 10:07 p.m. UTC | #5
On Fri, Dec 01, 2023 at 10:42:57AM +0000, John Garry wrote:
> On 30/11/2023 21:10, Dave Chinner wrote:
> > On Thu, Nov 30, 2023 at 07:23:09PM +0530, Ojaswin Mujoo wrote:
> > > Currently, iomap only supports atomic writes for direct IOs and there is
> > > no guarantees that a buffered IO will be atomic. Hence, if the user has
> > > explicitly requested the direct write to be atomic and there's a
> > > failure, return -EIO instead of falling back to buffered IO.
> > > 
> > > Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> > > ---
> > >   fs/iomap/direct-io.c | 8 +++++++-
> > >   1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > > index 6ef25e26f1a1..3e7cd9bc8f4d 100644
> > > --- a/fs/iomap/direct-io.c
> > > +++ b/fs/iomap/direct-io.c
> > > @@ -662,7 +662,13 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> > >   			if (ret != -EAGAIN) {
> > >   				trace_iomap_dio_invalidate_fail(inode, iomi.pos,
> > >   								iomi.len);
> > > -				ret = -ENOTBLK;
> > > +				/*
> > > +				 * if this write was supposed to be atomic,
> > > +				 * return the err rather than trying to fall
> > > +				 * back to buffered IO.
> > > +				 */
> > > +				if (!atomic_write)
> > > +					ret = -ENOTBLK;
> > This belongs in the caller when it receives an -ENOTBLK from
> > iomap_dio_rw(). The iomap code is saying "this IO cannot be done
> > with direct IO" by returning this value, and then the caller can
> > make the determination of whether to run a buffered IO or not.
> > 
> > For example, a filesystem might still be able to perform an atomic
> > IO via a COW-based buffered IO slow path. Sure, ext4 can't do this,
> > but the above patch would prevent filesystems that could from being
> > able to implement such a fallback....
> 
> Sure, and I think that we need a better story for supporting buffered IO for
> atomic writes.
> 
> Currently we have:
> - man pages tell us RWF_ATOMIC is only supported for direct IO
> - statx gives atomic write unit min/max, not explicitly telling us it's for
> direct IO
> - RWF_ATOMIC is ignored for !O_DIRECT
> 
> So I am thinking of expanding statx support to enable querying of atomic
> write capabilities for buffered IO and direct IO separately.

You're over complicating this way too much by trying to restrict the
functionality down to just what you want to implement right now.

RWF_ATOMIC is no different to RWF_NOWAIT. The API doesn't decide
what can be supported - the filesystems themselves decide what part
of the API they can support and implement those pieces.

TO go back to RWF_NOWAIT, for a long time we (XFS) only supported
RWF_NOWAIT on DIO, and buffered reads and writes were given
-EOPNOTSUPP by the filesystem. Then other filesystems started
supporting DIO with RWF_NOWAIT. Then buffered read support was added
to the page cache and XFS, and as other filesystems were converted
they removed the RWF_NOWAIT exclusion check from their read IO
paths.

We are now in the same place with buffered write support for
RWF_NOWAIT. XFS, the page cache and iomap allow buffered writes w/
RWF_NOWAIT, but ext4, btrfs and f2fs still all return -EOPNOTSUPP
because they don't support non-blocking buffered writes yet.

This is the same model we should be applying with RWF_ATOMIC - we
know that over time we'll be able to expand support for atomic
writes across both direct and buffered IO, so we should not be
restricting the API or infrastructure to only allow RWF_ATOMIC w/
DIO. Just have the filesystems reject RWF_ATOMIC w/ -EOPNOTSUPP if
they don't support it, and for those that do it is conditional on
whther the filesystem supports it for the given type of IO being
done.

Seriously - an application can easily probe for RWF_ATOMIC support
without needing information to be directly exposed in statx() - just
open a O_TMPFILE, issue the type of RWF_ATOMIC IO you require to be
supported, and if it returns -EOPNOTSUPP then it you can't use
RWF_ATOMIC optimisations in the application....

-Dave.
John Garry Dec. 4, 2023, 9:02 a.m. UTC | #6
On 01/12/2023 22:07, Dave Chinner wrote:
>> Sure, and I think that we need a better story for supporting buffered IO for
>> atomic writes.
>>
>> Currently we have:
>> - man pages tell us RWF_ATOMIC is only supported for direct IO
>> - statx gives atomic write unit min/max, not explicitly telling us it's for
>> direct IO
>> - RWF_ATOMIC is ignored for !O_DIRECT
>>
>> So I am thinking of expanding statx support to enable querying of atomic
>> write capabilities for buffered IO and direct IO separately.
> You're over complicating this way too much by trying to restrict the
> functionality down to just what you want to implement right now.
> 
> RWF_ATOMIC is no different to RWF_NOWAIT. The API doesn't decide
> what can be supported - the filesystems themselves decide what part
> of the API they can support and implement those pieces.

Sure, but for RWF_ATOMIC we still have the associated statx call to tell 
us whether atomic writes are supported for a file and the specific range 
capability.

> 
> TO go back to RWF_NOWAIT, for a long time we (XFS) only supported
> RWF_NOWAIT on DIO, and buffered reads and writes were given
> -EOPNOTSUPP by the filesystem. Then other filesystems started
> supporting DIO with RWF_NOWAIT. Then buffered read support was added
> to the page cache and XFS, and as other filesystems were converted
> they removed the RWF_NOWAIT exclusion check from their read IO
> paths.
> 
> We are now in the same place with buffered write support for
> RWF_NOWAIT. XFS, the page cache and iomap allow buffered writes w/
> RWF_NOWAIT, but ext4, btrfs and f2fs still all return -EOPNOTSUPP
> because they don't support non-blocking buffered writes yet.
> 
> This is the same model we should be applying with RWF_ATOMIC - we
> know that over time we'll be able to expand support for atomic
> writes across both direct and buffered IO, so we should not be
> restricting the API or infrastructure to only allow RWF_ATOMIC w/
> DIO.

Agreed.

> Just have the filesystems reject RWF_ATOMIC w/ -EOPNOTSUPP if
> they don't support it,

Yes, I was going to add this regardless.

> and for those that do it is conditional on
> whther the filesystem supports it for the given type of IO being
> done.
> 
> Seriously - an application can easily probe for RWF_ATOMIC support
> without needing information to be directly exposed in statx() - just
> open a O_TMPFILE, issue the type of RWF_ATOMIC IO you require to be
> supported, and if it returns -EOPNOTSUPP then it you can't use
> RWF_ATOMIC optimisations in the application....

ok, if that is the done thing.

So I can't imagine that atomic write unit range will be different for 
direct IO and buffered IO (ignoring for a moment Christoph's idea for 
CoW always for no HW offload) when supported. But it seems that we may 
have a scenario where statx tells is that atomic writes are supported 
for a file, and a DIO write succeeds and a buffered IO write may return 
-EOPNOTSUPP. If that's acceptable then I'll work towards that.

If we could just run statx on a file descriptor here then that would be 
simpler...

Thanks,
John
Darrick J. Wong Dec. 4, 2023, 6:17 p.m. UTC | #7
On Mon, Dec 04, 2023 at 09:02:56AM +0000, John Garry wrote:
> On 01/12/2023 22:07, Dave Chinner wrote:
> > > Sure, and I think that we need a better story for supporting buffered IO for
> > > atomic writes.
> > > 
> > > Currently we have:
> > > - man pages tell us RWF_ATOMIC is only supported for direct IO
> > > - statx gives atomic write unit min/max, not explicitly telling us it's for
> > > direct IO
> > > - RWF_ATOMIC is ignored for !O_DIRECT
> > > 
> > > So I am thinking of expanding statx support to enable querying of atomic
> > > write capabilities for buffered IO and direct IO separately.
> > You're over complicating this way too much by trying to restrict the
> > functionality down to just what you want to implement right now.
> > 
> > RWF_ATOMIC is no different to RWF_NOWAIT. The API doesn't decide
> > what can be supported - the filesystems themselves decide what part
> > of the API they can support and implement those pieces.
> 
> Sure, but for RWF_ATOMIC we still have the associated statx call to tell us
> whether atomic writes are supported for a file and the specific range
> capability.
> 
> > 
> > TO go back to RWF_NOWAIT, for a long time we (XFS) only supported
> > RWF_NOWAIT on DIO, and buffered reads and writes were given
> > -EOPNOTSUPP by the filesystem. Then other filesystems started
> > supporting DIO with RWF_NOWAIT. Then buffered read support was added
> > to the page cache and XFS, and as other filesystems were converted
> > they removed the RWF_NOWAIT exclusion check from their read IO
> > paths.
> > 
> > We are now in the same place with buffered write support for
> > RWF_NOWAIT. XFS, the page cache and iomap allow buffered writes w/
> > RWF_NOWAIT, but ext4, btrfs and f2fs still all return -EOPNOTSUPP
> > because they don't support non-blocking buffered writes yet.
> > 
> > This is the same model we should be applying with RWF_ATOMIC - we
> > know that over time we'll be able to expand support for atomic
> > writes across both direct and buffered IO, so we should not be
> > restricting the API or infrastructure to only allow RWF_ATOMIC w/
> > DIO.
> 
> Agreed.
> 
> > Just have the filesystems reject RWF_ATOMIC w/ -EOPNOTSUPP if
> > they don't support it,
> 
> Yes, I was going to add this regardless.
> 
> > and for those that do it is conditional on
> > whther the filesystem supports it for the given type of IO being
> > done.
> > 
> > Seriously - an application can easily probe for RWF_ATOMIC support
> > without needing information to be directly exposed in statx() - just
> > open a O_TMPFILE, issue the type of RWF_ATOMIC IO you require to be
> > supported, and if it returns -EOPNOTSUPP then it you can't use
> > RWF_ATOMIC optimisations in the application....
> 
> ok, if that is the done thing.
> 
> So I can't imagine that atomic write unit range will be different for direct
> IO and buffered IO (ignoring for a moment Christoph's idea for CoW always
> for no HW offload) when supported. But it seems that we may have a scenario
> where statx tells is that atomic writes are supported for a file, and a DIO
> write succeeds and a buffered IO write may return -EOPNOTSUPP. If that's
> acceptable then I'll work towards that.
> 
> If we could just run statx on a file descriptor here then that would be
> simpler...

statx(fd, "", AT_EMPTY_PATH, ...); ?

--D

> Thanks,
> John
> 
> 
>
John Garry Dec. 4, 2023, 6:34 p.m. UTC | #8
On 04/12/2023 18:17, Darrick J. Wong wrote:
>> If we could just run statx on a file descriptor here then that would be
>> simpler...
> statx(fd, "", AT_EMPTY_PATH, ...); ?

I really meant that if we could only run statx on a fd then we could 
know if we want DIO or buffered IO (as that is how it was opened).

Thanks,
John
John Garry Dec. 7, 2023, 12:43 p.m. UTC | #9
On 01/12/2023 22:07, Dave Chinner wrote:
> RWF_ATOMIC is no different to RWF_NOWAIT. The API doesn't decide
> what can be supported - the filesystems themselves decide what part
> of the API they can support and implement those pieces.
> 
> TO go back to RWF_NOWAIT, for a long time we (XFS) only supported
> RWF_NOWAIT on DIO, and buffered reads and writes were given
> -EOPNOTSUPP by the filesystem. Then other filesystems started
> supporting DIO with RWF_NOWAIT. Then buffered read support was added
> to the page cache and XFS, and as other filesystems were converted
> they removed the RWF_NOWAIT exclusion check from their read IO
> paths.
> 
> We are now in the same place with buffered write support for
> RWF_NOWAIT. XFS, the page cache and iomap allow buffered writes w/
> RWF_NOWAIT, but ext4, btrfs and f2fs still all return -EOPNOTSUPP
> because they don't support non-blocking buffered writes yet.
> 
> This is the same model we should be applying with RWF_ATOMIC - we
> know that over time we'll be able to expand support for atomic
> writes across both direct and buffered IO, so we should not be
> restricting the API or infrastructure to only allow RWF_ATOMIC w/
> DIO. Just have the filesystems reject RWF_ATOMIC w/ -EOPNOTSUPP if
> they don't support it, and for those that do it is conditional on
> whther the filesystem supports it for the given type of IO being
> done.
> 
> Seriously - an application can easily probe for RWF_ATOMIC support
> without needing information to be directly exposed in statx() - just
> open a O_TMPFILE, issue the type of RWF_ATOMIC IO you require to be
> supported, and if it returns -EOPNOTSUPP then it you can't use
> RWF_ATOMIC optimisations in the application....

Hi Dave,

For rejecting RWF_ATOMIC when not supported for a file, how about 
something like this:

--->8----

diff --git a/block/fops.c b/block/fops.c
index 273bd8f5a370..d9563ef29dde 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -639,6 +637,9 @@ static int blkdev_open(struct inode *inode, struct 
file *filp)
  	if (IS_ERR(handle))
  		return PTR_ERR(handle);

+	if (queue_atomic_write_unit_max_bytes(bdev_get_queue(handle->bdev)))
+		filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
+
  	if (bdev_nowait(handle->bdev))
  		filp->f_mode |= FMODE_NOWAIT;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4256ec184461..d725c194243c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -185,6 +185,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, 
loff_t offset,
  /* File supports async nowait buffered writes */
  #define FMODE_BUF_WASYNC	((__force fmode_t)0x80000000)

+/* File supports atomic writes */
+#define FMODE_CAN_ATOMIC_WRITE	((__force fmode_t)0x100000000)
+
  /*
   * Attribute flags.  These should be or-ed together to figure out what
   * has been changed!
@@ -3266,6 +3269,10 @@ static inline int kiocb_set_rw_flags(struct kiocb 
*ki, rwf_t flags)
  			return -EOPNOTSUPP;
  		kiocb_flags |= IOCB_NOIO;
  	}
+	if (flags & RWF_ATOMIC) {
+		if (!(ki->ki_filp->f_mode & FMODE_CAN_ATOMIC_WRITE))
+			return -EOPNOTSUPP;
+	}
  	kiocb_flags |= (__force int) (flags & RWF_SUPPORTED);
  	if (flags & RWF_SYNC)
  		kiocb_flags |= IOCB_DSYNC;
diff --git a/include/linux/types.h b/include/linux/types.h
index 253168bb3fe1..49c754fde1d6 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -153,7 +153,7 @@ typedef u32 dma_addr_t;

  typedef unsigned int __bitwise gfp_t;
  typedef unsigned int __bitwise slab_flags_t;
-typedef unsigned int __bitwise fmode_t;
+typedef unsigned long __bitwise fmode_t;

  #ifdef CONFIG_PHYS_ADDR_T_64BIT
  typedef u64 phys_addr_t;


----8<------

My concern is that we need to increase fmode_t in size as all available 
32 bits are used up.

Thanks,
John
diff mbox series

Patch

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 6ef25e26f1a1..3e7cd9bc8f4d 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -662,7 +662,13 @@  __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			if (ret != -EAGAIN) {
 				trace_iomap_dio_invalidate_fail(inode, iomi.pos,
 								iomi.len);
-				ret = -ENOTBLK;
+				/*
+				 * if this write was supposed to be atomic,
+				 * return the err rather than trying to fall
+				 * back to buffered IO.
+				 */
+				if (!atomic_write)
+					ret = -ENOTBLK;
 			}
 			goto out_free_dio;
 		}