diff mbox series

[09/10] iomap: add a IOMAP_DIO_NOALLOC flag

Message ID 20210112162616.2003366-10-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [01/10] xfs: factor out a xfs_ilock_iocb helper | expand

Commit Message

Christoph Hellwig Jan. 12, 2021, 4:26 p.m. UTC
Add a flag to request that the iomap instances do not allocate blocks
by translating it to another new IOMAP_NOALLOC flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/direct-io.c  | 2 ++
 include/linux/iomap.h | 2 ++
 2 files changed, 4 insertions(+)

Comments

Dave Chinner Jan. 12, 2021, 11:29 p.m. UTC | #1
On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
> Add a flag to request that the iomap instances do not allocate blocks
> by translating it to another new IOMAP_NOALLOC flag.

Except "no allocation" that is not what XFS needs for concurrent
sub-block DIO.

We are trying to avoid external sub-block IO outside the range of
the user data IO (COW, sub-block zeroing, etc) so that we don't
trash adjacent sub-block IO in flight. This means we can't do
sub-block zeroing and that then means we can't map unwritten extents
or allocate new extents for the sub-block IO.  It also means the IO
range cannot span EOF because that triggers unconditional sub-block
zeroing in iomap_dio_rw_actor().

And because we may have to map multiple extents to fully span an IO
range, we have to guarantee that subsequent extents for the IO are
also written otherwise we have a partial write abort case. Hence we
have single extent limitations as well.

So "no allocation" really doesn't describe what we want this flag to
at all.

If we're going to use a flag for this specific functionality, let's
call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
things with it.

	1. Make unaligned IO a formal part of the iomap_dio_rw()
	behaviour so it can do the common checks to for things that
	need exclusive serialisation for unaligned IO (i.e. avoid IO
	spanning EOF, abort if there are cached pages over the
	range, etc).

	2. require the filesystem mapping callback do only allow
	unaligned IO into ranges that are contiguous and don't
	require mapping state changes or sub-block zeroing to be
	performed during the sub-block IO.


Cheers,

Dave.
Brian Foster Jan. 13, 2021, 3:32 p.m. UTC | #2
On Wed, Jan 13, 2021 at 10:29:23AM +1100, Dave Chinner wrote:
> On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
> > Add a flag to request that the iomap instances do not allocate blocks
> > by translating it to another new IOMAP_NOALLOC flag.
> 
> Except "no allocation" that is not what XFS needs for concurrent
> sub-block DIO.
> 
> We are trying to avoid external sub-block IO outside the range of
> the user data IO (COW, sub-block zeroing, etc) so that we don't
> trash adjacent sub-block IO in flight. This means we can't do
> sub-block zeroing and that then means we can't map unwritten extents
> or allocate new extents for the sub-block IO.  It also means the IO
> range cannot span EOF because that triggers unconditional sub-block
> zeroing in iomap_dio_rw_actor().
> 
> And because we may have to map multiple extents to fully span an IO
> range, we have to guarantee that subsequent extents for the IO are
> also written otherwise we have a partial write abort case. Hence we
> have single extent limitations as well.
> 
> So "no allocation" really doesn't describe what we want this flag to
> at all.
> 
> If we're going to use a flag for this specific functionality, let's
> call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
> things with it.
> 
> 	1. Make unaligned IO a formal part of the iomap_dio_rw()
> 	behaviour so it can do the common checks to for things that
> 	need exclusive serialisation for unaligned IO (i.e. avoid IO
> 	spanning EOF, abort if there are cached pages over the
> 	range, etc).
> 
> 	2. require the filesystem mapping callback do only allow
> 	unaligned IO into ranges that are contiguous and don't
> 	require mapping state changes or sub-block zeroing to be
> 	performed during the sub-block IO.
> 
> 

Something I hadn't thought about before is whether applications might
depend on current unaligned dio serialization for coherency and thus
break if the kernel suddenly allows concurrent unaligned dio to pass
through. Should this be something that is explicitly requested by
userspace?

That aside, I agree that the DIO_UNALIGNED approach seems a bit more
clear than NOALLOC, but TBH the more I look at this the more Christoph's
first approach seems cleanest to me. It is a bit unfortunate to
duplicate the mapping lookups and have the extra ILOCK cycle, but the
lock is shared and only taken when I/O is unaligned. I don't really see
why that is a show stopper yet it's acceptable to fall back to exclusive
dio if the target range happens to be discontiguous (but otherwise
mapped/written).

So I dunno... to me, I would start with that approach and then as the
implementation soaks, perhaps see if we can find a way to optimize away
the extra cycle and lookup. In the meantime, performance should still be
improved significantly and the behavior fairly predictable. Anyways, I
suspect Dave disagrees so that's just my .02. ;) I'll let you guys find
some common ground and make a pass at whatever falls out...

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
>
Dave Chinner Jan. 13, 2021, 10:49 p.m. UTC | #3
On Wed, Jan 13, 2021 at 10:32:15AM -0500, Brian Foster wrote:
> On Wed, Jan 13, 2021 at 10:29:23AM +1100, Dave Chinner wrote:
> > On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
> > > Add a flag to request that the iomap instances do not allocate blocks
> > > by translating it to another new IOMAP_NOALLOC flag.
> > 
> > Except "no allocation" that is not what XFS needs for concurrent
> > sub-block DIO.
> > 
> > We are trying to avoid external sub-block IO outside the range of
> > the user data IO (COW, sub-block zeroing, etc) so that we don't
> > trash adjacent sub-block IO in flight. This means we can't do
> > sub-block zeroing and that then means we can't map unwritten extents
> > or allocate new extents for the sub-block IO.  It also means the IO
> > range cannot span EOF because that triggers unconditional sub-block
> > zeroing in iomap_dio_rw_actor().
> > 
> > And because we may have to map multiple extents to fully span an IO
> > range, we have to guarantee that subsequent extents for the IO are
> > also written otherwise we have a partial write abort case. Hence we
> > have single extent limitations as well.
> > 
> > So "no allocation" really doesn't describe what we want this flag to
> > at all.
> > 
> > If we're going to use a flag for this specific functionality, let's
> > call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
> > things with it.
> > 
> > 	1. Make unaligned IO a formal part of the iomap_dio_rw()
> > 	behaviour so it can do the common checks to for things that
> > 	need exclusive serialisation for unaligned IO (i.e. avoid IO
> > 	spanning EOF, abort if there are cached pages over the
> > 	range, etc).
> > 
> > 	2. require the filesystem mapping callback do only allow
> > 	unaligned IO into ranges that are contiguous and don't
> > 	require mapping state changes or sub-block zeroing to be
> > 	performed during the sub-block IO.
> > 
> > 
> 
> Something I hadn't thought about before is whether applications might
> depend on current unaligned dio serialization for coherency and thus
> break if the kernel suddenly allows concurrent unaligned dio to pass
> through. Should this be something that is explicitly requested by
> userspace?

If applications are relying on an undocumented, implementation
specific behaviour of a filesystem that only occurs for IOs of a
certain size for implicit data coherency between independent,
non-overlapping DIOs and/or page cache IO, then they are already
broken and need fixing because that behaviour is not guaranteed to
occur. e.g. 512 byte block size filesystem does not provide such
serialisation, so if the app depends on 512 byte DIOs being
serialised completely by the filesytem then it already fails on 512
byte block size filesystems.

So, no, we simply don't care about breaking broken applications that
are already broken.

> That aside, I agree that the DIO_UNALIGNED approach seems a bit more
> clear than NOALLOC, but TBH the more I look at this the more Christoph's
> first approach seems cleanest to me. It is a bit unfortunate to
> duplicate the mapping lookups and have the extra ILOCK cycle, but the
> lock is shared and only taken when I/O is unaligned. I don't really see
> why that is a show stopper yet it's acceptable to fall back to exclusive
> dio if the target range happens to be discontiguous (but otherwise
> mapped/written).

Unnecessary lock cycles in the fast path are always bad. The whole
reason this change is being done is for performance to bring it up
to par with block aligned IO. Adding an extra lock cycle to the
ILOCK on every IO will halve the performance on high IOPs hardware
because the ILOCK will be directly exposed to userspace IO
submission and hence become the contention point instead of the
IOLOCK.

IOWs, the fact taht we take the ILOCK 2x per IO instead of once
means that the ILOCK becomes the performance limiting lock (because
even shared locking causes cacheline contention) and changes the
entire lock profile for the IO path when unaligned IO is being done.

This is also ignoring the fact that the ILOCK is held in exclusive
mode during IO completion while doing file size and extent
manipulation transactions. IOWs we can block waiting on IO
completion before we even decide if we can do the IO with shared
locking. Hence there are new IO submission serialisation points in
the fast path that will also slow down the cases where we have to do
exclusive locking....

So, yeah, I can only see bad things occurring by lifting the ILOCK
up into the high level IO path. And, of course, once it's taken
there, people will find new reasons to expand it's scope and the
problems will only get worse...

> So I dunno... to me, I would start with that approach and then as the
> implementation soaks, perhaps see if we can find a way to optimize away
> the extra cycle and lookup.

I don't see how we can determine if we can do the unlaigned IO
holding a shared lock without doing an extent lookup. It's the
underlying extent state that makes shared locking possible, and I
can't think of any other state we can look at to make this decision.

Hence I think this path is simply a dead end with no possibility of
further optimisation. Of course, if you can solve the problem
without needing an extent lookup, then we can talk about how to
avoid racing with actual extent mapping changes done under the
ILOCK... :)

Cheers,

Dave.
Brian Foster Jan. 14, 2021, 10:23 a.m. UTC | #4
On Thu, Jan 14, 2021 at 09:49:35AM +1100, Dave Chinner wrote:
> On Wed, Jan 13, 2021 at 10:32:15AM -0500, Brian Foster wrote:
> > On Wed, Jan 13, 2021 at 10:29:23AM +1100, Dave Chinner wrote:
> > > On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
> > > > Add a flag to request that the iomap instances do not allocate blocks
> > > > by translating it to another new IOMAP_NOALLOC flag.
> > > 
> > > Except "no allocation" that is not what XFS needs for concurrent
> > > sub-block DIO.
> > > 
> > > We are trying to avoid external sub-block IO outside the range of
> > > the user data IO (COW, sub-block zeroing, etc) so that we don't
> > > trash adjacent sub-block IO in flight. This means we can't do
> > > sub-block zeroing and that then means we can't map unwritten extents
> > > or allocate new extents for the sub-block IO.  It also means the IO
> > > range cannot span EOF because that triggers unconditional sub-block
> > > zeroing in iomap_dio_rw_actor().
> > > 
> > > And because we may have to map multiple extents to fully span an IO
> > > range, we have to guarantee that subsequent extents for the IO are
> > > also written otherwise we have a partial write abort case. Hence we
> > > have single extent limitations as well.
> > > 
> > > So "no allocation" really doesn't describe what we want this flag to
> > > at all.
> > > 
> > > If we're going to use a flag for this specific functionality, let's
> > > call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
> > > things with it.
> > > 
> > > 	1. Make unaligned IO a formal part of the iomap_dio_rw()
> > > 	behaviour so it can do the common checks to for things that
> > > 	need exclusive serialisation for unaligned IO (i.e. avoid IO
> > > 	spanning EOF, abort if there are cached pages over the
> > > 	range, etc).
> > > 
> > > 	2. require the filesystem mapping callback do only allow
> > > 	unaligned IO into ranges that are contiguous and don't
> > > 	require mapping state changes or sub-block zeroing to be
> > > 	performed during the sub-block IO.
> > > 
> > > 
> > 
> > Something I hadn't thought about before is whether applications might
> > depend on current unaligned dio serialization for coherency and thus
> > break if the kernel suddenly allows concurrent unaligned dio to pass
> > through. Should this be something that is explicitly requested by
> > userspace?
> 
> If applications are relying on an undocumented, implementation
> specific behaviour of a filesystem that only occurs for IOs of a
> certain size for implicit data coherency between independent,
> non-overlapping DIOs and/or page cache IO, then they are already
> broken and need fixing because that behaviour is not guaranteed to
> occur. e.g. 512 byte block size filesystem does not provide such
> serialisation, so if the app depends on 512 byte DIOs being
> serialised completely by the filesytem then it already fails on 512
> byte block size filesystems.
> 

I'm not sure how the block size relates beyond just changing the
alignment requirements..?

> So, no, we simply don't care about breaking broken applications that
> are already broken.
> 

I agree in general, but I'm not sure that helps us on the "don't break
userspace" front. We can call userspace broken all we want, but if some
application has such a workload that historically functions correctly
due to this serialization and all of a sudden starts to cause data
corruption because we decide to remove it, I fear we'd end up taking the
blame regardless. :/

I wonder if other fs' provide similar concurrent unaligned dio
support..? A quick look at ext4 shows it has similar logic to XFS, btrfs
looks like it falls back to buffered I/O...

> > That aside, I agree that the DIO_UNALIGNED approach seems a bit more
> > clear than NOALLOC, but TBH the more I look at this the more Christoph's
> > first approach seems cleanest to me. It is a bit unfortunate to
> > duplicate the mapping lookups and have the extra ILOCK cycle, but the
> > lock is shared and only taken when I/O is unaligned. I don't really see
> > why that is a show stopper yet it's acceptable to fall back to exclusive
> > dio if the target range happens to be discontiguous (but otherwise
> > mapped/written).
> 
> Unnecessary lock cycles in the fast path are always bad. The whole
> reason this change is being done is for performance to bring it up
> to par with block aligned IO. Adding an extra lock cycle to the
> ILOCK on every IO will halve the performance on high IOPs hardware
> because the ILOCK will be directly exposed to userspace IO
> submission and hence become the contention point instead of the
> IOLOCK.
> 
> IOWs, the fact taht we take the ILOCK 2x per IO instead of once
> means that the ILOCK becomes the performance limiting lock (because
> even shared locking causes cacheline contention) and changes the
> entire lock profile for the IO path when unaligned IO is being done.
> 
> This is also ignoring the fact that the ILOCK is held in exclusive
> mode during IO completion while doing file size and extent
> manipulation transactions. IOWs we can block waiting on IO
> completion before we even decide if we can do the IO with shared
> locking. Hence there are new IO submission serialisation points in
> the fast path that will also slow down the cases where we have to do
> exclusive locking....
> 
> So, yeah, I can only see bad things occurring by lifting the ILOCK
> up into the high level IO path. And, of course, once it's taken
> there, people will find new reasons to expand it's scope and the
> problems will only get worse...
> 

I'm not saying the extra ilock cycle is free or even ideal. I'm
questioning that the proposed alternatives provide complete
functionality when things like unaligned dio that span mappings are
always going to fall back to exclusive I/O. There's an obvious tradeoff
there between performance and predictability that IMO isn't as cut and
dry as you describe. I certainly don't consider that as bringing
unaligned dio performance up to par with block aligned dio.

> > So I dunno... to me, I would start with that approach and then as the
> > implementation soaks, perhaps see if we can find a way to optimize away
> > the extra cycle and lookup.
> 
> I don't see how we can determine if we can do the unlaigned IO
> holding a shared lock without doing an extent lookup. It's the
> underlying extent state that makes shared locking possible, and I
> can't think of any other state we can look at to make this decision.
> 

Not sure how you got "without doing an extent lookup" from my comment.
:P I was referring to the extra/early lookup and lock cycle that we're
discussing above wrt to the original series. These subsequent series
already do without it, but to me they sacrifice functionality. I'm
basically saying that it might be worth to try and make it work first,
make it fast(er) second (or otherwise find a way to address the issue in
the latest series).

Of course, based on the behavior of other fs' I'm not totally convinced
this is a great idea in the first place, at least not without some kind
of opt-in from userspace or perhaps broader community consensus..

Brian

> Hence I think this path is simply a dead end with no possibility of
> further optimisation. Of course, if you can solve the problem
> without needing an extent lookup, then we can talk about how to
> avoid racing with actual extent mapping changes done under the
> ILOCK... :)
> 
> Cheers,
> 
> Dave.
> 
> -- 
> Dave Chinner
> david@fromorbit.com
>
Avi Kivity Jan. 14, 2021, 10:43 a.m. UTC | #5
On 1/14/21 12:23 PM, Brian Foster wrote:
> On Thu, Jan 14, 2021 at 09:49:35AM +1100, Dave Chinner wrote:
>> On Wed, Jan 13, 2021 at 10:32:15AM -0500, Brian Foster wrote:
>>> On Wed, Jan 13, 2021 at 10:29:23AM +1100, Dave Chinner wrote:
>>>> On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
>>>>> Add a flag to request that the iomap instances do not allocate blocks
>>>>> by translating it to another new IOMAP_NOALLOC flag.
>>>> Except "no allocation" that is not what XFS needs for concurrent
>>>> sub-block DIO.
>>>>
>>>> We are trying to avoid external sub-block IO outside the range of
>>>> the user data IO (COW, sub-block zeroing, etc) so that we don't
>>>> trash adjacent sub-block IO in flight. This means we can't do
>>>> sub-block zeroing and that then means we can't map unwritten extents
>>>> or allocate new extents for the sub-block IO.  It also means the IO
>>>> range cannot span EOF because that triggers unconditional sub-block
>>>> zeroing in iomap_dio_rw_actor().
>>>>
>>>> And because we may have to map multiple extents to fully span an IO
>>>> range, we have to guarantee that subsequent extents for the IO are
>>>> also written otherwise we have a partial write abort case. Hence we
>>>> have single extent limitations as well.
>>>>
>>>> So "no allocation" really doesn't describe what we want this flag to
>>>> at all.
>>>>
>>>> If we're going to use a flag for this specific functionality, let's
>>>> call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
>>>> things with it.
>>>>
>>>> 	1. Make unaligned IO a formal part of the iomap_dio_rw()
>>>> 	behaviour so it can do the common checks to for things that
>>>> 	need exclusive serialisation for unaligned IO (i.e. avoid IO
>>>> 	spanning EOF, abort if there are cached pages over the
>>>> 	range, etc).
>>>>
>>>> 	2. require the filesystem mapping callback do only allow
>>>> 	unaligned IO into ranges that are contiguous and don't
>>>> 	require mapping state changes or sub-block zeroing to be
>>>> 	performed during the sub-block IO.
>>>>
>>>>
>>> Something I hadn't thought about before is whether applications might
>>> depend on current unaligned dio serialization for coherency and thus
>>> break if the kernel suddenly allows concurrent unaligned dio to pass
>>> through. Should this be something that is explicitly requested by
>>> userspace?
>> If applications are relying on an undocumented, implementation
>> specific behaviour of a filesystem that only occurs for IOs of a
>> certain size for implicit data coherency between independent,
>> non-overlapping DIOs and/or page cache IO, then they are already
>> broken and need fixing because that behaviour is not guaranteed to
>> occur. e.g. 512 byte block size filesystem does not provide such
>> serialisation, so if the app depends on 512 byte DIOs being
>> serialised completely by the filesytem then it already fails on 512
>> byte block size filesystems.
>>
> I'm not sure how the block size relates beyond just changing the
> alignment requirements..?
>
>> So, no, we simply don't care about breaking broken applications that
>> are already broken.
>>
> I agree in general, but I'm not sure that helps us on the "don't break
> userspace" front. We can call userspace broken all we want, but if some
> application has such a workload that historically functions correctly
> due to this serialization and all of a sudden starts to cause data
> corruption because we decide to remove it, I fear we'd end up taking the
> blame regardless. :/


I think it's unlikely. Application writers rarely know about such 
issues, so they can't knowingly depend on them. The sub-sub-genre of 
application writers who rely on dio/aio will be a lot more careful and 
wary of the filesystem.


In this particular case, triggering serialization also triggers blocking 
in io_submit, which is the aio/dio user's worst nightmare, by several 
orders of magnitude than the runner up. I have code to detect these 
cases and try to prevent serialization, or, when serialization is 
inevitable, do the serialization in userspace so my io_submits don't get 
blocked.
Christoph Hellwig Jan. 14, 2021, 5:26 p.m. UTC | #6
On Wed, Jan 13, 2021 at 10:29:23AM +1100, Dave Chinner wrote:
> On Tue, Jan 12, 2021 at 05:26:15PM +0100, Christoph Hellwig wrote:
> > Add a flag to request that the iomap instances do not allocate blocks
> > by translating it to another new IOMAP_NOALLOC flag.
> 
> Except "no allocation" that is not what XFS needs for concurrent
> sub-block DIO.

Well, this is just a quick draft.  I could not come up with a better
name, so I picked on that explains most but not all of what is going
on.

> If we're going to use a flag for this specific functionality, let's
> call it what it is: IOMAP_DIO_UNALIGNED/IOMAP_UNALIGNED and do two
> things with it.

Sounds fine with me.

> 
> 	1. Make unaligned IO a formal part of the iomap_dio_rw()
> 	behaviour so it can do the common checks to for things that
> 	need exclusive serialisation for unaligned IO (i.e. avoid IO
> 	spanning EOF, abort if there are cached pages over the
> 	range, etc).

Note that these all writes already fall back to buffered I/O if
invalidate_inode_pages2_range fails, so there must never be cached
pages for direct I/O these days.  This is different from NOWAIT
I/O where we simply give up if there are any cached pages and don't
even try to invalidate them.

> 	2. require the filesystem mapping callback do only allow
> 	unaligned IO into ranges that are contiguous and don't
> 	require mapping state changes or sub-block zeroing to be
> 	performed during the sub-block IO.

Yeah.
Christoph Hellwig Jan. 14, 2021, 5:29 p.m. UTC | #7
On Wed, Jan 13, 2021 at 10:32:15AM -0500, Brian Foster wrote:
> Something I hadn't thought about before is whether applications might
> depend on current unaligned dio serialization for coherency and thus
> break if the kernel suddenly allows concurrent unaligned dio to pass
> through. Should this be something that is explicitly requested by
> userspace?

direct I/O has always been documented as not being synchronized.  Also
for block devices you won't get any synchronization at all, down to
the sector level.

> 
> That aside, I agree that the DIO_UNALIGNED approach seems a bit more
> clear than NOALLOC, but TBH the more I look at this the more Christoph's
> first approach seems cleanest to me. It is a bit unfortunate to
> duplicate the mapping lookups and have the extra ILOCK cycle, but the
> lock is shared and only taken when I/O is unaligned. I don't really see
> why that is a show stopper yet it's acceptable to fall back to exclusive
> dio if the target range happens to be discontiguous (but otherwise
> mapped/written).

I think both approaches have pros an cons.  My original one (which really
is yours as you suggested it) has the advantage of having a much simpler
structure, and not limititing the non-exclusive I/O to a single extent.
The refined version of Dave's approach avoids the extra one or two extent
lookups, and any knowledge of extent state above the iomap layer.
diff mbox series

Patch

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 96dc72debc4b79..a01f1d182685b4 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -436,6 +436,8 @@  __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 
 	if (is_sync_kiocb(iocb))
 		dio_flags |= IOMAP_DIO_FORCE_WAIT;
+	if (dio_flags & IOMAP_DIO_NOALLOC)
+		flags |= IOMAP_NOALLOC;
 
 	dio = kmalloc(sizeof(*dio), GFP_KERNEL);
 	if (!dio)
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e7865654dd0dca..a92890d5dc6799 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -122,6 +122,7 @@  struct iomap_page_ops {
 #define IOMAP_FAULT		(1 << 3) /* mapping for page fault */
 #define IOMAP_DIRECT		(1 << 4) /* direct I/O */
 #define IOMAP_NOWAIT		(1 << 5) /* do not block */
+#define IOMAP_NOALLOC		(1 << 6) /* do not allocate blocks */
 
 struct iomap_ops {
 	/*
@@ -257,6 +258,7 @@  struct iomap_dio_ops {
 };
 
 #define IOMAP_DIO_FORCE_WAIT	(1 << 0)	/* force waiting for I/O */
+#define IOMAP_DIO_NOALLOC	(1 << 1)	/* do not allocate blocks */
 
 ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		const struct iomap_ops *ops, const struct iomap_dio_ops *dops,