diff mbox series

[RFC,V2,01/12] fs/stat: Define DAX statx attribute

Message ID 20200110192942.25021-2-ira.weiny@intel.com (mailing list archive)
State Superseded
Headers show
Series Enable per-file/directory DAX operations V2 | expand

Commit Message

Ira Weiny Jan. 10, 2020, 7:29 p.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

In order for users to determine if a file is currently operating in DAX
mode (effective DAX).  Define a statx attribute value and set that
attribute if the effective DAX flag is set.

To go along with this we propose the following addition to the statx man
page:

STATX_ATTR_DAX

	DAX (cpu direct access) is a file mode that attempts to minimize
	software cache effects for both I/O and memory mappings of this
	file.  It requires a capable device, a compatible filesystem
	block size, and filesystem opt-in. It generally assumes all
	accesses are via cpu load / store instructions which can
	minimize overhead for small accesses, but adversely affect cpu
	utilization for large transfers. File I/O is done directly
	to/from user-space buffers. While the DAX property tends to
	result in data being transferred synchronously it does not give
	the guarantees of synchronous I/O that data and necessary
	metadata are transferred. Memory mapped I/O may be performed
	with direct mappings that bypass system memory buffering. Again
	while memory-mapped I/O tends to result in data being
	transferred synchronously it does not guarantee synchronous
	metadata updates. A dax file may optionally support being mapped
	with the MAP_SYNC flag which does allow cpu store operations to
	be considered synchronous modulo cpu cache effects.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 fs/stat.c                 | 3 +++
 include/uapi/linux/stat.h | 1 +
 2 files changed, 4 insertions(+)

Comments

Jan Kara Jan. 15, 2020, 11:37 a.m. UTC | #1
On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> In order for users to determine if a file is currently operating in DAX
> mode (effective DAX).  Define a statx attribute value and set that
> attribute if the effective DAX flag is set.
> 
> To go along with this we propose the following addition to the statx man
> page:
> 
> STATX_ATTR_DAX
> 
> 	DAX (cpu direct access) is a file mode that attempts to minimize
> 	software cache effects for both I/O and memory mappings of this
> 	file.  It requires a capable device, a compatible filesystem
> 	block size, and filesystem opt-in. It generally assumes all
> 	accesses are via cpu load / store instructions which can
> 	minimize overhead for small accesses, but adversely affect cpu
> 	utilization for large transfers. File I/O is done directly
> 	to/from user-space buffers. While the DAX property tends to
> 	result in data being transferred synchronously it does not give
> 	the guarantees of synchronous I/O that data and necessary
> 	metadata are transferred. Memory mapped I/O may be performed
> 	with direct mappings that bypass system memory buffering. Again
> 	while memory-mapped I/O tends to result in data being
> 	transferred synchronously it does not guarantee synchronous
> 	metadata updates. A dax file may optionally support being mapped
> 	with the MAP_SYNC flag which does allow cpu store operations to
> 	be considered synchronous modulo cpu cache effects.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

This looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/stat.c                 | 3 +++
>  include/uapi/linux/stat.h | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/fs/stat.c b/fs/stat.c
> index 030008796479..894699c74dde 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
>  	if (IS_AUTOMOUNT(inode))
>  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
>  
> +	if (IS_DAX(inode))
> +		stat->attributes |= STATX_ATTR_DAX;
> +
>  	if (inode->i_op->getattr)
>  		return inode->i_op->getattr(path, stat, request_mask,
>  					    query_flags);
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index ad80a5c885d5..e5f9d5517f6b 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -169,6 +169,7 @@ struct statx {
>  #define STATX_ATTR_ENCRYPTED		0x00000800 /* [I] File requires key to decrypt in fs */
>  #define STATX_ATTR_AUTOMOUNT		0x00001000 /* Dir: Automount trigger */
>  #define STATX_ATTR_VERITY		0x00100000 /* [I] Verity protected file */
> +#define STATX_ATTR_DAX			0x00002000 /* [I] File is DAX */
>  
>  
>  #endif /* _UAPI_LINUX_STAT_H */
> -- 
> 2.21.0
>
Darrick J. Wong Jan. 15, 2020, 5:38 p.m. UTC | #2
On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > In order for users to determine if a file is currently operating in DAX
> > mode (effective DAX).  Define a statx attribute value and set that
> > attribute if the effective DAX flag is set.
> > 
> > To go along with this we propose the following addition to the statx man
> > page:
> > 
> > STATX_ATTR_DAX
> > 
> > 	DAX (cpu direct access) is a file mode that attempts to minimize

"..is a file I/O mode"?

> > 	software cache effects for both I/O and memory mappings of this
> > 	file.  It requires a capable device, a compatible filesystem
> > 	block size, and filesystem opt-in.

"...a capable storage device..."

What does "compatible fs block size" mean?  How does the user figure out
if their fs blocksize is compatible?  Do we tell users to refer their
filesystem's documentation here?

> > It generally assumes all
> > 	accesses are via cpu load / store instructions which can
> > 	minimize overhead for small accesses, but adversely affect cpu
> > 	utilization for large transfers.

Will this always be true for persistent memory?

I wasn't even aware that large transfers adversely affected CPU
utilization. ;)

> >  File I/O is done directly
> > 	to/from user-space buffers. While the DAX property tends to
> > 	result in data being transferred synchronously it does not give

"...transferred synchronously, it does not..."

> > 	the guarantees of synchronous I/O that data and necessary

"...it does not guarantee that I/O or file metadata have been flushed to
the storage device."

> > 	metadata are transferred. Memory mapped I/O may be performed
> > 	with direct mappings that bypass system memory buffering.

"...with direct memory mappings that bypass kernel page cache."

> > Again
> > 	while memory-mapped I/O tends to result in data being

I would move the sentence about "Memory mapped I/O..." to directly after
the sentence about file I/O being done directly to and from userspace so
that you don't need to repeat this statement.

> > 	transferred synchronously it does not guarantee synchronous
> > 	metadata updates. A dax file may optionally support being mapped
> > 	with the MAP_SYNC flag which does allow cpu store operations to
> > 	be considered synchronous modulo cpu cache effects.

How does one detect or work around or deal with "cpu cache effects"?  I
assume some sort of CPU cache flush instruction is what is meant here,
but I think we could mention the basics of what has to be done here:

"A DAX file may support being mapped with the MAP_SYNC flag, which
enables a program to use CPU cache flush operations to persist CPU store
operations without an explicit fsync(2).  See mmap(2) for more
information."?

Oof, a paragraph break would be nice. :)

--D

> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> This looks good to me. You can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> 								Honza
> 
> > ---
> >  fs/stat.c                 | 3 +++
> >  include/uapi/linux/stat.h | 1 +
> >  2 files changed, 4 insertions(+)
> > 
> > diff --git a/fs/stat.c b/fs/stat.c
> > index 030008796479..894699c74dde 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> >  	if (IS_AUTOMOUNT(inode))
> >  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
> >  
> > +	if (IS_DAX(inode))
> > +		stat->attributes |= STATX_ATTR_DAX;
> > +
> >  	if (inode->i_op->getattr)
> >  		return inode->i_op->getattr(path, stat, request_mask,
> >  					    query_flags);
> > diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> > index ad80a5c885d5..e5f9d5517f6b 100644
> > --- a/include/uapi/linux/stat.h
> > +++ b/include/uapi/linux/stat.h
> > @@ -169,6 +169,7 @@ struct statx {
> >  #define STATX_ATTR_ENCRYPTED		0x00000800 /* [I] File requires key to decrypt in fs */
> >  #define STATX_ATTR_AUTOMOUNT		0x00001000 /* Dir: Automount trigger */
> >  #define STATX_ATTR_VERITY		0x00100000 /* [I] Verity protected file */
> > +#define STATX_ATTR_DAX			0x00002000 /* [I] File is DAX */
> >  
> >  
> >  #endif /* _UAPI_LINUX_STAT_H */
> > -- 
> > 2.21.0
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
Ira Weiny Jan. 15, 2020, 7:45 p.m. UTC | #3
On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > In order for users to determine if a file is currently operating in DAX
> > > mode (effective DAX).  Define a statx attribute value and set that
> > > attribute if the effective DAX flag is set.
> > > 
> > > To go along with this we propose the following addition to the statx man
> > > page:
> > > 
> > > STATX_ATTR_DAX
> > > 
> > > 	DAX (cpu direct access) is a file mode that attempts to minimize
> 
> "..is a file I/O mode"?

or  "... is a file state ..."?
 
> > > 	software cache effects for both I/O and memory mappings of this
> > > 	file.  It requires a capable device, a compatible filesystem
> > > 	block size, and filesystem opt-in.
> 
> "...a capable storage device..."

Done

> 
> What does "compatible fs block size" mean?  How does the user figure out
> if their fs blocksize is compatible?  Do we tell users to refer their
> filesystem's documentation here?

Perhaps it is wrong for this to be in the man page at all?  Would it be better
to assume the file system and block device are already configured properly by
the admin?

For which the blocksize restrictions are already well documented.  ie:

https://www.kernel.org/doc/Documentation/filesystems/dax.txt

?

How about changing the text to:

	It requires a block device and file system which have been configured
	to support DAX.

?

> 
> > > It generally assumes all
> > > 	accesses are via cpu load / store instructions which can
> > > 	minimize overhead for small accesses, but adversely affect cpu
> > > 	utilization for large transfers.
> 
> Will this always be true for persistent memory?

I'm not clear.  Did you mean; "this" == adverse utilization for large transfers?

> 
> I wasn't even aware that large transfers adversely affected CPU
> utilization. ;)

Sure vs using a DMA engine for example.

> 
> > >  File I/O is done directly
> > > 	to/from user-space buffers. While the DAX property tends to
> > > 	result in data being transferred synchronously it does not give
> 
> "...transferred synchronously, it does not..."

done.

> 
> > > 	the guarantees of synchronous I/O that data and necessary
> 
> "...it does not guarantee that I/O or file metadata have been flushed to
> the storage device."

The lack of guarantee here is mainly regarding metadata.

How about:

        While the DAX property tends to result in data being transferred
        synchronously, it does not give the same guarantees of 
	synchronous I/O where data and the necessary metadata are 
	transferred together.

> 
> > > 	metadata are transferred. Memory mapped I/O may be performed
> > > 	with direct mappings that bypass system memory buffering.
> 
> "...with direct memory mappings that bypass kernel page cache."

Done.

> 
> > > Again
> > > 	while memory-mapped I/O tends to result in data being
> 
> I would move the sentence about "Memory mapped I/O..." to directly after
> the sentence about file I/O being done directly to and from userspace so
> that you don't need to repeat this statement.

Done.

> 
> > > 	transferred synchronously it does not guarantee synchronous
> > > 	metadata updates. A dax file may optionally support being mapped
> > > 	with the MAP_SYNC flag which does allow cpu store operations to
> > > 	be considered synchronous modulo cpu cache effects.
> 
> How does one detect or work around or deal with "cpu cache effects"?  I
> assume some sort of CPU cache flush instruction is what is meant here,
> but I think we could mention the basics of what has to be done here:
> 
> "A DAX file may support being mapped with the MAP_SYNC flag, which
> enables a program to use CPU cache flush operations to persist CPU store
> operations without an explicit fsync(2).  See mmap(2) for more
> information."?

That sounds better.  I like the reference to mmap as well.

Ok I changed a couple of things as well.  How does this sound?


STATX_ATTR_DAX 

        DAX (cpu direct access) is a file mode that attempts to minimize
        software cache effects for both I/O and memory mappings of this
        file.  It requires a block device and file system which have
        been configured to support DAX.

        DAX generally assumes all accesses are via cpu load / store
        instructions which can minimize overhead for small accesses, but
        may adversely affect cpu utilization for large transfers.

        File I/O is done directly to/from user-space buffers and memory
        mapped I/O may be performed with direct memory mappings that
        bypass kernel page cache.

        While the DAX property tends to result in data being transferred
        synchronously, it does not give the same guarantees of
        synchronous I/O where data and the necessary metadata are
        transferred together.

        A DAX file may support being mapped with the MAP_SYNC flag,
        which enables a program to use CPU cache flush operations to
        persist CPU store operations without an explicit fsync(2).  See
        mmap(2) for more information.


Ira

> 
> Oof, a paragraph break would be nice. :)
> 
> --D
> 
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > This looks good to me. You can add:
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > 
> > 								Honza
> > 
> > > ---
> > >  fs/stat.c                 | 3 +++
> > >  include/uapi/linux/stat.h | 1 +
> > >  2 files changed, 4 insertions(+)
> > > 
> > > diff --git a/fs/stat.c b/fs/stat.c
> > > index 030008796479..894699c74dde 100644
> > > --- a/fs/stat.c
> > > +++ b/fs/stat.c
> > > @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> > >  	if (IS_AUTOMOUNT(inode))
> > >  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
> > >  
> > > +	if (IS_DAX(inode))
> > > +		stat->attributes |= STATX_ATTR_DAX;
> > > +
> > >  	if (inode->i_op->getattr)
> > >  		return inode->i_op->getattr(path, stat, request_mask,
> > >  					    query_flags);
> > > diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> > > index ad80a5c885d5..e5f9d5517f6b 100644
> > > --- a/include/uapi/linux/stat.h
> > > +++ b/include/uapi/linux/stat.h
> > > @@ -169,6 +169,7 @@ struct statx {
> > >  #define STATX_ATTR_ENCRYPTED		0x00000800 /* [I] File requires key to decrypt in fs */
> > >  #define STATX_ATTR_AUTOMOUNT		0x00001000 /* Dir: Automount trigger */
> > >  #define STATX_ATTR_VERITY		0x00100000 /* [I] Verity protected file */
> > > +#define STATX_ATTR_DAX			0x00002000 /* [I] File is DAX */
> > >  
> > >  
> > >  #endif /* _UAPI_LINUX_STAT_H */
> > > -- 
> > > 2.21.0
> > > 
> > -- 
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR
Dan Williams Jan. 15, 2020, 8:10 p.m. UTC | #4
On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
>
> On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > >
> > > > In order for users to determine if a file is currently operating in DAX
> > > > mode (effective DAX).  Define a statx attribute value and set that
> > > > attribute if the effective DAX flag is set.
> > > >
> > > > To go along with this we propose the following addition to the statx man
> > > > page:
> > > >
> > > > STATX_ATTR_DAX
> > > >
> > > >   DAX (cpu direct access) is a file mode that attempts to minimize
> >
> > "..is a file I/O mode"?
>
> or  "... is a file state ..."?
>
> > > >   software cache effects for both I/O and memory mappings of this
> > > >   file.  It requires a capable device, a compatible filesystem
> > > >   block size, and filesystem opt-in.
> >
> > "...a capable storage device..."
>
> Done
>
> >
> > What does "compatible fs block size" mean?  How does the user figure out
> > if their fs blocksize is compatible?  Do we tell users to refer their
> > filesystem's documentation here?
>
> Perhaps it is wrong for this to be in the man page at all?  Would it be better
> to assume the file system and block device are already configured properly by
> the admin?
>
> For which the blocksize restrictions are already well documented.  ie:
>
> https://www.kernel.org/doc/Documentation/filesystems/dax.txt
>
> ?
>
> How about changing the text to:
>
>         It requires a block device and file system which have been configured
>         to support DAX.
>
> ?

The goal was to document the gauntlet of checks that
__generic_fsdax_supported() performs so someone could debug "why am I
not able to get dax operation?"

>
> >
> > > > It generally assumes all
> > > >   accesses are via cpu load / store instructions which can
> > > >   minimize overhead for small accesses, but adversely affect cpu
> > > >   utilization for large transfers.
> >
> > Will this always be true for persistent memory?

For direct-mapped pmem there is no opportunity to do dma offload so it
will always be true that application dax access consumes cpu to do I/O
where something like NVMe does not. There has been unfruitful to date
experiments with the driver using an offload engine for kernel
internal I/O, but if you're use case is kernel internal I/O bound then
you don't need dax.

>
> I'm not clear.  Did you mean; "this" == adverse utilization for large transfers?
>
> >
> > I wasn't even aware that large transfers adversely affected CPU
> > utilization. ;)
>
> Sure vs using a DMA engine for example.

Right, this is purely a statement about cpu memcpy vs device-dma.

>
> >
> > > >  File I/O is done directly
> > > >   to/from user-space buffers. While the DAX property tends to
> > > >   result in data being transferred synchronously it does not give
> >
> > "...transferred synchronously, it does not..."
>
> done.
>
> >
> > > >   the guarantees of synchronous I/O that data and necessary
> >
> > "...it does not guarantee that I/O or file metadata have been flushed to
> > the storage device."
>
> The lack of guarantee here is mainly regarding metadata.
>
> How about:
>
>         While the DAX property tends to result in data being transferred
>         synchronously, it does not give the same guarantees of
>         synchronous I/O where data and the necessary metadata are
>         transferred together.
>
> >
> > > >   metadata are transferred. Memory mapped I/O may be performed
> > > >   with direct mappings that bypass system memory buffering.
> >
> > "...with direct memory mappings that bypass kernel page cache."
>
> Done.
>
> >
> > > > Again
> > > >   while memory-mapped I/O tends to result in data being
> >
> > I would move the sentence about "Memory mapped I/O..." to directly after
> > the sentence about file I/O being done directly to and from userspace so
> > that you don't need to repeat this statement.
>
> Done.
>
> >
> > > >   transferred synchronously it does not guarantee synchronous
> > > >   metadata updates. A dax file may optionally support being mapped
> > > >   with the MAP_SYNC flag which does allow cpu store operations to
> > > >   be considered synchronous modulo cpu cache effects.
> >
> > How does one detect or work around or deal with "cpu cache effects"?  I
> > assume some sort of CPU cache flush instruction is what is meant here,
> > but I think we could mention the basics of what has to be done here:
> >
> > "A DAX file may support being mapped with the MAP_SYNC flag, which
> > enables a program to use CPU cache flush operations to persist CPU store
> > operations without an explicit fsync(2).  See mmap(2) for more
> > information."?
>
> That sounds better.  I like the reference to mmap as well.
>
> Ok I changed a couple of things as well.  How does this sound?
>
>
> STATX_ATTR_DAX
>
>         DAX (cpu direct access) is a file mode that attempts to minimize

s/mode/state/?

>         software cache effects for both I/O and memory mappings of this
>         file.  It requires a block device and file system which have
>         been configured to support DAX.

It may not require a block device in the future.

>
>         DAX generally assumes all accesses are via cpu load / store
>         instructions which can minimize overhead for small accesses, but
>         may adversely affect cpu utilization for large transfers.
>
>         File I/O is done directly to/from user-space buffers and memory
>         mapped I/O may be performed with direct memory mappings that
>         bypass kernel page cache.
>
>         While the DAX property tends to result in data being transferred
>         synchronously, it does not give the same guarantees of
>         synchronous I/O where data and the necessary metadata are

Maybe use "O_SYNC I/O" explicitly to further differentiate the 2
meanings of "synchronous" in this sentence?

>         transferred together.
>
>         A DAX file may support being mapped with the MAP_SYNC flag,
>         which enables a program to use CPU cache flush operations to

s/operations/instructions/

>         persist CPU store operations without an explicit fsync(2).  See
>         mmap(2) for more information.

I think this also wants a reference to the Linux interpretation of
platform "persistence domains" we were discussing that here [1], but
maybe it should be part of a "pmem" manpage that can be referenced
from this man page.

[1]: http://lore.kernel.org/r/20200108064905.170394-1-aneesh.kumar@linux.ibm.com
Ira Weiny Jan. 15, 2020, 10:38 p.m. UTC | #5
On Wed, Jan 15, 2020 at 12:10:50PM -0800, Dan Williams wrote:
> On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
> >
> > On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > >

[snip]

> > Ok I changed a couple of things as well.  How does this sound?
> >
> >
> > STATX_ATTR_DAX
> >
> >         DAX (cpu direct access) is a file mode that attempts to minimize
> 
> s/mode/state/?

DOH!  yes state...  ;-)

> 
> >         software cache effects for both I/O and memory mappings of this
> >         file.  It requires a block device and file system which have
> >         been configured to support DAX.
> 
> It may not require a block device in the future.

Ok:

"It requires a file system which has been configured to support DAX." ?

I'm trying to separate the user of the individual STATX DAX flag from the Admin
details of configuring the file system and/or devices which supports it.

Also, I just realized that we should follow the format of the other STATX_*
attributes.  They all read something like "the file is..."

So I'm adding that text as well.

> 
> >
> >         DAX generally assumes all accesses are via cpu load / store
> >         instructions which can minimize overhead for small accesses, but
> >         may adversely affect cpu utilization for large transfers.
> >
> >         File I/O is done directly to/from user-space buffers and memory
> >         mapped I/O may be performed with direct memory mappings that
> >         bypass kernel page cache.
> >
> >         While the DAX property tends to result in data being transferred
> >         synchronously, it does not give the same guarantees of
> >         synchronous I/O where data and the necessary metadata are
> 
> Maybe use "O_SYNC I/O" explicitly to further differentiate the 2
> meanings of "synchronous" in this sentence?

Done.

> 
> >         transferred together.
> >
> >         A DAX file may support being mapped with the MAP_SYNC flag,
> >         which enables a program to use CPU cache flush operations to
> 
> s/operations/instructions/

Done.

> 
> >         persist CPU store operations without an explicit fsync(2).  See
> >         mmap(2) for more information.
> 
> I think this also wants a reference to the Linux interpretation of
> platform "persistence domains" we were discussing that here [1], but
> maybe it should be part of a "pmem" manpage that can be referenced
> from this man page.

Sure, but for now I think referencing mmap for details on MAP_SYNC works.

I suspect that we may have some word smithing once I get this series in and we
submit a change to the statx man page itself.  Can I move forward with the
following for this patch?

<quote>
STATX_ATTR_DAX

        The file is in the DAX (cpu direct access) state.  DAX state
        attempts to minimize software cache effects for both I/O and
        memory mappings of this file.  It requires a file system which
        has been configured to support DAX.

        DAX generally assumes all accesses are via cpu load / store
        instructions which can minimize overhead for small accesses, but
        may adversely affect cpu utilization for large transfers.

        File I/O is done directly to/from user-space buffers and memory
        mapped I/O may be performed with direct memory mappings that
        bypass kernel page cache.

        While the DAX property tends to result in data being transferred
        synchronously, it does not give the same guarantees of
        synchronous I/O where data and the necessary metadata are
        transferred together.

        A DAX file may support being mapped with the MAP_SYNC flag,
        which enables a program to use CPU cache flush instructions to
        persist CPU store operations without an explicit fsync(2).  See
        mmap(2) for more information.
</quote>

Ira

> 
> [1]: http://lore.kernel.org/r/20200108064905.170394-1-aneesh.kumar@linux.ibm.com
Darrick J. Wong Jan. 16, 2020, 5:39 a.m. UTC | #6
On Wed, Jan 15, 2020 at 02:38:21PM -0800, Ira Weiny wrote:
> On Wed, Jan 15, 2020 at 12:10:50PM -0800, Dan Williams wrote:
> > On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
> > >
> > > On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > > > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > >
> 
> [snip]
> 
> > > Ok I changed a couple of things as well.  How does this sound?
> > >
> > >
> > > STATX_ATTR_DAX
> > >
> > >         DAX (cpu direct access) is a file mode that attempts to minimize
> > 
> > s/mode/state/?
> 
> DOH!  yes state...  ;-)
> 
> > 
> > >         software cache effects for both I/O and memory mappings of this
> > >         file.  It requires a block device and file system which have
> > >         been configured to support DAX.
> > 
> > It may not require a block device in the future.
> 
> Ok:
> 
> "It requires a file system which has been configured to support DAX." ?
> 
> I'm trying to separate the user of the individual STATX DAX flag from the Admin
> details of configuring the file system and/or devices which supports it.
> 
> Also, I just realized that we should follow the format of the other STATX_*
> attributes.  They all read something like "the file is..."
> 
> So I'm adding that text as well.
> 
> > 
> > >
> > >         DAX generally assumes all accesses are via cpu load / store
> > >         instructions which can minimize overhead for small accesses, but
> > >         may adversely affect cpu utilization for large transfers.
> > >
> > >         File I/O is done directly to/from user-space buffers and memory
> > >         mapped I/O may be performed with direct memory mappings that
> > >         bypass kernel page cache.
> > >
> > >         While the DAX property tends to result in data being transferred
> > >         synchronously, it does not give the same guarantees of
> > >         synchronous I/O where data and the necessary metadata are
> > 
> > Maybe use "O_SYNC I/O" explicitly to further differentiate the 2
> > meanings of "synchronous" in this sentence?
> 
> Done.
> 
> > 
> > >         transferred together.
> > >
> > >         A DAX file may support being mapped with the MAP_SYNC flag,
> > >         which enables a program to use CPU cache flush operations to
> > 
> > s/operations/instructions/
> 
> Done.
> 
> > 
> > >         persist CPU store operations without an explicit fsync(2).  See
> > >         mmap(2) for more information.
> > 
> > I think this also wants a reference to the Linux interpretation of
> > platform "persistence domains" we were discussing that here [1], but
> > maybe it should be part of a "pmem" manpage that can be referenced
> > from this man page.
> 
> Sure, but for now I think referencing mmap for details on MAP_SYNC works.
> 
> I suspect that we may have some word smithing once I get this series in and we
> submit a change to the statx man page itself.  Can I move forward with the
> following for this patch?
> 
> <quote>
> STATX_ATTR_DAX
> 
>         The file is in the DAX (cpu direct access) state.  DAX state

Hmm, now that I see it written out, I <cough> kind of like "DAX mode"
better now. :/

"The file is in DAX (CPU direct access) mode.  DAX mode attempts..."

>         attempts to minimize software cache effects for both I/O and
>         memory mappings of this file.  It requires a file system which
>         has been configured to support DAX.
> 
>         DAX generally assumes all accesses are via cpu load / store
>         instructions which can minimize overhead for small accesses, but
>         may adversely affect cpu utilization for large transfers.
> 
>         File I/O is done directly to/from user-space buffers and memory
>         mapped I/O may be performed with direct memory mappings that
>         bypass kernel page cache.
> 
>         While the DAX property tends to result in data being transferred
>         synchronously, it does not give the same guarantees of
>         synchronous I/O where data and the necessary metadata are
>         transferred together.

(I'm frankly not sure that synchronous I/O actually guarantees that the
metadata has hit stable storage...)

--D

>         A DAX file may support being mapped with the MAP_SYNC flag,
>         which enables a program to use CPU cache flush instructions to
>         persist CPU store operations without an explicit fsync(2).  See
>         mmap(2) for more information.
> </quote>
> 
> Ira
> 
> > 
> > [1]: http://lore.kernel.org/r/20200108064905.170394-1-aneesh.kumar@linux.ibm.com
Dan Williams Jan. 16, 2020, 6:05 a.m. UTC | #7
On Wed, Jan 15, 2020 at 9:39 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
[..]
> >         attempts to minimize software cache effects for both I/O and
> >         memory mappings of this file.  It requires a file system which
> >         has been configured to support DAX.
> >
> >         DAX generally assumes all accesses are via cpu load / store
> >         instructions which can minimize overhead for small accesses, but
> >         may adversely affect cpu utilization for large transfers.
> >
> >         File I/O is done directly to/from user-space buffers and memory
> >         mapped I/O may be performed with direct memory mappings that
> >         bypass kernel page cache.
> >
> >         While the DAX property tends to result in data being transferred
> >         synchronously, it does not give the same guarantees of
> >         synchronous I/O where data and the necessary metadata are
> >         transferred together.
>
> (I'm frankly not sure that synchronous I/O actually guarantees that the
> metadata has hit stable storage...)

Oh? That text was motivated by the open(2) man page description of O_SYNC.
Darrick J. Wong Jan. 16, 2020, 6:18 a.m. UTC | #8
On Wed, Jan 15, 2020 at 10:05:00PM -0800, Dan Williams wrote:
> On Wed, Jan 15, 2020 at 9:39 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> [..]
> > >         attempts to minimize software cache effects for both I/O and
> > >         memory mappings of this file.  It requires a file system which
> > >         has been configured to support DAX.
> > >
> > >         DAX generally assumes all accesses are via cpu load / store
> > >         instructions which can minimize overhead for small accesses, but
> > >         may adversely affect cpu utilization for large transfers.
> > >
> > >         File I/O is done directly to/from user-space buffers and memory
> > >         mapped I/O may be performed with direct memory mappings that
> > >         bypass kernel page cache.
> > >
> > >         While the DAX property tends to result in data being transferred
> > >         synchronously, it does not give the same guarantees of
> > >         synchronous I/O where data and the necessary metadata are
> > >         transferred together.
> >
> > (I'm frankly not sure that synchronous I/O actually guarantees that the
> > metadata has hit stable storage...)
> 
> Oh? That text was motivated by the open(2) man page description of O_SYNC.

Eh, that's just me being cynical about software.  Yes, the O_SYNC docs
say that data+metadata are supposed to happen; that's good enough for
another section in the man pages. :)

--D
Dan Williams Jan. 16, 2020, 6:25 a.m. UTC | #9
On Wed, Jan 15, 2020 at 10:18 PM Darrick J. Wong
<darrick.wong@oracle.com> wrote:
>
> On Wed, Jan 15, 2020 at 10:05:00PM -0800, Dan Williams wrote:
> > On Wed, Jan 15, 2020 at 9:39 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > [..]
> > > >         attempts to minimize software cache effects for both I/O and
> > > >         memory mappings of this file.  It requires a file system which
> > > >         has been configured to support DAX.
> > > >
> > > >         DAX generally assumes all accesses are via cpu load / store
> > > >         instructions which can minimize overhead for small accesses, but
> > > >         may adversely affect cpu utilization for large transfers.
> > > >
> > > >         File I/O is done directly to/from user-space buffers and memory
> > > >         mapped I/O may be performed with direct memory mappings that
> > > >         bypass kernel page cache.
> > > >
> > > >         While the DAX property tends to result in data being transferred
> > > >         synchronously, it does not give the same guarantees of
> > > >         synchronous I/O where data and the necessary metadata are
> > > >         transferred together.
> > >
> > > (I'm frankly not sure that synchronous I/O actually guarantees that the
> > > metadata has hit stable storage...)
> >
> > Oh? That text was motivated by the open(2) man page description of O_SYNC.
>
> Eh, that's just me being cynical about software.  Yes, the O_SYNC docs
> say that data+metadata are supposed to happen; that's good enough for
> another section in the man pages. :)
>

Ah ok, yes, "all storage is a lie".
Ira Weiny Jan. 16, 2020, 5:55 p.m. UTC | #10
On Wed, Jan 15, 2020 at 09:39:35PM -0800, Darrick J. Wong wrote:
> On Wed, Jan 15, 2020 at 02:38:21PM -0800, Ira Weiny wrote:
> > On Wed, Jan 15, 2020 at 12:10:50PM -0800, Dan Williams wrote:
> > > On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
> > > >
> > > > On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > > > > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > > > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > > >
> > 

[snip]

> > 
> > Sure, but for now I think referencing mmap for details on MAP_SYNC works.
> > 
> > I suspect that we may have some word smithing once I get this series in and we
> > submit a change to the statx man page itself.  Can I move forward with the
> > following for this patch?
> > 
> > <quote>
> > STATX_ATTR_DAX
> > 
> >         The file is in the DAX (cpu direct access) state.  DAX state
> 
> Hmm, now that I see it written out, I <cough> kind of like "DAX mode"
> better now. :/
> 
> "The file is in DAX (CPU direct access) mode.  DAX mode attempts..."

Sure...  now you tell me...  ;-)

Seriously, we could use mode here in the man page as this is less confusing to
say "DAX mode".

But I think the code should still use 'state' because mode is just too
overloaded.  You were not the only one who was thrown by my use of mode and I
don't want that confusion when we look at this code 2 weeks from now...

https://www.reddit.com/r/ProgrammerHumor/comments/852og2/only_god_knows/

;-)

> 
> >         attempts to minimize software cache effects for both I/O and
> >         memory mappings of this file.  It requires a file system which
> >         has been configured to support DAX.
> > 
> >         DAX generally assumes all accesses are via cpu load / store
> >         instructions which can minimize overhead for small accesses, but
> >         may adversely affect cpu utilization for large transfers.
> > 
> >         File I/O is done directly to/from user-space buffers and memory
> >         mapped I/O may be performed with direct memory mappings that
> >         bypass kernel page cache.
> > 
> >         While the DAX property tends to result in data being transferred
> >         synchronously, it does not give the same guarantees of
> >         synchronous I/O where data and the necessary metadata are
> >         transferred together.
> 
> (I'm frankly not sure that synchronous I/O actually guarantees that the
> metadata has hit stable storage...)

I'll let you and Dan work this one out...  ;-)

Ira
Darrick J. Wong Jan. 16, 2020, 6:04 p.m. UTC | #11
On Thu, Jan 16, 2020 at 09:55:02AM -0800, Ira Weiny wrote:
> On Wed, Jan 15, 2020 at 09:39:35PM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 15, 2020 at 02:38:21PM -0800, Ira Weiny wrote:
> > > On Wed, Jan 15, 2020 at 12:10:50PM -0800, Dan Williams wrote:
> > > > On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
> > > > >
> > > > > On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > > > > > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > > > > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > > > >
> > > 
> 
> [snip]
> 
> > > 
> > > Sure, but for now I think referencing mmap for details on MAP_SYNC works.
> > > 
> > > I suspect that we may have some word smithing once I get this series in and we
> > > submit a change to the statx man page itself.  Can I move forward with the
> > > following for this patch?
> > > 
> > > <quote>
> > > STATX_ATTR_DAX
> > > 
> > >         The file is in the DAX (cpu direct access) state.  DAX state
> > 
> > Hmm, now that I see it written out, I <cough> kind of like "DAX mode"
> > better now. :/
> > 
> > "The file is in DAX (CPU direct access) mode.  DAX mode attempts..."
> 
> Sure...  now you tell me...  ;-)
> 
> Seriously, we could use mode here in the man page as this is less confusing to
> say "DAX mode".
> 
> But I think the code should still use 'state' because mode is just too
> overloaded.  You were not the only one who was thrown by my use of mode and I
> don't want that confusion when we look at this code 2 weeks from now...
> 
> https://www.reddit.com/r/ProgrammerHumor/comments/852og2/only_god_knows/
> 
> ;-)

Ok, let's leave it alone for now then.

I'm not even sure what 'DAX' stands for.  Direct Access to ...
Professor Xavier? 8-)

> > 
> > >         attempts to minimize software cache effects for both I/O and
> > >         memory mappings of this file.  It requires a file system which
> > >         has been configured to support DAX.
> > > 
> > >         DAX generally assumes all accesses are via cpu load / store
> > >         instructions which can minimize overhead for small accesses, but
> > >         may adversely affect cpu utilization for large transfers.
> > > 
> > >         File I/O is done directly to/from user-space buffers and memory
> > >         mapped I/O may be performed with direct memory mappings that
> > >         bypass kernel page cache.
> > > 
> > >         While the DAX property tends to result in data being transferred
> > >         synchronously, it does not give the same guarantees of
> > >         synchronous I/O where data and the necessary metadata are
> > >         transferred together.
> > 
> > (I'm frankly not sure that synchronous I/O actually guarantees that the
> > metadata has hit stable storage...)
> 
> I'll let you and Dan work this one out...  ;-)

Hehe.  I think the wording here is fine.

--D

> Ira
>
Ira Weiny Jan. 16, 2020, 6:52 p.m. UTC | #12
On Thu, Jan 16, 2020 at 10:04:21AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 16, 2020 at 09:55:02AM -0800, Ira Weiny wrote:
> > On Wed, Jan 15, 2020 at 09:39:35PM -0800, Darrick J. Wong wrote:
> > > On Wed, Jan 15, 2020 at 02:38:21PM -0800, Ira Weiny wrote:
> > > > On Wed, Jan 15, 2020 at 12:10:50PM -0800, Dan Williams wrote:
> > > > > On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > > > > > > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > > > > > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > > > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > > > > >
> > > > 
> > 
> > [snip]
> > 
> > > > 
> > > > Sure, but for now I think referencing mmap for details on MAP_SYNC works.
> > > > 
> > > > I suspect that we may have some word smithing once I get this series in and we
> > > > submit a change to the statx man page itself.  Can I move forward with the
> > > > following for this patch?
> > > > 
> > > > <quote>
> > > > STATX_ATTR_DAX
> > > > 
> > > >         The file is in the DAX (cpu direct access) state.  DAX state
> > > 
> > > Hmm, now that I see it written out, I <cough> kind of like "DAX mode"
> > > better now. :/
> > > 
> > > "The file is in DAX (CPU direct access) mode.  DAX mode attempts..."
> > 
> > Sure...  now you tell me...  ;-)
> > 
> > Seriously, we could use mode here in the man page as this is less confusing to
> > say "DAX mode".
> > 
> > But I think the code should still use 'state' because mode is just too
> > overloaded.  You were not the only one who was thrown by my use of mode and I
> > don't want that confusion when we look at this code 2 weeks from now...
> > 
> > https://www.reddit.com/r/ProgrammerHumor/comments/852og2/only_god_knows/
> > 
> > ;-)
> 
> Ok, let's leave it alone for now then.

Cool could I get a reviewed by?

And Jan is this reword of the man page/commit ok to keep your reviewed by?

> 
> I'm not even sure what 'DAX' stands for.  Direct Access to ...
> Professor Xavier? 8-)

That is pronounced 'Direct A'Xes'  you know, for chopping wood!

Thanks everyone,
Ira

> 
> > > 
> > > >         attempts to minimize software cache effects for both I/O and
> > > >         memory mappings of this file.  It requires a file system which
> > > >         has been configured to support DAX.
> > > > 
> > > >         DAX generally assumes all accesses are via cpu load / store
> > > >         instructions which can minimize overhead for small accesses, but
> > > >         may adversely affect cpu utilization for large transfers.
> > > > 
> > > >         File I/O is done directly to/from user-space buffers and memory
> > > >         mapped I/O may be performed with direct memory mappings that
> > > >         bypass kernel page cache.
> > > > 
> > > >         While the DAX property tends to result in data being transferred
> > > >         synchronously, it does not give the same guarantees of
> > > >         synchronous I/O where data and the necessary metadata are
> > > >         transferred together.
> > > 
> > > (I'm frankly not sure that synchronous I/O actually guarantees that the
> > > metadata has hit stable storage...)
> > 
> > I'll let you and Dan work this one out...  ;-)
> 
> Hehe.  I think the wording here is fine.
> 
> --D
> 
> > Ira
> >
Darrick J. Wong Jan. 16, 2020, 10:19 p.m. UTC | #13
On Thu, Jan 16, 2020 at 10:52:36AM -0800, Ira Weiny wrote:
> On Thu, Jan 16, 2020 at 10:04:21AM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 16, 2020 at 09:55:02AM -0800, Ira Weiny wrote:
> > > On Wed, Jan 15, 2020 at 09:39:35PM -0800, Darrick J. Wong wrote:
> > > > On Wed, Jan 15, 2020 at 02:38:21PM -0800, Ira Weiny wrote:
> > > > > On Wed, Jan 15, 2020 at 12:10:50PM -0800, Dan Williams wrote:
> > > > > > On Wed, Jan 15, 2020 at 11:45 AM Ira Weiny <ira.weiny@intel.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> > > > > > > > On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > > > > > > > > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > > > > > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > > > > > >
> > > > > 
> > > 
> > > [snip]
> > > 
> > > > > 
> > > > > Sure, but for now I think referencing mmap for details on MAP_SYNC works.
> > > > > 
> > > > > I suspect that we may have some word smithing once I get this series in and we
> > > > > submit a change to the statx man page itself.  Can I move forward with the
> > > > > following for this patch?
> > > > > 
> > > > > <quote>
> > > > > STATX_ATTR_DAX
> > > > > 
> > > > >         The file is in the DAX (cpu direct access) state.  DAX state
> > > > 
> > > > Hmm, now that I see it written out, I <cough> kind of like "DAX mode"
> > > > better now. :/
> > > > 
> > > > "The file is in DAX (CPU direct access) mode.  DAX mode attempts..."
> > > 
> > > Sure...  now you tell me...  ;-)
> > > 
> > > Seriously, we could use mode here in the man page as this is less confusing to
> > > say "DAX mode".
> > > 
> > > But I think the code should still use 'state' because mode is just too
> > > overloaded.  You were not the only one who was thrown by my use of mode and I
> > > don't want that confusion when we look at this code 2 weeks from now...
> > > 
> > > https://www.reddit.com/r/ProgrammerHumor/comments/852og2/only_god_knows/
> > > 
> > > ;-)
> > 
> > Ok, let's leave it alone for now then.
> 
> Cool could I get a reviewed by?

My bike shed is painted green with purple polka dots,

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> And Jan is this reword of the man page/commit ok to keep your reviewed by?
> 
> > 
> > I'm not even sure what 'DAX' stands for.  Direct Access to ...
> > Professor Xavier? 8-)
> 
> That is pronounced 'Direct A'Xes'  you know, for chopping wood!
> 
> Thanks everyone,
> Ira
> 
> > 
> > > > 
> > > > >         attempts to minimize software cache effects for both I/O and
> > > > >         memory mappings of this file.  It requires a file system which
> > > > >         has been configured to support DAX.
> > > > > 
> > > > >         DAX generally assumes all accesses are via cpu load / store
> > > > >         instructions which can minimize overhead for small accesses, but
> > > > >         may adversely affect cpu utilization for large transfers.
> > > > > 
> > > > >         File I/O is done directly to/from user-space buffers and memory
> > > > >         mapped I/O may be performed with direct memory mappings that
> > > > >         bypass kernel page cache.
> > > > > 
> > > > >         While the DAX property tends to result in data being transferred
> > > > >         synchronously, it does not give the same guarantees of
> > > > >         synchronous I/O where data and the necessary metadata are
> > > > >         transferred together.
> > > > 
> > > > (I'm frankly not sure that synchronous I/O actually guarantees that the
> > > > metadata has hit stable storage...)
> > > 
> > > I'll let you and Dan work this one out...  ;-)
> > 
> > Hehe.  I think the wording here is fine.
> > 
> > --D
> > 
> > > Ira
> > >
Jan Kara Jan. 17, 2020, 11:58 a.m. UTC | #14
On Thu 16-01-20 10:52:36, Ira Weiny wrote:
> And Jan is this reword of the man page/commit ok to keep your reviewed by?

Yes.

								Honza
Dave Chinner Jan. 18, 2020, 9:11 a.m. UTC | #15
On Wed, Jan 15, 2020 at 10:05:00PM -0800, Dan Williams wrote:
> On Wed, Jan 15, 2020 at 9:39 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> [..]
> > >         attempts to minimize software cache effects for both I/O and
> > >         memory mappings of this file.  It requires a file system which
> > >         has been configured to support DAX.
> > >
> > >         DAX generally assumes all accesses are via cpu load / store
> > >         instructions which can minimize overhead for small accesses, but
> > >         may adversely affect cpu utilization for large transfers.
> > >
> > >         File I/O is done directly to/from user-space buffers and memory
> > >         mapped I/O may be performed with direct memory mappings that
> > >         bypass kernel page cache.
> > >
> > >         While the DAX property tends to result in data being transferred
> > >         synchronously, it does not give the same guarantees of
> > >         synchronous I/O where data and the necessary metadata are
> > >         transferred together.
> >
> > (I'm frankly not sure that synchronous I/O actually guarantees that the
> > metadata has hit stable storage...)
> 
> Oh? That text was motivated by the open(2) man page description of O_SYNC.

Ugh. "synchronous I/O" means two different things, depending on
context. In the AIO context, it means "process context waits for operation
completion direct", but in the O_SYNC context, it means "we guarantee
data integrity for each I/O submitted".

Indeed, O_SYNC AIO is a thing. i.e. we can do an "async sync
write" to guarantee data integrity without directly waiting for
it. Now try describing that only using the words "synchronous
write" to describe both behaviours. :)

IOWs, if you are talking about data integrity, you need to
explicitly say "O_SYNC semantics", not "synchronous write", because
"synchronous write" is totally ambiguous without the O_SYNC context
of the open(2) man page...

Cheers,

Dave.
diff mbox series

Patch

diff --git a/fs/stat.c b/fs/stat.c
index 030008796479..894699c74dde 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -79,6 +79,9 @@  int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
 	if (IS_AUTOMOUNT(inode))
 		stat->attributes |= STATX_ATTR_AUTOMOUNT;
 
+	if (IS_DAX(inode))
+		stat->attributes |= STATX_ATTR_DAX;
+
 	if (inode->i_op->getattr)
 		return inode->i_op->getattr(path, stat, request_mask,
 					    query_flags);
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index ad80a5c885d5..e5f9d5517f6b 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -169,6 +169,7 @@  struct statx {
 #define STATX_ATTR_ENCRYPTED		0x00000800 /* [I] File requires key to decrypt in fs */
 #define STATX_ATTR_AUTOMOUNT		0x00001000 /* Dir: Automount trigger */
 #define STATX_ATTR_VERITY		0x00100000 /* [I] Verity protected file */
+#define STATX_ATTR_DAX			0x00002000 /* [I] File is DAX */
 
 
 #endif /* _UAPI_LINUX_STAT_H */