diff mbox series

[2/2] statx: move STATX_ATTR_DAX attribute handling to filesystems

Message ID 05a0f4fd-7f62-8fbc-378d-886ccd5b3f11@redhat.com (mailing list archive)
State New, archived
Headers show
Series statx: Fix DAX attribute collision and handling | expand

Commit Message

Eric Sandeen Dec. 1, 2020, 4:59 p.m. UTC
It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS;
while the VFS can detect the current DAX state, it is the filesystem which
actually sets S_DAX on the inode, and the filesystem is the place that
knows whether DAX is something that the "filesystem actually supports" [1]
so that the statx attributes_mask can be properly set.

So, move STATX_ATTR_DAX attribute setting to the individual dax-capable
filesystems, and update the attributes_mask there as well.

[1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
 fs/ext2/inode.c   | 6 +++++-
 fs/ext4/inode.c   | 5 ++++-
 fs/stat.c         | 3 ---
 fs/xfs/xfs_iops.c | 5 ++++-
 4 files changed, 13 insertions(+), 6 deletions(-)

Comments

David Howells Dec. 1, 2020, 5:20 p.m. UTC | #1
Eric Sandeen <sandeen@redhat.com> wrote:

> -	if (IS_DAX(inode))
> -		stat->attributes |= STATX_ATTR_DAX;
> -

This could probably be left in and not distributed amongst the filesytems
provided that any filesystem that might turn it on sets the bit in the
attributes_mask.

I'm presuming that the core doesn't turn it on without the filesystem buying
in.

David
David Howells Dec. 1, 2020, 5:28 p.m. UTC | #2
David Howells <dhowells@redhat.com> wrote:

> > -	if (IS_DAX(inode))
> > -		stat->attributes |= STATX_ATTR_DAX;
> > -
> 
> This could probably be left in and not distributed amongst the filesytems
> provided that any filesystem that might turn it on sets the bit in the
> attributes_mask.

On further consideration, it's probably better to split it as you've done it.

Reviewed-by: David Howells <dhowells@redhat.com>

You do need one or more Fixes: lines, though.

David
Darrick J. Wong Dec. 1, 2020, 5:39 p.m. UTC | #3
On Tue, Dec 01, 2020 at 10:59:36AM -0600, Eric Sandeen wrote:
> It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS;
> while the VFS can detect the current DAX state, it is the filesystem which
> actually sets S_DAX on the inode, and the filesystem is the place that
> knows whether DAX is something that the "filesystem actually supports" [1]
> so that the statx attributes_mask can be properly set.
> 
> So, move STATX_ATTR_DAX attribute setting to the individual dax-capable
> filesystems, and update the attributes_mask there as well.
> 
> [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
>  fs/ext2/inode.c   | 6 +++++-
>  fs/ext4/inode.c   | 5 ++++-
>  fs/stat.c         | 3 ---
>  fs/xfs/xfs_iops.c | 5 ++++-
>  4 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> index 11c5c6fe75bb..3550783a6ea0 100644
> --- a/fs/ext2/inode.c
> +++ b/fs/ext2/inode.c
> @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat,
>  		stat->attributes |= STATX_ATTR_IMMUTABLE;
>  	if (flags & EXT2_NODUMP_FL)
>  		stat->attributes |= STATX_ATTR_NODUMP;
> +	if (IS_DAX(inode))
> +		stat->attributes |= STATX_ATTR_DAX;
> +
>  	stat->attributes_mask |= (STATX_ATTR_APPEND |
>  			STATX_ATTR_COMPRESSED |
>  			STATX_ATTR_ENCRYPTED |
>  			STATX_ATTR_IMMUTABLE |
> -			STATX_ATTR_NODUMP);
> +			STATX_ATTR_NODUMP |
> +			STATX_ATTR_DAX);
>  
>  	generic_fillattr(inode, stat);
>  	return 0;
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 0d8385aea898..848a0f2b154e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat,
>  		stat->attributes |= STATX_ATTR_NODUMP;
>  	if (flags & EXT4_VERITY_FL)
>  		stat->attributes |= STATX_ATTR_VERITY;
> +	if (IS_DAX(inode))
> +		stat->attributes |= STATX_ATTR_DAX;
>  
>  	stat->attributes_mask |= (STATX_ATTR_APPEND |
>  				  STATX_ATTR_COMPRESSED |
>  				  STATX_ATTR_ENCRYPTED |
>  				  STATX_ATTR_IMMUTABLE |
>  				  STATX_ATTR_NODUMP |
> -				  STATX_ATTR_VERITY);
> +				  STATX_ATTR_VERITY |
> +				  STATX_ATTR_DAX);
>  
>  	generic_fillattr(inode, stat);
>  	return 0;
> diff --git a/fs/stat.c b/fs/stat.c
> index dacecdda2e79..5bd90949c69b 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
>  	if (IS_AUTOMOUNT(inode))
>  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
>  
> -	if (IS_DAX(inode))
> -		stat->attributes |= STATX_ATTR_DAX;
> -
>  	if (inode->i_op->getattr)
>  		return inode->i_op->getattr(path, stat, request_mask,
>  					    query_flags);
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 1414ab79eacf..56deda7042fd 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -575,10 +575,13 @@ xfs_vn_getattr(
>  		stat->attributes |= STATX_ATTR_APPEND;
>  	if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP)
>  		stat->attributes |= STATX_ATTR_NODUMP;
> +	if (IS_DAX(inode))
> +		stat->attributes |= STATX_ATTR_DAX;
>  
>  	stat->attributes_mask |= (STATX_ATTR_IMMUTABLE |
>  				  STATX_ATTR_APPEND |
> -				  STATX_ATTR_NODUMP);
> +				  STATX_ATTR_NODUMP |
> +				  STATX_ATTR_DAX);

TBH I preferred your previous iteration on this, which only set the DAX
bit in the attributes_mask if the underlying storage was pmem and the
blocksize was correct, etc. since it made it easier to distinguish
between a filesystem where you /could/ have DAX (but it wasn't currently
enabled) and a filesystem where it just isn't possible.

That might be enough to satisfy any critics who want to be able to
detect DAX support from an installer program.

--D

>  
>  	switch (inode->i_mode & S_IFMT) {
>  	case S_IFBLK:
> -- 
> 2.17.0
> 
>
Eric Sandeen Dec. 1, 2020, 5:53 p.m. UTC | #4
On 12/1/20 11:39 AM, Darrick J. Wong wrote:
>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>> index 1414ab79eacf..56deda7042fd 100644
>> --- a/fs/xfs/xfs_iops.c
>> +++ b/fs/xfs/xfs_iops.c
>> @@ -575,10 +575,13 @@ xfs_vn_getattr(
>>  		stat->attributes |= STATX_ATTR_APPEND;
>>  	if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP)
>>  		stat->attributes |= STATX_ATTR_NODUMP;
>> +	if (IS_DAX(inode))
>> +		stat->attributes |= STATX_ATTR_DAX;
>>  
>>  	stat->attributes_mask |= (STATX_ATTR_IMMUTABLE |
>>  				  STATX_ATTR_APPEND |
>> -				  STATX_ATTR_NODUMP);
>> +				  STATX_ATTR_NODUMP |
>> +				  STATX_ATTR_DAX);
> TBH I preferred your previous iteration on this, which only set the DAX
> bit in the attributes_mask if the underlying storage was pmem and the
> blocksize was correct, etc. since it made it easier to distinguish
> between a filesystem where you /could/ have DAX (but it wasn't currently
> enabled) and a filesystem where it just isn't possible.
> 
> That might be enough to satisfy any critics who want to be able to
> detect DAX support from an installer program.

(nb: that previous iteration wasn't in public, just something I talked to
Darrick about)

I'm sympathetic to that argument, but the exact details of what the mask means
in general, and for dax in particular, seems to be subject to ongoing debate.

I'd like to just set it with the simplest definition "the fileystem supports
the feature" for now, so that we aren't ever setting a feature that's omitted
from the mask, and refine the mask-setting for the dax flag in another
iteration if/when we reach agreement.

-Eric

> --D
>
Linus Torvalds Dec. 1, 2020, 8:04 p.m. UTC | #5
On Tue, Dec 1, 2020 at 8:59 AM Eric Sandeen <sandeen@redhat.com> wrote:
>
> It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS;
> while the VFS can detect the current DAX state, it is the filesystem which
> actually sets S_DAX on the inode, and the filesystem is the place that
> knows whether DAX is something that the "filesystem actually supports" [1]
> so that the statx attributes_mask can be properly set.
>
> So, move STATX_ATTR_DAX attribute setting to the individual dax-capable
> filesystems, and update the attributes_mask there as well.

I'm not really understanding the logic behind this.

The whole IS_DAX(inode) thing exists in various places outside the
low-level filesystem, why shouldn't stat() do this?

If IS_DAX() is incorrect, then we have much bigger problems than some
stat results. We have core functions like generic_file_read_iter() etc
all making actual behavioral judgements on IS_DAX().

And if IS_DAX() is correct, then why shouldn't this just be done in
generic code? Why move it to every individual filesystem?

               Linus
Eric Sandeen Dec. 1, 2020, 8:50 p.m. UTC | #6
On 12/1/20 2:04 PM, Linus Torvalds wrote:
> On Tue, Dec 1, 2020 at 8:59 AM Eric Sandeen <sandeen@redhat.com> wrote:
>>
>> It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS;
>> while the VFS can detect the current DAX state, it is the filesystem which
>> actually sets S_DAX on the inode, and the filesystem is the place that
>> knows whether DAX is something that the "filesystem actually supports" [1]
>> so that the statx attributes_mask can be properly set.
>>
>> So, move STATX_ATTR_DAX attribute setting to the individual dax-capable
>> filesystems, and update the attributes_mask there as well.
> 
> I'm not really understanding the logic behind this.
> 
> The whole IS_DAX(inode) thing exists in various places outside the
> low-level filesystem, why shouldn't stat() do this?
> 
> If IS_DAX() is incorrect, then we have much bigger problems than some
> stat results. We have core functions like generic_file_read_iter() etc
> all making actual behavioral judgements on IS_DAX().

It's not incorrect, I didn't mean to imply that. Current code does accurately
set the DAX flag in the statx attributes.
 
> And if IS_DAX() is correct, then why shouldn't this just be done in
> generic code? Why move it to every individual filesystem?

At the end of the day, it's because only the individual filesystems can
manage the dax flag in the statx attributes_mask. (That's only place that
knows if dax "is available" in general, as opposed to being set on a specific
inode) So if they have to do that, they may as well set the actual attribute
as well, like they do for every other flag they manage...

I mean, we could leave the statx->attributes setting in the vfs, and add
the statx->attributes_mask setting to each dax-capable filesystem. That just
felt a bit asymmetric vs. the way every other filesystem-specific flag gets
handled.

-Eric
Dave Chinner Dec. 1, 2020, 8:52 p.m. UTC | #7
On Tue, Dec 01, 2020 at 09:39:05AM -0800, Darrick J. Wong wrote:
> On Tue, Dec 01, 2020 at 10:59:36AM -0600, Eric Sandeen wrote:
> > It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS;
> > while the VFS can detect the current DAX state, it is the filesystem which
> > actually sets S_DAX on the inode, and the filesystem is the place that
> > knows whether DAX is something that the "filesystem actually supports" [1]
> > so that the statx attributes_mask can be properly set.
> > 
> > So, move STATX_ATTR_DAX attribute setting to the individual dax-capable
> > filesystems, and update the attributes_mask there as well.
> > 
> > [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx
> > 
> > Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> > ---
> >  fs/ext2/inode.c   | 6 +++++-
> >  fs/ext4/inode.c   | 5 ++++-
> >  fs/stat.c         | 3 ---
> >  fs/xfs/xfs_iops.c | 5 ++++-
> >  4 files changed, 13 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> > index 11c5c6fe75bb..3550783a6ea0 100644
> > --- a/fs/ext2/inode.c
> > +++ b/fs/ext2/inode.c
> > @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat,
> >  		stat->attributes |= STATX_ATTR_IMMUTABLE;
> >  	if (flags & EXT2_NODUMP_FL)
> >  		stat->attributes |= STATX_ATTR_NODUMP;
> > +	if (IS_DAX(inode))
> > +		stat->attributes |= STATX_ATTR_DAX;
> > +
> >  	stat->attributes_mask |= (STATX_ATTR_APPEND |
> >  			STATX_ATTR_COMPRESSED |
> >  			STATX_ATTR_ENCRYPTED |
> >  			STATX_ATTR_IMMUTABLE |
> > -			STATX_ATTR_NODUMP);
> > +			STATX_ATTR_NODUMP |
> > +			STATX_ATTR_DAX);
> >  
> >  	generic_fillattr(inode, stat);
> >  	return 0;
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 0d8385aea898..848a0f2b154e 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat,
> >  		stat->attributes |= STATX_ATTR_NODUMP;
> >  	if (flags & EXT4_VERITY_FL)
> >  		stat->attributes |= STATX_ATTR_VERITY;
> > +	if (IS_DAX(inode))
> > +		stat->attributes |= STATX_ATTR_DAX;
> >  
> >  	stat->attributes_mask |= (STATX_ATTR_APPEND |
> >  				  STATX_ATTR_COMPRESSED |
> >  				  STATX_ATTR_ENCRYPTED |
> >  				  STATX_ATTR_IMMUTABLE |
> >  				  STATX_ATTR_NODUMP |
> > -				  STATX_ATTR_VERITY);
> > +				  STATX_ATTR_VERITY |
> > +				  STATX_ATTR_DAX);
> >  
> >  	generic_fillattr(inode, stat);
> >  	return 0;
> > diff --git a/fs/stat.c b/fs/stat.c
> > index dacecdda2e79..5bd90949c69b 100644
> > --- a/fs/stat.c
> > +++ b/fs/stat.c
> > @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> >  	if (IS_AUTOMOUNT(inode))
> >  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
> >  
> > -	if (IS_DAX(inode))
> > -		stat->attributes |= STATX_ATTR_DAX;
> > -
> >  	if (inode->i_op->getattr)
> >  		return inode->i_op->getattr(path, stat, request_mask,
> >  					    query_flags);
> > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > index 1414ab79eacf..56deda7042fd 100644
> > --- a/fs/xfs/xfs_iops.c
> > +++ b/fs/xfs/xfs_iops.c
> > @@ -575,10 +575,13 @@ xfs_vn_getattr(
> >  		stat->attributes |= STATX_ATTR_APPEND;
> >  	if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP)
> >  		stat->attributes |= STATX_ATTR_NODUMP;
> > +	if (IS_DAX(inode))
> > +		stat->attributes |= STATX_ATTR_DAX;
> >  
> >  	stat->attributes_mask |= (STATX_ATTR_IMMUTABLE |
> >  				  STATX_ATTR_APPEND |
> > -				  STATX_ATTR_NODUMP);
> > +				  STATX_ATTR_NODUMP |
> > +				  STATX_ATTR_DAX);
> 
> TBH I preferred your previous iteration on this, which only set the DAX
> bit in the attributes_mask if the underlying storage was pmem and the
> blocksize was correct, etc. since it made it easier to distinguish
> between a filesystem where you /could/ have DAX (but it wasn't currently
> enabled) and a filesystem where it just isn't possible.

I think that's the only thing that makes sense from a userspace
perspective. THe man page explicitly says that:

  stx_attributes_mask
	A mask indicating which bits in stx_attributes are supported
	by the VFS and the filesystem.

So if DAX can never be turned on for that filesystem instance then,
by definition, it does not support DAX and the bit should never be
set.

e.g. We don't talk about kernels that support reflink - what matters
to userspace is whether the filesystem instance supports reflink.
Think of the useless mess that xfs_info would be if it reported
kernel capabilities instead of filesystem instance capabilities.
i.e. we don't report that a filesystem supports reflink just because
the kernel supports it - it reports whether the filesystem instance
being queried supports reflink. And that also implies the kernel
supports it, because the kernel has to support it to mount the
filesystem...

So, yeah, I think it really does need to be conditional on the
filesystem instance being queried to be actually useful to users....

Cheers,

Dave.
David Howells Dec. 1, 2020, 9:04 p.m. UTC | #8
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> And if IS_DAX() is correct, then why shouldn't this just be done in
> generic code? Why move it to every individual filesystem?

One way of looking at it is that the check is then done for every filesystem -
most of which don't support it.  Not sure whether that's a big enough problem
to worry about.  The same is true of the automount test too, I suppose...

David
Eric Sandeen Dec. 1, 2020, 10:03 p.m. UTC | #9
On 12/1/20 2:52 PM, Dave Chinner wrote:
> On Tue, Dec 01, 2020 at 09:39:05AM -0800, Darrick J. Wong wrote:
>> On Tue, Dec 01, 2020 at 10:59:36AM -0600, Eric Sandeen wrote:
>>> It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS;
>>> while the VFS can detect the current DAX state, it is the filesystem which
>>> actually sets S_DAX on the inode, and the filesystem is the place that
>>> knows whether DAX is something that the "filesystem actually supports" [1]
>>> so that the statx attributes_mask can be properly set.
>>>
>>> So, move STATX_ATTR_DAX attribute setting to the individual dax-capable
>>> filesystems, and update the attributes_mask there as well.
>>>
>>> [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx
>>>
>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>> ---
>>>  fs/ext2/inode.c   | 6 +++++-
>>>  fs/ext4/inode.c   | 5 ++++-
>>>  fs/stat.c         | 3 ---
>>>  fs/xfs/xfs_iops.c | 5 ++++-
>>>  4 files changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
>>> index 11c5c6fe75bb..3550783a6ea0 100644
>>> --- a/fs/ext2/inode.c
>>> +++ b/fs/ext2/inode.c
>>> @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat,
>>>  		stat->attributes |= STATX_ATTR_IMMUTABLE;
>>>  	if (flags & EXT2_NODUMP_FL)
>>>  		stat->attributes |= STATX_ATTR_NODUMP;
>>> +	if (IS_DAX(inode))
>>> +		stat->attributes |= STATX_ATTR_DAX;
>>> +
>>>  	stat->attributes_mask |= (STATX_ATTR_APPEND |
>>>  			STATX_ATTR_COMPRESSED |
>>>  			STATX_ATTR_ENCRYPTED |
>>>  			STATX_ATTR_IMMUTABLE |
>>> -			STATX_ATTR_NODUMP);
>>> +			STATX_ATTR_NODUMP |
>>> +			STATX_ATTR_DAX);
>>>  
>>>  	generic_fillattr(inode, stat);
>>>  	return 0;
>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>> index 0d8385aea898..848a0f2b154e 100644
>>> --- a/fs/ext4/inode.c
>>> +++ b/fs/ext4/inode.c
>>> @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat,
>>>  		stat->attributes |= STATX_ATTR_NODUMP;
>>>  	if (flags & EXT4_VERITY_FL)
>>>  		stat->attributes |= STATX_ATTR_VERITY;
>>> +	if (IS_DAX(inode))
>>> +		stat->attributes |= STATX_ATTR_DAX;
>>>  
>>>  	stat->attributes_mask |= (STATX_ATTR_APPEND |
>>>  				  STATX_ATTR_COMPRESSED |
>>>  				  STATX_ATTR_ENCRYPTED |
>>>  				  STATX_ATTR_IMMUTABLE |
>>>  				  STATX_ATTR_NODUMP |
>>> -				  STATX_ATTR_VERITY);
>>> +				  STATX_ATTR_VERITY |
>>> +				  STATX_ATTR_DAX);
>>>  
>>>  	generic_fillattr(inode, stat);
>>>  	return 0;
>>> diff --git a/fs/stat.c b/fs/stat.c
>>> index dacecdda2e79..5bd90949c69b 100644
>>> --- a/fs/stat.c
>>> +++ b/fs/stat.c
>>> @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
>>>  	if (IS_AUTOMOUNT(inode))
>>>  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
>>>  
>>> -	if (IS_DAX(inode))
>>> -		stat->attributes |= STATX_ATTR_DAX;
>>> -
>>>  	if (inode->i_op->getattr)
>>>  		return inode->i_op->getattr(path, stat, request_mask,
>>>  					    query_flags);
>>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>>> index 1414ab79eacf..56deda7042fd 100644
>>> --- a/fs/xfs/xfs_iops.c
>>> +++ b/fs/xfs/xfs_iops.c
>>> @@ -575,10 +575,13 @@ xfs_vn_getattr(
>>>  		stat->attributes |= STATX_ATTR_APPEND;
>>>  	if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP)
>>>  		stat->attributes |= STATX_ATTR_NODUMP;
>>> +	if (IS_DAX(inode))
>>> +		stat->attributes |= STATX_ATTR_DAX;
>>>  
>>>  	stat->attributes_mask |= (STATX_ATTR_IMMUTABLE |
>>>  				  STATX_ATTR_APPEND |
>>> -				  STATX_ATTR_NODUMP);
>>> +				  STATX_ATTR_NODUMP |
>>> +				  STATX_ATTR_DAX);
>>
>> TBH I preferred your previous iteration on this, which only set the DAX
>> bit in the attributes_mask if the underlying storage was pmem and the
>> blocksize was correct, etc. since it made it easier to distinguish
>> between a filesystem where you /could/ have DAX (but it wasn't currently
>> enabled) and a filesystem where it just isn't possible.
> 
> I think that's the only thing that makes sense from a userspace
> perspective. THe man page explicitly says that:
> 
>   stx_attributes_mask
> 	A mask indicating which bits in stx_attributes are supported
> 	by the VFS and the filesystem.
> 
> So if DAX can never be turned on for that filesystem instance then,
> by definition, it does not support DAX and the bit should never be
> set.
> 
> e.g. We don't talk about kernels that support reflink - what matters
> to userspace is whether the filesystem instance supports reflink.
> Think of the useless mess that xfs_info would be if it reported
> kernel capabilities instead of filesystem instance capabilities.
> i.e. we don't report that a filesystem supports reflink just because
> the kernel supports it - it reports whether the filesystem instance
> being queried supports reflink. And that also implies the kernel
> supports it, because the kernel has to support it to mount the
> filesystem...
> 
> So, yeah, I think it really does need to be conditional on the
> filesystem instance being queried to be actually useful to users....

So now we're back to "attributes_mask, how does it work?"

The original implementation, as written by the statx interface author, added:

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5d02b922afa3..b9ffa9f4191f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5413,6 +5413,12 @@ int ext4_getattr(const struct path *path, struct kstat *stat,
        if (flags & EXT4_NODUMP_FL)
                stat->attributes |= STATX_ATTR_NODUMP;
 
+       stat->attributes_mask |= (STATX_ATTR_APPEND |
+                                 STATX_ATTR_COMPRESSED |
+                                 STATX_ATTR_ENCRYPTED |
+                                 STATX_ATTR_IMMUTABLE |
+                                 STATX_ATTR_NODUMP);
+
        generic_fillattr(inode, stat);
        return 0;
 }

setting all those flags /unconditionally/ i.e. STATX_ATTR_ENCRYPTED is always
set in the mask, even if CONFIG_FS_ENCRYPTION=n

And as for compression, that's even better ...

so, um... 

That's why I was keen to just add DAX unconditionally at this point, and if we want
to invent/refine meanings for the mask, we can still try to do that?

-Eric
Linus Torvalds Dec. 1, 2020, 10:09 p.m. UTC | #10
On Tue, Dec 1, 2020 at 1:04 PM David Howells <dhowells@redhat.com> wrote:
>
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > And if IS_DAX() is correct, then why shouldn't this just be done in
> > generic code? Why move it to every individual filesystem?
>
> One way of looking at it is that the check is then done for every filesystem -
> most of which don't support it.  Not sure whether that's a big enough problem
> to worry about.  The same is true of the automount test too, I suppose...

So I'd rather have it in one single place than spread out in the filesystems.

Especially when it turns out that the STATX_ATTR_DAX bitmask value was
wrong - now clearly it doesn't seem to currently *matter* to anything,
but imagine if we had to have some strange compat rule to fix things
up with stat() versioning or similar. That's exactly the kind of code
we would _not_ want in every filesystem.

So basically, the thing that argues against this patch is that it
seems to just duplicate things inside filesystems, when the VFS layter
already has the information.

Now, if the VFS information was possibly stale or wrong, that woudl be
one thing. But then we'd have other and bigger  problems elsewhere as
far as I can tell.

IOW - make generic what can be made generic, and try to avoid having
filesystems do their own thing.

[ Replace "filesystems" by "architectures" or whatever else, this is
obviously not a filesystem-specific rule in general. ]

And don't get me wrong - I don't _hate_ the patch, and I don't care
_that_ deeply, but it just doesn't seem to make any sense to me. My
initial query was really about "what am I missing - can you please
flesh out the commit message because I don't understand what's wrong".

                Linus
Linus Torvalds Dec. 1, 2020, 10:12 p.m. UTC | #11
On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote:
>
> That's why I was keen to just add DAX unconditionally at this point, and if we want
> to invent/refine meanings for the mask, we can still try to do that?

Oh Gods.  Let's *not* make this some "random filesystem choice" where
now semantics depends on what some filesystem decided to do, and
different filesystems have very subtly different semantics.

This all screams "please keep this in the VFS layer" so that we at
least have _one_ place where these kinds of decisions are made.

I suspect very very few people actually end up caring about any of the
STATX flags at all, of course. The fact that the DAX one was
apparently entirely the wrong bit argues that this isn't all that
important.

               Linus
Eric Sandeen Dec. 1, 2020, 10:26 p.m. UTC | #12
On 12/1/20 4:12 PM, Linus Torvalds wrote:
> On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote:
>>
>> That's why I was keen to just add DAX unconditionally at this point, and if we want
>> to invent/refine meanings for the mask, we can still try to do that?
> 
> Oh Gods.  Let's *not* make this some "random filesystem choice" where
> now semantics depends on what some filesystem decided to do, and
> different filesystems have very subtly different semantics.

Well, I had hoped that refinement might start with the interface
documentation, I'm certainly not suggesting every filesystem should go
its own way.
> This all screams "please keep this in the VFS layer" so that we at
> least have _one_ place where these kinds of decisions are made.

Making the "right decision" depends on what the mask actually means,
I guess.

Today we set a DAX attribute in statx which is not set in the mask.
That seems clearly broken.

We can either leave that as it is, set DAX in the mask for everyone in
the VFS, or delegate that mask-setting task to the individual filesystems
so that it reflects <something>, probably "can this inode ever be in dax
mode?"

I honestly don't care if we keep setting the attribute itself in the VFS;
if that's the right thing to do, that's fine.  (If so, it seems like
IS_IMMUTABLE -> STATX_ATTR_IMMUTABLE etc could be moved there, too.)

-Eric
Eric Sandeen Dec. 2, 2020, 12:11 a.m. UTC | #13
On 12/1/20 4:09 PM, Linus Torvalds wrote:
> So basically, the thing that argues against this patch is that it
> seems to just duplicate things inside filesystems, when the VFS layter
> already has the information.
> 
> Now, if the VFS information was possibly stale or wrong, that woudl be
> one thing. But then we'd have other and bigger  problems elsewhere as
> far as I can tell.
> 
> IOW - make generic what can be made generic, and try to avoid having
> filesystems do their own thing.
> 
> [ Replace "filesystems" by "architectures" or whatever else, this is
> obviously not a filesystem-specific rule in general. ]
> 
> And don't get me wrong - I don't _hate_ the patch, and I don't care
> _that_ deeply, but it just doesn't seem to make any sense to me. My
> initial query was really about "what am I missing - can you please
> flesh out the commit message because I don't understand what's wrong".

Backing way up, my motivation was: Only the filesystem can appropriately
set the statx->attributes_mask, so it has to be done there. Since that
has to be done in the filesystem, set the actual attribute flag adjacent
to it, as is done for ~every other flag.

*shrug*

In any case I resent the flag value clash fix on a separate thread as
V2, hopefully that one is straightforward enough to go in.

Thanks,
-Eric
Ira Weiny Dec. 2, 2020, 2:29 a.m. UTC | #14
On Tue, Dec 01, 2020 at 04:26:43PM -0600, Eric Sandeen wrote:
> On 12/1/20 4:12 PM, Linus Torvalds wrote:
> > On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote:
> >>
> >> That's why I was keen to just add DAX unconditionally at this point, and if we want
> >> to invent/refine meanings for the mask, we can still try to do that?
> > 
> > Oh Gods.  Let's *not* make this some "random filesystem choice" where
> > now semantics depends on what some filesystem decided to do, and
> > different filesystems have very subtly different semantics.
> 
> Well, I had hoped that refinement might start with the interface
> documentation, I'm certainly not suggesting every filesystem should go
> its own way.
> > This all screams "please keep this in the VFS layer" so that we at
> > least have _one_ place where these kinds of decisions are made.
> 
> Making the "right decision" depends on what the mask actually means,
> I guess.
> 
> Today we set a DAX attribute in statx which is not set in the mask.
> That seems clearly broken.

Yes...  and no.  You can't set the statx DAX flag directly.  It is only an
indicator of the current file state.  That state is affected by the
inode mode and the DAX mount option.

But I agree that having a bit set when the corresponding mask is 0 is odd.

> 
> We can either leave that as it is, set DAX in the mask for everyone in
> the VFS, or delegate that mask-setting task to the individual filesystems
> so that it reflects <something>, probably "can this inode ever be in dax
> mode?"
> 
> I honestly don't care if we keep setting the attribute itself in the VFS;
> if that's the right thing to do, that's fine.  (If so, it seems like
> IS_IMMUTABLE -> STATX_ATTR_IMMUTABLE etc could be moved there, too.)

The reason I put it in the VFS layer was that the statx was intended to be a
VFS indication of the state of the inode.  This differs from the FS_XFLAG_DAX
which is a state of the on-disk inode.  The VFS IS_DAX can be altered by the
mount option forcing DAX or no-DAX.

So there was a reason for having it at that level.

But it is true that only FS's which support DAX will ever set IS_DAX() and
having the attribute set near the mask probably makes the code a bit more
clear.

So I'm not opposed to the patch either.  But users can't ever set the flag
directly so I'm not sure if it being in the mask is going to confuse anyone.

Ira

> 
> -Eric
Christoph Hellwig Dec. 2, 2020, 8:03 a.m. UTC | #15
On Tue, Dec 01, 2020 at 02:12:19PM -0800, Linus Torvalds wrote:
> On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote:
> >
> > That's why I was keen to just add DAX unconditionally at this point, and if we want
> > to invent/refine meanings for the mask, we can still try to do that?
> 
> Oh Gods.  Let's *not* make this some "random filesystem choice" where
> now semantics depends on what some filesystem decided to do, and
> different filesystems have very subtly different semantics.
> 
> This all screams "please keep this in the VFS layer" so that we at
> least have _one_ place where these kinds of decisions are made.
> 
> I suspect very very few people actually end up caring about any of the
> STATX flags at all, of course. The fact that the DAX one was
> apparently entirely the wrong bit argues that this isn't all that
> important.

Agreed.  That whole support interface is just weird.  But the only
thing that remotely makes (a little bit of) sense is to just set all
bits known about by this particular kernel in the VFS.  Everything else
is going to be a complete mess.
diff mbox series

Patch

diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 11c5c6fe75bb..3550783a6ea0 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1653,11 +1653,15 @@  int ext2_getattr(const struct path *path, struct kstat *stat,
 		stat->attributes |= STATX_ATTR_IMMUTABLE;
 	if (flags & EXT2_NODUMP_FL)
 		stat->attributes |= STATX_ATTR_NODUMP;
+	if (IS_DAX(inode))
+		stat->attributes |= STATX_ATTR_DAX;
+
 	stat->attributes_mask |= (STATX_ATTR_APPEND |
 			STATX_ATTR_COMPRESSED |
 			STATX_ATTR_ENCRYPTED |
 			STATX_ATTR_IMMUTABLE |
-			STATX_ATTR_NODUMP);
+			STATX_ATTR_NODUMP |
+			STATX_ATTR_DAX);
 
 	generic_fillattr(inode, stat);
 	return 0;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0d8385aea898..848a0f2b154e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5550,13 +5550,16 @@  int ext4_getattr(const struct path *path, struct kstat *stat,
 		stat->attributes |= STATX_ATTR_NODUMP;
 	if (flags & EXT4_VERITY_FL)
 		stat->attributes |= STATX_ATTR_VERITY;
+	if (IS_DAX(inode))
+		stat->attributes |= STATX_ATTR_DAX;
 
 	stat->attributes_mask |= (STATX_ATTR_APPEND |
 				  STATX_ATTR_COMPRESSED |
 				  STATX_ATTR_ENCRYPTED |
 				  STATX_ATTR_IMMUTABLE |
 				  STATX_ATTR_NODUMP |
-				  STATX_ATTR_VERITY);
+				  STATX_ATTR_VERITY |
+				  STATX_ATTR_DAX);
 
 	generic_fillattr(inode, stat);
 	return 0;
diff --git a/fs/stat.c b/fs/stat.c
index dacecdda2e79..5bd90949c69b 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -80,9 +80,6 @@  int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
 	if (IS_AUTOMOUNT(inode))
 		stat->attributes |= STATX_ATTR_AUTOMOUNT;
 
-	if (IS_DAX(inode))
-		stat->attributes |= STATX_ATTR_DAX;
-
 	if (inode->i_op->getattr)
 		return inode->i_op->getattr(path, stat, request_mask,
 					    query_flags);
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 1414ab79eacf..56deda7042fd 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -575,10 +575,13 @@  xfs_vn_getattr(
 		stat->attributes |= STATX_ATTR_APPEND;
 	if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP)
 		stat->attributes |= STATX_ATTR_NODUMP;
+	if (IS_DAX(inode))
+		stat->attributes |= STATX_ATTR_DAX;
 
 	stat->attributes_mask |= (STATX_ATTR_IMMUTABLE |
 				  STATX_ATTR_APPEND |
-				  STATX_ATTR_NODUMP);
+				  STATX_ATTR_NODUMP |
+				  STATX_ATTR_DAX);
 
 	switch (inode->i_mode & S_IFMT) {
 	case S_IFBLK: