diff mbox

[v5,1/3] xfs: Show realtime device stats on statfs calls if inherit flag set

Message ID 20170925194418.720146-2-rwareing@fb.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Richard Wareing Sept. 25, 2017, 7:44 p.m. UTC
- Reports realtime device free blocks in statfs calls if inheritance
  bit is set on the inode of directory.  This is a bit more intuitive,
  especially for use-cases which are using a much larger device for
  the realtime device.
- Add XFS_IS_REALTIME_MOUNT option to gate based on the existence of a
  realtime device on the mount, similar to the XFS_IS_REALTIME_INODE
  option.

Signed-off-by: Richard Wareing <rwareing@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
Changes since v4:
* None

Changes since v3:
* Fixed accounting bug, we are not required to substract m_alloc_set_aside
  as this is a data device only requirement.
* Added XFS_IS_REALTIME_MOUNT macro based on learnings from CVE-2017-14340,
  now provides similar gating on the mount as XFS_IS_REALTIME_INODE does
  for the inode.

Changes since v2:
* Style updated per Christoph Hellwig's comment
* Fixed bug: statp->f_bavail = statp->f_bfree

 fs/xfs/xfs_linux.h | 2 ++
 fs/xfs/xfs_super.c | 8 ++++++++
 2 files changed, 10 insertions(+)

Comments

Eric Sandeen Sept. 25, 2017, 10:53 p.m. UTC | #1
On 9/25/17 2:44 PM, Richard Wareing wrote:
> - Reports realtime device free blocks in statfs calls if inheritance
>   bit is set on the inode of directory.  This is a bit more intuitive,
>   especially for use-cases which are using a much larger device for
>   the realtime device.
> - Add XFS_IS_REALTIME_MOUNT option to gate based on the existence of a
>   realtime device on the mount, similar to the XFS_IS_REALTIME_INODE
>   option.

Sorry for chiming in late.

So, this is a major change in behavior of statfs on xfs.  How will the
user know this, where is it documented?
 
Also, if I read this right, does that mean that you get the magical
new statfs results if you stat a /directory/ with XFS_DIFLAG_RTINHERIT
set, but not if you stat a /file/ with XFS_DIFLAG_REALTIME set?

That seems quite confusing, but maybe I'm missing something obvious.

As for which filesystem is getting reported, would it be totally 
bonkers to have an xfs-realtime f_type to make it 100% clear what's
being reported?

-Eric

> Signed-off-by: Richard Wareing <rwareing@fb.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> ---
> Changes since v4:
> * None
> 
> Changes since v3:
> * Fixed accounting bug, we are not required to substract m_alloc_set_aside
>   as this is a data device only requirement.
> * Added XFS_IS_REALTIME_MOUNT macro based on learnings from CVE-2017-14340,
>   now provides similar gating on the mount as XFS_IS_REALTIME_INODE does
>   for the inode.
> 
> Changes since v2:
> * Style updated per Christoph Hellwig's comment
> * Fixed bug: statp->f_bavail = statp->f_bfree
> 
>  fs/xfs/xfs_linux.h | 2 ++
>  fs/xfs/xfs_super.c | 8 ++++++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
> index 044fb0e..fe46e71 100644
> --- a/fs/xfs/xfs_linux.h
> +++ b/fs/xfs/xfs_linux.h
> @@ -280,8 +280,10 @@ static inline __uint64_t howmany_64(__uint64_t x, __uint32_t y)
>  
>  #ifdef CONFIG_XFS_RT
>  #define XFS_IS_REALTIME_INODE(ip) ((ip)->i_d.di_flags & XFS_DIFLAG_REALTIME)
> +#define XFS_IS_REALTIME_MOUNT(mp) ((mp)->m_rtdev_targp ? 1 : 0)
>  #else
>  #define XFS_IS_REALTIME_INODE(ip) (0)
> +#define XFS_IS_REALTIME_MOUNT(mp) (0)
>  #endif
>  
>  #endif /* __XFS_LINUX__ */
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 455a575..6d33a5e 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1136,6 +1136,14 @@ xfs_fs_statfs(
>  	    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))) ==
>  			      (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))
>  		xfs_qm_statvfs(ip, statp);
> +
> +	if (XFS_IS_REALTIME_MOUNT(mp) &&
> +	    (ip->i_d.di_flags & XFS_DIFLAG_RTINHERIT)) {
> +		statp->f_blocks = sbp->sb_rblocks;
> +		statp->f_bavail = statp->f_bfree =
> +			sbp->sb_frextents * sbp->sb_rextsize;
> +	}
> +
>  	return 0;
>  }
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Sept. 25, 2017, 10:55 p.m. UTC | #2
On Mon, Sep 25, 2017 at 12:44:16PM -0700, Richard Wareing wrote:
> - Reports realtime device free blocks in statfs calls if inheritance
>   bit is set on the inode of directory.  This is a bit more intuitive,
>   especially for use-cases which are using a much larger device for
>   the realtime device.
> - Add XFS_IS_REALTIME_MOUNT option to gate based on the existence of a
>   realtime device on the mount, similar to the XFS_IS_REALTIME_INODE
>   option.
> 
> Signed-off-by: Richard Wareing <rwareing@fb.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> ---
> Changes since v4:
> * None
> 
> Changes since v3:
> * Fixed accounting bug, we are not required to substract m_alloc_set_aside
>   as this is a data device only requirement.
> * Added XFS_IS_REALTIME_MOUNT macro based on learnings from CVE-2017-14340,
>   now provides similar gating on the mount as XFS_IS_REALTIME_INODE does
>   for the inode.
> 
> Changes since v2:
> * Style updated per Christoph Hellwig's comment
> * Fixed bug: statp->f_bavail = statp->f_bfree
> 
>  fs/xfs/xfs_linux.h | 2 ++
>  fs/xfs/xfs_super.c | 8 ++++++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
> index 044fb0e..fe46e71 100644
> --- a/fs/xfs/xfs_linux.h
> +++ b/fs/xfs/xfs_linux.h
> @@ -280,8 +280,10 @@ static inline __uint64_t howmany_64(__uint64_t x, __uint32_t y)
>  
>  #ifdef CONFIG_XFS_RT
>  #define XFS_IS_REALTIME_INODE(ip) ((ip)->i_d.di_flags & XFS_DIFLAG_REALTIME)
> +#define XFS_IS_REALTIME_MOUNT(mp) ((mp)->m_rtdev_targp ? 1 : 0)

(I kind of dislike these macros but I tried turning them into static inline
functions once and it turned into a nightmare.)

>  #else
>  #define XFS_IS_REALTIME_INODE(ip) (0)
> +#define XFS_IS_REALTIME_MOUNT(mp) (0)
>  #endif
>  
>  #endif /* __XFS_LINUX__ */
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 455a575..6d33a5e 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1136,6 +1136,14 @@ xfs_fs_statfs(
>  	    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))) ==
>  			      (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))
>  		xfs_qm_statvfs(ip, statp);
> +
> +	if (XFS_IS_REALTIME_MOUNT(mp) &&
> +	    (ip->i_d.di_flags & XFS_DIFLAG_RTINHERIT)) {

Why don't we do this for any inode with REALTIME or RTINHERIT set?

--D

> +		statp->f_blocks = sbp->sb_rblocks;
> +		statp->f_bavail = statp->f_bfree =
> +			sbp->sb_frextents * sbp->sb_rextsize;
> +	}
> +
>  	return 0;
>  }
>  
> -- 
> 2.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Wareing Sept. 26, 2017, 3:32 a.m. UTC | #3
Eric Sandeen <sandeen@sandeen.net> wrote:

> On 9/25/17 2:44 PM, Richard Wareing wrote:
>> - Reports realtime device free blocks in statfs calls if inheritance
>>   bit is set on the inode of directory.  This is a bit more intuitive,
>>   especially for use-cases which are using a much larger device for
>>   the realtime device.
>> - Add XFS_IS_REALTIME_MOUNT option to gate based on the existence of a
>>   realtime device on the mount, similar to the XFS_IS_REALTIME_INODE
>>   option.
>
> Sorry for chiming in late.
>
> So, this is a major change in behavior of statfs on xfs.  How will the
> user know this, where is it documented?
>
> Also, if I read this right, does that mean that you get the magical
> new statfs results if you stat a /directory/ with XFS_DIFLAG_RTINHERIT
> set, but not if you stat a /file/ with XFS_DIFLAG_REALTIME set?
>
> That seems quite confusing, but maybe I'm missing something obvious.
>
> As for which filesystem is getting reported, would it be totally
> bonkers to have an xfs-realtime f_type to make it 100% clear what's
> being reported?
>
> -Eric
>

Wrt docs, I can write something up.  Any suggestions where the docs should
live?  Man page?

I'm also open to making this a sysfs option, it did cross my mind that
this change in behavior might be jarring (though in a good way ;) ).  For
those folks who wish to use realtime devices to place metadata on an SSD
this behavior is far more intuitive IMHO (as it more closely resembles the
presentation on a traditional FS setup).

You are also correct on the directory behavior.  For files you just
get their actual size on whatever device they happen to be stored on.
There is no change to that behavior.

Wrt new f_type, I'd be expecting a bit of push back on this as it may be
considered an abuse of this interface.

Richard


>> Signed-off-by: Richard Wareing <rwareing@fb.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Reviewed-by: Dave Chinner <dchinner@redhat.com>
>> ---
>> Changes since v4:
>> * None
>>
>> Changes since v3:
>> * Fixed accounting bug, we are not required to substract m_alloc_set_aside
>>   as this is a data device only requirement.
>> * Added XFS_IS_REALTIME_MOUNT macro based on learnings from  
>> CVE-2017-14340,
>>   now provides similar gating on the mount as XFS_IS_REALTIME_INODE does
>>   for the inode.
>>
>> Changes since v2:
>> * Style updated per Christoph Hellwig's comment
>> * Fixed bug: statp->f_bavail = statp->f_bfree
>>
>>  fs/xfs/xfs_linux.h | 2 ++
>>  fs/xfs/xfs_super.c | 8 ++++++++
>>  2 files changed, 10 insertions(+)
>>
>> diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
>> index 044fb0e..fe46e71 100644
>> --- a/fs/xfs/xfs_linux.h
>> +++ b/fs/xfs/xfs_linux.h
>> @@ -280,8 +280,10 @@ static inline __uint64_t howmany_64(__uint64_t x,  
>> __uint32_t y)
>>
>>  #ifdef CONFIG_XFS_RT
>>  #define XFS_IS_REALTIME_INODE(ip) ((ip)->i_d.di_flags & XFS_DIFLAG_REALTIME)
>> +#define XFS_IS_REALTIME_MOUNT(mp) ((mp)->m_rtdev_targp ? 1 : 0)
>>  #else
>>  #define XFS_IS_REALTIME_INODE(ip) (0)
>> +#define XFS_IS_REALTIME_MOUNT(mp) (0)
>>  #endif
>>
>>  #endif /* __XFS_LINUX__ */
>> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
>> index 455a575..6d33a5e 100644
>> --- a/fs/xfs/xfs_super.c
>> +++ b/fs/xfs/xfs_super.c
>> @@ -1136,6 +1136,14 @@ xfs_fs_statfs(
>>  	    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))) ==
>>  			      (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))
>>  		xfs_qm_statvfs(ip, statp);
>> +
>> +	if (XFS_IS_REALTIME_MOUNT(mp) &&
>> +	    (ip->i_d.di_flags & XFS_DIFLAG_RTINHERIT)) {
>> +		statp->f_blocks = sbp->sb_rblocks;
>> +		statp->f_bavail = statp->f_bfree =
>> +			sbp->sb_frextents * sbp->sb_rextsize;
>> +	}
>> +
>>  	return 0;
>>  }


--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index 044fb0e..fe46e71 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -280,8 +280,10 @@  static inline __uint64_t howmany_64(__uint64_t x, __uint32_t y)
 
 #ifdef CONFIG_XFS_RT
 #define XFS_IS_REALTIME_INODE(ip) ((ip)->i_d.di_flags & XFS_DIFLAG_REALTIME)
+#define XFS_IS_REALTIME_MOUNT(mp) ((mp)->m_rtdev_targp ? 1 : 0)
 #else
 #define XFS_IS_REALTIME_INODE(ip) (0)
+#define XFS_IS_REALTIME_MOUNT(mp) (0)
 #endif
 
 #endif /* __XFS_LINUX__ */
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 455a575..6d33a5e 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1136,6 +1136,14 @@  xfs_fs_statfs(
 	    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))) ==
 			      (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))
 		xfs_qm_statvfs(ip, statp);
+
+	if (XFS_IS_REALTIME_MOUNT(mp) &&
+	    (ip->i_d.di_flags & XFS_DIFLAG_RTINHERIT)) {
+		statp->f_blocks = sbp->sb_rblocks;
+		statp->f_bavail = statp->f_bfree =
+			sbp->sb_frextents * sbp->sb_rextsize;
+	}
+
 	return 0;
 }