diff mbox series

[5/5] hugetlbfs: fix confusing hugetlbfs stat

Message ID 20220721131637.6306-6-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series A few cleanup and fixup patches for hugetlbfs | expand

Commit Message

Miaohe Lin July 21, 2022, 1:16 p.m. UTC
When size option is not specified, f_blocks, f_bavail and f_bfree will be
set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
-1 first to make sure 0 is reported for max/free/used when no limit is set
as the comment states.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 fs/hugetlbfs/inode.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

Comments

Mike Kravetz July 22, 2022, 12:28 a.m. UTC | #1
On 07/21/22 21:16, Miaohe Lin wrote:
> When size option is not specified, f_blocks, f_bavail and f_bfree will be
> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
> -1 first to make sure 0 is reported for max/free/used when no limit is set
> as the comment states.

Just curious, where are you seeing values reported as -1?  The check
for sbinfo->spool was supposed to handle these cases.  Seems like it
should handle the max_hpages == -1 case.  But, it doesn't look like it
considers the max_inodes == -1 case.

If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
df seems to report zero instead of -1.

Just want to understand the reasoning behind the change.
Miaohe Lin July 22, 2022, 6:38 a.m. UTC | #2
On 2022/7/22 8:28, Mike Kravetz wrote:
> On 07/21/22 21:16, Miaohe Lin wrote:
>> When size option is not specified, f_blocks, f_bavail and f_bfree will be
>> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
>> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
>> -1 first to make sure 0 is reported for max/free/used when no limit is set
>> as the comment states.
> 
> Just curious, where are you seeing values reported as -1?  The check

From the standard statvfs() function.

> for sbinfo->spool was supposed to handle these cases.  Seems like it

sbinfo->spool could be created when ctx->max_hpages == -1 while
ctx->min_hpages != -1 in hugetlbfs_fill_super.

> should handle the max_hpages == -1 case.  But, it doesn't look like it
> considers the max_inodes == -1 case.
> 
> If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
> df seems to report zero instead of -1.
> 
> Just want to understand the reasoning behind the change.

I wrote a test program:

#include <sys/statvfs.h>
#include <stdio.h>

int main(void)
{
	struct statvfs buf;

	if (statvfs("/root/huge/", &buf) == -1) {
 		printf("statvfs() error\n");
		return -1;
	}
	printf("f_blocks %lld, f_bavail %lld, f_bfree %lld, f_files %lld, f_ffree %lld\n",
		buf.f_blocks, buf.f_bavail, buf.f_bfree, buf.f_files, buf.f_ffree);
	return 0;
}

And test it in my env:
[root@localhost ~]# mount -t hugetlbfs none /root/huge/
[root@localhost ~]# ./stat
f_blocks 0, f_bavail 0, f_bfree 0, f_files 0, f_ffree 0
[root@localhost ~]# umount /root/huge/
[root@localhost ~]# mount -t hugetlbfs -o min_size=32M none /root/huge/
[root@localhost ~]# ./stat
f_blocks -1, f_bavail -1, f_bfree -1, f_files -1, f_ffree -1
[root@localhost ~]# umount /root/huge/
[root@localhost ~]# mount -t hugetlbfs -o min_size=32M,size=64M none /root/huge/
[root@localhost ~]# ./stat
f_blocks 32, f_bavail 32, f_bfree 32, f_files -1, f_ffree -1
[root@localhost ~]# umount /root/huge/
[root@localhost ~]# mount -t hugetlbfs -o min_size=32M,size=64M,nr_inodes=1024 none /root/huge/
[root@localhost ~]# ./stat
f_blocks 32, f_bavail 32, f_bfree 32, f_files 1024, f_ffree 1023
[root@localhost ~]# umount /root/huge/

Or am I miss something?

> 

Thanks.
Mike Kravetz July 22, 2022, 10:55 p.m. UTC | #3
On 07/22/22 14:38, Miaohe Lin wrote:
> On 2022/7/22 8:28, Mike Kravetz wrote:
> > On 07/21/22 21:16, Miaohe Lin wrote:
> >> When size option is not specified, f_blocks, f_bavail and f_bfree will be
> >> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
> >> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
> >> -1 first to make sure 0 is reported for max/free/used when no limit is set
> >> as the comment states.
> > 
> > Just curious, where are you seeing values reported as -1?  The check
> 
> From the standard statvfs() function.
> 
> > for sbinfo->spool was supposed to handle these cases.  Seems like it
> 
> sbinfo->spool could be created when ctx->max_hpages == -1 while
> ctx->min_hpages != -1 in hugetlbfs_fill_super.
> 
> > should handle the max_hpages == -1 case.  But, it doesn't look like it
> > considers the max_inodes == -1 case.
> > 
> > If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
> > df seems to report zero instead of -1.
> > 
> > Just want to understand the reasoning behind the change.

Thanks for the additional information (and test program)!

From the hugetlbfs documentation:
"If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on
 command line then no limits are set."

So, having those values set to -1 indicates there is no limit set.

With this change, 0 is reported for the case where there is no limit set as
well as the case where the max value is 0.

There may be some value in reporting -1 as is done today.

To be honest, I am not sure what is the correct behavior here.  Unless
there is a user visible issue/problem, I am hesitant to change.  Other
opinions are welcome.
Miaohe Lin July 23, 2022, 2:56 a.m. UTC | #4
On 2022/7/23 6:55, Mike Kravetz wrote:
> On 07/22/22 14:38, Miaohe Lin wrote:
>> On 2022/7/22 8:28, Mike Kravetz wrote:
>>> On 07/21/22 21:16, Miaohe Lin wrote:
>>>> When size option is not specified, f_blocks, f_bavail and f_bfree will be
>>>> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
>>>> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
>>>> -1 first to make sure 0 is reported for max/free/used when no limit is set
>>>> as the comment states.
>>>
>>> Just curious, where are you seeing values reported as -1?  The check
>>
>> From the standard statvfs() function.
>>
>>> for sbinfo->spool was supposed to handle these cases.  Seems like it
>>
>> sbinfo->spool could be created when ctx->max_hpages == -1 while
>> ctx->min_hpages != -1 in hugetlbfs_fill_super.
>>
>>> should handle the max_hpages == -1 case.  But, it doesn't look like it
>>> considers the max_inodes == -1 case.
>>>
>>> If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
>>> df seems to report zero instead of -1.
>>>
>>> Just want to understand the reasoning behind the change.
> 
> Thanks for the additional information (and test program)!
> 
>>From the hugetlbfs documentation:
> "If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on
>  command line then no limits are set."
> 
> So, having those values set to -1 indicates there is no limit set.
> 
> With this change, 0 is reported for the case where there is no limit set as
> well as the case where the max value is 0.

IMHO, 0 should not be a valid max value otherwise there will be no hugetlb pages
to use. It should mean there's no limit. But maybe I'm wrong.

> 
> There may be some value in reporting -1 as is done today.

There still be a inconsistency:

If the ``size`` and ``min_size`` isn't specified, then reported max value is 0.
But if ``min_size`` is specified while ``size`` isn't specified, the reported
max value is -1.

> 
> To be honest, I am not sure what is the correct behavior here.  Unless
> there is a user visible issue/problem, I am hesitant to change.  Other
> opinions are welcome.

Yes, it might be better to keep it as is. Maybe we could change the comment to
reflect what the current behavior is like below?

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 44da9828e171..f03b1a019cc0 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1080,7 +1080,7 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
        buf->f_bsize = huge_page_size(h);
        if (sbinfo) {
                spin_lock(&sbinfo->stat_lock);
-               /* If no limits set, just report 0 for max/free/used
+               /* If no limits set, just report 0 or -1 for max/free/used
                 * blocks, like simple_statfs() */
                if (sbinfo->spool) {
                        spin_lock_irq(&sbinfo->spool->lock);

> 

No strong opinion to keep this patch or above change. Many thanks for your comment and reply. :)
Mike Kravetz July 25, 2022, 11:40 p.m. UTC | #5
On 07/23/22 10:56, Miaohe Lin wrote:
> On 2022/7/23 6:55, Mike Kravetz wrote:
> > On 07/22/22 14:38, Miaohe Lin wrote:
> >> On 2022/7/22 8:28, Mike Kravetz wrote:
> >>> On 07/21/22 21:16, Miaohe Lin wrote:
> >>>> When size option is not specified, f_blocks, f_bavail and f_bfree will be
> >>>> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
> >>>> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
> >>>> -1 first to make sure 0 is reported for max/free/used when no limit is set
> >>>> as the comment states.
> >>>
> >>> Just curious, where are you seeing values reported as -1?  The check
> >>
> >> From the standard statvfs() function.
> >>
> >>> for sbinfo->spool was supposed to handle these cases.  Seems like it
> >>
> >> sbinfo->spool could be created when ctx->max_hpages == -1 while
> >> ctx->min_hpages != -1 in hugetlbfs_fill_super.
> >>
> >>> should handle the max_hpages == -1 case.  But, it doesn't look like it
> >>> considers the max_inodes == -1 case.
> >>>
> >>> If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
> >>> df seems to report zero instead of -1.
> >>>
> >>> Just want to understand the reasoning behind the change.
> > 
> > Thanks for the additional information (and test program)!
> > 
> >>From the hugetlbfs documentation:
> > "If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on
> >  command line then no limits are set."
> > 
> > So, having those values set to -1 indicates there is no limit set.
> > 
> > With this change, 0 is reported for the case where there is no limit set as
> > well as the case where the max value is 0.
> 
> IMHO, 0 should not be a valid max value otherwise there will be no hugetlb pages
> to use. It should mean there's no limit. But maybe I'm wrong.

I agree that 0 as a max value makes little sense.  However, it is allowed
today and from what I can tell it is file system specific.  So, there is no
defined behavior.

> 
> > 
> > There may be some value in reporting -1 as is done today.
> 
> There still be a inconsistency:
> 
> If the ``size`` and ``min_size`` isn't specified, then reported max value is 0.
> But if ``min_size`` is specified while ``size`` isn't specified, the reported
> max value is -1.
> 

Agree that this is inconsistent and confusing.

In the case where min_size and size is not specified, -1 for size still may
make sense.  min_size specifies how many pages are reserved for use by the
filesystem.  The only required relation between min_size and size is that if
size is specified, then min_size must be smaller.  Otherwise, it makes no
sense to reserve pages (min_size) that can not be used.

> > To be honest, I am not sure what is the correct behavior here.  Unless
> > there is a user visible issue/problem, I am hesitant to change.  Other
> > opinions are welcome.
> 
> Yes, it might be better to keep it as is. Maybe we could change the comment to
> reflect what the current behavior is like below?
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 44da9828e171..f03b1a019cc0 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1080,7 +1080,7 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
>         buf->f_bsize = huge_page_size(h);
>         if (sbinfo) {
>                 spin_lock(&sbinfo->stat_lock);
> -               /* If no limits set, just report 0 for max/free/used
> +               /* If no limits set, just report 0 or -1 for max/free/used
>                  * blocks, like simple_statfs() */
>                 if (sbinfo->spool) {
>                         spin_lock_irq(&sbinfo->spool->lock);
> 
> > 
> 
> No strong opinion to keep this patch or above change. Many thanks for your comment and reply. :)
> 

I am fine with the comment change.  Thanks for reading through the code and
trying to make sense of it!
Miaohe Lin July 26, 2022, 2:01 a.m. UTC | #6
On 2022/7/26 7:40, Mike Kravetz wrote:
> On 07/23/22 10:56, Miaohe Lin wrote:
>> On 2022/7/23 6:55, Mike Kravetz wrote:
>>> On 07/22/22 14:38, Miaohe Lin wrote:
>>>> On 2022/7/22 8:28, Mike Kravetz wrote:
>>>>> On 07/21/22 21:16, Miaohe Lin wrote:
>>>>>> When size option is not specified, f_blocks, f_bavail and f_bfree will be
>>>>>> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
>>>>>> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
>>>>>> -1 first to make sure 0 is reported for max/free/used when no limit is set
>>>>>> as the comment states.
>>>>>
>>>>> Just curious, where are you seeing values reported as -1?  The check
>>>>
>>>> From the standard statvfs() function.
>>>>
>>>>> for sbinfo->spool was supposed to handle these cases.  Seems like it
>>>>
>>>> sbinfo->spool could be created when ctx->max_hpages == -1 while
>>>> ctx->min_hpages != -1 in hugetlbfs_fill_super.
>>>>
>>>>> should handle the max_hpages == -1 case.  But, it doesn't look like it
>>>>> considers the max_inodes == -1 case.
>>>>>
>>>>> If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
>>>>> df seems to report zero instead of -1.
>>>>>
>>>>> Just want to understand the reasoning behind the change.
>>>
>>> Thanks for the additional information (and test program)!
>>>
>>> >From the hugetlbfs documentation:
>>> "If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on
>>>  command line then no limits are set."
>>>
>>> So, having those values set to -1 indicates there is no limit set.
>>>
>>> With this change, 0 is reported for the case where there is no limit set as
>>> well as the case where the max value is 0.
>>
>> IMHO, 0 should not be a valid max value otherwise there will be no hugetlb pages
>> to use. It should mean there's no limit. But maybe I'm wrong.
> 
> I agree that 0 as a max value makes little sense.  However, it is allowed
> today and from what I can tell it is file system specific.  So, there is no
> defined behavior.

So it might be better to keep the code as is.

> 
>>
>>>
>>> There may be some value in reporting -1 as is done today.
>>
>> There still be a inconsistency:
>>
>> If the ``size`` and ``min_size`` isn't specified, then reported max value is 0.
>> But if ``min_size`` is specified while ``size`` isn't specified, the reported
>> max value is -1.
>>
> 
> Agree that this is inconsistent and confusing.
> 
> In the case where min_size and size is not specified, -1 for size still may
> make sense.  min_size specifies how many pages are reserved for use by the
> filesystem.  The only required relation between min_size and size is that if
> size is specified, then min_size must be smaller.  Otherwise, it makes no
> sense to reserve pages (min_size) that can not be used.
> 
>>> To be honest, I am not sure what is the correct behavior here.  Unless
>>> there is a user visible issue/problem, I am hesitant to change.  Other
>>> opinions are welcome.
>>
>> Yes, it might be better to keep it as is. Maybe we could change the comment to
>> reflect what the current behavior is like below?
>>
>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
>> index 44da9828e171..f03b1a019cc0 100644
>> --- a/fs/hugetlbfs/inode.c
>> +++ b/fs/hugetlbfs/inode.c
>> @@ -1080,7 +1080,7 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>         buf->f_bsize = huge_page_size(h);
>>         if (sbinfo) {
>>                 spin_lock(&sbinfo->stat_lock);
>> -               /* If no limits set, just report 0 for max/free/used
>> +               /* If no limits set, just report 0 or -1 for max/free/used
>>                  * blocks, like simple_statfs() */
>>                 if (sbinfo->spool) {
>>                         spin_lock_irq(&sbinfo->spool->lock);
>>
>>>
>>
>> No strong opinion to keep this patch or above change. Many thanks for your comment and reply. :)
>>
> 
> I am fine with the comment change.  Thanks for reading through the code and
> trying to make sense of it!

I will do it in next version. Many thanks for your time.

>
diff mbox series

Patch

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 19fc62a9c2fe..44da9828e171 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1083,16 +1083,20 @@  static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
 		/* If no limits set, just report 0 for max/free/used
 		 * blocks, like simple_statfs() */
 		if (sbinfo->spool) {
-			long free_pages;
-
 			spin_lock_irq(&sbinfo->spool->lock);
-			buf->f_blocks = sbinfo->spool->max_hpages;
-			free_pages = sbinfo->spool->max_hpages
-				- sbinfo->spool->used_hpages;
-			buf->f_bavail = buf->f_bfree = free_pages;
+			if (sbinfo->spool->max_hpages != -1) {
+				long free_pages;
+
+				buf->f_blocks = sbinfo->spool->max_hpages;
+				free_pages = sbinfo->spool->max_hpages
+					     - sbinfo->spool->used_hpages;
+				buf->f_bavail = buf->f_bfree = free_pages;
+			}
 			spin_unlock_irq(&sbinfo->spool->lock);
-			buf->f_files = sbinfo->max_inodes;
-			buf->f_ffree = sbinfo->free_inodes;
+			if (sbinfo->max_inodes != -1) {
+				buf->f_files = sbinfo->max_inodes;
+				buf->f_ffree = sbinfo->free_inodes;
+			}
 		}
 		spin_unlock(&sbinfo->stat_lock);
 	}