diff mbox series

[V2] nfs(5): Update rsize/wsize options

Message ID 20241119165942.213409-1-steved@redhat.com (mailing list archive)
State New
Headers show
Series [V2] nfs(5): Update rsize/wsize options | expand

Commit Message

Steve Dickson Nov. 19, 2024, 4:59 p.m. UTC
From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>

The rsize/wsize values are not multiples of 1024 but multiples of the
system's page size or powers of 2 if < system's page size as defined
in fs/nfs/internal.h:nfs_io_size().

Signed-off-by: Steve Dickson <steved@redhat.com>
---
 utils/mount/nfs.man | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

V2: Replaced PAGE_SIZE with "system's page size"

Comments

Cedric Blancher Nov. 20, 2024, 7:57 a.m. UTC | #1
On Tue, 19 Nov 2024 at 18:07, Steve Dickson <steved@redhat.com> wrote:
>
> From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
>
> The rsize/wsize values are not multiples of 1024 but multiples of the
> system's page size or powers of 2 if < system's page size as defined
> in fs/nfs/internal.h:nfs_io_size().
>
> Signed-off-by: Steve Dickson <steved@redhat.com>

REJECT. As discussed, this is the WRONG approach. The pagesize is not
not easily determinable (/bin/pagesize not even being part of the
default install), and the page size is flexible on many architectures.
rsize/wsize depending on the page size makes this option non portable
across platforms, or even same platforms with different default
pagesize settings.
In real life, this can prevent puppet from working for NFS root, if
NFS root needs rsize/wsize, and someone switches the default page
size.

I thought the correct fix would be to fix the NFS client to count in
kbytes as documented, and round up/down to the pagesize.

Ced

> ---
>  utils/mount/nfs.man | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> V2: Replaced PAGE_SIZE with "system's page size"
>
> diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
> index 233a7177..eab4692a 100644
> --- a/utils/mount/nfs.man
> +++ b/utils/mount/nfs.man
> @@ -215,15 +215,18 @@ or smaller than the
>  setting. The largest read payload supported by the Linux NFS client
>  is 1,048,576 bytes (one megabyte).
>  .IP
> -The
> +The allowed
>  .B rsize
> -value is a positive integral multiple of 1024.
> +value is a positive integral multiple of
> +system's page size
> +or a power of two if it is less than
> +system's page size.
>  Specified
>  .B rsize
>  values lower than 1024 are replaced with 4096; values larger than
>  1048576 are replaced with 1048576. If a specified value is within the supported
> -range but not a multiple of 1024, it is rounded down to the nearest
> -multiple of 1024.
> +range but not such an allowed value, it is rounded down to the nearest
> +allowed value.
>  .IP
>  If an
>  .B rsize
> @@ -257,16 +260,19 @@ setting. The largest write payload supported by the Linux NFS client
>  is 1,048,576 bytes (one megabyte).
>  .IP
>  Similar to
> -.B rsize
> -, the
> +.BR rsize ,
> +the allowed
>  .B wsize
> -value is a positive integral multiple of 1024.
> +value is a positive integral multiple of
> +system's page size
> +or a power of two if it is less than
> +system's page size.
>  Specified
>  .B wsize
>  values lower than 1024 are replaced with 4096; values larger than
>  1048576 are replaced with 1048576. If a specified value is within the supported
> -range but not a multiple of 1024, it is rounded down to the nearest
> -multiple of 1024.
> +range but not such an allowed value, it is rounded down to the nearest
> +allowed value.
>  .IP
>  If a
>  .B wsize
> --
> 2.47.0
>
>
Steve Dickson Nov. 20, 2024, 9:18 a.m. UTC | #2
On 11/20/24 2:57 AM, Cedric Blancher wrote:
> On Tue, 19 Nov 2024 at 18:07, Steve Dickson <steved@redhat.com> wrote:
>>
>> From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
>>
>> The rsize/wsize values are not multiples of 1024 but multiples of the
>> system's page size or powers of 2 if < system's page size as defined
>> in fs/nfs/internal.h:nfs_io_size().
>>
>> Signed-off-by: Steve Dickson <steved@redhat.com>
> 
> REJECT. 
You mean NACK! :-)

> As discussed, this is the WRONG approach. The pagesize is not
> not easily determinable (/bin/pagesize not even being part of the
> default install), and the page size is flexible on many architectures.
Yes... /bin/pagesize is not needed only getconf PAGESIZE is which
is in all default installs.

> rsize/wsize depending on the page size makes this option non portable
> across platforms, or even same platforms with different default
> pagesize settings.
"rsize/wsize depending on the page size" is a kernel thing...
Nothing nfs-utils can do about it but accurately document
what is happening.

> In real life, this can prevent puppet from working for NFS root, if
> NFS root needs rsize/wsize, and someone switches the default page
> size.
Why is puppet evening messing with rsize/wsize? Just let the
kernel do the right thing...

> 
> I thought the correct fix would be to fix the NFS client to count in
> kbytes as documented, and round up/down to the pagesize.
Again this is a kernel thing... Patches are welcomed!

steved.

> 
> Ced
> 
>> ---
>>   utils/mount/nfs.man | 24 +++++++++++++++---------
>>   1 file changed, 15 insertions(+), 9 deletions(-)
>>
>> V2: Replaced PAGE_SIZE with "system's page size"
>>
>> diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
>> index 233a7177..eab4692a 100644
>> --- a/utils/mount/nfs.man
>> +++ b/utils/mount/nfs.man
>> @@ -215,15 +215,18 @@ or smaller than the
>>   setting. The largest read payload supported by the Linux NFS client
>>   is 1,048,576 bytes (one megabyte).
>>   .IP
>> -The
>> +The allowed
>>   .B rsize
>> -value is a positive integral multiple of 1024.
>> +value is a positive integral multiple of
>> +system's page size
>> +or a power of two if it is less than
>> +system's page size.
>>   Specified
>>   .B rsize
>>   values lower than 1024 are replaced with 4096; values larger than
>>   1048576 are replaced with 1048576. If a specified value is within the supported
>> -range but not a multiple of 1024, it is rounded down to the nearest
>> -multiple of 1024.
>> +range but not such an allowed value, it is rounded down to the nearest
>> +allowed value.
>>   .IP
>>   If an
>>   .B rsize
>> @@ -257,16 +260,19 @@ setting. The largest write payload supported by the Linux NFS client
>>   is 1,048,576 bytes (one megabyte).
>>   .IP
>>   Similar to
>> -.B rsize
>> -, the
>> +.BR rsize ,
>> +the allowed
>>   .B wsize
>> -value is a positive integral multiple of 1024.
>> +value is a positive integral multiple of
>> +system's page size
>> +or a power of two if it is less than
>> +system's page size.
>>   Specified
>>   .B wsize
>>   values lower than 1024 are replaced with 4096; values larger than
>>   1048576 are replaced with 1048576. If a specified value is within the supported
>> -range but not a multiple of 1024, it is rounded down to the nearest
>> -multiple of 1024.
>> +range but not such an allowed value, it is rounded down to the nearest
>> +allowed value.
>>   .IP
>>   If a
>>   .B wsize
>> --
>> 2.47.0
>>
>>
> 
>
Benjamin Coddington Nov. 20, 2024, 2:48 p.m. UTC | #3
On 20 Nov 2024, at 2:57, Cedric Blancher wrote:

> On Tue, 19 Nov 2024 at 18:07, Steve Dickson <steved@redhat.com> wrote:
>>
>> From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
>>
>> The rsize/wsize values are not multiples of 1024 but multiples of the
>> system's page size or powers of 2 if < system's page size as defined
>> in fs/nfs/internal.h:nfs_io_size().
>>
>> Signed-off-by: Steve Dickson <steved@redhat.com>
>
> REJECT. As discussed, this is the WRONG approach.

I am astounded that this response keeps appearing, we must have some
fundamental inability to understand each other.

> The pagesize is not not easily determinable (/bin/pagesize not even being
> part of the default install), and the page size is flexible on many
> architectures.  rsize/wsize depending on the page size makes this option
> non portable across platforms, or even same platforms with different
> default pagesize settings.

Why do you need to determine the page size to set this option?  You are
incorrect when stating that rsize/wsize depends on the page size.

As I explained before, one does not need to know the page size to set this
mount option.  The kernel takes the desired size and rounds down to the
current system's page size.

> In real life, this can prevent puppet from working for NFS root, if
> NFS root needs rsize/wsize, and someone switches the default page
> size.

What does "prevent puppet from working" mean?
Please show a real example of this problem that requires a kernel fix.

> I thought the correct fix would be to fix the NFS client to count in
> kbytes as documented, and round up/down to the pagesize.

What problem are we fixing?  Currently, the only non-hypothetical issue is
that the man page is incorrect.

Ben
Steve Dickson Nov. 20, 2024, 8:56 p.m. UTC | #4
On 11/19/24 11:59 AM, Steve Dickson wrote:
> From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
> 
> The rsize/wsize values are not multiples of 1024 but multiples of the
> system's page size or powers of 2 if < system's page size as defined
> in fs/nfs/internal.h:nfs_io_size().
> 
> Signed-off-by: Steve Dickson <steved@redhat.com>
Committed... (tag: nfs-utils-2-8-2-rc2)

I know we are still discussing this but I think
this version is better than what we have.

So update patches are welcome!

steved.
> ---
>   utils/mount/nfs.man | 24 +++++++++++++++---------
>   1 file changed, 15 insertions(+), 9 deletions(-)
> 
> V2: Replaced PAGE_SIZE with "system's page size"
> 
> diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
> index 233a7177..eab4692a 100644
> --- a/utils/mount/nfs.man
> +++ b/utils/mount/nfs.man
> @@ -215,15 +215,18 @@ or smaller than the
>   setting. The largest read payload supported by the Linux NFS client
>   is 1,048,576 bytes (one megabyte).
>   .IP
> -The
> +The allowed
>   .B rsize
> -value is a positive integral multiple of 1024.
> +value is a positive integral multiple of
> +system's page size
> +or a power of two if it is less than
> +system's page size.
>   Specified
>   .B rsize
>   values lower than 1024 are replaced with 4096; values larger than
>   1048576 are replaced with 1048576. If a specified value is within the supported
> -range but not a multiple of 1024, it is rounded down to the nearest
> -multiple of 1024.
> +range but not such an allowed value, it is rounded down to the nearest
> +allowed value.
>   .IP
>   If an
>   .B rsize
> @@ -257,16 +260,19 @@ setting. The largest write payload supported by the Linux NFS client
>   is 1,048,576 bytes (one megabyte).
>   .IP
>   Similar to
> -.B rsize
> -, the
> +.BR rsize ,
> +the allowed
>   .B wsize
> -value is a positive integral multiple of 1024.
> +value is a positive integral multiple of
> +system's page size
> +or a power of two if it is less than
> +system's page size.
>   Specified
>   .B wsize
>   values lower than 1024 are replaced with 4096; values larger than
>   1048576 are replaced with 1048576. If a specified value is within the supported
> -range but not a multiple of 1024, it is rounded down to the nearest
> -multiple of 1024.
> +range but not such an allowed value, it is rounded down to the nearest
> +allowed value.
>   .IP
>   If a
>   .B wsize
Dan Shelton Nov. 20, 2024, 11:04 p.m. UTC | #5
On Wed, 20 Nov 2024 at 21:56, Steve Dickson <steved@redhat.com> wrote:
>
>
>
> On 11/19/24 11:59 AM, Steve Dickson wrote:
> > From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
> >
> > The rsize/wsize values are not multiples of 1024 but multiples of the
> > system's page size or powers of 2 if < system's page size as defined
> > in fs/nfs/internal.h:nfs_io_size().
> >
> > Signed-off-by: Steve Dickson <steved@redhat.com>
> Committed... (tag: nfs-utils-2-8-2-rc2)
>
> I know we are still discussing this but I think
> this version is better than what we have.

Nope. The code is IMO wrong, and the docs are buggy too.

>
> So update patches are welcome!

Solaris, HPUX, FreeBSD and Windows NFSv3/v4 implementations all count in bytes.

Why does Linux again have to be the oddball of the family and count in
pages? Not-invented-here-syndrome,
need-reason-why-companies-have-to-pay-for-Linux-admin-training?
My recommendation would be to fix the code to count in bytes, rounded
to the page size, and being a minimum of one page size. Will bite of
course if someone chooses 2M pages as default on x86-64.

Dan
Cedric Blancher Nov. 21, 2024, 1:51 p.m. UTC | #6
On Thu, 21 Nov 2024 at 00:17, Dan Shelton <dan.f.shelton@gmail.com> wrote:
>
> On Wed, 20 Nov 2024 at 21:56, Steve Dickson <steved@redhat.com> wrote:
> >
> >
> >
> > On 11/19/24 11:59 AM, Steve Dickson wrote:
> > > From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
> > >
> > > The rsize/wsize values are not multiples of 1024 but multiples of the
> > > system's page size or powers of 2 if < system's page size as defined
> > > in fs/nfs/internal.h:nfs_io_size().
> > >
> > > Signed-off-by: Steve Dickson <steved@redhat.com>
> > Committed... (tag: nfs-utils-2-8-2-rc2)
> >
> > I know we are still discussing this but I think
> > this version is better than what we have.
>
> Nope. The code is IMO wrong, and the docs are buggy too.
>
> >
> > So update patches are welcome!
>
> Solaris, HPUX, FreeBSD and Windows NFSv3/v4 implementations all count in bytes.

I hereby concur, this is better to be consistent across operating
systems. In any case this is better than some machine- or
hardware-specific config option called "default page size", which no
one knows at boot time.

Ced
Chuck Lever III Nov. 21, 2024, 2:36 p.m. UTC | #7
> On Nov 21, 2024, at 8:51 AM, Cedric Blancher <cedric.blancher@gmail.com> wrote:
> 
> On Thu, 21 Nov 2024 at 00:17, Dan Shelton <dan.f.shelton@gmail.com> wrote:
>> 
>> On Wed, 20 Nov 2024 at 21:56, Steve Dickson <steved@redhat.com> wrote:
>>> 
>>> 
>>> 
>>> On 11/19/24 11:59 AM, Steve Dickson wrote:
>>>> From: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
>>>> 
>>>> The rsize/wsize values are not multiples of 1024 but multiples of the
>>>> system's page size or powers of 2 if < system's page size as defined
>>>> in fs/nfs/internal.h:nfs_io_size().
>>>> 
>>>> Signed-off-by: Steve Dickson <steved@redhat.com>
>>> Committed... (tag: nfs-utils-2-8-2-rc2)
>>> 
>>> I know we are still discussing this but I think
>>> this version is better than what we have.
>> 
>> Nope. The code is IMO wrong, and the docs are buggy too.

Let's be absolutely clear.

A code bug (or "the code is wrong/broken") occurs when code
does not behave as it was designed to behave. That is not
what is happening here: the Linux NFS client is working as
intended.

What you and Dan are asking for is a change in design or
policy. The software industry uses the term "Request For
Enhancement" for that.

The patch Steve committed makes the man page match the
current kernel code. That is not a bad thing to do, and it
is certainly not meant to be a permanent and final answer
to this issue.

Now let's put aside the accusations and hostility, and please
let's stick with technical concerns.


>>> So update patches are welcome!
>> 
>> Solaris, HPUX, FreeBSD and Windows NFSv3/v4 implementations all count in bytes.

I don't understand what this statement implies.

Linux's nfs(5) states these values are in units of bytes.
AFAICT the proposed man page patch does not change that.

On Linux we say "rsize=4096" and that means the maximum size
of a READ on the wire is 4096 bytes. Our rsize and wsize
options are specified just like on other operating systems.


> I hereby concur, this is better to be consistent across operating
> systems. In any case this is better than some machine- or
> hardware-specific config option called "default page size", which no
> one knows at boot time.

The only difference with the Linux implementation is that
the kernel might round off the specified values a little
differently than other implementations.

The description of the kernel's rounding behavior is what
is being modified.


--
Chuck Lever
diff mbox series

Patch

diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
index 233a7177..eab4692a 100644
--- a/utils/mount/nfs.man
+++ b/utils/mount/nfs.man
@@ -215,15 +215,18 @@  or smaller than the
 setting. The largest read payload supported by the Linux NFS client
 is 1,048,576 bytes (one megabyte).
 .IP
-The
+The allowed
 .B rsize
-value is a positive integral multiple of 1024.
+value is a positive integral multiple of
+system's page size
+or a power of two if it is less than
+system's page size.
 Specified
 .B rsize
 values lower than 1024 are replaced with 4096; values larger than
 1048576 are replaced with 1048576. If a specified value is within the supported
-range but not a multiple of 1024, it is rounded down to the nearest
-multiple of 1024.
+range but not such an allowed value, it is rounded down to the nearest
+allowed value.
 .IP
 If an
 .B rsize
@@ -257,16 +260,19 @@  setting. The largest write payload supported by the Linux NFS client
 is 1,048,576 bytes (one megabyte).
 .IP
 Similar to
-.B rsize
-, the
+.BR rsize ,
+the allowed
 .B wsize
-value is a positive integral multiple of 1024.
+value is a positive integral multiple of
+system's page size
+or a power of two if it is less than
+system's page size.
 Specified
 .B wsize
 values lower than 1024 are replaced with 4096; values larger than
 1048576 are replaced with 1048576. If a specified value is within the supported
-range but not a multiple of 1024, it is rounded down to the nearest
-multiple of 1024.
+range but not such an allowed value, it is rounded down to the nearest
+allowed value.
 .IP
 If a
 .B wsize