mbox series

[v3,00/11] Add RDMA support to the pNFS file+flexfiles data channels

Message ID 20201110231906.863446-1-trondmy@kernel.org (mailing list archive)
Headers show
Series Add RDMA support to the pNFS file+flexfiles data channels | expand

Message

Trond Myklebust Nov. 10, 2020, 11:18 p.m. UTC
From: Trond Myklebust <trond.myklebust@hammerspace.com>

Add support for connecting to the pNFS files/flexfiles data servers
through RDMA, assuming that the GETDEVICEINFO call advertises that
support.

v2: Fix layoutstats encoding for pNFS/flexfiles.
v3: Move most of the netid handling into the SUNRPC and RDMA modules.
    Fix up the mount code to benefit more from automated loading of
    SUNRPC transport modules.

Trond Myklebust (11):
  SUNRPC: xprt_load_transport() needs to support the netid "rdma6"
  SUNRPC: Close a race with transport setup and module put
  SUNRPC: Add a helper to return the transport identifier given a netid
  NFS: Switch mount code to use xprt_find_transport_ident()
  SUNRPC: Remove unused function xprt_load_transport()
  NFSv4/pNFS: Use connections to a DS that are all of the same protocol
    family
  pNFS: Add helpers for allocation/free of struct nfs4_pnfs_ds_addr
  NFSv4/pNFS: Store the transport type in struct nfs4_pnfs_ds_addr
  pNFS/flexfiles: Fix up layoutstats reporting for non-TCP transports
  SUNRPC: Fix up open coded kmemdup_nul()
  pNFS: Clean up open coded xdr string decoding

 fs/nfs/flexfilelayout/flexfilelayout.c |   9 +-
 fs/nfs/fs_context.c                    |  21 +++--
 fs/nfs/pnfs.h                          |   2 +
 fs/nfs/pnfs_nfs.c                      | 103 ++++++++++------------
 include/linux/sunrpc/xprt.h            |   3 +-
 net/sunrpc/xdr.c                       |   4 +-
 net/sunrpc/xprt.c                      | 117 ++++++++++++++++++-------
 net/sunrpc/xprtrdma/module.c           |   1 +
 net/sunrpc/xprtrdma/transport.c        |   1 +
 net/sunrpc/xprtsock.c                  |   4 +
 10 files changed, 159 insertions(+), 106 deletions(-)

Comments

Trond Myklebust Nov. 10, 2020, 11:42 p.m. UTC | #1
On Tue, 2020-11-10 at 18:18 -0500, trondmy@kernel.org wrote:
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> Add support for connecting to the pNFS files/flexfiles data servers
> through RDMA, assuming that the GETDEVICEINFO call advertises that
> support.
> 
> v2: Fix layoutstats encoding for pNFS/flexfiles.
> v3: Move most of the netid handling into the SUNRPC and RDMA modules.
>     Fix up the mount code to benefit more from automated loading of
>     SUNRPC transport modules.
> 

Note that one cleanup that I did not perform, but which really could be
useful should we want to add more transport mechanisms, is to move the
code to parse stringified addresses (in particular IETF style
"universal addresses") into the transport modules so that the actual
parsing of mount and pNFS transport info can be automatically extended
when new transport modules are added.
Mkrtchyan, Tigran Nov. 13, 2020, 12:48 p.m. UTC | #2
Hi Trond,

whit this changes (3ee69a14f92d74ced2647140b3799511ba4f3fa5) I see an infinite loop
of LAYOUTGET->GETDEVICEINFO->LAYOUTRETURN without any attempt to connect to a DS.

This is how the response to LAYOUTGET looks like.

Network File System, Ops(2): SEQUENCE GETDEVINFO
    [Program Version: 4]
    [V4 Procedure: COMPOUND (1)]
    Status: NFS4_OK (0)
    Tag: <EMPTY>
        length: 0
        contents: <EMPTY>
    Operations (count: 2)
        Opcode: SEQUENCE (53)
        Opcode: GETDEVINFO (47)
            Status: NFS4_OK (0)
            layout type: LAYOUT4_FLEX_FILES (4)
            r_netid: tcp
                length: 3
                contents: tcp
                fill bytes: opaque data
            r_addr: 131.169.191.143.125.49
                length: 22
                contents: 131.169.191.143.125.49
                fill bytes: opaque data
            version: 4
            minorversion: 1
            max_rsize: 1048576
            max_wsize: 1048576
            tightly coupled: Yes
            notify_mask: 0x00000006 (Change, Delete)
                notify_type: Change (1)
                notify_type: Delete (2)
    [Main Opcode: GETDEVINFO (47)]



The MDS is mounted with IPv4. I can provide the full packet trace, if needed.


Regards,
   Tigran.


----- Original Message -----
> From: "trondmy" <trondmy@hammerspace.com>
> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Wednesday, 11 November, 2020 00:42:31
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> On Tue, 2020-11-10 at 18:18 -0500, trondmy@kernel.org wrote:
>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>> 
>> Add support for connecting to the pNFS files/flexfiles data servers
>> through RDMA, assuming that the GETDEVICEINFO call advertises that
>> support.
>> 
>> v2: Fix layoutstats encoding for pNFS/flexfiles.
>> v3: Move most of the netid handling into the SUNRPC and RDMA modules.
>>     Fix up the mount code to benefit more from automated loading of
>>     SUNRPC transport modules.
>> 
> 
> Note that one cleanup that I did not perform, but which really could be
> useful should we want to add more transport mechanisms, is to move the
> code to parse stringified addresses (in particular IETF style
> "universal addresses") into the transport modules so that the actual
> parsing of mount and pNFS transport info can be automatically extended
> when new transport modules are added.
> 
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
Mkrtchyan, Tigran Nov. 13, 2020, 9:30 p.m. UTC | #3
After more testing, it looks like that client doesn't like
notification bitmap:


[31576.789492] --> _nfs4_proc_getdeviceinfo
[31576.789503] --> nfs41_call_sync_prepare data->seq_server 000000001d17c43e
[31576.789507] --> nfs4_alloc_slot used_slots=0000 highest_used=4294967295 max_slots=16
[31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 slotid=0
[31576.789527] encode_sequence: sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0 max_slotid=0 cache_this=0
[31576.789991] decode_getdeviceinfo: unsupported notification
[31576.790003] --> nfs4_alloc_slot used_slots=0001 highest_used=0 max_slots=16
[31576.790007] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 slotid=1
[31576.790010] nfs4_free_slot: slotid 1 highest_used_slotid 0
[31576.790013] nfs41_sequence_process: Error 0 free the slot
[31576.790017] nfs4_free_slot: slotid 0 highest_used_slotid 4294967295
[31576.790030] <-- _nfs4_proc_getdeviceinfo status=-5
[31576.790034] nfs4_get_device_info getdevice info returns -5
[31576.790084] <-- nfs4_get_device_info d 0000000000000000


Tigran.


----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "trondmy" <trondmy@hammerspace.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Friday, 13 November, 2020 13:48:07
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> Hi Trond,
> 
> whit this changes (3ee69a14f92d74ced2647140b3799511ba4f3fa5) I see an infinite
> loop
> of LAYOUTGET->GETDEVICEINFO->LAYOUTRETURN without any attempt to connect to a
> DS.
> 
> This is how the response to LAYOUTGET looks like.
> 
> Network File System, Ops(2): SEQUENCE GETDEVINFO
>    [Program Version: 4]
>    [V4 Procedure: COMPOUND (1)]
>    Status: NFS4_OK (0)
>    Tag: <EMPTY>
>        length: 0
>        contents: <EMPTY>
>    Operations (count: 2)
>        Opcode: SEQUENCE (53)
>        Opcode: GETDEVINFO (47)
>            Status: NFS4_OK (0)
>            layout type: LAYOUT4_FLEX_FILES (4)
>            r_netid: tcp
>                length: 3
>                contents: tcp
>                fill bytes: opaque data
>            r_addr: 131.169.191.143.125.49
>                length: 22
>                contents: 131.169.191.143.125.49
>                fill bytes: opaque data
>            version: 4
>            minorversion: 1
>            max_rsize: 1048576
>            max_wsize: 1048576
>            tightly coupled: Yes
>            notify_mask: 0x00000006 (Change, Delete)
>                notify_type: Change (1)
>                notify_type: Delete (2)
>    [Main Opcode: GETDEVINFO (47)]
> 
> 
> 
> The MDS is mounted with IPv4. I can provide the full packet trace, if needed.
> 
> 
> Regards,
>   Tigran.
> 
> 
> ----- Original Message -----
>> From: "trondmy" <trondmy@hammerspace.com>
>> To: "linux-nfs" <linux-nfs@vger.kernel.org>
>> Sent: Wednesday, 11 November, 2020 00:42:31
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> On Tue, 2020-11-10 at 18:18 -0500, trondmy@kernel.org wrote:
>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>> 
>>> Add support for connecting to the pNFS files/flexfiles data servers
>>> through RDMA, assuming that the GETDEVICEINFO call advertises that
>>> support.
>>> 
>>> v2: Fix layoutstats encoding for pNFS/flexfiles.
>>> v3: Move most of the netid handling into the SUNRPC and RDMA modules.
>>>     Fix up the mount code to benefit more from automated loading of
>>>     SUNRPC transport modules.
>>> 
>> 
>> Note that one cleanup that I did not perform, but which really could be
>> useful should we want to add more transport mechanisms, is to move the
>> code to parse stringified addresses (in particular IETF style
>> "universal addresses") into the transport modules so that the actual
>> parsing of mount and pNFS transport info can be automatically extended
>> when new transport modules are added.
>> 
>> --
>> Trond Myklebust
>> Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
Trond Myklebust Nov. 13, 2020, 10:45 p.m. UTC | #4
On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
> 
> After more testing, it looks like that client doesn't like
> notification bitmap:
> 
> 
> [31576.789492] --> _nfs4_proc_getdeviceinfo
> [31576.789503] --> nfs41_call_sync_prepare data->seq_server
> 000000001d17c43e
> [31576.789507] --> nfs4_alloc_slot used_slots=0000
> highest_used=4294967295 max_slots=16
> [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
> slotid=0
> [31576.789527] encode_sequence:
> sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
> max_slotid=0 cache_this=0
> [31576.789991] decode_getdeviceinfo: unsupported notification

According to this, you appear to be returning a deviceinfo bitmap with
at least one non-zero entry that is not in the first 32-bit word. We
only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-zero
entries.

> [31576.790003] --> nfs4_alloc_slot used_slots=0001 highest_used=0
> max_slots=16
> [31576.790007] <-- nfs4_alloc_slot used_slots=0003 highest_used=1
> slotid=1
> [31576.790010] nfs4_free_slot: slotid 1 highest_used_slotid 0
> [31576.790013] nfs41_sequence_process: Error 0 free the slot
> [31576.790017] nfs4_free_slot: slotid 0 highest_used_slotid
> 4294967295
> [31576.790030] <-- _nfs4_proc_getdeviceinfo status=-5
> [31576.790034] nfs4_get_device_info getdevice info returns -5
> [31576.790084] <-- nfs4_get_device_info d 0000000000000000
> 
> 
> Tigran.
> 
> 
> ----- Original Message -----
> > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> > To: "trondmy" <trondmy@hammerspace.com>
> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> > Sent: Friday, 13 November, 2020 13:48:07
> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
> > file+flexfiles data channels
> 
> > Hi Trond,
> > 
> > whit this changes (3ee69a14f92d74ced2647140b3799511ba4f3fa5) I see
> > an infinite
> > loop
> > of LAYOUTGET->GETDEVICEINFO->LAYOUTRETURN without any attempt to
> > connect to a
> > DS.
> > 
> > This is how the response to LAYOUTGET looks like.
> > 
> > Network File System, Ops(2): SEQUENCE GETDEVINFO
> >    [Program Version: 4]
> >    [V4 Procedure: COMPOUND (1)]
> >    Status: NFS4_OK (0)
> >    Tag: <EMPTY>
> >        length: 0
> >        contents: <EMPTY>
> >    Operations (count: 2)
> >        Opcode: SEQUENCE (53)
> >        Opcode: GETDEVINFO (47)
> >            Status: NFS4_OK (0)
> >            layout type: LAYOUT4_FLEX_FILES (4)
> >            r_netid: tcp
> >                length: 3
> >                contents: tcp
> >                fill bytes: opaque data
> >            r_addr: 131.169.191.143.125.49
> >                length: 22
> >                contents: 131.169.191.143.125.49
> >                fill bytes: opaque data
> >            version: 4
> >            minorversion: 1
> >            max_rsize: 1048576
> >            max_wsize: 1048576
> >            tightly coupled: Yes
> >            notify_mask: 0x00000006 (Change, Delete)
> >                notify_type: Change (1)
> >                notify_type: Delete (2)
> >    [Main Opcode: GETDEVINFO (47)]
> > 
> > 
> > 
> > The MDS is mounted with IPv4. I can provide the full packet trace,
> > if needed.
> > 
> > 
> > Regards,
> >   Tigran.
> > 
> > 
> > ----- Original Message -----
> > > From: "trondmy" <trondmy@hammerspace.com>
> > > To: "linux-nfs" <linux-nfs@vger.kernel.org>
> > > Sent: Wednesday, 11 November, 2020 00:42:31
> > > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
> > > file+flexfiles data
> > > channels
> > 
> > > On Tue, 2020-11-10 at 18:18 -0500, trondmy@kernel.org wrote:
> > > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > > 
> > > > Add support for connecting to the pNFS files/flexfiles data
> > > > servers
> > > > through RDMA, assuming that the GETDEVICEINFO call advertises
> > > > that
> > > > support.
> > > > 
> > > > v2: Fix layoutstats encoding for pNFS/flexfiles.
> > > > v3: Move most of the netid handling into the SUNRPC and RDMA
> > > > modules.
> > > >     Fix up the mount code to benefit more from automated
> > > > loading of
> > > >     SUNRPC transport modules.
> > > > 
> > > 
> > > Note that one cleanup that I did not perform, but which really
> > > could be
> > > useful should we want to add more transport mechanisms, is to
> > > move the
> > > code to parse stringified addresses (in particular IETF style
> > > "universal addresses") into the transport modules so that the
> > > actual
> > > parsing of mount and pNFS transport info can be automatically
> > > extended
> > > when new transport modules are added.
> > > 
> > > --
> > > Trond Myklebust
> > > Linux NFS client maintainer, Hammerspace
> > > trond.myklebust@hammerspace.com
Mkrtchyan, Tigran Nov. 13, 2020, 11:46 p.m. UTC | #5
----- Original Message -----
> From: "trondmy" <trondmy@hammerspace.com>
> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Friday, 13 November, 2020 23:45:00
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>> 
>> After more testing, it looks like that client doesn't like
>> notification bitmap:
>> 
>> 
>> [31576.789492] --> _nfs4_proc_getdeviceinfo
>> [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>> 000000001d17c43e
>> [31576.789507] --> nfs4_alloc_slot used_slots=0000
>> highest_used=4294967295 max_slots=16
>> [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>> slotid=0
>> [31576.789527] encode_sequence:
>> sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>> max_slotid=0 cache_this=0
>> [31576.789991] decode_getdeviceinfo: unsupported notification
> 
> According to this, you appear to be returning a deviceinfo bitmap with
> at least one non-zero entry that is not in the first 32-bit word. We
> only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
> NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-zero
> entries.


according to packet capture only bitmap[0] has non zero bits set.
This is the reply of compound starting from nfs staus code, tag
length and so on.


0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
0090   00 00 00 00


the last 12 bytes : bitmap size, bitmap[0], bitmap[1]


This part of code in the didn't change since 2010, and I
have no issues to use 5.8 kernel. I am pretty sure, that
tests with 5.9 did pass as expected. I will try to bisec it.

Tigran.

> 
>> [31576.790003] --> nfs4_alloc_slot used_slots=0001 highest_used=0
>> max_slots=16
>> [31576.790007] <-- nfs4_alloc_slot used_slots=0003 highest_used=1
>> slotid=1
>> [31576.790010] nfs4_free_slot: slotid 1 highest_used_slotid 0
>> [31576.790013] nfs41_sequence_process: Error 0 free the slot
>> [31576.790017] nfs4_free_slot: slotid 0 highest_used_slotid
>> 4294967295
>> [31576.790030] <-- _nfs4_proc_getdeviceinfo status=-5
>> [31576.790034] nfs4_get_device_info getdevice info returns -5
>> [31576.790084] <-- nfs4_get_device_info d 0000000000000000
>> 
>> 
>> Tigran.
>> 
>> 
>> ----- Original Message -----
>> > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> > To: "trondmy" <trondmy@hammerspace.com>
>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>> > Sent: Friday, 13 November, 2020 13:48:07
>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>> > file+flexfiles data channels
>> 
>> > Hi Trond,
>> > 
>> > whit this changes (3ee69a14f92d74ced2647140b3799511ba4f3fa5) I see
>> > an infinite
>> > loop
>> > of LAYOUTGET->GETDEVICEINFO->LAYOUTRETURN without any attempt to
>> > connect to a
>> > DS.
>> > 
>> > This is how the response to LAYOUTGET looks like.
>> > 
>> > Network File System, Ops(2): SEQUENCE GETDEVINFO
>> >    [Program Version: 4]
>> >    [V4 Procedure: COMPOUND (1)]
>> >    Status: NFS4_OK (0)
>> >    Tag: <EMPTY>
>> >        length: 0
>> >        contents: <EMPTY>
>> >    Operations (count: 2)
>> >        Opcode: SEQUENCE (53)
>> >        Opcode: GETDEVINFO (47)
>> >            Status: NFS4_OK (0)
>> >            layout type: LAYOUT4_FLEX_FILES (4)
>> >            r_netid: tcp
>> >                length: 3
>> >                contents: tcp
>> >                fill bytes: opaque data
>> >            r_addr: 131.169.191.143.125.49
>> >                length: 22
>> >                contents: 131.169.191.143.125.49
>> >                fill bytes: opaque data
>> >            version: 4
>> >            minorversion: 1
>> >            max_rsize: 1048576
>> >            max_wsize: 1048576
>> >            tightly coupled: Yes
>> >            notify_mask: 0x00000006 (Change, Delete)
>> >                notify_type: Change (1)
>> >                notify_type: Delete (2)
>> >    [Main Opcode: GETDEVINFO (47)]
>> > 
>> > 
>> > 
>> > The MDS is mounted with IPv4. I can provide the full packet trace,
>> > if needed.
>> > 
>> > 
>> > Regards,
>> >   Tigran.
>> > 
>> > 
>> > ----- Original Message -----
>> > > From: "trondmy" <trondmy@hammerspace.com>
>> > > To: "linux-nfs" <linux-nfs@vger.kernel.org>
>> > > Sent: Wednesday, 11 November, 2020 00:42:31
>> > > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>> > > file+flexfiles data
>> > > channels
>> > 
>> > > On Tue, 2020-11-10 at 18:18 -0500, trondmy@kernel.org wrote:
>> > > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
>> > > > 
>> > > > Add support for connecting to the pNFS files/flexfiles data
>> > > > servers
>> > > > through RDMA, assuming that the GETDEVICEINFO call advertises
>> > > > that
>> > > > support.
>> > > > 
>> > > > v2: Fix layoutstats encoding for pNFS/flexfiles.
>> > > > v3: Move most of the netid handling into the SUNRPC and RDMA
>> > > > modules.
>> > > >     Fix up the mount code to benefit more from automated
>> > > > loading of
>> > > >     SUNRPC transport modules.
>> > > > 
>> > > 
>> > > Note that one cleanup that I did not perform, but which really
>> > > could be
>> > > useful should we want to add more transport mechanisms, is to
>> > > move the
>> > > code to parse stringified addresses (in particular IETF style
>> > > "universal addresses") into the transport modules so that the
>> > > actual
>> > > parsing of mount and pNFS transport info can be automatically
>> > > extended
>> > > when new transport modules are added.
>> > > 
>> > > --
>> > > Trond Myklebust
>> > > Linux NFS client maintainer, Hammerspace
>> > > trond.myklebust@hammerspace.com
> 
> --
> Trond Myklebust
> CTO, Hammerspace Inc
> 4984 El Camino Real, Suite 208
> Los Altos, CA 94022
> 
> www.hammer.space
Trond Myklebust Nov. 14, 2020, 2:29 p.m. UTC | #6
On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
> 
> 
> ----- Original Message -----
> > From: "trondmy" <trondmy@hammerspace.com>
> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> > Sent: Friday, 13 November, 2020 23:45:00
> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
> > file+flexfiles data channels
> 
> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
> > > 
> > > After more testing, it looks like that client doesn't like
> > > notification bitmap:
> > > 
> > > 
> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
> > > 000000001d17c43e
> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
> > > highest_used=4294967295 max_slots=16
> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
> > > slotid=0
> > > [31576.789527] encode_sequence:
> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
> > > max_slotid=0 cache_this=0
> > > [31576.789991] decode_getdeviceinfo: unsupported notification
> > 
> > According to this, you appear to be returning a deviceinfo bitmap
> > with
> > at least one non-zero entry that is not in the first 32-bit word.
> > We
> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
> > zero
> > entries.
> 
> 
> according to packet capture only bitmap[0] has non zero bits set.
> This is the reply of compound starting from nfs staus code, tag
> length and so on.
> 
> 
> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
> 0090   00 00 00 00
> 
> 
> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
> 
> 
> This part of code in the didn't change since 2010, and I
> have no issues to use 5.8 kernel. I am pretty sure, that
> tests with 5.9 did pass as expected. I will try to bisec it.

I don't think I've introduced this bug. I did not touch anything in the
getdeviceinfo proc or XDR code.
Does the following patch help?

8<-------------------------------------------------------
From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Fri, 13 Nov 2020 21:42:16 -0500
Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
 reply

We can fit the device_addr4 opaque data padding in the pages.

Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4xdr.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index c6dbfcae7517..c8714381d511 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst *req,
 	struct compound_hdr hdr = {
 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
 	};
+	uint32_t replen;
 
 	encode_compound_hdr(xdr, req, &hdr);
 	encode_sequence(xdr, &args->seq_args, &hdr);
+
+	replen = hdr.replen + op_decode_hdr_maxsz;
+
 	encode_getdeviceinfo(xdr, args, &hdr);
 
-	/* set up reply kvec. Subtract notification bitmap max size (2)
-	 * so that notification bitmap is put in xdr_buf tail */
+	/* set up reply kvec. device_addr4 opaque data is read into the
+	 * pages */
 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
-				args->pdev->pglen, hdr.replen - 2);
+				args->pdev->pglen, replen + 2);
 	encode_nops(&hdr);
 }
 
@@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
 	 * and places the remaining xdr data in xdr_buf->tail
 	 */
 	pdev->mincount = be32_to_cpup(p);
-	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
+	/* Calculate padding */
+	len = xdr_align_size(pdev->mincount);
+	if (xdr_read_pages(xdr, len) != len)
 		return -EIO;
 
 	/* Parse notification bitmap, verifying that it is zero. */
Mkrtchyan, Tigran Nov. 16, 2020, 8:55 p.m. UTC | #7
Hi Trond,

I am afraid, that the fix didn't work. I bisecting it.... 


Tigran.


----- Original Message -----
> From: "trondmy" <trondmy@hammerspace.com>
> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Saturday, 14 November, 2020 15:29:01
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>> 
>> 
>> ----- Original Message -----
>> > From: "trondmy" <trondmy@hammerspace.com>
>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>> > Sent: Friday, 13 November, 2020 23:45:00
>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>> > file+flexfiles data channels
>> 
>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>> > > 
>> > > After more testing, it looks like that client doesn't like
>> > > notification bitmap:
>> > > 
>> > > 
>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>> > > 000000001d17c43e
>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>> > > highest_used=4294967295 max_slots=16
>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>> > > slotid=0
>> > > [31576.789527] encode_sequence:
>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>> > > max_slotid=0 cache_this=0
>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>> > 
>> > According to this, you appear to be returning a deviceinfo bitmap
>> > with
>> > at least one non-zero entry that is not in the first 32-bit word.
>> > We
>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>> > zero
>> > entries.
>> 
>> 
>> according to packet capture only bitmap[0] has non zero bits set.
>> This is the reply of compound starting from nfs staus code, tag
>> length and so on.
>> 
>> 
>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>> 0090   00 00 00 00
>> 
>> 
>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>> 
>> 
>> This part of code in the didn't change since 2010, and I
>> have no issues to use 5.8 kernel. I am pretty sure, that
>> tests with 5.9 did pass as expected. I will try to bisec it.
> 
> I don't think I've introduced this bug. I did not touch anything in the
> getdeviceinfo proc or XDR code.
> Does the following patch help?
> 
> 8<-------------------------------------------------------
> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> Date: Fri, 13 Nov 2020 21:42:16 -0500
> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
> reply
> 
> We can fit the device_addr4 opaque data padding in the pages.
> 
> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
> fs/nfs/nfs4xdr.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index c6dbfcae7517..c8714381d511 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
> *req,
> 	struct compound_hdr hdr = {
> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
> 	};
> +	uint32_t replen;
> 
> 	encode_compound_hdr(xdr, req, &hdr);
> 	encode_sequence(xdr, &args->seq_args, &hdr);
> +
> +	replen = hdr.replen + op_decode_hdr_maxsz;
> +
> 	encode_getdeviceinfo(xdr, args, &hdr);
> 
> -	/* set up reply kvec. Subtract notification bitmap max size (2)
> -	 * so that notification bitmap is put in xdr_buf tail */
> +	/* set up reply kvec. device_addr4 opaque data is read into the
> +	 * pages */
> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
> -				args->pdev->pglen, hdr.replen - 2);
> +				args->pdev->pglen, replen + 2);
> 	encode_nops(&hdr);
> }
> 
> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
> 	 * and places the remaining xdr data in xdr_buf->tail
> 	 */
> 	pdev->mincount = be32_to_cpup(p);
> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
> +	/* Calculate padding */
> +	len = xdr_align_size(pdev->mincount);
> +	if (xdr_read_pages(xdr, len) != len)
> 		return -EIO;
> 
> 	/* Parse notification bitmap, verifying that it is zero. */
> --
> 2.28.0
> 
> 
> 
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
Mkrtchyan, Tigran Nov. 17, 2020, 2:50 p.m. UTC | #8
Here is the result:


$ git bisect bad 
c567552612ece787b178e3b147b5854ad422a836 is the first bad commit
commit c567552612ece787b178e3b147b5854ad422a836
Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date:   Wed May 28 13:41:22 2014 -0400

    NFS: Add READ_PLUS data segment support
    
    This patch adds client support for decoding a single NFS4_CONTENT_DATA
    segment returned by the server. This is the simplest implementation
    possible, since it does not account for any hole segments in the reply.
    
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

 fs/nfs/nfs42xdr.c         | 141 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfs/nfs4client.c       |   2 +
 fs/nfs/nfs4proc.c         |  43 +++++++++++++-
 fs/nfs/nfs4xdr.c          |   1 +
 include/linux/nfs4.h      |   2 +-
 include/linux/nfs_fs_sb.h |   1 +
 include/linux/nfs_xdr.h   |   2 +-
 7 files changed, 187 insertions(+), 5 deletions(-)


Regards,
   Tigran.



----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "trondmy" <trondmy@hammerspace.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Monday, 16 November, 2020 21:55:50
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> Hi Trond,
> 
> I am afraid, that the fix didn't work. I bisecting it....
> 
> 
> Tigran.
> 
> 
> ----- Original Message -----
>> From: "trondmy" <trondmy@hammerspace.com>
>> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>> Sent: Saturday, 14 November, 2020 15:29:01
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>>> 
>>> 
>>> ----- Original Message -----
>>> > From: "trondmy" <trondmy@hammerspace.com>
>>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>> > Sent: Friday, 13 November, 2020 23:45:00
>>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>>> > file+flexfiles data channels
>>> 
>>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>>> > > 
>>> > > After more testing, it looks like that client doesn't like
>>> > > notification bitmap:
>>> > > 
>>> > > 
>>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>>> > > 000000001d17c43e
>>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>>> > > highest_used=4294967295 max_slots=16
>>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>>> > > slotid=0
>>> > > [31576.789527] encode_sequence:
>>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>>> > > max_slotid=0 cache_this=0
>>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>>> > 
>>> > According to this, you appear to be returning a deviceinfo bitmap
>>> > with
>>> > at least one non-zero entry that is not in the first 32-bit word.
>>> > We
>>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>>> > zero
>>> > entries.
>>> 
>>> 
>>> according to packet capture only bitmap[0] has non zero bits set.
>>> This is the reply of compound starting from nfs staus code, tag
>>> length and so on.
>>> 
>>> 
>>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>>> 0090   00 00 00 00
>>> 
>>> 
>>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>>> 
>>> 
>>> This part of code in the didn't change since 2010, and I
>>> have no issues to use 5.8 kernel. I am pretty sure, that
>>> tests with 5.9 did pass as expected. I will try to bisec it.
>> 
>> I don't think I've introduced this bug. I did not touch anything in the
>> getdeviceinfo proc or XDR code.
>> Does the following patch help?
>> 
>> 8<-------------------------------------------------------
>> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>> Date: Fri, 13 Nov 2020 21:42:16 -0500
>> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
>> reply
>> 
>> We can fit the device_addr4 opaque data padding in the pages.
>> 
>> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>> ---
>> fs/nfs/nfs4xdr.c | 14 ++++++++++----
>> 1 file changed, 10 insertions(+), 4 deletions(-)
>> 
>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>> index c6dbfcae7517..c8714381d511 100644
>> --- a/fs/nfs/nfs4xdr.c
>> +++ b/fs/nfs/nfs4xdr.c
>> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
>> *req,
>> 	struct compound_hdr hdr = {
>> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
>> 	};
>> +	uint32_t replen;
>> 
>> 	encode_compound_hdr(xdr, req, &hdr);
>> 	encode_sequence(xdr, &args->seq_args, &hdr);
>> +
>> +	replen = hdr.replen + op_decode_hdr_maxsz;
>> +
>> 	encode_getdeviceinfo(xdr, args, &hdr);
>> 
>> -	/* set up reply kvec. Subtract notification bitmap max size (2)
>> -	 * so that notification bitmap is put in xdr_buf tail */
>> +	/* set up reply kvec. device_addr4 opaque data is read into the
>> +	 * pages */
>> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
>> -				args->pdev->pglen, hdr.replen - 2);
>> +				args->pdev->pglen, replen + 2);
>> 	encode_nops(&hdr);
>> }
>> 
>> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
>> 	 * and places the remaining xdr data in xdr_buf->tail
>> 	 */
>> 	pdev->mincount = be32_to_cpup(p);
>> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>> +	/* Calculate padding */
>> +	len = xdr_align_size(pdev->mincount);
>> +	if (xdr_read_pages(xdr, len) != len)
>> 		return -EIO;
>> 
>> 	/* Parse notification bitmap, verifying that it is zero. */
>> --
>> 2.28.0
>> 
>> 
>> 
>> --
>> Trond Myklebust
>> Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
Mkrtchyan, Tigran Nov. 26, 2020, 5:17 p.m. UTC | #9
I have added some debug info Ind got this:

decode_getdeviceinfo: layout type 4
decode_getdeviceinfo: layout size 64
decode_getdeviceinfo: layout notification bitmap size 1094994757

So it looks like that xdr_read_pages set xdr pointer to a wrong position
and next read, which is bitmap size

5857         pdev->mincount = be32_to_cpup(p);
5858         dprintk("%s: layout size %u\n", __func__, pdev->mincount);
5859         if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
5860                 return -EIO;
5861 
5862         /* Parse notification bitmap, verifying that it is zero. */
5863         p = xdr_inline_decode(xdr, 4);
5864         if (unlikely(!p))
5865                 return -EIO;
5866         len = be32_to_cpup(p);
5867         dprintk("%s: layout notification bitmap size %u\n", __func__, len);
5868         if (len) {
5869                 uint32_t i;


Tigran.


----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "trondmy" <trondmy@hammerspace.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker" <anna.schumaker@netapp.com>
> Sent: Tuesday, 17 November, 2020 15:50:57
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> Here is the result:
> 
> 
> $ git bisect bad
> c567552612ece787b178e3b147b5854ad422a836 is the first bad commit
> commit c567552612ece787b178e3b147b5854ad422a836
> Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
> Date:   Wed May 28 13:41:22 2014 -0400
> 
>    NFS: Add READ_PLUS data segment support
>    
>    This patch adds client support for decoding a single NFS4_CONTENT_DATA
>    segment returned by the server. This is the simplest implementation
>    possible, since it does not account for any hole segments in the reply.
>    
>    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
> 
> fs/nfs/nfs42xdr.c         | 141 ++++++++++++++++++++++++++++++++++++++++++++++
> fs/nfs/nfs4client.c       |   2 +
> fs/nfs/nfs4proc.c         |  43 +++++++++++++-
> fs/nfs/nfs4xdr.c          |   1 +
> include/linux/nfs4.h      |   2 +-
> include/linux/nfs_fs_sb.h |   1 +
> include/linux/nfs_xdr.h   |   2 +-
> 7 files changed, 187 insertions(+), 5 deletions(-)
> 
> 
> Regards,
>   Tigran.
> 
> 
> 
> ----- Original Message -----
>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> To: "trondmy" <trondmy@hammerspace.com>
>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>> Sent: Monday, 16 November, 2020 21:55:50
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> Hi Trond,
>> 
>> I am afraid, that the fix didn't work. I bisecting it....
>> 
>> 
>> Tigran.
>> 
>> 
>> ----- Original Message -----
>>> From: "trondmy" <trondmy@hammerspace.com>
>>> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>> Sent: Saturday, 14 November, 2020 15:29:01
>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>> channels
>> 
>>> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>> > From: "trondmy" <trondmy@hammerspace.com>
>>>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>> > Sent: Friday, 13 November, 2020 23:45:00
>>>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>>>> > file+flexfiles data channels
>>>> 
>>>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>>>> > > 
>>>> > > After more testing, it looks like that client doesn't like
>>>> > > notification bitmap:
>>>> > > 
>>>> > > 
>>>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>>>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>>>> > > 000000001d17c43e
>>>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>>>> > > highest_used=4294967295 max_slots=16
>>>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>>>> > > slotid=0
>>>> > > [31576.789527] encode_sequence:
>>>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>>>> > > max_slotid=0 cache_this=0
>>>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>>>> > 
>>>> > According to this, you appear to be returning a deviceinfo bitmap
>>>> > with
>>>> > at least one non-zero entry that is not in the first 32-bit word.
>>>> > We
>>>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>>>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>>>> > zero
>>>> > entries.
>>>> 
>>>> 
>>>> according to packet capture only bitmap[0] has non zero bits set.
>>>> This is the reply of compound starting from nfs staus code, tag
>>>> length and so on.
>>>> 
>>>> 
>>>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>>>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>>>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>>>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>>>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>>>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>>>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>>>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>>>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>>>> 0090   00 00 00 00
>>>> 
>>>> 
>>>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>>>> 
>>>> 
>>>> This part of code in the didn't change since 2010, and I
>>>> have no issues to use 5.8 kernel. I am pretty sure, that
>>>> tests with 5.9 did pass as expected. I will try to bisec it.
>>> 
>>> I don't think I've introduced this bug. I did not touch anything in the
>>> getdeviceinfo proc or XDR code.
>>> Does the following patch help?
>>> 
>>> 8<-------------------------------------------------------
>>> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>> Date: Fri, 13 Nov 2020 21:42:16 -0500
>>> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
>>> reply
>>> 
>>> We can fit the device_addr4 opaque data padding in the pages.
>>> 
>>> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
>>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>>> ---
>>> fs/nfs/nfs4xdr.c | 14 ++++++++++----
>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>> index c6dbfcae7517..c8714381d511 100644
>>> --- a/fs/nfs/nfs4xdr.c
>>> +++ b/fs/nfs/nfs4xdr.c
>>> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
>>> *req,
>>> 	struct compound_hdr hdr = {
>>> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
>>> 	};
>>> +	uint32_t replen;
>>> 
>>> 	encode_compound_hdr(xdr, req, &hdr);
>>> 	encode_sequence(xdr, &args->seq_args, &hdr);
>>> +
>>> +	replen = hdr.replen + op_decode_hdr_maxsz;
>>> +
>>> 	encode_getdeviceinfo(xdr, args, &hdr);
>>> 
>>> -	/* set up reply kvec. Subtract notification bitmap max size (2)
>>> -	 * so that notification bitmap is put in xdr_buf tail */
>>> +	/* set up reply kvec. device_addr4 opaque data is read into the
>>> +	 * pages */
>>> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
>>> -				args->pdev->pglen, hdr.replen - 2);
>>> +				args->pdev->pglen, replen + 2);
>>> 	encode_nops(&hdr);
>>> }
>>> 
>>> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
>>> 	 * and places the remaining xdr data in xdr_buf->tail
>>> 	 */
>>> 	pdev->mincount = be32_to_cpup(p);
>>> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>>> +	/* Calculate padding */
>>> +	len = xdr_align_size(pdev->mincount);
>>> +	if (xdr_read_pages(xdr, len) != len)
>>> 		return -EIO;
>>> 
>>> 	/* Parse notification bitmap, verifying that it is zero. */
>>> --
>>> 2.28.0
>>> 
>>> 
>>> 
>>> --
>>> Trond Myklebust
>>> Linux NFS client maintainer, Hammerspace
> > > trond.myklebust@hammerspace.com
Mkrtchyan, Tigran Dec. 1, 2020, 10:59 a.m. UTC | #10
More investigation...

After xdr_read_pages call the xdr stream points to the
beginning of RPC package and return the xid value on
the next read (which should be the size of the notification bitmap).


Regards,
   Tigran.

----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "trondmy" <trondmy@hammerspace.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker" <anna.schumaker@netapp.com>
> Sent: Thursday, 26 November, 2020 18:17:41
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> I have added some debug info Ind got this:
> 
> decode_getdeviceinfo: layout type 4
> decode_getdeviceinfo: layout size 64
> decode_getdeviceinfo: layout notification bitmap size 1094994757
> 
> So it looks like that xdr_read_pages set xdr pointer to a wrong position
> and next read, which is bitmap size
> 
> 5857         pdev->mincount = be32_to_cpup(p);
> 5858         dprintk("%s: layout size %u\n", __func__, pdev->mincount);
> 5859         if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
> 5860                 return -EIO;
> 5861
> 5862         /* Parse notification bitmap, verifying that it is zero. */
> 5863         p = xdr_inline_decode(xdr, 4);
> 5864         if (unlikely(!p))
> 5865                 return -EIO;
> 5866         len = be32_to_cpup(p);
> 5867         dprintk("%s: layout notification bitmap size %u\n", __func__, len);
> 5868         if (len) {
> 5869                 uint32_t i;
> 
> 
> Tigran.
> 
> 
> ----- Original Message -----
>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> To: "trondmy" <trondmy@hammerspace.com>
>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker"
>> <anna.schumaker@netapp.com>
>> Sent: Tuesday, 17 November, 2020 15:50:57
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> Here is the result:
>> 
>> 
>> $ git bisect bad
>> c567552612ece787b178e3b147b5854ad422a836 is the first bad commit
>> commit c567552612ece787b178e3b147b5854ad422a836
>> Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
>> Date:   Wed May 28 13:41:22 2014 -0400
>> 
>>    NFS: Add READ_PLUS data segment support
>>    
>>    This patch adds client support for decoding a single NFS4_CONTENT_DATA
>>    segment returned by the server. This is the simplest implementation
>>    possible, since it does not account for any hole segments in the reply.
>>    
>>    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
>> 
>> fs/nfs/nfs42xdr.c         | 141 ++++++++++++++++++++++++++++++++++++++++++++++
>> fs/nfs/nfs4client.c       |   2 +
>> fs/nfs/nfs4proc.c         |  43 +++++++++++++-
>> fs/nfs/nfs4xdr.c          |   1 +
>> include/linux/nfs4.h      |   2 +-
>> include/linux/nfs_fs_sb.h |   1 +
>> include/linux/nfs_xdr.h   |   2 +-
>> 7 files changed, 187 insertions(+), 5 deletions(-)
>> 
>> 
>> Regards,
>>   Tigran.
>> 
>> 
>> 
>> ----- Original Message -----
>>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>> To: "trondmy" <trondmy@hammerspace.com>
>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>> Sent: Monday, 16 November, 2020 21:55:50
>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>> channels
>> 
>>> Hi Trond,
>>> 
>>> I am afraid, that the fix didn't work. I bisecting it....
>>> 
>>> 
>>> Tigran.
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: "trondmy" <trondmy@hammerspace.com>
>>>> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>> Sent: Saturday, 14 November, 2020 15:29:01
>>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>>> channels
>>> 
>>>> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>>>>> 
>>>>> 
>>>>> ----- Original Message -----
>>>>> > From: "trondmy" <trondmy@hammerspace.com>
>>>>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>>> > Sent: Friday, 13 November, 2020 23:45:00
>>>>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>>>>> > file+flexfiles data channels
>>>>> 
>>>>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>>>>> > > 
>>>>> > > After more testing, it looks like that client doesn't like
>>>>> > > notification bitmap:
>>>>> > > 
>>>>> > > 
>>>>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>>>>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>>>>> > > 000000001d17c43e
>>>>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>>>>> > > highest_used=4294967295 max_slots=16
>>>>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>>>>> > > slotid=0
>>>>> > > [31576.789527] encode_sequence:
>>>>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>>>>> > > max_slotid=0 cache_this=0
>>>>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>>>>> > 
>>>>> > According to this, you appear to be returning a deviceinfo bitmap
>>>>> > with
>>>>> > at least one non-zero entry that is not in the first 32-bit word.
>>>>> > We
>>>>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>>>>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>>>>> > zero
>>>>> > entries.
>>>>> 
>>>>> 
>>>>> according to packet capture only bitmap[0] has non zero bits set.
>>>>> This is the reply of compound starting from nfs staus code, tag
>>>>> length and so on.
>>>>> 
>>>>> 
>>>>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>>>>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>>>>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>>>>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>>>>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>>>>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>>>>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>>>>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>>>>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>>>>> 0090   00 00 00 00
>>>>> 
>>>>> 
>>>>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>>>>> 
>>>>> 
>>>>> This part of code in the didn't change since 2010, and I
>>>>> have no issues to use 5.8 kernel. I am pretty sure, that
>>>>> tests with 5.9 did pass as expected. I will try to bisec it.
>>>> 
>>>> I don't think I've introduced this bug. I did not touch anything in the
>>>> getdeviceinfo proc or XDR code.
>>>> Does the following patch help?
>>>> 
>>>> 8<-------------------------------------------------------
>>>> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
>>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>> Date: Fri, 13 Nov 2020 21:42:16 -0500
>>>> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
>>>> reply
>>>> 
>>>> We can fit the device_addr4 opaque data padding in the pages.
>>>> 
>>>> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
>>>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>> ---
>>>> fs/nfs/nfs4xdr.c | 14 ++++++++++----
>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>> 
>>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>>> index c6dbfcae7517..c8714381d511 100644
>>>> --- a/fs/nfs/nfs4xdr.c
>>>> +++ b/fs/nfs/nfs4xdr.c
>>>> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
>>>> *req,
>>>> 	struct compound_hdr hdr = {
>>>> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
>>>> 	};
>>>> +	uint32_t replen;
>>>> 
>>>> 	encode_compound_hdr(xdr, req, &hdr);
>>>> 	encode_sequence(xdr, &args->seq_args, &hdr);
>>>> +
>>>> +	replen = hdr.replen + op_decode_hdr_maxsz;
>>>> +
>>>> 	encode_getdeviceinfo(xdr, args, &hdr);
>>>> 
>>>> -	/* set up reply kvec. Subtract notification bitmap max size (2)
>>>> -	 * so that notification bitmap is put in xdr_buf tail */
>>>> +	/* set up reply kvec. device_addr4 opaque data is read into the
>>>> +	 * pages */
>>>> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
>>>> -				args->pdev->pglen, hdr.replen - 2);
>>>> +				args->pdev->pglen, replen + 2);
>>>> 	encode_nops(&hdr);
>>>> }
>>>> 
>>>> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
>>>> 	 * and places the remaining xdr data in xdr_buf->tail
>>>> 	 */
>>>> 	pdev->mincount = be32_to_cpup(p);
>>>> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>>>> +	/* Calculate padding */
>>>> +	len = xdr_align_size(pdev->mincount);
>>>> +	if (xdr_read_pages(xdr, len) != len)
>>>> 		return -EIO;
>>>> 
>>>> 	/* Parse notification bitmap, verifying that it is zero. */
>>>> --
>>>> 2.28.0
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Trond Myklebust
>>>> Linux NFS client maintainer, Hammerspace
> > > > trond.myklebust@hammerspace.com
Mkrtchyan, Tigran Dec. 1, 2020, 2:44 p.m. UTC | #11
Actually, this was a co-incidence, which means that buf points to
more or less a random place.

Tigran.

----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "trondmy" <trondmy@hammerspace.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker" <anna.schumaker@netapp.com>
> Sent: Tuesday, 1 December, 2020 11:59:22
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> More investigation...
> 
> After xdr_read_pages call the xdr stream points to the
> beginning of RPC package and return the xid value on
> the next read (which should be the size of the notification bitmap).
> 
> 
> Regards,
>   Tigran.
> 
> ----- Original Message -----
>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> To: "trondmy" <trondmy@hammerspace.com>
>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker"
>> <anna.schumaker@netapp.com>
>> Sent: Thursday, 26 November, 2020 18:17:41
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> I have added some debug info Ind got this:
>> 
>> decode_getdeviceinfo: layout type 4
>> decode_getdeviceinfo: layout size 64
>> decode_getdeviceinfo: layout notification bitmap size 1094994757
>> 
>> So it looks like that xdr_read_pages set xdr pointer to a wrong position
>> and next read, which is bitmap size
>> 
>> 5857         pdev->mincount = be32_to_cpup(p);
>> 5858         dprintk("%s: layout size %u\n", __func__, pdev->mincount);
>> 5859         if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>> 5860                 return -EIO;
>> 5861
>> 5862         /* Parse notification bitmap, verifying that it is zero. */
>> 5863         p = xdr_inline_decode(xdr, 4);
>> 5864         if (unlikely(!p))
>> 5865                 return -EIO;
>> 5866         len = be32_to_cpup(p);
>> 5867         dprintk("%s: layout notification bitmap size %u\n", __func__, len);
>> 5868         if (len) {
>> 5869                 uint32_t i;
>> 
>> 
>> Tigran.
>> 
>> 
>> ----- Original Message -----
>>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>> To: "trondmy" <trondmy@hammerspace.com>
>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker"
>>> <anna.schumaker@netapp.com>
>>> Sent: Tuesday, 17 November, 2020 15:50:57
>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>> channels
>> 
>>> Here is the result:
>>> 
>>> 
>>> $ git bisect bad
>>> c567552612ece787b178e3b147b5854ad422a836 is the first bad commit
>>> commit c567552612ece787b178e3b147b5854ad422a836
>>> Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
>>> Date:   Wed May 28 13:41:22 2014 -0400
>>> 
>>>    NFS: Add READ_PLUS data segment support
>>>    
>>>    This patch adds client support for decoding a single NFS4_CONTENT_DATA
>>>    segment returned by the server. This is the simplest implementation
>>>    possible, since it does not account for any hole segments in the reply.
>>>    
>>>    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
>>> 
>>> fs/nfs/nfs42xdr.c         | 141 ++++++++++++++++++++++++++++++++++++++++++++++
>>> fs/nfs/nfs4client.c       |   2 +
>>> fs/nfs/nfs4proc.c         |  43 +++++++++++++-
>>> fs/nfs/nfs4xdr.c          |   1 +
>>> include/linux/nfs4.h      |   2 +-
>>> include/linux/nfs_fs_sb.h |   1 +
>>> include/linux/nfs_xdr.h   |   2 +-
>>> 7 files changed, 187 insertions(+), 5 deletions(-)
>>> 
>>> 
>>> Regards,
>>>   Tigran.
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>> To: "trondmy" <trondmy@hammerspace.com>
>>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>> Sent: Monday, 16 November, 2020 21:55:50
>>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>>> channels
>>> 
>>>> Hi Trond,
>>>> 
>>>> I am afraid, that the fix didn't work. I bisecting it....
>>>> 
>>>> 
>>>> Tigran.
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>>> From: "trondmy" <trondmy@hammerspace.com>
>>>>> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>>> Sent: Saturday, 14 November, 2020 15:29:01
>>>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>>>> channels
>>>> 
>>>>> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>>>>>> 
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>> > From: "trondmy" <trondmy@hammerspace.com>
>>>>>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>>>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>>>> > Sent: Friday, 13 November, 2020 23:45:00
>>>>>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>>>>>> > file+flexfiles data channels
>>>>>> 
>>>>>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>>>>>> > > 
>>>>>> > > After more testing, it looks like that client doesn't like
>>>>>> > > notification bitmap:
>>>>>> > > 
>>>>>> > > 
>>>>>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>>>>>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>>>>>> > > 000000001d17c43e
>>>>>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>>>>>> > > highest_used=4294967295 max_slots=16
>>>>>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>>>>>> > > slotid=0
>>>>>> > > [31576.789527] encode_sequence:
>>>>>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>>>>>> > > max_slotid=0 cache_this=0
>>>>>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>>>>>> > 
>>>>>> > According to this, you appear to be returning a deviceinfo bitmap
>>>>>> > with
>>>>>> > at least one non-zero entry that is not in the first 32-bit word.
>>>>>> > We
>>>>>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>>>>>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>>>>>> > zero
>>>>>> > entries.
>>>>>> 
>>>>>> 
>>>>>> according to packet capture only bitmap[0] has non zero bits set.
>>>>>> This is the reply of compound starting from nfs staus code, tag
>>>>>> length and so on.
>>>>>> 
>>>>>> 
>>>>>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>>>>>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>>>>>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>>>>>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>>>>>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>>>>>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>>>>>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>>>>>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>>>>>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>>>>>> 0090   00 00 00 00
>>>>>> 
>>>>>> 
>>>>>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>>>>>> 
>>>>>> 
>>>>>> This part of code in the didn't change since 2010, and I
>>>>>> have no issues to use 5.8 kernel. I am pretty sure, that
>>>>>> tests with 5.9 did pass as expected. I will try to bisec it.
>>>>> 
>>>>> I don't think I've introduced this bug. I did not touch anything in the
>>>>> getdeviceinfo proc or XDR code.
>>>>> Does the following patch help?
>>>>> 
>>>>> 8<-------------------------------------------------------
>>>>> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
>>>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>>> Date: Fri, 13 Nov 2020 21:42:16 -0500
>>>>> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
>>>>> reply
>>>>> 
>>>>> We can fit the device_addr4 opaque data padding in the pages.
>>>>> 
>>>>> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>>> ---
>>>>> fs/nfs/nfs4xdr.c | 14 ++++++++++----
>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>>> 
>>>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>>>> index c6dbfcae7517..c8714381d511 100644
>>>>> --- a/fs/nfs/nfs4xdr.c
>>>>> +++ b/fs/nfs/nfs4xdr.c
>>>>> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
>>>>> *req,
>>>>> 	struct compound_hdr hdr = {
>>>>> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
>>>>> 	};
>>>>> +	uint32_t replen;
>>>>> 
>>>>> 	encode_compound_hdr(xdr, req, &hdr);
>>>>> 	encode_sequence(xdr, &args->seq_args, &hdr);
>>>>> +
>>>>> +	replen = hdr.replen + op_decode_hdr_maxsz;
>>>>> +
>>>>> 	encode_getdeviceinfo(xdr, args, &hdr);
>>>>> 
>>>>> -	/* set up reply kvec. Subtract notification bitmap max size (2)
>>>>> -	 * so that notification bitmap is put in xdr_buf tail */
>>>>> +	/* set up reply kvec. device_addr4 opaque data is read into the
>>>>> +	 * pages */
>>>>> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
>>>>> -				args->pdev->pglen, hdr.replen - 2);
>>>>> +				args->pdev->pglen, replen + 2);
>>>>> 	encode_nops(&hdr);
>>>>> }
>>>>> 
>>>>> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
>>>>> 	 * and places the remaining xdr data in xdr_buf->tail
>>>>> 	 */
>>>>> 	pdev->mincount = be32_to_cpup(p);
>>>>> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>>>>> +	/* Calculate padding */
>>>>> +	len = xdr_align_size(pdev->mincount);
>>>>> +	if (xdr_read_pages(xdr, len) != len)
>>>>> 		return -EIO;
>>>>> 
>>>>> 	/* Parse notification bitmap, verifying that it is zero. */
>>>>> --
>>>>> 2.28.0
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Trond Myklebust
>>>>> Linux NFS client maintainer, Hammerspace
> > > > > trond.myklebust@hammerspace.com