mbox series

[v3,00/17] Readdir enhancements

Message ID 20201104161638.300324-1-trond.myklebust@hammerspace.com (mailing list archive)
Headers show
Series Readdir enhancements | expand

Message

Trond Myklebust Nov. 4, 2020, 4:16 p.m. UTC
From: Trond Myklebust <trond.myklebust@hammerspace.com>

The following patch series performs a number of cleanups on the readdir
code.
It also adds support for 1MB readdir RPC calls on-the-wire, and modifies
the caching code to ensure that we cache the entire contents of that
1MB call (instead of discarding the data that doesn't fit into a single
page).

v2: Fix the handling of the NFSv3/v4 directory verifier
v3: Optimise searching when the readdir cookies are seen to be ordered

Trond Myklebust (17):
  NFS: Ensure contents of struct nfs_open_dir_context are consistent
  NFS: Clean up readdir struct nfs_cache_array
  NFS: Clean up nfs_readdir_page_filler()
  NFS: Clean up directory array handling
  NFS: Don't discard readdir results
  NFS: Remove unnecessary kmap in nfs_readdir_xdr_to_array()
  NFS: Replace kmap() with kmap_atomic() in nfs_readdir_search_array()
  NFS: Simplify struct nfs_cache_array_entry
  NFS: Support larger readdir buffers
  NFS: More readdir cleanups
  NFS: nfs_do_filldir() does not return a value
  NFS: Reduce readdir stack usage
  NFS: Cleanup to remove nfs_readdir_descriptor_t typedef
  NFS: Allow the NFS generic code to pass in a verifier to readdir
  NFS: Handle NFS4ERR_NOT_SAME and NFSERR_BADCOOKIE from readdir calls
  NFS: Improve handling of directory verifiers
  NFS: Optimisations for monotonically increasing readdir cookies

 fs/nfs/client.c         |   4 +-
 fs/nfs/dir.c            | 629 +++++++++++++++++++++++++---------------
 fs/nfs/inode.c          |   7 -
 fs/nfs/internal.h       |   6 -
 fs/nfs/nfs3proc.c       |  35 ++-
 fs/nfs/nfs4proc.c       |  40 +--
 fs/nfs/proc.c           |  18 +-
 include/linux/nfs_fs.h  |   9 +-
 include/linux/nfs_xdr.h |  17 +-
 9 files changed, 459 insertions(+), 306 deletions(-)

Comments

Benjamin Coddington Nov. 7, 2020, 12:49 p.m. UTC | #1
On 4 Nov 2020, at 11:16, trondmy@gmail.com wrote:

> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>
> The following patch series performs a number of cleanups on the readdir
> code.
> It also adds support for 1MB readdir RPC calls on-the-wire, and modifies
> the caching code to ensure that we cache the entire contents of that
> 1MB call (instead of discarding the data that doesn't fit into a single
> page).
>
> v2: Fix the handling of the NFSv3/v4 directory verifier
> v3: Optimise searching when the readdir cookies are seen to be ordered

Hi Trond, thanks for these.

I did a bit of testing with these on 4-core/4G client listing 1.5M files
with READDIR.  I compared v5.10-rc2 without/with this set.

+------+     v5.10.rc-2      +--+ this v3 patch set  +
| run  |  time   | rpc calls |  |  time  | rpc calls |

nfsv3 with dtsize 262144:
+------+---------+-----------+--+--------+-----------+
| 1    | 81.583  | 14710     |  | 53.568 | 215       |
| 2    | 81.147  | 14710     |  | 50.781 | 215       |
| 3    | 81.61   | 14710     |  | 50.514 | 215       |
| 4    | 82.405  | 14710     |  | 50.746 | 215       |
| 5    | 82.066  | 14710     |  | 50.397 | 215       |
| 6    | 82.395  | 14710     |  | 50.892 | 215       |
| 7    | 81.657  | 14710     |  | 50.882 | 215       |
| 8    | 81.555  | 14710     |  | 50.981 | 215       |
| 9    | 81.421  | 14710     |  | 50.558 | 215       |
| 10   | 81.472  | 14710     |  | 50.588 | 215       |

nfsv3 with dtsize 1048576:
+------+---------+-----------+--+--------+-----------+
| 1    | 81.563  | 14710     |  | 52.692 | 61        |
| 2    | 82.123  | 14710     |  | 49.934 | 61        |
| 3    | 81.714  | 14710     |  | 50.158 | 61        |
| 4    | 81.707  | 14710     |  | 50.083 | 61        |
| 5    | 81.44   | 14710     |  | 50.045 | 61        |
| 6    | 81.685  | 14710     |  | 50.021 | 61        |
| 7    | 81.17   | 14710     |  | 50.131 | 61        |
| 8    | 81.366  | 14710     |  | 49.928 | 61        |
| 9    | 81.067  | 14710     |  | 50.081 | 61        |
| 10   | 81.524  | 14710     |  | 50.442 | 61        |

nfsv4 with dtsize 32768:
+------+---------+-----------+--+--------+-----------+
| 1    | 99.534  | 14712     |  | 79.461 | 331       |
| 2    | 98.998  | 14712     |  | 79.338 | 331       |
| 3    | 99.462  | 14712     |  | 81.101 | 331       |
| 4    | 99.891  | 14712     |  | 78.888 | 331       |
| 5    | 99.516  | 14712     |  | 81.147 | 331       |
| 6    | 98.649  | 14712     |  | 83.084 | 331       |
| 7    | 101.159 | 14712     |  | 80.461 | 331       |
| 8    | 100.402 | 14712     |  | 79.003 | 331       |
| 9    | 98.548  | 14712     |  | 80.619 | 331       |
| 10   | 97.456  | 14712     |  | 81.317 | 331       |

nfsv4 with dtsize 1048576:
+------+---------+-----------+--+--------+-----------+
| 1    | 100.357 | 14712     |  | 78.976 | 91        |
| 2    | 99.61   | 14712     |  | 79.328 | 91        |
| 3    | 101.095 | 14712     |  | 80.649 | 91        |
| 4    | 107.904 | 14712     |  | 78.285 | 91        |
| 5    | 103.665 | 14712     |  | 79.258 | 91        |
| 6    | 98.877  | 14712     |  | 78.817 | 91        |
| 7    | 99.567  | 14712     |  | 81.11  | 91        |
| 8    | 99.096  | 14712     |  | 80.296 | 91        |
| 9    | 100.124 | 14712     |  | 78.865 | 91        |
| 10   | 100.603 | 14712     |  | 79.143 | 91        |

These look great.  Feel free to add either/both of my:
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>

Ben
Trond Myklebust Nov. 7, 2020, 2:23 p.m. UTC | #2
On Sat, 2020-11-07 at 07:49 -0500, Benjamin Coddington wrote:
> On 4 Nov 2020, at 11:16, trondmy@gmail.com wrote:
> 
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > The following patch series performs a number of cleanups on the
> > readdir
> > code.
> > It also adds support for 1MB readdir RPC calls on-the-wire, and
> > modifies
> > the caching code to ensure that we cache the entire contents of
> > that
> > 1MB call (instead of discarding the data that doesn't fit into a
> > single
> > page).
> > 
> > v2: Fix the handling of the NFSv3/v4 directory verifier
> > v3: Optimise searching when the readdir cookies are seen to be
> > ordered
> 
> Hi Trond, thanks for these.
> 
> I did a bit of testing with these on 4-core/4G client listing 1.5M
> files
> with READDIR.  I compared v5.10-rc2 without/with this set.
> 
> +------+     v5.10.rc-2      +--+ this v3 patch set  +
> > run  |  time   | rpc calls |  |  time  | rpc calls |
> 
> nfsv3 with dtsize 262144:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 81.583  | 14710     |  | 53.568 | 215       |
> > 2    | 81.147  | 14710     |  | 50.781 | 215       |
> > 3    | 81.61   | 14710     |  | 50.514 | 215       |
> > 4    | 82.405  | 14710     |  | 50.746 | 215       |
> > 5    | 82.066  | 14710     |  | 50.397 | 215       |
> > 6    | 82.395  | 14710     |  | 50.892 | 215       |
> > 7    | 81.657  | 14710     |  | 50.882 | 215       |
> > 8    | 81.555  | 14710     |  | 50.981 | 215       |
> > 9    | 81.421  | 14710     |  | 50.558 | 215       |
> > 10   | 81.472  | 14710     |  | 50.588 | 215       |
> 
> nfsv3 with dtsize 1048576:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 81.563  | 14710     |  | 52.692 | 61        |
> > 2    | 82.123  | 14710     |  | 49.934 | 61        |
> > 3    | 81.714  | 14710     |  | 50.158 | 61        |
> > 4    | 81.707  | 14710     |  | 50.083 | 61        |
> > 5    | 81.44   | 14710     |  | 50.045 | 61        |
> > 6    | 81.685  | 14710     |  | 50.021 | 61        |
> > 7    | 81.17   | 14710     |  | 50.131 | 61        |
> > 8    | 81.366  | 14710     |  | 49.928 | 61        |
> > 9    | 81.067  | 14710     |  | 50.081 | 61        |
> > 10   | 81.524  | 14710     |  | 50.442 | 61        |
> 
> nfsv4 with dtsize 32768:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 99.534  | 14712     |  | 79.461 | 331       |
> > 2    | 98.998  | 14712     |  | 79.338 | 331       |
> > 3    | 99.462  | 14712     |  | 81.101 | 331       |
> > 4    | 99.891  | 14712     |  | 78.888 | 331       |
> > 5    | 99.516  | 14712     |  | 81.147 | 331       |
> > 6    | 98.649  | 14712     |  | 83.084 | 331       |
> > 7    | 101.159 | 14712     |  | 80.461 | 331       |
> > 8    | 100.402 | 14712     |  | 79.003 | 331       |
> > 9    | 98.548  | 14712     |  | 80.619 | 331       |
> > 10   | 97.456  | 14712     |  | 81.317 | 331       |
> 
> nfsv4 with dtsize 1048576:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 100.357 | 14712     |  | 78.976 | 91        |
> > 2    | 99.61   | 14712     |  | 79.328 | 91        |
> > 3    | 101.095 | 14712     |  | 80.649 | 91        |
> > 4    | 107.904 | 14712     |  | 78.285 | 91        |
> > 5    | 103.665 | 14712     |  | 79.258 | 91        |
> > 6    | 98.877  | 14712     |  | 78.817 | 91        |
> > 7    | 99.567  | 14712     |  | 81.11  | 91        |
> > 8    | 99.096  | 14712     |  | 80.296 | 91        |
> > 9    | 100.124 | 14712     |  | 78.865 | 91        |
> > 10   | 100.603 | 14712     |  | 79.143 | 91        |
> 
> These look great.  Feel free to add either/both of my:
> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
> Tested-by: Benjamin Coddington <bcodding@redhat.com>

Thanks again for testing! I missed this email before sending out v4,
but since that only adds 2 new patches to the series to deal with
Dave's v. large changing directory case, I assume I can apply the above
tags to the rest anyway as they have not changed?
Benjamin Coddington Nov. 8, 2020, 11:05 a.m. UTC | #3
On 7 Nov 2020, at 9:23, Trond Myklebust wrote:

> On Sat, 2020-11-07 at 07:49 -0500, Benjamin Coddington wrote:
>> On 4 Nov 2020, at 11:16, trondmy@gmail.com wrote:
>>
>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>
>>> The following patch series performs a number of cleanups on the
>>> readdir
>>> code.
>>> It also adds support for 1MB readdir RPC calls on-the-wire, and
>>> modifies
>>> the caching code to ensure that we cache the entire contents of
>>> that
>>> 1MB call (instead of discarding the data that doesn't fit into a
>>> single
>>> page).
>>>
>>> v2: Fix the handling of the NFSv3/v4 directory verifier
>>> v3: Optimise searching when the readdir cookies are seen to be
>>> ordered
>>
>> Hi Trond, thanks for these.
>>
>> I did a bit of testing with these on 4-core/4G client listing 1.5M
>> files
>> with READDIR.  I compared v5.10-rc2 without/with this set.
>>
>> +------+     v5.10.rc-2      +--+ this v3 patch set  +
>>> run  |  time   | rpc calls |  |  time  | rpc calls |
>>
>> nfsv3 with dtsize 262144:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 81.583  | 14710     |  | 53.568 | 215       |
>>> 2    | 81.147  | 14710     |  | 50.781 | 215       |
>>> 3    | 81.61   | 14710     |  | 50.514 | 215       |
>>> 4    | 82.405  | 14710     |  | 50.746 | 215       |
>>> 5    | 82.066  | 14710     |  | 50.397 | 215       |
>>> 6    | 82.395  | 14710     |  | 50.892 | 215       |
>>> 7    | 81.657  | 14710     |  | 50.882 | 215       |
>>> 8    | 81.555  | 14710     |  | 50.981 | 215       |
>>> 9    | 81.421  | 14710     |  | 50.558 | 215       |
>>> 10   | 81.472  | 14710     |  | 50.588 | 215       |
>>
>> nfsv3 with dtsize 1048576:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 81.563  | 14710     |  | 52.692 | 61        |
>>> 2    | 82.123  | 14710     |  | 49.934 | 61        |
>>> 3    | 81.714  | 14710     |  | 50.158 | 61        |
>>> 4    | 81.707  | 14710     |  | 50.083 | 61        |
>>> 5    | 81.44   | 14710     |  | 50.045 | 61        |
>>> 6    | 81.685  | 14710     |  | 50.021 | 61        |
>>> 7    | 81.17   | 14710     |  | 50.131 | 61        |
>>> 8    | 81.366  | 14710     |  | 49.928 | 61        |
>>> 9    | 81.067  | 14710     |  | 50.081 | 61        |
>>> 10   | 81.524  | 14710     |  | 50.442 | 61        |
>>
>> nfsv4 with dtsize 32768:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 99.534  | 14712     |  | 79.461 | 331       |
>>> 2    | 98.998  | 14712     |  | 79.338 | 331       |
>>> 3    | 99.462  | 14712     |  | 81.101 | 331       |
>>> 4    | 99.891  | 14712     |  | 78.888 | 331       |
>>> 5    | 99.516  | 14712     |  | 81.147 | 331       |
>>> 6    | 98.649  | 14712     |  | 83.084 | 331       |
>>> 7    | 101.159 | 14712     |  | 80.461 | 331       |
>>> 8    | 100.402 | 14712     |  | 79.003 | 331       |
>>> 9    | 98.548  | 14712     |  | 80.619 | 331       |
>>> 10   | 97.456  | 14712     |  | 81.317 | 331       |
>>
>> nfsv4 with dtsize 1048576:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 100.357 | 14712     |  | 78.976 | 91        |
>>> 2    | 99.61   | 14712     |  | 79.328 | 91        |
>>> 3    | 101.095 | 14712     |  | 80.649 | 91        |
>>> 4    | 107.904 | 14712     |  | 78.285 | 91        |
>>> 5    | 103.665 | 14712     |  | 79.258 | 91        |
>>> 6    | 98.877  | 14712     |  | 78.817 | 91        |
>>> 7    | 99.567  | 14712     |  | 81.11  | 91        |
>>> 8    | 99.096  | 14712     |  | 80.296 | 91        |
>>> 9    | 100.124 | 14712     |  | 78.865 | 91        |
>>> 10   | 100.603 | 14712     |  | 79.143 | 91        |
>>
>> These look great.  Feel free to add either/both of my:
>> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
>> Tested-by: Benjamin Coddington <bcodding@redhat.com>
>
> Thanks again for testing! I missed this email before sending out v4,
> but since that only adds 2 new patches to the series to deal with
> Dave's v. large changing directory case, I assume I can apply the above
> tags to the rest anyway as they have not changed?

Yes, I'll check those out too.

Ben
Mkrtchyan, Tigran Nov. 8, 2020, 6:15 p.m. UTC | #4
----- Original Message -----
> From: "Benjamin Coddington" <bcodding@redhat.com>
> To: "Trond Myklebust" <trondmy@gmail.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Saturday, 7 November, 2020 13:49:31
> Subject: Re: [PATCH v3 00/17] Readdir enhancements

> On 4 Nov 2020, at 11:16, trondmy@gmail.com wrote:
> 
>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>
>> The following patch series performs a number of cleanups on the readdir
>> code.
>> It also adds support for 1MB readdir RPC calls on-the-wire, and modifies
>> the caching code to ensure that we cache the entire contents of that
>> 1MB call (instead of discarding the data that doesn't fit into a single
>> page).
>>
>> v2: Fix the handling of the NFSv3/v4 directory verifier
>> v3: Optimise searching when the readdir cookies are seen to be ordered
> 
> Hi Trond, thanks for these.
> 
> I did a bit of testing with these on 4-core/4G client listing 1.5M files
> with READDIR.  I compared v5.10-rc2 without/with this set.
> 
> +------+     v5.10.rc-2      +--+ this v3 patch set  +
>| run  |  time   | rpc calls |  |  time  | rpc calls |
> 
> nfsv3 with dtsize 262144:
> +------+---------+-----------+--+--------+-----------+
>| 1    | 81.583  | 14710     |  | 53.568 | 215       |
>| 2    | 81.147  | 14710     |  | 50.781 | 215       |
>| 3    | 81.61   | 14710     |  | 50.514 | 215       |
>| 4    | 82.405  | 14710     |  | 50.746 | 215       |
>| 5    | 82.066  | 14710     |  | 50.397 | 215       |
>| 6    | 82.395  | 14710     |  | 50.892 | 215       |
>| 7    | 81.657  | 14710     |  | 50.882 | 215       |
>| 8    | 81.555  | 14710     |  | 50.981 | 215       |
>| 9    | 81.421  | 14710     |  | 50.558 | 215       |
>| 10   | 81.472  | 14710     |  | 50.588 | 215       |
> 
> nfsv3 with dtsize 1048576:
> +------+---------+-----------+--+--------+-----------+
>| 1    | 81.563  | 14710     |  | 52.692 | 61        |
>| 2    | 82.123  | 14710     |  | 49.934 | 61        |
>| 3    | 81.714  | 14710     |  | 50.158 | 61        |
>| 4    | 81.707  | 14710     |  | 50.083 | 61        |
>| 5    | 81.44   | 14710     |  | 50.045 | 61        |
>| 6    | 81.685  | 14710     |  | 50.021 | 61        |
>| 7    | 81.17   | 14710     |  | 50.131 | 61        |
>| 8    | 81.366  | 14710     |  | 49.928 | 61        |
>| 9    | 81.067  | 14710     |  | 50.081 | 61        |
>| 10   | 81.524  | 14710     |  | 50.442 | 61        |
> 
> nfsv4 with dtsize 32768:
> +------+---------+-----------+--+--------+-----------+
>| 1    | 99.534  | 14712     |  | 79.461 | 331       |
>| 2    | 98.998  | 14712     |  | 79.338 | 331       |
>| 3    | 99.462  | 14712     |  | 81.101 | 331       |
>| 4    | 99.891  | 14712     |  | 78.888 | 331       |
>| 5    | 99.516  | 14712     |  | 81.147 | 331       |
>| 6    | 98.649  | 14712     |  | 83.084 | 331       |
>| 7    | 101.159 | 14712     |  | 80.461 | 331       |
>| 8    | 100.402 | 14712     |  | 79.003 | 331       |
>| 9    | 98.548  | 14712     |  | 80.619 | 331       |
>| 10   | 97.456  | 14712     |  | 81.317 | 331       |
> 
> nfsv4 with 1048576:
> +------+---------+-----------+--+--------+-----------+
>| 1    | 100.357 | 14712     |  | 78.976 | 91        |
>| 2    | 99.61   | 14712     |  | 79.328 | 91        |
>| 3    | 101.095 | 14712     |  | 80.649 | 91        |
>| 4    | 107.904 | 14712     |  | 78.285 | 91        |
>| 5    | 103.665 | 14712     |  | 79.258 | 91        |
>| 6    | 98.877  | 14712     |  | 78.817 | 91        |
>| 7    | 99.567  | 14712     |  | 81.11  | 91        |
>| 8    | 99.096  | 14712     |  | 80.296 | 91        |
>| 9    | 100.124 | 14712     |  | 78.865 | 91        |
>| 10   | 100.603 | 14712     |  | 79.143 | 91        |


Hi Ben, hi Trond,

though number of RPC call between dtsize 1048576 and 32768
is x3 less, the time it takes almost the same. According to
your results, at some point (<= 32K) a bigger dtsize makes
no difference. As the original dtsize is 32K 
(#define NFS_MAX_READDIR_PAGES 8), it looks like that the
performance enhancements mostly contributed by a change
not related to the buffer size.

On another, the number of RPC calls with v3-patch-set drops
by x40. What ever Trond have changed there has a big impact!

Thanks a lot for your efforts,
   Tigran.

> 
> These look great.  Feel free to add either/both of my:
> Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
> Tested-by: Benjamin Coddington <bcodding@redhat.com>
> 
> Ben