mbox series

[RFC,00/30] Overhaul NFSD filecache

Message ID 165590626293.75778.9843437418112335153.stgit@manet.1015granger.net (mailing list archive)
Headers show
Series Overhaul NFSD filecache | expand

Message

Chuck Lever June 22, 2022, 2:12 p.m. UTC
This series overhauls the NFSD filecache, a cache of server-side
"struct file" objects recently used by NFS clients. The purposes of
this overhaul are an immediate improvement in cache scalability in
the number of open files, and preparation for further improvements.

There are three categories of patches in this series:

1. Add observability of cache operation so we can see what we're
doing as changes are made to the code.

2. Improve the scalability of filecache garbage collection,
addressing several bugs along the way.

3. Improve the scalability of the filecache hash table by converting
it to use rhashtable.

The series as it stands survives typical test workloads. Running
stress-tests like generic/531 is the next step.

These patches are also available in the linux-nfs-bugzilla-386
branch of

  https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 

---

Chuck Lever (30):
      NFSD: Report filecache LRU size
      NFSD: Report count of calls to nfsd_file_acquire()
      NFSD: Report count of freed filecache items
      NFSD: Report average age of filecache items
      NFSD: Add nfsd_file_lru_dispose_list() helper
      NFSD: Refactor nfsd_file_gc()
      NFSD: Refactor nfsd_file_lru_scan()
      NFSD: Report the number of items evicted by the LRU walk
      NFSD: Record number of flush calls
      NFSD: Report filecache item construction failures
      NFSD: Zero counters when the filecache is re-initialized
      NFSD: Hook up the filecache stat file
      NFSD: WARN when freeing an item still linked via nf_lru
      NFSD: Trace filecache LRU activity
      NFSD: Leave open files out of the filecache LRU
      NFSD: Fix the filecache LRU shrinker
      NFSD: Never call nfsd_file_gc() in foreground paths
      NFSD: No longer record nf_hashval in the trace log
      NFSD: Remove lockdep assertion from unhash_and_release_locked()
      NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
      NFSD: Refactor __nfsd_file_close_inode()
      NFSD: nfsd_file_hash_remove can compute hashval
      NFSD: Remove nfsd_file::nf_hashval
      NFSD: Remove stale comment from nfsd_file_acquire()
      NFSD: Clean up "open file" case in nfsd_file_acquire()
      NFSD: Document nfsd_file_cache_purge() API contract
      NFSD: Replace the "init once" mechanism
      NFSD: Set up an rhashtable for the filecache
      NFSD: Convert the filecache to use rhashtable
      NFSD: Clean up unusued code after rhashtable conversion


 fs/nfsd/filecache.c | 677 +++++++++++++++++++++++++++-----------------
 fs/nfsd/filecache.h |   6 +-
 fs/nfsd/nfsctl.c    |  10 +
 fs/nfsd/trace.h     | 117 ++++++--
 4 files changed, 522 insertions(+), 288 deletions(-)

--
Chuck Lever

Comments

Wang Yugui June 22, 2022, 6:36 p.m. UTC | #1
Hi,

fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
patchset.

[  405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049

[  405.608016] Call Trace:
[  405.608016]  <TASK>
[  405.613020]  nfs4_get_vfs_file+0x325/0x410 [nfsd]
[  405.618018]  nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
[  405.623016]  ? inode_get_bytes+0x38/0x40
[  405.623016]  ? nfsd_permission+0x97/0xf0 [nfsd]
[  405.628022]  ? fh_verify+0x1cc/0x6f0 [nfsd]
[  405.633025]  nfsd4_open+0x640/0xb30 [nfsd]
[  405.638025]  nfsd4_proc_compound+0x3bd/0x710 [nfsd]
[  405.643017]  nfsd_dispatch+0x143/0x270 [nfsd]
[  405.648019]  svc_process_common+0x3bf/0x5b0 [sunrpc]

more detail in attachment file(531.dmesg)

local.config of fstests:
	export NFS_MOUNT_OPTIONS="-o rw,relatime,vers=4.2,nconnect=8"
changes of generic/531
	max_allowable_files=$(( 1 * 1024 * 1024 / $nr_cpus / 2 ))

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/06/23

> This series overhauls the NFSD filecache, a cache of server-side
> "struct file" objects recently used by NFS clients. The purposes of
> this overhaul are an immediate improvement in cache scalability in
> the number of open files, and preparation for further improvements.
> 
> There are three categories of patches in this series:
> 
> 1. Add observability of cache operation so we can see what we're
> doing as changes are made to the code.
> 
> 2. Improve the scalability of filecache garbage collection,
> addressing several bugs along the way.
> 
> 3. Improve the scalability of the filecache hash table by converting
> it to use rhashtable.
> 
> The series as it stands survives typical test workloads. Running
> stress-tests like generic/531 is the next step.
> 
> These patches are also available in the linux-nfs-bugzilla-386
> branch of
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 
> 
> ---
> 
> Chuck Lever (30):
>       NFSD: Report filecache LRU size
>       NFSD: Report count of calls to nfsd_file_acquire()
>       NFSD: Report count of freed filecache items
>       NFSD: Report average age of filecache items
>       NFSD: Add nfsd_file_lru_dispose_list() helper
>       NFSD: Refactor nfsd_file_gc()
>       NFSD: Refactor nfsd_file_lru_scan()
>       NFSD: Report the number of items evicted by the LRU walk
>       NFSD: Record number of flush calls
>       NFSD: Report filecache item construction failures
>       NFSD: Zero counters when the filecache is re-initialized
>       NFSD: Hook up the filecache stat file
>       NFSD: WARN when freeing an item still linked via nf_lru
>       NFSD: Trace filecache LRU activity
>       NFSD: Leave open files out of the filecache LRU
>       NFSD: Fix the filecache LRU shrinker
>       NFSD: Never call nfsd_file_gc() in foreground paths
>       NFSD: No longer record nf_hashval in the trace log
>       NFSD: Remove lockdep assertion from unhash_and_release_locked()
>       NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>       NFSD: Refactor __nfsd_file_close_inode()
>       NFSD: nfsd_file_hash_remove can compute hashval
>       NFSD: Remove nfsd_file::nf_hashval
>       NFSD: Remove stale comment from nfsd_file_acquire()
>       NFSD: Clean up "open file" case in nfsd_file_acquire()
>       NFSD: Document nfsd_file_cache_purge() API contract
>       NFSD: Replace the "init once" mechanism
>       NFSD: Set up an rhashtable for the filecache
>       NFSD: Convert the filecache to use rhashtable
>       NFSD: Clean up unusued code after rhashtable conversion
> 
> 
>  fs/nfsd/filecache.c | 677 +++++++++++++++++++++++++++-----------------
>  fs/nfsd/filecache.h |   6 +-
>  fs/nfsd/nfsctl.c    |  10 +
>  fs/nfsd/trace.h     | 117 ++++++--
>  4 files changed, 522 insertions(+), 288 deletions(-)
> 
> --
> Chuck Lever
Chuck Lever June 22, 2022, 7:04 p.m. UTC | #2
> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> 
> Hi,
> 
> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
> patchset.

As I mention in the cover letter, I haven't tried running generic/531
yet -- no claim at all that this is finished work and that #386 has
been fixed at this point. I'm merely interested in comments on the
general approach.


> [  405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049

The "RIP: " tells the location of the crash. Notice that the call
trace here does not include that information. From your attachment:

[  405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]

To match that to a line of source code:

[cel@manet ~]$ cd src/linux/linux/
[cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
nfsd_do_file_acquire+0x4e1/0xfc0:
rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
(inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
(inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
(inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
(inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
[cel@manet linux]$

This is an example, I'm sure my compiled objects don't match yours.

And, now that I've added observability, you should be able to do:

  # watch cat /proc/fs/nfsd/filecache

to see how many items are in the hash and LRU list while the test
is running.


> [  405.608016] Call Trace:
> [  405.608016]  <TASK>
> [  405.613020]  nfs4_get_vfs_file+0x325/0x410 [nfsd]
> [  405.618018]  nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
> [  405.623016]  ? inode_get_bytes+0x38/0x40
> [  405.623016]  ? nfsd_permission+0x97/0xf0 [nfsd]
> [  405.628022]  ? fh_verify+0x1cc/0x6f0 [nfsd]
> [  405.633025]  nfsd4_open+0x640/0xb30 [nfsd]
> [  405.638025]  nfsd4_proc_compound+0x3bd/0x710 [nfsd]
> [  405.643017]  nfsd_dispatch+0x143/0x270 [nfsd]
> [  405.648019]  svc_process_common+0x3bf/0x5b0 [sunrpc]
> 
> more detail in attachment file(531.dmesg)
> 
> local.config of fstests:
> 	export NFS_MOUNT_OPTIONS="-o rw,relatime,vers=4.2,nconnect=8"
> changes of generic/531
> 	max_allowable_files=$(( 1 * 1024 * 1024 / $nr_cpus / 2 ))

Changed from:

	max_allowable_files=$(( $(cat /proc/sys/fs/file-max) / $nr_cpus / 2 ))

For my own information, what's $nr_cpus in your test?

Aside from the max_allowable_files setting, can you tell how the
test determines when it should stop creating files? Is it looking
for a particular error code from open(2), for instance?

On my client:

[cel@morisot generic]$ cat /proc/sys/fs/file-max
9223372036854775807
[cel@morisot generic]$

I wonder if it's realistic to expect an NFSv4 server to support
that many open files. Is 9 quintillion files really something
I'm going to have to engineer for, or is this just a crazy
test?


> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2022/06/23
> 
>> This series overhauls the NFSD filecache, a cache of server-side
>> "struct file" objects recently used by NFS clients. The purposes of
>> this overhaul are an immediate improvement in cache scalability in
>> the number of open files, and preparation for further improvements.
>> 
>> There are three categories of patches in this series:
>> 
>> 1. Add observability of cache operation so we can see what we're
>> doing as changes are made to the code.
>> 
>> 2. Improve the scalability of filecache garbage collection,
>> addressing several bugs along the way.
>> 
>> 3. Improve the scalability of the filecache hash table by converting
>> it to use rhashtable.
>> 
>> The series as it stands survives typical test workloads. Running
>> stress-tests like generic/531 is the next step.
>> 
>> These patches are also available in the linux-nfs-bugzilla-386
>> branch of
>> 
>>  https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 
>> 
>> ---
>> 
>> Chuck Lever (30):
>>      NFSD: Report filecache LRU size
>>      NFSD: Report count of calls to nfsd_file_acquire()
>>      NFSD: Report count of freed filecache items
>>      NFSD: Report average age of filecache items
>>      NFSD: Add nfsd_file_lru_dispose_list() helper
>>      NFSD: Refactor nfsd_file_gc()
>>      NFSD: Refactor nfsd_file_lru_scan()
>>      NFSD: Report the number of items evicted by the LRU walk
>>      NFSD: Record number of flush calls
>>      NFSD: Report filecache item construction failures
>>      NFSD: Zero counters when the filecache is re-initialized
>>      NFSD: Hook up the filecache stat file
>>      NFSD: WARN when freeing an item still linked via nf_lru
>>      NFSD: Trace filecache LRU activity
>>      NFSD: Leave open files out of the filecache LRU
>>      NFSD: Fix the filecache LRU shrinker
>>      NFSD: Never call nfsd_file_gc() in foreground paths
>>      NFSD: No longer record nf_hashval in the trace log
>>      NFSD: Remove lockdep assertion from unhash_and_release_locked()
>>      NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>>      NFSD: Refactor __nfsd_file_close_inode()
>>      NFSD: nfsd_file_hash_remove can compute hashval
>>      NFSD: Remove nfsd_file::nf_hashval
>>      NFSD: Remove stale comment from nfsd_file_acquire()
>>      NFSD: Clean up "open file" case in nfsd_file_acquire()
>>      NFSD: Document nfsd_file_cache_purge() API contract
>>      NFSD: Replace the "init once" mechanism
>>      NFSD: Set up an rhashtable for the filecache
>>      NFSD: Convert the filecache to use rhashtable
>>      NFSD: Clean up unusued code after rhashtable conversion
>> 
>> 
>> fs/nfsd/filecache.c | 677 +++++++++++++++++++++++++++-----------------
>> fs/nfsd/filecache.h |   6 +-
>> fs/nfsd/nfsctl.c    |  10 +
>> fs/nfsd/trace.h     | 117 ++++++--
>> 4 files changed, 522 insertions(+), 288 deletions(-)
>> 
>> --
>> Chuck Lever
> 
> <531.dmesg>

--
Chuck Lever
Chuck Lever June 22, 2022, 7:59 p.m. UTC | #3
> On Jun 22, 2022, at 3:04 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
>> 
>> Hi,
>> 
>> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
>> patchset.
> 
> As I mention in the cover letter, I haven't tried running generic/531
> yet -- no claim at all that this is finished work and that #386 has
> been fixed at this point. I'm merely interested in comments on the
> general approach.
> 
> 
>> [ 405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049
> 
> The "RIP: " tells the location of the crash. Notice that the call
> trace here does not include that information. From your attachment:
> 
> [ 405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]
> 
> To match that to a line of source code:
> 
> [cel@manet ~]$ cd src/linux/linux/
> [cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
> nfsd_do_file_acquire+0x4e1/0xfc0:
> rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
> (inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
> (inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
> (inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
> (inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
> [cel@manet linux]$
> 
> This is an example, I'm sure my compiled objects don't match yours.
> 
> And, now that I've added observability, you should be able to do:
> 
> # watch cat /proc/fs/nfsd/filecache
> 
> to see how many items are in the hash and LRU list while the test
> is running.
> 
> 
>> [ 405.608016] Call Trace:
>> [ 405.608016] <TASK>
>> [ 405.613020] nfs4_get_vfs_file+0x325/0x410 [nfsd]
>> [ 405.618018] nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
>> [ 405.623016] ? inode_get_bytes+0x38/0x40
>> [ 405.623016] ? nfsd_permission+0x97/0xf0 [nfsd]
>> [ 405.628022] ? fh_verify+0x1cc/0x6f0 [nfsd]
>> [ 405.633025] nfsd4_open+0x640/0xb30 [nfsd]
>> [ 405.638025] nfsd4_proc_compound+0x3bd/0x710 [nfsd]
>> [ 405.643017] nfsd_dispatch+0x143/0x270 [nfsd]
>> [ 405.648019] svc_process_common+0x3bf/0x5b0 [sunrpc]

I was able to trigger something that looks very much like this crash.
If you remove this line from fs/nfsd/filecache.c:

	.max_size		= 131072, /* buckets */

things get a lot more stable for generic/531.

I'm looking into the issue now.


>> more detail in attachment file(531.dmesg)
>> 
>> local.config of fstests:
>> 	export NFS_MOUNT_OPTIONS="-o rw,relatime,vers=4.2,nconnect=8"
>> changes of generic/531
>> 	max_allowable_files=$(( 1 * 1024 * 1024 / $nr_cpus / 2 ))
> 
> Changed from:
> 
> 	max_allowable_files=$(( $(cat /proc/sys/fs/file-max) / $nr_cpus / 2 ))
> 
> For my own information, what's $nr_cpus in your test?
> 
> Aside from the max_allowable_files setting, can you tell how the
> test determines when it should stop creating files? Is it looking
> for a particular error code from open(2), for instance?
> 
> On my client:
> 
> [cel@morisot generic]$ cat /proc/sys/fs/file-max
> 9223372036854775807
> [cel@morisot generic]$
> 
> I wonder if it's realistic to expect an NFSv4 server to support
> that many open files. Is 9 quintillion files really something
> I'm going to have to engineer for, or is this just a crazy
> test?
> 
> 
>> Best Regards
>> Wang Yugui (wangyugui@e16-tech.com)
>> 2022/06/23
>> 
>>> This series overhauls the NFSD filecache, a cache of server-side
>>> "struct file" objects recently used by NFS clients. The purposes of
>>> this overhaul are an immediate improvement in cache scalability in
>>> the number of open files, and preparation for further improvements.
>>> 
>>> There are three categories of patches in this series:
>>> 
>>> 1. Add observability of cache operation so we can see what we're
>>> doing as changes are made to the code.
>>> 
>>> 2. Improve the scalability of filecache garbage collection,
>>> addressing several bugs along the way.
>>> 
>>> 3. Improve the scalability of the filecache hash table by converting
>>> it to use rhashtable.
>>> 
>>> The series as it stands survives typical test workloads. Running
>>> stress-tests like generic/531 is the next step.
>>> 
>>> These patches are also available in the linux-nfs-bugzilla-386
>>> branch of
>>> 
>>> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git 
>>> 
>>> ---
>>> 
>>> Chuck Lever (30):
>>> NFSD: Report filecache LRU size
>>> NFSD: Report count of calls to nfsd_file_acquire()
>>> NFSD: Report count of freed filecache items
>>> NFSD: Report average age of filecache items
>>> NFSD: Add nfsd_file_lru_dispose_list() helper
>>> NFSD: Refactor nfsd_file_gc()
>>> NFSD: Refactor nfsd_file_lru_scan()
>>> NFSD: Report the number of items evicted by the LRU walk
>>> NFSD: Record number of flush calls
>>> NFSD: Report filecache item construction failures
>>> NFSD: Zero counters when the filecache is re-initialized
>>> NFSD: Hook up the filecache stat file
>>> NFSD: WARN when freeing an item still linked via nf_lru
>>> NFSD: Trace filecache LRU activity
>>> NFSD: Leave open files out of the filecache LRU
>>> NFSD: Fix the filecache LRU shrinker
>>> NFSD: Never call nfsd_file_gc() in foreground paths
>>> NFSD: No longer record nf_hashval in the trace log
>>> NFSD: Remove lockdep assertion from unhash_and_release_locked()
>>> NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>>> NFSD: Refactor __nfsd_file_close_inode()
>>> NFSD: nfsd_file_hash_remove can compute hashval
>>> NFSD: Remove nfsd_file::nf_hashval
>>> NFSD: Remove stale comment from nfsd_file_acquire()
>>> NFSD: Clean up "open file" case in nfsd_file_acquire()
>>> NFSD: Document nfsd_file_cache_purge() API contract
>>> NFSD: Replace the "init once" mechanism
>>> NFSD: Set up an rhashtable for the filecache
>>> NFSD: Convert the filecache to use rhashtable
>>> NFSD: Clean up unusued code after rhashtable conversion
>>> 
>>> 
>>> fs/nfsd/filecache.c | 677 +++++++++++++++++++++++++++-----------------
>>> fs/nfsd/filecache.h | 6 +-
>>> fs/nfsd/nfsctl.c | 10 +
>>> fs/nfsd/trace.h | 117 ++++++--
>>> 4 files changed, 522 insertions(+), 288 deletions(-)
>>> 
>>> --
>>> Chuck Lever
>> 
>> <531.dmesg>
> 
> --
> Chuck Lever

--
Chuck Lever
Dave Chinner June 23, 2022, 12:21 a.m. UTC | #4
On Wed, Jun 22, 2022 at 07:04:39PM +0000, Chuck Lever III wrote:
> > more detail in attachment file(531.dmesg)
> > 
> > local.config of fstests:
> > 	export NFS_MOUNT_OPTIONS="-o rw,relatime,vers=4.2,nconnect=8"
> > changes of generic/531
> > 	max_allowable_files=$(( 1 * 1024 * 1024 / $nr_cpus / 2 ))
> 
> Changed from:
> 
> 	max_allowable_files=$(( $(cat /proc/sys/fs/file-max) / $nr_cpus / 2 ))
> 
> For my own information, what's $nr_cpus in your test?
> 
> Aside from the max_allowable_files setting, can you tell how the
> test determines when it should stop creating files? Is it looking
> for a particular error code from open(2), for instance?
> 
> On my client:
> 
> [cel@morisot generic]$ cat /proc/sys/fs/file-max
> 9223372036854775807
> [cel@morisot generic]$

$ echo $((2**63 - 1))
9223372036854775807

i.e. LLONG_MAX, or "no limit is set".

> I wonder if it's realistic to expect an NFSv4 server to support
> that many open files. Is 9 quintillion files really something
> I'm going to have to engineer for, or is this just a crazy
> test?

The test does not use the value directly - it's a max value for
clamping:

max_files=$((50000 * LOAD_FACTOR))
max_allowable_files=$(( $(cat /proc/sys/fs/file-max) / $nr_cpus / 2 ))
test $max_allowable_files -gt 0 && test $max_files -gt $max_allowable_files && \
        max_files=$max_allowable_files
ulimit -n $max_files

i.e. the result should be

max_files = max(50000, max_allowable_files)

So the test should only be allowing 50,000 open unlinked files to be
created before unmounting. Which means there's lots of silly
renaming going on at the client and so the server is probably seeing
100,000 unique file handles across the test....

Cheers,

Dave.
Chuck Lever June 23, 2022, 1:01 a.m. UTC | #5
> On Jun 22, 2022, at 8:21 PM, Dave Chinner <david@fromorbit.com> wrote:
> 
> On Wed, Jun 22, 2022 at 07:04:39PM +0000, Chuck Lever III wrote:
>>> more detail in attachment file(531.dmesg)
>>> 
>>> local.config of fstests:
>>> 	export NFS_MOUNT_OPTIONS="-o rw,relatime,vers=4.2,nconnect=8"
>>> changes of generic/531
>>> 	max_allowable_files=$(( 1 * 1024 * 1024 / $nr_cpus / 2 ))
>> 
>> Changed from:
>> 
>> 	max_allowable_files=$(( $(cat /proc/sys/fs/file-max) / $nr_cpus / 2 ))
>> 
>> For my own information, what's $nr_cpus in your test?
>> 
>> Aside from the max_allowable_files setting, can you tell how the
>> test determines when it should stop creating files? Is it looking
>> for a particular error code from open(2), for instance?
>> 
>> On my client:
>> 
>> [cel@morisot generic]$ cat /proc/sys/fs/file-max
>> 9223372036854775807
>> [cel@morisot generic]$
> 
> $ echo $((2**63 - 1))
> 9223372036854775807
> 
> i.e. LLONG_MAX, or "no limit is set".
> 
>> I wonder if it's realistic to expect an NFSv4 server to support
>> that many open files. Is 9 quintillion files really something
>> I'm going to have to engineer for, or is this just a crazy
>> test?
> 
> The test does not use the value directly - it's a max value for
> clamping:
> 
> max_files=$((50000 * LOAD_FACTOR))
> max_allowable_files=$(( $(cat /proc/sys/fs/file-max) / $nr_cpus / 2 ))
> test $max_allowable_files -gt 0 && test $max_files -gt $max_allowable_files && \
>        max_files=$max_allowable_files
> ulimit -n $max_files
> 
> i.e. the result should be
> 
> max_files = max(50000, max_allowable_files)
> 
> So the test should only be allowing 50,000 open unlinked files to be
> created before unmounting.

Looking at my testing, it's ~50,000 per worker thread, and there are
2 workers per physical core on the client. But thankfully, this is
much smaller than 9 quintillion.


> Which means there's lots of silly
> renaming going on at the client and so the server is probably seeing
> 100,000 unique file handles across the test....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

--
Chuck Lever
Wang Yugui June 23, 2022, 9:02 a.m. UTC | #6
Hi,

> > On Jun 22, 2022, at 3:04 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> >> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> >> 
> >> Hi,
> >> 
> >> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
> >> patchset.
> > 
> > As I mention in the cover letter, I haven't tried running generic/531
> > yet -- no claim at all that this is finished work and that #386 has
> > been fixed at this point. I'm merely interested in comments on the
> > general approach.
> > 
> > 
> >> [ 405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049
> > 
> > The "RIP: " tells the location of the crash. Notice that the call
> > trace here does not include that information. From your attachment:
> > 
> > [ 405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]
> > 
> > To match that to a line of source code:
> > 
> > [cel@manet ~]$ cd src/linux/linux/
> > [cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
> > nfsd_do_file_acquire+0x4e1/0xfc0:
> > rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
> > (inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
> > (inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
> > (inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
> > (inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
> > [cel@manet linux]$
> > 
> > This is an example, I'm sure my compiled objects don't match yours.
> > 
> > And, now that I've added observability, you should be able to do:
> > 
> > # watch cat /proc/fs/nfsd/filecache
> > 
> > to see how many items are in the hash and LRU list while the test
> > is running.
> > 
> > 
> >> [ 405.608016] Call Trace:
> >> [ 405.608016] <TASK>
> >> [ 405.613020] nfs4_get_vfs_file+0x325/0x410 [nfsd]
> >> [ 405.618018] nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
> >> [ 405.623016] ? inode_get_bytes+0x38/0x40
> >> [ 405.623016] ? nfsd_permission+0x97/0xf0 [nfsd]
> >> [ 405.628022] ? fh_verify+0x1cc/0x6f0 [nfsd]
> >> [ 405.633025] nfsd4_open+0x640/0xb30 [nfsd]
> >> [ 405.638025] nfsd4_proc_compound+0x3bd/0x710 [nfsd]
> >> [ 405.643017] nfsd_dispatch+0x143/0x270 [nfsd]
> >> [ 405.648019] svc_process_common+0x3bf/0x5b0 [sunrpc]
> 
> I was able to trigger something that looks very much like this crash.
> If you remove this line from fs/nfsd/filecache.c:
> 
> 	.max_size		= 131072, /* buckets */
> 
> things get a lot more stable for generic/531.
> 
> I'm looking into the issue now.

Yes.  When '.max_size  = 131072' is removed,  fstests generic/531 passed.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/06/23
Chuck Lever June 23, 2022, 4:44 p.m. UTC | #7
> On Jun 23, 2022, at 5:02 AM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> 
> Hi,
> 
>>> On Jun 22, 2022, at 3:04 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
>>>> patchset.
>>> 
>>> As I mention in the cover letter, I haven't tried running generic/531
>>> yet -- no claim at all that this is finished work and that #386 has
>>> been fixed at this point. I'm merely interested in comments on the
>>> general approach.
>>> 
>>> 
>>>> [ 405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049
>>> 
>>> The "RIP: " tells the location of the crash. Notice that the call
>>> trace here does not include that information. From your attachment:
>>> 
>>> [ 405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]
>>> 
>>> To match that to a line of source code:
>>> 
>>> [cel@manet ~]$ cd src/linux/linux/
>>> [cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
>>> nfsd_do_file_acquire+0x4e1/0xfc0:
>>> rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
>>> (inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
>>> (inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
>>> (inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
>>> (inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
>>> [cel@manet linux]$
>>> 
>>> This is an example, I'm sure my compiled objects don't match yours.
>>> 
>>> And, now that I've added observability, you should be able to do:
>>> 
>>> # watch cat /proc/fs/nfsd/filecache
>>> 
>>> to see how many items are in the hash and LRU list while the test
>>> is running.
>>> 
>>> 
>>>> [ 405.608016] Call Trace:
>>>> [ 405.608016] <TASK>
>>>> [ 405.613020] nfs4_get_vfs_file+0x325/0x410 [nfsd]
>>>> [ 405.618018] nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
>>>> [ 405.623016] ? inode_get_bytes+0x38/0x40
>>>> [ 405.623016] ? nfsd_permission+0x97/0xf0 [nfsd]
>>>> [ 405.628022] ? fh_verify+0x1cc/0x6f0 [nfsd]
>>>> [ 405.633025] nfsd4_open+0x640/0xb30 [nfsd]
>>>> [ 405.638025] nfsd4_proc_compound+0x3bd/0x710 [nfsd]
>>>> [ 405.643017] nfsd_dispatch+0x143/0x270 [nfsd]
>>>> [ 405.648019] svc_process_common+0x3bf/0x5b0 [sunrpc]
>> 
>> I was able to trigger something that looks very much like this crash.
>> If you remove this line from fs/nfsd/filecache.c:
>> 
>> 	.max_size		= 131072, /* buckets */
>> 
>> things get a lot more stable for generic/531.
>> 
>> I'm looking into the issue now.
> 
> Yes. When '.max_size = 131072' is removed, fstests generic/531 passed.

Great! Are you comfortable with this general approach for bug #386?


--
Chuck Lever
Wang Yugui June 23, 2022, 5:51 p.m. UTC | #8
Hi,

> > On Jun 23, 2022, at 5:02 AM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> > 
> > Hi,
> > 
> >>> On Jun 22, 2022, at 3:04 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> >>>> 
> >>>> Hi,
> >>>> 
> >>>> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
> >>>> patchset.
> >>> 
> >>> As I mention in the cover letter, I haven't tried running generic/531
> >>> yet -- no claim at all that this is finished work and that #386 has
> >>> been fixed at this point. I'm merely interested in comments on the
> >>> general approach.
> >>> 
> >>> 
> >>>> [ 405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049
> >>> 
> >>> The "RIP: " tells the location of the crash. Notice that the call
> >>> trace here does not include that information. From your attachment:
> >>> 
> >>> [ 405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]
> >>> 
> >>> To match that to a line of source code:
> >>> 
> >>> [cel@manet ~]$ cd src/linux/linux/
> >>> [cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
> >>> nfsd_do_file_acquire+0x4e1/0xfc0:
> >>> rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
> >>> (inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
> >>> (inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
> >>> (inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
> >>> (inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
> >>> [cel@manet linux]$
> >>> 
> >>> This is an example, I'm sure my compiled objects don't match yours.
> >>> 
> >>> And, now that I've added observability, you should be able to do:
> >>> 
> >>> # watch cat /proc/fs/nfsd/filecache
> >>> 
> >>> to see how many items are in the hash and LRU list while the test
> >>> is running.
> >>> 
> >>> 
> >>>> [ 405.608016] Call Trace:
> >>>> [ 405.608016] <TASK>
> >>>> [ 405.613020] nfs4_get_vfs_file+0x325/0x410 [nfsd]
> >>>> [ 405.618018] nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
> >>>> [ 405.623016] ? inode_get_bytes+0x38/0x40
> >>>> [ 405.623016] ? nfsd_permission+0x97/0xf0 [nfsd]
> >>>> [ 405.628022] ? fh_verify+0x1cc/0x6f0 [nfsd]
> >>>> [ 405.633025] nfsd4_open+0x640/0xb30 [nfsd]
> >>>> [ 405.638025] nfsd4_proc_compound+0x3bd/0x710 [nfsd]
> >>>> [ 405.643017] nfsd_dispatch+0x143/0x270 [nfsd]
> >>>> [ 405.648019] svc_process_common+0x3bf/0x5b0 [sunrpc]
> >> 
> >> I was able to trigger something that looks very much like this crash.
> >> If you remove this line from fs/nfsd/filecache.c:
> >> 
> >> 	.max_size		= 131072, /* buckets */
> >> 
> >> things get a lot more stable for generic/531.
> >> 
> >> I'm looking into the issue now.
> > 
> > Yes. When '.max_size = 131072' is removed, fstests generic/531 passed.
> 
> Great! Are you comfortable with this general approach for bug #386?

It seems a good result for #386.

fstests generic/531(file-max: 1M) performance result:
base(5.19.0-rc3, 12 bits hash, serialized nfsd_file_gc): 222s
this patchset(.min_size=4096): 59s
so, a good improvement for #386.

It seems a good(acceptable) result for #387 too.
the period of 'text busy(exec directly from the back-end of nfs-server)'
is about 4s.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/06/24
Frank van der Linden June 23, 2022, 8:27 p.m. UTC | #9
On Wed, Jun 22, 2022 at 7:12 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> This series overhauls the NFSD filecache, a cache of server-side
> "struct file" objects recently used by NFS clients. The purposes of
> this overhaul are an immediate improvement in cache scalability in
> the number of open files, and preparation for further improvements.
>
> There are three categories of patches in this series:
>
> 1. Add observability of cache operation so we can see what we're
> doing as changes are made to the code.
>
> 2. Improve the scalability of filecache garbage collection,
> addressing several bugs along the way.
>
> 3. Improve the scalability of the filecache hash table by converting
> it to use rhashtable.
>
> The series as it stands survives typical test workloads. Running
> stress-tests like generic/531 is the next step.
>
> These patches are also available in the linux-nfs-bugzilla-386
> branch of
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
>
> ---
>
> Chuck Lever (30):
>       NFSD: Report filecache LRU size
>       NFSD: Report count of calls to nfsd_file_acquire()
>       NFSD: Report count of freed filecache items
>       NFSD: Report average age of filecache items
>       NFSD: Add nfsd_file_lru_dispose_list() helper
>       NFSD: Refactor nfsd_file_gc()
>       NFSD: Refactor nfsd_file_lru_scan()
>       NFSD: Report the number of items evicted by the LRU walk
>       NFSD: Record number of flush calls
>       NFSD: Report filecache item construction failures
>       NFSD: Zero counters when the filecache is re-initialized
>       NFSD: Hook up the filecache stat file
>       NFSD: WARN when freeing an item still linked via nf_lru
>       NFSD: Trace filecache LRU activity
>       NFSD: Leave open files out of the filecache LRU
>       NFSD: Fix the filecache LRU shrinker
>       NFSD: Never call nfsd_file_gc() in foreground paths
>       NFSD: No longer record nf_hashval in the trace log
>       NFSD: Remove lockdep assertion from unhash_and_release_locked()
>       NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>       NFSD: Refactor __nfsd_file_close_inode()
>       NFSD: nfsd_file_hash_remove can compute hashval
>       NFSD: Remove nfsd_file::nf_hashval
>       NFSD: Remove stale comment from nfsd_file_acquire()
>       NFSD: Clean up "open file" case in nfsd_file_acquire()
>       NFSD: Document nfsd_file_cache_purge() API contract
>       NFSD: Replace the "init once" mechanism
>       NFSD: Set up an rhashtable for the filecache
>       NFSD: Convert the filecache to use rhashtable
>       NFSD: Clean up unusued code after rhashtable conversion
>
>
>  fs/nfsd/filecache.c | 677 +++++++++++++++++++++++++++-----------------
>  fs/nfsd/filecache.h |   6 +-
>  fs/nfsd/nfsctl.c    |  10 +
>  fs/nfsd/trace.h     | 117 ++++++--
>  4 files changed, 522 insertions(+), 288 deletions(-)
>
> --
> Chuck Lever
>

Yep, looks good so far, thanks for doing this. Somewhat similar to my (buggy)
attempt at fixing it that I sent at the time (don't put open files on
the LRU, and
use rhashtable), but cleaner and, presumably, less buggy :)

Can't test it right now, but it seems like Wang already confirmed that it works.

- Frank
Chuck Lever June 24, 2022, 3:30 p.m. UTC | #10
> On Jun 23, 2022, at 1:51 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
> 
> Hi,
> 
>>> On Jun 23, 2022, at 5:02 AM, Wang Yugui <wangyugui@e16-tech.com> wrote:
>>> 
>>> Hi,
>>> 
>>>>> On Jun 22, 2022, at 3:04 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@e16-tech.com> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
>>>>>> patchset.
>>>>> 
>>>>> As I mention in the cover letter, I haven't tried running generic/531
>>>>> yet -- no claim at all that this is finished work and that #386 has
>>>>> been fixed at this point. I'm merely interested in comments on the
>>>>> general approach.
>>>>> 
>>>>> 
>>>>>> [ 405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049
>>>>> 
>>>>> The "RIP: " tells the location of the crash. Notice that the call
>>>>> trace here does not include that information. From your attachment:
>>>>> 
>>>>> [ 405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]
>>>>> 
>>>>> To match that to a line of source code:
>>>>> 
>>>>> [cel@manet ~]$ cd src/linux/linux/
>>>>> [cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
>>>>> nfsd_do_file_acquire+0x4e1/0xfc0:
>>>>> rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
>>>>> (inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
>>>>> (inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
>>>>> (inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
>>>>> (inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
>>>>> [cel@manet linux]$
>>>>> 
>>>>> This is an example, I'm sure my compiled objects don't match yours.
>>>>> 
>>>>> And, now that I've added observability, you should be able to do:
>>>>> 
>>>>> # watch cat /proc/fs/nfsd/filecache
>>>>> 
>>>>> to see how many items are in the hash and LRU list while the test
>>>>> is running.
>>>>> 
>>>>> 
>>>>>> [ 405.608016] Call Trace:
>>>>>> [ 405.608016] <TASK>
>>>>>> [ 405.613020] nfs4_get_vfs_file+0x325/0x410 [nfsd]
>>>>>> [ 405.618018] nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
>>>>>> [ 405.623016] ? inode_get_bytes+0x38/0x40
>>>>>> [ 405.623016] ? nfsd_permission+0x97/0xf0 [nfsd]
>>>>>> [ 405.628022] ? fh_verify+0x1cc/0x6f0 [nfsd]
>>>>>> [ 405.633025] nfsd4_open+0x640/0xb30 [nfsd]
>>>>>> [ 405.638025] nfsd4_proc_compound+0x3bd/0x710 [nfsd]
>>>>>> [ 405.643017] nfsd_dispatch+0x143/0x270 [nfsd]
>>>>>> [ 405.648019] svc_process_common+0x3bf/0x5b0 [sunrpc]
>>>> 
>>>> I was able to trigger something that looks very much like this crash.
>>>> If you remove this line from fs/nfsd/filecache.c:
>>>> 
>>>> 	.max_size		= 131072, /* buckets */
>>>> 
>>>> things get a lot more stable for generic/531.
>>>> 
>>>> I'm looking into the issue now.
>>> 
>>> Yes. When '.max_size = 131072' is removed, fstests generic/531 passed.
>> 
>> Great! Are you comfortable with this general approach for bug #386?
> 
> It seems a good result for #386.
> 
> fstests generic/531(file-max: 1M) performance result:
> base(5.19.0-rc3, 12 bits hash, serialized nfsd_file_gc): 222s
> this patchset(.min_size=4096): 59s
> so, a good improvement for #386.
> 
> It seems a good(acceptable) result for #387 too.
> the period of 'text busy(exec directly from the back-end of nfs-server)'
> is about 4s.

I was surprised to learn that NFSv4 doesn't close those files outright,
that they remain in the filecache for a bit. I think we can do better
there, but I haven't looked closely at that yet.

I see that files are left open on the server by crashed NFSv4 clients too.
That will result in the server becoming unusable after significant
uptime, which I've seen on occasion (because I crash my clients a lot)
but never looked into until now. 


--
Chuck Lever
Chuck Lever June 28, 2022, 5:57 p.m. UTC | #11
> On Jun 23, 2022, at 4:27 PM, Frank van der Linden <fvdl@google.com> wrote:
> 
> On Wed, Jun 22, 2022 at 7:12 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> This series overhauls the NFSD filecache, a cache of server-side
>> "struct file" objects recently used by NFS clients. The purposes of
>> this overhaul are an immediate improvement in cache scalability in
>> the number of open files, and preparation for further improvements.
>> 
>> There are three categories of patches in this series:
>> 
>> 1. Add observability of cache operation so we can see what we're
>> doing as changes are made to the code.
>> 
>> 2. Improve the scalability of filecache garbage collection,
>> addressing several bugs along the way.
>> 
>> 3. Improve the scalability of the filecache hash table by converting
>> it to use rhashtable.
>> 
>> The series as it stands survives typical test workloads. Running
>> stress-tests like generic/531 is the next step.
>> 
>> These patches are also available in the linux-nfs-bugzilla-386
>> branch of
>> 
>>  https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
>> 
>> ---
>> 
>> Chuck Lever (30):
>>      NFSD: Report filecache LRU size
>>      NFSD: Report count of calls to nfsd_file_acquire()
>>      NFSD: Report count of freed filecache items
>>      NFSD: Report average age of filecache items
>>      NFSD: Add nfsd_file_lru_dispose_list() helper
>>      NFSD: Refactor nfsd_file_gc()
>>      NFSD: Refactor nfsd_file_lru_scan()
>>      NFSD: Report the number of items evicted by the LRU walk
>>      NFSD: Record number of flush calls
>>      NFSD: Report filecache item construction failures
>>      NFSD: Zero counters when the filecache is re-initialized
>>      NFSD: Hook up the filecache stat file
>>      NFSD: WARN when freeing an item still linked via nf_lru
>>      NFSD: Trace filecache LRU activity
>>      NFSD: Leave open files out of the filecache LRU
>>      NFSD: Fix the filecache LRU shrinker
>>      NFSD: Never call nfsd_file_gc() in foreground paths
>>      NFSD: No longer record nf_hashval in the trace log
>>      NFSD: Remove lockdep assertion from unhash_and_release_locked()
>>      NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
>>      NFSD: Refactor __nfsd_file_close_inode()
>>      NFSD: nfsd_file_hash_remove can compute hashval
>>      NFSD: Remove nfsd_file::nf_hashval
>>      NFSD: Remove stale comment from nfsd_file_acquire()
>>      NFSD: Clean up "open file" case in nfsd_file_acquire()
>>      NFSD: Document nfsd_file_cache_purge() API contract
>>      NFSD: Replace the "init once" mechanism
>>      NFSD: Set up an rhashtable for the filecache
>>      NFSD: Convert the filecache to use rhashtable
>>      NFSD: Clean up unusued code after rhashtable conversion
>> 
>> 
>> fs/nfsd/filecache.c | 677 +++++++++++++++++++++++++++-----------------
>> fs/nfsd/filecache.h |   6 +-
>> fs/nfsd/nfsctl.c    |  10 +
>> fs/nfsd/trace.h     | 117 ++++++--
>> 4 files changed, 522 insertions(+), 288 deletions(-)
>> 
>> --
>> Chuck Lever
>> 
> 
> Yep, looks good so far, thanks for doing this. Somewhat similar to my (buggy)
> attempt at fixing it that I sent at the time (don't put open files on
> the LRU, and
> use rhashtable), but cleaner and, presumably, less buggy :)
> 
> Can't test it right now, but it seems like Wang already confirmed that it works.

Frank, thanks to you and Wang for reporting this issue, and sorry for
taking so long to address it.


--
Chuck Lever