Message ID | 20220224161705.1041788-1-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] nfsd: more robust allocation failure handling in nfsd_file_cache_init | expand |
On Thu, 2022-02-24 at 18:17 +0200, Amir Goldstein wrote: > The nfsd file cache table can be pretty large and its allocation > may require as many as 80 contigious pages. > > Employ the same fix that was employed for similar issue that was > reported for the reply cache hash table allocation several years ago > by commit 8f97514b423a ("nfsd: more robust allocation failure handling > in nfsd_reply_cache_init"). > > Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") > Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ > Suggested-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > > Since v1: > - Use kvcalloc() > - Use kvfree() > > fs/nfsd/filecache.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 8bc807c5fea4..cc2831cec669 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -632,7 +632,7 @@ nfsd_file_cache_init(void) > if (!nfsd_filecache_wq) > goto out; > > - nfsd_file_hashtbl = kcalloc(NFSD_FILE_HASH_SIZE, > + nfsd_file_hashtbl = kvcalloc(NFSD_FILE_HASH_SIZE, > sizeof(*nfsd_file_hashtbl), GFP_KERNEL); > if (!nfsd_file_hashtbl) { > pr_err("nfsd: unable to allocate nfsd_file_hashtbl\n"); > @@ -700,7 +700,7 @@ nfsd_file_cache_init(void) > nfsd_file_slab = NULL; > kmem_cache_destroy(nfsd_file_mark_slab); > nfsd_file_mark_slab = NULL; > - kfree(nfsd_file_hashtbl); > + kvfree(nfsd_file_hashtbl); > nfsd_file_hashtbl = NULL; > destroy_workqueue(nfsd_filecache_wq); > nfsd_filecache_wq = NULL; > @@ -811,7 +811,7 @@ nfsd_file_cache_shutdown(void) > fsnotify_wait_marks_destroyed(); > kmem_cache_destroy(nfsd_file_mark_slab); > nfsd_file_mark_slab = NULL; > - kfree(nfsd_file_hashtbl); > + kvfree(nfsd_file_hashtbl); > nfsd_file_hashtbl = NULL; > destroy_workqueue(nfsd_filecache_wq); > nfsd_filecache_wq = NULL; Reviewed-by: Jeff Layton <jlayton@kernel.org>
Hi Amir- > On Feb 24, 2022, at 11:17 AM, Amir Goldstein <amir73il@gmail.com> wrote: > > The nfsd file cache table can be pretty large and its allocation > may require as many as 80 contigious pages. > > Employ the same fix that was employed for similar issue that was > reported for the reply cache hash table allocation several years ago > by commit 8f97514b423a ("nfsd: more robust allocation failure handling > in nfsd_reply_cache_init"). > > Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") > Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ > Suggested-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > > Since v1: > - Use kvcalloc() > - Use kvfree() > > fs/nfsd/filecache.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) v2 passes some simple testing, so I've applied it to NFSD for-next. It should get 0-day and merge testing and is available for others to try out. I don't have anything that exercises low memory scenarios, though. Do you have anything like this to try? > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 8bc807c5fea4..cc2831cec669 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -632,7 +632,7 @@ nfsd_file_cache_init(void) > if (!nfsd_filecache_wq) > goto out; > > - nfsd_file_hashtbl = kcalloc(NFSD_FILE_HASH_SIZE, > + nfsd_file_hashtbl = kvcalloc(NFSD_FILE_HASH_SIZE, > sizeof(*nfsd_file_hashtbl), GFP_KERNEL); > if (!nfsd_file_hashtbl) { > pr_err("nfsd: unable to allocate nfsd_file_hashtbl\n"); > @@ -700,7 +700,7 @@ nfsd_file_cache_init(void) > nfsd_file_slab = NULL; > kmem_cache_destroy(nfsd_file_mark_slab); > nfsd_file_mark_slab = NULL; > - kfree(nfsd_file_hashtbl); > + kvfree(nfsd_file_hashtbl); > nfsd_file_hashtbl = NULL; > destroy_workqueue(nfsd_filecache_wq); > nfsd_filecache_wq = NULL; > @@ -811,7 +811,7 @@ nfsd_file_cache_shutdown(void) > fsnotify_wait_marks_destroyed(); > kmem_cache_destroy(nfsd_file_mark_slab); > nfsd_file_mark_slab = NULL; > - kfree(nfsd_file_hashtbl); > + kvfree(nfsd_file_hashtbl); > nfsd_file_hashtbl = NULL; > destroy_workqueue(nfsd_filecache_wq); > nfsd_filecache_wq = NULL; > -- > 2.25.1 > -- Chuck Lever
On Thu, Feb 24, 2022 at 10:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > > Hi Amir- > > > On Feb 24, 2022, at 11:17 AM, Amir Goldstein <amir73il@gmail.com> wrote: > > > > The nfsd file cache table can be pretty large and its allocation > > may require as many as 80 contigious pages. > > > > Employ the same fix that was employed for similar issue that was > > reported for the reply cache hash table allocation several years ago > > by commit 8f97514b423a ("nfsd: more robust allocation failure handling > > in nfsd_reply_cache_init"). > > > > Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") > > Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ > > Suggested-by: Jeff Layton <jlayton@kernel.org> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > --- > > > > Since v1: > > - Use kvcalloc() > > - Use kvfree() > > > > fs/nfsd/filecache.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > v2 passes some simple testing, so I've applied it to NFSD for-next. > It should get 0-day and merge testing and is available for others > to try out. > > I don't have anything that exercises low memory scenarios, though. > Do you have anything like this to try? Well, it is not low memory really it's fragmented memory. I would try setting: CONFIG_FAIL_PAGE_ALLOC=y echo 5 > /sys/kernel/debug/fail_page_alloc/min-order echo 100 > /sys/kernel/debug/fail_page_alloc/probability and starting (or restarting) nfsd. hoping that other large page allocations won't get in the way. I gave it a shot, but couldn't figure out why nfsd4_files slab is still there after stopping nfs-server service, meaning that nfsd_file_cache_shutdown() was not called - I must be missing something. I may play with this some more tomorrow. Thanks, Amir.
On Fri, 25 Feb 2022, Amir Goldstein wrote: > The nfsd file cache table can be pretty large and its allocation > may require as many as 80 contigious pages. > > Employ the same fix that was employed for similar issue that was > reported for the reply cache hash table allocation several years ago > by commit 8f97514b423a ("nfsd: more robust allocation failure handling > in nfsd_reply_cache_init"). > > Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") > Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ > Suggested-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > > Since v1: > - Use kvcalloc() > - Use kvfree() I think this is a good improvement, but it would be really nice to replace this bespoke hash table with an rhashtable. They we wouldn't need to worry about these trivial details. NeilBrown
> On Feb 24, 2022, at 4:49 PM, NeilBrown <neilb@suse.de> wrote: > > On Fri, 25 Feb 2022, Amir Goldstein wrote: >> The nfsd file cache table can be pretty large and its allocation >> may require as many as 80 contigious pages. >> >> Employ the same fix that was employed for similar issue that was >> reported for the reply cache hash table allocation several years ago >> by commit 8f97514b423a ("nfsd: more robust allocation failure handling >> in nfsd_reply_cache_init"). >> >> Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") >> Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ >> Suggested-by: Jeff Layton <jlayton@kernel.org> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com> >> --- >> >> Since v1: >> - Use kvcalloc() >> - Use kvfree() > > I think this is a good improvement, but it would be really nice to > replace this bespoke hash table with an rhashtable. They we wouldn't > need to worry about these trivial details. I agree -- I didn't want to saddle Jeff or Amir with pulling on that chain. But I'm willing to review patches that attempt that kind of replacement (same for the DRC hash table). -- Chuck Lever
On Thu, Feb 24, 2022 at 11:39 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Thu, Feb 24, 2022 at 10:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > Hi Amir- > > > > > On Feb 24, 2022, at 11:17 AM, Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > The nfsd file cache table can be pretty large and its allocation > > > may require as many as 80 contigious pages. > > > > > > Employ the same fix that was employed for similar issue that was > > > reported for the reply cache hash table allocation several years ago > > > by commit 8f97514b423a ("nfsd: more robust allocation failure handling > > > in nfsd_reply_cache_init"). > > > > > > Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") > > > Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ > > > Suggested-by: Jeff Layton <jlayton@kernel.org> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > > --- > > > > > > Since v1: > > > - Use kvcalloc() > > > - Use kvfree() > > > > > > fs/nfsd/filecache.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > v2 passes some simple testing, so I've applied it to NFSD for-next. > > It should get 0-day and merge testing and is available for others > > to try out. > > > > I don't have anything that exercises low memory scenarios, though. > > Do you have anything like this to try? > > Well, it is not low memory really it's fragmented memory. > I would try setting: > > CONFIG_FAIL_PAGE_ALLOC=y > > echo 5 > /sys/kernel/debug/fail_page_alloc/min-order > echo 100 > /sys/kernel/debug/fail_page_alloc/probability > > and starting (or restarting) nfsd. > hoping that other large page allocations won't get in the way. > > I gave it a shot, but couldn't figure out why nfsd4_files slab > is still there after stopping nfs-server service, meaning that > nfsd_file_cache_shutdown() was not called - I must be missing > something. I may play with this some more tomorrow. > Ok, I was missing some parameters. This configuration reproduces and failure and verified that the kvcalloc() fix solves the issue: $ systemctl stop nfs-server $ echo 5 > /sys/kernel/debug/fail_page_alloc/min-order $ echo 100 > /sys/kernel/debug/fail_page_alloc/probability $ echo 1 > /sys/kernel/debug/fail_page_alloc/times $ echo N > /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait $ systemctl start nfs-server [ 24.410560] FAULT_INJECTION: forcing a failure. [ 24.410560] name fail_page_alloc, interval 1, probability 100, space 0, times 1 [ 24.413887] CPU: 1 PID: 1218 Comm: rpc.nfsd Not tainted 5.17.0-rc2-xfstests #5927 [ 24.415625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 24.417098] Call Trace: [ 24.417098] <TASK> [ 24.417098] dump_stack_lvl+0x45/0x59 [ 24.418999] should_fail+0x11a/0x13d [ 24.418999] prepare_alloc_pages.isra.0+0x97/0xc5 [ 24.418999] __alloc_pages+0x76/0x1c7 [ 24.418999] kmalloc_order+0x35/0xa7 [ 24.418999] kmalloc_order_trace+0x1b/0xf3 [ 24.418999] nfsd_file_cache_init+0x5b/0x2d8 [ 24.418999] nfsd_svc+0xcd/0x2b2 [ 24.427086] write_threads+0x6d/0xb5 [ 24.427086] ? get_int+0x70/0x70 [ 24.429020] nfsctl_transaction_write+0x4f/0x67 [ 24.429020] vfs_write+0xe3/0x14b [ 24.429020] ksys_write+0x7f/0xcb [ 24.429020] do_syscall_64+0x6d/0x80 [ 24.429020] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 24.429020] RIP: 0033:0x7f29d80d6504 [ 24.429020] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53 [ 24.439028] RSP: 002b:00007ffe867a47f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 24.439028] RAX: ffffffffffffffda RBX: 00007f29d8219560 RCX: 00007f29d80d6504 [ 24.442325] RDX: 0000000000000002 RSI: 00007f29d8219560 RDI: 0000000000000003 [ 24.442325] RBP: 0000000000000003 R08: 0000000000000000 R09: 00007ffe867a4557 [ 24.445644] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 24.445644] R13: 0000000000000008 R14: 0000000000000000 R15: 00007f29d83572a0 [ 24.449026] </TASK> [ 24.450496] nfsd: unable to allocate nfsd_file_hashtbl Job for nfs-server.service canceled. With the fix patch, nfsd starts despite the injected failure. Thanks, Amir.
> On Feb 26, 2022, at 1:37 PM, Amir Goldstein <amir73il@gmail.com> wrote: > > On Thu, Feb 24, 2022 at 11:39 PM Amir Goldstein <amir73il@gmail.com> wrote: >> >> On Thu, Feb 24, 2022 at 10:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: >>> >>> Hi Amir- >>> >>>> On Feb 24, 2022, at 11:17 AM, Amir Goldstein <amir73il@gmail.com> wrote: >>>> >>>> The nfsd file cache table can be pretty large and its allocation >>>> may require as many as 80 contigious pages. >>>> >>>> Employ the same fix that was employed for similar issue that was >>>> reported for the reply cache hash table allocation several years ago >>>> by commit 8f97514b423a ("nfsd: more robust allocation failure handling >>>> in nfsd_reply_cache_init"). >>>> >>>> Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") >>>> Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ >>>> Suggested-by: Jeff Layton <jlayton@kernel.org> >>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com> >>>> --- >>>> >>>> Since v1: >>>> - Use kvcalloc() >>>> - Use kvfree() >>>> >>>> fs/nfsd/filecache.c | 6 +++--- >>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>> >>> v2 passes some simple testing, so I've applied it to NFSD for-next. >>> It should get 0-day and merge testing and is available for others >>> to try out. >>> >>> I don't have anything that exercises low memory scenarios, though. >>> Do you have anything like this to try? >> >> Well, it is not low memory really it's fragmented memory. >> I would try setting: >> >> CONFIG_FAIL_PAGE_ALLOC=y >> >> echo 5 > /sys/kernel/debug/fail_page_alloc/min-order >> echo 100 > /sys/kernel/debug/fail_page_alloc/probability >> >> and starting (or restarting) nfsd. >> hoping that other large page allocations won't get in the way. >> >> I gave it a shot, but couldn't figure out why nfsd4_files slab >> is still there after stopping nfs-server service, meaning that >> nfsd_file_cache_shutdown() was not called - I must be missing >> something. I may play with this some more tomorrow. >> > > Ok, I was missing some parameters. > This configuration reproduces and failure and verified that the > kvcalloc() fix solves the issue: > > $ systemctl stop nfs-server > $ echo 5 > /sys/kernel/debug/fail_page_alloc/min-order > $ echo 100 > /sys/kernel/debug/fail_page_alloc/probability > $ echo 1 > /sys/kernel/debug/fail_page_alloc/times > $ echo N > /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait > $ systemctl start nfs-server > > [ 24.410560] FAULT_INJECTION: forcing a failure. > [ 24.410560] name fail_page_alloc, interval 1, probability 100, > space 0, times 1 > [ 24.413887] CPU: 1 PID: 1218 Comm: rpc.nfsd Not tainted > 5.17.0-rc2-xfstests #5927 > [ 24.415625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.13.0-1ubuntu1.1 04/01/2014 > [ 24.417098] Call Trace: > [ 24.417098] <TASK> > [ 24.417098] dump_stack_lvl+0x45/0x59 > [ 24.418999] should_fail+0x11a/0x13d > [ 24.418999] prepare_alloc_pages.isra.0+0x97/0xc5 > [ 24.418999] __alloc_pages+0x76/0x1c7 > [ 24.418999] kmalloc_order+0x35/0xa7 > [ 24.418999] kmalloc_order_trace+0x1b/0xf3 > [ 24.418999] nfsd_file_cache_init+0x5b/0x2d8 > [ 24.418999] nfsd_svc+0xcd/0x2b2 > [ 24.427086] write_threads+0x6d/0xb5 > [ 24.427086] ? get_int+0x70/0x70 > [ 24.429020] nfsctl_transaction_write+0x4f/0x67 > [ 24.429020] vfs_write+0xe3/0x14b > [ 24.429020] ksys_write+0x7f/0xcb > [ 24.429020] do_syscall_64+0x6d/0x80 > [ 24.429020] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 24.429020] RIP: 0033:0x7f29d80d6504 > [ 24.429020] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f > 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 > 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 > f5 53 > [ 24.439028] RSP: 002b:00007ffe867a47f8 EFLAGS: 00000246 ORIG_RAX: > 0000000000000001 > [ 24.439028] RAX: ffffffffffffffda RBX: 00007f29d8219560 RCX: 00007f29d80d6504 > [ 24.442325] RDX: 0000000000000002 RSI: 00007f29d8219560 RDI: 0000000000000003 > [ 24.442325] RBP: 0000000000000003 R08: 0000000000000000 R09: 00007ffe867a4557 > [ 24.445644] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > [ 24.445644] R13: 0000000000000008 R14: 0000000000000000 R15: 00007f29d83572a0 > [ 24.449026] </TASK> > [ 24.450496] nfsd: unable to allocate nfsd_file_hashtbl > Job for nfs-server.service canceled. > > With the fix patch, nfsd starts despite the injected failure. Thanks. I will add your Tested-by: . -- Chuck Lever
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index 8bc807c5fea4..cc2831cec669 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -632,7 +632,7 @@ nfsd_file_cache_init(void) if (!nfsd_filecache_wq) goto out; - nfsd_file_hashtbl = kcalloc(NFSD_FILE_HASH_SIZE, + nfsd_file_hashtbl = kvcalloc(NFSD_FILE_HASH_SIZE, sizeof(*nfsd_file_hashtbl), GFP_KERNEL); if (!nfsd_file_hashtbl) { pr_err("nfsd: unable to allocate nfsd_file_hashtbl\n"); @@ -700,7 +700,7 @@ nfsd_file_cache_init(void) nfsd_file_slab = NULL; kmem_cache_destroy(nfsd_file_mark_slab); nfsd_file_mark_slab = NULL; - kfree(nfsd_file_hashtbl); + kvfree(nfsd_file_hashtbl); nfsd_file_hashtbl = NULL; destroy_workqueue(nfsd_filecache_wq); nfsd_filecache_wq = NULL; @@ -811,7 +811,7 @@ nfsd_file_cache_shutdown(void) fsnotify_wait_marks_destroyed(); kmem_cache_destroy(nfsd_file_mark_slab); nfsd_file_mark_slab = NULL; - kfree(nfsd_file_hashtbl); + kvfree(nfsd_file_hashtbl); nfsd_file_hashtbl = NULL; destroy_workqueue(nfsd_filecache_wq); nfsd_filecache_wq = NULL;
The nfsd file cache table can be pretty large and its allocation may require as many as 80 contigious pages. Employ the same fix that was employed for similar issue that was reported for the reply cache hash table allocation several years ago by commit 8f97514b423a ("nfsd: more robust allocation failure handling in nfsd_reply_cache_init"). Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ Suggested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- Since v1: - Use kvcalloc() - Use kvfree() fs/nfsd/filecache.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)