diff mbox series

btrfs: use kvcalloc in btrfs_get_dev_zone_info

Message ID 20221120124303.17918-1-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series btrfs: use kvcalloc in btrfs_get_dev_zone_info | expand

Commit Message

Christoph Hellwig Nov. 20, 2022, 12:43 p.m. UTC
Otherwise the kernel memory allocator seems to be unhappy about failing
order 6 allocations for the zones array, that cause 100% reproducible
mount failures in my qemu setup:

[   26.078981] mount: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
[   26.079741] CPU: 0 PID: 2965 Comm: mount Not tainted 6.1.0-rc5+ #185
[   26.080181] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[   26.080950] Call Trace:
[   26.081132]  <TASK>
[   26.081291]  dump_stack_lvl+0x56/0x6f
[   26.081554]  warn_alloc+0x117/0x140
[   26.081808]  ? __alloc_pages_direct_compact+0x1b5/0x300
[   26.082174]  __alloc_pages_slowpath.constprop.0+0xd0e/0xde0
[   26.082569]  __alloc_pages+0x32a/0x340
[   26.082836]  __kmalloc_large_node+0x4d/0xa0
[   26.083133]  ? trace_kmalloc+0x29/0xd0
[   26.083399]  kmalloc_large+0x14/0x60
[   26.083654]  btrfs_get_dev_zone_info+0x1b9/0xc00
[   26.083980]  ? _raw_spin_unlock_irqrestore+0x28/0x50
[   26.084328]  btrfs_get_dev_zone_info_all_devices+0x54/0x80
[   26.084708]  open_ctree+0xed4/0x1654
[   26.084974]  btrfs_mount_root.cold+0x12/0xde
[   26.085288]  ? lock_is_held_type+0xe2/0x140
[   26.085603]  legacy_get_tree+0x28/0x50
[   26.085876]  vfs_get_tree+0x1d/0xb0
[   26.086139]  vfs_kern_mount.part.0+0x6c/0xb0
[   26.086456]  btrfs_mount+0x118/0x3a0
[   26.086728]  ? lock_is_held_type+0xe2/0x140
[   26.087043]  legacy_get_tree+0x28/0x50
[   26.087323]  vfs_get_tree+0x1d/0xb0
[   26.087587]  path_mount+0x2ba/0xbe0
[   26.087850]  ? _raw_spin_unlock_irqrestore+0x38/0x50
[   26.088217]  __x64_sys_mount+0xfe/0x140
[   26.088506]  do_syscall_64+0x35/0x80
[   26.088776]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/zoned.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Damien Le Moal Nov. 21, 2022, 2:31 a.m. UTC | #1
On 11/20/22 21:43, Christoph Hellwig wrote:
> Otherwise the kernel memory allocator seems to be unhappy about failing
> order 6 allocations for the zones array, that cause 100% reproducible
> mount failures in my qemu setup:
> 
> [   26.078981] mount: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
> [   26.079741] CPU: 0 PID: 2965 Comm: mount Not tainted 6.1.0-rc5+ #185
> [   26.080181] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [   26.080950] Call Trace:
> [   26.081132]  <TASK>
> [   26.081291]  dump_stack_lvl+0x56/0x6f
> [   26.081554]  warn_alloc+0x117/0x140
> [   26.081808]  ? __alloc_pages_direct_compact+0x1b5/0x300
> [   26.082174]  __alloc_pages_slowpath.constprop.0+0xd0e/0xde0
> [   26.082569]  __alloc_pages+0x32a/0x340
> [   26.082836]  __kmalloc_large_node+0x4d/0xa0
> [   26.083133]  ? trace_kmalloc+0x29/0xd0
> [   26.083399]  kmalloc_large+0x14/0x60
> [   26.083654]  btrfs_get_dev_zone_info+0x1b9/0xc00
> [   26.083980]  ? _raw_spin_unlock_irqrestore+0x28/0x50
> [   26.084328]  btrfs_get_dev_zone_info_all_devices+0x54/0x80
> [   26.084708]  open_ctree+0xed4/0x1654
> [   26.084974]  btrfs_mount_root.cold+0x12/0xde
> [   26.085288]  ? lock_is_held_type+0xe2/0x140
> [   26.085603]  legacy_get_tree+0x28/0x50
> [   26.085876]  vfs_get_tree+0x1d/0xb0
> [   26.086139]  vfs_kern_mount.part.0+0x6c/0xb0
> [   26.086456]  btrfs_mount+0x118/0x3a0
> [   26.086728]  ? lock_is_held_type+0xe2/0x140
> [   26.087043]  legacy_get_tree+0x28/0x50
> [   26.087323]  vfs_get_tree+0x1d/0xb0
> [   26.087587]  path_mount+0x2ba/0xbe0
> [   26.087850]  ? _raw_spin_unlock_irqrestore+0x38/0x50
> [   26.088217]  __x64_sys_mount+0xfe/0x140
> [   26.088506]  do_syscall_64+0x35/0x80
> [   26.088776]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good.
This likely needs a fixes tag though.

Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

> ---
>  fs/btrfs/zoned.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index 2218b33dac568..a759668477bb2 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -468,7 +468,7 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
>  		goto out;
>  	}
>  
> -	zones = kcalloc(BTRFS_REPORT_NR_ZONES, sizeof(struct blk_zone), GFP_KERNEL);
> +	zones = kvcalloc(BTRFS_REPORT_NR_ZONES, sizeof(struct blk_zone), GFP_KERNEL);
>  	if (!zones) {
>  		ret = -ENOMEM;
>  		goto out;
> @@ -587,7 +587,7 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
>  	}
>  
>  
> -	kfree(zones);
> +	kvfree(zones);
>  
>  	switch (bdev_zoned_model(bdev)) {
>  	case BLK_ZONED_HM:
> @@ -619,7 +619,7 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
>  	return 0;
>  
>  out:
> -	kfree(zones);
> +	kvfree(zones);
>  out_free_zone_info:
>  	btrfs_destroy_dev_zone_info(device);
>
Christoph Hellwig Nov. 21, 2022, 7:05 a.m. UTC | #2
On Mon, Nov 21, 2022 at 11:31:01AM +0900, Damien Le Moal wrote:
> Looks good.
> This likely needs a fixes tag though.

git-blame seems to suggest the code has been like this since the
addition of the zone code.  Which is a bit odd as I've not seen
the issue before last week.
Johannes Thumshirn Nov. 21, 2022, 7:45 a.m. UTC | #3
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Damien Le Moal Nov. 21, 2022, 7:46 a.m. UTC | #4
On 11/21/22 16:05, Christoph Hellwig wrote:
> On Mon, Nov 21, 2022 at 11:31:01AM +0900, Damien Le Moal wrote:
>> Looks good.
>> This likely needs a fixes tag though.
> 
> git-blame seems to suggest the code has been like this since the
> addition of the zone code.  Which is a bit odd as I've not seen
> the issue before last week.

BTRFS_REPORT_NR_ZONES is 4096, so the allocation is for 256KiB (4096 x
64B). Not that small, but the mm code can likely handle that most of the
time. I guess we were all lucky until now :)

So maybe simply add:

Fixes: 5b316468983d ("btrfs: get zone information of zoned block devices")

No ?
David Sterba Nov. 21, 2022, 5:42 p.m. UTC | #5
On Sun, Nov 20, 2022 at 01:43:03PM +0100, Christoph Hellwig wrote:
> Otherwise the kernel memory allocator seems to be unhappy about failing
> order 6 allocations for the zones array, that cause 100% reproducible
> mount failures in my qemu setup:
> 
> [   26.078981] mount: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
> [   26.079741] CPU: 0 PID: 2965 Comm: mount Not tainted 6.1.0-rc5+ #185
> [   26.080181] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [   26.080950] Call Trace:
> [   26.081132]  <TASK>
> [   26.081291]  dump_stack_lvl+0x56/0x6f
> [   26.081554]  warn_alloc+0x117/0x140
> [   26.081808]  ? __alloc_pages_direct_compact+0x1b5/0x300
> [   26.082174]  __alloc_pages_slowpath.constprop.0+0xd0e/0xde0
> [   26.082569]  __alloc_pages+0x32a/0x340
> [   26.082836]  __kmalloc_large_node+0x4d/0xa0
> [   26.083133]  ? trace_kmalloc+0x29/0xd0
> [   26.083399]  kmalloc_large+0x14/0x60
> [   26.083654]  btrfs_get_dev_zone_info+0x1b9/0xc00
> [   26.083980]  ? _raw_spin_unlock_irqrestore+0x28/0x50
> [   26.084328]  btrfs_get_dev_zone_info_all_devices+0x54/0x80
> [   26.084708]  open_ctree+0xed4/0x1654
> [   26.084974]  btrfs_mount_root.cold+0x12/0xde
> [   26.085288]  ? lock_is_held_type+0xe2/0x140
> [   26.085603]  legacy_get_tree+0x28/0x50
> [   26.085876]  vfs_get_tree+0x1d/0xb0
> [   26.086139]  vfs_kern_mount.part.0+0x6c/0xb0
> [   26.086456]  btrfs_mount+0x118/0x3a0
> [   26.086728]  ? lock_is_held_type+0xe2/0x140
> [   26.087043]  legacy_get_tree+0x28/0x50
> [   26.087323]  vfs_get_tree+0x1d/0xb0
> [   26.087587]  path_mount+0x2ba/0xbe0
> [   26.087850]  ? _raw_spin_unlock_irqrestore+0x38/0x50
> [   26.088217]  __x64_sys_mount+0xfe/0x140
> [   26.088506]  do_syscall_64+0x35/0x80
> [   26.088776]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Added to misc-next, thanks.
diff mbox series

Patch

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 2218b33dac568..a759668477bb2 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -468,7 +468,7 @@  int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
 		goto out;
 	}
 
-	zones = kcalloc(BTRFS_REPORT_NR_ZONES, sizeof(struct blk_zone), GFP_KERNEL);
+	zones = kvcalloc(BTRFS_REPORT_NR_ZONES, sizeof(struct blk_zone), GFP_KERNEL);
 	if (!zones) {
 		ret = -ENOMEM;
 		goto out;
@@ -587,7 +587,7 @@  int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
 	}
 
 
-	kfree(zones);
+	kvfree(zones);
 
 	switch (bdev_zoned_model(bdev)) {
 	case BLK_ZONED_HM:
@@ -619,7 +619,7 @@  int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
 	return 0;
 
 out:
-	kfree(zones);
+	kvfree(zones);
 out_free_zone_info:
 	btrfs_destroy_dev_zone_info(device);