diff mbox

btrfs: use kvzalloc to allocate btrfs_fs_info

Message ID 20180216035947.31590-1-jeffm@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Mahoney Feb. 16, 2018, 3:59 a.m. UTC
From: Jeff Mahoney <jeffm@suse.com>

The srcu_struct in btrfs_fs_infoa scales in size with NR_CPUS.  On
kernels built with NR_CPUS=8192, this can result in kmalloc failures
that prevent mounting.

There is work in progress to try to resolve this for every user of
srcu_struct but using kvzalloc will work around the failures until
that is complete.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
 fs/btrfs/ctree.h | 2 +-
 fs/btrfs/super.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

David Sterba Feb. 19, 2018, 3:33 p.m. UTC | #1
On Thu, Feb 15, 2018 at 10:59:47PM -0500, jeffm@suse.com wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> 
> The srcu_struct in btrfs_fs_infoa scales in size with NR_CPUS.  On
> kernels built with NR_CPUS=8192, this can result in kmalloc failures
> that prevent mounting.
> 
> There is work in progress to try to resolve this for every user of
> srcu_struct but using kvzalloc will work around the failures until
> that is complete.

Interesting, the subvol_srcu is the worst contirbutor of the fs_info
size, on a config with NR_CPUS=512:

        struct srcu_struct         subvol_srcu;          /*  1064  3480 */
	...
	/* size: 6496, cachelines: 102, members: 181 */

Using kvzalloc makes sense and is a minimal fix. In the longterm I'd
rather allocate subvol_rcu dynamically so the whole fs_info is not
vmalloced (in the rare case when kmalloc would not work). As this is
unpredictable and almost invisible, I'm worried about some random
effects (performance, virtual mappings), so it would be better to avoid
them if possible.

Reviewed-by: David Sterba <dsterba@suse.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nikolay Borisov Feb. 19, 2018, 3:43 p.m. UTC | #2
On 19.02.2018 17:33, David Sterba wrote:
> On Thu, Feb 15, 2018 at 10:59:47PM -0500, jeffm@suse.com wrote:
>> From: Jeff Mahoney <jeffm@suse.com>
>>
>> The srcu_struct in btrfs_fs_infoa scales in size with NR_CPUS.  On
>> kernels built with NR_CPUS=8192, this can result in kmalloc failures
>> that prevent mounting.
>>
>> There is work in progress to try to resolve this for every user of
>> srcu_struct but using kvzalloc will work around the failures until
>> that is complete.
> 
> Interesting, the subvol_srcu is the worst contirbutor of the fs_info
> size, on a config with NR_CPUS=512:
> 
>         struct srcu_struct         subvol_srcu;          /*  1064  3480 */
> 	...
> 	/* size: 6496, cachelines: 102, members: 181 */
> 
> Using kvzalloc makes sense and is a minimal fix. In the longterm I'd
> rather allocate subvol_rcu dynamically so the whole fs_info is not
> vmalloced (in the rare case when kmalloc would not work). As this is
> unpredictable and almost invisible, I'm worried about some random
> effects (performance, virtual mappings), so it would be better to avoid
> them if possible.

The interesting bit is the "there is WIP trying to address this". Are
there any patches that have been sent to lkml?


> 
> Reviewed-by: David Sterba <dsterba@suse.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 1a462ab85c49..0f521ba5f2f9 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2974,7 +2974,7 @@  static inline void free_fs_info(struct btrfs_fs_info *fs_info)
 	kfree(fs_info->super_copy);
 	kfree(fs_info->super_for_commit);
 	security_free_mnt_opts(&fs_info->security_opts);
-	kfree(fs_info);
+	kvfree(fs_info);
 }
 
 /* tree mod log functions from ctree.c */
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 6e71a2a78363..4b817947e00f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1545,7 +1545,7 @@  static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 	 * it for searching for existing supers, so this lets us do that and
 	 * then open_ctree will properly initialize everything later.
 	 */
-	fs_info = kzalloc(sizeof(struct btrfs_fs_info), GFP_KERNEL);
+	fs_info = kvzalloc(sizeof(struct btrfs_fs_info), GFP_KERNEL);
 	if (!fs_info) {
 		error = -ENOMEM;
 		goto error_sec_opts;