From patchwork Tue Sep 24 19:29:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Raiber X-Patchwork-Id: 13811137 Received: from a4-15.smtp-out.eu-west-1.amazonses.com (a4-15.smtp-out.eu-west-1.amazonses.com [54.240.4.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A33D1A4F0C for ; Tue, 24 Sep 2024 19:29:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=54.240.4.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727206189; cv=none; b=qb1ZFjufmzOM6A/nxHRlif2wG0fFLEdu8dz96aeb5cfgw7zzhabYhbYboA11Np6R/eP9wH9p6rkDmzuYQ0t3bNaDNqzHvpRhS6xEkPjpC8Z8Jiym0dqerSCY804AMMliUdKP5fzlkunEzfRELw8gNGvTdeRT41r9aMh95caZf9o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727206189; c=relaxed/simple; bh=GVhqIwTGb0WaPvOCm0zl3ltfnY+bIpKQ2cWt8F0ruc8=; h=Content-Type:Message-ID:Date:MIME-Version:To:From:Subject; b=cRBhrmJQscxnW7ckC1eT20TEsbWuDifblkRHueYz86DBIHmFqJmHGYouAj8bY23QByWfk0Eu/G1fosQl92aSsqSfM8i/2+L+8FJsIgXpGbFfyXR53BrBof2nW6o5y5j9rmayjRv+Y/1nnSLxMzt0Yv2yKL8QzIfHOgxsSoQf0mQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=urbackup.org; spf=pass smtp.mailfrom=bounce.urbackup.org; dkim=pass (1024-bit key) header.d=urbackup.org header.i=@urbackup.org header.b=WknYZJ3A; dkim=pass (1024-bit key) header.d=amazonses.com header.i=@amazonses.com header.b=etuZsaw1; arc=none smtp.client-ip=54.240.4.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=urbackup.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bounce.urbackup.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=urbackup.org header.i=@urbackup.org header.b="WknYZJ3A"; dkim=pass (1024-bit key) header.d=amazonses.com header.i=@amazonses.com header.b="etuZsaw1" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=vbsgq4olmwpaxkmtpgfbbmccllr2wq3g; d=urbackup.org; t=1727206185; h=Content-Type:Message-ID:Date:MIME-Version:To:From:Subject; bh=GVhqIwTGb0WaPvOCm0zl3ltfnY+bIpKQ2cWt8F0ruc8=; b=WknYZJ3AXQrwxnRf9hgPMeQpoTTQ3Tb/fHefqprlHyj0C3uDdjgXubl85DdbqmKD kaWfZs1xfYMGehx/Vz46Vojz3Qaei3UJy0TdITpoos4araW2zBYpzBcgWvzckHSN5rh Qc6i0ya83MH3yRRAXQ2cX8ErZjzRhy6f+83FetBE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=uku4taia5b5tsbglxyj6zym32efj7xqv; d=amazonses.com; t=1727206185; h=Content-Type:Message-ID:Date:MIME-Version:To:From:Subject:Feedback-ID; bh=GVhqIwTGb0WaPvOCm0zl3ltfnY+bIpKQ2cWt8F0ruc8=; b=etuZsaw1JLTvUvZO+vaz9KhVkoTbIap6OeH61USkQ7pjwcgqdSRqAy4y9C4U/5t9 tHyowI0W7YZYuW1f+0ZOu489X/qAJ86Uy6g//k51Yk9IhjTSXgxFpRY4nO/VOoAzWpx DSwFnmQXsRLMvm/FyiMGWsUW23owCDBx6uNbKo1I= Message-ID: <010201922582d9b7-d7ef099b-176f-4799-a54c-ff43cda585aa-000000@eu-west-1.amazonses.com> Date: Tue, 24 Sep 2024 19:29:45 +0000 Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: "linux-btrfs@vger.kernel.org" From: Martin Raiber Subject: About one million subvols limit Feedback-ID: ::1.eu-west-1.zKMZH6MF2g3oUhhjaE2f3oQ8IBjABPbvixQzV8APwT0=:AmazonSES X-SES-Outgoing: 2024.09.24-54.240.4.15 Hi, one btrfs user ran into a problem when creating a snapshot where the snapshot creation returns "Too many open files" (EMFILE). I did some digging in this mailing list and saw a case where someone else had the same issue and it was diagnosed to a limitation to the number of anon bdevs which has max 2^20 (about one million) bdevs and it was fixed insofar that the limit was increased (3x?) and it wasn't remounting read-only in case of this occuring. Thanks for this! The user had about one million total subvols (in different file systems), so it is probably the same issue. It is problematic that this limitation exists. Did some further digging and found https://lore.kernel.org/linux-bcachefs/20240222154802.GA1219527@perftesting/ . Perhaps we can come up with an accelerated plan to increase the possible number of subvols? E.g. the behaviour could be switched over via a mount flag or feature bit? Also attached a possible patch which would increase the max number of bdevs to 2^31, significantly improving the situation, but I'm insufficiently involved to tell if this might cause obvious problems. I've also noticed that each subvol uses 2K of kernel memory, so 2^31 subvols would use 4TiB of RAM -- so that would be the limitation for now (would be great if that can be improved as well, but that would be another topic). Regards, Martin Raiber From 72afde28a2bf6656d921a3897555568b8e92eb13 Mon Sep 17 00:00:00 2001 From: Martin Raiber Date: Tue, 24 Sep 2024 21:19:00 +0200 Subject: [PATCH 1/1] Increase possible number of anon bdevs Currently only max 2^20 anon bdevs can be allocated. Increase this by also using the upper portion of the major device number for anon bdevs (most upper bit set). Since currently major devices seem to be numbered < 512 this shouldn't cause issues. --- fs/super.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/super.c b/fs/super.c index 2d762ce67f6e..0c030bfc04da 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1257,26 +1257,36 @@ static DEFINE_IDA(unnamed_dev_ida); int get_anon_bdev(dev_t *p) { int dev; + unsigned int dev_maj; /* * Many userspace utilities consider an FSID of 0 invalid. * Always return at least 1 from get_anon_bdev. */ - dev = ida_alloc_range(&unnamed_dev_ida, 1, (1 << MINORBITS) - 1, + dev = ida_alloc_range(&unnamed_dev_ida, 1, (1 << (MINORBITS + 11) ) - 1, GFP_ATOMIC); if (dev == -ENOSPC) dev = -EMFILE; if (dev < 0) return dev; - *p = MKDEV(0, dev); + dev_maj = MAJOR(dev); + if (dev_maj==0) + *p = MKDEV(0, MINOR(dev)); + else // Also use highest bit in MAJOR for anon devices + *p = MKDEV( 1U<<31 | dev_maj, MINOR(dev)); return 0; } EXPORT_SYMBOL(get_anon_bdev); void free_anon_bdev(dev_t dev) { - ida_free(&unnamed_dev_ida, MINOR(dev)); + if (dev & (1U<<31)) + dev &= ~(1U<<31); + else + dev = MINOR(dev); + + ida_free(&unnamed_dev_ida, dev); } EXPORT_SYMBOL(free_anon_bdev);