diff mbox series

initrd: support erofs as initrd

Message ID 20250320-initrd-erofs-v1-1-35bbb293468a@cyberus-technology.de (mailing list archive)
State New
Headers show
Series initrd: support erofs as initrd | expand

Commit Message

Julian Stecklina March 20, 2025, 7:28 p.m. UTC
From: Julian Stecklina <julian.stecklina@cyberus-technology.de>

Add erofs detection to the initrd mount code. This allows systems to
boot from an erofs-based initrd in the same way as they can boot from
a squashfs initrd.

Just as squashfs initrds, erofs images as initrds are a good option
for systems that are memory-constrained.

Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
---
 init/do_mounts_rd.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)


---
base-commit: 5fc31936081919a8572a3d644f3fbb258038f337
change-id: 20250320-initrd-erofs-76e925fdf68c

Best regards,

Comments

Al Viro March 21, 2025, 2:08 a.m. UTC | #1
On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote:
> From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> 
> Add erofs detection to the initrd mount code. This allows systems to
> boot from an erofs-based initrd in the same way as they can boot from
> a squashfs initrd.
> 
> Just as squashfs initrds, erofs images as initrds are a good option
> for systems that are memory-constrained.
> 
> Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>

>  #include "do_mounts.h"
>  #include "../fs/squashfs/squashfs_fs.h"
> +#include "../fs/erofs/erofs_fs.h"

This is getting really unpleasant...

Folks, could we do something similar to initcalls - add a section
(.init.text.rd_detect?) with array of pointers to __init functions
that would be called by that thing in turn?  With filesystems that
want to add that kind of stuff being about to do something like

static int __init detect_minix(struct file *file, void *buf, loff_t *pos, int start_block)
{
	struct minix_super_block *minixsb = buf;
	initrd_fill_buffer(file, buf, pos, (start_block + 1) * BLOCK_SIZE);
	if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
	    minixsb->s_magic == MINIX_SUPER_MAGIC2) {
		printk(KERN_NOTICE
			"RAMDISK: Minix filesystem found at block %d\n",
			start_block);
		return minixsb->s_nzones << minixsb->s_log_zone_size;
	}
	return -1;
}

initrd_detect(detect_minix);

with the latter emitting a pointer to detect_minix into that new
section?

initrd_fill_buffer() would be something along the lines of

	if (*pos != wanted) {
		*pos = wanted;
		kernel_read(file, buf, 512, pos);
	}

I mean, we can keep adding those pieces there, but...
Christoph Hellwig March 21, 2025, 5:01 a.m. UTC | #2
We've been trying to kill off initrd in favor of initramfs for about
two decades.  I don't think adding new file system support to it is
helpful.
Gao Xiang March 21, 2025, 5:27 a.m. UTC | #3
Hi Christoph,

On 2025/3/21 13:01, Christoph Hellwig wrote:
> We've been trying to kill off initrd in favor of initramfs for about
> two decades.  I don't think adding new file system support to it is
> helpful.
> 

Disclaimer: I don't know the background of this effort so
more background might be helpful.

Two years ago, I once thought if using EROFS + FSDAX to directly
use the initrd image from bootloaders to avoid the original initrd
double caching issue (which is what initramfs was proposed to
resolve) and initramfs unnecessary tmpfs unpack overhead:
https://lore.kernel.org/r/ZXgNQ85PdUKrQU1j@infradead.org

Also EROFS supports xattrs so the following potential work (which
the cpio format doesn't support) is no longer needed although I
don't have any interest to follow either):
https://lore.kernel.org/r/20190523121803.21638-1-roberto.sassu@huawei.com

Anyway, personally I have no time slot or even input on those.

Thanks,
Gao Xiang
Christian Brauner March 21, 2025, 8:46 a.m. UTC | #4
On Fri, Mar 21, 2025 at 02:08:26AM +0000, Al Viro wrote:
> On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote:
> > From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> > 
> > Add erofs detection to the initrd mount code. This allows systems to
> > boot from an erofs-based initrd in the same way as they can boot from
> > a squashfs initrd.
> > 
> > Just as squashfs initrds, erofs images as initrds are a good option
> > for systems that are memory-constrained.
> > 
> > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> 
> >  #include "do_mounts.h"
> >  #include "../fs/squashfs/squashfs_fs.h"
> > +#include "../fs/erofs/erofs_fs.h"
> 
> This is getting really unpleasant...
> 
> Folks, could we do something similar to initcalls - add a section
> (.init.text.rd_detect?) with array of pointers to __init functions
> that would be called by that thing in turn?  With filesystems that
> want to add that kind of stuff being about to do something like
> 
> static int __init detect_minix(struct file *file, void *buf, loff_t *pos, int start_block)
> {
> 	struct minix_super_block *minixsb = buf;
> 	initrd_fill_buffer(file, buf, pos, (start_block + 1) * BLOCK_SIZE);
> 	if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
> 	    minixsb->s_magic == MINIX_SUPER_MAGIC2) {
> 		printk(KERN_NOTICE
> 			"RAMDISK: Minix filesystem found at block %d\n",
> 			start_block);
> 		return minixsb->s_nzones << minixsb->s_log_zone_size;
> 	}
> 	return -1;
> }
> 
> initrd_detect(detect_minix);
> 
> with the latter emitting a pointer to detect_minix into that new
> section?
> 
> initrd_fill_buffer() would be something along the lines of
> 
> 	if (*pos != wanted) {
> 		*pos = wanted;
> 		kernel_read(file, buf, 512, pos);
> 	}
> 
> I mean, we can keep adding those pieces there, but...

Very much agreed.
Christian Brauner March 21, 2025, 8:48 a.m. UTC | #5
On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote:
> From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> 
> Add erofs detection to the initrd mount code. This allows systems to
> boot from an erofs-based initrd in the same way as they can boot from
> a squashfs initrd.
> 
> Just as squashfs initrds, erofs images as initrds are a good option
> for systems that are memory-constrained.

I think this can be valuable and I know we've had discussion about this
before but it should please come with a detailed rationale for using
erofs and what project(s) this would be used by. Then I don't think
there's a reason per se not to do it.
Thomas Weißschuh March 21, 2025, 9:16 a.m. UTC | #6
On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote:
> From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> 
> Add erofs detection to the initrd mount code. This allows systems to
> boot from an erofs-based initrd in the same way as they can boot from
> a squashfs initrd.
> 
> Just as squashfs initrds, erofs images as initrds are a good option
> for systems that are memory-constrained.
> 
> Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> ---
>  init/do_mounts_rd.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c
> index ac021ae6e6fa78c7b7828a78ab2fa3af3611bef3..7c3f8b45b5ed2eea3c534d7f2e65608542009df5 100644
> --- a/init/do_mounts_rd.c
> +++ b/init/do_mounts_rd.c
> @@ -11,6 +11,7 @@
>  
>  #include "do_mounts.h"
>  #include "../fs/squashfs/squashfs_fs.h"
> +#include "../fs/erofs/erofs_fs.h"
>  
>  #include <linux/decompress/generic.h>
>  
> @@ -47,6 +48,7 @@ static int __init crd_load(decompress_fn deco);
>   *	romfs
>   *	cramfs
>   *	squashfs
> + *	erofs
>   *	gzip
>   *	bzip2
>   *	lzma
> @@ -63,6 +65,7 @@ identify_ramdisk_image(struct file *file, loff_t pos,
>  	struct romfs_super_block *romfsb;
>  	struct cramfs_super *cramfsb;
>  	struct squashfs_super_block *squashfsb;
> +	struct erofs_super_block *erofsb;
>  	int nblocks = -1;
>  	unsigned char *buf;
>  	const char *compress_name;
> @@ -77,6 +80,7 @@ identify_ramdisk_image(struct file *file, loff_t pos,
>  	romfsb = (struct romfs_super_block *) buf;
>  	cramfsb = (struct cramfs_super *) buf;
>  	squashfsb = (struct squashfs_super_block *) buf;
> +	erofsb = (struct erofs_super_block *) buf;
>  	memset(buf, 0xe5, size);
>  
>  	/*
> @@ -165,6 +169,21 @@ identify_ramdisk_image(struct file *file, loff_t pos,
>  		goto done;
>  	}
>  
> +	/* Try erofs */
> +	pos = (start_block * BLOCK_SIZE) + EROFS_SUPER_OFFSET;
> +	kernel_read(file, buf, size, &pos);
> +
> +	if (erofsb->magic == EROFS_SUPER_MAGIC_V1) {

le32_to_cpu(erofsb->magic)

> +		printk(KERN_NOTICE
> +		       "RAMDISK: erofs filesystem found at block %d\n",
> +		       start_block);
> +
> +		nblocks = ((erofsb->blocks << erofsb->blkszbits) + BLOCK_SIZE - 1)
> +			>> BLOCK_SIZE_BITS;

le32_to_cpu(erofsb->blocks)

> +
> +		goto done;
> +	}
> +
>  	printk(KERN_NOTICE
>  	       "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
>  	       start_block);
> 

This seems to be broken for cramfs and minix, too.


Thomas
Julian Stecklina March 21, 2025, 12:49 p.m. UTC | #7
Hi Al!

On Fri, 2025-03-21 at 02:08 +0000, Al Viro wrote:
> On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote:
> > From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> > 
> > Add erofs detection to the initrd mount code. This allows systems to
> > boot from an erofs-based initrd in the same way as they can boot from
> > a squashfs initrd.
> > 
> > Just as squashfs initrds, erofs images as initrds are a good option
> > for systems that are memory-constrained.
> > 
> > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> 
> >  #include "do_mounts.h"
> >  #include "../fs/squashfs/squashfs_fs.h"
> > +#include "../fs/erofs/erofs_fs.h"
> 
> This is getting really unpleasant...
> 
> Folks, could we do something similar to initcalls - add a section
> (.init.text.rd_detect?) with array of pointers to __init functions
> that would be called by that thing in turn?  With filesystems that
> want to add that kind of stuff being about to do something like

That's a great suggestion! I wonder whether it's possible to restructure the
code further, so it does something along:

1. Try to detect whether it's a (compressed) initramfs -> if so, extract and be
done.
2. Copy the initrd onto /dev/ram and just try to mount it (without the manual fs
detection)

As far as I understand it, the kernel should be able to figure out itself what
filesystem it is? If so, this removes the manual fs detection and just allows
all filesystem images to be used as initrd.

I'm a noob in this part of the code base, so feel free to tell me that/why this
is not possible. :)

Julian
Julian Stecklina March 21, 2025, 1:17 p.m. UTC | #8
On Fri, 2025-03-21 at 13:27 +0800, Gao Xiang wrote:
> Hi Christoph,
> 
> On 2025/3/21 13:01, Christoph Hellwig wrote:
> > We've been trying to kill off initrd in favor of initramfs for about
> > two decades.  I don't think adding new file system support to it is
> > helpful.
> > 
> 
> Disclaimer: I don't know the background of this effort so
> more background might be helpful.

So erofs came up in an effort to improve the experience for users of NixOS on
smaller systems. We use erofs a lot and some people in the community just
consider it a "better" cpio at this point. A great property is that the contents
stays compressed in memory and there is no need to unpack anything at boot.
Others like that the rootfs is read-only by default. In short: erofs is a great
fit.

Of course there are some solutions to using erofs images at boot now:
https://github.com/containers/initoverlayfs

But this adds yet another step in the already complex boot process and feels
like a hack. It would be nice to just use erofs images as initrd. The other
building block to this is automatically sizing /dev/ram0:

https://lkml.org/lkml/2025/3/20/1296

I didn't pack both patches into one series, because I thought enabling erofs
itself would be less controversial and is already useful on its own. The
autosizing of /dev/ram is probably more involved than my RFC patch. I'm hoping
for some input on how to do it right. :)

> 
> Two years ago, I once thought if using EROFS + FSDAX to directly
> use the initrd image from bootloaders to avoid the original initrd
> double caching issue (which is what initramfs was proposed to
> resolve) and initramfs unnecessary tmpfs unpack overhead:
> https://lore.kernel.org/r/ZXgNQ85PdUKrQU1j@infradead.org
> 
> Also EROFS supports xattrs so the following potential work (which
> the cpio format doesn't support) is no longer needed although I
> don't have any interest to follow either):
> https://lore.kernel.org/r/20190523121803.21638-1-roberto.sassu@huawei.com

Thanks for the pointers!

Julian
Julian Stecklina March 21, 2025, 1:26 p.m. UTC | #9
On Fri, 2025-03-21 at 10:16 +0100, Thomas Weißschuh wrote:
> On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote:
> > From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> > 
> > Add erofs detection to the initrd mount code. This allows systems to
> > boot from an erofs-based initrd in the same way as they can boot from
> > a squashfs initrd.
> > 
> > Just as squashfs initrds, erofs images as initrds are a good option
> > for systems that are memory-constrained.
> > 
> > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
> > ---
> >  init/do_mounts_rd.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c
> > index
> > ac021ae6e6fa78c7b7828a78ab2fa3af3611bef3..7c3f8b45b5ed2eea3c534d7f2e65608542
> > 009df5 100644
> > --- a/init/do_mounts_rd.c
> > +++ b/init/do_mounts_rd.c
> > @@ -11,6 +11,7 @@
> >  
> >  #include "do_mounts.h"
> >  #include "../fs/squashfs/squashfs_fs.h"
> > +#include "../fs/erofs/erofs_fs.h"
> >  
> >  #include <linux/decompress/generic.h>
> >  
> > @@ -47,6 +48,7 @@ static int __init crd_load(decompress_fn deco);
> >   * romfs
> >   * cramfs
> >   * squashfs
> > + * erofs
> >   * gzip
> >   * bzip2
> >   * lzma
> > @@ -63,6 +65,7 @@ identify_ramdisk_image(struct file *file, loff_t pos,
> >   struct romfs_super_block *romfsb;
> >   struct cramfs_super *cramfsb;
> >   struct squashfs_super_block *squashfsb;
> > + struct erofs_super_block *erofsb;
> >   int nblocks = -1;
> >   unsigned char *buf;
> >   const char *compress_name;
> > @@ -77,6 +80,7 @@ identify_ramdisk_image(struct file *file, loff_t pos,
> >   romfsb = (struct romfs_super_block *) buf;
> >   cramfsb = (struct cramfs_super *) buf;
> >   squashfsb = (struct squashfs_super_block *) buf;
> > + erofsb = (struct erofs_super_block *) buf;
> >   memset(buf, 0xe5, size);
> >  
> >   /*
> > @@ -165,6 +169,21 @@ identify_ramdisk_image(struct file *file, loff_t pos,
> >   goto done;
> >   }
> >  
> > + /* Try erofs */
> > + pos = (start_block * BLOCK_SIZE) + EROFS_SUPER_OFFSET;
> > + kernel_read(file, buf, size, &pos);
> > +
> > + if (erofsb->magic == EROFS_SUPER_MAGIC_V1) {
> 
> le32_to_cpu(erofsb->magic)
> 
> > + printk(KERN_NOTICE
> > +        "RAMDISK: erofs filesystem found at block %d\n",
> > +        start_block);
> > +
> > + nblocks = ((erofsb->blocks << erofsb->blkszbits) + BLOCK_SIZE - 1)
> > + >> BLOCK_SIZE_BITS;
> 
> le32_to_cpu(erofsb->blocks)
> 
> > +
> > + goto done;
> > + }
> > +
> >   printk(KERN_NOTICE
> >          "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
> >          start_block);
> > 
> 
> This seems to be broken for cramfs and minix, too.

Great observation. Will fix! 

Julian
Gao Xiang March 21, 2025, 1:57 p.m. UTC | #10
Hi Julian,

On 2025/3/21 21:17, Julian Stecklina wrote:
> On Fri, 2025-03-21 at 13:27 +0800, Gao Xiang wrote:
>> Hi Christoph,
>>
>> On 2025/3/21 13:01, Christoph Hellwig wrote:
>>> We've been trying to kill off initrd in favor of initramfs for about
>>> two decades.  I don't think adding new file system support to it is
>>> helpful.
>>>
>>
>> Disclaimer: I don't know the background of this effort so
>> more background might be helpful.
> 
> So erofs came up in an effort to improve the experience for users of NixOS on
> smaller systems. We use erofs a lot and some people in the community just
> consider it a "better" cpio at this point. A great property is that the contents
> stays compressed in memory and there is no need to unpack anything at boot.
> Others like that the rootfs is read-only by default. In short: erofs is a great
> fit.
> 
> Of course there are some solutions to using erofs images at boot now:
> https://github.com/containers/initoverlayfs
> 
> But this adds yet another step in the already complex boot process and feels
> like a hack. It would be nice to just use erofs images as initrd. The other
> building block to this is automatically sizing /dev/ram0:
> 
> https://lkml.org/lkml/2025/3/20/1296
> 
> I didn't pack both patches into one series, because I thought enabling erofs
> itself would be less controversial and is already useful on its own. The
> autosizing of /dev/ram is probably more involved than my RFC patch. I'm hoping
> for some input on how to do it right. :)

Ok, my own thought is that cpio format is somewhat inflexible.  It
seems that the main original reason for introducing initramfs and
cpio was to avoid double caching, but it can be resolved with FSDAX
now and initdax totally avoids unnecessary cpio parsing and unpacking.
cpio format is much like tar which lacks of basic features like
random access and xattrs which are useful for some use cases as
I mentioned before.

The initrd image can even compressed as a whole and decompress in
the current initramfs way.  If you really need on-demand
decompression, you could leave some file compresssed since EROFS
supports per-inode compression, but those files are still double
caching since FSDAX mode should be uncompressed to support mmap.
You could leave rare-used files compressed.

Thanks,
Gao Xiang
diff mbox series

Patch

diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c
index ac021ae6e6fa78c7b7828a78ab2fa3af3611bef3..7c3f8b45b5ed2eea3c534d7f2e65608542009df5 100644
--- a/init/do_mounts_rd.c
+++ b/init/do_mounts_rd.c
@@ -11,6 +11,7 @@ 
 
 #include "do_mounts.h"
 #include "../fs/squashfs/squashfs_fs.h"
+#include "../fs/erofs/erofs_fs.h"
 
 #include <linux/decompress/generic.h>
 
@@ -47,6 +48,7 @@  static int __init crd_load(decompress_fn deco);
  *	romfs
  *	cramfs
  *	squashfs
+ *	erofs
  *	gzip
  *	bzip2
  *	lzma
@@ -63,6 +65,7 @@  identify_ramdisk_image(struct file *file, loff_t pos,
 	struct romfs_super_block *romfsb;
 	struct cramfs_super *cramfsb;
 	struct squashfs_super_block *squashfsb;
+	struct erofs_super_block *erofsb;
 	int nblocks = -1;
 	unsigned char *buf;
 	const char *compress_name;
@@ -77,6 +80,7 @@  identify_ramdisk_image(struct file *file, loff_t pos,
 	romfsb = (struct romfs_super_block *) buf;
 	cramfsb = (struct cramfs_super *) buf;
 	squashfsb = (struct squashfs_super_block *) buf;
+	erofsb = (struct erofs_super_block *) buf;
 	memset(buf, 0xe5, size);
 
 	/*
@@ -165,6 +169,21 @@  identify_ramdisk_image(struct file *file, loff_t pos,
 		goto done;
 	}
 
+	/* Try erofs */
+	pos = (start_block * BLOCK_SIZE) + EROFS_SUPER_OFFSET;
+	kernel_read(file, buf, size, &pos);
+
+	if (erofsb->magic == EROFS_SUPER_MAGIC_V1) {
+		printk(KERN_NOTICE
+		       "RAMDISK: erofs filesystem found at block %d\n",
+		       start_block);
+
+		nblocks = ((erofsb->blocks << erofsb->blkszbits) + BLOCK_SIZE - 1)
+			>> BLOCK_SIZE_BITS;
+
+		goto done;
+	}
+
 	printk(KERN_NOTICE
 	       "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
 	       start_block);