diff mbox series

[v1,07/23] erofs: add nodev mode

Message ID 20211227125444.21187-8-jefflexu@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series fscache,erofs: fscache-based demand-read semantics | expand

Commit Message

Jingbo Xu Dec. 27, 2021, 12:54 p.m. UTC
Until then erofs is exactly blockdev based filesystem. In other using
scenarios (e.g. container image), erofs needs to run upon files.

This patch introduces a new nodev mode, in which erofs could be mounted
from a bootstrap blob file containing the complete erofs image.

The following patch will introduce a new mount option "uuid", by which
users could specify the bootstrap blob file.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/data.c     | 13 ++++++++---
 fs/erofs/internal.h |  1 +
 fs/erofs/super.c    | 56 +++++++++++++++++++++++++++++++++------------
 3 files changed, 53 insertions(+), 17 deletions(-)

Comments

Gao Xiang Jan. 4, 2022, 2:33 p.m. UTC | #1
On Mon, Dec 27, 2021 at 08:54:28PM +0800, Jeffle Xu wrote:
> Until then erofs is exactly blockdev based filesystem. In other using
> scenarios (e.g. container image), erofs needs to run upon files.
> 
> This patch introduces a new nodev mode, in which erofs could be mounted
> from a bootstrap blob file containing the complete erofs image.
> 
> The following patch will introduce a new mount option "uuid", by which
> users could specify the bootstrap blob file.
> 
> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>

I think the order of some patches in this patchset can be improved.

Take this patch as an example. This patch introduces a new mount
option called "uuid", so the kernel will just accept it (which
generates a user-visible impact) after this patch but it doesn't
actually work.

Therefore, we actually have three different behaviors here:
 - kernel doesn't support "uuid" mount option completely;
 - kernel support "uuid" but it doesn't work;
 - kernel support "uuid" correctly (maybe after some random patch);

Actually that is bad for bisecting since there are some commits
having temporary behaviors. And we don't know which commit
actually fully implements this "uuid" mount option.

So personally I think the proper order is just like the bottom-up
approach, and make sure each patch can be tested / bisected
independently.

> ---
>  fs/erofs/data.c     | 13 ++++++++---
>  fs/erofs/internal.h |  1 +
>  fs/erofs/super.c    | 56 +++++++++++++++++++++++++++++++++------------
>  3 files changed, 53 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
> index 477aaff0c832..61fa431d0713 100644
> --- a/fs/erofs/data.c
> +++ b/fs/erofs/data.c
> @@ -11,11 +11,18 @@
>  
>  struct page *erofs_get_meta_page(struct super_block *sb, erofs_blk_t blkaddr)
>  {
> -	struct address_space *const mapping = sb->s_bdev->bd_inode->i_mapping;
> +	struct address_space *mapping;
>  	struct page *page;
>  
> -	page = read_cache_page_gfp(mapping, blkaddr,
> -				   mapping_gfp_constraint(mapping, ~__GFP_FS));

Apart from the recommendation above, if my understanding is
correct, I think after we implement fscache_aops, 
read_cache_page_gfp() can work with proper fscache mapping.

So no need to implement something like erofs_readpage_from_fscache()
later (at least for the case here.)

Thanks,
Gao Xiang

> +	if (sb->s_bdev) {
> +		mapping = sb->s_bdev->bd_inode->i_mapping;
> +		page = read_cache_page_gfp(mapping, blkaddr,
> +				mapping_gfp_constraint(mapping, ~__GFP_FS));
> +	} else {
> +		/* TODO: data path in nodev mode */
> +		page = ERR_PTR(-EINVAL);
> +	}
> +
Gao Xiang Jan. 4, 2022, 2:58 p.m. UTC | #2
On Tue, Jan 04, 2022 at 10:33:26PM +0800, Gao Xiang wrote:
> On Mon, Dec 27, 2021 at 08:54:28PM +0800, Jeffle Xu wrote:
> > Until then erofs is exactly blockdev based filesystem. In other using
> > scenarios (e.g. container image), erofs needs to run upon files.
> > 
> > This patch introduces a new nodev mode, in which erofs could be mounted
> > from a bootstrap blob file containing the complete erofs image.
> > 
> > The following patch will introduce a new mount option "uuid", by which
> > users could specify the bootstrap blob file.
> > 
> > Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
> 
> I think the order of some patches in this patchset can be improved.
> 
> Take this patch as an example. This patch introduces a new mount
> option called "uuid", so the kernel will just accept it (which
> generates a user-visible impact) after this patch but it doesn't
> actually work.
> 
> Therefore, we actually have three different behaviors here:
>  - kernel doesn't support "uuid" mount option completely;
>  - kernel support "uuid" but it doesn't work;
>  - kernel support "uuid" correctly (maybe after some random patch);
> 
> Actually that is bad for bisecting since there are some commits
> having temporary behaviors. And we don't know which commit
> actually fully implements this "uuid" mount option.
> 
> So personally I think the proper order is just like the bottom-up
> approach, and make sure each patch can be tested / bisected
> independently.

Oh, I may misread this patch, but I still think we'd better to
avoid dead paths "TODO" like this as much as possible.

Just do in the bottom-up way.

Thanks,
Gao Xiang
Jingbo Xu Jan. 5, 2022, 9:04 a.m. UTC | #3
On 1/4/22 10:58 PM, Gao Xiang wrote:
> On Tue, Jan 04, 2022 at 10:33:26PM +0800, Gao Xiang wrote:
>> On Mon, Dec 27, 2021 at 08:54:28PM +0800, Jeffle Xu wrote:
>>> Until then erofs is exactly blockdev based filesystem. In other using
>>> scenarios (e.g. container image), erofs needs to run upon files.
>>>
>>> This patch introduces a new nodev mode, in which erofs could be mounted
>>> from a bootstrap blob file containing the complete erofs image.
>>>
>>> The following patch will introduce a new mount option "uuid", by which
>>> users could specify the bootstrap blob file.
>>>
>>> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
>>
>> I think the order of some patches in this patchset can be improved.
>>
>> Take this patch as an example. This patch introduces a new mount
>> option called "uuid", so the kernel will just accept it (which
>> generates a user-visible impact) after this patch but it doesn't
>> actually work.
>>
>> Therefore, we actually have three different behaviors here:
>>  - kernel doesn't support "uuid" mount option completely;
>>  - kernel support "uuid" but it doesn't work;
>>  - kernel support "uuid" correctly (maybe after some random patch);
>>
>> Actually that is bad for bisecting since there are some commits
>> having temporary behaviors. And we don't know which commit
>> actually fully implements this "uuid" mount option.
>>
>> So personally I think the proper order is just like the bottom-up
>> approach, and make sure each patch can be tested / bisected
>> independently.
> 
> Oh, I may misread this patch, but I still think we'd better to
> avoid dead paths "TODO" like this as much as possible.
> 
> Just do in the bottom-up way.
> 

OK, it is better to be put at the latter part of the whole patch set.
Would be fixed in the next version. Thanks.
diff mbox series

Patch

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 477aaff0c832..61fa431d0713 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -11,11 +11,18 @@ 
 
 struct page *erofs_get_meta_page(struct super_block *sb, erofs_blk_t blkaddr)
 {
-	struct address_space *const mapping = sb->s_bdev->bd_inode->i_mapping;
+	struct address_space *mapping;
 	struct page *page;
 
-	page = read_cache_page_gfp(mapping, blkaddr,
-				   mapping_gfp_constraint(mapping, ~__GFP_FS));
+	if (sb->s_bdev) {
+		mapping = sb->s_bdev->bd_inode->i_mapping;
+		page = read_cache_page_gfp(mapping, blkaddr,
+				mapping_gfp_constraint(mapping, ~__GFP_FS));
+	} else {
+		/* TODO: data path in nodev mode */
+		page = ERR_PTR(-EINVAL);
+	}
+
 	/* should already be PageUptodate */
 	if (!IS_ERR(page))
 		lock_page(page);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 45fb6f5d11b5..c9ee8c247202 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -67,6 +67,7 @@  struct erofs_mount_opts {
 	unsigned int max_sync_decompress_pages;
 #endif
 	unsigned int mount_opt;
+	char *uuid;
 };
 
 struct erofs_dev_context {
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 6a969b1e0ee6..80c00c34eafc 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -304,15 +304,19 @@  static int erofs_init_devices(struct super_block *sb,
 		}
 		dis = ptr + erofs_blkoff(pos);
 
-		bdev = blkdev_get_by_path(dif->path,
-					  FMODE_READ | FMODE_EXCL,
-					  sb->s_type);
-		if (IS_ERR(bdev)) {
-			err = PTR_ERR(bdev);
-			goto err_out;
+		if (sb->s_bdev) {
+			bdev = blkdev_get_by_path(dif->path,
+					FMODE_READ | FMODE_EXCL,
+					sb->s_type);
+			if (IS_ERR(bdev)) {
+				err = PTR_ERR(bdev);
+				goto err_out;
+			}
+			dif->bdev = bdev;
+			dif->dax_dev = fs_dax_get_by_bdev(bdev);
+		} else {
+			/* TODO: multi devs in nodev mode */
 		}
-		dif->bdev = bdev;
-		dif->dax_dev = fs_dax_get_by_bdev(bdev);
 		dif->blocks = le32_to_cpu(dis->blocks);
 		dif->mapped_blkaddr = le32_to_cpu(dis->mapped_blkaddr);
 		sbi->total_blocks += dif->blocks;
@@ -337,7 +341,11 @@  static int erofs_read_superblock(struct super_block *sb)
 	void *data;
 	int ret;
 
-	page = read_mapping_page(sb->s_bdev->bd_inode->i_mapping, 0, NULL);
+	/* TODO: metadata path in nodev mode */
+	if (sb->s_bdev)
+		page = read_mapping_page(sb->s_bdev->bd_inode->i_mapping, 0, NULL);
+	else
+		page = ERR_PTR(-EOPNOTSUPP);
 	if (IS_ERR(page)) {
 		erofs_err(sb, "cannot read erofs superblock");
 		return PTR_ERR(page);
@@ -633,9 +641,12 @@  static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 
 	sb->s_magic = EROFS_SUPER_MAGIC;
 
-	if (!sb_set_blocksize(sb, EROFS_BLKSIZ)) {
+	if (sb->s_bdev && !sb_set_blocksize(sb, EROFS_BLKSIZ)) {
 		erofs_err(sb, "failed to set erofs blksize");
 		return -EINVAL;
+	} else {
+		sb->s_blocksize = EROFS_BLKSIZ;
+		sb->s_blocksize_bits = LOG_BLOCK_SIZE;
 	}
 
 	sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
@@ -644,16 +655,22 @@  static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 
 	sb->s_fs_info = sbi;
 	sbi->opt = ctx->opt;
-	sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev);
 	sbi->devs = ctx->devs;
 	ctx->devs = NULL;
 
+	if (sb->s_bdev)
+		sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev);
+	else
+		sbi->dax_dev = NULL;
+
 	err = erofs_read_superblock(sb);
 	if (err)
 		return err;
 
 	if (test_opt(&sbi->opt, DAX_ALWAYS) &&
-	    !dax_supported(sbi->dax_dev, sb->s_bdev, EROFS_BLKSIZ, 0, bdev_nr_sectors(sb->s_bdev))) {
+	    (!sbi->dax_dev ||
+	     !dax_supported(sbi->dax_dev, sb->s_bdev, EROFS_BLKSIZ, 0,
+			    bdev_nr_sectors(sb->s_bdev)))) {
 		errorfc(fc, "DAX unsupported by block device. Turning off DAX.");
 		clear_opt(&sbi->opt, DAX_ALWAYS);
 	}
@@ -701,6 +718,10 @@  static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
 
 static int erofs_fc_get_tree(struct fs_context *fc)
 {
+	struct erofs_fs_context *ctx = fc->fs_private;
+
+	if (ctx->opt.uuid)
+		return get_tree_nodev(fc, erofs_fc_fill_super);
 	return get_tree_bdev(fc, erofs_fc_fill_super);
 }
 
@@ -789,7 +810,10 @@  static void erofs_kill_sb(struct super_block *sb)
 
 	WARN_ON(sb->s_magic != EROFS_SUPER_MAGIC);
 
-	kill_block_super(sb);
+	if (sb->s_bdev)
+		kill_block_super(sb);
+	else
+		generic_shutdown_super(sb);
 
 	sbi = EROFS_SB(sb);
 	if (!sbi)
@@ -889,7 +913,11 @@  static int erofs_statfs(struct dentry *dentry, struct kstatfs *buf)
 {
 	struct super_block *sb = dentry->d_sb;
 	struct erofs_sb_info *sbi = EROFS_SB(sb);
-	u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
+	u64 id = 0;
+
+	/* TODO: fsid in nodev mode */
+	if (sb->s_bdev)
+		id = huge_encode_dev(sb->s_bdev->bd_dev);
 
 	buf->f_type = sb->s_magic;
 	buf->f_bsize = EROFS_BLKSIZ;