From patchwork Tue Jan 18 13:11:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C70B2C433FE for ; Tue, 18 Jan 2022 13:13:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242046AbiARNM1 (ORCPT ); Tue, 18 Jan 2022 08:12:27 -0500 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:60715 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242036AbiARNMY (ORCPT ); Tue, 18 Jan 2022 08:12:24 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R481e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C2avm_1642511537; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C2avm_1642511537) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:18 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 01/20] netfs: make @file optional in netfs_alloc_read_request() Date: Tue, 18 Jan 2022 21:11:57 +0800 Message-Id: <20220118131216.85338-2-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Make the @file parameter optional, and derive inode from the @folio parameter instead in order to support file system internal requests. @file parameter can't be removed completely, since it also works as the private data of ops->init_rreq(). Signed-off-by: Jeffle Xu --- fs/netfs/read_helper.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c index 8c58cff420ba..ca84918b6b5d 100644 --- a/fs/netfs/read_helper.c +++ b/fs/netfs/read_helper.c @@ -39,7 +39,7 @@ static void netfs_put_subrequest(struct netfs_read_subrequest *subreq, static struct netfs_read_request *netfs_alloc_read_request( const struct netfs_read_request_ops *ops, void *netfs_priv, - struct file *file) + struct inode *inode, struct file *file) { static atomic_t debug_ids; struct netfs_read_request *rreq; @@ -48,7 +48,7 @@ static struct netfs_read_request *netfs_alloc_read_request( if (rreq) { rreq->netfs_ops = ops; rreq->netfs_priv = netfs_priv; - rreq->inode = file_inode(file); + rreq->inode = inode; rreq->i_size = i_size_read(rreq->inode); rreq->debug_id = atomic_inc_return(&debug_ids); INIT_LIST_HEAD(&rreq->subrequests); @@ -870,6 +870,7 @@ void netfs_readahead(struct readahead_control *ractl, void *netfs_priv) { struct netfs_read_request *rreq; + struct inode *inode = file_inode(ractl->file); unsigned int debug_index = 0; int ret; @@ -878,7 +879,7 @@ void netfs_readahead(struct readahead_control *ractl, if (readahead_count(ractl) == 0) goto cleanup; - rreq = netfs_alloc_read_request(ops, netfs_priv, ractl->file); + rreq = netfs_alloc_read_request(ops, netfs_priv, inode, ractl->file); if (!rreq) goto cleanup; rreq->mapping = ractl->mapping; @@ -948,12 +949,13 @@ int netfs_readpage(struct file *file, void *netfs_priv) { struct netfs_read_request *rreq; + struct inode *inode = folio_file_mapping(folio)->host; unsigned int debug_index = 0; int ret; _enter("%lx", folio_index(folio)); - rreq = netfs_alloc_read_request(ops, netfs_priv, file); + rreq = netfs_alloc_read_request(ops, netfs_priv, inode, file); if (!rreq) { if (netfs_priv) ops->cleanup(folio_file_mapping(folio), netfs_priv); @@ -1122,7 +1124,7 @@ int netfs_write_begin(struct file *file, struct address_space *mapping, } ret = -ENOMEM; - rreq = netfs_alloc_read_request(ops, netfs_priv, file); + rreq = netfs_alloc_read_request(ops, netfs_priv, inode, file); if (!rreq) goto error; rreq->mapping = folio_file_mapping(folio); From patchwork Tue Jan 18 13:11:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5BF4C43217 for ; Tue, 18 Jan 2022 13:12:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241948AbiARNM0 (ORCPT ); Tue, 18 Jan 2022 08:12:26 -0500 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:42379 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241959AbiARNMW (ORCPT ); Tue, 18 Jan 2022 08:12:22 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R421e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C1QwT_1642511538; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C1QwT_1642511538) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:19 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 02/20] netfs,cachefiles: manage logical/physical offset separately Date: Tue, 18 Jan 2022 21:11:58 +0800 Message-Id: <20220118131216.85338-3-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently fscache is used in a style that every file in upper fs has a corresponding backing file in fscache, and the file offset in the upper file (logical) is always equal to that in the backing file (physical). While upper fs may implement different backing strategy, the above assumption can no longer be valid, e.g. multiple upper files can be packed into one single backing file. Thus this patch abstracts these two different offsets and manage them separately, so that upper fs can implement different backing strategy. For the original users where these two offsets are always equal, no change is needed. While for the scenario where these two offsets can be different, upper fs can set a separate logical/physical offset in ops->begin_cache_operation() if it's needed. Signed-off-by: Jeffle Xu --- fs/cachefiles/io.c | 14 +++++++------- fs/netfs/read_helper.c | 16 ++++++++++++---- include/linux/netfs.h | 2 ++ 3 files changed, 21 insertions(+), 11 deletions(-) diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index 60b1eac2ce78..5da0bfd78188 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -370,7 +370,7 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque off = cachefiles_inject_read_error(); if (off == 0) - off = vfs_llseek(file, subreq->start, SEEK_DATA); + off = vfs_llseek(file, subreq->p_start, SEEK_DATA); if (off < 0 && off >= (loff_t)-MAX_ERRNO) { if (off == (loff_t)-ENXIO) { why = cachefiles_trace_read_seek_nxio; @@ -382,21 +382,21 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque goto out; } - if (off >= subreq->start + subreq->len) { + if (off >= subreq->p_start + subreq->len) { why = cachefiles_trace_read_found_hole; goto download_and_store; } - if (off > subreq->start) { + if (off > subreq->p_start) { off = round_up(off, cache->bsize); - subreq->len = off - subreq->start; + subreq->len = off - subreq->p_start; why = cachefiles_trace_read_found_part; goto download_and_store; } to = cachefiles_inject_read_error(); if (to == 0) - to = vfs_llseek(file, subreq->start, SEEK_HOLE); + to = vfs_llseek(file, subreq->p_start, SEEK_HOLE); if (to < 0 && to >= (loff_t)-MAX_ERRNO) { trace_cachefiles_io_error(object, file_inode(file), to, cachefiles_trace_seek_error); @@ -404,12 +404,12 @@ static enum netfs_read_source cachefiles_prepare_read(struct netfs_read_subreque goto out; } - if (to < subreq->start + subreq->len) { + if (to < subreq->p_start + subreq->len) { if (subreq->start + subreq->len >= i_size) to = round_up(to, cache->bsize); else to = round_down(to, cache->bsize); - subreq->len = to - subreq->start; + subreq->len = to - subreq->p_start; } why = cachefiles_trace_read_have_data; diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c index ca84918b6b5d..077c0ca96612 100644 --- a/fs/netfs/read_helper.c +++ b/fs/netfs/read_helper.c @@ -181,7 +181,7 @@ static void netfs_read_from_cache(struct netfs_read_request *rreq, subreq->start + subreq->transferred, subreq->len - subreq->transferred); - cres->ops->read(cres, subreq->start, &iter, read_hole, + cres->ops->read(cres, subreq->p_start, &iter, read_hole, netfs_cache_read_terminated, subreq); } @@ -323,7 +323,7 @@ static void netfs_rreq_do_write_to_cache(struct netfs_read_request *rreq) netfs_put_subrequest(next, false); } - ret = cres->ops->prepare_write(cres, &subreq->start, &subreq->len, + ret = cres->ops->prepare_write(cres, &subreq->p_start, &subreq->len, rreq->i_size, true); if (ret < 0) { trace_netfs_failure(rreq, subreq, ret, netfs_fail_prepare_write); @@ -338,7 +338,7 @@ static void netfs_rreq_do_write_to_cache(struct netfs_read_request *rreq) netfs_stat(&netfs_n_rh_write); netfs_get_read_subrequest(subreq); trace_netfs_sreq(subreq, netfs_sreq_trace_write); - cres->ops->write(cres, subreq->start, &iter, + cres->ops->write(cres, subreq->p_start, &iter, netfs_rreq_copy_terminated, subreq); } @@ -760,6 +760,7 @@ static bool netfs_rreq_submit_slice(struct netfs_read_request *rreq, subreq->debug_index = (*_debug_index)++; subreq->start = rreq->start + rreq->submitted; + subreq->p_start = rreq->p_start + rreq->submitted; subreq->len = rreq->len - rreq->submitted; _debug("slice %llx,%zx,%zx", subreq->start, subreq->len, rreq->submitted); @@ -818,8 +819,12 @@ static void netfs_rreq_expand(struct netfs_read_request *rreq, { /* Give the cache a chance to change the request parameters. The * resultant request must contain the original region. + * Skip expanding if there may be multi-to-multi mapping between + * backing file and backed file. */ - netfs_cache_expand_readahead(rreq, &rreq->start, &rreq->len, rreq->i_size); + if (rreq->start == rreq->p_start) + netfs_cache_expand_readahead(rreq, &rreq->start, &rreq->len, + rreq->i_size); /* Give the netfs a chance to change the request parameters. The * resultant request must contain the original region. @@ -884,6 +889,7 @@ void netfs_readahead(struct readahead_control *ractl, goto cleanup; rreq->mapping = ractl->mapping; rreq->start = readahead_pos(ractl); + rreq->p_start = rreq->start; rreq->len = readahead_length(ractl); if (ops->begin_cache_operation) { @@ -964,6 +970,7 @@ int netfs_readpage(struct file *file, } rreq->mapping = folio_file_mapping(folio); rreq->start = folio_file_pos(folio); + rreq->p_start = rreq->start; rreq->len = folio_size(folio); if (ops->begin_cache_operation) { @@ -1129,6 +1136,7 @@ int netfs_write_begin(struct file *file, struct address_space *mapping, goto error; rreq->mapping = folio_file_mapping(folio); rreq->start = folio_file_pos(folio); + rreq->p_start = rreq->start; rreq->len = folio_size(folio); rreq->no_unlock_folio = folio_index(folio); __set_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags); diff --git a/include/linux/netfs.h b/include/linux/netfs.h index b46c39d98bbd..a17740b3b9d6 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -134,6 +134,7 @@ struct netfs_read_subrequest { struct netfs_read_request *rreq; /* Supervising read request */ struct list_head rreq_link; /* Link in rreq->subrequests */ loff_t start; /* Where to start the I/O */ + loff_t p_start; /* Start position of backing file */ size_t len; /* Size of the I/O */ size_t transferred; /* Amount of data transferred */ refcount_t usage; @@ -167,6 +168,7 @@ struct netfs_read_request { short error; /* 0 or error that occurred */ loff_t i_size; /* Size of the file */ loff_t start; /* Start position */ + loff_t p_start; /* Start position of backing file */ pgoff_t no_unlock_folio; /* Don't unlock this folio after read */ refcount_t usage; unsigned long flags; From patchwork Tue Jan 18 13:11:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25DC6C433F5 for ; Tue, 18 Jan 2022 13:12:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242007AbiARNMZ (ORCPT ); Tue, 18 Jan 2022 08:12:25 -0500 Received: from out30-45.freemail.mail.aliyun.com ([115.124.30.45]:49075 "EHLO out30-45.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242031AbiARNMY (ORCPT ); Tue, 18 Jan 2022 08:12:24 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C5CnK_1642511540; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C5CnK_1642511540) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:20 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 03/20] netfs,fscache: support on-demand reading Date: Tue, 18 Jan 2022 21:11:59 +0800 Message-Id: <20220118131216.85338-4-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add ondemand_read() callback to netfs_cache_ops to implement on-demand reading. The precondition for implementing on-demand reading semantic is that, all blob files have been placed under corresponding directory with correct file size (sparse files) on the first beginning. When upper fs starts to access the blob file, it will "cache miss" (hit the hole) and then .issue_op() callback will be called to prepare the data. The following working flow is described as below. The .issue_op() callback could be implemented by netfs_ondemand_read() helper, which will in turn call .ondemand_read() callback of corresponding fscache backend to prepare the data. The implementation of .ondemand_read() callback can be backend specific. The following patch will introduce an implementation of .ondemand_read() callback for cachefiles, which will notify user daemon the requested file range to read. The .ondemand_read() callback will get blocked until the user daemon has prepared the corresponding data. Then once .ondemand_read() callback returns with 0, it is guaranteed that the requested data has been ready. In this case, transform this IO request to NETFS_READ_FROM_CACHE state, initiate an incomplete completion and then retry to read from backing file. Signed-off-by: Jeffle Xu --- fs/fscache/Kconfig | 8 ++++++++ fs/netfs/Kconfig | 8 ++++++++ fs/netfs/read_helper.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/netfs.h | 8 ++++++++ 4 files changed, 61 insertions(+) diff --git a/fs/fscache/Kconfig b/fs/fscache/Kconfig index 76316c4a3fb7..f6b5396759ee 100644 --- a/fs/fscache/Kconfig +++ b/fs/fscache/Kconfig @@ -41,3 +41,11 @@ config FSCACHE_DEBUG config FSCACHE_OLD_API bool + +config FSCACHE_ONDEMAND + bool "Support for on-demand reading" + depends on FSCACHE + select NETFS_ONDEMAND + help + This permits on-demand reading with fscache. + If unsure, say N. diff --git a/fs/netfs/Kconfig b/fs/netfs/Kconfig index b4db21022cb4..c4bdd0b032dd 100644 --- a/fs/netfs/Kconfig +++ b/fs/netfs/Kconfig @@ -21,3 +21,11 @@ config NETFS_STATS multi-CPU system these may be on cachelines that keep bouncing between CPUs. On the other hand, the stats are very useful for debugging purposes. Saying 'Y' here is recommended. + +config NETFS_ONDEMAND + bool "Support for on-demand reading" + depends on NETFS_SUPPORT + default n + help + This enables on-demand reading with netfs API. + If unsure, say N. diff --git a/fs/netfs/read_helper.c b/fs/netfs/read_helper.c index 077c0ca96612..b84c184c365d 100644 --- a/fs/netfs/read_helper.c +++ b/fs/netfs/read_helper.c @@ -1013,6 +1013,43 @@ int netfs_readpage(struct file *file, } EXPORT_SYMBOL(netfs_readpage); +#ifdef CONFIG_NETFS_ONDEMAND +void netfs_ondemand_read(struct netfs_read_subrequest *subreq) +{ + struct netfs_read_request *rreq = subreq->rreq; + struct netfs_cache_resources *cres = &rreq->cache_resources; + loff_t start_pos; + size_t len; + int ret = -ENOBUFS; + + /* The cache backend may not be accessible at this moment. */ + if (!cres->ops) + goto out; + + if (!cres->ops->ondemand_read) { + ret = -EOPNOTSUPP; + goto out; + } + + start_pos = subreq->p_start + subreq->transferred; + len = subreq->len - subreq->transferred; + + /* + * In success case (ret == 0), user daemon has prepared data for + * us, thus transform to NETFS_READ_FROM_CACHE state and + * advertise that 0 byte readed, so that the request will enter + * into INCOMPLETE state and retry to read from backing file. + */ + ret = cres->ops->ondemand_read(cres, start_pos, len); + if (!ret) { + subreq->source = NETFS_READ_FROM_CACHE; + __clear_bit(NETFS_SREQ_WRITE_TO_CACHE, &subreq->flags); + } +out: + netfs_subreq_terminated(subreq, ret, false); +} +#endif + /* * Prepare a folio for writing without reading first * @folio: The folio being prepared diff --git a/include/linux/netfs.h b/include/linux/netfs.h index a17740b3b9d6..d6e041293dcc 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -246,6 +246,11 @@ struct netfs_cache_ops { int (*prepare_write)(struct netfs_cache_resources *cres, loff_t *_start, size_t *_len, loff_t i_size, bool no_space_allocated_yet); + +#ifdef CONFIG_NETFS_ONDEMAND + int (*ondemand_read)(struct netfs_cache_resources *cres, + loff_t start_pos, size_t len); +#endif }; struct readahead_control; @@ -261,6 +266,9 @@ extern int netfs_write_begin(struct file *, struct address_space *, void **, const struct netfs_read_request_ops *, void *); +#ifdef CONFIG_NETFS_ONDEMAND +extern void netfs_ondemand_read(struct netfs_read_subrequest *); +#endif extern void netfs_subreq_terminated(struct netfs_read_subrequest *, ssize_t, bool); extern void netfs_stats_show(struct seq_file *); From patchwork Tue Jan 18 13:12:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0E67C433F5 for ; Tue, 18 Jan 2022 13:13:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242146AbiARNMa (ORCPT ); Tue, 18 Jan 2022 08:12:30 -0500 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:54042 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241972AbiARNMZ (ORCPT ); Tue, 18 Jan 2022 08:12:25 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R541e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2CR7xO_1642511541; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2CR7xO_1642511541) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:22 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 04/20] cachefiles: extract generic daemon write function Date: Tue, 18 Jan 2022 21:12:00 +0800 Message-Id: <20220118131216.85338-5-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org ... so that the following new devnode can reuse most of the code when implementing its .write() callback. Signed-off-by: Jeffle Xu --- fs/cachefiles/daemon.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index 7ac04ee2c0a0..aa2e5e354afb 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -209,10 +209,11 @@ static ssize_t cachefiles_daemon_read(struct file *file, char __user *_buffer, /* * Take a command from cachefilesd, parse it and act on it. */ -static ssize_t cachefiles_daemon_write(struct file *file, - const char __user *_data, - size_t datalen, - loff_t *pos) +static ssize_t cachefiles_daemon_do_write(struct file *file, + const char __user *_data, + size_t datalen, + loff_t *pos, + const struct cachefiles_daemon_cmd *cmds) { const struct cachefiles_daemon_cmd *cmd; struct cachefiles_cache *cache = file->private_data; @@ -261,7 +262,7 @@ static ssize_t cachefiles_daemon_write(struct file *file, } /* run the appropriate command handler */ - for (cmd = cachefiles_daemon_cmds; cmd->name[0]; cmd++) + for (cmd = cmds; cmd->name[0]; cmd++) if (strcmp(cmd->name, data) == 0) goto found_command; @@ -284,6 +285,15 @@ static ssize_t cachefiles_daemon_write(struct file *file, goto error; } +static ssize_t cachefiles_daemon_write(struct file *file, + const char __user *_data, + size_t datalen, + loff_t *pos) +{ + return cachefiles_daemon_do_write(file, _data, datalen, pos, + cachefiles_daemon_cmds); +} + /* * Poll for culling state * - use EPOLLOUT to indicate culling state From patchwork Tue Jan 18 13:12:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5572DC433EF for ; Tue, 18 Jan 2022 13:12:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242097AbiARNM3 (ORCPT ); Tue, 18 Jan 2022 08:12:29 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:60510 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241978AbiARNMZ (ORCPT ); Tue, 18 Jan 2022 08:12:25 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R491e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C8e3U_1642511542; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C8e3U_1642511542) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:23 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 05/20] cachefiles: detect backing file size in on-demand read mode Date: Tue, 18 Jan 2022 21:12:01 +0800 Message-Id: <20220118131216.85338-6-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org fscache/cachefiles used to serve as a local cache for remote fs. The following patches will introduce a new use case, in which local read-only fs could implement on-demand reading with fscache. Then in this case, the upper read-only fs may has no idea on the size of the backed file. Besides it is worth nothing that, in this scenario, user daemon is responsible for preparing all backing files with correct file size (backing files are all sparse files in this case). And since it's read-only, we can trust the backing file size as the backed file size. With this precondition, cachefiles can detect the actual size of the backing file, and set it as the size of the backed file. This patch also adds one flag bit to distinguish the new introduced on-demand read mode from the original mode. The following patch will make it configurable by users. Signed-off-by: Jeffle Xu --- fs/cachefiles/Kconfig | 8 ++++++ fs/cachefiles/internal.h | 1 + fs/cachefiles/namei.c | 60 +++++++++++++++++++++++++++++++++++++++- 3 files changed, 68 insertions(+), 1 deletion(-) diff --git a/fs/cachefiles/Kconfig b/fs/cachefiles/Kconfig index 719faeeda168..0aaef4dd3866 100644 --- a/fs/cachefiles/Kconfig +++ b/fs/cachefiles/Kconfig @@ -26,3 +26,11 @@ config CACHEFILES_ERROR_INJECTION help This permits error injection to be enabled in cachefiles whilst a cache is in service. + +config CACHEFILES_ONDEMAND + bool "Support for on-demand reading" + depends on CACHEFILES && FSCACHE_ONDEMAND + default n + help + This permits on-demand read mode of cachefiles. + If unsure, say N. diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 421423819d63..2bb441197106 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -98,6 +98,7 @@ struct cachefiles_cache { #define CACHEFILES_DEAD 1 /* T if cache dead */ #define CACHEFILES_CULLING 2 /* T if cull engaged */ #define CACHEFILES_STATE_CHANGED 3 /* T if state changed (poll trigger) */ +#define CACHEFILES_ONDEMAND_MODE 4 /* T if in on-demand read mode */ char *rootdirname; /* name of cache root directory */ char *secctx; /* LSM security context */ char *tag; /* cache binding tag */ diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 9399153e1c99..1469f94cb229 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -506,15 +506,69 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object) return file; } +#ifdef CONFIG_CACHEFILES_ONDEMAND +static inline bool cachefiles_can_create_file(struct cachefiles_cache *cache) +{ + /* + * On-demand read mode requires that backing files have been prepared + * with correct file size under corresponding directory. We can get here + * when the backing file doesn't exist under corresponding directory, or + * the file size is unexpected 0. + */ + return !test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags); + +} + +/* + * Fs using fscache for on-demand reading may have no idea of the file size of + * backing files. Thus the on-demand read mode requires that backing files have + * been prepared with correct file size under corresponding directory. Then + * fscache backend is responsible for taking the file size of the backing file + * as the object size. + */ +static int cachefiles_recheck_size(struct cachefiles_object *object, + struct file *file) +{ + loff_t size; + struct cachefiles_cache *cache = object->volume->cache; + + if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags)) + return 0; + + size = i_size_read(file_inode(file)); + if (!size) + return -EINVAL; + + object->cookie->object_size = size; + return 0; +} +#else +static inline bool cachefiles_can_create_file(struct cachefiles_cache *cache) +{ + return true; +} + +static int cachefiles_recheck_size(struct cachefiles_object *object, + struct file *file) +{ + return 0; +} +#endif + + /* * Create a new file. */ static bool cachefiles_create_file(struct cachefiles_object *object) { + struct cachefiles_cache *cache = object->volume->cache; struct file *file; int ret; - ret = cachefiles_has_space(object->volume->cache, 1, 0, + if (!cachefiles_can_create_file(cache)) + return false; + + ret = cachefiles_has_space(cache, 1, 0, cachefiles_has_space_for_create); if (ret < 0) return false; @@ -569,6 +623,10 @@ static bool cachefiles_open_file(struct cachefiles_object *object, } _debug("file -> %pd positive", dentry); + ret = cachefiles_recheck_size(object, file); + if (ret < 0) + goto check_failed; + ret = cachefiles_check_auxdata(object, file); if (ret < 0) goto check_failed; From patchwork Tue Jan 18 13:12:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF146C433EF for ; Tue, 18 Jan 2022 13:12:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242205AbiARNMc (ORCPT ); Tue, 18 Jan 2022 08:12:32 -0500 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:41040 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242006AbiARNM2 (ORCPT ); Tue, 18 Jan 2022 08:12:28 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R521e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2CR7xi_1642511543; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2CR7xi_1642511543) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:24 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 06/20] cachefiles: introduce new devnode for on-demand read mode Date: Tue, 18 Jan 2022 21:12:02 +0800 Message-Id: <20220118131216.85338-7-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch introduces a new devnode 'cachefiles_ondemand' to support the newly introduced on-demand read mode. The precondition for on-demand reading semantic is that, all blob files have been placed under corresponding directory with correct file size (sparse files) on the first beginning. When upper fs starts to access the blob file, it will "cache miss" (hit the hole) and then turn to user daemon for preparing the data. The interaction between kernel and user daemon is described as below. 1. Once cache miss, .ondemand_read() callback of corresponding fscache backend is called to prepare the data. As for cachefiles, it just packages related metadata (file range to read, etc.) into a pending read request, and then the process triggering cache miss will fall asleep until the corresponding data gets fetched later. 2. User daemon needs to poll on the devnode ('cachefiles_ondemand'), waiting for pending read request. 3. Once there's pending read request, user daemon will be notified and shall read the devnode ('cachefiles_ondemand') to fetch one pending read request to process. 4. For the fetched read request, user daemon need to somehow prepare the data (e.g. download from remote through network) and then write the fetched data into the backing file to fill the hole. 5. After that, user daemon need to notify cachefiles backend by writing a 'done' command to devnode ('cachefiles_ondemand'). It will also awake the previous asleep process triggering cache miss. 6. By the time the process gets awaken, the data has been ready in the backing file. Then fscache will re-initiate a read request from the backing file. Signed-off-by: Jeffle Xu --- fs/cachefiles/daemon.c | 127 +++++++++++++++++++++++++++++++++++++++ fs/cachefiles/internal.h | 22 +++++++ fs/cachefiles/io.c | 68 +++++++++++++++++++++ fs/cachefiles/main.c | 27 +++++++++ 4 files changed, 244 insertions(+) diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index aa2e5e354afb..7af3e17e04c8 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -108,6 +108,10 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file) INIT_LIST_HEAD(&cache->volumes); INIT_LIST_HEAD(&cache->object_list); spin_lock_init(&cache->object_list_lock); +#ifdef CONFIG_CACHEFILES_ONDEMAND + idr_init(&cache->reqs); + set_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags); +#endif /* set default caching limits * - limit at 1% free space and/or free files @@ -142,6 +146,9 @@ static int cachefiles_daemon_release(struct inode *inode, struct file *file) cachefiles_daemon_unbind(cache); /* clean up the control file interface */ +#ifdef CONFIG_CACHEFILES_ONDEMAND + idr_destroy(&cache->reqs); +#endif cache->cachefilesd = NULL; file->private_data = NULL; cachefiles_open = 0; @@ -747,3 +754,123 @@ static void cachefiles_daemon_unbind(struct cachefiles_cache *cache) _leave(""); } + +#ifdef CONFIG_CACHEFILES_ONDEMAND +static ssize_t cachefiles_ondemand_write(struct file *, const char __user *, + size_t, loff_t *); +static ssize_t cachefiles_ondemand_read(struct file *, char __user *, size_t, + loff_t *); +static __poll_t cachefiles_ondemand_poll(struct file *, + struct poll_table_struct *); +static int cachefiles_daemon_done(struct cachefiles_cache *, char *); + +const struct file_operations cachefiles_ondemand_fops = { + .owner = THIS_MODULE, + .open = cachefiles_daemon_open, + .release = cachefiles_daemon_release, + .read = cachefiles_ondemand_read, + .write = cachefiles_ondemand_write, + .poll = cachefiles_ondemand_poll, + .llseek = noop_llseek, +}; + +static const struct cachefiles_daemon_cmd cachefiles_ondemand_cmds[] = { + { "bind", cachefiles_daemon_bind }, + { "brun", cachefiles_daemon_brun }, + { "bcull", cachefiles_daemon_bcull }, + { "bstop", cachefiles_daemon_bstop }, + { "cull", cachefiles_daemon_cull }, + { "debug", cachefiles_daemon_debug }, + { "dir", cachefiles_daemon_dir }, + { "frun", cachefiles_daemon_frun }, + { "fcull", cachefiles_daemon_fcull }, + { "fstop", cachefiles_daemon_fstop }, + { "inuse", cachefiles_daemon_inuse }, + { "secctx", cachefiles_daemon_secctx }, + { "tag", cachefiles_daemon_tag }, + { "done", cachefiles_daemon_done }, + { "", NULL } +}; + +static ssize_t cachefiles_ondemand_write(struct file *file, + const char __user *_data, + size_t datalen, + loff_t *pos) +{ + return cachefiles_daemon_do_write(file, _data, datalen, pos, + cachefiles_ondemand_cmds); +} + +static ssize_t cachefiles_ondemand_read(struct file *file, char __user *_buffer, + size_t buflen, loff_t *pos) +{ + struct cachefiles_cache *cache = file->private_data; + struct cachefiles_req *req; + int n, id = 0; + + if (!test_bit(CACHEFILES_READY, &cache->flags)) + return 0; + + idr_lock(&cache->reqs); + req = idr_get_next(&cache->reqs, &id); + idr_unlock(&cache->reqs); + if (!req) + return 0; + + n = sizeof(req->req_in); + if (n > buflen) + return -EMSGSIZE; + + if (copy_to_user(_buffer, &req->req_in, n) != 0) + return -EFAULT; + + return n; +} + +static __poll_t cachefiles_ondemand_poll(struct file *file, + struct poll_table_struct *poll) +{ + struct cachefiles_cache *cache = file->private_data; + __poll_t mask; + + poll_wait(file, &cache->daemon_pollwq, poll); + mask = 0; + + if (!idr_is_empty(&cache->reqs)) + mask |= EPOLLIN; + + return mask; +} + +/* + * Request completion + * - command: "done " + */ +static int cachefiles_daemon_done(struct cachefiles_cache *cache, char *args) +{ + unsigned long id; + int ret; + struct cachefiles_req *req; + + _enter(",%s", args); + + if (!*args) { + pr_err("Empty id specified\n"); + return -EINVAL; + } + + ret = kstrtoul(args, 0, &id); + if (ret) + return ret; + + idr_lock(&cache->reqs); + req = idr_remove(&cache->reqs, id); + idr_unlock(&cache->reqs); + if (!req) + return -EINVAL; + + complete(&req->done); + + return 0; +} +#endif diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 2bb441197106..aa622b966802 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -15,6 +15,7 @@ #include #include #include +#include #define CACHEFILES_DIO_BLOCK_SIZE 4096 @@ -60,6 +61,20 @@ struct cachefiles_object { #define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Have an unlinked tmpfile */ }; +#ifdef CONFIG_CACHEFILES_ONDEMAND +struct cachefiles_req_in { + uint64_t id; + uint64_t off; + uint64_t len; + char path[NAME_MAX]; +}; + +struct cachefiles_req { + struct completion done; + struct cachefiles_req_in req_in; +}; +#endif + /* * Cache files cache definition */ @@ -102,6 +117,10 @@ struct cachefiles_cache { char *rootdirname; /* name of cache root directory */ char *secctx; /* LSM security context */ char *tag; /* cache binding tag */ + +#ifdef CONFIG_CACHEFILES_ONDEMAND + struct idr reqs; +#endif }; #include @@ -146,6 +165,9 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache, * daemon.c */ extern const struct file_operations cachefiles_daemon_fops; +#ifdef CONFIG_CACHEFILES_ONDEMAND +extern const struct file_operations cachefiles_ondemand_fops; +#endif /* * error_inject.c diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index 5da0bfd78188..f7418d02fde1 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -539,12 +539,80 @@ static void cachefiles_end_operation(struct netfs_cache_resources *cres) fscache_end_cookie_access(fscache_cres_cookie(cres), fscache_access_io_end); } +#ifdef CONFIG_CACHEFILES_ONDEMAND +static struct cachefiles_req *cachefiles_alloc_req(struct cachefiles_object *object, + loff_t start_pos, + size_t len) +{ + struct cachefiles_req *req; + struct cachefiles_req_in *req_in; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return NULL; + + req_in = &req->req_in; + + req_in->off = start_pos; + req_in->len = len; + strncpy(req_in->path, object->d_name, sizeof(req_in->path) - 1); + + init_completion(&req->done); + + return req; +} + +int cachefiles_ondemand_read(struct netfs_cache_resources *cres, + loff_t start_pos, size_t len) +{ + struct cachefiles_object *object; + struct cachefiles_cache *cache; + struct cachefiles_req *req; + int ret; + + object = cachefiles_cres_object(cres); + cache = object->volume->cache; + + if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags)) + return -EOPNOTSUPP; + + req = cachefiles_alloc_req(object, start_pos, len); + if (!req) + return -ENOMEM; + + idr_preload(GFP_KERNEL); + idr_lock(&cache->reqs); + + ret = idr_alloc(&cache->reqs, req, 0, 0, GFP_ATOMIC); + if (ret >= 0) + req->req_in.id = ret; + + idr_unlock(&cache->reqs); + idr_preload_end(); + + if (ret < 0) { + kfree(req); + return -ENOMEM; + } + + wake_up_all(&cache->daemon_pollwq); + + wait_for_completion(&req->done); + kfree(req); + + return 0; +} +#endif + static const struct netfs_cache_ops cachefiles_netfs_cache_ops = { .end_operation = cachefiles_end_operation, .read = cachefiles_read, .write = cachefiles_write, .prepare_read = cachefiles_prepare_read, .prepare_write = cachefiles_prepare_write, +#ifdef CONFIG_CACHEFILES_ONDEMAND + .ondemand_read = cachefiles_ondemand_read, +#endif }; /* diff --git a/fs/cachefiles/main.c b/fs/cachefiles/main.c index 3f369c6f816d..eab17c3140d9 100644 --- a/fs/cachefiles/main.c +++ b/fs/cachefiles/main.c @@ -39,6 +39,27 @@ static struct miscdevice cachefiles_dev = { .fops = &cachefiles_daemon_fops, }; +#ifdef CONFIG_CACHEFILES_ONDEMAND +static struct miscdevice cachefiles_ondemand_dev = { + .minor = MISC_DYNAMIC_MINOR, + .name = "cachefiles_ondemand", + .fops = &cachefiles_ondemand_fops, +}; + +static inline int cachefiles_init_ondemand(void) +{ + return misc_register(&cachefiles_ondemand_dev); +} + +static inline void cachefiles_exit_ondemand(void) +{ + misc_deregister(&cachefiles_ondemand_dev); +} +#else +static inline int cachefiles_init_ondemand(void) { return 0; } +static inline void cachefiles_exit_ondemand(void) {} +#endif + /* * initialise the fs caching module */ @@ -52,6 +73,9 @@ static int __init cachefiles_init(void) ret = misc_register(&cachefiles_dev); if (ret < 0) goto error_dev; + ret = cachefiles_init_ondemand(); + if (ret < 0) + goto error_ondemand_dev; /* create an object jar */ ret = -ENOMEM; @@ -68,6 +92,8 @@ static int __init cachefiles_init(void) return 0; error_object_jar: + cachefiles_exit_ondemand(); +error_ondemand_dev: misc_deregister(&cachefiles_dev); error_dev: cachefiles_unregister_error_injection(); @@ -86,6 +112,7 @@ static void __exit cachefiles_exit(void) pr_info("Unloading\n"); kmem_cache_destroy(cachefiles_object_jar); + cachefiles_exit_ondemand(); misc_deregister(&cachefiles_dev); cachefiles_unregister_error_injection(); } From patchwork Tue Jan 18 13:12:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BEE1C43217 for ; Tue, 18 Jan 2022 13:13:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242134AbiARNN3 (ORCPT ); Tue, 18 Jan 2022 08:13:29 -0500 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:40432 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242102AbiARNM3 (ORCPT ); Tue, 18 Jan 2022 08:12:29 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R991e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2CR7xp_1642511544; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2CR7xp_1642511544) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:25 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 07/20] erofs: use meta buffers for erofs_read_superblock() Date: Tue, 18 Jan 2022 21:12:03 +0800 Message-Id: <20220118131216.85338-8-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The only change is that, meta buffers read cache page without __GFP_FS flag, which shall not matter. Signed-off-by: Jeffle Xu --- fs/erofs/super.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 915eefe0d7e2..12755217631f 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -281,21 +281,19 @@ static int erofs_init_devices(struct super_block *sb, static int erofs_read_superblock(struct super_block *sb) { struct erofs_sb_info *sbi; - struct page *page; + struct erofs_buf buf = __EROFS_BUF_INITIALIZER; struct erofs_super_block *dsb; unsigned int blkszbits; void *data; int ret; - page = read_mapping_page(sb->s_bdev->bd_inode->i_mapping, 0, NULL); - if (IS_ERR(page)) { + data = erofs_read_metabuf(&buf, sb, 0, EROFS_KMAP); + if (IS_ERR(data)) { erofs_err(sb, "cannot read erofs superblock"); - return PTR_ERR(page); + return PTR_ERR(data); } sbi = EROFS_SB(sb); - - data = kmap(page); dsb = (struct erofs_super_block *)(data + EROFS_SUPER_OFFSET); ret = -EINVAL; @@ -365,8 +363,7 @@ static int erofs_read_superblock(struct super_block *sb) if (erofs_sb_has_ztailpacking(sbi)) erofs_info(sb, "EXPERIMENTAL compressed inline data feature in use. Use at your own risk!"); out: - kunmap(page); - put_page(page); + erofs_put_metabuf(&buf); return ret; } From patchwork Tue Jan 18 13:12:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BD52C433F5 for ; Tue, 18 Jan 2022 13:13:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241938AbiARNNb (ORCPT ); Tue, 18 Jan 2022 08:13:31 -0500 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:35728 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242084AbiARNM3 (ORCPT ); Tue, 18 Jan 2022 08:12:29 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C1oxt_1642511546; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C1oxt_1642511546) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:26 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 08/20] erofs: export erofs_map_blocks() Date: Tue, 18 Jan 2022 21:12:04 +0800 Message-Id: <20220118131216.85338-9-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org ... so that it can be used in the following introduced fs/erofs/fscache.c. Signed-off-by: Jeffle Xu --- fs/erofs/data.c | 4 ++-- fs/erofs/internal.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index fa7ddb7ad980..f3aa133866e5 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -104,8 +104,8 @@ static int erofs_map_blocks_flatmode(struct inode *inode, return 0; } -static int erofs_map_blocks(struct inode *inode, - struct erofs_map_blocks *map, int flags) +int erofs_map_blocks(struct inode *inode, + struct erofs_map_blocks *map, int flags) { struct super_block *sb = inode->i_sb; struct erofs_inode *vi = EROFS_I(inode); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index b8272fb95fd6..f9f94d63d40f 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -484,6 +484,8 @@ void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb, int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *dev); int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len); +int erofs_map_blocks(struct inode *inode, + struct erofs_map_blocks *map, int flags); /* inode.c */ static inline unsigned long erofs_inode_hash(erofs_nid_t nid) From patchwork Tue Jan 18 13:12:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54E92C4332F for ; Tue, 18 Jan 2022 13:13:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242956AbiARNN2 (ORCPT ); Tue, 18 Jan 2022 08:13:28 -0500 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:34029 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242134AbiARNMa (ORCPT ); Tue, 18 Jan 2022 08:12:30 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R321e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C5Co8_1642511547; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C5Co8_1642511547) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:28 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 09/20] erofs: add mode checking helper Date: Tue, 18 Jan 2022 21:12:05 +0800 Message-Id: <20220118131216.85338-10-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Until then erofs is exactly blockdev based filesystem. In other using scenarios (e.g. container image), erofs needs to run upon files. This patch set is going to introduces a new nodev mode, in which erofs could be mounted from a bootstrap blob file containing complete erofs image. Add a helper checking which mode erofs works in. Signed-off-by: Jeffle Xu --- fs/erofs/internal.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index f9f94d63d40f..2b9337d385ce 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -161,6 +161,11 @@ struct erofs_sb_info { #define set_opt(opt, option) ((opt)->mount_opt |= EROFS_MOUNT_##option) #define test_opt(opt, option) ((opt)->mount_opt & EROFS_MOUNT_##option) +static inline bool erofs_bdev_mode(struct super_block *sb) +{ + return sb->s_bdev; +} + enum { EROFS_ZIP_CACHE_DISABLED, EROFS_ZIP_CACHE_READAHEAD, From patchwork Tue Jan 18 13:12:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75ED1C433EF for ; Tue, 18 Jan 2022 13:12:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242280AbiARNMf (ORCPT ); Tue, 18 Jan 2022 08:12:35 -0500 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:58583 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242171AbiARNMc (ORCPT ); Tue, 18 Jan 2022 08:12:32 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R411e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C1Qxj_1642511548; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C1Qxj_1642511548) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:29 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 10/20] erofs: register global fscache volume Date: Tue, 18 Jan 2022 21:12:06 +0800 Message-Id: <20220118131216.85338-11-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org All erofs instances will share one global fscache volume. In this using scenario, one erofs instance could be mounted from one (or multiple) blob files instead of blkdev. The number of blob files that each erofs instance could correspond to is limited, since these blob files are quite large in size. For example, when used for container image distribution, one erofs instance used for container image for node.js will correspond to ~20 blob files in total. Thus in densely employed environment, there could be as many as hundreds of containers and thus thousands of fscache cookies under one fscache volume. Then as for cachefiles backend, the hash table managing all cookies under one volume contains 32K slots. Thus the hashing functionality shall scale well in this case. Besides, cachefiles backend will scatter backing files under 256 fan sub-directoris, and thus the scalability of looking up backing files shall also not be an issue. Signed-off-by: Jeffle Xu --- fs/erofs/Makefile | 3 ++- fs/erofs/fscache.c | 21 +++++++++++++++++++++ fs/erofs/internal.h | 5 +++++ fs/erofs/super.c | 7 +++++++ 4 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 fs/erofs/fscache.c diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile index 8a3317e38e5a..21999e8a4728 100644 --- a/fs/erofs/Makefile +++ b/fs/erofs/Makefile @@ -1,7 +1,8 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_EROFS_FS) += erofs.o -erofs-objs := super.o inode.o data.o namei.o dir.o utils.o pcpubuf.o sysfs.o +erofs-objs := super.o inode.o data.o namei.o dir.o utils.o pcpubuf.o sysfs.o \ + fscache.o erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o erofs-$(CONFIG_EROFS_FS_ZIP) += decompressor.o zmap.o zdata.o erofs-$(CONFIG_EROFS_FS_ZIP_LZMA) += decompressor_lzma.o diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c new file mode 100644 index 000000000000..9c32f42e1056 --- /dev/null +++ b/fs/erofs/fscache.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2021, Alibaba Cloud + */ +#include "internal.h" + +static struct fscache_volume *volume; + +int __init erofs_init_fscache(void) +{ + volume = fscache_acquire_volume("erofs", NULL, NULL, 0); + if (!volume) + return -EINVAL; + + return 0; +} + +void erofs_exit_fscache(void) +{ + fscache_relinquish_volume(volume, NULL, false); +} diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 2b9337d385ce..c2608a469107 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -17,6 +17,7 @@ #include #include #include +#include #include "erofs_fs.h" /* redefine pr_fmt "erofs: " */ @@ -616,6 +617,10 @@ static inline int z_erofs_load_lzma_config(struct super_block *sb, } #endif /* !CONFIG_EROFS_FS_ZIP */ +/* fscache.c */ +int erofs_init_fscache(void); +void erofs_exit_fscache(void); + #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ #endif /* __EROFS_INTERNAL_H */ diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 12755217631f..798f0c379e35 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -814,6 +814,10 @@ static int __init erofs_module_init(void) if (err) goto sysfs_err; + err = erofs_init_fscache(); + if (err) + goto fscache_err; + err = register_filesystem(&erofs_fs_type); if (err) goto fs_err; @@ -821,6 +825,8 @@ static int __init erofs_module_init(void) return 0; fs_err: + erofs_exit_fscache(); +fscache_err: erofs_exit_sysfs(); sysfs_err: z_erofs_exit_zip_subsystem(); @@ -841,6 +847,7 @@ static void __exit erofs_module_exit(void) /* Ensure all RCU free inodes / pclusters are safe to be destroyed. */ rcu_barrier(); + erofs_exit_fscache(); erofs_exit_sysfs(); z_erofs_exit_zip_subsystem(); z_erofs_lzma_exit(); From patchwork Tue Jan 18 13:12:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716318 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DBA2C433FE for ; Tue, 18 Jan 2022 13:12:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242306AbiARNMf (ORCPT ); Tue, 18 Jan 2022 08:12:35 -0500 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:44644 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242187AbiARNMc (ORCPT ); Tue, 18 Jan 2022 08:12:32 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C2ax2_1642511549; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C2ax2_1642511549) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:30 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 11/20] erofs: add cookie context helper functions Date: Tue, 18 Jan 2022 21:12:07 +0800 Message-Id: <20220118131216.85338-12-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce 'struct erofs_cookie_ctx' for managing cookie for backing file, and the following introduced API for reading from backing file. Besides, introduce two helper functions for initializing and cleaning up erofs_cookie_ctx. struct erofs_cookie_ctx * erofs_fscache_get_ctx(struct super_block *sb, char *path); void erofs_fscache_put_ctx(struct erofs_cookie_ctx *ctx); Signed-off-by: Jeffle Xu --- fs/erofs/fscache.c | 78 +++++++++++++++++++++++++++++++++++++++++++++ fs/erofs/internal.h | 8 +++++ 2 files changed, 86 insertions(+) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 9c32f42e1056..10c3f5ea9e24 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -6,6 +6,84 @@ static struct fscache_volume *volume; +static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx, + char *path) +{ + struct fscache_cookie *cookie; + + /* + * @object_size shall be non-zero to avoid + * FSCACHE_COOKIE_NO_DATA_TO_READ. + */ + cookie = fscache_acquire_cookie(volume, 0, + path, strlen(path), + NULL, 0, -1); + if (!cookie) + return -EINVAL; + + fscache_use_cookie(cookie, false); + ctx->cookie = cookie; + return 0; +} + +static inline +void erofs_fscache_cleanup_cookie(struct erofs_fscache_context *ctx) +{ + struct fscache_cookie *cookie = ctx->cookie; + + fscache_unuse_cookie(cookie, NULL, NULL); + fscache_relinquish_cookie(cookie, false); + ctx->cookie = NULL; +} + +static int erofs_fscahce_init_ctx(struct erofs_fscache_context *ctx, + struct super_block *sb, char *path) +{ + int ret; + + ret = erofs_fscache_init_cookie(ctx, path); + if (ret) { + erofs_err(sb, "failed to init cookie"); + return ret; + } + + return 0; +} + +static inline +void erofs_fscache_cleanup_ctx(struct erofs_fscache_context *ctx) +{ + erofs_fscache_cleanup_cookie(ctx); +} + +struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb, + char *path) +{ + struct erofs_fscache_context *ctx; + int ret; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return ERR_PTR(-ENOMEM); + + ret = erofs_fscahce_init_ctx(ctx, sb, path); + if (ret) { + kfree(ctx); + return ERR_PTR(ret); + } + + return ctx; +} + +void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx) +{ + if (!ctx) + return; + + erofs_fscache_cleanup_ctx(ctx); + kfree(ctx); +} + int __init erofs_init_fscache(void) { volume = fscache_acquire_volume("erofs", NULL, NULL, 0); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index c2608a469107..1f5bc69e8e9f 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -97,6 +97,10 @@ struct erofs_sb_lz4_info { u16 max_pclusterblks; }; +struct erofs_fscache_context { + struct fscache_cookie *cookie; +}; + struct erofs_sb_info { struct erofs_mount_opts opt; /* options */ #ifdef CONFIG_EROFS_FS_ZIP @@ -621,6 +625,10 @@ static inline int z_erofs_load_lzma_config(struct super_block *sb, int erofs_init_fscache(void); void erofs_exit_fscache(void); +struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb, + char *path); +void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx); + #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ #endif /* __EROFS_INTERNAL_H */ From patchwork Tue Jan 18 13:12:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95F46C4332F for ; Tue, 18 Jan 2022 13:13:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243282AbiARNNL (ORCPT ); Tue, 18 Jan 2022 08:13:11 -0500 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:59123 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242231AbiARNMd (ORCPT ); Tue, 18 Jan 2022 08:12:33 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C5CoQ_1642511550; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C5CoQ_1642511550) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:31 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 12/20] erofs: add anonymous inode managing page cache of blob file Date: Tue, 18 Jan 2022 21:12:08 +0800 Message-Id: <20220118131216.85338-13-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce one anonymous inode for managing page cache of corresponding blob file. Then erofs could read directly from the address space of the anonymous inode when cache hit. Signed-off-by: Jeffle Xu --- fs/erofs/fscache.c | 45 ++++++++++++++++++++++++++++++++++++++++++--- fs/erofs/internal.h | 3 ++- 2 files changed, 44 insertions(+), 4 deletions(-) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 10c3f5ea9e24..74683df6144d 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -6,6 +6,9 @@ static struct fscache_volume *volume; +static const struct address_space_operations erofs_fscache_blob_aops = { +}; + static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx, char *path) { @@ -36,8 +39,34 @@ void erofs_fscache_cleanup_cookie(struct erofs_fscache_context *ctx) ctx->cookie = NULL; } +static int erofs_fscache_get_inode(struct erofs_fscache_context *ctx, + struct super_block *sb) +{ + struct inode *const inode = new_inode(sb); + + if (!inode) + return -ENOMEM; + + set_nlink(inode, 1); + inode->i_size = OFFSET_MAX; + + inode->i_mapping->a_ops = &erofs_fscache_blob_aops; + mapping_set_gfp_mask(inode->i_mapping, + GFP_NOFS | __GFP_HIGHMEM | __GFP_MOVABLE); + ctx->inode = inode; + return 0; +} + +static inline +void erofs_fscache_put_inode(struct erofs_fscache_context *ctx) +{ + iput(ctx->inode); + ctx->inode = NULL; +} + static int erofs_fscahce_init_ctx(struct erofs_fscache_context *ctx, - struct super_block *sb, char *path) + struct super_block *sb, char *path, + bool need_inode) { int ret; @@ -47,6 +76,15 @@ static int erofs_fscahce_init_ctx(struct erofs_fscache_context *ctx, return ret; } + if (need_inode) { + ret = erofs_fscache_get_inode(ctx, sb); + if (ret) { + erofs_err(sb, "failed to get anonymous inode"); + erofs_fscache_cleanup_cookie(ctx); + return ret; + } + } + return 0; } @@ -54,10 +92,11 @@ static inline void erofs_fscache_cleanup_ctx(struct erofs_fscache_context *ctx) { erofs_fscache_cleanup_cookie(ctx); + erofs_fscache_put_inode(ctx); } struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb, - char *path) + char *path, bool need_inode) { struct erofs_fscache_context *ctx; int ret; @@ -66,7 +105,7 @@ struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb, if (!ctx) return ERR_PTR(-ENOMEM); - ret = erofs_fscahce_init_ctx(ctx, sb, path); + ret = erofs_fscahce_init_ctx(ctx, sb, path, need_inode); if (ret) { kfree(ctx); return ERR_PTR(ret); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 1f5bc69e8e9f..bb5e992fe0df 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -99,6 +99,7 @@ struct erofs_sb_lz4_info { struct erofs_fscache_context { struct fscache_cookie *cookie; + struct inode *inode; }; struct erofs_sb_info { @@ -626,7 +627,7 @@ int erofs_init_fscache(void); void erofs_exit_fscache(void); struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb, - char *path); + char *path, bool need_inode); void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx); #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ From patchwork Tue Jan 18 13:12:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76BC4C433FE for ; Tue, 18 Jan 2022 13:13:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243154AbiARNNC (ORCPT ); Tue, 18 Jan 2022 08:13:02 -0500 Received: from out30-42.freemail.mail.aliyun.com ([115.124.30.42]:40489 "EHLO out30-42.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242285AbiARNMf (ORCPT ); Tue, 18 Jan 2022 08:12:35 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R861e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C1oyQ_1642511552; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C1oyQ_1642511552) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:32 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 13/20] erofs: register cookie context for bootstrap blob Date: Tue, 18 Jan 2022 21:12:09 +0800 Message-Id: <20220118131216.85338-14-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Registers fscache_cookie for the bootstrap blob file. The bootstrap blob file can be specified by a new mount option, which is going to be introduced by a following patch. Something worth mentioning about the cleanup routine. 1. The init routine is prior to when the root inode gets initialized, and thus the corresponding cleanup routine shall be placed under .kill_sb() callback. 2. The init routine will instantiate anonymous inodes under the super_block, and thus .put_super() callback shall also contain the cleanup routine. Or we'll get "VFS: Busy inodes after unmount." warning. Signed-off-by: Jeffle Xu --- fs/erofs/internal.h | 3 +++ fs/erofs/super.c | 13 +++++++++++++ 2 files changed, 16 insertions(+) diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index bb5e992fe0df..277dcd5888ea 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -75,6 +75,7 @@ struct erofs_mount_opts { unsigned int max_sync_decompress_pages; #endif unsigned int mount_opt; + char *uuid; }; struct erofs_dev_context { @@ -152,6 +153,8 @@ struct erofs_sb_info { /* sysfs support */ struct kobject s_kobj; /* /sys/fs/erofs/ */ struct completion s_kobj_unregister; + + struct erofs_fscache_context *bootstrap; }; #define EROFS_SB(sb) ((struct erofs_sb_info *)(sb)->s_fs_info) diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 798f0c379e35..8c5783c6f71f 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -598,6 +598,16 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) sbi->devs = ctx->devs; ctx->devs = NULL; + if (!erofs_bdev_mode(sb)) { + struct erofs_fscache_context *bootstrap; + + bootstrap = erofs_fscache_get_ctx(sb, ctx->opt.uuid, true); + if (IS_ERR(bootstrap)) + return PTR_ERR(bootstrap); + + sbi->bootstrap = bootstrap; + } + err = erofs_read_superblock(sb); if (err) return err; @@ -753,6 +763,7 @@ static void erofs_kill_sb(struct super_block *sb) return; erofs_free_dev_context(sbi->devs); + erofs_fscache_put_ctx(sbi->bootstrap); fs_put_dax(sbi->dax_dev); kfree(sbi); sb->s_fs_info = NULL; @@ -771,6 +782,8 @@ static void erofs_put_super(struct super_block *sb) iput(sbi->managed_cache); sbi->managed_cache = NULL; #endif + erofs_fscache_put_ctx(sbi->bootstrap); + sbi->bootstrap = NULL; } static struct file_system_type erofs_fs_type = { From patchwork Tue Jan 18 13:12:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52636C4332F for ; Tue, 18 Jan 2022 13:13:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243208AbiARNNG (ORCPT ); Tue, 18 Jan 2022 08:13:06 -0500 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:33659 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242296AbiARNMg (ORCPT ); Tue, 18 Jan 2022 08:12:36 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C1oyZ_1642511553; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C1oyZ_1642511553) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:33 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 14/20] erofs: implement fscache-based metadata read Date: Tue, 18 Jan 2022 21:12:10 +0800 Message-Id: <20220118131216.85338-15-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch implements the data plane of reading metadata from bootstrap blob file over fscache. Be noted that currently it only supports the scenario where the backing file has no hole. Once it hits a hole of the backing file, erofs will fail the IO with -EOPNOTSUPP for now. The following patch will fix this issue, i.e. implementing the demand reading mode. Signed-off-by: Jeffle Xu --- fs/erofs/data.c | 11 +++++++++-- fs/erofs/fscache.c | 33 +++++++++++++++++++++++++++++++++ fs/erofs/internal.h | 3 +++ 3 files changed, 45 insertions(+), 2 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index f3aa133866e5..51ccbc02dd73 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -31,15 +31,22 @@ void erofs_put_metabuf(struct erofs_buf *buf) void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb, erofs_blk_t blkaddr, enum erofs_kmap_type type) { - struct address_space *const mapping = sb->s_bdev->bd_inode->i_mapping; + struct address_space *mapping; + struct erofs_sb_info *sbi = EROFS_SB(sb); erofs_off_t offset = blknr_to_addr(blkaddr); pgoff_t index = offset >> PAGE_SHIFT; struct page *page = buf->page; if (!page || page->index != index) { erofs_put_metabuf(buf); - page = read_cache_page_gfp(mapping, index, + if (erofs_bdev_mode(sb)) { + mapping = sb->s_bdev->bd_inode->i_mapping; + page = read_cache_page_gfp(mapping, index, mapping_gfp_constraint(mapping, ~__GFP_FS)); + } else { + page = erofs_fscache_read_cache_page(sbi->bootstrap, + index); + } if (IS_ERR(page)) return page; /* should already be PageUptodate, no need to lock page */ diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 74683df6144d..5a25ae523e5e 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -6,9 +6,42 @@ static struct fscache_volume *volume; +static int erofs_blob_begin_cache_operation(struct netfs_read_request *rreq) +{ + return fscache_begin_read_operation(&rreq->cache_resources, + rreq->netfs_priv); +} + +/* .cleanup() is needed if rreq->netfs_priv is non-NULL */ +static void erofs_noop_cleanup(struct address_space *mapping, void *netfs_priv) +{ +} + +static const struct netfs_read_request_ops erofs_blob_req_ops = { + .begin_cache_operation = erofs_blob_begin_cache_operation, + .cleanup = erofs_noop_cleanup, +}; + +static int erofs_fscache_blob_readpage(struct file *data, struct page *page) +{ + struct folio *folio = page_folio(page); + struct erofs_fscache_context *ctx = + (struct erofs_fscache_context *)data; + + return netfs_readpage(NULL, folio, &erofs_blob_req_ops, ctx->cookie); +} + static const struct address_space_operations erofs_fscache_blob_aops = { + .readpage = erofs_fscache_blob_readpage, }; +struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx, + pgoff_t index) +{ + DBG_BUGON(!ctx->inode); + return read_mapping_page(ctx->inode->i_mapping, index, ctx); +} + static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx, char *path) { diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 277dcd5888ea..fca706cfaf72 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -633,6 +633,9 @@ struct erofs_fscache_context *erofs_fscache_get_ctx(struct super_block *sb, char *path, bool need_inode); void erofs_fscache_put_ctx(struct erofs_fscache_context *ctx); +struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx, + pgoff_t index); + #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ #endif /* __EROFS_INTERNAL_H */ From patchwork Tue Jan 18 13:12:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18E19C433F5 for ; Tue, 18 Jan 2022 13:13:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243085AbiARNNA (ORCPT ); Tue, 18 Jan 2022 08:13:00 -0500 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:35029 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242333AbiARNMh (ORCPT ); Tue, 18 Jan 2022 08:12:37 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C1oyi_1642511554; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C1oyi_1642511554) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:35 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 15/20] erofs: implement fscache-based data read for non-inline layout Date: Tue, 18 Jan 2022 21:12:11 +0800 Message-Id: <20220118131216.85338-16-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch implements the data plane of reading data from bootstrap blob file over fscache for non-inline layout. Be noted that compressed layout is not supported yet. Signed-off-by: Jeffle Xu --- fs/erofs/fscache.c | 111 ++++++++++++++++++++++++++++++++++++++++++++ fs/erofs/inode.c | 6 ++- fs/erofs/internal.h | 1 + 3 files changed, 117 insertions(+), 1 deletion(-) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 5a25ae523e5e..588c33ab6a90 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -4,6 +4,17 @@ */ #include "internal.h" +struct erofs_fscache_map { + struct erofs_fscache_context *m_ctx; + erofs_off_t m_pa, m_la, o_la; + u64 m_llen; +}; + +struct erofs_fscache_priv { + struct fscache_cookie *cookie; + loff_t offset; +}; + static struct fscache_volume *volume; static int erofs_blob_begin_cache_operation(struct netfs_read_request *rreq) @@ -22,6 +33,33 @@ static const struct netfs_read_request_ops erofs_blob_req_ops = { .cleanup = erofs_noop_cleanup, }; +static int erofs_begin_cache_operation(struct netfs_read_request *rreq) +{ + struct erofs_fscache_priv *priv = rreq->netfs_priv; + + rreq->p_start = priv->offset; + return fscache_begin_read_operation(&rreq->cache_resources, + priv->cookie); +} + +static bool erofs_clamp_length(struct netfs_read_subrequest *subreq) +{ + /* + * For non-inline layout, rreq->i_size is actually the size of upper + * file in erofs rather than that of blob file. Thus when cache miss, + * subreq->len can be restricted to the upper file size, while we hope + * blob file can be filled in a EROFS_BLKSIZ granule. + */ + subreq->len = round_up(subreq->len, EROFS_BLKSIZ); + return true; +} + +static const struct netfs_read_request_ops erofs_req_ops = { + .begin_cache_operation = erofs_begin_cache_operation, + .cleanup = erofs_noop_cleanup, + .clamp_length = erofs_clamp_length, +}; + static int erofs_fscache_blob_readpage(struct file *data, struct page *page) { struct folio *folio = page_folio(page); @@ -42,6 +80,79 @@ struct page *erofs_fscache_read_cache_page(struct erofs_fscache_context *ctx, return read_mapping_page(ctx->inode->i_mapping, index, ctx); } +static int erofs_fscache_readpage_noinline(struct page *page, + struct erofs_fscache_map *fsmap) +{ + struct folio *folio = page_folio(page); + struct erofs_fscache_priv priv; + + /* + * 1) For FLAT_PLAIN layout, the output map.m_la shall be equal to o_la, + * and the output map.m_pa is exactly the physical address of o_la. + * 2) For CHUNK_BASED layout, the output map.m_la is rounded down to the + * nearest chunk boundary, and the output map.m_pa is actually the + * physical address of this chunk boundary. So we need to recalculate + * the actual physical address of o_la. + */ + priv.offset = fsmap->m_pa + fsmap->o_la - fsmap->m_la; + priv.cookie = fsmap->m_ctx->cookie; + + return netfs_readpage(NULL, folio, &erofs_req_ops, &priv); +} + +static int erofs_fscache_readpage(struct file *file, struct page *page) +{ + struct inode *inode = page->mapping->host; + struct erofs_inode *vi = EROFS_I(inode); + struct super_block *sb = inode->i_sb; + struct erofs_sb_info *sbi = EROFS_SB(sb); + struct erofs_map_blocks map; + struct erofs_fscache_map fsmap; + int ret; + + if (erofs_inode_is_data_compressed(vi->datalayout)) { + erofs_info(sb, "compressed layout not supported yet"); + ret = -EOPNOTSUPP; + goto err_out; + } + + map.m_la = fsmap.o_la = page_offset(page); + + ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW); + if (ret) + goto err_out; + + if (!(map.m_flags & EROFS_MAP_MAPPED)) { + zero_user(page, 0, PAGE_SIZE); + SetPageUptodate(page); + unlock_page(page); + return 0; + } + + fsmap.m_ctx = sbi->bootstrap; + fsmap.m_la = map.m_la; + fsmap.m_pa = map.m_pa; + fsmap.m_llen = map.m_llen; + + switch (vi->datalayout) { + case EROFS_INODE_FLAT_PLAIN: + case EROFS_INODE_CHUNK_BASED: + return erofs_fscache_readpage_noinline(page, &fsmap); + default: + DBG_BUGON(1); + ret = -EOPNOTSUPP; + } + +err_out: + SetPageError(page); + unlock_page(page); + return ret; +} + +const struct address_space_operations erofs_fscache_access_aops = { + .readpage = erofs_fscache_readpage, +}; + static int erofs_fscache_init_cookie(struct erofs_fscache_context *ctx, char *path) { diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c index ff62f84f47d3..2f450cb3a7b9 100644 --- a/fs/erofs/inode.c +++ b/fs/erofs/inode.c @@ -296,7 +296,11 @@ static int erofs_fill_inode(struct inode *inode, int isdir) err = z_erofs_fill_inode(inode); goto out_unlock; } - inode->i_mapping->a_ops = &erofs_raw_access_aops; + + if (erofs_bdev_mode(inode->i_sb)) + inode->i_mapping->a_ops = &erofs_raw_access_aops; + else + inode->i_mapping->a_ops = &erofs_fscache_access_aops; out_unlock: erofs_put_metabuf(&buf); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index fca706cfaf72..548f928b0ded 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -393,6 +393,7 @@ struct page *erofs_grab_cache_page_nowait(struct address_space *mapping, extern const struct super_operations erofs_sops; extern const struct address_space_operations erofs_raw_access_aops; +extern const struct address_space_operations erofs_fscache_access_aops; extern const struct address_space_operations z_erofs_aops; /* From patchwork Tue Jan 18 13:12:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2E92C433F5 for ; Tue, 18 Jan 2022 13:12:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242407AbiARNMt (ORCPT ); Tue, 18 Jan 2022 08:12:49 -0500 Received: from out30-54.freemail.mail.aliyun.com ([115.124.30.54]:52131 "EHLO out30-54.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242369AbiARNMi (ORCPT ); Tue, 18 Jan 2022 08:12:38 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C5Cov_1642511555; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C5Cov_1642511555) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:36 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 16/20] erofs: implement fscache-based data read for inline layout Date: Tue, 18 Jan 2022 21:12:12 +0800 Message-Id: <20220118131216.85338-17-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch implements the data plane of reading data from bootstrap blob file over fscache for inline layout. Signed-off-by: Jeffle Xu --- fs/erofs/fscache.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 588c33ab6a90..8c56bd54b2af 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -100,6 +100,45 @@ static int erofs_fscache_readpage_noinline(struct page *page, return netfs_readpage(NULL, folio, &erofs_req_ops, &priv); } +static int erofs_fscache_readpage_inline(struct page *page, + struct erofs_fscache_map *fsmap) +{ + struct inode *inode = page->mapping->host; + struct super_block *sb = inode->i_sb; + struct erofs_buf buf = __EROFS_BUF_INITIALIZER; + erofs_blk_t blknr; + size_t offset, len; + void *src, *dst; + + /* + * For inline (tail packing) layout, the offset may be non-zero, while + * the offset can be calculated from corresponding physical address + * directly. + * Currently only flat layout supports inline (FLAT_INLINE), and the + * output map.m_pa is exactly the physical address of o_la in this case. + */ + offset = erofs_blkoff(fsmap->m_pa); + blknr = erofs_blknr(fsmap->m_pa); + len = fsmap->m_llen; + + src = erofs_read_metabuf(&buf, sb, blknr, EROFS_KMAP); + if (IS_ERR(src)) { + SetPageError(page); + unlock_page(page); + return PTR_ERR(src); + } + + dst = kmap(page); + memcpy(dst, src + offset, len); + kunmap(page); + + erofs_put_metabuf(&buf); + + SetPageUptodate(page); + unlock_page(page); + return 0; +} + static int erofs_fscache_readpage(struct file *file, struct page *page) { struct inode *inode = page->mapping->host; @@ -138,6 +177,8 @@ static int erofs_fscache_readpage(struct file *file, struct page *page) case EROFS_INODE_FLAT_PLAIN: case EROFS_INODE_CHUNK_BASED: return erofs_fscache_readpage_noinline(page, &fsmap); + case EROFS_INODE_FLAT_INLINE: + return erofs_fscache_readpage_inline(page, &fsmap); default: DBG_BUGON(1); ret = -EOPNOTSUPP; From patchwork Tue Jan 18 13:12:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB5CBC433FE for ; Tue, 18 Jan 2022 13:13:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242919AbiARNNP (ORCPT ); Tue, 18 Jan 2022 08:13:15 -0500 Received: from out4436.biz.mail.alibaba.com ([47.88.44.36]:62358 "EHLO out4436.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242879AbiARNMw (ORCPT ); Tue, 18 Jan 2022 08:12:52 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C2axr_1642511556; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C2axr_1642511556) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:37 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 17/20] erofs: register cookie context for data blobs Date: Tue, 18 Jan 2022 21:12:13 +0800 Message-Id: <20220118131216.85338-18-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Similar to the multi device mode, erofs could be mounted from multiple blob files (one bootstrap blob file and optional multiple data blob files). In this case, each device slot contains the path of corresponding data blob file. This patch registers corresponding cookie context for each data blob file. Signed-off-by: Jeffle Xu --- fs/erofs/internal.h | 1 + fs/erofs/super.c | 27 +++++++++++++++++++-------- 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 548f928b0ded..5d514c7b73cc 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -53,6 +53,7 @@ struct erofs_device_info { struct block_device *bdev; struct dax_device *dax_dev; u64 dax_part_off; + struct erofs_fscache_context *ctx; u32 blocks; u32 mapped_blkaddr; diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 8c5783c6f71f..f058a04a00c7 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -250,6 +250,7 @@ static int erofs_init_devices(struct super_block *sb, down_read(&sbi->devs->rwsem); idr_for_each_entry(&sbi->devs->tree, dif, id) { struct block_device *bdev; + struct erofs_fscache_context *ctx; ptr = erofs_read_metabuf(&buf, sb, erofs_blknr(pos), EROFS_KMAP); @@ -259,15 +260,24 @@ static int erofs_init_devices(struct super_block *sb, } dis = ptr + erofs_blkoff(pos); - bdev = blkdev_get_by_path(dif->path, - FMODE_READ | FMODE_EXCL, - sb->s_type); - if (IS_ERR(bdev)) { - err = PTR_ERR(bdev); - break; + if (erofs_bdev_mode(sb)) { + bdev = blkdev_get_by_path(dif->path, + FMODE_READ | FMODE_EXCL, + sb->s_type); + if (IS_ERR(bdev)) { + err = PTR_ERR(bdev); + break; + } + dif->bdev = bdev; + dif->dax_dev = fs_dax_get_by_bdev(bdev, &dif->dax_part_off); + } else { + ctx = erofs_fscache_get_ctx(sb, dif->path, false); + if (IS_ERR(ctx)) { + err = PTR_ERR(ctx); + break; + } + dif->ctx = ctx; } - dif->bdev = bdev; - dif->dax_dev = fs_dax_get_by_bdev(bdev, &dif->dax_part_off); dif->blocks = le32_to_cpu(dis->blocks); dif->mapped_blkaddr = le32_to_cpu(dis->mapped_blkaddr); sbi->total_blocks += dif->blocks; @@ -694,6 +704,7 @@ static int erofs_release_device_info(int id, void *ptr, void *data) { struct erofs_device_info *dif = ptr; + erofs_fscache_put_ctx(dif->ctx); fs_put_dax(dif->dax_dev); if (dif->bdev) blkdev_put(dif->bdev, FMODE_READ | FMODE_EXCL); From patchwork Tue Jan 18 13:12:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40AA3C43217 for ; Tue, 18 Jan 2022 13:12:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242499AbiARNMy (ORCPT ); Tue, 18 Jan 2022 08:12:54 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:38229 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242521AbiARNMm (ORCPT ); Tue, 18 Jan 2022 08:12:42 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R581e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C2ay._1642511558; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C2ay._1642511558) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:38 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 18/20] erofs: implement fscache-based data read for data blobs Date: Tue, 18 Jan 2022 21:12:14 +0800 Message-Id: <20220118131216.85338-19-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch implements the data plane of reading data from data blob file over fscache. Signed-off-by: Jeffle Xu --- fs/erofs/data.c | 3 +++ fs/erofs/fscache.c | 15 ++++++++++++--- fs/erofs/internal.h | 1 + 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 51ccbc02dd73..56db391a3411 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -200,6 +200,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) map->m_bdev = sb->s_bdev; map->m_daxdev = EROFS_SB(sb)->dax_dev; map->m_dax_part_off = EROFS_SB(sb)->dax_part_off; + map->m_ctx = EROFS_SB(sb)->bootstrap; if (map->m_deviceid) { down_read(&devs->rwsem); @@ -211,6 +212,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) map->m_bdev = dif->bdev; map->m_daxdev = dif->dax_dev; map->m_dax_part_off = dif->dax_part_off; + map->m_ctx = dif->ctx; up_read(&devs->rwsem); } else if (devs->extra_devices) { down_read(&devs->rwsem); @@ -228,6 +230,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map) map->m_bdev = dif->bdev; map->m_daxdev = dif->dax_dev; map->m_dax_part_off = dif->dax_part_off; + map->m_ctx = dif->ctx; break; } } diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 8c56bd54b2af..e8df35ee4ba8 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -144,8 +144,8 @@ static int erofs_fscache_readpage(struct file *file, struct page *page) struct inode *inode = page->mapping->host; struct erofs_inode *vi = EROFS_I(inode); struct super_block *sb = inode->i_sb; - struct erofs_sb_info *sbi = EROFS_SB(sb); struct erofs_map_blocks map; + struct erofs_map_dev mdev; struct erofs_fscache_map fsmap; int ret; @@ -168,9 +168,18 @@ static int erofs_fscache_readpage(struct file *file, struct page *page) return 0; } - fsmap.m_ctx = sbi->bootstrap; + mdev = (struct erofs_map_dev) { + .m_deviceid = map.m_deviceid, + .m_pa = map.m_pa, + }; + + ret = erofs_map_dev(sb, &mdev); + if (ret) + return ret; + + fsmap.m_ctx = mdev.m_ctx; fsmap.m_la = map.m_la; - fsmap.m_pa = map.m_pa; + fsmap.m_pa = mdev.m_pa; fsmap.m_llen = map.m_llen; switch (vi->datalayout) { diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 5d514c7b73cc..6ccf14952b2d 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -486,6 +486,7 @@ struct erofs_map_dev { struct block_device *m_bdev; struct dax_device *m_daxdev; u64 m_dax_part_off; + struct erofs_fscache_context *m_ctx; erofs_off_t m_pa; unsigned int m_deviceid; From patchwork Tue Jan 18 13:12:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716320 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00A7EC433FE for ; Tue, 18 Jan 2022 13:12:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242552AbiARNMx (ORCPT ); Tue, 18 Jan 2022 08:12:53 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:48942 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242548AbiARNMm (ORCPT ); Tue, 18 Jan 2022 08:12:42 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R621e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C5CpP_1642511559; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C5CpP_1642511559) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:40 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 19/20] erofs: add 'uuid' mount option Date: Tue, 18 Jan 2022 21:12:15 +0800 Message-Id: <20220118131216.85338-20-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce 'uuid' mount option to enable the nodev mode, in which erofs could be mounted from blob files instead of blkdev. By then users could specify the path of bootstrap blob file containing the complete erofs image. Signed-off-by: Jeffle Xu --- fs/erofs/Kconfig | 2 +- fs/erofs/super.c | 43 ++++++++++++++++++++++++++++++++++++------- 2 files changed, 37 insertions(+), 8 deletions(-) diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig index f57255ab88ed..37a2cc82ecc2 100644 --- a/fs/erofs/Kconfig +++ b/fs/erofs/Kconfig @@ -2,7 +2,7 @@ config EROFS_FS tristate "EROFS filesystem support" - depends on BLOCK + depends on BLOCK && FSCACHE_ONDEMAND select FS_IOMAP select LIBCRC32C help diff --git a/fs/erofs/super.c b/fs/erofs/super.c index f058a04a00c7..3f8557bac786 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -400,6 +400,7 @@ enum { Opt_dax, Opt_dax_enum, Opt_device, + Opt_uuid, Opt_err }; @@ -424,6 +425,7 @@ static const struct fs_parameter_spec erofs_fs_parameters[] = { fsparam_flag("dax", Opt_dax), fsparam_enum("dax", Opt_dax_enum, erofs_dax_param_enums), fsparam_string("device", Opt_device), + fsparam_string("uuid", Opt_uuid), {} }; @@ -519,6 +521,12 @@ static int erofs_fc_parse_param(struct fs_context *fc, } ++ctx->devs->extra_devices; break; + case Opt_uuid: + kfree(ctx->opt.uuid); + ctx->opt.uuid = kstrdup(param->string, GFP_KERNEL); + if (!ctx->opt.uuid) + return -ENOMEM; + break; default: return -ENOPARAM; } @@ -593,9 +601,14 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_magic = EROFS_SUPER_MAGIC; - if (!sb_set_blocksize(sb, EROFS_BLKSIZ)) { - erofs_err(sb, "failed to set erofs blksize"); - return -EINVAL; + if (erofs_bdev_mode(sb)) { + if (!sb_set_blocksize(sb, EROFS_BLKSIZ)) { + erofs_err(sb, "failed to set erofs blksize"); + return -EINVAL; + } + } else { + sb->s_blocksize = EROFS_BLKSIZ; + sb->s_blocksize_bits = LOG_BLOCK_SIZE; } sbi = kzalloc(sizeof(*sbi), GFP_KERNEL); @@ -604,11 +617,12 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_fs_info = sbi; sbi->opt = ctx->opt; - sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev, &sbi->dax_part_off); sbi->devs = ctx->devs; ctx->devs = NULL; - if (!erofs_bdev_mode(sb)) { + if (erofs_bdev_mode(sb)) { + sbi->dax_dev = fs_dax_get_by_bdev(sb->s_bdev, &sbi->dax_part_off); + } else { struct erofs_fscache_context *bootstrap; bootstrap = erofs_fscache_get_ctx(sb, ctx->opt.uuid, true); @@ -616,6 +630,7 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) return PTR_ERR(bootstrap); sbi->bootstrap = bootstrap; + sbi->dax_dev = NULL; } err = erofs_read_superblock(sb); @@ -678,6 +693,11 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc) static int erofs_fc_get_tree(struct fs_context *fc) { + struct erofs_fs_context *ctx = fc->fs_private; + + if (ctx->opt.uuid) + return get_tree_nodev(fc, erofs_fc_fill_super); + return get_tree_bdev(fc, erofs_fc_fill_super); } @@ -727,6 +747,7 @@ static void erofs_fc_free(struct fs_context *fc) struct erofs_fs_context *ctx = fc->fs_private; erofs_free_dev_context(ctx->devs); + kfree(ctx->opt.uuid); kfree(ctx); } @@ -767,7 +788,10 @@ static void erofs_kill_sb(struct super_block *sb) WARN_ON(sb->s_magic != EROFS_SUPER_MAGIC); - kill_block_super(sb); + if (erofs_bdev_mode(sb)) + kill_block_super(sb); + else + generic_shutdown_super(sb); sbi = EROFS_SB(sb); if (!sbi) @@ -885,7 +909,12 @@ static int erofs_statfs(struct dentry *dentry, struct kstatfs *buf) { struct super_block *sb = dentry->d_sb; struct erofs_sb_info *sbi = EROFS_SB(sb); - u64 id = huge_encode_dev(sb->s_bdev->bd_dev); + u64 id; + + if (erofs_bdev_mode(sb)) + id = huge_encode_dev(sb->s_bdev->bd_dev); + else + id = 0; /* TODO */ buf->f_type = sb->s_magic; buf->f_bsize = EROFS_BLKSIZ; From patchwork Tue Jan 18 13:12:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12716325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D415C433FE for ; Tue, 18 Jan 2022 13:13:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242233AbiARNNJ (ORCPT ); Tue, 18 Jan 2022 08:13:09 -0500 Received: from out30-54.freemail.mail.aliyun.com ([115.124.30.54]:56252 "EHLO out30-54.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242680AbiARNMp (ORCPT ); Tue, 18 Jan 2022 08:12:45 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0V2C2ayE_1642511560; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0V2C2ayE_1642511560) by smtp.aliyun-inc.com(127.0.0.1); Tue, 18 Jan 2022 21:12:41 +0800 From: Jeffle Xu To: dhowells@redhat.com, linux-cachefs@redhat.com, xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: linux-fsdevel@vger.kernel.org, joseph.qi@linux.alibaba.com, bo.liu@linux.alibaba.com, tao.peng@linux.alibaba.com, gerry@linux.alibaba.com, eguan@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 20/20] erofs: support on-demand reading Date: Tue, 18 Jan 2022 21:12:16 +0800 Message-Id: <20220118131216.85338-21-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220118131216.85338-1-jefflexu@linux.alibaba.com> References: <20220118131216.85338-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Implement the .issue_op() callback, and all work is done by netfs_ondemand_read(). Signed-off-by: Jeffle Xu --- fs/erofs/fscache.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index e8df35ee4ba8..9ba668c42098 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -28,9 +28,15 @@ static void erofs_noop_cleanup(struct address_space *mapping, void *netfs_priv) { } +static void erofs_issue_op(struct netfs_read_subrequest *subreq) +{ + netfs_ondemand_read(subreq); +} + static const struct netfs_read_request_ops erofs_blob_req_ops = { .begin_cache_operation = erofs_blob_begin_cache_operation, .cleanup = erofs_noop_cleanup, + .issue_op = erofs_issue_op, }; static int erofs_begin_cache_operation(struct netfs_read_request *rreq) @@ -58,6 +64,7 @@ static const struct netfs_read_request_ops erofs_req_ops = { .begin_cache_operation = erofs_begin_cache_operation, .cleanup = erofs_noop_cleanup, .clamp_length = erofs_clamp_length, + .issue_op = erofs_issue_op, }; static int erofs_fscache_blob_readpage(struct file *data, struct page *page)