From patchwork Mon Oct 11 03:00:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54D1BC433EF for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2CD2360F4C for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233186AbhJKDC4 (ORCPT ); Sun, 10 Oct 2021 23:02:56 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:39610 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233414AbhJKDCy (ORCPT ); Sun, 10 Oct 2021 23:02:54 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJTlBr_1633921253; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJTlBr_1633921253) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:53 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 1/7] fuse: add fuse_should_enable_dax() helper Date: Mon, 11 Oct 2021 11:00:46 +0800 Message-Id: <20211011030052.98923-2-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This is in prep for following per-file DAX checking. Signed-off-by: Jeffle Xu --- fs/fuse/dax.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 79df61ed7481..1eb6538bf1b2 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1332,11 +1332,19 @@ static const struct address_space_operations fuse_dax_file_aops = { .invalidatepage = noop_invalidatepage, }; -void fuse_dax_inode_init(struct inode *inode) +static bool fuse_should_enable_dax(struct inode *inode) { struct fuse_conn *fc = get_fuse_conn(inode); - if (!fc->dax) + if (fc->dax) + return true; + + return false; +} + +void fuse_dax_inode_init(struct inode *inode) +{ + if (!fuse_should_enable_dax(inode)) return; inode->i_flags |= S_DAX; From patchwork Mon Oct 11 03:00:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548955 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DA75C4332F for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 363DA60F38 for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233700AbhJKDC5 (ORCPT ); Sun, 10 Oct 2021 23:02:57 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:49221 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233538AbhJKDCz (ORCPT ); Sun, 10 Oct 2021 23:02:55 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJL8Ml_1633921253; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJL8Ml_1633921253) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:54 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 2/7] fuse: make DAX mount option a tri-state Date: Mon, 11 Oct 2021 11:00:47 +0800 Message-Id: <20211011030052.98923-3-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We add 'always', 'never', and 'inode' (default). '-o dax' continues to operate the same which is equivalent to 'always'. To be consistemt with ext4/xfs's tri-state mount option, when neither '-o dax' nor '-o dax=' option is specified, the default behaviour is equal to 'inode'. By the time this patch is applied, 'inode' mode is actually equal to 'always' mode, before the per-file DAX flag is introduced in the following patch. Signed-off-by: Jeffle Xu --- fs/fuse/dax.c | 19 ++++++++++++++++--- fs/fuse/fuse_i.h | 14 ++++++++++++-- fs/fuse/inode.c | 10 +++++++--- fs/fuse/virtio_fs.c | 16 ++++++++++++++-- 4 files changed, 49 insertions(+), 10 deletions(-) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 1eb6538bf1b2..4c6c64efc950 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1284,11 +1284,14 @@ static int fuse_dax_mem_range_init(struct fuse_conn_dax *fcd) return ret; } -int fuse_dax_conn_alloc(struct fuse_conn *fc, struct dax_device *dax_dev) +int fuse_dax_conn_alloc(struct fuse_conn *fc, enum fuse_dax_mode dax_mode, + struct dax_device *dax_dev) { struct fuse_conn_dax *fcd; int err; + fc->dax_mode = dax_mode; + if (!dax_dev) return 0; @@ -1335,11 +1338,21 @@ static const struct address_space_operations fuse_dax_file_aops = { static bool fuse_should_enable_dax(struct inode *inode) { struct fuse_conn *fc = get_fuse_conn(inode); + unsigned int dax_mode = fc->dax_mode; + + if (dax_mode == FUSE_DAX_NEVER) + return false; - if (fc->dax) + /* + * If 'dax=always/inode', fc->dax couldn't be NULL even when fuse + * daemon doesn't support DAX, since the mount routine will fail + * early in this case. + */ + if (dax_mode == FUSE_DAX_ALWAYS) return true; - return false; + /* dax_mode == FUSE_DAX_INODE */ + return true; } void fuse_dax_inode_init(struct inode *inode) diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 319596df5dc6..5abf9749923f 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -480,6 +480,12 @@ struct fuse_dev { struct list_head entry; }; +enum fuse_dax_mode { + FUSE_DAX_INODE, + FUSE_DAX_ALWAYS, + FUSE_DAX_NEVER, +}; + struct fuse_fs_context { int fd; struct file *file; @@ -497,7 +503,7 @@ struct fuse_fs_context { bool no_control:1; bool no_force_umount:1; bool legacy_opts_show:1; - bool dax:1; + enum fuse_dax_mode dax_mode; unsigned int max_read; unsigned int blksize; const char *subtype; @@ -802,6 +808,9 @@ struct fuse_conn { struct list_head devices; #ifdef CONFIG_FUSE_DAX + /* dax mode: FUSE_DAX_* (always, never or per-file) */ + enum fuse_dax_mode dax_mode; + /* Dax specific conn data, non-NULL if DAX is enabled */ struct fuse_conn_dax *dax; #endif @@ -1255,7 +1264,8 @@ ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to); ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from); int fuse_dax_mmap(struct file *file, struct vm_area_struct *vma); int fuse_dax_break_layouts(struct inode *inode, u64 dmap_start, u64 dmap_end); -int fuse_dax_conn_alloc(struct fuse_conn *fc, struct dax_device *dax_dev); +int fuse_dax_conn_alloc(struct fuse_conn *fc, enum fuse_dax_mode mode, + struct dax_device *dax_dev); void fuse_dax_conn_free(struct fuse_conn *fc); bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi); void fuse_dax_inode_init(struct inode *inode); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 36cd03114b6d..b4b41683e97e 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -742,8 +742,12 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root) seq_printf(m, ",blksize=%lu", sb->s_blocksize); } #ifdef CONFIG_FUSE_DAX - if (fc->dax) - seq_puts(m, ",dax"); + if (fc->dax_mode == FUSE_DAX_ALWAYS) + seq_puts(m, ",dax=always"); + else if (fc->dax_mode == FUSE_DAX_NEVER) + seq_puts(m, ",dax=never"); + else if (fc->dax_mode == FUSE_DAX_INODE) + seq_puts(m, ",dax=inode"); #endif return 0; @@ -1493,7 +1497,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx) sb->s_subtype = ctx->subtype; ctx->subtype = NULL; if (IS_ENABLED(CONFIG_FUSE_DAX)) { - err = fuse_dax_conn_alloc(fc, ctx->dax_dev); + err = fuse_dax_conn_alloc(fc, ctx->dax_mode, ctx->dax_dev); if (err) goto err; } diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 0ad89c6629d7..58cfbaeb4a7d 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -88,12 +88,21 @@ struct virtio_fs_req_work { static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq, struct fuse_req *req, bool in_flight); +static const struct constant_table dax_param_enums[] = { + {"inode", FUSE_DAX_INODE }, + {"always", FUSE_DAX_ALWAYS }, + {"never", FUSE_DAX_NEVER }, + {} +}; + enum { OPT_DAX, + OPT_DAX_ENUM, }; static const struct fs_parameter_spec virtio_fs_parameters[] = { fsparam_flag("dax", OPT_DAX), + fsparam_enum("dax", OPT_DAX_ENUM, dax_param_enums), {} }; @@ -110,7 +119,10 @@ static int virtio_fs_parse_param(struct fs_context *fsc, switch (opt) { case OPT_DAX: - ctx->dax = 1; + ctx->dax_mode = FUSE_DAX_ALWAYS; + break; + case OPT_DAX_ENUM: + ctx->dax_mode = result.uint_32; break; default: return -EINVAL; @@ -1326,7 +1338,7 @@ static int virtio_fs_fill_super(struct super_block *sb, struct fs_context *fsc) /* virtiofs allocates and installs its own fuse devices */ ctx->fudptr = NULL; - if (ctx->dax) { + if (ctx->dax_mode != FUSE_DAX_NEVER) { if (!fs->dax_dev) { err = -EINVAL; pr_err("virtio-fs: dax can't be enabled as filesystem" From patchwork Mon Oct 11 03:00:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548951 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B708C43219 for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 42F8360F58 for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233727AbhJKDC7 (ORCPT ); Sun, 10 Oct 2021 23:02:59 -0400 Received: from out30-54.freemail.mail.aliyun.com ([115.124.30.54]:43088 "EHLO out30-54.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233612AbhJKDCz (ORCPT ); Sun, 10 Oct 2021 23:02:55 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJTOY-_1633921254; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJTOY-_1633921254) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:54 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 3/7] fuse: support per-file DAX in fuse protocol Date: Mon, 11 Oct 2021 11:00:48 +0800 Message-Id: <20211011030052.98923-4-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Expand the fuse protocol to support per-file DAX. FUSE_PERFILE_DAX flag is added indicating if fuse server/client supporting per-file DAX. It can be conveyed in both FUSE_INIT request and reply. FUSE_ATTR_DAX flag is added indicating if DAX shall be enabled for corresponding file. It is conveyed in FUSE_LOOKUP reply. Signed-off-by: Jeffle Xu --- include/uapi/linux/fuse.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 36ed092227fa..15a1f5fc0797 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -184,6 +184,9 @@ * * 7.34 * - add FUSE_SYNCFS + * + * 7.35 + * - add FUSE_PERFILE_DAX, FUSE_ATTR_DAX */ #ifndef _LINUX_FUSE_H @@ -219,7 +222,7 @@ #define FUSE_KERNEL_VERSION 7 /** Minor version number of this interface */ -#define FUSE_KERNEL_MINOR_VERSION 34 +#define FUSE_KERNEL_MINOR_VERSION 35 /** The node ID of the root inode */ #define FUSE_ROOT_ID 1 @@ -336,6 +339,7 @@ struct fuse_file_lock { * write/truncate sgid is killed only if file has group * execute permission. (Same as Linux VFS behavior). * FUSE_SETXATTR_EXT: Server supports extended struct fuse_setxattr_in + * FUSE_PERFILE_DAX: kernel supports per-file DAX */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -367,6 +371,7 @@ struct fuse_file_lock { #define FUSE_SUBMOUNTS (1 << 27) #define FUSE_HANDLE_KILLPRIV_V2 (1 << 28) #define FUSE_SETXATTR_EXT (1 << 29) +#define FUSE_PERFILE_DAX (1 << 30) /** * CUSE INIT request/reply flags @@ -449,8 +454,10 @@ struct fuse_file_lock { * fuse_attr flags * * FUSE_ATTR_SUBMOUNT: Object is a submount root + * FUSE_ATTR_DAX: Enable DAX for this file in per-file DAX mode */ #define FUSE_ATTR_SUBMOUNT (1 << 0) +#define FUSE_ATTR_DAX (1 << 1) /** * Open flags From patchwork Mon Oct 11 03:00:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70291C4321E for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 509BB60F55 for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233612AbhJKDDA (ORCPT ); Sun, 10 Oct 2021 23:03:00 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:53785 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233650AbhJKDC4 (ORCPT ); Sun, 10 Oct 2021 23:02:56 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJf2Uv_1633921255; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJf2Uv_1633921255) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:55 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 4/7] fuse: negotiate per-file DAX in FUSE_INIT Date: Mon, 11 Oct 2021 11:00:49 +0800 Message-Id: <20211011030052.98923-5-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Among the FUSE_INIT phase, client shall advertise per-file DAX if it's mounted with "-o dax=inode". Then server is aware that client is in per-file DAX mode, and will construct per-inode DAX attribute accordingly. Signed-off-by: Jeffle Xu --- fs/fuse/inode.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index b4b41683e97e..f4ad99e2415b 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1203,6 +1203,8 @@ void fuse_send_init(struct fuse_mount *fm) #ifdef CONFIG_FUSE_DAX if (fm->fc->dax) ia->in.flags |= FUSE_MAP_ALIGNMENT; + if (fm->fc->dax_mode == FUSE_DAX_INODE) + ia->in.flags |= FUSE_PERFILE_DAX; #endif if (fm->fc->auto_submounts) ia->in.flags |= FUSE_SUBMOUNTS; From patchwork Mon Oct 11 03:00:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72ADFC4167B for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 588F66054F for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233739AbhJKDDA (ORCPT ); Sun, 10 Oct 2021 23:03:00 -0400 Received: from out30-42.freemail.mail.aliyun.com ([115.124.30.42]:57732 "EHLO out30-42.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233654AbhJKDC4 (ORCPT ); Sun, 10 Oct 2021 23:02:56 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJTlCR_1633921255; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJTlCR_1633921255) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:56 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 5/7] fuse: enable per-file DAX Date: Mon, 11 Oct 2021 11:00:50 +0800 Message-Id: <20211011030052.98923-6-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org DAX may be limited in some specific situation. When the number of usable DAX windows is under watermark, the recalim routine will be triggered to reclaim some DAX windows. It may have a negative impact on the performance, since some processes may need to wait for DAX windows to be recalimed and reused then. To mitigate the performance degradation, the overall DAX window need to be expanded larger. However, simply expanding the DAX window may not be a good deal in some scenario. To maintain one DAX window chunk (i.e., 2MB in size), 32KB (512 * 64 bytes) memory footprint will be consumed for page descriptors inside guest, which is greater than the memory footprint if it uses guest page cache when DAX disabled. Thus it'd better disable DAX for those files smaller than 32KB, to reduce the demand for DAX window and thus avoid the unworthy memory overhead. Per-file DAX feature is introduced to address this issue, by offering a finer grained control for dax to users, trying to achieve a balance between performance and memory overhead. The FUSE_ATTR_DAX flag in FUSE_LOOKUP reply is used to indicate whether DAX should be enabled or not for corresponding file. Currently the state whether DAX is enabled or not for the file is initialized only when inode is instantiated. Signed-off-by: Jeffle Xu --- fs/fuse/dax.c | 8 ++++---- fs/fuse/file.c | 4 ++-- fs/fuse/fuse_i.h | 4 ++-- fs/fuse/inode.c | 2 +- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 4c6c64efc950..15bde36829b8 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1335,7 +1335,7 @@ static const struct address_space_operations fuse_dax_file_aops = { .invalidatepage = noop_invalidatepage, }; -static bool fuse_should_enable_dax(struct inode *inode) +static bool fuse_should_enable_dax(struct inode *inode, unsigned int flags) { struct fuse_conn *fc = get_fuse_conn(inode); unsigned int dax_mode = fc->dax_mode; @@ -1352,12 +1352,12 @@ static bool fuse_should_enable_dax(struct inode *inode) return true; /* dax_mode == FUSE_DAX_INODE */ - return true; + return flags & FUSE_ATTR_DAX; } -void fuse_dax_inode_init(struct inode *inode) +void fuse_dax_inode_init(struct inode *inode, unsigned int flags) { - if (!fuse_should_enable_dax(inode)) + if (!fuse_should_enable_dax(inode, flags)) return; inode->i_flags |= S_DAX; diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 11404f8c21c7..40c667a48cf6 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -3163,7 +3163,7 @@ static const struct address_space_operations fuse_file_aops = { .write_end = fuse_write_end, }; -void fuse_init_file_inode(struct inode *inode) +void fuse_init_file_inode(struct inode *inode, unsigned int flags) { struct fuse_inode *fi = get_fuse_inode(inode); @@ -3177,5 +3177,5 @@ void fuse_init_file_inode(struct inode *inode) fi->writepages = RB_ROOT; if (IS_ENABLED(CONFIG_FUSE_DAX)) - fuse_dax_inode_init(inode); + fuse_dax_inode_init(inode, flags); } diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 5abf9749923f..0270a41c31d7 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1016,7 +1016,7 @@ int fuse_notify_poll_wakeup(struct fuse_conn *fc, /** * Initialize file operations on a regular file */ -void fuse_init_file_inode(struct inode *inode); +void fuse_init_file_inode(struct inode *inode, unsigned int flags); /** * Initialize inode operations on regular files and special files @@ -1268,7 +1268,7 @@ int fuse_dax_conn_alloc(struct fuse_conn *fc, enum fuse_dax_mode mode, struct dax_device *dax_dev); void fuse_dax_conn_free(struct fuse_conn *fc); bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi); -void fuse_dax_inode_init(struct inode *inode); +void fuse_dax_inode_init(struct inode *inode, unsigned int flags); void fuse_dax_inode_cleanup(struct inode *inode); bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment); void fuse_dax_cancel_work(struct fuse_conn *fc); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index f4ad99e2415b..73f19cd6e702 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -280,7 +280,7 @@ static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr) inode->i_ctime.tv_nsec = attr->ctimensec; if (S_ISREG(inode->i_mode)) { fuse_init_common(inode); - fuse_init_file_inode(inode); + fuse_init_file_inode(inode, attr->flags); } else if (S_ISDIR(inode->i_mode)) fuse_init_dir(inode); else if (S_ISLNK(inode->i_mode)) From patchwork Mon Oct 11 03:00:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548959 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4D07C43217 for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9912960F38 for ; Mon, 11 Oct 2021 03:01:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233755AbhJKDDB (ORCPT ); Sun, 10 Oct 2021 23:03:01 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:34586 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233677AbhJKDC5 (ORCPT ); Sun, 10 Oct 2021 23:02:57 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJTOYS_1633921256; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJTOYS_1633921256) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:56 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 6/7] fuse: mark inode DONT_CACHE when per-file DAX hint changes Date: Mon, 11 Oct 2021 11:00:51 +0800 Message-Id: <20211011030052.98923-7-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When the per-file DAX hint changes while the file is still *opened*, it is quite complicated and maybe fragile to dynamically change the DAX state. Hence mark the inode and corresponding dentries as DONE_CACHE once the per-file DAX hint changes, so that the inode instance will be evicted and freed as soon as possible once the file is closed and the last reference to the inode is put. And then when the file gets reopened next time, the new instantiated inode will reflect the new DAX state. In summary, when the per-file DAX hint changes for an *opened* file, the DAX state of the file won't be updated until this file is closed and reopened later. Signed-off-by: Jeffle Xu --- fs/fuse/dax.c | 9 +++++++++ fs/fuse/fuse_i.h | 1 + fs/fuse/inode.c | 3 +++ 3 files changed, 13 insertions(+) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 15bde36829b8..ca083c13f5e8 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1364,6 +1364,15 @@ void fuse_dax_inode_init(struct inode *inode, unsigned int flags) inode->i_data.a_ops = &fuse_dax_file_aops; } +void fuse_dax_dontcache(struct inode *inode, unsigned int flags) +{ + struct fuse_conn *fc = get_fuse_conn(inode); + + if (fc->dax_mode == FUSE_DAX_INODE && + (!!IS_DAX(inode) != !!(flags & FUSE_ATTR_DAX))) + d_mark_dontcache(inode); +} + bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment) { if (fc->dax && (map_alignment > FUSE_DAX_SHIFT)) { diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 0270a41c31d7..bb2c11e0311a 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1270,6 +1270,7 @@ void fuse_dax_conn_free(struct fuse_conn *fc); bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi); void fuse_dax_inode_init(struct inode *inode, unsigned int flags); void fuse_dax_inode_cleanup(struct inode *inode); +void fuse_dax_dontcache(struct inode *inode, unsigned int flags); bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment); void fuse_dax_cancel_work(struct fuse_conn *fc); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 73f19cd6e702..cf934c2ba761 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -268,6 +268,9 @@ void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr, if (inval) invalidate_inode_pages2(inode->i_mapping); } + + if (IS_ENABLED(CONFIG_FUSE_DAX)) + fuse_dax_dontcache(inode, attr->flags); } static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr) From patchwork Mon Oct 11 03:00:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12548961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A592CC433F5 for ; Mon, 11 Oct 2021 03:01:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8188C60F38 for ; Mon, 11 Oct 2021 03:01:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233726AbhJKDDC (ORCPT ); Sun, 10 Oct 2021 23:03:02 -0400 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:51721 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233705AbhJKDC5 (ORCPT ); Sun, 10 Oct 2021 23:02:57 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UrJTOYb_1633921256; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UrJTOYb_1633921256) by smtp.aliyun-inc.com(127.0.0.1); Mon, 11 Oct 2021 11:00:57 +0800 From: Jeffle Xu To: vgoyal@redhat.com, stefanha@redhat.com, miklos@szeredi.hub Cc: linux-fsdevel@vger.kernel.org, virtio-fs@redhat.com, bo.liu@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v6 7/7] Documentation/filesystem/dax: record DAX on virtiofs Date: Mon, 11 Oct 2021 11:00:52 +0800 Message-Id: <20211011030052.98923-8-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211011030052.98923-1-jefflexu@linux.alibaba.com> References: <20211011030052.98923-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Record DAX on virtiofs and the semantic difference with that on ext4 and xfs. Signed-off-by: Jeffle Xu --- Documentation/filesystems/dax.rst | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/Documentation/filesystems/dax.rst b/Documentation/filesystems/dax.rst index 9a1b8fd9e82b..e3b30429d703 100644 --- a/Documentation/filesystems/dax.rst +++ b/Documentation/filesystems/dax.rst @@ -23,8 +23,8 @@ on it as usual. The `DAX` code currently only supports files with a block size equal to your kernel's `PAGE_SIZE`, so you may need to specify a block size when creating the filesystem. -Currently 3 filesystems support `DAX`: ext2, ext4 and xfs. Enabling `DAX` on them -is different. +Currently 4 filesystems support `DAX`: ext2, ext4, xfs and virtiofs. +Enabling `DAX` on them is different. Enabling DAX on ext2 -------------------- @@ -168,6 +168,22 @@ if the underlying media does not support dax and/or the filesystem is overridden with a mount option. +Enabling DAX on virtiofs +---------------------------- +The semantic of DAX on virtiofs is basically equal to that on ext4 and xfs, +except that when '-o dax=inode' is specified, virtiofs client derives the hint +whether DAX shall be enabled or not from virtiofs server through FUSE protocol, +rather than the persistent `FS_XFLAG_DAX` flag. That is, whether DAX shall be +enabled or not is completely determined by virtiofs server, while virtiofs +server itself may deploy various algorithm making this decision, e.g. depending +on the persistent `FS_XFLAG_DAX` flag on the host. + +It is still supported to set or clear persistent `FS_XFLAG_DAX` flag inside +guest, but it is not guaranteed that DAX will be enabled or disabled for +corresponding file then. Users inside guest still need to call statx(2) and +check the statx flag `STATX_ATTR_DAX` to see if DAX is enabled for this file. + + Implementation Tips for Block Driver Writers --------------------------------------------