From patchwork Sun Feb 7 17:09:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_05, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75420C433DB for ; Sun, 7 Feb 2021 17:09:37 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E88B564DB1 for ; Sun, 7 Feb 2021 17:09:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E88B564DB1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id AF102100EBB76; Sun, 7 Feb 2021 09:09:36 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 5F35C100EBB6F for ; Sun, 7 Feb 2021 09:09:34 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299365" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:32 +0800 Received: from G08CNEXMBPEKD04.g08.fujitsu.local (unknown [10.167.33.201]) by cn.fujitsu.com (Postfix) with ESMTP id C6D914CE6033; Mon, 8 Feb 2021 01:09:29 +0800 (CST) Received: from G08CNEXJMPEKD02.g08.fujitsu.local (10.167.33.202) by G08CNEXMBPEKD04.g08.fujitsu.local (10.167.33.201) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:32 +0800 Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXJMPEKD02.g08.fujitsu.local (10.167.33.202) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:31 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:30 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 1/7] fsdax: Output address in dax_iomap_pfn() and rename it Date: Mon, 8 Feb 2021 01:09:18 +0800 Message-ID: <20210207170924.2933035-2-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: C6D914CE6033.AD132 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: 6KSRUO477KRNPU6WUNAYDHUY2ASPTDYA X-Message-ID-Hash: 6KSRUO477KRNPU6WUNAYDHUY2ASPTDYA X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Add address output in dax_iomap_pfn() in order to perform a memcpy() in CoW case. Since this function both output address and pfn, rename it to dax_iomap_direct_access(). Signed-off-by: Shiyang Ruan Reviewed-by: Christoph Hellwig --- fs/dax.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5b47834f2e1b..b012b2db7ba2 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -998,8 +998,8 @@ static sector_t dax_iomap_sector(struct iomap *iomap, loff_t pos) return (iomap->addr + (pos & PAGE_MASK) - iomap->offset) >> 9; } -static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, size_t size, - pfn_t *pfnp) +static int dax_iomap_direct_access(struct iomap *iomap, loff_t pos, size_t size, + void **kaddr, pfn_t *pfnp) { const sector_t sector = dax_iomap_sector(iomap, pos); pgoff_t pgoff; @@ -1011,11 +1011,13 @@ static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, size_t size, return rc; id = dax_read_lock(); length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size), - NULL, pfnp); + kaddr, pfnp); if (length < 0) { rc = length; goto out; } + if (!pfnp) + goto out_check_addr; rc = -EINVAL; if (PFN_PHYS(length) < size) goto out; @@ -1025,6 +1027,12 @@ static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, size_t size, if (length > 1 && !pfn_t_devmap(*pfnp)) goto out; rc = 0; + +out_check_addr: + if (!kaddr) + goto out; + if (!*kaddr) + rc = -EFAULT; out: dax_read_unlock(id); return rc; @@ -1348,7 +1356,8 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, count_memcg_event_mm(vma->vm_mm, PGMAJFAULT); major = VM_FAULT_MAJOR; } - error = dax_iomap_pfn(&iomap, pos, PAGE_SIZE, &pfn); + error = dax_iomap_direct_access(&iomap, pos, PAGE_SIZE, + NULL, &pfn); if (error < 0) goto error_finish_iomap; @@ -1566,7 +1575,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, switch (iomap.type) { case IOMAP_MAPPED: - error = dax_iomap_pfn(&iomap, pos, PMD_SIZE, &pfn); + error = dax_iomap_direct_access(&iomap, pos, PMD_SIZE, + NULL, &pfn); if (error < 0) goto finish_iomap; From patchwork Sun Feb 7 17:09:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FDD8C433E9 for ; Sun, 7 Feb 2021 17:09:41 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DC1FC64DB1 for ; Sun, 7 Feb 2021 17:09:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC1FC64DB1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 10DC8100EBB96; Sun, 7 Feb 2021 09:09:40 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 0247E100EBB6F for ; Sun, 7 Feb 2021 09:09:35 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299368" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:32 +0800 Received: from G08CNEXMBPEKD05.g08.fujitsu.local (unknown [10.167.33.204]) by cn.fujitsu.com (Postfix) with ESMTP id 641D84CE6F77; Mon, 8 Feb 2021 01:09:30 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD05.g08.fujitsu.local (10.167.33.204) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:32 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:31 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 2/7] fsdax: Introduce dax_copy_edges() for CoW Date: Mon, 8 Feb 2021 01:09:19 +0800 Message-ID: <20210207170924.2933035-3-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: 641D84CE6F77.ACBBD X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: PE7Z22CA723GBAAB3P7YVDL45W4PCHFJ X-Message-ID-Hash: PE7Z22CA723GBAAB3P7YVDL45W4PCHFJ X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de, Goldwyn Rodrigues X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: dax_copy_edges() is a helper functions performs a copy from one part of the device to another for data not page aligned. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Shiyang Ruan --- fs/dax.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/fs/dax.c b/fs/dax.c index b012b2db7ba2..ea4e8a434900 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1038,6 +1038,47 @@ static int dax_iomap_direct_access(struct iomap *iomap, loff_t pos, size_t size, return rc; } +/* + * Copies the part of the pages not included in the write, but required for CoW + * because offset/offset+length are not page aligned. + */ +static int dax_copy_edges(loff_t pos, loff_t length, struct iomap *srcmap, + void *daddr, bool pmd) +{ + size_t page_size = pmd ? PMD_SIZE : PAGE_SIZE; + loff_t offset = pos & (page_size - 1); + size_t size = ALIGN(offset + length, page_size); + loff_t end = pos + length; + loff_t pg_end = round_up(end, page_size); + void *saddr = 0; + int ret = 0; + + ret = dax_iomap_direct_access(srcmap, pos, size, &saddr, NULL); + if (ret) + return ret; + /* + * Copy the first part of the page + * Note: we pass offset as length + */ + if (offset) { + if (saddr) + ret = copy_mc_to_kernel(daddr, saddr, offset); + else + memset(daddr, 0, offset); + } + + /* Copy the last part of the range */ + if (end < pg_end) { + if (saddr) + ret = copy_mc_to_kernel(daddr + offset + length, + saddr + offset + length, pg_end - end); + else + memset(daddr + offset + length, 0, pg_end - end); + } + + return ret; +} + /* * The user has performed a load from a hole in the file. Allocating a new * page in the file would cause excessive storage usage for workloads with From patchwork Sun Feb 7 17:09:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51BA3C433E0 for ; Sun, 7 Feb 2021 17:09:39 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 031EE64DB2 for ; Sun, 7 Feb 2021 17:09:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 031EE64DB2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id CDCF5100EBB81; Sun, 7 Feb 2021 09:09:38 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 326CA100EBB6F for ; Sun, 7 Feb 2021 09:09:35 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299366" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:32 +0800 Received: from G08CNEXMBPEKD06.g08.fujitsu.local (unknown [10.167.33.206]) by cn.fujitsu.com (Postfix) with ESMTP id A5EAA4CE6F78; Mon, 8 Feb 2021 01:09:31 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD06.g08.fujitsu.local (10.167.33.206) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:32 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:33 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 3/7] fsdax: Copy data before write Date: Mon, 8 Feb 2021 01:09:20 +0800 Message-ID: <20210207170924.2933035-4-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: A5EAA4CE6F78.AEB4A X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: VCBQMJB35E6CNG77CKGTZRQVWF3UWWOB X-Message-ID-Hash: VCBQMJB35E6CNG77CKGTZRQVWF3UWWOB X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Add dax_copy_edges() into each dax actor functions to perform CoW. Signed-off-by: Shiyang Ruan --- fs/dax.c | 37 ++++++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index ea4e8a434900..b2195cbdf2dc 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1161,7 +1161,8 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return iov_iter_zero(min(length, end - pos), iter); } - if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) + if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED && + !(iomap->flags & IOMAP_F_SHARED))) return -EIO; /* @@ -1200,6 +1201,12 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (iomap != srcmap) { + ret = dax_copy_edges(pos, length, srcmap, kaddr, false); + if (ret) + break; + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; @@ -1311,6 +1318,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, vm_fault_t ret = 0; void *entry; pfn_t pfn; + void *kaddr; trace_dax_pte_fault(inode, vmf, ret); /* @@ -1392,19 +1400,27 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, switch (iomap.type) { case IOMAP_MAPPED: +cow: if (iomap.flags & IOMAP_F_NEW) { count_vm_event(PGMAJFAULT); count_memcg_event_mm(vma->vm_mm, PGMAJFAULT); major = VM_FAULT_MAJOR; } error = dax_iomap_direct_access(&iomap, pos, PAGE_SIZE, - NULL, &pfn); + &kaddr, &pfn); if (error < 0) goto error_finish_iomap; entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, 0, write && !sync); + if (srcmap.type != IOMAP_HOLE) { + error = dax_copy_edges(pos, PAGE_SIZE, &srcmap, kaddr, + false); + if (error) + goto error_finish_iomap; + } + /* * If we are doing synchronous page fault and inode needs fsync, * we can insert PTE into page tables only after that happens. @@ -1428,6 +1444,9 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, goto finish_iomap; case IOMAP_UNWRITTEN: + if (write && iomap.flags & IOMAP_F_SHARED) + goto cow; + fallthrough; case IOMAP_HOLE: if (!write) { ret = dax_load_hole(&xas, mapping, &entry, vmf); @@ -1535,6 +1554,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, loff_t pos; int error; pfn_t pfn; + void *kaddr; /* * Check whether offset isn't beyond end of file now. Caller is @@ -1616,14 +1636,22 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, switch (iomap.type) { case IOMAP_MAPPED: +cow: error = dax_iomap_direct_access(&iomap, pos, PMD_SIZE, - NULL, &pfn); + &kaddr, &pfn); if (error < 0) goto finish_iomap; entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, DAX_PMD, write && !sync); + if (srcmap.type != IOMAP_HOLE) { + error = dax_copy_edges(pos, PMD_SIZE, &srcmap, kaddr, + true); + if (error) + goto unlock_entry; + } + /* * If we are doing synchronous page fault and inode needs fsync, * we can insert PMD into page tables only after that happens. @@ -1642,6 +1670,9 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, result = vmf_insert_pfn_pmd(vmf, pfn, write); break; case IOMAP_UNWRITTEN: + if (write && iomap.flags & IOMAP_F_SHARED) + goto cow; + fallthrough; case IOMAP_HOLE: if (WARN_ON_ONCE(write)) break; From patchwork Sun Feb 7 17:09:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 831D3C433DB for ; Sun, 7 Feb 2021 17:09:43 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3C2C664DBD for ; Sun, 7 Feb 2021 17:09:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C2C664DBD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 2EF45100EBB9D; Sun, 7 Feb 2021 09:09:43 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 5D196100EBB82 for ; Sun, 7 Feb 2021 09:09:39 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299370" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:38 +0800 Received: from G08CNEXMBPEKD04.g08.fujitsu.local (unknown [10.167.33.201]) by cn.fujitsu.com (Postfix) with ESMTP id BB7304CE6F6E; Mon, 8 Feb 2021 01:09:32 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD04.g08.fujitsu.local (10.167.33.201) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:34 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:34 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 4/7] fsdax: Replace mmap entry in case of CoW Date: Mon, 8 Feb 2021 01:09:21 +0800 Message-ID: <20210207170924.2933035-5-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: BB7304CE6F6E.AF719 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: 3YO2PVUBUVBATOF3WV5EKBSD6TXWATGT X-Message-ID-Hash: 3YO2PVUBUVBATOF3WV5EKBSD6TXWATGT X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de, Goldwyn Rodrigues X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: We replace the existing entry to the newly allocated one in case of CoW. Also, we mark the entry as PAGECACHE_TAG_TOWRITE so writeback marks this entry as writeprotected. This helps us snapshots so new write pagefaults after snapshots trigger a CoW. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Shiyang Ruan --- fs/dax.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index b2195cbdf2dc..29698a3d2e37 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -722,6 +722,9 @@ static int copy_cow_page_dax(struct block_device *bdev, struct dax_device *dax_d return 0; } +#define DAX_IF_DIRTY (1ULL << 0) +#define DAX_IF_COW (1ULL << 1) + /* * By this point grab_mapping_entry() has ensured that we have a locked entry * of the appropriate size so we don't have to worry about downgrading PMDs to @@ -731,14 +734,16 @@ static int copy_cow_page_dax(struct block_device *bdev, struct dax_device *dax_d */ static void *dax_insert_entry(struct xa_state *xas, struct address_space *mapping, struct vm_fault *vmf, - void *entry, pfn_t pfn, unsigned long flags, bool dirty) + void *entry, pfn_t pfn, unsigned long flags, bool insert_flags) { void *new_entry = dax_make_entry(pfn, flags); + bool dirty = insert_flags & DAX_IF_DIRTY; + bool cow = insert_flags & DAX_IF_COW; if (dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); - if (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE)) { + if (cow || (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE))) { unsigned long index = xas->xa_index; /* we are replacing a zero page with block mapping */ if (dax_is_pmd_entry(entry)) @@ -750,7 +755,7 @@ static void *dax_insert_entry(struct xa_state *xas, xas_reset(xas); xas_lock_irq(xas); - if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { + if (cow || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { void *old; dax_disassociate_entry(entry, mapping, false); @@ -774,6 +779,9 @@ static void *dax_insert_entry(struct xa_state *xas, if (dirty) xas_set_mark(xas, PAGECACHE_TAG_DIRTY); + if (cow) + xas_set_mark(xas, PAGECACHE_TAG_TOWRITE); + xas_unlock_irq(xas); return entry; } @@ -1319,6 +1327,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, void *entry; pfn_t pfn; void *kaddr; + unsigned long insert_flags = 0; trace_dax_pte_fault(inode, vmf, ret); /* @@ -1444,8 +1453,10 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, goto finish_iomap; case IOMAP_UNWRITTEN: - if (write && iomap.flags & IOMAP_F_SHARED) + if (write && (iomap.flags & IOMAP_F_SHARED)) { + insert_flags |= DAX_IF_COW; goto cow; + } fallthrough; case IOMAP_HOLE: if (!write) { @@ -1555,6 +1566,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, int error; pfn_t pfn; void *kaddr; + unsigned long insert_flags = 0; /* * Check whether offset isn't beyond end of file now. Caller is @@ -1670,14 +1682,17 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, result = vmf_insert_pfn_pmd(vmf, pfn, write); break; case IOMAP_UNWRITTEN: - if (write && iomap.flags & IOMAP_F_SHARED) + if (write && (iomap.flags & IOMAP_F_SHARED)) { + insert_flags |= DAX_IF_COW; goto cow; + } fallthrough; case IOMAP_HOLE: - if (WARN_ON_ONCE(write)) + if (!write) { + result = dax_pmd_load_hole(&xas, vmf, &iomap, &entry); break; - result = dax_pmd_load_hole(&xas, vmf, &iomap, &entry); - break; + } + fallthrough; default: WARN_ON_ONCE(1); break; From patchwork Sun Feb 7 17:09:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDF6EC433E6 for ; Sun, 7 Feb 2021 17:09:45 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8324264DB2 for ; Sun, 7 Feb 2021 17:09:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8324264DB2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 6B629100EB827; Sun, 7 Feb 2021 09:09:44 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 35A33100EBB94 for ; Sun, 7 Feb 2021 09:09:40 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299372" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:38 +0800 Received: from G08CNEXMBPEKD04.g08.fujitsu.local (unknown [10.167.33.201]) by cn.fujitsu.com (Postfix) with ESMTP id EE5B64CE6033; Mon, 8 Feb 2021 01:09:33 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD04.g08.fujitsu.local (10.167.33.201) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:36 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:35 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 5/7] fsdax: Dedup file range to use a compare function Date: Mon, 8 Feb 2021 01:09:22 +0800 Message-ID: <20210207170924.2933035-6-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: EE5B64CE6033.AF9D0 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: QLS4QKYWA6CACQP6ZG37JLEAVDQI2ZXF X-Message-ID-Hash: QLS4QKYWA6CACQP6ZG37JLEAVDQI2ZXF X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de, Goldwyn Rodrigues X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: With dax we cannot deal with readpage() etc. So, we create a funciton callback to perform the file data comparison and pass it to generic_remap_file_range_prep() so it can use iomap-based functions. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Shiyang Ruan --- fs/btrfs/reflink.c | 3 +- fs/dax.c | 67 ++++++++++++++++++++++++++++++++++++++++++++ fs/ocfs2/file.c | 2 +- fs/remap_range.c | 14 +++++---- fs/xfs/xfs_reflink.c | 2 +- include/linux/dax.h | 5 ++++ include/linux/fs.h | 9 +++++- 7 files changed, 92 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index 99aa87c08912..efe094f7f748 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -783,7 +783,8 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, return ret; return generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags); + len, remap_flags, + vfs_dedupe_file_range_compare); } loff_t btrfs_remap_file_range(struct file *src_file, loff_t off, diff --git a/fs/dax.c b/fs/dax.c index 29698a3d2e37..6e9391845057 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -30,6 +30,8 @@ #define CREATE_TRACE_POINTS #include +#define MIN(a, b) (((a) < (b)) ? (a) : (b)) + static inline unsigned int pe_order(enum page_entry_size pe_size) { if (pe_size == PE_SIZE_PTE) @@ -1827,3 +1829,68 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, return dax_insert_pfn_mkwrite(vmf, pfn, order); } EXPORT_SYMBOL_GPL(dax_finish_sync_fault); + +int dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, + loff_t destoff, loff_t len, bool *is_same, + const struct iomap_ops *ops) +{ + void *saddr, *daddr; + struct iomap smap = { 0 }; + struct iomap dmap = { 0 }; + bool same = false; + loff_t cmp_len; + int id, ret = 0; + + id = dax_read_lock(); + while (len) { + ret = ops->iomap_begin(src, srcoff, len, 0, &smap, NULL); + if (ret < 0) + goto out_src; + cmp_len = MIN(len, smap.offset + smap.length - srcoff); + + ret = ops->iomap_begin(dest, destoff, cmp_len, 0, &dmap, NULL); + if (ret < 0) + goto out_dest; + cmp_len = MIN(cmp_len, dmap.offset + dmap.length - destoff); + + if (smap.type == IOMAP_HOLE && dmap.type == IOMAP_HOLE) + goto next; + + if (smap.type == IOMAP_HOLE || dmap.type == IOMAP_HOLE) { + same = false; + goto next; + } + + ret = dax_iomap_direct_access(&smap, srcoff, + ALIGN(srcoff + cmp_len, PAGE_SIZE), &saddr, NULL); + if (ret < 0) + goto out_dest; + + ret = dax_iomap_direct_access(&dmap, destoff, + ALIGN(destoff + cmp_len, PAGE_SIZE), &daddr, NULL); + if (ret < 0) + goto out_dest; + + same = !memcmp(saddr, daddr, cmp_len); + if (!same) + break; +next: + len -= cmp_len; + srcoff += cmp_len; + destoff += cmp_len; +out_dest: + if (ops->iomap_end) + ops->iomap_end(dest, destoff, cmp_len, 0, 0, &dmap); +out_src: + if (ops->iomap_end) + ops->iomap_end(src, srcoff, len, 0, 0, &smap); + + if (ret < 0) + goto out; + } + *is_same = same; +out: + dax_read_unlock(id); + return ret; +} +EXPORT_SYMBOL_GPL(dax_file_range_compare); diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 85979e2214b3..9d101f129d16 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -2591,7 +2591,7 @@ static loff_t ocfs2_remap_file_range(struct file *file_in, loff_t pos_in, goto out_unlock; ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - &len, remap_flags); + &len, remap_flags, vfs_dedupe_file_range_compare); if (ret < 0 || len == 0) goto out_unlock; diff --git a/fs/remap_range.c b/fs/remap_range.c index e6099beefa97..fc64f7e2af5a 100644 --- a/fs/remap_range.c +++ b/fs/remap_range.c @@ -199,9 +199,9 @@ static void vfs_unlock_two_pages(struct page *page1, struct page *page2) * Compare extents of two files to see if they are the same. * Caller must have locked both inodes to prevent write races. */ -static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, - struct inode *dest, loff_t destoff, - loff_t len, bool *is_same) +int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, + loff_t len, bool *is_same) { loff_t src_poff; loff_t dest_poff; @@ -280,6 +280,7 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, out_error: return error; } +EXPORT_SYMBOL(vfs_dedupe_file_range_compare); /* * Check that the two inodes are eligible for cloning, the ranges make @@ -291,7 +292,8 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, */ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, - loff_t *len, unsigned int remap_flags) + loff_t *len, unsigned int remap_flags, + compare_range_t compare_range_fn) { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); @@ -351,8 +353,8 @@ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, if (remap_flags & REMAP_FILE_DEDUP) { bool is_same = false; - ret = vfs_dedupe_file_range_compare(inode_in, pos_in, - inode_out, pos_out, *len, &is_same); + ret = compare_range_fn(inode_in, pos_in, inode_out, pos_out, + *len, &is_same); if (ret) return ret; if (!is_same) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 6fa05fb78189..26708c01fd78 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1309,7 +1309,7 @@ xfs_reflink_remap_prep( goto out_unlock; ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags); + len, remap_flags, vfs_dedupe_file_range_compare); if (ret || *len == 0) goto out_unlock; diff --git a/include/linux/dax.h b/include/linux/dax.h index b52f084aa643..183e084c2ac7 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -238,6 +238,11 @@ int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index); int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index); s64 dax_iomap_zero(loff_t pos, u64 length, struct iomap *iomap); +int dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, + loff_t len, bool *is_same, + const struct iomap_ops *ops); + static inline bool dax_mapping(struct address_space *mapping) { return mapping->host && IS_DAX(mapping->host); diff --git a/include/linux/fs.h b/include/linux/fs.h index 8667d0cdc71e..159ec755e47b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1912,13 +1912,20 @@ extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *); extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *, loff_t, size_t, unsigned int); +extern int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, + loff_t len, bool *is_same); +typedef int (*compare_range_t)(struct inode *src, loff_t srcpos, + struct inode *dest, loff_t destpos, + loff_t len, bool *is_same); extern ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len, unsigned int flags); extern int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t *count, - unsigned int remap_flags); + unsigned int remap_flags, + compare_range_t compare_range_fn); extern loff_t do_clone_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); From patchwork Sun Feb 7 17:09:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24D59C433E0 for ; Sun, 7 Feb 2021 17:09:44 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E9DD264DB1 for ; Sun, 7 Feb 2021 17:09:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E9DD264DB1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 5162F100EB825; Sun, 7 Feb 2021 09:09:43 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id 722AB100EBB92 for ; Sun, 7 Feb 2021 09:09:39 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299371" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:38 +0800 Received: from G08CNEXMBPEKD06.g08.fujitsu.local (unknown [10.167.33.206]) by cn.fujitsu.com (Postfix) with ESMTP id BF2554CE6F77; Mon, 8 Feb 2021 01:09:35 +0800 (CST) Received: from G08CNEXJMPEKD02.g08.fujitsu.local (10.167.33.202) by G08CNEXMBPEKD06.g08.fujitsu.local (10.167.33.206) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:36 +0800 Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXJMPEKD02.g08.fujitsu.local (10.167.33.202) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:37 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:36 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 6/7] fs/xfs: Handle CoW for fsdax write() path Date: Mon, 8 Feb 2021 01:09:23 +0800 Message-ID: <20210207170924.2933035-7-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: BF2554CE6F77.AE2A0 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: 7TQN5HEB3YRSJDNRQV6KDOXTD26BJC37 X-Message-ID-Hash: 7TQN5HEB3YRSJDNRQV6KDOXTD26BJC37 X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: In fsdax mode, WRITE and ZERO on a shared extent need CoW mechanism performed. After CoW, new extents needs to be remapped to the file. Signed-off-by: Shiyang Ruan --- fs/xfs/xfs_bmap_util.c | 6 +++++- fs/xfs/xfs_file.c | 10 +++++++--- fs/xfs/xfs_iomap.c | 3 ++- fs/xfs/xfs_iops.c | 11 ++++++++--- fs/xfs/xfs_reflink.c | 2 ++ 5 files changed, 24 insertions(+), 8 deletions(-) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 7371a7f7c652..eb0c7ff103f8 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -977,10 +977,14 @@ xfs_free_file_space( if (offset + len > XFS_ISIZE(ip)) len = XFS_ISIZE(ip) - offset; error = iomap_zero_range(VFS_I(ip), offset, len, NULL, - &xfs_buffered_write_iomap_ops); + IS_DAX(VFS_I(ip)) ? + &xfs_direct_write_iomap_ops : &xfs_buffered_write_iomap_ops); if (error) return error; + if (xfs_is_reflink_inode(ip)) + xfs_reflink_end_cow(ip, offset, len); + /* * If we zeroed right up to EOF and EOF straddles a page boundary we * must make sure that the post-EOF area is also zeroed because the diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 5b0f93f73837..ab738641a3f4 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -624,9 +624,13 @@ xfs_file_dax_write( trace_xfs_file_dax_write(ip, count, pos); ret = dax_iomap_rw(iocb, from, &xfs_direct_write_iomap_ops); - if (ret > 0 && iocb->ki_pos > i_size_read(inode)) { - i_size_write(inode, iocb->ki_pos); - error = xfs_setfilesize(ip, pos, ret); + if (ret > 0) { + if (iocb->ki_pos > i_size_read(inode)) { + i_size_write(inode, iocb->ki_pos); + error = xfs_setfilesize(ip, pos, ret); + } + if (xfs_is_cow_inode(ip)) + xfs_reflink_end_cow(ip, pos, ret); } out: xfs_iunlock(ip, iolock); diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 7b9ff824e82d..d6d4cc0f084e 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -765,13 +765,14 @@ xfs_direct_write_iomap_begin( goto out_unlock; if (imap_needs_cow(ip, flags, &imap, nimaps)) { + bool need_convert = flags & IOMAP_DIRECT || IS_DAX(inode); error = -EAGAIN; if (flags & IOMAP_NOWAIT) goto out_unlock; /* may drop and re-acquire the ilock */ error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared, - &lockmode, flags & IOMAP_DIRECT); + &lockmode, need_convert); if (error) goto out_unlock; if (shared) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 1414ab79eacf..480bda0f16d0 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -860,6 +860,7 @@ xfs_setattr_size( int error; uint lock_flags = 0; bool did_zeroing = false; + const struct iomap_ops *ops; ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL)); ASSERT(xfs_isilocked(ip, XFS_MMAPLOCK_EXCL)); @@ -906,10 +907,15 @@ xfs_setattr_size( * extension, or zeroing out the rest of the block on a downward * truncate. */ + if (IS_DAX(inode)) + ops = &xfs_direct_write_iomap_ops; + else + ops = &xfs_buffered_write_iomap_ops; + if (newsize > oldsize) { trace_xfs_zero_eof(ip, oldsize, newsize - oldsize); error = iomap_zero_range(inode, oldsize, newsize - oldsize, - &did_zeroing, &xfs_buffered_write_iomap_ops); + &did_zeroing, ops); } else { /* * iomap won't detect a dirty page over an unwritten block (or a @@ -921,8 +927,7 @@ xfs_setattr_size( newsize); if (error) return error; - error = iomap_truncate_page(inode, newsize, &did_zeroing, - &xfs_buffered_write_iomap_ops); + error = iomap_truncate_page(inode, newsize, &did_zeroing, ops); } if (error) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 26708c01fd78..38bde16cdb39 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1246,6 +1246,8 @@ xfs_reflink_zero_posteof( trace_xfs_zero_eof(ip, isize, pos - isize); return iomap_zero_range(VFS_I(ip), isize, pos - isize, NULL, + IS_DAX(VFS_I(ip)) ? + &xfs_direct_write_iomap_ops : &xfs_buffered_write_iomap_ops); } From patchwork Sun Feb 7 17:09:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruan Shiyang X-Patchwork-Id: 12073177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 055A7C433DB for ; Sun, 7 Feb 2021 17:09:50 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A7AC964DB2 for ; Sun, 7 Feb 2021 17:09:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A7AC964DB2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=cn.fujitsu.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 95D79100EB82A; Sun, 7 Feb 2021 09:09:49 -0800 (PST) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=183.91.158.132; helo=heian.cn.fujitsu.com; envelope-from=ruansy.fnst@cn.fujitsu.com; receiver= Received: from heian.cn.fujitsu.com (mail.cn.fujitsu.com [183.91.158.132]) by ml01.01.org (Postfix) with ESMTP id B5E38100EB829 for ; Sun, 7 Feb 2021 09:09:46 -0800 (PST) X-IronPort-AV: E=Sophos;i="5.81,160,1610380800"; d="scan'208";a="104299378" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 08 Feb 2021 01:09:44 +0800 Received: from G08CNEXMBPEKD05.g08.fujitsu.local (unknown [10.167.33.204]) by cn.fujitsu.com (Postfix) with ESMTP id A63F74CE6F74; Mon, 8 Feb 2021 01:09:38 +0800 (CST) Received: from G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.200) by G08CNEXMBPEKD05.g08.fujitsu.local (10.167.33.204) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 8 Feb 2021 01:09:38 +0800 Received: from irides.mr.mr.mr (10.167.225.141) by G08CNEXCHPEKD04.g08.fujitsu.local (10.167.33.209) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 8 Feb 2021 01:09:37 +0800 From: Shiyang Ruan To: , , , Subject: [PATCH 7/7] fs/xfs: Add dedupe support for fsdax Date: Mon, 8 Feb 2021 01:09:24 +0800 Message-ID: <20210207170924.2933035-8-ruansy.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> References: <20210207170924.2933035-1-ruansy.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: A63F74CE6F74.AE72D X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: ruansy.fnst@cn.fujitsu.com Message-ID-Hash: IB3B5HACIFG6RORNHNOXQU4V666WSJXZ X-Message-ID-Hash: IB3B5HACIFG6RORNHNOXQU4V666WSJXZ X-MailFrom: ruansy.fnst@cn.fujitsu.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: darrick.wong@oracle.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, linux-btrfs@vger.kernel.org, ocfs2-devel@oss.oracle.com, david@fromorbit.com, hch@lst.de, rgoldwyn@suse.de X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Add xfs_break_two_dax_layouts() to break layout for tow dax files. Then call compare range function only when files are both DAX or not. Signed-off-by: Shiyang Ruan --- fs/xfs/xfs_file.c | 20 ++++++++++++++++++++ fs/xfs/xfs_inode.c | 8 +++++++- fs/xfs/xfs_inode.h | 1 + fs/xfs/xfs_reflink.c | 21 ++++++++++++++++++--- 4 files changed, 46 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index ab738641a3f4..64ded96b43c8 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -791,6 +791,26 @@ xfs_break_dax_layouts( 0, 0, xfs_wait_dax_page(inode)); } +int +xfs_break_two_dax_layouts( + struct inode *src, + struct inode *dest) +{ + int error; + bool retry = false; + +retry: + error = xfs_break_dax_layouts(src, &retry); + if (error || retry) + goto retry; + + error = xfs_break_dax_layouts(dest, &retry); + if (error || retry) + goto retry; + + return error; +} + int xfs_break_layouts( struct inode *inode, diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 2bfbcf28b1bd..6483dbaa4d57 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -3786,8 +3786,10 @@ xfs_ilock2_io_mmap( struct xfs_inode *ip2) { int ret; + struct inode *inode1 = VFS_I(ip1); + struct inode *inode2 = VFS_I(ip2); - ret = xfs_iolock_two_inodes_and_break_layout(VFS_I(ip1), VFS_I(ip2)); + ret = xfs_iolock_two_inodes_and_break_layout(inode1, inode2); if (ret) return ret; if (ip1 == ip2) @@ -3795,6 +3797,10 @@ xfs_ilock2_io_mmap( else xfs_lock_two_inodes(ip1, XFS_MMAPLOCK_EXCL, ip2, XFS_MMAPLOCK_EXCL); + + if (IS_DAX(inode1) && IS_DAX(inode2)) + ret = xfs_break_two_dax_layouts(inode1, inode2); + return 0; } diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 751a3d1d7d84..462b61bea691 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -431,6 +431,7 @@ enum xfs_prealloc_flags { int xfs_update_prealloc_flags(struct xfs_inode *ip, enum xfs_prealloc_flags flags); +int xfs_break_two_dax_layouts(struct inode *inode1, struct inode *inode2); int xfs_break_layouts(struct inode *inode, uint *iolock, enum layout_break_reason reason); diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 38bde16cdb39..bdb9a07e703f 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -29,6 +29,7 @@ #include "xfs_iomap.h" #include "xfs_sb.h" #include "xfs_ag_resv.h" +#include /* * Copy on Write of Shared Blocks @@ -1251,6 +1252,14 @@ xfs_reflink_zero_posteof( &xfs_buffered_write_iomap_ops); } +int xfs_reflink_dedupe_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, + loff_t len, bool *is_same) +{ + return dax_file_range_compare(src, srcoff, dest, destoff, len, is_same, + &xfs_read_iomap_ops); +} + /* * Prepare two files for range cloning. Upon a successful return both inodes * will have the iolock and mmaplock held, the page cache of the out file will @@ -1293,6 +1302,7 @@ xfs_reflink_remap_prep( struct xfs_inode *src = XFS_I(inode_in); struct inode *inode_out = file_inode(file_out); struct xfs_inode *dest = XFS_I(inode_out); + compare_range_t compare_range_fn; int ret; /* Lock both files against IO */ @@ -1306,12 +1316,17 @@ xfs_reflink_remap_prep( if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) goto out_unlock; - /* Don't share DAX file data for now. */ - if (IS_DAX(inode_in) || IS_DAX(inode_out)) + /* Don't share DAX file data with non-DAX file. */ + if (IS_DAX(inode_in) != IS_DAX(inode_out)) goto out_unlock; + if (IS_DAX(inode_in)) + compare_range_fn = xfs_reflink_dedupe_file_range_compare; + else + compare_range_fn = vfs_dedupe_file_range_compare; + ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags, vfs_dedupe_file_range_compare); + len, remap_flags, compare_range_fn); if (ret || *len == 0) goto out_unlock;