From patchwork Mon Apr 29 17:26:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 26E1E14B6 for ; Mon, 29 Apr 2019 17:27:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0D2C42884A for ; Mon, 29 Apr 2019 17:27:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 017A228877; Mon, 29 Apr 2019 17:27:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 62F97288C6 for ; Mon, 29 Apr 2019 17:27:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728861AbfD2R1L (ORCPT ); Mon, 29 Apr 2019 13:27:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:58004 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728798AbfD2R1K (ORCPT ); Mon, 29 Apr 2019 13:27:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7D9EEACFB; Mon, 29 Apr 2019 17:27:08 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 01/18] btrfs: create a mount option for dax Date: Mon, 29 Apr 2019 12:26:32 -0500 Message-Id: <20190429172649.8288-2-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This sets S_DAX in inode->i_flags, which can be used with IS_DAX(). The dax option is restricted to non multi-device mounts. dax interacts with the device directly instead of using bio, so all bio-hooks which we use for multi-device cannot be performed here. While regular read/writes could be manipulated with RAID0/1, mmap() is still an issue. Auto-setting free space tree, because dealing with free space inode (specifically readpages) is a nightmare. Auto-setting nodatasum because we don't get callback for writing checksums after mmap()s. Deny compression because it does not go with direct I/O. Store the dax_device in fs_info which will be used in iomap code. I am aware of the push to directory-based flags for dax. Until, that code is in the kernel, we will work with mount flags. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/disk-io.c | 4 ++++ fs/btrfs/ioctl.c | 5 ++++- fs/btrfs/super.c | 30 ++++++++++++++++++++++++++++++ 4 files changed, 40 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b3642367a595..8ca1c0d120f4 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1067,6 +1067,7 @@ struct btrfs_fs_info { u32 metadata_ratio; void *bdev_holder; + struct dax_device *dax_dev; /* private scrub information */ struct mutex scrub_lock; @@ -1442,6 +1443,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info) #define BTRFS_MOUNT_FREE_SPACE_TREE (1 << 26) #define BTRFS_MOUNT_NOLOGREPLAY (1 << 27) #define BTRFS_MOUNT_REF_VERIFY (1 << 28) +#define BTRFS_MOUNT_DAX (1 << 29) #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) #define BTRFS_DEFAULT_MAX_INLINE (2048) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 6fe9197f6ee4..2bbb63b2fcff 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -2805,6 +2806,8 @@ int open_ctree(struct super_block *sb, goto fail_alloc; } + fs_info->dax_dev = fs_dax_get_by_bdev(fs_devices->latest_bdev); + /* * We want to check superblock checksum, the type is stored inside. * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). @@ -4043,6 +4046,7 @@ void close_ctree(struct btrfs_fs_info *fs_info) #endif btrfs_close_devices(fs_info->fs_devices); + fs_put_dax(fs_info->dax_dev); btrfs_mapping_tree_free(&fs_info->mapping_tree); percpu_counter_destroy(&fs_info->dirty_metadata_bytes); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index cd4e693406a0..0138119cd9a3 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -149,8 +149,11 @@ void btrfs_sync_inode_flags_to_i_flags(struct inode *inode) if (binode->flags & BTRFS_INODE_DIRSYNC) new_fl |= S_DIRSYNC; + if ((btrfs_test_opt(btrfs_sb(inode->i_sb), DAX)) && S_ISREG(inode->i_mode)) + new_fl |= S_DAX; + set_mask_bits(&inode->i_flags, - S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC, + S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC | S_DAX, new_fl); } diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 120e4340792a..3b85e61e5182 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -326,6 +326,7 @@ enum { Opt_treelog, Opt_notreelog, Opt_usebackuproot, Opt_user_subvol_rm_allowed, + Opt_dax, /* Deprecated options */ Opt_alloc_start, @@ -393,6 +394,7 @@ static const match_table_t tokens = { {Opt_notreelog, "notreelog"}, {Opt_usebackuproot, "usebackuproot"}, {Opt_user_subvol_rm_allowed, "user_subvol_rm_allowed"}, + {Opt_dax, "dax"}, /* Deprecated options */ {Opt_alloc_start, "alloc_start=%s"}, @@ -745,6 +747,32 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, case Opt_user_subvol_rm_allowed: btrfs_set_opt(info->mount_opt, USER_SUBVOL_RM_ALLOWED); break; + case Opt_dax: +#ifdef CONFIG_FS_DAX + if (btrfs_super_num_devices(info->super_copy) > 1) { + btrfs_info(info, + "dax not supported for multi-device btrfs partition\n"); + ret = -EOPNOTSUPP; + goto out; + } + btrfs_set_opt(info->mount_opt, DAX); + btrfs_warn(info, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk\n"); + btrfs_set_and_info(info, NODATASUM, + "auto-setting nodatasum (dax)"); + btrfs_clear_opt(info->mount_opt, SPACE_CACHE); + btrfs_set_and_info(info, FREE_SPACE_TREE, + "auto-setting free space tree (dax)"); + if (btrfs_test_opt(info, COMPRESS)) { + btrfs_info(info, "disabling compress (dax)"); + btrfs_clear_opt(info->mount_opt, COMPRESS); + } + break; +#else + btrfs_err(info, + "DAX option not supported\n"); + ret = -EINVAL; + goto out; +#endif case Opt_enospc_debug: btrfs_set_opt(info->mount_opt, ENOSPC_DEBUG); break; @@ -1335,6 +1363,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) seq_puts(seq, ",clear_cache"); if (btrfs_test_opt(info, USER_SUBVOL_RM_ALLOWED)) seq_puts(seq, ",user_subvol_rm_allowed"); + if (btrfs_test_opt(info, DAX)) + seq_puts(seq, ",dax"); if (btrfs_test_opt(info, ENOSPC_DEBUG)) seq_puts(seq, ",enospc_debug"); if (btrfs_test_opt(info, AUTO_DEFRAG)) From patchwork Mon Apr 29 17:26:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922207 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C90A81515 for ; Mon, 29 Apr 2019 17:27:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AEE072883B for ; Mon, 29 Apr 2019 17:27:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2A3F2884A; Mon, 29 Apr 2019 17:27:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E2BA2883B for ; Mon, 29 Apr 2019 17:27:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728907AbfD2R1P (ORCPT ); Mon, 29 Apr 2019 13:27:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:58038 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728748AbfD2R1L (ORCPT ); Mon, 29 Apr 2019 13:27:11 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D9A26AD5D; Mon, 29 Apr 2019 17:27:09 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 02/18] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Date: Mon, 29 Apr 2019 12:26:33 -0500 Message-Id: <20190429172649.8288-3-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This makes btrfs_get_extent_map_write() independent of Direct I/O code. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 40 +++++++++++++++++++++++++++------------- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 8ca1c0d120f4..9512f49262dd 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3277,6 +3277,8 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location, struct btrfs_path *path); struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, struct btrfs_root *root, int *was_new); +int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, + struct inode *inode, u64 start, u64 len); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, struct page *page, size_t pg_offset, u64 start, u64 end, int create); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 82fdda8ff5ab..68b8a4935ba6 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7496,11 +7496,10 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em, return 0; } -static int btrfs_get_blocks_direct_write(struct extent_map **map, - struct buffer_head *bh_result, - struct inode *inode, - struct btrfs_dio_data *dio_data, - u64 start, u64 len) +int btrfs_get_extent_map_write(struct extent_map **map, + struct buffer_head *bh, + struct inode *inode, + u64 start, u64 len) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em = *map; @@ -7554,22 +7553,38 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, */ btrfs_free_reserved_data_space_noquota(inode, start, len); - goto skip_cow; + /* skip COW */ + goto out; } } /* this will cow the extent */ - len = bh_result->b_size; + if (bh) + len = bh->b_size; free_extent_map(em); *map = em = btrfs_new_extent_direct(inode, start, len); - if (IS_ERR(em)) { - ret = PTR_ERR(em); - goto out; - } + if (IS_ERR(em)) + return PTR_ERR(em); +out: + return ret; +} +static int btrfs_get_blocks_direct_write(struct extent_map **map, + struct buffer_head *bh_result, + struct inode *inode, + struct btrfs_dio_data *dio_data, + u64 start, u64 len) +{ + int ret; + struct extent_map *em; + + ret = btrfs_get_extent_map_write(map, bh_result, inode, + start, len); + if (ret < 0) + return ret; + em = *map; len = min(len, em->len - (start - em->start)); -skip_cow: bh_result->b_blocknr = (em->block_start + (start - em->start)) >> inode->i_blkbits; bh_result->b_size = len; @@ -7590,7 +7605,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, dio_data->reserve -= len; dio_data->unsubmitted_oe_range_end = start + len; current->journal_info = dio_data; -out: return ret; } From patchwork Mon Apr 29 17:26:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A154F14B6 for ; Mon, 29 Apr 2019 17:27:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 88265288C6 for ; Mon, 29 Apr 2019 17:27:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7C9DB288BC; Mon, 29 Apr 2019 17:27:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D4B3288C6 for ; Mon, 29 Apr 2019 17:27:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728893AbfD2R1P (ORCPT ); Mon, 29 Apr 2019 13:27:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:58054 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728798AbfD2R1O (ORCPT ); Mon, 29 Apr 2019 13:27:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E5F2DAD78; Mon, 29 Apr 2019 17:27:11 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 03/18] btrfs: basic dax read Date: Mon, 29 Apr 2019 12:26:34 -0500 Message-Id: <20190429172649.8288-4-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Perform a basic read using iomap support. The btrfs_iomap_begin() finds the extent at the position and fills the iomap data structure with the values. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/Makefile | 1 + fs/btrfs/ctree.h | 5 +++++ fs/btrfs/dax.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/file.c | 11 ++++++++++- 4 files changed, 65 insertions(+), 1 deletion(-) create mode 100644 fs/btrfs/dax.c diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index ca693dd554e9..1fa77b875ae9 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -12,6 +12,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o +btrfs-$(CONFIG_FS_DAX) += dax.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9512f49262dd..b7bbe5130a3b 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3795,6 +3795,11 @@ int btrfs_reada_wait(void *handle); void btrfs_reada_detach(void *handle); int btree_readahead_hook(struct extent_buffer *eb, int err); +#ifdef CONFIG_FS_DAX +/* dax.c */ +ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); +#endif /* CONFIG_FS_DAX */ + static inline int is_fstree(u64 rootid) { if (rootid == BTRFS_FS_TREE_OBJECTID || diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c new file mode 100644 index 000000000000..bf3d46b0acb6 --- /dev/null +++ b/fs/btrfs/dax.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * DAX support for BTRFS + * + * Copyright (c) 2019 SUSE Linux + * Author: Goldwyn Rodrigues + */ + +#ifdef CONFIG_FS_DAX +#include +#include +#include "ctree.h" +#include "btrfs_inode.h" + +static int btrfs_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned flags, struct iomap *iomap) +{ + struct extent_map *em; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + if (em->block_start == EXTENT_MAP_HOLE) { + iomap->type = IOMAP_HOLE; + return 0; + } + iomap->type = IOMAP_MAPPED; + iomap->bdev = em->bdev; + iomap->dax_dev = fs_info->dax_dev; + iomap->offset = em->start; + iomap->length = em->len; + iomap->addr = em->block_start; + return 0; +} + +static const struct iomap_ops btrfs_iomap_ops = { + .iomap_begin = btrfs_iomap_begin, +}; + +ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to) +{ + ssize_t ret; + struct inode *inode = file_inode(iocb->ki_filp); + + inode_lock_shared(inode); + ret = dax_iomap_rw(iocb, to, &btrfs_iomap_ops); + inode_unlock_shared(inode); + + return ret; +} +#endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 34fe8a58b0e9..9194591f9eea 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3288,9 +3288,18 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) return generic_file_open(inode, filp); } +static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ +#ifdef CONFIG_FS_DAX + if (IS_DAX(file_inode(iocb->ki_filp))) + return btrfs_file_dax_read(iocb, to); +#endif + return generic_file_read_iter(iocb, to); +} + const struct file_operations btrfs_file_operations = { .llseek = btrfs_file_llseek, - .read_iter = generic_file_read_iter, + .read_iter = btrfs_file_read_iter, .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, .mmap = btrfs_file_mmap, From patchwork Mon Apr 29 17:26:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5CA114B6 for ; Mon, 29 Apr 2019 17:27:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC0352883B for ; Mon, 29 Apr 2019 17:27:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C063428877; Mon, 29 Apr 2019 17:27:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 65E502883B for ; Mon, 29 Apr 2019 17:27:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728914AbfD2R1R (ORCPT ); Mon, 29 Apr 2019 13:27:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:58092 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728873AbfD2R1Q (ORCPT ); Mon, 29 Apr 2019 13:27:16 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id EA113ADAF; Mon, 29 Apr 2019 17:27:13 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes Date: Mon, 29 Apr 2019 12:26:35 -0500 Message-Id: <20190429172649.8288-5-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues The IOMAP_DAX_COW is a iomap type which performs copy of edges of data while performing a write if start/end are not page aligned. The source address is expected in iomap->inline_data. dax_copy_edges() is a helper functions performs a copy from one part of the device to another for data not page aligned. If iomap->inline_data is NULL, it memset's the area to zero. Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- include/linux/iomap.h | 1 + 2 files changed, 46 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index e5e54da1715f..610bfa861a28 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1084,6 +1084,42 @@ int __dax_zero_page_range(struct block_device *bdev, } EXPORT_SYMBOL_GPL(__dax_zero_page_range); +/* + * dax_copy_edges - Copies the part of the pages not included in + * the write, but required for CoW because + * offset/offset+length are not page aligned. + */ +static int dax_copy_edges(struct inode *inode, loff_t pos, loff_t length, + struct iomap *iomap, void *daddr) +{ + unsigned offset = pos & (PAGE_SIZE - 1); + loff_t end = pos + length; + loff_t pg_end = round_up(end, PAGE_SIZE); + void *saddr = iomap->inline_data; + int ret = 0; + /* + * Copy the first part of the page + * Note: we pass offset as length + */ + if (offset) { + if (saddr) + ret = memcpy_mcsafe(daddr, saddr, offset); + else + memset(daddr, 0, offset); + } + + /* Copy the last part of the range */ + if (end < pg_end) { + if (saddr) + ret = memcpy_mcsafe(daddr + offset + length, + saddr + offset + length, pg_end - end); + else + memset(daddr + offset + length, 0, + pg_end - end); + } + return ret; +} + static loff_t dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct iomap *iomap) @@ -1105,9 +1141,11 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return iov_iter_zero(min(length, end - pos), iter); } - if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED)) + if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED + && iomap->type != IOMAP_DAX_COW)) return -EIO; + /* * Write can allocate block for an area which has a hole page mapped * into page tables. We have to tear down these mappings so that data @@ -1144,6 +1182,12 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (iomap->type == IOMAP_DAX_COW) { + ret = dax_copy_edges(inode, pos, length, iomap, kaddr); + if (ret) + break; + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 0fefb5455bda..6e885c5a38a3 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -25,6 +25,7 @@ struct vm_fault; #define IOMAP_MAPPED 0x03 /* blocks allocated at @addr */ #define IOMAP_UNWRITTEN 0x04 /* blocks allocated at @addr in unwritten state */ #define IOMAP_INLINE 0x05 /* data inline in the inode */ +#define IOMAP_DAX_COW 0x06 /* Copy data pointed by inline_data before write*/ /* * Flags for all iomap mappings: From patchwork Mon Apr 29 17:26:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922221 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9340B14B6 for ; Mon, 29 Apr 2019 17:27:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A5562883B for ; Mon, 29 Apr 2019 17:27:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6EF6C2884A; Mon, 29 Apr 2019 17:27:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 218F72883B for ; Mon, 29 Apr 2019 17:27:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728926AbfD2R1S (ORCPT ); Mon, 29 Apr 2019 13:27:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:58122 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728748AbfD2R1R (ORCPT ); Mon, 29 Apr 2019 13:27:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0828EAD0A; Mon, 29 Apr 2019 17:27:16 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 05/18] btrfs: return whether extent is nocow or not Date: Mon, 29 Apr 2019 12:26:36 -0500 Message-Id: <20190429172649.8288-6-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues We require this to make sure we return type IOMAP_DAX_COW in iomap structure, in the later patches. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 +- fs/btrfs/inode.c | 9 +++++++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b7bbe5130a3b..050f30165531 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3278,7 +3278,7 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location, struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, struct btrfs_root *root, int *was_new); int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, - struct inode *inode, u64 start, u64 len); + struct inode *inode, u64 start, u64 len, bool *nocow); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, struct page *page, size_t pg_offset, u64 start, u64 end, int create); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 68b8a4935ba6..8e33c38511bb 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7499,12 +7499,15 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em, int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, struct inode *inode, - u64 start, u64 len) + u64 start, u64 len, bool *nocow) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em = *map; int ret = 0; + if (nocow) + *nocow = false; + /* * We don't allocate a new extent in the following cases * @@ -7553,6 +7556,8 @@ int btrfs_get_extent_map_write(struct extent_map **map, */ btrfs_free_reserved_data_space_noquota(inode, start, len); + if (nocow) + *nocow = true; /* skip COW */ goto out; } @@ -7579,7 +7584,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, struct extent_map *em; ret = btrfs_get_extent_map_write(map, bh_result, inode, - start, len); + start, len, NULL); if (ret < 0) return ret; em = *map; From patchwork Mon Apr 29 17:26:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922229 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5964B14B6 for ; Mon, 29 Apr 2019 17:27:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 407ED2883B for ; Mon, 29 Apr 2019 17:27:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34A022884A; Mon, 29 Apr 2019 17:27:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CBAEF2883B for ; Mon, 29 Apr 2019 17:27:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728942AbfD2R1V (ORCPT ); Mon, 29 Apr 2019 13:27:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:58146 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728928AbfD2R1U (ORCPT ); Mon, 29 Apr 2019 13:27:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 11C81ACFB; Mon, 29 Apr 2019 17:27:18 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 06/18] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Date: Mon, 29 Apr 2019 12:26:37 -0500 Message-Id: <20190429172649.8288-7-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Since we will be using it in another part of the code, use a better name to declare it non-static Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 7 +++++-- fs/btrfs/inode.c | 14 +++++--------- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 050f30165531..1e3e758b83c2 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3280,8 +3280,11 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, struct inode *inode, u64 start, u64 len, bool *nocow); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, - struct page *page, size_t pg_offset, - u64 start, u64 end, int create); + struct page *page, size_t pg_offset, + u64 start, u64 end, int create); +void btrfs_update_ordered_extent(struct inode *inode, + const u64 offset, const u64 bytes, + const bool uptodate); int btrfs_update_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct inode *inode); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8e33c38511bb..af4b56cba104 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -98,10 +98,6 @@ static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len, u64 ram_bytes, int compress_type, int type); -static void __endio_write_update_ordered(struct inode *inode, - const u64 offset, const u64 bytes, - const bool uptodate); - /* * Cleanup all submitted ordered extents in specified range to handle errors * from the btrfs_run_delalloc_range() callback. @@ -142,7 +138,7 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode, bytes -= PAGE_SIZE; } - return __endio_write_update_ordered(inode, offset, bytes, false); + return btrfs_update_ordered_extent(inode, offset, bytes, false); } static int btrfs_dirty_inode(struct inode *inode); @@ -8085,7 +8081,7 @@ static void btrfs_endio_direct_read(struct bio *bio) bio_put(bio); } -static void __endio_write_update_ordered(struct inode *inode, +void btrfs_update_ordered_extent(struct inode *inode, const u64 offset, const u64 bytes, const bool uptodate) { @@ -8138,7 +8134,7 @@ static void btrfs_endio_direct_write(struct bio *bio) struct btrfs_dio_private *dip = bio->bi_private; struct bio *dio_bio = dip->dio_bio; - __endio_write_update_ordered(dip->inode, dip->logical_offset, + btrfs_update_ordered_extent(dip->inode, dip->logical_offset, dip->bytes, !bio->bi_status); kfree(dip); @@ -8457,7 +8453,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, bio = NULL; } else { if (write) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, file_offset, dio_bio->bi_iter.bi_size, false); @@ -8597,7 +8593,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) */ if (dio_data.unsubmitted_oe_range_start < dio_data.unsubmitted_oe_range_end) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, dio_data.unsubmitted_oe_range_start, dio_data.unsubmitted_oe_range_end - dio_data.unsubmitted_oe_range_start, From patchwork Mon Apr 29 17:26:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922235 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D528214B6 for ; Mon, 29 Apr 2019 17:27:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9C422883B for ; Mon, 29 Apr 2019 17:27:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AE04328849; Mon, 29 Apr 2019 17:27:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2514928877 for ; Mon, 29 Apr 2019 17:27:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728953AbfD2R1X (ORCPT ); Mon, 29 Apr 2019 13:27:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:58172 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728929AbfD2R1W (ORCPT ); Mon, 29 Apr 2019 13:27:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1AAACAD5D; Mon, 29 Apr 2019 17:27:20 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 07/18] btrfs: add dax write support Date: Mon, 29 Apr 2019 12:26:38 -0500 Message-Id: <20190429172649.8288-8-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues IOMAP_DAX_COW allows to inform the dax code, to first perform a copy which are not page-aligned before performing the write. The responsibility of checking if data edges are page aligned is performed in ->iomap_begin() and the source address is stored in ->inline_data A new struct btrfs_iomap is passed from iomap_begin() to iomap_end(), which contains all the accounting and locking information for CoW based writes. For writing to a hole, iomap->inline_data is set to zero. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 6 ++ fs/btrfs/dax.c | 182 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- fs/btrfs/file.c | 4 +- 3 files changed, 185 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 1e3e758b83c2..eec01eb92f33 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3801,6 +3801,12 @@ int btree_readahead_hook(struct extent_buffer *eb, int err); #ifdef CONFIG_FS_DAX /* dax.c */ ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); +ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from); +#else +static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) +{ + return 0; +} #endif /* CONFIG_FS_DAX */ static inline int is_fstree(u64 rootid) diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index bf3d46b0acb6..f5cc9bcdbf14 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -9,30 +9,184 @@ #ifdef CONFIG_FS_DAX #include #include +#include #include "ctree.h" #include "btrfs_inode.h" +struct btrfs_iomap { + u64 start; + u64 end; + bool nocow; + struct extent_changeset *data_reserved; + struct extent_state *cached_state; +}; + +static struct btrfs_iomap *btrfs_iomap_init(struct inode *inode, + struct extent_map **em, + loff_t pos, loff_t length) +{ + int ret = 0; + struct extent_map *map = *em; + struct btrfs_iomap *bi; + + bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS); + if (!bi) + return ERR_PTR(-ENOMEM); + + bi->start = round_down(pos, PAGE_SIZE); + bi->end = PAGE_ALIGN(pos + length); + + /* Wait for existing ordered extents in range to finish */ + btrfs_wait_ordered_range(inode, bi->start, bi->end - bi->start); + + lock_extent_bits(&BTRFS_I(inode)->io_tree, bi->start, bi->end, &bi->cached_state); + + ret = btrfs_delalloc_reserve_space(inode, &bi->data_reserved, + bi->start, bi->end - bi->start); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + kfree(bi); + return ERR_PTR(ret); + } + + refcount_inc(&map->refs); + ret = btrfs_get_extent_map_write(em, NULL, + inode, bi->start, bi->end - bi->start, &bi->nocow); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + btrfs_delalloc_release_space(inode, + bi->data_reserved, bi->start, + bi->end - bi->start, true); + extent_changeset_free(bi->data_reserved); + kfree(bi); + return ERR_PTR(ret); + } + free_extent_map(map); + return bi; +} + +static void *dax_address(struct block_device *bdev, struct dax_device *dax_dev, + sector_t sector, loff_t pos, loff_t length) +{ + size_t size = ALIGN(pos + length, PAGE_SIZE); + int id, ret = 0; + void *kaddr = NULL; + pgoff_t pgoff; + long map_len; + + id = dax_read_lock(); + + ret = bdev_dax_pgoff(bdev, sector, size, &pgoff); + if (ret) + goto out; + + map_len = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size), + &kaddr, NULL); + if (map_len < 0) + ret = map_len; + +out: + dax_read_unlock(id); + if (ret) + return ERR_PTR(ret); + return kaddr; +} + static int btrfs_iomap_begin(struct inode *inode, loff_t pos, loff_t length, unsigned flags, struct iomap *iomap) { struct extent_map *em; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_iomap *bi = NULL; + unsigned offset = pos & (PAGE_SIZE - 1); + u64 srcblk = 0; + loff_t diff; + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + + iomap->type = IOMAP_MAPPED; + + if (flags & IOMAP_WRITE) { + if (em->block_start != EXTENT_MAP_HOLE) + srcblk = em->block_start + pos - em->start - offset; + + bi = btrfs_iomap_init(inode, &em, pos, length); + if (IS_ERR(bi)) + return PTR_ERR(bi); + + } + + /* + * Advance the difference between pos and start, to align well with + * inline_data in case of writes + */ + diff = round_down(pos - em->start, PAGE_SIZE); + iomap->offset = em->start + diff; + iomap->length = em->len - diff; + iomap->bdev = em->bdev; + iomap->dax_dev = fs_info->dax_dev; + + /* + * This will be true for reads only since we have already + * allocated em + */ if (em->block_start == EXTENT_MAP_HOLE) { iomap->type = IOMAP_HOLE; return 0; } - iomap->type = IOMAP_MAPPED; - iomap->bdev = em->bdev; - iomap->dax_dev = fs_info->dax_dev; - iomap->offset = em->start; - iomap->length = em->len; - iomap->addr = em->block_start; + + iomap->addr = em->block_start + diff; + /* Check if we really need to copy data from old extent */ + if (bi && !bi->nocow && (offset || pos + length < bi->end)) { + iomap->type = IOMAP_DAX_COW; + if (srcblk) { + sector_t sector = (srcblk + (pos & PAGE_MASK) - + iomap->offset) >> 9; + iomap->inline_data = dax_address(em->bdev, + fs_info->dax_dev, sector, pos, length); + if (IS_ERR(iomap->inline_data)) { + kfree(bi); + return PTR_ERR(iomap->inline_data); + } + } + } + + iomap->private = bi; + return 0; +} + +static int btrfs_iomap_end(struct inode *inode, loff_t pos, + loff_t length, ssize_t written, unsigned flags, + struct iomap *iomap) +{ + struct btrfs_iomap *bi = iomap->private; + u64 wend; + + if (!bi) + return 0; + + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + + wend = PAGE_ALIGN(pos + written); + if (wend < bi->end) { + btrfs_delalloc_release_space(inode, + bi->data_reserved, wend, + bi->end - wend, true); + } + + btrfs_update_ordered_extent(inode, bi->start, wend - bi->start, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), wend - bi->start, false); + extent_changeset_free(bi->data_reserved); + kfree(bi); return 0; } static const struct iomap_ops btrfs_iomap_ops = { .iomap_begin = btrfs_iomap_begin, + .iomap_end = btrfs_iomap_end, }; ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to) @@ -46,4 +200,20 @@ ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to) return ret; } + +ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *iter) +{ + ssize_t ret = 0; + u64 pos = iocb->ki_pos; + struct inode *inode = file_inode(iocb->ki_filp); + ret = dax_iomap_rw(iocb, iter, &btrfs_iomap_ops); + + if (ret > 0) { + pos += ret; + if (pos > i_size_read(inode)) + i_size_write(inode, pos); + iocb->ki_pos = pos; + } + return ret; +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 9194591f9eea..a795023e26ca 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1964,7 +1964,9 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, if (sync) atomic_inc(&BTRFS_I(inode)->sync_writers); - if (iocb->ki_flags & IOCB_DIRECT) { + if (IS_DAX(inode)) { + num_written = btrfs_file_dax_write(iocb, from); + } else if (iocb->ki_flags & IOCB_DIRECT) { num_written = __btrfs_direct_write(iocb, from); } else { num_written = btrfs_buffered_write(iocb, from); From patchwork Mon Apr 29 17:26:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922259 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 07B3414B6 for ; Mon, 29 Apr 2019 17:27:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E154E28849 for ; Mon, 29 Apr 2019 17:27:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D5322288BC; Mon, 29 Apr 2019 17:27:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82B0028849 for ; Mon, 29 Apr 2019 17:27:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728964AbfD2R11 (ORCPT ); Mon, 29 Apr 2019 13:27:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:58202 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728928AbfD2R1X (ORCPT ); Mon, 29 Apr 2019 13:27:23 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2A3BAADD7; Mon, 29 Apr 2019 17:27:22 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 08/18] dax: memcpy page in case of IOMAP_DAX_COW for mmap faults Date: Mon, 29 Apr 2019 12:26:39 -0500 Message-Id: <20190429172649.8288-9-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Change dax_iomap_pfn to return the address as well in order to use it for performing a memcpy in case the type is IOMAP_DAX_COW. We don't handle PMD because btrfs does not support hugepages. Question: The sequence of bdev_dax_pgoff() and dax_direct_access() is used multiple times to calculate address and pfn's. Would it make sense to call it while calculating address as well to reduce code? Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 610bfa861a28..718b1632a39d 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -984,7 +984,7 @@ static sector_t dax_iomap_sector(struct iomap *iomap, loff_t pos) } static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, size_t size, - pfn_t *pfnp) + pfn_t *pfnp, void **addr) { const sector_t sector = dax_iomap_sector(iomap, pos); pgoff_t pgoff; @@ -996,7 +996,7 @@ static int dax_iomap_pfn(struct iomap *iomap, loff_t pos, size_t size, return rc; id = dax_read_lock(); length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size), - NULL, pfnp); + addr, pfnp); if (length < 0) { rc = length; goto out; @@ -1286,6 +1286,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, XA_STATE(xas, &mapping->i_pages, vmf->pgoff); struct inode *inode = mapping->host; unsigned long vaddr = vmf->address; + void *addr; loff_t pos = (loff_t)vmf->pgoff << PAGE_SHIFT; struct iomap iomap = { 0 }; unsigned flags = IOMAP_FAULT; @@ -1375,16 +1376,26 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, sync = dax_fault_is_synchronous(flags, vma, &iomap); switch (iomap.type) { + case IOMAP_DAX_COW: case IOMAP_MAPPED: if (iomap.flags & IOMAP_F_NEW) { count_vm_event(PGMAJFAULT); count_memcg_event_mm(vma->vm_mm, PGMAJFAULT); major = VM_FAULT_MAJOR; } - error = dax_iomap_pfn(&iomap, pos, PAGE_SIZE, &pfn); + error = dax_iomap_pfn(&iomap, pos, PAGE_SIZE, &pfn, &addr); if (error < 0) goto error_finish_iomap; + if (iomap.type == IOMAP_DAX_COW) { + if (iomap.inline_data) { + error = memcpy_mcsafe(addr, iomap.inline_data, + PAGE_SIZE); + if (error < 0) + goto error_finish_iomap; + } else + memset(addr, 0, PAGE_SIZE); + } entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, 0, write && !sync); @@ -1597,7 +1608,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, switch (iomap.type) { case IOMAP_MAPPED: - error = dax_iomap_pfn(&iomap, pos, PMD_SIZE, &pfn); + error = dax_iomap_pfn(&iomap, pos, PMD_SIZE, &pfn, NULL); if (error < 0) goto finish_iomap; From patchwork Mon Apr 29 17:26:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922241 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 216F414D5 for ; Mon, 29 Apr 2019 17:27:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 06DC82883B for ; Mon, 29 Apr 2019 17:27:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EF63A28877; Mon, 29 Apr 2019 17:27:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 847702883B for ; Mon, 29 Apr 2019 17:27:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728970AbfD2R12 (ORCPT ); Mon, 29 Apr 2019 13:27:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:58230 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728929AbfD2R10 (ORCPT ); Mon, 29 Apr 2019 13:27:26 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 41383ADDB; Mon, 29 Apr 2019 17:27:24 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 09/18] btrfs: Add dax specific address_space_operations Date: Mon, 29 Apr 2019 12:26:40 -0500 Message-Id: <20190429172649.8288-10-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/inode.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index af4b56cba104..05714ffc4894 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include "ctree.h" #include "disk-io.h" @@ -65,6 +66,7 @@ static const struct inode_operations btrfs_dir_ro_inode_operations; static const struct inode_operations btrfs_special_inode_operations; static const struct inode_operations btrfs_file_inode_operations; static const struct address_space_operations btrfs_aops; +static const struct address_space_operations btrfs_dax_aops; static const struct file_operations btrfs_dir_file_operations; static const struct extent_io_ops btrfs_extent_io_ops; @@ -3757,7 +3759,10 @@ static int btrfs_read_locked_inode(struct inode *inode, switch (inode->i_mode & S_IFMT) { case S_IFREG: - inode->i_mapping->a_ops = &btrfs_aops; + if (btrfs_test_opt(fs_info, DAX)) + inode->i_mapping->a_ops = &btrfs_dax_aops; + else + inode->i_mapping->a_ops = &btrfs_aops; BTRFS_I(inode)->io_tree.ops = &btrfs_extent_io_ops; inode->i_fop = &btrfs_file_operations; inode->i_op = &btrfs_file_inode_operations; @@ -3778,6 +3783,7 @@ static int btrfs_read_locked_inode(struct inode *inode, } btrfs_sync_inode_flags_to_i_flags(inode); + return 0; } @@ -6538,7 +6544,10 @@ static int btrfs_create(struct inode *dir, struct dentry *dentry, */ inode->i_fop = &btrfs_file_operations; inode->i_op = &btrfs_file_inode_operations; - inode->i_mapping->a_ops = &btrfs_aops; + if (IS_DAX(inode) && S_ISREG(mode)) + inode->i_mapping->a_ops = &btrfs_dax_aops; + else + inode->i_mapping->a_ops = &btrfs_aops; err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name); if (err) @@ -8665,6 +8674,15 @@ static int btrfs_writepages(struct address_space *mapping, return extent_writepages(mapping, wbc); } +static int btrfs_dax_writepages(struct address_space *mapping, + struct writeback_control *wbc) +{ + struct inode *inode = mapping->host; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + return dax_writeback_mapping_range(mapping, fs_info->fs_devices->latest_bdev, + wbc); +} + static int btrfs_readpages(struct file *file, struct address_space *mapping, struct list_head *pages, unsigned nr_pages) @@ -10436,7 +10454,10 @@ static int btrfs_tmpfile(struct inode *dir, struct dentry *dentry, umode_t mode) inode->i_fop = &btrfs_file_operations; inode->i_op = &btrfs_file_inode_operations; - inode->i_mapping->a_ops = &btrfs_aops; + if (IS_DAX(inode)) + inode->i_mapping->a_ops = &btrfs_dax_aops; + else + inode->i_mapping->a_ops = &btrfs_aops; BTRFS_I(inode)->io_tree.ops = &btrfs_extent_io_ops; ret = btrfs_init_inode_security(trans, inode, dir, NULL); @@ -10892,6 +10913,13 @@ static const struct address_space_operations btrfs_aops = { .swap_deactivate = btrfs_swap_deactivate, }; +static const struct address_space_operations btrfs_dax_aops = { + .writepages = btrfs_dax_writepages, + .direct_IO = noop_direct_IO, + .set_page_dirty = noop_set_page_dirty, + .invalidatepage = noop_invalidatepage, +}; + static const struct inode_operations btrfs_file_inode_operations = { .getattr = btrfs_getattr, .setattr = btrfs_setattr, From patchwork Mon Apr 29 17:26:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3CD0B1515 for ; Mon, 29 Apr 2019 17:27:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 21EB52883B for ; Mon, 29 Apr 2019 17:27:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 164122884A; Mon, 29 Apr 2019 17:27:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9225828849 for ; Mon, 29 Apr 2019 17:27:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728975AbfD2R13 (ORCPT ); Mon, 29 Apr 2019 13:27:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:58202 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728954AbfD2R12 (ORCPT ); Mon, 29 Apr 2019 13:27:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5988CAE17; Mon, 29 Apr 2019 17:27:26 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 10/18] dax: replace mmap entry in case of CoW Date: Mon, 29 Apr 2019 12:26:41 -0500 Message-Id: <20190429172649.8288-11-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues We replace the existing entry to the newly allocated one in case of CoW. Also, we mark the entry as PAGECACHE_TAG_TOWRITE so writeback marks this entry as writeprotected. This helps us snapshots so new write pagefaults after snapshots trigger a CoW. btrfs does not support hugepages so we don't handle PMD. Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 718b1632a39d..07e8ff20161d 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -700,6 +700,9 @@ static int copy_user_dax(struct block_device *bdev, struct dax_device *dax_dev, return 0; } +#define DAX_IF_DIRTY (1ULL << 0) +#define DAX_IF_COW (1ULL << 1) + /* * By this point grab_mapping_entry() has ensured that we have a locked entry * of the appropriate size so we don't have to worry about downgrading PMDs to @@ -709,14 +712,17 @@ static int copy_user_dax(struct block_device *bdev, struct dax_device *dax_dev, */ static void *dax_insert_entry(struct xa_state *xas, struct address_space *mapping, struct vm_fault *vmf, - void *entry, pfn_t pfn, unsigned long flags, bool dirty) + void *entry, pfn_t pfn, unsigned long flags, + unsigned long insert_flags) { void *new_entry = dax_make_entry(pfn, flags); + bool dirty = insert_flags & DAX_IF_DIRTY; + bool cow = insert_flags & DAX_IF_COW; if (dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); - if (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE)) { + if (cow || (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE))) { unsigned long index = xas->xa_index; /* we are replacing a zero page with block mapping */ if (dax_is_pmd_entry(entry)) @@ -728,12 +734,12 @@ static void *dax_insert_entry(struct xa_state *xas, xas_reset(xas); xas_lock_irq(xas); - if (dax_entry_size(entry) != dax_entry_size(new_entry)) { + if (cow || (dax_entry_size(entry) != dax_entry_size(new_entry))) { dax_disassociate_entry(entry, mapping, false); dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address); } - if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { + if (cow || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { /* * Only swap our new entry into the page cache if the current * entry is a zero page or an empty entry. If a normal PTE or @@ -753,6 +759,9 @@ static void *dax_insert_entry(struct xa_state *xas, if (dirty) xas_set_mark(xas, PAGECACHE_TAG_DIRTY); + if (cow) + xas_set_mark(xas, PAGECACHE_TAG_TOWRITE); + xas_unlock_irq(xas); return entry; } @@ -1032,7 +1041,7 @@ static vm_fault_t dax_load_hole(struct xa_state *xas, vm_fault_t ret; *entry = dax_insert_entry(xas, mapping, vmf, *entry, pfn, - DAX_ZERO_PAGE, false); + DAX_ZERO_PAGE, 0); ret = vmf_insert_mixed(vmf->vma, vaddr, pfn); trace_dax_load_hole(inode, vmf, ret); @@ -1296,6 +1305,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, vm_fault_t ret = 0; void *entry; pfn_t pfn; + unsigned long insert_flags = 0; trace_dax_pte_fault(inode, vmf, ret); /* @@ -1357,6 +1367,8 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, error = copy_user_dax(iomap.bdev, iomap.dax_dev, sector, PAGE_SIZE, vmf->cow_page, vaddr); break; + case IOMAP_DAX_COW: + /* Should not be setting this - fallthrough */ default: WARN_ON_ONCE(1); error = -EIO; @@ -1377,6 +1389,8 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, switch (iomap.type) { case IOMAP_DAX_COW: + insert_flags |= DAX_IF_COW; + /* fallthrough */ case IOMAP_MAPPED: if (iomap.flags & IOMAP_F_NEW) { count_vm_event(PGMAJFAULT); @@ -1396,8 +1410,10 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, } else memset(addr, 0, PAGE_SIZE); } + if (write && !sync) + insert_flags |= DAX_IF_DIRTY; entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, - 0, write && !sync); + 0, insert_flags); /* * If we are doing synchronous page fault and inode needs fsync, @@ -1478,7 +1494,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, pfn = page_to_pfn_t(zero_page); *entry = dax_insert_entry(xas, mapping, vmf, *entry, pfn, - DAX_PMD | DAX_ZERO_PAGE, false); + DAX_PMD | DAX_ZERO_PAGE, 0); if (arch_needs_pgtable_deposit()) { pgtable = pte_alloc_one(vma->vm_mm); @@ -1528,6 +1544,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, loff_t pos; int error; pfn_t pfn; + unsigned long insert_flags = 0; /* * Check whether offset isn't beyond end of file now. Caller is @@ -1612,8 +1629,11 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, if (error < 0) goto finish_iomap; + if (write && !sync) + insert_flags |= DAX_IF_DIRTY; + entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, - DAX_PMD, write && !sync); + DAX_PMD, insert_flags); /* * If we are doing synchronous page fault and inode needs fsync, From patchwork Mon Apr 29 17:26:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922255 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 233B414B6 for ; Mon, 29 Apr 2019 17:27:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 09E4E28877 for ; Mon, 29 Apr 2019 17:27:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EFB862896F; Mon, 29 Apr 2019 17:27:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C075288CB for ; Mon, 29 Apr 2019 17:27:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728982AbfD2R1c (ORCPT ); Mon, 29 Apr 2019 13:27:32 -0400 Received: from mx2.suse.de ([195.135.220.15]:58270 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728748AbfD2R13 (ORCPT ); Mon, 29 Apr 2019 13:27:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 67F39ADE3; Mon, 29 Apr 2019 17:27:28 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 11/18] btrfs: add dax mmap support Date: Mon, 29 Apr 2019 12:26:42 -0500 Message-Id: <20190429172649.8288-12-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Add a new vm_operations struct btrfs_dax_vm_ops specifically for dax files. Since we will be removing(nulling) readpages/writepages for dax return ENOEXEC only for non-dax files. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 1 + fs/btrfs/dax.c | 13 ++++++++++++- fs/btrfs/file.c | 18 ++++++++++++++++-- 3 files changed, 29 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index eec01eb92f33..2b7bdabb44f8 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3802,6 +3802,7 @@ int btree_readahead_hook(struct extent_buffer *eb, int err); /* dax.c */ ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from); +vm_fault_t btrfs_dax_fault(struct vm_fault *vmf); #else static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) { diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index f5cc9bcdbf14..de957d681e16 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -139,7 +139,7 @@ static int btrfs_iomap_begin(struct inode *inode, loff_t pos, iomap->addr = em->block_start + diff; /* Check if we really need to copy data from old extent */ - if (bi && !bi->nocow && (offset || pos + length < bi->end)) { + if (bi && !bi->nocow && (offset || pos + length < bi->end || flags & IOMAP_FAULT)) { iomap->type = IOMAP_DAX_COW; if (srcblk) { sector_t sector = (srcblk + (pos & PAGE_MASK) - @@ -216,4 +216,15 @@ ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *iter) } return ret; } + +vm_fault_t btrfs_dax_fault(struct vm_fault *vmf) +{ + vm_fault_t ret; + pfn_t pfn; + ret = dax_iomap_fault(vmf, PE_SIZE_PTE, &pfn, NULL, &btrfs_iomap_ops); + if (ret & VM_FAULT_NEEDDSYNC) + ret = dax_finish_sync_fault(vmf, PE_SIZE_PTE, pfn); + + return ret; +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index a795023e26ca..9d5a3c99a6b9 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2214,15 +2214,29 @@ static const struct vm_operations_struct btrfs_file_vm_ops = { .page_mkwrite = btrfs_page_mkwrite, }; +#ifdef CONFIG_FS_DAX +static const struct vm_operations_struct btrfs_dax_vm_ops = { + .fault = btrfs_dax_fault, + .page_mkwrite = btrfs_dax_fault, + .pfn_mkwrite = btrfs_dax_fault, +}; +#else +#define btrfs_dax_vm_ops btrfs_file_vm_ops +#endif + static int btrfs_file_mmap(struct file *filp, struct vm_area_struct *vma) { struct address_space *mapping = filp->f_mapping; + struct inode *inode = file_inode(filp); - if (!mapping->a_ops->readpage) + if (!IS_DAX(inode) && !mapping->a_ops->readpage) return -ENOEXEC; file_accessed(filp); - vma->vm_ops = &btrfs_file_vm_ops; + if (IS_DAX(inode)) + vma->vm_ops = &btrfs_dax_vm_ops; + else + vma->vm_ops = &btrfs_file_vm_ops; return 0; } From patchwork Mon Apr 29 17:26:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922261 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 10E8214B6 for ; Mon, 29 Apr 2019 17:27:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EAFEA28849 for ; Mon, 29 Apr 2019 17:27:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DFAD8288C6; Mon, 29 Apr 2019 17:27:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9BB3128849 for ; Mon, 29 Apr 2019 17:27:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728996AbfD2R1e (ORCPT ); Mon, 29 Apr 2019 13:27:34 -0400 Received: from mx2.suse.de ([195.135.220.15]:58288 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728954AbfD2R1b (ORCPT ); Mon, 29 Apr 2019 13:27:31 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6BA01AE29; Mon, 29 Apr 2019 17:27:30 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 12/18] btrfs: allow MAP_SYNC mmap Date: Mon, 29 Apr 2019 12:26:43 -0500 Message-Id: <20190429172649.8288-13-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Adam Borowski Used by userspace to detect DAX. [rgoldwyn@suse.com: Added CONFIG_FS_DAX around mmap_supported_flags] Signed-off-by: Adam Borowski Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 9d5a3c99a6b9..362a9cf9dcb2 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -16,6 +16,7 @@ #include #include #include +#include #include "ctree.h" #include "disk-io.h" #include "transaction.h" @@ -3319,6 +3320,9 @@ const struct file_operations btrfs_file_operations = { .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, .mmap = btrfs_file_mmap, +#ifdef CONFIG_FS_DAX + .mmap_supported_flags = MAP_SYNC, +#endif .open = btrfs_file_open, .release = btrfs_release_file, .fsync = btrfs_sync_file, From patchwork Mon Apr 29 17:26:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DBA421515 for ; Mon, 29 Apr 2019 17:27:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BFAAA2883B for ; Mon, 29 Apr 2019 17:27:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B40C528877; Mon, 29 Apr 2019 17:27:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 09E8C28849 for ; Mon, 29 Apr 2019 17:27:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729002AbfD2R1k (ORCPT ); Mon, 29 Apr 2019 13:27:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:58314 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728748AbfD2R1e (ORCPT ); Mon, 29 Apr 2019 13:27:34 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7E8C7AE34; Mon, 29 Apr 2019 17:27:32 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 13/18] fs: dedup file range to use a compare function Date: Mon, 29 Apr 2019 12:26:44 -0500 Message-Id: <20190429172649.8288-14-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues With dax we cannot deal with readpage() etc. So, we create a funciton callback to perform the file data comparison and pass it to generic_remap_file_range_prep() so it can use iomap-based functions. This may not be the best way to solve this. Suggestions welcome. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 9 ++++++++ fs/btrfs/dax.c | 8 +++++++ fs/btrfs/ioctl.c | 11 +++++++-- fs/dax.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ fs/ocfs2/file.c | 2 +- fs/read_write.c | 11 +++++---- fs/xfs/xfs_reflink.c | 2 +- include/linux/dax.h | 4 ++++ include/linux/fs.h | 8 ++++++- 9 files changed, 110 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2b7bdabb44f8..d3d044125619 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3803,11 +3803,20 @@ int btree_readahead_hook(struct extent_buffer *eb, int err); ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from); vm_fault_t btrfs_dax_fault(struct vm_fault *vmf); +int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, loff_t len, + bool *is_same); #else static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) { return 0; } +static inline int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, loff_t len, + bool *is_same) +{ + return 0; +} #endif /* CONFIG_FS_DAX */ static inline int is_fstree(u64 rootid) diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index de957d681e16..af64696a5337 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -227,4 +227,12 @@ vm_fault_t btrfs_dax_fault(struct vm_fault *vmf) return ret; } + +int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, loff_t len, + bool *is_same) +{ + return dax_file_range_compare(src, srcoff, dest, destoff, len, + is_same, &btrfs_iomap_ops); +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 0138119cd9a3..5ebb52848d5a 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3939,6 +3939,7 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, bool same_inode = inode_out == inode_in; u64 wb_len; int ret; + compare_range_t cmp; if (!(remap_flags & REMAP_FILE_DEDUP)) { struct btrfs_root *root_out = BTRFS_I(inode_out)->root; @@ -4000,8 +4001,14 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, if (ret < 0) goto out_unlock; - ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags); + if (IS_DAX(file_inode(file_in)) && IS_DAX(file_inode(file_out))) + cmp = btrfs_dax_file_range_compare; + else + cmp = vfs_dedupe_file_range_compare; + + ret = generic_remap_file_range_prep(file_in, pos_in, file_out, + pos_out, len, remap_flags, cmp); + if (ret < 0 || *len == 0) goto out_unlock; diff --git a/fs/dax.c b/fs/dax.c index 07e8ff20161d..fa9ccbad7c03 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -39,6 +39,8 @@ #define CREATE_TRACE_POINTS #include +#define MIN(a, b) (((a) < (b)) ? (a) : (b)) + static inline unsigned int pe_order(enum page_entry_size pe_size) { if (pe_size == PE_SIZE_PTE) @@ -1795,3 +1797,66 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, return dax_insert_pfn_mkwrite(vmf, pfn, order); } EXPORT_SYMBOL_GPL(dax_finish_sync_fault); + +static inline void *iomap_address(struct iomap *iomap, loff_t off, loff_t len) +{ + loff_t start; + void *addr; + start = (get_start_sect(iomap->bdev) << 9) + iomap->addr + + (off - iomap->offset); + dax_direct_access(iomap->dax_dev, PHYS_PFN(start), PHYS_PFN(len), + &addr, NULL); + return addr; +} + +int dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, + loff_t destoff, loff_t len, bool *is_same, const struct iomap_ops *ops) +{ + void *saddr, *daddr; + struct iomap s_iomap = {0}; + struct iomap d_iomap = {0}; + bool same = true; + loff_t cmp_len; + int id, ret = 0; + + id = dax_read_lock(); + while (len) { + ret = ops->iomap_begin(src, srcoff, len, 0, &s_iomap); + if (ret < 0) { + if (ops->iomap_end) + ops->iomap_end(src, srcoff, len, ret, 0, &s_iomap); + return ret; + } + cmp_len = len; + cmp_len = MIN(len, s_iomap.offset + s_iomap.length - srcoff); + + ret = ops->iomap_begin(dest, destoff, cmp_len, 0, &d_iomap); + if (ret < 0) { + if (ops->iomap_end) { + ops->iomap_end(src, srcoff, len, ret, 0, &s_iomap); + ops->iomap_end(dest, destoff, len, ret, 0, &d_iomap); + } + return ret; + } + cmp_len = MIN(cmp_len, d_iomap.offset + d_iomap.length - destoff); + + saddr = iomap_address(&s_iomap, srcoff, cmp_len); + daddr = iomap_address(&d_iomap, destoff, cmp_len); + + same = !memcmp(saddr, daddr, cmp_len); + if (!same) + break; + len -= cmp_len; + srcoff += cmp_len; + destoff += cmp_len; + + if (ops->iomap_end) { + ret = ops->iomap_end(src, srcoff, len, 0, 0, &s_iomap); + ret = ops->iomap_end(dest, destoff, len, 0, 0, &d_iomap); + } + } + dax_read_unlock(id); + *is_same = same; + return ret; +} +EXPORT_SYMBOL_GPL(dax_file_range_compare); diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index d640c5f8a85d..9d470306cfc3 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -2558,7 +2558,7 @@ static loff_t ocfs2_remap_file_range(struct file *file_in, loff_t pos_in, goto out_unlock; ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - &len, remap_flags); + &len, remap_flags, vfs_dedupe_file_range_compare); if (ret < 0 || len == 0) goto out_unlock; diff --git a/fs/read_write.c b/fs/read_write.c index 61b43ad7608e..c6283802ef1c 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1778,7 +1778,7 @@ static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset) * Compare extents of two files to see if they are the same. * Caller must have locked both inodes to prevent write races. */ -static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, +int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, loff_t len, bool *is_same) { @@ -1845,6 +1845,7 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, out_error: return error; } +EXPORT_SYMBOL_GPL(vfs_dedupe_file_range_compare); /* * Check that the two inodes are eligible for cloning, the ranges make @@ -1856,7 +1857,8 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, */ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, - loff_t *len, unsigned int remap_flags) + loff_t *len, unsigned int remap_flags, + compare_range_t compare) { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); @@ -1915,9 +1917,8 @@ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, */ if (remap_flags & REMAP_FILE_DEDUP) { bool is_same = false; - - ret = vfs_dedupe_file_range_compare(inode_in, pos_in, - inode_out, pos_out, *len, &is_same); + ret = (*compare)(inode_in, pos_in, + inode_out, pos_out, *len, &is_same); if (ret) return ret; if (!is_same) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 680ae7662a78..68e4257cebb0 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1350,7 +1350,7 @@ xfs_reflink_remap_prep( goto out_unlock; ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags); + len, remap_flags, vfs_dedupe_file_range_compare); if (ret < 0 || *len == 0) goto out_unlock; diff --git a/include/linux/dax.h b/include/linux/dax.h index 0dd316a74a29..1370d39c91b6 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -157,6 +157,10 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index); int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index); +int dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, + loff_t len, bool *is_same, + const struct iomap_ops *ops); #ifdef CONFIG_FS_DAX int __dax_zero_page_range(struct block_device *bdev, diff --git a/include/linux/fs.h b/include/linux/fs.h index dd28e7679089..0224503e42ce 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1883,10 +1883,16 @@ extern ssize_t vfs_readv(struct file *, const struct iovec __user *, unsigned long, loff_t *, rwf_t); extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *, loff_t, size_t, unsigned int); +extern int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, + loff_t len, bool *is_same); +typedef int (*compare_range_t)(struct inode *src, loff_t srcpos, + struct inode *dest, loff_t destpos, + loff_t len, bool *is_same); extern int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t *count, - unsigned int remap_flags); + unsigned int remap_flags, compare_range_t cmp); extern loff_t do_clone_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); From patchwork Mon Apr 29 17:26:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922305 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 35A6A14D5 for ; Mon, 29 Apr 2019 17:27:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A6B028849 for ; Mon, 29 Apr 2019 17:27:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E99728877; Mon, 29 Apr 2019 17:27:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9AEFA28849 for ; Mon, 29 Apr 2019 17:27:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729009AbfD2R1m (ORCPT ); Mon, 29 Apr 2019 13:27:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:58334 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728954AbfD2R1g (ORCPT ); Mon, 29 Apr 2019 13:27:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 96DB1AE5A; Mon, 29 Apr 2019 17:27:34 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 14/18] dax: memcpy before zeroing range Date: Mon, 29 Apr 2019 12:26:45 -0500 Message-Id: <20190429172649.8288-15-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues However, this needed more iomap fields, so it was easier to pass iomap and compute inside the function rather than passing a log of arguments. Note, there is subtle difference between iomap_sector and dax_iomap_sector(). Can we replace dax_iomap_sector with iomap_sector()? It would need pos & PAGE_MASK though or else bdev_dax_pgoff() return -EINVAL. Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 17 ++++++++++++----- fs/iomap.c | 9 +-------- include/linux/dax.h | 11 +++++------ include/linux/iomap.h | 6 ++++++ 4 files changed, 24 insertions(+), 19 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index fa9ccbad7c03..82a08b0eec23 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1063,11 +1063,16 @@ static bool dax_range_is_aligned(struct block_device *bdev, return true; } -int __dax_zero_page_range(struct block_device *bdev, - struct dax_device *dax_dev, sector_t sector, - unsigned int offset, unsigned int size) +int __dax_zero_page_range(struct iomap *iomap, loff_t pos, + unsigned int offset, unsigned int size) { - if (dax_range_is_aligned(bdev, offset, size)) { + sector_t sector = dax_iomap_sector(iomap, pos & PAGE_MASK); + struct block_device *bdev = iomap->bdev; + struct dax_device *dax_dev = iomap->dax_dev; + int ret = 0; + + if (!(iomap->type == IOMAP_DAX_COW) && + dax_range_is_aligned(bdev, offset, size)) { sector_t start_sector = sector + (offset >> 9); return blkdev_issue_zeroout(bdev, start_sector, @@ -1087,11 +1092,13 @@ int __dax_zero_page_range(struct block_device *bdev, dax_read_unlock(id); return rc; } + if (iomap->type == IOMAP_DAX_COW) + ret = memcpy_mcsafe(kaddr, iomap->inline_data, offset); memset(kaddr + offset, 0, size); dax_flush(dax_dev, kaddr + offset, size); dax_read_unlock(id); } - return 0; + return ret; } EXPORT_SYMBOL_GPL(__dax_zero_page_range); diff --git a/fs/iomap.c b/fs/iomap.c index abdd18e404f8..90698c854883 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -98,12 +98,6 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, return written ? written : ret; } -static sector_t -iomap_sector(struct iomap *iomap, loff_t pos) -{ - return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT; -} - static struct iomap_page * iomap_page_create(struct inode *inode, struct page *page) { @@ -990,8 +984,7 @@ static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes, struct iomap *iomap) { - return __dax_zero_page_range(iomap->bdev, iomap->dax_dev, - iomap_sector(iomap, pos & PAGE_MASK), offset, bytes); + return __dax_zero_page_range(iomap, pos, offset, bytes); } static loff_t diff --git a/include/linux/dax.h b/include/linux/dax.h index 1370d39c91b6..c469d9ff54b4 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -9,6 +9,7 @@ typedef unsigned long dax_entry_t; +struct iomap; struct iomap_ops; struct dax_device; struct dax_operations { @@ -163,13 +164,11 @@ int dax_file_range_compare(struct inode *src, loff_t srcoff, const struct iomap_ops *ops); #ifdef CONFIG_FS_DAX -int __dax_zero_page_range(struct block_device *bdev, - struct dax_device *dax_dev, sector_t sector, - unsigned int offset, unsigned int length); +int __dax_zero_page_range(struct iomap *iomap, loff_t pos, + unsigned int offset, unsigned int size); #else -static inline int __dax_zero_page_range(struct block_device *bdev, - struct dax_device *dax_dev, sector_t sector, - unsigned int offset, unsigned int length) +static inline int __dax_zero_page_range(struct iomap *iomap, loff_t pos, + unsigned int offset, unsigned int size) { return -ENXIO; } diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 6e885c5a38a3..fcfce269db3e 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -7,6 +7,7 @@ #include #include #include +#include struct address_space; struct fiemap_extent_info; @@ -120,6 +121,11 @@ static inline struct iomap_page *to_iomap_page(struct page *page) return NULL; } +static inline sector_t iomap_sector(struct iomap *iomap, loff_t pos) +{ + return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT; +} + ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, const struct iomap_ops *ops); int iomap_readpage(struct page *page, const struct iomap_ops *ops); From patchwork Mon Apr 29 17:26:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E63651515 for ; Mon, 29 Apr 2019 17:27:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAE9B2883B for ; Mon, 29 Apr 2019 17:27:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BF7462884A; Mon, 29 Apr 2019 17:27:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C23528849 for ; Mon, 29 Apr 2019 17:27:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729017AbfD2R1m (ORCPT ); Mon, 29 Apr 2019 13:27:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:58364 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728838AbfD2R1h (ORCPT ); Mon, 29 Apr 2019 13:27:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A9A5BAE8A; Mon, 29 Apr 2019 17:27:36 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 15/18] btrfs: handle dax page zeroing Date: Mon, 29 Apr 2019 12:26:46 -0500 Message-Id: <20190429172649.8288-16-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues btrfs_dax_zero_block() zeros part of the page, either from the front or the regular rest of the block. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 1 + fs/btrfs/dax.c | 27 ++++++++++++++++++++++++++- fs/btrfs/inode.c | 4 ++++ 3 files changed, 31 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index d3d044125619..ee1ed18f8b3c 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3806,6 +3806,7 @@ vm_fault_t btrfs_dax_fault(struct vm_fault *vmf); int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, loff_t len, bool *is_same); +int btrfs_dax_zero_block(struct inode *inode, loff_t from, loff_t len, bool front); #else static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) { diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index af64696a5337..bf2ddac5b5a1 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -132,7 +132,8 @@ static int btrfs_iomap_begin(struct inode *inode, loff_t pos, * This will be true for reads only since we have already * allocated em */ - if (em->block_start == EXTENT_MAP_HOLE) { + if (em->block_start == EXTENT_MAP_HOLE || + em->flags == EXTENT_FLAG_FILLING) { iomap->type = IOMAP_HOLE; return 0; } @@ -235,4 +236,28 @@ int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, return dax_file_range_compare(src, srcoff, dest, destoff, len, is_same, &btrfs_iomap_ops); } + +/* + * zero a part of the page only. This should CoW (via iomap_begin) if required + */ +int btrfs_dax_zero_block(struct inode *inode, loff_t from, loff_t len, bool front) +{ + loff_t start = round_down(from, PAGE_SIZE); + loff_t end = round_up(from, PAGE_SIZE); + loff_t offset = from; + int ret = 0; + + if (front) { + len = from - start; + offset = start; + } else { + if (!len) + len = end - from; + } + + if (len) + ret = iomap_zero_range(inode, offset, len, NULL, &btrfs_iomap_ops); + + return (ret < 0) ? ret : 0; +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 05714ffc4894..7e88280a2c3b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4833,6 +4833,10 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, (!len || IS_ALIGNED(len, blocksize))) goto out; +#ifdef CONFIG_FS_DAX + if (IS_DAX(inode)) + return btrfs_dax_zero_block(inode, from, len, front); +#endif block_start = round_down(from, blocksize); block_end = block_start + blocksize - 1; From patchwork Mon Apr 29 17:26:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922293 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 333F614B6 for ; Mon, 29 Apr 2019 17:27:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A66B2883B for ; Mon, 29 Apr 2019 17:27:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0F0D72884A; Mon, 29 Apr 2019 17:27:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FFE42883B for ; Mon, 29 Apr 2019 17:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729018AbfD2R1n (ORCPT ); Mon, 29 Apr 2019 13:27:43 -0400 Received: from mx2.suse.de ([195.135.220.15]:58398 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728997AbfD2R1k (ORCPT ); Mon, 29 Apr 2019 13:27:40 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B6119AE8B; Mon, 29 Apr 2019 17:27:38 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 16/18] btrfs: Writeprotect mmap pages on snapshot Date: Mon, 29 Apr 2019 12:26:47 -0500 Message-Id: <20190429172649.8288-17-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Inorder to make sure mmap'd files don't change after snapshot, writeprotect the mmap pages on snapshot. This is done by performing a data writeback on the pages (which simply mark the pages are wrprotected). This way if the user process tries to access the memory we will get another fault and we can perform a CoW. In order to accomplish this, we tag all CoW pages as PAGECACHE_TAG_TOWRITE, and add the mmapd inode in delalloc_inodes. During snapshot, it starts writeback of all delalloc'd inodes and here we perform a data writeback. We don't want to keep the inodes in delalloc_inodes until it umount (WARN_ON), so we remove it during inode evictions. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 3 ++- fs/btrfs/dax.c | 7 +++++++ fs/btrfs/inode.c | 13 ++++++++++++- 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index ee1ed18f8b3c..d1b70f24adeb 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3252,7 +3252,8 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans, struct btrfs_root *new_root, struct btrfs_root *parent_root, u64 new_dirid); - void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state, +void btrfs_add_delalloc_inodes(struct btrfs_root *root, struct inode *inode); +void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state, unsigned *bits); void btrfs_clear_delalloc_extent(struct inode *inode, struct extent_state *state, unsigned *bits); diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index bf2ddac5b5a1..20ec2ec49c68 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -222,10 +222,17 @@ vm_fault_t btrfs_dax_fault(struct vm_fault *vmf) { vm_fault_t ret; pfn_t pfn; + struct inode *inode = file_inode(vmf->vma->vm_file); + struct btrfs_inode *binode = BTRFS_I(inode); ret = dax_iomap_fault(vmf, PE_SIZE_PTE, &pfn, NULL, &btrfs_iomap_ops); if (ret & VM_FAULT_NEEDDSYNC) ret = dax_finish_sync_fault(vmf, PE_SIZE_PTE, pfn); + /* Insert into delalloc so we get writeback calls on snapshots */ + if (vmf->flags & FAULT_FLAG_WRITE && + !test_bit(BTRFS_INODE_IN_DELALLOC_LIST, &binode->runtime_flags)) + btrfs_add_delalloc_inodes(binode->root, inode); + return ret; } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7e88280a2c3b..e98fb512e1ca 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1713,7 +1713,7 @@ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new, spin_unlock(&BTRFS_I(inode)->lock); } -static void btrfs_add_delalloc_inodes(struct btrfs_root *root, +void btrfs_add_delalloc_inodes(struct btrfs_root *root, struct inode *inode) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -5358,12 +5358,17 @@ void btrfs_evict_inode(struct inode *inode) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_trans_handle *trans; + struct btrfs_inode *binode = BTRFS_I(inode); struct btrfs_root *root = BTRFS_I(inode)->root; struct btrfs_block_rsv *rsv; int ret; trace_btrfs_inode_evict(inode); + if (IS_DAX(inode) + && test_bit(BTRFS_INODE_IN_DELALLOC_LIST, &binode->runtime_flags)) + btrfs_del_delalloc_inode(root, binode); + if (!root) { clear_inode(inode); return; @@ -8683,6 +8688,10 @@ static int btrfs_dax_writepages(struct address_space *mapping, { struct inode *inode = mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_inode *binode = BTRFS_I(inode); + if ((wbc->sync_mode == WB_SYNC_ALL) && + test_bit(BTRFS_INODE_IN_DELALLOC_LIST, &binode->runtime_flags)) + btrfs_del_delalloc_inode(binode->root, binode); return dax_writeback_mapping_range(mapping, fs_info->fs_devices->latest_bdev, wbc); } @@ -9981,6 +9990,8 @@ static void btrfs_run_delalloc_work(struct btrfs_work *work) delalloc_work = container_of(work, struct btrfs_delalloc_work, work); inode = delalloc_work->inode; + if (IS_DAX(inode)) + filemap_fdatawrite(inode->i_mapping); filemap_flush(inode->i_mapping); if (test_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &BTRFS_I(inode)->runtime_flags)) From patchwork Mon Apr 29 17:26:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922301 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEA8C14B6 for ; Mon, 29 Apr 2019 17:27:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 959492883B for ; Mon, 29 Apr 2019 17:27:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A16B28877; Mon, 29 Apr 2019 17:27:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3704B2883B for ; Mon, 29 Apr 2019 17:27:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729023AbfD2R1o (ORCPT ); Mon, 29 Apr 2019 13:27:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:58446 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729008AbfD2R1m (ORCPT ); Mon, 29 Apr 2019 13:27:42 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C88F7AE8D; Mon, 29 Apr 2019 17:27:40 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 17/18] btrfs: Disable dax-based defrag and send Date: Mon, 29 Apr 2019 12:26:48 -0500 Message-Id: <20190429172649.8288-18-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This is temporary, and a TODO. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ioctl.c | 13 +++++++++++++ fs/btrfs/send.c | 4 ++++ 2 files changed, 17 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 5ebb52848d5a..2470f90c9983 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2990,6 +2990,12 @@ static int btrfs_ioctl_defrag(struct file *file, void __user *argp) goto out; } + if (IS_DAX(inode)) { + btrfs_warn(root->fs_info, "File defrag is not supported with DAX"); + ret = -EOPNOTSUPP; + goto out; + } + if (argp) { if (copy_from_user(range, argp, sizeof(*range))) { @@ -4653,6 +4659,10 @@ static long btrfs_ioctl_balance(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM; + /* send can be on a directory, so check super block instead */ + if (btrfs_test_opt(fs_info, DAX)) + return -EOPNOTSUPP; + ret = mnt_want_write_file(file); if (ret) return ret; @@ -5505,6 +5515,9 @@ static int _btrfs_ioctl_send(struct file *file, void __user *argp, bool compat) struct btrfs_ioctl_send_args *arg; int ret; + if (IS_DAX(file_inode(file))) + return -EOPNOTSUPP; + if (compat) { #if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT) struct btrfs_ioctl_send_args_32 args32; diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 7ea2d6b1f170..9679fd54db86 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -6609,6 +6609,10 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) int sort_clone_roots = 0; int index; + /* send can be on a directory, so check super block instead */ + if (btrfs_test_opt(fs_info, DAX)) + return -EOPNOTSUPP; + if (!capable(CAP_SYS_ADMIN)) return -EPERM; From patchwork Mon Apr 29 17:26:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10922295 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3DA2C14B6 for ; Mon, 29 Apr 2019 17:27:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2494C2883B for ; Mon, 29 Apr 2019 17:27:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 18D6528877; Mon, 29 Apr 2019 17:27:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B05832883B for ; Mon, 29 Apr 2019 17:27:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729027AbfD2R1o (ORCPT ); Mon, 29 Apr 2019 13:27:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:58480 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728997AbfD2R1o (ORCPT ); Mon, 29 Apr 2019 13:27:44 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D4DE0AEA1; Mon, 29 Apr 2019 17:27:42 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: kilobyte@angband.pl, linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com, willy@infradead.org, hch@lst.de, darrick.wong@oracle.com, dsterba@suse.cz, nborisov@suse.com, linux-nvdimm@lists.01.org, Goldwyn Rodrigues Subject: [PATCH 18/18] btrfs: trace functions for btrfs_iomap_begin/end Date: Mon, 29 Apr 2019 12:26:49 -0500 Message-Id: <20190429172649.8288-19-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190429172649.8288-1-rgoldwyn@suse.de> References: <20190429172649.8288-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This is for debug purposes only and can be skipped. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/dax.c | 3 +++ include/trace/events/btrfs.h | 56 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index 20ec2ec49c68..3fee28f5a199 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -104,6 +104,8 @@ static int btrfs_iomap_begin(struct inode *inode, loff_t pos, u64 srcblk = 0; loff_t diff; + trace_btrfs_iomap_begin(inode, pos, length, flags); + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); iomap->type = IOMAP_MAPPED; @@ -164,6 +166,7 @@ static int btrfs_iomap_end(struct inode *inode, loff_t pos, { struct btrfs_iomap *bi = iomap->private; u64 wend; + trace_btrfs_iomap_end(inode, pos, length, written, flags); if (!bi) return 0; diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index ab1cc33adbac..8779e5789a7c 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -1850,6 +1850,62 @@ DEFINE_EVENT(btrfs__block_group, btrfs_skip_unused_block_group, TP_ARGS(bg_cache) ); +TRACE_EVENT(btrfs_iomap_begin, + + TP_PROTO(const struct inode *inode, loff_t pos, loff_t length, int flags), + + TP_ARGS(inode, pos, length, flags), + + TP_STRUCT__entry_btrfs( + __field( u64, ino ) + __field( u64, pos ) + __field( u64, length ) + __field( int, flags ) + ), + + TP_fast_assign_btrfs(btrfs_sb(inode->i_sb), + __entry->ino = btrfs_ino(BTRFS_I(inode)); + __entry->pos = pos; + __entry->length = length; + __entry->flags = flags; + ), + + TP_printk_btrfs("ino=%llu pos=%llu len=%llu flags=0x%x", + __entry->ino, + __entry->pos, + __entry->length, + __entry->flags) +); + +TRACE_EVENT(btrfs_iomap_end, + + TP_PROTO(const struct inode *inode, loff_t pos, loff_t length, loff_t written, int flags), + + TP_ARGS(inode, pos, length, written, flags), + + TP_STRUCT__entry_btrfs( + __field( u64, ino ) + __field( u64, pos ) + __field( u64, length ) + __field( u64, written ) + __field( int, flags ) + ), + + TP_fast_assign_btrfs(btrfs_sb(inode->i_sb), + __entry->ino = btrfs_ino(BTRFS_I(inode)); + __entry->pos = pos; + __entry->length = length; + __entry->written = written; + __entry->flags = flags; + ), + + TP_printk_btrfs("ino=%llu pos=%llu len=%llu written=%llu flags=0x%x", + __entry->ino, + __entry->pos, + __entry->length, + __entry->written, + __entry->flags) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */