From patchwork Tue Mar 26 19:02:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872047 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 581F717E6 for ; Tue, 26 Mar 2019 19:03:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4B2A228C19 for ; Tue, 26 Mar 2019 19:03:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3F07C28D04; Tue, 26 Mar 2019 19:03:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C52A628C58 for ; Tue, 26 Mar 2019 19:03:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732179AbfCZTDU (ORCPT ); Tue, 26 Mar 2019 15:03:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:36922 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726175AbfCZTDU (ORCPT ); Tue, 26 Mar 2019 15:03:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 78780ADF2; Tue, 26 Mar 2019 19:03:18 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 01/15] btrfs: create a mount option for dax Date: Tue, 26 Mar 2019 14:02:47 -0500 Message-Id: <20190326190301.32365-2-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This sets S_DAX in inode->i_flags, which can be used with IS_DAX(). The dax option is restricted to non multi-device mounts. dax interacts with the device directly instead of using bio, so all bio-hooks which we use for multi-device cannot be performed here. While regular read/writes could be manipulated with RAID0/1, mmap() is still an issue. Auto-setting free space tree, because dealing with free space inode (specifically readpages) is a nightmare. Auto-setting nodatasum because we don't get callback for writing checksums after mmap()s. Store the dax_device in fs_info which will be used in iomap code. Question: Since we have only one dax device, I thought fs_info is the best place. However, should it moved to btrfs_device? Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/disk-io.c | 4 ++++ fs/btrfs/ioctl.c | 5 ++++- fs/btrfs/super.c | 26 ++++++++++++++++++++++++++ 4 files changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b3642367a595..8ca1c0d120f4 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1067,6 +1067,7 @@ struct btrfs_fs_info { u32 metadata_ratio; void *bdev_holder; + struct dax_device *dax_dev; /* private scrub information */ struct mutex scrub_lock; @@ -1442,6 +1443,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info) #define BTRFS_MOUNT_FREE_SPACE_TREE (1 << 26) #define BTRFS_MOUNT_NOLOGREPLAY (1 << 27) #define BTRFS_MOUNT_REF_VERIFY (1 << 28) +#define BTRFS_MOUNT_DAX (1 << 29) #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) #define BTRFS_DEFAULT_MAX_INLINE (2048) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 6fe9197f6ee4..2bbb63b2fcff 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -2805,6 +2806,8 @@ int open_ctree(struct super_block *sb, goto fail_alloc; } + fs_info->dax_dev = fs_dax_get_by_bdev(fs_devices->latest_bdev); + /* * We want to check superblock checksum, the type is stored inside. * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). @@ -4043,6 +4046,7 @@ void close_ctree(struct btrfs_fs_info *fs_info) #endif btrfs_close_devices(fs_info->fs_devices); + fs_put_dax(fs_info->dax_dev); btrfs_mapping_tree_free(&fs_info->mapping_tree); percpu_counter_destroy(&fs_info->dirty_metadata_bytes); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index ec2d8919e7fb..e66426e7692d 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -149,8 +149,11 @@ void btrfs_sync_inode_flags_to_i_flags(struct inode *inode) if (binode->flags & BTRFS_INODE_DIRSYNC) new_fl |= S_DIRSYNC; + if ((btrfs_test_opt(btrfs_sb(inode->i_sb), DAX)) && S_ISREG(inode->i_mode)) + new_fl |= S_DAX; + set_mask_bits(&inode->i_flags, - S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC, + S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC | S_DAX, new_fl); } diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 120e4340792a..2d448b9d6004 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -326,6 +326,7 @@ enum { Opt_treelog, Opt_notreelog, Opt_usebackuproot, Opt_user_subvol_rm_allowed, + Opt_dax, /* Deprecated options */ Opt_alloc_start, @@ -393,6 +394,7 @@ static const match_table_t tokens = { {Opt_notreelog, "notreelog"}, {Opt_usebackuproot, "usebackuproot"}, {Opt_user_subvol_rm_allowed, "user_subvol_rm_allowed"}, + {Opt_dax, "dax"}, /* Deprecated options */ {Opt_alloc_start, "alloc_start=%s"}, @@ -745,6 +747,28 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, case Opt_user_subvol_rm_allowed: btrfs_set_opt(info->mount_opt, USER_SUBVOL_RM_ALLOWED); break; + case Opt_dax: +#ifdef CONFIG_FS_DAX + if (btrfs_super_num_devices(info->super_copy) > 1) { + btrfs_info(info, + "dax not supported for multi-device btrfs partition\n"); + ret = -EOPNOTSUPP; + goto out; + } + btrfs_set_opt(info->mount_opt, DAX); + btrfs_warn(info, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk\n"); + btrfs_set_and_info(info, NODATASUM, + "auto-setting nodatasum (dax)"); + btrfs_clear_opt(info->mount_opt, SPACE_CACHE); + btrfs_set_and_info(info, FREE_SPACE_TREE, + "auto-setting free space tree (dax)"); + break; +#else + btrfs_err(info, + "DAX option not supported\n"); + ret = -EINVAL; + goto out; +#endif case Opt_enospc_debug: btrfs_set_opt(info->mount_opt, ENOSPC_DEBUG); break; @@ -1335,6 +1359,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) seq_puts(seq, ",clear_cache"); if (btrfs_test_opt(info, USER_SUBVOL_RM_ALLOWED)) seq_puts(seq, ",user_subvol_rm_allowed"); + if (btrfs_test_opt(info, DAX)) + seq_puts(seq, ",dax"); if (btrfs_test_opt(info, ENOSPC_DEBUG)) seq_puts(seq, ",enospc_debug"); if (btrfs_test_opt(info, AUTO_DEFRAG)) From patchwork Tue Mar 26 19:02:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872051 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70E3617E6 for ; Tue, 26 Mar 2019 19:03:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63DF328C19 for ; Tue, 26 Mar 2019 19:03:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 588CC28C63; Tue, 26 Mar 2019 19:03:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05BE128C19 for ; Tue, 26 Mar 2019 19:03:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732272AbfCZTDX (ORCPT ); Tue, 26 Mar 2019 15:03:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:36928 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726175AbfCZTDV (ORCPT ); Tue, 26 Mar 2019 15:03:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9384DAEE6; Tue, 26 Mar 2019 19:03:20 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 02/15] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Date: Tue, 26 Mar 2019 14:02:48 -0500 Message-Id: <20190326190301.32365-3-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This makes btrfs_get_extent_map_write() independent of Direct I/O code. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 40 +++++++++++++++++++++++++++------------- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 8ca1c0d120f4..9512f49262dd 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3277,6 +3277,8 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location, struct btrfs_path *path); struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, struct btrfs_root *root, int *was_new); +int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, + struct inode *inode, u64 start, u64 len); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, struct page *page, size_t pg_offset, u64 start, u64 end, int create); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 82fdda8ff5ab..80184d0c3b52 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7496,11 +7496,10 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em, return 0; } -static int btrfs_get_blocks_direct_write(struct extent_map **map, - struct buffer_head *bh_result, - struct inode *inode, - struct btrfs_dio_data *dio_data, - u64 start, u64 len) +int btrfs_get_extent_map_write(struct extent_map **map, + struct buffer_head *bh, + struct inode *inode, + u64 start, u64 len) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em = *map; @@ -7554,22 +7553,38 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, */ btrfs_free_reserved_data_space_noquota(inode, start, len); - goto skip_cow; + /* skip COW */ + goto out; } } /* this will cow the extent */ - len = bh_result->b_size; + if (bh) + len = bh->b_size; free_extent_map(em); *map = em = btrfs_new_extent_direct(inode, start, len); - if (IS_ERR(em)) { - ret = PTR_ERR(em); - goto out; - } + if (IS_ERR(em)) + return PTR_ERR(em); +out: + return ret; +} +static int btrfs_get_blocks_direct_write(struct extent_map **map, + struct buffer_head *bh_result, + struct inode *inode, + struct btrfs_dio_data *dio_data, + u64 start, u64 len) +{ + int ret = 0; + struct extent_map *em; + + ret = btrfs_get_extent_map_write(map, bh_result, inode, + start, len); + if (ret < 0) + return ret; + em = *map; len = min(len, em->len - (start - em->start)); -skip_cow: bh_result->b_blocknr = (em->block_start + (start - em->start)) >> inode->i_blkbits; bh_result->b_size = len; @@ -7590,7 +7605,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, dio_data->reserve -= len; dio_data->unsubmitted_oe_range_end = start + len; current->journal_info = dio_data; -out: return ret; } From patchwork Tue Mar 26 19:02:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872055 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F2F4139A for ; Tue, 26 Mar 2019 19:03:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5F91028C19 for ; Tue, 26 Mar 2019 19:03:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 50E3128C9D; Tue, 26 Mar 2019 19:03:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D9B2A28C19 for ; Tue, 26 Mar 2019 19:03:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732490AbfCZTDZ (ORCPT ); Tue, 26 Mar 2019 15:03:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:36940 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732204AbfCZTDY (ORCPT ); Tue, 26 Mar 2019 15:03:24 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A3231ADF2; Tue, 26 Mar 2019 19:03:22 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 03/15] btrfs: basic dax read Date: Tue, 26 Mar 2019 14:02:49 -0500 Message-Id: <20190326190301.32365-4-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Perform a basic read using iomap support. The btrfs_iomap_begin() finds the extent at the position and fills the iomap data structure with the values. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/Makefile | 1 + fs/btrfs/ctree.h | 5 +++++ fs/btrfs/dax.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/file.c | 12 +++++++++++- 4 files changed, 66 insertions(+), 1 deletion(-) create mode 100644 fs/btrfs/dax.c diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index ca693dd554e9..1fa77b875ae9 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -12,6 +12,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o +btrfs-$(CONFIG_FS_DAX) += dax.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9512f49262dd..b7bbe5130a3b 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3795,6 +3795,11 @@ int btrfs_reada_wait(void *handle); void btrfs_reada_detach(void *handle); int btree_readahead_hook(struct extent_buffer *eb, int err); +#ifdef CONFIG_FS_DAX +/* dax.c */ +ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); +#endif /* CONFIG_FS_DAX */ + static inline int is_fstree(u64 rootid) { if (rootid == BTRFS_FS_TREE_OBJECTID || diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c new file mode 100644 index 000000000000..bf3d46b0acb6 --- /dev/null +++ b/fs/btrfs/dax.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * DAX support for BTRFS + * + * Copyright (c) 2019 SUSE Linux + * Author: Goldwyn Rodrigues + */ + +#ifdef CONFIG_FS_DAX +#include +#include +#include "ctree.h" +#include "btrfs_inode.h" + +static int btrfs_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned flags, struct iomap *iomap) +{ + struct extent_map *em; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + if (em->block_start == EXTENT_MAP_HOLE) { + iomap->type = IOMAP_HOLE; + return 0; + } + iomap->type = IOMAP_MAPPED; + iomap->bdev = em->bdev; + iomap->dax_dev = fs_info->dax_dev; + iomap->offset = em->start; + iomap->length = em->len; + iomap->addr = em->block_start; + return 0; +} + +static const struct iomap_ops btrfs_iomap_ops = { + .iomap_begin = btrfs_iomap_begin, +}; + +ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to) +{ + ssize_t ret; + struct inode *inode = file_inode(iocb->ki_filp); + + inode_lock_shared(inode); + ret = dax_iomap_rw(iocb, to, &btrfs_iomap_ops); + inode_unlock_shared(inode); + + return ret; +} +#endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 34fe8a58b0e9..b620f4e718b2 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3288,9 +3288,19 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) return generic_file_open(inode, filp); } +static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ +#ifdef CONFIG_FS_DAX + struct inode *inode = file_inode(iocb->ki_filp); + if (IS_DAX(file_inode(iocb->ki_filp))) + return btrfs_file_dax_read(iocb, to); +#endif + return generic_file_read_iter(iocb, to); +} + const struct file_operations btrfs_file_operations = { .llseek = btrfs_file_llseek, - .read_iter = generic_file_read_iter, + .read_iter = btrfs_file_read_iter, .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, .mmap = btrfs_file_mmap, From patchwork Tue Mar 26 19:02:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872059 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4EFAA139A for ; Tue, 26 Mar 2019 19:03:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 406A128C19 for ; Tue, 26 Mar 2019 19:03:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34B1728C63; Tue, 26 Mar 2019 19:03:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D089628C19 for ; Tue, 26 Mar 2019 19:03:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732560AbfCZTD0 (ORCPT ); Tue, 26 Mar 2019 15:03:26 -0400 Received: from mx2.suse.de ([195.135.220.15]:36958 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732504AbfCZTD0 (ORCPT ); Tue, 26 Mar 2019 15:03:26 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id CD4D2AEE6; Tue, 26 Mar 2019 19:03:24 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 04/15] dax: Introduce IOMAP_F_COW for copy-on-write Date: Tue, 26 Mar 2019 14:02:50 -0500 Message-Id: <20190326190301.32365-5-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues The IOMAP_F_COW is a flag to notify dax that it needs to copy the data from iomap->cow_addr to iomap->addr, if the start/end of I/O are not page aligned. This also introduces dax_to_dax_copy() which performs a copy from one part of the device to another, to a maximum of one page. Question: Using iomap.cow_addr == 0 means the CoW is to be copied (or memset) from a hole. Would this be better handled through a flag? Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/iomap.h | 3 +++ 2 files changed, 39 insertions(+) diff --git a/fs/dax.c b/fs/dax.c index ca0671d55aa6..e254535dd830 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1051,6 +1051,28 @@ static bool dax_range_is_aligned(struct block_device *bdev, return true; } +static void dax_to_dax_copy(struct iomap *iomap, loff_t pos, void *daddr, + size_t len) +{ + loff_t blk_start, blk_pg; + void *saddr; + ssize_t map_len; + + /* A zero address is a hole. */ + if (iomap->cow_addr == 0) { + memset(daddr, 0, len); + return; + } + + blk_start = iomap->cow_addr + pos - iomap->cow_pos; + blk_pg = round_down(blk_start, PAGE_SIZE); + + map_len = dax_direct_access(iomap->dax_dev, PHYS_PFN(blk_pg), PAGE_SIZE, + &saddr, NULL); + saddr += blk_start - blk_pg; + memcpy(daddr, saddr, len); +} + int __dax_zero_page_range(struct block_device *bdev, struct dax_device *dax_dev, sector_t sector, unsigned int offset, unsigned int size) @@ -1143,6 +1165,20 @@ dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, break; } + if (iomap->flags & IOMAP_F_COW) { + loff_t pg_end = round_up(end, PAGE_SIZE); + /* + * Copy the first part of the page + * Note: we pass offset as length + */ + if (offset) + dax_to_dax_copy(iomap, pos - offset, kaddr, offset); + + /* Copy the last part of the range */ + if (end < pg_end) + dax_to_dax_copy(iomap, end, kaddr + offset + length, pg_end - end); + } + map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 0fefb5455bda..391785de1428 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -35,6 +35,7 @@ struct vm_fault; #define IOMAP_F_NEW 0x01 /* blocks have been newly allocated */ #define IOMAP_F_DIRTY 0x02 /* uncommitted metadata */ #define IOMAP_F_BUFFER_HEAD 0x04 /* file system requires buffer heads */ +#define IOMAP_F_COW 0x08 /* cow before write */ /* * Flags that only need to be reported for IOMAP_REPORT requests: @@ -59,6 +60,8 @@ struct iomap { u64 length; /* length of mapping, bytes */ u16 type; /* type of mapping */ u16 flags; /* flags for mapping */ + u64 cow_addr; /* read address to perform CoW */ + loff_t cow_pos; /* file offset of cow_addr */ struct block_device *bdev; /* block device for I/O */ struct dax_device *dax_dev; /* dax_dev for dax operations */ void *inline_data; From patchwork Tue Mar 26 19:02:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872063 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B54C3925 for ; Tue, 26 Mar 2019 19:03:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A703128C19 for ; Tue, 26 Mar 2019 19:03:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9B45228C63; Tue, 26 Mar 2019 19:03:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 48C9D28C19 for ; Tue, 26 Mar 2019 19:03:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732595AbfCZTD3 (ORCPT ); Tue, 26 Mar 2019 15:03:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:36964 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732504AbfCZTD2 (ORCPT ); Tue, 26 Mar 2019 15:03:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 09B7AADF2; Tue, 26 Mar 2019 19:03:27 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 05/15] btrfs: return whether extent is nocow or not Date: Tue, 26 Mar 2019 14:02:51 -0500 Message-Id: <20190326190301.32365-6-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues We require this to set the IOMAP_F_COW flag in iomap structure, in the later patches. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 +- fs/btrfs/inode.c | 9 +++++++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b7bbe5130a3b..2c49d3c46170 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3278,7 +3278,7 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location, struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, struct btrfs_root *root, int *was_new); int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, - struct inode *inode, u64 start, u64 len); + struct inode *inode, u64 start, u64 len, int *nocow); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, struct page *page, size_t pg_offset, u64 start, u64 end, int create); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 80184d0c3b52..c8702e0b5e66 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7499,12 +7499,15 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em, int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, struct inode *inode, - u64 start, u64 len) + u64 start, u64 len, int *nocow) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em = *map; int ret = 0; + if (nocow) + *nocow = 0; + /* * We don't allocate a new extent in the following cases * @@ -7553,6 +7556,8 @@ int btrfs_get_extent_map_write(struct extent_map **map, */ btrfs_free_reserved_data_space_noquota(inode, start, len); + if (nocow) + *nocow = 1; /* skip COW */ goto out; } @@ -7579,7 +7584,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, struct extent_map *em; ret = btrfs_get_extent_map_write(map, bh_result, inode, - start, len); + start, len, NULL); if (ret < 0) return ret; em = *map; From patchwork Tue Mar 26 19:02:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872067 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9712B17E6 for ; Tue, 26 Mar 2019 19:03:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89B6E28C19 for ; Tue, 26 Mar 2019 19:03:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7DFEF28C63; Tue, 26 Mar 2019 19:03:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 25D9B28C58 for ; Tue, 26 Mar 2019 19:03:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732619AbfCZTDb (ORCPT ); Tue, 26 Mar 2019 15:03:31 -0400 Received: from mx2.suse.de ([195.135.220.15]:36986 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732504AbfCZTDb (ORCPT ); Tue, 26 Mar 2019 15:03:31 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 13F27ADF2; Tue, 26 Mar 2019 19:03:29 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 06/15] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Date: Tue, 26 Mar 2019 14:02:52 -0500 Message-Id: <20190326190301.32365-7-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Since we will be using it in another part of the code, use a better name to declare it non-static Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 7 +++++-- fs/btrfs/inode.c | 14 +++++--------- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2c49d3c46170..a3543a4a063d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3280,8 +3280,11 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, struct inode *inode, u64 start, u64 len, int *nocow); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, - struct page *page, size_t pg_offset, - u64 start, u64 end, int create); + struct page *page, size_t pg_offset, + u64 start, u64 end, int create); +void btrfs_update_ordered_extent(struct inode *inode, + const u64 offset, const u64 bytes, + const bool uptodate); int btrfs_update_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct inode *inode); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c8702e0b5e66..f721fc1e3f7f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -98,10 +98,6 @@ static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len, u64 ram_bytes, int compress_type, int type); -static void __endio_write_update_ordered(struct inode *inode, - const u64 offset, const u64 bytes, - const bool uptodate); - /* * Cleanup all submitted ordered extents in specified range to handle errors * from the btrfs_run_delalloc_range() callback. @@ -142,7 +138,7 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode, bytes -= PAGE_SIZE; } - return __endio_write_update_ordered(inode, offset, bytes, false); + return btrfs_update_ordered_extent(inode, offset, bytes, false); } static int btrfs_dirty_inode(struct inode *inode); @@ -8085,7 +8081,7 @@ static void btrfs_endio_direct_read(struct bio *bio) bio_put(bio); } -static void __endio_write_update_ordered(struct inode *inode, +void btrfs_update_ordered_extent(struct inode *inode, const u64 offset, const u64 bytes, const bool uptodate) { @@ -8138,7 +8134,7 @@ static void btrfs_endio_direct_write(struct bio *bio) struct btrfs_dio_private *dip = bio->bi_private; struct bio *dio_bio = dip->dio_bio; - __endio_write_update_ordered(dip->inode, dip->logical_offset, + btrfs_update_ordered_extent(dip->inode, dip->logical_offset, dip->bytes, !bio->bi_status); kfree(dip); @@ -8457,7 +8453,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, bio = NULL; } else { if (write) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, file_offset, dio_bio->bi_iter.bi_size, false); @@ -8597,7 +8593,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) */ if (dio_data.unsubmitted_oe_range_start < dio_data.unsubmitted_oe_range_end) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, dio_data.unsubmitted_oe_range_start, dio_data.unsubmitted_oe_range_end - dio_data.unsubmitted_oe_range_start, From patchwork Tue Mar 26 19:02:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872071 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2573F17E6 for ; Tue, 26 Mar 2019 19:03:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 182DA28C19 for ; Tue, 26 Mar 2019 19:03:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BC1F28CFF; Tue, 26 Mar 2019 19:03:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C25028C58 for ; Tue, 26 Mar 2019 19:03:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732688AbfCZTDg (ORCPT ); Tue, 26 Mar 2019 15:03:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:36998 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732636AbfCZTDd (ORCPT ); Tue, 26 Mar 2019 15:03:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 255BDAEE6; Tue, 26 Mar 2019 19:03:31 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 07/15] btrfs: add dax write support Date: Tue, 26 Mar 2019 14:02:53 -0500 Message-Id: <20190326190301.32365-8-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues IOMAP_F_COW allows to inform the dax code, to first perform a copy which are not page-aligned before performing the write. A new struct btrfs_iomap is passed from iomap_begin() to iomap_end(), which contains all the accounting and locking information for CoW based writes. For writing to a hole, iomap->cow_addr is set to zero. Would this be better handled by a flag or can a valid filesystem block be at offset zero of the device? Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 6 +++ fs/btrfs/dax.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- fs/btrfs/file.c | 4 +- 3 files changed, 124 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index a3543a4a063d..3bcd2a4959c1 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3801,6 +3801,12 @@ int btree_readahead_hook(struct extent_buffer *eb, int err); #ifdef CONFIG_FS_DAX /* dax.c */ ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); +ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from); +#else +static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) +{ + return 0; +} #endif /* CONFIG_FS_DAX */ static inline int is_fstree(u64 rootid) diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index bf3d46b0acb6..49619fe3f94f 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -9,30 +9,124 @@ #ifdef CONFIG_FS_DAX #include #include +#include #include "ctree.h" #include "btrfs_inode.h" +struct btrfs_iomap { + u64 start; + u64 end; + int nocow; + struct extent_changeset *data_reserved; + struct extent_state *cached_state; +}; + static int btrfs_iomap_begin(struct inode *inode, loff_t pos, loff_t length, unsigned flags, struct iomap *iomap) { struct extent_map *em; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + + if (flags & IOMAP_WRITE) { + int ret = 0, nocow; + struct extent_map *map = em; + struct btrfs_iomap *bi; + + bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS); + if (!bi) + return -ENOMEM; + + bi->start = round_down(pos, PAGE_SIZE); + bi->end = round_up(pos + length, PAGE_SIZE); + + iomap->private = bi; + + /* Wait for existing ordered extents in range to finish */ + btrfs_wait_ordered_range(inode, bi->start, bi->end - bi->start); + + lock_extent_bits(&BTRFS_I(inode)->io_tree, bi->start, bi->end, &bi->cached_state); + + ret = btrfs_delalloc_reserve_space(inode, &bi->data_reserved, + bi->start, bi->end - bi->start); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + kfree(bi); + return ret; + } + + refcount_inc(&map->refs); + ret = btrfs_get_extent_map_write(&em, NULL, + inode, bi->start, bi->end - bi->start, &nocow); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + btrfs_delalloc_release_space(inode, + bi->data_reserved, bi->start, + bi->end - bi->start, true); + extent_changeset_free(bi->data_reserved); + kfree(bi); + return ret; + } + if (!nocow) { + iomap->flags |= IOMAP_F_COW; + if (map->block_start != EXTENT_MAP_HOLE) { + iomap->cow_addr = map->block_start; + iomap->cow_pos = map->start; + } + } else { + bi->nocow = 1; + } + free_extent_map(map); + } + + iomap->offset = em->start; + iomap->length = em->len; + iomap->bdev = em->bdev; + iomap->dax_dev = fs_info->dax_dev; + if (em->block_start == EXTENT_MAP_HOLE) { iomap->type = IOMAP_HOLE; return 0; } + iomap->type = IOMAP_MAPPED; - iomap->bdev = em->bdev; - iomap->dax_dev = fs_info->dax_dev; - iomap->offset = em->start; - iomap->length = em->len; iomap->addr = em->block_start; return 0; } +static int btrfs_iomap_end(struct inode *inode, loff_t pos, + loff_t length, ssize_t written, unsigned flags, + struct iomap *iomap) +{ + struct btrfs_iomap *bi = iomap->private; + u64 wend; + + if (!bi) + return 0; + + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + + wend = round_up(pos + written, PAGE_SIZE); + if (wend < bi->end) { + btrfs_delalloc_release_space(inode, + bi->data_reserved, wend, + bi->end - wend, true); + } + + btrfs_update_ordered_extent(inode, bi->start, wend - bi->start, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), wend - bi->start, false); + extent_changeset_free(bi->data_reserved); + kfree(bi); + return 0; +} + static const struct iomap_ops btrfs_iomap_ops = { .iomap_begin = btrfs_iomap_begin, + .iomap_end = btrfs_iomap_end, }; ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to) @@ -46,4 +140,21 @@ ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to) return ret; } + +ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *iter) +{ + ssize_t ret = 0; + u64 pos = iocb->ki_pos; + struct inode *inode = file_inode(iocb->ki_filp); + + ret = dax_iomap_rw(iocb, iter, &btrfs_iomap_ops); + + if (ret > 0) { + pos += ret; + if (pos > i_size_read(inode)) + i_size_write(inode, pos); + iocb->ki_pos = pos; + } + return ret; +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index b620f4e718b2..3b320d0ab495 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1964,7 +1964,9 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, if (sync) atomic_inc(&BTRFS_I(inode)->sync_writers); - if (iocb->ki_flags & IOCB_DIRECT) { + if (IS_DAX(inode)) { + num_written = btrfs_file_dax_write(iocb, from); + } else if (iocb->ki_flags & IOCB_DIRECT) { num_written = __btrfs_direct_write(iocb, from); } else { num_written = btrfs_buffered_write(iocb, from); From patchwork Tue Mar 26 19:02:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872079 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5437925 for ; Tue, 26 Mar 2019 19:03:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7DB628C19 for ; Tue, 26 Mar 2019 19:03:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ABF5128C71; Tue, 26 Mar 2019 19:03:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6012B28C19 for ; Tue, 26 Mar 2019 19:03:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732660AbfCZTDg (ORCPT ); Tue, 26 Mar 2019 15:03:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:37008 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732504AbfCZTDe (ORCPT ); Tue, 26 Mar 2019 15:03:34 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4BBC0AF79; Tue, 26 Mar 2019 19:03:33 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 08/15] dax: add dax_iomap_cow to copy a mmap page before writing Date: Tue, 26 Mar 2019 14:02:54 -0500 Message-Id: <20190326190301.32365-9-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues dax_iomap_cow copies a page before presenting for mmap. Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index e254535dd830..21ee3df6f02c 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1269,6 +1269,33 @@ static bool dax_fault_is_synchronous(unsigned long flags, && (iomap->flags & IOMAP_F_DIRTY); } +static int dax_iomap_cow(struct iomap *iomap, loff_t pos, pfn_t *pfn) +{ + void *daddr; + pgoff_t pgoff; + long rc; + int id; + sector_t sector; + + pos = round_down(pos, PAGE_SIZE); + + sector = round_down(iomap->addr + iomap->offset - pos, PAGE_SIZE) >> 9; + rc = bdev_dax_pgoff(iomap->bdev, sector, PAGE_SIZE, &pgoff); + if (rc) + return rc; + + id = dax_read_lock(); + rc = dax_direct_access(iomap->dax_dev, pgoff, 1, &daddr, pfn); + if (rc < 0) + goto out; + + dax_to_dax_copy(iomap, pos, daddr, PAGE_SIZE); + +out: + dax_read_unlock(id); + return rc; +} + static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, int *iomap_errp, const struct iomap_ops *ops) { @@ -1372,7 +1399,11 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, count_memcg_event_mm(vma->vm_mm, PGMAJFAULT); major = VM_FAULT_MAJOR; } - error = dax_iomap_pfn(&iomap, pos, PAGE_SIZE, &pfn); + + if (iomap.flags & IOMAP_F_COW) + error = dax_iomap_cow(&iomap, pos, &pfn); + else + error = dax_iomap_pfn(&iomap, pos, PAGE_SIZE, &pfn); if (error < 0) goto error_finish_iomap; From patchwork Tue Mar 26 19:02:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872077 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3567917E6 for ; Tue, 26 Mar 2019 19:03:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2804328C19 for ; Tue, 26 Mar 2019 19:03:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C38C28C58; Tue, 26 Mar 2019 19:03:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2EF128C71 for ; Tue, 26 Mar 2019 19:03:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732697AbfCZTDi (ORCPT ); Tue, 26 Mar 2019 15:03:38 -0400 Received: from mx2.suse.de ([195.135.220.15]:37022 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732653AbfCZTDh (ORCPT ); Tue, 26 Mar 2019 15:03:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A6441ADF2; Tue, 26 Mar 2019 19:03:35 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 09/15] btrfs: add dax mmap support Date: Tue, 26 Mar 2019 14:02:55 -0500 Message-Id: <20190326190301.32365-10-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Add a new vm_operations struct btrfs_dax_vm_ops specifically for dax files. Since we will be removing(nulling) readpages/writepages for dax return ENOEXEC only for non-dax files. dax_insert_entry() looks ugly. Do you think we should break it into dax_insert_cow_entry() and dax_insert_entry()? Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 1 + fs/btrfs/dax.c | 11 +++++++++++ fs/btrfs/file.c | 18 ++++++++++++++++-- fs/dax.c | 17 ++++++++++------- 4 files changed, 38 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 3bcd2a4959c1..0e5060933bde 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3802,6 +3802,7 @@ int btree_readahead_hook(struct extent_buffer *eb, int err); /* dax.c */ ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from); +vm_fault_t btrfs_dax_fault(struct vm_fault *vmf); #else static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) { diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index 49619fe3f94f..927f962d1e88 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -157,4 +157,15 @@ ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *iter) } return ret; } + +vm_fault_t btrfs_dax_fault(struct vm_fault *vmf) +{ + vm_fault_t ret; + pfn_t pfn; + ret = dax_iomap_fault(vmf, PE_SIZE_PTE, &pfn, NULL, &btrfs_iomap_ops); + if (ret & VM_FAULT_NEEDDSYNC) + ret = dax_finish_sync_fault(vmf, PE_SIZE_PTE, pfn); + + return ret; +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 3b320d0ab495..196c8f37ff9d 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2214,15 +2214,29 @@ static const struct vm_operations_struct btrfs_file_vm_ops = { .page_mkwrite = btrfs_page_mkwrite, }; +#ifdef CONFIG_FS_DAX +static const struct vm_operations_struct btrfs_dax_vm_ops = { + .fault = btrfs_dax_fault, + .page_mkwrite = btrfs_dax_fault, + .pfn_mkwrite = btrfs_dax_fault, +}; +#else +#define btrfs_dax_vm_ops btrfs_file_vm_ops +#endif + static int btrfs_file_mmap(struct file *filp, struct vm_area_struct *vma) { struct address_space *mapping = filp->f_mapping; + struct inode *inode = file_inode(filp); - if (!mapping->a_ops->readpage) + if (!IS_DAX(inode) && !mapping->a_ops->readpage) return -ENOEXEC; file_accessed(filp); - vma->vm_ops = &btrfs_file_vm_ops; + if (IS_DAX(inode)) + vma->vm_ops = &btrfs_dax_vm_ops; + else + vma->vm_ops = &btrfs_file_vm_ops; return 0; } diff --git a/fs/dax.c b/fs/dax.c index 21ee3df6f02c..41061da42771 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -708,14 +708,15 @@ static int copy_user_dax(struct block_device *bdev, struct dax_device *dax_dev, */ static void *dax_insert_entry(struct xa_state *xas, struct address_space *mapping, struct vm_fault *vmf, - void *entry, pfn_t pfn, unsigned long flags, bool dirty) + void *entry, pfn_t pfn, unsigned long flags, bool dirty, + bool cow) { void *new_entry = dax_make_entry(pfn, flags); if (dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); - if (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE)) { + if (cow || (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE))) { unsigned long index = xas->xa_index; /* we are replacing a zero page with block mapping */ if (dax_is_pmd_entry(entry)) @@ -732,7 +733,7 @@ static void *dax_insert_entry(struct xa_state *xas, dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address); } - if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { + if (cow || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { /* * Only swap our new entry into the page cache if the current * entry is a zero page or an empty entry. If a normal PTE or @@ -1031,7 +1032,7 @@ static vm_fault_t dax_load_hole(struct xa_state *xas, vm_fault_t ret; *entry = dax_insert_entry(xas, mapping, vmf, *entry, pfn, - DAX_ZERO_PAGE, false); + DAX_ZERO_PAGE, false, false); ret = vmf_insert_mixed(vmf->vma, vaddr, pfn); trace_dax_load_hole(inode, vmf, ret); @@ -1408,7 +1409,8 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, goto error_finish_iomap; entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, - 0, write && !sync); + 0, write && !sync, + (iomap.flags & IOMAP_F_COW) != 0); /* * If we are doing synchronous page fault and inode needs fsync, @@ -1487,7 +1489,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, pfn = page_to_pfn_t(zero_page); *entry = dax_insert_entry(xas, mapping, vmf, *entry, pfn, - DAX_PMD | DAX_ZERO_PAGE, false); + DAX_PMD | DAX_ZERO_PAGE, false, false); ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); if (!pmd_none(*(vmf->pmd))) { @@ -1610,7 +1612,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, goto finish_iomap; entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, - DAX_PMD, write && !sync); + DAX_PMD, write && !sync, + false); /* * If we are doing synchronous page fault and inode needs fsync, From patchwork Tue Mar 26 19:02:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872083 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 605A5925 for ; Tue, 26 Mar 2019 19:03:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5211228C19 for ; Tue, 26 Mar 2019 19:03:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 44BE528C63; Tue, 26 Mar 2019 19:03:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41C4328C19 for ; Tue, 26 Mar 2019 19:03:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732711AbfCZTDk (ORCPT ); Tue, 26 Mar 2019 15:03:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:37028 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732703AbfCZTDj (ORCPT ); Tue, 26 Mar 2019 15:03:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BD853AEE6; Tue, 26 Mar 2019 19:03:37 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 10/15] btrfs: Add dax specific address_space_operations Date: Tue, 26 Mar 2019 14:02:56 -0500 Message-Id: <20190326190301.32365-11-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/inode.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f721fc1e3f7f..21780ea14e5a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include "ctree.h" #include "disk-io.h" @@ -65,6 +66,7 @@ static const struct inode_operations btrfs_dir_ro_inode_operations; static const struct inode_operations btrfs_special_inode_operations; static const struct inode_operations btrfs_file_inode_operations; static const struct address_space_operations btrfs_aops; +static const struct address_space_operations btrfs_dax_aops; static const struct file_operations btrfs_dir_file_operations; static const struct extent_io_ops btrfs_extent_io_ops; @@ -3757,7 +3759,10 @@ static int btrfs_read_locked_inode(struct inode *inode, switch (inode->i_mode & S_IFMT) { case S_IFREG: - inode->i_mapping->a_ops = &btrfs_aops; + if (btrfs_test_opt(fs_info, DAX)) + inode->i_mapping->a_ops = &btrfs_dax_aops; + else + inode->i_mapping->a_ops = &btrfs_aops; BTRFS_I(inode)->io_tree.ops = &btrfs_extent_io_ops; inode->i_fop = &btrfs_file_operations; inode->i_op = &btrfs_file_inode_operations; @@ -3778,6 +3783,7 @@ static int btrfs_read_locked_inode(struct inode *inode, } btrfs_sync_inode_flags_to_i_flags(inode); + return 0; } @@ -6538,7 +6544,10 @@ static int btrfs_create(struct inode *dir, struct dentry *dentry, */ inode->i_fop = &btrfs_file_operations; inode->i_op = &btrfs_file_inode_operations; - inode->i_mapping->a_ops = &btrfs_aops; + if (IS_DAX(inode) && S_ISREG(mode)) + inode->i_mapping->a_ops = &btrfs_dax_aops; + else + inode->i_mapping->a_ops = &btrfs_aops; err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name); if (err) @@ -8665,6 +8674,15 @@ static int btrfs_writepages(struct address_space *mapping, return extent_writepages(mapping, wbc); } +static int btrfs_dax_writepages(struct address_space *mapping, + struct writeback_control *wbc) +{ + struct inode *inode = mapping->host; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + return dax_writeback_mapping_range(mapping, fs_info->fs_devices->latest_bdev, + wbc); +} + static int btrfs_readpages(struct file *file, struct address_space *mapping, struct list_head *pages, unsigned nr_pages) @@ -10436,7 +10454,10 @@ static int btrfs_tmpfile(struct inode *dir, struct dentry *dentry, umode_t mode) inode->i_fop = &btrfs_file_operations; inode->i_op = &btrfs_file_inode_operations; - inode->i_mapping->a_ops = &btrfs_aops; + if (IS_DAX(inode)) + inode->i_mapping->a_ops = &btrfs_dax_aops; + else + inode->i_mapping->a_ops = &btrfs_aops; BTRFS_I(inode)->io_tree.ops = &btrfs_extent_io_ops; ret = btrfs_init_inode_security(trans, inode, dir, NULL); @@ -10892,6 +10913,13 @@ static const struct address_space_operations btrfs_aops = { .swap_deactivate = btrfs_swap_deactivate, }; +static const struct address_space_operations btrfs_dax_aops = { + .writepages = btrfs_dax_writepages, + .direct_IO = noop_direct_IO, + .set_page_dirty = noop_set_page_dirty, + .invalidatepage = noop_invalidatepage, +}; + static const struct inode_operations btrfs_file_inode_operations = { .getattr = btrfs_getattr, .setattr = btrfs_setattr, From patchwork Tue Mar 26 19:02:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872085 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BE4DD139A for ; Tue, 26 Mar 2019 19:03:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA38B289FC for ; Tue, 26 Mar 2019 19:03:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9EB2628C63; Tue, 26 Mar 2019 19:03:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EAB9A289FC for ; Tue, 26 Mar 2019 19:03:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732716AbfCZTDp (ORCPT ); Tue, 26 Mar 2019 15:03:45 -0400 Received: from mx2.suse.de ([195.135.220.15]:37066 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732707AbfCZTDl (ORCPT ); Tue, 26 Mar 2019 15:03:41 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 17E1AADF2; Tue, 26 Mar 2019 19:03:40 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 11/15] fs: dedup file range to use a compare function Date: Tue, 26 Mar 2019 14:02:57 -0500 Message-Id: <20190326190301.32365-12-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues With dax we cannot deal with readpage() etc. So, we create a funciton callback to perform the file data comparison and pass it to generic_remap_file_range_prep() so it can use iomap-based functions. This may not be the best way to solve this. Suggestions welcome. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 3 +++ fs/btrfs/dax.c | 7 +++++++ fs/btrfs/ioctl.c | 13 +++++++++++- fs/dax.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++ fs/ocfs2/file.c | 2 +- fs/read_write.c | 9 +++++--- fs/xfs/xfs_reflink.c | 2 +- include/linux/dax.h | 2 ++ include/linux/fs.h | 4 +++- 9 files changed, 93 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 0e5060933bde..750f9c70fabe 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3803,6 +3803,9 @@ int btree_readahead_hook(struct extent_buffer *eb, int err); ssize_t btrfs_file_dax_read(struct kiocb *iocb, struct iov_iter *to); ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from); vm_fault_t btrfs_dax_fault(struct vm_fault *vmf); +int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, loff_t len, + bool *is_same); #else static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) { diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index 927f962d1e88..9488cae0f8b4 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -168,4 +168,11 @@ vm_fault_t btrfs_dax_fault(struct vm_fault *vmf) return ret; } + +int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, + struct inode *dest, loff_t destoff, loff_t len, + bool *is_same) +{ + return dax_file_range_compare(src, srcoff, dest, destoff, len, is_same, &btrfs_iomap_ops); +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index e66426e7692d..2e5137b01561 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3990,8 +3990,19 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, if (ret < 0) goto out_unlock; +#ifdef CONFIG_FS_DAX + if (IS_DAX(file_inode(file_in)) && IS_DAX(file_inode(file_out))) + ret = generic_remap_file_range_prep(file_in, pos_in, file_out, + pos_out, len, remap_flags, + btrfs_dax_file_range_compare); + else + ret = generic_remap_file_range_prep(file_in, pos_in, file_out, + pos_out, len, remap_flags, NULL); +#else ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags); + len, remap_flags, NULL); +#endif + if (ret < 0 || *len == 0) goto out_unlock; diff --git a/fs/dax.c b/fs/dax.c index 41061da42771..18998c5ee27a 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1775,3 +1775,61 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, return dax_insert_pfn_mkwrite(vmf, pfn, order); } EXPORT_SYMBOL_GPL(dax_finish_sync_fault); + + +int dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, + loff_t destoff, loff_t len, bool *is_same, const struct iomap_ops *ops) +{ + void *saddr, *daddr; + struct iomap s_iomap = {0}; + struct iomap d_iomap = {0}; + loff_t dstart, sstart; + bool same = true; + loff_t cmp_len, l; + int id, ret = 0; + + id = dax_read_lock(); + while (len) { + ret = ops->iomap_begin(src, srcoff, len, 0, &s_iomap); + if (ret < 0) { + if (ops->iomap_end) + ops->iomap_end(src, srcoff, len, ret, 0, &s_iomap); + return ret; + } + cmp_len = len; + if (cmp_len > s_iomap.offset + s_iomap.length - srcoff) + cmp_len = s_iomap.offset + s_iomap.length - srcoff; + + ret = ops->iomap_begin(dest, destoff, cmp_len, 0, &d_iomap); + if (ret < 0) { + if (ops->iomap_end) { + ops->iomap_end(src, srcoff, len, ret, 0, &s_iomap); + ops->iomap_end(dest, destoff, len, ret, 0, &d_iomap); + } + return ret; + } + if (cmp_len > d_iomap.offset + d_iomap.length - destoff) + cmp_len = d_iomap.offset + d_iomap.length - destoff; + + + sstart = (get_start_sect(s_iomap.bdev) << 9) + s_iomap.addr + (srcoff - s_iomap.offset); + l = dax_direct_access(s_iomap.dax_dev, PHYS_PFN(sstart), PHYS_PFN(cmp_len), &saddr, NULL); + dstart = (get_start_sect(d_iomap.bdev) << 9) + d_iomap.addr + (destoff - d_iomap.offset); + l = dax_direct_access(d_iomap.dax_dev, PHYS_PFN(dstart), PHYS_PFN(cmp_len), &daddr, NULL); + same = !memcmp(saddr, daddr, cmp_len); + if (!same) + break; + len -= cmp_len; + srcoff += cmp_len; + destoff += cmp_len; + + if (ops->iomap_end) { + ret = ops->iomap_end(src, srcoff, len, 0, 0, &s_iomap); + ret = ops->iomap_end(dest, destoff, len, 0, 0, &d_iomap); + } + } + dax_read_unlock(id); + *is_same = same; + return ret; +} +EXPORT_SYMBOL_GPL(dax_file_range_compare); diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index d640c5f8a85d..6bf3e8fbb016 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -2558,7 +2558,7 @@ static loff_t ocfs2_remap_file_range(struct file *file_in, loff_t pos_in, goto out_unlock; ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - &len, remap_flags); + &len, remap_flags, NULL); if (ret < 0 || len == 0) goto out_unlock; diff --git a/fs/read_write.c b/fs/read_write.c index 177ccc3d405a..da521a221213 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1855,7 +1855,7 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff, */ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, - loff_t *len, unsigned int remap_flags) + loff_t *len, unsigned int remap_flags, compare_range_t compare) { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); @@ -1914,8 +1914,11 @@ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, */ if (remap_flags & REMAP_FILE_DEDUP) { bool is_same = false; - - ret = vfs_dedupe_file_range_compare(inode_in, pos_in, + if (compare) + ret = compare(inode_in, pos_in, + inode_out, pos_out, *len, &is_same); + else + ret = vfs_dedupe_file_range_compare(inode_in, pos_in, inode_out, pos_out, *len, &is_same); if (ret) return ret; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 680ae7662a78..8907c7aa3f19 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1350,7 +1350,7 @@ xfs_reflink_remap_prep( goto out_unlock; ret = generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, - len, remap_flags); + len, remap_flags, NULL); if (ret < 0 || *len == 0) goto out_unlock; diff --git a/include/linux/dax.h b/include/linux/dax.h index 0dd316a74a29..a11bc7b1f526 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -157,6 +157,8 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index); int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index); +int dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, + loff_t destoff, loff_t len, bool *is_same, const struct iomap_ops *ops); #ifdef CONFIG_FS_DAX int __dax_zero_page_range(struct block_device *bdev, diff --git a/include/linux/fs.h b/include/linux/fs.h index 8b42df09b04c..22fe4324b22e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1880,10 +1880,12 @@ extern ssize_t vfs_readv(struct file *, const struct iovec __user *, unsigned long, loff_t *, rwf_t); extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *, loff_t, size_t, unsigned int); +typedef int (compare_range_t)(struct inode *src, loff_t srcpos, struct inode *dest, + loff_t destpos, loff_t len, bool *is_same); extern int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t *count, - unsigned int remap_flags); + unsigned int remap_flags, compare_range_t cmp); extern loff_t do_clone_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); From patchwork Tue Mar 26 19:02:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872091 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39F4E925 for ; Tue, 26 Mar 2019 19:03:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2CB0C289FC for ; Tue, 26 Mar 2019 19:03:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2131728C58; Tue, 26 Mar 2019 19:03:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C2130289FC for ; Tue, 26 Mar 2019 19:03:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732320AbfCZTDp (ORCPT ); Tue, 26 Mar 2019 15:03:45 -0400 Received: from mx2.suse.de ([195.135.220.15]:37094 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732034AbfCZTDn (ORCPT ); Tue, 26 Mar 2019 15:03:43 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 44305AEE6; Tue, 26 Mar 2019 19:03:42 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 12/15] btrfs: trace functions for btrfs_iomap_begin/end Date: Tue, 26 Mar 2019 14:02:58 -0500 Message-Id: <20190326190301.32365-13-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This is for debug purposes only and can be skipped. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/dax.c | 3 +++ include/trace/events/btrfs.h | 56 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index 9488cae0f8b4..7900b5773829 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -27,6 +27,8 @@ static int btrfs_iomap_begin(struct inode *inode, loff_t pos, struct extent_map *em; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + trace_btrfs_iomap_begin(inode, pos, length, flags); + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); if (flags & IOMAP_WRITE) { @@ -103,6 +105,7 @@ static int btrfs_iomap_end(struct inode *inode, loff_t pos, { struct btrfs_iomap *bi = iomap->private; u64 wend; + trace_btrfs_iomap_end(inode, pos, length, written, flags); if (!bi) return 0; diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index ab1cc33adbac..8779e5789a7c 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -1850,6 +1850,62 @@ DEFINE_EVENT(btrfs__block_group, btrfs_skip_unused_block_group, TP_ARGS(bg_cache) ); +TRACE_EVENT(btrfs_iomap_begin, + + TP_PROTO(const struct inode *inode, loff_t pos, loff_t length, int flags), + + TP_ARGS(inode, pos, length, flags), + + TP_STRUCT__entry_btrfs( + __field( u64, ino ) + __field( u64, pos ) + __field( u64, length ) + __field( int, flags ) + ), + + TP_fast_assign_btrfs(btrfs_sb(inode->i_sb), + __entry->ino = btrfs_ino(BTRFS_I(inode)); + __entry->pos = pos; + __entry->length = length; + __entry->flags = flags; + ), + + TP_printk_btrfs("ino=%llu pos=%llu len=%llu flags=0x%x", + __entry->ino, + __entry->pos, + __entry->length, + __entry->flags) +); + +TRACE_EVENT(btrfs_iomap_end, + + TP_PROTO(const struct inode *inode, loff_t pos, loff_t length, loff_t written, int flags), + + TP_ARGS(inode, pos, length, written, flags), + + TP_STRUCT__entry_btrfs( + __field( u64, ino ) + __field( u64, pos ) + __field( u64, length ) + __field( u64, written ) + __field( int, flags ) + ), + + TP_fast_assign_btrfs(btrfs_sb(inode->i_sb), + __entry->ino = btrfs_ino(BTRFS_I(inode)); + __entry->pos = pos; + __entry->length = length; + __entry->written = written; + __entry->flags = flags; + ), + + TP_printk_btrfs("ino=%llu pos=%llu len=%llu written=%llu flags=0x%x", + __entry->ino, + __entry->pos, + __entry->length, + __entry->written, + __entry->flags) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */ From patchwork Tue Mar 26 19:02:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872095 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DECB517E6 for ; Tue, 26 Mar 2019 19:03:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D13F528C19 for ; Tue, 26 Mar 2019 19:03:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C5CF828CFF; Tue, 26 Mar 2019 19:03:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41EA028C19 for ; Tue, 26 Mar 2019 19:03:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732731AbfCZTDr (ORCPT ); Tue, 26 Mar 2019 15:03:47 -0400 Received: from mx2.suse.de ([195.135.220.15]:37106 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732718AbfCZTDq (ORCPT ); Tue, 26 Mar 2019 15:03:46 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9BE3BADF2; Tue, 26 Mar 2019 19:03:44 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 13/15] btrfs: handle dax page zeroing Date: Tue, 26 Mar 2019 14:02:59 -0500 Message-Id: <20190326190301.32365-14-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues btrfs_dax_zero_block() zeros part of the page, either from the front or the regular rest of the block. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 1 + fs/btrfs/dax.c | 29 +++++++++++++++++++++++++++-- fs/btrfs/inode.c | 4 ++++ fs/dax.c | 17 ++++++++++++----- fs/iomap.c | 9 +-------- include/linux/dax.h | 11 +++++------ include/linux/iomap.h | 6 ++++++ 7 files changed, 56 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 750f9c70fabe..21068dc4a95a 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3806,6 +3806,7 @@ vm_fault_t btrfs_dax_fault(struct vm_fault *vmf); int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, loff_t len, bool *is_same); +int btrfs_dax_zero_block(struct inode *inode, loff_t from, loff_t len, bool front); #else static inline ssize_t btrfs_file_dax_write(struct kiocb *iocb, struct iov_iter *from) { diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index 7900b5773829..d73945d50b88 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -31,7 +31,7 @@ static int btrfs_iomap_begin(struct inode *inode, loff_t pos, em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); - if (flags & IOMAP_WRITE) { + if (flags & (IOMAP_WRITE | IOMAP_ZERO)) { int ret = 0, nocow; struct extent_map *map = em; struct btrfs_iomap *bi; @@ -89,7 +89,8 @@ static int btrfs_iomap_begin(struct inode *inode, loff_t pos, iomap->bdev = em->bdev; iomap->dax_dev = fs_info->dax_dev; - if (em->block_start == EXTENT_MAP_HOLE) { + if (em->block_start == EXTENT_MAP_HOLE || + em->flags == EXTENT_FLAG_FILLING) { iomap->type = IOMAP_HOLE; return 0; } @@ -178,4 +179,28 @@ int btrfs_dax_file_range_compare(struct inode *src, loff_t srcoff, { return dax_file_range_compare(src, srcoff, dest, destoff, len, is_same, &btrfs_iomap_ops); } + +/* + * zero a part of the page only. This should CoW (via iomap_begin) if required + */ +int btrfs_dax_zero_block(struct inode *inode, loff_t from, loff_t len, bool front) +{ + loff_t start = round_down(from, PAGE_SIZE); + loff_t end = round_up(from, PAGE_SIZE); + loff_t offset = from; + int ret = 0; + + if (front) { + len = from - start; + offset = start; + } else { + if (!len) + len = end - from; + } + + if (len) + ret = iomap_zero_range(inode, offset, len, NULL, &btrfs_iomap_ops); + + return (ret < 0) ? ret : 0; +} #endif /* CONFIG_FS_DAX */ diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 21780ea14e5a..5350e5f23728 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4833,6 +4833,10 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, (!len || IS_ALIGNED(len, blocksize))) goto out; +#ifdef CONFIG_FS_DAX + if (IS_DAX(inode)) + return btrfs_dax_zero_block(inode, from, len, front); +#endif block_start = round_down(from, blocksize); block_end = block_start + blocksize - 1; diff --git a/fs/dax.c b/fs/dax.c index 18998c5ee27a..93146142bb00 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1068,17 +1068,21 @@ static void dax_to_dax_copy(struct iomap *iomap, loff_t pos, void *daddr, blk_start = iomap->cow_addr + pos - iomap->cow_pos; blk_pg = round_down(blk_start, PAGE_SIZE); - map_len = dax_direct_access(iomap->dax_dev, PHYS_PFN(blk_pg), PAGE_SIZE, + map_len = dax_direct_access(iomap->dax_dev, PHYS_PFN(blk_pg), 1, &saddr, NULL); saddr += blk_start - blk_pg; memcpy(daddr, saddr, len); } -int __dax_zero_page_range(struct block_device *bdev, - struct dax_device *dax_dev, sector_t sector, - unsigned int offset, unsigned int size) +int __dax_zero_page_range(struct iomap *iomap, loff_t pos, + unsigned int offset, unsigned int size) { - if (dax_range_is_aligned(bdev, offset, size)) { + sector_t sector = iomap_sector(iomap, pos & PAGE_MASK); + struct block_device *bdev = iomap->bdev; + struct dax_device *dax_dev = iomap->dax_dev; + + if (!(iomap->flags & IOMAP_F_COW) && + dax_range_is_aligned(bdev, offset, size)) { sector_t start_sector = sector + (offset >> 9); return blkdev_issue_zeroout(bdev, start_sector, @@ -1098,6 +1102,9 @@ int __dax_zero_page_range(struct block_device *bdev, dax_read_unlock(id); return rc; } + if (iomap->flags & IOMAP_F_COW) + dax_to_dax_copy(iomap, pos & PAGE_MASK, + kaddr, PAGE_SIZE); memset(kaddr + offset, 0, size); dax_flush(dax_dev, kaddr + offset, size); dax_read_unlock(id); diff --git a/fs/iomap.c b/fs/iomap.c index abdd18e404f8..90698c854883 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -98,12 +98,6 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, return written ? written : ret; } -static sector_t -iomap_sector(struct iomap *iomap, loff_t pos) -{ - return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT; -} - static struct iomap_page * iomap_page_create(struct inode *inode, struct page *page) { @@ -990,8 +984,7 @@ static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes, struct iomap *iomap) { - return __dax_zero_page_range(iomap->bdev, iomap->dax_dev, - iomap_sector(iomap, pos & PAGE_MASK), offset, bytes); + return __dax_zero_page_range(iomap, pos, offset, bytes); } static loff_t diff --git a/include/linux/dax.h b/include/linux/dax.h index a11bc7b1f526..892c478d7073 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -9,6 +9,7 @@ typedef unsigned long dax_entry_t; +struct iomap; struct iomap_ops; struct dax_device; struct dax_operations { @@ -161,13 +162,11 @@ int dax_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, loff_t len, bool *is_same, const struct iomap_ops *ops); #ifdef CONFIG_FS_DAX -int __dax_zero_page_range(struct block_device *bdev, - struct dax_device *dax_dev, sector_t sector, - unsigned int offset, unsigned int length); +int __dax_zero_page_range(struct iomap *iomap, loff_t pos, + unsigned int offset, unsigned int size); #else -static inline int __dax_zero_page_range(struct block_device *bdev, - struct dax_device *dax_dev, sector_t sector, - unsigned int offset, unsigned int length) +static inline int __dax_zero_page_range(struct iomap *iomap, loff_t pos, + unsigned int offset, unsigned int size) { return -ENXIO; } diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 391785de1428..e5a1b2a1962d 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -7,6 +7,7 @@ #include #include #include +#include struct address_space; struct fiemap_extent_info; @@ -122,6 +123,11 @@ static inline struct iomap_page *to_iomap_page(struct page *page) return NULL; } +static inline sector_t iomap_sector(struct iomap *iomap, loff_t pos) +{ + return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT; +} + ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, const struct iomap_ops *ops); int iomap_readpage(struct page *page, const struct iomap_ops *ops); From patchwork Tue Mar 26 19:03:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872097 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3EE2F18A6 for ; Tue, 26 Mar 2019 19:03:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AC01289FC for ; Tue, 26 Mar 2019 19:03:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1F20F28C19; Tue, 26 Mar 2019 19:03:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C251228C9D for ; Tue, 26 Mar 2019 19:03:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732737AbfCZTDs (ORCPT ); Tue, 26 Mar 2019 15:03:48 -0400 Received: from mx2.suse.de ([195.135.220.15]:37174 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732730AbfCZTDr (ORCPT ); Tue, 26 Mar 2019 15:03:47 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 829A0AEE6; Tue, 26 Mar 2019 19:03:46 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 14/15] btrfs: Disable dax-based defrag and send Date: Tue, 26 Mar 2019 14:03:00 -0500 Message-Id: <20190326190301.32365-15-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This is temporary, and a TODO. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ioctl.c | 13 +++++++++++++ fs/btrfs/send.c | 4 ++++ 2 files changed, 17 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 2e5137b01561..f532a8df2026 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2980,6 +2980,12 @@ static int btrfs_ioctl_defrag(struct file *file, void __user *argp) goto out; } + if (IS_DAX(inode)) { + btrfs_warn(root->fs_info, "File defrag is not supported with DAX"); + ret = -EOPNOTSUPP; + goto out; + } + if (argp) { if (copy_from_user(range, argp, sizeof(*range))) { @@ -4647,6 +4653,10 @@ static long btrfs_ioctl_balance(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM; + /* send can be on a directory, so check super block instead */ + if (btrfs_test_opt(fs_info, DAX)) + return -EOPNOTSUPP; + ret = mnt_want_write_file(file); if (ret) return ret; @@ -5499,6 +5509,9 @@ static int _btrfs_ioctl_send(struct file *file, void __user *argp, bool compat) struct btrfs_ioctl_send_args *arg; int ret; + if (IS_DAX(file_inode(file))) + return -EOPNOTSUPP; + if (compat) { #if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT) struct btrfs_ioctl_send_args_32 args32; diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 7ea2d6b1f170..9679fd54db86 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -6609,6 +6609,10 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) int sort_clone_roots = 0; int index; + /* send can be on a directory, so check super block instead */ + if (btrfs_test_opt(fs_info, DAX)) + return -EOPNOTSUPP; + if (!capable(CAP_SYS_ADMIN)) return -EPERM; From patchwork Tue Mar 26 19:03:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 10872103 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F39F139A for ; Tue, 26 Mar 2019 19:03:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 31986289FC for ; Tue, 26 Mar 2019 19:03:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 25D6928C58; Tue, 26 Mar 2019 19:03:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3BF6289FC for ; Tue, 26 Mar 2019 19:03:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732742AbfCZTDv (ORCPT ); Tue, 26 Mar 2019 15:03:51 -0400 Received: from mx2.suse.de ([195.135.220.15]:37192 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732739AbfCZTDu (ORCPT ); Tue, 26 Mar 2019 15:03:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id AE5DCADF2; Tue, 26 Mar 2019 19:03:48 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Goldwyn Rodrigues Subject: [PATCH 15/15] btrfs: Writeprotect mmap pages on snapshot Date: Tue, 26 Mar 2019 14:03:01 -0500 Message-Id: <20190326190301.32365-16-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190326190301.32365-1-rgoldwyn@suse.de> References: <20190326190301.32365-1-rgoldwyn@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Inorder to make sure mmap'd files don't change after snapshot, writeprotect the mmap pages on snapshot. This is done by performing a data writeback on the pages (which simply mark the pages are wrprotected). This way if the user process tries to access the memory we will get another fault and we can perform a CoW. In order to accomplish this, we tag all CoW pages as PAGECACHE_TAG_TOWRITE, and add the mmapd inode in delalloc_inodes. During snapshot, it starts writeback of all delalloc'd inodes and here we perform a data writeback. We don't want to keep the inodes in delalloc_inodes until it umount (WARN_ON), so we remove it during inode evictions. This looks hackish. Other alternatives could be to create another list for mmap'd files or rename delalloc_inodes to writeback_inodes. Suggestions? Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 3 ++- fs/btrfs/dax.c | 7 +++++++ fs/btrfs/inode.c | 13 ++++++++++++- fs/dax.c | 3 +++ 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 21068dc4a95a..68a63d93556a 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3252,7 +3252,8 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans, struct btrfs_root *new_root, struct btrfs_root *parent_root, u64 new_dirid); - void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state, +void btrfs_add_delalloc_inodes(struct btrfs_root *root, struct inode *inode); +void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state, unsigned *bits); void btrfs_clear_delalloc_extent(struct inode *inode, struct extent_state *state, unsigned *bits); diff --git a/fs/btrfs/dax.c b/fs/btrfs/dax.c index d73945d50b88..bcb961242c74 100644 --- a/fs/btrfs/dax.c +++ b/fs/btrfs/dax.c @@ -166,10 +166,17 @@ vm_fault_t btrfs_dax_fault(struct vm_fault *vmf) { vm_fault_t ret; pfn_t pfn; + struct inode *inode = file_inode(vmf->vma->vm_file); + struct btrfs_inode *binode = BTRFS_I(inode); ret = dax_iomap_fault(vmf, PE_SIZE_PTE, &pfn, NULL, &btrfs_iomap_ops); if (ret & VM_FAULT_NEEDDSYNC) ret = dax_finish_sync_fault(vmf, PE_SIZE_PTE, pfn); + /* Insert into delalloc so we get writeback calls on snapshots */ + if (vmf->flags & FAULT_FLAG_WRITE && + !test_bit(BTRFS_INODE_IN_DELALLOC_LIST, &binode->runtime_flags)) + btrfs_add_delalloc_inodes(binode->root, inode); + return ret; } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5350e5f23728..3b72c1c96b34 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1713,7 +1713,7 @@ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new, spin_unlock(&BTRFS_I(inode)->lock); } -static void btrfs_add_delalloc_inodes(struct btrfs_root *root, +void btrfs_add_delalloc_inodes(struct btrfs_root *root, struct inode *inode) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -5358,12 +5358,17 @@ void btrfs_evict_inode(struct inode *inode) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_trans_handle *trans; + struct btrfs_inode *binode = BTRFS_I(inode); struct btrfs_root *root = BTRFS_I(inode)->root; struct btrfs_block_rsv *rsv; int ret; trace_btrfs_inode_evict(inode); + if (IS_DAX(inode) + && test_bit(BTRFS_INODE_IN_DELALLOC_LIST, &binode->runtime_flags)) + btrfs_del_delalloc_inode(root, binode); + if (!root) { clear_inode(inode); return; @@ -8683,6 +8688,10 @@ static int btrfs_dax_writepages(struct address_space *mapping, { struct inode *inode = mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_inode *binode = BTRFS_I(inode); + if ((wbc->sync_mode == WB_SYNC_ALL) && + test_bit(BTRFS_INODE_IN_DELALLOC_LIST, &binode->runtime_flags)) + btrfs_del_delalloc_inode(binode->root, binode); return dax_writeback_mapping_range(mapping, fs_info->fs_devices->latest_bdev, wbc); } @@ -9981,6 +9990,8 @@ static void btrfs_run_delalloc_work(struct btrfs_work *work) delalloc_work = container_of(work, struct btrfs_delalloc_work, work); inode = delalloc_work->inode; + if (IS_DAX(inode)) + filemap_fdatawrite(inode->i_mapping); filemap_flush(inode->i_mapping); if (test_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &BTRFS_I(inode)->runtime_flags)) diff --git a/fs/dax.c b/fs/dax.c index 93146142bb00..c42e9cb486ef 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -753,6 +753,9 @@ static void *dax_insert_entry(struct xa_state *xas, if (dirty) xas_set_mark(xas, PAGECACHE_TAG_DIRTY); + if (cow) + xas_set_mark(xas, PAGECACHE_TAG_TOWRITE); + xas_unlock_irq(xas); return entry; }