From patchwork Fri Aug 2 22:00:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074151 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 319ED13B1 for ; Fri, 2 Aug 2019 22:00:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 199B2286E0 for ; Fri, 2 Aug 2019 22:00:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E6D62871E; Fri, 2 Aug 2019 22:00:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1611F286E0 for ; Fri, 2 Aug 2019 22:00:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2395229AbfHBWA5 (ORCPT ); Fri, 2 Aug 2019 18:00:57 -0400 Received: from mx2.suse.de ([195.135.220.15]:37936 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730782AbfHBWA4 (ORCPT ); Fri, 2 Aug 2019 18:00:56 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 3637EB0CC; Fri, 2 Aug 2019 22:00:55 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 01/13] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Date: Fri, 2 Aug 2019 17:00:36 -0500 Message-Id: <20190802220048.16142-2-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Introduces a new type IOMAP_COW, which means the data at offset must be read from a srcmap and copied before performing the write on the offset. The srcmap is used to identify where the read is to be performed from. This is passed to iomap->begin() of the respective filesystem, which is supposed to put in the details for reading before performing the copy for CoW. Signed-off-by: Goldwyn Rodrigues --- fs/dax.c | 8 +++++--- fs/ext2/inode.c | 2 +- fs/ext4/inode.c | 2 +- fs/gfs2/bmap.c | 3 ++- fs/iomap/apply.c | 5 +++-- fs/iomap/buffered-io.c | 14 +++++++------- fs/iomap/direct-io.c | 2 +- fs/iomap/fiemap.c | 4 ++-- fs/iomap/seek.c | 4 ++-- fs/iomap/swapfile.c | 3 ++- fs/xfs/xfs_iomap.c | 9 ++++++--- include/linux/iomap.h | 6 ++++-- 12 files changed, 36 insertions(+), 26 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a237141d8787..b21d9a9cde2b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1090,7 +1090,7 @@ EXPORT_SYMBOL_GPL(__dax_zero_page_range); static loff_t dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct block_device *bdev = iomap->bdev; struct dax_device *dax_dev = iomap->dax_dev; @@ -1248,6 +1248,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, unsigned long vaddr = vmf->address; loff_t pos = (loff_t)vmf->pgoff << PAGE_SHIFT; struct iomap iomap = { 0 }; + struct iomap srcmap = { 0 }; unsigned flags = IOMAP_FAULT; int error, major = 0; bool write = vmf->flags & FAULT_FLAG_WRITE; @@ -1292,7 +1293,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, * the file system block size to be equal the page size, which means * that we never have to deal with more than a single extent here. */ - error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap); + error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap, &srcmap); if (iomap_errp) *iomap_errp = error; if (error) { @@ -1472,6 +1473,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, struct inode *inode = mapping->host; vm_fault_t result = VM_FAULT_FALLBACK; struct iomap iomap = { 0 }; + struct iomap srcmap = { 0 }; pgoff_t max_pgoff; void *entry; loff_t pos; @@ -1546,7 +1548,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, * to look up our filesystem block. */ pos = (loff_t)xas.xa_index << PAGE_SHIFT; - error = ops->iomap_begin(inode, pos, PMD_SIZE, iomap_flags, &iomap); + error = ops->iomap_begin(inode, pos, PMD_SIZE, iomap_flags, &iomap, &srcmap); if (error) goto unlock_entry; diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 7004ce581a32..467c13ff6b40 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -801,7 +801,7 @@ int ext2_get_block(struct inode *inode, sector_t iblock, #ifdef CONFIG_FS_DAX static int ext2_iomap_begin(struct inode *inode, loff_t offset, loff_t length, - unsigned flags, struct iomap *iomap) + unsigned flags, struct iomap *iomap, struct iomap *srcmap) { unsigned int blkbits = inode->i_blkbits; unsigned long first_block = offset >> blkbits; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 420fe3deed39..918f94eff799 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3453,7 +3453,7 @@ static bool ext4_inode_datasync_dirty(struct inode *inode) } static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length, - unsigned flags, struct iomap *iomap) + unsigned flags, struct iomap *iomap, struct iomap *srcmap) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); unsigned int blkbits = inode->i_blkbits; diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 79581b9bdebb..0bf8e8fa82bd 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -1123,7 +1123,8 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos, } static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, struct iomap *iomap) + unsigned flags, struct iomap *iomap, + struct iomap *srcmap) { struct gfs2_inode *ip = GFS2_I(inode); struct metapath mp = { .mp_aheight = 1, }; diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c index 54c02aecf3cd..6cdb362fff36 100644 --- a/fs/iomap/apply.c +++ b/fs/iomap/apply.c @@ -24,6 +24,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, const struct iomap_ops *ops, void *data, iomap_actor_t actor) { struct iomap iomap = { 0 }; + struct iomap srcmap = { 0 }; loff_t written = 0, ret; /* @@ -38,7 +39,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * expose transient stale data. If the reserve fails, we can safely * back out at this point as there is nothing to undo. */ - ret = ops->iomap_begin(inode, pos, length, flags, &iomap); + ret = ops->iomap_begin(inode, pos, length, flags, &iomap, &srcmap); if (ret) return ret; if (WARN_ON(iomap.offset > pos)) @@ -58,7 +59,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * we can do the copy-in page by page without having to worry about * failures exposing transient data. */ - written = actor(inode, pos, length, data, &iomap); + written = actor(inode, pos, length, data, &iomap, &srcmap); /* * Now the data has been copied, commit the range we've copied. This diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e25901ae3ff4..f27756c0b31c 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -205,7 +205,7 @@ iomap_read_inline_data(struct inode *inode, struct page *page, static loff_t iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; struct page *page = ctx->cur_page; @@ -351,7 +351,7 @@ iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos, static loff_t iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_readpage_ctx *ctx = data; loff_t done, ret; @@ -371,7 +371,7 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, ctx->cur_page_in_bio = false; } ret = iomap_readpage_actor(inode, pos + done, length - done, - ctx, iomap); + ctx, iomap, srcmap); } return done; @@ -736,7 +736,7 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, static loff_t iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct iov_iter *i = data; long status = 0; @@ -853,7 +853,7 @@ __iomap_read_page(struct inode *inode, loff_t offset) static loff_t iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { long status = 0; ssize_t written = 0; @@ -942,7 +942,7 @@ static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes, static loff_t iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { bool *did_zero = data; loff_t written = 0; @@ -1011,7 +1011,7 @@ EXPORT_SYMBOL_GPL(iomap_truncate_page); static loff_t iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { struct page *page = data; int ret; diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 10517cea9682..5279029c7a3c 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -362,7 +362,7 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length, static loff_t iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { struct iomap_dio *dio = data; diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c index f26fdd36e383..690ef2d7c6c8 100644 --- a/fs/iomap/fiemap.c +++ b/fs/iomap/fiemap.c @@ -44,7 +44,7 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi, static loff_t iomap_fiemap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap) + struct iomap *iomap, struct iomap *srcmap) { struct fiemap_ctx *ctx = data; loff_t ret = length; @@ -111,7 +111,7 @@ EXPORT_SYMBOL_GPL(iomap_fiemap); static loff_t iomap_bmap_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { sector_t *bno = data, addr; diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c index c04bad4b2b43..89f61d93c0bc 100644 --- a/fs/iomap/seek.c +++ b/fs/iomap/seek.c @@ -119,7 +119,7 @@ page_cache_seek_hole_data(struct inode *inode, loff_t offset, loff_t length, static loff_t iomap_seek_hole_actor(struct inode *inode, loff_t offset, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { switch (iomap->type) { case IOMAP_UNWRITTEN: @@ -165,7 +165,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_hole); static loff_t iomap_seek_data_actor(struct inode *inode, loff_t offset, loff_t length, - void *data, struct iomap *iomap) + void *data, struct iomap *iomap, struct iomap *srcmap) { switch (iomap->type) { case IOMAP_HOLE: diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c index 152a230f668d..a648dbf6991e 100644 --- a/fs/iomap/swapfile.c +++ b/fs/iomap/swapfile.c @@ -76,7 +76,8 @@ static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi) * distinction between written and unwritten extents. */ static loff_t iomap_swapfile_activate_actor(struct inode *inode, loff_t pos, - loff_t count, void *data, struct iomap *iomap) + loff_t count, void *data, struct iomap *iomap, + struct iomap *srcmap) { struct iomap_swapfile_info *isi = data; int error; diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 3a4310d7cb59..8321733c16c3 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -922,7 +922,8 @@ xfs_file_iomap_begin( loff_t offset, loff_t length, unsigned flags, - struct iomap *iomap) + struct iomap *iomap, + struct iomap *srcmap) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; @@ -1145,7 +1146,8 @@ xfs_seek_iomap_begin( loff_t offset, loff_t length, unsigned flags, - struct iomap *iomap) + struct iomap *iomap, + struct iomap *srcmap) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; @@ -1231,7 +1233,8 @@ xfs_xattr_iomap_begin( loff_t offset, loff_t length, unsigned flags, - struct iomap *iomap) + struct iomap *iomap, + struct iomap *srcmap) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; diff --git a/include/linux/iomap.h b/include/linux/iomap.h index bc499ceae392..5b2055e8ca8a 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -26,6 +26,7 @@ struct vm_fault; #define IOMAP_MAPPED 0x03 /* blocks allocated at @addr */ #define IOMAP_UNWRITTEN 0x04 /* blocks allocated at @addr in unwritten state */ #define IOMAP_INLINE 0x05 /* data inline in the inode */ +#define IOMAP_COW 0x06 /* copy data from srcmap before writing */ /* * Flags for all iomap mappings: @@ -110,7 +111,8 @@ struct iomap_ops { * The actual length is returned in iomap->length. */ int (*iomap_begin)(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, struct iomap *iomap); + unsigned flags, struct iomap *iomap, + struct iomap *srcmap); /* * Commit and/or unreserve space previous allocated using iomap_begin. @@ -126,7 +128,7 @@ struct iomap_ops { * Main iomap iterator function. */ typedef loff_t (*iomap_actor_t)(struct inode *inode, loff_t pos, loff_t len, - void *data, struct iomap *iomap); + void *data, struct iomap *iomap, struct iomap *srcmap); loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, const struct iomap_ops *ops, void *data, From patchwork Fri Aug 2 22:00:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074155 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AAEC8912 for ; Fri, 2 Aug 2019 22:01:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 962072870D for ; Fri, 2 Aug 2019 22:01:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8942328725; Fri, 2 Aug 2019 22:01:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 280332870D for ; Fri, 2 Aug 2019 22:01:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437107AbfHBWA7 (ORCPT ); Fri, 2 Aug 2019 18:00:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:37948 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730782AbfHBWA6 (ORCPT ); Fri, 2 Aug 2019 18:00:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 419C5B11B; Fri, 2 Aug 2019 22:00:57 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 02/13] iomap: Read page from srcmap for IOMAP_COW Date: Fri, 2 Aug 2019 17:00:37 -0500 Message-Id: <20190802220048.16142-3-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues In case of a IOMAP_COW, read a page from the srcmap before performing a write on the page. Signed-off-by: Goldwyn Rodrigues Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index f27756c0b31c..a96cc26eec92 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -581,7 +581,7 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, static int iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, - struct page **pagep, struct iomap *iomap) + struct page **pagep, struct iomap *iomap, struct iomap *srcmap) { const struct iomap_page_ops *page_ops = iomap->page_ops; pgoff_t index = pos >> PAGE_SHIFT; @@ -607,6 +607,8 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, if (iomap->type == IOMAP_INLINE) iomap_read_inline_data(inode, page, iomap); + else if (iomap->type == IOMAP_COW) + status = __iomap_write_begin(inode, pos, len, page, srcmap); else if (iomap->flags & IOMAP_F_BUFFER_HEAD) status = __block_write_begin_int(page, pos, len, NULL, iomap); else @@ -772,7 +774,7 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, } status = iomap_write_begin(inode, pos, bytes, flags, &page, - iomap); + iomap, srcmap); if (unlikely(status)) break; @@ -871,7 +873,7 @@ iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data, return PTR_ERR(rpage); status = iomap_write_begin(inode, pos, bytes, - AOP_FLAG_NOFS, &page, iomap); + AOP_FLAG_NOFS, &page, iomap, srcmap); put_page(rpage); if (unlikely(status)) return status; @@ -917,13 +919,13 @@ iomap_file_dirty(struct inode *inode, loff_t pos, loff_t len, EXPORT_SYMBOL_GPL(iomap_file_dirty); static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, - unsigned bytes, struct iomap *iomap) + unsigned bytes, struct iomap *iomap, struct iomap *srcmap) { struct page *page; int status; status = iomap_write_begin(inode, pos, bytes, AOP_FLAG_NOFS, &page, - iomap); + iomap, srcmap); if (status) return status; @@ -961,7 +963,7 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, if (IS_DAX(inode)) status = iomap_dax_zero(pos, offset, bytes, iomap); else - status = iomap_zero(inode, pos, offset, bytes, iomap); + status = iomap_zero(inode, pos, offset, bytes, iomap, srcmap); if (status < 0) return status; From patchwork Fri Aug 2 22:00:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85ACA13B1 for ; Fri, 2 Aug 2019 22:01:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 756EF2870D for ; Fri, 2 Aug 2019 22:01:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 68D5428725; Fri, 2 Aug 2019 22:01:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA6C82870D for ; Fri, 2 Aug 2019 22:01:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437112AbfHBWBB (ORCPT ); Fri, 2 Aug 2019 18:01:01 -0400 Received: from mx2.suse.de ([195.135.220.15]:37954 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730782AbfHBWBA (ORCPT ); Fri, 2 Aug 2019 18:01:00 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 43B61B0BA; Fri, 2 Aug 2019 22:00:59 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 03/13] btrfs: Eliminate PagePrivate for btrfs data pages Date: Fri, 2 Aug 2019 17:00:38 -0500 Message-Id: <20190802220048.16142-4-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues While most of the code works just eliminating page's private field and related code, there is a problem when we are cloning. The extent assumes the data is uptodate. Clear the EXTENT_UPTODATE flag for the extent so the next time the file is read, it is forced to be read from the disk as opposed to pagecache. This patch is required to make sure we don't conflict with iomap's usage of page->private. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/compression.c | 1 - fs/btrfs/extent_io.c | 13 ------------- fs/btrfs/extent_io.h | 2 -- fs/btrfs/file.c | 1 - fs/btrfs/free-space-cache.c | 1 - fs/btrfs/inode.c | 15 +-------------- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/relocation.c | 2 -- 8 files changed, 3 insertions(+), 36 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 60c47b417a4b..fe41fa3d2999 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -481,7 +481,6 @@ static noinline int add_ra_bio_pages(struct inode *inode, * for these bytes in the file. But, we have to make * sure they map to this compressed extent on disk. */ - set_page_extent_mapped(page); lock_extent(tree, last_offset, end); read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, last_offset, diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1ff438fd5bc2..27233fb6660c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3005,15 +3005,6 @@ static void attach_extent_buffer_page(struct extent_buffer *eb, } } -void set_page_extent_mapped(struct page *page) -{ - if (!PagePrivate(page)) { - SetPagePrivate(page); - get_page(page); - set_page_private(page, EXTENT_PAGE_PRIVATE); - } -} - static struct extent_map * __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset, u64 start, u64 len, get_extent_t *get_extent, @@ -3074,8 +3065,6 @@ static int __do_readpage(struct extent_io_tree *tree, size_t blocksize = inode->i_sb->s_blocksize; unsigned long this_bio_flag = 0; - set_page_extent_mapped(page); - if (!PageUptodate(page)) { if (cleancache_get_page(page) == 0) { BUG_ON(blocksize != PAGE_SIZE); @@ -3589,8 +3578,6 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc, pg_offset = 0; - set_page_extent_mapped(page); - if (!epd->extent_locked) { ret = writepage_delalloc(inode, page, wbc, start, &nr_written); if (ret == 1) diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 401423b16976..8082774371b5 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -416,8 +416,6 @@ int extent_readpages(struct address_space *mapping, struct list_head *pages, unsigned nr_pages); int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, __u64 start, __u64 len); -void set_page_extent_mapped(struct page *page); - struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start); struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 58a18ed11546..4466a09f2d98 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1539,7 +1539,6 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, * delalloc bits and dirty the pages as required. */ for (i = 0; i < num_pages; i++) { - set_page_extent_mapped(pages[i]); WARN_ON(!PageLocked(pages[i])); } diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 062be9dde4c6..9a0c519bd6d4 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -395,7 +395,6 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, struct inode *inode for (i = 0; i < io_ctl->num_pages; i++) { clear_page_dirty_for_io(io_ctl->pages[i]); - set_page_extent_mapped(io_ctl->pages[i]); } return 0; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ee582a36653d..258bacefdf5f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4932,7 +4932,6 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, wait_on_page_writeback(page); lock_extent_bits(io_tree, block_start, block_end, &cached_state); - set_page_extent_mapped(page); ordered = btrfs_lookup_ordered_extent(inode, block_start); if (ordered) { @@ -8754,13 +8753,7 @@ btrfs_readpages(struct file *file, struct address_space *mapping, static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) { - int ret = try_release_extent_mapping(page, gfp_flags); - if (ret == 1) { - ClearPagePrivate(page); - set_page_private(page, 0); - put_page(page); - } - return ret; + return try_release_extent_mapping(page, gfp_flags); } static int btrfs_releasepage(struct page *page, gfp_t gfp_flags) @@ -8878,11 +8871,6 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, } ClearPageChecked(page); - if (PagePrivate(page)) { - ClearPagePrivate(page); - set_page_private(page, 0); - put_page(page); - } } /* @@ -8961,7 +8949,6 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) wait_on_page_writeback(page); lock_extent_bits(io_tree, page_start, page_end, &cached_state); - set_page_extent_mapped(page); /* * we can't set the delalloc bits if there are pending ordered diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 818f7ec8bb0e..861617e3d0c9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1355,7 +1355,6 @@ static int cluster_pages_for_defrag(struct inode *inode, for (i = 0; i < i_done; i++) { clear_page_dirty_for_io(pages[i]); ClearPageChecked(pages[i]); - set_page_extent_mapped(pages[i]); set_page_dirty(pages[i]); unlock_page(pages[i]); put_page(pages[i]); @@ -3550,6 +3549,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode, int ret; const u64 len = olen_aligned; u64 last_dest_end = destoff; + struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; ret = -ENOMEM; buf = kvmalloc(fs_info->nodesize, GFP_KERNEL); @@ -3864,6 +3864,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode, destoff, olen, no_time_update); } + clear_extent_uptodate(tree, destoff, destoff+olen, NULL); out: btrfs_free_path(path); kvfree(buf); @@ -3935,7 +3936,6 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src, truncate_inode_pages_range(&inode->i_data, round_down(destoff, PAGE_SIZE), round_up(destoff + len, PAGE_SIZE) - 1); - return ret; } diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 7f219851fa23..612988b7eb27 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3300,8 +3300,6 @@ static int relocate_file_extent_cluster(struct inode *inode, lock_extent(&BTRFS_I(inode)->io_tree, page_start, page_end); - set_page_extent_mapped(page); - if (nr < cluster->nr && page_start + offset == cluster->boundary[nr]) { set_extent_bits(&BTRFS_I(inode)->io_tree, From patchwork Fri Aug 2 22:00:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074165 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 04EFA14DB for ; Fri, 2 Aug 2019 22:01:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E969D2870D for ; Fri, 2 Aug 2019 22:01:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD59B2871E; Fri, 2 Aug 2019 22:01:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1392328716 for ; Fri, 2 Aug 2019 22:01:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437118AbfHBWBE (ORCPT ); Fri, 2 Aug 2019 18:01:04 -0400 Received: from mx2.suse.de ([195.135.220.15]:37972 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730782AbfHBWBD (ORCPT ); Fri, 2 Aug 2019 18:01:03 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 53E6EB0CC; Fri, 2 Aug 2019 22:01:01 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 04/13] btrfs: Add a simple buffered iomap write Date: Fri, 2 Aug 2019 17:00:39 -0500 Message-Id: <20190802220048.16142-5-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Introduce a new btrfs_iomap structure which contains information about the filesystem between the iomap_begin() and iomap_end() calls. This contains information about reservations and extent locking. This one is a long patch. Most of the code is "inspired" by fs/btrfs/file.c. To keep the size small, all removals are in following patches. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/Makefile | 2 +- fs/btrfs/ctree.h | 1 + fs/btrfs/file.c | 4 +- fs/btrfs/iomap.c | 381 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 385 insertions(+), 3 deletions(-) create mode 100644 fs/btrfs/iomap.c diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 76a843198bcb..f88e696b0698 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -11,7 +11,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ - block-rsv.o delalloc-space.o + block-rsv.o delalloc-space.o iomap.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 299e11e6c554..7a4ff524dc77 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3247,6 +3247,7 @@ int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end); loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); +size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from); /* tree-defrag.c */ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 4466a09f2d98..0707db04d3cc 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1829,7 +1829,7 @@ static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) return written; pos = iocb->ki_pos; - written_buffered = btrfs_buffered_write(iocb, from); + written_buffered = btrfs_buffered_iomap_write(iocb, from); if (written_buffered < 0) { err = written_buffered; goto out; @@ -1966,7 +1966,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, if (iocb->ki_flags & IOCB_DIRECT) { num_written = __btrfs_direct_write(iocb, from); } else { - num_written = btrfs_buffered_write(iocb, from); + num_written = btrfs_buffered_iomap_write(iocb, from); if (num_written > 0) iocb->ki_pos = pos + num_written; if (clean_page) diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c new file mode 100644 index 000000000000..9eb5e7b7603a --- /dev/null +++ b/fs/btrfs/iomap.c @@ -0,0 +1,381 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * iomap support for BTRFS + * + * Copyright (c) 2019 SUSE Linux + * Author: Goldwyn Rodrigues + */ + +#include +#include "ctree.h" +#include "btrfs_inode.h" +#include "volumes.h" +#include "disk-io.h" +#include "delalloc-space.h" + +struct btrfs_iomap { + u64 start; + u64 end; + bool nocow; + int extents_locked; + ssize_t reserved_bytes; + struct extent_changeset *data_reserved; + struct extent_state *cached_state; +}; + + +/* + * This function locks the extent and properly waits for data=ordered extents + * to finish before allowing the pages to be modified if need. + * + * The return value: + * 1 - the extent is locked + * 0 - the extent is not locked, and everything is OK + * -EAGAIN - need re-prepare the pages + * the other < 0 number - Something wrong happens + */ +static noinline int +lock_and_cleanup_extent(struct btrfs_inode *inode, loff_t pos, + size_t write_bytes, + u64 *lockstart, u64 *lockend, + struct extent_state **cached_state) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + u64 start_pos; + u64 last_pos; + int ret = 0; + + start_pos = round_down(pos, fs_info->sectorsize); + last_pos = start_pos + + round_up(pos + write_bytes - start_pos, + fs_info->sectorsize) - 1; + + if (start_pos < inode->vfs_inode.i_size) { + struct btrfs_ordered_extent *ordered; + + lock_extent_bits(&inode->io_tree, start_pos, last_pos, + cached_state); + ordered = btrfs_lookup_ordered_range(inode, start_pos, + last_pos - start_pos + 1); + if (ordered && + ordered->file_offset + ordered->len > start_pos && + ordered->file_offset <= last_pos) { + unlock_extent_cached(&inode->io_tree, start_pos, + last_pos, cached_state); + btrfs_start_ordered_extent(&inode->vfs_inode, + ordered, 1); + btrfs_put_ordered_extent(ordered); + return -EAGAIN; + } + if (ordered) + btrfs_put_ordered_extent(ordered); + + *lockstart = start_pos; + *lockend = last_pos; + ret = 1; + } + + return ret; +} + +static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, + size_t *write_bytes) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + struct btrfs_root *root = inode->root; + struct btrfs_ordered_extent *ordered; + u64 lockstart, lockend; + u64 num_bytes; + int ret; + + ret = btrfs_start_write_no_snapshotting(root); + if (!ret) + return -ENOSPC; + + lockstart = round_down(pos, fs_info->sectorsize); + lockend = round_up(pos + *write_bytes, + fs_info->sectorsize) - 1; + + while (1) { + lock_extent(&inode->io_tree, lockstart, lockend); + ordered = btrfs_lookup_ordered_range(inode, lockstart, + lockend - lockstart + 1); + if (!ordered) { + break; + } + unlock_extent(&inode->io_tree, lockstart, lockend); + btrfs_start_ordered_extent(&inode->vfs_inode, ordered, 1); + btrfs_put_ordered_extent(ordered); + } + + num_bytes = lockend - lockstart + 1; + ret = can_nocow_extent(&inode->vfs_inode, lockstart, &num_bytes, + NULL, NULL, NULL); + if (ret <= 0) { + ret = 0; + btrfs_end_write_no_snapshotting(root); + } else { + *write_bytes = min_t(size_t, *write_bytes , + num_bytes - pos + lockstart); + } + + unlock_extent(&inode->io_tree, lockstart, lockend); + + return ret; +} + +static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, + const u64 start, + const u64 len, + struct extent_state **cached_state) +{ + u64 search_start = start; + const u64 end = start + len - 1; + + while (search_start < end) { + const u64 search_len = end - search_start + 1; + struct extent_map *em; + u64 em_len; + int ret = 0; + + em = btrfs_get_extent(inode, NULL, 0, search_start, + search_len, 0); + if (IS_ERR(em)) + return PTR_ERR(em); + + if (em->block_start != EXTENT_MAP_HOLE) + goto next; + + em_len = em->len; + if (em->start < search_start) + em_len -= search_start - em->start; + if (em_len > search_len) + em_len = search_len; + + ret = set_extent_bit(&inode->io_tree, search_start, + search_start + em_len - 1, + EXTENT_DELALLOC_NEW, + NULL, cached_state, GFP_NOFS); +next: + search_start = extent_map_end(em); + free_extent_map(em); + if (ret) + return ret; + } + return 0; +} + +static void btrfs_buffered_page_done(struct inode *inode, loff_t pos, + unsigned copied, struct page *page, + struct iomap *iomap) +{ + SetPageUptodate(page); + ClearPageChecked(page); + set_page_dirty(page); + get_page(page); +} + + +static const struct iomap_page_ops btrfs_buffered_page_ops = { + .page_done = btrfs_buffered_page_done, +}; + + +static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) +{ + int ret; + size_t write_bytes = length; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + size_t sector_offset = pos & (fs_info->sectorsize - 1); + struct btrfs_iomap *bi; + + bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS); + if (!bi) + return -ENOMEM; + + bi->reserved_bytes = round_up(write_bytes + sector_offset, + fs_info->sectorsize); + + /* Reserve data space */ + ret = btrfs_check_data_free_space(inode, &bi->data_reserved, pos, + write_bytes); + if (ret < 0) { + /* + * Space allocation failed. Let's check if we can + * continue I/O without allocations + */ + if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | + BTRFS_INODE_PREALLOC)) && + check_can_nocow(BTRFS_I(inode), pos, + &write_bytes) > 0) { + bi->nocow = true; + /* + * our prealloc extent may be smaller than + * write_bytes, so scale down. + */ + bi->reserved_bytes = round_up(write_bytes + + sector_offset, + fs_info->sectorsize); + } else { + goto error; + } + } + + WARN_ON(bi->reserved_bytes == 0); + + /* We have the data space allocated, reserve the metadata now */ + ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), + bi->reserved_bytes); + if (ret) { + struct btrfs_root *root = BTRFS_I(inode)->root; + if (!bi->nocow) + btrfs_free_reserved_data_space(inode, + bi->data_reserved, pos, + write_bytes); + else + btrfs_end_write_no_snapshotting(root); + goto error; + } + + do { + ret = lock_and_cleanup_extent( + BTRFS_I(inode), pos, write_bytes, &bi->start, + &bi->end, &bi->cached_state); + } while (ret == -EAGAIN); + + if (ret < 0) { + btrfs_delalloc_release_extents(BTRFS_I(inode), + bi->reserved_bytes, true); + goto release; + } else { + bi->extents_locked = ret; + } + iomap->private = bi; + iomap->length = round_up(write_bytes, fs_info->sectorsize); + iomap->offset = round_down(pos, fs_info->sectorsize); + iomap->addr = IOMAP_NULL_ADDR; + iomap->type = IOMAP_DELALLOC; + iomap->bdev = fs_info->fs_devices->latest_bdev; + iomap->page_ops = &btrfs_buffered_page_ops; + return 0; +release: + if (bi->extents_locked) + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, + bi->end, &bi->cached_state); + if (bi->nocow) { + struct btrfs_root *root = BTRFS_I(inode)->root; + btrfs_end_write_no_snapshotting(root); + btrfs_delalloc_release_metadata(BTRFS_I(inode), + bi->reserved_bytes, true); + } else { + btrfs_delalloc_release_space(inode, bi->data_reserved, + round_down(pos, fs_info->sectorsize), + bi->reserved_bytes, true); + } + extent_changeset_free(bi->data_reserved); + +error: + kfree(bi); + return ret; +} + +static int btrfs_buffered_iomap_end(struct inode *inode, loff_t pos, + loff_t length, ssize_t written, unsigned flags, + struct iomap *iomap) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_iomap *bi = iomap->private; + ssize_t release_bytes = round_down(bi->reserved_bytes - written, + 1 << fs_info->sb->s_blocksize_bits); + unsigned int extra_bits = 0; + u64 start_pos = pos & ~((u64) fs_info->sectorsize - 1); + u64 num_bytes = round_up(written + pos - start_pos, + fs_info->sectorsize); + u64 end_of_last_block = start_pos + num_bytes - 1; + int ret = 0; + + if (release_bytes > 0) { + if (bi->nocow) { + btrfs_delalloc_release_metadata(BTRFS_I(inode), + release_bytes, true); + } else { + u64 __pos = round_down(pos + written, fs_info->sectorsize); + btrfs_delalloc_release_space(inode, bi->data_reserved, + __pos, release_bytes, true); + } + } + + /* + * The pages may have already been dirty, clear out old accounting so + * we can set things up properly + */ + clear_extent_bit(&BTRFS_I(inode)->io_tree, start_pos, end_of_last_block, + EXTENT_DIRTY | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | + EXTENT_DEFRAG, 0, 0, &bi->cached_state); + + if (!btrfs_is_free_space_inode(BTRFS_I(inode))) { + if (start_pos >= i_size_read(inode) && + !(BTRFS_I(inode)->flags & BTRFS_INODE_PREALLOC)) { + /* + * There can't be any extents following eof in this case + * so just set the delalloc new bit for the range + * directly. + */ + extra_bits |= EXTENT_DELALLOC_NEW; + } else { + ret = btrfs_find_new_delalloc_bytes(BTRFS_I(inode), + start_pos, num_bytes, + &bi->cached_state); + if (ret) + goto unlock; + } + } + + ret = btrfs_set_extent_delalloc(inode, start_pos, end_of_last_block, + extra_bits, &bi->cached_state, 0); +unlock: + if (bi->extents_locked) + unlock_extent_cached(&BTRFS_I(inode)->io_tree, + bi->start, bi->end, &bi->cached_state); + + if (bi->nocow) { + struct btrfs_root *root = BTRFS_I(inode)->root; + btrfs_end_write_no_snapshotting(root); + if (written > 0) { + u64 start = round_down(pos, fs_info->sectorsize); + u64 end = round_up(pos + written, fs_info->sectorsize) - 1; + set_extent_bit(&BTRFS_I(inode)->io_tree, start, end, + EXTENT_NORESERVE, NULL, NULL, GFP_NOFS); + } + + } + btrfs_delalloc_release_extents(BTRFS_I(inode), bi->reserved_bytes, + true); + + if (written < fs_info->nodesize) + btrfs_btree_balance_dirty(fs_info); + + extent_changeset_free(bi->data_reserved); + kfree(bi); + return ret; +} + +static const struct iomap_ops btrfs_buffered_iomap_ops = { + .iomap_begin = btrfs_buffered_iomap_begin, + .iomap_end = btrfs_buffered_iomap_end, +}; + +size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from) +{ + ssize_t written; + struct inode *inode = file_inode(iocb->ki_filp); + written = iomap_file_buffered_write(iocb, from, &btrfs_buffered_iomap_ops); + if (written > 0) + iocb->ki_pos += written; + if (iocb->ki_pos > i_size_read(inode)) + i_size_write(inode, iocb->ki_pos); + return written; +} + From patchwork Fri Aug 2 22:00:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E230912 for ; Fri, 2 Aug 2019 22:01:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60B84286DF for ; Fri, 2 Aug 2019 22:01:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 54C20288DB; Fri, 2 Aug 2019 22:01:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EEDFA288DA for ; Fri, 2 Aug 2019 22:01:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437123AbfHBWBH (ORCPT ); Fri, 2 Aug 2019 18:01:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:37992 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437117AbfHBWBF (ORCPT ); Fri, 2 Aug 2019 18:01:05 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5248CB128; Fri, 2 Aug 2019 22:01:03 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 05/13] btrfs: Add CoW in iomap based writes Date: Fri, 2 Aug 2019 17:00:40 -0500 Message-Id: <20190802220048.16142-6-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Set iomap->type to IOMAP_COW and fill up the source map in case the I/O is not page aligned. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/iomap.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c index 9eb5e7b7603a..879038e2f1a0 100644 --- a/fs/btrfs/iomap.c +++ b/fs/btrfs/iomap.c @@ -165,6 +165,35 @@ static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, return 0; } +/* + * get_iomap: Get the block map and fill the iomap structure + * @pos: file position + * @length: I/O length + * @iomap: The iomap structure to fill + */ + +static int get_iomap(struct inode *inode, loff_t pos, loff_t length, + struct iomap *iomap) +{ + struct extent_map *em; + iomap->addr = IOMAP_NULL_ADDR; + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + if (IS_ERR(em)) + return PTR_ERR(em); + /* XXX Do we need to check for em->flags here? */ + if (em->block_start == EXTENT_MAP_HOLE) { + iomap->type = IOMAP_HOLE; + } else { + iomap->addr = em->block_start; + iomap->type = IOMAP_MAPPED; + } + iomap->offset = em->start; + iomap->bdev = em->bdev; + iomap->length = em->len; + free_extent_map(em); + return 0; +} + static void btrfs_buffered_page_done(struct inode *inode, loff_t pos, unsigned copied, struct page *page, struct iomap *iomap) @@ -188,6 +217,7 @@ static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos, int ret; size_t write_bytes = length; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + size_t end; size_t sector_offset = pos & (fs_info->sectorsize - 1); struct btrfs_iomap *bi; @@ -255,6 +285,17 @@ static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos, iomap->private = bi; iomap->length = round_up(write_bytes, fs_info->sectorsize); iomap->offset = round_down(pos, fs_info->sectorsize); + end = pos + write_bytes; + /* Set IOMAP_COW if start/end is not page aligned */ + if (((pos & (PAGE_SIZE - 1)) || (end & (PAGE_SIZE - 1)))) { + iomap->type = IOMAP_COW; + ret = get_iomap(inode, pos, length, srcmap); + if (ret < 0) + goto release; + } else { + iomap->type = IOMAP_DELALLOC; + } + iomap->addr = IOMAP_NULL_ADDR; iomap->type = IOMAP_DELALLOC; iomap->bdev = fs_info->fs_devices->latest_bdev; From patchwork Fri Aug 2 22:00:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84705912 for ; Fri, 2 Aug 2019 22:01:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74383286DF for ; Fri, 2 Aug 2019 22:01:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 68DB8288DB; Fri, 2 Aug 2019 22:01:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78E30286DF for ; Fri, 2 Aug 2019 22:01:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437129AbfHBWBK (ORCPT ); Fri, 2 Aug 2019 18:01:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:38034 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730782AbfHBWBG (ORCPT ); Fri, 2 Aug 2019 18:01:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5CE52B11C; Fri, 2 Aug 2019 22:01:05 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 06/13] btrfs: remove buffered write code made unnecessary Date: Fri, 2 Aug 2019 17:00:41 -0500 Message-Id: <20190802220048.16142-7-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Better done in a separate patch to keep the main patch short(er) Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 463 -------------------------------------------------------- 1 file changed, 463 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 0707db04d3cc..f7087e28ac08 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -390,79 +390,6 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) return 0; } -/* simple helper to fault in pages and copy. This should go away - * and be replaced with calls into generic code. - */ -static noinline int btrfs_copy_from_user(loff_t pos, size_t write_bytes, - struct page **prepared_pages, - struct iov_iter *i) -{ - size_t copied = 0; - size_t total_copied = 0; - int pg = 0; - int offset = offset_in_page(pos); - - while (write_bytes > 0) { - size_t count = min_t(size_t, - PAGE_SIZE - offset, write_bytes); - struct page *page = prepared_pages[pg]; - /* - * Copy data from userspace to the current page - */ - copied = iov_iter_copy_from_user_atomic(page, i, offset, count); - - /* Flush processor's dcache for this page */ - flush_dcache_page(page); - - /* - * if we get a partial write, we can end up with - * partially up to date pages. These add - * a lot of complexity, so make sure they don't - * happen by forcing this copy to be retried. - * - * The rest of the btrfs_file_write code will fall - * back to page at a time copies after we return 0. - */ - if (!PageUptodate(page) && copied < count) - copied = 0; - - iov_iter_advance(i, copied); - write_bytes -= copied; - total_copied += copied; - - /* Return to btrfs_file_write_iter to fault page */ - if (unlikely(copied == 0)) - break; - - if (copied < PAGE_SIZE - offset) { - offset += copied; - } else { - pg++; - offset = 0; - } - } - return total_copied; -} - -/* - * unlocks pages after btrfs_file_write is done with them - */ -static void btrfs_drop_pages(struct page **pages, size_t num_pages) -{ - size_t i; - for (i = 0; i < num_pages; i++) { - /* page checked is some magic around finding pages that - * have been modified without going through btrfs_set_page_dirty - * clear it here. There should be no need to mark the pages - * accessed as prepare_pages should have marked them accessed - * in prepare_pages via find_or_create_page() - */ - ClearPageChecked(pages[i]); - unlock_page(pages[i]); - put_page(pages[i]); - } -} - static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, const u64 start, const u64 len, @@ -1387,164 +1314,6 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, return 0; } -/* - * on error we return an unlocked page and the error value - * on success we return a locked page and 0 - */ -static int prepare_uptodate_page(struct inode *inode, - struct page *page, u64 pos, - bool force_uptodate) -{ - int ret = 0; - - if (((pos & (PAGE_SIZE - 1)) || force_uptodate) && - !PageUptodate(page)) { - ret = btrfs_readpage(NULL, page); - if (ret) - return ret; - lock_page(page); - if (!PageUptodate(page)) { - unlock_page(page); - return -EIO; - } - if (page->mapping != inode->i_mapping) { - unlock_page(page); - return -EAGAIN; - } - } - return 0; -} - -/* - * this just gets pages into the page cache and locks them down. - */ -static noinline int prepare_pages(struct inode *inode, struct page **pages, - size_t num_pages, loff_t pos, - size_t write_bytes, bool force_uptodate) -{ - int i; - unsigned long index = pos >> PAGE_SHIFT; - gfp_t mask = btrfs_alloc_write_mask(inode->i_mapping); - int err = 0; - int faili; - - for (i = 0; i < num_pages; i++) { -again: - pages[i] = find_or_create_page(inode->i_mapping, index + i, - mask | __GFP_WRITE); - if (!pages[i]) { - faili = i - 1; - err = -ENOMEM; - goto fail; - } - - if (i == 0) - err = prepare_uptodate_page(inode, pages[i], pos, - force_uptodate); - if (!err && i == num_pages - 1) - err = prepare_uptodate_page(inode, pages[i], - pos + write_bytes, false); - if (err) { - put_page(pages[i]); - if (err == -EAGAIN) { - err = 0; - goto again; - } - faili = i - 1; - goto fail; - } - wait_on_page_writeback(pages[i]); - } - - return 0; -fail: - while (faili >= 0) { - unlock_page(pages[faili]); - put_page(pages[faili]); - faili--; - } - return err; - -} - -/* - * This function locks the extent and properly waits for data=ordered extents - * to finish before allowing the pages to be modified if need. - * - * The return value: - * 1 - the extent is locked - * 0 - the extent is not locked, and everything is OK - * -EAGAIN - need re-prepare the pages - * the other < 0 number - Something wrong happens - */ -static noinline int -lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, - size_t num_pages, loff_t pos, - size_t write_bytes, - u64 *lockstart, u64 *lockend, - struct extent_state **cached_state) -{ - struct btrfs_fs_info *fs_info = inode->root->fs_info; - u64 start_pos; - u64 last_pos; - int i; - int ret = 0; - - start_pos = round_down(pos, fs_info->sectorsize); - last_pos = start_pos - + round_up(pos + write_bytes - start_pos, - fs_info->sectorsize) - 1; - - if (start_pos < inode->vfs_inode.i_size) { - struct btrfs_ordered_extent *ordered; - - lock_extent_bits(&inode->io_tree, start_pos, last_pos, - cached_state); - ordered = btrfs_lookup_ordered_range(inode, start_pos, - last_pos - start_pos + 1); - if (ordered && - ordered->file_offset + ordered->len > start_pos && - ordered->file_offset <= last_pos) { - unlock_extent_cached(&inode->io_tree, start_pos, - last_pos, cached_state); - for (i = 0; i < num_pages; i++) { - unlock_page(pages[i]); - put_page(pages[i]); - } - btrfs_start_ordered_extent(&inode->vfs_inode, - ordered, 1); - btrfs_put_ordered_extent(ordered); - return -EAGAIN; - } - if (ordered) - btrfs_put_ordered_extent(ordered); - - *lockstart = start_pos; - *lockend = last_pos; - ret = 1; - } - - /* - * It's possible the pages are dirty right now, but we don't want - * to clean them yet because copy_from_user may catch a page fault - * and we might have to fall back to one page at a time. If that - * happens, we'll unlock these pages and we'd have a window where - * reclaim could sneak in and drop the once-dirty page on the floor - * without writing it. - * - * We have the pages locked and the extent range locked, so there's - * no way someone can start IO on any dirty pages in this range. - * - * We'll call btrfs_dirty_pages() later on, and that will flip around - * delalloc bits and dirty the pages as required. - */ - for (i = 0; i < num_pages; i++) { - WARN_ON(!PageLocked(pages[i])); - } - - return ret; -} - static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, size_t *write_bytes) { @@ -1581,238 +1350,6 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, return ret; } -static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, - struct iov_iter *i) -{ - struct file *file = iocb->ki_filp; - loff_t pos = iocb->ki_pos; - struct inode *inode = file_inode(file); - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - struct btrfs_root *root = BTRFS_I(inode)->root; - struct page **pages = NULL; - struct extent_state *cached_state = NULL; - struct extent_changeset *data_reserved = NULL; - u64 release_bytes = 0; - u64 lockstart; - u64 lockend; - size_t num_written = 0; - int nrptrs; - int ret = 0; - bool only_release_metadata = false; - bool force_page_uptodate = false; - - nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE), - PAGE_SIZE / (sizeof(struct page *))); - nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied); - nrptrs = max(nrptrs, 8); - pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL); - if (!pages) - return -ENOMEM; - - while (iov_iter_count(i) > 0) { - size_t offset = offset_in_page(pos); - size_t sector_offset; - size_t write_bytes = min(iov_iter_count(i), - nrptrs * (size_t)PAGE_SIZE - - offset); - size_t num_pages = DIV_ROUND_UP(write_bytes + offset, - PAGE_SIZE); - size_t reserve_bytes; - size_t dirty_pages; - size_t copied; - size_t dirty_sectors; - size_t num_sectors; - int extents_locked; - - WARN_ON(num_pages > nrptrs); - - /* - * Fault pages before locking them in prepare_pages - * to avoid recursive lock - */ - if (unlikely(iov_iter_fault_in_readable(i, write_bytes))) { - ret = -EFAULT; - break; - } - - sector_offset = pos & (fs_info->sectorsize - 1); - reserve_bytes = round_up(write_bytes + sector_offset, - fs_info->sectorsize); - - extent_changeset_release(data_reserved); - ret = btrfs_check_data_free_space(inode, &data_reserved, pos, - write_bytes); - if (ret < 0) { - if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | - BTRFS_INODE_PREALLOC)) && - check_can_nocow(BTRFS_I(inode), pos, - &write_bytes) > 0) { - /* - * For nodata cow case, no need to reserve - * data space. - */ - only_release_metadata = true; - /* - * our prealloc extent may be smaller than - * write_bytes, so scale down. - */ - num_pages = DIV_ROUND_UP(write_bytes + offset, - PAGE_SIZE); - reserve_bytes = round_up(write_bytes + - sector_offset, - fs_info->sectorsize); - } else { - break; - } - } - - WARN_ON(reserve_bytes == 0); - ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), - reserve_bytes); - if (ret) { - if (!only_release_metadata) - btrfs_free_reserved_data_space(inode, - data_reserved, pos, - write_bytes); - else - btrfs_end_write_no_snapshotting(root); - break; - } - - release_bytes = reserve_bytes; -again: - /* - * This is going to setup the pages array with the number of - * pages we want, so we don't really need to worry about the - * contents of pages from loop to loop - */ - ret = prepare_pages(inode, pages, num_pages, - pos, write_bytes, - force_page_uptodate); - if (ret) { - btrfs_delalloc_release_extents(BTRFS_I(inode), - reserve_bytes, true); - break; - } - - extents_locked = lock_and_cleanup_extent_if_need( - BTRFS_I(inode), pages, - num_pages, pos, write_bytes, &lockstart, - &lockend, &cached_state); - if (extents_locked < 0) { - if (extents_locked == -EAGAIN) - goto again; - btrfs_delalloc_release_extents(BTRFS_I(inode), - reserve_bytes, true); - ret = extents_locked; - break; - } - - copied = btrfs_copy_from_user(pos, write_bytes, pages, i); - - num_sectors = BTRFS_BYTES_TO_BLKS(fs_info, reserve_bytes); - dirty_sectors = round_up(copied + sector_offset, - fs_info->sectorsize); - dirty_sectors = BTRFS_BYTES_TO_BLKS(fs_info, dirty_sectors); - - /* - * if we have trouble faulting in the pages, fall - * back to one page at a time - */ - if (copied < write_bytes) - nrptrs = 1; - - if (copied == 0) { - force_page_uptodate = true; - dirty_sectors = 0; - dirty_pages = 0; - } else { - force_page_uptodate = false; - dirty_pages = DIV_ROUND_UP(copied + offset, - PAGE_SIZE); - } - - if (num_sectors > dirty_sectors) { - /* release everything except the sectors we dirtied */ - release_bytes -= dirty_sectors << - fs_info->sb->s_blocksize_bits; - if (only_release_metadata) { - btrfs_delalloc_release_metadata(BTRFS_I(inode), - release_bytes, true); - } else { - u64 __pos; - - __pos = round_down(pos, - fs_info->sectorsize) + - (dirty_pages << PAGE_SHIFT); - btrfs_delalloc_release_space(inode, - data_reserved, __pos, - release_bytes, true); - } - } - - release_bytes = round_up(copied + sector_offset, - fs_info->sectorsize); - - if (copied > 0) - ret = btrfs_dirty_pages(inode, pages, dirty_pages, - pos, copied, &cached_state); - if (extents_locked) - unlock_extent_cached(&BTRFS_I(inode)->io_tree, - lockstart, lockend, &cached_state); - btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes, - true); - if (ret) { - btrfs_drop_pages(pages, num_pages); - break; - } - - release_bytes = 0; - if (only_release_metadata) - btrfs_end_write_no_snapshotting(root); - - if (only_release_metadata && copied > 0) { - lockstart = round_down(pos, - fs_info->sectorsize); - lockend = round_up(pos + copied, - fs_info->sectorsize) - 1; - - set_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, - lockend, EXTENT_NORESERVE, NULL, - NULL, GFP_NOFS); - only_release_metadata = false; - } - - btrfs_drop_pages(pages, num_pages); - - cond_resched(); - - balance_dirty_pages_ratelimited(inode->i_mapping); - if (dirty_pages < (fs_info->nodesize >> PAGE_SHIFT) + 1) - btrfs_btree_balance_dirty(fs_info); - - pos += copied; - num_written += copied; - } - - kfree(pages); - - if (release_bytes) { - if (only_release_metadata) { - btrfs_end_write_no_snapshotting(root); - btrfs_delalloc_release_metadata(BTRFS_I(inode), - release_bytes, true); - } else { - btrfs_delalloc_release_space(inode, data_reserved, - round_down(pos, fs_info->sectorsize), - release_bytes, true); - } - } - - extent_changeset_free(data_reserved); - return num_written ? num_written : ret; -} - static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; From patchwork Fri Aug 2 22:00:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 22B57912 for ; Fri, 2 Aug 2019 22:01:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1505B286BE for ; Fri, 2 Aug 2019 22:01:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 090BB288DA; Fri, 2 Aug 2019 22:01:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A097D286BE for ; Fri, 2 Aug 2019 22:01:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437141AbfHBWBT (ORCPT ); Fri, 2 Aug 2019 18:01:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:38062 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437124AbfHBWBI (ORCPT ); Fri, 2 Aug 2019 18:01:08 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 62A5FB11B; Fri, 2 Aug 2019 22:01:07 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 07/13] btrfs: basic direct read operation Date: Fri, 2 Aug 2019 17:00:42 -0500 Message-Id: <20190802220048.16142-8-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Add btrfs_dio_iomap_ops for iomap.begin() function. In order to accomodate dio reads, add a new function btrfs_file_read_iter() which would call btrfs_dio_iomap_read() for DIO reads and fallback to generic_file_read_iter otherwise. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/file.c | 10 +++++++++- fs/btrfs/iomap.c | 20 ++++++++++++++++++++ 3 files changed, 31 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 7a4ff524dc77..9eca2d576dd1 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3247,7 +3247,9 @@ int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end); loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); +/* iomap.c */ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from); +ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to); /* tree-defrag.c */ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index f7087e28ac08..997eb152a35a 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2839,9 +2839,17 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) return generic_file_open(inode, filp); } +static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + if (iocb->ki_flags & IOCB_DIRECT) + return btrfs_dio_iomap_read(iocb, to); + + return generic_file_read_iter(iocb, to); +} + const struct file_operations btrfs_file_operations = { .llseek = btrfs_file_llseek, - .read_iter = generic_file_read_iter, + .read_iter = btrfs_file_read_iter, .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, .mmap = btrfs_file_mmap, diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c index 879038e2f1a0..36df606fc028 100644 --- a/fs/btrfs/iomap.c +++ b/fs/btrfs/iomap.c @@ -420,3 +420,23 @@ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from) return written; } +static int btrfs_dio_iomap_begin(struct inode *inode, loff_t pos, + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) +{ + return get_iomap(inode, pos, length, iomap); +} + +static const struct iomap_ops btrfs_dio_iomap_ops = { + .iomap_begin = btrfs_dio_iomap_begin, +}; + +ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to) +{ + struct inode *inode = file_inode(iocb->ki_filp); + ssize_t ret; + inode_lock_shared(inode); + ret = iomap_dio_rw(iocb, to, &btrfs_dio_iomap_ops, NULL); + inode_unlock_shared(inode); + return ret; +} From patchwork Fri Aug 2 22:00:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074175 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 49397186E for ; Fri, 2 Aug 2019 22:01:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A24F286DF for ; Fri, 2 Aug 2019 22:01:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2E967288DA; Fri, 2 Aug 2019 22:01:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D087B286DF for ; Fri, 2 Aug 2019 22:01:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437133AbfHBWBL (ORCPT ); Fri, 2 Aug 2019 18:01:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:38082 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437128AbfHBWBK (ORCPT ); Fri, 2 Aug 2019 18:01:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 663C4B12E; Fri, 2 Aug 2019 22:01:09 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 08/13] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Date: Fri, 2 Aug 2019 17:00:43 -0500 Message-Id: <20190802220048.16142-9-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This makes btrfs_get_extent_map_write() independent of Direct I/O code. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/inode.c | 40 +++++++++++++++++++++++++++------------- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9eca2d576dd1..66232cbc2414 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3172,6 +3172,8 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location, struct btrfs_path *path); struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, struct btrfs_root *root, int *was_new); +int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, + struct inode *inode, u64 start, u64 len); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, struct page *page, size_t pg_offset, u64 start, u64 end, int create); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 258bacefdf5f..24895793fd91 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7592,11 +7592,10 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em, return 0; } -static int btrfs_get_blocks_direct_write(struct extent_map **map, - struct buffer_head *bh_result, - struct inode *inode, - struct btrfs_dio_data *dio_data, - u64 start, u64 len) +int btrfs_get_extent_map_write(struct extent_map **map, + struct buffer_head *bh, + struct inode *inode, + u64 start, u64 len) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em = *map; @@ -7650,22 +7649,38 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, */ btrfs_free_reserved_data_space_noquota(inode, start, len); - goto skip_cow; + /* skip COW */ + goto out; } } /* this will cow the extent */ - len = bh_result->b_size; + if (bh) + len = bh->b_size; free_extent_map(em); *map = em = btrfs_new_extent_direct(inode, start, len); - if (IS_ERR(em)) { - ret = PTR_ERR(em); - goto out; - } + if (IS_ERR(em)) + return PTR_ERR(em); +out: + return ret; +} +static int btrfs_get_blocks_direct_write(struct extent_map **map, + struct buffer_head *bh_result, + struct inode *inode, + struct btrfs_dio_data *dio_data, + u64 start, u64 len) +{ + int ret; + struct extent_map *em; + + ret = btrfs_get_extent_map_write(map, bh_result, inode, + start, len); + if (ret < 0) + return ret; + em = *map; len = min(len, em->len - (start - em->start)); -skip_cow: bh_result->b_blocknr = (em->block_start + (start - em->start)) >> inode->i_blkbits; bh_result->b_size = len; @@ -7686,7 +7701,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, dio_data->reserve -= len; dio_data->unsubmitted_oe_range_end = start + len; current->journal_info = dio_data; -out: return ret; } From patchwork Fri Aug 2 22:00:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AFE2186E for ; Fri, 2 Aug 2019 22:01:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BF82286DF for ; Fri, 2 Aug 2019 22:01:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 002F6288DA; Fri, 2 Aug 2019 22:01:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9F61E286E0 for ; Fri, 2 Aug 2019 22:01:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437152AbfHBWBX (ORCPT ); Fri, 2 Aug 2019 18:01:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:38088 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437134AbfHBWBN (ORCPT ); Fri, 2 Aug 2019 18:01:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6FDDFB60A; Fri, 2 Aug 2019 22:01:11 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 09/13] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Date: Fri, 2 Aug 2019 17:00:44 -0500 Message-Id: <20190802220048.16142-10-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Since we will be calling from another file, use a better name to declare it non-static Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 7 +++++-- fs/btrfs/inode.c | 14 +++++--------- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 66232cbc2414..b8b19647b43e 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3175,8 +3175,11 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location, int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh, struct inode *inode, u64 start, u64 len); struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, - struct page *page, size_t pg_offset, - u64 start, u64 end, int create); + struct page *page, size_t pg_offset, + u64 start, u64 end, int create); +void btrfs_update_ordered_extent(struct inode *inode, + const u64 offset, const u64 bytes, + const bool uptodate); int btrfs_update_inode(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct inode *inode); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 24895793fd91..d415534ce733 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -89,10 +89,6 @@ static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len, u64 ram_bytes, int compress_type, int type); -static void __endio_write_update_ordered(struct inode *inode, - const u64 offset, const u64 bytes, - const bool uptodate); - /* * Cleanup all submitted ordered extents in specified range to handle errors * from the btrfs_run_delalloc_range() callback. @@ -133,7 +129,7 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode, bytes -= PAGE_SIZE; } - return __endio_write_update_ordered(inode, offset, bytes, false); + return btrfs_update_ordered_extent(inode, offset, bytes, false); } static int btrfs_dirty_inode(struct inode *inode); @@ -8176,7 +8172,7 @@ static void btrfs_endio_direct_read(struct bio *bio) bio_put(bio); } -static void __endio_write_update_ordered(struct inode *inode, +void btrfs_update_ordered_extent(struct inode *inode, const u64 offset, const u64 bytes, const bool uptodate) { @@ -8229,7 +8225,7 @@ static void btrfs_endio_direct_write(struct bio *bio) struct btrfs_dio_private *dip = bio->bi_private; struct bio *dio_bio = dip->dio_bio; - __endio_write_update_ordered(dip->inode, dip->logical_offset, + btrfs_update_ordered_extent(dip->inode, dip->logical_offset, dip->bytes, !bio->bi_status); kfree(dip); @@ -8546,7 +8542,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, bio = NULL; } else { if (write) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, file_offset, dio_bio->bi_iter.bi_size, false); @@ -8686,7 +8682,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) */ if (dio_data.unsubmitted_oe_range_start < dio_data.unsubmitted_oe_range_end) - __endio_write_update_ordered(inode, + btrfs_update_ordered_extent(inode, dio_data.unsubmitted_oe_range_start, dio_data.unsubmitted_oe_range_end - dio_data.unsubmitted_oe_range_start, From patchwork Fri Aug 2 22:00:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074187 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF605912 for ; Fri, 2 Aug 2019 22:01:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2F41286DF for ; Fri, 2 Aug 2019 22:01:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C714C288DA; Fri, 2 Aug 2019 22:01:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05E8D286DF for ; Fri, 2 Aug 2019 22:01:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437148AbfHBWBV (ORCPT ); Fri, 2 Aug 2019 18:01:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:38098 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437128AbfHBWBO (ORCPT ); Fri, 2 Aug 2019 18:01:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 780C3B611; Fri, 2 Aug 2019 22:01:13 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 10/13] iomap: use a function pointer for dio submits Date: Fri, 2 Aug 2019 17:00:45 -0500 Message-Id: <20190802220048.16142-11-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues This helps filesystems to perform tasks on the bio while submitting for I/O. Since btrfs requires the position we are working on, pass pos to iomap_dio_submit_bio() The correct place for submit_io() is not page_ops. Would it better to rename the structure to something like iomap_io_ops or put it directly under struct iomap? Signed-off-by: Goldwyn Rodrigues --- fs/iomap/direct-io.c | 16 +++++++++++----- include/linux/iomap.h | 1 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 5279029c7a3c..a802e66bf11f 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -59,7 +59,7 @@ int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) EXPORT_SYMBOL_GPL(iomap_dio_iopoll); static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, - struct bio *bio) + struct bio *bio, loff_t pos) { atomic_inc(&dio->ref); @@ -67,7 +67,13 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, bio_set_polled(bio, dio->iocb); dio->submit.last_queue = bdev_get_queue(iomap->bdev); - dio->submit.cookie = submit_bio(bio); + if (iomap->page_ops && iomap->page_ops->submit_io) { + iomap->page_ops->submit_io(bio, file_inode(dio->iocb->ki_filp), + pos); + dio->submit.cookie = BLK_QC_T_NONE; + } else { + dio->submit.cookie = submit_bio(bio); + } } static ssize_t iomap_dio_complete(struct iomap_dio *dio) @@ -195,7 +201,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, get_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); - iomap_dio_submit_bio(dio, iomap, bio); + iomap_dio_submit_bio(dio, iomap, bio, pos); } static loff_t @@ -301,11 +307,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, iov_iter_advance(dio->submit.iter, n); dio->size += n; - pos += n; copied += n; nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES); - iomap_dio_submit_bio(dio, iomap, bio); + iomap_dio_submit_bio(dio, iomap, bio, pos); + pos += n; } while (nr_pages); /* diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 5b2055e8ca8a..6617e4b6fb6d 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -92,6 +92,7 @@ struct iomap_page_ops { struct iomap *iomap); void (*page_done)(struct inode *inode, loff_t pos, unsigned copied, struct page *page, struct iomap *iomap); + dio_submit_t *submit_io; }; /* From patchwork Fri Aug 2 22:00:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 556BE1398 for ; Fri, 2 Aug 2019 22:01:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4906E286BE for ; Fri, 2 Aug 2019 22:01:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D978286E0; Fri, 2 Aug 2019 22:01:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 70CC8286BE for ; Fri, 2 Aug 2019 22:01:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437157AbfHBWB0 (ORCPT ); Fri, 2 Aug 2019 18:01:26 -0400 Received: from mx2.suse.de ([195.135.220.15]:38112 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437135AbfHBWBR (ORCPT ); Fri, 2 Aug 2019 18:01:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7DBC7B613; Fri, 2 Aug 2019 22:01:15 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 11/13] btrfs: Use iomap_dio_rw for performing direct I/O writes Date: Fri, 2 Aug 2019 17:00:46 -0500 Message-Id: <20190802220048.16142-12-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Since btrfs Direct I/O needs to perform operations before submission, use the submit_io function which operates on the bio to perform checksum calculations etc. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 3 ++ fs/btrfs/file.c | 2 +- fs/btrfs/inode.c | 14 +++-- fs/btrfs/iomap.c | 158 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 165 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b8b19647b43e..3b7a6ddceed6 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3206,6 +3206,8 @@ int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end); void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, u64 end, int uptodate); extern const struct dentry_operations btrfs_dentry_operations; +void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, + loff_t file_offset); /* ioctl.c */ long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg); @@ -3255,6 +3257,7 @@ loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in, /* iomap.c */ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from); ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to); +ssize_t btrfs_dio_iomap_write(struct kiocb *iocb, struct iov_iter *from); /* tree-defrag.c */ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 997eb152a35a..faa5ad89469f 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1501,7 +1501,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, atomic_inc(&BTRFS_I(inode)->sync_writers); if (iocb->ki_flags & IOCB_DIRECT) { - num_written = __btrfs_direct_write(iocb, from); + num_written = btrfs_dio_iomap_write(iocb, from); } else { num_written = btrfs_buffered_iomap_write(iocb, from); if (num_written > 0) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d415534ce733..323d72858c9c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8167,9 +8167,8 @@ static void btrfs_endio_direct_read(struct bio *bio) kfree(dip); dio_bio->bi_status = err; - dio_end_io(dio_bio); + bio_endio(dio_bio); btrfs_io_bio_free_csum(io_bio); - bio_put(bio); } void btrfs_update_ordered_extent(struct inode *inode, @@ -8231,8 +8230,7 @@ static void btrfs_endio_direct_write(struct bio *bio) kfree(dip); dio_bio->bi_status = bio->bi_status; - dio_end_io(dio_bio); - bio_put(bio); + bio_endio(dio_bio); } static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data, @@ -8464,8 +8462,8 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) return 0; } -static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, - loff_t file_offset) +void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, + loff_t file_offset) { struct btrfs_dio_private *dip = NULL; struct bio *bio = NULL; @@ -8536,7 +8534,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, /* * The end io callbacks free our dip, do the final put on bio * and all the cleanup and final put for dio_bio (through - * dio_end_io()). + * end_io()). */ dip = NULL; bio = NULL; @@ -8555,7 +8553,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, * Releases and cleans up our dio_bio, no need to bio_put() * nor bio_endio()/bio_io_error() against dio_bio. */ - dio_end_io(dio_bio); + bio_endio(dio_bio); } if (bio) bio_put(bio); diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c index 36df606fc028..329954c8cb88 100644 --- a/fs/btrfs/iomap.c +++ b/fs/btrfs/iomap.c @@ -7,6 +7,7 @@ */ #include +#include #include "ctree.h" #include "btrfs_inode.h" #include "volumes.h" @@ -420,15 +421,127 @@ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from) return written; } +static const struct iomap_page_ops btrfs_dio_iomap_page_ops = { + .submit_io = btrfs_submit_direct, +}; + +static struct btrfs_iomap *btrfs_iomap_init(struct inode *inode, + struct extent_map **em, + loff_t pos, loff_t length) +{ + int ret = 0; + struct extent_map *map = *em; + struct btrfs_iomap *bi; + u64 num_bytes; + + bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS); + if (!bi) + return ERR_PTR(-ENOMEM); + + bi->start = round_down(pos, PAGE_SIZE); + bi->end = PAGE_ALIGN(pos + length) - 1; + num_bytes = bi->end - bi->start + 1; + + /* Wait for existing ordered extents in range to finish */ + btrfs_wait_ordered_range(inode, bi->start, num_bytes); + + lock_extent_bits(&BTRFS_I(inode)->io_tree, bi->start, bi->end, &bi->cached_state); + + ret = btrfs_delalloc_reserve_space(inode, &bi->data_reserved, + bi->start, num_bytes); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + kfree(bi); + return ERR_PTR(ret); + } + + refcount_inc(&map->refs); + ret = btrfs_get_extent_map_write(em, NULL, + inode, bi->start, num_bytes); + if (ret) { + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + btrfs_delalloc_release_space(inode, + bi->data_reserved, bi->start, + num_bytes, true); + extent_changeset_free(bi->data_reserved); + kfree(bi); + return ERR_PTR(ret); + } + free_extent_map(map); + return bi; +} + static int btrfs_dio_iomap_begin(struct inode *inode, loff_t pos, - loff_t length, unsigned flags, struct iomap *iomap, - struct iomap *srcmap) + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) { - return get_iomap(inode, pos, length, iomap); + struct extent_map *em; + struct btrfs_iomap *bi = NULL; + + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0); + + if (flags & IOMAP_WRITE) { + srcmap->offset = em->start; + srcmap->length = em->len; + srcmap->bdev = em->bdev; + if (em->block_start == EXTENT_MAP_HOLE) { + srcmap->type = IOMAP_HOLE; + } else { + srcmap->type = IOMAP_MAPPED; + srcmap->addr = em->block_start; + } + bi = btrfs_iomap_init(inode, &em, pos, length); + if (IS_ERR(bi)) + return PTR_ERR(bi); + } + + iomap->offset = em->start; + iomap->length = em->len; + iomap->bdev = em->bdev; + + if (em->block_start == EXTENT_MAP_HOLE) { + iomap->type = IOMAP_HOLE; + } else { + iomap->type = IOMAP_MAPPED; + iomap->addr = em->block_start; + } + iomap->private = bi; + iomap->page_ops = &btrfs_dio_iomap_page_ops; + return 0; +} + +static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, + loff_t length, ssize_t written, unsigned flags, + struct iomap *iomap) +{ + struct btrfs_iomap *bi = iomap->private; + u64 wend; + loff_t release_bytes; + + if (!bi) + return 0; + + unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end, + &bi->cached_state); + + wend = PAGE_ALIGN(pos + written); + release_bytes = wend - bi->end - 1; + if (release_bytes > 0) + btrfs_delalloc_release_space(inode, + bi->data_reserved, wend, + release_bytes, true); + + btrfs_delalloc_release_extents(BTRFS_I(inode), wend - bi->start, false); + extent_changeset_free(bi->data_reserved); + kfree(bi); + return 0; } static const struct iomap_ops btrfs_dio_iomap_ops = { .iomap_begin = btrfs_dio_iomap_begin, + .iomap_end = btrfs_dio_iomap_end, }; ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to) @@ -440,3 +553,42 @@ ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to) inode_unlock_shared(inode); return ret; } + +ssize_t btrfs_dio_iomap_write(struct kiocb *iocb, struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + ssize_t written, written_buffered; + loff_t pos, endbyte; + int err; + + written = iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops, NULL); + if (written < 0 || !iov_iter_count(from)) + return written; + + pos = iocb->ki_pos; + written_buffered = btrfs_buffered_iomap_write(iocb, from); + if (written_buffered < 0) { + err = written_buffered; + goto out; + } + /* + * Ensure all data is persisted. We want the next direct IO read to be + * able to read what was just written. + */ + endbyte = pos + written_buffered - 1; + err = btrfs_fdatawrite_range(inode, pos, endbyte); + if (err) + goto out; + err = filemap_fdatawait_range(inode->i_mapping, pos, endbyte); + if (err) + goto out; + written += written_buffered; + iocb->ki_pos = pos + written_buffered; + invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT, + endbyte >> PAGE_SHIFT); +out: + if (written > 0 && iocb->ki_pos > i_size_read(inode)) + i_size_write(inode, iocb->ki_pos); + return written ? written : err; +} From patchwork Fri Aug 2 22:00:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074193 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8144C912 for ; Fri, 2 Aug 2019 22:01:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 71CD7286BE for ; Fri, 2 Aug 2019 22:01:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 65D84286DF; Fri, 2 Aug 2019 22:01:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9792286E0 for ; Fri, 2 Aug 2019 22:01:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437160AbfHBWB1 (ORCPT ); Fri, 2 Aug 2019 18:01:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:38134 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437140AbfHBWBT (ORCPT ); Fri, 2 Aug 2019 18:01:19 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 87B6EB618; Fri, 2 Aug 2019 22:01:17 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 12/13] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Date: Fri, 2 Aug 2019 17:00:47 -0500 Message-Id: <20190802220048.16142-13-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues btrfs_dio_data is unnecessary since we are now storing all informaiton in btrfs_iomap. Advantage: We don't abuse current->journal_info anymore :) Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 40 ---------------------------- fs/btrfs/inode.c | 81 ++------------------------------------------------------ 2 files changed, 2 insertions(+), 119 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index faa5ad89469f..90a5fa387986 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1350,46 +1350,6 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, return ret; } -static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) -{ - struct file *file = iocb->ki_filp; - struct inode *inode = file_inode(file); - loff_t pos; - ssize_t written; - ssize_t written_buffered; - loff_t endbyte; - int err; - - written = generic_file_direct_write(iocb, from); - - if (written < 0 || !iov_iter_count(from)) - return written; - - pos = iocb->ki_pos; - written_buffered = btrfs_buffered_iomap_write(iocb, from); - if (written_buffered < 0) { - err = written_buffered; - goto out; - } - /* - * Ensure all data is persisted. We want the next direct IO read to be - * able to read what was just written. - */ - endbyte = pos + written_buffered - 1; - err = btrfs_fdatawrite_range(inode, pos, endbyte); - if (err) - goto out; - err = filemap_fdatawait_range(inode->i_mapping, pos, endbyte); - if (err) - goto out; - written += written_buffered; - iocb->ki_pos = pos + written_buffered; - invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT, - endbyte >> PAGE_SHIFT); -out: - return written ? written : err; -} - static void update_time_for_write(struct inode *inode) { struct timespec64 now; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 323d72858c9c..87fbe73ca2e4 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -54,13 +54,6 @@ struct btrfs_iget_args { struct btrfs_root *root; }; -struct btrfs_dio_data { - u64 reserve; - u64 unsubmitted_oe_range_start; - u64 unsubmitted_oe_range_end; - int overwrite; -}; - static const struct inode_operations btrfs_dir_inode_operations; static const struct inode_operations btrfs_symlink_inode_operations; static const struct inode_operations btrfs_dir_ro_inode_operations; @@ -7664,7 +7657,6 @@ int btrfs_get_extent_map_write(struct extent_map **map, static int btrfs_get_blocks_direct_write(struct extent_map **map, struct buffer_head *bh_result, struct inode *inode, - struct btrfs_dio_data *dio_data, u64 start, u64 len) { int ret; @@ -7686,17 +7678,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, if (!test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) set_buffer_new(bh_result); - /* - * Need to update the i_size under the extent lock so buffered - * readers will get the updated i_size when we unlock. - */ - if (!dio_data->overwrite && start + len > i_size_read(inode)) - i_size_write(inode, start + len); - - WARN_ON(dio_data->reserve < len); - dio_data->reserve -= len; - dio_data->unsubmitted_oe_range_end = start + len; - current->journal_info = dio_data; return ret; } @@ -7706,7 +7687,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em; struct extent_state *cached_state = NULL; - struct btrfs_dio_data *dio_data = NULL; u64 start = iblock << inode->i_blkbits; u64 lockstart, lockend; u64 len = bh_result->b_size; @@ -7721,16 +7701,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, lockstart = start; lockend = start + len - 1; - if (current->journal_info) { - /* - * Need to pull our outstanding extents and set journal_info to NULL so - * that anything that needs to check if there's a transaction doesn't get - * confused. - */ - dio_data = current->journal_info; - current->journal_info = NULL; - } - /* * If this errors out it's because we couldn't invalidate pagecache for * this range and we need to fallback to buffered. @@ -7770,7 +7740,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, if (create) { ret = btrfs_get_blocks_direct_write(&em, bh_result, inode, - dio_data, start, len); + start, len); if (ret < 0) goto unlock_err; @@ -7808,8 +7778,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, unlock_bits, 1, 0, &cached_state); err: - if (dio_data) - current->journal_info = dio_data; return ret; } @@ -8498,21 +8466,6 @@ void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, dip->subio_endio = btrfs_subio_endio_read; } - /* - * Reset the range for unsubmitted ordered extents (to a 0 length range) - * even if we fail to submit a bio, because in such case we do the - * corresponding error handling below and it must not be done a second - * time by btrfs_direct_IO(). - */ - if (write) { - struct btrfs_dio_data *dio_data = current->journal_info; - - dio_data->unsubmitted_oe_range_end = dip->logical_offset + - dip->bytes; - dio_data->unsubmitted_oe_range_start = - dio_data->unsubmitted_oe_range_end; - } - ret = btrfs_submit_direct_hook(dip); if (!ret) return; @@ -8598,7 +8551,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - struct btrfs_dio_data dio_data = { 0 }; struct extent_changeset *data_reserved = NULL; loff_t offset = iocb->ki_pos; size_t count = 0; @@ -8631,7 +8583,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) * not unlock the i_mutex at this case. */ if (offset + count <= inode->i_size) { - dio_data.overwrite = 1; inode_unlock(inode); relock = true; } else if (iocb->ki_flags & IOCB_NOWAIT) { @@ -8643,16 +8594,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) if (ret) goto out; - /* - * We need to know how many extents we reserved so that we can - * do the accounting properly if we go over the number we - * originally calculated. Abuse current->journal_info for this. - */ - dio_data.reserve = round_up(count, - fs_info->sectorsize); - dio_data.unsubmitted_oe_range_start = (u64)offset; - dio_data.unsubmitted_oe_range_end = (u64)offset; - current->journal_info = &dio_data; down_read(&BTRFS_I(inode)->dio_sem); } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK, &BTRFS_I(inode)->runtime_flags)) { @@ -8667,25 +8608,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) btrfs_submit_direct, flags); if (iov_iter_rw(iter) == WRITE) { up_read(&BTRFS_I(inode)->dio_sem); - current->journal_info = NULL; - if (ret < 0 && ret != -EIOCBQUEUED) { - if (dio_data.reserve) - btrfs_delalloc_release_space(inode, data_reserved, - offset, dio_data.reserve, true); - /* - * On error we might have left some ordered extents - * without submitting corresponding bios for them, so - * cleanup them up to avoid other tasks getting them - * and waiting for them to complete forever. - */ - if (dio_data.unsubmitted_oe_range_start < - dio_data.unsubmitted_oe_range_end) - btrfs_update_ordered_extent(inode, - dio_data.unsubmitted_oe_range_start, - dio_data.unsubmitted_oe_range_end - - dio_data.unsubmitted_oe_range_start, - false); - } else if (ret >= 0 && (size_t)ret < count) + if (ret >= 0 && (size_t)ret < count) btrfs_delalloc_release_space(inode, data_reserved, offset, count - (size_t)ret, true); btrfs_delalloc_release_extents(BTRFS_I(inode), count, false); From patchwork Fri Aug 2 22:00:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11074195 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 744651398 for ; Fri, 2 Aug 2019 22:01:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63D7D286E0 for ; Fri, 2 Aug 2019 22:01:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 58658288DD; Fri, 2 Aug 2019 22:01:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7AB15288DA for ; Fri, 2 Aug 2019 22:01:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437162AbfHBWB1 (ORCPT ); Fri, 2 Aug 2019 18:01:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:38158 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2437144AbfHBWBU (ORCPT ); Fri, 2 Aug 2019 18:01:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 8EEE2B61A; Fri, 2 Aug 2019 22:01:19 +0000 (UTC) From: Goldwyn Rodrigues To: linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com, Goldwyn Rodrigues Subject: [PATCH 13/13] btrfs: update inode size during bio completion Date: Fri, 2 Aug 2019 17:00:48 -0500 Message-Id: <20190802220048.16142-14-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190802220048.16142-1-rgoldwyn@suse.de> References: <20190802220048.16142-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Goldwyn Rodrigues Update the inode size for dio writes during bio completion. This ties the success of the underlying block layer whether to increase the size of the inode. Especially for in aio cases. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 87fbe73ca2e4..f87a9dd154a9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8191,9 +8191,13 @@ static void btrfs_endio_direct_write(struct bio *bio) { struct btrfs_dio_private *dip = bio->bi_private; struct bio *dio_bio = dip->dio_bio; + struct inode *inode = dip->inode; - btrfs_update_ordered_extent(dip->inode, dip->logical_offset, + btrfs_update_ordered_extent(inode, dip->logical_offset, dip->bytes, !bio->bi_status); + if (!bio->bi_status && + i_size_read(inode) < dip->logical_offset + dip->bytes) + i_size_write(inode, dip->logical_offset + dip->bytes); kfree(dip);