From patchwork Tue Nov 26 03:14:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11261345 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 172F3112B for ; Tue, 26 Nov 2019 03:15:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 009BA2071A for ; Tue, 26 Nov 2019 03:15:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727228AbfKZDP2 (ORCPT ); Mon, 25 Nov 2019 22:15:28 -0500 Received: from mx2.suse.de ([195.135.220.15]:34120 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726016AbfKZDP2 (ORCPT ); Mon, 25 Nov 2019 22:15:28 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A0062AE00; Tue, 26 Nov 2019 03:15:26 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, darrick.wong@oracle.com, fdmanana@kernel.org, Goldwyn Rodrigues Subject: [PATCH 1/5] fs: Export generic_file_buffered_read() Date: Mon, 25 Nov 2019 21:14:52 -0600 Message-Id: <20191126031456.12150-2-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20191126031456.12150-1-rgoldwyn@suse.de> References: <20191126031456.12150-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Export generic_file_buffered_read() to be used to supplement incomplete direct reads. While we are at it, correct the comments and variable names. Signed-off-by: Goldwyn Rodrigues Reviewed-by: Johannes Thumshirn --- include/linux/fs.h | 2 ++ mm/filemap.c | 13 +++++++------ 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index e0d909d35763..5bc75cff3536 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3095,6 +3095,8 @@ extern int generic_file_rw_checks(struct file *file_in, struct file *file_out); extern int generic_copy_file_checks(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t *count, unsigned int flags); +extern ssize_t generic_file_buffered_read(struct kiocb *iocb, + struct iov_iter *to, ssize_t already_read); extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *); extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *); extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *); diff --git a/mm/filemap.c b/mm/filemap.c index 85b7d087eb45..6a1be2fcdfe7 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1994,7 +1994,7 @@ static void shrink_readahead_size_eio(struct file *filp, * generic_file_buffered_read - generic file read routine * @iocb: the iocb to read * @iter: data destination - * @written: already copied + * @copied: already copied * * This is a generic file read routine, and uses the * mapping->a_ops->readpage() function for the actual low-level stuff. @@ -2003,11 +2003,11 @@ static void shrink_readahead_size_eio(struct file *filp, * of the logic when it comes to error handling etc. * * Return: - * * total number of bytes copied, including those the were already @written + * * total number of bytes copied, including those the were @copied * * negative error code if nothing was copied */ -static ssize_t generic_file_buffered_read(struct kiocb *iocb, - struct iov_iter *iter, ssize_t written) +ssize_t generic_file_buffered_read(struct kiocb *iocb, + struct iov_iter *iter, ssize_t copied) { struct file *filp = iocb->ki_filp; struct address_space *mapping = filp->f_mapping; @@ -2148,7 +2148,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, prev_offset = offset; put_page(page); - written += ret; + copied += ret; if (!iov_iter_count(iter)) goto out; if (ret < nr) { @@ -2256,8 +2256,9 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, *ppos = ((loff_t)index << PAGE_SHIFT) + offset; file_accessed(filp); - return written ? written : error; + return copied ? copied : error; } +EXPORT_SYMBOL_GPL(generic_file_buffered_read); /** * generic_file_read_iter - generic filesystem read routine From patchwork Tue Nov 26 03:14:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11261349 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A7B5109A for ; Tue, 26 Nov 2019 03:15:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 15B962071A for ; Tue, 26 Nov 2019 03:15:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727240AbfKZDPg (ORCPT ); Mon, 25 Nov 2019 22:15:36 -0500 Received: from mx2.suse.de ([195.135.220.15]:34172 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726016AbfKZDPg (ORCPT ); Mon, 25 Nov 2019 22:15:36 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 8BF60AE00; Tue, 26 Nov 2019 03:15:34 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, darrick.wong@oracle.com, fdmanana@kernel.org, Goldwyn Rodrigues Subject: [PATCH 2/5] iomap: add a filesystem hook for direct I/O bio submission Date: Mon, 25 Nov 2019 21:14:53 -0600 Message-Id: <20191126031456.12150-3-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20191126031456.12150-1-rgoldwyn@suse.de> References: <20191126031456.12150-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues This helps filesystems to perform tasks on the bio while submitting for I/O. This could be post-write operations such as data CRC or data replication for fs-handled RAID. Signed-off-by: Goldwyn Rodrigues Reviewed-by: Nikolay Borisov Reviewed-by: Johannes Thumshirn --- fs/iomap/direct-io.c | 14 +++++++++----- include/linux/iomap.h | 2 ++ 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index b77ec2acb066..2dac380f88a0 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -59,7 +59,7 @@ int iomap_dio_iopoll(struct kiocb *kiocb, bool spin) EXPORT_SYMBOL_GPL(iomap_dio_iopoll); static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, - struct bio *bio) + struct bio *bio, loff_t pos) { atomic_inc(&dio->ref); @@ -67,7 +67,11 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, bio_set_polled(bio, dio->iocb); dio->submit.last_queue = bdev_get_queue(iomap->bdev); - dio->submit.cookie = submit_bio(bio); + if (dio->dops && dio->dops->submit_io) + dio->submit.cookie = dio->dops->submit_io(bio, + dio->iocb->ki_filp, pos); + else + dio->submit.cookie = submit_bio(bio); } static ssize_t iomap_dio_complete(struct iomap_dio *dio) @@ -191,7 +195,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, get_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); - iomap_dio_submit_bio(dio, iomap, bio); + iomap_dio_submit_bio(dio, iomap, bio, pos); } static loff_t @@ -297,11 +301,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, iov_iter_advance(dio->submit.iter, n); dio->size += n; - pos += n; copied += n; nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES); - iomap_dio_submit_bio(dio, iomap, bio); + iomap_dio_submit_bio(dio, iomap, bio, pos); + pos += n; } while (nr_pages); /* diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 8b09463dae0d..2b093a23ef1c 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -252,6 +252,8 @@ int iomap_writepages(struct address_space *mapping, struct iomap_dio_ops { int (*end_io)(struct kiocb *iocb, ssize_t size, int error, unsigned flags); + blk_qc_t (*submit_io)(struct bio *bio, struct file *file, + loff_t file_offset); }; ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, From patchwork Tue Nov 26 03:14:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11261353 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1F5E8112B for ; Tue, 26 Nov 2019 03:15:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ECE382071E for ; Tue, 26 Nov 2019 03:15:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727266AbfKZDPt (ORCPT ); Mon, 25 Nov 2019 22:15:49 -0500 Received: from mx2.suse.de ([195.135.220.15]:34242 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726016AbfKZDPs (ORCPT ); Mon, 25 Nov 2019 22:15:48 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 23F3BAE00; Tue, 26 Nov 2019 03:15:45 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, darrick.wong@oracle.com, fdmanana@kernel.org, Goldwyn Rodrigues Subject: [PATCH 3/5] btrfs: Switch to iomap_dio_rw() for dio Date: Mon, 25 Nov 2019 21:14:54 -0600 Message-Id: <20191126031456.12150-4-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20191126031456.12150-1-rgoldwyn@suse.de> References: <20191126031456.12150-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Switch from __blockdev_direct_IO() to iomap_dio_rw(). Rename btrfs_get_blocks_direct() to btrfs_dio_iomap_begin() and use it as iomap_begin() for iomap direct I/O functions. This function allocates and locks all the blocks required for the I/O. btrfs_submit_direct() is used as the submit_io() hook for direct I/O ops. Since we need direct I/O reads to go through iomap_dio_rw(), we change file_operations.read_iter() to a btrfs_file_read_iter() which calls btrfs_direct_IO() for direct reads and falls back to generic_file_buffered_read() for incomplete reads and buffered reads. We don't need address_space.direct_IO() anymore so set it to noop. Similarly, we don't need flags used in __blockdev_direct_IO(). iomap is capable of direct I/O reads from a hole, so we don't need to return -ENOENT. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 2 + fs/btrfs/file.c | 15 ++++- fs/btrfs/inode.c | 171 ++++++++++++++++++++++++++----------------------------- 3 files changed, 97 insertions(+), 91 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index fe2b8765d9e6..6f038ba34512 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2903,6 +2903,8 @@ int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end); void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, u64 end, int uptodate); extern const struct dentry_operations btrfs_dentry_operations; +ssize_t btrfs_dio_read(struct kiocb *iocb, struct iov_iter *iter); +ssize_t btrfs_dio_write(struct kiocb *iocb, struct iov_iter *iter); /* ioctl.c */ long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg); diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 435a502a3226..960a0767f532 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1834,7 +1834,7 @@ static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) loff_t endbyte; int err; - written = generic_file_direct_write(iocb, from); + written = btrfs_dio_write(iocb, from); if (written < 0 || !iov_iter_count(from)) return written; @@ -3452,9 +3452,20 @@ static int btrfs_file_open(struct inode *inode, struct file *filp) return generic_file_open(inode, filp); } +static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + ssize_t ret = 0; + if (iocb->ki_flags & IOCB_DIRECT) + ret = btrfs_dio_read(iocb, to); + if (ret < 0) + return ret; + + return generic_file_buffered_read(iocb, to, ret); +} + const struct file_operations btrfs_file_operations = { .llseek = btrfs_file_llseek, - .read_iter = generic_file_read_iter, + .read_iter = btrfs_file_read_iter, .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, .mmap = btrfs_file_mmap, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 015910079e73..9a39a53b9068 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include "misc.h" #include "ctree.h" @@ -7619,28 +7620,7 @@ static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len, } -static int btrfs_get_blocks_direct_read(struct extent_map *em, - struct buffer_head *bh_result, - struct inode *inode, - u64 start, u64 len) -{ - if (em->block_start == EXTENT_MAP_HOLE || - test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) - return -ENOENT; - - len = min(len, em->len - (start - em->start)); - - bh_result->b_blocknr = (em->block_start + (start - em->start)) >> - inode->i_blkbits; - bh_result->b_size = len; - bh_result->b_bdev = em->bdev; - set_buffer_mapped(bh_result); - - return 0; -} - static int btrfs_get_blocks_direct_write(struct extent_map **map, - struct buffer_head *bh_result, struct inode *inode, struct btrfs_dio_data *dio_data, u64 start, u64 len) @@ -7702,7 +7682,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, } /* this will cow the extent */ - len = bh_result->b_size; free_extent_map(em); *map = em = btrfs_new_extent_direct(inode, start, len); if (IS_ERR(em)) { @@ -7713,15 +7692,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, len = min(len, em->len - (start - em->start)); skip_cow: - bh_result->b_blocknr = (em->block_start + (start - em->start)) >> - inode->i_blkbits; - bh_result->b_size = len; - bh_result->b_bdev = em->bdev; - set_buffer_mapped(bh_result); - - if (!test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) - set_buffer_new(bh_result); - /* * Need to update the i_size under the extent lock so buffered * readers will get the updated i_size when we unlock. @@ -7737,17 +7707,19 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, return ret; } -static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, - struct buffer_head *bh_result, int create) +static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start, + loff_t length, unsigned flags, struct iomap *iomap, + struct iomap *srcmap) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct extent_map *em; struct extent_state *cached_state = NULL; struct btrfs_dio_data *dio_data = NULL; - u64 start = iblock << inode->i_blkbits; u64 lockstart, lockend; - u64 len = bh_result->b_size; + int create = flags & IOMAP_WRITE; int ret = 0; + u64 len = length; + bool unlock_extents = false; if (!create) len = min_t(u64, len, fs_info->sectorsize); @@ -7755,6 +7727,17 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, lockstart = start; lockend = start + len - 1; + /* + * The generic stuff only does filemap_write_and_wait_range, which + * isn't enough if we've written compressed pages to this area, so + * we need to flush the dirty pages again to make absolutely sure + * that any outstanding dirty pages are on disk. + */ + if (test_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, + &BTRFS_I(inode)->runtime_flags)) + ret = filemap_fdatawrite_range(inode->i_mapping, start, + start + length - 1); + if (current->journal_info) { /* * Need to pull our outstanding extents and set journal_info to NULL so @@ -7803,35 +7786,45 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, } if (create) { - ret = btrfs_get_blocks_direct_write(&em, bh_result, inode, + ret = btrfs_get_blocks_direct_write(&em, inode, dio_data, start, len); if (ret < 0) goto unlock_err; - - unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, - lockend, &cached_state); + unlock_extents = true; + } else if (em->block_start == EXTENT_MAP_HOLE || + test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) { + /* Unlock in case of direct reading from a hole */ + unlock_extents = true; } else { - ret = btrfs_get_blocks_direct_read(em, bh_result, inode, - start, len); - /* Can be negative only if we read from a hole */ - if (ret < 0) { - ret = 0; - free_extent_map(em); - goto unlock_err; - } + + len = min(len, em->len - (start - em->start)); /* * We need to unlock only the end area that we aren't using. * The rest is going to be unlocked by the endio routine. */ - lockstart = start + bh_result->b_size; - if (lockstart < lockend) { - unlock_extent_cached(&BTRFS_I(inode)->io_tree, - lockstart, lockend, &cached_state); - } else { + lockstart = start + len; + if (lockstart < lockend) + unlock_extents = true; + else free_extent_state(cached_state); - } } + if (unlock_extents) + unlock_extent_cached(&BTRFS_I(inode)->io_tree, + lockstart, lockend, &cached_state); + + if ((em->block_start == EXTENT_MAP_HOLE) || + (test_bit(EXTENT_FLAG_PREALLOC, &em->flags) && !create)) { + iomap->addr = IOMAP_NULL_ADDR; + iomap->type = IOMAP_HOLE; + } else { + iomap->addr = em->block_start; + iomap->type = IOMAP_MAPPED; + } + iomap->offset = em->start; + iomap->bdev = em->bdev; + iomap->length = em->len; + free_extent_map(em); return 0; @@ -8199,9 +8192,8 @@ static void btrfs_endio_direct_read(struct bio *bio) kfree(dip); dio_bio->bi_status = err; - dio_end_io(dio_bio); + bio_endio(dio_bio); btrfs_io_bio_free_csum(io_bio); - bio_put(bio); } static void __endio_write_update_ordered(struct inode *inode, @@ -8263,8 +8255,7 @@ static void btrfs_endio_direct_write(struct bio *bio) kfree(dip); dio_bio->bi_status = bio->bi_status; - dio_end_io(dio_bio); - bio_put(bio); + bio_endio(dio_bio); } static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data, @@ -8496,9 +8487,10 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) return 0; } -static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, +static blk_qc_t btrfs_submit_direct(struct bio *dio_bio, struct file *file, loff_t file_offset) { + struct inode *inode = file_inode(file); struct btrfs_dio_private *dip = NULL; struct bio *bio = NULL; struct btrfs_io_bio *io_bio; @@ -8549,7 +8541,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, ret = btrfs_submit_direct_hook(dip); if (!ret) - return; + return BLK_QC_T_NONE; btrfs_io_bio_free_csum(io_bio); @@ -8568,7 +8560,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, /* * The end io callbacks free our dip, do the final put on bio * and all the cleanup and final put for dio_bio (through - * dio_end_io()). + * end_io()). */ dip = NULL; bio = NULL; @@ -8587,11 +8579,12 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, * Releases and cleans up our dio_bio, no need to bio_put() * nor bio_endio()/bio_io_error() against dio_bio. */ - dio_end_io(dio_bio); + bio_endio(dio_bio); } if (bio) bio_put(bio); kfree(dip); + return BLK_QC_T_NONE; } static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info, @@ -8627,6 +8620,14 @@ static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info, return retval; } +static const struct iomap_ops btrfs_dio_iomap_ops = { + .iomap_begin = btrfs_dio_iomap_begin, +}; + +static const struct iomap_dio_ops btrfs_dops = { + .submit_io = btrfs_submit_direct, +}; + static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) { struct file *file = iocb->ki_filp; @@ -8636,28 +8637,13 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) struct extent_changeset *data_reserved = NULL; loff_t offset = iocb->ki_pos; size_t count = 0; - int flags = 0; - bool wakeup = true; bool relock = false; ssize_t ret; if (check_direct_IO(fs_info, iter, offset)) return 0; - inode_dio_begin(inode); - - /* - * The generic stuff only does filemap_write_and_wait_range, which - * isn't enough if we've written compressed pages to this area, so - * we need to flush the dirty pages again to make absolutely sure - * that any outstanding dirty pages are on disk. - */ count = iov_iter_count(iter); - if (test_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, - &BTRFS_I(inode)->runtime_flags)) - filemap_fdatawrite_range(inode->i_mapping, offset, - offset + count - 1); - if (iov_iter_rw(iter) == WRITE) { /* * If the write DIO is beyond the EOF, we need update @@ -8688,17 +8674,11 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) dio_data.unsubmitted_oe_range_end = (u64)offset; current->journal_info = &dio_data; down_read(&BTRFS_I(inode)->dio_sem); - } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK, - &BTRFS_I(inode)->runtime_flags)) { - inode_dio_end(inode); - flags = DIO_LOCKING | DIO_SKIP_HOLES; - wakeup = false; } - ret = __blockdev_direct_IO(iocb, inode, - fs_info->fs_devices->latest_bdev, - iter, btrfs_get_blocks_direct, NULL, - btrfs_submit_direct, flags); + ret = iomap_dio_rw(iocb, iter, &btrfs_dio_iomap_ops, &btrfs_dops, + is_sync_kiocb(iocb)); + if (iov_iter_rw(iter) == WRITE) { up_read(&BTRFS_I(inode)->dio_sem); current->journal_info = NULL; @@ -8725,15 +8705,28 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) btrfs_delalloc_release_extents(BTRFS_I(inode), count); } out: - if (wakeup) - inode_dio_end(inode); if (relock) inode_lock(inode); - extent_changeset_free(data_reserved); return ret; } +ssize_t btrfs_dio_read(struct kiocb *iocb, struct iov_iter *to) +{ + struct inode *inode = file_inode(iocb->ki_filp); + ssize_t ret; + + inode_lock_shared(inode); + ret = btrfs_direct_IO(iocb, to); + inode_unlock_shared(inode); + return ret; +} + +ssize_t btrfs_dio_write(struct kiocb *iocb, struct iov_iter *iter) +{ + return btrfs_direct_IO(iocb, iter); +} + #define BTRFS_FIEMAP_FLAGS (FIEMAP_FLAG_SYNC) static int btrfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, @@ -11019,7 +11012,7 @@ static const struct address_space_operations btrfs_aops = { .writepage = btrfs_writepage, .writepages = btrfs_writepages, .readpages = btrfs_readpages, - .direct_IO = btrfs_direct_IO, + .direct_IO = noop_direct_IO, .invalidatepage = btrfs_invalidatepage, .releasepage = btrfs_releasepage, .set_page_dirty = btrfs_set_page_dirty, From patchwork Tue Nov 26 03:14:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11261357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9EA39109A for ; Tue, 26 Nov 2019 03:15:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 887572071A for ; Tue, 26 Nov 2019 03:15:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727289AbfKZDPx (ORCPT ); Mon, 25 Nov 2019 22:15:53 -0500 Received: from mx2.suse.de ([195.135.220.15]:34268 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726016AbfKZDPx (ORCPT ); Mon, 25 Nov 2019 22:15:53 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id F3945B390; Tue, 26 Nov 2019 03:15:51 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, darrick.wong@oracle.com, fdmanana@kernel.org, Goldwyn Rodrigues Subject: [PATCH 4/5] btrfs: Wait for extent bits to release page Date: Mon, 25 Nov 2019 21:14:55 -0600 Message-Id: <20191126031456.12150-5-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20191126031456.12150-1-rgoldwyn@suse.de> References: <20191126031456.12150-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues While trying to release a page, the extent containing the page may be locked which would stop the page from being released. Wait for the extent lock to be cleared, if blocking is allowed and then clear the bits. While we are at it, clean the code of try_release_extent_state() to make it simpler. Signed-off-by: Goldwyn Rodrigues Reviewed-by: Nikolay Borisov Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent_io.c | 33 ++++++++++++++------------------- 1 file changed, 14 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index cceaf05aada2..a7c32276702d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4367,28 +4367,23 @@ static int try_release_extent_state(struct extent_io_tree *tree, { u64 start = page_offset(page); u64 end = start + PAGE_SIZE - 1; - int ret = 1; if (test_range_bit(tree, start, end, EXTENT_LOCKED, 0, NULL)) { - ret = 0; - } else { - /* - * at this point we can safely clear everything except the - * locked bit and the nodatasum bit - */ - ret = __clear_extent_bit(tree, start, end, - ~(EXTENT_LOCKED | EXTENT_NODATASUM), - 0, 0, NULL, mask, NULL); - - /* if clear_extent_bit failed for enomem reasons, - * we can't allow the release to continue. - */ - if (ret < 0) - ret = 0; - else - ret = 1; + if (!gfpflags_allow_blocking(mask)) + return 0; + wait_extent_bit(tree, start, end, EXTENT_LOCKED); } - return ret; + /* + * At this point we can safely clear everything except the locked and + * nodatasum bits. If clear_extent_bit failed due to -ENOMEM, + * don't allow release. + */ + if (__clear_extent_bit(tree, start, end, + ~(EXTENT_LOCKED | EXTENT_NODATASUM), 0, 0, + NULL, mask, NULL) < 0) + return 0; + + return 1; } /* From patchwork Tue Nov 26 03:14:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11261361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 253E7109A for ; Tue, 26 Nov 2019 03:16:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0D8722084D for ; Tue, 26 Nov 2019 03:16:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727305AbfKZDQB (ORCPT ); Mon, 25 Nov 2019 22:16:01 -0500 Received: from mx2.suse.de ([195.135.220.15]:34322 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726016AbfKZDQB (ORCPT ); Mon, 25 Nov 2019 22:16:01 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 798F9B0BA; Tue, 26 Nov 2019 03:15:59 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, darrick.wong@oracle.com, fdmanana@kernel.org, Goldwyn Rodrigues Subject: [PATCH 5/5] fs: Remove dio_end_io() Date: Mon, 25 Nov 2019 21:14:56 -0600 Message-Id: <20191126031456.12150-6-rgoldwyn@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20191126031456.12150-1-rgoldwyn@suse.de> References: <20191126031456.12150-1-rgoldwyn@suse.de> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Since we removed the last user of dio_end_io(), remove the helper function dio_end_io(). Signed-off-by: Goldwyn Rodrigues Reviewed-by: Nikolay Borisov Reviewed-by: Johannes Thumshirn --- fs/direct-io.c | 19 ------------------- include/linux/fs.h | 1 - 2 files changed, 20 deletions(-) diff --git a/fs/direct-io.c b/fs/direct-io.c index 9329ced91f1d..9d6fc6467d9c 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -405,25 +405,6 @@ static void dio_bio_end_io(struct bio *bio) spin_unlock_irqrestore(&dio->bio_lock, flags); } -/** - * dio_end_io - handle the end io action for the given bio - * @bio: The direct io bio thats being completed - * - * This is meant to be called by any filesystem that uses their own dio_submit_t - * so that the DIO specific endio actions are dealt with after the filesystem - * has done it's completion work. - */ -void dio_end_io(struct bio *bio) -{ - struct dio *dio = bio->bi_private; - - if (dio->is_async) - dio_bio_end_aio(bio); - else - dio_bio_end_io(bio); -} -EXPORT_SYMBOL_GPL(dio_end_io); - static inline void dio_bio_alloc(struct dio *dio, struct dio_submit *sdio, struct block_device *bdev, diff --git a/include/linux/fs.h b/include/linux/fs.h index 5bc75cff3536..eab2c26fb32a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3154,7 +3154,6 @@ enum { DIO_SKIP_HOLES = 0x02, }; -void dio_end_io(struct bio *bio); void dio_warn_stale_pagecache(struct file *filp); ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,