From patchwork Mon Jun 22 16:20:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618349 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5437690 for ; Mon, 22 Jun 2020 16:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C7C020656 for ; Mon, 22 Jun 2020 16:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729519AbgFVQU1 (ORCPT ); Mon, 22 Jun 2020 12:20:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:46044 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729486AbgFVQU0 (ORCPT ); Mon, 22 Jun 2020 12:20:26 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4865EC241; Mon, 22 Jun 2020 16:20:25 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 1/8] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Date: Mon, 22 Jun 2020 11:20:10 -0500 Message-Id: <20200622162017.21773-2-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues While we do this, correct the call to pagecache_isize_extended(): - pagecache_isisze_extended needs to be called to the starting of the write as opposed to i_size - We don't need to check range befor the call, this is done in the function Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 7f75adf70bcd..01377a7d1d13 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1585,6 +1585,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, int ret = 0; bool only_release_metadata = false; bool force_page_uptodate = false; + loff_t old_isize = i_size_read(inode); nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE), PAGE_SIZE / (sizeof(struct page *))); @@ -1806,6 +1807,10 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, } extent_changeset_free(data_reserved); + if (num_written > 0) { + pagecache_isize_extended(inode, old_isize, iocb->ki_pos); + iocb->ki_pos += num_written; + } return num_written ? num_written : ret; } @@ -1926,7 +1931,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, loff_t pos; size_t count; loff_t oldsize; - int clean_page = 0; if (!(iocb->ki_flags & IOCB_DIRECT) && (iocb->ki_flags & IOCB_NOWAIT)) @@ -1998,8 +2002,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, inode_unlock(inode); goto out; } - if (start_pos > round_up(oldsize, fs_info->sectorsize)) - clean_page = 1; } if (sync) @@ -2009,11 +2011,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, num_written = btrfs_direct_write(iocb, from); } else { num_written = btrfs_buffered_write(iocb, from); - if (num_written > 0) - iocb->ki_pos = pos + num_written; - if (clean_page) - pagecache_isize_extended(inode, oldsize, - i_size_read(inode)); } inode_unlock(inode); From patchwork Mon Jun 22 16:20:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618351 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4126690 for ; Mon, 22 Jun 2020 16:20:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2A2B82073E for ; Mon, 22 Jun 2020 16:20:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729525AbgFVQUa (ORCPT ); Mon, 22 Jun 2020 12:20:30 -0400 Received: from mx2.suse.de ([195.135.220.15]:46084 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729521AbgFVQU3 (ORCPT ); Mon, 22 Jun 2020 12:20:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 581A4C241; Mon, 22 Jun 2020 16:20:28 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 2/8] btrfs: Move FS error state bit early during write Date: Mon, 22 Jun 2020 11:20:11 -0500 Message-Id: <20200622162017.21773-3-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues We don't need the inode locked to check for the error bit. Move the check early. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 01377a7d1d13..9ec30ccbd67a 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1932,6 +1932,15 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, size_t count; loff_t oldsize; + /* + * If BTRFS flips readonly due to some impossible error + * (fs_info->fs_state now has BTRFS_SUPER_FLAG_ERROR), + * although we have opened a file as writable, we have + * to stop this write operation to ensure FS consistency. + */ + if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) + return -EROFS; + if (!(iocb->ki_flags & IOCB_DIRECT) && (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP; @@ -1971,18 +1980,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, goto out; } - /* - * If BTRFS flips readonly due to some impossible error - * (fs_info->fs_state now has BTRFS_SUPER_FLAG_ERROR), - * although we have opened a file as writable, we have - * to stop this write operation to ensure FS consistency. - */ - if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) { - inode_unlock(inode); - err = -EROFS; - goto out; - } - /* * We reserve space for updating the inode when we reserve space for the * extent we are going to write, so we will enospc out there. We don't From patchwork Mon Jun 22 16:20:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618353 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F20B390 for ; Mon, 22 Jun 2020 16:20:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E32EE206C1 for ; Mon, 22 Jun 2020 16:20:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729551AbgFVQUe (ORCPT ); Mon, 22 Jun 2020 12:20:34 -0400 Received: from mx2.suse.de ([195.135.220.15]:46118 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729521AbgFVQUd (ORCPT ); Mon, 22 Jun 2020 12:20:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6B3CBC220; Mon, 22 Jun 2020 16:20:31 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 3/8] btrfs: Introduce btrfs_write_check() Date: Mon, 22 Jun 2020 11:20:12 -0500 Message-Id: <20200622162017.21773-4-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues btrfs_write_check() checks for all parameters before the write. This simplifies with error situations with respect to inode unlocking. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 138 +++++++++++++++++++++++++----------------------- 1 file changed, 71 insertions(+), 67 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 9ec30ccbd67a..ba7c3b2cf1c5 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1567,6 +1567,76 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos, return ret; } +static void update_time_for_write(struct inode *inode) +{ + struct timespec64 now; + + if (IS_NOCMTIME(inode)) + return; + + now = current_time(inode); + if (!timespec64_equal(&inode->i_mtime, &now)) + inode->i_mtime = now; + + if (!timespec64_equal(&inode->i_ctime, &now)) + inode->i_ctime = now; + + if (IS_I_VERSION(inode)) + inode_inc_iversion(inode); +} + +static size_t btrfs_write_check(struct kiocb *iocb, struct iov_iter *from) +{ + int err; + loff_t oldsize; + loff_t start_pos; + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + loff_t pos = iocb->ki_pos; + size_t count = iov_iter_count(from); + + err = generic_write_checks(iocb, from); + if (err <= 0) + return err; + + if (iocb->ki_flags & IOCB_NOWAIT) { + /* + * We will allocate space in case nodatacow is not set, + * so bail + */ + if (!(BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | + BTRFS_INODE_PREALLOC)) || + check_can_nocow(BTRFS_I(inode), pos, &count) <= 0) + return -EAGAIN; + } + + current->backing_dev_info = inode_to_bdi(inode); + err = file_remove_privs(file); + if (err) + return err; + + /* + * We reserve space for updating the inode when we reserve space for the + * extent we are going to write, so we will enospc out there. We don't + * need to start yet another transaction to update the inode as we will + * update the inode when we finish writing whatever data we write. + */ + update_time_for_write(inode); + + start_pos = round_down(pos, fs_info->sectorsize); + oldsize = i_size_read(inode); + if (start_pos > oldsize) { + /* Expand hole size to cover write data, preventing empty gap */ + u64 end_pos = round_up(pos + count, + fs_info->sectorsize); + err = btrfs_cont_expand(inode, oldsize, end_pos); + if (err) + return err; + } + return count; +} + static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, struct iov_iter *i) { @@ -1898,24 +1968,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) return written ? written : err; } -static void update_time_for_write(struct inode *inode) -{ - struct timespec64 now; - - if (IS_NOCMTIME(inode)) - return; - - now = current_time(inode); - if (!timespec64_equal(&inode->i_mtime, &now)) - inode->i_mtime = now; - - if (!timespec64_equal(&inode->i_ctime, &now)) - inode->i_ctime = now; - - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); -} - static ssize_t btrfs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) { @@ -1923,14 +1975,9 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_root *root = BTRFS_I(inode)->root; - u64 start_pos; - u64 end_pos; ssize_t num_written = 0; const bool sync = iocb->ki_flags & IOCB_DSYNC; ssize_t err; - loff_t pos; - size_t count; - loff_t oldsize; /* * If BTRFS flips readonly due to some impossible error @@ -1952,55 +1999,12 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, inode_lock(inode); } - err = generic_write_checks(iocb, from); + err = btrfs_write_check(iocb, from); if (err <= 0) { - inode_unlock(inode); - return err; - } - - pos = iocb->ki_pos; - count = iov_iter_count(from); - if (iocb->ki_flags & IOCB_NOWAIT) { - /* - * We will allocate space in case nodatacow is not set, - * so bail - */ - if (!(BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | - BTRFS_INODE_PREALLOC)) || - check_can_nocow(BTRFS_I(inode), pos, &count) <= 0) { - inode_unlock(inode); - return -EAGAIN; - } - } - - current->backing_dev_info = inode_to_bdi(inode); - err = file_remove_privs(file); - if (err) { inode_unlock(inode); goto out; } - /* - * We reserve space for updating the inode when we reserve space for the - * extent we are going to write, so we will enospc out there. We don't - * need to start yet another transaction to update the inode as we will - * update the inode when we finish writing whatever data we write. - */ - update_time_for_write(inode); - - start_pos = round_down(pos, fs_info->sectorsize); - oldsize = i_size_read(inode); - if (start_pos > oldsize) { - /* Expand hole size to cover write data, preventing empty gap */ - end_pos = round_up(pos + count, - fs_info->sectorsize); - err = btrfs_cont_expand(inode, oldsize, end_pos); - if (err) { - inode_unlock(inode); - goto out; - } - } - if (sync) atomic_inc(&BTRFS_I(inode)->sync_writers); From patchwork Mon Jun 22 16:20:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618355 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4427460D for ; Mon, 22 Jun 2020 16:20:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 35A7220656 for ; Mon, 22 Jun 2020 16:20:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729565AbgFVQUg (ORCPT ); Mon, 22 Jun 2020 12:20:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:46174 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729521AbgFVQUg (ORCPT ); Mon, 22 Jun 2020 12:20:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6AB98C241; Mon, 22 Jun 2020 16:20:34 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 4/8] btrfs: Introduce btrfs_inode_lock()/unlock() Date: Mon, 22 Jun 2020 11:20:13 -0500 Message-Id: <20200622162017.21773-5-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues btrfs_inode_lock/unlock() acquires the inode->i_rwsem depending on the flags passed. ilock_flags determines the type of lock to be taken: BTRFS_ILOCK_SHARED - for shared locks, for possible parallel DIO BTRFS_ILOCK_TRY - for the RWF_NOWAIT sequence Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/ctree.h | 8 ++++++++ fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 50 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 161533040978..346fea668ca0 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2953,6 +2953,14 @@ void btrfs_update_ioctl_balance_args(struct btrfs_fs_info *fs_info, struct btrfs_ioctl_balance_args *bargs); /* file.c */ + +/* ilock flags definition */ +#define BTRFS_ILOCK_SHARED 0x1 +#define BTRFS_ILOCK_TRY 0x2 + +int btrfs_inode_lock(struct inode *inode, int ilock_flags); +void btrfs_inode_unlock(struct inode *inode, int ilock_flags); + int __init btrfs_auto_defrag_init(void); void __cold btrfs_auto_defrag_exit(void); int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index ba7c3b2cf1c5..1a9a0a9e4b3d 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -70,6 +70,37 @@ static int __compare_inode_defrag(struct inode_defrag *defrag1, return 0; } +int btrfs_inode_lock(struct inode *inode, int ilock_flags) +{ + if (ilock_flags & BTRFS_ILOCK_SHARED) { + if (ilock_flags & BTRFS_ILOCK_TRY) { + if (!inode_trylock_shared(inode)) + return -EAGAIN; + else + return 0; + } + inode_lock_shared(inode); + } else { + if (ilock_flags & BTRFS_ILOCK_TRY) { + if (!inode_trylock(inode)) + return -EAGAIN; + else + return 0; + } + inode_lock(inode); + } + return 0; +} + +void btrfs_inode_unlock(struct inode *inode, int ilock_flags) +{ + if (ilock_flags & BTRFS_ILOCK_SHARED) + inode_unlock_shared(inode); + else + inode_unlock(inode); + +} + /* pop a record for an inode into the defrag tree. The lock * must be held already * @@ -1978,6 +2009,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, ssize_t num_written = 0; const bool sync = iocb->ki_flags & IOCB_DSYNC; ssize_t err; + int ilock_flags = 0; /* * If BTRFS flips readonly due to some impossible error @@ -1992,12 +2024,12 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP; - if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock(inode)) - return -EAGAIN; - } else { - inode_lock(inode); - } + if (iocb->ki_flags & IOCB_NOWAIT) + ilock_flags |= BTRFS_ILOCK_TRY; + + err = btrfs_inode_lock(inode, ilock_flags); + if (err < 0) + goto out; err = btrfs_write_check(iocb, from); if (err <= 0) { @@ -2014,7 +2046,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, num_written = btrfs_buffered_write(iocb, from); } - inode_unlock(inode); + btrfs_inode_unlock(inode, ilock_flags); /* * We also have to set last_sub_trans to the current log transid, @@ -3547,9 +3579,10 @@ static ssize_t btrfs_direct_read(struct kiocb *iocb, struct iov_iter *to) if (is_sync_kiocb(iocb)) flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION; - inode_lock_shared(inode); + btrfs_inode_lock(inode, BTRFS_ILOCK_SHARED); ret = iomap_dio_rw(iocb, to, &btrfs_dio_iomap_ops, &btrfs_dops, flags); - inode_unlock_shared(inode); + btrfs_inode_unlock(inode, BTRFS_ILOCK_SHARED); + return ret; } From patchwork Mon Jun 22 16:20:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1FD5890 for ; Mon, 22 Jun 2020 16:20:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 13CC4206C1 for ; Mon, 22 Jun 2020 16:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729568AbgFVQUk (ORCPT ); Mon, 22 Jun 2020 12:20:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:46274 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729521AbgFVQUj (ORCPT ); Mon, 22 Jun 2020 12:20:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6E433C247; Mon, 22 Jun 2020 16:20:37 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 5/8] btrfs: Push inode locking and unlocking in buffered/direct write Date: Mon, 22 Jun 2020 11:20:14 -0500 Message-Id: <20200622162017.21773-6-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Push inode locking and unlocking closer to where we perform the I/O. For this we need to move the write checks inside the respective functions as well. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 70 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 46 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 1a9a0a9e4b3d..aa6be931620b 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1672,7 +1672,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, struct iov_iter *i) { struct file *file = iocb->ki_filp; - loff_t pos = iocb->ki_pos; + loff_t pos; struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_root *root = BTRFS_I(inode)->root; @@ -1687,14 +1687,29 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, bool only_release_metadata = false; bool force_page_uptodate = false; loff_t old_isize = i_size_read(inode); + int ilock_flags = 0; + + if (iocb->ki_flags & IOCB_NOWAIT) + ilock_flags |= BTRFS_ILOCK_TRY; + + ret = btrfs_inode_lock(inode, ilock_flags); + if (ret < 0) + return ret; + + ret = btrfs_write_check(iocb, i); + if (ret <= 0) + goto out; + pos = iocb->ki_pos; nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE), PAGE_SIZE / (sizeof(struct page *))); nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied); nrptrs = max(nrptrs, 8); pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL); - if (!pages) - return -ENOMEM; + if (!pages) { + ret = -ENOMEM; + goto out; + } while (iov_iter_count(i) > 0) { struct extent_state *cached_state = NULL; @@ -1912,6 +1927,8 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, pagecache_isize_extended(inode, old_isize, iocb->ki_pos); iocb->ki_pos += num_written; } +out: + btrfs_inode_unlock(inode, ilock_flags); return num_written ? num_written : ret; } @@ -1942,6 +1959,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) size_t count = 0; bool relock = false; int flags = IOMAP_DIOF_PGINVALID_FAIL; + int ilock_flags = 0; + + if (iocb->ki_flags & IOCB_NOWAIT) + ilock_flags |= BTRFS_ILOCK_TRY; + + err = btrfs_inode_lock(inode, ilock_flags); + if (err < 0) + goto error; + + err = btrfs_write_check(iocb, from); + if (err <= 0) + goto error; if (check_direct_IO(fs_info, from, pos)) goto buffered; @@ -1956,7 +1985,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) inode_unlock(inode); relock = true; } else if (iocb->ki_flags & IOCB_NOWAIT) { - return -EAGAIN; + err = -EAGAIN; + goto out; } if (is_sync_kiocb(iocb)) @@ -1970,10 +2000,13 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) if (relock) inode_lock(inode); - if (written < 0 || !iov_iter_count(from)) - return written; + if (written < 0 || !iov_iter_count(from)) { + err = written; + goto error; + } buffered: + btrfs_inode_unlock(inode, ilock_flags); pos = iocb->ki_pos; written_buffered = btrfs_buffered_write(iocb, from); if (written_buffered < 0) { @@ -1995,8 +2028,13 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) iocb->ki_pos = pos + written_buffered; invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT, endbyte >> PAGE_SHIFT); + out: return written ? written : err; + +error: + btrfs_inode_unlock(inode, ilock_flags); + return err; } static ssize_t btrfs_file_write_iter(struct kiocb *iocb, @@ -2008,8 +2046,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, struct btrfs_root *root = BTRFS_I(inode)->root; ssize_t num_written = 0; const bool sync = iocb->ki_flags & IOCB_DSYNC; - ssize_t err; - int ilock_flags = 0; + ssize_t err = 0; /* * If BTRFS flips readonly due to some impossible error @@ -2024,19 +2061,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP; - if (iocb->ki_flags & IOCB_NOWAIT) - ilock_flags |= BTRFS_ILOCK_TRY; - - err = btrfs_inode_lock(inode, ilock_flags); - if (err < 0) - goto out; - - err = btrfs_write_check(iocb, from); - if (err <= 0) { - inode_unlock(inode); - goto out; - } - if (sync) atomic_inc(&BTRFS_I(inode)->sync_writers); @@ -2046,8 +2070,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, num_written = btrfs_buffered_write(iocb, from); } - btrfs_inode_unlock(inode, ilock_flags); - /* * We also have to set last_sub_trans to the current log transid, * otherwise subsequent syncs to a file that's been synced in this @@ -2061,7 +2083,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, if (sync) atomic_dec(&BTRFS_I(inode)->sync_writers); -out: + current->backing_dev_info = NULL; return num_written ? num_written : err; } From patchwork Mon Jun 22 16:20:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618359 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0677090 for ; Mon, 22 Jun 2020 16:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E3395206C1 for ; Mon, 22 Jun 2020 16:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729576AbgFVQUm (ORCPT ); Mon, 22 Jun 2020 12:20:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:46346 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729521AbgFVQUl (ORCPT ); Mon, 22 Jun 2020 12:20:41 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5F23CC220; Mon, 22 Jun 2020 16:20:40 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 6/8] btrfs: Use shared inode lock for direct writes within EOF Date: Mon, 22 Jun 2020 11:20:15 -0500 Message-Id: <20200622162017.21773-7-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues This is to parallelize direct writes within EOF or with direct I/O reads. This covers the race with truncate() accidentally increasing the filesize. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 25 +++++++------------------ 1 file changed, 7 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index aa6be931620b..c446a4aeb867 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1957,12 +1957,18 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) loff_t endbyte; int err; size_t count = 0; - bool relock = false; int flags = IOMAP_DIOF_PGINVALID_FAIL; int ilock_flags = 0; if (iocb->ki_flags & IOCB_NOWAIT) ilock_flags |= BTRFS_ILOCK_TRY; + /* + * If the write DIO within EOF, use a shared lock + */ + if (pos + count <= i_size_read(inode)) + ilock_flags |= BTRFS_ILOCK_SHARED; + else if (iocb->ki_flags & IOCB_NOWAIT) + return -EAGAIN; err = btrfs_inode_lock(inode, ilock_flags); if (err < 0) @@ -1975,20 +1981,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) if (check_direct_IO(fs_info, from, pos)) goto buffered; - count = iov_iter_count(from); - /* - * If the write DIO is beyond the EOF, we need update the isize, but it - * is protected by i_mutex. So we can not unlock the i_mutex at this - * case. - */ - if (pos + count <= inode->i_size) { - inode_unlock(inode); - relock = true; - } else if (iocb->ki_flags & IOCB_NOWAIT) { - err = -EAGAIN; - goto out; - } - if (is_sync_kiocb(iocb)) flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION; @@ -1997,9 +1989,6 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) flags); up_read(&BTRFS_I(inode)->dio_sem); - if (relock) - inode_lock(inode); - if (written < 0 || !iov_iter_count(from)) { err = written; goto error; From patchwork Mon Jun 22 16:20:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D27EA90 for ; Mon, 22 Jun 2020 16:20:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C06BE20767 for ; Mon, 22 Jun 2020 16:20:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729532AbgFVQUq (ORCPT ); Mon, 22 Jun 2020 12:20:46 -0400 Received: from mx2.suse.de ([195.135.220.15]:46370 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729484AbgFVQUp (ORCPT ); Mon, 22 Jun 2020 12:20:45 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4CB97C220; Mon, 22 Jun 2020 16:20:43 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 7/8] btrfs: Remove dio_sem Date: Mon, 22 Jun 2020 11:20:16 -0500 Message-Id: <20200622162017.21773-8-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues Since we handle all synchronization using inode_lock()/unlock(), we don't need dio_sem to handle the race with btrfs_sync_file() Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/btrfs_inode.h | 10 ---------- fs/btrfs/file.c | 14 -------------- fs/btrfs/inode.c | 1 - 3 files changed, 25 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index aeff56a0e105..86c5482e0fdd 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -187,16 +187,6 @@ struct btrfs_inode { /* Hook into fs_info->delayed_iputs */ struct list_head delayed_iput; - /* - * To avoid races between lockless (i_mutex not held) direct IO writes - * and concurrent fsync requests. Direct IO writes must acquire read - * access on this semaphore for creating an extent map and its - * corresponding ordered extent. The fast fsync path must acquire write - * access on this semaphore before it collects ordered extents and - * extent maps. - */ - struct rw_semaphore dio_sem; - struct inode vfs_inode; }; diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index c446a4aeb867..c8127406fcc6 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1984,10 +1984,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) if (is_sync_kiocb(iocb)) flags |= IOMAP_DIOF_WAIT_FOR_COMPLETION; - down_read(&BTRFS_I(inode)->dio_sem); written = iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops, &btrfs_dops, flags); - up_read(&BTRFS_I(inode)->dio_sem); if (written < 0 || !iov_iter_count(from)) { err = written; @@ -2165,13 +2163,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) inode_lock(inode); - /* - * We take the dio_sem here because the tree log stuff can race with - * lockless dio writes and get an extent map logged for an extent we - * never waited on. We need it this high up for lockdep reasons. - */ - down_write(&BTRFS_I(inode)->dio_sem); - atomic_inc(&root->log_batch); /* @@ -2209,7 +2200,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) */ ret = start_ordered_ops(inode, start, end); if (ret) { - up_write(&BTRFS_I(inode)->dio_sem); inode_unlock(inode); goto out; } @@ -2223,7 +2213,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) */ ret = btrfs_wait_ordered_range(inode, start, (u64)end - (u64)start + 1); if (ret) { - up_write(&BTRFS_I(inode)->dio_sem); inode_unlock(inode); goto out; } @@ -2247,7 +2236,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) * checked called fsync. */ ret = filemap_check_wb_err(inode->i_mapping, file->f_wb_err); - up_write(&BTRFS_I(inode)->dio_sem); inode_unlock(inode); goto out; } @@ -2266,7 +2254,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) trans = btrfs_start_transaction(root, 0); if (IS_ERR(trans)) { ret = PTR_ERR(trans); - up_write(&BTRFS_I(inode)->dio_sem); inode_unlock(inode); goto out; } @@ -2287,7 +2274,6 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) * file again, but that will end up using the synchronization * inside btrfs_sync_log to keep things safe. */ - up_write(&BTRFS_I(inode)->dio_sem); inode_unlock(inode); if (ret != BTRFS_NO_LOG_SYNC) { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 31ac8c682f19..7e71f42e7ec0 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8433,7 +8433,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) INIT_LIST_HEAD(&ei->delalloc_inodes); INIT_LIST_HEAD(&ei->delayed_iput); RB_CLEAR_NODE(&ei->rb_node); - init_rwsem(&ei->dio_sem); return inode; } From patchwork Mon Jun 22 16:20:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goldwyn Rodrigues X-Patchwork-Id: 11618363 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C04D060D for ; Mon, 22 Jun 2020 16:20:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AFCC12073E for ; Mon, 22 Jun 2020 16:20:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729589AbgFVQUs (ORCPT ); Mon, 22 Jun 2020 12:20:48 -0400 Received: from mx2.suse.de ([195.135.220.15]:46426 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729484AbgFVQUs (ORCPT ); Mon, 22 Jun 2020 12:20:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 46DDEC220; Mon, 22 Jun 2020 16:20:46 +0000 (UTC) From: Goldwyn Rodrigues To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH 8/8] btrfs: Move generic_write_sync() to btrfs_buffered_write() Date: Mon, 22 Jun 2020 11:20:17 -0500 Message-Id: <20200622162017.21773-9-rgoldwyn@suse.de> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200622162017.21773-1-rgoldwyn@suse.de> References: <20200622162017.21773-1-rgoldwyn@suse.de> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues For direct I/O writes, generic_write_sync() gets called twice, once with iomap_dio_complete() and the next time in btrfs_file_write_iter(). Move it within btrfs_buffered_write(). In the process, we have to move btrfs_inode->last_sub_trans as well to btrfs_buffered_write() and btrfs_dio_iomap_end() respectively. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/file.c | 23 +++++++++++------------ fs/btrfs/inode.c | 12 ++++++++---- 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index c8127406fcc6..8bbde5474297 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1929,6 +1929,17 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, } out: btrfs_inode_unlock(inode, ilock_flags); + /* + * We also have to set last_sub_trans to the current log transid, + * otherwise subsequent syncs to a file that's been synced in this + * transaction will appear to have already occurred. + */ + spin_lock(&BTRFS_I(inode)->lock); + BTRFS_I(inode)->last_sub_trans = root->log_transid; + spin_unlock(&BTRFS_I(inode)->lock); + if (num_written > 0) + num_written = generic_write_sync(iocb, num_written); + return num_written ? num_written : ret; } @@ -2030,7 +2041,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, struct file *file = iocb->ki_filp; struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - struct btrfs_root *root = BTRFS_I(inode)->root; ssize_t num_written = 0; const bool sync = iocb->ki_flags & IOCB_DSYNC; ssize_t err = 0; @@ -2057,17 +2067,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, num_written = btrfs_buffered_write(iocb, from); } - /* - * We also have to set last_sub_trans to the current log transid, - * otherwise subsequent syncs to a file that's been synced in this - * transaction will appear to have already occurred. - */ - spin_lock(&BTRFS_I(inode)->lock); - BTRFS_I(inode)->last_sub_trans = root->log_transid; - spin_unlock(&BTRFS_I(inode)->lock); - if (num_written > 0) - num_written = generic_write_sync(iocb, num_written); - if (sync) atomic_dec(&BTRFS_I(inode)->sync_writers); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7e71f42e7ec0..e0cfb1a5b952 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7416,10 +7416,11 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, struct btrfs_dio_data *dio_data = iomap->private; size_t submitted = dio_data->submitted; const bool write = !!(flags & IOMAP_WRITE); + struct btrfs_inode *bi = BTRFS_I(inode); if (!write && (iomap->type == IOMAP_HOLE)) { /* If reading from a hole, unlock and return */ - unlock_extent(&BTRFS_I(inode)->io_tree, pos, pos + length - 1); + unlock_extent(&bi->io_tree, pos, pos + length - 1); goto out; } @@ -7429,8 +7430,7 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, if (write) __endio_write_update_ordered(inode, pos, length, false); else - unlock_extent(&BTRFS_I(inode)->io_tree, pos, - pos + length - 1); + unlock_extent(&bi->io_tree, pos, pos + length - 1); ret = -ENOTBLK; } @@ -7439,8 +7439,12 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, btrfs_delalloc_release_space(inode, dio_data->data_reserved, pos, dio_data->reserve, true); - btrfs_delalloc_release_extents(BTRFS_I(inode), dio_data->length); + btrfs_delalloc_release_extents(bi, dio_data->length); extent_changeset_free(dio_data->data_reserved); + spin_lock(&bi->lock); + bi->last_sub_trans = bi->root->log_transid; + spin_unlock(&bi->lock); + } out: kfree(dio_data);