From patchwork Tue Jan 14 16:12:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B3191398 for ; Tue, 14 Jan 2020 16:13:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4B1142467A for ; Tue, 14 Jan 2020 16:13:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="hZ2uCty8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728757AbgANQMb (ORCPT ); Tue, 14 Jan 2020 11:12:31 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43352 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726053AbgANQMb (ORCPT ); Tue, 14 Jan 2020 11:12:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=qvys4y5BuENtlHq/fQJ5HpYZJz7aH8HDgIfuuwBY89Q=; b=hZ2uCty81+cLZZ+AsRSBc8RzcG kC15Kg6m7XjWGXt8bZwI3hFaj/SqYbm/8p3j7EuKhk/rJCCyXz7IoovhUGraWUO/P7TvsGWnYCwm9 D8mmmN/0rqoay7ZV2O/4AxayCcPF41tBPGAXRySkvtmkorPeP9n+3KUZwDnMogNsudXoZFlEUTdF/ a5QKf9ajqV5tgHOeo+IVo0cqkhwkqF+QX8Ay9IhuNEGb6SxJe0O9IFxQYXB0xmJVKK8EJSgFINc4f 5aAkHc3tOjML58DMTOlXPrFqR2DhdRp6/Vf4K6W1SJZMZvNid2zZazualtOvqFH0qifjtLd7pI/mb i98/ibWg==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOnu-00007I-4k; Tue, 14 Jan 2020 16:12:30 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 01/12] mm: fix a comment in sys_swapon Date: Tue, 14 Jan 2020 17:12:14 +0100 Message-Id: <20200114161225.309792-2-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org claim_swapfile now always takes i_rwsem. Signed-off-by: Christoph Hellwig --- mm/swapfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index bb3261d45b6a..fe6e4c1add0b 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3157,7 +3157,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) mapping = swap_file->f_mapping; inode = mapping->host; - /* If S_ISREG(inode->i_mode) will do inode_lock(inode); */ + /* will take i_rwsem; */ error = claim_swapfile(p, inode); if (unlikely(error)) goto bad_swap; From patchwork Tue Jan 14 16:12:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B0A56C1 for ; Tue, 14 Jan 2020 16:13:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D318B24658 for ; Tue, 14 Jan 2020 16:13:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="LRd1rAVd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728904AbgANQMe (ORCPT ); Tue, 14 Jan 2020 11:12:34 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43372 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728850AbgANQMd (ORCPT ); Tue, 14 Jan 2020 11:12:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=kUvZdbrxgXYM+XtmN49YBQ9SrUjZBQUmtmFMa7zViEA=; b=LRd1rAVdu7xTOpYjtX3JQ4QZqG OqE7DLd8J2fmEAv0BXF5eYeuB+Q4dCGgSpqnyXC/I+SAPH2yKKm2W8AGVklxNeHrCc1aMXKfFPqh3 FftbCFmQR9IbEawSShE+O3RO0vdHjoG5zBSXk6SDvo6YgfFtI01W1ebtS0qaocS8cQyK9O7GSYqZx 5UM81HbcAvgOK6roMBpJ7uqBXhC5VacnY/zdsuGhGItpqfMqGz3rcy0umTAJQXsjxlnxGk4qs8+UO mklssD83lByYvKT7G3flWLK7WiKJVLPSMW6Cg4vdkf0oaB2HzWfW+ruX2oiypP9+wwGm7uH9HiCzy 9qiyXnxw==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOnw-00007u-QN; Tue, 14 Jan 2020 16:12:33 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 02/12] locking/rwsem: Exit early when held by an anonymous owner Date: Tue, 14 Jan 2020 17:12:15 +0100 Message-Id: <20200114161225.309792-3-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The rwsem code overloads the owner field with either a task struct or negative magic numbers. Add a quick hack to catch these negative values early on. Without this spinning on a writer that replaced the owner with RWSEM_OWNER_UNKNOWN, rwsem_spin_on_owner can crash while deferencing the task_struct ->on_cpu field of a -8 value. XXX: This might be a bit of a hack as the code otherwise doesn't use the ERR_PTR family macros, better suggestions welcome. Signed-off-by: Christoph Hellwig --- kernel/locking/rwsem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 44e68761f432..6adc719a30a1 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -725,6 +725,8 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable) state = rwsem_owner_state(owner, flags, nonspinnable); if (state != OWNER_WRITER) return state; + if (IS_ERR(owner)) + return state; rcu_read_lock(); for (;;) { From patchwork Tue Jan 14 16:12:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C29A56C1 for ; Tue, 14 Jan 2020 16:13:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 97B7024686 for ; Tue, 14 Jan 2020 16:13:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="kWOqEnHo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729076AbgANQMg (ORCPT ); Tue, 14 Jan 2020 11:12:36 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43402 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728850AbgANQMg (ORCPT ); Tue, 14 Jan 2020 11:12:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ieFO0VW89Plk/SxbxvjUMyxGKaaWQ3fu6VZZX3+eXfU=; b=kWOqEnHoRjSX1561j4MXD/6Osb /OSAUcMlRV+0OWE0r2Z9tu9pFwGdI9kU0cZLSrXGU5IgPFiU6NFRlChhIO7uHCiaP4wNdJTnqu6T9 Ps6ig31zrnEbY0fXUCQw3qTjXUe1NlpLy17XH2HPoPNpiJUSG0x6d++F+lFe7LfPM6r+8Xv04Gyag AbtmbVoAxfu/Ep68NNEmqpcqONUpR8jSGGMIFYT1Py+iCoBpYi14sVJ+dAbBfnXr2oVLhpUAkIL9K Aypu60LR6r1u0zQW/bnDAMKp4wozCSqUhg9jx/OHWyPSJHeWsIXS11ws8MzTl5qh3PhQvKaIew/NW EU6QFkZg==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOnz-00008Y-Fj; Tue, 14 Jan 2020 16:12:36 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 03/12] xfs: fix IOCB_NOWAIT handling in xfs_file_dio_aio_read Date: Tue, 14 Jan 2020 17:12:16 +0100 Message-Id: <20200114161225.309792-4-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Direct I/O reads can also be used with RWF_NOWAIT & co. Fix the inode locking in xfs_file_dio_aio_read to take IOCB_NOWAIT into account. Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_file.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index c93250108952..b8a4a3f29b36 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -187,7 +187,12 @@ xfs_file_dio_aio_read( file_accessed(iocb->ki_filp); - xfs_ilock(ip, XFS_IOLOCK_SHARED); + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!xfs_ilock_nowait(ip, XFS_IOLOCK_SHARED)) + return -EAGAIN; + } else { + xfs_ilock(ip, XFS_IOLOCK_SHARED); + } ret = iomap_dio_rw(iocb, to, &xfs_read_iomap_ops, NULL, is_sync_kiocb(iocb)); xfs_iunlock(ip, XFS_IOLOCK_SHARED); From patchwork Tue Jan 14 16:12:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332647 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF8101398 for ; Tue, 14 Jan 2020 16:13:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ABEE124658 for ; Tue, 14 Jan 2020 16:13:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="oplgdNMp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729174AbgANQMk (ORCPT ); Tue, 14 Jan 2020 11:12:40 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43448 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728800AbgANQMj (ORCPT ); Tue, 14 Jan 2020 11:12:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=79c/IqiMprBaz8H+ADFYKQUEqd4rZzvM1P9wDVwg94s=; b=oplgdNMprg/V9W1Vc9mSW6qBSh ChGK0XTPuUo5pfag32IxC8sSxaBrkFyl0myFkYt5pYEyGDrjCOovd86lG6sDj5l1Dco9y642z5ofb N/SL8Ih0SiW0jUAC/rpbTkGHn1T03SY2d3AGw+aVATKAXqEDbHN/Dygi0lo5LHvV8raebRWT9GfJi DKIua/yP7vmAD/gBNERutRlhO3Wd2YpngZcwxHUuVlIJRVz7EivgPFTLE7wzC+rnD9dkkn1uiTVrE TSVhdz5wCPHNKFv3VsImok06/b6DKyqx29YHWXMa+eS2n363pcieQPrgHV03WpEDEzEpMmSeup9El slJllqmg==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOo2-00009S-As; Tue, 14 Jan 2020 16:12:38 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 04/12] gfs2: move setting current->backing_dev_info Date: Tue, 14 Jan 2020 17:12:17 +0100 Message-Id: <20200114161225.309792-5-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Only set current->backing_dev_info just around the buffered write calls to prepare for the next fix. Signed-off-by: Christoph Hellwig --- fs/gfs2/file.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 9d58295ccf7a..21d032c4b077 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -867,18 +867,15 @@ static ssize_t gfs2_file_write_iter(struct kiocb *iocb, struct iov_iter *from) inode_lock(inode); ret = generic_write_checks(iocb, from); if (ret <= 0) - goto out; - - /* We can write back this queue in page reclaim */ - current->backing_dev_info = inode_to_bdi(inode); + goto out_unlock; ret = file_remove_privs(file); if (ret) - goto out2; + goto out_unlock; ret = file_update_time(file); if (ret) - goto out2; + goto out_unlock; if (iocb->ki_flags & IOCB_DIRECT) { struct address_space *mapping = file->f_mapping; @@ -887,11 +884,13 @@ static ssize_t gfs2_file_write_iter(struct kiocb *iocb, struct iov_iter *from) written = gfs2_file_direct_write(iocb, from); if (written < 0 || !iov_iter_count(from)) - goto out2; + goto out_unlock; + current->backing_dev_info = inode_to_bdi(inode); ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops); + current->backing_dev_info = NULL; if (unlikely(ret < 0)) - goto out2; + goto out_unlock; buffered = ret; /* @@ -915,14 +914,14 @@ static ssize_t gfs2_file_write_iter(struct kiocb *iocb, struct iov_iter *from) */ } } else { + current->backing_dev_info = inode_to_bdi(inode); ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops); + current->backing_dev_info = NULL; if (likely(ret > 0)) iocb->ki_pos += ret; } -out2: - current->backing_dev_info = NULL; -out: +out_unlock: inode_unlock(inode); if (likely(ret > 0)) { /* Handle various SYNC-type writes */ From patchwork Tue Jan 14 16:12:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332645 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 862641398 for ; Tue, 14 Jan 2020 16:13:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 64C9F24655 for ; Tue, 14 Jan 2020 16:13:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="OkrfZS8N" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729240AbgANQMn (ORCPT ); Tue, 14 Jan 2020 11:12:43 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43478 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728986AbgANQMl (ORCPT ); Tue, 14 Jan 2020 11:12:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=gE1ExdwCpfYksx1sroYSdgpkjqJWhJqvmDoS1LitSEk=; b=OkrfZS8NRZAUrCzKeBJ9qWvRAB 2p1eoX14Slhqy2MdQbUQPF0HUA7/g5EN85LDdIwsd3BB9W7/vFoDRbIUV1u3p8uRTNQdFA89yhgL0 9jVm1PJqp35970u4WIuAU8p6kNZs2ZRtGJh3MVPfizmEgVpisfkPS5RqCDD065W+lOkbgH/N4b1Bz 7rnYRufVoQXMjKefjP7lJXcCL1F7uOU+daZA/kor/TTYgisnaIvGQhJJN6++IY5+4HY49K2j6A2Mh gq2BDx44Hr656RZ6OjOd0CKj3XN1ffkbrMPs0HMihU+PDNmbAhTVWrX7qp1DCYhsdIKw9HUpFhaEO Ej399QhQ==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOo4-0000AR-W5; Tue, 14 Jan 2020 16:12:41 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 05/12] gfs2: fix O_SYNC write handling Date: Tue, 14 Jan 2020 17:12:18 +0100 Message-Id: <20200114161225.309792-6-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Don't ignore the return value from generic_write_sync for the direct to buffered I/O callback case when written is non-zero. Also don't bother to call generic_write_sync for the pure direct I/O case, as iomap_dio_rw already takes care of that. Signed-off-by: Christoph Hellwig --- fs/gfs2/file.c | 51 +++++++++++++++++++++++++------------------------- 1 file changed, 25 insertions(+), 26 deletions(-) diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 21d032c4b077..86c0e61407b6 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -847,7 +847,7 @@ static ssize_t gfs2_file_write_iter(struct kiocb *iocb, struct iov_iter *from) struct file *file = iocb->ki_filp; struct inode *inode = file_inode(file); struct gfs2_inode *ip = GFS2_I(inode); - ssize_t written = 0, ret; + ssize_t ret = 0; ret = gfs2_rsqa_alloc(ip); if (ret) @@ -882,52 +882,51 @@ static ssize_t gfs2_file_write_iter(struct kiocb *iocb, struct iov_iter *from) loff_t pos, endbyte; ssize_t buffered; - written = gfs2_file_direct_write(iocb, from); - if (written < 0 || !iov_iter_count(from)) + ret = gfs2_file_direct_write(iocb, from); + if (ret < 0 || !iov_iter_count(from)) goto out_unlock; current->backing_dev_info = inode_to_bdi(inode); - ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops); + buffered = iomap_file_buffered_write(iocb, from, + &gfs2_iomap_ops); current->backing_dev_info = NULL; - if (unlikely(ret < 0)) + if (unlikely(buffered <= 0)) { + if (buffered < 0) + ret = buffered; goto out_unlock; - buffered = ret; + } /* * We need to ensure that the page cache pages are written to * disk and invalidated to preserve the expected O_DIRECT - * semantics. + * semantics. If the writeback or invalidate fails only report + * the direct I/O range as we don't know if the buffered pages + * made it to disk. */ pos = iocb->ki_pos; endbyte = pos + buffered - 1; ret = filemap_write_and_wait_range(mapping, pos, endbyte); - if (!ret) { - iocb->ki_pos += buffered; - written += buffered; - invalidate_mapping_pages(mapping, - pos >> PAGE_SHIFT, - endbyte >> PAGE_SHIFT); - } else { - /* - * We don't know how much we wrote, so just return - * the number of bytes which were direct-written - */ - } + if (ret) + goto out_unlock; + + invalidate_mapping_pages(mapping, pos >> PAGE_SHIFT, + endbyte >> PAGE_SHIFT); + ret += buffered; } else { current->backing_dev_info = inode_to_bdi(inode); ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops); current->backing_dev_info = NULL; - if (likely(ret > 0)) - iocb->ki_pos += ret; + if (unlikely(ret <= 0)) + goto out_unlock; } + iocb->ki_pos += ret; + inode_unlock(inode); + return generic_write_sync(iocb, ret); + out_unlock: inode_unlock(inode); - if (likely(ret > 0)) { - /* Handle various SYNC-type writes */ - ret = generic_write_sync(iocb, ret); - } - return written ? written : ret; + return ret; } static int fallocate_chunk(struct inode *inode, loff_t offset, loff_t len, From patchwork Tue Jan 14 16:12:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332635 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D1611398 for ; Tue, 14 Jan 2020 16:13:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 233532465A for ; Tue, 14 Jan 2020 16:13:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="iGOc6nDb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728986AbgANQMq (ORCPT ); Tue, 14 Jan 2020 11:12:46 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43516 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729263AbgANQMo (ORCPT ); Tue, 14 Jan 2020 11:12:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=keR0X+WfeQcwktT8eyN6Jj1q2oCggZn3aD+r2MvaupY=; b=iGOc6nDbzK9T+olY+Y8Yr6GQC8 ldbeOze0xj3z7wnqZYqyikHW/b1WntEa1URKv7wRibkxDWk3f+5Y4aErWvbEQbccNPSXMk/P0rvdb hyOkQHJquup/q3zDF9vRIENczJRJhy9fjQNnyQc2K4fXuZjMdarik9T3iwuoxVRXAoTKERUFQF1HU U51jzCHo6Z/Y9tgkwjKnACfGGcWLUqk4L6dNksek+3toV/4GS7jG3T2GDVLlICozak3sBQdk8d8jK MdCHCRCpochiSs/D4KQ/kuFOt1NqfkHimNFhXXl0kElJi7wvvJabgZVYlydkY15YfiXgOAuWdTiyV 9U35iYRQ==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOo7-0000BN-MA; Tue, 14 Jan 2020 16:12:44 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 06/12] iomap: pass a flags value to iomap_dio_rw Date: Tue, 14 Jan 2020 17:12:19 +0100 Message-Id: <20200114161225.309792-7-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Replace the wait_for_completion flag in struct iomap_dio with a new IOMAP_DIO_SYNCHRONOUS flag for dio->flags, and allow passing the initial flags to iomap_dio_rw. Also take the check for synchronous iocbs into iomap_dio_rw instead of duplicating it in all the callers. Signed-off-by: Christoph Hellwig --- fs/ext4/file.c | 8 +++++--- fs/gfs2/file.c | 6 ++---- fs/iomap/direct-io.c | 7 ++++--- fs/xfs/xfs_file.c | 21 +++++++++------------ include/linux/iomap.h | 5 +++-- 5 files changed, 23 insertions(+), 24 deletions(-) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 6a7293a5cda2..08b603d0c638 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -74,8 +74,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) return generic_file_read_iter(iocb, to); } - ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, - is_sync_kiocb(iocb)); + ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, 0); inode_unlock_shared(inode); file_accessed(iocb->ki_filp); @@ -371,6 +370,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) handle_t *handle; struct inode *inode = file_inode(iocb->ki_filp); bool extend = false, overwrite = false, unaligned_aio = false; + unsigned int dio_flags = 0; if (iocb->ki_flags & IOCB_NOWAIT) { if (!inode_trylock(inode)) @@ -404,6 +404,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) && !is_sync_kiocb(iocb) && ext4_unaligned_aio(inode, from, offset)) { unaligned_aio = true; + dio_flags |= IOMAP_DIO_SYNCHRONOUS; inode_dio_wait(inode); } @@ -432,11 +433,12 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) } extend = true; + dio_flags |= IOMAP_DIO_SYNCHRONOUS; ext4_journal_stop(handle); } ret = iomap_dio_rw(iocb, from, &ext4_iomap_ops, &ext4_dio_write_ops, - is_sync_kiocb(iocb) || unaligned_aio || extend); + dio_flags); if (extend) ret = ext4_handle_inode_extension(inode, offset, ret, count); diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 86c0e61407b6..2260cb5d31af 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -771,8 +771,7 @@ static ssize_t gfs2_file_direct_read(struct kiocb *iocb, struct iov_iter *to) if (ret) goto out_uninit; - ret = iomap_dio_rw(iocb, to, &gfs2_iomap_ops, NULL, - is_sync_kiocb(iocb)); + ret = iomap_dio_rw(iocb, to, &gfs2_iomap_ops, NULL, 0); gfs2_glock_dq(&gh); out_uninit: @@ -807,8 +806,7 @@ static ssize_t gfs2_file_direct_write(struct kiocb *iocb, struct iov_iter *from) if (offset + len > i_size_read(&ip->i_inode)) goto out; - ret = iomap_dio_rw(iocb, from, &gfs2_iomap_ops, NULL, - is_sync_kiocb(iocb)); + ret = iomap_dio_rw(iocb, from, &gfs2_iomap_ops, NULL, 0); out: gfs2_glock_dq(&gh); diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 23837926c0c5..e706329d71a0 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -400,7 +400,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, const struct iomap_dio_ops *dops, - bool wait_for_completion) + unsigned int dio_flags) { struct address_space *mapping = iocb->ki_filp->f_mapping; struct inode *inode = file_inode(iocb->ki_filp); @@ -410,14 +410,15 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, unsigned int flags = IOMAP_DIRECT; struct blk_plug plug; struct iomap_dio *dio; + bool wait_for_completion = false; lockdep_assert_held(&inode->i_rwsem); if (!count) return 0; - if (WARN_ON(is_sync_kiocb(iocb) && !wait_for_completion)) - return -EIO; + if (is_sync_kiocb(iocb) || (dio_flags & IOMAP_DIO_SYNCHRONOUS)) + wait_for_completion = true; dio = kmalloc(sizeof(*dio), GFP_KERNEL); if (!dio) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index b8a4a3f29b36..0cc843a4a163 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -193,8 +193,7 @@ xfs_file_dio_aio_read( } else { xfs_ilock(ip, XFS_IOLOCK_SHARED); } - ret = iomap_dio_rw(iocb, to, &xfs_read_iomap_ops, NULL, - is_sync_kiocb(iocb)); + ret = iomap_dio_rw(iocb, to, &xfs_read_iomap_ops, NULL, 0); xfs_iunlock(ip, XFS_IOLOCK_SHARED); return ret; @@ -493,6 +492,7 @@ xfs_file_dio_aio_write( int iolock; size_t count = iov_iter_count(from); struct xfs_buftarg *target = xfs_inode_buftarg(ip); + unsigned int dio_flags = 0; /* DIO must be aligned to device logical sector size */ if ((iocb->ki_pos | count) & target->bt_logical_sectormask) @@ -538,27 +538,24 @@ xfs_file_dio_aio_write( count = iov_iter_count(from); /* - * If we are doing unaligned IO, we can't allow any other overlapping IO - * in-flight at the same time or we risk data corruption. Wait for all - * other IO to drain before we submit. If the IO is aligned, demote the - * iolock if we had to take the exclusive lock in + * If we are doing unaligned I/O, we can't allow any other overlapping + * I/O in-flight at the same time or we risk data corruption. Wait for + * all other I/O to drain before we submit and execute the I/O + * synchronously to prevent subsequent overlapping I/O. If the I/O is + * aligned, demote the iolock if we had to take the exclusive lock in * xfs_file_aio_write_checks() for other reasons. */ if (unaligned_io) { inode_dio_wait(inode); + dio_flags = IOMAP_DIO_SYNCHRONOUS; } else if (iolock == XFS_IOLOCK_EXCL) { xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); iolock = XFS_IOLOCK_SHARED; } trace_xfs_file_direct_write(ip, count, iocb->ki_pos); - /* - * If unaligned, this is the only IO in-flight. Wait on it before we - * release the iolock to prevent subsequent overlapping IO. - */ ret = iomap_dio_rw(iocb, from, &xfs_direct_write_iomap_ops, - &xfs_dio_write_ops, - is_sync_kiocb(iocb) || unaligned_io); + &xfs_dio_write_ops, dio_flags); out: xfs_iunlock(ip, iolock); diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 8b09463dae0d..3faeb8fd0961 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -244,10 +244,11 @@ int iomap_writepages(struct address_space *mapping, const struct iomap_writeback_ops *ops); /* - * Flags for direct I/O ->end_io: + * Flags for iomap_dio_complete and ->end_io: */ #define IOMAP_DIO_UNWRITTEN (1 << 0) /* covers unwritten extent(s) */ #define IOMAP_DIO_COW (1 << 1) /* covers COW extent(s) */ +#define IOMAP_DIO_SYNCHRONOUS (1 << 2) /* no async completion */ struct iomap_dio_ops { int (*end_io)(struct kiocb *iocb, ssize_t size, int error, @@ -256,7 +257,7 @@ struct iomap_dio_ops { ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, const struct iomap_dio_ops *dops, - bool wait_for_completion); + unsigned int dio_flags); int iomap_dio_iopoll(struct kiocb *kiocb, bool spin); #ifdef CONFIG_SWAP From patchwork Tue Jan 14 16:12:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332641 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E50A184C for ; Tue, 14 Jan 2020 16:13:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3D34F24676 for ; Tue, 14 Jan 2020 16:13:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="sCB1kcD9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729410AbgANQNT (ORCPT ); Tue, 14 Jan 2020 11:13:19 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43562 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729296AbgANQMr (ORCPT ); Tue, 14 Jan 2020 11:12:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=kuBKhMklw+YxwJKcgxxO2S9Flu8Wk6bhNkWsODoSOE0=; b=sCB1kcD90uIUWLGF48wprnsXiX 3TB+jsOI4Y2LkRdrKtxH7i+aXulnTirR1Cm/o8Nz6e4INd8u4uc0mJ3rPB5SzAwsKj5VfFRQAJBVr 3fUFDOuZm+DBrQQWkhuoNqedHFyFeIuI8r927b3PwWgcpgLD/Am4ErzsYUPEBFxz7sZuTCsVinp4t b/PWVC2rKPI1UPEI1qNqzum+uHVMcQA2gZRteVU9VeEFHkzOfP20VFKZOc1tfVxyDUniazlYLshgf FZcWjN50VlbXvGJKA5TxSQ7uDk2TOGwTCR8IhGlGSp9FaxaAyGjBKQ3+KxCIelMg+ZEfos510Jftn QT84XA8Q==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOoA-0000CC-CW; Tue, 14 Jan 2020 16:12:46 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 07/12] iomap: allow holding i_rwsem until aio completion Date: Tue, 14 Jan 2020 17:12:20 +0100 Message-Id: <20200114161225.309792-8-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The direct I/O code currently uses a hand crafted i_dio_count that needs to be incremented under i_rwsem and then is decremented when I/O completes. That scheme means file system code needs to be very careful to wait for i_dio_count to reach zero under i_rwsem in various places that are very cumbersome to get rid. It also means we can't get the effect of an exclusive i_rwsem for actually asynchronous I/O, forcing pointless synchronous execution of sub-blocksize writes. Replace the i_dio_count scheme with holding i_rwsem over the duration of the whole I/O. While this introduces a non-owner unlock that isn't nice to RT workload, the open coded locking primitive using i_dio_count isn't any better. Signed-off-by: Christoph Hellwig --- fs/iomap/direct-io.c | 44 +++++++++++++++++++++++++++++++++++++------ include/linux/iomap.h | 2 ++ 2 files changed, 40 insertions(+), 6 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index e706329d71a0..0113ac33b0a0 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -70,7 +70,7 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap, dio->submit.cookie = submit_bio(bio); } -static ssize_t iomap_dio_complete(struct iomap_dio *dio) +static ssize_t iomap_dio_complete(struct iomap_dio *dio, bool unlock) { const struct iomap_dio_ops *dops = dio->dops; struct kiocb *iocb = dio->iocb; @@ -112,6 +112,13 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio) dio_warn_stale_pagecache(iocb->ki_filp); } + if (unlock) { + if (dio->flags & IOMAP_DIO_RWSEM_EXCL) + up_write(&inode->i_rwsem); + else if (dio->flags & IOMAP_DIO_RWSEM_SHARED) + up_read(&inode->i_rwsem); + } + /* * If this is a DSYNC write, make sure we push it to stable storage now * that we've written data. @@ -129,8 +136,22 @@ static void iomap_dio_complete_work(struct work_struct *work) { struct iomap_dio *dio = container_of(work, struct iomap_dio, aio.work); struct kiocb *iocb = dio->iocb; + struct inode *inode = file_inode(iocb->ki_filp); - iocb->ki_complete(iocb, iomap_dio_complete(dio), 0); + /* + * XXX: For reads this code is directly called from bio ->end_io, which + * often is hard or softirq context. In that case lockdep records the + * below as lock acquisitions from irq context and causes warnings. + */ + if (dio->flags & IOMAP_DIO_RWSEM_EXCL) { + rwsem_acquire(&inode->i_rwsem.dep_map, 0, 0, _THIS_IP_); + if (IS_ENABLED(CONFIG_RWSEM_SPIN_ON_OWNER)) + atomic_long_set(&inode->i_rwsem.owner, (long)current); + } else if (dio->flags & IOMAP_DIO_RWSEM_SHARED) { + rwsem_acquire_read(&inode->i_rwsem.dep_map, 0, 0, _THIS_IP_); + } + + iocb->ki_complete(iocb, iomap_dio_complete(dio, true), 0); } /* @@ -430,7 +451,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio->i_size = i_size_read(inode); dio->dops = dops; dio->error = 0; - dio->flags = 0; + dio->flags = dio_flags; dio->submit.iter = iter; dio->submit.waiter = current; @@ -551,8 +572,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio->wait_for_completion = wait_for_completion; if (!atomic_dec_and_test(&dio->ref)) { if (!wait_for_completion) - return -EIOCBQUEUED; - + goto async_completion; for (;;) { set_current_state(TASK_UNINTERRUPTIBLE); if (!READ_ONCE(dio->submit.waiter)) @@ -567,10 +587,22 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, __set_current_state(TASK_RUNNING); } - return iomap_dio_complete(dio); + return iomap_dio_complete(dio, false); out_free_dio: kfree(dio); return ret; + +async_completion: + /* + * We are returning to userspace now, but i_rwsem is still held until + * the I/O completion comes back. + */ + if (dio_flags & (IOMAP_DIO_RWSEM_EXCL | IOMAP_DIO_RWSEM_SHARED)) + rwsem_release(&inode->i_rwsem.dep_map, _THIS_IP_); + if ((dio_flags & IOMAP_DIO_RWSEM_EXCL) && + IS_ENABLED(CONFIG_RWSEM_SPIN_ON_OWNER)) + atomic_long_set(&inode->i_rwsem.owner, RWSEM_OWNER_UNKNOWN); + return -EIOCBQUEUED; } EXPORT_SYMBOL_GPL(iomap_dio_rw); diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 3faeb8fd0961..f259bb979d7f 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -249,6 +249,8 @@ int iomap_writepages(struct address_space *mapping, #define IOMAP_DIO_UNWRITTEN (1 << 0) /* covers unwritten extent(s) */ #define IOMAP_DIO_COW (1 << 1) /* covers COW extent(s) */ #define IOMAP_DIO_SYNCHRONOUS (1 << 2) /* no async completion */ +#define IOMAP_DIO_RWSEM_EXCL (1 << 3) /* holds shared i_rwsem */ +#define IOMAP_DIO_RWSEM_SHARED (1 << 4) /* holds exclusive i_rwsem */ struct iomap_dio_ops { int (*end_io)(struct kiocb *iocb, ssize_t size, int error, From patchwork Tue Jan 14 16:12:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332631 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E298813BD for ; Tue, 14 Jan 2020 16:13:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B7AF82465A for ; Tue, 14 Jan 2020 16:13:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="UEzcY+P6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728940AbgANQNN (ORCPT ); Tue, 14 Jan 2020 11:13:13 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43598 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729334AbgANQMu (ORCPT ); Tue, 14 Jan 2020 11:12:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=KsSHHUOODJAHu0IH1q2ibItg0o+FnF2PjXUOf0SG9sk=; b=UEzcY+P6ZvJxzr8ibbfl7rLl4Z 173d2Re54ZTDUtZuxdT3xB1jvIvX/ELdv1aF02PAqHtlKcZHFtbKxF/AoiK2YmisrsIC7XEHHgzRB tF9ZwhfAEH0phGwBRkfQjaxddO0iPZZyh55XP9r+dVluaUZGpM2OOoJN68oVnheCZxD9/R2O+iSJq JULueo3prR4THi9bWDNyJYcLPoDCbiI21velupVtAIonY/MelN+LDgMpyONrFeUd/7weRrLxsSYo8 0Np5PDa2YtOY71dex2+4X7fGg4vkORBUCVG4MAF6EhUV3BQ+FwVbaYsBWGEEtLexQG0Lh+Qzpq31f batE4Azg==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOoD-0000DG-38; Tue, 14 Jan 2020 16:12:49 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 08/12] ext4: hold i_rwsem until AIO completes Date: Tue, 14 Jan 2020 17:12:21 +0100 Message-Id: <20200114161225.309792-9-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Switch ext4 from the magic i_dio_count scheme to just hold i_rwsem until the actual I/O has completed to reduce the locking complexity and avoid nasty bugs due to missing inode_dio_wait calls. Signed-off-by: Christoph Hellwig --- fs/ext4/extents.c | 12 ------------ fs/ext4/file.c | 21 +++++++++++++-------- fs/ext4/inode.c | 11 ----------- fs/ext4/ioctl.c | 5 ----- fs/ext4/move_extent.c | 4 ---- 5 files changed, 13 insertions(+), 40 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 0e8708b77da6..b6aa2d249b30 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4777,9 +4777,6 @@ static long ext4_zero_range(struct file *file, loff_t offset, if (mode & FALLOC_FL_KEEP_SIZE) flags |= EXT4_GET_BLOCKS_KEEP_SIZE; - /* Wait all existing dio workers, newcomers will block on i_mutex */ - inode_dio_wait(inode); - /* Preallocate the range including the unaligned edges */ if (partial_begin || partial_end) { ret = ext4_alloc_file_blocks(file, @@ -4949,9 +4946,6 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) goto out; } - /* Wait all existing dio workers, newcomers will block on i_mutex */ - inode_dio_wait(inode); - ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size, flags); if (ret) goto out; @@ -5525,9 +5519,6 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) goto out_mutex; } - /* Wait for existing dio to complete */ - inode_dio_wait(inode); - /* * Prevent page faults from reinstantiating pages we have released from * page cache. @@ -5678,9 +5669,6 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) goto out_mutex; } - /* Wait for existing dio to complete */ - inode_dio_wait(inode); - /* * Prevent page faults from reinstantiating pages we have released from * page cache. diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 08b603d0c638..b3410a3ede27 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -74,9 +74,10 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) return generic_file_read_iter(iocb, to); } - ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, 0); - inode_unlock_shared(inode); - + ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, + IOMAP_DIO_RWSEM_SHARED); + if (ret != -EIOCBQUEUED) + inode_unlock_shared(inode); file_accessed(iocb->ki_filp); return ret; } @@ -405,7 +406,6 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) !is_sync_kiocb(iocb) && ext4_unaligned_aio(inode, from, offset)) { unaligned_aio = true; dio_flags |= IOMAP_DIO_SYNCHRONOUS; - inode_dio_wait(inode); } /* @@ -416,7 +416,10 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) if (!unaligned_aio && ext4_overwrite_io(inode, offset, count) && ext4_should_dioread_nolock(inode)) { overwrite = true; + dio_flags |= IOMAP_DIO_RWSEM_SHARED; downgrade_write(&inode->i_rwsem); + } else { + dio_flags |= IOMAP_DIO_RWSEM_EXCL; } if (offset + count > EXT4_I(inode)->i_disksize) { @@ -444,10 +447,12 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) ret = ext4_handle_inode_extension(inode, offset, ret, count); out: - if (overwrite) - inode_unlock_shared(inode); - else - inode_unlock(inode); + if (ret != -EIOCBQUEUED) { + if (overwrite) + inode_unlock_shared(inode); + else + inode_unlock(inode); + } if (ret >= 0 && iov_iter_count(from)) { ssize_t err; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 629a25d999f0..e2dac0727ab0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3965,9 +3965,6 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) } - /* Wait all existing dio workers, newcomers will block on i_mutex */ - inode_dio_wait(inode); - /* * Prevent page faults from reinstantiating pages we have released from * page cache. @@ -5263,11 +5260,6 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) if (error) goto err_out; } - /* - * Blocks are going to be removed from the inode. Wait - * for dio in flight. - */ - inode_dio_wait(inode); } down_write(&EXT4_I(inode)->i_mmap_sem); @@ -5798,9 +5790,6 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) if (is_journal_aborted(journal)) return -EROFS; - /* Wait for all existing dio workers */ - inode_dio_wait(inode); - /* * Before flushing the journal and switching inode's aops, we have * to flush all dirty data the inode has. There can be outstanding diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index e8870fff8224..99d21d81074f 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -153,10 +153,6 @@ static long swap_inode_boot_loader(struct super_block *sb, if (err) goto err_out; - /* Wait for all existing dio workers */ - inode_dio_wait(inode); - inode_dio_wait(inode_bl); - truncate_inode_pages(&inode->i_data, 0); truncate_inode_pages(&inode_bl->i_data, 0); @@ -364,7 +360,6 @@ static int ext4_ioctl_setflags(struct inode *inode, */ if (S_ISREG(inode->i_mode) && !IS_IMMUTABLE(inode) && (flags & EXT4_IMMUTABLE_FL)) { - inode_dio_wait(inode); err = filemap_write_and_wait(inode->i_mapping); if (err) goto flags_out; diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 30ce3dc69378..20240808569f 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -602,10 +602,6 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); - /* Wait for all existing dio workers */ - inode_dio_wait(orig_inode); - inode_dio_wait(donor_inode); - /* Protect extent tree against block allocations via delalloc */ ext4_double_down_write_data_sem(orig_inode, donor_inode); /* Check the filesystem environment whether move_extent can be done */ From patchwork Tue Jan 14 16:12:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332623 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1EC71398 for ; Tue, 14 Jan 2020 16:13:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C099024676 for ; Tue, 14 Jan 2020 16:13:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="iWnmoMlR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729009AbgANQMx (ORCPT ); Tue, 14 Jan 2020 11:12:53 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43636 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbgANQMw (ORCPT ); Tue, 14 Jan 2020 11:12:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=x+ttUEN/NS6BrlO/9nKVYw2uuqodMabBguY4qU1kFWc=; b=iWnmoMlRQkItEjXzvB/HH5yDWZ PS0zTIEcudypIh4nwuowXENkKhWW01nf5gAcY00+vGC0HIrBHoSkdhNhqbmTlbaubBeUoQ/4FuLW+ MvXfiI4rJZxhKmKyciLADxGonrBgfq62VBcA4ZdoFDvCbjDtN0WTJTOtSKfUb3+ki9+rNXkKRlu7b s0kyyX8KYDNz94WWMCMSYaFE+gzN1dwLgd/CtX/HmpCd151jSZJF0RrWdIoemOyd/wSHAfBETH/bk JEKJ0p7/d68aL/JzgCs3ClBs1coxtIf0tk2hlPQC1Z515sGlE5pyMYwgo1OIXxLkBamqVUnkqiKie gLwFZ78w==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOoF-0000E6-P5; Tue, 14 Jan 2020 16:12:52 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 09/12] gfs2: hold i_rwsem until AIO completes Date: Tue, 14 Jan 2020 17:12:22 +0100 Message-Id: <20200114161225.309792-10-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Switch gfs from the magic i_dio_count scheme to just hold i_rwsem until the actual I/O has completed to reduce the locking complexity and avoid nasty bugs due to missing inode_dio_wait calls. Note that gfs only uses i_rwsem for direct I/O writes, not for reads so no change to the read behavior. It might also make sense to use the same scheme for the gfs2 internal cluster lock. Signed-off-by: Christoph Hellwig --- fs/gfs2/bmap.c | 2 -- fs/gfs2/file.c | 6 ++++-- fs/gfs2/glops.c | 10 ++-------- 3 files changed, 6 insertions(+), 12 deletions(-) diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 08f6fbb3655e..226f4eb680c7 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -2181,8 +2181,6 @@ int gfs2_setattr_size(struct inode *inode, u64 newsize) if (ret) return ret; - inode_dio_wait(inode); - ret = gfs2_rsqa_alloc(ip); if (ret) goto out; diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 2260cb5d31af..82a2f313a3e6 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -806,7 +806,8 @@ static ssize_t gfs2_file_direct_write(struct kiocb *iocb, struct iov_iter *from) if (offset + len > i_size_read(&ip->i_inode)) goto out; - ret = iomap_dio_rw(iocb, from, &gfs2_iomap_ops, NULL, 0); + ret = iomap_dio_rw(iocb, from, &gfs2_iomap_ops, NULL, + IOMAP_DIO_RWSEM_EXCL); out: gfs2_glock_dq(&gh); @@ -923,7 +924,8 @@ static ssize_t gfs2_file_write_iter(struct kiocb *iocb, struct iov_iter *from) return generic_write_sync(iocb, ret); out_unlock: - inode_unlock(inode); + if (ret != -EIOCBQUEUED) + inode_unlock(inode); return ret; } diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index 4ede1f18de85..a705eeb75117 100644 --- a/fs/gfs2/glops.c +++ b/fs/gfs2/glops.c @@ -243,11 +243,8 @@ static void inode_go_sync(struct gfs2_glock *gl) struct address_space *metamapping = gfs2_glock2aspace(gl); int error; - if (isreg) { - if (test_and_clear_bit(GIF_SW_PAGED, &ip->i_flags)) - unmap_shared_mapping_range(ip->i_inode.i_mapping, 0, 0); - inode_dio_wait(&ip->i_inode); - } + if (isreg && test_and_clear_bit(GIF_SW_PAGED, &ip->i_flags)) + unmap_shared_mapping_range(ip->i_inode.i_mapping, 0, 0); if (!test_and_clear_bit(GLF_DIRTY, &gl->gl_flags)) goto out; @@ -440,9 +437,6 @@ static int inode_go_lock(struct gfs2_holder *gh) return error; } - if (gh->gh_state != LM_ST_DEFERRED) - inode_dio_wait(&ip->i_inode); - if ((ip->i_diskflags & GFS2_DIF_TRUNC_IN_PROG) && (gl->gl_state == LM_ST_EXCLUSIVE) && (gh->gh_state == LM_ST_EXCLUSIVE)) { From patchwork Tue Jan 14 16:12:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332617 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A01D6C1 for ; Tue, 14 Jan 2020 16:13:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4E53E24658 for ; Tue, 14 Jan 2020 16:13:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ZKbbV85l" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729402AbgANQMz (ORCPT ); Tue, 14 Jan 2020 11:12:55 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43670 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbgANQMz (ORCPT ); Tue, 14 Jan 2020 11:12:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=4Hexr2/FIsZz1f1S0iN0yG1Q7M/61ZediDTapXBtYP8=; b=ZKbbV85l+TR0/8EQoI6oK7HMR+ L/ai9aEQ+wkkVnz+mMSjdF1R+fg/w7+4CTKvQpQgDarOYB1MK63tY0vW+MlRFVAuoLWnQ8aQ7jWJD 4JesyRb3zYBTonvFcxVoMXA8CzIQa62hP9zFTQBaKJQt3gwCJYM6vjXKK2bD77kIspkFYRowTsJ0m pjT/twezcuXLpVRtlDpq5s7aph1jeYdoQwPNFbOHup/a/DOyV/htDWG5QbTxrNeSDYedphQ+97Lo/ nPBO8jOkNXxdTSzw+yW7al2k2hrlZliDhwcFfAqp11Z97xlb2mWLOGw7cqIwMP7oxuVX86Xe8+4R7 uAsY6AnQ==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOoI-0000Er-Ej; Tue, 14 Jan 2020 16:12:55 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 10/12] xfs: hold i_rwsem until AIO completes Date: Tue, 14 Jan 2020 17:12:23 +0100 Message-Id: <20200114161225.309792-11-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Switch ext4 from the magic i_dio_count scheme to just hold i_rwsem until the actual I/O has completed to reduce the locking complexity and avoid nasty bugs due to missing inode_dio_wait calls. Signed-off-by: Christoph Hellwig --- fs/xfs/scrub/bmap.c | 1 - fs/xfs/xfs_bmap_util.c | 3 --- fs/xfs/xfs_file.c | 47 +++++++++++++----------------------------- fs/xfs/xfs_icache.c | 3 +-- fs/xfs/xfs_ioctl.c | 1 - fs/xfs/xfs_iops.c | 5 ----- fs/xfs/xfs_reflink.c | 2 -- 7 files changed, 15 insertions(+), 47 deletions(-) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index fa6ea6407992..d3e4068d3189 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -45,7 +45,6 @@ xchk_setup_inode_bmap( */ if (S_ISREG(VFS_I(sc->ip)->i_mode) && sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) { - inode_dio_wait(VFS_I(sc->ip)); error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping); if (error) goto out; diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index e62fb5216341..a454f481107e 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -674,9 +674,6 @@ xfs_free_eofblocks( if (error) return error; - /* wait on dio to ensure i_size has settled */ - inode_dio_wait(VFS_I(ip)); - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp); if (error) { diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 0cc843a4a163..d0ee7d2932e4 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -193,9 +193,11 @@ xfs_file_dio_aio_read( } else { xfs_ilock(ip, XFS_IOLOCK_SHARED); } - ret = iomap_dio_rw(iocb, to, &xfs_read_iomap_ops, NULL, 0); - xfs_iunlock(ip, XFS_IOLOCK_SHARED); + ret = iomap_dio_rw(iocb, to, &xfs_read_iomap_ops, NULL, + IOMAP_DIO_RWSEM_SHARED); + if (ret != -EIOCBQUEUED) + xfs_iunlock(ip, XFS_IOLOCK_SHARED); return ret; } @@ -341,15 +343,6 @@ xfs_file_aio_write_checks( xfs_ilock(ip, *iolock); iov_iter_reexpand(from, count); } - /* - * We now have an IO submission barrier in place, but - * AIO can do EOF updates during IO completion and hence - * we now need to wait for all of them to drain. Non-AIO - * DIO will have drained before we are given the - * XFS_IOLOCK_EXCL, and so for most cases this wait is a - * no-op. - */ - inode_dio_wait(inode); drained_dio = true; goto restart; } @@ -469,13 +462,7 @@ static const struct iomap_dio_ops xfs_dio_write_ops = { * needs to do sub-block zeroing and that requires serialisation against other * direct IOs to the same block. In this case we need to serialise the * submission of the unaligned IOs so that we don't get racing block zeroing in - * the dio layer. To avoid the problem with aio, we also need to wait for - * outstanding IOs to complete so that unwritten extent conversion is completed - * before we try to map the overlapping block. This is currently implemented by - * hitting it with a big hammer (i.e. inode_dio_wait()). - * - * Returns with locks held indicated by @iolock and errors indicated by - * negative return values. + * the dio layer. */ STATIC ssize_t xfs_file_dio_aio_write( @@ -546,18 +533,21 @@ xfs_file_dio_aio_write( * xfs_file_aio_write_checks() for other reasons. */ if (unaligned_io) { - inode_dio_wait(inode); - dio_flags = IOMAP_DIO_SYNCHRONOUS; - } else if (iolock == XFS_IOLOCK_EXCL) { - xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); - iolock = XFS_IOLOCK_SHARED; + dio_flags = IOMAP_DIO_RWSEM_EXCL | IOMAP_DIO_SYNCHRONOUS; + } else { + if (iolock == XFS_IOLOCK_EXCL) { + xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); + iolock = XFS_IOLOCK_SHARED; + } + dio_flags = IOMAP_DIO_RWSEM_SHARED; } trace_xfs_file_direct_write(ip, count, iocb->ki_pos); ret = iomap_dio_rw(iocb, from, &xfs_direct_write_iomap_ops, &xfs_dio_write_ops, dio_flags); out: - xfs_iunlock(ip, iolock); + if (ret != -EIOCBQUEUED) + xfs_iunlock(ip, iolock); /* * No fallback to buffered IO on errors for XFS, direct IO will either @@ -819,15 +809,6 @@ xfs_file_fallocate( if (error) goto out_unlock; - /* - * Must wait for all AIO to complete before we continue as AIO can - * change the file size on completion without holding any locks we - * currently hold. We must do this first because AIO can update both - * the on disk and in memory inode sizes, and the operations that follow - * require the in-memory size to be fully up-to-date. - */ - inode_dio_wait(inode); - /* * Now AIO and DIO has drained we flush and (if necessary) invalidate * the cached range over the first operation we are about to run. diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 8dc2e5414276..9e6f32fd32f5 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -1720,8 +1720,7 @@ xfs_prep_free_cowblocks( */ if ((VFS_I(ip)->i_state & I_DIRTY_PAGES) || mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY) || - mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_WRITEBACK) || - atomic_read(&VFS_I(ip)->i_dio_count)) + mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_WRITEBACK)) return false; return true; diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 7b35d62ede9f..331453f2c4be 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -548,7 +548,6 @@ xfs_ioc_space( error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); if (error) goto out_unlock; - inode_dio_wait(inode); switch (bf->l_whence) { case 0: /*SEEK_SET*/ diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 8afe69ca188b..700edeccc6bf 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -893,11 +893,6 @@ xfs_setattr_size( if (error) return error; - /* - * Wait for all direct I/O to complete. - */ - inode_dio_wait(inode); - /* * File data changes must be complete before we start the transaction to * modify the inode. This needs to be done before joining the inode to diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index de451235c4ee..f775e60ca6f7 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1525,8 +1525,6 @@ xfs_reflink_unshare( trace_xfs_reflink_unshare(ip, offset, len); - inode_dio_wait(inode); - error = iomap_file_unshare(inode, offset, len, &xfs_buffered_write_iomap_ops); if (error) From patchwork Tue Jan 14 16:12:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332613 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA08113BD for ; Tue, 14 Jan 2020 16:13:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C855724655 for ; Tue, 14 Jan 2020 16:13:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="NRBTU/Aa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729463AbgANQM7 (ORCPT ); Tue, 14 Jan 2020 11:12:59 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43710 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbgANQM6 (ORCPT ); Tue, 14 Jan 2020 11:12:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=my8Ph2mEg6LXHc/rF64e/o08bhUX729CzcwPRovaKLM=; b=NRBTU/AaZu+sQFQZ4IGjdaLih2 EC4rECXtbG5PEOMagJiggEc3WbBOSbipx3iig1QOJAwDPAlfVB+txKFHjxxRC0lYrmI2UJ19gBls/ l1oDnXE081oxYp5WpcpzNqes3e7E8reV2lB3w/gXv+Lj+ru4uAknlB/WmjXbDBuSHfI7YU2C3372X mrEwnRdyiNFu6tVXXVjl93yPQQ1HnkKc3OwSENOSPt8u1307+0QFUBgKt/cGvmnt30Cdo3cBf/vHG NqR2eUe8B/zgTTP8n+hDOS5sfgXiHDhJAAlP9qRmGwVopD4rxV97jLr3c/ez20Em0fScdTyhSxAb6 IF26LMGQ==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOoL-0000FW-71; Tue, 14 Jan 2020 16:12:57 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 11/12] xfs: don't set IOMAP_DIO_SYNCHRONOUS for unaligned I/O Date: Tue, 14 Jan 2020 17:12:24 +0100 Message-Id: <20200114161225.309792-12-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Now that i_rwsem is held until asynchronous writes complete, there is no need to force them to execute synchronously, as the i_rwsem protection is exactly the same as for synchronous writes. Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_file.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index d0ee7d2932e4..3a734ad4bb10 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -510,9 +510,6 @@ xfs_file_dio_aio_write( } if (iocb->ki_flags & IOCB_NOWAIT) { - /* unaligned dio always waits, bail */ - if (unaligned_io) - return -EAGAIN; if (!xfs_ilock_nowait(ip, iolock)) return -EAGAIN; } else { @@ -526,14 +523,11 @@ xfs_file_dio_aio_write( /* * If we are doing unaligned I/O, we can't allow any other overlapping - * I/O in-flight at the same time or we risk data corruption. Wait for - * all other I/O to drain before we submit and execute the I/O - * synchronously to prevent subsequent overlapping I/O. If the I/O is - * aligned, demote the iolock if we had to take the exclusive lock in - * xfs_file_aio_write_checks() for other reasons. + * If the I/O is aligned, demote the iolock if we had to take the + * exclusive lock in xfs_file_aio_write_checks() for other reasons. */ if (unaligned_io) { - dio_flags = IOMAP_DIO_RWSEM_EXCL | IOMAP_DIO_SYNCHRONOUS; + dio_flags = IOMAP_DIO_RWSEM_EXCL; } else { if (iolock == XFS_IOLOCK_EXCL) { xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); From patchwork Tue Jan 14 16:12:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 11332605 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4D156C1 for ; Tue, 14 Jan 2020 16:13:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B30E524658 for ; Tue, 14 Jan 2020 16:13:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="oe4mwhFZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729486AbgANQNB (ORCPT ); Tue, 14 Jan 2020 11:13:01 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:43754 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbgANQNB (ORCPT ); Tue, 14 Jan 2020 11:13:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=m2saCpf7ggz2fcYQT7OoTpnuyef/oFSaHismlZbaRwg=; b=oe4mwhFZgfpe9eTAKq5ucy4vTN NaN8iUTyOjY/EZWJ71LcA5QaZWt4OSsutSc6fLAIIoAare7j/+jUTGIn/UtPsBt4EWMGxOBKEcbDS EZYHcVXyQqFS/k3PtNf5ElfH03EYQdj4YLWnBMY4t26wLDS91nEdgDBck5YCUSwo5euviW7OmwMik ZIlKWgxrRLkyWEH9i2gbgtzUo3YqrNrJAlVWITaDEbflh5zmoN/E4VbqXU9dCYXe285Fjafs/dpm0 MSi9wED6clPOjNi1XE6QmGW9rw5fxSkj7bZrJvQnHGiIWqmpamopH8FdxveTkOrCheBdsT6fUAvzu UftNEe5A==; Received: from [2001:4bb8:18c:4f54:fcbb:a92b:61e1:719] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1irOoO-0000GP-6P; Tue, 14 Jan 2020 16:13:00 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Waiman Long , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Will Deacon , Andrew Morton , linux-ext4@vger.kernel.org, cluster-devel@redhat.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 12/12] iomap: remove the inode_dio_begin/end calls Date: Tue, 14 Jan 2020 17:12:25 +0100 Message-Id: <20200114161225.309792-13-hch@lst.de> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200114161225.309792-1-hch@lst.de> References: <20200114161225.309792-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Now that all iomap users hold i_rwsem over asynchronous I/O operations these calls can be removed. Signed-off-by: Christoph Hellwig --- fs/iomap/direct-io.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 0113ac33b0a0..c90ec82e8e08 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -126,7 +126,6 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio, bool unlock) if (ret > 0 && (dio->flags & IOMAP_DIO_NEED_SYNC)) ret = generic_write_sync(iocb, ret); - inode_dio_end(file_inode(iocb->ki_filp)); kfree(dio); return ret; @@ -513,8 +512,6 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, goto out_free_dio; } - inode_dio_begin(inode); - blk_start_plug(&plug); do { ret = iomap_apply(inode, pos, count, flags, ops, dio,