From patchwork Mon Apr 29 22:09:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 10922565 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 89BDB1390 for ; Mon, 29 Apr 2019 22:09:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7863E289EA for ; Mon, 29 Apr 2019 22:09:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6C627289ED; Mon, 29 Apr 2019 22:09:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 18858289EA for ; Mon, 29 Apr 2019 22:09:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729525AbfD2WJq (ORCPT ); Mon, 29 Apr 2019 18:09:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47066 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729523AbfD2WJq (ORCPT ); Mon, 29 Apr 2019 18:09:46 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 00F4E307EA82; Mon, 29 Apr 2019 22:09:46 +0000 (UTC) Received: from max.home.com (unknown [10.40.205.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3E9BC57AA; Mon, 29 Apr 2019 22:09:43 +0000 (UTC) From: Andreas Gruenbacher To: cluster-devel@redhat.com, "Darrick J . Wong" Cc: Christoph Hellwig , Bob Peterson , Jan Kara , Dave Chinner , Ross Lagerwall , Mark Syms , =?utf-8?b?RWR3aW4gVMO2csO2aw==?= , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andreas Gruenbacher Subject: [PATCH v7 1/5] iomap: Clean up __generic_write_end calling Date: Tue, 30 Apr 2019 00:09:30 +0200 Message-Id: <20190429220934.10415-2-agruenba@redhat.com> In-Reply-To: <20190429220934.10415-1-agruenba@redhat.com> References: <20190429220934.10415-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Mon, 29 Apr 2019 22:09:46 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Christoph Hellwig Move the call to __generic_write_end into iomap_write_end instead of duplicating it in each of the three branches. This requires open coding the generic_write_end for the buffer_head case. Signed-off-by: Christoph Hellwig Signed-off-by: Andreas Gruenbacher Reviewed-by: Jan Kara Reviewed-by: Darrick J. Wong --- fs/iomap.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 97cb9d486a7d..2344c662e6fc 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -738,13 +738,11 @@ __iomap_write_end(struct inode *inode, loff_t pos, unsigned len, * uptodate page as a zero-length write, and force the caller to redo * the whole thing. */ - if (unlikely(copied < len && !PageUptodate(page))) { - copied = 0; - } else { - iomap_set_range_uptodate(page, offset_in_page(pos), len); - iomap_set_page_dirty(page); - } - return __generic_write_end(inode, pos, copied, page); + if (unlikely(copied < len && !PageUptodate(page))) + return 0; + iomap_set_range_uptodate(page, offset_in_page(pos), len); + iomap_set_page_dirty(page); + return copied; } static int @@ -761,7 +759,6 @@ iomap_write_end_inline(struct inode *inode, struct page *page, kunmap_atomic(addr); mark_inode_dirty(inode); - __generic_write_end(inode, pos, copied, page); return copied; } @@ -774,12 +771,13 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, if (iomap->type == IOMAP_INLINE) { ret = iomap_write_end_inline(inode, page, iomap, pos, copied); } else if (iomap->flags & IOMAP_F_BUFFER_HEAD) { - ret = generic_write_end(NULL, inode->i_mapping, pos, len, - copied, page, NULL); + ret = block_write_end(NULL, inode->i_mapping, pos, len, copied, + page, NULL); } else { ret = __iomap_write_end(inode, pos, len, copied, page, iomap); } + ret = __generic_write_end(inode, pos, ret, page); if (iomap->page_done) iomap->page_done(inode, pos, copied, page, iomap); From patchwork Mon Apr 29 22:09:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 10922569 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A286F1395 for ; Mon, 29 Apr 2019 22:09:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 920DA289EA for ; Mon, 29 Apr 2019 22:09:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 85F7A289ED; Mon, 29 Apr 2019 22:09:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2B88A289EA for ; Mon, 29 Apr 2019 22:09:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729437AbfD2WJt (ORCPT ); Mon, 29 Apr 2019 18:09:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46706 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728105AbfD2WJt (ORCPT ); Mon, 29 Apr 2019 18:09:49 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1A3BC308624B; Mon, 29 Apr 2019 22:09:49 +0000 (UTC) Received: from max.home.com (unknown [10.40.205.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D9AA1850B; Mon, 29 Apr 2019 22:09:46 +0000 (UTC) From: Andreas Gruenbacher To: cluster-devel@redhat.com, "Darrick J . Wong" Cc: Christoph Hellwig , Bob Peterson , Jan Kara , Dave Chinner , Ross Lagerwall , Mark Syms , =?utf-8?b?RWR3aW4gVMO2csO2aw==?= , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andreas Gruenbacher Subject: [PATCH v7 2/5] fs: Turn __generic_write_end into a void function Date: Tue, 30 Apr 2019 00:09:31 +0200 Message-Id: <20190429220934.10415-3-agruenba@redhat.com> In-Reply-To: <20190429220934.10415-1-agruenba@redhat.com> References: <20190429220934.10415-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Mon, 29 Apr 2019 22:09:49 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The VFS-internal __generic_write_end helper always returns the value of its @copied argument. This can be confusing, and it isn't very useful anyway, so turn __generic_write_end into a function returning void instead. Signed-off-by: Andreas Gruenbacher Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/buffer.c | 6 +++--- fs/internal.h | 2 +- fs/iomap.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index ce357602f471..e0d4c6a5e2d2 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2085,7 +2085,7 @@ int block_write_begin(struct address_space *mapping, loff_t pos, unsigned len, } EXPORT_SYMBOL(block_write_begin); -int __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, +void __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, struct page *page) { loff_t old_size = inode->i_size; @@ -2116,7 +2116,6 @@ int __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, */ if (i_size_changed) mark_inode_dirty(inode); - return copied; } int block_write_end(struct file *file, struct address_space *mapping, @@ -2160,7 +2159,8 @@ int generic_write_end(struct file *file, struct address_space *mapping, struct page *page, void *fsdata) { copied = block_write_end(file, mapping, pos, len, copied, page, fsdata); - return __generic_write_end(mapping->host, pos, copied, page); + __generic_write_end(mapping->host, pos, copied, page); + return copied; } EXPORT_SYMBOL(generic_write_end); diff --git a/fs/internal.h b/fs/internal.h index 6a8b71643af4..530587fdf5d8 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -44,7 +44,7 @@ static inline int __sync_blockdev(struct block_device *bdev, int wait) extern void guard_bio_eod(int rw, struct bio *bio); extern int __block_write_begin_int(struct page *page, loff_t pos, unsigned len, get_block_t *get_block, struct iomap *iomap); -int __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, +void __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, struct page *page); /* diff --git a/fs/iomap.c b/fs/iomap.c index 2344c662e6fc..f8c9722d1a97 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -777,7 +777,7 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, ret = __iomap_write_end(inode, pos, len, copied, page, iomap); } - ret = __generic_write_end(inode, pos, ret, page); + __generic_write_end(inode, pos, ret, page); if (iomap->page_done) iomap->page_done(inode, pos, copied, page, iomap); From patchwork Mon Apr 29 22:09:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 10922573 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A52B31395 for ; Mon, 29 Apr 2019 22:09:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 95F50285BA for ; Mon, 29 Apr 2019 22:09:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A229287EA; Mon, 29 Apr 2019 22:09:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 355C5285BA for ; Mon, 29 Apr 2019 22:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729547AbfD2WJw (ORCPT ); Mon, 29 Apr 2019 18:09:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46734 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728105AbfD2WJw (ORCPT ); Mon, 29 Apr 2019 18:09:52 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2E63C308624B; Mon, 29 Apr 2019 22:09:52 +0000 (UTC) Received: from max.home.com (unknown [10.40.205.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7467717CCB; Mon, 29 Apr 2019 22:09:49 +0000 (UTC) From: Andreas Gruenbacher To: cluster-devel@redhat.com, "Darrick J . Wong" Cc: Christoph Hellwig , Bob Peterson , Jan Kara , Dave Chinner , Ross Lagerwall , Mark Syms , =?utf-8?b?RWR3aW4gVMO2csO2aw==?= , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andreas Gruenbacher Subject: [PATCH v7 3/5] iomap: Fix use-after-free error in page_done callback Date: Tue, 30 Apr 2019 00:09:32 +0200 Message-Id: <20190429220934.10415-4-agruenba@redhat.com> In-Reply-To: <20190429220934.10415-1-agruenba@redhat.com> References: <20190429220934.10415-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Mon, 29 Apr 2019 22:09:52 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In iomap_write_end, we're not holding a page reference anymore when calling the page_done callback, but the callback needs that reference to access the page. To fix that, move the put_page call in __generic_write_end into the callers of __generic_write_end. Then, in iomap_write_end, put the page after calling the page_done callback. Reported-by: Jan Kara Fixes: 63899c6f8851 ("iomap: add a page_done callback") Signed-off-by: Andreas Gruenbacher Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/buffer.c | 2 +- fs/iomap.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c index e0d4c6a5e2d2..0faa41fb4c88 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2104,7 +2104,6 @@ void __generic_write_end(struct inode *inode, loff_t pos, unsigned copied, } unlock_page(page); - put_page(page); if (old_size < pos) pagecache_isize_extended(inode, old_size, pos); @@ -2160,6 +2159,7 @@ int generic_write_end(struct file *file, struct address_space *mapping, { copied = block_write_end(file, mapping, pos, len, copied, page, fsdata); __generic_write_end(mapping->host, pos, copied, page); + put_page(page); return copied; } EXPORT_SYMBOL(generic_write_end); diff --git a/fs/iomap.c b/fs/iomap.c index f8c9722d1a97..62e3461704ce 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -780,6 +780,7 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, __generic_write_end(inode, pos, ret, page); if (iomap->page_done) iomap->page_done(inode, pos, copied, page, iomap); + put_page(page); if (ret < len) iomap_write_failed(inode, pos, len); From patchwork Mon Apr 29 22:09:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 10922577 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4DF0F1390 for ; Mon, 29 Apr 2019 22:09:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3EFE8285BA for ; Mon, 29 Apr 2019 22:09:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 335A8287EA; Mon, 29 Apr 2019 22:09:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FA0C285BA for ; Mon, 29 Apr 2019 22:09:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729551AbfD2WJ4 (ORCPT ); Mon, 29 Apr 2019 18:09:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47140 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728105AbfD2WJz (ORCPT ); Mon, 29 Apr 2019 18:09:55 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4B936307EA86; Mon, 29 Apr 2019 22:09:55 +0000 (UTC) Received: from max.home.com (unknown [10.40.205.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C4C4891C; Mon, 29 Apr 2019 22:09:52 +0000 (UTC) From: Andreas Gruenbacher To: cluster-devel@redhat.com, "Darrick J . Wong" Cc: Christoph Hellwig , Bob Peterson , Jan Kara , Dave Chinner , Ross Lagerwall , Mark Syms , =?utf-8?b?RWR3aW4gVMO2csO2aw==?= , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andreas Gruenbacher Subject: [PATCH v7 4/5] iomap: Add a page_prepare callback Date: Tue, 30 Apr 2019 00:09:33 +0200 Message-Id: <20190429220934.10415-5-agruenba@redhat.com> In-Reply-To: <20190429220934.10415-1-agruenba@redhat.com> References: <20190429220934.10415-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Mon, 29 Apr 2019 22:09:55 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Move the page_done callback into a separate iomap_page_ops structure and add a page_prepare calback to be called before the next page is written to. In gfs2, we'll want to start a transaction in page_prepare and end it in page_done. Other filesystems that implement data journaling will require the same kind of mechanism. Signed-off-by: Andreas Gruenbacher Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Reviewed-by: Darrick J. Wong --- fs/gfs2/bmap.c | 15 ++++++++++----- fs/iomap.c | 36 ++++++++++++++++++++++++++---------- include/linux/iomap.h | 22 +++++++++++++++++----- 3 files changed, 53 insertions(+), 20 deletions(-) diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 5da4ca9041c0..aa014725f84a 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -991,15 +991,20 @@ static void gfs2_write_unlock(struct inode *inode) gfs2_glock_dq_uninit(&ip->i_gh); } -static void gfs2_iomap_journaled_page_done(struct inode *inode, loff_t pos, - unsigned copied, struct page *page, - struct iomap *iomap) +static void gfs2_iomap_page_done(struct inode *inode, loff_t pos, + unsigned copied, struct page *page, + struct iomap *iomap) { struct gfs2_inode *ip = GFS2_I(inode); - gfs2_page_add_databufs(ip, page, offset_in_page(pos), copied); + if (page) + gfs2_page_add_databufs(ip, page, offset_in_page(pos), copied); } +static const struct iomap_page_ops gfs2_iomap_page_ops = { + .page_done = gfs2_iomap_page_done, +}; + static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos, loff_t length, unsigned flags, struct iomap *iomap, @@ -1077,7 +1082,7 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos, } } if (!gfs2_is_stuffed(ip) && gfs2_is_jdata(ip)) - iomap->page_done = gfs2_iomap_journaled_page_done; + iomap->page_ops = &gfs2_iomap_page_ops; return 0; out_trans_end: diff --git a/fs/iomap.c b/fs/iomap.c index 62e3461704ce..a3ffc83134ee 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -665,6 +665,7 @@ static int iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, struct page **pagep, struct iomap *iomap) { + const struct iomap_page_ops *page_ops = iomap->page_ops; pgoff_t index = pos >> PAGE_SHIFT; struct page *page; int status = 0; @@ -674,9 +675,17 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, if (fatal_signal_pending(current)) return -EINTR; + if (page_ops && page_ops->page_prepare) { + status = page_ops->page_prepare(inode, pos, len, iomap); + if (status) + return status; + } + page = grab_cache_page_write_begin(inode->i_mapping, index, flags); - if (!page) - return -ENOMEM; + if (!page) { + status = -ENOMEM; + goto out_no_page; + } if (iomap->type == IOMAP_INLINE) iomap_read_inline_data(inode, page, iomap); @@ -684,15 +693,21 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, status = __block_write_begin_int(page, pos, len, NULL, iomap); else status = __iomap_write_begin(inode, pos, len, page, iomap); - if (unlikely(status)) { - unlock_page(page); - put_page(page); - page = NULL; - iomap_write_failed(inode, pos, len); - } + if (unlikely(status)) + goto out_unlock; *pagep = page; + return 0; + +out_unlock: + unlock_page(page); + put_page(page); + iomap_write_failed(inode, pos, len); + +out_no_page: + if (page_ops && page_ops->page_done) + page_ops->page_done(inode, pos, 0, NULL, iomap); return status; } @@ -766,6 +781,7 @@ static int iomap_write_end(struct inode *inode, loff_t pos, unsigned len, unsigned copied, struct page *page, struct iomap *iomap) { + const struct iomap_page_ops *page_ops = iomap->page_ops; int ret; if (iomap->type == IOMAP_INLINE) { @@ -778,8 +794,8 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, } __generic_write_end(inode, pos, ret, page); - if (iomap->page_done) - iomap->page_done(inode, pos, copied, page, iomap); + if (page_ops && page_ops->page_done) + page_ops->page_done(inode, pos, copied, page, iomap); put_page(page); if (ret < len) diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 0fefb5455bda..2103b94cb1bf 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -53,6 +53,8 @@ struct vm_fault; */ #define IOMAP_NULL_ADDR -1ULL /* addr is not valid */ +struct iomap_page_ops; + struct iomap { u64 addr; /* disk offset of mapping, bytes */ loff_t offset; /* file offset of mapping, bytes */ @@ -63,12 +65,22 @@ struct iomap { struct dax_device *dax_dev; /* dax_dev for dax operations */ void *inline_data; void *private; /* filesystem private */ + const struct iomap_page_ops *page_ops; +}; - /* - * Called when finished processing a page in the mapping returned in - * this iomap. At least for now this is only supported in the buffered - * write path. - */ +/* + * When a filesystem sets page_ops in an iomap mapping it returns, page_prepare + * and page_done will be called for each page written to. This only applies to + * buffered writes as unbuffered writes will not typically have pages + * associated with them. + * + * When page_prepare succeeds, page_done will always be called to do any + * cleanup work necessary. In that page_done call, @page will be NULL if the + * associated page could not be obtained. + */ +struct iomap_page_ops { + int (*page_prepare)(struct inode *inode, loff_t pos, unsigned len, + struct iomap *iomap); void (*page_done)(struct inode *inode, loff_t pos, unsigned copied, struct page *page, struct iomap *iomap); }; From patchwork Mon Apr 29 22:09:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 10922581 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 05BB01390 for ; Mon, 29 Apr 2019 22:10:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E7EEF285BA for ; Mon, 29 Apr 2019 22:10:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D71FE287EA; Mon, 29 Apr 2019 22:10:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 21B56285BA for ; Mon, 29 Apr 2019 22:10:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729554AbfD2WKB (ORCPT ); Mon, 29 Apr 2019 18:10:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55120 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728105AbfD2WKB (ORCPT ); Mon, 29 Apr 2019 18:10:01 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 956E53082E70; Mon, 29 Apr 2019 22:10:00 +0000 (UTC) Received: from max.home.com (unknown [10.40.205.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id D3C4E891C; Mon, 29 Apr 2019 22:09:55 +0000 (UTC) From: Andreas Gruenbacher To: cluster-devel@redhat.com, "Darrick J . Wong" Cc: Christoph Hellwig , Bob Peterson , Jan Kara , Dave Chinner , Ross Lagerwall , Mark Syms , =?utf-8?b?RWR3aW4gVMO2csO2aw==?= , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andreas Gruenbacher Subject: [PATCH v7 5/5] gfs2: Fix iomap write page reclaim deadlock Date: Tue, 30 Apr 2019 00:09:34 +0200 Message-Id: <20190429220934.10415-6-agruenba@redhat.com> In-Reply-To: <20190429220934.10415-1-agruenba@redhat.com> References: <20190429220934.10415-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Mon, 29 Apr 2019 22:10:00 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Since commit 64bc06bb32ee ("gfs2: iomap buffered write support"), gfs2 is doing buffered writes by starting a transaction in iomap_begin, writing a range of pages, and ending that transaction in iomap_end. This approach suffers from two problems: (1) Any allocations necessary for the write are done in iomap_begin, so when the data aren't journaled, there is no need for keeping the transaction open until iomap_end. (2) Transactions keep the gfs2 log flush lock held. When iomap_file_buffered_write calls balance_dirty_pages, this can end up calling gfs2_write_inode, which will try to flush the log. This requires taking the log flush lock which is already held, resulting in a deadlock. Fix both of these issues by not keeping transactions open from iomap_begin to iomap_end. Instead, start a small transaction in page_prepare and end it in page_done when necessary. Reported-by: Edwin Török Fixes: 64bc06bb32ee ("gfs2: iomap buffered write support") Signed-off-by: Andreas Gruenbacher Signed-off-by: Bob Peterson --- fs/gfs2/aops.c | 14 +++++--- fs/gfs2/bmap.c | 88 +++++++++++++++++++++++++++----------------------- 2 files changed, 58 insertions(+), 44 deletions(-) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 05dd78f4b2b3..6210d4429d84 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -649,7 +649,7 @@ static int gfs2_readpages(struct file *file, struct address_space *mapping, */ void adjust_fs_space(struct inode *inode) { - struct gfs2_sbd *sdp = inode->i_sb->s_fs_info; + struct gfs2_sbd *sdp = GFS2_SB(inode); struct gfs2_inode *m_ip = GFS2_I(sdp->sd_statfs_inode); struct gfs2_inode *l_ip = GFS2_I(sdp->sd_sc_inode); struct gfs2_statfs_change_host *m_sc = &sdp->sd_statfs_master; @@ -657,10 +657,13 @@ void adjust_fs_space(struct inode *inode) struct buffer_head *m_bh, *l_bh; u64 fs_total, new_free; + if (gfs2_trans_begin(sdp, 2 * RES_STATFS, 0) != 0) + return; + /* Total up the file system space, according to the latest rindex. */ fs_total = gfs2_ri_total(sdp); if (gfs2_meta_inode_buffer(m_ip, &m_bh) != 0) - return; + goto out; spin_lock(&sdp->sd_statfs_spin); gfs2_statfs_change_in(m_sc, m_bh->b_data + @@ -675,11 +678,14 @@ void adjust_fs_space(struct inode *inode) gfs2_statfs_change(sdp, new_free, new_free, 0); if (gfs2_meta_inode_buffer(l_ip, &l_bh) != 0) - goto out; + goto out2; update_statfs(sdp, m_bh, l_bh); brelse(l_bh); -out: +out2: brelse(m_bh); +out: + sdp->sd_rindex_uptodate = 0; + gfs2_trans_end(sdp); } /** diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index aa014725f84a..27c82f4aaf32 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -991,17 +991,28 @@ static void gfs2_write_unlock(struct inode *inode) gfs2_glock_dq_uninit(&ip->i_gh); } +static int gfs2_iomap_page_prepare(struct inode *inode, loff_t pos, + unsigned len, struct iomap *iomap) +{ + struct gfs2_sbd *sdp = GFS2_SB(inode); + + return gfs2_trans_begin(sdp, RES_DINODE + (len >> inode->i_blkbits), 0); +} + static void gfs2_iomap_page_done(struct inode *inode, loff_t pos, unsigned copied, struct page *page, struct iomap *iomap) { struct gfs2_inode *ip = GFS2_I(inode); + struct gfs2_sbd *sdp = GFS2_SB(inode); - if (page) + if (page && !gfs2_is_stuffed(ip)) gfs2_page_add_databufs(ip, page, offset_in_page(pos), copied); + gfs2_trans_end(sdp); } static const struct iomap_page_ops gfs2_iomap_page_ops = { + .page_prepare = gfs2_iomap_page_prepare, .page_done = gfs2_iomap_page_done, }; @@ -1057,31 +1068,45 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos, if (alloc_required) rblocks += gfs2_rg_blocks(ip, data_blocks + ind_blocks); - ret = gfs2_trans_begin(sdp, rblocks, iomap->length >> inode->i_blkbits); - if (ret) - goto out_trans_fail; + if (unstuff || iomap->type == IOMAP_HOLE) { + struct gfs2_trans *tr; - if (unstuff) { - ret = gfs2_unstuff_dinode(ip, NULL); + ret = gfs2_trans_begin(sdp, rblocks, + iomap->length >> inode->i_blkbits); if (ret) - goto out_trans_end; - release_metapath(mp); - ret = gfs2_iomap_get(inode, iomap->offset, iomap->length, - flags, iomap, mp); - if (ret) - goto out_trans_end; - } + goto out_trans_fail; - if (iomap->type == IOMAP_HOLE) { - ret = gfs2_iomap_alloc(inode, iomap, flags, mp); - if (ret) { - gfs2_trans_end(sdp); - gfs2_inplace_release(ip); - punch_hole(ip, iomap->offset, iomap->length); - goto out_qunlock; + if (unstuff) { + ret = gfs2_unstuff_dinode(ip, NULL); + if (ret) + goto out_trans_end; + release_metapath(mp); + ret = gfs2_iomap_get(inode, iomap->offset, + iomap->length, flags, iomap, mp); + if (ret) + goto out_trans_end; + } + + if (iomap->type == IOMAP_HOLE) { + ret = gfs2_iomap_alloc(inode, iomap, flags, mp); + if (ret) { + gfs2_trans_end(sdp); + gfs2_inplace_release(ip); + punch_hole(ip, iomap->offset, iomap->length); + goto out_qunlock; + } } + + tr = current->journal_info; + if (tr->tr_num_buf_new) + __mark_inode_dirty(inode, I_DIRTY_DATASYNC); + else + gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[0]); + + gfs2_trans_end(sdp); } - if (!gfs2_is_stuffed(ip) && gfs2_is_jdata(ip)) + + if (gfs2_is_stuffed(ip) || gfs2_is_jdata(ip)) iomap->page_ops = &gfs2_iomap_page_ops; return 0; @@ -1121,10 +1146,6 @@ static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length, iomap->type != IOMAP_MAPPED) ret = -ENOTBLK; } - if (!ret) { - get_bh(mp.mp_bh[0]); - iomap->private = mp.mp_bh[0]; - } release_metapath(&mp); trace_gfs2_iomap_end(ip, iomap, ret); return ret; @@ -1135,27 +1156,16 @@ static int gfs2_iomap_end(struct inode *inode, loff_t pos, loff_t length, { struct gfs2_inode *ip = GFS2_I(inode); struct gfs2_sbd *sdp = GFS2_SB(inode); - struct gfs2_trans *tr = current->journal_info; - struct buffer_head *dibh = iomap->private; if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) != IOMAP_WRITE) goto out; - if (iomap->type != IOMAP_INLINE) { + if (!gfs2_is_stuffed(ip)) gfs2_ordered_add_inode(ip); - if (tr->tr_num_buf_new) - __mark_inode_dirty(inode, I_DIRTY_DATASYNC); - else - gfs2_trans_add_meta(ip->i_gl, dibh); - } - - if (inode == sdp->sd_rindex) { + if (inode == sdp->sd_rindex) adjust_fs_space(inode); - sdp->sd_rindex_uptodate = 0; - } - gfs2_trans_end(sdp); gfs2_inplace_release(ip); if (length != written && (iomap->flags & IOMAP_F_NEW)) { @@ -1175,8 +1185,6 @@ static int gfs2_iomap_end(struct inode *inode, loff_t pos, loff_t length, gfs2_write_unlock(inode); out: - if (dibh) - brelse(dibh); return 0; }