From patchwork Tue Jul 26 00:35:26 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A . Shutemov" X-Patchwork-Id: 9247385 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B2F4B60B19 for ; Tue, 26 Jul 2016 00:42:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2B2E27813 for ; Tue, 26 Jul 2016 00:42:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 973E31FF15; Tue, 26 Jul 2016 00:42:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 222A91FF15 for ; Tue, 26 Jul 2016 00:42:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932280AbcGZAjB (ORCPT ); Mon, 25 Jul 2016 20:39:01 -0400 Received: from mga02.intel.com ([134.134.136.20]:1122 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755419AbcGZAgH (ORCPT ); Mon, 25 Jul 2016 20:36:07 -0400 Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP; 25 Jul 2016 17:36:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,421,1464678000"; d="scan'208";a="145359365" Received: from black.fi.intel.com ([10.237.72.93]) by fmsmga004.fm.intel.com with ESMTP; 25 Jul 2016 17:35:57 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id D0560AE6; Tue, 26 Jul 2016 03:35:47 +0300 (EEST) From: "Kirill A. Shutemov" To: "Theodore Ts'o" , Andreas Dilger , Jan Kara Cc: Alexander Viro , Hugh Dickins , Andrea Arcangeli , Andrew Morton , Dave Hansen , Vlastimil Babka , Matthew Wilcox , Ross Zwisler , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv1, RFC 24/33] truncate: make truncate_inode_pages_range() aware about huge pages Date: Tue, 26 Jul 2016 03:35:26 +0300 Message-Id: <1469493335-3622-25-git-send-email-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1469493335-3622-1-git-send-email-kirill.shutemov@linux.intel.com> References: <1469493335-3622-1-git-send-email-kirill.shutemov@linux.intel.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP As with shmem_undo_range(), truncate_inode_pages_range() removes huge pages, if it fully within range. Partial truncate of huge pages zero out this part of THP. Unlike with shmem, it doesn't prevent us having holes in the middle of huge page we still can skip writeback not touched buffers. With memory-mapped IO we would loose holes in some cases when we have THP in page cache, since we cannot track access on 4k level in this case. Signed-off-by: Kirill A. Shutemov --- fs/buffer.c | 2 +- mm/truncate.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 88 insertions(+), 9 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 2e25d0e7a233..e636dac53215 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1534,7 +1534,7 @@ void block_invalidatepage(struct page *page, unsigned int offset, /* * Check for overflow */ - BUG_ON(stop > PAGE_SIZE || stop < length); + BUG_ON(stop > hpage_size(page) || stop < length); head = page_buffers(page); bh = head; diff --git a/mm/truncate.c b/mm/truncate.c index ce904e4b1708..9c339e6255f2 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -90,7 +90,7 @@ void do_invalidatepage(struct page *page, unsigned int offset, { void (*invalidatepage)(struct page *, unsigned int, unsigned int); - invalidatepage = page->mapping->a_ops->invalidatepage; + invalidatepage = page_mapping(page)->a_ops->invalidatepage; #ifdef CONFIG_BLOCK if (!invalidatepage) invalidatepage = block_invalidatepage; @@ -116,7 +116,7 @@ truncate_complete_page(struct address_space *mapping, struct page *page) return -EIO; if (page_has_private(page)) - do_invalidatepage(page, 0, PAGE_SIZE); + do_invalidatepage(page, 0, hpage_size(page)); /* * Some filesystems seem to re-dirty the page even after @@ -288,6 +288,36 @@ void truncate_inode_pages_range(struct address_space *mapping, unlock_page(page); continue; } + + if (PageTransTail(page)) { + /* Middle of THP: zero out the page */ + clear_highpage(page); + if (page_has_private(page)) { + int off = page - compound_head(page); + do_invalidatepage(compound_head(page), + off * PAGE_SIZE, + PAGE_SIZE); + } + unlock_page(page); + continue; + } else if (PageTransHuge(page)) { + if (index == round_down(end, HPAGE_PMD_NR)) { + /* + * Range ends in the middle of THP: + * zero out the page + */ + clear_highpage(page); + if (page_has_private(page)) { + do_invalidatepage(page, 0, + PAGE_SIZE); + } + unlock_page(page); + continue; + } + index += HPAGE_PMD_NR - 1; + i += HPAGE_PMD_NR - 1; + } + truncate_inode_page(mapping, page); unlock_page(page); } @@ -309,9 +339,12 @@ void truncate_inode_pages_range(struct address_space *mapping, wait_on_page_writeback(page); zero_user_segment(page, partial_start, top); cleancache_invalidate_page(mapping, page); - if (page_has_private(page)) - do_invalidatepage(page, partial_start, - top - partial_start); + if (page_has_private(page)) { + int off = page - compound_head(page); + do_invalidatepage(compound_head(page), + off * PAGE_SIZE + partial_start, + top - partial_start); + } unlock_page(page); put_page(page); } @@ -322,9 +355,12 @@ void truncate_inode_pages_range(struct address_space *mapping, wait_on_page_writeback(page); zero_user_segment(page, 0, partial_end); cleancache_invalidate_page(mapping, page); - if (page_has_private(page)) - do_invalidatepage(page, 0, - partial_end); + if (page_has_private(page)) { + int off = page - compound_head(page); + do_invalidatepage(compound_head(page), + off * PAGE_SIZE, + partial_end); + } unlock_page(page); put_page(page); } @@ -373,6 +409,49 @@ void truncate_inode_pages_range(struct address_space *mapping, lock_page(page); WARN_ON(page_to_pgoff(page) != index); wait_on_page_writeback(page); + + if (PageTransTail(page)) { + /* Middle of THP: zero out the page */ + clear_highpage(page); + if (page_has_private(page)) { + int off = page - compound_head(page); + do_invalidatepage(compound_head(page), + off * PAGE_SIZE, + PAGE_SIZE); + } + unlock_page(page); + /* + * Partial thp truncate due 'start' in middle + * of THP: don't need to look on these pages + * again on !pvec.nr restart. + */ + if (index != round_down(end, HPAGE_PMD_NR)) + start++; + continue; + } else if (PageTransHuge(page)) { + if (index == round_down(end, HPAGE_PMD_NR)) { + /* + * Range ends in the middle of THP: + * zero out the page + */ + clear_highpage(page); + if (page_has_private(page)) { + do_invalidatepage(page, 0, + PAGE_SIZE); + } + unlock_page(page); + /* + * Partial thp truncate due 'end' in + * middle of THP: don't need to look on + * these pages again restart. + */ + start++; + continue; + } + index += HPAGE_PMD_NR - 1; + i += HPAGE_PMD_NR - 1; + } + truncate_inode_page(mapping, page); unlock_page(page); }