From patchwork Wed Jul 20 16:14:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 9239919 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D8EB160574 for ; Wed, 20 Jul 2016 16:14:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C97B027C4B for ; Wed, 20 Jul 2016 16:14:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BE39627CCB; Wed, 20 Jul 2016 16:14:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from oss.sgi.com (oss.sgi.com [192.48.182.195]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8F00227C4B for ; Wed, 20 Jul 2016 16:14:57 +0000 (UTC) Received: from oss.sgi.com (localhost [IPv6:::1]) by oss.sgi.com (Postfix) with ESMTP id 6276D7CA2; Wed, 20 Jul 2016 11:14:55 -0500 (CDT) X-Original-To: xfs@oss.sgi.com Delivered-To: xfs@oss.sgi.com Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 157FA7CA1 for ; Wed, 20 Jul 2016 11:14:54 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id CEB8E304048 for ; Wed, 20 Jul 2016 09:14:53 -0700 (PDT) X-ASG-Debug-ID: 1469031291-04cbb00355f2470001-NocioJ Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id BgtRLO109vmWHLGn (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Wed, 20 Jul 2016 09:14:52 -0700 (PDT) X-Barracuda-Envelope-From: bfoster@redhat.com X-Barracuda-Effective-Source-IP: mx1.redhat.com[209.132.183.28] X-Barracuda-Apparent-Source-IP: 209.132.183.28 X-ASG-Whitelist: Client Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B3B5DC05AA43 for ; Wed, 20 Jul 2016 16:14:51 +0000 (UTC) Received: from bfoster.bfoster (dhcp-41-104.bos.redhat.com [10.18.41.104]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u6KGEpZS020191 for ; Wed, 20 Jul 2016 12:14:51 -0400 Received: by bfoster.bfoster (Postfix, from userid 1000) id 3276112026D; Wed, 20 Jul 2016 12:14:50 -0400 (EDT) From: Brian Foster To: xfs@oss.sgi.com Subject: [PATCH] xfs: skip dirty pages in ->releasepage() Date: Wed, 20 Jul 2016 12:14:50 -0400 X-ASG-Orig-Subj: [PATCH] xfs: skip dirty pages in ->releasepage() Message-Id: <1469031290-18087-1-git-send-email-bfoster@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 20 Jul 2016 16:14:51 +0000 (UTC) X-Barracuda-Connect: mx1.redhat.com[209.132.183.28] X-Barracuda-Start-Time: 1469031292 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://192.48.176.25:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 4111 X-Virus-Scanned: by bsmtpd at sgi.com X-Barracuda-BRTS-Status: 1 X-BeenThere: xfs@oss.sgi.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com X-Virus-Scanned: ClamAV using ClamSMTP XFS has had scattered reports of delalloc blocks present at ->releasepage() time. This results in a warning with a stack trace similar to the following: ... Call Trace: [] dump_stack+0x63/0x84 [] warn_slowpath_common+0x97/0xe0 [] warn_slowpath_null+0x1a/0x20 [] xfs_vm_releasepage+0x10f/0x140 [] ? page_mkclean_one+0xd0/0xd0 [] ? anon_vma_prepare+0x150/0x150 [] try_to_release_page+0x32/0x50 [] shrink_active_list+0x3ce/0x3e0 [] shrink_lruvec+0x687/0x7d0 [] shrink_zone+0xdc/0x2c0 [] kswapd+0x4f9/0x970 [] ? mem_cgroup_shrink_node_zone+0x1a0/0x1a0 [] kthread+0xc9/0xe0 [] ? kthread_stop+0x100/0x100 [] ret_from_fork+0x3f/0x70 [] ? kthread_stop+0x100/0x100 This occurs because it is possible for shrink_active_list() to send pages marked dirty to ->releasepage() when certain buffer_head threshold conditions are met. shrink_active_list() doesn't check the page dirty state apparently to handle an old ext3 corner case where in some cases clean pages would not have the dirty bit cleared, thus it is up to the filesystem to determine how to handle the page. XFS currently handles the delalloc case properly, but this behavior makes the warning spurious. Update the XFS ->releasepage() handler to explicitly skip dirty pages. Retain the existing delalloc/unwritten checks so we continue to warn if such buffers exist on clean pages when they shouldn't. Diagnosed-by: Dave Chinner Signed-off-by: Brian Foster --- This is in response to the discussion here[1] that seemingly resulted in no resolution in the mm. As a result, it's left to the fs to deal with this particular situation. After discussing this with Dave on IRC, the initial plan was to include a WARN_ON_ONCE(PageDirty()) check. I've dropped the warning because as it turns out, block_invalidatepage() sends invalidated-but-dirty pages through ->releasepage(). In other words, a simple 'xfs_io -fc "pwrite 0 4k" $file; unlink $file' invokes the warning. In fact, it turned into a consistent boot time warning on both systems I ran it against so far. So this patch changes behavior in that particular case where a page is dirty but invalidated. If we want to try and preserve existing behavior, I do have another (so far untested) variant that does something like this: if (PageDirty(page) && (delalloc || unwritten)) return 0; else if (WARN_ON_ONCE(delalloc)) return 0; else if (WARN_ON_ONCE(unwritten)) return 0; ... ... to avoid the spurious warnings yet continue to try to free buffers on dirty pages. I lean more towards this patch as it is less logic and more consistent with other filesystems. Thoughts? Brian [1] http://oss.sgi.com/pipermail/xfs/2016-May/049136.html fs/xfs/xfs_aops.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index b368277..332b672 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -1040,6 +1040,20 @@ xfs_vm_releasepage( trace_xfs_releasepage(page->mapping->host, page, 0, 0); + /* + * mm accommodates an old ext3 case where clean pages might not have had + * the dirty bit cleared. Thus, it can send actual dirty pages to + * ->releasepage() via shrink_active_list(). Conversely, + * block_invalidatepage() can send pages that are still marked dirty + * but otherwise have invalidated buffers. + * + * We've historically freed buffers on the latter. Instead, quietly + * filter out all dirty pages to avoid spurious buffer state warnings. + * This can likely be removed once shrink_active_list() is fixed. + */ + if (PageDirty(page)) + return 0; + xfs_count_page_state(page, &delalloc, &unwritten); if (WARN_ON_ONCE(delalloc))