[Bug,202441] Possibly vfs cache related replicable xfs regression since 4.19.0 on sata hdd:s

https://bugzilla.kernel.org/show_bug.cgi?id=202441

--- Comment #16 from Dave Chinner (david@fromorbit.com) ---
On Tue, Jan 29, 2019 at 09:41:21PM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=202441
> --- Comment #14 from Dave Chinner (david@fromorbit.com) ---
> > --- Comment #12 from Roger (rogan6710@gmail.com) ---
> > Beginnig from rc5, might have been earlier also, cache get's released,
> > sometimes almost all of it, and begins to fill up slowly again,
> 
> Which I'd consider bad behaviour - trashing the entire working set
> because memory pressure is occurring is pathological behaviour.
> 
> Can you confirm which -rcX that behaviour starts in? e.g. between
> -rc4 and -rc5 there is this commit:
> 
> 172b06c32b94 mm: slowly shrink slabs with a relatively small number of
> objects
> 
> Which does change the way that the inode caches are reclaimed by
> forcably triggering reclaim for caches that would have previously
> been ignored. That's one of the "red flag" commits I noticed when
> first looking at the history between 4.18 and 4.19....

And now, added in 4.19.3:

 $ gl -n 1 5ebac3b957a9 -p
commit 5ebac3b957a91c921d2f1a7953caafca18aa6260
Author: Roman Gushchin <guro@fb.com>
Date:   Fri Nov 16 15:08:18 2018 -0800

    mm: don't reclaim inodes with many attached pages

    commit a76cf1a474d7dbcd9336b5f5afb0162baa142cf0 upstream.

    Spock reported that commit 172b06c32b94 ("mm: slowly shrink slabs with a
    relatively small number of objects") leads to a regression on his setup:
    periodically the majority of the pagecache is evicted without an obvious
    reason, while before the change the amount of free memory was balancing
    around the watermark.

    The reason behind is that the mentioned above change created some
    minimal background pressure on the inode cache.  The problem is that if
    an inode is considered to be reclaimed, all belonging pagecache page are
    stripped, no matter how many of them are there.  So, if a huge
    multi-gigabyte file is cached in the memory, and the goal is to reclaim
    only few slab objects (unused inodes), we still can eventually evict all
    gigabytes of the pagecache at once.

    The workload described by Spock has few large non-mapped files in the
    pagecache, so it's especially noticeable.

    To solve the problem let's postpone the reclaim of inodes, which have
    more than 1 attached page.  Let's wait until the pagecache pages will be
    evicted naturally by scanning the corresponding LRU lists, and only then
    reclaim the inode structure.

    Link: http://lkml.kernel.org/r/20181023164302.20436-1-guro@fb.com
    Signed-off-by: Roman Gushchin <guro@fb.com>
    Reported-by: Spock <dairinin@gmail.com>
    Tested-by: Spock <dairinin@gmail.com>
    Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Rik van Riel <riel@surriel.com>
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Cc: <stable@vger.kernel.org>    [4.19.x]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

SO, basically, I was right - the slab shrinking change in 4.18-rc5
caused the page cache to saw tooth like you reported, and there is a
"fix" for it in 4.19.3.

What does that "fix" do? It stops inode reclaim from inodes with
cached pages attached.

                return LRU_ROTATE;

Basically, what happened before this patch was that when an inode
was aged out of the cache due to the shrinker cycling over it, it's
page cache was reclaimed and then the inode reclaimed.

Now, the inode does not get reclaimed and the page cache is not
reclaimed. When you have lots of large files in your workload, that
means the inode cache turning over can no longer reclaim those
inodes, and so the inode can only be reclaimed after memory reclaim
has reclaimed the entire page cache for an inode.

That's a /massive/ change in behaviour, and it means that clean
inodes with cached pages attached can no longer be reclaimed by the
inode cache shrinker. Which will drive the inode cache shrinker into
trying to reclaim dirty inodes.....

Can you revert the above patch and see if the problem goes away?

Cheers,

Dave.

Message ID	bug-202441-201763-UUDONeIBmx@https.bugzilla.kernel.org/ (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-xfs-owner@kernel.org> Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 597EB91E for <patchwork-linux-xfs@patchwork.kernel.org>; Tue, 29 Jan 2019 21:53:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4AA142DB73 for <patchwork-linux-xfs@patchwork.kernel.org>; Tue, 29 Jan 2019 21:53:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3EFBA2DBD6; Tue, 29 Jan 2019 21:53:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA3182DB73 for <patchwork-linux-xfs@patchwork.kernel.org>; Tue, 29 Jan 2019 21:53:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727461AbfA2VxT convert rfc822-to-8bit (ORCPT <rfc822;patchwork-linux-xfs@patchwork.kernel.org>); Tue, 29 Jan 2019 16:53:19 -0500 Received: from mail.wl.linuxfoundation.org ([198.145.29.98]:38378 "EHLO mail.wl.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727342AbfA2VxT (ORCPT <rfc822;linux-xfs@vger.kernel.org>); Tue, 29 Jan 2019 16:53:19 -0500 Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E610C2DBD2 for <linux-xfs@vger.kernel.org>; Tue, 29 Jan 2019 21:53:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DADBC2DC1F; Tue, 29 Jan 2019 21:53:17 +0000 (UTC) From: bugzilla-daemon@bugzilla.kernel.org To: linux-xfs@vger.kernel.org Subject: [Bug 202441] Possibly vfs cache related replicable xfs regression since 4.19.0 on sata hdd:s Date: Tue, 29 Jan 2019 21:53:16 +0000 X-Bugzilla-Reason: None X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: AssignedTo filesystem_xfs@kernel-bugs.kernel.org X-Bugzilla-Product: File System X-Bugzilla-Component: XFS X-Bugzilla-Version: 2.5 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: david@fromorbit.com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: filesystem_xfs@kernel-bugs.kernel.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: <bug-202441-201763-UUDONeIBmx@https.bugzilla.kernel.org/> In-Reply-To: <bug-202441-201763@https.bugzilla.kernel.org/> References: <bug-202441-201763@https.bugzilla.kernel.org/> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Bugzilla-URL: https://bugzilla.kernel.org/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: <linux-xfs.vger.kernel.org> X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP
Series	[Bug,202441] Possibly vfs cache related replicable xfs regression since 4.19.0 on sata hdd:s \| expand [Bug,202441] Possibly vfs cache related replicable xfs regression since 4.19.0 on sata hdd:s

[Bug,202441] Possibly vfs cache related replicable xfs regression since 4.19.0 on sata hdd:s

Commit Message

Patch