From patchwork Wed Aug 18 15:22:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12444757 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E099EC4320A for ; Wed, 18 Aug 2021 15:23:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5C2626108F for ; Wed, 18 Aug 2021 15:23:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5C2626108F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0C6078D0002; Wed, 18 Aug 2021 11:23:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0762F6B0072; Wed, 18 Aug 2021 11:23:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA77D8D0002; Wed, 18 Aug 2021 11:23:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0036.hostedemail.com [216.40.44.36]) by kanga.kvack.org (Postfix) with ESMTP id D02666B006C for ; Wed, 18 Aug 2021 11:23:27 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7D5C91B037 for ; Wed, 18 Aug 2021 15:23:27 +0000 (UTC) X-FDA: 78488570454.04.A026FF8 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf20.hostedemail.com (Postfix) with ESMTP id F0E5ED002C3D for ; Wed, 18 Aug 2021 15:23:26 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 723C01FFD3; Wed, 18 Aug 2021 15:23:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1629300205; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=v+NNK6iWX9ya37IkD5cwdV+fRJZKaar2Z50gB5eYw88=; b=jkFVO0RxRMu6FHUEzNh65BvcIP18tWC2ntBF874Tw5Z7UR+j5hQJtuR5sjlv6WN0IRQiLK Yfp1lfhKScBMJWgx5CPCpRZ47fOTQ/0GD8Zdy+hoYXr4SOS4vc+fwgFA5xvU+6uwIhWsTn t1xpkvHdkhq8IoHnaVLI7Z0yU8adyDo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1629300205; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=v+NNK6iWX9ya37IkD5cwdV+fRJZKaar2Z50gB5eYw88=; b=AUj5I1Kg8gsgWDQbG3Cv5lFT+l19jN2s3IejGlYAqJhLRr2IoOvJfhkcBn+vzIA0I5MIHx UeoWiKl30VKpSzAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DA96913DF7; Wed, 18 Aug 2021 15:23:24 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id XrSML+wlHWHWZwAAMHmgww (envelope-from ); Wed, 18 Aug 2021 15:23:24 +0000 From: Vlastimil Babka To: linux-mm@kvack.org Cc: Andrew Morton , Muchun Song , Chris Down , Michal Hocko , Matthew Wilcox , Vlastimil Babka , Chunxin Zang Subject: [PATCH] mm, vmscan: guarantee drop_slab_node() termination Date: Wed, 18 Aug 2021 17:22:39 +0200 Message-Id: <20210818152239.25502-1-vbabka@suse.cz> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: F0E5ED002C3D Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=jkFVO0Rx; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=AUj5I1Kg; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam01 X-Stat-Signature: huemn48mf84a1j1g9ougsdet5gd7kn4m X-HE-Tag: 1629300206-39895 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: drop_slab_node() is called as part of echo 2>/proc/sys/vm/drop_caches operation. It iterates over all memcgs and calls shrink_slab() which in turn iterates over all slab shrinkers. Freed objects are counted and as long as the total number of freed objects from all memcgs and shrinkers is higher than 10, drop_slab_node() loops for another full memcgs*shrinkers iteration. This arbitrary constant threshold of 10 can result in effectively an infinite loop on a system with large number of memcgs and/or parallel activity that allocates new objects. This has been reported previously by Chunxin Zang [1] and recently by our customer. The previous report [1] has resulted in commit 069c411de40a ("mm/vmscan: fix infinite loop in drop_slab_node") which added a check for signals allowing the user to terminate the command writing to drop_caches. At the time it was also considered to make the threshold grow with each iteration to guarantee termination, but such patch hasn't been formally proposed yet. This patch implements the dynamically growing threshold. At first iteration it's enough to free one object to continue, and this threshold effectively doubles with each iteration. Our customer's feedback was positive. There is always a risk that this change will result on some system in a previously terminating drop_caches operation to terminate sooner and free fewer objects. Ideally the semantics would guarantee freeing all freeable objects that existed at the moment of starting the operation, while not looping forever for newly allocated objects, but that's not feasible to track. In the less ideal solution based on thresholds, arguably the termination guarantee is more important than the exhaustiveness guarantee. If there are reports of large regression wrt being exhaustive, we can tune how fast the threshold grows. [1] https://lore.kernel.org/lkml/20200909152047.27905-1-zangchunxin@bytedance.com/T/#u Reported-by: Chunxin Zang Signed-off-by: Vlastimil Babka Reported-by: Matthew Wilcox Signed-off-by: Vlastimil Babka --- mm/vmscan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 403a175a720f..ef3554314b47 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -936,6 +936,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, void drop_slab_node(int nid) { unsigned long freed; + int shift = 0; do { struct mem_cgroup *memcg = NULL; @@ -948,7 +949,7 @@ void drop_slab_node(int nid) do { freed += shrink_slab(GFP_KERNEL, nid, memcg, 0); } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); - } while (freed > 10); + } while ((freed >> shift++) > 0); } void drop_slab(void)