From patchwork Thu Oct 28 11:56:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ning Zhang X-Patchwork-Id: 12589941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9574C433F5 for ; Thu, 28 Oct 2021 11:57:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8AD6E604DA for ; Thu, 28 Oct 2021 11:57:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8AD6E604DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 45B8094000D; Thu, 28 Oct 2021 07:57:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 40BCF94000A; Thu, 28 Oct 2021 07:57:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2FA0894000D; Thu, 28 Oct 2021 07:57:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id 00DEE94000A for ; Thu, 28 Oct 2021 07:57:10 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 917ED82499A8 for ; Thu, 28 Oct 2021 11:57:10 +0000 (UTC) X-FDA: 78745695420.23.5CFD47A Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf16.hostedemail.com (Postfix) with ESMTP id 3B48EF0000B2 for ; Thu, 28 Oct 2021 11:57:03 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=ningzhang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0Uu.rv43_1635422215; Received: from localhost(mailfrom:ningzhang@linux.alibaba.com fp:SMTPD_---0Uu.rv43_1635422215) by smtp.aliyun-inc.com(127.0.0.1); Thu, 28 Oct 2021 19:56:55 +0800 From: Ning Zhang To: linux-mm@kvack.org Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Yu Zhao Subject: [RFC 0/6] Reclaim zero subpages of thp to avoid memory bloat Date: Thu, 28 Oct 2021 19:56:49 +0800 Message-Id: <1635422215-99394-1-git-send-email-ningzhang@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 X-Stat-Signature: yuc3dgxxrcif3ea1r58bhs8wa6pfp4sn X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3B48EF0000B2 Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf16.hostedemail.com: domain of ningzhang@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=ningzhang@linux.alibaba.com X-HE-Tag: 1635422223-58177 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As we know, thp may lead to memory bloat which may cause OOM. Through testing with some apps, we found that the reason of memory bloat is a huge page may contain some zero subpages (may accessed or not). And we found that most zero subpages are centralized in a few huge pages. Following is a text_classification_rnn case for tensorflow: zero_subpages huge_pages waste [ 0, 1) 186 0.00% [ 1, 2) 23 0.01% [ 2, 4) 36 0.02% [ 4, 8) 67 0.08% [ 8, 16) 80 0.23% [ 16, 32) 109 0.61% [ 32, 64) 44 0.49% [ 64, 128) 12 0.30% [ 128, 256) 28 1.54% [ 256, 513) 159 18.03% In the case, there are 187 huge pages (25% of the total huge pages) which contain more then 128 zero subpages. And these huge pages lead to 19.57% waste of the total rss. It means we can reclaim 19.57% memory by splitting the 187 huge pages and reclaiming the zero subpages. This patchset introduce a new mechanism to split the huge page which has zero subpages and reclaim these zero subpages. We add the anonymous huge page to a list to reduce the cost of finding the huge page. When the memory reclaim is triggering, the list will be walked and the huge page contains enough zero subpages may be reclaimed. Meanwhile, replace the zero subpages by ZERO_PAGE(0). Yu Zhao has done some similar work when the huge page is swap out or migrated to accelerate[1]. While we do this in the normal memory shrink path for the swapoff scene to avoid OOM. In the future, we will do the proactive reclaim to reclaim the "cold" huge page proactively. This is for keeping the performance of thp as for as possible. In addition to that, some users want the memory usage using thp is equal to the usage using 4K. [1] https://lore.kernel.org/linux-mm/20210731063938.1391602-1-yuzhao@google.com/ Ning Zhang (6): mm, thp: introduce thp zero subpages reclaim mm, thp: add a global interface for zero subapges reclaim mm, thp: introduce zero subpages reclaim threshold mm, thp: introduce a controller to trigger zero subpages reclaim mm, thp: add some statistics for zero subpages reclaim mm, thp: add document for zero subpages reclaim Documentation/admin-guide/mm/transhuge.rst | 75 ++++++ include/linux/huge_mm.h | 13 + include/linux/memcontrol.h | 26 ++ include/linux/mm.h | 1 + include/linux/mm_types.h | 6 + include/linux/mmzone.h | 9 + mm/huge_memory.c | 374 ++++++++++++++++++++++++++++- mm/memcontrol.c | 243 +++++++++++++++++++ mm/vmscan.c | 61 ++++- 9 files changed, 805 insertions(+), 3 deletions(-)