From patchwork Thu Aug 29 10:25:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingxiang Zeng X-Patchwork-Id: 13783002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09878C83031 for ; Thu, 29 Aug 2024 10:25:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 832F26B00B1; Thu, 29 Aug 2024 06:25:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E1C76B00B3; Thu, 29 Aug 2024 06:25:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68C776B00B4; Thu, 29 Aug 2024 06:25:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 49A0D6B00B1 for ; Thu, 29 Aug 2024 06:25:55 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C4BE880A59 for ; Thu, 29 Aug 2024 10:25:54 +0000 (UTC) X-FDA: 82504902228.24.3A9AF24 Received: from mail-oa1-f67.google.com (mail-oa1-f67.google.com [209.85.160.67]) by imf25.hostedemail.com (Postfix) with ESMTP id D8340A000A for ; Thu, 29 Aug 2024 10:25:52 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PNCJtYwV; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of jingxiangzeng.cas@gmail.com designates 209.85.160.67 as permitted sender) smtp.mailfrom=jingxiangzeng.cas@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724927080; a=rsa-sha256; cv=none; b=0KmQuafKwr2UUPPvqufol6T/wVvcLM5mAFebUrNOjaKYhpjCBwxsxhe0Ua4mwcdVCRi6Fk NxLJioqP+x/jFF4gZlXv0yMQAsAVzIA/SExEycGEhvVjomH9WaW+iJaGOqDT6wGfaFO+lm kr/X9HFt2bmtMzFd7Ou9HBxtypHMTe0= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PNCJtYwV; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of jingxiangzeng.cas@gmail.com designates 209.85.160.67 as permitted sender) smtp.mailfrom=jingxiangzeng.cas@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724927080; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=nYRtWVX/41T5+U+WXO0cDgWlLr0WAJyLIXo4BMUOBCE=; b=k1IHWzfI5ydFJzTDrWrJ8pag2wg8RTwUXl9EYAQE6rvLNjaLIVk4YR0spY8NzTaJ8HAH5v 6nThIGXYXg39SIVmcTh6/zcrqQ9aqOzxCNDkqoZ+IY6ymbdrsh0viBVRhqCfjxvbk46q8j MeL8TWFm4+x5LzKoNVaXnmt+ks4RiuY= Received: by mail-oa1-f67.google.com with SMTP id 586e51a60fabf-270263932d5so256781fac.2 for ; Thu, 29 Aug 2024 03:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724927151; x=1725531951; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=nYRtWVX/41T5+U+WXO0cDgWlLr0WAJyLIXo4BMUOBCE=; b=PNCJtYwVWRu2J0DZdgDeTuWFdeci2z2DMFF8EDM79OGeseGzdUXHGbhvHxQVR6Tf+/ Gr3wkfKKxbW1zYY1sWBCNLvC56z3fjhKAXe4nTGMKdNRN8g+t+8pAwvdXiqRsGxBbATx 9idWbF0gA3td4iNugqghPrVtlx+xZK/oztR4wRpFy2RnEHwrMhPhox5KdOjHQJfpv75F tST73/mrWdhVy7y3/D7xjL8SX83QG2lgNpu+DVE+dNbwQhBh8Vyfxw9g9+MspyH9uh86 uVVSz4XFJnqpQKzubNIpJuNTA8muDK2D62yp1N95p2zaxbpy5dorsmy3T3TDSBn2oswv eh4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724927151; x=1725531951; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nYRtWVX/41T5+U+WXO0cDgWlLr0WAJyLIXo4BMUOBCE=; b=DB+VeGyjwThHECpru0TbHiOnXQoIp+K0ok6l0owBel8dBpfDu5kwIfo+nPqQTfrkmt Gll4SC/x9GjWpXSnQKVg1rclSH3MP0Rg701yY20xXhiLuDBLjn3s9JGMJVArnJtxNjao zbfrl9zdcCNHb1b2VQ5vCqIoarkOdkdRZ72mJueZ4DYXJ7rQx3vlwSCJ65HHRKolRnu2 A+QTg2pc3MED1mKnq3r1kgK5K/tMXTaSEvfs24p6QvX8MT8S/S4/ux6SnE3idyAWLwBM 7mEQqDvaXoUEWSBgsEMMx3pjimPYRL8+iWEuCKp89kNtKEbuOmrkp3hfuIoEJv0pj6ty b1pw== X-Gm-Message-State: AOJu0YxZmE2p1r3JuizwMY7PuMgSeW+q0zEo6nFV5zLb9aGKwXhHjObj 55MR4IDGCirXRsH/1j9AdM/3swNwaWAiik4f2pXy7MHwIM7PvQOXKDBz6cvyo8D09g== X-Google-Smtp-Source: AGHT+IE2aWk5JqAfpT2qHgBTa8xB3hYzc3d9r11BejHpf6qEenlbPyO8fOLgCPSK5vg4wvWVnd1xJQ== X-Received: by 2002:a05:6870:c0c4:b0:25e:940:e934 with SMTP id 586e51a60fabf-27790321256mr2657448fac.47.1724927151301; Thu, 29 Aug 2024 03:25:51 -0700 (PDT) Received: from localhost.localdomain ([14.116.239.35]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-715e5596d56sm833008b3a.46.2024.08.29.03.25.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Aug 2024 03:25:50 -0700 (PDT) From: Jingxiang Zeng To: linux-mm@kvack.org Cc: Andrew Morton , Yu Zhao , Wei Xu , "T . J . Mercier" , Kairui Song , linux-kernel@vger.kernel.org, Zeng Jingxiang Subject: [PATCH] mm/vmscan: wake up flushers conditionally to avoid cgroup OOM Date: Thu, 29 Aug 2024 18:25:43 +0800 Message-ID: <20240829102543.189453-1-jingxiangzeng.cas@gmail.com> X-Mailer: git-send-email 2.43.5 Reply-To: Jingxiang Zeng MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D8340A000A X-Stat-Signature: 3z361tke8xhusfrym4o5mrptgyhgh3rh X-Rspam-User: X-HE-Tag: 1724927152-576471 X-HE-Meta: U2FsdGVkX1/XOeIPOC86DXE8lcWIkift712LRA1B6W/kzCHK+9fjS47L69UM0Wd7Eti7ptXFzUL3BoEU+rfcoJ1tvafkXw4L6mxBbrwzra6LgCMK+EfFs9cfFO4eZ6FPX266FajfFZsdg5PV13PLCmJC0mJ7FP4+uqeFlAmGZIiPxm3Uwo9YvCN+LX9xxaW4I6AFwORFBXRlcLrwdVcAjdbRNP0F74O0Q6km9PDOIk+tyldi6FSLfy1MV0ZUxJ+LOLatSlWMMus4kHPlt3Yl+JhLUPX0r9bGsXGnFRgKQorElX8/FtHvCwulz46NpIh6KIlu8PKrjdvCfgsNue8nXAC8qqb2iEFn9GlssAJEI7XTtsSvQNZ9B25S1bqiPozM7rNYH8adpMa+uI9JYO5I5o3pundeN6vHq7KjbZK8BHShexhgtQy7nDj32lai7XGaWBS7DBoSRkYHLHJ2dJACXHogsE4ZHYmPKf/W5QHEL3Xk1dDIOSu+amnafYhX/uhMxm757HL/VBHXjaeDRnkEtFmMReW1KliBUN7uSC0SoHr1VcH+rO4kTwndV1zSXZA0a6msENkSM9oBTWu7yexnTRNc3ddIW8q5jDAQ5lYbj5FFNBDz1FvyctH6VNnmwl5ywsMTKfwxGnrTLsKcryucpr+LuxWrX0TYu64RouIqrhFbBwarrF/UasHRB+Nq6amumj54ad033gloIvpvJQItxWEYvmPVAWGwlHg0X0Fyi1dM28FnfV5jIGU0c/cL7dqlsxyhywm3jLHmuWn2EkBO372aIagxtSP7mHucENXtE8OF3c30r+lH9D14D4ux9WEF1otYFUn6jmLh6ZFPJMLM0UWAyQbZXCdSDwJj1Yn6/rm8mKJ2VEPYMlUIFmWvLdkREtco1Rnn/22f9wpBtHR9jUvMb81iFysyqmvdCJWTX7FkGQ2oUckrAWzP6IhskH6gXnzv8gJHiiS+CQO4J6V sd/hRAjh zrHRsLSn8cPuqwY2iakwfTk4xCh0VJMqBEDIqX8XoEn/vkL421AGRoEjhPRz/Kp3rAtMhQ0qzwxStzZ6EASikjNDYrIIOxh8crPRbPQbxHgCjEFKNTJITqgyts2EEZupQ5pCXA0xycR2R0gcLpsWGLWBLYRzX0RNI/uaX+52azEOwXRytWmdWJ3ztrYFqi4mumTmmOsd49YVpFdvuN/J2cT/wCEXx68RPFk6U6XIdseQMCrKHtjxyYlv8+IWVe08Lfj0araU18xspP8RfnE4aSSj4Yrb/jF/8GGyHtUvweJwLOJ46wBWBC0fC9WKHrNKfF8TVClU4bkWWfFUDsxak4OrVdzd14iyaETPRkc61Au8nAmac4gXPrv+FOqTBMpqLAHOE/NpfS+HRv9P2haqSxvmUgScr5KTyM2DVBzIAfGnw/dcDAc5PHl45Zc5vkVDVO85nVfTpqzLFCu7n5Ngcj77ipga1+v5nOnM8iQZ1opBE0A0HKvQUq+ja9w0uM7Y5V1KCqbJhNp8dYRNQ4K3HKMLHmfp7KvzL03OgK3Ab7WhmCUx+KJovkL62Q34oxtzrvmq3mkebBX24uWQ9HQdAuo2kGieE5Xtxfd/WYwPp/o+7QNZkEvOIt1KflAEjTwSUrvvK4MK0nbROQNyXJmAwEBZm6907fV4TkQefdbudtOIcmHloNG1ySWn+Fbh6F1WBgTQqArqvnbH5GoISCKgdt8BwnuutrdAZZkxuninCI++DywFRAF3HkubJb+li0l4Q0PRR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Zeng Jingxiang Commit 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle") removed the opportunity to wake up flushers during the MGLRU page reclamation process can lead to an increased likelihood of triggering OOM when encountering many dirty pages during reclamation on MGLRU. This leads to premature OOM if there are too many dirty pages in cgroup: Killed dd invoked oom-killer: gfp_mask=0x101cca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE), order=0, oom_score_adj=0 Call Trace: dump_stack_lvl+0x5f/0x80 dump_stack+0x14/0x20 dump_header+0x46/0x1b0 oom_kill_process+0x104/0x220 out_of_memory+0x112/0x5a0 mem_cgroup_out_of_memory+0x13b/0x150 try_charge_memcg+0x44f/0x5c0 charge_memcg+0x34/0x50 __mem_cgroup_charge+0x31/0x90 filemap_add_folio+0x4b/0xf0 __filemap_get_folio+0x1a4/0x5b0 ? srso_return_thunk+0x5/0x5f ? __block_commit_write+0x82/0xb0 ext4_da_write_begin+0xe5/0x270 generic_perform_write+0x134/0x2b0 ext4_buffered_write_iter+0x57/0xd0 ext4_file_write_iter+0x76/0x7d0 ? selinux_file_permission+0x119/0x150 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f vfs_write+0x30c/0x440 ksys_write+0x65/0xe0 __x64_sys_write+0x1e/0x30 x64_sys_call+0x11c2/0x1d50 do_syscall_64+0x47/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e memory: usage 308224kB, limit 308224kB, failcnt 2589 swap: usage 0kB, limit 9007199254740988kB, failcnt 0 ... file_dirty 303247360 file_writeback 0 ... oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=test, mems_allowed=0,oom_memcg=/test,task_memcg=/test,task=dd,pid=4404,uid=0 Memory cgroup out of memory: Killed process 4404 (dd) total-vm:10512kB, anon-rss:1152kB, file-rss:1824kB, shmem-rss:0kB, UID:0 pgtables:76kB oom_score_adj:0 The flusher wake up was removed to decrease SSD wearing, but if we are seeing all dirty folios at the tail of an LRU, not waking up the flusher could lead to thrashing easily. So wake it up when a mem cgroups is about to OOM due to dirty caches. MGLRU still suffers OOM issue on latest mm tree, so the test is done with another fix merged [1]. Link: https://lore.kernel.org/linux-mm/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoXmw@mail.gmail.com/ [1] Fixes: 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle") Signed-off-by: Zeng Jingxiang Signed-off-by: Kairui Song --- mm/vmscan.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index f27792e77a0f..9cd8c42f67cb 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4447,6 +4447,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, scanned, skipped, isolated, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); + sc->nr.taken += isolated; /* * There might not be eligible folios due to reclaim_idx. Check the * remaining to prevent livelock if it's not making progress. @@ -4919,6 +4920,14 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc if (try_to_shrink_lruvec(lruvec, sc)) lru_gen_rotate_memcg(lruvec, MEMCG_LRU_YOUNG); + /* + * If too many pages failed to evict due to page being dirty, + * memory pressure have pushed dirty pages to oldest gen, + * wake up flusher. + */ + if (sc->nr.unqueued_dirty >= sc->nr.taken) + wakeup_flusher_threads(WB_REASON_VMSCAN); + clear_mm_walk(); blk_finish_plug(&plug);