From patchwork Tue Mar 30 10:15:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1399DC433C1 for ; Tue, 30 Mar 2021 10:20:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 82E1761955 for ; Tue, 30 Mar 2021 10:20:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 82E1761955 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 144E86B0081; Tue, 30 Mar 2021 06:20:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F2FD6B0082; Tue, 30 Mar 2021 06:20:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAE5C6B0083; Tue, 30 Mar 2021 06:20:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0077.hostedemail.com [216.40.44.77]) by kanga.kvack.org (Postfix) with ESMTP id C19E36B0081 for ; Tue, 30 Mar 2021 06:20:45 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8AB8C8249980 for ; Tue, 30 Mar 2021 10:20:45 +0000 (UTC) X-FDA: 77976146850.27.50FB557 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf20.hostedemail.com (Postfix) with ESMTP id 4438CF4 for ; Tue, 30 Mar 2021 10:20:42 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id h25so11403609pgm.3 for ; Tue, 30 Mar 2021 03:20:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mcUfJURbaW0/pk8mnMuxrluSpP/XrND4LfgaV4Zkidk=; b=PsPsxTwX9RegUj+h72lWw56jPXRtpLCx/YV8Qn7VSIrVh4YWoCvlIbbwkN/7NNwyk+ dEMKW9ogd4Z0M8+Q822sY3pwhvICGO7iYNVOTGuVy53sxIaZ5KdqsI/WyR8ZgDw/TYDO o9jDHfwYW556VZbS0jJiRkNIjf7VfjVCkG/JwgNweU4W9+4pV1T1VvsbCXUx4ypeLAG7 ptxVNfOszwBVySCntGTnWvJEL51GfJ1+AGDD2Q15o40rANP/n04vEDnimelG2SNYxHLB RRVF6fqkddY0HI70W3cBM7jVmBMcqlRDEzt1to5Y3q1g9Bn7CpWUR3Hwkx+oNtusgEa5 Uelw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mcUfJURbaW0/pk8mnMuxrluSpP/XrND4LfgaV4Zkidk=; b=RLiFAYdIl/GrIfS1NgFtB8RsJC8Jvdzdp2+SZVQOONq3E460+CoK6F8I9DzPqBCWXd G8TVH+3trxD2tkZ0GfkQbwCUzJm2trO6/f7m5XMFDgBCA0PnRM506ggGb+yx4rGQfeZ7 KaM3YVO7AlNFh5/dlYJ2mFzjzRdwZpux2zPjNLIuF7IAX/71xYe/QKNU7oOd1MzX9sy2 e7aGTDW+3raUE3cN7YsmVq6dN2wTLUQ+5VvGRedlSIPZxS8CWFppG2NRXucOs/AN/YJz Xo5PM5VOGFupFh6AMVVSWR1YDISPSKaOwkPypXMMI1ioigB1brSKoWRoRORWLDH2oqGG r6IA== X-Gm-Message-State: AOAM530HsGMqzWs0eeObNUB5j3hoDDFUJo3ek35Jnx8EXylb9l3iNkur oSzt9IqmLEhlWp6S99C35yykSA== X-Google-Smtp-Source: ABdhPJx1NdxoJRBW9S7K0gOi/pa+6k+EGwxyShNbEsEIgqn2JJqbABL3/JMzYOnVkpIcYjvuvyEtHQ== X-Received: by 2002:a65:5308:: with SMTP id m8mr26343234pgq.266.1617099644225; Tue, 30 Mar 2021 03:20:44 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.20.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:20:43 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 01/15] mm: memcontrol: fix page charging in page replacement Date: Tue, 30 Mar 2021 18:15:17 +0800 Message-Id: <20210330101531.82752-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 4438CF4 X-Stat-Signature: ous6w895hfa3hw1qq6jg5d1jqojxfxx8 X-Rspamd-Server: rspam02 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf20; identity=mailfrom; envelope-from=""; helo=mail-pg1-f177.google.com; client-ip=209.85.215.177 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099642-872281 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The pages aren't accounted at the root level, so do not charge the page to the root memcg in page replacement. Although we do not display the value (mem_cgroup_usage) so there shouldn't be any actual problem, but there is a WARN_ON_ONCE in the page_counter_cancel(). Who knows if it will trigger? So it is better to fix it. Signed-off-by: Muchun Song Acked-by: Johannes Weiner --- mm/memcontrol.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 249bf6b4d94c..d0c4f6e91e17 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6936,9 +6936,11 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) /* Force-charge the new page. The old one will be freed soon */ nr_pages = thp_nr_pages(newpage); - page_counter_charge(&memcg->memory, nr_pages); - if (do_memsw_account()) - page_counter_charge(&memcg->memsw, nr_pages); + if (!mem_cgroup_is_root(memcg)) { + page_counter_charge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_charge(&memcg->memsw, nr_pages); + } css_get(&memcg->css); commit_charge(newpage, memcg); From patchwork Tue Mar 30 10:15:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BC62C433DB for ; Tue, 30 Mar 2021 10:20:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0F82C619AE for ; Tue, 30 Mar 2021 10:20:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F82C619AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 930ED6B0083; Tue, 30 Mar 2021 06:20:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BA926B0085; Tue, 30 Mar 2021 06:20:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7346B6B0087; Tue, 30 Mar 2021 06:20:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0083.hostedemail.com [216.40.44.83]) by kanga.kvack.org (Postfix) with ESMTP id 53CF86B0083 for ; Tue, 30 Mar 2021 06:20:56 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1CEDD441E for ; Tue, 30 Mar 2021 10:20:56 +0000 (UTC) X-FDA: 77976147312.26.F92078F Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf16.hostedemail.com (Postfix) with ESMTP id 5A5E180192D4 for ; Tue, 30 Mar 2021 10:20:54 +0000 (UTC) Received: by mail-pl1-f181.google.com with SMTP id d8so5920733plh.11 for ; Tue, 30 Mar 2021 03:20:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4vv0fgzpNtXFXrGhCyirhwBALxnuEbM4AMJFXSO3pFg=; b=2HE2ueFDVsGNMUAshabKdYu6Q3JHI1+rwf/EjY6KNcooThpqp06667KERB5xeYj/Lv GKc9jxRp30VUTisgvgoI67Z44ogXJZ/Mv5yCBV0rVx4MNf2y4hJ4biP0NmLf/EtfsNxE dLGkGb6tKysZ4cCYyuqqpMRMLJvR7U3mBCmsiSSedZfBZAl3U6DWs4kgRRDsqyNOkR6z jtrTjwkggHL5mjhKf/TsrMseR3AAeR9F/fDhk8zttbLlsW9tSS73wK9brfDm0W/VVw0X 200tz++gF9wQiPmBk7xQGV9c/lh/nbTi8UXEcpOmVZVkyCNRpJDrSV2TgGjHKkH3CzSA a9Og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4vv0fgzpNtXFXrGhCyirhwBALxnuEbM4AMJFXSO3pFg=; b=J7abSnyzRv4lintLhpw1DHug2F7ioJWXIwN4N+JNMTpaK4muEDdU+udIWlce4DLvQu DRgzwvkJn6NKjujD34J7VAnjPpla2NUxtyEGAzrScigk4MeNQUBwZlV7FDxlqWYReBcw z5LRmL3Yakh85QOT2cGEgI+nMO6JOeRe9pad/qNfzn0320I4A2/sraRwuUTdxX7pPEbQ SeBoCTQNJAhTKfNMFfae/iUfaJm1A3v9AJwCJCPdabHoZomDBhoxe71qtG5XkLinKUNe SXOufmMqnWB1oXWBHlzVbgj4EnAGS+VWxBJqxHmnLY2SRD5M6mEtN73vSCUBcEhbnfpK /I4A== X-Gm-Message-State: AOAM533gvl8ldCaaET8D93zDw29FbOBPLw/5sgUNfKuoLpQ0EdNuBtQT qLtHNVsfMn8dny1O4NDNRZKtUQ== X-Google-Smtp-Source: ABdhPJyi8G/6wIWqYCU0GDx98hfZpDTpTad1jbbdH67t8tYd2VN4fJf09V0yhNqEvksyMvsfulZZhA== X-Received: by 2002:a17:90a:3b0e:: with SMTP id d14mr3659046pjc.198.1617099654608; Tue, 30 Mar 2021 03:20:54 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.20.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:20:54 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 02/15] mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm Date: Tue, 30 Mar 2021 18:15:18 +0800 Message-Id: <20210330101531.82752-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 5A5E180192D4 X-Stat-Signature: uk4ttd3nq1t6swok4rwifz1uahzaqykx X-Rspamd-Server: rspam02 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf16; identity=mailfrom; envelope-from=""; helo=mail-pl1-f181.google.com; client-ip=209.85.214.181 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099654-560377 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When mm is NULL, we do not need to hold rcu lock and call css_tryget for the root memcg. And we also do not need to check !mm in every loop of while. So bail out early when !mm. Signed-off-by: Muchun Song Acked-by: Johannes Weiner --- mm/memcontrol.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d0c4f6e91e17..48e4c20bf115 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1029,20 +1029,19 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) if (mem_cgroup_disabled()) return NULL; + /* + * Page cache insertions can happen withou an + * actual mm context, e.g. during disk probing + * on boot, loopback IO, acct() writes etc. + */ + if (unlikely(!mm)) + return root_mem_cgroup; + rcu_read_lock(); do { - /* - * Page cache insertions can happen withou an - * actual mm context, e.g. during disk probing - * on boot, loopback IO, acct() writes etc. - */ - if (unlikely(!mm)) + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); + if (unlikely(!memcg)) memcg = root_mem_cgroup; - else { - memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); - if (unlikely(!memcg)) - memcg = root_mem_cgroup; - } } while (!css_tryget(&memcg->css)); rcu_read_unlock(); return memcg; From patchwork Tue Mar 30 10:15:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B08EC433C1 for ; Tue, 30 Mar 2021 10:21:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1CDE7619AD for ; Tue, 30 Mar 2021 10:21:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1CDE7619AD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A43536B0087; Tue, 30 Mar 2021 06:21:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F2556B0088; Tue, 30 Mar 2021 06:21:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 845856B0089; Tue, 30 Mar 2021 06:21:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 6483B6B0087 for ; Tue, 30 Mar 2021 06:21:05 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 194AC34A3 for ; Tue, 30 Mar 2021 10:21:05 +0000 (UTC) X-FDA: 77976147690.24.FDC404E Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf30.hostedemail.com (Postfix) with ESMTP id C0ACBE0001AF for ; Tue, 30 Mar 2021 10:20:58 +0000 (UTC) Received: by mail-pf1-f181.google.com with SMTP id m11so11811651pfc.11 for ; Tue, 30 Mar 2021 03:21:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yzxvMRbUYa1GwcLB1wkQ6og1cTIvOHumZQ5rSMOWVXI=; b=f7HdTGTmy0fm4hATTZMFmFD5ccXmrfkBsfyEPWwsNzDwvP9dOHC7wIt0VOuiGmkyL7 ckh8rAjqdUQB3qRSLvRJ1heoVAZYQlQgODYXFQNolXRSv1E9rOL4Q++JpTfdg+hS1W7d q7wODDo/0n8AbE931c6XZGxJUtymxBoNEbcKtJcWhcxu6roEexQWF3+VLWT+9ZZ5b2/P 3GJV4+K47FjTLiNBp+WAeQc2B4b9M7yRpyASFVbmRlQH74bhcuU1utwnQtj+cPnOnZj9 IKcrc1489ws/5Fi429Q1oV4627gS0Kih1k+PblrGObqf+JExcItvOVseUVMlm55rLR6g IcwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yzxvMRbUYa1GwcLB1wkQ6og1cTIvOHumZQ5rSMOWVXI=; b=Q/DFtby64C8HmAYrXU1nTLS/1sG6AxhuERCCKwFNxvcaXftNJCNN/m9rC7Vqi6FrVt PFNLRzyhUe495YW9hJh66yLLFonlbnh5ronnGct48+Eo7ugrfETUaLT9PqaY0fKLDWZJ +34qrej6xnj+t3QeQsueXkL7H3Gju3u9fPrU7pNLYvgGe6atTFAPcMorzx4xml6HqEiw bUN1Qn1H9dAiPK6/hHQ7YdT80JOKcdh3NiKLspqEgkRvNu84QL64atUelR1D8LGA58Ue IhvKF17oqd6Cm5qmShvRHV9vvZb8eiWUE9tIUXmBVKgR7yIjnIqQj5ii9Gy5+JfoSi2T 6C6Q== X-Gm-Message-State: AOAM532npHtAFoQUWTduUfy206l8bQp6OLncxZfiHmFDBChWepQprfWt 3A3ZZ+Smv3UVYVyIPcvkJKwatZrRHr30bOSeqms= X-Google-Smtp-Source: ABdhPJzxrphcnc3CBE3GFoJWSIJlIgixHfhNQtJGsS8OadTn0HoshWWR4jjjF0odWFA7PnllC5nneQ== X-Received: by 2002:a62:5e05:0:b029:20b:241e:4e18 with SMTP id s5-20020a625e050000b029020b241e4e18mr29529352pfb.1.1617099663738; Tue, 30 Mar 2021 03:21:03 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.20.55 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:21:03 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 03/15] mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec Date: Tue, 30 Mar 2021 18:15:19 +0800 Message-Id: <20210330101531.82752-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: 1hq3okj1uoojjd97t5edgt7jpzbbtrg5 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C0ACBE0001AF Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=mail-pf1-f181.google.com; client-ip=209.85.210.181 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099658-475575 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: All the callers of mem_cgroup_page_lruvec() just pass page_pgdat(page) as the 2nd parameter to it (except isolate_migratepages_block()). But for isolate_migratepages_block(), the page_pgdat(page) is also equal to the local variable of @pgdat. So mem_cgroup_page_lruvec() do not need the pgdat parameter. Just remove it to simplify the code. Signed-off-by: Muchun Song Acked-by: Johannes Weiner --- include/linux/memcontrol.h | 10 +++++----- mm/compaction.c | 2 +- mm/memcontrol.c | 9 +++------ mm/page-writeback.c | 2 +- mm/swap.c | 2 +- mm/workingset.c | 2 +- 6 files changed, 12 insertions(+), 15 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 7fdc92e1983e..a35a22994cf7 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -732,13 +732,12 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, /** * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page * @page: the page - * @pgdat: pgdat of the page * * This function relies on page->mem_cgroup being stable. */ -static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page, - struct pglist_data *pgdat) +static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page) { + pg_data_t *pgdat = page_pgdat(page); struct mem_cgroup *memcg = page_memcg(page); VM_WARN_ON_ONCE_PAGE(!memcg && !mem_cgroup_disabled(), page); @@ -1232,9 +1231,10 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, return &pgdat->__lruvec; } -static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page, - struct pglist_data *pgdat) +static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page) { + pg_data_t *pgdat = page_pgdat(page); + return &pgdat->__lruvec; } diff --git a/mm/compaction.c b/mm/compaction.c index e04f4476e68e..8b8fc279766e 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -994,7 +994,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (!TestClearPageLRU(page)) goto isolate_fail_put; - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = mem_cgroup_page_lruvec(page); /* If we already hold the lock, we can skip some rechecking */ if (lruvec != locked) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 48e4c20bf115..405c9642aac0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1305,9 +1305,8 @@ void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) struct lruvec *lock_page_lruvec(struct page *page) { struct lruvec *lruvec; - struct pglist_data *pgdat = page_pgdat(page); - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = mem_cgroup_page_lruvec(page); spin_lock(&lruvec->lru_lock); lruvec_memcg_debug(lruvec, page); @@ -1318,9 +1317,8 @@ struct lruvec *lock_page_lruvec(struct page *page) struct lruvec *lock_page_lruvec_irq(struct page *page) { struct lruvec *lruvec; - struct pglist_data *pgdat = page_pgdat(page); - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = mem_cgroup_page_lruvec(page); spin_lock_irq(&lruvec->lru_lock); lruvec_memcg_debug(lruvec, page); @@ -1331,9 +1329,8 @@ struct lruvec *lock_page_lruvec_irq(struct page *page) struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags) { struct lruvec *lruvec; - struct pglist_data *pgdat = page_pgdat(page); - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = mem_cgroup_page_lruvec(page); spin_lock_irqsave(&lruvec->lru_lock, *flags); lruvec_memcg_debug(lruvec, page); diff --git a/mm/page-writeback.c b/mm/page-writeback.c index eb34d204d4ee..f517e0669924 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2727,7 +2727,7 @@ int test_clear_page_writeback(struct page *page) int ret; memcg = lock_page_memcg(page); - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + lruvec = mem_cgroup_page_lruvec(page); if (mapping && mapping_use_writeback_tags(mapping)) { struct inode *inode = mapping->host; struct backing_dev_info *bdi = inode_to_bdi(inode); diff --git a/mm/swap.c b/mm/swap.c index 31b844d4ed94..af695acb7413 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -300,7 +300,7 @@ void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages) void lru_note_cost_page(struct page *page) { - lru_note_cost(mem_cgroup_page_lruvec(page, page_pgdat(page)), + lru_note_cost(mem_cgroup_page_lruvec(page), page_is_file_lru(page), thp_nr_pages(page)); } diff --git a/mm/workingset.c b/mm/workingset.c index cd39902c1062..1ab5784c9e25 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -408,7 +408,7 @@ void workingset_activation(struct page *page) memcg = page_memcg_rcu(page); if (!mem_cgroup_disabled() && !memcg) goto out; - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + lruvec = mem_cgroup_page_lruvec(page); workingset_age_nonresident(lruvec, thp_nr_pages(page)); out: rcu_read_unlock(); From patchwork Tue Mar 30 10:15:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68B67C433DB for ; Tue, 30 Mar 2021 10:21:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EC60361955 for ; Tue, 30 Mar 2021 10:21:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC60361955 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 70E476B0080; Tue, 30 Mar 2021 06:21:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6BDEC6B0089; Tue, 30 Mar 2021 06:21:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 537006B008A; Tue, 30 Mar 2021 06:21:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0069.hostedemail.com [216.40.44.69]) by kanga.kvack.org (Postfix) with ESMTP id 33F046B0080 for ; Tue, 30 Mar 2021 06:21:19 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E6E57181AF5C7 for ; Tue, 30 Mar 2021 10:21:18 +0000 (UTC) X-FDA: 77976148236.39.4B8A7A7 Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by imf20.hostedemail.com (Postfix) with ESMTP id 9C76AF0 for ; Tue, 30 Mar 2021 10:21:13 +0000 (UTC) Received: by mail-pg1-f180.google.com with SMTP id v186so11383642pgv.7 for ; Tue, 30 Mar 2021 03:21:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=R5veAUhSS6Fz26qMprH/wcKOq7j9SHcNjy/OxtDgU18=; b=zmgA6N0P6xL+V29cMSaEkKDyFRe4H8XPXlM0+f6dCOZP38p5f58whv0lg+2tRIU9n4 c7E0IsyjDNrfjBv7A2czlzFODQekAnNcftnNDdPW6e/iO1sELVQOI+fvR4OmbdsMdlGo EpZQGi3QbqjgKvnRW/cPYp/u/RcPZJFYSwm9w3jZd8BEQwDv0XPB8spHV94sM0kVpkNo kLXnRkq5NWdv7IrJugk0nfOtvEmSvWLglNZiHd9gj9sUg2/VkVFQPSSAxdWsv6xzqBUT IoMJVPGAVMtWEj/TwGv9tX+5ZzPz/mU5edg6TFzXtxGDV+EDqe7fjL8CQotZiyXl6TrA lGoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R5veAUhSS6Fz26qMprH/wcKOq7j9SHcNjy/OxtDgU18=; b=dx0QKeP+UMNsTETeb2WF9m+8PlXQWbU7hMHI0z2XBgXVzlMo0wd6Plf1C/TXq34vSi H+7Fw+xWvBctM19f2Kiv47ll2w0K+IYUdX+Q7NSNnKkVgyhmTkqMoC3vnekEoIuxaGNx fOXpXZgwJm8Ap9ubg3wYYn+waUrtLMAbQhVtTfVI3TeRl5OELRdE1Z9Z7aPm5jgt/Cj/ /9BewyqObyW05JZ46VI5q89yjcd+c3t2MPvO8ik5IxuWUPVaBLMJshugd6dEGPGwIPAz CPJGv8GkpbjnNufjQx3oGRJiC/9Odi4lik2HEI6lhaRaspGgrUNb8cpAj9E/AyZ7e+hu qnRQ== X-Gm-Message-State: AOAM533kwSaba/9+3n4UfDkF2U1htoUy7/ENksxZpKLe7WnaouXUfwXG SC+RDxoT/SLk5sAxOQJLBB76OQ== X-Google-Smtp-Source: ABdhPJz286lp8BRJWYf0CYMLqNmJOod7UzXekLsRaYeQsUYB9g/aG3ON62ct8biWttlwrCYtmaA77g== X-Received: by 2002:a62:fc10:0:b029:1ef:141f:609 with SMTP id e16-20020a62fc100000b02901ef141f0609mr28684092pfh.78.1617099675662; Tue, 30 Mar 2021 03:21:15 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.21.04 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:21:15 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 04/15] mm: memcontrol: use lruvec_memcg in lruvec_holds_page_lru_lock Date: Tue, 30 Mar 2021 18:15:20 +0800 Message-Id: <20210330101531.82752-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 9C76AF0 X-Stat-Signature: 6pcd6xcyyd5k6wdza9he91mrbrd7xsc7 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf20; identity=mailfrom; envelope-from=""; helo=mail-pg1-f180.google.com; client-ip=209.85.215.180 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099673-479822 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We already have a helper lruvec_memcg() to get the memcg from lruvec, we do not need to do it ourselves in the lruvec_holds_page_lru_lock(). So use lruvec_memcg() instead. Signed-off-by: Muchun Song --- include/linux/memcontrol.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a35a22994cf7..6e3283828391 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -744,20 +744,20 @@ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page) return mem_cgroup_lruvec(memcg, pgdat); } +static inline struct mem_cgroup *lruvec_memcg(struct lruvec *lruvec); + static inline bool lruvec_holds_page_lru_lock(struct page *page, struct lruvec *lruvec) { pg_data_t *pgdat = page_pgdat(page); const struct mem_cgroup *memcg; - struct mem_cgroup_per_node *mz; if (mem_cgroup_disabled()) return lruvec == &pgdat->__lruvec; - mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg = page_memcg(page) ? : root_mem_cgroup; - return lruvec->pgdat == pgdat && mz->memcg == memcg; + return lruvec->pgdat == pgdat && lruvec_memcg(lruvec) == memcg; } struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); From patchwork Tue Mar 30 10:15:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 704D2C433DB for ; Tue, 30 Mar 2021 10:21:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CFC1361955 for ; Tue, 30 Mar 2021 10:21:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CFC1361955 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 588126B008A; Tue, 30 Mar 2021 06:21:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 544546B008C; Tue, 30 Mar 2021 06:21:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B0C16B0092; Tue, 30 Mar 2021 06:21:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id 1A5176B008A for ; Tue, 30 Mar 2021 06:21:31 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DB5554DDC for ; Tue, 30 Mar 2021 10:21:30 +0000 (UTC) X-FDA: 77976148740.32.86B4F4A Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by imf17.hostedemail.com (Postfix) with ESMTP id A06B740002DE for ; Tue, 30 Mar 2021 10:21:27 +0000 (UTC) Received: by mail-pg1-f175.google.com with SMTP id i6so4220897pgs.1 for ; Tue, 30 Mar 2021 03:21:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/XlrhlWJR5eNlgc0Go1Prosg77ePOmL9XygDOYDXK/A=; b=pvDGPRY/fpgSyM+8p66+cRR8V+TEw7yXpz6hylJ2TOHDCX/h1BX+s70mvfl1sceihz 6RybxuB3YVqzfFlaLjiRueifOx11usygHge7HKzTGG5ErzbPF5GcDt8ZlSYjOUALa1ya ElMNWMeE9F53ruq0CC4Ln9+e6lwyh+53D5JG6d/IoYzVZIO5iRG9pso6hY/ZqEner/ns Ilrhg2IwnGmhWJwgJDglkFzaAdNPpvh5KG9tvYt8tibfsNzjmD5zkmTVQd8GbxmG/SjO gtV6tu0RZfuawBgGSu832rWw0EaOj/jj/pHnR3bRi1XRIED5esFL/JCvvGVz1doZUKhd cu9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/XlrhlWJR5eNlgc0Go1Prosg77ePOmL9XygDOYDXK/A=; b=gZPSFPMtevyNf4c8LQy8Sb+cVTbWp8Bq6qNBNz2nIpL34XZIMqtyuY+u2xklbZJmiJ k6up/OlR/9GyxTqK/dZa4hbqJ8rwESSgkJyIb7lDppS+6vXro5Xp5OU+H/8kGK79uX25 mjILxokq+vTEg9SZVKkTyMwsUFznYzBCMVx+wCvKghMnNGeeJj5h2c4QUM6kQLefljdJ DDS69+CVYvf1oelahbxSqclZqLOcCkD6sE18JXo/ccspUisSuCEBkmtbFifZ4FU3rzV8 VBoZTPP8dnQKOXcPuuWhaQ4PsECZhS50qmY1HJHy/vjkRtERq4sJ95gjPAxXZrB1ty5t BK7w== X-Gm-Message-State: AOAM533SFW1pvWZ900GB04K6lbWBeLQN97KOSPSuz6CepxZhCYhgCVwx 04ow53ifdqxx0tXhxGLUB0hSKw== X-Google-Smtp-Source: ABdhPJzWM4NKDxc668yTzYqUP3ZuqGSp3kSwjeplcYCLk4ruq3hI+7jSjTn6delF4WVuCuZwBohCgg== X-Received: by 2002:a63:fc58:: with SMTP id r24mr8111385pgk.368.1617099687301; Tue, 30 Mar 2021 03:21:27 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.21.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:21:27 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 05/15] mm: memcontrol: simplify the logic of objcg pinning memcg Date: Tue, 30 Mar 2021 18:15:21 +0800 Message-Id: <20210330101531.82752-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A06B740002DE X-Stat-Signature: c5jjuuasr54mriqzmpd9jdt89q6pz7xy X-Rspamd-Server: rspam02 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=mail-pg1-f175.google.com; client-ip=209.85.215.175 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099687-565190 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The obj_cgroup_release() and memcg_reparent_objcgs() are serialized by the css_set_lock. We do not need to care about objcg->memcg being released in the process of obj_cgroup_release(). So there is no need to pin memcg before releasing objcg. Remove those pinning logic to simplfy the code. There are only two places that modifies the objcg->memcg. One is the initialization to objcg->memcg in the memcg_online_kmem(), another is objcgs reparenting in the memcg_reparent_objcgs(). It is also impossible for the two to run in parallel. So xchg() is unnecessary and it is enough to use WRITE_ONCE(). Signed-off-by: Muchun Song --- mm/memcontrol.c | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 405c9642aac0..fdabe12e9df0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -261,7 +261,6 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg, static void obj_cgroup_release(struct percpu_ref *ref) { struct obj_cgroup *objcg = container_of(ref, struct obj_cgroup, refcnt); - struct mem_cgroup *memcg; unsigned int nr_bytes; unsigned int nr_pages; unsigned long flags; @@ -291,11 +290,9 @@ static void obj_cgroup_release(struct percpu_ref *ref) nr_pages = nr_bytes >> PAGE_SHIFT; spin_lock_irqsave(&css_set_lock, flags); - memcg = obj_cgroup_memcg(objcg); if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); list_del(&objcg->list); - mem_cgroup_put(memcg); spin_unlock_irqrestore(&css_set_lock, flags); percpu_ref_exit(ref); @@ -330,17 +327,14 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg, spin_lock_irq(&css_set_lock); - /* Move active objcg to the parent's list */ - xchg(&objcg->memcg, parent); - css_get(&parent->css); - list_add(&objcg->list, &parent->objcg_list); + /* 1) Ready to reparent active objcg. */ + list_add(&objcg->list, &memcg->objcg_list); - /* Move already reparented objcgs to the parent's list */ - list_for_each_entry(iter, &memcg->objcg_list, list) { - css_get(&parent->css); - xchg(&iter->memcg, parent); - css_put(&memcg->css); - } + /* 2) Reparent active objcg and already reparented objcgs to parent. */ + list_for_each_entry(iter, &memcg->objcg_list, list) + WRITE_ONCE(iter->memcg, parent); + + /* 3) Move already reparented objcgs to the parent's list */ list_splice(&memcg->objcg_list, &parent->objcg_list); spin_unlock_irq(&css_set_lock); From patchwork Tue Mar 30 10:15:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2097C433DB for ; Tue, 30 Mar 2021 10:21:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3A2C4619AA for ; Tue, 30 Mar 2021 10:21:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3A2C4619AA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CB3A36B0092; Tue, 30 Mar 2021 06:21:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C62E86B0093; Tue, 30 Mar 2021 06:21:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB5D16B0095; Tue, 30 Mar 2021 06:21:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 8CCB46B0092 for ; Tue, 30 Mar 2021 06:21:43 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 534D65C0 for ; Tue, 30 Mar 2021 10:21:43 +0000 (UTC) X-FDA: 77976149286.33.2FA09F7 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by imf13.hostedemail.com (Postfix) with ESMTP id 51D26E0011DD for ; Tue, 30 Mar 2021 10:21:41 +0000 (UTC) Received: by mail-pf1-f175.google.com with SMTP id g15so11857261pfq.3 for ; Tue, 30 Mar 2021 03:21:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dfWwAKuY+YWb39Txh7PaDKZ2adAgm9Qy+u/WjVIFUvM=; b=vBJ/T9bHr7jxeciveI0Wx03b+++8Eg2FiToVQoXaxTLTm3ghXZDZG6QiGdoMqziAWY Cg2VoGbSmPhizOHcUnltli2yxFctyPO2CHmdURsxRdFRKIIvhfkMW/iockIOrU5/Qz23 nvcG+EwgtiACTww18oKFhVQJrZhDYXdJieKIEF4wiIhDF1ut9FTQct4NCX10BZHmkrDD VLMS9sgYEmmRD+9nVewGSTFcpn5MxQvlensWKOJSTu+qqUTfuCfgPvQyJEjapI29pVcP tbiJBXSQXV4wX8OwkCyqRPdsR45NHxKShoNCafRz3895S+YWI01TcT/kxc7Tm0Hb+204 vl4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dfWwAKuY+YWb39Txh7PaDKZ2adAgm9Qy+u/WjVIFUvM=; b=eFTSOwu0ygjbVIdWAHgHEyeFeb6IEUNnvzFntmbaf+8c6mZyJz296noBuRLJy8Lm2y 0A8LAsqhJjWs+y1hV0eFJ8yhezWoNcfP94qorwXOU1oP5S0WTzkvLNyWWGNuPE6MszI2 wokObfs35XG80n2IhrcPmSyDyojCrp5FPaW4e3K05uEKdixmmbpDJS+QtghTM0o26pZa dowoFGIJWe+pC9pZ8y9Gw917rQ5n+u7iyscjDdvaVbDco+3LY4gq3hfIqd/Wtm/YrLAM 8pL+Shrv/6rcN9doDSDA7ZTKgkCSgBMmonmmAaA0b457Mn1Kh0MP83gugse3sk9+ZsVH Dn3A== X-Gm-Message-State: AOAM531LRDywD+ZYExZU6JhzTtDnHyC2MR0OsCTt0k4FYnJX3gFVxUM5 wYFutWSUeK8VLDuAY19dT7Y8yw== X-Google-Smtp-Source: ABdhPJxbOW2erq0nW/pOofv8s7ZSV2CLVgdPB/0PJ7/xLpa64488sjYyuXlCaG6bn7p8yhSswbLHgQ== X-Received: by 2002:a63:cc05:: with SMTP id x5mr27248025pgf.254.1617099701738; Tue, 30 Mar 2021 03:21:41 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.21.30 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:21:41 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 06/15] mm: memcontrol: move the objcg infrastructure out of CONFIG_MEMCG_KMEM Date: Tue, 30 Mar 2021 18:15:22 +0800 Message-Id: <20210330101531.82752-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 51D26E0011DD X-Stat-Signature: yihaim1txsqzwy3wicxx9cqg74nr5iu5 X-Rspamd-Server: rspam02 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf13; identity=mailfrom; envelope-from=""; helo=mail-pf1-f175.google.com; client-ip=209.85.210.175 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099701-349264 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Because memory allocations pinning memcgs for a long time - it exists at a larger scale and is causing recurring problems in the real world: page cache doesn't get reclaimed for a long time, or is used by the second, third, fourth, ... instance of the same job that was restarted into a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory, and make page reclaim very inefficient. We can convert LRU pages and most other raw memcg pins to the objcg direction to fix this problem, and then the page->memcg will always point to an object cgroup pointer. Therefore, the infrastructure of objcg no longer only serves CONFIG_MEMCG_KMEM. In this patch, we move the infrastructure of the objcg out of the scope of the CONFIG_MEMCG_KMEM so that the LRU pages can reuse it to charge pages. We know that the LRU pages are not accounted at the root level. But the page->memcg_data points to the root_mem_cgroup. So the page->memcg_data of the LRU pages always points to a valid pointer. But the root_mem_cgroup dose not have an object cgroup. If we use obj_cgroup APIs to charge the LRU pages, we should set the page->memcg_data to a root object cgroup. So we also allocate an object cgroup for the root_mem_cgroup and introduce root_obj_cgroup to cache its value just like root_mem_cgroup. Signed-off-by: Muchun Song --- include/linux/memcontrol.h | 4 ++- mm/memcontrol.c | 71 +++++++++++++++++++++++++++++++++++++--------- 2 files changed, 60 insertions(+), 15 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6e3283828391..463fc7b78396 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -205,7 +205,9 @@ struct memcg_cgwb_frn { struct obj_cgroup { struct percpu_ref refcnt; struct mem_cgroup *memcg; +#ifdef CONFIG_MEMCG_KMEM atomic_t nr_charged_bytes; +#endif union { struct list_head list; struct rcu_head rcu; @@ -303,9 +305,9 @@ struct mem_cgroup { #ifdef CONFIG_MEMCG_KMEM int kmemcg_id; enum memcg_kmem_state kmem_state; +#endif struct obj_cgroup __rcu *objcg; struct list_head objcg_list; /* list of inherited objcgs */ -#endif MEMCG_PADDING(_pad2_); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fdabe12e9df0..0107f23e7035 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -75,6 +75,7 @@ struct cgroup_subsys memory_cgrp_subsys __read_mostly; EXPORT_SYMBOL(memory_cgrp_subsys); struct mem_cgroup *root_mem_cgroup __read_mostly; +static struct obj_cgroup *root_obj_cgroup __read_mostly; /* Active memory cgroup to use from an interrupt context */ DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg); @@ -252,9 +253,14 @@ struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr) return &container_of(vmpr, struct mem_cgroup, vmpressure)->css; } -#ifdef CONFIG_MEMCG_KMEM extern spinlock_t css_set_lock; +static inline bool obj_cgroup_is_root(struct obj_cgroup *objcg) +{ + return objcg == root_obj_cgroup; +} + +#ifdef CONFIG_MEMCG_KMEM static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg, unsigned int nr_pages); @@ -298,6 +304,20 @@ static void obj_cgroup_release(struct percpu_ref *ref) percpu_ref_exit(ref); kfree_rcu(objcg, rcu); } +#else +static void obj_cgroup_release(struct percpu_ref *ref) +{ + struct obj_cgroup *objcg = container_of(ref, struct obj_cgroup, refcnt); + unsigned long flags; + + spin_lock_irqsave(&css_set_lock, flags); + list_del(&objcg->list); + spin_unlock_irqrestore(&css_set_lock, flags); + + percpu_ref_exit(ref); + kfree_rcu(objcg, rcu); +} +#endif static struct obj_cgroup *obj_cgroup_alloc(void) { @@ -318,10 +338,14 @@ static struct obj_cgroup *obj_cgroup_alloc(void) return objcg; } -static void memcg_reparent_objcgs(struct mem_cgroup *memcg, - struct mem_cgroup *parent) +static void memcg_reparent_objcgs(struct mem_cgroup *memcg) { struct obj_cgroup *objcg, *iter; + struct mem_cgroup *parent; + + parent = parent_mem_cgroup(memcg); + if (!parent) + parent = root_mem_cgroup; objcg = rcu_replace_pointer(memcg->objcg, NULL, true); @@ -342,6 +366,27 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg, percpu_ref_kill(&objcg->refcnt); } +static int memcg_obj_cgroup_alloc(struct mem_cgroup *memcg) +{ + struct obj_cgroup *objcg; + + objcg = obj_cgroup_alloc(); + if (!objcg) + return -ENOMEM; + + objcg->memcg = memcg; + rcu_assign_pointer(memcg->objcg, objcg); + + return 0; +} + +static void memcg_obj_cgroup_free(struct mem_cgroup *memcg) +{ + if (unlikely(memcg->objcg)) + memcg_reparent_objcgs(memcg); +} + +#ifdef CONFIG_MEMCG_KMEM /* * This will be used as a shrinker list's index. * The main reason for not using cgroup id for this: @@ -3648,7 +3693,6 @@ static void memcg_flush_percpu_vmevents(struct mem_cgroup *memcg) #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { - struct obj_cgroup *objcg; int memcg_id; if (cgroup_memory_nokmem) @@ -3661,14 +3705,6 @@ static int memcg_online_kmem(struct mem_cgroup *memcg) if (memcg_id < 0) return memcg_id; - objcg = obj_cgroup_alloc(); - if (!objcg) { - memcg_free_cache_id(memcg_id); - return -ENOMEM; - } - objcg->memcg = memcg; - rcu_assign_pointer(memcg->objcg, objcg); - static_branch_enable(&memcg_kmem_enabled_key); memcg->kmemcg_id = memcg_id; @@ -3692,7 +3728,7 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg) if (!parent) parent = root_mem_cgroup; - memcg_reparent_objcgs(memcg, parent); + memcg_reparent_objcgs(memcg); kmemcg_id = memcg->kmemcg_id; BUG_ON(kmemcg_id < 0); @@ -5192,6 +5228,7 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) static void mem_cgroup_free(struct mem_cgroup *memcg) { + memcg_obj_cgroup_free(memcg); memcg_wb_domain_exit(memcg); /* * Flush percpu vmstats and vmevents to guarantee the value correctness @@ -5242,6 +5279,9 @@ static struct mem_cgroup *mem_cgroup_alloc(void) if (memcg_wb_domain_init(memcg, GFP_KERNEL)) goto fail; + if (memcg_obj_cgroup_alloc(memcg)) + goto free_wb; + INIT_WORK(&memcg->high_work, high_work_func); INIT_LIST_HEAD(&memcg->oom_notify); mutex_init(&memcg->thresholds_lock); @@ -5252,8 +5292,8 @@ static struct mem_cgroup *mem_cgroup_alloc(void) memcg->socket_pressure = jiffies; #ifdef CONFIG_MEMCG_KMEM memcg->kmemcg_id = -1; - INIT_LIST_HEAD(&memcg->objcg_list); #endif + INIT_LIST_HEAD(&memcg->objcg_list); #ifdef CONFIG_CGROUP_WRITEBACK INIT_LIST_HEAD(&memcg->cgwb_list); for (i = 0; i < MEMCG_CGWB_FRN_CNT; i++) @@ -5267,6 +5307,8 @@ static struct mem_cgroup *mem_cgroup_alloc(void) #endif idr_replace(&mem_cgroup_idr, memcg, memcg->id.id); return memcg; +free_wb: + memcg_wb_domain_exit(memcg); fail: mem_cgroup_id_remove(memcg); __mem_cgroup_free(memcg); @@ -5304,6 +5346,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) page_counter_init(&memcg->tcpmem, NULL); root_mem_cgroup = memcg; + root_obj_cgroup = memcg->objcg; return &memcg->css; } From patchwork Tue Mar 30 10:15:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC727C433DB for ; Tue, 30 Mar 2021 10:21:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6DB00619AE for ; Tue, 30 Mar 2021 10:21:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6DB00619AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 009146B0095; Tue, 30 Mar 2021 06:21:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EFB976B0096; Tue, 30 Mar 2021 06:21:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D748F6B0098; Tue, 30 Mar 2021 06:21:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id B69926B0095 for ; Tue, 30 Mar 2021 06:21:56 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7B046180AD81A for ; Tue, 30 Mar 2021 10:21:56 +0000 (UTC) X-FDA: 77976149832.27.177153C Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf21.hostedemail.com (Postfix) with ESMTP id 1C576E0011C5 for ; Tue, 30 Mar 2021 10:21:53 +0000 (UTC) Received: by mail-pl1-f182.google.com with SMTP id y2so5913941plg.5 for ; Tue, 30 Mar 2021 03:21:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/bUVH6CYiEDE2OojiUwNW4YxUovgMumcyycakaekQOs=; b=rc7vxMSybQciv9/5dtRhvyJjfFL8C1QprmT4SwgF2AknzGU48EXo4+csaZ88DVCOXN HLlyMSTI83QInTdlLsM8wBPyhykOQhUA/6woR2kxUWN+MmoJVmdL1vS71CF8KVeKTdTJ PKc4xisUKrZmhoRDxyT2rRl3UDmkf6Lh7h8b9EYup+jepXkv/1CLnlPJ9UxibpwX715z Q+gfpkC4dMvYufZcIUNgZ6i/Z7ZQ3FLiey5sq6MfWLruvg6IU3TW2N+UxDLIGk6HM0eF wUMId628x41BQmLxwu9CJoCIODDjm7BlALZL17b24aZbF+qb/aSW2b9vu4AAJRFuAY28 xwQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/bUVH6CYiEDE2OojiUwNW4YxUovgMumcyycakaekQOs=; b=FWAiLA4MvNgw/tMw3kgEj67UlcOKaTrUx0UYIZ3fSZK8YJOamJ6p1npVezkZtzq3+d v/Nf3vDbzWAODfV61pXphrnf7q/AJ01/jpZKJJpFbqTWQo0/qUIj4hN2F/UfDuXQyJqs w6CjtPqNr8mwaxfr31eWrUo93I1szihIVnubtxMtTaEAMcRdYiY4NWTN7TLau/YOZWkh cTFWW/SVCOCauDEBDU/AN+UI2UpQhVVH1aiB/RGRQ9k1JZZmnlpmsdcFiCMQwot1wZie eviRoT3Ugp6Ar1KfLdRYxUL8vs8W+utEETQ+PmUojEcjqnW2YFsgMlMi5gW9Yb1g+L1F v28w== X-Gm-Message-State: AOAM532mlGBsmMBzXBfRxVGJenA1m+1sSjmJrg0BCCFwQN2kt94vcb1k i1aYYFA92gdcUB/8tfo5W8jLOA== X-Google-Smtp-Source: ABdhPJwUaQ5H5WNHLQADgPnCOwmyrS9JVPM2onvAWbxdHSJgQZXin+hFD74iiQOLABz8q4nyQCLqCQ== X-Received: by 2002:a17:902:c101:b029:e7:3268:6fed with SMTP id 1-20020a170902c101b02900e732686fedmr19897636pli.79.1617099715239; Tue, 30 Mar 2021 03:21:55 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.21.42 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:21:54 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 07/15] mm: memcontrol: introduce compact_lock_page_lruvec_irqsave Date: Tue, 30 Mar 2021 18:15:23 +0800 Message-Id: <20210330101531.82752-8-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: qyn3mnootq9zs3xd41u9m9kfcej3nwna X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1C576E0011C5 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mail-pl1-f182.google.com; client-ip=209.85.214.182 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099713-558630 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If we reuse the objcg APIs to charge LRU pages, the page_memcg() can be changed when the LRU pages reparented. In this case, we need to acquire the new lruvec lock. lruvec = mem_cgroup_page_lruvec(page); // The page is reparented. compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); // Acquired the wrong lruvec lock and need to retry. But compact_lock_irqsave() only take lruvec lock as the parameter, we cannot aware this change. If it can take the page as parameter to acquire the lruvec lock. When the page memcg is changed, we can use the page_memcg() detect whether we need to reacquire the new lruvec lock. So compact_lock_irqsave() is not suitable for us. Similar to lock_page_lruvec_irqsave(), introduce compact_lock_page_lruvec_irqsave() to acquire the lruvec lock in the compaction routine. Signed-off-by: Muchun Song --- mm/compaction.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 8b8fc279766e..d6b7d5f90fce 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -511,6 +511,29 @@ static bool compact_lock_irqsave(spinlock_t *lock, unsigned long *flags, return true; } +static struct lruvec * +compact_lock_page_lruvec_irqsave(struct page *page, unsigned long *flags, + struct compact_control *cc) +{ + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page); + + /* Track if the lock is contended in async mode */ + if (cc->mode == MIGRATE_ASYNC && !cc->contended) { + if (spin_trylock_irqsave(&lruvec->lru_lock, *flags)) + goto out; + + cc->contended = true; + } + + spin_lock_irqsave(&lruvec->lru_lock, *flags); +out: + lruvec_memcg_debug(lruvec, page); + + return lruvec; +} + /* * Compaction requires the taking of some coarse locks that are potentially * very heavily contended. The lock should be periodically unlocked to avoid @@ -1001,10 +1024,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (locked) unlock_page_lruvec_irqrestore(locked, flags); - compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); - locked = lruvec; - - lruvec_memcg_debug(lruvec, page); + locked = compact_lock_page_lruvec_irqsave(page, &flags, cc); + lruvec = locked; /* Try get exclusive access under lock */ if (!skip_updated) { From patchwork Tue Mar 30 10:15:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14AD0C433DB for ; Tue, 30 Mar 2021 10:22:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 65521619AE for ; Tue, 30 Mar 2021 10:22:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 65521619AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E38866B0098; Tue, 30 Mar 2021 06:22:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC1696B0099; Tue, 30 Mar 2021 06:22:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C14DE6B009A; Tue, 30 Mar 2021 06:22:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id A23DA6B0098 for ; Tue, 30 Mar 2021 06:22:10 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6483F3644 for ; Tue, 30 Mar 2021 10:22:10 +0000 (UTC) X-FDA: 77976150420.30.05943F7 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf25.hostedemail.com (Postfix) with ESMTP id 2882E6000106 for ; Tue, 30 Mar 2021 10:22:08 +0000 (UTC) Received: by mail-pf1-f174.google.com with SMTP id j25so11853560pfe.2 for ; Tue, 30 Mar 2021 03:22:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0xlRDpcZM1nqKykm7nw5VxdMkKGKLsb46vHmKtuxpiM=; b=jEPdFratXAzCkzUOa84wp5/kJojfK6fZ0lW0cWzAguwkNwiO6PFV16bfcRUYxKnk5C r30JBBFLuqpQMgjBw7GzF+S5nVZ6vciZXxHlNfsXt5GeMQFPgMCKL+FQOsGpteBHcXIo EKYUQWVwq6qIjV/4J6XfHy/iRbQlEcYLekAM4g+PzU3TVjc3Owr78cYkMEos50wkr6id B1H66N4G+6VgwxJhy4EJt+i2+F0+iP//ZhNy/X//Bhx1dm8GgM9mxJBDxuIblDyfbH/J xGVPasfw2stMr9Yt52zkbk18wDq8JY59dNkC5mI5IaWo/bxegqRNxqIJSTOerMKHowww IRAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0xlRDpcZM1nqKykm7nw5VxdMkKGKLsb46vHmKtuxpiM=; b=iVuTxCoHVB7ToOasFHangK+Q99efmBAnJQw2kp5fhOJd6j3OWbFuXKWvOwOwhScG8S tVXn2cONb6f/bcHED92iPTOB0PWCQyYPPylHJkaCRVwfJ2KAtUagHSQyLUfM4ulaKwfS B+hu6FPVhCCxG/Fevlu5pflKrNJd+NNoN/Yf6R857RCurm9PBvrJvjI8QOgl1QuM7jPh ZhfWhGL5Pqs/Uc843Ss1puMqbIG5blugHsnwOZ1lqMSu8/zKbaoAByMCUaRL/+1tODh7 1pESRGlrdm+Y5hRxU08LKrdzhkpx9l1CJJAwsK/myvAjDSp+FseQmda5kNKEcSv9l550 zotg== X-Gm-Message-State: AOAM533w6XSu7DD/5+VETzgsw4Sab5v5cnHt/9pfECJIvmgBoIGvKTl/ fUUHvHqWuAURnEk3djcPXyW71A== X-Google-Smtp-Source: ABdhPJyFtmrXkqseL07KUSdajLU6owbhX5LiS593/reozQof0dYU9pkStvryvC9yzP80dqK/s7/D4A== X-Received: by 2002:a62:ee0c:0:b029:214:7a61:6523 with SMTP id e12-20020a62ee0c0000b02902147a616523mr28827776pfi.59.1617099728977; Tue, 30 Mar 2021 03:22:08 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.21.56 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:22:08 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 08/15] mm: memcontrol: make lruvec lock safe when the LRU pages reparented Date: Tue, 30 Mar 2021 18:15:24 +0800 Message-Id: <20210330101531.82752-9-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2882E6000106 X-Stat-Signature: cuo51i6bwtpa3kmzqo8jpmqr3zjhqs5c Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=mail-pf1-f174.google.com; client-ip=209.85.210.174 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099728-2016 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The diagram below shows how to make the page lruvec lock safe when the LRU pages reparented. lock_page_lruvec(page) retry: lruvec = mem_cgroup_page_lruvec(page); // The page is reparented at this time. spin_lock(&lruvec->lru_lock); if (unlikely(lruvec_memcg(lruvec) != page_memcg(page))) // Acquired the wrong lruvec lock and need to retry. // Because this page is on the parent memcg lruvec list. goto retry; // If we reach here, it means that page_memcg(page) is stable. memcg_reparent_objcgs(memcg) // lruvec belongs to memcg and lruvec_parent belongs to parent memcg. spin_lock(&lruvec->lru_lock); spin_lock(&lruvec_parent->lru_lock); // Move all the pages from the lruvec list to the parent lruvec list. spin_unlock(&lruvec_parent->lru_lock); spin_unlock(&lruvec->lru_lock); After we acquire the lruvec lock, we need to check whether the page is reparented. If so, we need to reacquire the new lruvec lock. On the routine of the LRU pages reparenting, we will also acquire the lruvec lock (Will be implemented in the later patch). So page_memcg() cannot be changed when we hold the lruvec lock. Since lruvec_memcg(lruvec) is always equal to page_memcg(page) after we hold the lruvec lock, lruvec_memcg_debug() check is pointless. So remove it. This is a preparation for reparenting the LRU pages. Signed-off-by: Muchun Song --- include/linux/memcontrol.h | 16 +++---------- mm/compaction.c | 13 ++++++++++- mm/memcontrol.c | 56 +++++++++++++++++++++++++++++----------------- mm/swap.c | 5 +++++ 4 files changed, 56 insertions(+), 34 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 463fc7b78396..5d7c8a060843 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -735,7 +735,9 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page * @page: the page * - * This function relies on page->mem_cgroup being stable. + * The lruvec can be changed to its parent lruvec when the page reparented. + * The caller need to recheck if it cares about this change (just like + * lock_page_lruvec() does). */ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page) { @@ -771,14 +773,6 @@ struct lruvec *lock_page_lruvec_irq(struct page *page); struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags); -#ifdef CONFIG_DEBUG_VM -void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page); -#else -static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) -{ -} -#endif - static inline struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ return css ? container_of(css, struct mem_cgroup, css) : NULL; @@ -1500,10 +1494,6 @@ static inline void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx) { } - -static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) -{ -} #endif /* CONFIG_MEMCG */ static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx) diff --git a/mm/compaction.c b/mm/compaction.c index d6b7d5f90fce..b0ad635dd576 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -517,6 +517,8 @@ compact_lock_page_lruvec_irqsave(struct page *page, unsigned long *flags, { struct lruvec *lruvec; + rcu_read_lock(); +retry: lruvec = mem_cgroup_page_lruvec(page); /* Track if the lock is contended in async mode */ @@ -529,7 +531,16 @@ compact_lock_page_lruvec_irqsave(struct page *page, unsigned long *flags, spin_lock_irqsave(&lruvec->lru_lock, *flags); out: - lruvec_memcg_debug(lruvec, page); + if (unlikely(lruvec_memcg(lruvec) != page_memcg(page))) { + spin_unlock_irqrestore(&lruvec->lru_lock, *flags); + goto retry; + } + + /* + * Preemption is disabled in the internal of spin_lock, which can serve + * as RCU read-side critical sections. + */ + rcu_read_unlock(); return lruvec; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0107f23e7035..2592e2b072ef 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1314,23 +1314,6 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, return ret; } -#ifdef CONFIG_DEBUG_VM -void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) -{ - struct mem_cgroup *memcg; - - if (mem_cgroup_disabled()) - return; - - memcg = page_memcg(page); - - if (!memcg) - VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != root_mem_cgroup, page); - else - VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != memcg, page); -} -#endif - /** * lock_page_lruvec - lock and return lruvec for a given page. * @page: the page @@ -1345,10 +1328,21 @@ struct lruvec *lock_page_lruvec(struct page *page) { struct lruvec *lruvec; + rcu_read_lock(); +retry: lruvec = mem_cgroup_page_lruvec(page); spin_lock(&lruvec->lru_lock); - lruvec_memcg_debug(lruvec, page); + if (unlikely(lruvec_memcg(lruvec) != page_memcg(page))) { + spin_unlock(&lruvec->lru_lock); + goto retry; + } + + /* + * Preemption is disabled in the internal of spin_lock, which can serve + * as RCU read-side critical sections. + */ + rcu_read_unlock(); return lruvec; } @@ -1357,10 +1351,21 @@ struct lruvec *lock_page_lruvec_irq(struct page *page) { struct lruvec *lruvec; + rcu_read_lock(); +retry: lruvec = mem_cgroup_page_lruvec(page); spin_lock_irq(&lruvec->lru_lock); - lruvec_memcg_debug(lruvec, page); + if (unlikely(lruvec_memcg(lruvec) != page_memcg(page))) { + spin_unlock_irq(&lruvec->lru_lock); + goto retry; + } + + /* + * Preemption is disabled in the internal of spin_lock, which can serve + * as RCU read-side critical sections. + */ + rcu_read_unlock(); return lruvec; } @@ -1369,10 +1374,21 @@ struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags) { struct lruvec *lruvec; + rcu_read_lock(); +retry: lruvec = mem_cgroup_page_lruvec(page); spin_lock_irqsave(&lruvec->lru_lock, *flags); - lruvec_memcg_debug(lruvec, page); + if (unlikely(lruvec_memcg(lruvec) != page_memcg(page))) { + spin_unlock_irqrestore(&lruvec->lru_lock, *flags); + goto retry; + } + + /* + * Preemption is disabled in the internal of spin_lock, which can serve + * as RCU read-side critical sections. + */ + rcu_read_unlock(); return lruvec; } diff --git a/mm/swap.c b/mm/swap.c index af695acb7413..044c240d8873 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -300,6 +300,11 @@ void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages) void lru_note_cost_page(struct page *page) { + /* + * The rcu read lock is held by the caller, so we do not need to + * care about the lruvec returned by mem_cgroup_page_lruvec() being + * released. + */ lru_note_cost(mem_cgroup_page_lruvec(page), page_is_file_lru(page), thp_nr_pages(page)); } From patchwork Tue Mar 30 10:15:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB979C433DB for ; Tue, 30 Mar 2021 10:22:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3943B619AE for ; Tue, 30 Mar 2021 10:22:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3943B619AE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B52D46B009A; Tue, 30 Mar 2021 06:22:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B02466B009B; Tue, 30 Mar 2021 06:22:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CC8F6B009C; Tue, 30 Mar 2021 06:22:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 7C86B6B009A for ; Tue, 30 Mar 2021 06:22:23 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2DEA3180AD81A for ; Tue, 30 Mar 2021 10:22:23 +0000 (UTC) X-FDA: 77976150966.29.08CE514 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf18.hostedemail.com (Postfix) with ESMTP id 2AF8A2000248 for ; Tue, 30 Mar 2021 10:22:23 +0000 (UTC) Received: by mail-pl1-f169.google.com with SMTP id w11so5926945ply.6 for ; Tue, 30 Mar 2021 03:22:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=trRXhcI2buKhbiPF1Cmv0EYAvkXYBR9MP7Cwm4odxNo=; b=rMUVCwGAerlhCaFyhFnoT84igkNv89tF5s/Xe0nynB9QECidYiZf2ShkllrUupF606 Mrmyso9d8z9AXiRcUPM/d6NRAMGOrjsZmhs8xnN9oOgoPwfDFT0kJy+DnN8sg7sJrRcY bCaiZR/6tgjJE548y00tTiC/3BT0f6Imlvt5jLtOtwtnX+J7Dfr8HN66C4G3kif+Q6PN xF3/5DioZzFrjqJGHNWfqGsUt4VXCNsbltDB+YuXP3ekSblX7knFfsP5IZ3MPeFPOHR3 vH+qTkhvVhP7azNxJCADYZrIG+KDoS0FJz4MVHojlMdMEYvb4USZDWnxzFBalh/fj7B1 4wKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=trRXhcI2buKhbiPF1Cmv0EYAvkXYBR9MP7Cwm4odxNo=; b=fqaPN4XWb+iUoIo6dVng2MHcouGWM/yQIhpVXG/wnmbCR2UgQMFNQQrhVl/FeYyBKa Uze+HQmfE58h1NoUv+qfRpe5wjz87UOxM2WVGSEodbKJStdrJCg6XXkZN+Obi/yHOxnV YSKj9JQR3nAOqdu43It/YXObKmpFPwEJF5ID5BVv/YChEpr/K3wvU5yx2HeOhC7G5RtC nlGJ7+AYkoYh/jnT2ZIlm7DdSdLVmhCpeOHXHY12ZBRWuDgBVjyLQJJLCOKRnmhNmFz5 0kaU/8DTCM5OAyvpAbSQm7SRtEXsPRbjgSkmcS05xhoOgPfQK9MiN07Crfc1fxVX3QEZ xh3g== X-Gm-Message-State: AOAM53032+Nvz1DBTd6F8YMA5L47GD98t1m6Kt77IfYkH7ruFy8N+r9D QTANX/kJ8w4H3xt7OZLwSUsdyg== X-Google-Smtp-Source: ABdhPJwYg4/GbAWtPjhDVOW8ONgYXJY5VA50X+1eJCxD6Fvmf/6UjCqsPJMH3OVPtERtfB8bOO/MCA== X-Received: by 2002:a17:902:e889:b029:e7:1490:9da5 with SMTP id w9-20020a170902e889b02900e714909da5mr26905267plg.20.1617099741663; Tue, 30 Mar 2021 03:22:21 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.22.09 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:22:21 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 09/15] mm: thp: introduce lock/unlock_split_queue{_irqsave}() Date: Tue, 30 Mar 2021 18:15:25 +0800 Message-Id: <20210330101531.82752-10-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 2AF8A2000248 X-Stat-Signature: mi4qdrxeqbbnn6egw5oph45jz5kwj9yt Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=mail-pl1-f169.google.com; client-ip=209.85.214.169 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099743-919567 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We should make thp deferred split queue lock safe when LRU pages reparented. Similar to lock_page_lruvec{_irqsave, _irq}(), we introduce lock/unlock_split_queue{_irqsave}() to make the deferred split queue lock easier to be reparented. And in the next patch, we can use a similar approach (just like lruvec lock did) to make thp deferred split queue lock safe when the LRU pages reparented. Signed-off-by: Muchun Song --- mm/huge_memory.c | 97 +++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 75 insertions(+), 22 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 395c75111d33..186dc11e8992 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -486,25 +486,76 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) } #ifdef CONFIG_MEMCG -static inline struct deferred_split *get_deferred_split_queue(struct page *page) +static inline struct mem_cgroup *split_queue_to_memcg(struct deferred_split *queue) { - struct mem_cgroup *memcg = page_memcg(compound_head(page)); - struct pglist_data *pgdat = NODE_DATA(page_to_nid(page)); + return container_of(queue, struct mem_cgroup, deferred_split_queue); +} + +static struct deferred_split *lock_split_queue(struct page *page) +{ + struct deferred_split *queue; + struct mem_cgroup *memcg; + memcg = page_memcg(compound_head(page)); if (memcg) - return &memcg->deferred_split_queue; + queue = &memcg->deferred_split_queue; else - return &pgdat->deferred_split_queue; + queue = &NODE_DATA(page_to_nid(page))->deferred_split_queue; + spin_lock(&queue->split_queue_lock); + + return queue; +} + +static struct deferred_split *lock_split_queue_irqsave(struct page *page, + unsigned long *flags) +{ + struct deferred_split *queue; + struct mem_cgroup *memcg; + + memcg = page_memcg(compound_head(page)); + if (memcg) + queue = &memcg->deferred_split_queue; + else + queue = &NODE_DATA(page_to_nid(page))->deferred_split_queue; + spin_lock_irqsave(&queue->split_queue_lock, *flags); + + return queue; } #else -static inline struct deferred_split *get_deferred_split_queue(struct page *page) +static struct deferred_split *lock_split_queue(struct page *page) { - struct pglist_data *pgdat = NODE_DATA(page_to_nid(page)); + struct deferred_split *queue; - return &pgdat->deferred_split_queue; + queue = &NODE_DATA(page_to_nid(page))->deferred_split_queue; + spin_lock(&queue->split_queue_lock); + + return queue; +} + +static struct deferred_split *lock_split_queue_irqsave(struct page *page, + unsigned long *flags) + +{ + struct deferred_split *queue; + + queue = &NODE_DATA(page_to_nid(page))->deferred_split_queue; + spin_lock_irqsave(&queue->split_queue_lock, *flags); + + return queue; } #endif +static inline void unlock_split_queue(struct deferred_split *queue) +{ + spin_unlock(&queue->split_queue_lock); +} + +static inline void unlock_split_queue_irqrestore(struct deferred_split *queue, + unsigned long flags) +{ + spin_unlock_irqrestore(&queue->split_queue_lock, flags); +} + void prep_transhuge_page(struct page *page) { /* @@ -2668,7 +2719,7 @@ bool can_split_huge_page(struct page *page, int *pextra_pins) int split_huge_page_to_list(struct page *page, struct list_head *list) { struct page *head = compound_head(page); - struct deferred_split *ds_queue = get_deferred_split_queue(head); + struct deferred_split *ds_queue; struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; int count, mapcount, extra_pins, ret; @@ -2747,7 +2798,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) } /* Prevent deferred_split_scan() touching ->_refcount */ - spin_lock(&ds_queue->split_queue_lock); + ds_queue = lock_split_queue(head); count = page_count(head); mapcount = total_mapcount(head); if (!mapcount && page_ref_freeze(head, 1 + extra_pins)) { @@ -2755,7 +2806,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) ds_queue->split_queue_len--; list_del(page_deferred_list(head)); } - spin_unlock(&ds_queue->split_queue_lock); + unlock_split_queue(ds_queue); if (mapping) { int nr = thp_nr_pages(head); @@ -2778,7 +2829,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) dump_page(page, "total_mapcount(head) > 0"); BUG(); } - spin_unlock(&ds_queue->split_queue_lock); + unlock_split_queue(ds_queue); fail: if (mapping) xa_unlock(&mapping->i_pages); local_irq_enable(); @@ -2800,24 +2851,21 @@ fail: if (mapping) void free_transhuge_page(struct page *page) { - struct deferred_split *ds_queue = get_deferred_split_queue(page); + struct deferred_split *ds_queue; unsigned long flags; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + ds_queue = lock_split_queue_irqsave(page, &flags); if (!list_empty(page_deferred_list(page))) { ds_queue->split_queue_len--; list_del(page_deferred_list(page)); } - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); + unlock_split_queue_irqrestore(ds_queue, flags); free_compound_page(page); } void deferred_split_huge_page(struct page *page) { - struct deferred_split *ds_queue = get_deferred_split_queue(page); -#ifdef CONFIG_MEMCG - struct mem_cgroup *memcg = page_memcg(compound_head(page)); -#endif + struct deferred_split *ds_queue; unsigned long flags; VM_BUG_ON_PAGE(!PageTransHuge(page), page); @@ -2835,18 +2883,23 @@ void deferred_split_huge_page(struct page *page) if (PageSwapCache(page)) return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + ds_queue = lock_split_queue_irqsave(page, &flags); if (list_empty(page_deferred_list(page))) { count_vm_event(THP_DEFERRED_SPLIT_PAGE); list_add_tail(page_deferred_list(page), &ds_queue->split_queue); ds_queue->split_queue_len++; + #ifdef CONFIG_MEMCG - if (memcg) + if (page_memcg(page)) { + struct mem_cgroup *memcg; + + memcg = split_queue_to_memcg(ds_queue); memcg_set_shrinker_bit(memcg, page_to_nid(page), deferred_split_shrinker.id); + } #endif } - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); + unlock_split_queue_irqrestore(ds_queue, flags); } static unsigned long deferred_split_count(struct shrinker *shrink, From patchwork Tue Mar 30 10:15:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C193C433C1 for ; Tue, 30 Mar 2021 10:22:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7E0AD619AD for ; Tue, 30 Mar 2021 10:22:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E0AD619AD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 140766B009C; Tue, 30 Mar 2021 06:22:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 117EB6B009D; Tue, 30 Mar 2021 06:22:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAC196B009E; Tue, 30 Mar 2021 06:22:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0096.hostedemail.com [216.40.44.96]) by kanga.kvack.org (Postfix) with ESMTP id CC8636B009C for ; Tue, 30 Mar 2021 06:22:34 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 83F6C181AF5C7 for ; Tue, 30 Mar 2021 10:22:34 +0000 (UTC) X-FDA: 77976151428.10.C6277E8 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf13.hostedemail.com (Postfix) with ESMTP id D2B68E0011CE for ; Tue, 30 Mar 2021 10:22:32 +0000 (UTC) Received: by mail-pl1-f169.google.com with SMTP id h8so5919015plt.7 for ; Tue, 30 Mar 2021 03:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=35PJPimaboX0f0GU6NT304tFzSCBj6gk52k3CNbuEN0=; b=k0pCcjq+CZH6CkeB7OZseO3R0aoXsvjx6Oxds9wSe0FcNtTLNcBYwf/eApXmFxW3DE mq4oUwM8ddJlOXfU2bjf+jON7bXFz0ciEk/omi+H0Ge+W1gyu8H7gtDPNeWxqVk4kkax 8O9fmWu+ics5JjV26PspctzHvzQZvS67iRBvMSGw7sglb7RpAryMK5ZKC2OS148Ex75J HjHhEjEKzNLCT48TIWPJBwVXsktvFsRXXJ1l0T9dia74Xt0Whr06FVLmTNvC1tmZd3y/ LyowX64eomXd4tilg5Y+Lcv7QV/EJuKgOc6XLz3G7x6byAy+ULDiudYEZYWfJrNfGetW m1VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=35PJPimaboX0f0GU6NT304tFzSCBj6gk52k3CNbuEN0=; b=g6tyBsXuLu+CM3VwKtiGv45WBTVjKfAKP2HmbgyPU1u+H4u8Yus96+U0hD9ebOuTM7 zK6Ey7tpUGf7V989Li0Tf8DPFkBdvNzzObCIPtQERYitlubXOR2QCUkfuItdQTWy1sk4 4F9eQ7aEPw1wQQyIRlR8ZZ+ObEJauSPJNMrL3ZhRMnLSPK5Tm1QhXEEFM+GQ7rCWFsBX W8oU1KFkOUN3xn4IK+XZzWE8sAipETVWSb6jO5+WY2X3wdW0+H7kjGOTB+u3bH2zbw9u oL8okvUkb5kZ+kMi7OsX7ZdMBPvVCWHmKtTOUWRjmZNrcvtPWJYG/Sjdv/CEztsbnIPW yx8w== X-Gm-Message-State: AOAM531G9CcWTDkvBcEcHN/OsyJOZi6f/55hUMpwPXN3gu/uqJduPIbW A6+IgkjVXfXZyS4ZWC/aYDxs5w== X-Google-Smtp-Source: ABdhPJyrZ53j9C5rLFjTDQpQ/1eARR63PEdeQ3l5NphxlX8LYCZtDeAcNu6lfko9ahvLOeA5KbhfSA== X-Received: by 2002:a17:90a:f286:: with SMTP id fs6mr3663311pjb.183.1617099753361; Tue, 30 Mar 2021 03:22:33 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.22.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:22:33 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 10/15] mm: thp: make deferred split queue lock safe when the LRU pages reparented Date: Tue, 30 Mar 2021 18:15:26 +0800 Message-Id: <20210330101531.82752-11-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D2B68E0011CE X-Stat-Signature: nbyp5juzsnfec61ftneitp1rh3ugyq4a Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf13; identity=mailfrom; envelope-from=""; helo=mail-pl1-f169.google.com; client-ip=209.85.214.169 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099752-880003 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Similar to lruvec lock, we use the same approach to make the lock safe when the LRU pages reparented. Signed-off-by: Muchun Song --- mm/huge_memory.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 186dc11e8992..434cc7283a64 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -496,6 +496,8 @@ static struct deferred_split *lock_split_queue(struct page *page) struct deferred_split *queue; struct mem_cgroup *memcg; + rcu_read_lock(); +retry: memcg = page_memcg(compound_head(page)); if (memcg) queue = &memcg->deferred_split_queue; @@ -503,6 +505,17 @@ static struct deferred_split *lock_split_queue(struct page *page) queue = &NODE_DATA(page_to_nid(page))->deferred_split_queue; spin_lock(&queue->split_queue_lock); + if (unlikely(memcg != page_memcg(page))) { + spin_unlock(&queue->split_queue_lock); + goto retry; + } + + /* + * Preemption is disabled in the internal of spin_lock, which can serve + * as RCU read-side critical sections. + */ + rcu_read_unlock(); + return queue; } @@ -512,6 +525,8 @@ static struct deferred_split *lock_split_queue_irqsave(struct page *page, struct deferred_split *queue; struct mem_cgroup *memcg; + rcu_read_lock(); +retry: memcg = page_memcg(compound_head(page)); if (memcg) queue = &memcg->deferred_split_queue; @@ -519,6 +534,17 @@ static struct deferred_split *lock_split_queue_irqsave(struct page *page, queue = &NODE_DATA(page_to_nid(page))->deferred_split_queue; spin_lock_irqsave(&queue->split_queue_lock, *flags); + if (unlikely(memcg != page_memcg(page))) { + spin_unlock_irqrestore(&queue->split_queue_lock, *flags); + goto retry; + } + + /* + * Preemption is disabled in the internal of spin_lock, which can serve + * as RCU read-side critical sections. + */ + rcu_read_unlock(); + return queue; } #else From patchwork Tue Mar 30 10:15:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47BF3C433E0 for ; Tue, 30 Mar 2021 10:22:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 91C53619B1 for ; Tue, 30 Mar 2021 10:22:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 91C53619B1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 142BA6B009E; Tue, 30 Mar 2021 06:22:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 117FE6B009F; Tue, 30 Mar 2021 06:22:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E85396B00A0; Tue, 30 Mar 2021 06:22:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C74396B009E for ; Tue, 30 Mar 2021 06:22:49 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8DEDB181AF5C7 for ; Tue, 30 Mar 2021 10:22:49 +0000 (UTC) X-FDA: 77976152058.07.9621E55 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf15.hostedemail.com (Postfix) with ESMTP id 1E886A0009E5 for ; Tue, 30 Mar 2021 10:22:46 +0000 (UTC) Received: by mail-pj1-f47.google.com with SMTP id kk2-20020a17090b4a02b02900c777aa746fso7418045pjb.3 for ; Tue, 30 Mar 2021 03:22:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+RRui44YOPJhiRmspahr5RWVyh8BgMk07C75sioBtqk=; b=vzfrXfXZ+/XxPREmIiv9MdyNd9I4OeHsfzej7mDnOj15gIzFXMeSsbYqNYXPZ5hwEv kY8b7dvMmyOMdi28jQNy/QiXOjRJrpPjf86+xot6OoIu8Xb0bThMVc3Io5nOwz2ax6M6 IU4cqz0KOTadDamr4O9IaqqGIu1qCssJTV2vzHMq8oMOhBGmljp62iTDYwMuleyqkm+T Nzw/RIzqqe73pU2r1dqy8JbQHruhWbiZYMGPLuBoU42A01qTboY8FkNSksaqcl0gDDkO 99ihm0d1zFVZh68BP5hCgb3AzWNBONbgrFc1bjmhnDtbU52Tn4j5FWQWivB74aEUn8hy M1vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+RRui44YOPJhiRmspahr5RWVyh8BgMk07C75sioBtqk=; b=Y0N4p3IiBBsHtOe5L80WXtMJ8ON4uF7qRmVwPReUawvgfHBi7NsY9f3U7lfPC9cAUG Gutrm30Y55BUX4Gqm/wtpYMk0CCSCA0Uo6oGg3koJNcfiSsVi4Mb+cwx8lAP59QtQR8B rUu7UsAmNSjgeLNq9iwbyDAZFR5sRA0+Nfpr/4eEABTqa4fZZuX8plz4fjEayQ7nZ1nS ikX45rAcapiFZtDM32ahQo+dLRxB+rx69pyZw/UEBlmKHhkvfOh5veGrI9jHBn1KLa6j NyBdDaWWDcqm2T0WAOHwL7AyG5Rxc0chzV9LF/ShX1XD04+D7Px+y5qWYJWt+bSacLxM CgTw== X-Gm-Message-State: AOAM531yamYTCCyo3Wp0KN67EqbGotmnbKvamWRTjIFYIk2PjK8Pxu6s NF0v8U+iKQzkmLTsQNHns2Oz9g== X-Google-Smtp-Source: ABdhPJxPvWJm3OgX91dsEqA4LIOqvgaCwymQOQtQ+qqV31O23dn8D11n5PGy2XIMz4lH61iziy8LQA== X-Received: by 2002:a17:902:b705:b029:e6:f027:adf8 with SMTP id d5-20020a170902b705b02900e6f027adf8mr33326834pls.72.1617099767458; Tue, 30 Mar 2021 03:22:47 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.22.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:22:47 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 11/15] mm: memcontrol: make all the callers of page_memcg() safe Date: Tue, 30 Mar 2021 18:15:27 +0800 Message-Id: <20210330101531.82752-12-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1E886A0009E5 X-Stat-Signature: tiq1xsxg8ggdwsttnz756hbmp84gd776 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=mail-pj1-f47.google.com; client-ip=209.85.216.47 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099766-458202 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When we use objcg APIs to charge the LRU pages, the page will not hold a reference to the memcg associated with the page. So the caller of the page_memcg() should hold an rcu read lock or obtain a reference to the memcg associated with the page to protect memcg from being released. So introduce get_mem_cgroup_from_page() to obtain a reference to the memory cgroup associated with the page. In this patch, make all the callers hold an rcu read lock or obtain a reference to the memcg to protect memcg from being released when the LRU pages reparented. We do not need to adjust the callers of page_memcg() during the whole process of mem_cgroup_move_task(). Because the cgroup migration and memory cgroup offlining are serialized by @cgroup_mutex. In this routine, the LRU pages cannot be reparented to its parent memory cgroup. So page_memcg(page) is stable and cannot be released. This is a preparation for reparenting the LRU pages. Signed-off-by: Muchun Song --- fs/buffer.c | 3 ++- fs/fs-writeback.c | 23 +++++++++++---------- include/linux/memcontrol.h | 34 ++++++++++++++++++++++++++++--- mm/memcontrol.c | 50 ++++++++++++++++++++++++++++++++++++---------- mm/migrate.c | 4 ++++ mm/page_io.c | 5 +++-- 6 files changed, 91 insertions(+), 28 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 591547779dbd..790ba6660d10 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -848,7 +848,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, gfp |= __GFP_NOFAIL; /* The page lock pins the memcg */ - memcg = page_memcg(page); + memcg = get_mem_cgroup_from_page(page); old_memcg = set_active_memcg(memcg); head = NULL; @@ -868,6 +868,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, set_bh_page(bh, page, offset); } out: + mem_cgroup_put(memcg); set_active_memcg(old_memcg); return head; /* diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index e91980f49388..3ac002561327 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -255,15 +255,13 @@ void __inode_attach_wb(struct inode *inode, struct page *page) if (inode_cgwb_enabled(inode)) { struct cgroup_subsys_state *memcg_css; - if (page) { - memcg_css = mem_cgroup_css_from_page(page); - wb = wb_get_create(bdi, memcg_css, GFP_ATOMIC); - } else { - /* must pin memcg_css, see wb_get_create() */ + /* must pin memcg_css, see wb_get_create() */ + if (page) + memcg_css = get_mem_cgroup_css_from_page(page); + else memcg_css = task_get_css(current, memory_cgrp_id); - wb = wb_get_create(bdi, memcg_css, GFP_ATOMIC); - css_put(memcg_css); - } + wb = wb_get_create(bdi, memcg_css, GFP_ATOMIC); + css_put(memcg_css); } if (!wb) @@ -736,16 +734,16 @@ void wbc_account_cgroup_owner(struct writeback_control *wbc, struct page *page, if (!wbc->wb || wbc->no_cgroup_owner) return; - css = mem_cgroup_css_from_page(page); + css = get_mem_cgroup_css_from_page(page); /* dead cgroups shouldn't contribute to inode ownership arbitration */ if (!(css->flags & CSS_ONLINE)) - return; + goto out; id = css->id; if (id == wbc->wb_id) { wbc->wb_bytes += bytes; - return; + goto out; } if (id == wbc->wb_lcand_id) @@ -758,6 +756,9 @@ void wbc_account_cgroup_owner(struct writeback_control *wbc, struct page *page, wbc->wb_tcand_bytes += bytes; else wbc->wb_tcand_bytes -= min(bytes, wbc->wb_tcand_bytes); + +out: + css_put(css); } EXPORT_SYMBOL_GPL(wbc_account_cgroup_owner); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5d7c8a060843..8944115ebf8e 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -367,7 +367,7 @@ static inline bool PageMemcgKmem(struct page *page); * a valid memcg, but can be atomically swapped to the parent memcg. * * The caller must ensure that the returned memcg won't be released: - * e.g. acquire the rcu_read_lock or css_set_lock. + * e.g. acquire the rcu_read_lock or css_set_lock or cgroup_mutex. */ static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg) { @@ -445,6 +445,31 @@ static inline struct mem_cgroup *page_memcg(struct page *page) } /* + * get_mem_cgroup_from_page - Obtain a reference on the memory cgroup associated + * with a page + * @page: a pointer to the page struct + * + * Returns a pointer to the memory cgroup (and obtain a reference on it) + * associated with the page, or NULL. This function assumes that the page + * is known to have a proper memory cgroup pointer. It's not safe to call + * this function against some type of pages, e.g. slab pages or ex-slab + * pages. + */ +static inline struct mem_cgroup *get_mem_cgroup_from_page(struct page *page) +{ + struct mem_cgroup *memcg; + + rcu_read_lock(); +retry: + memcg = page_memcg(page); + if (unlikely(memcg && !css_tryget(&memcg->css))) + goto retry; + rcu_read_unlock(); + + return memcg; +} + +/* * page_memcg_rcu - locklessly get the memory cgroup associated with a page * @page: a pointer to the page struct * @@ -870,7 +895,7 @@ static inline bool mm_match_cgroup(struct mm_struct *mm, return match; } -struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page); +struct cgroup_subsys_state *get_mem_cgroup_css_from_page(struct page *page); ino_t page_cgroup_ino(struct page *page); static inline bool mem_cgroup_online(struct mem_cgroup *memcg) @@ -1068,10 +1093,13 @@ static inline void count_memcg_events(struct mem_cgroup *memcg, static inline void count_memcg_page_event(struct page *page, enum vm_event_item idx) { - struct mem_cgroup *memcg = page_memcg(page); + struct mem_cgroup *memcg; + rcu_read_lock(); + memcg = page_memcg(page); if (memcg) count_memcg_events(memcg, idx, 1); + rcu_read_unlock(); } static inline void count_memcg_event_mm(struct mm_struct *mm, diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2592e2b072ef..cb650d089d9f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -563,7 +563,7 @@ void memcg_set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) } /** - * mem_cgroup_css_from_page - css of the memcg associated with a page + * get_mem_cgroup_css_from_page - get css of the memcg associated with a page * @page: page of interest * * If memcg is bound to the default hierarchy, css of the memcg associated @@ -573,13 +573,15 @@ void memcg_set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) * If memcg is bound to a traditional hierarchy, the css of root_mem_cgroup * is returned. */ -struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page) +struct cgroup_subsys_state *get_mem_cgroup_css_from_page(struct page *page) { struct mem_cgroup *memcg; - memcg = page_memcg(page); + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) + return &root_mem_cgroup->css; - if (!memcg || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) + memcg = get_mem_cgroup_from_page(page); + if (!memcg) memcg = root_mem_cgroup; return &memcg->css; @@ -3332,7 +3334,7 @@ void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size) */ void mem_cgroup_split_huge_fixup(struct page *head) { - struct mem_cgroup *memcg = page_memcg(head); + struct mem_cgroup *memcg = get_mem_cgroup_from_page(head); int i; if (mem_cgroup_disabled()) @@ -3342,6 +3344,7 @@ void mem_cgroup_split_huge_fixup(struct page *head) css_get(&memcg->css); head[i].memcg_data = (unsigned long)memcg; } + css_put(&memcg->css); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -4664,7 +4667,7 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages, void mem_cgroup_track_foreign_dirty_slowpath(struct page *page, struct bdi_writeback *wb) { - struct mem_cgroup *memcg = page_memcg(page); + struct mem_cgroup *memcg; struct memcg_cgwb_frn *frn; u64 now = get_jiffies_64(); u64 oldest_at = now; @@ -4673,6 +4676,7 @@ void mem_cgroup_track_foreign_dirty_slowpath(struct page *page, trace_track_foreign_dirty(page, wb); + memcg = get_mem_cgroup_from_page(page); /* * Pick the slot to use. If there is already a slot for @wb, keep * using it. If not replace the oldest one which isn't being @@ -4711,6 +4715,7 @@ void mem_cgroup_track_foreign_dirty_slowpath(struct page *page, frn->memcg_id = wb->memcg_css->id; frn->at = now; } + css_put(&memcg->css); } /* issue foreign writeback flushes for recorded foreign dirtying events */ @@ -6182,6 +6187,14 @@ static void mem_cgroup_move_charge(void) atomic_dec(&mc.from->moving_account); } +/* + * The cgroup migration and memory cgroup offlining are serialized by + * @cgroup_mutex. If we reach here, it means that the LRU pages cannot + * be reparented to its parent memory cgroup. So during the whole process + * of mem_cgroup_move_task(), page_memcg(page) is stable. So we do not + * need to worry about the memcg (returned from page_memcg()) being + * released even if we do not hold an rcu read lock. + */ static void mem_cgroup_move_task(void) { if (mc.to) { @@ -6977,7 +6990,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) if (page_memcg(newpage)) return; - memcg = page_memcg(oldpage); + memcg = get_mem_cgroup_from_page(oldpage); VM_WARN_ON_ONCE_PAGE(!memcg, oldpage); if (!memcg) return; @@ -6998,6 +7011,8 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) mem_cgroup_charge_statistics(memcg, newpage, nr_pages); memcg_check_events(memcg, newpage); local_irq_restore(flags); + + css_put(&memcg->css); } DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key); @@ -7186,6 +7201,10 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) return; + /* + * Interrupts should be disabled by the caller (see the comments below), + * which can serve as RCU read-side critical sections. + */ memcg = page_memcg(page); VM_WARN_ON_ONCE_PAGE(!memcg, page); @@ -7253,15 +7272,16 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) return 0; + rcu_read_lock(); memcg = page_memcg(page); VM_WARN_ON_ONCE_PAGE(!memcg, page); if (!memcg) - return 0; + goto out; if (!entry.val) { memcg_memory_event(memcg, MEMCG_SWAP_FAIL); - return 0; + goto out; } memcg = mem_cgroup_id_get_online(memcg); @@ -7271,6 +7291,7 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) memcg_memory_event(memcg, MEMCG_SWAP_MAX); memcg_memory_event(memcg, MEMCG_SWAP_FAIL); mem_cgroup_id_put(memcg); + rcu_read_unlock(); return -ENOMEM; } @@ -7280,6 +7301,8 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) oldid = swap_cgroup_record(entry, mem_cgroup_id(memcg), nr_pages); VM_BUG_ON_PAGE(oldid, page); mod_memcg_state(memcg, MEMCG_SWAP, nr_pages); +out: + rcu_read_unlock(); return 0; } @@ -7334,17 +7357,22 @@ bool mem_cgroup_swap_full(struct page *page) if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) return false; + rcu_read_lock(); memcg = page_memcg(page); if (!memcg) - return false; + goto out; for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg)) { unsigned long usage = page_counter_read(&memcg->swap); if (usage * 2 >= READ_ONCE(memcg->swap.high) || - usage * 2 >= READ_ONCE(memcg->swap.max)) + usage * 2 >= READ_ONCE(memcg->swap.max)) { + rcu_read_unlock(); return true; + } } +out: + rcu_read_unlock(); return false; } diff --git a/mm/migrate.c b/mm/migrate.c index 62b81d5257aa..6f5d445949c2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -490,6 +490,10 @@ int migrate_page_move_mapping(struct address_space *mapping, struct lruvec *old_lruvec, *new_lruvec; struct mem_cgroup *memcg; + /* + * Irq is disabled, which can serve as RCU read-side critical + * sections. + */ memcg = page_memcg(page); old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat); new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat); diff --git a/mm/page_io.c b/mm/page_io.c index c493ce9ebcf5..81744777ab76 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -269,13 +269,14 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) struct cgroup_subsys_state *css; struct mem_cgroup *memcg; + rcu_read_lock(); memcg = page_memcg(page); if (!memcg) - return; + goto out; - rcu_read_lock(); css = cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys); bio_associate_blkg_from_css(bio, css); +out: rcu_read_unlock(); } #else From patchwork Tue Mar 30 10:15:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F79EC433C1 for ; Tue, 30 Mar 2021 10:23:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 129A0619AD for ; Tue, 30 Mar 2021 10:23:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 129A0619AD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9086F6B00A0; Tue, 30 Mar 2021 06:22:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DED26B00A1; Tue, 30 Mar 2021 06:22:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 758B06B00A2; Tue, 30 Mar 2021 06:22:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 59A7A6B00A0 for ; Tue, 30 Mar 2021 06:22:59 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1E5CC52CC for ; Tue, 30 Mar 2021 10:22:59 +0000 (UTC) X-FDA: 77976152478.22.292581F Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf30.hostedemail.com (Postfix) with ESMTP id D151AE0011C9 for ; Tue, 30 Mar 2021 10:22:52 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id f3so2182064pgv.0 for ; Tue, 30 Mar 2021 03:22:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LlVfK3tBgrU0BmUX2bWJJi+yNP+gPFz4okGwO5zSo8A=; b=npOuj/IXL32HPg+K23Sf+62ebOlixpsNhaNB6Q3mb/mp0VlX2286qJxPe43mPC206Q sKe0ASbkn8j9Y6Bu1nY89TM5lv4O7xe6RmLNinsrYRcvQoQkiDcrCvob1ltIglysLe1/ FJkYD9zLZVp7JaXlnxo/GoQDqb5523ZL6hvS6i5ihuVzIZ/dgZwqd7Vs253seeHV6wf6 SufLYUloFclKQVRLkUFN+8qZHerr5cE4Fi263QbIuTBVqNstFg6PYRMaZzV4gfsMOCjH MdPzLEykQ+MzbaHjAogbvIr1kHzzzthVUPvsf7zPTAY2Zxf2P0SjkwGjATdxL0Hp49kX 3glw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LlVfK3tBgrU0BmUX2bWJJi+yNP+gPFz4okGwO5zSo8A=; b=MITGOQ9TX8pkgKm7u5OJQdkjFShp3QknYbc4TYQM5Vaeq+5hrmNceuCekDdss2bXYx xAcz0CLQjK6fYnUjLdoo4eIO9kHbGQqfKR4+8mEZD4CVlOyRC+Jys8Gv2//5o8VhR7XB fKGSgx4tYkBnqd+KBktDaFtbNzVY5rkjueiOgIPt0djs4igg+JAcekJVZDXEicdoGFJ1 n3+5UpLeb2Xz1nJFZuGksLLvteYtng0htnIXiv6sJZtDlvp8Xl8WS+qN8hP1zTj+G1SE Nv77mWXJqWp9hKSqyL5Fl1VriChph7HWOsBQVHg/Ln0LO/UwIVxvgyvOIsdA7jaJrcj8 Ed7g== X-Gm-Message-State: AOAM530ab27caxoK5nw51UWEJ/VTv0IsbtXZZj28JKYsSnu772GutTRI DQwmaF8alnNOit86PqAQVp2xSw== X-Google-Smtp-Source: ABdhPJyOUnZ7tulMq0ZkReZZufRvoJjBeThdaJ2q71EIRLjtbmbmrs8OtM6B/E1tX6dkgSDD809a5w== X-Received: by 2002:a65:5308:: with SMTP id m8mr26351120pgq.266.1617099777573; Tue, 30 Mar 2021 03:22:57 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.22.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:22:57 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 12/15] mm: memcontrol: introduce memcg_reparent_ops Date: Tue, 30 Mar 2021 18:15:28 +0800 Message-Id: <20210330101531.82752-13-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D151AE0011C9 X-Stat-Signature: rexnxe3qcwkiut9mctto6a4yn9pr6ip1 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=mail-pg1-f177.google.com; client-ip=209.85.215.177 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099772-855463 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the previous patch, we know how to make the lruvec lock safe when the LRU pages reparented. We should do something like following. memcg_reparent_objcgs(memcg) 1) lock // lruvec belongs to memcg and lruvec_parent belongs to parent memcg. spin_lock(&lruvec->lru_lock); spin_lock(&lruvec_parent->lru_lock); 2) do reparent // Move all the pages from the lruvec list to the parent lruvec list. 3) unlock spin_unlock(&lruvec_parent->lru_lock); spin_unlock(&lruvec->lru_lock); Apart from the page lruvec lock, the deferred split queue lock (THP only) also needs to do something similar. So we extracted the necessary 3 steps in the memcg_reparent_objcgs(). memcg_reparent_objcgs(memcg) 1) lock memcg_reparent_ops->lock(memcg, parent); 2) reparent memcg_reparent_ops->reparent(memcg, reparent); 3) unlock memcg_reparent_ops->unlock(memcg, reparent); Now there are two different locks (e.g. lruvec lock and deferred split queue lock) need to use this infrastructure. In the next patch, we will use those APIs to make those locks safe when the LRU pages reparented. Signed-off-by: Muchun Song --- include/linux/memcontrol.h | 11 +++++++++++ mm/memcontrol.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 8944115ebf8e..c79770ce3c81 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -341,6 +341,17 @@ struct mem_cgroup { /* WARNING: nodeinfo must be the last member here */ }; +struct memcg_reparent_ops { + struct list_head list; + + /* Irq is disabled before calling those functions. */ + void (*lock)(struct mem_cgroup *memcg, struct mem_cgroup *parent); + void (*unlock)(struct mem_cgroup *memcg, struct mem_cgroup *parent); + void (*reparent)(struct mem_cgroup *memcg, struct mem_cgroup *parent); +}; + +void __init register_memcg_repatent(struct memcg_reparent_ops *ops); + /* * size of first charge trial. "32" comes from vmscan.c's magic value. * TODO: maybe necessary to use big numbers in big irons. diff --git a/mm/memcontrol.c b/mm/memcontrol.c index cb650d089d9f..d5701117794a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -338,6 +338,41 @@ static struct obj_cgroup *obj_cgroup_alloc(void) return objcg; } +static LIST_HEAD(reparent_ops_head); + +static void memcg_reparent_lock(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + struct memcg_reparent_ops *ops; + + list_for_each_entry(ops, &reparent_ops_head, list) + ops->lock(memcg, parent); +} + +static void memcg_reparent_unlock(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + struct memcg_reparent_ops *ops; + + list_for_each_entry(ops, &reparent_ops_head, list) + ops->unlock(memcg, parent); +} + +static void memcg_do_reparent(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + struct memcg_reparent_ops *ops; + + list_for_each_entry(ops, &reparent_ops_head, list) + ops->reparent(memcg, parent); +} + +void __init register_memcg_repatent(struct memcg_reparent_ops *ops) +{ + BUG_ON(!ops->lock || !ops->unlock || !ops->reparent); + list_add(&ops->list, &reparent_ops_head); +} + static void memcg_reparent_objcgs(struct mem_cgroup *memcg) { struct obj_cgroup *objcg, *iter; @@ -347,9 +382,13 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg) if (!parent) parent = root_mem_cgroup; + local_irq_disable(); + + memcg_reparent_lock(memcg, parent); + objcg = rcu_replace_pointer(memcg->objcg, NULL, true); - spin_lock_irq(&css_set_lock); + spin_lock(&css_set_lock); /* 1) Ready to reparent active objcg. */ list_add(&objcg->list, &memcg->objcg_list); @@ -361,7 +400,13 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg) /* 3) Move already reparented objcgs to the parent's list */ list_splice(&memcg->objcg_list, &parent->objcg_list); - spin_unlock_irq(&css_set_lock); + spin_unlock(&css_set_lock); + + memcg_do_reparent(memcg, parent); + + memcg_reparent_unlock(memcg, parent); + + local_irq_enable(); percpu_ref_kill(&objcg->refcnt); } From patchwork Tue Mar 30 10:15:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB2FCC433C1 for ; Tue, 30 Mar 2021 10:23:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 41C3C619AD for ; Tue, 30 Mar 2021 10:23:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 41C3C619AD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C8C986B00A2; Tue, 30 Mar 2021 06:23:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C61386B00A3; Tue, 30 Mar 2021 06:23:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADB136B00A4; Tue, 30 Mar 2021 06:23:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id 87A506B00A2 for ; Tue, 30 Mar 2021 06:23:11 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4C0EC181AF5C7 for ; Tue, 30 Mar 2021 10:23:11 +0000 (UTC) X-FDA: 77976152982.36.06FBEBE Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by imf25.hostedemail.com (Postfix) with ESMTP id E979F6000103 for ; Tue, 30 Mar 2021 10:23:08 +0000 (UTC) Received: by mail-pg1-f170.google.com with SMTP id k8so2232849pgf.4 for ; Tue, 30 Mar 2021 03:23:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tNFf2QyGKwe25+FGy3DBLMJaAN3lwOmYG4LAT2WfriA=; b=v9R+PTgEluSZ/DsJpdYxF9JyKIHeTDdOM8MWp2/uG7yTpz3BRm9uU1MJ9yUw6fiWd6 hVQJ9hLjczuMIsE/7vk9Clw+T4gn77cPLv5GoqZ0Wgx8jcYf+HMbFlVd+GQzXD3FUIc1 xv/Z1tV1/b7bAMFxYJkWZAb5QVIjysrg8x4lpXpEs60ec9BYYc0H6z4P3EQA9rcWUdml /fnBucSwbtq6X4EMDIldu9e+2MSSwXX+zqPEt2Tjf1TBJ/XaZjWBQxfpuum8E5HgdWHG 3BuvG9Ityc4v9WOlz/rIAL3BB1ub2AJAE6zdPPA5W+EeOooglhy2UJvhW+1sy1FK2PdF ipmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tNFf2QyGKwe25+FGy3DBLMJaAN3lwOmYG4LAT2WfriA=; b=qOr6De/k5Z5OWpy4aUW9LsNnfI5jOU8kJeXUx+8yo1sMI+VumX+RLeli0VbW8y+sSe 5UANsRIcvWWoej4T1LmMQQ5LWqB+wFr5dDfqivKuIggFlLSF7pE/cPTLiJFg6Q0lCfr+ nmlnqbciR9pcM+AePIlrKIt6YGMjUsS7IkexGyXPN0dGSq3yByyIkcTuoxdYvMnyKnNq jO+YSZZBdO2yo137c64m0qsty/SQYjwUNqscGFQncKGf/2Y0rGvPa0Zy8jG8C8opcLBd J44xLRYEqX20QnfMsw1f100jK/vy3RzKfQDHi+B7tvp3lUqktiE7gcE7bGSfaAHkvlvl MYig== X-Gm-Message-State: AOAM532XAC+NF2PqR3d/sH6t/hX8uoPBWPpzP+xH9d/Ze9f6rYnNTJui WRghn9elJ4FEc+SWL9ieasp1EA== X-Google-Smtp-Source: ABdhPJwFS66RV8ZU/LS8ozBiVfHeI/wszavd+OCc+FUmLB6z8/xd6gAFMAecyA8ItNy6zIAhKqbEOQ== X-Received: by 2002:a63:5416:: with SMTP id i22mr28618798pgb.43.1617099789601; Tue, 30 Mar 2021 03:23:09 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.22.59 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:23:09 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 13/15] mm: memcontrol: use obj_cgroup APIs to charge the LRU pages Date: Tue, 30 Mar 2021 18:15:29 +0800 Message-Id: <20210330101531.82752-14-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: q9eiqh4iu7an7sezpb4xzyq9qqwjxp7x X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E979F6000103 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=mail-pg1-f170.google.com; client-ip=209.85.215.170 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099788-536806 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We will reuse the obj_cgroup APIs to charge the LRU pages. Finally, page->memcg_data will have 2 different meanings. - For the slab pages, page->memcg_data points to an object cgroups vector. - For the kmem pages (exclude the slab pages) and the LRU pages, page->memcg_data points to an object cgroup. In this patch, we reuse obj_cgroup APIs to charge LRU pages. In the end, The page cache cannot prevent long-living objects from pinning the original memory cgroup in the memory. At the same time we also changed the rules of page and objcg or memcg binding stability. The new rules are as follows. For a page any of the following ensures page and objcg binding stability: - the page lock - LRU isolation - lock_page_memcg() - exclusive reference Based on the stable binding of page and objcg, for a page any of the following ensures page and memcg binding stability: - css_set_lock - cgroup_mutex - the lruvec lock - the split queue lock (only THP page) If the caller only want to ensure that the page counters of memcg are updated correctly, ensure that the binding stability of page and objcg is sufficient. Signed-off-by: Muchun Song --- include/linux/memcontrol.h | 96 +++++++----------- mm/huge_memory.c | 48 +++++++++ mm/memcontrol.c | 245 ++++++++++++++++++++++++++++++--------------- 3 files changed, 251 insertions(+), 138 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index c79770ce3c81..cd9e9ff6c2bf 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -371,8 +371,6 @@ enum page_memcg_data_flags { #define MEMCG_DATA_FLAGS_MASK (__NR_MEMCG_DATA_FLAGS - 1) -static inline bool PageMemcgKmem(struct page *page); - /* * After the initialization objcg->memcg is always pointing at * a valid memcg, but can be atomically swapped to the parent memcg. @@ -386,43 +384,19 @@ static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg) } /* - * __page_memcg - get the memory cgroup associated with a non-kmem page - * @page: a pointer to the page struct - * - * Returns a pointer to the memory cgroup associated with the page, - * or NULL. This function assumes that the page is known to have a - * proper memory cgroup pointer. It's not safe to call this function - * against some type of pages, e.g. slab pages or ex-slab pages or - * kmem pages. - */ -static inline struct mem_cgroup *__page_memcg(struct page *page) -{ - unsigned long memcg_data = page->memcg_data; - - VM_BUG_ON_PAGE(PageSlab(page), page); - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page); - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); - - return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); -} - -/* - * __page_objcg - get the object cgroup associated with a kmem page + * page_objcg - get the object cgroup associated with page * @page: a pointer to the page struct * * Returns a pointer to the object cgroup associated with the page, * or NULL. This function assumes that the page is known to have a - * proper object cgroup pointer. It's not safe to call this function - * against some type of pages, e.g. slab pages or ex-slab pages or - * LRU pages. + * proper object cgroup pointer. */ -static inline struct obj_cgroup *__page_objcg(struct page *page) +static inline struct obj_cgroup *page_objcg(struct page *page) { unsigned long memcg_data = page->memcg_data; VM_BUG_ON_PAGE(PageSlab(page), page); VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page); - VM_BUG_ON_PAGE(!(memcg_data & MEMCG_DATA_KMEM), page); return (struct obj_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); } @@ -436,23 +410,35 @@ static inline struct obj_cgroup *__page_objcg(struct page *page) * proper memory cgroup pointer. It's not safe to call this function * against some type of pages, e.g. slab pages or ex-slab pages. * - * For a non-kmem page any of the following ensures page and memcg binding - * stability: + * For a page any of the following ensures page and objcg binding stability: * * - the page lock * - LRU isolation * - lock_page_memcg() * - exclusive reference * - * For a kmem page a caller should hold an rcu read lock to protect memcg - * associated with a kmem page from being released. + * Based on the stable binding of page and objcg, for a page any of the + * following ensures page and memcg binding stability: + * + * - css_set_lock + * - cgroup_mutex + * - the lruvec lock + * - the split queue lock (only THP page) + * + * If the caller only want to ensure that the page counters of memcg are + * updated correctly, ensure that the binding stability of page and objcg + * is sufficient. + * + * A caller should hold an rcu read lock (In addition, regions of code across + * which interrupts, preemption, or softirqs have been disabled also serve as + * RCU read-side critical sections) to protect memcg associated with a page + * from being released. */ static inline struct mem_cgroup *page_memcg(struct page *page) { - if (PageMemcgKmem(page)) - return obj_cgroup_memcg(__page_objcg(page)); - else - return __page_memcg(page); + struct obj_cgroup *objcg = page_objcg(page); + + return objcg ? obj_cgroup_memcg(objcg) : NULL; } /* @@ -465,6 +451,8 @@ static inline struct mem_cgroup *page_memcg(struct page *page) * is known to have a proper memory cgroup pointer. It's not safe to call * this function against some type of pages, e.g. slab pages or ex-slab * pages. + * + * The page and objcg or memcg binding rules can refer to page_memcg(). */ static inline struct mem_cgroup *get_mem_cgroup_from_page(struct page *page) { @@ -488,22 +476,20 @@ static inline struct mem_cgroup *get_mem_cgroup_from_page(struct page *page) * or NULL. This function assumes that the page is known to have a * proper memory cgroup pointer. It's not safe to call this function * against some type of pages, e.g. slab pages or ex-slab pages. + * + * The page and objcg or memcg binding rules can refer to page_memcg(). */ static inline struct mem_cgroup *page_memcg_rcu(struct page *page) { unsigned long memcg_data = READ_ONCE(page->memcg_data); + struct obj_cgroup *objcg; VM_BUG_ON_PAGE(PageSlab(page), page); WARN_ON_ONCE(!rcu_read_lock_held()); - if (memcg_data & MEMCG_DATA_KMEM) { - struct obj_cgroup *objcg; - - objcg = (void *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); - return obj_cgroup_memcg(objcg); - } + objcg = (void *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); - return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); + return objcg ? obj_cgroup_memcg(objcg) : NULL; } /* @@ -516,16 +502,10 @@ static inline struct mem_cgroup *page_memcg_rcu(struct page *page) * has an associated memory cgroup pointer or an object cgroups vector or * an object cgroup. * - * For a non-kmem page any of the following ensures page and memcg binding - * stability: - * - * - the page lock - * - LRU isolation - * - lock_page_memcg() - * - exclusive reference + * The page and objcg or memcg binding rules can refer to page_memcg(). * - * For a kmem page a caller should hold an rcu read lock to protect memcg - * associated with a kmem page from being released. + * A caller should hold an rcu read lock to protect memcg associated with a + * page from being released. */ static inline struct mem_cgroup *page_memcg_check(struct page *page) { @@ -534,18 +514,14 @@ static inline struct mem_cgroup *page_memcg_check(struct page *page) * for slab pages, READ_ONCE() should be used here. */ unsigned long memcg_data = READ_ONCE(page->memcg_data); + struct obj_cgroup *objcg; if (memcg_data & MEMCG_DATA_OBJCGS) return NULL; - if (memcg_data & MEMCG_DATA_KMEM) { - struct obj_cgroup *objcg; - - objcg = (void *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); - return obj_cgroup_memcg(objcg); - } + objcg = (void *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); - return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); + return objcg ? obj_cgroup_memcg(objcg) : NULL; } #ifdef CONFIG_MEMCG_KMEM diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 434cc7283a64..a47c97a48951 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -486,6 +486,8 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) } #ifdef CONFIG_MEMCG +static struct shrinker deferred_split_shrinker; + static inline struct mem_cgroup *split_queue_to_memcg(struct deferred_split *queue) { return container_of(queue, struct mem_cgroup, deferred_split_queue); @@ -547,6 +549,52 @@ static struct deferred_split *lock_split_queue_irqsave(struct page *page, return queue; } + +static void memcg_reparent_split_queue_lock(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + spin_lock(&memcg->deferred_split_queue.split_queue_lock); + spin_lock(&parent->deferred_split_queue.split_queue_lock); +} + +static void memcg_reparent_split_queue_unlock(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + spin_unlock(&parent->deferred_split_queue.split_queue_lock); + spin_unlock(&memcg->deferred_split_queue.split_queue_lock); +} + +static void memcg_reparent_split_queue(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + int nid; + struct deferred_split *src, *dst; + + src = &memcg->deferred_split_queue; + dst = &parent->deferred_split_queue; + + if (!src->split_queue_len) + return; + + list_splice_tail_init(&src->split_queue, &dst->split_queue); + dst->split_queue_len += src->split_queue_len; + src->split_queue_len = 0; + + for_each_node(nid) + memcg_set_shrinker_bit(parent, nid, deferred_split_shrinker.id); +} + +static struct memcg_reparent_ops split_queue_reparent_ops = { + .lock = memcg_reparent_split_queue_lock, + .unlock = memcg_reparent_split_queue_unlock, + .reparent = memcg_reparent_split_queue, +}; + +static void __init split_queue_reparent_init(void) +{ + register_memcg_repatent(&split_queue_reparent_ops); +} +core_initcall(split_queue_reparent_init); #else static struct deferred_split *lock_split_queue(struct page *page) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d5701117794a..71689243242f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -338,6 +338,77 @@ static struct obj_cgroup *obj_cgroup_alloc(void) return objcg; } +static void memcg_reparent_lruvec_lock(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + int nid; + + for_each_node(nid) { + spin_lock(&mem_cgroup_lruvec(memcg, NODE_DATA(nid))->lru_lock); + spin_lock(&mem_cgroup_lruvec(parent, NODE_DATA(nid))->lru_lock); + } +} + +static void memcg_reparent_lruvec_unlock(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + int nid; + + for_each_node(nid) { + spin_unlock(&mem_cgroup_lruvec(parent, NODE_DATA(nid))->lru_lock); + spin_unlock(&mem_cgroup_lruvec(memcg, NODE_DATA(nid))->lru_lock); + } +} + +static void lruvec_reparent_lru(struct lruvec *src, struct lruvec *dst, + enum lru_list lru) +{ + int zid; + struct mem_cgroup_per_node *mz_src, *mz_dst; + + mz_src = container_of(src, struct mem_cgroup_per_node, lruvec); + mz_dst = container_of(dst, struct mem_cgroup_per_node, lruvec); + + list_splice_tail_init(&src->lists[lru], &dst->lists[lru]); + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + mz_dst->lru_zone_size[zid][lru] += mz_src->lru_zone_size[zid][lru]; + mz_src->lru_zone_size[zid][lru] = 0; + } +} + +static void memcg_reparent_lruvec(struct mem_cgroup *memcg, + struct mem_cgroup *parent) +{ + int nid; + + for_each_node(nid) { + enum lru_list lru; + struct lruvec *src, *dst; + + src = mem_cgroup_lruvec(memcg, NODE_DATA(nid)); + dst = mem_cgroup_lruvec(parent, NODE_DATA(nid)); + + dst->anon_cost += src->anon_cost; + dst->file_cost += src->file_cost; + + for_each_lru(lru) + lruvec_reparent_lru(src, dst, lru); + } +} + +static struct memcg_reparent_ops lruvec_reparent_ops = { + .lock = memcg_reparent_lruvec_lock, + .unlock = memcg_reparent_lruvec_unlock, + .reparent = memcg_reparent_lruvec, +}; + +static void __init lruvec_reparent_init(void) +{ + register_memcg_repatent(&lruvec_reparent_ops); +} +core_initcall(lruvec_reparent_init); + static LIST_HEAD(reparent_ops_head); static void memcg_reparent_lock(struct mem_cgroup *memcg, @@ -2963,18 +3034,18 @@ static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) } #endif -static void commit_charge(struct page *page, struct mem_cgroup *memcg) +static void commit_charge(struct page *page, struct obj_cgroup *objcg) { - VM_BUG_ON_PAGE(page_memcg(page), page); + VM_BUG_ON_PAGE(page_objcg(page), page); /* - * Any of the following ensures page's memcg stability: + * Any of the following ensures page's objcg stability: * * - the page lock * - LRU isolation * - lock_page_memcg() * - exclusive reference */ - page->memcg_data = (unsigned long)memcg; + page->memcg_data = (unsigned long)objcg; } static struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg) @@ -2991,6 +3062,21 @@ static struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg) return memcg; } +static struct obj_cgroup *get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) +{ + struct obj_cgroup *objcg = NULL; + + rcu_read_lock(); + for (; memcg; memcg = parent_mem_cgroup(memcg)) { + objcg = rcu_dereference(memcg->objcg); + if (objcg && obj_cgroup_tryget(objcg)) + break; + } + rcu_read_unlock(); + + return objcg; +} + #ifdef CONFIG_MEMCG_KMEM int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, gfp_t gfp, bool new_page) @@ -3088,12 +3174,15 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) else memcg = mem_cgroup_from_task(current); - for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg)) { - objcg = rcu_dereference(memcg->objcg); - if (objcg && obj_cgroup_tryget(objcg)) - break; + if (mem_cgroup_is_root(memcg)) + goto out; + + objcg = get_obj_cgroup_from_memcg(memcg); + if (obj_cgroup_is_root(objcg)) { + obj_cgroup_put(objcg); objcg = NULL; } +out: rcu_read_unlock(); return objcg; @@ -3236,13 +3325,14 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) */ void __memcg_kmem_uncharge_page(struct page *page, int order) { - struct obj_cgroup *objcg; + struct obj_cgroup *objcg = page_objcg(page); unsigned int nr_pages = 1 << order; - if (!PageMemcgKmem(page)) + if (!objcg) return; - objcg = __page_objcg(page); + VM_BUG_ON_PAGE(!PageMemcgKmem(page), page); + objcg = page_objcg(page); obj_cgroup_uncharge_pages(objcg, nr_pages); page->memcg_data = 0; obj_cgroup_put(objcg); @@ -3379,17 +3469,16 @@ void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size) */ void mem_cgroup_split_huge_fixup(struct page *head) { - struct mem_cgroup *memcg = get_mem_cgroup_from_page(head); + struct obj_cgroup *objcg = page_objcg(head); int i; if (mem_cgroup_disabled()) return; for (i = 1; i < HPAGE_PMD_NR; i++) { - css_get(&memcg->css); - head[i].memcg_data = (unsigned long)memcg; + obj_cgroup_get(objcg); + commit_charge(&head[i], objcg); } - css_put(&memcg->css); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -5755,10 +5844,10 @@ static int mem_cgroup_move_account(struct page *page, */ smp_mb(); - css_get(&to->css); - css_put(&from->css); + obj_cgroup_get(to->objcg); + obj_cgroup_put(from->objcg); - page->memcg_data = (unsigned long)to; + page->memcg_data = (unsigned long)to->objcg; __unlock_page_memcg(from); @@ -6796,6 +6885,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) { unsigned int nr_pages = thp_nr_pages(page); struct mem_cgroup *memcg = NULL; + struct obj_cgroup *objcg; int ret = 0; if (mem_cgroup_disabled()) @@ -6813,7 +6903,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) * removal, which in turn serializes uncharging. */ VM_BUG_ON_PAGE(!PageLocked(page), page); - if (page_memcg(compound_head(page))) + if (page_objcg(compound_head(page))) goto out; id = lookup_swap_cgroup_id(ent); @@ -6827,12 +6917,16 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) if (!memcg) memcg = get_mem_cgroup_from_mm(mm); - ret = try_charge(memcg, gfp_mask, nr_pages); - if (ret) - goto out_put; + objcg = get_obj_cgroup_from_memcg(memcg); + /* Do not account at the root objcg level. */ + if (!obj_cgroup_is_root(objcg)) { + ret = try_charge(memcg, gfp_mask, nr_pages); + if (ret) + goto out_put; + } - css_get(&memcg->css); - commit_charge(page, memcg); + obj_cgroup_get(objcg); + commit_charge(page, objcg); local_irq_disable(); mem_cgroup_charge_statistics(memcg, page, nr_pages); @@ -6862,13 +6956,14 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) } out_put: + obj_cgroup_put(objcg); css_put(&memcg->css); out: return ret; } struct uncharge_gather { - struct mem_cgroup *memcg; + struct obj_cgroup *objcg; unsigned long nr_memory; unsigned long pgpgout; unsigned long nr_kmem; @@ -6883,63 +6978,56 @@ static inline void uncharge_gather_clear(struct uncharge_gather *ug) static void uncharge_batch(const struct uncharge_gather *ug) { unsigned long flags; + struct mem_cgroup *memcg; + rcu_read_lock(); + memcg = obj_cgroup_memcg(ug->objcg); if (ug->nr_memory) { - page_counter_uncharge(&ug->memcg->memory, ug->nr_memory); + page_counter_uncharge(&memcg->memory, ug->nr_memory); if (do_memsw_account()) - page_counter_uncharge(&ug->memcg->memsw, ug->nr_memory); + page_counter_uncharge(&memcg->memsw, ug->nr_memory); if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && ug->nr_kmem) - page_counter_uncharge(&ug->memcg->kmem, ug->nr_kmem); - memcg_oom_recover(ug->memcg); + page_counter_uncharge(&memcg->kmem, ug->nr_kmem); + memcg_oom_recover(memcg); } local_irq_save(flags); - __count_memcg_events(ug->memcg, PGPGOUT, ug->pgpgout); - __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_memory); - memcg_check_events(ug->memcg, ug->dummy_page); + __count_memcg_events(memcg, PGPGOUT, ug->pgpgout); + __this_cpu_add(memcg->vmstats_percpu->nr_page_events, ug->nr_memory); + memcg_check_events(memcg, ug->dummy_page); local_irq_restore(flags); + rcu_read_unlock(); /* drop reference from uncharge_page */ - css_put(&ug->memcg->css); + obj_cgroup_put(ug->objcg); } static void uncharge_page(struct page *page, struct uncharge_gather *ug) { unsigned long nr_pages; - struct mem_cgroup *memcg; struct obj_cgroup *objcg; VM_BUG_ON_PAGE(PageLRU(page), page); /* * Nobody should be changing or seriously looking at - * page memcg or objcg at this point, we have fully - * exclusive access to the page. + * page objcg at this point, we have fully exclusive + * access to the page. */ - if (PageMemcgKmem(page)) { - objcg = __page_objcg(page); - /* - * This get matches the put at the end of the function and - * kmem pages do not hold memcg references anymore. - */ - memcg = get_mem_cgroup_from_objcg(objcg); - } else { - memcg = __page_memcg(page); - } - - if (!memcg) + objcg = page_objcg(page); + if (!objcg) return; - if (ug->memcg != memcg) { - if (ug->memcg) { + if (ug->objcg != objcg) { + if (ug->objcg) { uncharge_batch(ug); uncharge_gather_clear(ug); } - ug->memcg = memcg; + ug->objcg = objcg; ug->dummy_page = page; - /* pairs with css_put in uncharge_batch */ - css_get(&memcg->css); + /* pairs with obj_cgroup_put in uncharge_batch */ + obj_cgroup_get(objcg); } nr_pages = compound_nr(page); @@ -6947,19 +7035,15 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) if (PageMemcgKmem(page)) { ug->nr_memory += nr_pages; ug->nr_kmem += nr_pages; - - page->memcg_data = 0; - obj_cgroup_put(objcg); } else { /* LRU pages aren't accounted at the root level */ - if (!mem_cgroup_is_root(memcg)) + if (!obj_cgroup_is_root(objcg)) ug->nr_memory += nr_pages; ug->pgpgout++; - - page->memcg_data = 0; } - css_put(&memcg->css); + page->memcg_data = 0; + obj_cgroup_put(objcg); } /** @@ -6976,7 +7060,7 @@ void mem_cgroup_uncharge(struct page *page) return; /* Don't touch page->lru of any random page, pre-check: */ - if (!page_memcg(page)) + if (!page_objcg(page)) return; uncharge_gather_clear(&ug); @@ -7002,7 +7086,7 @@ void mem_cgroup_uncharge_list(struct list_head *page_list) uncharge_gather_clear(&ug); list_for_each_entry(page, page_list, lru) uncharge_page(page, &ug); - if (ug.memcg) + if (ug.objcg) uncharge_batch(&ug); } @@ -7019,6 +7103,7 @@ void mem_cgroup_uncharge_list(struct list_head *page_list) void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) { struct mem_cgroup *memcg; + struct obj_cgroup *objcg; unsigned int nr_pages; unsigned long flags; @@ -7032,32 +7117,34 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) return; /* Page cache replacement: new page already charged? */ - if (page_memcg(newpage)) + if (page_objcg(newpage)) return; - memcg = get_mem_cgroup_from_page(oldpage); - VM_WARN_ON_ONCE_PAGE(!memcg, oldpage); - if (!memcg) + objcg = page_objcg(oldpage); + VM_WARN_ON_ONCE_PAGE(!objcg, oldpage); + if (!objcg) return; /* Force-charge the new page. The old one will be freed soon */ nr_pages = thp_nr_pages(newpage); - if (!mem_cgroup_is_root(memcg)) { + rcu_read_lock(); + memcg = obj_cgroup_memcg(objcg); + + if (!obj_cgroup_is_root(objcg)) { page_counter_charge(&memcg->memory, nr_pages); if (do_memsw_account()) page_counter_charge(&memcg->memsw, nr_pages); } - css_get(&memcg->css); - commit_charge(newpage, memcg); + obj_cgroup_get(objcg); + commit_charge(newpage, objcg); local_irq_save(flags); mem_cgroup_charge_statistics(memcg, newpage, nr_pages); memcg_check_events(memcg, newpage); local_irq_restore(flags); - - css_put(&memcg->css); + rcu_read_unlock(); } DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key); @@ -7234,6 +7321,7 @@ static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg) void mem_cgroup_swapout(struct page *page, swp_entry_t entry) { struct mem_cgroup *memcg, *swap_memcg; + struct obj_cgroup *objcg; unsigned int nr_entries; unsigned short oldid; @@ -7246,15 +7334,16 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) return; + objcg = page_objcg(page); + VM_WARN_ON_ONCE_PAGE(!objcg, page); + if (!objcg) + return; + /* * Interrupts should be disabled by the caller (see the comments below), * which can serve as RCU read-side critical sections. */ - memcg = page_memcg(page); - - VM_WARN_ON_ONCE_PAGE(!memcg, page); - if (!memcg) - return; + memcg = obj_cgroup_memcg(objcg); /* * In case the memcg owning these pages has been offlined and doesn't @@ -7273,7 +7362,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) page->memcg_data = 0; - if (!mem_cgroup_is_root(memcg)) + if (!obj_cgroup_is_root(objcg)) page_counter_uncharge(&memcg->memory, nr_entries); if (!cgroup_memory_noswap && memcg != swap_memcg) { @@ -7292,7 +7381,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) mem_cgroup_charge_statistics(memcg, page, -nr_entries); memcg_check_events(memcg, page); - css_put(&memcg->css); + obj_cgroup_put(objcg); } /** From patchwork Tue Mar 30 10:15:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34C79C433DB for ; Tue, 30 Mar 2021 10:23:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8BE2661955 for ; Tue, 30 Mar 2021 10:23:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8BE2661955 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 140B76B00A4; Tue, 30 Mar 2021 06:23:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 116606B00A5; Tue, 30 Mar 2021 06:23:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EABA76B00A6; Tue, 30 Mar 2021 06:23:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id C7E906B00A4 for ; Tue, 30 Mar 2021 06:23:34 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7DD3B442B for ; Tue, 30 Mar 2021 10:23:34 +0000 (UTC) X-FDA: 77976153948.04.B6D8CB4 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf15.hostedemail.com (Postfix) with ESMTP id 2389AA0009F5 for ; Tue, 30 Mar 2021 10:23:32 +0000 (UTC) Received: by mail-pj1-f52.google.com with SMTP id bt4so7498014pjb.5 for ; Tue, 30 Mar 2021 03:23:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fT0weKC4DEAC+5wWcmHzQMivFqo3ZXEo4WZWR/hlXUA=; b=u/1orgfVtogLUV0OALnuu2NoPCBwa8seczxkr+RtaBGXBOZtJ1jgP3GKQMJTAW8mC6 CONRxw1VpiZ4cME0tZqapqteuKMppgWKpAftOMHbKHNpfGLH1SZWT8+hHU3E3POnUPLY 0ldx5LKfAzaEhFHbYLEWzxIMMTdN6mNunSx++cnE2xprtZKVZulaNF8Prcgr2lYs3B2v WLjywPAPLvmhtnMxWpBXbQEV3EEmS8yr+AZppBRi7i/DRnSiWss8ECtz2SpSiiFLbL1t /Yi92Ba+sXnxIXOzxQZD77hwr1yws6beAetsgCln9tL22W2+n8QwbFuOH6Qtz880dVd2 7O5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fT0weKC4DEAC+5wWcmHzQMivFqo3ZXEo4WZWR/hlXUA=; b=Po8I/yJtL0mXXin56+zD/vs7NZBRmZfuSG7XSQmn2X9YaSOpDFKQkn5H2YDUfYEzos GBtimM8PYtGGIYe+EGJqzCC3lpew/km4Us/S4AsRcQcMtskK6V4nIT/ZTShl81g0LXIo OcdqSDFXbiJ+x8+OTdWtjg5ksX57WrhMP28DMWWhCX9vohOZZbRqI7krHc4kQz/3NuFG jA0lrHKMDBItxZA98UoRPe/70MD/GuaqwTnBFR831QM0bNTOW1R89nDGotADvhMMUdtI vCr6k6QXu53S0L5BhR6XYNNKK1VneHWjuMoMSI9BU/t1yTihXbHDhn/JNZvdN4hWzVfr lEEQ== X-Gm-Message-State: AOAM530ar03wLxUyYCPkLrGE+YYQCvRpWv+fsUi78l7K5rQdNGu14b0S u0I+UoYOBGqmYiQOXVUkOuU0fw== X-Google-Smtp-Source: ABdhPJy9F8UoZuP7LT1OlYDjJNsmaCuqqa+etWBFX0//1uxYfgub3NBcMG1hlbTr2kKZKz+qdPA8qg== X-Received: by 2002:a17:90a:c207:: with SMTP id e7mr3536823pjt.188.1617099812967; Tue, 30 Mar 2021 03:23:32 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.23.10 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:23:32 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 14/15] mm: memcontrol: rename {un}lock_page_memcg() to {un}lock_page_objcg() Date: Tue, 30 Mar 2021 18:15:30 +0800 Message-Id: <20210330101531.82752-15-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 2389AA0009F5 X-Stat-Signature: f57smzsggkopwjzohtfr4cnjcq3zgzxi Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=mail-pj1-f52.google.com; client-ip=209.85.216.52 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099812-562291 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now the lock_page_memcg() does not lock a page and memcg binding, it actually lock a page and objcg binding. So rename lock_page_memcg() to lock_page_objcg(). This is just code cleanup without any functionality changes. Signed-off-by: Muchun Song --- Documentation/admin-guide/cgroup-v1/memory.rst | 2 +- fs/buffer.c | 10 ++-- fs/iomap/buffered-io.c | 4 +- include/linux/memcontrol.h | 22 +++++---- mm/filemap.c | 2 +- mm/huge_memory.c | 4 +- mm/memcontrol.c | 65 ++++++++++++++++---------- mm/page-writeback.c | 26 +++++------ mm/rmap.c | 14 +++--- 9 files changed, 85 insertions(+), 64 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 0936412e044e..578823f2c764 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -291,7 +291,7 @@ Lock order is as follows: Page lock (PG_locked bit of page->flags) mm->page_table_lock or split pte_lock - lock_page_memcg (memcg->move_lock) + lock_page_objcg (memcg->move_lock) mapping->i_pages lock lruvec->lru_lock. diff --git a/fs/buffer.c b/fs/buffer.c index 790ba6660d10..8b6d66511690 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -595,7 +595,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode); * If warn is true, then emit a warning if the page is not uptodate and has * not been truncated. * - * The caller must hold lock_page_memcg(). + * The caller must hold lock_page_objcg(). */ void __set_page_dirty(struct page *page, struct address_space *mapping, int warn) @@ -660,14 +660,14 @@ int __set_page_dirty_buffers(struct page *page) * Lock out page's memcg migration to keep PageDirty * synchronized with per-memcg dirty page counters. */ - lock_page_memcg(page); + lock_page_objcg(page); newly_dirty = !TestSetPageDirty(page); spin_unlock(&mapping->private_lock); if (newly_dirty) __set_page_dirty(page, mapping, 1); - unlock_page_memcg(page); + unlock_page_objcg(page); if (newly_dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); @@ -1168,13 +1168,13 @@ void mark_buffer_dirty(struct buffer_head *bh) struct page *page = bh->b_page; struct address_space *mapping = NULL; - lock_page_memcg(page); + lock_page_objcg(page); if (!TestSetPageDirty(page)) { mapping = page_mapping(page); if (mapping) __set_page_dirty(page, mapping, 0); } - unlock_page_memcg(page); + unlock_page_objcg(page); if (mapping) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); } diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 16a1e82e3aeb..8a3ffd38d9e0 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -653,11 +653,11 @@ iomap_set_page_dirty(struct page *page) * Lock out page's memcg migration to keep PageDirty * synchronized with per-memcg dirty page counters. */ - lock_page_memcg(page); + lock_page_objcg(page); newly_dirty = !TestSetPageDirty(page); if (newly_dirty) __set_page_dirty(page, mapping, 0); - unlock_page_memcg(page); + unlock_page_objcg(page); if (newly_dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index cd9e9ff6c2bf..688a8e1fa9b6 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -410,11 +410,12 @@ static inline struct obj_cgroup *page_objcg(struct page *page) * proper memory cgroup pointer. It's not safe to call this function * against some type of pages, e.g. slab pages or ex-slab pages. * - * For a page any of the following ensures page and objcg binding stability: + * For a page any of the following ensures page and objcg binding stability + * (But the page can be reparented to its parent memcg): * * - the page lock * - LRU isolation - * - lock_page_memcg() + * - lock_page_objcg() * - exclusive reference * * Based on the stable binding of page and objcg, for a page any of the @@ -947,9 +948,9 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg); extern bool cgroup_memory_noswap; #endif -struct mem_cgroup *lock_page_memcg(struct page *page); -void __unlock_page_memcg(struct mem_cgroup *memcg); -void unlock_page_memcg(struct page *page); +struct obj_cgroup *lock_page_objcg(struct page *page); +void __unlock_page_objcg(struct obj_cgroup *objcg); +void unlock_page_objcg(struct page *page); /* * idx can be of type enum memcg_stat_item or node_stat_item. @@ -1155,6 +1156,11 @@ void mem_cgroup_split_huge_fixup(struct page *head); struct mem_cgroup; +static inline struct obj_cgroup *page_objcg(struct page *page) +{ + return NULL; +} + static inline struct mem_cgroup *page_memcg(struct page *page) { return NULL; @@ -1375,16 +1381,16 @@ mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) { } -static inline struct mem_cgroup *lock_page_memcg(struct page *page) +static inline struct obj_cgroup *lock_page_objcg(struct page *page) { return NULL; } -static inline void __unlock_page_memcg(struct mem_cgroup *memcg) +static inline void __unlock_page_objcg(struct obj_cgroup *objcg) { } -static inline void unlock_page_memcg(struct page *page) +static inline void unlock_page_objcg(struct page *page) { } diff --git a/mm/filemap.c b/mm/filemap.c index 925964b67583..c427de610860 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -110,7 +110,7 @@ * ->i_pages lock (page_remove_rmap->set_page_dirty) * bdi.wb->list_lock (page_remove_rmap->set_page_dirty) * ->inode->i_lock (page_remove_rmap->set_page_dirty) - * ->memcg->move_lock (page_remove_rmap->lock_page_memcg) + * ->memcg->move_lock (page_remove_rmap->lock_page_objcg) * bdi.wb->list_lock (zap_pte_range->set_page_dirty) * ->inode->i_lock (zap_pte_range->set_page_dirty) * ->private_lock (zap_pte_range->__set_page_dirty_buffers) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a47c97a48951..088511eaa326 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2303,7 +2303,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, atomic_inc(&page[i]._mapcount); } - lock_page_memcg(page); + lock_page_objcg(page); if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { /* Last compound_mapcount is gone. */ __mod_lruvec_page_state(page, NR_ANON_THPS, @@ -2314,7 +2314,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, atomic_dec(&page[i]._mapcount); } } - unlock_page_memcg(page); + unlock_page_objcg(page); } smp_wmb(); /* make pte visible before pmd */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 71689243242f..442b846dc7bc 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1439,7 +1439,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, * These functions are safe to use under any of the following conditions: * - page locked * - PageLRU cleared - * - lock_page_memcg() + * - lock_page_objcg() * - page->_refcount is zero */ struct lruvec *lock_page_lruvec(struct page *page) @@ -2255,20 +2255,22 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) } /** - * lock_page_memcg - lock a page and memcg binding + * lock_page_objcg - lock a page and objcg binding * @page: the page * * This function protects unlocked LRU pages from being moved to - * another cgroup. + * another object cgroup. But the page can be reparented to its + * parent memcg. * - * It ensures lifetime of the returned memcg. Caller is responsible - * for the lifetime of the page; __unlock_page_memcg() is available + * It ensures lifetime of the returned objcg. Caller is responsible + * for the lifetime of the page; __unlock_page_objcg() is available * when @page might get freed inside the locked section. */ -struct mem_cgroup *lock_page_memcg(struct page *page) +struct obj_cgroup *lock_page_objcg(struct page *page) { struct page *head = compound_head(page); /* rmap on tail pages */ struct mem_cgroup *memcg; + struct obj_cgroup *objcg; unsigned long flags; /* @@ -2287,10 +2289,12 @@ struct mem_cgroup *lock_page_memcg(struct page *page) if (mem_cgroup_disabled()) return NULL; again: - memcg = page_memcg(head); - if (unlikely(!memcg)) + objcg = page_objcg(head); + if (unlikely(!objcg)) return NULL; + memcg = obj_cgroup_memcg(objcg); + #ifdef CONFIG_PROVE_LOCKING local_irq_save(flags); might_lock(&memcg->move_lock); @@ -2298,7 +2302,7 @@ struct mem_cgroup *lock_page_memcg(struct page *page) #endif if (atomic_read(&memcg->moving_account) <= 0) - return memcg; + return objcg; spin_lock_irqsave(&memcg->move_lock, flags); if (memcg != page_memcg(head)) { @@ -2309,23 +2313,34 @@ struct mem_cgroup *lock_page_memcg(struct page *page) /* * When charge migration first begins, we can have locked and * unlocked page stat updates happening concurrently. Track - * the task who has the lock for unlock_page_memcg(). + * the task who has the lock for unlock_page_objcg(). */ memcg->move_lock_task = current; memcg->move_lock_flags = flags; - return memcg; + /* + * The cgroup migration and memory cgroup offlining are serialized by + * cgroup_mutex. If we reach here, it means that we are race with cgroup + * migration (or we are cgroup migration) and the @page cannot be + * reparented to its parent memory cgroup. So during the whole process + * from lock_page_objcg(page) to unlock_page_objcg(page), page_memcg(page) + * and obj_cgroup_memcg(objcg) are stable. + */ + + return objcg; } -EXPORT_SYMBOL(lock_page_memcg); +EXPORT_SYMBOL(lock_page_objcg); /** - * __unlock_page_memcg - unlock and unpin a memcg - * @memcg: the memcg + * __unlock_page_objcg - unlock and unpin a objcg + * @objcg: the objcg * - * Unlock and unpin a memcg returned by lock_page_memcg(). + * Unlock and unpin a objcg returned by lock_page_objcg(). */ -void __unlock_page_memcg(struct mem_cgroup *memcg) +void __unlock_page_objcg(struct obj_cgroup *objcg) { + struct mem_cgroup *memcg = objcg ? obj_cgroup_memcg(objcg) : NULL; + if (memcg && memcg->move_lock_task == current) { unsigned long flags = memcg->move_lock_flags; @@ -2339,16 +2354,16 @@ void __unlock_page_memcg(struct mem_cgroup *memcg) } /** - * unlock_page_memcg - unlock a page and memcg binding + * unlock_page_objcg - unlock a page and objcg binding * @page: the page */ -void unlock_page_memcg(struct page *page) +void unlock_page_objcg(struct page *page) { struct page *head = compound_head(page); - __unlock_page_memcg(page_memcg(head)); + __unlock_page_objcg(page_objcg(head)); } -EXPORT_SYMBOL(unlock_page_memcg); +EXPORT_SYMBOL(unlock_page_objcg); struct memcg_stock_pcp { struct mem_cgroup *cached; /* this never be root cgroup */ @@ -3042,7 +3057,7 @@ static void commit_charge(struct page *page, struct obj_cgroup *objcg) * * - the page lock * - LRU isolation - * - lock_page_memcg() + * - lock_page_objcg() * - exclusive reference */ page->memcg_data = (unsigned long)objcg; @@ -5785,7 +5800,7 @@ static int mem_cgroup_move_account(struct page *page, from_vec = mem_cgroup_lruvec(from, pgdat); to_vec = mem_cgroup_lruvec(to, pgdat); - lock_page_memcg(page); + lock_page_objcg(page); if (PageAnon(page)) { if (page_mapped(page)) { @@ -5837,7 +5852,7 @@ static int mem_cgroup_move_account(struct page *page, * with (un)charging, migration, LRU putback, or anything else * that would rely on a stable page's memory cgroup. * - * Note that lock_page_memcg is a memcg lock, not a page lock, + * Note that lock_page_objcg is a memcg lock, not a page lock, * to save space. As soon as we switch page's memory cgroup to a * new memcg that isn't locked, the above state can change * concurrently again. Make sure we're truly done with it. @@ -5849,7 +5864,7 @@ static int mem_cgroup_move_account(struct page *page, page->memcg_data = (unsigned long)to->objcg; - __unlock_page_memcg(from); + __unlock_page_objcg(from->objcg); ret = 0; @@ -6291,7 +6306,7 @@ static void mem_cgroup_move_charge(void) { lru_add_drain_all(); /* - * Signal lock_page_memcg() to take the memcg's move_lock + * Signal lock_page_objcg() to take the memcg's move_lock * while we're moving its pages to another memcg. Then wait * for already started RCU-only updates to finish. */ diff --git a/mm/page-writeback.c b/mm/page-writeback.c index f517e0669924..2a119afbf7fa 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2413,7 +2413,7 @@ int __set_page_dirty_no_writeback(struct page *page) /* * Helper function for set_page_dirty family. * - * Caller must hold lock_page_memcg(). + * Caller must hold lock_page_objcg(). * * NOTE: This relies on being atomic wrt interrupts. */ @@ -2445,7 +2445,7 @@ void account_page_dirtied(struct page *page, struct address_space *mapping) /* * Helper function for deaccounting dirty page without writeback. * - * Caller must hold lock_page_memcg(). + * Caller must hold lock_page_objcg(). */ void account_page_cleaned(struct page *page, struct address_space *mapping, struct bdi_writeback *wb) @@ -2472,13 +2472,13 @@ void account_page_cleaned(struct page *page, struct address_space *mapping, */ int __set_page_dirty_nobuffers(struct page *page) { - lock_page_memcg(page); + lock_page_objcg(page); if (!TestSetPageDirty(page)) { struct address_space *mapping = page_mapping(page); unsigned long flags; if (!mapping) { - unlock_page_memcg(page); + unlock_page_objcg(page); return 1; } @@ -2489,7 +2489,7 @@ int __set_page_dirty_nobuffers(struct page *page) __xa_set_mark(&mapping->i_pages, page_index(page), PAGECACHE_TAG_DIRTY); xa_unlock_irqrestore(&mapping->i_pages, flags); - unlock_page_memcg(page); + unlock_page_objcg(page); if (mapping->host) { /* !PageAnon && !swapper_space */ @@ -2497,7 +2497,7 @@ int __set_page_dirty_nobuffers(struct page *page) } return 1; } - unlock_page_memcg(page); + unlock_page_objcg(page); return 0; } EXPORT_SYMBOL(__set_page_dirty_nobuffers); @@ -2630,14 +2630,14 @@ void __cancel_dirty_page(struct page *page) struct bdi_writeback *wb; struct wb_lock_cookie cookie = {}; - lock_page_memcg(page); + lock_page_objcg(page); wb = unlocked_inode_to_wb_begin(inode, &cookie); if (TestClearPageDirty(page)) account_page_cleaned(page, mapping, wb); unlocked_inode_to_wb_end(inode, &cookie); - unlock_page_memcg(page); + unlock_page_objcg(page); } else { ClearPageDirty(page); } @@ -2722,11 +2722,11 @@ EXPORT_SYMBOL(clear_page_dirty_for_io); int test_clear_page_writeback(struct page *page) { struct address_space *mapping = page_mapping(page); - struct mem_cgroup *memcg; + struct obj_cgroup *objcg; struct lruvec *lruvec; int ret; - memcg = lock_page_memcg(page); + objcg = lock_page_objcg(page); lruvec = mem_cgroup_page_lruvec(page); if (mapping && mapping_use_writeback_tags(mapping)) { struct inode *inode = mapping->host; @@ -2759,7 +2759,7 @@ int test_clear_page_writeback(struct page *page) dec_zone_page_state(page, NR_ZONE_WRITE_PENDING); inc_node_page_state(page, NR_WRITTEN); } - __unlock_page_memcg(memcg); + __unlock_page_objcg(objcg); return ret; } @@ -2768,7 +2768,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write) struct address_space *mapping = page_mapping(page); int ret, access_ret; - lock_page_memcg(page); + lock_page_objcg(page); if (mapping && mapping_use_writeback_tags(mapping)) { XA_STATE(xas, &mapping->i_pages, page_index(page)); struct inode *inode = mapping->host; @@ -2808,7 +2808,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write) inc_lruvec_page_state(page, NR_WRITEBACK); inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); } - unlock_page_memcg(page); + unlock_page_objcg(page); access_ret = arch_make_page_accessible(page); /* * If writeback has been triggered on a page that cannot be made diff --git a/mm/rmap.c b/mm/rmap.c index b0fc27e77d6d..3c2488e1081c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -31,7 +31,7 @@ * swap_lock (in swap_duplicate, swap_info_get) * mmlist_lock (in mmput, drain_mmlist and others) * mapping->private_lock (in __set_page_dirty_buffers) - * lock_page_memcg move_lock (in __set_page_dirty_buffers) + * lock_page_objcg move_lock (in __set_page_dirty_buffers) * i_pages lock (widely used) * lruvec->lru_lock (in lock_page_lruvec_irq) * inode->i_lock (in set_page_dirty's __mark_inode_dirty) @@ -1127,7 +1127,7 @@ void do_page_add_anon_rmap(struct page *page, bool first; if (unlikely(PageKsm(page))) - lock_page_memcg(page); + lock_page_objcg(page); else VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -1155,7 +1155,7 @@ void do_page_add_anon_rmap(struct page *page, } if (unlikely(PageKsm(page))) { - unlock_page_memcg(page); + unlock_page_objcg(page); return; } @@ -1215,7 +1215,7 @@ void page_add_file_rmap(struct page *page, bool compound) int i, nr = 1; VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); - lock_page_memcg(page); + lock_page_objcg(page); if (compound && PageTransHuge(page)) { int nr_pages = thp_nr_pages(page); @@ -1244,7 +1244,7 @@ void page_add_file_rmap(struct page *page, bool compound) } __mod_lruvec_page_state(page, NR_FILE_MAPPED, nr); out: - unlock_page_memcg(page); + unlock_page_objcg(page); } static void page_remove_file_rmap(struct page *page, bool compound) @@ -1345,7 +1345,7 @@ static void page_remove_anon_compound_rmap(struct page *page) */ void page_remove_rmap(struct page *page, bool compound) { - lock_page_memcg(page); + lock_page_objcg(page); if (!PageAnon(page)) { page_remove_file_rmap(page, compound); @@ -1384,7 +1384,7 @@ void page_remove_rmap(struct page *page, bool compound) * faster for those pages still in swapcache. */ out: - unlock_page_memcg(page); + unlock_page_objcg(page); } /* From patchwork Tue Mar 30 10:15:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12172227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46AA3C433C1 for ; Tue, 30 Mar 2021 10:23:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D363461955 for ; Tue, 30 Mar 2021 10:23:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D363461955 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 620B56B00A6; Tue, 30 Mar 2021 06:23:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F8206B00A7; Tue, 30 Mar 2021 06:23:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 498686B00A8; Tue, 30 Mar 2021 06:23:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0028.hostedemail.com [216.40.44.28]) by kanga.kvack.org (Postfix) with ESMTP id 2D55F6B00A6 for ; Tue, 30 Mar 2021 06:23:49 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E313152B0 for ; Tue, 30 Mar 2021 10:23:48 +0000 (UTC) X-FDA: 77976154536.26.50FD08B Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf09.hostedemail.com (Postfix) with ESMTP id A08886000104 for ; Tue, 30 Mar 2021 10:23:46 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id h25so11408568pgm.3 for ; Tue, 30 Mar 2021 03:23:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zl+lZA6IhpCUdnBZY8wrlIzpChNsz+rycB3YEkTf9d8=; b=jFmF1YweAfzCVW5eqeWZ6Wo5H2khjQVMjiHyanNyg6cfLPNR69RzaFvzTv5BCuFeeY kn4hrQEOWLZyoALmJ/9YjSod4oMunh8bu9E1BHxTdkklBsUMbCUHK3qmC0FpAmeYKmXW 3QUJOTX1TY2viaAiH6YZ35hoI3TzS5dTvChl4oM0uHXFybdk0T+8Z1ezdg9HRstUInBF G+V6B9oQ86nel9+i/N94+u4H27ha13gSyfVSElV4s1a9UjwxrM7I3ckM+Px2Tq30y79e QBRt4L0utaWecE4FCLY1xbwc1c+eiKB2kzTEQx6uc8BikMu809d+RPCs9fnGrS8ms/t6 usGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zl+lZA6IhpCUdnBZY8wrlIzpChNsz+rycB3YEkTf9d8=; b=GJaH/EVVHaD8fTl603hQ7s4yyVqoR1eONJj/Ct9cBaDBt0fTovbgeiMgS6cJe0OUtD MVC16VtdOgle/XJYpRhapoZhSw8rcmENfwDzgBEN5s/q+VA80PaPkqa4AJX4tbEQifLr Bs0V98nLavLMSy3JuB06FmwLGKme90AvP0gl+9Qau+ZGtw2DfJt1Jq2vc8uhGn1QWXsS im/sUUGMVMG+QNUAB5VQZczQwv/9KDd83+OtJ1/RvgED+KJ64WfdvFy3q7pGr4N/QuhM y9L2mY57v6+UMpgeArpa2nYwyyUHKU6mGep7PHzkTn5uSOVI3mMNqEu9wi887NvmArJB VmDQ== X-Gm-Message-State: AOAM533cpw47XW8tXOYixp50bz+yeZS5WcfyLLh6pXv4TbYIYx0I085T DFOjAC36kjn5XPeYzjzzf5iEEUpb4TysUAskMmU= X-Google-Smtp-Source: ABdhPJzQlV5q8vaRIDxPUz0GLjpfRCT4vVajWn0k9HKajxPtZV/x9+IEGJqZR3HRCpycu/Y5CXYmNQ== X-Received: by 2002:aa7:93af:0:b029:1ef:1bb9:b1a1 with SMTP id x15-20020aa793af0000b02901ef1bb9b1a1mr29893022pff.49.1617099827671; Tue, 30 Mar 2021 03:23:47 -0700 (PDT) Received: from localhost.localdomain ([2408:8445:ad30:68d8:c87f:ca1b:dc00:4730]) by smtp.gmail.com with ESMTPSA id k10sm202259pfk.205.2021.03.30.03.23.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 Mar 2021 03:23:47 -0700 (PDT) From: Muchun Song To: guro@fb.com, hannes@cmpxchg.org, mhocko@kernel.org, akpm@linux-foundation.org, shakeelb@google.com, vdavydov.dev@gmail.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, Muchun Song Subject: [RFC PATCH 15/15] mm: lru: add VM_BUG_ON_PAGE to lru maintenance function Date: Tue, 30 Mar 2021 18:15:31 +0800 Message-Id: <20210330101531.82752-16-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20210330101531.82752-1-songmuchun@bytedance.com> References: <20210330101531.82752-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A08886000104 X-Stat-Signature: jhhoq3iq5mp6yfszx4wie4z5pd4tjotm Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf09; identity=mailfrom; envelope-from=""; helo=mail-pg1-f177.google.com; client-ip=209.85.215.177 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617099826-1475 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We need to make sure that the page is deleted from or added to the correct lruvec list. So add a VM_BUG_ON_PAGE() to catch invalid users. Signed-off-by: Muchun Song --- include/linux/mm_inline.h | 6 ++++++ mm/vmscan.c | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 355ea1ee32bd..d19870448287 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -84,6 +84,8 @@ static __always_inline void add_page_to_lru_list(struct page *page, { enum lru_list lru = page_lru(page); + VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page); + update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page)); list_add(&page->lru, &lruvec->lists[lru]); } @@ -93,6 +95,8 @@ static __always_inline void add_page_to_lru_list_tail(struct page *page, { enum lru_list lru = page_lru(page); + VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page); + update_lru_size(lruvec, lru, page_zonenum(page), thp_nr_pages(page)); list_add_tail(&page->lru, &lruvec->lists[lru]); } @@ -100,6 +104,8 @@ static __always_inline void add_page_to_lru_list_tail(struct page *page, static __always_inline void del_page_from_lru_list(struct page *page, struct lruvec *lruvec) { + VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page); + list_del(&page->lru); update_lru_size(lruvec, page_lru(page), page_zonenum(page), -thp_nr_pages(page)); diff --git a/mm/vmscan.c b/mm/vmscan.c index fea6b43bc1f9..0a4a3072d092 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1656,6 +1656,8 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, page = lru_to_page(src); prefetchw_prev_lru_page(page, src, flags); + VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page); + nr_pages = compound_nr(page); total_scan += nr_pages; @@ -1866,7 +1868,6 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, * All pages were isolated from the same lruvec (and isolation * inhibits memcg migration). */ - VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page); add_page_to_lru_list(page, lruvec); nr_pages = thp_nr_pages(page); nr_moved += nr_pages;