From patchwork Mon Aug 13 06:58:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Konstantin Khlebnikov X-Patchwork-Id: 10563905 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 35B6017E1 for ; Mon, 13 Aug 2018 06:58:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A18A28EC6 for ; Mon, 13 Aug 2018 06:58:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1DA1D28F1E; Mon, 13 Aug 2018 06:58:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DD3828EC6 for ; Mon, 13 Aug 2018 06:58:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 473426B0007; Mon, 13 Aug 2018 02:58:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 449BD6B0008; Mon, 13 Aug 2018 02:58:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3150B6B000A; Mon, 13 Aug 2018 02:58:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by kanga.kvack.org (Postfix) with ESMTP id AF1D06B0007 for ; Mon, 13 Aug 2018 02:58:16 -0400 (EDT) Received: by mail-lf1-f69.google.com with SMTP id w8-v6so3039812lfe.15 for ; Sun, 12 Aug 2018 23:58:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:subject:from:to:cc :date:message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=RCjJcduRfJ1tqg5OczeRsjSyUm49buizoP+lV7kD+TU=; b=JZJIUnfNlDRyAZ6jVYrWm+bJ5O0nM/lTxR5cePbexcnEBxaJaDtn2PJQ5gO0wmp9q8 lwJH1JCPGVUychO+VJ7obzyVWqJqzCvcSjNggjUY/GGs0c7r8xQ74DdVfoEZSOyUSj8E 1LLsze2vOg4LR10tpGJQQi6w+5ky/YUHu6yxYD7HTV5O/umVyt6DgFi2hu8AzvD1xiaB M5Q2f5Cr6u+HT5wFVJlZcAtpIkgLtoV3Qnpv4K4YR5r+fkO68osL2YC+c/YtB3cC5+AE La6+bai1YEc4HQIrEa7I2dMYqlZxchCahHyoapP04+OlE5epK5FGYZPDAV2nBcluEKBy J4mA== X-Gm-Message-State: AOUpUlGcEzmkouN1ch1bY1OilAYZyy23GAyJnGEz/CAZxwsG1zmDChU7 9ua9otfKDA6/c6knM7Of6bnpxNB3Hf2uTag7UhtWXvPXoIToH7wqAM1kmMabOFz/QgjaaZlccjp 7dQ3JTDmczPw9ppQced5eTZIEGldzwGi57Do1opPD5t2KIqOciuUsmTK+W+wfCnO1Og== X-Received: by 2002:a19:73c9:: with SMTP id h70-v6mr9930803lfk.61.1534143496022; Sun, 12 Aug 2018 23:58:16 -0700 (PDT) X-Google-Smtp-Source: AA+uWPzHFPDagf8hq8M9DhqgJwhjZKEEqOFdGTdl8RN0urLs0B/19YHi3xjFFLSvXluLT69gyyGJ X-Received: by 2002:a19:73c9:: with SMTP id h70-v6mr9930776lfk.61.1534143494915; Sun, 12 Aug 2018 23:58:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534143494; cv=none; d=google.com; s=arc-20160816; b=fiwlwisZy2R4SQ/eaWMgExv4PT9UWpNu5f+wahWKVv5atC08pxi+v/NnenN7GyunKj UheVFn0yOB19LRgq5/BsoTSJU0W45WQhyN2dClBxrfEp4VKFfP9xtvVoYMs1LT+KUaSJ i/qrZuqHDXlSrrkoJpPnjGTA3aPQ1h4qiZkM/DAL2k8gDhaf3/uvn6csJ4edheVOciNv cSXpIAPlE8u1/d7l444FEg3T5JCntuey/lf7/P0OwKXzDlk9RRBLCvenYru9/oKvL3zQ kdNYNGEXh88WytpNLsgOIdZf5xYEQnag1q0HRThXHaLrUVfXbPdGoNCQwXaGBlWa3Zyz f55g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject:dkim-signature :dkim-signature:arc-authentication-results; bh=RCjJcduRfJ1tqg5OczeRsjSyUm49buizoP+lV7kD+TU=; b=BrCxVj1Txt0n7VTg7cMW7BPtNS7CIRCMf7wKXbmK6CN5iqe+0wzUmKidfEttg738ky SmNOxA/QN2msULN4QrE3wkRhh6sKpdzpEdNSQTyq/9R4BGrTBdySGKkfN17WW7nuWV7k xzO6MmYsGnb3ZJ8sEJQENh6fcrIPjnW7D14OBWZdUCabQg4xOrJ+z96qpKALw+hAA9Mc DqMGX4SHZSEIZdDhyjlCJQn0fWZy8Ywa1G9MSjM9vaCzk6KhtRuFPtRVcGqNNLCMlmE5 mDS5YawJhF47A57o7zwQjypmMOTn+tMnpekI2zqlelwOpCXnCFamlSPLV4F8Jfxetwxi AJEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=dpyb+wtL; dkim=pass header.i=@yandex-team.ru header.s=default header.b="lSMv/8I3"; spf=pass (google.com: domain of khlebnikov@yandex-team.ru designates 2a02:6b8:0:1465::fd as permitted sender) smtp.mailfrom=khlebnikov@yandex-team.ru; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: from forwardcorp1g.cmail.yandex.net (forwardcorp1g.cmail.yandex.net. [2a02:6b8:0:1465::fd]) by mx.google.com with ESMTPS id p16-v6si6927774lji.224.2018.08.12.23.58.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Aug 2018 23:58:14 -0700 (PDT) Received-SPF: pass (google.com: domain of khlebnikov@yandex-team.ru designates 2a02:6b8:0:1465::fd as permitted sender) client-ip=2a02:6b8:0:1465::fd; Authentication-Results: mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=dpyb+wtL; dkim=pass header.i=@yandex-team.ru header.s=default header.b="lSMv/8I3"; spf=pass (google.com: domain of khlebnikov@yandex-team.ru designates 2a02:6b8:0:1465::fd as permitted sender) smtp.mailfrom=khlebnikov@yandex-team.ru; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: from mxbackcorp1o.mail.yandex.net (mxbackcorp1o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::301]) by forwardcorp1g.cmail.yandex.net (Yandex) with ESMTP id 487FE2072D; Mon, 13 Aug 2018 09:58:14 +0300 (MSK) Received: from smtpcorp1p.mail.yandex.net (smtpcorp1p.mail.yandex.net [2a02:6b8:0:1472:2741:0:8b6:10]) by mxbackcorp1o.mail.yandex.net (nwsmtp/Yandex) with ESMTP id 769TnBkkZv-wDiWXWPD; Mon, 13 Aug 2018 09:58:14 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1534143494; bh=RCjJcduRfJ1tqg5OczeRsjSyUm49buizoP+lV7kD+TU=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References; b=dpyb+wtLeTpIzv9emF93Vlna7I+zDWtx0lll70Gj5rKAJLiFthfjKDtvRUXZ3/1Mf bFp3H2iJPUyKcNbkrTig780qtVSaO6btFsjPpby7/TC76JgKoaMsh9annPKJpSZ3hM US9He5eboS9PEWfKy36brnib/n7ypGb/5gQtyDmk= Received: from dynamic-red.dhcp.yndx.net (dynamic-red.dhcp.yndx.net [2a02:6b8:0:40c:854c:7dcd:9203:76a5]) by smtpcorp1p.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id DkUuBRmSwn-wD8GErpp; Mon, 13 Aug 2018 09:58:13 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1534143493; bh=RCjJcduRfJ1tqg5OczeRsjSyUm49buizoP+lV7kD+TU=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References; b=lSMv/8I3VpsmZJUc8bZq6a6hmbdgRQgA0+AbmiT0l1bbAhjcAw/aR8lgvIYvGXJP5 1+mHLhFQ8CH3egth+zwM/LW6EA3QYn/q274T3D6eKL7bhpZxLjKCwpHdUpdba2nnm8 bUAuxDyYOqx9M9RiLelpbDKagWBz4nsYDwzmQBq8= Authentication-Results: smtpcorp1p.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Subject: [PATCH RFC 2/3] proc/kpagecgroup: report also inode numbers of offline cgroups From: Konstantin Khlebnikov To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Cc: Tejun Heo , Michal Hocko , Vladimir Davydov , Roman Gushchin , Johannes Weiner Date: Mon, 13 Aug 2018 09:58:10 +0300 Message-ID: <153414348994.737150.10057219558779418929.stgit@buzz> In-Reply-To: <153414348591.737150.14229960913953276515.stgit@buzz> References: <153414348591.737150.14229960913953276515.stgit@buzz> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP By default this interface reports inode number of closest online ancestor if cgroups is offline (removed). Information about real owner is required for detecting which pages keep removed cgroup. This patch adds per-file mode which is changed by writing 64-bit flags into opened /proc/kpagecgroup. For now only first bit is used. Signed-off-by: Konstantin Khlebnikov --- Documentation/admin-guide/mm/pagemap.rst | 3 +++ fs/proc/page.c | 24 ++++++++++++++++++++++-- include/linux/memcontrol.h | 2 +- mm/memcontrol.c | 5 +++-- mm/memory-failure.c | 2 +- 5 files changed, 30 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst index 577af85beb41..b39d841ac560 100644 --- a/Documentation/admin-guide/mm/pagemap.rst +++ b/Documentation/admin-guide/mm/pagemap.rst @@ -80,6 +80,9 @@ There are four components to pagemap: memory cgroup each page is charged to, indexed by PFN. Only available when CONFIG_MEMCG is set. + For offline (removed) cgroup this returnes inode number of closest online + ancestor. Write 64-bit flag 1 into opened file for getting real owners. + Short descriptions to the page flags ==================================== diff --git a/fs/proc/page.c b/fs/proc/page.c index 792c78a49174..337f526fcc27 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -248,6 +248,7 @@ static const struct file_operations proc_kpageflags_operations = { static ssize_t kpagecgroup_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { + unsigned long flags = (unsigned long)file->private_data; u64 __user *out = (u64 __user *)buf; struct page *ppage; unsigned long src = *ppos; @@ -267,7 +268,7 @@ static ssize_t kpagecgroup_read(struct file *file, char __user *buf, ppage = NULL; if (ppage) - ino = page_cgroup_ino(ppage); + ino = page_cgroup_ino(ppage, !(flags & 1)); else ino = 0; @@ -289,9 +290,28 @@ static ssize_t kpagecgroup_read(struct file *file, char __user *buf, return ret; } +static ssize_t kpagecgroup_write(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) +{ + u64 flags; + + if (count != 8) + return -EINVAL; + + if (get_user(flags, buf)) + return -EFAULT; + + if (flags > 1) + return -EINVAL; + + file->private_data = (void *)(unsigned long)flags; + return count; +} + static const struct file_operations proc_kpagecgroup_operations = { .llseek = mem_lseek, .read = kpagecgroup_read, + .write = kpagecgroup_write, }; #endif /* CONFIG_MEMCG */ @@ -300,7 +320,7 @@ static int __init proc_page_init(void) proc_create("kpagecount", S_IRUSR, NULL, &proc_kpagecount_operations); proc_create("kpageflags", S_IRUSR, NULL, &proc_kpageflags_operations); #ifdef CONFIG_MEMCG - proc_create("kpagecgroup", S_IRUSR, NULL, &proc_kpagecgroup_operations); + proc_create("kpagecgroup", 0600, NULL, &proc_kpagecgroup_operations); #endif return 0; } diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6c6fb116e925..a7c40522bef0 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -444,7 +444,7 @@ static inline bool mm_match_cgroup(struct mm_struct *mm, } struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page); -ino_t page_cgroup_ino(struct page *page); +ino_t page_cgroup_ino(struct page *page, bool online); static inline bool mem_cgroup_online(struct mem_cgroup *memcg) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 19a4348974a4..7ef6ea9d5e4a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -333,6 +333,7 @@ struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page) /** * page_cgroup_ino - return inode number of the memcg a page is charged to * @page: the page + * @online: return closest online ancestor * * Look up the closest online ancestor of the memory cgroup @page is charged to * and return its inode number or 0 if @page is not charged to any cgroup. It @@ -343,14 +344,14 @@ struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page) * after page_cgroup_ino() returns, so it only should be used by callers that * do not care (such as procfs interfaces). */ -ino_t page_cgroup_ino(struct page *page) +ino_t page_cgroup_ino(struct page *page, bool online) { struct mem_cgroup *memcg; unsigned long ino = 0; rcu_read_lock(); memcg = READ_ONCE(page->mem_cgroup); - while (memcg && !(memcg->css.flags & CSS_ONLINE)) + while (memcg && online && !(memcg->css.flags & CSS_ONLINE)) memcg = parent_mem_cgroup(memcg); if (memcg) ino = cgroup_ino(memcg->css.cgroup); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 9d142b9b86dc..bd09c447e0ec 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -139,7 +139,7 @@ static int hwpoison_filter_task(struct page *p) if (!hwpoison_filter_memcg) return 0; - if (page_cgroup_ino(p) != hwpoison_filter_memcg) + if (page_cgroup_ino(p, true) != hwpoison_filter_memcg) return -EINVAL; return 0;