From patchwork Thu Sep 2 21:55:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12473025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F076C433F5 for ; Thu, 2 Sep 2021 21:55:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0285160E8B for ; Thu, 2 Sep 2021 21:55:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0285160E8B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 78F326B00E9; Thu, 2 Sep 2021 17:55:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 740736B00EA; Thu, 2 Sep 2021 17:55:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 606218D0001; Thu, 2 Sep 2021 17:55:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id 4E2CA6B00E9 for ; Thu, 2 Sep 2021 17:55:09 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0C9A982F4BF5 for ; Thu, 2 Sep 2021 21:55:09 +0000 (UTC) X-FDA: 78543989538.16.F519C43 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP id B980A6001986 for ; Thu, 2 Sep 2021 21:55:08 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id C58D8610A0; Thu, 2 Sep 2021 21:55:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1630619708; bh=9Khd0fiL6Y0w4zEeD14hyDQqzQXIFxHV6a86me1iQ00=; h=Date:From:To:Subject:In-Reply-To:From; b=tOxB9lvSvXhJP6ZDhH+Nf3AP/dJOy4Ky9xWVXo7+J72mcjhyP5RO3lMqyG0+VGDo7 aluU3VPMoOY2IfMF9SP06g41fXbEvi1FsBP1HsfBJpF3UYD1tAvln6/npfwjncByW+ 5ekoBjVhsHWGLH72lHFiutLbZwXhhEPHXumAQ2Iw= Date: Thu, 02 Sep 2021 14:55:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, nglaive@gmail.com, shakeelb@google.com, shenwenbo@zju.edu.cn, torvalds@linux-foundation.org, vdavydov.dev@gmail.com Subject: [patch 098/212] memcg: charge fs_context and legacy_fs_context Message-ID: <20210902215507.FVhZ-cq1V%akpm@linux-foundation.org> In-Reply-To: <20210902144820.78957dff93d7bea620d55a89@linux-foundation.org> User-Agent: s-nail v14.8.16 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=tOxB9lvS; dmarc=none; spf=pass (imf10.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B980A6001986 X-Stat-Signature: rxih74u97es7ep4ws3f6krg7mgasoas4 X-HE-Tag: 1630619708-823976 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yutian Yang Subject: memcg: charge fs_context and legacy_fs_context This patch adds accounting flags to fs_context and legacy_fs_context allocation sites so that kernel could correctly charge these objects. We have written a PoC to demonstrate the effect of the missing-charging bugs. The PoC takes around 1,200MB unaccounted memory, while it is charged for only 362MB memory usage. We evaluate the PoC on QEMU x86_64 v5.2.90 + Linux kernel v5.10.19 + Debian buster. All the limitations including ulimits and sysctl variables are set as default. Specifically, the hard NOFILE limit and nr_open in sysctl are both 1,048,576. /*------------------------- POC code ----------------------------*/ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \ } while (0) #define STACK_SIZE (8 * 1024) #ifndef __NR_fsopen #define __NR_fsopen 430 #endif static inline int fsopen(const char *fs_name, unsigned int flags) { return syscall(__NR_fsopen, fs_name, flags); } static char thread_stack[512][STACK_SIZE]; int thread_fn(void* arg) { for (int i = 0; i< 800000; ++i) { int fsfd = fsopen("nfs", FSOPEN_CLOEXEC); if (fsfd == -1) { errExit("fsopen"); } } while(1); return 0; } int main(int argc, char *argv[]) { int thread_pid; for (int i = 0; i < 1; ++i) { thread_pid = clone(thread_fn, thread_stack[i] + STACK_SIZE, \ SIGCHLD, NULL); } while(1); return 0; } /*-------------------------- end --------------------------------*/ Link: https://lkml.kernel.org/r/1626517201-24086-1-git-send-email-nglaive@gmail.com Signed-off-by: Yutian Yang Reviewed-by: Shakeel Butt Cc: Michal Hocko Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Signed-off-by: Andrew Morton --- fs/fs_context.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/fs/fs_context.c~memcg-charge-fs_context-and-legacy_fs_context +++ a/fs/fs_context.c @@ -254,7 +254,7 @@ static struct fs_context *alloc_fs_conte struct fs_context *fc; int ret = -ENOMEM; - fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); + fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT); if (!fc) return ERR_PTR(-ENOMEM); @@ -649,7 +649,7 @@ const struct fs_context_operations legac */ static int legacy_init_fs_context(struct fs_context *fc) { - fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL); + fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL_ACCOUNT); if (!fc->fs_private) return -ENOMEM; fc->ops = &legacy_fs_context_ops;