From patchwork Wed Sep 21 16:59:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984012 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 591D7C6FA90 for ; Wed, 21 Sep 2022 17:00:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230235AbiIURAP (ORCPT ); Wed, 21 Sep 2022 13:00:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229910AbiIURAM (ORCPT ); Wed, 21 Sep 2022 13:00:12 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E90F2A439; Wed, 21 Sep 2022 10:00:11 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id j12so6511053pfi.11; Wed, 21 Sep 2022 10:00:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=Uc034YbXLe4uSOJi2Giv7PoxUQGtDra34gKYHPXI91g=; b=YH4e0M93/zwgxts/4p2PWviNzWQigMOF4qZN4QnqPvGBdnj4aBKylgRGfgksaqFVt3 uXgL1W+nJvpKKKY4mhvcnJPNc05aAwCwBr6okJ/N55sz0zmm80oMEtJF2tqJ41VaRqLl OJmVR11MfPb0vTYFi1XU2wURqMcoAp/H6cLE6AF6ZBy41JeBoV15yrUyLR61fgysl6Wu WiIyzXyERFPVOgV1APZL4yirViwmqoPokXppwYFwLQu/EwhUO9GPtQYORtPwYHMNvKZA dfM4yOYAXYE0nSs1KEzaZckLVHVolUTeu3vpNCp9pwBHspL12xGT21zHdj46zM2RVdW4 znfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=Uc034YbXLe4uSOJi2Giv7PoxUQGtDra34gKYHPXI91g=; b=xKLWi8ITv/0Ii7GajQ9PCw2p61yZ4SQyhjSv5XI2cfel7uKSuFzqV5LbWxHHzk7lbr fWlP6DCW+/WRX5NRuEszk/tffsaP2z3/TLNmyZOksdbr6wvAKFJwyT+YtOgeipCxhJiy 6jlOHZOyiHOpvKX8qNs5pAW8yadpWSOaQ4LbWA1W1sgm/sF1qIqwECLzhcOTFy/2+Ut/ 5DSKRleNs7WiDLWSk+GvTXSSBbK1QI1c/IBwk4rstHwZYCiJl17ZuClIpCNTHozlbIQf y/aun0LYSmK15Pp4dy1XF963fMZAFmeoddUb7a2h9+gEhCIb6sgdjzlgEVAIwwWt1e6K 7sXQ== X-Gm-Message-State: ACrzQf3dFIjQgnxC1w9eeErp8DrYR2P5A2UgmF15sZsM2XtAyX/qPTiY V19mFzCm6MxMf/DqdIapSpM= X-Google-Smtp-Source: AMsMyM4uy0dKCw/P2UdR/ehiaNLiN9bDGO5ET+mXRA6NW2qjr04hx9bZzdjbgfNvCsKVHP2VALThGg== X-Received: by 2002:a63:4705:0:b0:43b:fc70:2464 with SMTP id u5-20020a634705000000b0043bfc702464mr1534798pga.540.1663779610654; Wed, 21 Sep 2022 10:00:10 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:09 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 01/10] bpf: Introduce new helper bpf_map_put_memcg() Date: Wed, 21 Sep 2022 16:59:53 +0000 Message-Id: <20220921170002.29557-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Replace the open-coded mem_cgroup_put() with a new helper bpf_map_put_memcg(). That could make it more clear. Signed-off-by: Yafang Shao --- kernel/bpf/syscall.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index dab156f..70d5f70 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -441,6 +441,11 @@ static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) return root_mem_cgroup; } +static void bpf_map_put_memcg(struct mem_cgroup *memcg) +{ + mem_cgroup_put(memcg); +} + void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, int node) { @@ -451,7 +456,7 @@ void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, old_memcg = set_active_memcg(memcg); ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node); set_active_memcg(old_memcg); - mem_cgroup_put(memcg); + bpf_map_put_memcg(memcg); return ptr; } @@ -465,7 +470,7 @@ void *bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags) old_memcg = set_active_memcg(memcg); ptr = kzalloc(size, flags | __GFP_ACCOUNT); set_active_memcg(old_memcg); - mem_cgroup_put(memcg); + bpf_map_put_memcg(memcg); return ptr; } @@ -480,7 +485,7 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, old_memcg = set_active_memcg(memcg); ptr = __alloc_percpu_gfp(size, align, flags | __GFP_ACCOUNT); set_active_memcg(old_memcg); - mem_cgroup_put(memcg); + bpf_map_put_memcg(memcg); return ptr; } From patchwork Wed Sep 21 16:59:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984013 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D913C6FA8B for ; Wed, 21 Sep 2022 17:00:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230392AbiIURAQ (ORCPT ); Wed, 21 Sep 2022 13:00:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230229AbiIURAO (ORCPT ); Wed, 21 Sep 2022 13:00:14 -0400 Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77AF9326F8; Wed, 21 Sep 2022 10:00:13 -0700 (PDT) Received: by mail-pf1-x430.google.com with SMTP id l65so6525947pfl.8; Wed, 21 Sep 2022 10:00:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=752ad/zAEG/6KeMcUBzMu9pWUcBuWGq8SqU7i/8Af4Q=; b=ctY4BKSR3+VtoeXZ+S4ZKkux+Q1OUenAzPs/7AL6PkxO8AdoE7qdlVVPfLccV3VoGA fHZflnl0sZwFqyysd0UZKhCzmt+/wYS0t05ggRvRDts2/dpIZJ6mHc3BJPK19l3Sj0Cw tMFIEPhAqCiYrNkv64e/hgvhLQHWezrMwPgTZgVuCeoTEzGPVWz+SdQDnju2+M4WKCwS p+p+vZOws/G2p3htlBoylLCkCVtvAVg15Hiyb5Kyl9sAPK7LrOoZFDbLgFuVc6IZG3aO GrezZHi0jb2uFI9OukNkeBf+0cQZflwvg3/Hb8uQniMWG7uHOX6Mvo+YDFxc3zZChLuL U8hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=752ad/zAEG/6KeMcUBzMu9pWUcBuWGq8SqU7i/8Af4Q=; b=U4hFR0oFd8frLZLkWRLPuljglNhdjPxNhYRTdNQRe7DevM+AYFSmfSRHfTNoaw901m +n79bgKOaePLtHipJ2knPAR/DSOvlu+spg9b9vzx/uA5112ZPbnmeVlkY9e37KJEVjMk XnnyZRYRshOPWXS+Xekz3xhhXOlwzlz5287MmQEARFuTvLSPd+KUdOFke3RqhHVhw6kF S3asG3q0umxltx0tWGaVHrxzIHcdurPR2kDvpQLXd6ThBcJBRn24exn2colwUt9QAcUO kyK9GYL/CpfnpOVWs8BqWbI2Qq6NFqLYTn8dArwizvX9Q98rdlw6tr0NKB8ZLPqOWUGK RJzQ== X-Gm-Message-State: ACrzQf1Bc7odtKQpfyQ/1fSpu4zvsA1WA6VRuoi2P5Y7EFc6e12k1/rw jtBP1ST516BbRtPRvK+gFFM= X-Google-Smtp-Source: AMsMyM76JQfNBEF0sdLp/8BhosBc7gQ/WRu0QQ4V812cMs1t+f3WWQaZdX9CGbTaNKYcAeTfOgJTig== X-Received: by 2002:a63:6a09:0:b0:43a:20d4:85fe with SMTP id f9-20020a636a09000000b0043a20d485femr14037704pgc.625.1663779612667; Wed, 21 Sep 2022 10:00:12 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:11 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 02/10] bpf: Define bpf_map_{get,put}_memcg for !CONFIG_MEMCG_KMEM Date: Wed, 21 Sep 2022 16:59:54 +0000 Message-Id: <20220921170002.29557-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC We can use this helper when CONFIG_MEMCG_KMEM or CONFIG_MEMCG is not set. It also moves bpf_map_{get,put}_memcg into include/linux/bpf.h, so these two helpers can be used in other source files. Signed-off-by: Yafang Shao --- include/linux/bpf.h | 26 ++++++++++++++++++++++++++ include/linux/memcontrol.h | 10 ++++++++++ kernel/bpf/syscall.c | 13 ------------- 3 files changed, 36 insertions(+), 13 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e0dbe0c..9ae1504 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -27,6 +27,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; @@ -2656,4 +2657,29 @@ static inline void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype) {} static inline void bpf_cgroup_atype_put(int cgroup_atype) {} #endif /* CONFIG_BPF_LSM */ +#ifdef CONFIG_MEMCG_KMEM +static inline struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) +{ + if (map->objcg) + return get_mem_cgroup_from_objcg(map->objcg); + + return root_mem_cgroup; +} + +static inline void bpf_map_put_memcg(struct mem_cgroup *memcg) +{ + mem_cgroup_put(memcg); +} + +#else +static inline struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) +{ + return root_memcg(); +} + +static inline void bpf_map_put_memcg(struct mem_cgroup *memcg) +{ +} +#endif + #endif /* _LINUX_BPF_H */ diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6257867..d4a0ad3 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -361,6 +361,11 @@ struct mem_cgroup { extern struct mem_cgroup *root_mem_cgroup; +static inline struct mem_cgroup *root_memcg(void) +{ + return root_mem_cgroup; +} + enum page_memcg_data_flags { /* page->memcg_data is a pointer to an objcgs vector */ MEMCG_DATA_OBJCGS = (1UL << 0), @@ -1158,6 +1163,11 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, #define MEM_CGROUP_ID_SHIFT 0 #define MEM_CGROUP_ID_MAX 0 +static inline struct mem_cgroup *root_memcg(void) +{ + return NULL; +} + static inline struct mem_cgroup *folio_memcg(struct folio *folio) { return NULL; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 70d5f70..574ddc3 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -433,19 +433,6 @@ static void bpf_map_release_memcg(struct bpf_map *map) obj_cgroup_put(map->objcg); } -static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) -{ - if (map->objcg) - return get_mem_cgroup_from_objcg(map->objcg); - - return root_mem_cgroup; -} - -static void bpf_map_put_memcg(struct mem_cgroup *memcg) -{ - mem_cgroup_put(memcg); -} - void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, int node) { From patchwork Wed Sep 21 16:59:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984014 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F815C6FA82 for ; Wed, 21 Sep 2022 17:00:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231215AbiIURAa (ORCPT ); Wed, 21 Sep 2022 13:00:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230348AbiIURAR (ORCPT ); Wed, 21 Sep 2022 13:00:17 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 341C22A27E; Wed, 21 Sep 2022 10:00:16 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id v1so6265254plo.9; Wed, 21 Sep 2022 10:00:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=f1B/fJ7c1U504s6DGE1MVFIz1uDuaG3uKK0ASZv0FxM=; b=fpZ8/1xkHoMLB2UOFwFpVvw7tWvX9cslgyitt4vRPFpO7zuQCu0xCkIlaFhSFsUZby E27igllfmnZEinieJr5J8Zqo/4xpWdKJGEM0w4+79gE/Pm/NlVMiIHZ9lyb2iX+2j/RU kwVc5mQgrWuSCJ9JazP3ZTgS/62djsGR1UNSnbWAQtMjaIRjFTl0u3UotwgdhToA1xnf 0Uf8aqByrex8ypDCikKf+UFNtLy2moByEgFvDDrLgEqp2muM4iW3TJdOMJlqhkAjZZ9w /8msZT9MUh7fip1qYOz7NV/YZxjb5+2aPXavfQQx/hgAHqQ8ld+k4ByuDNIYSvPe5+JG jvSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=f1B/fJ7c1U504s6DGE1MVFIz1uDuaG3uKK0ASZv0FxM=; b=YZf9mRzLG8eggGzZ6Ir4qUr79coTwTFCzOysxY/VVqdlePIzZpmwsPvQCC80pGiTF5 KfFCthyVE6n/jhU7Yga6HARk96SnlC6BpwTDGzI3xvQ96h6OAsH1SGFUnJqJGtZ/Gnyc SXJwb5IQ0PKRePK0rqP7Hn1DJGJv706b38MDchhZFiGGn+snDeWJpigb0wDkekpGMSfY 6v+DI/nHyIqCITMdW9aFaTFcx0fA5RKGbFKrik3d5w3HqTvkk6vSkkuH9PByBE9CItbS 6RAYlHq+fQCy4mGjVAusOM8rS+aS71v1NsiCAsvGoyFF77Zk+3NOb4HJG80xUAoIPFLU 4lKw== X-Gm-Message-State: ACrzQf2lfcRCC9PHjy7omcJvPyLwi/J8BC843+R7dQ2dgarslUYnB9Ok OOAomYVptm+CJTT4WjiCJbM= X-Google-Smtp-Source: AMsMyM5gR2T0wg64OZlwIN8ArMexApKA+n2Y8QDh3pERtSakiNDYZEYZXhf/fS+0+8sOQ10SEuZYrg== X-Received: by 2002:a17:90b:4a0c:b0:202:b4ed:1a2b with SMTP id kk12-20020a17090b4a0c00b00202b4ed1a2bmr10449138pjb.67.1663779614994; Wed, 21 Sep 2022 10:00:14 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:14 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 03/10] bpf: Call bpf_map_init_from_attr() immediately after map creation Date: Wed, 21 Sep 2022 16:59:55 +0000 Message-Id: <20220921170002.29557-4-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC In order to make all other map related memory allocations been allocated after memcg is saved in the map, we should save the memcg immediately after map creation. But the map is created in bpf_map_area_alloc(), within which we can't get the related bpf_map (except with a pointer casting which may be error prone), so we can do it in bpf_map_init_from_attr(), which is used by all bpf maps. bpf_map_init_from_attr() is executed immediately after bpf_map_area_alloc() for almost all bpf maps except bpf_struct_ops, devmap and hashmap, so this patch changes these three maps. In the future we will change the return type of bpf_map_init_from_attr() from void to int for error cases, so put it immediately after bpf_map_area_alloc() will make it eary to handle the error case. Signed-off-by: Yafang Shao --- kernel/bpf/bpf_struct_ops.c | 2 +- kernel/bpf/devmap.c | 5 ++--- kernel/bpf/hashtab.c | 4 ++-- 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 84b2d9d..36f24f8 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -624,6 +624,7 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr) st_map->st_ops = st_ops; map = &st_map->map; + bpf_map_init_from_attr(map, attr); st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE); st_map->links = @@ -637,7 +638,6 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr) mutex_init(&st_map->lock); set_vm_flush_reset_perms(st_map->image); - bpf_map_init_from_attr(map, attr); return map; } diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index f9a87dc..20decc7 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -127,9 +127,6 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) */ attr->map_flags |= BPF_F_RDONLY_PROG; - - bpf_map_init_from_attr(&dtab->map, attr); - if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); @@ -167,6 +164,8 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr) if (!dtab) return ERR_PTR(-ENOMEM); + bpf_map_init_from_attr(&dtab->map, attr); + err = dev_map_init_map(dtab, attr); if (err) { bpf_map_area_free(dtab); diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 86aec20..6c0e4eb 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -514,10 +514,10 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) if (!htab) return ERR_PTR(-ENOMEM); - lockdep_register_key(&htab->lockdep_key); - bpf_map_init_from_attr(&htab->map, attr); + lockdep_register_key(&htab->lockdep_key); + if (percpu_lru) { /* ensure each CPU's lru list has >=1 elements. * since we are at it, make each lru list has the same From patchwork Wed Sep 21 16:59:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984015 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89CECECAAD8 for ; Wed, 21 Sep 2022 17:00:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230207AbiIURAe (ORCPT ); Wed, 21 Sep 2022 13:00:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230404AbiIURAX (ORCPT ); Wed, 21 Sep 2022 13:00:23 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C576550731; Wed, 21 Sep 2022 10:00:17 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id l10so6260386plb.10; Wed, 21 Sep 2022 10:00:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=6uHUNzw9THV/sM0+eyVmwujZVoDqXFVqBetYM2tvv/A=; b=m/RR+R0YdjPOyx4e0qB52oCVgd1WOTGSWPH/F/2Y9jPQEagZn8Wrh9yBqkPPc/dnyU uJ0txjV2P4DEDkON+CFDXx+WcELu/l+qC0wDgvPRXClTQCb8LeEMOXSCstvzIThfDD6K AVCRBlCZ2AzBNegHClkxRP9J6yQroLtHzPHuw6kZomuDAQE4xjNzqifda/kWpRASG1Sv e6kyPTv/v4Hwwi+N6URkF7DAbeM3Tqx1qgue54ZPAtmBUhGSCZaLbbxfDpaldh0VfbCp m/R/Pu+q9j8REWkRyBqZEwIcWCaWhfarJO3GhHHTmnvOMnMOmArCLAYOI3lG3ZOV7yI8 wXeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=6uHUNzw9THV/sM0+eyVmwujZVoDqXFVqBetYM2tvv/A=; b=fNIjkX3nFEr0Vph0WsHotXPBJ2jcoZlHXoUFH62LpW7DLOaAGRc1FWcLzhlsy06cR+ vzTJtw5P4/I43x+Mmallg/Y2Io1GlrOPKByg3PZgTtXTTi0Sgk1lHeppds7T76CINKyp CNF8i0Wci5E5ySnzgaPIX2wqDkAkeXws1fH1vEfCc18LeqSdq81muny5FykaIpaoMOUZ D2Jbz6gpxnVnqnIKr/PXKltNo3cria8bmn0MuR3EZnOcsb/g61OJ0F7m/vA47aJ1YRsT qJ4sZghuHKboiDp5lTZ6UKtMSoCQ7iZSFCOnW2vj5Bqf891VHtOpboUHdwwkKvpjSSTQ LbJg== X-Gm-Message-State: ACrzQf0Xg/IydMAKpIw0DvjUNohNAgwFf+98SFEKfQgW7K6idg1hiBn2 F+BkUnhBMtL/ntsj5YXkBYs= X-Google-Smtp-Source: AMsMyM5rKzopkUGweQHVVYmDIKKk3231FCsB/XfwrxAku0hsiIoWpGDc19STDJi3Ucfb40socyUTHQ== X-Received: by 2002:a17:90b:692:b0:203:6c21:b4aa with SMTP id m18-20020a17090b069200b002036c21b4aamr10337164pjz.227.1663779617120; Wed, 21 Sep 2022 10:00:17 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:16 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 04/10] bpf: Save memcg in bpf_map_init_from_attr() Date: Wed, 21 Sep 2022 16:59:56 +0000 Message-Id: <20220921170002.29557-5-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Move bpf_map_save_memcg() into bpf_map_init_from_attr(), then all other map related memory allocation will be allocated after saving the memcg. And then we can get memcg from the map in the followup memory allocation. To pair with this change, bpf_map_release_memcg() is moved into bpf_map_area_free(). A new parameter struct bpf_map is introduced into bpf_map_area_free() for this purpose. Signed-off-by: Yafang Shao --- include/linux/bpf.h | 2 +- kernel/bpf/arraymap.c | 8 +++--- kernel/bpf/bloom_filter.c | 2 +- kernel/bpf/bpf_local_storage.c | 4 +-- kernel/bpf/bpf_struct_ops.c | 6 ++--- kernel/bpf/cpumap.c | 6 ++--- kernel/bpf/devmap.c | 8 +++--- kernel/bpf/hashtab.c | 10 +++---- kernel/bpf/local_storage.c | 2 +- kernel/bpf/lpm_trie.c | 2 +- kernel/bpf/offload.c | 4 +-- kernel/bpf/queue_stack_maps.c | 2 +- kernel/bpf/reuseport_array.c | 2 +- kernel/bpf/ringbuf.c | 8 +++--- kernel/bpf/stackmap.c | 8 +++--- kernel/bpf/syscall.c | 60 ++++++++++++++++++++++-------------------- net/core/sock_map.c | 12 ++++----- net/xdp/xskmap.c | 2 +- 18 files changed, 76 insertions(+), 72 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 9ae1504..d64d7a2 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1698,7 +1698,7 @@ struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, void bpf_map_put(struct bpf_map *map); void *bpf_map_area_alloc(u64 size, int numa_node); void *bpf_map_area_mmapable_alloc(u64 size, int numa_node); -void bpf_map_area_free(void *base); +void bpf_map_area_free(void *base, struct bpf_map *map); bool bpf_map_write_active(const struct bpf_map *map); void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr); int generic_map_lookup_batch(struct bpf_map *map, diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 832b265..8cf021e 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -147,7 +147,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) array->elem_size = elem_size; if (percpu && bpf_array_alloc_percpu(array)) { - bpf_map_area_free(array); + bpf_map_area_free(array, &array->map); return ERR_PTR(-ENOMEM); } @@ -445,9 +445,9 @@ static void array_map_free(struct bpf_map *map) bpf_array_free_percpu(array); if (array->map.map_flags & BPF_F_MMAPABLE) - bpf_map_area_free(array_map_vmalloc_addr(array)); + bpf_map_area_free(array_map_vmalloc_addr(array), map); else - bpf_map_area_free(array); + bpf_map_area_free(array, map); } static void array_map_seq_show_elem(struct bpf_map *map, void *key, @@ -795,7 +795,7 @@ static void fd_array_map_free(struct bpf_map *map) for (i = 0; i < array->map.max_entries; i++) BUG_ON(array->ptrs[i] != NULL); - bpf_map_area_free(array); + bpf_map_area_free(array, map); } static void *fd_array_map_lookup_elem(struct bpf_map *map, void *key) diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c index b9ea539..e59064d 100644 --- a/kernel/bpf/bloom_filter.c +++ b/kernel/bpf/bloom_filter.c @@ -168,7 +168,7 @@ static void bloom_map_free(struct bpf_map *map) struct bpf_bloom_filter *bloom = container_of(map, struct bpf_bloom_filter, map); - bpf_map_area_free(bloom); + bpf_map_area_free(bloom, map); } static void *bloom_map_lookup_elem(struct bpf_map *map, void *key) diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 802fc15..7b68d846 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -582,7 +582,7 @@ void bpf_local_storage_map_free(struct bpf_local_storage_map *smap, synchronize_rcu(); kvfree(smap->buckets); - bpf_map_area_free(smap); + bpf_map_area_free(smap, &smap->map); } int bpf_local_storage_map_alloc_check(union bpf_attr *attr) @@ -623,7 +623,7 @@ struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr) smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets, GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT); if (!smap->buckets) { - bpf_map_area_free(smap); + bpf_map_area_free(smap, &smap->map); return ERR_PTR(-ENOMEM); } diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 36f24f8..9fb8ad1 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -577,10 +577,10 @@ static void bpf_struct_ops_map_free(struct bpf_map *map) if (st_map->links) bpf_struct_ops_map_put_progs(st_map); - bpf_map_area_free(st_map->links); + bpf_map_area_free(st_map->links, NULL); bpf_jit_free_exec(st_map->image); - bpf_map_area_free(st_map->uvalue); - bpf_map_area_free(st_map); + bpf_map_area_free(st_map->uvalue, NULL); + bpf_map_area_free(st_map, map); } static int bpf_struct_ops_map_alloc_check(union bpf_attr *attr) diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index b5ba34d..7de2ae6 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -118,7 +118,7 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr) return &cmap->map; free_cmap: - bpf_map_area_free(cmap); + bpf_map_area_free(cmap, &cmap->map); return ERR_PTR(err); } @@ -622,8 +622,8 @@ static void cpu_map_free(struct bpf_map *map) /* bq flush and cleanup happens after RCU grace-period */ __cpu_map_entry_replace(cmap, i, NULL); /* call_rcu */ } - bpf_map_area_free(cmap->cpu_map); - bpf_map_area_free(cmap); + bpf_map_area_free(cmap->cpu_map, NULL); + bpf_map_area_free(cmap, map); } /* Elements are kept alive by RCU; either by rcu_read_lock() (from syscall) or diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 20decc7..3268ce7 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -168,7 +168,7 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr) err = dev_map_init_map(dtab, attr); if (err) { - bpf_map_area_free(dtab); + bpf_map_area_free(dtab, &dtab->map); return ERR_PTR(err); } @@ -221,7 +221,7 @@ static void dev_map_free(struct bpf_map *map) } } - bpf_map_area_free(dtab->dev_index_head); + bpf_map_area_free(dtab->dev_index_head, NULL); } else { for (i = 0; i < dtab->map.max_entries; i++) { struct bpf_dtab_netdev *dev; @@ -236,10 +236,10 @@ static void dev_map_free(struct bpf_map *map) kfree(dev); } - bpf_map_area_free(dtab->netdev_map); + bpf_map_area_free(dtab->netdev_map, NULL); } - bpf_map_area_free(dtab); + bpf_map_area_free(dtab, &dtab->map); } static int dev_map_get_next_key(struct bpf_map *map, void *key, void *next_key) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 6c0e4eb..f542b51 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -311,7 +311,7 @@ static void htab_free_elems(struct bpf_htab *htab) cond_resched(); } free_elems: - bpf_map_area_free(htab->elems); + bpf_map_area_free(htab->elems, NULL); } /* The LRU list has a lock (lru_lock). Each htab bucket has a lock @@ -626,12 +626,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) percpu_counter_destroy(&htab->pcount); for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) free_percpu(htab->map_locked[i]); - bpf_map_area_free(htab->buckets); + bpf_map_area_free(htab->buckets, NULL); bpf_mem_alloc_destroy(&htab->pcpu_ma); bpf_mem_alloc_destroy(&htab->ma); free_htab: lockdep_unregister_key(&htab->lockdep_key); - bpf_map_area_free(htab); + bpf_map_area_free(htab, &htab->map); return ERR_PTR(err); } @@ -1561,7 +1561,7 @@ static void htab_map_free(struct bpf_map *map) bpf_map_free_kptr_off_tab(map); free_percpu(htab->extra_elems); - bpf_map_area_free(htab->buckets); + bpf_map_area_free(htab->buckets, NULL); bpf_mem_alloc_destroy(&htab->pcpu_ma); bpf_mem_alloc_destroy(&htab->ma); if (htab->use_percpu_counter) @@ -1569,7 +1569,7 @@ static void htab_map_free(struct bpf_map *map) for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) free_percpu(htab->map_locked[i]); lockdep_unregister_key(&htab->lockdep_key); - bpf_map_area_free(htab); + bpf_map_area_free(htab, map); } static void htab_map_seq_show_elem(struct bpf_map *map, void *key, diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 098cf33..c705d66 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -345,7 +345,7 @@ static void cgroup_storage_map_free(struct bpf_map *_map) WARN_ON(!RB_EMPTY_ROOT(&map->root)); WARN_ON(!list_empty(&map->list)); - bpf_map_area_free(map); + bpf_map_area_free(map, _map); } static int cgroup_storage_delete_elem(struct bpf_map *map, void *key) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index d833496..fd99360 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -609,7 +609,7 @@ static void trie_free(struct bpf_map *map) } out: - bpf_map_area_free(trie); + bpf_map_area_free(trie, map); } static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key) diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index 13e4efc..c9941a9 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -404,7 +404,7 @@ struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr) err_unlock: up_write(&bpf_devs_lock); rtnl_unlock(); - bpf_map_area_free(offmap); + bpf_map_area_free(offmap, &offmap->map); return ERR_PTR(err); } @@ -428,7 +428,7 @@ void bpf_map_offload_map_free(struct bpf_map *map) up_write(&bpf_devs_lock); rtnl_unlock(); - bpf_map_area_free(offmap); + bpf_map_area_free(offmap, map); } int bpf_map_offload_lookup_elem(struct bpf_map *map, void *key, void *value) diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c index 8a5e060..f2ec0c4 100644 --- a/kernel/bpf/queue_stack_maps.c +++ b/kernel/bpf/queue_stack_maps.c @@ -92,7 +92,7 @@ static void queue_stack_map_free(struct bpf_map *map) { struct bpf_queue_stack *qs = bpf_queue_stack(map); - bpf_map_area_free(qs); + bpf_map_area_free(qs, map); } static int __queue_map_get(struct bpf_map *map, void *value, bool delete) diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c index 82c6161..3b6d1c7 100644 --- a/kernel/bpf/reuseport_array.c +++ b/kernel/bpf/reuseport_array.c @@ -143,7 +143,7 @@ static void reuseport_array_free(struct bpf_map *map) * Once reaching here, all sk->sk_user_data is not * referencing this "array". "array" can be freed now. */ - bpf_map_area_free(array); + bpf_map_area_free(array, map); } static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr) diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index b483aea..74dd8dc 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -116,7 +116,7 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node) err_free_pages: for (i = 0; i < nr_pages; i++) __free_page(pages[i]); - bpf_map_area_free(pages); + bpf_map_area_free(pages, NULL); return NULL; } @@ -172,7 +172,7 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr) rb_map->rb = bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node); if (!rb_map->rb) { - bpf_map_area_free(rb_map); + bpf_map_area_free(rb_map, &rb_map->map); return ERR_PTR(-ENOMEM); } @@ -190,7 +190,7 @@ static void bpf_ringbuf_free(struct bpf_ringbuf *rb) vunmap(rb); for (i = 0; i < nr_pages; i++) __free_page(pages[i]); - bpf_map_area_free(pages); + bpf_map_area_free(pages, NULL); } static void ringbuf_map_free(struct bpf_map *map) @@ -199,7 +199,7 @@ static void ringbuf_map_free(struct bpf_map *map) rb_map = container_of(map, struct bpf_ringbuf_map, map); bpf_ringbuf_free(rb_map->rb); - bpf_map_area_free(rb_map); + bpf_map_area_free(rb_map, map); } static void *ringbuf_map_lookup_elem(struct bpf_map *map, void *key) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 1adbe67..042b7d2 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -62,7 +62,7 @@ static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) return 0; free_elems: - bpf_map_area_free(smap->elems); + bpf_map_area_free(smap->elems, NULL); return err; } @@ -120,7 +120,7 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr) put_buffers: put_callchain_buffers(); free_smap: - bpf_map_area_free(smap); + bpf_map_area_free(smap, &smap->map); return ERR_PTR(err); } @@ -648,9 +648,9 @@ static void stack_map_free(struct bpf_map *map) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); - bpf_map_area_free(smap->elems); + bpf_map_area_free(smap->elems, NULL); pcpu_freelist_destroy(&smap->freelist); - bpf_map_area_free(smap); + bpf_map_area_free(smap, map); put_callchain_buffers(); } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 574ddc3..29ad913 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -293,6 +293,34 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, return err; } +#ifdef CONFIG_MEMCG_KMEM +static void bpf_map_save_memcg(struct bpf_map *map) +{ + /* Currently if a map is created by a process belonging to the root + * memory cgroup, get_obj_cgroup_from_current() will return NULL. + * So we have to check map->objcg for being NULL each time it's + * being used. + */ + map->objcg = get_obj_cgroup_from_current(); +} + +static void bpf_map_release_memcg(struct bpf_map *map) +{ + if (map->objcg) + obj_cgroup_put(map->objcg); +} + +#else +static void bpf_map_save_memcg(struct bpf_map *map) +{ +} + +static void bpf_map_release_memcg(struct bpf_map *map) +{ +} + +#endif + /* Please, do not use this function outside from the map creation path * (e.g. in map update path) without taking care of setting the active * memory cgroup (see at bpf_map_kmalloc_node() for example). @@ -344,8 +372,10 @@ void *bpf_map_area_mmapable_alloc(u64 size, int numa_node) return __bpf_map_area_alloc(size, numa_node, true); } -void bpf_map_area_free(void *area) +void bpf_map_area_free(void *area, struct bpf_map *map) { + if (map) + bpf_map_release_memcg(map); kvfree(area); } @@ -363,6 +393,7 @@ static u32 bpf_map_flags_retain_permanent(u32 flags) void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr) { + bpf_map_save_memcg(map); map->map_type = attr->map_type; map->key_size = attr->key_size; map->value_size = attr->value_size; @@ -417,22 +448,6 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock) } #ifdef CONFIG_MEMCG_KMEM -static void bpf_map_save_memcg(struct bpf_map *map) -{ - /* Currently if a map is created by a process belonging to the root - * memory cgroup, get_obj_cgroup_from_current() will return NULL. - * So we have to check map->objcg for being NULL each time it's - * being used. - */ - map->objcg = get_obj_cgroup_from_current(); -} - -static void bpf_map_release_memcg(struct bpf_map *map) -{ - if (map->objcg) - obj_cgroup_put(map->objcg); -} - void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, int node) { @@ -477,14 +492,6 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, return ptr; } -#else -static void bpf_map_save_memcg(struct bpf_map *map) -{ -} - -static void bpf_map_release_memcg(struct bpf_map *map) -{ -} #endif static int bpf_map_kptr_off_cmp(const void *a, const void *b) @@ -605,7 +612,6 @@ static void bpf_map_free_deferred(struct work_struct *work) security_bpf_map_free(map); kfree(map->off_arr); - bpf_map_release_memcg(map); /* implementation dependent freeing, map_free callback also does * bpf_map_free_kptr_off_tab, if needed. */ @@ -1158,8 +1164,6 @@ static int map_create(union bpf_attr *attr) if (err) goto free_map_sec; - bpf_map_save_memcg(map); - err = bpf_map_new_fd(map, f_flags); if (err < 0) { /* failed to allocate fd. diff --git a/net/core/sock_map.c b/net/core/sock_map.c index a660bae..8da9fd4 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -52,7 +52,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr) sizeof(struct sock *), stab->map.numa_node); if (!stab->sks) { - bpf_map_area_free(stab); + bpf_map_area_free(stab, &stab->map); return ERR_PTR(-ENOMEM); } @@ -360,8 +360,8 @@ static void sock_map_free(struct bpf_map *map) /* wait for psock readers accessing its map link */ synchronize_rcu(); - bpf_map_area_free(stab->sks); - bpf_map_area_free(stab); + bpf_map_area_free(stab->sks, NULL); + bpf_map_area_free(stab, map); } static void sock_map_release_progs(struct bpf_map *map) @@ -1115,7 +1115,7 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr) return &htab->map; free_htab: - bpf_map_area_free(htab); + bpf_map_area_free(htab, &htab->map); return ERR_PTR(err); } @@ -1167,8 +1167,8 @@ static void sock_hash_free(struct bpf_map *map) /* wait for psock readers accessing its map link */ synchronize_rcu(); - bpf_map_area_free(htab->buckets); - bpf_map_area_free(htab); + bpf_map_area_free(htab->buckets, NULL); + bpf_map_area_free(htab, map); } static void *sock_hash_lookup_sys(struct bpf_map *map, void *key) diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index acc8e52..5abb87e 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -90,7 +90,7 @@ static void xsk_map_free(struct bpf_map *map) struct xsk_map *m = container_of(map, struct xsk_map, map); synchronize_net(); - bpf_map_area_free(m); + bpf_map_area_free(m, map); } static int xsk_map_get_next_key(struct bpf_map *map, void *key, void *next_key) From patchwork Wed Sep 21 16:59:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984016 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86881ECAAD8 for ; Wed, 21 Sep 2022 17:00:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230345AbiIURAi (ORCPT ); Wed, 21 Sep 2022 13:00:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230483AbiIURA3 (ORCPT ); Wed, 21 Sep 2022 13:00:29 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AFE75F114; Wed, 21 Sep 2022 10:00:19 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 207so6530050pgc.7; Wed, 21 Sep 2022 10:00:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=GPAo+80diqPAiSVk1uiDDdhvq4tNUbYv5d6hFBuvhcw=; b=Tzf1vrCqi4hNfhvQlxCP+vYCh/RGy1ZnOCc8zfODz/BSB8JTrUob8ZdOXvjrFqviHa v1Ig4Y0CrMRS9X+SqIIGbIM0MhN6WUitIgbS0Jr7gBtRYq22yS4Am5ZS1Sl9fcHYGmgF o1r8p2Dpx7UdWyv0oEifhGj1HtCPJtK7xLow+ZlNnKZbimCs6QKboQw6JC3R+uVjpA4F 5rqczrpJmsdf3RBJu62XWIXT0UdZH/JW/p/QCWrAhtlNQTVEply18gqyHUxFCsG+VeXU jwJg8AvHi9JlJbXQ6YxJhZpemNRWmGxGnffdwYMYUNjr3GBGwejSi7SHmGGYeS1ZyPU9 Xpog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=GPAo+80diqPAiSVk1uiDDdhvq4tNUbYv5d6hFBuvhcw=; b=wHb9g+WHQIxWUQXsY8NnZSMOzLANlHBta2HpJUVzu8pTMHqjTlGlO5Ey7hX2NUoise /qOcyCqZ9ITpHt0iUtVGdr1T+QzYuBbGxZhpLEOR+m4qQFmy2lDHcsrQ27pKeTuGuEiU 6kBVLhqlo7c0Q+RwyrxbLWDy/GPXz8sHkxenSALF/VLvDmWjMNBpxUXMFs+7UUQpWwuc vKWBM9ER3a/PcGGzZa/yQTzh/aiUlDTDtg61/INHT9MxRTMPp6lEkp64Qe769T9rt5Zv NrXDo02V/CXBxdc8PHvKXnL8CYF/klbr33BZKhZs6COy/Sh5bDjZZbxeWgULP+/w8vrY 6YVQ== X-Gm-Message-State: ACrzQf23joBXvV0mMlkJ6VC2NVtZthuvSE/JGy+Vi8G9Bczd5fox4JpV JT5WNWcv223PkXWHJ4cUSko= X-Google-Smtp-Source: AMsMyM4+tSOlSYzOwM3gD90jnB+pVb54xTcr7YKgcqMYmKowrXDveCEorVqCBbhktk/nqQX1CNFhFw== X-Received: by 2002:a63:d250:0:b0:435:1774:1f93 with SMTP id t16-20020a63d250000000b0043517741f93mr25605362pgi.339.1663779619160; Wed, 21 Sep 2022 10:00:19 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:18 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 05/10] bpf: Use scoped-based charge in bpf_map_area_alloc Date: Wed, 21 Sep 2022 16:59:57 +0000 Message-Id: <20220921170002.29557-6-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Currently bpf_map_area_alloc() is used to allocate a container of struct bpf_map or members in this container. To distinguish the map creation and the other case, a new parameter struct bpf_map is added into bpf_map_area_alloc(). Then for the non-map-creation case, we could get the memcg from the map instead of using the current memcg. Signed-off-by: Yafang Shao --- include/linux/bpf.h | 2 +- kernel/bpf/arraymap.c | 2 +- kernel/bpf/bloom_filter.c | 2 +- kernel/bpf/bpf_local_storage.c | 2 +- kernel/bpf/bpf_struct_ops.c | 6 +++--- kernel/bpf/cpumap.c | 5 +++-- kernel/bpf/devmap.c | 13 ++++++++----- kernel/bpf/hashtab.c | 8 +++++--- kernel/bpf/local_storage.c | 2 +- kernel/bpf/lpm_trie.c | 2 +- kernel/bpf/offload.c | 2 +- kernel/bpf/queue_stack_maps.c | 2 +- kernel/bpf/reuseport_array.c | 2 +- kernel/bpf/ringbuf.c | 15 +++++++++------ kernel/bpf/stackmap.c | 5 +++-- kernel/bpf/syscall.c | 16 ++++++++++++++-- net/core/sock_map.c | 10 ++++++---- net/xdp/xskmap.c | 2 +- 18 files changed, 61 insertions(+), 37 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d64d7a2..eca1502 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1696,7 +1696,7 @@ struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, struct bpf_map * __must_check bpf_map_inc_not_zero(struct bpf_map *map); void bpf_map_put_with_uref(struct bpf_map *map); void bpf_map_put(struct bpf_map *map); -void *bpf_map_area_alloc(u64 size, int numa_node); +void *bpf_map_area_alloc(u64 size, int numa_node, struct bpf_map *map); void *bpf_map_area_mmapable_alloc(u64 size, int numa_node); void bpf_map_area_free(void *base, struct bpf_map *map); bool bpf_map_write_active(const struct bpf_map *map); diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 8cf021e..dd79d0d 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -135,7 +135,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) array = data + PAGE_ALIGN(sizeof(struct bpf_array)) - offsetof(struct bpf_array, value); } else { - array = bpf_map_area_alloc(array_size, numa_node); + array = bpf_map_area_alloc(array_size, numa_node, NULL); } if (!array) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c index e59064d..6691f79 100644 --- a/kernel/bpf/bloom_filter.c +++ b/kernel/bpf/bloom_filter.c @@ -142,7 +142,7 @@ static struct bpf_map *bloom_map_alloc(union bpf_attr *attr) } bitset_bytes = roundup(bitset_bytes, sizeof(unsigned long)); - bloom = bpf_map_area_alloc(sizeof(*bloom) + bitset_bytes, numa_node); + bloom = bpf_map_area_alloc(sizeof(*bloom) + bitset_bytes, numa_node, NULL); if (!bloom) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 7b68d846..44498d7d 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -610,7 +610,7 @@ struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr) unsigned int i; u32 nbuckets; - smap = bpf_map_area_alloc(sizeof(*smap), NUMA_NO_NODE); + smap = bpf_map_area_alloc(sizeof(*smap), NUMA_NO_NODE, NULL); if (!smap) return ERR_PTR(-ENOMEM); bpf_map_init_from_attr(&smap->map, attr); diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 9fb8ad1..37ba5c0 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -618,7 +618,7 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr) */ (vt->size - sizeof(struct bpf_struct_ops_value)); - st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE); + st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE, NULL); if (!st_map) return ERR_PTR(-ENOMEM); @@ -626,10 +626,10 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr) map = &st_map->map; bpf_map_init_from_attr(map, attr); - st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE); + st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE, map); st_map->links = bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_links *), - NUMA_NO_NODE); + NUMA_NO_NODE, map); st_map->image = bpf_jit_alloc_exec(PAGE_SIZE); if (!st_map->uvalue || !st_map->links || !st_map->image) { bpf_struct_ops_map_free(map); diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 7de2ae6..b593157 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -97,7 +97,7 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr) attr->map_flags & ~BPF_F_NUMA_NODE) return ERR_PTR(-EINVAL); - cmap = bpf_map_area_alloc(sizeof(*cmap), NUMA_NO_NODE); + cmap = bpf_map_area_alloc(sizeof(*cmap), NUMA_NO_NODE, NULL); if (!cmap) return ERR_PTR(-ENOMEM); @@ -112,7 +112,8 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr) /* Alloc array for possible remote "destination" CPUs */ cmap->cpu_map = bpf_map_area_alloc(cmap->map.max_entries * sizeof(struct bpf_cpu_map_entry *), - cmap->map.numa_node); + cmap->map.numa_node, + &cmap->map); if (!cmap->cpu_map) goto free_cmap; diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 3268ce7..807a4cd 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -89,12 +89,13 @@ struct bpf_dtab { static LIST_HEAD(dev_map_list); static struct hlist_head *dev_map_create_hash(unsigned int entries, - int numa_node) + int numa_node, + struct bpf_map *map) { int i; struct hlist_head *hash; - hash = bpf_map_area_alloc((u64) entries * sizeof(*hash), numa_node); + hash = bpf_map_area_alloc((u64) entries * sizeof(*hash), numa_node, map); if (hash != NULL) for (i = 0; i < entries; i++) INIT_HLIST_HEAD(&hash[i]); @@ -136,7 +137,8 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets, - dtab->map.numa_node); + dtab->map.numa_node, + &dtab->map); if (!dtab->dev_index_head) return -ENOMEM; @@ -144,7 +146,8 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) } else { dtab->netdev_map = bpf_map_area_alloc((u64) dtab->map.max_entries * sizeof(struct bpf_dtab_netdev *), - dtab->map.numa_node); + dtab->map.numa_node, + &dtab->map); if (!dtab->netdev_map) return -ENOMEM; } @@ -160,7 +163,7 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr) if (!capable(CAP_NET_ADMIN)) return ERR_PTR(-EPERM); - dtab = bpf_map_area_alloc(sizeof(*dtab), NUMA_NO_NODE); + dtab = bpf_map_area_alloc(sizeof(*dtab), NUMA_NO_NODE, NULL); if (!dtab) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index f542b51..89887df 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -349,7 +349,8 @@ static int prealloc_init(struct bpf_htab *htab) num_entries += num_possible_cpus(); htab->elems = bpf_map_area_alloc((u64)htab->elem_size * num_entries, - htab->map.numa_node); + htab->map.numa_node, + &htab->map); if (!htab->elems) return -ENOMEM; @@ -510,7 +511,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) struct bpf_htab *htab; int err, i; - htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE); + htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE, NULL); if (!htab) return ERR_PTR(-ENOMEM); @@ -549,7 +550,8 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) err = -ENOMEM; htab->buckets = bpf_map_area_alloc(htab->n_buckets * sizeof(struct bucket), - htab->map.numa_node); + htab->map.numa_node, + &htab->map); if (!htab->buckets) goto free_htab; diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index c705d66..fcc7ece 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -313,7 +313,7 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr) /* max_entries is not used and enforced to be 0 */ return ERR_PTR(-EINVAL); - map = bpf_map_area_alloc(sizeof(struct bpf_cgroup_storage_map), numa_node); + map = bpf_map_area_alloc(sizeof(struct bpf_cgroup_storage_map), numa_node, NULL); if (!map) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index fd99360..3d329ae 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -558,7 +558,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) attr->value_size > LPM_VAL_SIZE_MAX) return ERR_PTR(-EINVAL); - trie = bpf_map_area_alloc(sizeof(*trie), NUMA_NO_NODE); + trie = bpf_map_area_alloc(sizeof(*trie), NUMA_NO_NODE, NULL); if (!trie) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index c9941a9..87c59da 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -372,7 +372,7 @@ struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr) attr->map_type != BPF_MAP_TYPE_HASH) return ERR_PTR(-EINVAL); - offmap = bpf_map_area_alloc(sizeof(*offmap), NUMA_NO_NODE); + offmap = bpf_map_area_alloc(sizeof(*offmap), NUMA_NO_NODE, NULL); if (!offmap) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c index f2ec0c4..bf57e45 100644 --- a/kernel/bpf/queue_stack_maps.c +++ b/kernel/bpf/queue_stack_maps.c @@ -74,7 +74,7 @@ static struct bpf_map *queue_stack_map_alloc(union bpf_attr *attr) size = (u64) attr->max_entries + 1; queue_size = sizeof(*qs) + size * attr->value_size; - qs = bpf_map_area_alloc(queue_size, numa_node); + qs = bpf_map_area_alloc(queue_size, numa_node, NULL); if (!qs) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c index 3b6d1c7..fc6f6b6 100644 --- a/kernel/bpf/reuseport_array.c +++ b/kernel/bpf/reuseport_array.c @@ -155,7 +155,7 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr) return ERR_PTR(-EPERM); /* allocate all map elements and zero-initialize them */ - array = bpf_map_area_alloc(struct_size(array, ptrs, attr->max_entries), numa_node); + array = bpf_map_area_alloc(struct_size(array, ptrs, attr->max_entries), numa_node, NULL); if (!array) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 74dd8dc..5eb7820 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -59,7 +59,8 @@ struct bpf_ringbuf_hdr { u32 pg_off; }; -static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node) +static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node, + struct bpf_map *map) { const gfp_t flags = GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL | __GFP_NOWARN | __GFP_ZERO; @@ -89,7 +90,7 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node) * user-space implementations significantly. */ array_size = (nr_meta_pages + 2 * nr_data_pages) * sizeof(*pages); - pages = bpf_map_area_alloc(array_size, numa_node); + pages = bpf_map_area_alloc(array_size, numa_node, map); if (!pages) return NULL; @@ -127,11 +128,12 @@ static void bpf_ringbuf_notify(struct irq_work *work) wake_up_all(&rb->waitq); } -static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node) +static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node, + struct bpf_map *map) { struct bpf_ringbuf *rb; - rb = bpf_ringbuf_area_alloc(data_sz, numa_node); + rb = bpf_ringbuf_area_alloc(data_sz, numa_node, map); if (!rb) return NULL; @@ -164,13 +166,14 @@ static struct bpf_map *ringbuf_map_alloc(union bpf_attr *attr) return ERR_PTR(-E2BIG); #endif - rb_map = bpf_map_area_alloc(sizeof(*rb_map), NUMA_NO_NODE); + rb_map = bpf_map_area_alloc(sizeof(*rb_map), NUMA_NO_NODE, NULL); if (!rb_map) return ERR_PTR(-ENOMEM); bpf_map_init_from_attr(&rb_map->map, attr); - rb_map->rb = bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node); + rb_map->rb = bpf_ringbuf_alloc(attr->max_entries, rb_map->map.numa_node, + &rb_map->map); if (!rb_map->rb) { bpf_map_area_free(rb_map, &rb_map->map); return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 042b7d2..9440fab 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -49,7 +49,8 @@ static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) int err; smap->elems = bpf_map_area_alloc(elem_size * smap->map.max_entries, - smap->map.numa_node); + smap->map.numa_node, + &smap->map); if (!smap->elems) return -ENOMEM; @@ -100,7 +101,7 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr) return ERR_PTR(-E2BIG); cost = n_buckets * sizeof(struct stack_map_bucket *) + sizeof(*smap); - smap = bpf_map_area_alloc(cost, bpf_map_attr_numa_node(attr)); + smap = bpf_map_area_alloc(cost, bpf_map_attr_numa_node(attr), NULL); if (!smap) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 29ad913..727c04c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -362,9 +362,21 @@ static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) flags, numa_node, __builtin_return_address(0)); } -void *bpf_map_area_alloc(u64 size, int numa_node) +void *bpf_map_area_alloc(u64 size, int numa_node, struct bpf_map *map) { - return __bpf_map_area_alloc(size, numa_node, false); + struct mem_cgroup *memcg, *old_memcg; + void *ptr; + + if (!map) + return __bpf_map_area_alloc(size, numa_node, false); + + memcg = bpf_map_get_memcg(map); + old_memcg = set_active_memcg(memcg); + ptr = __bpf_map_area_alloc(size, numa_node, false); + set_active_memcg(old_memcg); + bpf_map_put_memcg(memcg); + + return ptr; } void *bpf_map_area_mmapable_alloc(u64 size, int numa_node) diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 8da9fd4..25a5ac4 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -41,7 +41,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr) attr->map_flags & ~SOCK_CREATE_FLAG_MASK) return ERR_PTR(-EINVAL); - stab = bpf_map_area_alloc(sizeof(*stab), NUMA_NO_NODE); + stab = bpf_map_area_alloc(sizeof(*stab), NUMA_NO_NODE, NULL); if (!stab) return ERR_PTR(-ENOMEM); @@ -50,7 +50,8 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr) stab->sks = bpf_map_area_alloc((u64) stab->map.max_entries * sizeof(struct sock *), - stab->map.numa_node); + stab->map.numa_node, + &stab->map); if (!stab->sks) { bpf_map_area_free(stab, &stab->map); return ERR_PTR(-ENOMEM); @@ -1085,7 +1086,7 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr) if (attr->key_size > MAX_BPF_STACK) return ERR_PTR(-E2BIG); - htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE); + htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE, NULL); if (!htab) return ERR_PTR(-ENOMEM); @@ -1102,7 +1103,8 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr) htab->buckets = bpf_map_area_alloc(htab->buckets_num * sizeof(struct bpf_shtab_bucket), - htab->map.numa_node); + htab->map.numa_node, + &htab->map); if (!htab->buckets) { err = -ENOMEM; goto free_htab; diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index 5abb87e..beb11fd 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -75,7 +75,7 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr) numa_node = bpf_map_attr_numa_node(attr); size = struct_size(m, xsk_map, attr->max_entries); - m = bpf_map_area_alloc(size, numa_node); + m = bpf_map_area_alloc(size, numa_node, NULL); if (!m) return ERR_PTR(-ENOMEM); From patchwork Wed Sep 21 16:59:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984018 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 564C5C6FA8E for ; Wed, 21 Sep 2022 17:00:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230512AbiIURAo (ORCPT ); Wed, 21 Sep 2022 13:00:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230474AbiIURA3 (ORCPT ); Wed, 21 Sep 2022 13:00:29 -0400 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9EA16170D; Wed, 21 Sep 2022 10:00:21 -0700 (PDT) Received: by mail-pg1-x52d.google.com with SMTP id v4so6508058pgi.10; Wed, 21 Sep 2022 10:00:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=E8MAYkdVIQP0FJEM6EBF4l8Xg/qgqUjpGYemv3sFZKw=; b=Qlgug0EJTn9PhpLyeh5mxZ2/UgDdHf7kf/FXA3gXzZJQImIqnKk1qK72+iKUYP8Jmn Kp8KiFpG5E+RRflbtRDIb8lBO48xrjsDhj3VW+NCLDbK5wXD7ehjODlRWXgXzpQfRpKO kErn7juPi38n3YWxo8c71AlpBn8KSyYyfB9OhzeFhlfSS94dz4nyUOVPfTzX19ziXUc1 4hOC1yDAom+aQhvQ+Ka3nL+yQE+JoGYamz/4p+g20oKVz88RehgrKH6TsC5zdKM5ve0O JeOSaqmiubTlLv9wDJl4oNpd74MA+/N/cYIxyA3XO9aKRZWfYkpKWoa/nlhO8KLJaxiP rU/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=E8MAYkdVIQP0FJEM6EBF4l8Xg/qgqUjpGYemv3sFZKw=; b=lBdjl4UppqNVSpLhrv8QCovhHko6e+U0xO8OXo7z9dnAhQwU2CZx3/zYPAKEIMMD17 ibJos2ZhFlzDLQi/S+oyUlyR+4+pzYTxPDGxIbR4oU12aVC6NOE74wl0ogKbBdqLgpVC gKHGF1RbUDR4f+n6ZNdufjiNcrcmHD0yg49VAer6iSiqEMW5J/kIdR1vCHWWghfZ/LT6 4O4lsVpJXunrVf4j4XhUlOaiU/pqMalO2GLdcMdXf7O5r9q524MPY8OxeI2tFkcD1tC+ TB+PTMtYgGDPMJvkKTPU1/l9obUcvtNV8xVzD/5sTQzXpJyyIQ8cJAVEZ/RC9qEwhhtP 1kIA== X-Gm-Message-State: ACrzQf1Qr/GS8qAzf0XBr4g9Ysz1NXmJqmdykC/SF2wzj7xj+8t6urrH qNJW4Stb6+gR+vmMH0z0uV0= X-Google-Smtp-Source: AMsMyM47Iwz30ngRYs701fJgHSVfpL6lW6gHhsxmWmTedwveJGUmb3HZL9GZo+r5hRLHTT+3qGXjNw== X-Received: by 2002:a05:6a00:1484:b0:547:89e:272c with SMTP id v4-20020a056a00148400b00547089e272cmr30005190pfu.0.1663779621239; Wed, 21 Sep 2022 10:00:21 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:20 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 06/10] bpf: Introduce new helpers bpf_ringbuf_pages_{alloc,free} Date: Wed, 21 Sep 2022 16:59:58 +0000 Message-Id: <20220921170002.29557-7-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Allocate pages related memory into the new helper bpf_ringbuf_pages_alloc(), then it can be handled as a single unit. Suggested-by: Andrii Nakryiko Signed-off-by: Yafang Shao Acked-by: Andrii Nakryiko --- kernel/bpf/ringbuf.c | 80 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 56 insertions(+), 24 deletions(-) diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 5eb7820..1e7284c 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -59,6 +59,57 @@ struct bpf_ringbuf_hdr { u32 pg_off; }; +static void bpf_ringbuf_pages_free(struct page **pages, int nr_pages) +{ + int i; + + for (i = 0; i < nr_pages; i++) + __free_page(pages[i]); + bpf_map_area_free(pages, NULL); +} + +static struct page **bpf_ringbuf_pages_alloc(struct bpf_map *map, + int nr_meta_pages, + int nr_data_pages, + int numa_node, + const gfp_t flags) +{ + int nr_pages = nr_meta_pages + nr_data_pages; + struct mem_cgroup *memcg, *old_memcg; + struct page **pages, *page; + int array_size; + int i; + + memcg = bpf_map_get_memcg(map); + old_memcg = set_active_memcg(memcg); + array_size = (nr_meta_pages + 2 * nr_data_pages) * sizeof(*pages); + pages = bpf_map_area_alloc(array_size, numa_node, NULL); + if (!pages) + goto err; + + for (i = 0; i < nr_pages; i++) { + page = alloc_pages_node(numa_node, flags, 0); + if (!page) { + nr_pages = i; + goto err_free_pages; + } + pages[i] = page; + if (i >= nr_meta_pages) + pages[nr_data_pages + i] = page; + } + set_active_memcg(old_memcg); + bpf_map_put_memcg(memcg); + + return pages; + +err_free_pages: + bpf_ringbuf_pages_free(pages, nr_pages); +err: + set_active_memcg(old_memcg); + bpf_map_put_memcg(memcg); + return NULL; +} + static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node, struct bpf_map *map) { @@ -67,10 +118,8 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node, int nr_meta_pages = RINGBUF_PGOFF + RINGBUF_POS_PAGES; int nr_data_pages = data_sz >> PAGE_SHIFT; int nr_pages = nr_meta_pages + nr_data_pages; - struct page **pages, *page; struct bpf_ringbuf *rb; - size_t array_size; - int i; + struct page **pages; /* Each data page is mapped twice to allow "virtual" * continuous read of samples wrapping around the end of ring @@ -89,22 +138,11 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node, * when mmap()'ed in user-space, simplifying both kernel and * user-space implementations significantly. */ - array_size = (nr_meta_pages + 2 * nr_data_pages) * sizeof(*pages); - pages = bpf_map_area_alloc(array_size, numa_node, map); + pages = bpf_ringbuf_pages_alloc(map, nr_meta_pages, nr_data_pages, + numa_node, flags); if (!pages) return NULL; - for (i = 0; i < nr_pages; i++) { - page = alloc_pages_node(numa_node, flags, 0); - if (!page) { - nr_pages = i; - goto err_free_pages; - } - pages[i] = page; - if (i >= nr_meta_pages) - pages[nr_data_pages + i] = page; - } - rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages, VM_MAP | VM_USERMAP, PAGE_KERNEL); if (rb) { @@ -114,10 +152,6 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node, return rb; } -err_free_pages: - for (i = 0; i < nr_pages; i++) - __free_page(pages[i]); - bpf_map_area_free(pages, NULL); return NULL; } @@ -188,12 +222,10 @@ static void bpf_ringbuf_free(struct bpf_ringbuf *rb) * to unmap rb itself with vunmap() below */ struct page **pages = rb->pages; - int i, nr_pages = rb->nr_pages; + int nr_pages = rb->nr_pages; vunmap(rb); - for (i = 0; i < nr_pages; i++) - __free_page(pages[i]); - bpf_map_area_free(pages, NULL); + bpf_ringbuf_pages_free(pages, nr_pages); } static void ringbuf_map_free(struct bpf_map *map) From patchwork Wed Sep 21 16:59:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984017 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 980B0C6FA8B for ; Wed, 21 Sep 2022 17:00:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231394AbiIURAm (ORCPT ); Wed, 21 Sep 2022 13:00:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230512AbiIURA3 (ORCPT ); Wed, 21 Sep 2022 13:00:29 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB089642E3; Wed, 21 Sep 2022 10:00:23 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id s14-20020a17090a6e4e00b0020057c70943so14944604pjm.1; Wed, 21 Sep 2022 10:00:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=gqCnXF6pL1DkeQyk82Z43CJDm8euemRLx+k6VOog8Uc=; b=UGRZwVaZmlJMU25ZsQipxPfiqXs+wsllhJ+tOSSDcAUjFGAX7DQpWyoztud6+XklKC nu8mtVQhAabPURghaZx7jg8BVskiJM/UYMNF3dkH8+cvzz/onhdgh3QlOUj/uLvR9tcc 2H31ALd6zzINn3zaoG/SK9MPdeYtU7JXkF7VliyTYP/e66MQ/ut7KGcJQZeoJUkXcYcH js+Cz02F2TZtIl25SDE8O4d9j6s9E8shiwIAPCn/yl46O+Envod6GqXKimJU0wpKiFxq HbAdN9rj34msRsmvaFGVBspvNxRmwm3ZAnxbroDnhMw2M3KrHWP120R2b3iMAiyReA34 93Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=gqCnXF6pL1DkeQyk82Z43CJDm8euemRLx+k6VOog8Uc=; b=RUATKfYOm1A5DyOm/ip/kVDdt4oUGbmq9qSw8i9N9xGw1j0uqGpWlC7mCLyNOAoeGQ 0kFNtZz3y9iio48xWZRc40Ifu52BIPbRle3sFMhKKbUQd7+8i67POKjfhqIe+lVbZgVR d5K6NzuwCSuJ64y62AH6X74FXXPkXVxBzls9A9oOq11cc4SDlzAKTSd9c6MOPZuEFefM nFJ2g+Vq+Uckhw7pTkO3xvRYmxfF8ow+DG6XKZuZ4hTINhMbg1IoFU/ApxzxRzK9k2c6 GKf/rZwwgzQubyDuftE2SbBopHuyhxrvZFrTrCfPpTdeLdz5yiN6GUGzvsQVb5kGSVJT Jp3g== X-Gm-Message-State: ACrzQf2Yt9nqyghqCjQfNpxzjILYXpv77jrWOg2GRgND+u+hp3hMJ8hR MMc9w7jxDUYcUAfYZnnHUek= X-Google-Smtp-Source: AMsMyM6HRw/Vr5zpNcVZxiuAc86cjAkLvMMsOYZ0frkrsI/RI9RosxyjVSCqGe6nLjJt+N6Ie7/jvQ== X-Received: by 2002:a17:902:d40c:b0:178:4439:69f9 with SMTP id b12-20020a170902d40c00b00178443969f9mr5780706ple.118.1663779623293; Wed, 21 Sep 2022 10:00:23 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:22 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 07/10] bpf: Use bpf_map_kzalloc in arraymap Date: Wed, 21 Sep 2022 16:59:59 +0000 Message-Id: <20220921170002.29557-8-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Allocates memory after map creation, then we can use the generic helper bpf_map_kzalloc() instead of the open-coded kzalloc(). Signed-off-by: Yafang Shao --- kernel/bpf/arraymap.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index dd79d0d..7f1766c 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -1111,20 +1111,20 @@ static struct bpf_map *prog_array_map_alloc(union bpf_attr *attr) struct bpf_array_aux *aux; struct bpf_map *map; - aux = kzalloc(sizeof(*aux), GFP_KERNEL_ACCOUNT); - if (!aux) + map = array_map_alloc(attr); + if (IS_ERR(map)) return ERR_PTR(-ENOMEM); + aux = bpf_map_kzalloc(map, sizeof(*aux), GFP_KERNEL); + if (!aux) { + array_map_free(map); + return ERR_PTR(-ENOMEM); + } + INIT_WORK(&aux->work, prog_array_map_clear_deferred); INIT_LIST_HEAD(&aux->poke_progs); mutex_init(&aux->poke_mutex); - map = array_map_alloc(attr); - if (IS_ERR(map)) { - kfree(aux); - return map; - } - container_of(map, struct bpf_array, map)->aux = aux; aux->map = map; From patchwork Wed Sep 21 17:00:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984019 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98C56ECAAD8 for ; Wed, 21 Sep 2022 17:00:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231407AbiIURAr (ORCPT ); Wed, 21 Sep 2022 13:00:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231194AbiIURAa (ORCPT ); Wed, 21 Sep 2022 13:00:30 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 668165E559; Wed, 21 Sep 2022 10:00:25 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id b21so6273089plz.7; Wed, 21 Sep 2022 10:00:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=86Fvvo7OHCl7WcRNmebHsC+UjHYkXcsdLASZGPmT+6Y=; b=Oq03LOMMFgsLNPoDix5xEX0x1FdtZZSvbxGuAYQ4ywUrDpKo4H2L8SHRgi01pOHZI8 vLbrDrut8/Szy52XWOK7vDIIy3EtGQdHA9FBQF6IoiRRpX8JeE1QjlLCYNVvi6lw8Ykn HjT9HSRJkSulSHbK4VhpWAJX8JBZHYvYWRo6w656Tnag7jwqL4C95a5GOJtRqq2+wM/u DmXP37nSBNVxJ9+qKBqGAYksqrkHOPYy/iYLPse5wd3DfoXkxKfraUH5agPIJ3iXwvPX SO065VFBaLWKhXyu20BZbWN6OcjgtdsyQVLmCwBy8e3IlypTydv7n8tKF+YY2CWAU8iM FsAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=86Fvvo7OHCl7WcRNmebHsC+UjHYkXcsdLASZGPmT+6Y=; b=FzXBR0HT/iDHIx2t62ow5hz0THig8dfB5WRlaMsqqMFW4vSFXo2T0qdnGACYzfXUDW eg5b/VWmmxNfYhTnJMorKDzGIwMbSWpg/UZyWDQKTiR8B5OJ7XqzuRzbQNqnjkVT7+KD iRw3rwYw4GI/S6it67Qfujvl5h727Wz9E+ti8D+3PR25u1EQQaunlrKZNZ8OiBjh4Ntz 59ZKBwV5ojesJ9fPOcPm0fKvI9flEkwYvJdK9J9C3LzqKmAOfxeUvRA41M6K6b0Iwkr5 /DZGxnlzTQEP/Mkbj3Wdlh71RGT5RHBaBZqGK1Kgaz+2R1vQRzlhyAE1zwarM8vrPELF 7w2A== X-Gm-Message-State: ACrzQf0Yzr8TUWbEL1ExqDGPfchdNmAZJuDHYAU5wjfdVNWkqNdh4uuo G82F2p29hbrifsXCRGRMpB4= X-Google-Smtp-Source: AMsMyM71OEz6FCG+w0oFSQ37gXstmmjQLulm8CE9fLNx0xAKUpv8TdHR3SpgktqUbz5zKADwFjdWDw== X-Received: by 2002:a17:90b:1e49:b0:200:6d41:9662 with SMTP id pi9-20020a17090b1e4900b002006d419662mr10364539pjb.221.1663779625342; Wed, 21 Sep 2022 10:00:25 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:24 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 08/10] bpf: Use bpf_map_kvcalloc in bpf_local_storage Date: Wed, 21 Sep 2022 17:00:00 +0000 Message-Id: <20220921170002.29557-9-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Introduce new helper bpf_map_kvcalloc() for this memory allocation. Signed-off-by: Yafang Shao --- include/linux/bpf.h | 8 ++++++++ kernel/bpf/bpf_local_storage.c | 4 ++-- kernel/bpf/syscall.c | 15 +++++++++++++++ 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index eca1502..e1e5ada 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1717,6 +1717,8 @@ int generic_map_delete_batch(struct bpf_map *map, void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, int node); void *bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags); +void *bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, + gfp_t flags); void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, size_t align, gfp_t flags); #else @@ -1733,6 +1735,12 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, return kzalloc(size, flags); } +static inline void * +bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, gfp_t flags) +{ + return kvcalloc(n, size, flags); +} + static inline void __percpu * bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, size_t align, gfp_t flags) diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 44498d7d..8a24828 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -620,8 +620,8 @@ struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr) nbuckets = max_t(u32, 2, nbuckets); smap->bucket_log = ilog2(nbuckets); - smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets, - GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT); + smap->buckets = bpf_map_kvcalloc(&smap->map, sizeof(*smap->buckets), + nbuckets, GFP_USER | __GFP_NOWARN); if (!smap->buckets) { bpf_map_area_free(smap, &smap->map); return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 727c04c..6123c71 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -489,6 +489,21 @@ void *bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags) return ptr; } +void *bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, + gfp_t flags) +{ + struct mem_cgroup *memcg, *old_memcg; + void *ptr; + + memcg = bpf_map_get_memcg(map); + old_memcg = set_active_memcg(memcg); + ptr = kvcalloc(n, size, flags | __GFP_ACCOUNT); + set_active_memcg(old_memcg); + bpf_map_put_memcg(memcg); + + return ptr; +} + void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, size_t align, gfp_t flags) { From patchwork Wed Sep 21 17:00:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984020 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0AFBC6FA8E for ; Wed, 21 Sep 2022 17:00:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231309AbiIURAt (ORCPT ); Wed, 21 Sep 2022 13:00:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231272AbiIURAc (ORCPT ); Wed, 21 Sep 2022 13:00:32 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E4F53B5; Wed, 21 Sep 2022 10:00:28 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id j6-20020a17090a694600b00200bba67dadso6526082pjm.5; Wed, 21 Sep 2022 10:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=gba3u0+gd94mB6GmfPdadjDps/EFMbhme8baw/BzKk0=; b=eWDIZ3rFab2okkRLajKM5mxi8K6wMkkS84fi9NRdZCYncHKadFI1wpoWOZHwSuU0SK p+iaWwWdjR/bQtukKkpUcmpO9XbTRmNduklhIpBL40Kxs9yMpkE0pBcKSCR1amGLvOpq 9egmfQUOrGMJQ29NSGMv1RrH46RO3w+aFQvFs0wv62z90Yr0WlesxS5tEMkH7/VyIMmw Wo04CFfnI+a+iYXt5W8MgC6gcw8sV2FmjQ2Ub7/DXPixGt/6GKHtRExemrlo9RbObcml RkcTaTt7hy83mKnbvK/XHsXEwgkk5i+125gADjKkLhjT0MjRqyIg2IBayiGHLOOx6m4B eaBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=gba3u0+gd94mB6GmfPdadjDps/EFMbhme8baw/BzKk0=; b=i0FrOPjEjJKKoWbZSVBImfDBMvsG2mBd87XLyFxjNtgi4Sn5BdVy8Qbdx7GzETBrP4 uklXAZrMjS6I7SUNck7a/iF7IYUJi9BytgWGRE3GB7gyxT3yfMoszNlG+0NM18AqyoM6 t1P6Q8tnl84T42btiQkPUmbWaRUa6PsAWleKJGnH1KdvmTpuWnjEKhepBr53gGWUtGF8 gYKt8bFpU7r/4MNxMcYBGzues6vpqic+bZ2k2KMzlvGxn/fGIwrS/w7PdkXTWAJbeFfc N/vCHRRADMW2twnYyJW6yq7rRpwMR3y3lNKKtT8jKgh9DWhiIJ1eEDIYvBunoPAWWDxZ QBaQ== X-Gm-Message-State: ACrzQf3ZVygWryCOpXBTPURfcXLIIXTb6TtDF55w80rLQnljzIXcMbcf usHyY4SSe9vx8kRMhFCsiOc= X-Google-Smtp-Source: AMsMyM5UUh5UGXfBN7hdQyZZoZIfKCCpq9e3N25yua41v8owfxVZSErG70KBKf17D7b1lz68CI8HQg== X-Received: by 2002:a17:90a:1b0a:b0:203:6731:4c98 with SMTP id q10-20020a17090a1b0a00b0020367314c98mr10340586pjq.10.1663779627393; Wed, 21 Sep 2022 10:00:27 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:26 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 09/10] bpf: Add bpf map free helpers Date: Wed, 21 Sep 2022 17:00:01 +0000 Message-Id: <20220921170002.29557-10-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Some new helpers are introduced to allocate memory, instead of using the general free helpers. Then we can do something in these new helpers to track the free of bpf memory in the future. Signed-off-by: Yafang Shao --- include/linux/bpf.h | 24 ++++++++++++++++++++++++ kernel/bpf/arraymap.c | 4 ++-- kernel/bpf/bpf_local_storage.c | 10 +++++----- kernel/bpf/cpumap.c | 13 ++++++------- kernel/bpf/devmap.c | 10 ++++++---- kernel/bpf/hashtab.c | 2 +- kernel/bpf/helpers.c | 2 +- kernel/bpf/local_storage.c | 10 +++++----- kernel/bpf/lpm_trie.c | 2 +- kernel/bpf/ringbuf.c | 7 ++++++- kernel/bpf/syscall.c | 14 ++++++++++++++ net/xdp/xskmap.c | 2 +- 12 files changed, 72 insertions(+), 28 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e1e5ada..f7a4cfc 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1721,6 +1721,12 @@ void *bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, gfp_t flags); void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, size_t align, gfp_t flags); +void bpf_map_kfree(const void *ptr); +void bpf_map_kvfree(const void *ptr); +void bpf_map_free_percpu(void __percpu *ptr); + +#define bpf_map_kfree_rcu(ptr, rhf...) kvfree_rcu(ptr, ## rhf) + #else static inline void * bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, @@ -1747,6 +1753,24 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, { return __alloc_percpu_gfp(size, align, flags); } + +static inline void bpf_map_kfree(const void *ptr) +{ + kfree(ptr); +} + +static inline void bpf_map_kvfree(const void *ptr) +{ + kvfree(ptr); +} + +static inline void bpf_map_free_percpu(void __percpu *ptr) +{ + free_percpu(ptr); +} + +#define bpf_map_kfree_rcu(ptr, rhf...) kvfree_rcu(ptr, ## rhf) + #endif extern int sysctl_unprivileged_bpf_disabled; diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 7f1766c..9bdb99d 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -24,7 +24,7 @@ static void bpf_array_free_percpu(struct bpf_array *array) int i; for (i = 0; i < array->map.max_entries; i++) { - free_percpu(array->pptrs[i]); + bpf_map_free_percpu(array->pptrs[i]); cond_resched(); } } @@ -1141,7 +1141,7 @@ static void prog_array_map_free(struct bpf_map *map) list_del_init(&elem->list); kfree(elem); } - kfree(aux); + bpf_map_kfree(aux); fd_array_map_free(map); } diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 8a24828..6ef49aa 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -89,7 +89,7 @@ void bpf_local_storage_free_rcu(struct rcu_head *rcu) struct bpf_local_storage *local_storage; local_storage = container_of(rcu, struct bpf_local_storage, rcu); - kfree_rcu(local_storage, rcu); + bpf_map_kfree_rcu(local_storage, rcu); } static void bpf_selem_free_rcu(struct rcu_head *rcu) @@ -97,7 +97,7 @@ static void bpf_selem_free_rcu(struct rcu_head *rcu) struct bpf_local_storage_elem *selem; selem = container_of(rcu, struct bpf_local_storage_elem, rcu); - kfree_rcu(selem, rcu); + bpf_map_kfree_rcu(selem, rcu); } /* local_storage->lock must be held and selem->local_storage == local_storage. @@ -153,7 +153,7 @@ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, if (use_trace_rcu) call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_rcu); else - kfree_rcu(selem, rcu); + bpf_map_kfree_rcu(selem, rcu); return free_local_storage; } @@ -348,7 +348,7 @@ int bpf_local_storage_alloc(void *owner, return 0; uncharge: - kfree(storage); + bpf_map_kfree(storage); mem_uncharge(smap, owner, sizeof(*storage)); return err; } @@ -581,7 +581,7 @@ void bpf_local_storage_map_free(struct bpf_local_storage_map *smap, */ synchronize_rcu(); - kvfree(smap->buckets); + bpf_map_kvfree(smap->buckets); bpf_map_area_free(smap, &smap->map); } diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index b593157..5ee774e 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -166,8 +166,8 @@ static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu) /* The queue should be empty at this point */ __cpu_map_ring_cleanup(rcpu->queue); ptr_ring_cleanup(rcpu->queue, NULL); - kfree(rcpu->queue); - kfree(rcpu); + bpf_map_kfree(rcpu->queue); + bpf_map_kfree(rcpu); } } @@ -486,11 +486,11 @@ static int __cpu_map_load_bpf_program(struct bpf_cpu_map_entry *rcpu, free_ptr_ring: ptr_ring_cleanup(rcpu->queue, NULL); free_queue: - kfree(rcpu->queue); + bpf_map_kfree(rcpu->queue); free_bulkq: - free_percpu(rcpu->bulkq); + bpf_map_free_percpu(rcpu->bulkq); free_rcu: - kfree(rcpu); + bpf_map_kfree(rcpu); return NULL; } @@ -504,8 +504,7 @@ static void __cpu_map_entry_free(struct rcu_head *rcu) * find this entry. */ rcpu = container_of(rcu, struct bpf_cpu_map_entry, rcu); - - free_percpu(rcpu->bulkq); + bpf_map_free_percpu(rcpu->bulkq); /* Cannot kthread_stop() here, last put free rcpu resources */ put_cpu_map_entry(rcpu); } diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 807a4cd..38bd7be 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -220,7 +220,7 @@ static void dev_map_free(struct bpf_map *map) if (dev->xdp_prog) bpf_prog_put(dev->xdp_prog); dev_put(dev->dev); - kfree(dev); + bpf_map_kfree(dev); } } @@ -236,7 +236,7 @@ static void dev_map_free(struct bpf_map *map) if (dev->xdp_prog) bpf_prog_put(dev->xdp_prog); dev_put(dev->dev); - kfree(dev); + bpf_map_kfree(dev); } bpf_map_area_free(dtab->netdev_map, NULL); @@ -793,12 +793,14 @@ static void *dev_map_hash_lookup_elem(struct bpf_map *map, void *key) static void __dev_map_entry_free(struct rcu_head *rcu) { struct bpf_dtab_netdev *dev; + struct bpf_dtab *dtab; dev = container_of(rcu, struct bpf_dtab_netdev, rcu); if (dev->xdp_prog) bpf_prog_put(dev->xdp_prog); dev_put(dev->dev); - kfree(dev); + dtab = dev->dtab; + bpf_map_kfree(dev); } static int dev_map_delete_elem(struct bpf_map *map, void *key) @@ -883,7 +885,7 @@ static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net, err_put_dev: dev_put(dev->dev); err_out: - kfree(dev); + bpf_map_kfree(dev); return ERR_PTR(-EINVAL); } diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 89887df..7f43371 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -1562,7 +1562,7 @@ static void htab_map_free(struct bpf_map *map) } bpf_map_free_kptr_off_tab(map); - free_percpu(htab->extra_elems); + bpf_map_free_percpu(htab->extra_elems); bpf_map_area_free(htab->buckets, NULL); bpf_mem_alloc_destroy(&htab->pcpu_ma); bpf_mem_alloc_destroy(&htab->ma); diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 41aeaf3..fd0549b 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1366,7 +1366,7 @@ void bpf_timer_cancel_and_free(void *val) */ if (this_cpu_read(hrtimer_running) != t) hrtimer_cancel(&t->timer); - kfree(t); + bpf_map_kfree(t); } BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr) diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index fcc7ece..035ef9e 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -174,7 +174,7 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *key, check_and_init_map_value(map, new->data); new = xchg(&storage->buf, new); - kfree_rcu(new, rcu); + bpf_map_kfree_rcu(new, rcu); return 0; } @@ -526,7 +526,7 @@ struct bpf_cgroup_storage *bpf_cgroup_storage_alloc(struct bpf_prog *prog, return storage; enomem: - kfree(storage); + bpf_map_kfree(storage); return ERR_PTR(-ENOMEM); } @@ -535,8 +535,8 @@ static void free_shared_cgroup_storage_rcu(struct rcu_head *rcu) struct bpf_cgroup_storage *storage = container_of(rcu, struct bpf_cgroup_storage, rcu); - kfree(storage->buf); - kfree(storage); + bpf_map_kfree(storage->buf); + bpf_map_kfree(storage); } static void free_percpu_cgroup_storage_rcu(struct rcu_head *rcu) @@ -545,7 +545,7 @@ static void free_percpu_cgroup_storage_rcu(struct rcu_head *rcu) container_of(rcu, struct bpf_cgroup_storage, rcu); free_percpu(storage->percpu_buf); - kfree(storage); + bpf_map_kfree(storage); } void bpf_cgroup_storage_free(struct bpf_cgroup_storage *storage) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index 3d329ae..815e5d4 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -602,7 +602,7 @@ static void trie_free(struct bpf_map *map) continue; } - kfree(node); + bpf_map_kfree(node); RCU_INIT_POINTER(*slot, NULL); break; } diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 1e7284c..535e440 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -59,12 +59,17 @@ struct bpf_ringbuf_hdr { u32 pg_off; }; +static inline void bpf_map_free_page(struct page *page) +{ + __free_page(page); +} + static void bpf_ringbuf_pages_free(struct page **pages, int nr_pages) { int i; for (i = 0; i < nr_pages; i++) - __free_page(pages[i]); + bpf_map_free_page(pages[i]); bpf_map_area_free(pages, NULL); } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 6123c71..b9250c8 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -519,6 +519,20 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, return ptr; } +void bpf_map_kfree(const void *ptr) +{ + kfree(ptr); +} + +void bpf_map_kvfree(const void *ptr) +{ + kvfree(ptr); +} + +void bpf_map_free_percpu(void __percpu *ptr) +{ + free_percpu(ptr); +} #endif static int bpf_map_kptr_off_cmp(const void *a, const void *b) diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index beb11fd..e9d93b8 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -33,7 +33,7 @@ static struct xsk_map_node *xsk_map_node_alloc(struct xsk_map *map, static void xsk_map_node_free(struct xsk_map_node *node) { bpf_map_put(&node->map->map); - kfree(node); + bpf_map_kfree(node); } static void xsk_map_sock_add(struct xdp_sock *xs, struct xsk_map_node *node) From patchwork Wed Sep 21 17:00:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 12984021 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49F5FC6FA82 for ; Wed, 21 Sep 2022 17:01:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231564AbiIURA6 (ORCPT ); Wed, 21 Sep 2022 13:00:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231345AbiIURAd (ORCPT ); Wed, 21 Sep 2022 13:00:33 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18D5350731; Wed, 21 Sep 2022 10:00:30 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id p18so6270031plr.8; Wed, 21 Sep 2022 10:00:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=rEuCWbdgrwwGiHAb+CkeCueJvfR5xkIo+Y2V08UZ+0Q=; b=C4bm40YB8V9wdtklRrCCiR3gKKqr4vT5mv+UmGw/unek0Cdvuigja/P7Dp+MQyi3bU gfKt7ChIFsbYGzX5ZdSVeqZZVSQ7PTVpBH69eT2f4oTlkFr/rRL5wWCz03IITuIIYVDw p7zkKTB/REpasj9icnRE32zK+VZ0/7UwK89KL1Hl9OGSmIvQox7MHLRt7kIuexgcOsrK GF4ehGltq1jy1DXXRAumQoCHZ4mKYidr13NRBPxtpKdKZR2isS6nNpRqNvRlGtPErT6G cMsoSA0CBK9xcE1ug+J5uZCXVB6uQQN2165ppcDwMTtweS2RbDZH+emKAgKy2KdF98Yo cLTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=rEuCWbdgrwwGiHAb+CkeCueJvfR5xkIo+Y2V08UZ+0Q=; b=EPB5K61u6C2hB/iddb3a/vyFAgkx1C2q9eilDGQ5nDavcdRsXymwwJBVwXmmAlCyYa yRbb8D6P5c904FWr7eph4n/O/zpRcC9coJ+wKWWdhv3peCw1VS/aTxD3p/0adD0tPDoD zTZcWxJGRE86iHk10togcY0Rsdz7d5ni84P4n+zGEH+lDoizgGXnw7Or9CXKJYaqOUZ5 2EXAS7tVbQFD84Dc+jqJjwvGZmZzyJXZjenxG/hmwZ4uu5KBTlqP2Zw8w0ZmLcCDpuca WgSU6qnQFeSgMeJR6nJhwsYrqAxGi8JPlPm7RUWNq6QBGFEnfb5p1Pr2buaRLnxZGpFW 9t9Q== X-Gm-Message-State: ACrzQf1GM4alGcBh1hoHHjW+WkjnxnQ8JHWlSSqPh8BJ4hkeM8fGBAYq fV86AY5qLRtfu9CA7bgmUcw= X-Google-Smtp-Source: AMsMyM68wv7b1uQ8Tbun+mu+bNwKV5sJPRgpzJPJbjzgAjhhr9RWZmRnD6jART9e8Z6VQ+pZZnmQtg== X-Received: by 2002:a17:90b:3b92:b0:203:a4c6:383c with SMTP id pc18-20020a17090b3b9200b00203a4c6383cmr10353340pjb.92.1663779629464; Wed, 21 Sep 2022 10:00:29 -0700 (PDT) Received: from vultr.guest ([2001:19f0:6001:488e:5400:4ff:fe25:7db8]) by smtp.gmail.com with ESMTPSA id mp4-20020a17090b190400b002006f8e7688sm2102495pjb.32.2022.09.21.10.00.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 10:00:28 -0700 (PDT) From: Yafang Shao To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, songmuchun@bytedance.com, akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com Cc: cgroups@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, Yafang Shao Subject: [RFC PATCH bpf-next 10/10] bpf, memcg: Add new item bpf into memory.stat Date: Wed, 21 Sep 2022 17:00:02 +0000 Message-Id: <20220921170002.29557-11-laoar.shao@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220921170002.29557-1-laoar.shao@gmail.com> References: <20220921170002.29557-1-laoar.shao@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC A new item 'bpf' is introduced into memory.stat, then we can get the memory consumed by bpf. Currently only the memory of bpf-map is accounted. The accouting of this new item is implemented with scope-based accouting, which is similar to set_active_memcg(). In this scope, the memory allocated will be accounted or unaccounted to a specific item, which is specified by set_active_memcg_item(). The result in cgroup v1 as follows, $ cat /sys/fs/cgroup/memory/foo/memory.stat | grep bpf bpf 109056000 total_bpf 109056000 After the map is removed, the counter will become zero again. $ cat /sys/fs/cgroup/memory/foo/memory.stat | grep bpf bpf 0 total_bpf 0 The 'bpf' may not be 0 after the bpf-map is destroyed, because there may be cached objects. Note that there's no kmemcg in root memory cgroup, so the item 'bpf' will be always 0 in root memory cgroup. If a bpf-map is charged into root memcg directly, its memory size will not be accounted, so the 'total_bpf' can't be used to monitor system-wide bpf memory consumption yet. Signed-off-by: Yafang Shao --- include/linux/bpf.h | 10 ++++++++-- include/linux/memcontrol.h | 1 + include/linux/sched.h | 1 + include/linux/sched/mm.h | 24 ++++++++++++++++++++++++ kernel/bpf/memalloc.c | 10 ++++++++++ kernel/bpf/ringbuf.c | 4 ++++ kernel/bpf/syscall.c | 40 ++++++++++++++++++++++++++++++++++++++-- kernel/fork.c | 1 + mm/memcontrol.c | 20 ++++++++++++++++++++ 9 files changed, 107 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f7a4cfc..9eda143 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1725,7 +1725,13 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, void bpf_map_kvfree(const void *ptr); void bpf_map_free_percpu(void __percpu *ptr); -#define bpf_map_kfree_rcu(ptr, rhf...) kvfree_rcu(ptr, ## rhf) +#define bpf_map_kfree_rcu(ptr, rhf...) { \ + int old_item; \ + \ + old_item = set_active_memcg_item(MEMCG_BPF); \ + kvfree_rcu(ptr, ## rhf); \ + set_active_memcg_item(old_item); \ +} #else static inline void * @@ -1771,7 +1777,7 @@ static inline void bpf_map_free_percpu(void __percpu *ptr) #define bpf_map_kfree_rcu(ptr, rhf...) kvfree_rcu(ptr, ## rhf) -#endif +#endif /* CONFIG_MEMCG_KMEM */ extern int sysctl_unprivileged_bpf_disabled; diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d4a0ad3..f345467 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -37,6 +37,7 @@ enum memcg_stat_item { MEMCG_KMEM, MEMCG_ZSWAP_B, MEMCG_ZSWAPPED, + MEMCG_BPF, MEMCG_NR_STAT, }; diff --git a/include/linux/sched.h b/include/linux/sched.h index e7b2f8a..79362da 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1423,6 +1423,7 @@ struct task_struct { /* Used by memcontrol for targeted memcg charge: */ struct mem_cgroup *active_memcg; + int active_item; #endif #ifdef CONFIG_BLK_CGROUP diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 2a24361..3a334c7 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -363,6 +363,7 @@ static inline void memalloc_pin_restore(unsigned int flags) #ifdef CONFIG_MEMCG DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg); +DECLARE_PER_CPU(int, int_active_item); /** * set_active_memcg - Starts the remote memcg charging scope. * @memcg: memcg to charge. @@ -389,12 +390,35 @@ static inline void memalloc_pin_restore(unsigned int flags) return old; } + +static inline int +set_active_memcg_item(int item) +{ + int old_item; + + if (!in_task()) { + old_item = this_cpu_read(int_active_item); + this_cpu_write(int_active_item, item); + } else { + old_item = current->active_item; + current->active_item = item; + } + + return old_item; +} + #else static inline struct mem_cgroup * set_active_memcg(struct mem_cgroup *memcg) { return NULL; } + +static inline int +set_active_memcg_item(int item) +{ + return MEMCG_NR_STAT; +} #endif #ifdef CONFIG_MEMBARRIER diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 5f83be1..51d59d4 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -165,11 +165,14 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) { struct mem_cgroup *memcg = NULL, *old_memcg; unsigned long flags; + int old_item; void *obj; int i; memcg = get_memcg(c); old_memcg = set_active_memcg(memcg); + old_item = set_active_memcg_item(MEMCG_BPF); + for (i = 0; i < cnt; i++) { obj = __alloc(c, node); if (!obj) @@ -194,19 +197,26 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (IS_ENABLED(CONFIG_PREEMPT_RT)) local_irq_restore(flags); } + + set_active_memcg_item(old_item); set_active_memcg(old_memcg); mem_cgroup_put(memcg); } static void free_one(struct bpf_mem_cache *c, void *obj) { + int old_item; + + old_item = set_active_memcg_item(MEMCG_BPF); if (c->percpu_size) { free_percpu(((void **)obj)[1]); kfree(obj); + set_active_memcg_item(old_item); return; } kfree(obj); + set_active_memcg_item(old_item); } static void __free_rcu(struct rcu_head *head) diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 535e440..72435bd 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -61,7 +61,11 @@ struct bpf_ringbuf_hdr { static inline void bpf_map_free_page(struct page *page) { + int old_item; + + old_item = set_active_memcg_item(MEMCG_BPF); __free_page(page); + set_active_memcg_item(old_item); } static void bpf_ringbuf_pages_free(struct page **pages, int nr_pages) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index b9250c8..703aa6a 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -340,11 +340,14 @@ static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) const gfp_t gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_ACCOUNT; unsigned int flags = 0; unsigned long align = 1; + int old_item; void *area; + void *ptr; if (size >= SIZE_MAX) return NULL; + old_item = set_active_memcg_item(MEMCG_BPF); /* kmalloc()'ed memory can't be mmap()'ed */ if (mmapable) { BUG_ON(!PAGE_ALIGNED(size)); @@ -353,13 +356,18 @@ static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) } else if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { area = kmalloc_node(size, gfp | GFP_USER | __GFP_NORETRY, numa_node); - if (area != NULL) + if (area != NULL) { + set_active_memcg_item(old_item); return area; + } } - return __vmalloc_node_range(size, align, VMALLOC_START, VMALLOC_END, + ptr = __vmalloc_node_range(size, align, VMALLOC_START, VMALLOC_END, gfp | GFP_KERNEL | __GFP_RETRY_MAYFAIL, PAGE_KERNEL, flags, numa_node, __builtin_return_address(0)); + + set_active_memcg_item(old_item); + return ptr; } void *bpf_map_area_alloc(u64 size, int numa_node, struct bpf_map *map) @@ -386,9 +394,13 @@ void *bpf_map_area_mmapable_alloc(u64 size, int numa_node) void bpf_map_area_free(void *area, struct bpf_map *map) { + int old_item; + if (map) bpf_map_release_memcg(map); + old_item = set_active_memcg_item(MEMCG_BPF); kvfree(area); + set_active_memcg_item(old_item); } static u32 bpf_map_flags_retain_permanent(u32 flags) @@ -464,11 +476,14 @@ void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, int node) { struct mem_cgroup *memcg, *old_memcg; + int old_item; void *ptr; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); + old_item = set_active_memcg_item(MEMCG_BPF); ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node); + set_active_memcg_item(old_item); set_active_memcg(old_memcg); bpf_map_put_memcg(memcg); @@ -479,10 +494,13 @@ void *bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags) { struct mem_cgroup *memcg, *old_memcg; void *ptr; + int old_item; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); + old_item = set_active_memcg_item(MEMCG_BPF); ptr = kzalloc(size, flags | __GFP_ACCOUNT); + set_active_memcg_item(old_item); set_active_memcg(old_memcg); bpf_map_put_memcg(memcg); @@ -494,11 +512,14 @@ void *bpf_map_kvcalloc(struct bpf_map *map, size_t n, size_t size, { struct mem_cgroup *memcg, *old_memcg; void *ptr; + int old_item; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); + old_item = set_active_memcg_item(MEMCG_BPF); ptr = kvcalloc(n, size, flags | __GFP_ACCOUNT); set_active_memcg(old_memcg); + set_active_memcg_item(old_item); bpf_map_put_memcg(memcg); return ptr; @@ -509,10 +530,13 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, { struct mem_cgroup *memcg, *old_memcg; void __percpu *ptr; + int old_item; memcg = bpf_map_get_memcg(map); old_memcg = set_active_memcg(memcg); + old_item = set_active_memcg_item(MEMCG_BPF); ptr = __alloc_percpu_gfp(size, align, flags | __GFP_ACCOUNT); + set_active_memcg_item(old_item); set_active_memcg(old_memcg); bpf_map_put_memcg(memcg); @@ -521,17 +545,29 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, void bpf_map_kfree(const void *ptr) { + int old_item; + + old_item = set_active_memcg_item(MEMCG_BPF); kfree(ptr); + set_active_memcg_item(old_item); } void bpf_map_kvfree(const void *ptr) { + int old_item; + + old_item = set_active_memcg_item(MEMCG_BPF); kvfree(ptr); + set_active_memcg_item(old_item); } void bpf_map_free_percpu(void __percpu *ptr) { + int old_item; + + old_item = set_active_memcg_item(MEMCG_BPF); free_percpu(ptr); + set_active_memcg_item(old_item); } #endif diff --git a/kernel/fork.c b/kernel/fork.c index 90c85b1..dac2429 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1043,6 +1043,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) #ifdef CONFIG_MEMCG tsk->active_memcg = NULL; + tsk->active_item = 0; #endif #ifdef CONFIG_CPU_SUP_INTEL diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b69979c..9008417 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -82,6 +82,10 @@ DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg); EXPORT_PER_CPU_SYMBOL_GPL(int_active_memcg); +/* Active memory cgroup to use from an interrupt context */ +DEFINE_PER_CPU(int, int_active_item); +EXPORT_PER_CPU_SYMBOL_GPL(int_active_item); + /* Socket memory accounting disabled? */ static bool cgroup_memory_nosocket __ro_after_init; @@ -923,6 +927,14 @@ static __always_inline struct mem_cgroup *active_memcg(void) return current->active_memcg; } +static __always_inline int active_memcg_item(void) +{ + if (!in_task()) + return this_cpu_read(int_active_item); + + return current->active_item; +} + /** * get_mem_cgroup_from_mm: Obtain a reference on given mm_struct's memcg. * @mm: mm from which memcg should be extracted. It can be NULL. @@ -1436,6 +1448,7 @@ struct memory_stat { { "workingset_restore_anon", WORKINGSET_RESTORE_ANON }, { "workingset_restore_file", WORKINGSET_RESTORE_FILE }, { "workingset_nodereclaim", WORKINGSET_NODERECLAIM }, + { "bpf", MEMCG_BPF }, }; /* Translate stat items to the correct unit for memory.stat output */ @@ -2993,6 +3006,11 @@ struct obj_cgroup *get_obj_cgroup_from_page(struct page *page) static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages) { + int item = active_memcg_item(); + + WARN_ON_ONCE(item != 0 && (item < MEMCG_SWAP || item >= MEMCG_NR_STAT)); + if (item) + mod_memcg_state(memcg, item, nr_pages); mod_memcg_state(memcg, MEMCG_KMEM, nr_pages); if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) { if (nr_pages > 0) @@ -3976,6 +3994,7 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) NR_FILE_DIRTY, NR_WRITEBACK, MEMCG_SWAP, + MEMCG_BPF, }; static const char *const memcg1_stat_names[] = { @@ -3989,6 +4008,7 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v) "dirty", "writeback", "swap", + "bpf", }; /* Universal VM events cgroup1 shows, original sort order */