From patchwork Sat Jun 24 03:13:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291566 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87A977F; Sat, 24 Jun 2023 03:13:43 +0000 (UTC) Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05555E47; Fri, 23 Jun 2023 20:13:42 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b55643507dso9932315ad.0; Fri, 23 Jun 2023 20:13:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576421; x=1690168421; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tnvjt/gAeB7s5ArRb1iRUdzehhXE3naKGVtPVDLMicQ=; b=clVBYy2HM7CDMGbJVgsY8JT5N2kpfGdjUihKx9XjFcHXhrFXTI9jEOooqSLA1m/2tv 1vdX2Su/MA0cA2JVxVLgo9XKdctxU566cdOyLb6c51X5RkIT0C25Zjfun4KtHOjYWy5H fZL9euFedDGTKi5kzwZ9Qf8xZUjDZ/EG1bXEcVDDnjJmeh+ETyRqnfsH4lt70Qmy2DHQ LU9r9IAdz0mIkCfqPs7F7IkC4L5tHlRBlw+sKOP8/Tn6oCvpqZIKarJGDBPmY4o1LG0x bQHGE0XD9NsN0bkaIRun2HPOgzH7gVtz1Tmx94WairPRelo6/xEDdwDzpNNeNMmyphlI dCLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576421; x=1690168421; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tnvjt/gAeB7s5ArRb1iRUdzehhXE3naKGVtPVDLMicQ=; b=QZmMmPX/iti7LgrQvnn0PH6ru4V7GziIZzr16f3SpGWZm6bT9nhntsNZHDGq2Tc2D/ qktPkwIYbs4h63XTLMX9fnD9LGS33E4eeVBMCUCWkOLh08jQIAktLmfBne+KMQyEHfkS zsT3NY9CjWdJBXPcT6XrWOTr0J++RI8J7w8KC/CUk9TWCAB8QEoH7Qk3MtvJU8Ng+k17 +XFu6cjHqdWvOY2CpPi0TtRvtAg3vMGvOItsgaGwz3iluRk2YaCxszp28WFzKPqBiQXl QEd1loS4B4W4DEzqaKHNeYg4wBytiGCZc679XLaMI+yNU8lj5YCthQT+FxWrTm/+N8Pi 1xEw== X-Gm-Message-State: AC+VfDyPhDg6WKJqMjVwbXsFfxDSyzOWlUqM/YLFzrYQQrYmjMhMlkth xGXDnFh5X8l0tZvxBIFlyQk= X-Google-Smtp-Source: ACHHUZ7iQ7qu3wPaw6lLEBSDpwQ/fQxx2gB4IW/Twra9PASTIjnPaMMLuXJA30k9ALPr3jD6LoGQUg== X-Received: by 2002:a17:902:ed54:b0:1b3:bd82:c5d2 with SMTP id y20-20020a170902ed5400b001b3bd82c5d2mr995593plb.55.1687576421068; Fri, 23 Jun 2023 20:13:41 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b001a0567811fbsm240715plc.127.2023.06.23.20.13.39 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:13:40 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 01/13] bpf: Rename few bpf_mem_alloc fields. Date: Fri, 23 Jun 2023 20:13:21 -0700 Message-Id: <20230624031333.96597-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Rename: - struct rcu_head rcu; - struct llist_head free_by_rcu; - struct llist_head waiting_for_gp; - atomic_t call_rcu_in_progress; + struct llist_head free_by_rcu_ttrace; + struct llist_head waiting_for_gp_ttrace; + struct rcu_head rcu_ttrace; + atomic_t call_rcu_ttrace_in_progress; ... - static void do_call_rcu(struct bpf_mem_cache *c) + static void do_call_rcu_ttrace(struct bpf_mem_cache *c) to better indicate intended use. The 'tasks trace' is shortened to 'ttrace' to reduce verbosity. No functional changes. Later patches will add free_by_rcu/waiting_for_gp fields to be used with normal RCU. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 57 ++++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 0668bcd7c926..cc5b8adb4c83 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -99,10 +99,11 @@ struct bpf_mem_cache { int low_watermark, high_watermark, batch; int percpu_size; - struct rcu_head rcu; - struct llist_head free_by_rcu; - struct llist_head waiting_for_gp; - atomic_t call_rcu_in_progress; + /* list of objects to be freed after RCU tasks trace GP */ + struct llist_head free_by_rcu_ttrace; + struct llist_head waiting_for_gp_ttrace; + struct rcu_head rcu_ttrace; + atomic_t call_rcu_ttrace_in_progress; }; struct bpf_mem_caches { @@ -165,18 +166,18 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) old_memcg = set_active_memcg(memcg); for (i = 0; i < cnt; i++) { /* - * free_by_rcu is only manipulated by irq work refill_work(). + * free_by_rcu_ttrace is only manipulated by irq work refill_work(). * IRQ works on the same CPU are called sequentially, so it is * safe to use __llist_del_first() here. If alloc_bulk() is * invoked by the initial prefill, there will be no running * refill_work(), so __llist_del_first() is fine as well. * - * In most cases, objects on free_by_rcu are from the same CPU. + * In most cases, objects on free_by_rcu_ttrace are from the same CPU. * If some objects come from other CPUs, it doesn't incur any * harm because NUMA_NO_NODE means the preference for current * numa node and it is not a guarantee. */ - obj = __llist_del_first(&c->free_by_rcu); + obj = __llist_del_first(&c->free_by_rcu_ttrace); if (!obj) { /* Allocate, but don't deplete atomic reserves that typical * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc @@ -232,10 +233,10 @@ static void free_all(struct llist_node *llnode, bool percpu) static void __free_rcu(struct rcu_head *head) { - struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu); + struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu_ttrace); - free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size); - atomic_set(&c->call_rcu_in_progress, 0); + free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); + atomic_set(&c->call_rcu_ttrace_in_progress, 0); } static void __free_rcu_tasks_trace(struct rcu_head *head) @@ -254,32 +255,32 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) struct llist_node *llnode = obj; /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. - * Nothing races to add to free_by_rcu list. + * Nothing races to add to free_by_rcu_ttrace list. */ - __llist_add(llnode, &c->free_by_rcu); + __llist_add(llnode, &c->free_by_rcu_ttrace); } -static void do_call_rcu(struct bpf_mem_cache *c) +static void do_call_rcu_ttrace(struct bpf_mem_cache *c) { struct llist_node *llnode, *t; - if (atomic_xchg(&c->call_rcu_in_progress, 1)) + if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) return; - WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); - llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu)) - /* There is no concurrent __llist_add(waiting_for_gp) access. + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); + llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu_ttrace)) + /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. - * But there could be two concurrent llist_del_all(waiting_for_gp): + * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): * from __free_rcu() and from drain_mem_cache(). */ - __llist_add(llnode, &c->waiting_for_gp); + __llist_add(llnode, &c->waiting_for_gp_ttrace); /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal * progs to finish and finally do free_one() on each element. */ - call_rcu_tasks_trace(&c->rcu, __free_rcu_tasks_trace); + call_rcu_tasks_trace(&c->rcu_ttrace, __free_rcu_tasks_trace); } static void free_bulk(struct bpf_mem_cache *c) @@ -307,7 +308,7 @@ static void free_bulk(struct bpf_mem_cache *c) /* and drain free_llist_extra */ llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) enque_to_free(c, llnode); - do_call_rcu(c); + do_call_rcu_ttrace(c); } static void bpf_mem_refill(struct irq_work *work) @@ -441,13 +442,13 @@ static void drain_mem_cache(struct bpf_mem_cache *c) /* No progs are using this bpf_mem_cache, but htab_map_free() called * bpf_mem_cache_free() for all remaining elements and they can be in - * free_by_rcu or in waiting_for_gp lists, so drain those lists now. + * free_by_rcu_ttrace or in waiting_for_gp_ttrace lists, so drain those lists now. * - * Except for waiting_for_gp list, there are no concurrent operations + * Except for waiting_for_gp_ttrace list, there are no concurrent operations * on these lists, so it is safe to use __llist_del_all(). */ - free_all(__llist_del_all(&c->free_by_rcu), percpu); - free_all(llist_del_all(&c->waiting_for_gp), percpu); + free_all(__llist_del_all(&c->free_by_rcu_ttrace), percpu); + free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); } @@ -462,7 +463,7 @@ static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) static void free_mem_alloc(struct bpf_mem_alloc *ma) { - /* waiting_for_gp lists was drained, but __free_rcu might + /* waiting_for_gp_ttrace lists was drained, but __free_rcu might * still execute. Wait for it now before we freeing percpu caches. * * rcu_barrier_tasks_trace() doesn't imply synchronize_rcu_tasks_trace(), @@ -535,7 +536,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) */ irq_work_sync(&c->refill_work); drain_mem_cache(c); - rcu_in_progress += atomic_read(&c->call_rcu_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } /* objcg is the same across cpus */ if (c->objcg) @@ -550,7 +551,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) c = &cc->cache[i]; irq_work_sync(&c->refill_work); drain_mem_cache(c); - rcu_in_progress += atomic_read(&c->call_rcu_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } } if (c->objcg) From patchwork Sat Jun 24 03:13:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291567 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD2A27F; Sat, 24 Jun 2023 03:13:46 +0000 (UTC) Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9210DE47; Fri, 23 Jun 2023 20:13:45 -0700 (PDT) Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-666ecb21f86so1129733b3a.3; Fri, 23 Jun 2023 20:13:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576425; x=1690168425; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q6ir4URdr01h+ZUg6kzH9KBRVLbULKDLoOzFhIlPeJI=; b=L+ptJ7NY03zyI1ljB8YpkJDY6mzKWJzN7mzzr1GH8I2PDRz6lEC4vG9S92WgLyAqnq jDIxZ9eVOut6aO1/yJABfk8hHMRjV5iGl/sW290PMTLIh1YlQdytM8pt9Mcd8crH1en6 D0g8gSN5pNDX2HAj98d3w0vAZQBCT0d7YBk5GwL1hW9x14B8wF1JWJTz+563eFTmIDNE UK7p3EkJqp4KQatPUocMuYHImptp18tRTTpOiKzrBr6uvgADq0vnEmPwUucVbTMseYf4 Dzn52UxEBIYGnFFNvla3P/4e52A4eGFd98t8H+PzeCzDBbrLNia7H0xmcQ9nsXmCUxUC TSSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576425; x=1690168425; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q6ir4URdr01h+ZUg6kzH9KBRVLbULKDLoOzFhIlPeJI=; b=GaCEIkj+6GlMbqmp0rCla9UutMtfssZT5Rvl8wD3LfR/GKyCGnzyBpkg4xus8/iaLy jPzLr5J/06BSq5WftbTnXcZN14puwqwVOq/eQcnekDsbzLqsa/YzK6kBkGlIo3SnsKWQ 864Ud1u0X9cDknBAk4IZKbpw/aGp0WiD79Y9BAzhEkrQ+bzsDZpCOmVRpEs8F12e3Ycd vgC01OEiUF2+ovvJO50AwKRhj38kwYuT1pNPxQdpeadVxFQ8ZOrtn1BFUp1JjsDMBTux bLbLM4E7Gis+3uyS6CsjmgJ3bET4IJJuEaKqG2VilxuwxaBDb3ZbWVlHZHS+EnZ/kuk7 B+ag== X-Gm-Message-State: AC+VfDzltqqzb0mhga15z6jBJAGX/TEvC/bnYEgsIIgUvqhQewC/KG6Q zGKUzr9VAnwSQWw7SJZ/eiqRYJHSG+s= X-Google-Smtp-Source: ACHHUZ7DMOcBLsampKDvaTFPgxvAXugOh8VX0BBW7ADumKx2vEeJNaKIp6J81s2nwRsAtx0mLHvchw== X-Received: by 2002:a05:6a00:2e10:b0:64f:35c8:8584 with SMTP id fc16-20020a056a002e1000b0064f35c88584mr32645742pfb.18.1687576424903; Fri, 23 Jun 2023 20:13:44 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id j18-20020a62e912000000b00666add7f047sm186975pfh.207.2023.06.23.20.13.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:13:44 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 02/13] bpf: Simplify code of destroy_mem_alloc() with kmemdup(). Date: Fri, 23 Jun 2023 20:13:22 -0700 Message-Id: <20230624031333.96597-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Use kmemdup() to simplify the code. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index cc5b8adb4c83..b0011217be6c 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -499,7 +499,7 @@ static void destroy_mem_alloc(struct bpf_mem_alloc *ma, int rcu_in_progress) return; } - copy = kmalloc(sizeof(*ma), GFP_KERNEL); + copy = kmemdup(ma, sizeof(*ma), GFP_KERNEL); if (!copy) { /* Slow path with inline barrier-s */ free_mem_alloc(ma); @@ -507,10 +507,7 @@ static void destroy_mem_alloc(struct bpf_mem_alloc *ma, int rcu_in_progress) } /* Defer barriers into worker to let the rest of map memory to be freed */ - copy->cache = ma->cache; - ma->cache = NULL; - copy->caches = ma->caches; - ma->caches = NULL; + memset(ma, 0, sizeof(*ma)); INIT_WORK(©->work, free_mem_alloc_deferred); queue_work(system_unbound_wq, ©->work); } From patchwork Sat Jun 24 03:13:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291568 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4C357F; Sat, 24 Jun 2023 03:13:50 +0000 (UTC) Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8ECB26A6; Fri, 23 Jun 2023 20:13:49 -0700 (PDT) Received: by mail-oi1-x234.google.com with SMTP id 5614622812f47-3a1a0e5c0ddso977738b6e.1; Fri, 23 Jun 2023 20:13:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576429; x=1690168429; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m5dE0XhJvwd5u50dMmi7rGk+uakRp6is0d4Rg5rl7Ow=; b=hpZ9HhPDM1nf+tG/dgz5CbyK6JskPYmgSNgtMBSfEj5nKfnCL/JivLXBH0KcnHfwL7 owq7PxTSOBaxCVy6ANkWcbMAIohRJxae1V+SI/Rq7ry+Di0jKj5MqJITLdcvwbzFXyd2 B8HAWpi+D7xVyyrBW24fBg5fQNph60sqihjGGpgquBEWAjcbR8vFtIIUn/Yk0JJekFk5 WuGbfTJzCcguVm1ZPOaSWOQDx7RVpWDzQtBp8Dfu4YRRA3Wa4rdLaKj9S2TsiKkUexm8 qvl6sj5Lyrj9Bs9bp7TR69DqS0dINYl88cCS2THm08CFeBRMjUurOXXq2iyCqciCDZbk qBSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576429; x=1690168429; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m5dE0XhJvwd5u50dMmi7rGk+uakRp6is0d4Rg5rl7Ow=; b=GFe6nPriTsQD5Z0tWVIEUhtduWlmnRGDGQXUNE61vOj2rzZU2wBuOrm2JX/lDKGCjY hwaPpw/dHcESHQCMWM7AtRbMkcR8KLqLCuVmguA4vy0Lwo1zybadQ4OvdNKbE390zz6n bgft7oiZkhWiMPgEfB8iLFcET3KdmMtdy1KeKq2H2KjYuuMlCzk3qjeVNR6EiOLKCGrO +QLLpQNwgx7NPkapxWYTmLMcy1dfyVC1fqRg/UDTGhinOB+xEys0mNK4GgZ9eXKBbY8T C9pBlsJvFiYkTkkYTgSBRSbHDqB5QcJnpkJQfWf7gSuJOoSimH6TInxZZzd9KUr4TP/I M2rg== X-Gm-Message-State: AC+VfDxRnHAIQ82le5n0fCux5IKvG0//8gOFfj8u7LqLIZdLxXMQMIbN FtKVOlM7VYqrfu14PRjEcqI= X-Google-Smtp-Source: ACHHUZ4jZMpMrsj+OpLLGbg2UbRX2WERNS2Nssi9AirVeFLiHXVPn1tdCgLQWtjj01zXaVBkFsthww== X-Received: by 2002:aca:2411:0:b0:3a1:b24b:9ae6 with SMTP id n17-20020aca2411000000b003a1b24b9ae6mr2535566oic.36.1687576428933; Fri, 23 Jun 2023 20:13:48 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id s21-20020a170902989500b001b3f039f8a8sm245920plp.61.2023.06.23.20.13.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:13:48 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 03/13] bpf: Let free_all() return the number of freed elements. Date: Fri, 23 Jun 2023 20:13:23 -0700 Message-Id: <20230624031333.96597-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Let free_all() helper return the number of freed elements. It's not used in this patch, but helps in debug/development of bpf_mem_alloc. For example this diff for __free_rcu(): - free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); + printk("cpu %d freed %d objs after tasks trace\n", raw_smp_processor_id(), + free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size)); would show how busy RCU tasks trace is. In artificial benchmark where one cpu is allocating and different cpu is freeing the RCU tasks trace won't be able to keep up and the list of objects would keep growing from thousands to millions and eventually OOMing. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index b0011217be6c..693651d2648b 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -223,12 +223,16 @@ static void free_one(void *obj, bool percpu) kfree(obj); } -static void free_all(struct llist_node *llnode, bool percpu) +static int free_all(struct llist_node *llnode, bool percpu) { struct llist_node *pos, *t; + int cnt = 0; - llist_for_each_safe(pos, t, llnode) + llist_for_each_safe(pos, t, llnode) { free_one(pos, percpu); + cnt++; + } + return cnt; } static void __free_rcu(struct rcu_head *head) From patchwork Sat Jun 24 03:13:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291569 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 028F27F; Sat, 24 Jun 2023 03:13:55 +0000 (UTC) Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5C9AE47; Fri, 23 Jun 2023 20:13:53 -0700 (PDT) Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-55e57337756so930487eaf.0; Fri, 23 Jun 2023 20:13:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576433; x=1690168433; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=16uWDlZdQR3Juwac+BQwsJvGxkAsevnPbRvEaTU5eEY=; b=B6XZ3jgGD6DdakMXT0qyN74hn0ckaNLgOrz8SwvtV14P8yqvRf1fwJKAmUFyOrNUI5 px/z0LunAxAckmSg94/eu2ncizYvsfZQnLzugWukkZ1lId4C+gHt77zzXyU0tSqHCZMu 2EFBzdZwEAmTMyNgX771KnWdtRID/wxJdN2Qox6+V1UoYHmlZQ6DnfMhlgsGsseFT9rK 9Yzdsx174V9l79V9urKNEtjh9CsPeY8CmFzlCoBDFIgzSTlMBPv0yJkcRBxy4lsbiq1Y EpGgntBqp5F9bTmuLzlhYEzAsZFAQzG1ZPb5TqN2nfA80blFUjELRL73+kgVoXlfBE8X X0ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576433; x=1690168433; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=16uWDlZdQR3Juwac+BQwsJvGxkAsevnPbRvEaTU5eEY=; b=GC4X+z5qJDXVS1/xlL5WJM9x1XDRzDurh6kJU+QuCaDEI37T02aIq2byMk2T+x6JGN P6VYyrez0WRyMvTl+k8/QyXmDsPECDVOdfNj7U3Mb/LJVkeBq2tMOXITgPavc6yqtIWv z36cneFFHdHDvXn2ijqp3FxAur/hwHS5+nTBaD6z1dghybLE+6m9GqlnpJzGZIT+J0mR e+hNg0ymP+rnXyhxPdmRfQi1nBvKHoZjZwfL6srn5lQWH9INtapURUtIRzCUtGqFE0nh 2tQjkBIdGOq1hwPumwR+pTurTm1zYobxS5vPNJttMHhwzHaP78Y9c0XrOtvkqkkcLStS O7jQ== X-Gm-Message-State: AC+VfDwYMjrT1L8W18AdB8VgZ4wlyv8Wio657tV5o3Y/Sg7q+A9nRCUV d/fOZRZ5O83skYnQdyuJ5R4= X-Google-Smtp-Source: ACHHUZ6i4Q6XBEi1sWyBTXB8kwICKU1XyzeLDi9VjKLqnvRFRaPgV4AqnmvXMhFgS8mZ6rtGRoKFXg== X-Received: by 2002:a05:6808:b28:b0:3a1:aef1:bbf3 with SMTP id t8-20020a0568080b2800b003a1aef1bbf3mr2949957oij.23.1687576432861; Fri, 23 Jun 2023 20:13:52 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id a19-20020a170902b59300b001b04c2023e3sm229348pls.218.2023.06.23.20.13.51 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:13:52 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 04/13] bpf: Refactor alloc_bulk(). Date: Fri, 23 Jun 2023 20:13:24 -0700 Message-Id: <20230624031333.96597-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Factor out inner body of alloc_bulk into separate helper. No functioncal changes. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 46 ++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 693651d2648b..9693b1f8cbda 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -154,11 +154,35 @@ static struct mem_cgroup *get_memcg(const struct bpf_mem_cache *c) #endif } +static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +{ + unsigned long flags; + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + /* In RT irq_work runs in per-cpu kthread, so disable + * interrupts to avoid preemption and interrupts and + * reduce the chance of bpf prog executing on this cpu + * when active counter is busy. + */ + local_irq_save(flags); + /* alloc_bulk runs from irq_work which will not preempt a bpf + * program that does unit_alloc/unit_free since IRQs are + * disabled there. There is no race to increment 'active' + * counter. It protects free_llist from corruption in case NMI + * bpf prog preempted this loop. + */ + WARN_ON_ONCE(local_inc_return(&c->active) != 1); + __llist_add(obj, &c->free_llist); + c->free_cnt++; + local_dec(&c->active); + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_restore(flags); +} + /* Mostly runs from irq_work except __init phase. */ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) { struct mem_cgroup *memcg = NULL, *old_memcg; - unsigned long flags; void *obj; int i; @@ -188,25 +212,7 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (!obj) break; } - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - /* In RT irq_work runs in per-cpu kthread, so disable - * interrupts to avoid preemption and interrupts and - * reduce the chance of bpf prog executing on this cpu - * when active counter is busy. - */ - local_irq_save(flags); - /* alloc_bulk runs from irq_work which will not preempt a bpf - * program that does unit_alloc/unit_free since IRQs are - * disabled there. There is no race to increment 'active' - * counter. It protects free_llist from corruption in case NMI - * bpf prog preempted this loop. - */ - WARN_ON_ONCE(local_inc_return(&c->active) != 1); - __llist_add(obj, &c->free_llist); - c->free_cnt++; - local_dec(&c->active); - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_restore(flags); + add_obj_to_free_list(c, obj); } set_active_memcg(old_memcg); mem_cgroup_put(memcg); From patchwork Sat Jun 24 03:13:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291570 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2D587F; Sat, 24 Jun 2023 03:13:58 +0000 (UTC) Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A314E47; Fri, 23 Jun 2023 20:13:57 -0700 (PDT) Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-666edfc50deso953948b3a.0; Fri, 23 Jun 2023 20:13:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576437; x=1690168437; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=demoIvd6FUiWf68KRG+qXQL3nbt4VblYJWw6+Cz8IOo=; b=sgfK82GqNhjk9qkC5VjJV3YL8U0hG58WuUkRH8QQaDbWoVOyqfWOtHerQJiGmOhpXa SWj2jdviKPi/pwl7Q3rGnfvUSVraoEAhVjAf3Vl+f6o+XCI4TMWbUE/srSt0QA79Zg3+ 9D+fE78lpjld3RaWGkS+HItFKnld3Uc7U+1iux3aq/vOyyGVsdrPM5IZvjL+FoIwu3vB TaO07/kk7rDWNHKFSXGRWBg3G6WbmKwkuTlS1ReKDXgbXFYLfeQ+dW59F+AS/ZxNObgW EDxAQdb0nhjQ8U1p590MV7jf+KuWKwGjgjS1Rphv0Lsq/vwAJLxuvNS0mNiUL12CELZw cofg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576437; x=1690168437; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=demoIvd6FUiWf68KRG+qXQL3nbt4VblYJWw6+Cz8IOo=; b=ZIFhVwUiNuD1DcDnKZBmn/ZhBhBJw0VrdU5uB+Jzo49ZxCuqlnyiGt0EiAwZrxuYVD p7w2d42xZNMD1WgSTxoTJZpx1QTOV1eaXhSBsiJJ89ZMbZyy2e/9rQrBTVzBYSOnzH5Y zfkQEX2+Za+p9/ZNk/aLb6sFCpCE6s3Vl/ZoVRo4usDrf0CtSh2ngmD3tD8lDWdcMZTk xqYMB/sM3hTtDU/JfaAow7z7jws7/Legz2+5nb9w3WMRu39yhjejU7mbIWAAGbgZ8tgc Hjo0chGkUWe5tzUXKD2CPuhyXeUxQk5GIOSFpcVkdacTfl0qgoiJka/p+taIy9NaRR0a urOg== X-Gm-Message-State: AC+VfDwt4P1+UZoeHD8cSJXrMqzmKBzyAQI5zyN/3u4sD1qfKR9YIhWM EdfsAU3ngJ7AHprh8DcGIzU= X-Google-Smtp-Source: ACHHUZ6EJt5ILDUNDUuY7/FujMmanXoq+PAdTjPZ8EouZHKZ38QPRAvZIMlVBjHBUVmXj7E+cQZ6Iw== X-Received: by 2002:a05:6a00:1493:b0:668:9b2d:b534 with SMTP id v19-20020a056a00149300b006689b2db534mr19237238pfu.1.1687576436618; Fri, 23 Jun 2023 20:13:56 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id x13-20020a056a00270d00b0064fabbc047dsm217051pfv.55.2023.06.23.20.13.55 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:13:56 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 05/13] bpf: Further refactor alloc_bulk(). Date: Fri, 23 Jun 2023 20:13:25 -0700 Message-Id: <20230624031333.96597-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov In certain scenarios alloc_bulk() migth be taking free objects mainly from free_by_rcu_ttrace list. In such case get_memcg() and set_active_memcg() are redundant, but they show up in perf profile. Split the loop and only set memcg when allocating from slab. No performance difference in this patch alone, but it helps in combination with further patches. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 9693b1f8cbda..b07368d77343 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -186,8 +186,6 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) void *obj; int i; - memcg = get_memcg(c); - old_memcg = set_active_memcg(memcg); for (i = 0; i < cnt; i++) { /* * free_by_rcu_ttrace is only manipulated by irq work refill_work(). @@ -202,16 +200,24 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) * numa node and it is not a guarantee. */ obj = __llist_del_first(&c->free_by_rcu_ttrace); - if (!obj) { - /* Allocate, but don't deplete atomic reserves that typical - * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc - * will allocate from the current numa node which is what we - * want here. - */ - obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT); - if (!obj) - break; - } + if (!obj) + break; + add_obj_to_free_list(c, obj); + } + if (i >= cnt) + return; + + memcg = get_memcg(c); + old_memcg = set_active_memcg(memcg); + for (; i < cnt; i++) { + /* Allocate, but don't deplete atomic reserves that typical + * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc + * will allocate from the current numa node which is what we + * want here. + */ + obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT); + if (!obj) + break; add_obj_to_free_list(c, obj); } set_active_memcg(old_memcg); From patchwork Sat Jun 24 03:13:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291571 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CC0A7F; Sat, 24 Jun 2023 03:14:02 +0000 (UTC) Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34DD3E47; Fri, 23 Jun 2023 20:14:01 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-666edfc50deso953958b3a.0; Fri, 23 Jun 2023 20:14:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576440; x=1690168440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6NzUeillbcneAmPlmWD6PS2VoyHI/+zawYBd3/Cla9Q=; b=WOvB2DoFrcCtkYpdzTee0UfRcZ88G3EyFMnIdQ/OfGfpioFFt2EI0pHnOK8QDFQcUn 3CG002jvQgYcUtr4Wrkh269r8u+IMXFq6wpIrmUZ9lSG0DqhZVLPs1dn54JXHcNprfg+ uQ4M269JQLLQLl5nuKBZzrov6FfiLdVmm9OXURYEC4q0kaAQCNWusoJuJEej2OYGs7g5 sth4AvvSdZAisNZQjCDt9qzzVnJ2vrSFYrcT72zI88US1IPfinnit68ayqcCs0jWRNhN GGfdY8Qeg67KH6Z6V7o/3BxM1EQ5zHCCTplTcwimq33KENvOy+lOBhyrjFN0qSBiNICT 9EtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576440; x=1690168440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6NzUeillbcneAmPlmWD6PS2VoyHI/+zawYBd3/Cla9Q=; b=MS4lJuKPxO3OlCKYkMhs1vIpXa3PUBWzwJ+zMZZHVcMZSNj8nbXW52woqfD0hhshB2 jKe6gC3zqii9mBmLV10ijTQ4/Os75jMBCe/VZyDuEf/iKDPT7YzH7lVX4WV4gjCFGghP Ea2C+JIYh2flvO7pUtO07jZlhDjEBUev7a+zMnjTOj9V4N3CKRi/LuttWgnEJ8qJQA2V uC4FrwCrfqktph0Uu0sjVXibiwZSMcQdRCnw7YG1IbU4dZ0V7ZdhLnMCun1U+0f8ibdx JNv/PtvQfqZkNhgalSyaKCQAekMCGGGrrSQSGUyV7qy52RbtWhaaoqtZIbxYNaQdxuB0 yrUA== X-Gm-Message-State: AC+VfDz6sLwxbLIS+G8lVkh9y7ogExm9BW7QsIpjjFJm4y8XI1fsEuPJ b1bV/PSOAe5FHn0/2NYhsnY= X-Google-Smtp-Source: ACHHUZ6YBhluqOiTsx34LSBcnnY9tu+ylZF2DnVSdPaVCqp5S44vjjm8wFeoE6xwEJObVsQtwoUJew== X-Received: by 2002:a05:6a00:e85:b0:666:6c01:2e9e with SMTP id bo5-20020a056a000e8500b006666c012e9emr31798591pfb.15.1687576440512; Fri, 23 Jun 2023 20:14:00 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id 16-20020a630f50000000b00553c09cc795sm318137pgp.50.2023.06.23.20.13.58 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:00 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 06/13] bpf: Optimize moving objects from free_by_rcu_ttrace to waiting_for_gp_ttrace. Date: Fri, 23 Jun 2023 20:13:26 -0700 Message-Id: <20230624031333.96597-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Optimize moving objects from free_by_rcu_ttrace to waiting_for_gp_ttrace by remembering the tail. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index b07368d77343..4fd79bd51f5a 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -101,6 +101,7 @@ struct bpf_mem_cache { /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; + struct llist_node *free_by_rcu_ttrace_tail; struct llist_head waiting_for_gp_ttrace; struct rcu_head rcu_ttrace; atomic_t call_rcu_ttrace_in_progress; @@ -273,24 +274,27 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. * Nothing races to add to free_by_rcu_ttrace list. */ - __llist_add(llnode, &c->free_by_rcu_ttrace); + if (__llist_add(llnode, &c->free_by_rcu_ttrace)) + c->free_by_rcu_ttrace_tail = llnode; } static void do_call_rcu_ttrace(struct bpf_mem_cache *c) { - struct llist_node *llnode, *t; + struct llist_node *llnode; if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) return; WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); - llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu_ttrace)) + llnode = __llist_del_all(&c->free_by_rcu_ttrace); + if (llnode) /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): * from __free_rcu() and from drain_mem_cache(). */ - __llist_add(llnode, &c->waiting_for_gp_ttrace); + __llist_add_batch(llnode, c->free_by_rcu_ttrace_tail, + &c->waiting_for_gp_ttrace); /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal From patchwork Sat Jun 24 03:13:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291572 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A52C47F; Sat, 24 Jun 2023 03:14:06 +0000 (UTC) Received: from mail-ot1-x32f.google.com (mail-ot1-x32f.google.com [IPv6:2607:f8b0:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EE37E47; Fri, 23 Jun 2023 20:14:05 -0700 (PDT) Received: by mail-ot1-x32f.google.com with SMTP id 46e09a7af769-6b5d4b359d3so1214908a34.2; Fri, 23 Jun 2023 20:14:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576444; x=1690168444; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=duNqWZDO3/6HncjrcON7PuH7YZgYXu50jcANK1eHCwM=; b=ZOFQNEFTttgq8yp6fz032yFZrP1lQA8sbIsYdC43C+SGNcVNoFH97svtFSzA33SlIr A5Nz6qIPOBP8fYhasfUJIMHtNYUMg/vMQjrZ7wB6zrb6buV4HAQUpN+boXShki8Ial7c n/a5UWPOt9AFkaDWOC37X6A4GCXUKNn8IEdSt6hnAC3Cw0OjyB09FWJ129TBxYGo37aa FyDxZklB0I4Ru3H40c6Uz8kZuMjTXig8iriUIWf8WT0g3rnS813kJwgw29l/uU3vInqh zYJdq7a235Nz8qZIPDlz5zSBhJV2W/bG46lJuGM+rqmCOY6OKP145klRiVCf+W8gCOcC ievA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576444; x=1690168444; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=duNqWZDO3/6HncjrcON7PuH7YZgYXu50jcANK1eHCwM=; b=WRfMLh07wrWhI4PBILUb9cANWcUOUWS9ID1io8+LoIvDJpDjB107OVcczZOgRxbUqc JZmhI9TUgIVTz6MWwZMwRO6uAw/hUCYgUQJG+1jjjHz+8fHkBkmo9Sm1qLntM1YpuA8J ivWoSd0pWFYog8lPdXAActKXz3aTNQSCEJuDiEcedqB6dKtwAbtUl5uHPfZu8vqGKqh7 a8lv4Le4nwfEhXJdc+jhF3LOabC7xoR5Dp48PxY6zZlUCdL2PXbDz7EgWmNgtgjVqDGK XUq9ido5fNzSNXiMZ7t3qzH6FDaZLPjuX9UR3GZoXnDp37tsHMO+Kx6/dDPn7voTKelW XP7w== X-Gm-Message-State: AC+VfDyMBmwY4qcytchMBfPVI1j+DajCBeODtYGrkXsUG3EL4CzYcO2p S7TGuuJnomhHwzyStVD+Avg= X-Google-Smtp-Source: ACHHUZ6FKQtMwOK84nF8VW2bU0nzN+YBigoFd8DVHGXtTmuRd/GhRu1lN3z2GcluBAb3bGCe9HhaLA== X-Received: by 2002:a9d:6343:0:b0:6b1:d368:557c with SMTP id y3-20020a9d6343000000b006b1d368557cmr20046744otk.30.1687576444556; Fri, 23 Jun 2023 20:14:04 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id f17-20020a170902ab9100b001aadd0d7364sm245495plr.83.2023.06.23.20.14.02 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:04 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 07/13] bpf: Change bpf_mem_cache draining process. Date: Fri, 23 Jun 2023 20:13:27 -0700 Message-Id: <20230624031333.96597-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov The next patch will introduce cross-cpu llist access and existing irq_work_sync() + drain_mem_cache() + rcu_barrier_tasks_trace() mechanism will not be enough, since irq_work_sync() + drain_mem_cache() on cpu A won't guarantee that llist on cpu A are empty. The free_bulk() on cpu B might add objects back to llist of cpu A. Add 'bool draining' flag and set it all cpus before proceeding with irq_work_sync. The modified sequence looks like: for_each_cpu: WRITE_ONCE(c->draining, true); // make RCU callback a nop irq_work_sync(); // wait for irq_work callback (free_bulk) to finish for_each_cpu: drain_mem_cache(); // free all objects rcu_barrier_tasks_trace(); // wait for RCU callbacks to execute as a nop Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 4fd79bd51f5a..d68a854f45ee 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -98,6 +98,7 @@ struct bpf_mem_cache { int free_cnt; int low_watermark, high_watermark, batch; int percpu_size; + bool draining; /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -252,7 +253,10 @@ static void __free_rcu(struct rcu_head *head) { struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu_ttrace); + if (unlikely(READ_ONCE(c->draining))) + goto out; free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); +out: atomic_set(&c->call_rcu_ttrace_in_progress, 0); } @@ -542,16 +546,11 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) rcu_in_progress = 0; for_each_possible_cpu(cpu) { c = per_cpu_ptr(ma->cache, cpu); - /* - * refill_work may be unfinished for PREEMPT_RT kernel - * in which irq work is invoked in a per-CPU RT thread. - * It is also possible for kernel with - * arch_irq_work_has_interrupt() being false and irq - * work is invoked in timer interrupt. So waiting for - * the completion of irq work to ease the handling of - * concurrency. - */ + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); + } + for_each_possible_cpu(cpu) { + c = per_cpu_ptr(ma->cache, cpu); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } @@ -566,7 +565,14 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) cc = per_cpu_ptr(ma->caches, cpu); for (i = 0; i < NUM_CACHES; i++) { c = &cc->cache[i]; + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); + } + } + for_each_possible_cpu(cpu) { + cc = per_cpu_ptr(ma->caches, cpu); + for (i = 0; i < NUM_CACHES; i++) { + c = &cc->cache[i]; drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } From patchwork Sat Jun 24 03:13:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291573 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F4847F; Sat, 24 Jun 2023 03:14:10 +0000 (UTC) Received: from mail-ot1-x32d.google.com (mail-ot1-x32d.google.com [IPv6:2607:f8b0:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65270E47; Fri, 23 Jun 2023 20:14:09 -0700 (PDT) Received: by mail-ot1-x32d.google.com with SMTP id 46e09a7af769-6b711c3ad1fso1189667a34.0; Fri, 23 Jun 2023 20:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576448; x=1690168448; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5rMpHCpyAwYur1+foqQGqDjqSKhTON8KzkS42uHp7Vs=; b=CUcGfrhmhl9NtZX7D/yOa94cfvd4Zw53ZvXOnTyfAhAeOB57jlu2hzsPIa+3tB7qGI F99RQiGMR0oEQpNRtdRaxh/azMNpSXFv5F0E9mHfrY/Dj3auFwpjlIQHThGlUDwFNhlJ WK7fYf65apVL6WhXq7DP7wT1aTfd5dof5urV5MQurCQFuOXoC2BGkQC465OvJbpEbTmE MZMKDnvLaJBoGqf2wgAXXcfE6azRuQBKyPRfOz7SnLmkq0VKLvfqpHUUxDMiVbH/z2xL fCo/EiyGiQM2Bval/7d1fbXEQIl6AH0QiZ1am+e1b3P45srsDN7q1FBh7vEiEIRBWm3e l+4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576448; x=1690168448; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5rMpHCpyAwYur1+foqQGqDjqSKhTON8KzkS42uHp7Vs=; b=BBAlnWSONq0SS/aZHFVYK0OhhyE74VbTTnVSacSJnmx6RWPWykrJYT2XVNwI9jPHcp x9WxqtpFP2tdNQmvClFtJ/f9l95jThSLS5EbVmvR/sjD1jSaQH2SBA1L60eBwUGBpnJH 9XqJ+pk6xRn/SKQpgI7rZ1QW+HstQtldsi3D1sRqvUAZUBDUrEc9zXZdZJ0dZRCGu6PK 644D01QBClxCBx4z1bMoO89eBdvSZZXkR+Q8k6FM04+9Ls6rG9/nsYX0UvMvixQu2UCZ kn7Vokri4OQ+s9Fp4rWTm+GOIeY7M+dbR7/TX2fg6/22abBXufCvN28HwhOkhCli0zep 68kA== X-Gm-Message-State: AC+VfDxRxc2+vEoIfyx61vVVl6cnb6LCw/Lw+03LXwLicz0RZSWSoTK1 SLa8Qul0KQeGZZGHL2HG4GH0EM3UVsU= X-Google-Smtp-Source: ACHHUZ7HSDtqcXLFJV7RPojw+TX3/T2Q3dBKp2lXq+wxtsx2dl1JPFqiBaHy6zo2cOGLThId3/SQNw== X-Received: by 2002:a05:6358:5118:b0:132:d796:5c6f with SMTP id 24-20020a056358511800b00132d7965c6fmr4648198rwi.20.1687576448468; Fri, 23 Jun 2023 20:14:08 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id s18-20020a170902b19200b001b3d8ac1b6bsm232231plr.212.2023.06.23.20.14.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:08 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 08/13] bpf: Add a hint to allocated objects. Date: Fri, 23 Jun 2023 20:13:28 -0700 Message-Id: <20230624031333.96597-9-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov To address OOM issue when one cpu is allocating and another cpu is freeing add a target bpf_mem_cache hint to allocated objects and when local cpu free_llist overflows free to that bpf_mem_cache. The hint addresses the OOM while maintaing the same performance for common case when alloc/free are done on the same cpu. Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 46 ++++++++++++++++++++++++++----------------- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index d68a854f45ee..692a9a30c1dc 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -99,6 +99,7 @@ struct bpf_mem_cache { int low_watermark, high_watermark, batch; int percpu_size; bool draining; + struct bpf_mem_cache *tgt; /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -190,18 +191,11 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) for (i = 0; i < cnt; i++) { /* - * free_by_rcu_ttrace is only manipulated by irq work refill_work(). - * IRQ works on the same CPU are called sequentially, so it is - * safe to use __llist_del_first() here. If alloc_bulk() is - * invoked by the initial prefill, there will be no running - * refill_work(), so __llist_del_first() is fine as well. - * - * In most cases, objects on free_by_rcu_ttrace are from the same CPU. - * If some objects come from other CPUs, it doesn't incur any - * harm because NUMA_NO_NODE means the preference for current - * numa node and it is not a guarantee. + * For every 'c' llist_del_first(&c->free_by_rcu_ttrace); is + * done only by one CPU == current CPU. Other CPUs might + * llist_add() and llist_del_all() in parallel. */ - obj = __llist_del_first(&c->free_by_rcu_ttrace); + obj = llist_del_first(&c->free_by_rcu_ttrace); if (!obj) break; add_obj_to_free_list(c, obj); @@ -278,7 +272,7 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. * Nothing races to add to free_by_rcu_ttrace list. */ - if (__llist_add(llnode, &c->free_by_rcu_ttrace)) + if (llist_add(llnode, &c->free_by_rcu_ttrace)) c->free_by_rcu_ttrace_tail = llnode; } @@ -290,7 +284,7 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) return; WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); - llnode = __llist_del_all(&c->free_by_rcu_ttrace); + llnode = llist_del_all(&c->free_by_rcu_ttrace); if (llnode) /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. @@ -303,16 +297,22 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal * progs to finish and finally do free_one() on each element. + * + * call_rcu_tasks_trace() enqueues to a global queue, so it's ok + * that current cpu bpf_mem_cache != target bpf_mem_cache. */ call_rcu_tasks_trace(&c->rcu_ttrace, __free_rcu_tasks_trace); } static void free_bulk(struct bpf_mem_cache *c) { + struct bpf_mem_cache *tgt = c->tgt; struct llist_node *llnode, *t; unsigned long flags; int cnt; + WARN_ON_ONCE(tgt->unit_size != c->unit_size); + do { if (IS_ENABLED(CONFIG_PREEMPT_RT)) local_irq_save(flags); @@ -326,13 +326,13 @@ static void free_bulk(struct bpf_mem_cache *c) if (IS_ENABLED(CONFIG_PREEMPT_RT)) local_irq_restore(flags); if (llnode) - enque_to_free(c, llnode); + enque_to_free(tgt, llnode); } while (cnt > (c->high_watermark + c->low_watermark) / 2); /* and drain free_llist_extra */ llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) - enque_to_free(c, llnode); - do_call_rcu_ttrace(c); + enque_to_free(tgt, llnode); + do_call_rcu_ttrace(tgt); } static void bpf_mem_refill(struct irq_work *work) @@ -431,6 +431,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) c->unit_size = unit_size; c->objcg = objcg; c->percpu_size = percpu_size; + c->tgt = c; prefill_mem_cache(c, cpu); } ma->cache = pc; @@ -453,6 +454,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) c = &cc->cache[i]; c->unit_size = sizes[i]; c->objcg = objcg; + c->tgt = c; prefill_mem_cache(c, cpu); } } @@ -471,7 +473,7 @@ static void drain_mem_cache(struct bpf_mem_cache *c) * Except for waiting_for_gp_ttrace list, there are no concurrent operations * on these lists, so it is safe to use __llist_del_all(). */ - free_all(__llist_del_all(&c->free_by_rcu_ttrace), percpu); + free_all(llist_del_all(&c->free_by_rcu_ttrace), percpu); free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); @@ -605,8 +607,10 @@ static void notrace *unit_alloc(struct bpf_mem_cache *c) local_irq_save(flags); if (local_inc_return(&c->active) == 1) { llnode = __llist_del_first(&c->free_llist); - if (llnode) + if (llnode) { cnt = --c->free_cnt; + *(struct bpf_mem_cache **)llnode = c; + } } local_dec(&c->active); local_irq_restore(flags); @@ -630,6 +634,12 @@ static void notrace unit_free(struct bpf_mem_cache *c, void *ptr) BUILD_BUG_ON(LLIST_NODE_SZ > 8); + /* + * Remember bpf_mem_cache that allocated this object. + * The hint is not accurate. + */ + c->tgt = *(struct bpf_mem_cache **)llnode; + local_irq_save(flags); if (local_inc_return(&c->active) == 1) { __llist_add(llnode, &c->free_llist); From patchwork Sat Jun 24 03:13:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291574 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 057267F; Sat, 24 Jun 2023 03:14:14 +0000 (UTC) Received: from mail-oa1-x33.google.com (mail-oa1-x33.google.com [IPv6:2001:4860:4864:20::33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18BE1E47; Fri, 23 Jun 2023 20:14:13 -0700 (PDT) Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-1aa161c3796so1213550fac.1; Fri, 23 Jun 2023 20:14:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576452; x=1690168452; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BZj9kRSuBlBukl9uM0KLWNA4kwPzrxPQ9TSaIxf8hKQ=; b=S518s2v2/AxnQdou/ec0ySauX36NoAlE9dwdT3v6e56rWscgtPySW+6m3ow/KY0kTU ncwz/apbtHT2qzknWQCRRgaxzgvLcIbhsj0oDdwJYCs+l0Do0zYD3OcSJ82Tj0VpLQ9M pRB/JitwBMm9gDsd5dl77JE4rHZf7xr9AnNM5ztivU1l1CHKlMHrPHuKOJLZyBpj5m74 9avVzPuBuPZTK5fUP4KvKoAjaXPqKkB3sUGH8a77AMButL7bm/rx6gPjUhSubwHBQlj3 d55vTjgiTQjE/2vVBKK0E3Udpek7eISn3EfBbPYanfUNRqs3GRAKMWAOqGJi+0MFjXPX QsnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576452; x=1690168452; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BZj9kRSuBlBukl9uM0KLWNA4kwPzrxPQ9TSaIxf8hKQ=; b=QblBKYPyz9oQ4ZIIWp1WujDLqls9xNYrCKARYAz1KWXzHNcbbxoGZEIxk2TAP0fVgi xuMudFOyhmhkZpBmmQe9U7IL4geE6JrKHpMP4FmzgA/4Kwtc5/F8II/ZVHyrzI2mCzWk jct6PotFlLYbgbAMeCIhNF5/YnMCdahePki3iw+5tSE4aQMXq/WArnn/1g1zZJTNNzDV dOBD1L36ahbU+9xJYV5rz81iW8qWbYMGReAwXIKFnSSztqssiy/bgD/xBkqPmUFHt2Q8 Ol/BiW+QHthHeSQpBmeYAY+f+4IU5uqcCvZr+4oMbKgh3uau0gy01RAEeXdwqVwhWEWB F0Cg== X-Gm-Message-State: AC+VfDwYKR+edrQ6+qx6ljvsJqi/0I3zMBg5VsTMZIgQPPOVoKa3KctV 0oVEPyguZhBSSXkKrv+gsos3kRt7DJU= X-Google-Smtp-Source: ACHHUZ5d0cUQNzbtP0/2qASt0GxeEAHYs65WJsM2C8LQcbeUMy92C/1iI1zKAN+ogAY1eAAES9n+CQ== X-Received: by 2002:a05:6808:1cb:b0:3a0:3e86:3c with SMTP id x11-20020a05680801cb00b003a03e86003cmr12249246oic.56.1687576452342; Fri, 23 Jun 2023 20:14:12 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id d3-20020a170902b70300b001b3d6c86ffdsm239753pls.156.2023.06.23.20.14.10 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:11 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 09/13] bpf: Allow reuse from waiting_for_gp_ttrace list. Date: Fri, 23 Jun 2023 20:13:29 -0700 Message-Id: <20230624031333.96597-10-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov alloc_bulk() can reuse elements from free_by_rcu_ttrace. Let it reuse from waiting_for_gp_ttrace as well to avoid unnecessary kmalloc(). Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 692a9a30c1dc..666917c16e87 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -203,6 +203,15 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (i >= cnt) return; + for (; i < cnt; i++) { + obj = llist_del_first(&c->waiting_for_gp_ttrace); + if (!obj) + break; + add_obj_to_free_list(c, obj); + } + if (i >= cnt) + return; + memcg = get_memcg(c); old_memcg = set_active_memcg(memcg); for (; i < cnt; i++) { From patchwork Sat Jun 24 03:13:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291575 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E7A97F; Sat, 24 Jun 2023 03:14:18 +0000 (UTC) Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8ED9E47; Fri, 23 Jun 2023 20:14:16 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1b512309c86so9528255ad.1; Fri, 23 Jun 2023 20:14:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576456; x=1690168456; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QJFALLSA0Zej+KutPxvztQdPk2ASSkOODjkIKNtNWzE=; b=Posl1N45nd9idugsdC3M7csc+TZcClM4RxGd6bv/qpt4UamKsMxC/nl9S4MnZ6/oKD u4nKziVyHdmcm4W+paOlN9YSojFB1+imXTB35jEl/ol4uzjLc+CEjY4Ctupn187E7yTU degc2D+RKIdee0z39ZOmXNsZPPXEwn1tXRhlMXPGy4R4MsBiR8ZpMt+Vo2jf9DxudV4Y FjOpilbEGHOqQXbPYl7mmwFelc3VLV931afnbTv13nPQlHBJzBCiXavNIytsIOFrQ3nk S4Mv29QPHUEqex5UB6LfbVtfMMihCDhq7F9gFFlgsr2i9MyLviAs6LD0bRRmlr+r6Gxq bUyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576456; x=1690168456; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QJFALLSA0Zej+KutPxvztQdPk2ASSkOODjkIKNtNWzE=; b=ZOFX8r+Vyjr3UY/Q5rumPvenUoVxQ2ypT6mw1SmZxqJhswsLcCtYU0aOS7bEDfyjS/ 7ob2v049KBQtm7UoQpFBGwhwGj4kB1y9o2KYZqkB8kzecFY1fEyiIRmNWD4f7okJ8MUC az+uHfvz80k2qLk4M+8lq1d6QqKdYmlQExEijlg+AkFsVHxKYtmDyWPsq2F8n8pXzLVR eoZSmqx+BF3HggpFD6gt/4+VQoBYfEnKWSc0UivVyX6a/Zdmjg7892QVUcj0+fr2VfJS XaBkY452/T5PtUZvo8Q5ACvYHAiaxPxVu0B3Aq84oReEUtdqLuP5smaCxCP/eyaBHX7b QpFg== X-Gm-Message-State: AC+VfDykz5Gk4xb/2k3La/StR+2cCz7MBeq3u37blPvRn8tJWDZE5Mmo +CYtUU929K18ToTYhDgubO8= X-Google-Smtp-Source: ACHHUZ6sr3mb5nOlKO4mn5gKd4FlBguxSlFzf5WdGbq4peJxDD3eguXLmtSNnJ7lYZSE/NNCf8Xg0A== X-Received: by 2002:a17:902:d4c9:b0:1b6:6dc8:edeb with SMTP id o9-20020a170902d4c900b001b66dc8edebmr1132344plg.21.1687576456264; Fri, 23 Jun 2023 20:14:16 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id h13-20020a170902f7cd00b001b3d4bb352dsm238626plw.175.2023.06.23.20.14.14 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:15 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 10/13] rcu: Export rcu_request_urgent_qs_task() Date: Fri, 23 Jun 2023 20:13:30 -0700 Message-Id: <20230624031333.96597-11-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: "Paul E. McKenney" If a CPU is executing a long series of non-sleeping system calls, RCU grace periods can be delayed for on the order of a couple hundred milliseconds. This is normally not a problem, but if each system call does a call_rcu(), those callbacks can stack up. RCU will eventually notice this callback storm, but use of rcu_request_urgent_qs_task() allows the code invoking call_rcu() to give RCU a heads up. This function is not for general use, not yet, anyway. Reported-by: Alexei Starovoitov Signed-off-by: Paul E. McKenney Signed-off-by: Alexei Starovoitov --- include/linux/rcutiny.h | 2 ++ include/linux/rcutree.h | 1 + kernel/rcu/rcu.h | 2 -- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 7f17acf29dda..7b949292908a 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -138,6 +138,8 @@ static inline int rcu_needs_cpu(void) return 0; } +static inline void rcu_request_urgent_qs_task(struct task_struct *t) { } + /* * Take advantage of the fact that there is only one CPU, which * allows us to ignore virtualization-based context switches. diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 56bccb5a8fde..126f6b418f6a 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -21,6 +21,7 @@ void rcu_softirq_qs(void); void rcu_note_context_switch(bool preempt); int rcu_needs_cpu(void); void rcu_cpu_stall_reset(void); +void rcu_request_urgent_qs_task(struct task_struct *t); /* * Note a virtualization-based context switch. This is simply a diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 4a1b9622598b..6f5fb3f7ebf3 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -493,7 +493,6 @@ static inline void rcu_expedite_gp(void) { } static inline void rcu_unexpedite_gp(void) { } static inline void rcu_async_hurry(void) { } static inline void rcu_async_relax(void) { } -static inline void rcu_request_urgent_qs_task(struct task_struct *t) { } #else /* #ifdef CONFIG_TINY_RCU */ bool rcu_gp_is_normal(void); /* Internal RCU use. */ bool rcu_gp_is_expedited(void); /* Internal RCU use. */ @@ -508,7 +507,6 @@ void show_rcu_tasks_gp_kthreads(void); #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ static inline void show_rcu_tasks_gp_kthreads(void) {} #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */ -void rcu_request_urgent_qs_task(struct task_struct *t); #endif /* #else #ifdef CONFIG_TINY_RCU */ #define RCU_SCHEDULER_INACTIVE 0 From patchwork Sat Jun 24 03:13:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291576 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 249287F; Sat, 24 Jun 2023 03:14:22 +0000 (UTC) Received: from mail-oi1-x236.google.com (mail-oi1-x236.google.com [IPv6:2607:f8b0:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F7D8E47; Fri, 23 Jun 2023 20:14:21 -0700 (PDT) Received: by mail-oi1-x236.google.com with SMTP id 5614622812f47-392116b8f31so978708b6e.2; Fri, 23 Jun 2023 20:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576460; x=1690168460; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Id52rcowJ7NZMb5+YK4IlpOtkvo0YDczJ2nwiMNvyqw=; b=BShzq8MyYkVioGmrAxeOVWScFForOJrYEt3VbO/E82mY8Jr2ivkVJggn8locLJGFh1 dHgpxd3I9pzGNxUU+kmwHgLhG11UVGY0ZLB27m0HGE8aLIu9MoFl5XL8GyGcvPCbdRcg YcHejhIfOLEPd1gK5Ew5/C/QvgLymapawu5Xlr5EnEq1aUb71CXMeMqTRyHFeOc+ocBg mt++ATYZqs1FgzL+qLbuxAQ0KqFe5EItZDwIyBi95VMi1hWmDB/Jxq3eTJ5bAo6xUbGm p48XI6MM8pbjXL5/55MBlkBYq5aLVVzI3ZgUNRqodqsi/HfIhKmDVjb84apd6e09YpwB wevA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576460; x=1690168460; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Id52rcowJ7NZMb5+YK4IlpOtkvo0YDczJ2nwiMNvyqw=; b=EYw8pS/UCzq7xKrZzGe5vGSkE4hUXATk01kaTBD5CmpN5pfjqMjQh1ywKVZH3Sg84W wAkOrNbNXx/hTETJjDIBOv6F55uc7Ri9XpMtGslottCq5vGIIHmv8nr66DCEF9zp8eqi 8TmkK5P7nNuRULuSpi6vlWQlI5hCwbR81Zu0YCSvNONve6Oqqr5kOgeZLHDRTSzLfGTi nkvvoGe5//lUmlmj/+nArkugswjEqyfjZwVAgrg/0W4KaIg47sm9Sg4u+isxwGLsJgrr inWLqYrd6djI0WxNQmBpailReqnhh/bc9ABrbKZhWkqB87fCh30NEkeTQv5xLf0YHJ/j 5QPg== X-Gm-Message-State: AC+VfDxKjTA6NOSQQ1C1ZzK6UAAlN3T+em06K1ApLjDQVwpvZf7YJTj0 jBt+Oh0EWnZBO+H6RJZnRZQnRfyvUyM= X-Google-Smtp-Source: ACHHUZ6PhUWyiwFlHRknKqzTh5a+xst8/nQYnjELBVda5xmxVWGNE66JwfUlEteCx+gb6wq8UpZBTA== X-Received: by 2002:a05:6808:23d0:b0:3a0:567f:8e9c with SMTP id bq16-20020a05680823d000b003a0567f8e9cmr9973176oib.43.1687576460380; Fri, 23 Jun 2023 20:14:20 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id w1-20020a170902d70100b001b523714ed5sm225623ply.252.2023.06.23.20.14.18 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:19 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 11/13] selftests/bpf: Improve test coverage of bpf_mem_alloc. Date: Fri, 23 Jun 2023 20:13:31 -0700 Message-Id: <20230624031333.96597-12-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov bpf_obj_new() calls bpf_mem_alloc(), but doing alloc/free of 8 elements is not triggering watermark conditions in bpf_mem_alloc. Increase to 200 elements to make sure alloc_bulk/free_bulk is exercised. Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/progs/linked_list.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/progs/linked_list.c b/tools/testing/selftests/bpf/progs/linked_list.c index 57440a554304..84d1777a9e6c 100644 --- a/tools/testing/selftests/bpf/progs/linked_list.c +++ b/tools/testing/selftests/bpf/progs/linked_list.c @@ -96,7 +96,7 @@ static __always_inline int list_push_pop_multiple(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) { struct bpf_list_node *n; - struct foo *f[8], *pf; + struct foo *f[200], *pf; int i; /* Loop following this check adds nodes 2-at-a-time in order to From patchwork Sat Jun 24 03:13:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291577 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB1BE7F; Sat, 24 Jun 2023 03:14:26 +0000 (UTC) Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CD26E47; Fri, 23 Jun 2023 20:14:25 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-544c0d768b9so1119019a12.0; Fri, 23 Jun 2023 20:14:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576464; x=1690168464; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WO3mquHsb2SLX3jh+W53s9CaZefqHX8CDJgrQ9c5OIY=; b=XDE3lInwGbLjgj5xeUIbBnLQ6QkcfC0ZJ0Cb2B+V3QLzrA5OEPIY2B5GW0Cvdr/9HT cIri37afZSVlUuDWoQM4tb+OvmEi9gPQl4dcMiYWcXdsquxVXDH6OPEw+IcOsvhC7SHA PJU6SbEXkYEe2KpwZm+hNXQQbmBvpPQzeh5xaSdLqcbSDuSsX44LB6t9/2sm2jWJ90RG wGj8yXRS/43gsUSZxCerJ39H5lHqnVwOTEC+99pscBQuQD4CjkL6rQE0Mr3gtRNcilkT wR1txEXb81MMncCFrX5Vp2bUtXHaoaXSMDsbr8gKwiLQAia1dirFX/rLQCtpMXgIgQSG RQjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576464; x=1690168464; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WO3mquHsb2SLX3jh+W53s9CaZefqHX8CDJgrQ9c5OIY=; b=HmK7Rvr3U49DJI+VXFbmUH2lsKLiWkrebw1/OPyI12OXMSvuDRKGgUlvU48sPtfUQO CrGetek1SPVSU47TDzs319HEJPTMKBX9ftn656LMCTxSBpIoBLN1iliOvWLQBB5cZzU5 MIEnfRpNySuMuvoGngL+HKan/yfhepCqAV08mrI3yO5qwV2XNXz62seC3WeYNobhu6xS mFfwZASwSZJfujz1CvCHjg4RLvgM+Tf8UdsBk7jEHmrEu7uirOh4YZDJ8so6Ye+/ZaqD 22xnznuJaMAGfxMppTBzp0srrnBDTAG1clcUBst4OSDxPCPI5kD5dzP/HzBAQ+k4Pybt a1aA== X-Gm-Message-State: AC+VfDweu5/CqjvoeuJm2m5hWRpMM2sHEa5jZDEtOSCFObcFf1k/7YBe 4JNWJA6E5fUtPOGGQIIOVuZnJwX+avU= X-Google-Smtp-Source: ACHHUZ6xST1asMzWlt+08ilik6CxlRN0xPxSct9WaRObEiu6lHTFm8xtxaI6usfDtXnQM+0fKkX0OQ== X-Received: by 2002:a17:903:2281:b0:1b3:cdfc:3e28 with SMTP id b1-20020a170903228100b001b3cdfc3e28mr1135423plh.23.1687576464482; Fri, 23 Jun 2023 20:14:24 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id c13-20020a170902d48d00b001ab0a30c895sm234961plg.202.2023.06.23.20.14.22 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:24 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 12/13] bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu(). Date: Fri, 23 Jun 2023 20:13:32 -0700 Message-Id: <20230624031333.96597-13-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce bpf_mem_[cache_]free_rcu() similar to kfree_rcu(). Unlike bpf_mem_[cache_]free() that links objects for immediate reuse into per-cpu free list the _rcu() flavor waits for RCU grace period and then moves objects into free_by_rcu_ttrace list where they are waiting for RCU task trace grace period to be freed into slab. The life cycle of objects: alloc: dequeue free_llist free: enqeueu free_llist free_rcu: enqueue free_by_rcu -> waiting_for_gp free_llist above high watermark -> free_by_rcu_ttrace after RCU GP waiting_for_gp -> free_by_rcu_ttrace free_by_rcu_ttrace -> waiting_for_gp_ttrace -> slab Signed-off-by: Alexei Starovoitov --- include/linux/bpf_mem_alloc.h | 2 + kernel/bpf/memalloc.c | 117 +++++++++++++++++++++++++++++++++- 2 files changed, 117 insertions(+), 2 deletions(-) diff --git a/include/linux/bpf_mem_alloc.h b/include/linux/bpf_mem_alloc.h index 3929be5743f4..d644bbb298af 100644 --- a/include/linux/bpf_mem_alloc.h +++ b/include/linux/bpf_mem_alloc.h @@ -27,10 +27,12 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma); /* kmalloc/kfree equivalent: */ void *bpf_mem_alloc(struct bpf_mem_alloc *ma, size_t size); void bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr); +void bpf_mem_free_rcu(struct bpf_mem_alloc *ma, void *ptr); /* kmem_cache_alloc/free equivalent: */ void *bpf_mem_cache_alloc(struct bpf_mem_alloc *ma); void bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr); +void bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr); void bpf_mem_cache_raw_free(void *ptr); void *bpf_mem_cache_alloc_flags(struct bpf_mem_alloc *ma, gfp_t flags); diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 666917c16e87..dc144c54d502 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -101,6 +101,15 @@ struct bpf_mem_cache { bool draining; struct bpf_mem_cache *tgt; + /* list of objects to be freed after RCU GP */ + struct llist_head free_by_rcu; + struct llist_node *free_by_rcu_tail; + struct llist_head waiting_for_gp; + struct llist_node *waiting_for_gp_tail; + struct rcu_head rcu; + atomic_t call_rcu_in_progress; + struct llist_head free_llist_extra_rcu; + /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; struct llist_node *free_by_rcu_ttrace_tail; @@ -344,6 +353,60 @@ static void free_bulk(struct bpf_mem_cache *c) do_call_rcu_ttrace(tgt); } +static void __free_by_rcu(struct rcu_head *head) +{ + struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu); + struct bpf_mem_cache *tgt = c->tgt; + struct llist_node *llnode; + + if (unlikely(READ_ONCE(c->draining))) + goto out; + + llnode = llist_del_all(&c->waiting_for_gp); + if (!llnode) + goto out; + + if (llist_add_batch(llnode, c->waiting_for_gp_tail, &tgt->free_by_rcu_ttrace)) + tgt->free_by_rcu_ttrace_tail = c->waiting_for_gp_tail; + + /* Objects went through regular RCU GP. Send them to RCU tasks trace */ + do_call_rcu_ttrace(tgt); +out: + atomic_set(&c->call_rcu_in_progress, 0); +} + +static void check_free_by_rcu(struct bpf_mem_cache *c) +{ + struct llist_node *llnode, *t; + + if (llist_empty(&c->free_by_rcu) && llist_empty(&c->free_llist_extra_rcu)) + return; + + /* drain free_llist_extra_rcu */ + llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra_rcu)) + if (__llist_add(llnode, &c->free_by_rcu)) + c->free_by_rcu_tail = llnode; + + if (atomic_xchg(&c->call_rcu_in_progress, 1)) { + /* + * Instead of kmalloc-ing new rcu_head and triggering 10k + * call_rcu() to hit rcutree.qhimark and force RCU to notice + * the overload just ask RCU to hurry up. There could be many + * objects in free_by_rcu list. + * This hint reduces memory consumption for an artifical + * benchmark from 2 Gbyte to 150 Mbyte. + */ + rcu_request_urgent_qs_task(current); + return; + } + + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); + + WRITE_ONCE(c->waiting_for_gp.first, __llist_del_all(&c->free_by_rcu)); + c->waiting_for_gp_tail = c->free_by_rcu_tail; + call_rcu_hurry(&c->rcu, __free_by_rcu); +} + static void bpf_mem_refill(struct irq_work *work) { struct bpf_mem_cache *c = container_of(work, struct bpf_mem_cache, refill_work); @@ -358,6 +421,8 @@ static void bpf_mem_refill(struct irq_work *work) alloc_bulk(c, c->batch, NUMA_NO_NODE); else if (cnt > c->high_watermark) free_bulk(c); + + check_free_by_rcu(c); } static void notrace irq_work_raise(struct bpf_mem_cache *c) @@ -486,6 +551,9 @@ static void drain_mem_cache(struct bpf_mem_cache *c) free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); + free_all(__llist_del_all(&c->free_by_rcu), percpu); + free_all(__llist_del_all(&c->free_llist_extra_rcu), percpu); + free_all(llist_del_all(&c->waiting_for_gp), percpu); } static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) @@ -498,8 +566,8 @@ static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) static void free_mem_alloc(struct bpf_mem_alloc *ma) { - /* waiting_for_gp_ttrace lists was drained, but __free_rcu might - * still execute. Wait for it now before we freeing percpu caches. + /* waiting_for_gp[_ttrace] lists were drained, but RCU callbacks + * might still execute. Wait for them. * * rcu_barrier_tasks_trace() doesn't imply synchronize_rcu_tasks_trace(), * but rcu_barrier_tasks_trace() and rcu_barrier() below are only used @@ -564,6 +632,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) c = per_cpu_ptr(ma->cache, cpu); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } /* objcg is the same across cpus */ if (c->objcg) @@ -586,6 +655,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) c = &cc->cache[i]; drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } } if (c->objcg) @@ -670,6 +740,27 @@ static void notrace unit_free(struct bpf_mem_cache *c, void *ptr) irq_work_raise(c); } +static void notrace unit_free_rcu(struct bpf_mem_cache *c, void *ptr) +{ + struct llist_node *llnode = ptr - LLIST_NODE_SZ; + unsigned long flags; + + c->tgt = *(struct bpf_mem_cache **)llnode; + + local_irq_save(flags); + if (local_inc_return(&c->active) == 1) { + if (__llist_add(llnode, &c->free_by_rcu)) + c->free_by_rcu_tail = llnode; + } else { + llist_add(llnode, &c->free_llist_extra_rcu); + } + local_dec(&c->active); + local_irq_restore(flags); + + if (!atomic_read(&c->call_rcu_in_progress)) + irq_work_raise(c); +} + /* Called from BPF program or from sys_bpf syscall. * In both cases migration is disabled. */ @@ -703,6 +794,20 @@ void notrace bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr) unit_free(this_cpu_ptr(ma->caches)->cache + idx, ptr); } +void notrace bpf_mem_free_rcu(struct bpf_mem_alloc *ma, void *ptr) +{ + int idx; + + if (!ptr) + return; + + idx = bpf_mem_cache_idx(ksize(ptr - LLIST_NODE_SZ)); + if (idx < 0) + return; + + unit_free_rcu(this_cpu_ptr(ma->caches)->cache + idx, ptr); +} + void notrace *bpf_mem_cache_alloc(struct bpf_mem_alloc *ma) { void *ret; @@ -719,6 +824,14 @@ void notrace bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr) unit_free(this_cpu_ptr(ma->cache), ptr); } +void notrace bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr) +{ + if (!ptr) + return; + + unit_free_rcu(this_cpu_ptr(ma->cache), ptr); +} + /* Directly does a kfree() without putting 'ptr' back to the free_llist * for reuse and without waiting for a rcu_tasks_trace gp. * The caller must first go through the rcu_tasks_trace gp for 'ptr' From patchwork Sat Jun 24 03:13:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13291578 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78BD27F; Sat, 24 Jun 2023 03:14:30 +0000 (UTC) Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A0C9E47; Fri, 23 Jun 2023 20:14:29 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-668711086f4so941540b3a.1; Fri, 23 Jun 2023 20:14:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687576468; x=1690168468; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xfktoJD/McKDXPEAYo4ZmtHg2u99iMZJ4yXppCi89l0=; b=p9mkB7i8fY1miNgqCh6JzjUWcWvHiL0PawOU6UygHWwc+NlcsE84WrTwEf5bcwYIcT EA5d9aLOcDm2I0Q2LAq4fj9nEW8+cn5+wxUPgCVKJZyB+xhTlwvekHYug/SIrddf7RgH aM9N3NTIq2NAuPTaPNuS/kr6BA37g6AKnL5UOCxZcZ82fmZOF0VHh3tQv9EYvc24AO8T 8R0qvr2cgHm6pV2GE2zr1h2W/6S5UdMZeOU5SDbNrWZIko/mNUJslUWHW+C0kXQ9tuji pKam6fFroXtJzRnCkxcUNcywmhZpMKMx+YkJS+rTtVVc1hnF0oO0B36KnTxmxJ3l/qaw oICw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687576468; x=1690168468; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xfktoJD/McKDXPEAYo4ZmtHg2u99iMZJ4yXppCi89l0=; b=RL6uxtxVTT05sMzkH8I1LnQRJOQrBQgHNr5k5/lZAof8FRJ4UMQjwHoCJeCvtPMd07 upR4YroeKiQbqZW3MqkcB6M86x0lvlklsX27xAsC3YE4uymTHD6HG8kzUv0e1fk2Kz5G 1lhpOHNeM5+bcDOk+FwyktnMiNs9ILrL7X7ohmkWVIGY4Y+cP+avj8lSb1VgH4SmIb+2 vAQDq6ooV6+VkHkjKzPPBtrhaa3d0DG/JpCSYfPBR6G5FVWS0z3HZKaLQ94EWN0nU/WG Y8OtWvyjQsdJ9VIyRN5BESa8mykp0U5+peuubL5f5y7Qvlr2SNLVfPzPFxjzl5EdULUW f3Mg== X-Gm-Message-State: AC+VfDzfWTm/44E6iLO5/P4SYM0L/l45ZAM/4Sx6cf3w3ktq6RNo2rFt d6lK7RtPEOBzohMyfZKtQ8I= X-Google-Smtp-Source: ACHHUZ4uWXjbFvD/+zsgvTmQ5PbSmoCb8Lesk1eF/FTZ3JCp2aHhqcGP60ZcgOvtl9NpG8TPRexELA== X-Received: by 2002:a05:6a00:2186:b0:65e:1d92:c0cc with SMTP id h6-20020a056a00218600b0065e1d92c0ccmr42129758pfi.10.1687576468479; Fri, 23 Jun 2023 20:14:28 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:b07c]) by smtp.gmail.com with ESMTPSA id t16-20020aa79390000000b0065dd1e7c3c2sm195221pfe.184.2023.06.23.20.14.26 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 23 Jun 2023 20:14:28 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 13/13] bpf: Convert bpf_cpumask to bpf_mem_cache_free_rcu. Date: Fri, 23 Jun 2023 20:13:33 -0700 Message-Id: <20230624031333.96597-14-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230624031333.96597-1-alexei.starovoitov@gmail.com> References: <20230624031333.96597-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Convert bpf_cpumask to bpf_mem_cache_free_rcu. Signed-off-by: Alexei Starovoitov Acked-by: David Vernet --- kernel/bpf/cpumask.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c index 938a60ff4295..6983af8e093c 100644 --- a/kernel/bpf/cpumask.c +++ b/kernel/bpf/cpumask.c @@ -9,7 +9,6 @@ /** * struct bpf_cpumask - refcounted BPF cpumask wrapper structure * @cpumask: The actual cpumask embedded in the struct. - * @rcu: The RCU head used to free the cpumask with RCU safety. * @usage: Object reference counter. When the refcount goes to 0, the * memory is released back to the BPF allocator, which provides * RCU safety. @@ -25,7 +24,6 @@ */ struct bpf_cpumask { cpumask_t cpumask; - struct rcu_head rcu; refcount_t usage; }; @@ -82,16 +80,6 @@ __bpf_kfunc struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask) return cpumask; } -static void cpumask_free_cb(struct rcu_head *head) -{ - struct bpf_cpumask *cpumask; - - cpumask = container_of(head, struct bpf_cpumask, rcu); - migrate_disable(); - bpf_mem_cache_free(&bpf_cpumask_ma, cpumask); - migrate_enable(); -} - /** * bpf_cpumask_release() - Release a previously acquired BPF cpumask. * @cpumask: The cpumask being released. @@ -102,8 +90,12 @@ static void cpumask_free_cb(struct rcu_head *head) */ __bpf_kfunc void bpf_cpumask_release(struct bpf_cpumask *cpumask) { - if (refcount_dec_and_test(&cpumask->usage)) - call_rcu(&cpumask->rcu, cpumask_free_cb); + if (!refcount_dec_and_test(&cpumask->usage)) + return; + + migrate_disable(); + bpf_mem_cache_free_rcu(&bpf_cpumask_ma, cpumask); + migrate_enable(); } /**