From patchwork Tue Jan 14 02:19:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13938312 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 488B4AD27 for ; Tue, 14 Jan 2025 02:19:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821178; cv=none; b=uZ/3DGi8su12Ha+hXXo0YtfIS4grLlF94icaXpPrDiqcdAnf9KNYvvYnXuLqKeOp0FMleTV96+AbUE4Mqufn4YGaazUxIQQk/VMqBnMn6XffP85lO7VelgbAKFCm33EhBxLk0gU7gEAmtzeEFaKb27csReNH1plVkfMijP4CdUw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821178; c=relaxed/simple; bh=LE5i/b+J3cBV84XRbfzo1Bz/cVOQ7Ljn4yobyTGWImA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Pi2Le6WXMumbE5wFt2p2nuefP1qAs9J3QoORKV+UFt4qH2Gh9EsrBokuZfUrHAxDJKs9JdAIAj5XjbYWM92Zqhz5qVQQPTbnQdyofup3IXzad8k97i6bTdgz4e+3zCjtGnFuYB34MXGbpqDcvj/JFr78FZG0zH8niFij/VtpdNU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BlaZnX56; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BlaZnX56" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-2ee8aa26415so8341086a91.1 for ; Mon, 13 Jan 2025 18:19:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736821176; x=1737425976; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=akgdVcj5xEeNq6DmKIL978uvRiSwDxIZz8GY/zQxfoo=; b=BlaZnX56AuEJ6p4GsSGUfWoUjiNM/rWEP4w/Ar55IMMxjHkfqh4yVHER+UcGCHiWSk OOxxiRyXlZ3IQEdD9eSQSCrtqFZnQNilWsdcUkeWluffM9rkZzggUOaX0efql668qK50 6IKCQZ0CcvoxeNJjAQ39t+vHU5EhmxYu8b/GmHl9uDTAZwTV01n//n1yv2xgZYCkh5bz Xd+DTmNrD3ZNfOimHerOOu/yT14Je8fYJaOiA4/wF1sozTzFOBDVLS2yruKaFIKqVNaN gXJsYHgdUVek1TPEi/hAfBBefTRN9v/9gUwVzyU5Z5aas0Noqv4FLdPX1LTiOK+dMNgB qnLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736821176; x=1737425976; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=akgdVcj5xEeNq6DmKIL978uvRiSwDxIZz8GY/zQxfoo=; b=mKDRP9JxRUGOoA+41906jicgLtsFw+9K4gGP+A2PJ6YmRu25kQ6jsHYRK3hKKGJ1w6 qYvo+jfpGPUzuBfnRDD4Wbz+xBtK8k4l1JZszxeZOYm2YQzA4kH0pkvelrTKLcnD3Gmc W45mMO1xNsD68MRxdcSC0DmgQ13ICE/UL8x5bbs7CV8AWRobo0uLLELHRLMWTjUkEps5 EiUhOmj8mtPPOmYyHSqN73pSk7bGg8HhfKLXm2XL7D/TamTbeEW4AkhEb9S7i+8BAzXV nOHim3nF7xdMfA6OiBd2C0yD5Xhif8hvRz1nh20c0ACKNY2sMF113CCaUlpGacQPsdU6 10GA== X-Gm-Message-State: AOJu0Yzh/hcltYcXGHGLlHJPGNhnhPDLpCpgvVjtTsbOFpZw/4jbEtrW ADCf/oF0huW70p8Dx+x2wgmIYViXV1Z6SpZ2mYFaV8f4eEkbZh5fgl/giw== X-Gm-Gg: ASbGnctN0x82RT6C8AwfZ7HvFrWKd4aqcVK5DIyottlsA6yOF6ggk8TegMc+vOgt1mW b4XcN069ybx5cS9d56wedX2RhDIUQawK5z5PPzZHZioxW8pNW7dQZoU9CawV+DSyPnVl/FppCqA zAPvrQ3KSDUCkp9zVR1vkcvaB8gDCEa3U1wnxeB+m0IN5TAygZbMtY7PYTldSD23lBAGmfdCwj5 eAjWf9OgqaLjGxdwOP4hSXXkS3oMtHPzCcmRW/l3de3eAqoYJwhLKUIeDEX7leck+u1MgoyUdhl wU7AHbbM X-Google-Smtp-Source: AGHT+IHnBshZ4mFPAZZGQFlNI39VdZrv1X1j+B7cThKNElaLFNGw9ViuDJNNiPM+Xr5/Kv9IgpitWg== X-Received: by 2002:a17:90b:2712:b0:2ee:48bf:7dc9 with SMTP id 98e67ed59e1d1-2f548eb3a99mr35445590a91.15.1736821176273; Mon, 13 Jan 2025 18:19:36 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:4043]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f55942eed5sm8658076a91.27.2025.01.13.18.19.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 18:19:35 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v4 1/6] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Date: Mon, 13 Jan 2025 18:19:17 -0800 Message-Id: <20250114021922.92609-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250114021922.92609-1-alexei.starovoitov@gmail.com> References: <20250114021922.92609-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Tracing BPF programs execute from tracepoints and kprobes where running context is unknown, but they need to request additional memory. The prior workarounds were using pre-allocated memory and BPF specific freelists to satisfy such allocation requests. Instead, introduce gfpflags_allow_spinning() condition that signals to the allocator that running context is unknown. Then rely on percpu free list of pages to allocate a page. The rmqueue_pcplist() should be able to pop the page from. If it fails (due to IRQ re-entrancy or list being empty) then try_alloc_pages() attempts to spin_trylock zone->lock and refill percpu freelist as normal. BPF program may execute with IRQs disabled and zone->lock is sleeping in RT, so trylock is the only option. In theory we can introduce percpu reentrance counter and increment it every time spin_lock_irqsave(&zone->lock, flags) is used, but we cannot rely on it. Even if this cpu is not in page_alloc path the spin_lock_irqsave() is not safe, since BPF prog might be called from tracepoint where preemption is disabled. So trylock only. Note, free_page and memcg are not taught about gfpflags_allow_spinning() condition. The support comes in the next patches. This is a first step towards supporting BPF requirements in SLUB and getting rid of bpf_mem_alloc. That goal was discussed at LSFMM: https://lwn.net/Articles/974138/ Signed-off-by: Alexei Starovoitov Acked-by: Michal Hocko --- include/linux/gfp.h | 22 ++++++++++++ mm/internal.h | 1 + mm/page_alloc.c | 85 +++++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 105 insertions(+), 3 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index b0fe9f62d15b..b41bb6e01781 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -39,6 +39,25 @@ static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags) return !!(gfp_flags & __GFP_DIRECT_RECLAIM); } +static inline bool gfpflags_allow_spinning(const gfp_t gfp_flags) +{ + /* + * !__GFP_DIRECT_RECLAIM -> direct claim is not allowed. + * !__GFP_KSWAPD_RECLAIM -> it's not safe to wake up kswapd. + * All GFP_* flags including GFP_NOWAIT use one or both flags. + * try_alloc_pages() is the only API that doesn't specify either flag. + * + * This is stronger than GFP_NOWAIT or GFP_ATOMIC because + * those are guaranteed to never block on a sleeping lock. + * Here we are enforcing that the allaaction doesn't ever spin + * on any locks (i.e. only trylocks). There is no highlevel + * GFP_$FOO flag for this use in try_alloc_pages() as the + * regular page allocator doesn't fully support this + * allocation mode. + */ + return !(gfp_flags & __GFP_RECLAIM); +} + #ifdef CONFIG_HIGHMEM #define OPT_ZONE_HIGHMEM ZONE_HIGHMEM #else @@ -347,6 +366,9 @@ static inline struct page *alloc_page_vma_noprof(gfp_t gfp, } #define alloc_page_vma(...) alloc_hooks(alloc_page_vma_noprof(__VA_ARGS__)) +struct page *try_alloc_pages_noprof(int nid, unsigned int order); +#define try_alloc_pages(...) alloc_hooks(try_alloc_pages_noprof(__VA_ARGS__)) + extern unsigned long get_free_pages_noprof(gfp_t gfp_mask, unsigned int order); #define __get_free_pages(...) alloc_hooks(get_free_pages_noprof(__VA_ARGS__)) diff --git a/mm/internal.h b/mm/internal.h index cb8d8e8e3ffa..5454fa610aac 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1174,6 +1174,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, #define ALLOC_NOFRAGMENT 0x0 #endif #define ALLOC_HIGHATOMIC 0x200 /* Allows access to MIGRATE_HIGHATOMIC */ +#define ALLOC_TRYLOCK 0x400 /* Only use spin_trylock in allocation path */ #define ALLOC_KSWAPD 0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */ /* Flags that allow allocations below the min watermark. */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1cb4b8c8886d..0f4be88ff131 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2304,7 +2304,11 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long flags; int i; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) + return 0; + spin_lock_irqsave(&zone->lock, flags); + } for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, alloc_flags); @@ -2904,7 +2908,11 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, do { page = NULL; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) + return NULL; + spin_lock_irqsave(&zone->lock, flags); + } if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { @@ -4509,7 +4517,8 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, might_alloc(gfp_mask); - if (should_fail_alloc_page(gfp_mask, order)) + if (!(*alloc_flags & ALLOC_TRYLOCK) && + should_fail_alloc_page(gfp_mask, order)) return false; *alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, *alloc_flags); @@ -7023,3 +7032,73 @@ static bool __free_unaccepted(struct page *page) } #endif /* CONFIG_UNACCEPTED_MEMORY */ + +struct page *try_alloc_pages_noprof(int nid, unsigned int order) +{ + /* + * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed. + * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd + * is not safe in arbitrary context. + * + * These two are the conditions for gfpflags_allow_spinning() being true. + * + * Specify __GFP_NOWARN since failing try_alloc_pages() is not a reason + * to warn. Also warn would trigger printk() which is unsafe from + * various contexts. We cannot use printk_deferred_enter() to mitigate, + * since the running context is unknown. + * + * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below + * is safe in any context. Also zeroing the page is mandatory for + * BPF use cases. + * + * Though __GFP_NOMEMALLOC is not checked in the code path below, + * specify it here to highlight that try_alloc_pages() + * doesn't want to deplete reserves. + */ + gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC; + unsigned int alloc_flags = ALLOC_TRYLOCK; + struct alloc_context ac = { }; + struct page *page; + + /* + * In RT spin_trylock() may call raw_spin_lock() which is unsafe in NMI. + * If spin_trylock() is called from hard IRQ the current task may be + * waiting for one rt_spin_lock, but rt_spin_trylock() will mark the + * task as the owner of another rt_spin_lock which will confuse PI + * logic, so return immediately if called form hard IRQ or NMI. + * + * Note, irqs_disabled() case is ok. This function can be called + * from raw_spin_lock_irqsave region. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq())) + return NULL; + if (!pcp_allowed_order(order)) + return NULL; + +#ifdef CONFIG_UNACCEPTED_MEMORY + /* Bailout, since try_to_accept_memory_one() needs to take a lock */ + if (has_unaccepted_memory()) + return NULL; +#endif + /* Bailout, since _deferred_grow_zone() needs to take a lock */ + if (deferred_pages_enabled()) + return NULL; + + if (nid == NUMA_NO_NODE) + nid = numa_node_id(); + + prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac, + &alloc_gfp, &alloc_flags); + + /* + * Best effort allocation from percpu free list. + * If it's empty attempt to spin_trylock zone->lock. + */ + page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac); + + /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ + + trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); + kmsan_alloc_page(page, order, alloc_gfp); + return page; +} From patchwork Tue Jan 14 02:19:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13938313 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E672224B0D for ; Tue, 14 Jan 2025 02:19:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821184; cv=none; b=oAM3gKf/ZP57p0ik9OJFTtI0yp1RoX4KhT55F9wr8XsSRzS0o10tLK50tDv/Z1ecx7C/2Zm32bswPkXYjqXA2m3Ein1XDUmIzCNTDR4m02dUS4kTl/nB02sWw4dsf4tce8H/FBQWPnFAZxUnA4jsUiipATRVlt8hZLgUhvPv5kE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821184; c=relaxed/simple; bh=v4yd47iyDk7L2ClxBfWHEiNosog5ZJz+TrgymKpSoSM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VIs4GfJAQnGcIWb6pHuWPr9zuRC5VY+x7LHfSIfV3sHlImyXPLqfi7ax9TfVRBsRYagCi6NxFfaaT6y/TlsAhg6OWMP7KHUI3mUwSBEY1kWEDWxBaVQ+xjLBzhrp4kp1ZReOnTaa2DtM/ITIGWasurWwoyJhdvwK3fXD51deJo8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QLJdIbX1; arc=none smtp.client-ip=209.85.214.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QLJdIbX1" Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-2165448243fso105613265ad.1 for ; Mon, 13 Jan 2025 18:19:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736821181; x=1737425981; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SQPsX2DHIButJsUpdhJcEvvwgfeMxBo/GFMw0XDovgw=; b=QLJdIbX1b7bBNIQkmzvEw3EylpX9GreDXkGIan2wG4iK1RVucflWwiA5ErBRWtvok7 kHsfenrxG8k/TyIc1iWRVWwwHWI6VZIppAZYAXJBYceB5SYvy8dMNR86SEnqxSkJSDSn CO3cfVyeymVd/WjNpc84uJV2Z5yKVw0YVrlqtuboGZbPh6RrNkso1h5zioKbz0i0UPRV CMJc15FF9aR7yQDwldYe9PXLBDTfrcnQsxrEBxFG9jd4bkwkcmdnXQk0Ua30eu+Rt7tv emmP+XS57MyG9f3pAUg5EryAFsxyAhteVKhZD+J+hYNeqKrUJPShXM3dU9/PpcLwSehH eWnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736821181; x=1737425981; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SQPsX2DHIButJsUpdhJcEvvwgfeMxBo/GFMw0XDovgw=; b=thtkrREtijwywKpsNp475yDS2r5U7gQcdYvXeGbgX7VMRx5Tm8sGVT7t06rHENpb10 jKaIKFIBAHBVq1h1Obcct0w9x/3r944YP+bdutV+tRFd2yc/uxbcgC5RrJfRCVLy0Kp5 hWS7RLJnn/bvSzQC0nKYpk+B8ctHzAjEblFSWv49lghHZIcelmyG/a4g3iYoNRSH9oNz FWwYsLg5ucdYRww4NHkAjaKbFVlkyuiE/aVGvrV+bmo8CBu2ou6dUulierqMNGMZE3qt NSepTvsl7oAaQ9uEiLZA9Wv97WFL3dz3IW5fuaL6wfVcDtRwvyxIO7KwDpI9qnPCZ9jp CTWA== X-Gm-Message-State: AOJu0Yz7lUi9QqLBryjEavM+7UWkvb3QFjT5Mg5iN+RDmqp8P+cJSTMq zHv5R89zHobXc2eDO6FsGcnOif2xyR5zmS/DCOtNhZi2O/b+r42f7ISYqw== X-Gm-Gg: ASbGnctsG9dRCFbPupxtEL7eVFhsW3KBlyqM7oaLBRMt/y2V+PJhaKNDE0hHUSLWeeG bT9oBIitPa9ZS/eZ/3JY8V/BtorcQGItThnflCGVawo2c2zMfyQ48VWgmjx7tJOnBoIuhvY0FV/ 3ecSwyhuE+yR6oRcVgj1xmF17S7o/CgmFnRAJSm5+yqjg3XLGHrUwiAlKNfTTcfLSVVw8pATKqA GYKz/51CcOPqvLSRzVhLG99+x/71OiZrY/W+gJpk+RxGmHrmx5KKKIDZ0nnmRms4eoRRgk2T0L9 1OzOBkv4 X-Google-Smtp-Source: AGHT+IH4C8oQxYrxEc0hMPntr3TFj0/HQ3tRiTFBDtPEqYrgkCC5XatENgLtg7g5dRzWvlp6uSkr4w== X-Received: by 2002:a17:903:1245:b0:216:48f4:4f20 with SMTP id d9443c01a7336-21a83f56f58mr324043955ad.16.1736821180973; Mon, 13 Jan 2025 18:19:40 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:4043]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10e011sm59713495ad.3.2025.01.13.18.19.38 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 18:19:40 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v4 2/6] mm, bpf: Introduce free_pages_nolock() Date: Mon, 13 Jan 2025 18:19:18 -0800 Message-Id: <20250114021922.92609-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250114021922.92609-1-alexei.starovoitov@gmail.com> References: <20250114021922.92609-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce free_pages_nolock() that can free pages without taking locks. It relies on trylock and can be called from any context. Since spin_trylock() cannot be used in RT from hard IRQ or NMI it uses lockless link list to stash the pages which will be freed by subsequent free_pages() from good context. Do not use llist unconditionally. BPF maps continuously allocate/free, so we cannot unconditionally delay the freeing to llist. When the memory becomes free make it available to the kernel and BPF users right away if possible, and fallback to llist as the last resort. Signed-off-by: Alexei Starovoitov --- include/linux/gfp.h | 1 + include/linux/mm_types.h | 4 ++ include/linux/mmzone.h | 3 ++ mm/page_alloc.c | 79 ++++++++++++++++++++++++++++++++++++---- 4 files changed, 79 insertions(+), 8 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index b41bb6e01781..6eba2d80feb8 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -391,6 +391,7 @@ __meminit void *alloc_pages_exact_nid_noprof(int nid, size_t size, gfp_t gfp_mas __get_free_pages((gfp_mask) | GFP_DMA, (order)) extern void __free_pages(struct page *page, unsigned int order); +extern void free_pages_nolock(struct page *page, unsigned int order); extern void free_pages(unsigned long addr, unsigned int order); #define __free_page(page) __free_pages((page), 0) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7361a8f3ab68..52547b3e5fd8 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -99,6 +99,10 @@ struct page { /* Or, free page */ struct list_head buddy_list; struct list_head pcp_list; + struct { + struct llist_node pcp_llist; + unsigned int order; + }; }; /* See page-flags.h for PAGE_MAPPING_FLAGS */ struct address_space *mapping; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b36124145a16..1a854e0a9e3b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -953,6 +953,9 @@ struct zone { /* Primarily protects free_area */ spinlock_t lock; + /* Pages to be freed when next trylock succeeds */ + struct llist_head trylock_free_pages; + /* Write-intensive fields used by compaction and vmstats. */ CACHELINE_PADDING(_pad2_); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0f4be88ff131..f967725898be 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -88,6 +88,9 @@ typedef int __bitwise fpi_t; */ #define FPI_TO_TAIL ((__force fpi_t)BIT(1)) +/* Free the page without taking locks. Rely on trylock only. */ +#define FPI_TRYLOCK ((__force fpi_t)BIT(2)) + /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); #define MIN_PERCPU_PAGELIST_HIGH_FRACTION (8) @@ -1247,13 +1250,44 @@ static void split_large_buddy(struct zone *zone, struct page *page, } } +static void add_page_to_zone_llist(struct zone *zone, struct page *page, + unsigned int order) +{ + /* Remember the order */ + page->order = order; + /* Add the page to the free list */ + llist_add(&page->pcp_llist, &zone->trylock_free_pages); +} + static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { + struct llist_head *llhead; unsigned long flags; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + add_page_to_zone_llist(zone, page, order); + return; + } + spin_lock_irqsave(&zone->lock, flags); + } + + /* The lock succeeded. Process deferred pages. */ + llhead = &zone->trylock_free_pages; + if (unlikely(!llist_empty(llhead) && !(fpi_flags & FPI_TRYLOCK))) { + struct llist_node *llnode; + struct page *p, *tmp; + + llnode = llist_del_all(llhead); + llist_for_each_entry_safe(p, tmp, llnode, pcp_llist) { + unsigned int p_order = p->order; + + split_large_buddy(zone, p, page_to_pfn(p), p_order, fpi_flags); + __count_vm_events(PGFREE, 1 << p_order); + } + } split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); @@ -2596,7 +2630,7 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone, static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, struct page *page, int migratetype, - unsigned int order) + unsigned int order, fpi_t fpi_flags) { int high, batch; int pindex; @@ -2631,6 +2665,14 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, } if (pcp->free_count < (batch << CONFIG_PCP_BATCH_SCALE_MAX)) pcp->free_count += (1 << order); + + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + /* + * Do not attempt to take a zone lock. Let pcp->count get + * over high mark temporarily. + */ + return; + } high = nr_pcp_high(pcp, zone, batch, free_high); if (pcp->count >= high) { free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), @@ -2645,7 +2687,8 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, /* * Free a pcp page */ -void free_unref_page(struct page *page, unsigned int order) +static void __free_unref_page(struct page *page, unsigned int order, + fpi_t fpi_flags) { unsigned long __maybe_unused UP_flags; struct per_cpu_pages *pcp; @@ -2654,7 +2697,7 @@ void free_unref_page(struct page *page, unsigned int order) int migratetype; if (!pcp_allowed_order(order)) { - __free_pages_ok(page, order, FPI_NONE); + __free_pages_ok(page, order, fpi_flags); return; } @@ -2671,24 +2714,33 @@ void free_unref_page(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(page_zone(page), page, pfn, order, FPI_NONE); + free_one_page(page_zone(page), page, pfn, order, fpi_flags); return; } migratetype = MIGRATE_MOVABLE; } zone = page_zone(page); + if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq())) { + add_page_to_zone_llist(zone, page, order); + return; + } pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, migratetype, order); + free_unref_page_commit(zone, pcp, page, migratetype, order, fpi_flags); pcp_spin_unlock(pcp); } else { - free_one_page(zone, page, pfn, order, FPI_NONE); + free_one_page(zone, page, pfn, order, fpi_flags); } pcp_trylock_finish(UP_flags); } +void free_unref_page(struct page *page, unsigned int order) +{ + __free_unref_page(page, order, FPI_NONE); +} + /* * Free a batch of folios */ @@ -2777,7 +2829,7 @@ void free_unref_folios(struct folio_batch *folios) trace_mm_page_free_batched(&folio->page); free_unref_page_commit(zone, pcp, &folio->page, migratetype, - order); + order, FPI_NONE); } if (pcp) { @@ -4853,6 +4905,17 @@ void __free_pages(struct page *page, unsigned int order) } EXPORT_SYMBOL(__free_pages); +/* + * Can be called while holding raw_spin_lock or from IRQ and NMI, + * but only for pages that came from try_alloc_pages(): + * order <= 3, !folio, etc + */ +void free_pages_nolock(struct page *page, unsigned int order) +{ + if (put_page_testzero(page)) + __free_unref_page(page, order, FPI_TRYLOCK); +} + void free_pages(unsigned long addr, unsigned int order) { if (addr != 0) { From patchwork Tue Jan 14 02:19:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13938314 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A391B22FE02 for ; Tue, 14 Jan 2025 02:19:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821188; cv=none; b=r+7zSlEblILGx5zjB4exCV5L0SQxrwL73d5VQF/bD7C5+AgaSyOekimWU2ui/Tqz7bNXCOow/q8E4cNgzeyo6OpPg3iQHkQIK8b/jJZjl1gF1yRlT0bdqCfK9z1dJQgoLHYYbGAPOw8xPNL7hBO/c8o3wqA+1KmkXI59IM+6nCU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821188; c=relaxed/simple; bh=FxvXZIK6Q1cSySyzTPe24jvgtxLpl4PK0Mjw9ENtIKI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dnOZBtB3pfVL1k/KBy6I9IGKCgvykm6rkxHryiDiRnqdvGuISP2h0xcsbdqeBAgVOjjsI3UZ+g+Bzg83oC05pq/VyyTmjreFzgsU1p4oYLsExsfvMC2UZ7/BD/Ve/3nFqFqdO9spEoaplPCJEbzDlfVBSXc9yR/dSiYTte9B4xQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bhVtzEnr; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bhVtzEnr" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-2ee989553c1so8305814a91.3 for ; Mon, 13 Jan 2025 18:19:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736821185; x=1737425985; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=w+Lbt2oOvlSTPMdIT0UeNEpvkZe5nRvEnawf5u/tH+w=; b=bhVtzEnrYJz013Y8BCGL4XOfEhENsTevRDs140nLosfbG9nKKlTr1BkRhlwpLnae8B 6BD7ZqEhtTs72lOaX4t35UCWN8XCbHrVSB6jyH/g/iT8YPUC4RMsgBrYwPz1J55ezDJH EbNaEn5/sARj2a3MQIJ+O316QWKDWW059Z9zq7XR3vyLiC6hG15cekMPjQnGvI6Yhuhi 3XEYoJ/kYq3vi1hafOM5zsHwiNQWsg0+EVmxB0JEhD7WDC1JbeAOLcPOvZ9EduNlkXPF Cs0pPFl8KttJVfICzBGof5jXy7U/dhXbC9579Sooh1s483JnfNtqSIuOAZigrcjVDWJv ThMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736821185; x=1737425985; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w+Lbt2oOvlSTPMdIT0UeNEpvkZe5nRvEnawf5u/tH+w=; b=V5+MuYB7tyvYf9C+94bwtiCbk/3+vQTt98zTczTuFd2bvnVwR2k/Zgk6BHN8E3vSdx 1T7IVOkt+z3flQ6pM3JwVcQWog1e/vRuOYZrVEvqQRaDvaCjScHkD8+OcR3/0dWfOhk7 jLbTugIcjZSVb1DdnYaeRoeJyl/Z8Wluk7MpV0fRa3IRfY2FcwIsVjKhadCcVeU++yeY /cUdbpdkEHa3aN/vXM+jUimOdQfdqB0oYcfkwtwe0B+lF587WtY2YNfYjMwZnNAWFL/z f2RnlKWGFokoloD820hMpemvNnBP9V6RaHOnwUEWpXhf5DIVmqSYGlypufVtGN+u8bur K8Sw== X-Gm-Message-State: AOJu0YwPTExlriFx5UKJj8atkkK+WFN8rQiI5vnR31upzLZT6svf+7x+ V7Ip5dwUiEjkUPEzU3Z090v6FXc7ZZxLQh9JT198eM91/nrpeHbecjnUVw== X-Gm-Gg: ASbGnctIMAJia1r9lkFl5+VOGwhok9N9hWsweZSt5XiPWamEn2TKx2+3lMHGWokpRLW NSTdtCPeMdN7JQYwDMXktPT/W8ACA5bgzlgRNBbfFISJbboatt8aKlbxBY7oit86GjMj2aDC9w0 IzAr2unIsASvWZH+MRL6xk8BVye52zmc/DlteuQ3wMkJiTA3cB0kkygHAyd+mMsDzd7XoauA4X/ bJtlWi5lMkcUh6dRLDZXO2IkSIi9T8UIIZLMSH64bzZmHPmD7pj0gIRrl0T+wRnweGAqta9ydsi jJSmcbEs X-Google-Smtp-Source: AGHT+IEFmgpkoNWWnjwaaKDdY0kFEuoA/VYtTI9WUz5vYKzivTLXj4vvKroyeHsRxZ0DCUaCQZ66Tg== X-Received: by 2002:a17:90b:51cb:b0:2ee:863e:9fff with SMTP id 98e67ed59e1d1-2f548f2a4afmr34600926a91.10.1736821185517; Mon, 13 Jan 2025 18:19:45 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:4043]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-a317a07cccbsm7420136a12.11.2025.01.13.18.19.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 18:19:45 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v4 3/6] locking/local_lock: Introduce local_trylock_irqsave() Date: Mon, 13 Jan 2025 18:19:19 -0800 Message-Id: <20250114021922.92609-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250114021922.92609-1-alexei.starovoitov@gmail.com> References: <20250114021922.92609-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Similar to local_lock_irqsave() introduce local_trylock_irqsave(). This is inspired by 'struct local_tryirq_lock' in: https://lore.kernel.org/all/20241112-slub-percpu-caches-v1-5-ddc0bdc27e05@suse.cz/ Use spin_trylock in PREEMPT_RT when not in hard IRQ and not in NMI and fail instantly otherwise, since spin_trylock is not safe from IRQ due to PI issues. In !PREEMPT_RT use simple active flag to prevent IRQs or NMIs reentering locked region. Note there is no need to use local_inc for active flag. If IRQ handler grabs the same local_lock after READ_ONCE(lock->active) already completed it has to unlock it before returning. Similar with NMI handler. So there is a strict nesting of scopes. It's a per cpu lock. Multiple cpus do not access it in parallel. Signed-off-by: Alexei Starovoitov --- include/linux/local_lock.h | 9 ++++ include/linux/local_lock_internal.h | 76 ++++++++++++++++++++++++++--- 2 files changed, 78 insertions(+), 7 deletions(-) diff --git a/include/linux/local_lock.h b/include/linux/local_lock.h index 091dc0b6bdfb..84ee560c4f51 100644 --- a/include/linux/local_lock.h +++ b/include/linux/local_lock.h @@ -30,6 +30,15 @@ #define local_lock_irqsave(lock, flags) \ __local_lock_irqsave(lock, flags) +/** + * local_trylock_irqsave - Try to acquire a per CPU local lock, save and disable + * interrupts. Always fails in RT when in_hardirq or NMI. + * @lock: The lock variable + * @flags: Storage for interrupt flags + */ +#define local_trylock_irqsave(lock, flags) \ + __local_trylock_irqsave(lock, flags) + /** * local_unlock - Release a per CPU local lock * @lock: The lock variable diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h index 8dd71fbbb6d2..93672127c73d 100644 --- a/include/linux/local_lock_internal.h +++ b/include/linux/local_lock_internal.h @@ -9,6 +9,7 @@ #ifndef CONFIG_PREEMPT_RT typedef struct { + int active; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; struct task_struct *owner; @@ -22,7 +23,7 @@ typedef struct { .wait_type_inner = LD_WAIT_CONFIG, \ .lock_type = LD_LOCK_PERCPU, \ }, \ - .owner = NULL, + .owner = NULL, .active = 0 static inline void local_lock_acquire(local_lock_t *l) { @@ -31,6 +32,13 @@ static inline void local_lock_acquire(local_lock_t *l) l->owner = current; } +static inline void local_trylock_acquire(local_lock_t *l) +{ + lock_map_acquire_try(&l->dep_map); + DEBUG_LOCKS_WARN_ON(l->owner); + l->owner = current; +} + static inline void local_lock_release(local_lock_t *l) { DEBUG_LOCKS_WARN_ON(l->owner != current); @@ -45,6 +53,7 @@ static inline void local_lock_debug_init(local_lock_t *l) #else /* CONFIG_DEBUG_LOCK_ALLOC */ # define LOCAL_LOCK_DEBUG_INIT(lockname) static inline void local_lock_acquire(local_lock_t *l) { } +static inline void local_trylock_acquire(local_lock_t *l) { } static inline void local_lock_release(local_lock_t *l) { } static inline void local_lock_debug_init(local_lock_t *l) { } #endif /* !CONFIG_DEBUG_LOCK_ALLOC */ @@ -60,6 +69,7 @@ do { \ 0, LD_WAIT_CONFIG, LD_WAIT_INV, \ LD_LOCK_PERCPU); \ local_lock_debug_init(lock); \ + (lock)->active = 0; \ } while (0) #define __spinlock_nested_bh_init(lock) \ @@ -75,37 +85,73 @@ do { \ #define __local_lock(lock) \ do { \ + local_lock_t *l; \ preempt_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ + l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 0); \ + WRITE_ONCE(l->active, 1); \ + local_lock_acquire(l); \ } while (0) #define __local_lock_irq(lock) \ do { \ + local_lock_t *l; \ local_irq_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ + l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 0); \ + WRITE_ONCE(l->active, 1); \ + local_lock_acquire(l); \ } while (0) #define __local_lock_irqsave(lock, flags) \ do { \ + local_lock_t *l; \ local_irq_save(flags); \ - local_lock_acquire(this_cpu_ptr(lock)); \ + l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 0); \ + WRITE_ONCE(l->active, 1); \ + local_lock_acquire(l); \ } while (0) +#define __local_trylock_irqsave(lock, flags) \ + ({ \ + local_lock_t *l; \ + local_irq_save(flags); \ + l = this_cpu_ptr(lock); \ + if (READ_ONCE(l->active) == 1) { \ + local_irq_restore(flags); \ + l = NULL; \ + } else { \ + WRITE_ONCE(l->active, 1); \ + local_trylock_acquire(l); \ + } \ + !!l; \ + }) + #define __local_unlock(lock) \ do { \ - local_lock_release(this_cpu_ptr(lock)); \ + local_lock_t *l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 1); \ + WRITE_ONCE(l->active, 0); \ + local_lock_release(l); \ preempt_enable(); \ } while (0) #define __local_unlock_irq(lock) \ do { \ - local_lock_release(this_cpu_ptr(lock)); \ + local_lock_t *l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 1); \ + WRITE_ONCE(l->active, 0); \ + local_lock_release(l); \ local_irq_enable(); \ } while (0) #define __local_unlock_irqrestore(lock, flags) \ do { \ - local_lock_release(this_cpu_ptr(lock)); \ + local_lock_t *l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 1); \ + WRITE_ONCE(l->active, 0); \ + local_lock_release(l); \ local_irq_restore(flags); \ } while (0) @@ -148,6 +194,22 @@ typedef spinlock_t local_lock_t; __local_lock(lock); \ } while (0) +#define __local_trylock_irqsave(lock, flags) \ + ({ \ + __label__ out; \ + int ret = 0; \ + typecheck(unsigned long, flags); \ + flags = 0; \ + if (in_nmi() || in_hardirq()) \ + goto out; \ + migrate_disable(); \ + ret = spin_trylock(this_cpu_ptr((lock))); \ + if (!ret) \ + migrate_enable(); \ + out: \ + ret; \ + }) + #define __local_unlock(__lock) \ do { \ spin_unlock(this_cpu_ptr((__lock))); \ From patchwork Tue Jan 14 02:19:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13938315 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 640FE23026F for ; Tue, 14 Jan 2025 02:19:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821192; cv=none; b=IGih65ShE/9MSfWKMvBVHY6HLbwbWIqNe4B1l6aZmRIUygliWpRGnQk22VL8A3aApPi6GHMj8w9dWG2GyMdYckJlhNXuUV2tddxLq5tVie96BZzmOCoIx/MPnKF1W2knQx+Z4c5gQkQD40+eAobuyUxssAXd6lN62i1ga3jKZsE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821192; c=relaxed/simple; bh=ZhXE0wLLMBUB9EgYPFEuO6m3OKi1YFiPbsJQ4bZ7JhE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RCK7y2tbb90z/DMoCSfd4H9FASDCz3bARqAoZqNgu3MJeMNx1EShc2kh4vy3vIjeHroj7X38o3aE8Qzdub+Dzi/IOHwHQ7pQ9CY/0DFNZ8eh/Xj79c/I/vS0qI2dX/m0dpufOlXytPBXJp6IQqCsq5nTBXNem4koOKzoZsT3eIM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ReEwvYds; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ReEwvYds" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-21634338cfdso73813785ad.2 for ; Mon, 13 Jan 2025 18:19:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736821190; x=1737425990; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rVaRyHY3N181Ob6HytxKBrw588chz5TJlaChctBoK1M=; b=ReEwvYdsEkYR43euo1dvFDuACdNJeRZ/rhG+BPdyiEPUfZFshFNOFRBDNYnEYYAa+V Rn4C7FsuN5BFGjp8jGVxPFHktr8+m4dpGPrSahLM7KknVf+nQ7777z9tsML9z8CiF/qp KMTt9BzCfiIKyYd4CHqDovHL31AAVi6klozTh8dCjy6Zr0drifsUbUygKkeRQhXNJwVe mgxOKzOrcvIOWeikoEuhDn4JusYLwt2H3F7R+mdIopoj4nD5PR7naSlES3HxUVAnWnqM 6i4P/wOfChEL0+L4ONEO9NnJS0Gn4Gdx/giURHNrupoUo+8BUioB5oviwA+SW8Da5qGH 9ePA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736821190; x=1737425990; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rVaRyHY3N181Ob6HytxKBrw588chz5TJlaChctBoK1M=; b=M81mNJ2TGbQb/FvFhzBD3LI9o7HJUQTnoMoWDQBMriCcMhinqDIus9f1KSfBsj6iU/ 3HWvewtD2t1Ir/Zl3FTEzGXUNbfJj4eOd31QFja6fxnDS+J1kzm0vLjTxUJGM3extqmQ c36Nx1FJsd0JlPi9Xgzg28b4vbXKrzt+qE0yi7EKZFo0VTX2eDih8ekG310segk7+3VT guVPEYCTkm0Xty7kcx3tGDTYJjr/HAWfHYJcTtbknl0V24DVXhVmNkvY33TIpqQP82B9 gR0rwmNb/Ne/NJK3UumGciy6ld2BnDM4Tj8KkucsFJgCmqUcY7YEtoSRll/piNQ+1riC OvFQ== X-Gm-Message-State: AOJu0YxM1c45mXhIX2Tz9cbe7hvYeQiwVMQN8CG5gzsGUrovnrGBxOss tM1w1g+F6doaSPQXDNZBf2EN/WN+ybw4CJOvOEprQWl0SQAsqhLUPOHbNg== X-Gm-Gg: ASbGncsGoGbuvi9ybnigBPsHCpiLK02dgVbKVtvwR6MxwqnQg8kCOJhTAZgD2Bklk8K +eZkd7HLIDhTkFORF0k8ZG+OlnhjofGWBOayuhLp01nQtyMCU8mYBMQ2YzBKgejBPT9IV3ZIbgt tTwkJ2ixC4QJoYRJ3Py1z4D7POA0X/su03Jn/XmTKDKFTSMnjL60fwz/rvz+7pE8BvUaFcVTKEc N3oioUOHW5TpaOqlL+k6LxljUh/jJDWlsGd5eqcHvF5FuGGlbEySMc37qYasAJFE5wNbLjzT8gY MsIB0ATi X-Google-Smtp-Source: AGHT+IFgd3F1vhcoVYBusGT4YkcS3cyvI9lPQCSRNl7H1SCdyLOMHx5+LSBkcObgthXTbJSLr0qoDQ== X-Received: by 2002:a05:6a00:3cc2:b0:726:f7c9:7b36 with SMTP id d2e1a72fcca58-72d21f325b8mr38139980b3a.8.1736821190203; Mon, 13 Jan 2025 18:19:50 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:4043]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72d406a4dfesm6523242b3a.156.2025.01.13.18.19.48 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 18:19:49 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v4 4/6] memcg: Use trylock to access memcg stock_lock. Date: Mon, 13 Jan 2025 18:19:20 -0800 Message-Id: <20250114021922.92609-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250114021922.92609-1-alexei.starovoitov@gmail.com> References: <20250114021922.92609-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Teach memcg to operate under trylock conditions when spinning locks cannot be used. The end result is __memcg_kmem_charge_page() and __memcg_kmem_uncharge_page() are safe to use from any context in RT and !RT. In !RT the NMI handler may fail to trylock stock_lock. In RT hard IRQ and NMI handlers will not attempt to trylock. Signed-off-by: Alexei Starovoitov Acked-by: Michal Hocko --- mm/memcontrol.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b3503d12aaf..e4c7049465e0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1756,7 +1756,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, * * returns true if successful, false otherwise. */ -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, + gfp_t gfp_mask) { struct memcg_stock_pcp *stock; unsigned int stock_pages; @@ -1766,7 +1767,11 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) if (nr_pages > MEMCG_CHARGE_BATCH) return ret; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + if (!gfpflags_allow_spinning(gfp_mask)) + return ret; + local_lock_irqsave(&memcg_stock.stock_lock, flags); + } stock = this_cpu_ptr(&memcg_stock); stock_pages = READ_ONCE(stock->nr_pages); @@ -1851,7 +1856,14 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + /* + * In case of unlikely failure to lock percpu stock_lock + * uncharge memcg directly. + */ + mem_cgroup_cancel_charge(memcg, nr_pages); + return; + } __refill_stock(memcg, nr_pages); local_unlock_irqrestore(&memcg_stock.stock_lock, flags); } @@ -2196,9 +2208,13 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, unsigned long pflags; retry: - if (consume_stock(memcg, nr_pages)) + if (consume_stock(memcg, nr_pages, gfp_mask)) return 0; + if (!gfpflags_allow_spinning(gfp_mask)) + /* Avoid the refill and flush of the older stock */ + batch = nr_pages; + if (!do_memsw_account() || page_counter_try_charge(&memcg->memsw, batch, &counter)) { if (page_counter_try_charge(&memcg->memory, batch, &counter)) From patchwork Tue Jan 14 02:19:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13938316 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A22B22FE1D for ; Tue, 14 Jan 2025 02:19:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821197; cv=none; b=CjwFc/VFylV4voXKePV/svx7xVjUPy1hxjMb53+D/TFlA8gXW8k6pTofXomIZXm+FLDuIuDyoBpJcPhx7E8gIaCe/Z4HIoTggSn/+9cQKZ8btVk8EVML5bn55+NJTg3yiBTAeuQKekRwr4gpqnNpUumk9enXhdxa2cPEOTUPJHU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821197; c=relaxed/simple; bh=tzwHjbGjp68P11OdjhC/2hNyfpMrd/9w03aGeFbz4qE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=V/2uiOUYnhpF9qfhP0Lpk5wwg1DTuySHn9cGB3/XapxW7eSkePs+UNhOUR4dRtYc7HxQg1QPNGa3RHsj4B/7jdLy1cI2gt2plgAjdH7hon2mKSFrKGddusyFRXD3dT00iaUs9e4FZdwZl8r6qotqMJfnkVyE21JwTKVNr8Fz6oc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=e0LkqiBM; arc=none smtp.client-ip=209.85.214.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="e0LkqiBM" Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-2166360285dso86649955ad.1 for ; Mon, 13 Jan 2025 18:19:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736821195; x=1737425995; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ygT/gYIDC88MbRDeBH2BI6HaB56L95edS8LxJJawsPI=; b=e0LkqiBM4O63+CQeHTOYQClhzNGemxbcxgRy4hBeblwhBPIf5n83Eg1BIcxH4Z+x+T zql1mxqXzOTaXdIiYVjsjfSCsk54q7ZDxgZmHQQEO1diHFJ7fNuxGkkAY3MZmK8nk6qF eG8osE9suQTBwpG+Tbs329Wp9WEDXpzI+w0yQXwep0bjgvYADwWFAz9qLK/foeofiC3L dkdqIpk6tZ6oByCXF4Xe/AYzCL1HhAOzdiaeLqjibDD88fZhq8YrDrs6roueIkLKuf1/ x3SiReMCbJZc8vE6YTSzPSXfeuremspvqTIib0h3OFHCI26zCiVkohOkCK9pmCfAd7hw CTMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736821195; x=1737425995; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ygT/gYIDC88MbRDeBH2BI6HaB56L95edS8LxJJawsPI=; b=MRefN8Ok/xtV6iAO+JcO9hq6eMDPBrrTeNnOazJsOA/L0FQZaaQgNr/TNwgYmDnb1A XDF+M3OAMk9FTs1kt+KK8PXw1Dh5E2SAgO1ya0x0a5a7DoVyQU56IoDfkezcXhiHN8XA Yn0O40j5ctxhNvYSmIBl0VoVGLRN1DTlQtjbYjlSHscw1Xxvsiosz6i1cGMm8y5TyCwr bVOvz5sszQvus08GQ34lLqk5UicAQ3Q4pEQYlYbGluisobW5fpn27r3gxix61iXeSv/Y S+1GSjbzYN6ninEtkCCf2J153TKwUr1MyfescgzdWQL5L3qDym5yJGTUj8g0Bngmj6CC 27Ew== X-Gm-Message-State: AOJu0YxLjU90keOh/uT1A+udDra3ImeMyHvyfEX9LDDlbaIcNcR99zgg YpiDBQB/eV+d8xMfkNjPdft8NTxnsCcv+4b7MDizPuPynEyw/8bj+Ef0/g== X-Gm-Gg: ASbGncu0Vxl+tnJtbIUFrWNZuILxm4+8RRxiEMgTO2/yWNtMmerUYDO8C7xGNcihD3G m56CXoksieV7v9agv4qPfLqiriQ6+hvfsYybFf2g7++Xbu8If/iGlTLeyrqBh8iE58B47lR6Ucc F3bNp65d9a0rp7YqGT4ruR81aG4OTr5V9w32K30pEmLMBHeV3Zutcyl2BTAve676we1VOFFksbw z/gblAAm1smS2xkInzmo7z4uP5T+WAbpIrHvdi2aeMdLEpdtfW3VdyaFkGDg0PFBtmX6xxnrkoG zukY/K6k X-Google-Smtp-Source: AGHT+IFgYqVYLd3bJ1uye+mrUb8xwZqUFfTGtcoiwxRHlc9JoDBAT6ZJfA68H0it17/fmR3KmGQU5g== X-Received: by 2002:a17:902:f54a:b0:216:2426:7668 with SMTP id d9443c01a7336-21a83f4bce1mr257464125ad.13.1736821195029; Mon, 13 Jan 2025 18:19:55 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:4043]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f22eb8esm59444185ad.166.2025.01.13.18.19.52 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 18:19:54 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v4 5/6] mm, bpf: Use memcg in try_alloc_pages(). Date: Mon, 13 Jan 2025 18:19:21 -0800 Message-Id: <20250114021922.92609-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250114021922.92609-1-alexei.starovoitov@gmail.com> References: <20250114021922.92609-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Unconditionally use __GFP_ACCOUNT in try_alloc_pages(). The caller is responsible to setup memcg correctly. All BPF memory accounting is memcg based. Signed-off-by: Alexei Starovoitov --- mm/page_alloc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f967725898be..d80d4212c7c6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7118,7 +7118,8 @@ struct page *try_alloc_pages_noprof(int nid, unsigned int order) * specify it here to highlight that try_alloc_pages() * doesn't want to deplete reserves. */ - gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC; + gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC + | __GFP_ACCOUNT; unsigned int alloc_flags = ALLOC_TRYLOCK; struct alloc_context ac = { }; struct page *page; @@ -7161,6 +7162,11 @@ struct page *try_alloc_pages_noprof(int nid, unsigned int order) /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ + if (memcg_kmem_online() && page && + unlikely(__memcg_kmem_charge_page(page, alloc_gfp, order) != 0)) { + free_pages_nolock(page, order); + page = NULL; + } trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); kmsan_alloc_page(page, order, alloc_gfp); return page; From patchwork Tue Jan 14 02:19:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13938317 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F8FD224B0D for ; Tue, 14 Jan 2025 02:20:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821202; cv=none; b=mOlsYSgOj2xjorztESEnSdp9AFSV0h4p+XkB+9FebbVf2WMwSybl/+CtoMLzvkf6djl5fW331X+LGDfkU/5SiRcNkPq9t1c0uVbGVa53obZmm/Tr2KuYodftU6Q4CLoTpgfjciOt5pGCfG8xZExcTtWLAHwus26AtzkyKqbQ6Ic= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736821202; c=relaxed/simple; bh=qpD82EOTiD0WuUW0ru5J3NdtZhJ8QpRSIPbqAuteWAQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Al1rebunF6dsE2K3BmS1nzJihqNMungQa+dsvc+9NIEDcnFiikpd+cMuVdtt2v2b0gYqt9RVQBN2Lr8BEtnt+K5tu9HY6Xk7tGdDGLPlztW4eQvknMV+VlsuntmzQXdacCP4QG7k/OJeE0VMPp+9VMldObihlnAFVbOv3wXKToo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gkYXsUPO; arc=none smtp.client-ip=209.85.214.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gkYXsUPO" Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-218c8aca5f1so107113585ad.0 for ; Mon, 13 Jan 2025 18:20:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736821200; x=1737426000; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6YPnFtJFai+0xFpNHgKwg1vEFPjmUMGue3EQsxvwhos=; b=gkYXsUPO0Po0Pay1CkB+BlKF7qffTsv8KxaJoKh3obQmg8qbldnKs9ALZ+WAFqUz9A 6ORHonYWsjTXV1i8HmCm7Weq5Xm226RaZoUlJ8xxXkNiQ93FVH/AkMK9gVU+MsNxV2sK d6YpHDjoayNJJroYBucX7yB4mQRwm2yzdfL0+OrOuUdoCs/xr+IWiz4fdPfYvqjXoiWO UGuQAmwq8WUbHkoH5UCTH92RN09wtCrCQQyM0NJOOCgErbccA1Mx/E6gqynlkqxhw3mB ukJu5CCwoqXPlqMvc97JIJTRJnO+ulw9TydHCH1/RIBKadnMD7QprI/MplfswffT7t39 akzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736821200; x=1737426000; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6YPnFtJFai+0xFpNHgKwg1vEFPjmUMGue3EQsxvwhos=; b=DWhvgblHdOpQDCUaqtHLbtwJL7nV3CKaCSK1zdKQBIsSCsGohZ8k4vjEMt3M4FwbRt 7WkQ+Uj2uaewIxI8aQgDql58sExXQpR2Lng8Lca3JWpwZ7f7FgcAp9YPCGvkU2wjyAs9 ZFhg2ScfFPrMGn8mFGnn5EXM6nbbDiGUCLJxMrowqUEp3/Fcp0HG3fzh8QcJCJAsR0jO nne2XqFg6sh6do1mYSj3O78Gtc6WuNh9K8jDsVr/wZxJRdmtXGVZXoo0Ir/d/R9ZOIoP iWAX5C87IXH0vVJmprx/6LYqR2IvaqV3WL8xFijQESURqJLr0MJhjqXhpAtaENRUGuEs 6GoQ== X-Gm-Message-State: AOJu0Yz3KYbGBxpvB2UoReItEnchdVFaMGWHDAIxFLrDSERPYtoodrf6 9I31Ue694XoGoPhK62fijsS94AK7++ClhOb1vqvOhWnO+Zr52ZzYP71Miw== X-Gm-Gg: ASbGncuv8o5pXjtItP+5+F9lYpYj+tHihn4Z/X5UDdZhCXHDwlGhItpZDOCP86njuuk TsYgM6/sruDtw/e30gJckWZ0L0wd4XtH82eBQN8czvkySHYs2g+V3MTpaf2caI0bjegoJXJcwb4 Lsh4qAvNZDd/lAO3+MjeJtDlV/WDnebukeWNUuSjCHxFoXTOhAqC3+NcFf5OS3Fl211x03ATsrB +o4VxDv89y4NqFAGufsdcD7kIbwaHowyHsGSFKVW74NFCvyBFZURHPhAduPnAMUATgI+ewsHklS JN3esIcg X-Google-Smtp-Source: AGHT+IFK2Q/NzWEMQDd1zkERrN23lZllcwNoRX4XpjEtT4xySg8TTJ5FTM1kqalpNAWOVle2SSYeWQ== X-Received: by 2002:a05:6a20:4394:b0:1e3:e836:8aea with SMTP id adf61e73a8af0-1e88d18eab5mr40855550637.14.1736821199823; Mon, 13 Jan 2025 18:19:59 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:4043]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72d4056a5fesm6475293b3a.48.2025.01.13.18.19.57 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 13 Jan 2025 18:19:59 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v4 6/6] bpf: Use try_alloc_pages() to allocate pages for bpf needs. Date: Mon, 13 Jan 2025 18:19:22 -0800 Message-Id: <20250114021922.92609-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250114021922.92609-1-alexei.starovoitov@gmail.com> References: <20250114021922.92609-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Use try_alloc_pages() and free_pages_nolock() Signed-off-by: Alexei Starovoitov --- kernel/bpf/syscall.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 0daf098e3207..8bcf48e31a5a 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -582,14 +582,14 @@ int bpf_map_alloc_pages(const struct bpf_map *map, gfp_t gfp, int nid, old_memcg = set_active_memcg(memcg); #endif for (i = 0; i < nr_pages; i++) { - pg = alloc_pages_node(nid, gfp | __GFP_ACCOUNT, 0); + pg = try_alloc_pages(nid, 0); if (pg) { pages[i] = pg; continue; } for (j = 0; j < i; j++) - __free_page(pages[j]); + free_pages_nolock(pages[j], 0); ret = -ENOMEM; break; }