From patchwork Sun Jan 26 07:02:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13950575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35B3AC0218E for ; Sun, 26 Jan 2025 07:02:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8E4592800B0; Sun, 26 Jan 2025 02:02:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 87C822800A6; Sun, 26 Jan 2025 02:02:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BE9F2800B0; Sun, 26 Jan 2025 02:02:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 43AD42800A6 for ; Sun, 26 Jan 2025 02:02:15 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E12231A12E0 for ; Sun, 26 Jan 2025 07:02:14 +0000 (UTC) X-FDA: 83048708988.16.E8A00CD Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf01.hostedemail.com (Postfix) with ESMTP id 1193B4000A for ; Sun, 26 Jan 2025 07:02:12 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ayeYCH6H; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of 3892VZwYKCEo463qzns00sxq.o0yxuz69-yyw7mow.03s@flex--surenb.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3892VZwYKCEo463qzns00sxq.o0yxuz69-yyw7mow.03s@flex--surenb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737874933; a=rsa-sha256; cv=none; b=vu4bBsnH9pw2viZigi4UxU14GZSdY4WXYT/8E6bM7R6ZAnJVlot97UbQ/88/0ajPSOiszn /SukVQNGSs/uq4wWlLkT6e3GoHUUVBTUuE6454Y1u6VEXTPBAIT+jrJc4AQj5DKVv7iBnr ZSqwfanKfpOZ7VX7PNjNtolwQUMfftk= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ayeYCH6H; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of 3892VZwYKCEo463qzns00sxq.o0yxuz69-yyw7mow.03s@flex--surenb.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3892VZwYKCEo463qzns00sxq.o0yxuz69-yyw7mow.03s@flex--surenb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737874933; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u+wqmnI/8fK6TKpLWm3RbIHTrNperIZqHqFOsh7k2n8=; b=2hsSFkTI0dYdwUqjHIv1wq5bRY1glPuNK0QKzBz8m03BlkAnM2mRR/99KtVffBl38/XS0I cFOtX4Q92IMYxfD1d2rMFa0IYa/MG8rrk678h+tLmMJMry36OytVljO7JgdKn7Giu59PMW ziu0iCezD3vafDGuyqw7gy0/wGHvdEk= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f816a85facso587083a91.3 for ; Sat, 25 Jan 2025 23:02:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737874932; x=1738479732; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=u+wqmnI/8fK6TKpLWm3RbIHTrNperIZqHqFOsh7k2n8=; b=ayeYCH6HVHlVeLpbT6g+liEVR+CyLTsFG6WFnD29P5M69+14YKaLrT8eciRtGLh00M bpwyN1QU1Gr6WCVtqXZnkN2u1SLR0PHZtll1EI7n0h/SLvaccRzL7rUMwlsU3I85vtrT 3L4aWKbXViER7B+SzWDaylyd8dhpneJg8FYZEgfdiZzHcwb7GFZ9EZVehbYslFnnz12M Vh0xVTTkXuzKcCHv6vzXwGLGMg+9EvpG9KPvKkMCzQaFE0mt+sKn1jPt31Pt+A5/PfCu BPfX9s2gMBkgVPV+6kYTzj3pmhlFirXrdGAEZFbeYs5L0iSPuZ38e+s8iMKoSVcPM6wi uMQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737874932; x=1738479732; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=u+wqmnI/8fK6TKpLWm3RbIHTrNperIZqHqFOsh7k2n8=; b=Hh+bg6GqUvVUhbQjt1tlc8CIu2ZEzRQWpCPJFXmN9zTldf+iQUMK5HULWpi5a8mh4v F+NTeai+xdi17KyoBQ20/0G1fSyh58Zcc7xfg7ssV7ydd8r9HuOOtqGGRQdYrc8ojLBi nmC7+7kVC5dCMduJe4FlenZijwBtZSeYPf47jhuQmw6uafpuloUU+iqRA4gukuDqoTde c23V0y/rA9R3nk7vMeUl+dwUGJ/eT77QBOrw4hgOKHDhYkSMgg2jINdeXnyXauBsqR/8 q3PSEB8KhtPVaTWdzGz8OQfwpkJV/VaRaX3lYwmsCJ69HRraJ5zpOuR0B357c8fUMFF0 rRzA== X-Forwarded-Encrypted: i=1; AJvYcCV/7e40CeZ0OLJYqo3iafh5nYlKZTrQacQouZ5sdmuTTWbrPgYiGXx6KDsASslaZFYfUnJufmowyw==@kvack.org X-Gm-Message-State: AOJu0YxQUnygcNLa5UmrcRt20rsBNMnk0YoBYONbqofFt4KZm/7m+8fY fblUK+aEKfYGK5ClbElBoGeEcJKgTMsnpfQsqEP+Coa9oBd4TypwiEpi2l2Ic5A3sCtX4SykgTu RnA== X-Google-Smtp-Source: AGHT+IHTGE6/Vf5/+TRu1bIQ3qHUzvpK90Z3vXbQx7gL95QZe8ucUUi3LcdMyv8Ds5MGltY+8AwhAlKYMaY= X-Received: from pjbsw13.prod.google.com ([2002:a17:90b:2c8d:b0:2ea:6b84:3849]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3a0e:b0:2ee:b4bf:2d06 with SMTP id 98e67ed59e1d1-2f782cb9fe6mr49139163a91.19.1737874931632; Sat, 25 Jan 2025 23:02:11 -0800 (PST) Date: Sat, 25 Jan 2025 23:02:05 -0800 In-Reply-To: <20250126070206.381302-1-surenb@google.com> Mime-Version: 1.0 References: <20250126070206.381302-1-surenb@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250126070206.381302-2-surenb@google.com> Subject: [PATCH 2/3] alloc_tag: uninline code gated by mem_alloc_profiling_key in slab allocator From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, vbabka@suse.cz, yuzhao@google.com, minchan@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, 00107082@163.com, quic_zhenhuah@quicinc.com, surenb@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org X-Rspam-User: X-Rspamd-Queue-Id: 1193B4000A X-Rspamd-Server: rspam10 X-Stat-Signature: jfkxzhc5ubu7y4e9fiqckx74exrw9iwd X-HE-Tag: 1737874932-51260 X-HE-Meta: U2FsdGVkX1+yy/UGVH2wjkeWipdkqYAqzyF3fDNbrKUyc/gV8HzK8uBX91uYe2j9usw42RjfeLXla94+a6dc2oea2c7JNENa5ErJwrhbKL+4waZJFjdhw3S5+5SGU1WaeIo/uPwYj7kXdlznQYYJP2qieDJEGM1LK9LbofroqwHXTY4IWVusG+Bsmwx6MYdrJ7Mgxft/R5v2fZo0LxPpGqxEGFcLKHHvCUcgVZ5ne2WvGRtpQZhilPcWpF8M9n2gP1SYXcs9QOH6YAORI5PlSPPrTYWh85Lsdm6XwrlQnk9phF+uR3FfUUprbVr4EZ54c8AQZEFgi2TSYfNPvehgJkhMVxl88+o0loCkCv2iUBl5+yOuZggqXzt/ZRuxLCWP4RaX4sgD8rOSsjIvl7yu+4B8WYvozpMRwhzhKOPI3xP4Adkg4JBF2sMbly7W9cTklPcIsuY/A7m24T9HYrJhtS1RlgIA2pr3RPth5JCsKhSQKDxqxyqVM4M6cAgHGwB8v/cHgIyIglewnMaiYGODvyLMSyTWOGQc1W8Z9i6PbQBNrZ9WdVPzE6ii8rx4B7r1d9ys9myDzgTSkFm1wqjVNGyHD6IPZketahPyQcBdTNpoUUm1UHpAC3k2gaUSvZjdSuyjgAnbUrtu7SoYcXE00bLIy2PpxaOd1ENFdk/QOmrrS/zUcnlLbwHs5TYrVSWF+69jBx0ijqjYAXSOd+QxIrrFLiH1ag45xeuMp+H2vmr6zBSGur98GrHg/LgEZtjktuPbia2OUlWAnIzvyXlY2KFlKQ+E27pCcH+P/8DfvDaS0pwqIfegQZ8cyBnFYVRzEwb8NlAhfYB6ndWDq0DACmvPNKFijQbpdqB7uYRaah6bs3uvAIrLirdwMm3GZwhAA+2/Fk9OLzE4dOscQgG/sK00rBje0PU+m9TjJiAe1P+u7oWxJvb4nk2xXk8FuoUqrxK+Sg6G49XbAtUrnHH 13M04lWC HU8x/aFT+9zeg4QRfFpwOeKBhq3t+7teeotmAC4XZ16R6TOkJJzGh09sMcacK0jFlxZbc+3aXkB9p6i+wJQ60ppvEigEvjWtjXmN8iJF1fxLC6gkC5PzT5SYlE7fIyV2EvT0g5Xw/epAFPqYJ0lT8vE4eRRqKcLrUT4E4fuEpYjDoObpFWzaHT9a+tGInBjyNUKyO7Ejeo7IOqfsqacibkNbPpTsBMi+9ASpYUW2k27yt5GEKyR8RKbCL9a0lXHhGDOi7m6ok3pEOzc/+TlvHCBXDKdcdB6w2CwVnaHIkNEs6Vch7S4CO9W+JpktzsvAQlVHVNUcnivTH2L8viKFKqSOvYhvFtMcxZVyYdJh7wDShx/kACq0LB/p8XFTstn5ecHbMSMUnIN0eUP2XfMYNwVWAlc/fOHVsGBKsTHqWFEAhYiHXVATpWki1TL2sYr3pMFd+4GZON5KTu70HSyojlDqycChpNnHuxx9p0jqp62Bd+NjA7IPiEEPQRaGdrLEkJweORZGYywXpPkrrHN5HA1A60ZqYwY+LQhCRfJ4TaqefBL2M0Yp2Ss4mgAcfDnDuXuXb1KbZazIVx3BwaXduIPA+lw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.175081, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a sizable code section is protected by a disabled static key, that code gets into the instruction cache even though it's not executed and consumes the cache, increasing cache misses. This can be remedied by moving such code into a separate uninlined function. The improvement however comes at the expense of the configuration when this static key gets enabled since there is now an additional function call. The default state of the mem_alloc_profiling_key is controlled by CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT. Apply this optimization only if CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n, improving the performance of the default configuration. When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y the functions are inlined and performance does not change. On a Pixel6 phone, slab allocation profiling overhead measured with CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n and profiling disabled: baseline modified Big 3.31% 0.17% Medium 3.79% 0.57% Little 6.68% 1.28% When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n and memory allocation profiling gets enabled, the difference in performance before and after this change stays within noise levels. On x86 this patch does not make noticeable difference because the overhead with mem_alloc_profiling_key disabled is much lower (under 1%) to start with, so any improvement is less visible and hard to distinguish from the noise. Signed-off-by: Suren Baghdasaryan --- include/linux/alloc_tag.h | 6 +++++ mm/slub.c | 46 ++++++++++++++++++++++++--------------- 2 files changed, 34 insertions(+), 18 deletions(-) diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h index a946e0203e6d..c5de2a0c1780 100644 --- a/include/linux/alloc_tag.h +++ b/include/linux/alloc_tag.h @@ -116,6 +116,12 @@ DECLARE_PER_CPU(struct alloc_tag_counters, _shared_alloc_tag); DECLARE_STATIC_KEY_MAYBE(CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT, mem_alloc_profiling_key); +#ifdef CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT +#define inline_if_mem_alloc_prof inline +#else +#define inline_if_mem_alloc_prof noinline +#endif + static inline bool mem_alloc_profiling_enabled(void) { return static_branch_maybe(CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT, diff --git a/mm/slub.c b/mm/slub.c index 996691c137eb..3107d43dfddc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2000,7 +2000,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, return 0; } -static inline void free_slab_obj_exts(struct slab *slab) +static inline_if_mem_alloc_prof void free_slab_obj_exts(struct slab *slab) { struct slabobj_ext *obj_exts; @@ -2077,33 +2077,35 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) return slab_obj_exts(slab) + obj_to_index(s, slab, p); } -static inline void -alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) +static inline_if_mem_alloc_prof void +__alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) { - if (need_slab_obj_ext()) { - struct slabobj_ext *obj_exts; + struct slabobj_ext *obj_exts; - obj_exts = prepare_slab_obj_exts_hook(s, flags, object); - /* - * Currently obj_exts is used only for allocation profiling. - * If other users appear then mem_alloc_profiling_enabled() - * check should be added before alloc_tag_add(). - */ - if (likely(obj_exts)) - alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); - } + obj_exts = prepare_slab_obj_exts_hook(s, flags, object); + /* + * Currently obj_exts is used only for allocation profiling. + * If other users appear then mem_alloc_profiling_enabled() + * check should be added before alloc_tag_add(). + */ + if (likely(obj_exts)) + alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); } static inline void -alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, +alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) +{ + if (need_slab_obj_ext()) + __alloc_tagging_slab_alloc_hook(s, object, flags); +} + +static inline_if_mem_alloc_prof void +__alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, int objects) { struct slabobj_ext *obj_exts; int i; - if (!mem_alloc_profiling_enabled()) - return; - /* slab->obj_exts might not be NULL if it was created for MEMCG accounting. */ if (s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE)) return; @@ -2119,6 +2121,14 @@ alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, } } +static inline void +alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, + int objects) +{ + if (mem_alloc_profiling_enabled()) + __alloc_tagging_slab_free_hook(s, slab, p, objects); +} + #else /* CONFIG_MEM_ALLOC_PROFILING */ static inline void