From patchwork Sun Jan 26 07:02:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13950576 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B709EC0218D for ; Sun, 26 Jan 2025 07:02:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ADF9B2800C7; Sun, 26 Jan 2025 02:02:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A414E2800A6; Sun, 26 Jan 2025 02:02:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86BC72800C7; Sun, 26 Jan 2025 02:02:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 63AFF2800A6 for ; Sun, 26 Jan 2025 02:02:17 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 141751612B4 for ; Sun, 26 Jan 2025 07:02:17 +0000 (UTC) X-FDA: 83048709114.19.AF3AD50 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf01.hostedemail.com (Postfix) with ESMTP id 3D98540004 for ; Sun, 26 Jan 2025 07:02:15 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=UxIA+8SJ; spf=pass (imf01.hostedemail.com: domain of 39d2VZwYKCEw685s1pu22uzs.q20zw18B-00y9oqy.25u@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=39d2VZwYKCEw685s1pu22uzs.q20zw18B-00y9oqy.25u@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737874935; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VuMsPHTSMtWsNhCdVIn1M3R177GHzQc3VDuucKTMIUU=; b=DS9/zqXmyKNxrLX/Vu4tCA1yaRlU+khbyknwUxSMPXEttcmqtfeFWyjDzdz5pxpWzv4vC7 G4XUg8UeL32lGVgdfPnodffmIs/Z/93QCTAahPMsh3WRLq2Nerb8QWaUHe0g6NL8Lf6Er7 ALXzJBWajM0RdLuZ0iIgl9y4NP/Ds3g= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=UxIA+8SJ; spf=pass (imf01.hostedemail.com: domain of 39d2VZwYKCEw685s1pu22uzs.q20zw18B-00y9oqy.25u@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=39d2VZwYKCEw685s1pu22uzs.q20zw18B-00y9oqy.25u@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737874935; a=rsa-sha256; cv=none; b=m5aMA3Bg+vzoUT5jfc9+CR1UvfLj/KUlnM/DkXcKgjWDwxT5w3HKa357ALDHaLrm0X4OKP CcVGlsaMolXI3uSeLXVl8uIyvZpauX91HLAA8/YzLbQe4J0KBA4GgxuNOovn+gsK0YHfSr YG7NRYYQEc+3/AhCBOTtnP224zrDpik= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21631cbf87dso67394375ad.3 for ; Sat, 25 Jan 2025 23:02:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737874934; x=1738479734; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VuMsPHTSMtWsNhCdVIn1M3R177GHzQc3VDuucKTMIUU=; b=UxIA+8SJD5K5FY6JPpSVmHkHAcQ+oEQWRNRfSQhggudlMLaqrswJYZHsO7bz3LrSYo Q1LA6NFrcro13+nbzpdo2dnwFz/201iRpTRhjSHUjIVfuFRKQ1VPejUakEnQr2r2ljYq er2X2U2lQhqmQImah44mP9Ny/6jM4BoPtfQ3z7zu6uTLxdy2dl+pywJSF+pwzcMKX31v fHiV56OHUBjEXdouaLcAYX3hW/7NUng/JUtPyhW2L2O4M4YMvCR9U+N9qcXADpFqbOfW 0p20vFzk1gFz4k1S68tVfoyUMNEu7qoYuKr4GzBLgvX52wVb8hKwZSz99thNuUZiuO8C QK4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737874934; x=1738479734; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VuMsPHTSMtWsNhCdVIn1M3R177GHzQc3VDuucKTMIUU=; b=JA6XtWXdrilOC3NVmGBZjCA16humGJrdBeZ5zfcTVwy2yxbhZOSX3AKpi9iAYPwscZ FEcNCnfaNDYDGYxkBREeUPGD/fg9AzzFALrEsVwgWMQCfEb/U+wuPeB7ckrIGGu+yby9 AnitOKt8G2PfaKfBvr7sHd0rgzXqIJxlxa+VURg8a6OjhRRiIkupTJuo253xOwbDJzbn fifMhVjpD5hK92K5ye+fB5fMZCGDkbM5w9p21yBmc73diJDyvrB5wN75l3vIROoqIF5n /W8oBiyI+i38oMF/kTVyr82CJroT50I/RMDuzIB4zDXgRF+64bC7ysEzl0J1QNdUuXqh D8og== X-Forwarded-Encrypted: i=1; AJvYcCVBPj3QbTRt1EPdghh1kZ2hJbR3bth3ghMOkCeKD0WnEycAjc3wjQPcvkP7wO+MthIxuQLmSjJ6HA==@kvack.org X-Gm-Message-State: AOJu0YyuZxMbKmgKpjaxxGP4bQRU2txUXEUaiqBxQEWQLK/V6nKVrIq7 CnG7z5nRHHT0dRl6UBZbJIID35JUdJQWZJszALM9aUY08b4W0y9RIGt1tiJk8neOYf/NqwnkdfE ONg== X-Google-Smtp-Source: AGHT+IHy+6ALp0tq0HhC4NNxguhtqx7IJQwGFlemwAnliRSj9IqTbeoXX2L4PWoV+tmh3gWwLoTTF0wsrrU= X-Received: from pgbdh19.prod.google.com ([2002:a05:6a02:b93:b0:7fd:50ab:dc45]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:81a3:b0:1e8:bff6:8356 with SMTP id adf61e73a8af0-1eb214cb158mr41901012637.20.1737874933839; Sat, 25 Jan 2025 23:02:13 -0800 (PST) Date: Sat, 25 Jan 2025 23:02:06 -0800 In-Reply-To: <20250126070206.381302-1-surenb@google.com> Mime-Version: 1.0 References: <20250126070206.381302-1-surenb@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250126070206.381302-3-surenb@google.com> Subject: [PATCH 3/3] alloc_tag: uninline code gated by mem_alloc_profiling_key in page allocator From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, vbabka@suse.cz, yuzhao@google.com, minchan@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, 00107082@163.com, quic_zhenhuah@quicinc.com, surenb@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3D98540004 X-Stat-Signature: ek1kjs3gw5p5osd58d47h43w3grw4q3e X-Rspam-User: X-HE-Tag: 1737874935-997208 X-HE-Meta: U2FsdGVkX19mQu8PSKg/IvU5L1IRjf/SS50ahnu0B43QIsIwdRwlV2zu7nUu7JVI8pLgQQ2C65uY97gtchMakQKjDu9PbRt0DiJOZOg1q5JDEsMHnPOKCQhHY+i1CynWrYVN17azalLJmAAbsSzhZJBL0ykf1OVCNOMdUqhL6JFIXko6lv9btSLI9g5Szwj1VHgo/etO/mdiUyepmHZmMRktNTpeiFby5ZtMBgm8jW8e7h7RuFybWZvtJhAkPoVALuupIaonJ7aw6SoWWVIR6iZD/ke3wROU1wXJjbWnEE3a1yaed9cXqB/EuZUP0CRE/UfQtFWfSZ1Z7sy+uD5Oz+ZGwmtk4JF9eWrrSKTKpvAz4qIC+kleHqXgIxHm5R5c+nnnDSqIHxvPHT//SPW/4zwKgB+4q56EE5oQAni8FjejIjgAvmKAsMO3bApt3CvFTBOc7Vjl2S1qVxLo7q3nJPLiluCmEg6hPYW5NjCLpF+wlz2LuwoODmH/UYTD4nXkAzPNgrj/rIooZH31GXfoXFrQiaaxVwQl48bsKLMVvS6Wx0iuSGWHxiKT923i4MutlhxJIkdGgv5Pbf5ItWdnzVpN6GC5F7ECShqnR/EerAeNN3SNwsmD225HqrqN1qNNzV6bp7AHomDD4sT3tNmPKUfNrywkhgfkW1YRr2Spa5+JRUcyBYQ8+c9TmVaiAJ2bfwqVdF50QiNeiBTqqSWDauJt+8wq6wksO4n5K+xrLdlbRgeLm2EewB195zvLB5CuZShGwDogwCFW/XAbxIoJtgOGWHm4Zc2846jVlURpS/25rS4mDh5z1LPlz+tMDBc8Ip/JkHTS5uMBjE0j40HKVrOuOq0UstSEwxsJ2TRZZ3/6cpngeYXbE3jsCMVO7J6KA3VoDUo7HG9FVsrc+6bdoYaIQ74accQTovOLES/DVocEEE+i5UeiPwVRCEjrMN3kAln/zxU8HUlmMeU6nWp FOEG8bSS P5yWVX1pdw+uZTPIszTP0WBLrh8OMooDb0CBr9iyLd7RiphdUvDRajukSIfCS624PONnMfVPUjIOz97P59Vc+n83qA3Hxk3CvGPPo+yzHhLfm4av/JG3irqxh+76VqUMG9iOQwI11RnMwlwnIPX9Hk2NKUXxNlBHb7Oto9DPOuTVj6zZUaPi/9JHlHqNxASOus0xMUxN3hZ4lkdfUnFgwuaUgPPWUZde3AIkU6kvK2zlE5GYG74pRw+Wi089CPt+e0au1M54cMrXqM95BX/AoFIgECIWTFnyiKk7G/dSi3I/Xi1Gf7nt4CmiKqQITHVeTcQkpW9B5LjTRyxiVdANZko/ca3J0BZ9St0IhHcnwABJmQJz+8+Rty5F0sXFYGC/Y1pf5awMtw8VaWqqJcDKCsz9g5vDT/Ii2t/cMihZmgQ5V4Doy8Ce+Eqng7sUIer45Pf4e/b/hhlJgs3fUxj7XGi5dbAbM3aiDwoAE8ESfAMGaESMQuoNT9ZInpzGxmwtYxQ66ovkjDb5QwclyLUfNuenOjlf8kSRmi4nhSvVqD7fHaXbWZ4ZxFOXb+bNpGp2WZx2CQhaD0+bTnXeGzxqkhNJlWQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.016717, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a sizable code section is protected by a disabled static key, that code gets into the instruction cache even though it's not executed and consumes the cache, increasing cache misses. This can be remedied by moving such code into a separate uninlined function. The improvement however comes at the expense of the configuration when this static key gets enabled since there is now an additional function call. The default state of the mem_alloc_profiling_key is controlled by CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT. Apply this optimization only if CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n, improving the performance of the default configuration. When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y the functions are inlined and performance does not change. On a Pixel6 phone, page allocation profiling overhead measured with CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n and profiling disabled: baseline modified Big 4.93% 1.53% Medium 4.39% 1.41% Little 1.02% 0.36% When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n and memory allocation profiling gets enabled, the difference in performance before and after this change stays within noise levels. On x86 this patch does not make noticeable difference because the overhead with mem_alloc_profiling_key disabled is much lower (under 1%) to start with, so any improvement is less visible and hard to distinguish from the noise. Signed-off-by: Suren Baghdasaryan --- include/linux/pgalloc_tag.h | 60 +++------------------------- mm/page_alloc.c | 78 +++++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+), 55 deletions(-) diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h index 4a82b6b4820e..c74077977830 100644 --- a/include/linux/pgalloc_tag.h +++ b/include/linux/pgalloc_tag.h @@ -162,47 +162,13 @@ static inline void update_page_tag_ref(union pgtag_ref_handle handle, union code } } -static inline void clear_page_tag_ref(struct page *page) -{ - if (mem_alloc_profiling_enabled()) { - union pgtag_ref_handle handle; - union codetag_ref ref; - - if (get_page_tag_ref(page, &ref, &handle)) { - set_codetag_empty(&ref); - update_page_tag_ref(handle, &ref); - put_page_tag_ref(handle); - } - } -} - -static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, - unsigned int nr) -{ - if (mem_alloc_profiling_enabled()) { - union pgtag_ref_handle handle; - union codetag_ref ref; - - if (get_page_tag_ref(page, &ref, &handle)) { - alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr); - update_page_tag_ref(handle, &ref); - put_page_tag_ref(handle); - } - } -} +/* Should be called only if mem_alloc_profiling_enabled() */ +void __clear_page_tag_ref(struct page *page); -static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) +static inline void clear_page_tag_ref(struct page *page) { - if (mem_alloc_profiling_enabled()) { - union pgtag_ref_handle handle; - union codetag_ref ref; - - if (get_page_tag_ref(page, &ref, &handle)) { - alloc_tag_sub(&ref, PAGE_SIZE * nr); - update_page_tag_ref(handle, &ref); - put_page_tag_ref(handle); - } - } + if (mem_alloc_profiling_enabled()) + __clear_page_tag_ref(page); } /* Should be called only if mem_alloc_profiling_enabled() */ @@ -222,18 +188,6 @@ static inline struct alloc_tag *__pgalloc_tag_get(struct page *page) return tag; } -static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) -{ - struct alloc_tag *tag; - - if (!mem_alloc_profiling_enabled()) - return; - - tag = __pgalloc_tag_get(page); - if (tag) - this_cpu_sub(tag->counters->bytes, PAGE_SIZE * nr); -} - void pgalloc_tag_split(struct folio *folio, int old_order, int new_order); void pgalloc_tag_swap(struct folio *new, struct folio *old); @@ -242,10 +196,6 @@ void __init alloc_tag_sec_init(void); #else /* CONFIG_MEM_ALLOC_PROFILING */ static inline void clear_page_tag_ref(struct page *page) {} -static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, - unsigned int nr) {} -static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} -static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) {} static inline void alloc_tag_sec_init(void) {} static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order) {} static inline void pgalloc_tag_swap(struct folio *new, struct folio *old) {} diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 55ed2f245f80..67e205286dbf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1041,6 +1041,84 @@ static void kernel_init_pages(struct page *page, int numpages) kasan_enable_current(); } +#ifdef CONFIG_MEM_ALLOC_PROFILING + +/* Should be called only if mem_alloc_profiling_enabled() */ +void __clear_page_tag_ref(struct page *page) +{ + union pgtag_ref_handle handle; + union codetag_ref ref; + + if (get_page_tag_ref(page, &ref, &handle)) { + set_codetag_empty(&ref); + update_page_tag_ref(handle, &ref); + put_page_tag_ref(handle); + } +} + +/* Should be called only if mem_alloc_profiling_enabled() */ +static inline_if_mem_alloc_prof +void __pgalloc_tag_add(struct page *page, struct task_struct *task, + unsigned int nr) +{ + union pgtag_ref_handle handle; + union codetag_ref ref; + + if (get_page_tag_ref(page, &ref, &handle)) { + alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr); + update_page_tag_ref(handle, &ref); + put_page_tag_ref(handle); + } +} + +static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, + unsigned int nr) +{ + if (mem_alloc_profiling_enabled()) + __pgalloc_tag_add(page, task, nr); +} + +/* Should be called only if mem_alloc_profiling_enabled() */ +static inline_if_mem_alloc_prof +void __pgalloc_tag_sub(struct page *page, unsigned int nr) +{ + union pgtag_ref_handle handle; + union codetag_ref ref; + + if (get_page_tag_ref(page, &ref, &handle)) { + alloc_tag_sub(&ref, PAGE_SIZE * nr); + update_page_tag_ref(handle, &ref); + put_page_tag_ref(handle); + } +} + +static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) +{ + if (mem_alloc_profiling_enabled()) + __pgalloc_tag_sub(page, nr); +} + +static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) +{ + struct alloc_tag *tag; + + if (!mem_alloc_profiling_enabled()) + return; + + tag = __pgalloc_tag_get(page); + if (tag) + this_cpu_sub(tag->counters->bytes, PAGE_SIZE * nr); +} + +#else /* CONFIG_MEM_ALLOC_PROFILING */ + +static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, + unsigned int nr) {} +static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} +static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) {} + +#endif /* CONFIG_MEM_ALLOC_PROFILING */ + __always_inline bool free_pages_prepare(struct page *page, unsigned int order) {