From patchwork Tue May 21 23:31:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Guzik X-Patchwork-Id: 13669811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14A4EC25B74 for ; Tue, 21 May 2024 23:31:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 81CD16B0089; Tue, 21 May 2024 19:31:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CD146B008A; Tue, 21 May 2024 19:31:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 694E36B008C; Tue, 21 May 2024 19:31:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4A8EE6B0089 for ; Tue, 21 May 2024 19:31:18 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B2498140DE8 for ; Tue, 21 May 2024 23:31:17 +0000 (UTC) X-FDA: 82144001394.03.F478A25 Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf08.hostedemail.com (Postfix) with ESMTP id 8A9BF160009 for ; Tue, 21 May 2024 23:31:15 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=CXD4p4f7; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716334275; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=xqPAZZjPqHvDwLszD6b8Qm4Tg+BARirbGA2Pdbnf72s=; b=B6I8CASoHJu4v+Z3OCbIyb+BAzx9y5Z6UA4kZ6yfXBGG1rGNsDHXk02/Fi5FunnFIUcWcR i4EjPxJ41+Kd9eMxdou6EvPFFiU98Ucp9+g+V0/PKheH6YPzx1a+Z9TQAH5KC8Vm9sQooo fUFxmJq7EMyFPsqeW/tJf7XYWhDcqC8= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=CXD4p4f7; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf08.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716334275; a=rsa-sha256; cv=none; b=uXO0KR73HC3ScOHqfmz5MeIa54JBpztwBi1krIIaMQUxmDroVm0xAfS7pQ28yo1SluvWV1 fQIEHGUUrbsG2JIuUBv4Nl4As0uG7Gacd0vRLK7PFLamEqIV+XqgfGTZSUy1L3AkqSAB5v UoqVS7gwCGc0Z9h+LzIzUrXVuNRlWd8= Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-57230faeb81so5622857a12.0 for ; Tue, 21 May 2024 16:31:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716334274; x=1716939074; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=xqPAZZjPqHvDwLszD6b8Qm4Tg+BARirbGA2Pdbnf72s=; b=CXD4p4f7XXkD0ztHvNn3xjEtaPzm0QUk3GlFHH18Jw9cwBYSlTQrXlWN62e3ayGXIk m3mFh4L2xmR0M+zevYb23Mu+g4NlwBnH7fIkfVZnJvt2+9PbRdxBW+6oTZwq51vE341p qwAM03H9NqsYTD0OJc/EblxpcsegWtFn7DNFVLgchHT4JLw1X1PANwssptP5Tj8w04dy F02BdYvM5n57LQsOd3maS8pvw3SbEL6ebQxCEAaa1oJ/Aj14FjxNBIs933lCoJR5GmN5 IJCm6uhmAGSyYq1kWprlOwkyBOckSMLaZgZv4p3x/hsFeesaOpze5FBATZQBYxfIUVl/ Vt7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716334274; x=1716939074; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xqPAZZjPqHvDwLszD6b8Qm4Tg+BARirbGA2Pdbnf72s=; b=vFhMyx0ap+HRIX2r0+GBfAHrpum3bxJMfYX9OA2dQEzNSswn0Ph5e5M9XxIfTJWDx7 jioaWl7yQNjjVzlvMFfUO3x9H8OZg22AkVtt64XSObtA4IhC38+rc8ASHjmxMwZbgSto O5tN49/ysXPwGKO4CT6CGhHd9VQZS8B4IPqGxUzawCTfwstswJby8lk/6pvDHa2VdJPv Bw1NgsBGhVw8RZEOrSGetnNtZVUDC2y/ORH8yPN3NIUi5WsA5jp5XBMabVf4q/AITFof ye1d9YGe9onb4Uj5iQV97XfNZlgQtDB/T+DTOJHiqLcNelVYn9OUs8TRIPGj0bhnoFjl 4mSw== X-Forwarded-Encrypted: i=1; AJvYcCWNb47NQp5OrhcPe4fxqoTVNt+w8Xx62YsKD5IqJcP1iTnlJLXjl00R7vEefAQARen/YG+xy2ELYQkBBcyxVCmwfmA= X-Gm-Message-State: AOJu0Yx8dIJ/wjuEbA/Wp8LsTRctmUu6tQhyUFsh3MqlBHacYdooKvSe 5pTjmLjeu/4XmuwDS1XM5A/04eKC78EhClvVTCGZvmVN36+Dv50T X-Google-Smtp-Source: AGHT+IHtXE1d5G9pJ2WTi0OrYXevd9wa+c0tBu9ShIuzfN2sQswRwV5MPugbx6A52VwQzgEJNaW9fg== X-Received: by 2002:a50:8e17:0:b0:573:55cc:2f50 with SMTP id 4fb4d7f45d1cf-57832c585c5mr141882a12.37.1716334273685; Tue, 21 May 2024 16:31:13 -0700 (PDT) Received: from f.. (cst-prg-19-178.cust.vodafone.cz. [46.135.19.178]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-57825a6f47bsm1333062a12.64.2024.05.21.16.31.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 16:31:12 -0700 (PDT) From: Mateusz Guzik To: dennis@kernel.org Cc: tj@kernel.org, hughd@google.com, akpm@linux-foundation.org, vbabka@suse.cz, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mateusz Guzik Subject: [PATCH v3] percpu_counter: add a cmpxchg-based _add_batch variant Date: Wed, 22 May 2024 01:31:00 +0200 Message-ID: <20240521233100.358002-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: 8A9BF160009 X-Stat-Signature: ejjqudaauaqkazbubha7reph8ddetb9r X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1716334275-191743 X-HE-Meta: U2FsdGVkX18uwL/KoMK8wRgALW+/Ev1XWPRwMkvJ8XI9XMlO4nSDKEcyfl0DZQYatqD4Yq4QcBdnkI4KOHgyK9yArDeCgd39abgNwR8BZlVOF+dLs7VTLiY88CWjRlNmW0l97hLfr1/0h/WPmUONuMWCvVi/GB1f8odMQ+y9cbZnqTT+G0vF2BZdK7zGd+LBgtGVcq719bZZS1bKnVVaQ7Z7cyyyO0T/H8VIiHcmeN77pN1nwFeDLp4Dv8oxRbFmi85KBsKJ7THdeN4VWuscRA+vYbMCyVQOCut84nLlww4llxp5eUxT5MWyerjxlPI0gHjX5MZ55DvD0sMid9LbcIXq2c9DrT5EDqFT4/qRwR08dxwi8nFTzLWH0X5Th3tzOu7WIf8NmV69lBGzr5N7tEPifCDOP1+4vVn2NGLjiCQrA2zR2qXvrUmiXskvDyklxA4Nfw7eNhzNd0gZf/idSEzOFKcA9QX1oKyCbSslpV/FxyvKDJxvluLlHEW0UGqbFL+m9ErGPQ197kIUFZ2MhREyGMvjlRITHEci3n4MdNa9LX+oAIcf5P9R0LH7FhwFuD1yQnuLoBBtNu9Uo4ynBXI9ogqvj/Xvemh4COLo6ANV9N8/s+dGBa/inXoRK8BbLLFIlVDNn/zSSsSn9vPq2awWhqzWJ6PlTmlLJyoGTts4E/tKZ48lUchEnziaFmuS6YBCexn3B6R5767lcQS4B6Tc7J7qpCTsEcWGCTg+0opujJ0zfKdns9yIoviBhL5SRxwP3PQs3LeIoZTUJhzbVOHDdVUvI7JIw5vvw2lM4R23z6OYuzSuibxfejke4SHk7m3s87y3LTl6uxoesWm6kQXe+uwytP8kUNer1SLEEakSfCUQiTvQ9UWJ0JtUyc/uaqE0hmB8opoF57YRYLHQ92I4qrdK0jZw6/C1RlaYqIgGHQJ842ZG9XQSTRdU2AWBLiWh0SI44e66/q4eMmg bdmryl3X zRaeKVxg0sP+1bejjd5TZoiftfFsBEUDjpnpLKfPJjngK0R9RrXySLH6TGcVgJ4L/Ka0hpSWm4CxsJtj5OICCI6GzkV2pPsvsxbQvHgX43nKZlf7rbIOEA6ukuDco57hCQz8m+APql5ds5pV9bdaEgJcb/6o97FgNwWD93ONqziYRCxtoWMawGo7tmMikgoulvef2S2H7Tz3AcM3WWU4hNFYnKZw0VF/SWyW9+K4iNeLUYo4vp5SLhTOqS6Xx8YnBZc1HJ6klzeQpHv35H64WcSLIu0mVfpuRfESDAV8tj9xtxopbq5cGs4WXuFoYUkkFEGNE2sZZD9bnCxKdXfu0C2vIWdZSmY3nw8YBaly6IBUQc6omVWaiPl+caOIqMpMzzIT87R8oM6FlCzBWgR/WNKMExwwLttsmvi+29rfQYXeM02UrEbV7HFQZKQZ4kJQVGGiXErPAZvBiwf6WD1UXu1M1tUsDEFeUcuummsnQ79sY4sY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Interrupt disable/enable trips are quite expensive on x86-64 compared to a mere cmpxchg (note: no lock prefix!) and percpu counters are used quite often. With this change I get a bump of 1% ops/s for negative path lookups, plugged into will-it-scale: void testcase(unsigned long long *iterations, unsigned long nr) { while (1) { int fd = open("/tmp/nonexistent", O_RDONLY); assert(fd == -1); (*iterations)++; } } The win would be higher if it was not for other slowdowns, but one has to start somewhere. Signed-off-by: Mateusz Guzik Acked-by: Vlastimil Babka --- v3: - add a missing word to the new comment v2: - dodge preemption - use this_cpu_try_cmpxchg - keep the old variant depending on CONFIG_HAVE_CMPXCHG_LOCAL lib/percpu_counter.c | 44 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 44dd133594d4..c3140276bb36 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -73,17 +73,50 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount) EXPORT_SYMBOL(percpu_counter_set); /* - * local_irq_save() is needed to make the function irq safe: - * - The slow path would be ok as protected by an irq-safe spinlock. - * - this_cpu_add would be ok as it is irq-safe by definition. - * But: - * The decision slow path/fast path and the actual update must be atomic, too. + * Add to a counter while respecting batch size. + * + * There are 2 implementations, both dealing with the following problem: + * + * The decision slow path/fast path and the actual update must be atomic. * Otherwise a call in process context could check the current values and * decide that the fast path can be used. If now an interrupt occurs before * the this_cpu_add(), and the interrupt updates this_cpu(*fbc->counters), * then the this_cpu_add() that is executed after the interrupt has completed * can produce values larger than "batch" or even overflows. */ +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +/* + * Safety against interrupts is achieved in 2 ways: + * 1. the fast path uses local cmpxchg (note: no lock prefix) + * 2. the slow path operates with interrupts disabled + */ +void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) +{ + s64 count; + unsigned long flags; + + count = this_cpu_read(*fbc->counters); + do { + if (unlikely(abs(count + amount)) >= batch) { + raw_spin_lock_irqsave(&fbc->lock, flags); + /* + * Note: by now we might have migrated to another CPU + * or the value might have changed. + */ + count = __this_cpu_read(*fbc->counters); + fbc->count += count + amount; + __this_cpu_sub(*fbc->counters, count); + raw_spin_unlock_irqrestore(&fbc->lock, flags); + return; + } + } while (!this_cpu_try_cmpxchg(*fbc->counters, &count, count + amount)); +} +#else +/* + * local_irq_save() is used to make the function irq safe: + * - The slow path would be ok as protected by an irq-safe spinlock. + * - this_cpu_add would be ok as it is irq-safe by definition. + */ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) { s64 count; @@ -101,6 +134,7 @@ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) } local_irq_restore(flags); } +#endif EXPORT_SYMBOL(percpu_counter_add_batch); /*