From patchwork Wed Mar 15 08:49:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13175493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0527FC61DA4 for ; Wed, 15 Mar 2023 08:50:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231770AbjCOIt6 (ORCPT ); Wed, 15 Mar 2023 04:49:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231576AbjCOIt4 (ORCPT ); Wed, 15 Mar 2023 04:49:56 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D0495D474 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id s17so10342186pgv.4 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YkxChdLDa4UFijk1Jheg8QY/La3zPetPy6md4h64POE=; b=QjRMzeMPM7bEXlB6ydwS837/NFUyQ8FTlay7FyNeoNs0VDwU6tSXowOh5PEijcS3y8 sD6nxAC6h4K1cRCQD9qu0IFALoN8EolNk7Bid2nRlhUXSKUTeDvUpAvWMnwt+HPHUQx/ KyyxJcLdPcRikCzCFdYvFjT+Mqx43i8pHeTDS+TQTsjpdDOaSGXHR1WhZ00ymwBzz/6L oUBDFOeHVrVNwCUzNaMl/ntSSHv31y6BC2N0XzgLSBSKpat6+edLAyQmzHqhhsxZ7a1M lj1RUSkyW5Ia0Sp/kwGQHE1me5OM6WQIgiCDHeaI+wyfJ3J//4uGOBLUlhHOul1FoJei V5fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YkxChdLDa4UFijk1Jheg8QY/La3zPetPy6md4h64POE=; b=3DYdEXpNt8hFGPd/0TWpkIrVD10Cbz19oBNOdSTmMCC72x7Gfp5WJeVffLV2v7J6U8 PX5Ea/U2vAO/aIw+jQ+CTm9Ps/nxVegO5szccoR/tIBHXpbZjBJRlOasm0m47xF1x9fa 4+Zui60+sc0cr5VTa6XCDSenysHhkT0UCn+fsjpesmA9Nyf/a+a2wFmXQH8GpmPJ9IzN oLUTXKlca5J/QDdXjXAskztT+NFjNbACF/m6hHGV/Me1sVqTTYBbqWsuOeD85hT35UoG gSx9O04rDso+ngUjBa9azSpHmwV2gcZ38TJCliVshbPMbYxgyV8t08K6EuCmaFKyu4C5 pdoQ== X-Gm-Message-State: AO0yUKUQABLRLB+Z2vPIkQcByKnmo+KOlu58Z+rCsLAcu1CYPHYZWbKD whIfa1J4b0FPAMwYKN9rpVfgPA== X-Google-Smtp-Source: AK7set+PAG3sniqSn+uEJTES+NuHK2ZCXd+V9ulDUeNZd08I0VS+GAnsdTPkVoBU/h/tKlRjkhWWSg== X-Received: by 2002:aa7:8ec1:0:b0:625:8217:15b9 with SMTP id b1-20020aa78ec1000000b00625821715b9mr3081112pfr.2.1678870191951; Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id i17-20020aa787d1000000b0058d9058fe8asm2966024pfo.103.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeS-2N; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6L-0D; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 1/4] cpumask: introduce for_each_cpu_or Date: Wed, 15 Mar 2023 19:49:35 +1100 Message-Id: <20230315084938.2544737-2-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Dave Chinner Equivalent of for_each_cpu_and, except it ORs the two masks together so it iterates all the CPUs present in either mask. Signed-off-by: Dave Chinner --- include/linux/cpumask.h | 17 +++++++++++++++++ include/linux/find.h | 37 +++++++++++++++++++++++++++++++++++++ lib/find_bit.c | 9 +++++++++ 3 files changed, 63 insertions(+) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index d4901ca8883c..ca736b05ec7b 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -350,6 +350,23 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta #define for_each_cpu_andnot(cpu, mask1, mask2) \ for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits) +/** + * for_each_cpu_or - iterate over every cpu present in either mask + * @cpu: the (optionally unsigned) integer iterator + * @mask1: the first cpumask pointer + * @mask2: the second cpumask pointer + * + * This saves a temporary CPU mask in many places. It is equivalent to: + * struct cpumask tmp; + * cpumask_or(&tmp, &mask1, &mask2); + * for_each_cpu(cpu, &tmp) + * ... + * + * After the loop, cpu is >= nr_cpu_ids. + */ +#define for_each_cpu_or(cpu, mask1, mask2) \ + for_each_or_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits) + /** * cpumask_any_but - return a "random" in a cpumask, but not this one. * @mask: the cpumask to search diff --git a/include/linux/find.h b/include/linux/find.h index 4647864a5ffd..5e4f39ef2e72 100644 --- a/include/linux/find.h +++ b/include/linux/find.h @@ -14,6 +14,8 @@ unsigned long _find_next_and_bit(const unsigned long *addr1, const unsigned long unsigned long nbits, unsigned long start); unsigned long _find_next_andnot_bit(const unsigned long *addr1, const unsigned long *addr2, unsigned long nbits, unsigned long start); +unsigned long _find_next_or_bit(const unsigned long *addr1, const unsigned long *addr2, + unsigned long nbits, unsigned long start); unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long nbits, unsigned long start); extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size); @@ -127,6 +129,36 @@ unsigned long find_next_andnot_bit(const unsigned long *addr1, } #endif +#ifndef find_next_or_bit +/** + * find_next_or_bit - find the next set bit in either memory regions + * @addr1: The first address to base the search on + * @addr2: The second address to base the search on + * @size: The bitmap size in bits + * @offset: The bitnumber to start searching at + * + * Returns the bit number for the next set bit + * If no bits are set, returns @size. + */ +static inline +unsigned long find_next_or_bit(const unsigned long *addr1, + const unsigned long *addr2, unsigned long size, + unsigned long offset) +{ + if (small_const_nbits(size)) { + unsigned long val; + + if (unlikely(offset >= size)) + return size; + + val = (*addr1 | *addr2) & GENMASK(size - 1, offset); + return val ? __ffs(val) : size; + } + + return _find_next_or_bit(addr1, addr2, size, offset); +} +#endif + #ifndef find_next_zero_bit /** * find_next_zero_bit - find the next cleared bit in a memory region @@ -536,6 +568,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned (bit) = find_next_andnot_bit((addr1), (addr2), (size), (bit)), (bit) < (size);\ (bit)++) +#define for_each_or_bit(bit, addr1, addr2, size) \ + for ((bit) = 0; \ + (bit) = find_next_or_bit((addr1), (addr2), (size), (bit)), (bit) < (size);\ + (bit)++) + /* same as for_each_set_bit() but use bit as value to start with */ #define for_each_set_bit_from(bit, addr, size) \ for (; (bit) = find_next_bit((addr), (size), (bit)), (bit) < (size); (bit)++) diff --git a/lib/find_bit.c b/lib/find_bit.c index c10920e66788..32f99e9a670e 100644 --- a/lib/find_bit.c +++ b/lib/find_bit.c @@ -182,6 +182,15 @@ unsigned long _find_next_andnot_bit(const unsigned long *addr1, const unsigned l EXPORT_SYMBOL(_find_next_andnot_bit); #endif +#ifndef find_next_or_bit +unsigned long _find_next_or_bit(const unsigned long *addr1, const unsigned long *addr2, + unsigned long nbits, unsigned long start) +{ + return FIND_NEXT_BIT(addr1[idx] | addr2[idx], /* nop */, nbits, start); +} +EXPORT_SYMBOL(_find_next_or_bit); +#endif + #ifndef find_next_zero_bit unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long nbits, unsigned long start) From patchwork Wed Mar 15 08:49:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13175496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1E95C7618E for ; Wed, 15 Mar 2023 08:50:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231806AbjCOIuD (ORCPT ); Wed, 15 Mar 2023 04:50:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231470AbjCOIt4 (ORCPT ); Wed, 15 Mar 2023 04:49:56 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D6305D47A for ; Wed, 15 Mar 2023 01:49:53 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id d10so10321407pgt.12 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tkFj6yYLyA565/QhOvX6PgrnENIUFczCaPlTlQYQGRk=; b=P8lBfpG7IED6FPgo2ZB7apKwy3nUll9jUKZ8XzlrGpjDtYG8AbCcwo7zvIxIedMAi2 H6VZUAxroyRe59ECAAvynvkq8QSQkYykDwiNYazLkZA+lH3wCju9gPC+hT8a43f2BmoA j9RpQWC41i0mNupI+oXacSSvjKAGJ4+vgvsFXxD4G/haDpYnbt0PZ0RSkuEWQjljHrLG hGZcIKjas3JKkHHKlgv2t0oVs/hWgaL7pp9XDcpTbRs3hXgx0O3YC0Xiy7NF2n3Ej7tU RslGq9jpYn6NsD498EG4rurP0SM0c4G4pWSHhohSvWR1DWUZXDA7LXMdnh2+YmirVJdi 9A6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tkFj6yYLyA565/QhOvX6PgrnENIUFczCaPlTlQYQGRk=; b=dRpC7LmQRjPtYoaFIBfpFSaiDUgddpP2vHHhlsyTTNTyYIfr+AylmpBBfjp3iM7qhS HBucFibRJThtSY5l7USDbKf/pv6Ua9aq51sZBd4pSoxNIx3TUJBvAy1rxjLeMGLPTVBx NVlmshgSLjbnNM+Jdoc4p5PePNIoQXaN++amS/v6rewHRvTo9zeGNEm558e6zKbfgHEl eJOzVjN5F38HNgADHQQCsA0YIekMPYRkWouUdrOcZCFilPEerg2kXXfYD2gC4NKJ03RW u/+vqOpd8Z8gqWD0ziQ8PEWRpImU5Wv6Lf19a6G94nQ04p1+5oNxXoCtZw0FB0n6r0TY aLvA== X-Gm-Message-State: AO0yUKWdVmQ34fBRO+HW1PkVTfpQj3zRYD7GGTTdgD1EjjZsnbY8ptjv KSwth0UZuOx2qhhRyXu2eJ0b8rhd6JE30QXAWn4= X-Google-Smtp-Source: AK7set9m56zOUn9/VDRZB2SidWtzW+u6ZCwpdH8KJIhu6v/dpLkJCqq28QRPyVmDBf4S/L86vLXNpQ== X-Received: by 2002:a05:6a00:41:b0:625:5560:696e with SMTP id i1-20020a056a00004100b006255560696emr5343794pfk.16.1678870192212; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id h11-20020a62b40b000000b0062505afff9fsm2971980pfn.126.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeT-3D; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6P-0I; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 2/4] pcpcntrs: fix dying cpu summation race Date: Wed, 15 Mar 2023 19:49:36 +1100 Message-Id: <20230315084938.2544737-3-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Dave Chinner In commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") a race condition between a cpu dying and percpu_counter_sum() iterating online CPUs was identified. The solution was to iterate all possible CPUs for summation via percpu_counter_sum_all(). We recently had a percpu_counter_sum() call in XFS trip over this same race condition and it fired a debug assert because the filesystem was unmounting and the counter *should* be zero just before we destroy it. That was reported here: https://lore.kernel.org/linux-kernel/20230314090649.326642-1-yebin@huaweicloud.com/ likely as a result of running generic/648 which exercises filesystems in the presence of CPU online/offline events. The solution to use percpu_counter_sum_all() is an awful one. We use percpu counters and percpu_counter_sum() for accurate and reliable threshold detection for space management, so a summation race condition during these operations can result in overcommit of available space and that may result in filesystem shutdowns. As percpu_counter_sum_all() iterates all possible CPUs rather than just those online or even those present, the mask can include CPUs that aren't even installed in the machine, or in the case of machines that can hot-plug CPU capable nodes, even have physical sockets present in the machine. Fundamentally, this race condition is caused by the CPU being offlined being removed from the cpu_online_mask before the notifier that cleans up per-cpu state is run. Hence percpu_counter_sum() will not sum the count for a cpu currently being taken offline, regardless of whether the notifier has run or not. This is the root cause of the bug. The percpu counter notifier iterates all the registered counters, locks the counter and moves the percpu count to the global sum. This is serialised against other operations that move the percpu counter to the global sum as well as percpu_counter_sum() operations that sum the percpu counts while holding the counter lock. Hence the notifier is safe to run concurrently with sum operations, and the only thing we actually need to care about is that percpu_counter_sum() iterates dying CPUs. That's trivial to do, and when there are no CPUs dying, it has no addition overhead except for a cpumask_or() operation. This change makes percpu_counter_sum() always do the right thing in the presence of CPU hot unplug events and makes percpu_counter_sum_all() unnecessary. This, in turn, means that filesystems like XFS, ext4, and btrfs don't have to work out when they should use percpu_counter_sum() vs percpu_counter_sum_all() in their space accounting algorithms Signed-off-by: Dave Chinner --- lib/percpu_counter.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index dba56c5c1837..0e096311e0c0 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -131,7 +131,7 @@ static s64 __percpu_counter_sum_mask(struct percpu_counter *fbc, raw_spin_lock_irqsave(&fbc->lock, flags); ret = fbc->count; - for_each_cpu(cpu, cpu_mask) { + for_each_cpu_or(cpu, cpu_online_mask, cpu_mask) { s32 *pcount = per_cpu_ptr(fbc->counters, cpu); ret += *pcount; } @@ -141,11 +141,20 @@ static s64 __percpu_counter_sum_mask(struct percpu_counter *fbc, /* * Add up all the per-cpu counts, return the result. This is a more accurate - * but much slower version of percpu_counter_read_positive() + * but much slower version of percpu_counter_read_positive(). + * + * We use the cpu mask of (cpu_online_mask | cpu_dying_mask) to capture sums + * from CPUs that are in the process of being taken offline. Dying cpus have + * been removed from the online mask, but may not have had the hotplug dead + * notifier called to fold the percpu count back into the global counter sum. + * By including dying CPUs in the iteration mask, we avoid this race condition + * so __percpu_counter_sum() just does the right thing when CPUs are being taken + * offline. */ s64 __percpu_counter_sum(struct percpu_counter *fbc) { - return __percpu_counter_sum_mask(fbc, cpu_online_mask); + + return __percpu_counter_sum_mask(fbc, cpu_dying_mask); } EXPORT_SYMBOL(__percpu_counter_sum); From patchwork Wed Mar 15 08:49:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13175492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4953BC7618B for ; Wed, 15 Mar 2023 08:49:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231663AbjCOIt5 (ORCPT ); Wed, 15 Mar 2023 04:49:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231794AbjCOIty (ORCPT ); Wed, 15 Mar 2023 04:49:54 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8129C5C11F for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id a2so19279947plm.4 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3PiH3qSD87Gy2R0sEveTfXmyJns2RRHbdlIcrLyx7vY=; b=dGXEAZWGtw3hjf4evG9X6AHe1yTsnbPEjFBkPoP86wCPr8QN6TVnbPOULBMveWDVoT jiIftWXhqKYSzp0qmhThWBeyCKsD4Tm6Xp+7PMte7cJF7wgzVLSiYssGL15uSOr3lvfV rQd66fF+t9UQq3jhdoFYBw6GCsPIQuVOxn5UolXc0160hrbGsy9zlV4+B/qhV9yCDS7F H/CMxjUw1Z+tGw+1u0CeYT7yhOpK5oiWVK6GvR3bPJpgJzz75iWIvVxMm4iFmXRulZ4+ jvTJP5PpwGUCDBuh+DdFDKIgLX1jwAvPcxNGWCCVoZa7qwZqgDG/cKCvCt3mmTsN4n4r PO9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3PiH3qSD87Gy2R0sEveTfXmyJns2RRHbdlIcrLyx7vY=; b=USIRaZzSBmyElw4k13eWycgXe/zTECKo6YXa7sFRAzLI9ECz6xb6VpexNJN6oA97MA jMT/D+hg5cpKXyrAwJaS/Pyc4nzDZGDpqhcH0uziW8FiUWlvQUvIqJbWTOLR0ayi9ioC v3CEBuJPZ6DkV+TjVurMtn+AnNbmBA9a/lPezuANgoG17efer/OJvNQCH7M7Ls3rWKMY 7rlzknZ+NBV5WttMiolbx+NwlkhN3lM04FnWupjwCSZwqAyEQJ9gaxzLQjjLuhlUxHCG ZmvMDbZGmUGA42SXS2CiNksC1LsOqDRKguftM8clGOD+OS9+mmJrE0R7jvp2x/Pcmhcw fuKg== X-Gm-Message-State: AO0yUKWdEvMqtGSQX9jGH11Zlw2guT+cTT5ZX6mCGF8OHI9jdp34Sj9T GuPHgBTMoFPf6TMFrs5XFW/agA== X-Google-Smtp-Source: AK7set/d5shADEggOumzehf9o4A4WvFVUPs23yylPGUIObnfMjjp0YKBaZtfOr+A8wcBYihNDIQpPQ== X-Received: by 2002:a17:902:e541:b0:19a:70f9:affb with SMTP id n1-20020a170902e54100b0019a70f9affbmr2478057plf.2.1678870191825; Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id kk5-20020a170903070500b0019a928a8982sm3096849plb.118.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeU-45; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6T-0N; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 3/4] fork: remove use of percpu_counter_sum_all Date: Wed, 15 Mar 2023 19:49:37 +1100 Message-Id: <20230315084938.2544737-4-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Dave Chinner This effectively reverts the change made in commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") as the race condition percpu_counter_sum_all() was invented to avoid is now handled directly in percpu_counter_sum() and nobody needs to care about summing racing with cpu unplug anymore. Signed-off-by: Dave Chinner --- kernel/fork.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index d8cda4c6de6c..c0257cbee093 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -755,11 +755,6 @@ static void check_mm(struct mm_struct *mm) for (i = 0; i < NR_MM_COUNTERS; i++) { long x = percpu_counter_sum(&mm->rss_stat[i]); - if (likely(!x)) - continue; - - /* Making sure this is not due to race with CPU offlining. */ - x = percpu_counter_sum_all(&mm->rss_stat[i]); if (unlikely(x)) pr_alert("BUG: Bad rss-counter state mm:%p type:%s val:%ld\n", mm, resident_page_types[i], x); From patchwork Wed Mar 15 08:49:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13175494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26479C76195 for ; Wed, 15 Mar 2023 08:50:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231534AbjCOIt4 (ORCPT ); Wed, 15 Mar 2023 04:49:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231788AbjCOIty (ORCPT ); Wed, 15 Mar 2023 04:49:54 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C25716AF6 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id c18so686855ple.11 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870191; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DnnMb4Et7WhpEe76RPSGsI6qWidW0Ymzw0QUU6G4yao=; b=MldnmFi+QRjEcfycD9KY0bZvSKUHkvSL3UtA0U/PWvEzo8AB2GYhsOMJ9Yna93phsr 3pmBXGrj79AYyoeIfCO7JsUKNVP8tmqlS8mY/FTf5EtAxhPN1xd1Fm9/pzGq/ZdxfhQz eU2tuyfglF9o3N9mm4hhAzPHlUfBjcvnSq9Nb2s84Jjh61ufX2fH3XB7ShgyRL4j80Ax QyS9p698XnwvyGRgs0e4BrNF4ilCCsUmxrRDQcMuw618JIheaRe7R9Qmg/xhef7MwYkb GmW1c4Xe7Xs5vN1n3/NKD+0FvXsYvic5EsCMGBEC3z5N+9iBqt2u48HdxYZQHcZIPME0 idug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870191; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DnnMb4Et7WhpEe76RPSGsI6qWidW0Ymzw0QUU6G4yao=; b=204lpPDgR6h2n5cxVjXGClyVAjcDBW7jSoZnKC1I+5XSGgKce24Tv2pvzEVYz8lKA0 lxPvnxwZl0nQc7elB6r6c60gSBrnng3EY8YEabIYIQr4gSBAFhDB0mlztmGVzjspL1K+ +m/HdtDDo04A8D3VLTs+C1xTBBV7HOTuRVTwW228Lw8PLnKD5YLhhAwRSD+0zPNdDTxp lKHJNhkTjMCVsUqVGNe3l8Z3sgz5gY423XeT7e4FI9SSlmNr5sKzFWo+ZbTaTzUTYX8s WszO7GHPK+kwNz/Cuid6jP8mttS6/WLo2C/4ICWY84WEjVQndsC+RWxEWgjW7ydcI34t iQPA== X-Gm-Message-State: AO0yUKWJI4ekJdVYIn3ThJPK9ecGAaZ7UoYjaKx9YSxXHSE49AtSfCna sVByDdSim+Ip4kWk1LvCIcAoSyCf05il13d5+c8= X-Google-Smtp-Source: AK7set8ONEq6R2DicXybExxV1KAqb5LJDX736BB8nt9+Drt0g4Wcxf32rD3Zg0NHbWPaIIoqz/KYqA== X-Received: by 2002:a05:6a20:7d88:b0:cc:32a8:323d with SMTP id v8-20020a056a207d8800b000cc32a8323dmr15446982pzj.28.1678870191517; Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id e29-20020a63501d000000b0050be183459bsm593527pgb.34.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeV-50; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6X-0S; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 4/4] pcpcntr: remove percpu_counter_sum_all() Date: Wed, 15 Mar 2023 19:49:38 +1100 Message-Id: <20230315084938.2544737-5-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Dave Chinner percpu_counter_sum_all() is now redundant as the race condition it was invented to handle is now dealt with by percpu_counter_sum() directly and all users of percpu_counter_sum_all() have been removed. Remove it. This effectively reverts the changes made in f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") except for the cpumask iteration that fixes percpu_counter_sum() made earlier in this series. Signed-off-by: Dave Chinner --- include/linux/percpu_counter.h | 6 ----- lib/percpu_counter.c | 40 ++++++++++------------------------ 2 files changed, 11 insertions(+), 35 deletions(-) diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h index 521a733e21a9..75b73c83bc9d 100644 --- a/include/linux/percpu_counter.h +++ b/include/linux/percpu_counter.h @@ -45,7 +45,6 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount); void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch); s64 __percpu_counter_sum(struct percpu_counter *fbc); -s64 percpu_counter_sum_all(struct percpu_counter *fbc); int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch); void percpu_counter_sync(struct percpu_counter *fbc); @@ -196,11 +195,6 @@ static inline s64 percpu_counter_sum(struct percpu_counter *fbc) return percpu_counter_read(fbc); } -static inline s64 percpu_counter_sum_all(struct percpu_counter *fbc) -{ - return percpu_counter_read(fbc); -} - static inline bool percpu_counter_initialized(struct percpu_counter *fbc) { return true; diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 0e096311e0c0..5004463c4f9f 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -122,23 +122,6 @@ void percpu_counter_sync(struct percpu_counter *fbc) } EXPORT_SYMBOL(percpu_counter_sync); -static s64 __percpu_counter_sum_mask(struct percpu_counter *fbc, - const struct cpumask *cpu_mask) -{ - s64 ret; - int cpu; - unsigned long flags; - - raw_spin_lock_irqsave(&fbc->lock, flags); - ret = fbc->count; - for_each_cpu_or(cpu, cpu_online_mask, cpu_mask) { - s32 *pcount = per_cpu_ptr(fbc->counters, cpu); - ret += *pcount; - } - raw_spin_unlock_irqrestore(&fbc->lock, flags); - return ret; -} - /* * Add up all the per-cpu counts, return the result. This is a more accurate * but much slower version of percpu_counter_read_positive(). @@ -153,22 +136,21 @@ static s64 __percpu_counter_sum_mask(struct percpu_counter *fbc, */ s64 __percpu_counter_sum(struct percpu_counter *fbc) { + s64 ret; + int cpu; + unsigned long flags; - return __percpu_counter_sum_mask(fbc, cpu_dying_mask); + raw_spin_lock_irqsave(&fbc->lock, flags); + ret = fbc->count; + for_each_cpu_or(cpu, cpu_online_mask, cpu_dying_mask) { + s32 *pcount = per_cpu_ptr(fbc->counters, cpu); + ret += *pcount; + } + raw_spin_unlock_irqrestore(&fbc->lock, flags); + return ret; } EXPORT_SYMBOL(__percpu_counter_sum); -/* - * This is slower version of percpu_counter_sum as it traverses all possible - * cpus. Use this only in the cases where accurate data is needed in the - * presense of CPUs getting offlined. - */ -s64 percpu_counter_sum_all(struct percpu_counter *fbc) -{ - return __percpu_counter_sum_mask(fbc, cpu_possible_mask); -} -EXPORT_SYMBOL(percpu_counter_sum_all); - int __percpu_counter_init(struct percpu_counter *fbc, s64 amount, gfp_t gfp, struct lock_class_key *key) {