From patchwork Thu Jan 5 00:09:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34825C53210 for ; Thu, 5 Jan 2023 00:10:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234496AbjAEAJ7 (ORCPT ); Wed, 4 Jan 2023 19:09:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234393AbjAEAJ7 (ORCPT ); Wed, 4 Jan 2023 19:09:59 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67CC343A1C; Wed, 4 Jan 2023 16:09:58 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B0E396188D; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1B7A4C433D2; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=PS7RJ6ca0uDsIciwHIe/Mb/Re+MAuOv96kNuEUv2OMY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DoS72ule2hem61lv95M9JtiH09qkzU0Hj+5layKP/VVObizNp1Qa758AhEzAY+c2H 4ci22oFx051LnFLTWeEMlh62giV84SNZwwxe/9AQ48PJjgmLeL3A1udcDgovMaMQs3 jokz193nPTkox7VXTabjpZyxqgUCB+BK+uJTuXSBOZWW0wD/d/qAihab4tCepYrx12 cO52GjTe/8nByNjeAMAKT5mYFiRokf8sxZFWhlTmbiG1VBrSh+zkGZbyf4Mi87AzBZ zV/qn3Zz4cwczFI6Zc/LOWyKcMPqzk7sDoQqCPhYe/istlvax/utbJfbzkVrjYlHON l4RS4Q0QPW49g== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id C36105C05CA; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 01/15] doc: Further updates to RCU's lockdep.rst Date: Wed, 4 Jan 2023 16:09:41 -0800 Message-Id: <20230105000955.1767218-1-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit wordsmiths RCU's lockdep.rst. Signed-off-by: Paul E. McKenney --- Documentation/RCU/lockdep.rst | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/Documentation/RCU/lockdep.rst b/Documentation/RCU/lockdep.rst index 9308f1bdba05d..2749f43ec1b03 100644 --- a/Documentation/RCU/lockdep.rst +++ b/Documentation/RCU/lockdep.rst @@ -69,9 +69,8 @@ checking of rcu_dereference() primitives: value of the pointer itself, for example, against NULL. The rcu_dereference_check() check expression can be any boolean -expression, but would normally include a lockdep expression. However, -any boolean expression can be used. For a moderately ornate example, -consider the following:: +expression, but would normally include a lockdep expression. For a +moderately ornate example, consider the following:: file = rcu_dereference_check(fdt->fd[fd], lockdep_is_held(&files->file_lock) || @@ -97,10 +96,10 @@ code, it could instead be written as follows:: atomic_read(&files->count) == 1); This would verify cases #2 and #3 above, and furthermore lockdep would -complain if this was used in an RCU read-side critical section unless one -of these two cases held. Because rcu_dereference_protected() omits all -barriers and compiler constraints, it generates better code than do the -other flavors of rcu_dereference(). On the other hand, it is illegal +complain even if this was used in an RCU read-side critical section unless +one of these two cases held. Because rcu_dereference_protected() omits +all barriers and compiler constraints, it generates better code than do +the other flavors of rcu_dereference(). On the other hand, it is illegal to use rcu_dereference_protected() if either the RCU-protected pointer or the RCU-protected data that it points to can change concurrently. From patchwork Thu Jan 5 00:09:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A924C54EBC for ; Thu, 5 Jan 2023 00:10:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235063AbjAEAKF (ORCPT ); Wed, 4 Jan 2023 19:10:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235147AbjAEAKB (ORCPT ); Wed, 4 Jan 2023 19:10:01 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB4D143A20; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 645CFB81983; Thu, 5 Jan 2023 00:09:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 20065C433F1; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=FzkboUT+mHioIWdRgUQ1Pv6Fjcpxk2Vxw+5VTWcH2nM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZhW5Rhk3h2P9grPu6H10yXEuYHKBByIe20q+iFWnIfNGQ1cwb16yxu7B2q7gjumgQ 0aN0xbb09qJD7v0Lh9Cro2zHAGWj8PFl5iyMunQo/OPYucQv/4YWDvcfkHpBynTtK+ C4ebexVsFiikvYgfcfmwABl5vPGZHqGah36/BSEbmiGmUEW4TaGpSz7jQndsZruSfF iGX+uO5t/QqGA3vcT1yibcrTCEFGLmUfFL9vxLCobdxRTxktt6YbycOY7cvP1+txGu P3u3VKsP2jiJdR8/qRGt/z4kUeOIq6xUF5m4Q7jDmoMmkIxM0jy4WZD3RPku8c57fQ c5NWon8K8MoPA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id C5B835C086D; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 02/15] doc: Update NMI-RCU.rst Date: Wed, 4 Jan 2023 16:09:42 -0800 Message-Id: <20230105000955.1767218-2-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates NMI-RCU.rst to highlight the ancient heritage of the example code and to discourage wanton compiler "optimizations". Signed-off-by: Paul E. McKenney --- Documentation/RCU/NMI-RCU.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/RCU/NMI-RCU.rst b/Documentation/RCU/NMI-RCU.rst index 2a92bc685ef1a..dff60a80b386e 100644 --- a/Documentation/RCU/NMI-RCU.rst +++ b/Documentation/RCU/NMI-RCU.rst @@ -8,7 +8,7 @@ Although RCU is usually used to protect read-mostly data structures, it is possible to use RCU to provide dynamic non-maskable interrupt handlers, as well as dynamic irq handlers. This document describes how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer -work in "arch/x86/kernel/traps.c". +work in an old version of "arch/x86/kernel/traps.c". The relevant pieces of code are listed below, each followed by a brief explanation:: @@ -116,7 +116,7 @@ Answer to Quick Quiz: This same sad story can happen on other CPUs when using a compiler with aggressive pointer-value speculation - optimizations. + optimizations. (But please don't!) More important, the rcu_dereference_sched() makes it clear to someone reading the code that the pointer is From patchwork Thu Jan 5 00:09:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 263F3C46467 for ; Thu, 5 Jan 2023 00:10:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235362AbjAEAKV (ORCPT ); Wed, 4 Jan 2023 19:10:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235330AbjAEAKC (ORCPT ); Wed, 4 Jan 2023 19:10:02 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF32543A2E; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5EA08B81982; Thu, 5 Jan 2023 00:09:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1E3D7C433F0; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=V2Z8QYT5z//Jt9zS0OsjC84Q9vVqZBpFtFH0qHjU5P8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XEIEU+aT+PoiGBDGyH0Jr2XrApzwbVTa/rft5ZuyOBGPDvSsUC6JvbZsXxtoPRgRe hdgh4gFvPiiiG5WWV5hRs7feCMXw4KEawKe0sNkzTRunuGd0xyNDUddKMaoFcIfxFR L+lXK6MJaMVI2Rpc+HZ8EbENlZVzd//qdnTFFmIWRM+QFwH7m0veJk4Cw2iEwRw+5X 0e7QcAYg7CCjK1UpQAJW0zLsgg6XkZ+v4WISVPeDN6O+PAaEEFIpuCNHWnr4J4aO9S 2rtnz/IcPVOEGj8BdmrOnP2Q2BiBwKIvMGzhb51UKTgVqwVrXl0Uo1gF5B+YbIaHFh 0QvsCtISup8uQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id C7D0A5C08E5; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 03/15] doc: Update rcubarrier.rst Date: Wed, 4 Jan 2023 16:09:43 -0800 Message-Id: <20230105000955.1767218-3-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates rcubarrier.txt to reflect RCU additions and changes over the past few years. [ paulmck: Apply Stephen Rothwell feedback. ] Signed-off-by: Paul E. McKenney --- Documentation/RCU/rcubarrier.rst | 196 +++++++++++++++++-------------- 1 file changed, 110 insertions(+), 86 deletions(-) diff --git a/Documentation/RCU/rcubarrier.rst b/Documentation/RCU/rcubarrier.rst index 3b4a248774961..5a643e5233d5f 100644 --- a/Documentation/RCU/rcubarrier.rst +++ b/Documentation/RCU/rcubarrier.rst @@ -5,37 +5,12 @@ RCU and Unloadable Modules [Originally published in LWN Jan. 14, 2007: http://lwn.net/Articles/217484/] -RCU (read-copy update) is a synchronization mechanism that can be thought -of as a replacement for read-writer locking (among other things), but with -very low-overhead readers that are immune to deadlock, priority inversion, -and unbounded latency. RCU read-side critical sections are delimited -by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION -kernels, generate no code whatsoever. - -This means that RCU writers are unaware of the presence of concurrent -readers, so that RCU updates to shared data must be undertaken quite -carefully, leaving an old version of the data structure in place until all -pre-existing readers have finished. These old versions are needed because -such readers might hold a reference to them. RCU updates can therefore be -rather expensive, and RCU is thus best suited for read-mostly situations. - -How can an RCU writer possibly determine when all readers are finished, -given that readers might well leave absolutely no trace of their -presence? There is a synchronize_rcu() primitive that blocks until all -pre-existing readers have completed. An updater wishing to delete an -element p from a linked list might do the following, while holding an -appropriate lock, of course:: - - list_del_rcu(p); - synchronize_rcu(); - kfree(p); - -But the above code cannot be used in IRQ context -- the call_rcu() -primitive must be used instead. This primitive takes a pointer to an -rcu_head struct placed within the RCU-protected data structure and -another pointer to a function that may be invoked later to free that -structure. Code to delete an element p from the linked list from IRQ -context might then be as follows:: +RCU updaters sometimes use call_rcu() to initiate an asynchronous wait for +a grace period to elapse. This primitive takes a pointer to an rcu_head +struct placed within the RCU-protected data structure and another pointer +to a function that may be invoked later to free that structure. Code to +delete an element p from the linked list from IRQ context might then be +as follows:: list_del_rcu(p); call_rcu(&p->rcu, p_callback); @@ -54,7 +29,7 @@ IRQ context. The function p_callback() might be defined as follows:: Unloading Modules That Use call_rcu() ------------------------------------- -But what if p_callback is defined in an unloadable module? +But what if the p_callback() function is defined in an unloadable module? If we unload the module while some RCU callbacks are pending, the CPUs executing these callbacks are going to be severely @@ -67,20 +42,21 @@ grace period to elapse, it does not wait for the callbacks to complete. One might be tempted to try several back-to-back synchronize_rcu() calls, but this is still not guaranteed to work. If there is a very -heavy RCU-callback load, then some of the callbacks might be deferred -in order to allow other processing to proceed. Such deferral is required -in realtime kernels in order to avoid excessive scheduling latencies. +heavy RCU-callback load, then some of the callbacks might be deferred in +order to allow other processing to proceed. For but one example, such +deferral is required in realtime kernels in order to avoid excessive +scheduling latencies. rcu_barrier() ------------- -We instead need the rcu_barrier() primitive. Rather than waiting for -a grace period to elapse, rcu_barrier() waits for all outstanding RCU -callbacks to complete. Please note that rcu_barrier() does **not** imply -synchronize_rcu(), in particular, if there are no RCU callbacks queued -anywhere, rcu_barrier() is within its rights to return immediately, -without waiting for a grace period to elapse. +This situation can be handled by the rcu_barrier() primitive. Rather +than waiting for a grace period to elapse, rcu_barrier() waits for all +outstanding RCU callbacks to complete. Please note that rcu_barrier() +does **not** imply synchronize_rcu(), in particular, if there are no RCU +callbacks queued anywhere, rcu_barrier() is within its rights to return +immediately, without waiting for anything, let alone a grace period. Pseudo-code using rcu_barrier() is as follows: @@ -89,19 +65,22 @@ Pseudo-code using rcu_barrier() is as follows: 3. Allow the module to be unloaded. There is also an srcu_barrier() function for SRCU, and you of course -must match the flavor of rcu_barrier() with that of call_rcu(). If your -module uses multiple flavors of call_rcu(), then it must also use multiple -flavors of rcu_barrier() when unloading that module. For example, if -it uses call_rcu(), call_srcu() on srcu_struct_1, and call_srcu() on -srcu_struct_2, then the following three lines of code will be required -when unloading:: +must match the flavor of srcu_barrier() with that of call_srcu(). +If your module uses multiple srcu_struct structures, then it must also +use multiple invocations of srcu_barrier() when unloading that module. +For example, if it uses call_rcu(), call_srcu() on srcu_struct_1, and +call_srcu() on srcu_struct_2, then the following three lines of code +will be required when unloading:: 1 rcu_barrier(); 2 srcu_barrier(&srcu_struct_1); 3 srcu_barrier(&srcu_struct_2); -The rcutorture module makes use of rcu_barrier() in its exit function -as follows:: +If latency is of the essence, workqueues could be used to run these +three functions concurrently. + +An ancient version of the rcutorture module makes use of rcu_barrier() +in its exit function as follows:: 1 static void 2 rcu_torture_cleanup(void) @@ -190,16 +169,17 @@ Quick Quiz #1: :ref:`Answer to Quick Quiz #1 ` Your module might have additional complications. For example, if your -module invokes call_rcu() from timers, you will need to first cancel all -the timers, and only then invoke rcu_barrier() to wait for any remaining +module invokes call_rcu() from timers, you will need to first refrain +from posting new timers, cancel (or wait for) all the already-posted +timers, and only then invoke rcu_barrier() to wait for any remaining RCU callbacks to complete. -Of course, if you module uses call_rcu(), you will need to invoke +Of course, if your module uses call_rcu(), you will need to invoke rcu_barrier() before unloading. Similarly, if your module uses call_srcu(), you will need to invoke srcu_barrier() before unloading, and on the same srcu_struct structure. If your module uses call_rcu() -**and** call_srcu(), then you will need to invoke rcu_barrier() **and** -srcu_barrier(). +**and** call_srcu(), then (as noted above) you will need to invoke +rcu_barrier() **and** srcu_barrier(). Implementing rcu_barrier() @@ -211,27 +191,40 @@ queues. His implementation queues an RCU callback on each of the per-CPU callback queues, and then waits until they have all started executing, at which point, all earlier RCU callbacks are guaranteed to have completed. -The original code for rcu_barrier() was as follows:: - - 1 void rcu_barrier(void) - 2 { - 3 BUG_ON(in_interrupt()); - 4 /* Take cpucontrol mutex to protect against CPU hotplug */ - 5 mutex_lock(&rcu_barrier_mutex); - 6 init_completion(&rcu_barrier_completion); - 7 atomic_set(&rcu_barrier_cpu_count, 0); - 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1); - 9 wait_for_completion(&rcu_barrier_completion); - 10 mutex_unlock(&rcu_barrier_mutex); - 11 } - -Line 3 verifies that the caller is in process context, and lines 5 and 10 +The original code for rcu_barrier() was roughly as follows:: + + 1 void rcu_barrier(void) + 2 { + 3 BUG_ON(in_interrupt()); + 4 /* Take cpucontrol mutex to protect against CPU hotplug */ + 5 mutex_lock(&rcu_barrier_mutex); + 6 init_completion(&rcu_barrier_completion); + 7 atomic_set(&rcu_barrier_cpu_count, 1); + 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1); + 9 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) + 10 complete(&rcu_barrier_completion); + 11 wait_for_completion(&rcu_barrier_completion); + 12 mutex_unlock(&rcu_barrier_mutex); + 13 } + +Line 3 verifies that the caller is in process context, and lines 5 and 12 use rcu_barrier_mutex to ensure that only one rcu_barrier() is using the global completion and counters at a time, which are initialized on lines 6 and 7. Line 8 causes each CPU to invoke rcu_barrier_func(), which is shown below. Note that the final "1" in on_each_cpu()'s argument list ensures that all the calls to rcu_barrier_func() will have completed -before on_each_cpu() returns. Line 9 then waits for the completion. +before on_each_cpu() returns. Line 9 removes the initial count from +rcu_barrier_cpu_count, and if this count is now zero, line 10 finalizes +the completion, which prevents line 11 from blocking. Either way, +line 11 then waits (if needed) for the completion. + +.. _rcubarrier_quiz_2: + +Quick Quiz #2: + Why doesn't line 8 initialize rcu_barrier_cpu_count to zero, + thereby avoiding the need for lines 9 and 10? + +:ref:`Answer to Quick Quiz #2 ` This code was rewritten in 2008 and several times thereafter, but this still gives the general idea. @@ -253,7 +246,7 @@ to post an RCU callback, as follows:: Lines 3 and 4 locate RCU's internal per-CPU rcu_data structure, which contains the struct rcu_head that needed for the later call to call_rcu(). Line 7 picks up a pointer to this struct rcu_head, and line -8 increments a global counter. This counter will later be decremented +8 increments the global counter. This counter will later be decremented by the callback. Line 9 then registers the rcu_barrier_callback() on the current CPU's queue. @@ -267,27 +260,28 @@ reaches zero, as follows:: 4 complete(&rcu_barrier_completion); 5 } -.. _rcubarrier_quiz_2: +.. _rcubarrier_quiz_3: -Quick Quiz #2: +Quick Quiz #3: What happens if CPU 0's rcu_barrier_func() executes immediately (thus incrementing rcu_barrier_cpu_count to the value one), but the other CPU's rcu_barrier_func() invocations are delayed for a full grace period? Couldn't this result in rcu_barrier() returning prematurely? -:ref:`Answer to Quick Quiz #2 ` +:ref:`Answer to Quick Quiz #3 ` The current rcu_barrier() implementation is more complex, due to the need to avoid disturbing idle CPUs (especially on battery-powered systems) and the need to minimally disturb non-idle CPUs in real-time systems. -However, the code above illustrates the concepts. +In addition, a great many optimizations have been applied. However, +the code above illustrates the concepts. rcu_barrier() Summary --------------------- -The rcu_barrier() primitive has seen relatively little use, since most +The rcu_barrier() primitive is used relatively infrequently, since most code using RCU is in the core kernel rather than in modules. However, if you are using RCU from an unloadable module, you need to use rcu_barrier() so that your module may be safely unloaded. @@ -318,6 +312,39 @@ Answer: Interestingly enough, rcu_barrier() was not originally .. _answer_rcubarrier_quiz_2: Quick Quiz #2: + Why doesn't line 8 initialize rcu_barrier_cpu_count to zero, + thereby avoiding the need for lines 9 and 10? + +Answer: Suppose that the on_each_cpu() function shown on line 8 was + delayed, so that CPU 0's rcu_barrier_func() executed and + the corresponding grace period elapsed, all before CPU 1's + rcu_barrier_func() started executing. This would result in + rcu_barrier_cpu_count being decremented to zero, so that line + 11's wait_for_completion() would return immediately, failing to + wait for CPU 1's callbacks to be invoked. + + Note that this was not a problem when the rcu_barrier() code + was first added back in 2005. This is because on_each_cpu() + disables preemption, which acted as an RCU read-side critical + section, thus preventing CPU 0's grace period from completing + until on_each_cpu() had dealt with all of the CPUs. However, + with the advent of preemptible RCU, rcu_barrier() no longer + waited on nonpreemptible regions of code in preemptible kernels, + that being the job of the new rcu_barrier_sched() function. + + However, with the RCU flavor consolidation around v4.20, this + possibility was once again ruled out, because the consolidated + RCU once again waits on nonpreemptible regions of code. + + Nevertheless, that extra count might still be a good idea. + Relying on these sort of accidents of implementation can result + in later surprise bugs when the implementation changes. + +:ref:`Back to Quick Quiz #2 ` + +.. _answer_rcubarrier_quiz_3: + +Quick Quiz #3: What happens if CPU 0's rcu_barrier_func() executes immediately (thus incrementing rcu_barrier_cpu_count to the value one), but the other CPU's rcu_barrier_func() invocations @@ -336,18 +363,15 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last Therefore, on_each_cpu() disables preemption across its call to smp_call_function() and also across the local call to - rcu_barrier_func(). This prevents the local CPU from context - switching, again preventing grace periods from completing. This + rcu_barrier_func(). Because recent RCU implementations treat + preemption-disabled regions of code as RCU read-side critical + sections, this prevents grace periods from completing. This means that all CPUs have executed rcu_barrier_func() before the first rcu_barrier_callback() can possibly execute, in turn preventing rcu_barrier_cpu_count from prematurely reaching zero. - Currently, -rt implementations of RCU keep but a single global - queue for RCU callbacks, and thus do not suffer from this - problem. However, when the -rt RCU eventually does have per-CPU - callback queues, things will have to change. One simple change - is to add an rcu_read_lock() before line 8 of rcu_barrier() - and an rcu_read_unlock() after line 8 of this same function. If - you can think of a better change, please let me know! + But if on_each_cpu() ever decides to forgo disabling preemption, + as might well happen due to real-time latency considerations, + initializing rcu_barrier_cpu_count to one will save the day. -:ref:`Back to Quick Quiz #2 ` +:ref:`Back to Quick Quiz #3 ` From patchwork Thu Jan 5 00:09:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7538AC54E76 for ; Thu, 5 Jan 2023 00:10:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234806AbjAEAKA (ORCPT ); Wed, 4 Jan 2023 19:10:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231480AbjAEAJ7 (ORCPT ); Wed, 4 Jan 2023 19:09:59 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8360543A1D; Wed, 4 Jan 2023 16:09:58 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C4A686185D; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 28564C433F2; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=+AiKtda+llbq2fzplOX6DNXjkIDDWzakm6PNDaSoAkU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ceQcRjLOz/Rn8fggvDh6YDzDjjtD9/K2vNLtGKS2jDGfc5860/nLKqyseHDt1s8kK NH9Q/DncskK9gkJ3SPYPPdVFYM1ow0HSu/3hMJgs+HlxYXAUV0oMFCWqGPmLCFLrTk a1RNVUel2N2fdr7zWWVllDK7WKz9vw+RVNhBg402LouHogSDy4xzLLtH+jfiewsoNy VdrkOzbj0eZy4kA0Mn/t1rpUWe/4A2/Bvq/MEbLAV7FmffL9KBGU1MVx0AWSdZgQMg vnLogOaJuwGCc6iavccKwB76BCOMuqIGgShsxYyAD0V0aVP4zZd0vpcb+0re7fIEPJ 6b2mJMEclNp2w== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id C9BF55C1456; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 04/15] doc: Update rcu_dereference.rst Date: Wed, 4 Jan 2023 16:09:44 -0800 Message-Id: <20230105000955.1767218-4-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates rcu_dereference.rst to reflect RCU additions and changes over the past few years Signed-off-by: Paul E. McKenney --- Documentation/RCU/rcu_dereference.rst | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst index 81e828c8313b8..3b739f6243c85 100644 --- a/Documentation/RCU/rcu_dereference.rst +++ b/Documentation/RCU/rcu_dereference.rst @@ -19,8 +19,9 @@ Follow these rules to keep your RCU code working properly: can reload the value, and won't your code have fun with two different values for a single pointer! Without rcu_dereference(), DEC Alpha can load a pointer, dereference that pointer, and - return data preceding initialization that preceded the store of - the pointer. + return data preceding initialization that preceded the store + of the pointer. (As noted later, in recent kernels READ_ONCE() + also prevents DEC Alpha from playing these tricks.) In addition, the volatile cast in rcu_dereference() prevents the compiler from deducing the resulting pointer value. Please see @@ -34,7 +35,7 @@ Follow these rules to keep your RCU code working properly: takes on the role of the lockless_dereference() primitive that was removed in v4.15. -- You are only permitted to use rcu_dereference on pointer values. +- You are only permitted to use rcu_dereference() on pointer values. The compiler simply knows too much about integral values to trust it to carry dependencies through integer operations. There are a very few exceptions, namely that you can temporarily @@ -240,6 +241,7 @@ precautions. To see this, consider the following code fragment:: struct foo *q; int r1, r2; + rcu_read_lock(); p = rcu_dereference(gp2); if (p == NULL) return; @@ -248,7 +250,10 @@ precautions. To see this, consider the following code fragment:: if (p == q) { /* The compiler decides that q->c is same as p->c. */ r2 = p->c; /* Could get 44 on weakly order system. */ + } else { + r2 = p->c - r1; /* Unconditional access to p->c. */ } + rcu_read_unlock(); do_something_with(r1, r2); } @@ -297,6 +302,7 @@ Then one approach is to use locking, for example, as follows:: struct foo *q; int r1, r2; + rcu_read_lock(); p = rcu_dereference(gp2); if (p == NULL) return; @@ -306,7 +312,12 @@ Then one approach is to use locking, for example, as follows:: if (p == q) { /* The compiler decides that q->c is same as p->c. */ r2 = p->c; /* Locking guarantees r2 == 144. */ + } else { + spin_lock(&q->lock); + r2 = q->c - r1; + spin_unlock(&q->lock); } + rcu_read_unlock(); spin_unlock(&p->lock); do_something_with(r1, r2); } @@ -364,7 +375,7 @@ the exact value of "p" even in the not-equals case. This allows the compiler to make the return values independent of the load from "gp", in turn destroying the ordering between this load and the loads of the return values. This can result in "p->b" returning pre-initialization -garbage values. +garbage values on weakly ordered systems. In short, rcu_dereference() is *not* optional when you are going to dereference the resulting pointer. @@ -430,7 +441,7 @@ member of the rcu_dereference() to use in various situations: SPARSE CHECKING OF RCU-PROTECTED POINTERS ----------------------------------------- -The sparse static-analysis tool checks for direct access to RCU-protected +The sparse static-analysis tool checks for non-RCU access to RCU-protected pointers, which can result in "interesting" bugs due to compiler optimizations involving invented loads and perhaps also load tearing. For example, suppose someone mistakenly does something like this:: From patchwork Thu Jan 5 00:09:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE5B6C46467 for ; Thu, 5 Jan 2023 00:10:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235443AbjAEAKS (ORCPT ); Wed, 4 Jan 2023 19:10:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235185AbjAEAKB (ORCPT ); Wed, 4 Jan 2023 19:10:01 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0570B43A30; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 92951B81986; Thu, 5 Jan 2023 00:09:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2A7E8C43392; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=isoqPsWCBoxvya3XBG1XQQhxkGdrk+wCUjPVtX+gzXw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jFreSwbNJSbKqmLpsZzxoF5Ds/4w0toLRX7JzauJIPSeOFRSk9YrhYOafx6lfw26T Lhc/NYW+a1yz9S+AwjRdG5OIHbc/A3W5oowXWdrv15kBE0P7QGyhIs9vchGiFkO8db a9xRRLURF4aiFfnGYA5GZzRatkitmPPjn+FojoANVMUuFhlxf9UMryvr6nzgBwoRza uXUtsahWe7tdLN1zI2GqzKax19BRZ45qLsqrmMBda4bGgx8vyybkaCOICykR6+2vVr BT/kb4yWqsZlc4A5bsYh9Vkyzg0tCqoUTcuPLwU4B/mgfbPkAzHY2f1FqkxBjJsBGl diMHdQA3RxRnQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id CBC1A5C149B; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 05/15] doc: Update and wordsmith rculist_nulls.rst Date: Wed, 4 Jan 2023 16:09:45 -0800 Message-Id: <20230105000955.1767218-5-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Do some wordsmithing and breaking up of RCU readers. Signed-off-by: Paul E. McKenney --- Documentation/RCU/rculist_nulls.rst | 109 ++++++++++++++-------------- 1 file changed, 54 insertions(+), 55 deletions(-) diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst index ca4692775ad41..f84d6970758bc 100644 --- a/Documentation/RCU/rculist_nulls.rst +++ b/Documentation/RCU/rculist_nulls.rst @@ -14,19 +14,19 @@ Using 'nulls' ============= Using special makers (called 'nulls') is a convenient way -to solve following problem : +to solve following problem. -A typical RCU linked list managing objects which are -allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can -use following algos : +Without 'nulls', a typical RCU linked list managing objects which are +allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can use the following +algorithms: -1) Lookup algo --------------- +1) Lookup algorithm +------------------- :: - rcu_read_lock() begin: + rcu_read_lock() obj = lockless_lookup(key); if (obj) { if (!try_get_ref(obj)) // might fail for free objects @@ -38,6 +38,7 @@ use following algos : */ if (obj->key != key) { // not the object we expected put_ref(obj); + rcu_read_unlock(); goto begin; } } @@ -52,9 +53,9 @@ but a version with an additional memory barrier (smp_rmb()) { struct hlist_node *node, *next; for (pos = rcu_dereference((head)->first); - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); - pos = rcu_dereference(next)) + pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); + pos = rcu_dereference(next)) if (obj->key == key) return obj; return NULL; @@ -64,9 +65,9 @@ And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb():: struct hlist_node *node; for (pos = rcu_dereference((head)->first); - pos && ({ prefetch(pos->next); 1; }) && - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); - pos = rcu_dereference(pos->next)) + pos && ({ prefetch(pos->next); 1; }) && + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); + pos = rcu_dereference(pos->next)) if (obj->key == key) return obj; return NULL; @@ -82,36 +83,32 @@ Quoting Corey Minyard:: solved by pre-fetching the "next" field (with proper barriers) before checking the key." -2) Insert algo --------------- +2) Insertion algorithm +---------------------- We need to make sure a reader cannot read the new 'obj->obj_next' value -and previous value of 'obj->key'. Or else, an item could be deleted +and previous value of 'obj->key'. Otherwise, an item could be deleted from a chain, and inserted into another chain. If new chain was empty -before the move, 'next' pointer is NULL, and lockless reader can -not detect it missed following items in original chain. +before the move, 'next' pointer is NULL, and lockless reader can not +detect the fact that it missed following items in original chain. :: /* - * Please note that new inserts are done at the head of list, - * not in the middle or end. - */ + * Please note that new inserts are done at the head of list, + * not in the middle or end. + */ obj = kmem_cache_alloc(...); lock_chain(); // typically a spin_lock() obj->key = key; - /* - * we need to make sure obj->key is updated before obj->next - * or obj->refcnt - */ - smp_wmb(); - atomic_set(&obj->refcnt, 1); + atomic_set_release(&obj->refcnt, 1); // key before refcnt hlist_add_head_rcu(&obj->obj_node, list); unlock_chain(); // typically a spin_unlock() -3) Remove algo --------------- +3) Removal algorithm +-------------------- + Nothing special here, we can use a standard RCU hlist deletion. But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused very very fast (before the end of RCU grace period) @@ -133,7 +130,7 @@ Avoiding extra smp_rmb() ======================== With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() -and extra smp_wmb() in insert function. +and extra _release() in insert function. For example, if we choose to store the slot number as the 'nulls' end-of-list marker for each slot of the hash table, we can detect @@ -142,59 +139,61 @@ to another chain) checking the final 'nulls' value if the lookup met the end of chain. If final 'nulls' value is not the slot number, then we must restart the lookup at the beginning. If the object was moved to the same chain, -then the reader doesn't care : It might eventually +then the reader doesn't care: It might occasionally scan the list again without harm. -1) lookup algo --------------- +1) lookup algorithm +------------------- :: head = &table[slot]; - rcu_read_lock(); begin: + rcu_read_lock(); hlist_nulls_for_each_entry_rcu(obj, node, head, member) { if (obj->key == key) { - if (!try_get_ref(obj)) // might fail for free objects + if (!try_get_ref(obj)) { // might fail for free objects + rcu_read_unlock(); goto begin; + } if (obj->key != key) { // not the object we expected put_ref(obj); + rcu_read_unlock(); goto begin; } - goto out; + goto out; + } + } + + // If the nulls value we got at the end of this lookup is + // not the expected one, we must restart lookup. + // We probably met an item that was moved to another chain. + if (get_nulls_value(node) != slot) { + put_ref(obj); + rcu_read_unlock(); + goto begin; } - /* - * if the nulls value we got at the end of this lookup is - * not the expected one, we must restart lookup. - * We probably met an item that was moved to another chain. - */ - if (get_nulls_value(node) != slot) - goto begin; obj = NULL; out: rcu_read_unlock(); -2) Insert function ------------------- +2) Insert algorithm +------------------- :: /* - * Please note that new inserts are done at the head of list, - * not in the middle or end. - */ + * Please note that new inserts are done at the head of list, + * not in the middle or end. + */ obj = kmem_cache_alloc(cachep); lock_chain(); // typically a spin_lock() obj->key = key; + atomic_set_release(&obj->refcnt, 1); // key before refcnt /* - * changes to obj->key must be visible before refcnt one - */ - smp_wmb(); - atomic_set(&obj->refcnt, 1); - /* - * insert obj in RCU way (readers might be traversing chain) - */ + * insert obj in RCU way (readers might be traversing chain) + */ hlist_nulls_add_head_rcu(&obj->obj_node, list); unlock_chain(); // typically a spin_unlock() From patchwork Thu Jan 5 00:09:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089214 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01D2CC53210 for ; Thu, 5 Jan 2023 00:10:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235423AbjAEAKQ (ORCPT ); Wed, 4 Jan 2023 19:10:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235212AbjAEAKB (ORCPT ); Wed, 4 Jan 2023 19:10:01 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47D9443A32; Wed, 4 Jan 2023 16:10:00 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F3585B81980; Thu, 5 Jan 2023 00:09:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7389CC4339E; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=0etIo6hBjooXT6XtfpnKLk00LP5yH/3ZBijhs3GTztk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fUnjV/KbazTXNXp3MyXoNTfMfc5z1c0VQmlRw8Kiyih+n7dYcRBhcPyVnyzNsq7rw s2kQmfcaxtwiBk2EN0kPbBPVrGRunRgDo4Yb9dLrzXfi67QBmsytKYjQeMaKQMipw7 EXNpIGlYdJvIxZ3sMuT8taahFKM3ayenjPMVqSQrhcdlZ9DatEGtkA+/wVL40R8t0I ivB4IuDaqlOKr3+X5lwQozkWg+1xK7N/UCvyfNKZWUpcHN02g3IUQSf9nFJqn3XsxU lnPjC2hgjXDcmSRYMkdZJkQG2XU8Geh/w81H9DcRMTiDKcLmEASg6TvVLXmdEcOcO/ hjk49VXyz2DJw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id CDA165C1ADF; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 06/15] doc: Update rcu.rst Date: Wed, 4 Jan 2023 16:09:46 -0800 Message-Id: <20230105000955.1767218-6-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit provides a couple of updates based on the inexorable passage of time. Signed-off-by: Paul E. McKenney --- Documentation/RCU/rcu.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst index 3cfe01ba9a494..381cb86f657d8 100644 --- a/Documentation/RCU/rcu.rst +++ b/Documentation/RCU/rcu.rst @@ -77,12 +77,13 @@ Frequently Asked Questions search for the string "Patent" in Documentation/RCU/RTFP.txt to find them. Of these, one was allowed to lapse by the assignee, and the others have been contributed to the Linux kernel under GPL. + Many (but not all) have long since expired. There are now also LGPL implementations of user-level RCU available (https://liburcu.org/). - I hear that RCU needs work in order to support realtime kernels? - Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU + Realtime-friendly RCU are enabled via the CONFIG_PREEMPTION kernel configuration parameter. - Where can I find more information on RCU? From patchwork Thu Jan 5 00:09:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089222 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A755BC54E76 for ; Thu, 5 Jan 2023 00:10:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235540AbjAEAK1 (ORCPT ); Wed, 4 Jan 2023 19:10:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235356AbjAEAKE (ORCPT ); Wed, 4 Jan 2023 19:10:04 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0BB843A37; Wed, 4 Jan 2023 16:10:00 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 14DA7B81985; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 73E43C433A1; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=IqhCvimGfZ6hCH77O9U7PJxJALGcr3RAART03cZb3mE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jOJumSyHIUQLJHJH8yAGISoWY3aVL31iHDHdYPe1G5fcVVIkI+dmUweZGgjAqjTdM l6oFc/DTvPx1DE1GszgTuRH2K2hLKgrCcxaUrRMJEngB3EWHS8eaF+/XhwR1WROQTB wZ8LGjpDAEt9mg9VMq/I5gFlisDaxwMJCyaiIYaznH6n1sd8LDu09V9GT0gRXIc/jN YLHseN62xSBFcJPu+D+XY4PBKSStSVPw+qzPR3ltzJNJp9h991yErAJYLs2AWYcagF IAnj+5y8oKCJtDCkLWOOVNbtKS0h4mHViaTu/opXeH/H+BS0TkCipxU74y9O2QSy35 10IfXarPBv3Qg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id CF7085C1AE0; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 07/15] doc: Update stallwarn.rst Date: Wed, 4 Jan 2023 16:09:47 -0800 Message-Id: <20230105000955.1767218-7-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates stallwarn.rst to reflect RCU additions and changes over the past few years. Signed-off-by: Paul E. McKenney --- Documentation/RCU/stallwarn.rst | 43 +++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 18 deletions(-) diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index e38c587067fc8..dfa4db8c0931e 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -25,10 +25,10 @@ warnings: - A CPU looping with bottom halves disabled. -- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel - without invoking schedule(). If the looping in the kernel is - really expected and desirable behavior, you might need to add - some calls to cond_resched(). +- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the + kernel without potentially invoking schedule(). If the looping + in the kernel is really expected and desirable behavior, you + might need to add some calls to cond_resched(). - Booting Linux using a console connection that is too slow to keep up with the boot-time console-message rate. For example, @@ -108,16 +108,17 @@ warnings: - A bug in the RCU implementation. -- A hardware failure. This is quite unlikely, but has occurred - at least once in real life. A CPU failed in a running system, - becoming unresponsive, but not causing an immediate crash. - This resulted in a series of RCU CPU stall warnings, eventually - leading the realization that the CPU had failed. +- A hardware failure. This is quite unlikely, but is not at all + uncommon in large datacenter. In one memorable case some decades + back, a CPU failed in a running system, becoming unresponsive, + but not causing an immediate crash. This resulted in a series + of RCU CPU stall warnings, eventually leading the realization + that the CPU had failed. -The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning. -Note that SRCU does *not* have CPU stall warnings. Please note that -RCU only detects CPU stalls when there is a grace period in progress. -No grace period, no CPU stall warnings. +The RCU, RCU-sched, RCU-tasks, and RCU-tasks-trace implementations have +CPU stall warning. Note that SRCU does *not* have CPU stall warnings. +Please note that RCU only detects CPU stalls when there is a grace period +in progress. No grace period, no CPU stall warnings. To diagnose the cause of the stall, inspect the stack traces. The offending function will usually be near the top of the stack. @@ -205,16 +206,21 @@ RCU_STALL_RAT_DELAY rcupdate.rcu_task_stall_timeout ------------------------------- - This boot/sysfs parameter controls the RCU-tasks stall warning - interval. A value of zero or less suppresses RCU-tasks stall - warnings. A positive value sets the stall-warning interval - in seconds. An RCU-tasks stall warning starts with the line: + This boot/sysfs parameter controls the RCU-tasks and + RCU-tasks-trace stall warning intervals. A value of zero or less + suppresses RCU-tasks stall warnings. A positive value sets the + stall-warning interval in seconds. An RCU-tasks stall warning + starts with the line: INFO: rcu_tasks detected stalls on tasks: And continues with the output of sched_show_task() for each task stalling the current RCU-tasks grace period. + An RCU-tasks-trace stall warning starts (and continues) similarly: + + INFO: rcu_tasks_trace detected stalls on tasks + Interpreting RCU's CPU Stall-Detector "Splats" ============================================== @@ -248,7 +254,8 @@ dynticks counter, which will have an even-numbered value if the CPU is in dyntick-idle mode and an odd-numbered value otherwise. The hex number between the two "/"s is the value of the nesting, which will be a small non-negative number if in the idle loop (as shown above) and a -very large positive number otherwise. +very large positive number otherwise. The number following the final +"/" is the NMI nesting, which will be a small non-negative number. The "softirq=" portion of the message tracks the number of RCU softirq handlers that the stalled CPU has executed. The number before the "/" From patchwork Thu Jan 5 00:09:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089220 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 792BFC46467 for ; Thu, 5 Jan 2023 00:10:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235437AbjAEAKZ (ORCPT ); Wed, 4 Jan 2023 19:10:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235431AbjAEAKQ (ORCPT ); Wed, 4 Jan 2023 19:10:16 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C0D144346; Wed, 4 Jan 2023 16:10:03 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 27BB9B81987; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7AC52C433A7; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=47ZqK3G8Z6W2//45p9y8TR73Xz/Q6+3lfGpPEhloL98=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=A4mDAd3/mCvN4bb3U6zGCAmbCT35ci9yivTiyU4IWLZHR++V9x5vUF1TzMPwm04Hs LqLCOePs4eCqoysLoucwVPVjMps3iSG/Wg7YgFSV0i1VrD/xepa6gV+CGZ0jINaYbu ETkdlOUblaghCxvdrYK4J8av/M/QP8wR6tjAs0EFDKkM/ON/zOK1oJW8EZr3LYF3iQ H6yRIw8+kh7aqhO3yce+RohFzRYBAeBf0ucd5t8lvhrECU8BK/MTTB/wjpurfp34NO bFTO0s2OI2adf9ZcIDNA9n2NlDP2ARBJTFsAvU6LBebyTVxDGdhfC10+Hj1miJDKBr 2im7TD7qFhRKA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D135B5C1C5B; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 08/15] doc: Update torture.rst Date: Wed, 4 Jan 2023 16:09:48 -0800 Message-Id: <20230105000955.1767218-8-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates torture.rst with wordsmithing and the addition of a few more scripts. Signed-off-by: Paul E. McKenney --- Documentation/RCU/torture.rst | 89 +++++++++++++++++++++++++++++++++-- 1 file changed, 85 insertions(+), 4 deletions(-) diff --git a/Documentation/RCU/torture.rst b/Documentation/RCU/torture.rst index a901477130629..0316ba0c69225 100644 --- a/Documentation/RCU/torture.rst +++ b/Documentation/RCU/torture.rst @@ -206,7 +206,11 @@ values for memory may require disabling the callback-flooding tests using the --bootargs parameter discussed below. Sometimes additional debugging is useful, and in such cases the --kconfig -parameter to kvm.sh may be used, for example, ``--kconfig 'CONFIG_KASAN=y'``. +parameter to kvm.sh may be used, for example, ``--kconfig 'CONFIG_RCU_EQS_DEBUG=y'``. +In addition, there are the --gdb, --kasan, and --kcsan parameters. +Note that --gdb limits you to one scenario per kvm.sh run and requires +that you have another window open from which to run ``gdb`` as instructed +by the script. Kernel boot arguments can also be supplied, for example, to control rcutorture's module parameters. For example, to test a change to RCU's @@ -219,10 +223,17 @@ require disabling rcutorture's callback-flooding tests:: --bootargs 'rcutorture.fwd_progress=0' Sometimes all that is needed is a full set of kernel builds. This is -what the --buildonly argument does. +what the --buildonly parameter does. -Finally, the --trust-make argument allows each kernel build to reuse what -it can from the previous kernel build. +The --duration parameter can override the default run time of 30 minutes. +For example, ``--duration 2d`` would run for two days, ``--duration 3h`` +would run for three hours, ``--duration 5m`` would run for five minutes, +and ``--duration 45s`` would run for 45 seconds. This last can be useful +for tracking down rare boot-time failures. + +Finally, the --trust-make parameter allows each kernel build to reuse what +it can from the previous kernel build. Please note that without the +--trust-make parameter, your tags files may be demolished. There are additional more arcane arguments that are documented in the source code of the kvm.sh script. @@ -291,3 +302,73 @@ the following summary at the end of the run on a 12-CPU system:: TREE07 ------- 167347 GPs (30.9902/s) [rcu: g1079021 f0x0 ] n_max_cbs: 478732 CPU count limited from 16 to 12 TREE09 ------- 752238 GPs (139.303/s) [rcu: g13075057 f0x0 ] n_max_cbs: 99011 + + +Repeated Runs +============= + +Suppose that you are chasing down a rare boot-time failure. Although you +could use kvm.sh, doing so will rebuild the kernel on each run. If you +need (say) 1,000 runs to have confidence that you have fixed the bug, +these pointless rebuilds can become extremely annoying. + +This is why kvm-again.sh exists. + +Suppose that a previous kvm.sh run left its output in this directory:: + + tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 + +Then this run can be re-run without rebuilding as follow: + + kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 + +A few of the original run's kvm.sh parameters may be overridden, perhaps +most notably --duration and --bootargs. For example:: + + kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 \ + --duration 45s + +would re-run the previous test, but for only 45 seconds, thus facilitating +tracking down the aforementioned rare boot-time failure. + + +Distributed Runs +================ + +Although kvm.sh is quite useful, its testing is confined to a single +system. It is not all that hard to use your favorite framework to cause +(say) 5 instances of kvm.sh to run on your 5 systems, but this will very +likely unnecessarily rebuild kernels. In addition, manually distributing +the desired rcutorture scenarios across the available systems can be +painstaking and error-prone. + +And this is why the kvm-remote.sh script exists. + +If you the following command works:: + + ssh system0 date + +and if it also works for system1, system2, system3, system4, and system5, +and all of these systems have 64 CPUs, you can type:: + + kvm-remote.sh "system0 system1 system2 system3 system4 system5" \ + --cpus 64 --duration 8h --configs "5*CFLIST" + +This will build each default scenario's kernel on the local system, then +spread each of five instances of each scenario over the systems listed, +running each scenario for eight hours. At the end of the runs, the +results will be gathered, recorded, and printed. Most of the parameters +that kvm.sh will accept can be passed to kvm-remote.sh, but the list of +systems must come first. + +The kvm.sh ``--dryrun scenarios`` argument is useful for working out +how many scenarios may be run in one batch across a group of systems. + +You can also re-run a previous remote run in a manner similar to kvm.sh: + + kvm-remote.sh "system0 system1 system2 system3 system4 system5" \ + tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28-remote \ + --duration 24h + +In this case, most of the kvm-again.sh parmeters may be supplied following +the pathname of the old run-results directory. From patchwork Thu Jan 5 00:09:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089210 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BD03C54E76 for ; Thu, 5 Jan 2023 00:10:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235315AbjAEAKC (ORCPT ); Wed, 4 Jan 2023 19:10:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235063AbjAEAKB (ORCPT ); Wed, 4 Jan 2023 19:10:01 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DB7543A16; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2552F6189F; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7F23AC433A0; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=b7i4mGR/FuXTcyH6kks7yRlm8D/4PuTAENFbH1OA8wg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OMSOFYzS1xnbA9wpLTLSK1S80Tuk4cu339hnuEsXF0+nH7/4n/hUt/Chf/1xHIyDo +nk9Dr2r9VtlWYWW6eXzeNYN0FA4fDj48PnsTo+CyEFWqdx7wjzns2MOcUYrmGCk1t dRHwCf3vzxS89y9huoBaTMhHzyVu0iffvG+YSqAYvTC9AbXwz/QDK9Lfq/HnnkaQfa /hcIOtO+PMVEwjdAyJOE9zOfYz6nraJkpEYVtqE0G1IBMN8YbsMdY4WpO08pZj8dXG O7ylODSJe9rn3XGLIU/wjRPKEC6bupjyrNm6ay65AXWZHphd/SPhW9Ppt2jZsQOmQh KEW6zDE4YF63w== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D2EB75C1C5D; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 09/15] doc: Update UP.rst Date: Wed, 4 Jan 2023 16:09:49 -0800 Message-Id: <20230105000955.1767218-9-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates UP.rst to reflect changes over the past few years, including the advent of userspace RCU libraries for constrained systems. Signed-off-by: Paul E. McKenney --- Documentation/RCU/UP.rst | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/Documentation/RCU/UP.rst b/Documentation/RCU/UP.rst index e26dda27430c8..8b20fd45f2558 100644 --- a/Documentation/RCU/UP.rst +++ b/Documentation/RCU/UP.rst @@ -38,7 +38,7 @@ by having call_rcu() directly invoke its arguments only if it was called from process context. However, this can fail in a similar manner. Suppose that an RCU-based algorithm again scans a linked list containing -elements A, B, and C in process contexts, but that it invokes a function +elements A, B, and C in process context, but that it invokes a function on each element as it is scanned. Suppose further that this function deletes element B from the list, then passes it to call_rcu() for deferred freeing. This may be a bit unconventional, but it is perfectly legal @@ -59,7 +59,8 @@ Example 3: Death by Deadlock Suppose that call_rcu() is invoked while holding a lock, and that the callback function must acquire this same lock. In this case, if call_rcu() were to directly invoke the callback, the result would -be self-deadlock. +be self-deadlock *even if* this invocation occurred from a later +call_rcu() invocation a full grace period later. In some cases, it would possible to restructure to code so that the call_rcu() is delayed until after the lock is released. However, @@ -85,6 +86,14 @@ Quick Quiz #2: :ref:`Answers to Quick Quiz ` +It is important to note that userspace RCU implementations *do* +permit call_rcu() to directly invoke callbacks, but only if a full +grace period has elapsed since those callbacks were queued. This is +the case because some userspace environments are extremely constrained. +Nevertheless, people writing userspace RCU implementations are strongly +encouraged to avoid invoking callbacks from call_rcu(), thus obtaining +the deadlock-avoidance benefits called out above. + Summary ------- From patchwork Thu Jan 5 00:09:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E822C53210 for ; Thu, 5 Jan 2023 00:10:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235520AbjAEAKW (ORCPT ); Wed, 4 Jan 2023 19:10:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235422AbjAEAKQ (ORCPT ); Wed, 4 Jan 2023 19:10:16 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18F3543A26; Wed, 4 Jan 2023 16:10:03 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2F156B81988; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79707C433A4; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=lNPz8fsxYbYIrwgfENvy5JR5By4h6WSZrBDq8U/etz0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NiUoxGUwWUOcqSNhxX31YIaSXGdjqoMOHZ1LsmXkpn2F31w8zTU0CQ78aHtJW3VfX 2P4+4lOB7oSwsSwazthyRznygJRYt0YumJLiCioyahJUu2+ADfLtnKvrWlv5B2ngKe WQhHpkDe6sU+yERVH/g3iA4T9Vqy/5HV+fHKsFXpR/UqCMHQpP8bkqjCW+cHtS8XOD +f3DCzMWtSGOE15SJiHvTBYBqb/b4W7po0AAilXlRSrL1R5+5vZ0caSqTlmd2tEDex 8lqSfePl59A+edjLrGAbUZ84MZeumfCrEUDgyO5FP6+TAxASIyxjJuew50hP12ir4b d21RoF1Scm4Gw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D45E65C03B9; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 10/15] doc: Update rcu.rst URL to RCU publications Date: Wed, 4 Jan 2023 16:09:50 -0800 Message-Id: <20230105000955.1767218-10-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Also add the more recent thicket of Google Documents. Signed-off-by: Paul E. McKenney --- Documentation/RCU/rcu.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst index 381cb86f657d8..bf6617b330a74 100644 --- a/Documentation/RCU/rcu.rst +++ b/Documentation/RCU/rcu.rst @@ -89,4 +89,5 @@ Frequently Asked Questions - Where can I find more information on RCU? See the Documentation/RCU/RTFP.txt file. - Or point your browser at (http://www.rdrop.com/users/paulmck/RCU/). + Or point your browser at (https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit) + or (https://docs.google.com/document/d/1GCdQC8SDbb54W1shjEXqGZ0Rq8a6kIeYutdSIajfpLA/edit?usp=sharing). From patchwork Thu Jan 5 00:09:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AD5BC54EBC for ; Thu, 5 Jan 2023 00:10:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235485AbjAEAKT (ORCPT ); Wed, 4 Jan 2023 19:10:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235230AbjAEAKC (ORCPT ); Wed, 4 Jan 2023 19:10:02 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B41F43A1C; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2131B618B3; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C9EDC433A8; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=zIuR6b+M23FgJhA/VuiF85BlWyIsIRWHp9NsUOE91v0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=elPQLO5e8H5+eFqhwK9LEoENtRiade+LxinnMZqzUoMWV0qLqr8+S++1iGnGp1pdb M9vJ8WBN0kT/t7ZOcqfbYJwKg9cBZPtp2X1u0dm3KBMr9DufCJ0GGMIhbVMseqC5RI EoiRmB0i1EaxecSzROwjLOg4iDyxVZ4Qw/23XEJ2VkRWLMEaSMD49D8JT65Ypw06+k 9dEDPQ1BTkIUQhYZUFu4fiKzlragpb166oJ6WLn4d+pUtBxyPTPMD+ZxqvipjrxwyB sw/Vunp88O3qA8wudFQRkiBwCShx1uvOs+c4oyFEg3Pl/H78asTvL3HdgHXCY+oVtr SDsueunfADNew== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D69A45C1C77; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Paul E. McKenney" Subject: [PATCH rcu 11/15] doc: Update whatisRCU.rst Date: Wed, 4 Jan 2023 16:09:51 -0800 Message-Id: <20230105000955.1767218-11-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This commit updates whatisRCU.rst with wordsmithing and updates provokes by the passage of time. Signed-off-by: Paul E. McKenney --- Documentation/RCU/whatisRCU.rst | 193 +++++++++++++++++++++----------- 1 file changed, 125 insertions(+), 68 deletions(-) diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index 1c747ac3f2c8e..2c5563a91998f 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -16,18 +16,23 @@ to start learning about RCU: | 6. The RCU API, 2019 Edition https://lwn.net/Articles/777036/ | 2019 Big API Table https://lwn.net/Articles/777165/ +For those preferring video: + +| 1. Unraveling RCU Mysteries: Fundamentals https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries +| 2. Unraveling RCU Mysteries: Additional Use Cases https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries-additional-use-cases + What is RCU? RCU is a synchronization mechanism that was added to the Linux kernel during the 2.5 development effort that is optimized for read-mostly -situations. Although RCU is actually quite simple once you understand it, -getting there can sometimes be a challenge. Part of the problem is that -most of the past descriptions of RCU have been written with the mistaken -assumption that there is "one true way" to describe RCU. Instead, -the experience has been that different people must take different paths -to arrive at an understanding of RCU. This document provides several -different paths, as follows: +situations. Although RCU is actually quite simple, making effective use +of it requires you to think differently about your code. Another part +of the problem is the mistaken assumption that there is "one true way" to +describe and to use RCU. Instead, the experience has been that different +people must take different paths to arrive at an understanding of RCU, +depending on their experiences and use cases. This document provides +several different paths, as follows: :ref:`1. RCU OVERVIEW <1_whatisRCU>` @@ -157,34 +162,36 @@ rcu_read_lock() ^^^^^^^^^^^^^^^ void rcu_read_lock(void); - Used by a reader to inform the reclaimer that the reader is - entering an RCU read-side critical section. It is illegal - to block while in an RCU read-side critical section, though - kernels built with CONFIG_PREEMPT_RCU can preempt RCU - read-side critical sections. Any RCU-protected data structure - accessed during an RCU read-side critical section is guaranteed to - remain unreclaimed for the full duration of that critical section. - Reference counts may be used in conjunction with RCU to maintain - longer-term references to data structures. + This temporal primitive is used by a reader to inform the + reclaimer that the reader is entering an RCU read-side critical + section. It is illegal to block while in an RCU read-side + critical section, though kernels built with CONFIG_PREEMPT_RCU + can preempt RCU read-side critical sections. Any RCU-protected + data structure accessed during an RCU read-side critical section + is guaranteed to remain unreclaimed for the full duration of that + critical section. Reference counts may be used in conjunction + with RCU to maintain longer-term references to data structures. rcu_read_unlock() ^^^^^^^^^^^^^^^^^ void rcu_read_unlock(void); - Used by a reader to inform the reclaimer that the reader is - exiting an RCU read-side critical section. Note that RCU - read-side critical sections may be nested and/or overlapping. + This temporal primitives is used by a reader to inform the + reclaimer that the reader is exiting an RCU read-side critical + section. Note that RCU read-side critical sections may be nested + and/or overlapping. synchronize_rcu() ^^^^^^^^^^^^^^^^^ void synchronize_rcu(void); - Marks the end of updater code and the beginning of reclaimer - code. It does this by blocking until all pre-existing RCU - read-side critical sections on all CPUs have completed. - Note that synchronize_rcu() will **not** necessarily wait for - any subsequent RCU read-side critical sections to complete. - For example, consider the following sequence of events:: + This temporal primitive marks the end of updater code and the + beginning of reclaimer code. It does this by blocking until + all pre-existing RCU read-side critical sections on all CPUs + have completed. Note that synchronize_rcu() will **not** + necessarily wait for any subsequent RCU read-side critical + sections to complete. For example, consider the following + sequence of events:: CPU 0 CPU 1 CPU 2 ----------------- ------------------------- --------------- @@ -211,13 +218,13 @@ synchronize_rcu() to be useful in all but the most read-intensive situations, synchronize_rcu()'s overhead must also be quite small. - The call_rcu() API is a callback form of synchronize_rcu(), - and is described in more detail in a later section. Instead of - blocking, it registers a function and argument which are invoked - after all ongoing RCU read-side critical sections have completed. - This callback variant is particularly useful in situations where - it is illegal to block or where update-side performance is - critically important. + The call_rcu() API is an asynchronous callback form of + synchronize_rcu(), and is described in more detail in a later + section. Instead of blocking, it registers a function and + argument which are invoked after all ongoing RCU read-side + critical sections have completed. This callback variant is + particularly useful in situations where it is illegal to block + or where update-side performance is critically important. However, the call_rcu() API should not be used lightly, as use of the synchronize_rcu() API generally results in simpler code. @@ -236,11 +243,13 @@ rcu_assign_pointer() would be cool to be able to declare a function in this manner. (Compiler experts will no doubt disagree.) - The updater uses this function to assign a new value to an + The updater uses this spatial macro to assign a new value to an RCU-protected pointer, in order to safely communicate the change - in value from the updater to the reader. This macro does not - evaluate to an rvalue, but it does execute any memory-barrier - instructions required for a given CPU architecture. + in value from the updater to the reader. This is a spatial (as + opposed to temporal) macro. It does not evaluate to an rvalue, + but it does execute any memory-barrier instructions required + for a given CPU architecture. Its ordering properties are that + of a store-release operation. Perhaps just as important, it serves to document (1) which pointers are protected by RCU and (2) the point at which a @@ -255,14 +264,15 @@ rcu_dereference() Like rcu_assign_pointer(), rcu_dereference() must be implemented as a macro. - The reader uses rcu_dereference() to fetch an RCU-protected - pointer, which returns a value that may then be safely - dereferenced. Note that rcu_dereference() does not actually - dereference the pointer, instead, it protects the pointer for - later dereferencing. It also executes any needed memory-barrier - instructions for a given CPU architecture. Currently, only Alpha - needs memory barriers within rcu_dereference() -- on other CPUs, - it compiles to nothing, not even a compiler directive. + The reader uses the spatial rcu_dereference() macro to fetch + an RCU-protected pointer, which returns a value that may + then be safely dereferenced. Note that rcu_dereference() + does not actually dereference the pointer, instead, it + protects the pointer for later dereferencing. It also + executes any needed memory-barrier instructions for a given + CPU architecture. Currently, only Alpha needs memory barriers + within rcu_dereference() -- on other CPUs, it compiles to a + volatile load. Common coding practice uses rcu_dereference() to copy an RCU-protected pointer to a local variable, then dereferences @@ -355,12 +365,15 @@ reader, updater, and reclaimer. synchronize_rcu() & call_rcu() -The RCU infrastructure observes the time sequence of rcu_read_lock(), +The RCU infrastructure observes the temporal sequence of rcu_read_lock(), rcu_read_unlock(), synchronize_rcu(), and call_rcu() invocations in order to determine when (1) synchronize_rcu() invocations may return to their callers and (2) call_rcu() callbacks may be invoked. Efficient implementations of the RCU infrastructure make heavy use of batching in order to amortize their overhead over many uses of the corresponding APIs. +The rcu_assign_pointer() and rcu_dereference() invocations communicate +spatial changes via stores to and loads from the RCU-protected pointer in +question. There are at least three flavors of RCU usage in the Linux kernel. The diagram above shows the most common one. On the updater side, the rcu_assign_pointer(), @@ -392,7 +405,9 @@ b. RCU applied to networking data structures that may be subjected c. RCU applied to scheduler and interrupt/NMI-handler tasks. Again, most uses will be of (a). The (b) and (c) cases are important -for specialized uses, but are relatively uncommon. +for specialized uses, but are relatively uncommon. The SRCU, RCU-Tasks, +RCU-Tasks-Rude, and RCU-Tasks-Trace have similar relationships among +their assorted primitives. .. _3_whatisRCU: @@ -468,7 +483,7 @@ So, to sum up: - Within an RCU read-side critical section, use rcu_dereference() to dereference RCU-protected pointers. -- Use some solid scheme (such as locks or semaphores) to +- Use some solid design (such as locks or semaphores) to keep concurrent updates from interfering with each other. - Use rcu_assign_pointer() to update an RCU-protected pointer. @@ -579,6 +594,14 @@ to avoid having to write your own callback:: kfree_rcu(old_fp, rcu); +If the occasional sleep is permitted, the single-argument form may +be used, omitting the rcu_head structure from struct foo. + + kfree_rcu(old_fp); + +This variant of kfree_rcu() almost never blocks, but might do so by +invoking synchronize_rcu() in response to memory-allocation failure. + Again, see checklist.rst for additional rules governing the use of RCU. .. _5_whatisRCU: @@ -596,7 +619,7 @@ lacking both functionality and performance. However, they are useful in getting a feel for how RCU works. See kernel/rcu/update.c for a production-quality implementation, and see: - http://www.rdrop.com/users/paulmck/RCU + https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit for papers describing the Linux kernel RCU implementation. The OLS'01 and OLS'02 papers are a good introduction, and the dissertation provides @@ -929,6 +952,8 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be initialized after each and every call to kmem_cache_alloc(), which renders reference-free spinlock acquisition completely unsafe. Therefore, when using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter. +(Those willing to use a kmem_cache constructor may also use locking, +including cache-friendly sequence locking.) With traditional reference counting -- such as that implemented by the kref library in Linux -- there is typically code that runs when the last @@ -1047,6 +1072,30 @@ sched:: rcu_read_lock_sched_held +RCU-Tasks:: + + Critical sections Grace period Barrier + + N/A call_rcu_tasks rcu_barrier_tasks + synchronize_rcu_tasks + + +RCU-Tasks-Rude:: + + Critical sections Grace period Barrier + + N/A call_rcu_tasks_rude rcu_barrier_tasks_rude + synchronize_rcu_tasks_rude + + +RCU-Tasks-Trace:: + + Critical sections Grace period Barrier + + rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace + rcu_read_unlock_trace synchronize_rcu_tasks_trace + + SRCU:: Critical sections Grace period Barrier @@ -1087,35 +1136,43 @@ list can be helpful: a. Will readers need to block? If so, you need SRCU. -b. What about the -rt patchset? If readers would need to block - in an non-rt kernel, you need SRCU. If readers would block - in a -rt kernel, but not in a non-rt kernel, SRCU is not - necessary. (The -rt patchset turns spinlocks into sleeplocks, - hence this distinction.) +b. Will readers need to block and are you doing tracing, for + example, ftrace or BPF? If so, you need RCU-tasks, + RCU-tasks-rude, and/or RCU-tasks-trace. + +c. What about the -rt patchset? If readers would need to block in + an non-rt kernel, you need SRCU. If readers would block when + acquiring spinlocks in a -rt kernel, but not in a non-rt kernel, + SRCU is not necessary. (The -rt patchset turns spinlocks into + sleeplocks, hence this distinction.) -c. Do you need to treat NMI handlers, hardirq handlers, +d. Do you need to treat NMI handlers, hardirq handlers, and code segments with preemption disabled (whether via preempt_disable(), local_irq_save(), local_bh_disable(), or some other mechanism) as if they were explicit RCU readers? - If so, RCU-sched is the only choice that will work for you. - -d. Do you need RCU grace periods to complete even in the face - of softirq monopolization of one or more of the CPUs? For - example, is your code subject to network-based denial-of-service - attacks? If so, you should disable softirq across your readers, - for example, by using rcu_read_lock_bh(). - -e. Is your workload too update-intensive for normal use of + If so, RCU-sched readers are the only choice that will work + for you, but since about v4.20 you use can use the vanilla RCU + update primitives. + +e. Do you need RCU grace periods to complete even in the face of + softirq monopolization of one or more of the CPUs? For example, + is your code subject to network-based denial-of-service attacks? + If so, you should disable softirq across your readers, for + example, by using rcu_read_lock_bh(). Since about v4.20 you + use can use the vanilla RCU update primitives. + +f. Is your workload too update-intensive for normal use of RCU, but inappropriate for other synchronization mechanisms? If so, consider SLAB_TYPESAFE_BY_RCU (which was originally named SLAB_DESTROY_BY_RCU). But please be careful! -f. Do you need read-side critical sections that are respected - even though they are in the middle of the idle loop, during - user-mode execution, or on an offlined CPU? If so, SRCU is the - only choice that will work for you. +g. Do you need read-side critical sections that are respected even + on CPUs that are deep in the idle loop, during entry to or exit + from user-mode execution, or on an offlined CPU? If so, SRCU + and RCU Tasks Trace are the only choices that will work for you, + with SRCU being strongly preferred in almost all cases. -g. Otherwise, use RCU. +h. Otherwise, use RCU. Of course, this all assumes that you have determined that RCU is in fact the right tool for your job. From patchwork Thu Jan 5 00:09:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29F82C53210 for ; Thu, 5 Jan 2023 00:10:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235155AbjAEAKZ (ORCPT ); Wed, 4 Jan 2023 19:10:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235426AbjAEAKQ (ORCPT ); Wed, 4 Jan 2023 19:10:16 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 667A744347; Wed, 4 Jan 2023 16:10:03 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D8D83B81984; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89BC8C433AC; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=C6HzkjFAA72iQ6ev+Vk91gXKITUtU7V20ZYXLoH/gA8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uIiqFL1tgkDtve0tSWcO9gciiwj029HJ1kZkVPw0gOpLA2IlP/PHZvBrND3zXB18l P/adILooJ5HQY8ScS9AD8Y2M95QfEj7CuCJRkJgnhf3Gr8ajx5FP46jSzCoyAcce9Y Vo699GpNpct5W6G15CRLwmctBsXMSq46A5zTGu8O8Uh1ZfCPb0VaH9Iwi31ZJPp+C7 X/1ylbNo9a0WIPXAQ1N5Qvex8j+pxxgBusof6vWO9ccycjs5FUdOsaa8+IOUCS3pJg MpCAxDUy/ELJMB2NW3hoiozlWgAzcdlb6m2nA+pXyAdxLoU9tUl5OJWTXKH3B7RwLB u1sUXNyTpZ0fw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D85835C1C78; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Zhen Lei , Frederic Weisbecker , "Paul E . McKenney" Subject: [PATCH rcu 12/15] doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information Date: Wed, 4 Jan 2023 16:09:52 -0800 Message-Id: <20230105000955.1767218-12-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Zhen Lei This commit documents the additional RCU CPU stall warning output produced by kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with rcupdate.rcu_cpu_stall_cputime=1. [ paulmck: Apply wordsmithing. ] Signed-off-by: Zhen Lei Reviewed-by: Frederic Weisbecker Signed-off-by: Paul E. McKenney --- Documentation/RCU/stallwarn.rst | 88 +++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index dfa4db8c0931e..c1e92dfef40d5 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -390,3 +390,91 @@ for example, "P3421". It is entirely possible to see stall warnings from normal and from expedited grace periods at about the same time during the same run. + +RCU_CPU_STALL_CPUTIME +===================== + +In kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with +rcupdate.rcu_cpu_stall_cputime=1, the following additional information +is supplied with each RCU CPU stall warning:: + +rcu: hardirqs softirqs csw/system +rcu: number: 624 45 0 +rcu: cputime: 69 1 2425 ==> 2500(ms) + +These statistics are collected during the sampling period. The values +in row "number:" are the number of hard interrupts, number of soft +interrupts, and number of context switches on the stalled CPU. The +first three values in row "cputime:" indicate the CPU time in +milliseconds consumed by hard interrupts, soft interrupts, and tasks +on the stalled CPU. The last number is the measurement interval, again +in milliseconds. Because user-mode tasks normally do not cause RCU CPU +stalls, these tasks are typically kernel tasks, which is why only the +system CPU time are considered. + +The sampling period is shown as follows: +:<------------first timeout---------->:<-----second timeout----->: +:<--half timeout-->:<--half timeout-->: : +: :<--first period-->: : +: :<-----------second sampling period---------->: +: : : : +: snapshot time point 1st-stall 2nd-stall + + +The following describes four typical scenarios: + +1. A CPU looping with interrupts disabled.:: + + rcu: hardirqs softirqs csw/system + rcu: number: 0 0 0 + rcu: cputime: 0 0 0 ==> 2500(ms) + + Because interrupts have been disabled throughout the measurement + interval, there are no interrupts and no context switches. + Furthermore, because CPU time consumption was measured using interrupt + handlers, the system CPU consumption is misleadingly measured as zero. + This scenario will normally also have "(0 ticks this GP)" printed on + this CPU's summary line. + +2. A CPU looping with bottom halves disabled. + + This is similar to the previous example, but with non-zero number of + and CPU time consumed by hard interrupts, along with non-zero CPU + time consumed by in-kernel execution.:: + + rcu: hardirqs softirqs csw/system + rcu: number: 624 0 0 + rcu: cputime: 49 0 2446 ==> 2500(ms) + + The fact that there are zero softirqs gives a hint that these were + disabled, perhaps via local_bh_disable(). It is of course possible + that there were no softirqs, perhaps because all events that would + result in softirq execution are confined to other CPUs. In this case, + the diagnosis should continue as shown in the next example. + +3. A CPU looping with preemption disabled. + + Here, only the number of context switches is zero.:: + + rcu: hardirqs softirqs csw/system + rcu: number: 624 45 0 + rcu: cputime: 69 1 2425 ==> 2500(ms) + + This situation hints that the stalled CPU was looping with preemption + disabled. + +4. No looping, but massive hard and soft interrupts.:: + + rcu: hardirqs softirqs csw/system + rcu: number: xx xx 0 + rcu: cputime: xx xx 0 ==> 2500(ms) + + Here, the number and CPU time of hard interrupts are all non-zero, + but the number of context switches and the in-kernel CPU time consumed + are zero. The number and cputime of soft interrupts will usually be + non-zero, but could be zero, for example, if the CPU was spinning + within a single hard interrupt handler. + + If this type of RCU CPU stall warning can be reproduced, you can + narrow it down by looking at /proc/interrupts or by writing code to + trace each interrupt, for example, by referring to show_interrupts(). From patchwork Thu Jan 5 00:09:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6411FC53210 for ; Thu, 5 Jan 2023 00:10:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235368AbjAEAKF (ORCPT ); Wed, 4 Jan 2023 19:10:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235064AbjAEAKB (ORCPT ); Wed, 4 Jan 2023 19:10:01 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2C9D43A1D; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3F4D8618A8; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90BC9C433B0; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=fcSC4bGknuyVNiSTBAA/pbbBIR+9W/cdpVsAscN6+Mc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PWhA9VAJbsgRQBpx1Anbzy+QLG23qhY/U+SxRPbJwOLe8kAQQrqoOtX5cyJjEjAN+ LfDYA+wZofCcoAg8JQDyKZypwMrlRvMI6WbEbmtlMipAInx9XLKzaUpgm7fxXzUYBq CfyuuFf4Nw98PbEDDIZ5epb401mhiFz4+v5/aDTsPuQ0ibrQSueRhfbDSh1cAhVoEq P2P7z5FFYySpcXA1Vo16kc+JWvUQr5bhV2P3DU9+mw47Bkv5AV2q5uMtO3sb8m0Yhh QJWc+RH0ybefuANKoQICZ/w2usva+7Z7jbg0TxKeEf1zhusWb068q8QbRAiiXPzE8y 5OofSaoKy6+ow== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id DA0075C1C89; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Akira Yokosawa , "Paul E . McKenney" Subject: [PATCH rcu 13/15] docs/RCU/rcubarrier: Adjust 'Answer' parts of QQs as definition-lists Date: Wed, 4 Jan 2023 16:09:53 -0800 Message-Id: <20230105000955.1767218-13-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Akira Yokosawa The "Answer" parts of QQs divert from proper format of definition-lists as described at [1] and are not rendered as such. Adjust them. Link: [1] https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#definition-lists Signed-off-by: Akira Yokosawa Signed-off-by: Paul E. McKenney --- Documentation/RCU/rcubarrier.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/Documentation/RCU/rcubarrier.rst b/Documentation/RCU/rcubarrier.rst index 5a643e5233d5f..9fb9ed7773552 100644 --- a/Documentation/RCU/rcubarrier.rst +++ b/Documentation/RCU/rcubarrier.rst @@ -296,7 +296,8 @@ Quick Quiz #1: Is there any other situation where rcu_barrier() might be required? -Answer: Interestingly enough, rcu_barrier() was not originally +Answer: + Interestingly enough, rcu_barrier() was not originally implemented for module unloading. Nikita Danilov was using RCU in a filesystem, which resulted in a similar situation at filesystem-unmount time. Dipankar Sarma coded up rcu_barrier() @@ -315,7 +316,8 @@ Quick Quiz #2: Why doesn't line 8 initialize rcu_barrier_cpu_count to zero, thereby avoiding the need for lines 9 and 10? -Answer: Suppose that the on_each_cpu() function shown on line 8 was +Answer: + Suppose that the on_each_cpu() function shown on line 8 was delayed, so that CPU 0's rcu_barrier_func() executed and the corresponding grace period elapsed, all before CPU 1's rcu_barrier_func() started executing. This would result in @@ -351,7 +353,8 @@ Quick Quiz #3: are delayed for a full grace period? Couldn't this result in rcu_barrier() returning prematurely? -Answer: This cannot happen. The reason is that on_each_cpu() has its last +Answer: + This cannot happen. The reason is that on_each_cpu() has its last argument, the wait flag, set to "1". This flag is passed through to smp_call_function() and further to smp_call_function_on_cpu(), causing this latter to spin until the cross-CPU invocation of From patchwork Thu Jan 5 00:09:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97EF4C53210 for ; Thu, 5 Jan 2023 00:10:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235389AbjAEAKI (ORCPT ); Wed, 4 Jan 2023 19:10:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235155AbjAEAKB (ORCPT ); Wed, 4 Jan 2023 19:10:01 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8BE843A2D; Wed, 4 Jan 2023 16:09:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5BE396188F; Thu, 5 Jan 2023 00:09:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92851C433AA; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=ho1bxeu1IhGy97GpWjdnxhE2TC9LF5q8tcAiwvvPzeQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NJ3zgjmmeVYRUZWKz3Hc3b4JA/1xhMBBNSCPzttQ53K1KG0VdevBIr+rUJdITVKnZ dHtVC2CfB3z+rzT4NEKQO++KJFVpV7lhTcRIPd7Xtw44wTFG2I8J6LGbeKK4lXASsP x/L9YC7/SOGNohGoUdj2eY7/DTOfW3Lj5ZG/QpoUlA2ifquhafpk9t8TwyIdxAvcou gtb/BQmMP5ov+Rvz7wW1MuFOAuUNgk5eSA684ynLxPYjMJQll70kEM57xJ8OY5iuD0 YxpXeTw/5Tf2nj3ulELuWxc5ylVvv4i+7WqQEP3CaXA3hJfL7U95mEBUlx+ccg2r/n iLKQm+rkenU+Q== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id DBD785C1C98; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Akira Yokosawa , "Paul E . McKenney" Subject: [PATCH rcu 14/15] docs/RCU/rcubarrier: Right-adjust line numbers in code snippets Date: Wed, 4 Jan 2023 16:09:54 -0800 Message-Id: <20230105000955.1767218-14-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Akira Yokosawa Line numbers in code snippets in rcubarrier.rst have beed left adjusted since commit 4af498306ffd ("doc: Convert to rcubarrier.txt to ReST"). This might have been because right adjusting them had confused Sphinx. The rules around a literal block in reST are: - Need a blank line above it. - A line with the same indent level as the line above it is regarded as the end of it. Those line numbers can be right adjusted by keeping indents at two- digit numbers. While at it, add some spaces between the column of line numbers and the code area for better readability. Signed-off-by: Akira Yokosawa Signed-off-by: Paul E. McKenney --- Documentation/RCU/rcubarrier.rst | 168 +++++++++++++++---------------- 1 file changed, 84 insertions(+), 84 deletions(-) diff --git a/Documentation/RCU/rcubarrier.rst b/Documentation/RCU/rcubarrier.rst index 9fb9ed7773552..6da7f66da2a80 100644 --- a/Documentation/RCU/rcubarrier.rst +++ b/Documentation/RCU/rcubarrier.rst @@ -72,9 +72,9 @@ For example, if it uses call_rcu(), call_srcu() on srcu_struct_1, and call_srcu() on srcu_struct_2, then the following three lines of code will be required when unloading:: - 1 rcu_barrier(); - 2 srcu_barrier(&srcu_struct_1); - 3 srcu_barrier(&srcu_struct_2); + 1 rcu_barrier(); + 2 srcu_barrier(&srcu_struct_1); + 3 srcu_barrier(&srcu_struct_2); If latency is of the essence, workqueues could be used to run these three functions concurrently. @@ -82,69 +82,69 @@ three functions concurrently. An ancient version of the rcutorture module makes use of rcu_barrier() in its exit function as follows:: - 1 static void - 2 rcu_torture_cleanup(void) - 3 { - 4 int i; - 5 - 6 fullstop = 1; - 7 if (shuffler_task != NULL) { - 8 VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task"); - 9 kthread_stop(shuffler_task); - 10 } - 11 shuffler_task = NULL; + 1 static void + 2 rcu_torture_cleanup(void) + 3 { + 4 int i; + 5 + 6 fullstop = 1; + 7 if (shuffler_task != NULL) { + 8 VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task"); + 9 kthread_stop(shuffler_task); + 10 } + 11 shuffler_task = NULL; 12 - 13 if (writer_task != NULL) { - 14 VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task"); - 15 kthread_stop(writer_task); - 16 } - 17 writer_task = NULL; + 13 if (writer_task != NULL) { + 14 VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task"); + 15 kthread_stop(writer_task); + 16 } + 17 writer_task = NULL; 18 - 19 if (reader_tasks != NULL) { - 20 for (i = 0; i < nrealreaders; i++) { - 21 if (reader_tasks[i] != NULL) { - 22 VERBOSE_PRINTK_STRING( - 23 "Stopping rcu_torture_reader task"); - 24 kthread_stop(reader_tasks[i]); - 25 } - 26 reader_tasks[i] = NULL; - 27 } - 28 kfree(reader_tasks); - 29 reader_tasks = NULL; - 30 } - 31 rcu_torture_current = NULL; + 19 if (reader_tasks != NULL) { + 20 for (i = 0; i < nrealreaders; i++) { + 21 if (reader_tasks[i] != NULL) { + 22 VERBOSE_PRINTK_STRING( + 23 "Stopping rcu_torture_reader task"); + 24 kthread_stop(reader_tasks[i]); + 25 } + 26 reader_tasks[i] = NULL; + 27 } + 28 kfree(reader_tasks); + 29 reader_tasks = NULL; + 30 } + 31 rcu_torture_current = NULL; 32 - 33 if (fakewriter_tasks != NULL) { - 34 for (i = 0; i < nfakewriters; i++) { - 35 if (fakewriter_tasks[i] != NULL) { - 36 VERBOSE_PRINTK_STRING( - 37 "Stopping rcu_torture_fakewriter task"); - 38 kthread_stop(fakewriter_tasks[i]); - 39 } - 40 fakewriter_tasks[i] = NULL; - 41 } - 42 kfree(fakewriter_tasks); - 43 fakewriter_tasks = NULL; - 44 } + 33 if (fakewriter_tasks != NULL) { + 34 for (i = 0; i < nfakewriters; i++) { + 35 if (fakewriter_tasks[i] != NULL) { + 36 VERBOSE_PRINTK_STRING( + 37 "Stopping rcu_torture_fakewriter task"); + 38 kthread_stop(fakewriter_tasks[i]); + 39 } + 40 fakewriter_tasks[i] = NULL; + 41 } + 42 kfree(fakewriter_tasks); + 43 fakewriter_tasks = NULL; + 44 } 45 - 46 if (stats_task != NULL) { - 47 VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task"); - 48 kthread_stop(stats_task); - 49 } - 50 stats_task = NULL; + 46 if (stats_task != NULL) { + 47 VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task"); + 48 kthread_stop(stats_task); + 49 } + 50 stats_task = NULL; 51 - 52 /* Wait for all RCU callbacks to fire. */ - 53 rcu_barrier(); + 52 /* Wait for all RCU callbacks to fire. */ + 53 rcu_barrier(); 54 - 55 rcu_torture_stats_print(); /* -After- the stats thread is stopped! */ + 55 rcu_torture_stats_print(); /* -After- the stats thread is stopped! */ 56 - 57 if (cur_ops->cleanup != NULL) - 58 cur_ops->cleanup(); - 59 if (atomic_read(&n_rcu_torture_error)) - 60 rcu_torture_print_module_parms("End of test: FAILURE"); - 61 else - 62 rcu_torture_print_module_parms("End of test: SUCCESS"); - 63 } + 57 if (cur_ops->cleanup != NULL) + 58 cur_ops->cleanup(); + 59 if (atomic_read(&n_rcu_torture_error)) + 60 rcu_torture_print_module_parms("End of test: FAILURE"); + 61 else + 62 rcu_torture_print_module_parms("End of test: SUCCESS"); + 63 } Line 6 sets a global variable that prevents any RCU callbacks from re-posting themselves. This will not be necessary in most cases, since @@ -193,16 +193,16 @@ which point, all earlier RCU callbacks are guaranteed to have completed. The original code for rcu_barrier() was roughly as follows:: - 1 void rcu_barrier(void) - 2 { - 3 BUG_ON(in_interrupt()); - 4 /* Take cpucontrol mutex to protect against CPU hotplug */ - 5 mutex_lock(&rcu_barrier_mutex); - 6 init_completion(&rcu_barrier_completion); - 7 atomic_set(&rcu_barrier_cpu_count, 1); - 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1); - 9 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) - 10 complete(&rcu_barrier_completion); + 1 void rcu_barrier(void) + 2 { + 3 BUG_ON(in_interrupt()); + 4 /* Take cpucontrol mutex to protect against CPU hotplug */ + 5 mutex_lock(&rcu_barrier_mutex); + 6 init_completion(&rcu_barrier_completion); + 7 atomic_set(&rcu_barrier_cpu_count, 1); + 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1); + 9 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) + 10 complete(&rcu_barrier_completion); 11 wait_for_completion(&rcu_barrier_completion); 12 mutex_unlock(&rcu_barrier_mutex); 13 } @@ -232,16 +232,16 @@ still gives the general idea. The rcu_barrier_func() runs on each CPU, where it invokes call_rcu() to post an RCU callback, as follows:: - 1 static void rcu_barrier_func(void *notused) - 2 { - 3 int cpu = smp_processor_id(); - 4 struct rcu_data *rdp = &per_cpu(rcu_data, cpu); - 5 struct rcu_head *head; - 6 - 7 head = &rdp->barrier; - 8 atomic_inc(&rcu_barrier_cpu_count); - 9 call_rcu(head, rcu_barrier_callback); - 10 } + 1 static void rcu_barrier_func(void *notused) + 2 { + 3 int cpu = smp_processor_id(); + 4 struct rcu_data *rdp = &per_cpu(rcu_data, cpu); + 5 struct rcu_head *head; + 6 + 7 head = &rdp->barrier; + 8 atomic_inc(&rcu_barrier_cpu_count); + 9 call_rcu(head, rcu_barrier_callback); + 10 } Lines 3 and 4 locate RCU's internal per-CPU rcu_data structure, which contains the struct rcu_head that needed for the later call to @@ -254,11 +254,11 @@ The rcu_barrier_callback() function simply atomically decrements the rcu_barrier_cpu_count variable and finalizes the completion when it reaches zero, as follows:: - 1 static void rcu_barrier_callback(struct rcu_head *notused) - 2 { - 3 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) - 4 complete(&rcu_barrier_completion); - 5 } + 1 static void rcu_barrier_callback(struct rcu_head *notused) + 2 { + 3 if (atomic_dec_and_test(&rcu_barrier_cpu_count)) + 4 complete(&rcu_barrier_completion); + 5 } .. _rcubarrier_quiz_3: From patchwork Thu Jan 5 00:09:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13089219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C990C54E76 for ; Thu, 5 Jan 2023 00:10:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235451AbjAEAKX (ORCPT ); Wed, 4 Jan 2023 19:10:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235433AbjAEAKR (ORCPT ); Wed, 4 Jan 2023 19:10:17 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FF2644348; Wed, 4 Jan 2023 16:10:03 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2B3CEB8198A; Thu, 5 Jan 2023 00:10:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 98E14C43444; Thu, 5 Jan 2023 00:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672877397; bh=50t2JhGfBuLq9wZOJgcb1sE9TNLXRw/1jZ3BmOlQGeY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dSIC4AuizxrG9CF/+aE7ID2IVaxuhXgCIZuFAtDsNyVoU01fkkRdq0HUAIecpB4rk 49hyTHjAt+PgZMWw+qNNmW3IjJcV81xFKQaIWvHsP46Pnm4ucgJ5a00UjVjZ1VObm8 71wkNt179WJOfZCVnZXEmOOvFKupbGSHQpOe+wHyr0rgh+A/J3uzMbc/kX6gt+O3QI vL3JW5wdZAI5+feEl/iqoTsHPJovFRgevkG7lNN05d0bzOnrUycf1jRF1rFF2rrhsA 1+y01t1yvhIDm0Rpelz7UC+LSI4aD4MQlhDivcBKK/Cgetxnpkn6k0PZetx5/beG5O 2VFCIonb04ooA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id DDE215C1C99; Wed, 4 Jan 2023 16:09:56 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Zhen Lei , Stephen Rothwell , Akira Yokosawa , "Paul E . McKenney" Subject: [PATCH rcu 15/15] doc: Fix htmldocs build warnings of stallwarn.rst Date: Wed, 4 Jan 2023 16:09:55 -0800 Message-Id: <20230105000955.1767218-15-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> References: <20230105000945.GA1767128@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Zhen Lei Documentation/RCU/stallwarn.rst: 401: WARNING: Literal block expected; none found. 428: WARNING: Literal block expected; none found. 445: WARNING: Literal block expected; none found. 459: WARNING: Literal block expected; none found. 468: WARNING: Literal block expected; none found. The literal block needs to be indented, so this commit adds two spaces to each line. In addition, ':', which is used as a boundary in the literal block, is replaced by '|'. Link: https://lore.kernel.org/linux-next/20221123163255.48653674@canb.auug.org.au/ Fixes: 3d2788ba4573 ("doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information") Reported-by: Stephen Rothwell Signed-off-by: Zhen Lei Tested-by: Akira Yokosawa Signed-off-by: Paul E. McKenney --- Documentation/RCU/stallwarn.rst | 56 ++++++++++++++++++--------------- 1 file changed, 30 insertions(+), 26 deletions(-) diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index c1e92dfef40d5..ca7b7cd806a16 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -398,9 +398,9 @@ In kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with rcupdate.rcu_cpu_stall_cputime=1, the following additional information is supplied with each RCU CPU stall warning:: -rcu: hardirqs softirqs csw/system -rcu: number: 624 45 0 -rcu: cputime: 69 1 2425 ==> 2500(ms) + rcu: hardirqs softirqs csw/system + rcu: number: 624 45 0 + rcu: cputime: 69 1 2425 ==> 2500(ms) These statistics are collected during the sampling period. The values in row "number:" are the number of hard interrupts, number of soft @@ -412,22 +412,24 @@ in milliseconds. Because user-mode tasks normally do not cause RCU CPU stalls, these tasks are typically kernel tasks, which is why only the system CPU time are considered. -The sampling period is shown as follows: -:<------------first timeout---------->:<-----second timeout----->: -:<--half timeout-->:<--half timeout-->: : -: :<--first period-->: : -: :<-----------second sampling period---------->: -: : : : -: snapshot time point 1st-stall 2nd-stall +The sampling period is shown as follows:: + |<------------first timeout---------->|<-----second timeout----->| + |<--half timeout-->|<--half timeout-->| | + | |<--first period-->| | + | |<-----------second sampling period---------->| + | | | | + snapshot time point 1st-stall 2nd-stall The following describes four typical scenarios: -1. A CPU looping with interrupts disabled.:: +1. A CPU looping with interrupts disabled. - rcu: hardirqs softirqs csw/system - rcu: number: 0 0 0 - rcu: cputime: 0 0 0 ==> 2500(ms) + :: + + rcu: hardirqs softirqs csw/system + rcu: number: 0 0 0 + rcu: cputime: 0 0 0 ==> 2500(ms) Because interrupts have been disabled throughout the measurement interval, there are no interrupts and no context switches. @@ -440,11 +442,11 @@ The following describes four typical scenarios: This is similar to the previous example, but with non-zero number of and CPU time consumed by hard interrupts, along with non-zero CPU - time consumed by in-kernel execution.:: + time consumed by in-kernel execution:: - rcu: hardirqs softirqs csw/system - rcu: number: 624 0 0 - rcu: cputime: 49 0 2446 ==> 2500(ms) + rcu: hardirqs softirqs csw/system + rcu: number: 624 0 0 + rcu: cputime: 49 0 2446 ==> 2500(ms) The fact that there are zero softirqs gives a hint that these were disabled, perhaps via local_bh_disable(). It is of course possible @@ -454,20 +456,22 @@ The following describes four typical scenarios: 3. A CPU looping with preemption disabled. - Here, only the number of context switches is zero.:: + Here, only the number of context switches is zero:: - rcu: hardirqs softirqs csw/system - rcu: number: 624 45 0 - rcu: cputime: 69 1 2425 ==> 2500(ms) + rcu: hardirqs softirqs csw/system + rcu: number: 624 45 0 + rcu: cputime: 69 1 2425 ==> 2500(ms) This situation hints that the stalled CPU was looping with preemption disabled. -4. No looping, but massive hard and soft interrupts.:: +4. No looping, but massive hard and soft interrupts. + + :: - rcu: hardirqs softirqs csw/system - rcu: number: xx xx 0 - rcu: cputime: xx xx 0 ==> 2500(ms) + rcu: hardirqs softirqs csw/system + rcu: number: xx xx 0 + rcu: cputime: xx xx 0 ==> 2500(ms) Here, the number and CPU time of hard interrupts are all non-zero, but the number of context switches and the in-kernel CPU time consumed