From patchwork Thu Oct 19 23:35:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13429939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85EE8C004C0 for ; Thu, 19 Oct 2023 23:36:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346741AbjJSXgB (ORCPT ); Thu, 19 Oct 2023 19:36:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346719AbjJSXf5 (ORCPT ); Thu, 19 Oct 2023 19:35:57 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AABF4116; Thu, 19 Oct 2023 16:35:55 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ED987C433C8; Thu, 19 Oct 2023 23:35:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697758555; bh=R8Atha9x2t4vsEfcPTQzyYDg3f35Pt4dzxjw+9RuuTU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=azl/B57WvaINso+zwnxXnpDWX0/KZd66y8FLgQUWYslMcmU/gDUg6quWS5gzUkzF1 8jpVePGUxs33EeZfn/Zqz2Tc6LaptodpzRZDkLfQHKFghF7c5SDSDLQ9TPDZwG4woT Rqo5umQPJjabpDf1E9qGrsVITEI2FppPm6Pdm5ln956GPsBUf8OfcDWC7mpoHt6V5I DoqR2dG9Mo2SIUGvPJ3SJmqmn9nWzMk447z0bzQhW8IBaxDsLnW5617/t0IixJg4Md ThI3J3pK6QvsGeA3v99ErEoqbeZiN9aZZhaiof26ZSbNuGcTO8violaIBOXYgvjROU zjrObKodXex7g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Josh Triplett , Mathieu Desnoyers , Neeraj Upadhyay , "Paul E . McKenney" , Steven Rostedt , Uladzislau Rezki , rcu , Zqiang , Lai Jiangshan , "Liam R . Howlett" , Peter Zijlstra , Sebastian Siewior , Thomas Gleixner Subject: [PATCH 1/4] softirq: Rename __raise_softirq_irqoff() to raise_softirq_no_wake() Date: Fri, 20 Oct 2023 01:35:40 +0200 Message-Id: <20231019233543.1243121-2-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231019233543.1243121-1-frederic@kernel.org> References: <20231019233543.1243121-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org This makes the purpose of this function clearer. Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") Cc: Liam R. Howlett Cc: Peter Zijlstra (Intel) Cc: Sebastian Siewior Cc: Thomas Gleixner Signed-off-by: Frederic Weisbecker --- block/blk-mq.c | 2 +- include/linux/interrupt.h | 2 +- kernel/softirq.c | 6 +++--- lib/irq_poll.c | 4 ++-- net/core/dev.c | 8 ++++---- 5 files changed, 11 insertions(+), 11 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1fafd54dce3c..1bda40a2aa29 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1144,7 +1144,7 @@ static int blk_softirq_cpu_dead(unsigned int cpu) static void __blk_mq_complete_request_remote(void *data) { - __raise_softirq_irqoff(BLOCK_SOFTIRQ); + raise_softirq_no_wake(BLOCK_SOFTIRQ); } static inline bool blk_mq_complete_need_ipi(struct request *rq) diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 76121c2bb4f8..558a1a329da9 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -604,7 +604,7 @@ static inline void do_softirq_post_smp_call_flush(unsigned int unused) extern void open_softirq(int nr, void (*action)(struct softirq_action *)); extern void softirq_init(void); -extern void __raise_softirq_irqoff(unsigned int nr); +extern void raise_softirq_no_wake(unsigned int nr); extern void raise_softirq_irqoff(unsigned int nr); extern void raise_softirq(unsigned int nr); diff --git a/kernel/softirq.c b/kernel/softirq.c index 210cf5f8d92c..acfed6f3701d 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -664,7 +664,7 @@ void irq_exit(void) */ inline void raise_softirq_irqoff(unsigned int nr) { - __raise_softirq_irqoff(nr); + raise_softirq_no_wake(nr); /* * If we're in an interrupt or softirq, we're done @@ -688,7 +688,7 @@ void raise_softirq(unsigned int nr) local_irq_restore(flags); } -void __raise_softirq_irqoff(unsigned int nr) +void raise_softirq_no_wake(unsigned int nr) { lockdep_assert_irqs_disabled(); trace_softirq_raise(nr); @@ -795,7 +795,7 @@ static void tasklet_action_common(struct softirq_action *a, t->next = NULL; *tl_head->tail = t; tl_head->tail = &t->next; - __raise_softirq_irqoff(softirq_nr); + raise_softirq_no_wake(softirq_nr); local_irq_enable(); } } diff --git a/lib/irq_poll.c b/lib/irq_poll.c index 2d5329a42105..193cd847fd8f 100644 --- a/lib/irq_poll.c +++ b/lib/irq_poll.c @@ -130,7 +130,7 @@ static void __latent_entropy irq_poll_softirq(struct softirq_action *h) } if (rearm) - __raise_softirq_irqoff(IRQ_POLL_SOFTIRQ); + raise_softirq_no_wake(IRQ_POLL_SOFTIRQ); local_irq_enable(); } @@ -197,7 +197,7 @@ static int irq_poll_cpu_dead(unsigned int cpu) local_irq_disable(); list_splice_init(&per_cpu(blk_cpu_iopoll, cpu), this_cpu_ptr(&blk_cpu_iopoll)); - __raise_softirq_irqoff(IRQ_POLL_SOFTIRQ); + raise_softirq_no_wake(IRQ_POLL_SOFTIRQ); local_irq_enable(); local_bh_enable(); diff --git a/net/core/dev.c b/net/core/dev.c index 85df22f05c38..6f4622cc8939 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4459,7 +4459,7 @@ static inline void ____napi_schedule(struct softnet_data *sd, * we have to raise NET_RX_SOFTIRQ. */ if (!sd->in_net_rx_action) - __raise_softirq_irqoff(NET_RX_SOFTIRQ); + raise_softirq_no_wake(NET_RX_SOFTIRQ); } #ifdef CONFIG_RPS @@ -4678,7 +4678,7 @@ static void trigger_rx_softirq(void *data) { struct softnet_data *sd = data; - __raise_softirq_irqoff(NET_RX_SOFTIRQ); + raise_softirq_no_wake(NET_RX_SOFTIRQ); smp_store_release(&sd->defer_ipi_scheduled, 0); } @@ -4705,7 +4705,7 @@ static void napi_schedule_rps(struct softnet_data *sd) * we have to raise NET_RX_SOFTIRQ. */ if (!mysd->in_net_rx_action && !mysd->in_napi_threaded_poll) - __raise_softirq_irqoff(NET_RX_SOFTIRQ); + raise_softirq_no_wake(NET_RX_SOFTIRQ); return; } #endif /* CONFIG_RPS */ @@ -6743,7 +6743,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) list_splice_tail(&repoll, &list); list_splice(&list, &sd->poll_list); if (!list_empty(&sd->poll_list)) - __raise_softirq_irqoff(NET_RX_SOFTIRQ); + raise_softirq_no_wake(NET_RX_SOFTIRQ); else sd->in_net_rx_action = false; From patchwork Thu Oct 19 23:35:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13429940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34535C00A98 for ; Thu, 19 Oct 2023 23:36:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346750AbjJSXgD (ORCPT ); Thu, 19 Oct 2023 19:36:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346745AbjJSXgB (ORCPT ); Thu, 19 Oct 2023 19:36:01 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A443918C; Thu, 19 Oct 2023 16:35:59 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2015C433C9; Thu, 19 Oct 2023 23:35:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697758559; bh=G8Ft7qYrBUqFVecE25yG9J1QI44MxA99SLR4I8DhQCM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DcvzMQNCuu5kN4SZyJr2LBpbsCdYKAvYqtLErvqiyMnKxHzBqlLY/lla5VggsyeoJ 71LkIljQUmMR3q35JnmCs2jcbodm8Kb8+XuduevxCq/YNT8V1tiOlQz+9Y0VspoWWs STK7kNWfgOHMpUVLtsuDOwaSxR+HakakeDWK2p2fHKo+FpHS3hnt9P0sy8ocjFo2WY cCHyoW163m5TIkfRf/mENYHiQj0n9nMZOlvJVnDjJV9lyND0QFl1xlZAAY5lr6PG4v /36B3/7RGxeS2PIWOXfdtmZmxQbC+lN2DDPOkThp3TYvM7/fGvjjaqZwxHNNLm5b1v 4ASc41AUt2mOQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Josh Triplett , Mathieu Desnoyers , Neeraj Upadhyay , "Paul E . McKenney" , Steven Rostedt , Uladzislau Rezki , rcu , Zqiang , Lai Jiangshan , "Liam R . Howlett" , Peter Zijlstra , Sebastian Siewior , Thomas Gleixner Subject: [PATCH 2/4] softirq: Introduce raise_ksoftirqd_irqoff() Date: Fri, 20 Oct 2023 01:35:41 +0200 Message-Id: <20231019233543.1243121-3-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231019233543.1243121-1-frederic@kernel.org> References: <20231019233543.1243121-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Provide a function to raise a softirq vector and force the wakeup of ksoftirqd along the way, irrespective of the current interrupt context. This is going to be used by rcutiny to fix and optimize the triggering of quiescent states from idle. Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") Cc: Liam R. Howlett Cc: Peter Zijlstra (Intel) Cc: Sebastian Siewior Cc: Thomas Gleixner Signed-off-by: Frederic Weisbecker --- include/linux/interrupt.h | 1 + kernel/softirq.c | 71 +++++++++++++++++++++++---------------- 2 files changed, 43 insertions(+), 29 deletions(-) diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 558a1a329da9..301d2956e746 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -608,6 +608,7 @@ extern void raise_softirq_no_wake(unsigned int nr); extern void raise_softirq_irqoff(unsigned int nr); extern void raise_softirq(unsigned int nr); +extern void raise_ksoftirqd_irqoff(unsigned int nr); DECLARE_PER_CPU(struct task_struct *, ksoftirqd); diff --git a/kernel/softirq.c b/kernel/softirq.c index acfed6f3701d..9c29a8ced1c3 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -659,35 +659,6 @@ void irq_exit(void) lockdep_hardirq_exit(); } -/* - * This function must run with irqs disabled! - */ -inline void raise_softirq_irqoff(unsigned int nr) -{ - raise_softirq_no_wake(nr); - - /* - * If we're in an interrupt or softirq, we're done - * (this also catches softirq-disabled code). We will - * actually run the softirq once we return from - * the irq or softirq. - * - * Otherwise we wake up ksoftirqd to make sure we - * schedule the softirq soon. - */ - if (!in_interrupt() && should_wake_ksoftirqd()) - wakeup_softirqd(); -} - -void raise_softirq(unsigned int nr) -{ - unsigned long flags; - - local_irq_save(flags); - raise_softirq_irqoff(nr); - local_irq_restore(flags); -} - void raise_softirq_no_wake(unsigned int nr) { lockdep_assert_irqs_disabled(); @@ -695,6 +666,48 @@ void raise_softirq_no_wake(unsigned int nr) or_softirq_pending(1UL << nr); } +/* + * This function must run with irqs disabled! + */ +static inline void __raise_softirq_irqoff(unsigned int nr, bool threaded) +{ + raise_softirq_no_wake(nr); + + if (threaded && should_wake_ksoftirqd()) + wakeup_softirqd(); +} + +/* + * This function must run with irqs disabled! + */ +inline void raise_softirq_irqoff(unsigned int nr) +{ + bool threaded; + /* + * If in an interrupt or softirq (servicing or disabled + * section), the vector will be handled at the end of + * the interrupt or softirq servicing/disabled section. + * Otherwise the vector must rely on ksoftirqd. + */ + threaded = !in_interrupt(); + + __raise_softirq_irqoff(nr, threaded); +} + +void raise_softirq(unsigned int nr) +{ + unsigned long flags; + + local_irq_save(flags); + raise_softirq_irqoff(nr); + local_irq_restore(flags); +} + +void raise_ksoftirqd_irqoff(unsigned int nr) +{ + __raise_softirq_irqoff(nr, true); +} + void open_softirq(int nr, void (*action)(struct softirq_action *)) { softirq_vec[nr].action = action; From patchwork Thu Oct 19 23:35:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13429941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F7D0C001DF for ; Thu, 19 Oct 2023 23:36:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346729AbjJSXgH (ORCPT ); Thu, 19 Oct 2023 19:36:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346736AbjJSXgF (ORCPT ); Thu, 19 Oct 2023 19:36:05 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D19D11D; Thu, 19 Oct 2023 16:36:03 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B71A5C433C7; Thu, 19 Oct 2023 23:35:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697758563; bh=dFLlbmcwk20GgIBfNZC+zS71NZd8v6vjwcVszfRzl3A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=k0MThQ7u+sfp4jgen87T3Ojc9/LZ4+Ib4ysdidYB5VkASFBaj3DtTqJ4JDqj0jP58 Y0N1CZtiW0bCl7tGqFJSYgzC3ADPVwhDkki7uOT+D9L6hMpuQFVgciUxSwDYNYZTGA YE0iWsUuznM1F9VB2uF7HSES9tKxTYqdpqcOCkqiFbzpA3BxG3aAQEHG12R6ap46XK 4KS/wzBS2ta3bSfCwanFWn9f3kNb83k5erF3BNrCp2RVJEUOxwzC6W05BruDdawb+d 0IgcI1Qr+xNz1RhJagW26kcB6yNa54vJupF6rG8DFxG9qSaOBzyaIuXp869QQRx1d9 eVD/wSSpuPe0g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Josh Triplett , Mathieu Desnoyers , Neeraj Upadhyay , "Paul E . McKenney" , Steven Rostedt , Uladzislau Rezki , rcu , Zqiang , Lai Jiangshan , "Liam R . Howlett" , Peter Zijlstra , Sebastian Siewior , Thomas Gleixner Subject: [PATCH 3/4] rcu: Make tiny RCU use ksoftirqd to trigger a QS from idle Date: Fri, 20 Oct 2023 01:35:42 +0200 Message-Id: <20231019233543.1243121-4-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231019233543.1243121-1-frederic@kernel.org> References: <20231019233543.1243121-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org The commit: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") fixed an issue where rcutiny would request a quiescent state with setting TIF_NEED_RESCHED in early boot when init/0 has the PF_IDLE flag set but interrupts aren't enabled yet. A subsequent call to cond_resched() would then enable IRQs too early. When callbacks are enqueued in idle, RCU currently performs the following: 1) Call resched_cpu() to trigger exit from idle and go through the scheduler to call rcu_note_context_switch() -> rcu_qs() 2) rcu_qs() notes the quiescent state and raises RCU_SOFTIRQ if there is a callback, waking up ksoftirqd since it isn't called from an interrupt. However the call to resched_cpu() can opportunistically be replaced and optimized with raising RCU_SOFTIRQ and forcing ksoftirqd wakeup instead. It's worth noting that RCU grace period polling while idle is then suboptimized but such a usecase can be considered very rare or even non-existent. The advantage of this optimization is that it also works if PF_IDLE is set early because ksoftirqd is created way after IRQs are enabled on boot and it can't be awaken before its creation. If raise_ksoftirqd_irqoff() is called after the first scheduling point but before kostfirqd is created, nearby voluntary schedule calls are expected to provide the desired quiescent state and in the worst case the first launch of ksoftirqd is close enough on the first initcalls. Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") Cc: Liam R. Howlett Cc: Peter Zijlstra (Intel) Cc: Sebastian Siewior Cc: Thomas Gleixner Signed-off-by: Frederic Weisbecker Reviewed-by: Paul E. McKenney --- kernel/rcu/tiny.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index fec804b79080..9460e4e9d84c 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -190,12 +190,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) local_irq_save(flags); *rcu_ctrlblk.curtail = head; rcu_ctrlblk.curtail = &head->next; - local_irq_restore(flags); if (unlikely(is_idle_task(current))) { - /* force scheduling for rcu_qs() */ - resched_cpu(0); + /* + * Force resched to trigger a QS and handle callbacks right after. + * This also takes care of avoiding too early rescheduling on boot. + */ + raise_ksoftirqd_irqoff(RCU_SOFTIRQ); } + local_irq_restore(flags); } EXPORT_SYMBOL_GPL(call_rcu); @@ -228,8 +231,16 @@ unsigned long start_poll_synchronize_rcu(void) unsigned long gp_seq = get_state_synchronize_rcu(); if (unlikely(is_idle_task(current))) { - /* force scheduling for rcu_qs() */ - resched_cpu(0); + unsigned long flags; + + /* + * Force resched to trigger a QS. This also takes care of avoiding + * too early rescheduling on boot. It's suboptimized but GP + * polling on idle isn't expected much as a usecase. + */ + local_irq_save(flags); + raise_ksoftirqd_irqoff(RCU_SOFTIRQ); + local_irq_restore(flags); } return gp_seq; } From patchwork Thu Oct 19 23:35:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 13429942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73454C001DF for ; Thu, 19 Oct 2023 23:36:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346753AbjJSXgR (ORCPT ); Thu, 19 Oct 2023 19:36:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346755AbjJSXgJ (ORCPT ); Thu, 19 Oct 2023 19:36:09 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4017318A; Thu, 19 Oct 2023 16:36:07 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6E1AAC433CB; Thu, 19 Oct 2023 23:36:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697758566; bh=SRJb9h+7LIx0+8pFOh4ACdoNOIWr+SDM51E4mSRaHRo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=thbIHCBuTzwshUBgP4oE3WpBPBX/sHqzNus6LSOjGCnc+iREVAB3PRd5cGTPUtG90 rn40Yzh2FxHCTw8qAPCQ43znSvmGBkAK6Px6ifHhYyVF8YFMMUu8X0BGIajFntpPnh +8sxBf5TGEgifiSUPwyq+Nm2ej6yAh2oXuoNRqID5b+MttWoOAODJTwpQWG+aPCxKL Wa+5mtFGOApxGtFdsYgeh5mQAaaVn3uISiOsF9qZm/65yJFQsY68DAgmC0x7mBcLPR yK8zLr346hQ5QwZlZE8wpCChG4KsvVzOPabJj8uWge87VAWcI5vZXULvfpNpK0N/Ej OC1n6qchNxN/A== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Boqun Feng , Joel Fernandes , Josh Triplett , Mathieu Desnoyers , Neeraj Upadhyay , "Paul E . McKenney" , Steven Rostedt , Uladzislau Rezki , rcu , Zqiang , Lai Jiangshan , "Liam R . Howlett" , Peter Zijlstra , Sebastian Siewior , Thomas Gleixner Subject: [PATCH 4/4] Revert "kernel/sched: Modify initial boot task idle setup" Date: Fri, 20 Oct 2023 01:35:43 +0200 Message-Id: <20231019233543.1243121-5-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231019233543.1243121-1-frederic@kernel.org> References: <20231019233543.1243121-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Now that rcutiny can deal with early boot PF_IDLE setting, revert commit cff9b2332ab762b7e0586c793c431a8f2ea4db04. This fixes several subtle issues introduced on RCU-tasks(-trace): 1) RCU-tasks stalls when: 1.1 Grace period is started before init/0 had a chance to set PF_IDLE, keeping it stuck in the holdout list until idle ever schedules. 1.2 Grace period is started when some possible CPUs have never been online, keeping their idle tasks stuck in the holdout list until the CPU ever boots up. 1.3 Similar to 1.1 but with secondary CPUs: Grace period is started concurrently with secondary CPU booting, putting its idle task in the holdout list because PF_IDLE isn't yet observed on it. It stays then stuck in the holdout list until that CPU ever schedules. The effect is mitigated here by all the smpboot kthreads and the hotplug AP thread that must run to bring the CPU up. 2) Spurious warning on RCU task trace that assumes offline CPU's idle task is always PF_IDLE. More issues have been found in RCU-tasks related to PF_IDLE which should be fixed with later changes as those are not regressions: 3) The RCU-Tasks semantics consider the idle loop as a quiescent state, however: 3.1 The boot code preceding the idle entry is included in this quiescent state. Especially after the completion of kthreadd_done after which init/1 can launch userspace concurrently. The window is tiny before PF_IDLE is set but it exists. 3.2 Similarly, the boot code preceding the idle entry on secondary CPUs is wrongly accounted as RCU tasks quiescent state. Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") Cc: Liam R. Howlett Cc: Peter Zijlstra (Intel) Cc: Sebastian Siewior Cc: Thomas Gleixner Signed-off-by: Frederic Weisbecker --- kernel/sched/core.c | 2 +- kernel/sched/idle.c | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ad960f97e4e1..b02dcbe98024 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -9269,7 +9269,7 @@ void __init init_idle(struct task_struct *idle, int cpu) * PF_KTHREAD should already be set at this point; regardless, make it * look like a proper per-CPU kthread. */ - idle->flags |= PF_KTHREAD | PF_NO_SETAFFINITY; + idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY; kthread_set_per_cpu(idle, cpu); #ifdef CONFIG_SMP diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 5007b25c5bc6..342f58a329f5 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -373,7 +373,6 @@ EXPORT_SYMBOL_GPL(play_idle_precise); void cpu_startup_entry(enum cpuhp_state state) { - current->flags |= PF_IDLE; arch_cpu_idle_prepare(); cpuhp_online_idle(state); while (1)