From patchwork Sat Oct 22 12:45:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Leizhen (ThunderTown)" X-Patchwork-Id: 13015957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60379C04A95 for ; Sat, 22 Oct 2022 12:46:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229732AbiJVMqK (ORCPT ); Sat, 22 Oct 2022 08:46:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229520AbiJVMqJ (ORCPT ); Sat, 22 Oct 2022 08:46:09 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DDA3FAA55; Sat, 22 Oct 2022 05:46:07 -0700 (PDT) Received: from dggpemm500024.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Mvgtj3Sd3zmVDj; Sat, 22 Oct 2022 20:41:17 +0800 (CST) Received: from dggpemm500006.china.huawei.com (7.185.36.236) by dggpemm500024.china.huawei.com (7.185.36.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Sat, 22 Oct 2022 20:46:05 +0800 Received: from thunder-town.china.huawei.com (10.174.178.55) by dggpemm500006.china.huawei.com (7.185.36.236) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Sat, 22 Oct 2022 20:46:04 +0800 From: Zhen Lei To: "Paul E . McKenney" , Frederic Weisbecker , Neeraj Upadhyay , "Josh Triplett" , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , , CC: Zhen Lei Subject: [PATCH v2 0/3] rcu: Add RCU stall diagnosis information Date: Sat, 22 Oct 2022 20:45:22 +0800 Message-ID: <20221022124525.2080-1-thunder.leizhen@huawei.com> X-Mailer: git-send-email 2.37.3.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.174.178.55] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500006.china.huawei.com (7.185.36.236) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org v1 --> v2: 1. Fixed a bug in the code. If the rcu stall is detected by another CPU, kcpustat_this_cpu cannot be used. @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu) if (r->gp_seq != rdp->gp_seq) return; - cpustat = kcpustat_this_cpu->cpustat; + cpustat = kcpustat_cpu(cpu).cpustat; 2. Move the start point of statistics from rcu_stall_kick_kthreads() to rcu_implicit_dynticks_qs(), removing the dependency on irq_work. v1: In some extreme cases, such as the I/O pressure test, the CPU usage may be 100%, causing RCU stall. In this case, the printed information about current is not useful. Displays the number and usage of hard interrupts, soft interrupts, and context switches that are generated within half of the CPU stall timeout, can help us make a general judgment. In other cases, we can preliminarily determine whether an infinite loop occurs when local_irq, local_bh or preempt is disabled. Zhen Lei (3): sched: Add helper kstat_cpu_softirqs_sum() sched: Add helper nr_context_switches_cpu() rcu: Add RCU stall diagnosis information include/linux/kernel_stat.h | 12 ++++++++++++ kernel/rcu/tree.c | 16 ++++++++++++++++ kernel/rcu/tree.h | 11 +++++++++++ kernel/rcu/tree_stall.h | 28 ++++++++++++++++++++++++++++ kernel/sched/core.c | 5 +++++ 5 files changed, 72 insertions(+)