From patchwork Mon Jan 7 15:12:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 10750719 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6314017D2 for ; Mon, 7 Jan 2019 15:13:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52536289C4 for ; Mon, 7 Jan 2019 15:13:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 45E6828AB8; Mon, 7 Jan 2019 15:13:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2563289C4 for ; Mon, 7 Jan 2019 15:13:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729504AbfAGPNN (ORCPT ); Mon, 7 Jan 2019 10:13:13 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55042 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726918AbfAGPNN (ORCPT ); Mon, 7 Jan 2019 10:13:13 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A7E4289ADB; Mon, 7 Jan 2019 15:13:12 +0000 (UTC) Received: from llong.com (dhcp-17-223.bos.redhat.com [10.18.17.223]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3941E65F71; Mon, 7 Jan 2019 15:13:11 +0000 (UTC) From: Waiman Long To: Andrew Morton , Alexey Dobriyan , Luis Chamberlain , Kees Cook , Jonathan Corbet Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, Davidlohr Bueso , Miklos Szeredi , Daniel Colascione , Dave Chinner , Randy Dunlap , Waiman Long Subject: [PATCH 1/2] /proc/stat: Extract irqs counting code into show_stat_irqs() Date: Mon, 7 Jan 2019 10:12:57 -0500 Message-Id: <1546873978-27797-2-git-send-email-longman@redhat.com> In-Reply-To: <1546873978-27797-1-git-send-email-longman@redhat.com> References: <1546873978-27797-1-git-send-email-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 07 Jan 2019 15:13:12 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The code that generates the "intr" line of /proc/stat is now moved from show_stat() into a new function - show_stat_irqs(). There is no functional change. Signed-off-by: Waiman Long Reviewed-by: Kees Cook --- fs/proc/stat.c | 39 +++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/fs/proc/stat.c b/fs/proc/stat.c index 535eda7..4b06f1b 100644 --- a/fs/proc/stat.c +++ b/fs/proc/stat.c @@ -79,12 +79,38 @@ static u64 get_iowait_time(int cpu) #endif +static u64 compute_stat_irqs_sum(void) +{ + int i; + u64 sum = 0; + + for_each_possible_cpu(i) { + sum += kstat_cpu_irqs_sum(i); + sum += arch_irq_stat_cpu(i); + } + sum += arch_irq_stat(); + return sum; +} + +/* + * Print out the "intr" line of /proc/stat. + */ +static void show_stat_irqs(struct seq_file *p) +{ + int i; + + seq_put_decimal_ull(p, "intr ", compute_stat_irqs_sum()); + for_each_irq_nr(i) + seq_put_decimal_ull(p, " ", kstat_irqs_usr(i)); + + seq_putc(p, '\n'); +} + static int show_stat(struct seq_file *p, void *v) { int i, j; u64 user, nice, system, idle, iowait, irq, softirq, steal; u64 guest, guest_nice; - u64 sum = 0; u64 sum_softirq = 0; unsigned int per_softirq_sums[NR_SOFTIRQS] = {0}; struct timespec64 boottime; @@ -105,8 +131,6 @@ static int show_stat(struct seq_file *p, void *v) steal += kcpustat_cpu(i).cpustat[CPUTIME_STEAL]; guest += kcpustat_cpu(i).cpustat[CPUTIME_GUEST]; guest_nice += kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE]; - sum += kstat_cpu_irqs_sum(i); - sum += arch_irq_stat_cpu(i); for (j = 0; j < NR_SOFTIRQS; j++) { unsigned int softirq_stat = kstat_softirqs_cpu(j, i); @@ -115,7 +139,6 @@ static int show_stat(struct seq_file *p, void *v) sum_softirq += softirq_stat; } } - sum += arch_irq_stat(); seq_put_decimal_ull(p, "cpu ", nsec_to_clock_t(user)); seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice)); @@ -154,14 +177,10 @@ static int show_stat(struct seq_file *p, void *v) seq_put_decimal_ull(p, " ", nsec_to_clock_t(guest_nice)); seq_putc(p, '\n'); } - seq_put_decimal_ull(p, "intr ", (unsigned long long)sum); - - /* sum again ? it could be updated? */ - for_each_irq_nr(j) - seq_put_decimal_ull(p, " ", kstat_irqs_usr(j)); + show_stat_irqs(p); seq_printf(p, - "\nctxt %llu\n" + "ctxt %llu\n" "btime %llu\n" "processes %lu\n" "procs_running %lu\n" From patchwork Mon Jan 7 15:12:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 10750717 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C54011399 for ; Mon, 7 Jan 2019 15:13:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF1BE289C4 for ; Mon, 7 Jan 2019 15:13:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9FDE428AB8; Mon, 7 Jan 2019 15:13:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1479D289C4 for ; Mon, 7 Jan 2019 15:13:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729572AbfAGPNT (ORCPT ); Mon, 7 Jan 2019 10:13:19 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41892 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729519AbfAGPNP (ORCPT ); Mon, 7 Jan 2019 10:13:15 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2279F9E62D; Mon, 7 Jan 2019 15:13:14 +0000 (UTC) Received: from llong.com (dhcp-17-223.bos.redhat.com [10.18.17.223]) by smtp.corp.redhat.com (Postfix) with ESMTP id C728B60C45; Mon, 7 Jan 2019 15:13:12 +0000 (UTC) From: Waiman Long To: Andrew Morton , Alexey Dobriyan , Luis Chamberlain , Kees Cook , Jonathan Corbet Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, Davidlohr Bueso , Miklos Szeredi , Daniel Colascione , Dave Chinner , Randy Dunlap , Waiman Long Subject: [PATCH 2/2] /proc/stat: Add sysctl parameter to control irq counts latency Date: Mon, 7 Jan 2019 10:12:58 -0500 Message-Id: <1546873978-27797-3-git-send-email-longman@redhat.com> In-Reply-To: <1546873978-27797-1-git-send-email-longman@redhat.com> References: <1546873978-27797-1-git-send-email-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 07 Jan 2019 15:13:14 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Reading /proc/stat can be slow especially if there are many irqs and on systems with many CPUs as summation of all the percpu counts for each of the irqs is required. On some newer systems, there can be more than 1000 irqs per socket. Applications that need to read /proc/stat many times per seconds will easily hit a bottleneck. In reality, the irq counts are seldom looked at. Even those applications that read them don't really need up-to-date information. One way to reduce the performance impact of irq counts computation is to do it less frequently. A new "fs/proc-stat-irqs-latency-ms" sysctl parameter is now added to control the maximum latency in milliseconds allowed between the time when the computation was done and when the values are reported. Setting this parameter to an appropriate value will allow us to reduce the performance impact of reading /proc/stat repetitively. If /proc/stat is read once in a while, the irq counts will be accurate. Reading /proc/stat repetitively, however, may make the counts somewhat stale. On a 4-socket 96-core Broadwell system (HT off) with 2824 irqs, the times for reading /proc/stat 10,000 times with various values of proc-stat-irqs-latency-ms were: proc-stat-irqs-latency-ms elapsed time sys time ------------------------- ------------ -------- 0 11.041s 9.452s 1 12.983s 10.314s 10 8.452s 5.466s 100 8.003s 4.882s 1000 8.000s 4.740s Signed-off-by: Waiman Long --- Documentation/sysctl/fs.txt | 16 +++++++++++++++ fs/proc/stat.c | 48 +++++++++++++++++++++++++++++++++++++++++++++ kernel/sysctl.c | 12 ++++++++++++ 3 files changed, 76 insertions(+) diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt index 819caf8..603d1b5 100644 --- a/Documentation/sysctl/fs.txt +++ b/Documentation/sysctl/fs.txt @@ -34,6 +34,7 @@ Currently, these files are in /proc/sys/fs: - overflowgid - pipe-user-pages-hard - pipe-user-pages-soft +- proc-stat-irqs-latency-ms - protected_fifos - protected_hardlinks - protected_regular @@ -184,6 +185,21 @@ applied. ============================================================== +proc-stat-irqs-latency-ms: + +The maximum latency (in mseconds) between the time when the IRQ counts +in the "intr" line of /proc/stat were computed and the time when they +are reported. + +The default is 0 which means the counts are computed every time +/proc/stat is read. As computing the IRQ counts can be the most time +consuming part of accessing /proc/stat, setting a high enough value +will shorten the time to read it in most cases. + +The actual maximum latency is rounded up to the next multiple of jiffies. + +============================================================== + protected_fifos: The intent of this protection is to avoid unintentional writes to diff --git a/fs/proc/stat.c b/fs/proc/stat.c index 4b06f1b..52f5845 100644 --- a/fs/proc/stat.c +++ b/fs/proc/stat.c @@ -13,6 +13,7 @@ #include #include #include +#include #ifndef arch_irq_stat_cpu #define arch_irq_stat_cpu(cpu) 0 @@ -21,6 +22,12 @@ #define arch_irq_stat() 0 #endif +/* + * Maximum latency (in ms) of the irq values reported in the "intr" line. + * This is converted internally to multiple of jiffies. + */ +unsigned int proc_stat_irqs_latency_ms; + #ifdef arch_idle_time static u64 get_idle_time(int cpu) @@ -98,7 +105,48 @@ static u64 compute_stat_irqs_sum(void) static void show_stat_irqs(struct seq_file *p) { int i; +#ifdef CONFIG_PROC_SYSCTL + static char *irqs_buf; /* Buffer for irqs values */ + static int buflen; + static unsigned long last_jiffies; /* Last buffer update jiffies */ + static DEFINE_MUTEX(irqs_mutex); + unsigned int latency = proc_stat_irqs_latency_ms; + + if (latency) { + char *ptr; + + latency = _msecs_to_jiffies(latency); + + mutex_lock(&irqs_mutex); + if (irqs_buf && time_before(jiffies, last_jiffies + latency)) + goto print_out; + + /* + * Each irq value may require up to 11 bytes. + */ + if (!irqs_buf) { + irqs_buf = kmalloc(nr_irqs * 11 + 32, + GFP_KERNEL | __GFP_ZERO); + if (!irqs_buf) { + mutex_unlock(&irqs_mutex); + goto fallback; + } + } + ptr = irqs_buf; + ptr += sprintf(ptr, "intr %llu", compute_stat_irqs_sum()); + for_each_irq_nr(i) + ptr += sprintf(ptr, " %u", kstat_irqs_usr(i)); + *ptr++ = '\n'; + buflen = ptr - irqs_buf; + last_jiffies = jiffies; +print_out: + seq_write(p, irqs_buf, buflen); + mutex_unlock(&irqs_mutex); + return; + } +fallback: +#endif seq_put_decimal_ull(p, "intr ", compute_stat_irqs_sum()); for_each_irq_nr(i) seq_put_decimal_ull(p, " ", kstat_irqs_usr(i)); diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 1825f71..07010c9 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -114,6 +114,9 @@ #ifndef CONFIG_MMU extern int sysctl_nr_trim_pages; #endif +#ifdef CONFIG_PROC_FS +extern unsigned int proc_stat_irqs_latency_ms; +#endif /* Constants used for minimum and maximum */ #ifdef CONFIG_LOCKUP_DETECTOR @@ -1890,6 +1893,15 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write, .proc_handler = proc_dointvec_minmax, .extra1 = &one, }, +#ifdef CONFIG_PROC_FS + { + .procname = "proc-stat-irqs-latency-ms", + .data = &proc_stat_irqs_latency_ms, + .maxlen = sizeof(proc_stat_irqs_latency_ms), + .mode = 0644, + .proc_handler = proc_douintvec, + }, +#endif { } };