From patchwork Fri Dec 2 17:16:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13063001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02955C47088 for ; Fri, 2 Dec 2022 17:16:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 307436B007D; Fri, 2 Dec 2022 12:16:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B5C86B007B; Fri, 2 Dec 2022 12:16:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F18226B007D; Fri, 2 Dec 2022 12:16:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DD04B6B0078 for ; Fri, 2 Dec 2022 12:16:21 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 96811ABB6A for ; Fri, 2 Dec 2022 17:16:21 +0000 (UTC) X-FDA: 80198019762.27.DC59139 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id CD0CF160015 for ; Fri, 2 Dec 2022 17:16:20 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eqfzDeRW; spf=pass (imf08.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670001381; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CW4axpY/WkASdhmPKJthM+tNRg8V7lNskJpRuDs78M8=; b=UrjtpXNZV6201T0TlJ6oETJxuVJ5N2aZQWmaPaWx/eUKE/A+Gkcj4V/KzXusHE+nZXegTl +tMr/AIKB6eMYOEFBXocZtIp3LavLe51RdrRWWf7GZZpVxAPGQVX/k70fXkNMyMcE1GmDn JBbqQQaWyNpE6yE7bST53cJe7pbiJL4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eqfzDeRW; spf=pass (imf08.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670001381; a=rsa-sha256; cv=none; b=WOPR5RPEUiy4Rc6QQyRKijcaUEukRZ3buNppHBe8li1yJzLsGnl/oLlu+N7/6AYlSfEnRi +x+VY9N2DDY0ETvSXhYCOq7VIVd8Fpr6Fpyq6GpCGVIM/ZnPojr/Q51SubrXpfhkxrN0SH rflPr+FOP1YQZMrLQT7E9GuEpWzJgZg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670001380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CW4axpY/WkASdhmPKJthM+tNRg8V7lNskJpRuDs78M8=; b=eqfzDeRWM//lLTF0odW4wYdBFYeBT9yjeRbqXjY6+D+6xLMlogqVo475OvVRw650h2t8OH wRT+j+mKqa0JdHYliR1rna4nkA+vUPkmBUuGfT5Uj0PqCpFlx+DXzdS7BwrmguAuhhPyY5 v3XfkKhYxoheGJ949Bph7oy7RZ/XRLs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-590-6ydDHLUSNQaMhMi-06NKSQ-1; Fri, 02 Dec 2022 12:16:16 -0500 X-MC-Unique: 6ydDHLUSNQaMhMi-06NKSQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9C351894E83; Fri, 2 Dec 2022 17:16:16 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.8.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id 580D240C947B; Fri, 2 Dec 2022 17:16:16 +0000 (UTC) From: Brian Foster To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: ikent@redhat.com, onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com Subject: [PATCH v3 5/5] procfs: use efficient tgid pid search on root readdir Date: Fri, 2 Dec 2022 12:16:20 -0500 Message-Id: <20221202171620.509140-6-bfoster@redhat.com> In-Reply-To: <20221202171620.509140-1-bfoster@redhat.com> References: <20221202171620.509140-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spamd-Result: default: False [2.59 / 9.00]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24]; MIME_GOOD(-0.10)[text/plain]; RCVD_NO_TLS_LAST(0.10)[]; BAYES_HAM(-0.01)[46.87%]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_NONE(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CD0CF160015 X-Stat-Signature: 4k39nhmqizbsbqgypm5fry34qj7nprrh X-HE-Tag: 1670001380-543244 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: find_ge_pid() walks every allocated id and checks every associated pid in the namespace for a link to a PIDTYPE_TGID task. If the pid namespace contains processes with large numbers of threads, this search doesn't scale and can notably increase getdents() syscall latency. For example, on a mostly idle 2.4GHz Intel Xeon running Fedora on 5.19.0-rc2, 'strace -T xfs_io -c readdir /proc' shows the following: getdents64(... /* 814 entries */, 32768) = 20624 <0.000568> With the addition of a dummy (i.e. idle) process running that creates an additional 100k threads, that latency increases to: getdents64(... /* 815 entries */, 32768) = 20656 <0.011315> While this may not be noticeable to users in one off /proc scans or simple usage of ps or top, we have users that report problems caused by this latency increase in these sort of scaled environments with custom tooling that makes heavier use of task monitoring. Optimize the tgid task scanning in proc_pid_readdir() by using the more efficient find_get_tgid_task() helper. This significantly improves readdir() latency when the pid namespace is populated with processes with very large thread counts. For example, the above 100k idle task test against a patched kernel now results in the following: Idle: getdents64(... /* 861 entries */, 32768) = 21048 <0.000670> "" + 100k threads: getdents64(... /* 862 entries */, 32768) = 21096 <0.000959> ... which is a much smaller latency hit after the high thread count task is started. Signed-off-by: Brian Foster Reviewed-by: Ian Kent --- fs/proc/base.c | 17 +---------------- 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 9e479d7d202b..ac34b6bb7249 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3475,24 +3475,9 @@ struct tgid_iter { }; static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter iter) { - struct pid *pid; - if (iter.task) put_task_struct(iter.task); - rcu_read_lock(); -retry: - iter.task = NULL; - pid = find_ge_pid(iter.tgid, ns); - if (pid) { - iter.tgid = pid_nr_ns(pid, ns); - iter.task = pid_task(pid, PIDTYPE_TGID); - if (!iter.task) { - iter.tgid += 1; - goto retry; - } - get_task_struct(iter.task); - } - rcu_read_unlock(); + iter.task = find_get_tgid_task(&iter.tgid, ns); return iter; }