From patchwork Tue Oct 10 08:31:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 13415005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0448CCD68FE for ; Tue, 10 Oct 2023 08:32:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96C078D00B6; Tue, 10 Oct 2023 04:32:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 91C1E8D006D; Tue, 10 Oct 2023 04:32:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E3CA8D00B6; Tue, 10 Oct 2023 04:32:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6E8138D006D for ; Tue, 10 Oct 2023 04:32:30 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3254E8020F for ; Tue, 10 Oct 2023 08:32:30 +0000 (UTC) X-FDA: 81328885260.16.239DA0C Received: from outbound-smtp55.blacknight.com (outbound-smtp55.blacknight.com [46.22.136.239]) by imf17.hostedemail.com (Postfix) with ESMTP id 3A26040026 for ; Tue, 10 Oct 2023 08:32:27 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf17.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.239 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696926748; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZUUmmvHRsFuHbarAKic1Xc/Rc4fR/ZLRsatJ6L/9eBE=; b=n+1Kk5oQHUKJ6oEoDh1fBP1hnuFdj//pmyhA0PgDCvILIAsX8llABUuHMU6yDFIBFuA4gS P7bmC8XUNTjHiJLlxrgwTnXwyiZ+LoavYl3uKaJeyB6iULuhUXDMR6iA5YRUTiXWbe3L8g we5/oD5SR4zlOySh5CA98G8WX7rdZOg= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf17.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.239 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696926748; a=rsa-sha256; cv=none; b=xoWIhae4hX/HFuTEj3Jk1RwnV4rvtCiyzKWMt0vAYs8PijR/AsEPPmE+gnntLFGGXaiVjA +NuWyHV0nYgcLuoMat97mBRkBzI4SAF+NtVQ8XlR6GFl5rYYZYNzFqXkEnhQnpmhYFZjuS DW4x3D7pTRacRZGXc8c6lzwMiftKGl4= Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp55.blacknight.com (Postfix) with ESMTPS id A0921FABE9 for ; Tue, 10 Oct 2023 09:32:26 +0100 (IST) Received: (qmail 8011 invoked from network); 10 Oct 2023 08:32:26 -0000 Received: from unknown (HELO morpheus.112glenside.lan) (mgorman@techsingularity.net@[84.203.197.19]) by 81.17.254.9 with ESMTPA; 10 Oct 2023 08:32:26 -0000 From: Mel Gorman To: Peter Zijlstra Cc: Raghavendra K T , K Prateek Nayak , Bharata B Rao , Ingo Molnar , LKML , Linux-MM , Mel Gorman Subject: [PATCH 3/6] sched/numa: Trace decisions related to skipping VMAs Date: Tue, 10 Oct 2023 09:31:40 +0100 Message-Id: <20231010083143.19593-4-mgorman@techsingularity.net> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20231010083143.19593-1-mgorman@techsingularity.net> References: <20231010083143.19593-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3A26040026 X-Stat-Signature: s4oje1mm76w8prmwgbtkkm4oetrqfcg3 X-HE-Tag: 1696926747-295973 X-HE-Meta: U2FsdGVkX19pPnhkL4XPKUlwhP+NsmicXokvHLRJU5nEDT1sdKZFo0ZADedegi68rj6HgqRUJVLM+Gu26Zcb+PpJhNIkOr30FFEIvpjXrgaCjPG++MSmmCObAQSD5ynA6nXnSCKcAjpwpv3wzMRW5XXcANuM8YQDgk6/EgYDQAUtw5D5SD1sVKV10yFSMmJqjepTyxqUdQA9ht8bjxdNKwACeUNQxSIcNbmzDYsQDHcpLhwrnfJIXM45dfheE9nWi2FXT1Md2ZJ8h/Yt9R6VNgAWvpyjoS057+nh4qkuTKPw0S6hZtcSNGTi73gWGyCyPU6QKLA5ahUz+cWL82ze92PVXF8nN1sUuEOgNeUzSsYGRA2J4cDPqNZsSMYeWa5TuJtvaPWKhFK9jQYc4dcIqaX/L4k6ctrwBcEZm4XOYHcqQUZpovPv6WJWvjIA3X/DsX3tVWFnOvtUZZHGlOjt0Tm0AOopL2HH4/tlePxev7JnElYoV8mI7e+tVqFJVZLlLdRuh4uAX9D94qScl2kB7NAii69p3DCck+6ggV8YKygegLT4XyIj9p4O0gLCqrykwr6KZ6iAhSQe8eUVZq3GviLN/V5cPqdp93UrN32eHwu16GiXTuJpMTm/k00xCszXpjmFf11pNJ8VtwH8qwx2VPS3828QkSGXBT0ZDXv8+lTOKLcLfqdGaG4kzEUMc5ea2GH1T6h9FpbSQW+4iyyANVWa2l4CAcilmXigYqYo9iuZzYtkXxSyZUVgXKpwl5iBvz5AdtHKRGBWpWwV/15z1j4P2j+k3ibeN5BIsEeK3Sb7n7+2/u7/xAyIZaImeZ2mQ4R31jZnQS9iuBAL9IREKB3o2R1xQeZBZIYX3CbLVhE0m4MdZRhJ5eRfWRfs8cmGtQTuOoXcf9DV59eHMfrvMHa/Q4GQfTxZ2U/rEVeSH1pfZiOVHsWEC8jJbZ8GlaCcsxP2eqJWEUUOeu4s+0S 78djcklS YfPWQYDZk0CN0GQHqyHZHoBjm1F8YXAG84kNugxQNBaYJl1zefxMSY4+k0S0B8DYrV5Ef0omj+WKH/jf1ZkPoxFFJsMwjXX5L/gg2FYQTxSvB0Zyck9kGRIg1S8GpmHdltxeSoIlLprUTOZWLZ3cIHrSiyz712uwaQ3e5Qc9lEYEw0dh4LTfX3EqclRcSRh/LyEqsy6Kh4we1kmQErh0UMMMS0A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: NUMA Balancing skip or scans VMAs for a variety of reasons. In preparation for completing scans of VMAs regardless of PID access, trace the reasons why a VMA was skipped. In a later patch, the tracing will be used to track if a VMA was forcibly scanned. Signed-off-by: Mel Gorman --- include/linux/sched/numa_balancing.h | 8 +++++ include/trace/events/sched.h | 50 ++++++++++++++++++++++++++++ kernel/sched/fair.c | 17 +++++++--- 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/include/linux/sched/numa_balancing.h b/include/linux/sched/numa_balancing.h index 3988762efe15..c127a1509e2f 100644 --- a/include/linux/sched/numa_balancing.h +++ b/include/linux/sched/numa_balancing.h @@ -15,6 +15,14 @@ #define TNF_FAULT_LOCAL 0x08 #define TNF_MIGRATE_FAIL 0x10 +enum numa_vmaskip_reason { + NUMAB_SKIP_UNSUITABLE, + NUMAB_SKIP_SHARED_RO, + NUMAB_SKIP_INACCESSIBLE, + NUMAB_SKIP_SCAN_DELAY, + NUMAB_SKIP_PID_INACTIVE, +}; + #ifdef CONFIG_NUMA_BALANCING extern void task_numa_fault(int last_node, int node, int pages, int flags); extern pid_t task_numa_group_id(struct task_struct *p); diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index fbb99a61f714..b0d0dbf491ea 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -664,6 +664,56 @@ DEFINE_EVENT(sched_numa_pair_template, sched_swap_numa, TP_ARGS(src_tsk, src_cpu, dst_tsk, dst_cpu) ); +#ifdef CONFIG_NUMA_BALANCING +#define NUMAB_SKIP_REASON \ + EM( NUMAB_SKIP_UNSUITABLE, "unsuitable" ) \ + EM( NUMAB_SKIP_SHARED_RO, "shared_ro" ) \ + EM( NUMAB_SKIP_INACCESSIBLE, "inaccessible" ) \ + EM( NUMAB_SKIP_SCAN_DELAY, "scan_delay" ) \ + EMe(NUMAB_SKIP_PID_INACTIVE, "pid_inactive" ) + +/* Redefine for export. */ +#undef EM +#undef EMe +#define EM(a, b) TRACE_DEFINE_ENUM(a); +#define EMe(a, b) TRACE_DEFINE_ENUM(a); + +NUMAB_SKIP_REASON + +/* Redefine for symbolic printing. */ +#undef EM +#undef EMe +#define EM(a, b) { a, b }, +#define EMe(a, b) { a, b } + +TRACE_EVENT(sched_skip_vma_numa, + + TP_PROTO(struct mm_struct *mm, struct vm_area_struct *vma, + enum numa_vmaskip_reason reason), + + TP_ARGS(mm, vma, reason), + + TP_STRUCT__entry( + __field(unsigned long, numa_scan_offset) + __field(unsigned long, vm_start) + __field(unsigned long, vm_end) + __field(enum numa_vmaskip_reason, reason) + ), + + TP_fast_assign( + __entry->numa_scan_offset = mm->numa_scan_offset; + __entry->vm_start = vma->vm_start; + __entry->vm_end = vma->vm_end; + __entry->reason = reason; + ), + + TP_printk("numa_scan_offset=%lX vm_start=%lX vm_end=%lX reason=%s", + __entry->numa_scan_offset, + __entry->vm_start, + __entry->vm_end, + __print_symbolic(__entry->reason, NUMAB_SKIP_REASON)) +); +#endif /* CONFIG_NUMA_BALANCING */ /* * Tracepoint for waking a polling cpu without an IPI. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 81405627b9ed..0535c57f6a77 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3227,6 +3227,7 @@ static void task_numa_work(struct callback_head *work) do { if (!vma_migratable(vma) || !vma_policy_mof(vma) || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_MIXEDMAP)) { + trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_UNSUITABLE); continue; } @@ -3237,15 +3238,19 @@ static void task_numa_work(struct callback_head *work) * as migrating the pages will be of marginal benefit. */ if (!vma->vm_mm || - (vma->vm_file && (vma->vm_flags & (VM_READ|VM_WRITE)) == (VM_READ))) + (vma->vm_file && (vma->vm_flags & (VM_READ|VM_WRITE)) == (VM_READ))) { + trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_SHARED_RO); continue; + } /* * Skip inaccessible VMAs to avoid any confusion between * PROT_NONE and NUMA hinting ptes */ - if (!vma_is_accessible(vma)) + if (!vma_is_accessible(vma)) { + trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_INACCESSIBLE); continue; + } /* Initialise new per-VMA NUMAB state. */ if (!vma->numab_state) { @@ -3267,12 +3272,16 @@ static void task_numa_work(struct callback_head *work) * delay the scan for new VMAs. */ if (mm->numa_scan_seq && time_before(jiffies, - vma->numab_state->next_scan)) + vma->numab_state->next_scan)) { + trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_SCAN_DELAY); continue; + } /* Do not scan the VMA if task has not accessed */ - if (!vma_is_accessed(vma)) + if (!vma_is_accessed(vma)) { + trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_PID_INACTIVE); continue; + } /* * RESET access PIDs regularly for old VMAs. Resetting after checking