From patchwork Mon Oct 1 10:05:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 10621767 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6EA5B6CB for ; Mon, 1 Oct 2018 10:05:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CC6328911 for ; Mon, 1 Oct 2018 10:05:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E89829495; Mon, 1 Oct 2018 10:05:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2DC9728911 for ; Mon, 1 Oct 2018 10:05:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBEB86B0010; Mon, 1 Oct 2018 06:05:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D69F26B0266; Mon, 1 Oct 2018 06:05:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C58A56B0269; Mon, 1 Oct 2018 06:05:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by kanga.kvack.org (Postfix) with ESMTP id 678EF6B0010 for ; Mon, 1 Oct 2018 06:05:28 -0400 (EDT) Received: by mail-ed1-f71.google.com with SMTP id k16-v6so9944157ede.6 for ; Mon, 01 Oct 2018 03:05:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=hQsMxxem8dZdSJHF3mc8cO7Dg/74asKZ5HIfc8a0opw=; b=XtX8u4rUuMU32/aw1jQ1wuxvBZlWZE2evLrWuaT5UciwXYa5RKQmB9CkdNFH9B2Av+ PBWwhRui7uWW1IifOok69sH5hy/9cwDlqiqUqN5SDcVRp8v/W2hql5DEwO+hq4pTSaTw vMno3qbnUBmF5xpo2V6joYhrK9b//tX8xc+Q9klY7np4tBk7CAyovJk0pqCR/Nc6JeO/ mcJXZIfnlAog9fK9McJRmmCTJeguzZrIvB0sPH2cWm3iTS0aef5M1OGhehZyxo1Ui/Ip kdtFtqYzllIlnq7UdD6pAY2bBCWcpKl8PBElU1nLKRihY+Hr6lefFtT9kNbTLfLVrRJl bSYg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-Gm-Message-State: ABuFfoimylcwoV15gFSJP8ZzJTP0lHvrYzmMuDjhOk8GovBgk3uaJzNS qnhBE29GcYq8ctz52afNWWw/MyYClRnxbH0IEttbujATQQmaMsvjlNKwOMKQDaeJZqZrJxIlgWl x8HiFzQ7SCYRoUo+hT7PyC/4xclWKubuU93pt+7KcgPIWEgWr96KURIn4FkrxcE1JMw== X-Received: by 2002:a50:aed4:: with SMTP id f20-v6mr16814368edd.271.1538388327854; Mon, 01 Oct 2018 03:05:27 -0700 (PDT) X-Google-Smtp-Source: ACcGV639FKv8oQnfvD08SWUV9VL4np5AeAM/s5fuXHYRRPWwaPFfx+aj6DBfvnxAMkFC3GPkNu9L X-Received: by 2002:a50:aed4:: with SMTP id f20-v6mr16814268edd.271.1538388326748; Mon, 01 Oct 2018 03:05:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538388326; cv=none; d=google.com; s=arc-20160816; b=UHU1c9JFepdIKIyh0sOcFScTjoBt5N0lndLjCg+tDvx9Ltrbiv0KfMhWIgKXd8WkC4 QEv208VaoqzjxgC2Y69mJCUy5WPnFCzSn60J/5gxYfG4VUqghwyoxoMA76z/rOz6Njrh LWVKwnILQbILgmY3XnvfStCh1AzjVHIa1/EC7mMdokxyyS1NQx+aK8o8FwjAsXAtz5Bw LW3Ekrm0QOKu/9zPvj0pswM2QWz6Coloj6wM9XjMXHR83nj1Lcl0CzKfUq/Bx6EQU+4n rSIubTDFxFlHQ4d2iLBVQL3uAiBbYajA/2JIiAbBac/QhrrIOIJEUqPT1hj6/44SJFbF U1Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=hQsMxxem8dZdSJHF3mc8cO7Dg/74asKZ5HIfc8a0opw=; b=jBCEuqguwyy4ogKlSt5H3KwaRxM71wOLNxEh/af+sdUK9VgnFF6i4qM0xlogT6WTt8 R+xyi29KC57BVpZI8Zwu44QXJmA8QJU0wxmUbBXDmQyL/ZUNBD9GWMsXID8pV5MH+EI7 QWitXq0JzOV+F3nvxrPZlyz0XIsfIYnb2nQJtRlxeQzkIX/ZDSjmLfaEYheLzo689rNW fMc1mtCgHvy4cZPcbO7q8nnd/pBUkKw1e+cnGo02Zy24p8fi1PL8c2az09RYLXEwPsGl +GZm78u/IEsLPV7+fnY5HvQQLu38EEk4N0cRx7AOS/JIUvUEL2SPqv1kF2ocsJ9ne6vD uacw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net Received: from outbound-smtp25.blacknight.com (outbound-smtp25.blacknight.com. [81.17.249.193]) by mx.google.com with ESMTPS id y23-v6si4768668ejo.35.2018.10.01.03.05.26 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 01 Oct 2018 03:05:26 -0700 (PDT) Received-SPF: pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) client-ip=81.17.249.193; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp25.blacknight.com (Postfix) with ESMTPS id 61C3CB882F for ; Mon, 1 Oct 2018 11:05:26 +0100 (IST) Received: (qmail 4675 invoked from network); 1 Oct 2018 10:05:26 -0000 Received: from unknown (HELO stampy.163woodhaven.lan) (mgorman@techsingularity.net@[37.228.229.88]) by 81.17.254.9 with ESMTPA; 1 Oct 2018 10:05:26 -0000 From: Mel Gorman To: Peter Zijlstra Cc: Ingo Molnar , Srikar Dronamraju , Jirka Hladky , Rik van Riel , LKML , Linux-MM , Mel Gorman Subject: [PATCH 1/2] mm, numa: Remove rate-limiting of automatic numa balancing migration Date: Mon, 1 Oct 2018 11:05:24 +0100 Message-Id: <20181001100525.29789-2-mgorman@techsingularity.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20181001100525.29789-1-mgorman@techsingularity.net> References: <20181001100525.29789-1-mgorman@techsingularity.net> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Rate limiting of page migrations due to automatic NUMA balancing was introduced to mitigate the worst-case scenario of migrating at high frequency due to false sharing or slowly ping-ponging between nodes. Since then, a lot of effort was spent on correctly identifying these pages and avoiding unnecessary migrations and the safety net may no longer be required. Jirka Hladky reported a regression in 4.17 due to a scheduler patch that avoids spreading STREAM tasks wide prematurely. However, once the task was properly placed, it delayed migrating the memory due to rate limiting. Increasing the limit fixed the problem for him. Currently, the limit is hard-coded and does not account for the real capabilities of the hardware. Even if an estimate was attempted, it would not properly account for the number of memory controllers and it could not account for the amount of bandwidth used for normal accesses. Rather than fudging, this patch simply eliminates the rate limiting. However, Jirka reports that a STREAM configuration using multiple processes achieved similar performance to 4.16. In local tests, this patch improved performance of STREAM relative to the baseline but it is somewhat machine-dependent. Most workloads show little or not performance difference implying that there is not a heavily reliance on the throttling mechanism and it is safe to remove. STREAM on 2-socket machine 4.19.0-rc5 4.19.0-rc5 numab-v1r1 noratelimit-v1r1 MB/sec copy 43298.52 ( 0.00%) 44673.38 ( 3.18%) MB/sec scale 30115.06 ( 0.00%) 31293.06 ( 3.91%) MB/sec add 32825.12 ( 0.00%) 34883.62 ( 6.27%) MB/sec triad 32549.52 ( 0.00%) 34906.60 ( 7.24% Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Reviewed-by: Srikar Dronamraju --- include/linux/mmzone.h | 6 ---- include/trace/events/migrate.h | 27 ------------------ mm/migrate.c | 65 ------------------------------------------ mm/page_alloc.c | 2 -- 4 files changed, 100 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 1e22d96734e0..3f4c0b167333 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -671,12 +671,6 @@ typedef struct pglist_data { #ifdef CONFIG_NUMA_BALANCING /* Lock serializing the migrate rate limiting window */ spinlock_t numabalancing_migrate_lock; - - /* Rate limiting time interval */ - unsigned long numabalancing_migrate_next_window; - - /* Number of pages migrated during the rate limiting time interval */ - unsigned long numabalancing_migrate_nr_pages; #endif /* * This is a per-node reserve of pages that are not available diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index 711372845945..705b33d1e395 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -70,33 +70,6 @@ TRACE_EVENT(mm_migrate_pages, __print_symbolic(__entry->mode, MIGRATE_MODE), __print_symbolic(__entry->reason, MIGRATE_REASON)) ); - -TRACE_EVENT(mm_numa_migrate_ratelimit, - - TP_PROTO(struct task_struct *p, int dst_nid, unsigned long nr_pages), - - TP_ARGS(p, dst_nid, nr_pages), - - TP_STRUCT__entry( - __array( char, comm, TASK_COMM_LEN) - __field( pid_t, pid) - __field( int, dst_nid) - __field( unsigned long, nr_pages) - ), - - TP_fast_assign( - memcpy(__entry->comm, p->comm, TASK_COMM_LEN); - __entry->pid = p->pid; - __entry->dst_nid = dst_nid; - __entry->nr_pages = nr_pages; - ), - - TP_printk("comm=%s pid=%d dst_nid=%d nr_pages=%lu", - __entry->comm, - __entry->pid, - __entry->dst_nid, - __entry->nr_pages) -); #endif /* _TRACE_MIGRATE_H */ /* This part must be outside protection */ diff --git a/mm/migrate.c b/mm/migrate.c index 4f1d894835b5..5e285c1249a0 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1855,54 +1855,6 @@ static struct page *alloc_misplaced_dst_page(struct page *page, return newpage; } -/* - * page migration rate limiting control. - * Do not migrate more than @pages_to_migrate in a @migrate_interval_millisecs - * window of time. Default here says do not migrate more than 1280M per second. - */ -static unsigned int migrate_interval_millisecs __read_mostly = 100; -static unsigned int ratelimit_pages __read_mostly = 128 << (20 - PAGE_SHIFT); - -/* Returns true if the node is migrate rate-limited after the update */ -static bool numamigrate_update_ratelimit(pg_data_t *pgdat, - unsigned long nr_pages) -{ - unsigned long next_window, interval; - - next_window = READ_ONCE(pgdat->numabalancing_migrate_next_window); - interval = msecs_to_jiffies(migrate_interval_millisecs); - - /* - * Rate-limit the amount of data that is being migrated to a node. - * Optimal placement is no good if the memory bus is saturated and - * all the time is being spent migrating! - */ - if (time_after(jiffies, next_window) && - spin_trylock(&pgdat->numabalancing_migrate_lock)) { - pgdat->numabalancing_migrate_nr_pages = 0; - do { - next_window += interval; - } while (unlikely(time_after(jiffies, next_window))); - - WRITE_ONCE(pgdat->numabalancing_migrate_next_window, next_window); - spin_unlock(&pgdat->numabalancing_migrate_lock); - } - if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages) { - trace_mm_numa_migrate_ratelimit(current, pgdat->node_id, - nr_pages); - return true; - } - - /* - * This is an unlocked non-atomic update so errors are possible. - * The consequences are failing to migrate when we potentiall should - * have which is not severe enough to warrant locking. If it is ever - * a problem, it can be converted to a per-cpu counter. - */ - pgdat->numabalancing_migrate_nr_pages += nr_pages; - return false; -} - static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page) { int page_lru; @@ -1975,14 +1927,6 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma, if (page_is_file_cache(page) && PageDirty(page)) goto out; - /* - * Rate-limit the amount of data that is being migrated to a node. - * Optimal placement is no good if the memory bus is saturated and - * all the time is being spent migrating! - */ - if (numamigrate_update_ratelimit(pgdat, 1)) - goto out; - isolated = numamigrate_isolate_page(pgdat, page); if (!isolated) goto out; @@ -2029,14 +1973,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, unsigned long mmun_start = address & HPAGE_PMD_MASK; unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE; - /* - * Rate-limit the amount of data that is being migrated to a node. - * Optimal placement is no good if the memory bus is saturated and - * all the time is being spent migrating! - */ - if (numamigrate_update_ratelimit(pgdat, HPAGE_PMD_NR)) - goto out_dropref; - new_page = alloc_pages_node(node, (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE), HPAGE_PMD_ORDER); @@ -2133,7 +2069,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, out_fail: count_vm_events(PGMIGRATE_FAIL, HPAGE_PMD_NR); -out_dropref: ptl = pmd_lock(mm, pmd); if (pmd_same(*pmd, entry)) { entry = pmd_modify(entry, vma->vm_page_prot); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 89d2a2ab3fe6..706a738c0aee 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6197,8 +6197,6 @@ static unsigned long __init calc_memmap_size(unsigned long spanned_pages, static void pgdat_init_numabalancing(struct pglist_data *pgdat) { spin_lock_init(&pgdat->numabalancing_migrate_lock); - pgdat->numabalancing_migrate_nr_pages = 0; - pgdat->numabalancing_migrate_next_window = jiffies; } #else static void pgdat_init_numabalancing(struct pglist_data *pgdat) {} From patchwork Mon Oct 1 10:05:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 10621771 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D09015A6 for ; Mon, 1 Oct 2018 10:05:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F1AB28911 for ; Mon, 1 Oct 2018 10:05:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0385D29495; Mon, 1 Oct 2018 10:05:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89A4D28911 for ; Mon, 1 Oct 2018 10:05:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5E9B6B0266; Mon, 1 Oct 2018 06:05:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BE4086B0269; Mon, 1 Oct 2018 06:05:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A867F6B026A; Mon, 1 Oct 2018 06:05:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by kanga.kvack.org (Postfix) with ESMTP id 457AC6B0266 for ; Mon, 1 Oct 2018 06:05:29 -0400 (EDT) Received: by mail-ed1-f70.google.com with SMTP id w44-v6so9943113edb.16 for ; Mon, 01 Oct 2018 03:05:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=Kp/+LsFBT1b8hPAW2TSnSE48fvnqhLyqpNyfD0ym8M0=; b=LurMyrQKr0q9wBAbf+p2eUve0sY191JIcCCvjTYOglLZQg0ERrDlqBHJLr/2ABF+Qi hIDo2905xNSLRB1I/kFfClAhpuHGzCAHDQfkPaZCnIVz6rKK6Lztlux/Rc3XT2Ty4p4J vaBqRwFxa0i8OF2UYl7iA7RxNhu2x5bdLPfc48ZPHD4vcT9ZicC5vgpmMAOm9D25nUXj HVSexYh3MSRCNyubS1bx0MNY3DH5hNsfJuE0hsoG4XCH/Tuvu9A1CvO6zy9LXnbYlsTG HuKwXfBW9G6PF+crk0ioa0VZn7vQ/13sCV45QrZNudepXJW0dE/pFmhWHJilDor1YbkP RrZA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 46.22.139.13 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-Gm-Message-State: ABuFfohFcXNrPbdMMaIeUAB+UuhUaWaJXK03BoDGAp5oCwVzf7Zctl5C xRbs13aYN/m7QDXowBX2XUUyUyf8kv+hjYHyXMfOqaN7vyaaI0y5hZ44JgmHt6GD4Hl7pifx5OB 0DG03oxfQvOxiboIyiWuvh7s2eLkOPeGZ/k5Li8DHTrqjkGR/JseG1qINdGxLWoRtBg== X-Received: by 2002:a50:fa8b:: with SMTP id w11-v6mr16639831edr.59.1538388328769; Mon, 01 Oct 2018 03:05:28 -0700 (PDT) X-Google-Smtp-Source: ACcGV63gEcr4a8PMmTLIpnTeve8Fvis7QRn8nB/XQwxlm7HbPJwWVmApZubK7irLnwqjzlcK5hZp X-Received: by 2002:a50:fa8b:: with SMTP id w11-v6mr16639669edr.59.1538388327001; Mon, 01 Oct 2018 03:05:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538388326; cv=none; d=google.com; s=arc-20160816; b=fO2IzeGiwmNzIgRQpEHOLrKhVTLtncwtcQMz26YSqmU39rVNcJ2GI1TBcmx7uaGvFz g+nNnGzXqMKt1SL8TrEQmoJnKI4unCZ6gqHrEID87I0WoBot+jNwjsttIehe3piFy8/t hSI/RYPHQ2n3EtIZ/u3vp58SHk7kQAV/5zRLjAvrPkZUx6L8NFaVAq5PCm9pjhw0T4ld gh+xre8xNktd8/sJAQ5lu0eauvjpR2mH9eJLPncFy/a5NnhsrZKm1ad5/yJO4ZNmspyn brUnaMYkKs0u7F+4nk4G0oIcQSmt0KSQyGxXH69BngLX+sv/S1cth0ed9/6oDyaTQqH7 wraw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=Kp/+LsFBT1b8hPAW2TSnSE48fvnqhLyqpNyfD0ym8M0=; b=eAhzi74IMfPFrxL+Fc3Q2sXON44ETR6G/dGxZWkgaHnft5mTSM7ic679rdc5l0ZNmO grLWJQPk4eMZLk0AlIzSBSNhkAxgNGZKJcvfHrGbRw5gQNwTy+UXYOP6DqHsfff6HzK4 bRqxA7m0QLsqSsvPszn19VFtqFkGWXeOG0UJcys+EQh7PGJPBA3udONxazzeGRERrw5u 3pELx1T1LLHnl+AseMoQiGW0WJCgBnrt3babToYrp8W2l61CkecS+MbkwoLd9DRBNrZL MTNbuT3l2iSUs9b26KPLBePYtkhdDCQnBhPupeG+LTUBSlWy4wCcIsXHjPSNBdNOGqa/ RuLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 46.22.139.13 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net Received: from outbound-smtp08.blacknight.com (outbound-smtp08.blacknight.com. [46.22.139.13]) by mx.google.com with ESMTPS id q48-v6si1379674eda.413.2018.10.01.03.05.26 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Oct 2018 03:05:26 -0700 (PDT) Received-SPF: pass (google.com: domain of mgorman@techsingularity.net designates 46.22.139.13 as permitted sender) client-ip=46.22.139.13; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 46.22.139.13 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp08.blacknight.com (Postfix) with ESMTPS id 8F2931C289E for ; Mon, 1 Oct 2018 11:05:26 +0100 (IST) Received: (qmail 4703 invoked from network); 1 Oct 2018 10:05:26 -0000 Received: from unknown (HELO stampy.163woodhaven.lan) (mgorman@techsingularity.net@[37.228.229.88]) by 81.17.254.9 with ESMTPA; 1 Oct 2018 10:05:26 -0000 From: Mel Gorman To: Peter Zijlstra Cc: Ingo Molnar , Srikar Dronamraju , Jirka Hladky , Rik van Riel , LKML , Linux-MM , Mel Gorman Subject: [PATCH 2/2] mm, numa: Migrate pages to local nodes quicker early in the lifetime of a task Date: Mon, 1 Oct 2018 11:05:25 +0100 Message-Id: <20181001100525.29789-3-mgorman@techsingularity.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20181001100525.29789-1-mgorman@techsingularity.net> References: <20181001100525.29789-1-mgorman@techsingularity.net> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Automatic NUMA Balancing uses a multi-stage pass to decide whether a page should migrate to a local node. This filter avoids excessive ping-ponging if a page is shared or used by threads that migrate cross-node frequently. Threads inherit both page tables and the preferred node ID from the parent. This means that threads can trigger hinting faults earlier than a new task which delays scanning for a number of seconds. As it can be load balanced very early in its lifetime there can be an unnecessary delay before it starts migrating thread-local data. This patch migrates private pages faster early in the lifetime of a thread using the sequence counter as an identifier of new tasks. With this patch applied, STREAM performance is the same as 4.17 even though processes are not spread cross-node prematurely. Other workloads showed a mix of minor gains and losses. This is somewhat expected most workloads are not very sensitive to the starting conditions of a process. 4.19.0-rc5 4.19.0-rc5 4.17.0 numab-v1r1 fastmigrate-v1r1 vanilla MB/sec copy 43298.52 ( 0.00%) 47335.46 ( 9.32%) 47219.24 ( 9.06%) MB/sec scale 30115.06 ( 0.00%) 32568.12 ( 8.15%) 32527.56 ( 8.01%) MB/sec add 32825.12 ( 0.00%) 36078.94 ( 9.91%) 35928.02 ( 9.45%) MB/sec triad 32549.52 ( 0.00%) 35935.94 ( 10.40%) 35969.88 ( 10.51%) Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel --- kernel/sched/fair.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 25c7c7e09cbd..7fc4a371bdd2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1392,6 +1392,17 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page, int last_cpupid, this_cpupid; this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid); + last_cpupid = page_cpupid_xchg_last(page, this_cpupid); + + /* + * Allow first faults or private faults to migrate immediately early in + * the lifetime of a task. The magic number 4 is based on waiting for + * two full passes of the "multi-stage node selection" test that is + * executed below. + */ + if ((p->numa_preferred_nid == -1 || p->numa_scan_seq <= 4) && + (cpupid_pid_unset(last_cpupid) || cpupid_match_pid(p, last_cpupid))) + return true; /* * Multi-stage node selection is used in conjunction with a periodic @@ -1410,7 +1421,6 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page, * This quadric squishes small probabilities, making it less likely we * act on an unlikely task<->page relation. */ - last_cpupid = page_cpupid_xchg_last(page, this_cpupid); if (!cpupid_pid_unset(last_cpupid) && cpupid_to_nid(last_cpupid) != dst_nid) return false;