From patchwork Sun Dec 12 11:31:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 12672217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6ED52C433F5 for ; Sun, 12 Dec 2021 11:32:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0C566B0071; Sun, 12 Dec 2021 06:32:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE02C6B0074; Sun, 12 Dec 2021 06:32:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A673A6B0078; Sun, 12 Dec 2021 06:32:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 8287B6B0071 for ; Sun, 12 Dec 2021 06:32:26 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2EFC0181AEF1F for ; Sun, 12 Dec 2021 11:32:16 +0000 (UTC) X-FDA: 78908928672.31.05EAD19 Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf20.hostedemail.com (Postfix) with ESMTP id 96C491C000E for ; Sun, 12 Dec 2021 11:32:13 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R931e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0V-JXZGi_1639308729; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0V-JXZGi_1639308729) by smtp.aliyun-inc.com(127.0.0.1); Sun, 12 Dec 2021 19:32:10 +0800 From: Baolin Wang To: akpm@linux-foundation.org, ying.huang@intel.com, dave.hansen@linux.intel.com Cc: ziy@nvidia.com, shy828301@gmail.com, baolin.wang@linux.alibaba.com, zhongjiang-ali@linux.alibaba.com, xlpang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 0/4] Add speculative numa fault support Date: Sun, 12 Dec 2021 19:31:56 +0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 X-Stat-Signature: q6nzjhfguos1t39ha6my7j9zjsncrfhh Authentication-Results: imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 96C491C000E X-HE-Tag: 1639308733-383907 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, This RFC patch set adds speculative numa fault support for some scenarios, like tiered memory system. On tiered memory system, it will rely on the numa balancing to promote slow and hot memory to fast memory to improve performance. So we can promote several sequential pages on slow memory in advance according to the data locality for some workloads to improve the performance. So now how much pages need to be promoted to fast memory is the best? Now this RFC patch set only implements a basic and simple mechanism to speculate the numa fault window for each VMA. It will introduce a new atomic member for each VMA to record the numa fault window information, which is used to determine if it is a sequential stream to expand or reduance the numa fault window. Now I can see about 6% improvement when testing mysql on tiered memory system, more data can be found in patch 1. Looking forword to comments and suggestion to make the algorithm more robust and suitable for more scenarios. Thanks in advance. Note: this patch set is based on the patch set implemented the tiered memory promotion[1]. [1] https://lore.kernel.org/lkml/87bl2gsnrd.fsf@yhuang6-desk2.ccr.corp.intel.com/T/ Baolin Wang (4): mm: Add speculative numa fault support mm: Add a debug interface to control the range of speculative numa fault mm: Add speculative numa fault stats mm: Update the speculative pages' accessing time include/linux/mm_types.h | 3 + include/linux/vm_event_item.h | 1 + mm/memory.c | 222 ++++++++++++++++++++++++++++++++-- mm/vmstat.c | 1 + 4 files changed, 216 insertions(+), 11 deletions(-)