From patchwork Fri Jun 21 07:54:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13706964 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 598D1C2BA1A for ; Fri, 21 Jun 2024 07:55:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D71356B011A; Fri, 21 Jun 2024 03:55:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFAC66B0125; Fri, 21 Jun 2024 03:55:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B76CF6B011A; Fri, 21 Jun 2024 03:55:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 932776B011A for ; Fri, 21 Jun 2024 03:55:11 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 38D7580AE9 for ; Fri, 21 Jun 2024 07:55:11 +0000 (UTC) X-FDA: 82254135222.22.95AB6DB Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) by imf28.hostedemail.com (Postfix) with ESMTP id 9B7ABC0012 for ; Fri, 21 Jun 2024 07:55:08 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=c44birts; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf28.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718956501; a=rsa-sha256; cv=none; b=YM93SgJ4pBOB+MhOnSPimUQK8+hDbtVKxCBMNIgeipM6/8VHBCkwHMDxXHdV2batokvYlE YnZ7vRIXdbCtZ2Tngak1afArtKhaMW7EvLsU7uw0VkvVmzFAmmxWnWFyI2NbJdH2lCbf/8 oeCk4VOp2rnUvWRN+HeN9UVVS8Q6GYs= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=c44birts; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf28.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718956501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=/DzM3S1lYqPHGfQNE3s2md1ssVnUB4aDWLiC0OMtstI=; b=nqlsIWOpeUS1H+5U2V5477VwNN2Q2rtBgMC0gppxuZteURy8r3WtvFzk5h9lLSWE+eP5X+ vN9u7u/N2xUCY0dnBtEFebrawGApA3a1ZT2REfrCNe1jiMi9BtyGQSsqA8veHP/LO5IbMV 3r+DXDIKpF1+l8/o/I7DVqv5QE1v2M0= X-Envelope-To: david@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1718956506; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=/DzM3S1lYqPHGfQNE3s2md1ssVnUB4aDWLiC0OMtstI=; b=c44birtsPLQ/KRxTVLOrU2WaCy1Np6KOJZvdccrU8NL3X5ygcZfpbPgOGTOiuiH5DxThip p65dlLsSzqF6mibdo8opCdIIiniwMwr3mVytvEBO09zjSIKFpEUtIsyHWjmsE1YPl0QCON jGakdbqg7WcEbyMcPmyqF4mhw13YlkU= X-Envelope-To: linux-kernel@vger.kernel.org X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: hughd@google.com X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: aarcange@redhat.com X-Envelope-To: shr@devkernel.io X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Subject: [PATCH v2 0/3] mm/ksm: cmp_and_merge_page() optimizations and cleanup Date: Fri, 21 Jun 2024 15:54:28 +0800 Message-Id: <20240621-b4-ksm-scan-optimize-v2-0-1c328aa9e30b@linux.dev> MIME-Version: 1.0 X-B4-Tracking: v=1; b=H4sIALQxdWYC/22NQQ6CMBBFr0K6dkw7FEhceQ/DooVRJkohLTYo6 d2tuHX5XvLf30QgzxTEqdiEp8iBJ5cBD4XoBuNuBNxnFihRyxoVWA33MELojINpXnjkNwHVSpv SVIiqEXk6e7ryumcvbeaBwzL51/4S1df+ghXq/8GoQIKsSlsq2zfG6vOD3XM99hRFm1L6AOT20 9+4AAAA To: Andrew Morton , david@redhat.com, aarcange@redhat.com, hughd@google.com, shr@devkernel.io Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com, Chengming Zhou X-Developer-Signature: v=1; a=ed25519-sha256; t=1718956503; l=4828; i=chengming.zhou@linux.dev; s=20240617; h=from:subject:message-id; bh=jLA1H+y/05ddFVormNithidAIAVMdbemin5WLnmAX3M=; b=g2ZgHFx1HAZs4ZB+QDo8QKaBIN++N9TrIez8n5Qzm40axxA1trrwzZOYrA2ZkpsssQpjz+Agz kXxi8F6BtCbDhSUYsKlhPP0Gi/QNcMJQNBQ237b1Pi8R+sUIyBiRi2j X-Developer-Key: i=chengming.zhou@linux.dev; a=ed25519; pk=/XPhIutBo+zyUeQyf4Ni5JYk/PEIWxIeUQqy2DYjmhI= X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 9B7ABC0012 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: utqz5yn338qm5ma5rqmzbdqeqb6m5t1d X-HE-Tag: 1718956508-725174 X-HE-Meta: U2FsdGVkX191vcjbXwMJXM7jUU14AsyNbiMPrt3MmEkclGfs3H2d5jg/jc2B1ox+Xn6NThN9OH/jGLYgI+yZEuyiWisIH0Hzs3HSWsX8oHFJ2wCCQcQmCFlNPie/AD/ZJDcHwfLcHzbHehV94rWpCbl1y9J+qTYrf0WdfKPmiBzdsgtCdD4AoPxC0XVRuGE46CeGbxDtRqYqrOBWjJAScRWTvgD6rjKntznhcatQK4u9dzH70IhV30NneuMk7VKhOZIQVRGQAbPucWLvS6h7l3XewcKlY4ObYl0R4Ua4kseYNFrzWXGrBIb8wdWqe1U8SjKjXXcj70cVBjp6DDHN0gTwmyjK783KZNDx7+RltT4qNIoaOvBptDLvhpVaJH7QPNaAIC9tj6i+OzeBvlMhndy6hcNpOH2Kceb2o8rlFtxOUjpbsDP6MgPSHUQSOdI7X6iCvJ3LrDBcRVPikOSIlgV789XmHUYfvNQ43chd4EHmcNaL1GyhMqq2cjobacgo9RjrpS4uAUdOvyAfr6AzlekbJKslCjIOuA2Cb7LzK9ja3nrRR10ALrjxyydTO2qEpM3K1gIIDefP3YEKZfxZuQelOwn6QdPD2bpthaN5z0obJTep+Mp0ut9BOyrgccCNppFK4GvB746gE9jfFcloE+BWdJT05eq3QmV8TmDNt2SE5N+TMWWj4GlZAukQhPkI6ES3WBBYXHFnYTlI8IndZTiDrB8l/RnIzUqUibGZe3DoX3j2/ou01cLQeT6pfQCmNr7AJoeXJ4M3PZdQUctF754gje3CNH+nvt+x95P9TLNRF9y4+r/hvOJAbvQvYG1d3C9W2rBhQMGaGFP6DYaHsP0cxc/CPSj6vgwgAhZ0qeZrSI7Bmf+whnvuDmMtTOMIscLgpAywjsHM7a1glB/dgikEI1pYox5HeMf1acYNoMV5wqXV0qSw4O0oc9aj/4YLgwCYAWoSYJTnbWiczmv 2ke85JOR //5mm6seLhDuamktovLwYVvGhyv1DaBAhFuMECm0GfSL0RZz8e5CaZcjOxYY3gvfNBg2T21CmhuuY0QJDZMdZ0wDgH2qP2WXkW3xPd5rfbJ4gJpxK93QIWQjpkIB1DdLnsKnJyVNuIjb7dDbFAc/V2StSlUzA5KEUX0PZVRmkF7xlkVfn4gQR2J0qaPDqFuJCy6+2HGi7H960KpUYYd5N9628exj2hNGWbKRqtUu5WxmsDjg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Changes in v2: - Fix the comments of try_to_merge_with_zero_page(), per David. - Drop the last one patch in this version since it's a very rare case and hard to testing to prove. - Rebase on the latest mm/mm-stable branch. - Link to v1: https://lore.kernel.org/r/20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev Hello, This series mainly optimizes cmp_and_merge_page() to have more efficient separate code flow for ksm page and non-ksm anon page. - ksm page: don't need to calculate the checksum obviously. - anon page: don't need to search stable tree if changing fast and try to merge with zero page before searching ksm page on stable tree. Please see the patch-2 for details. Patch-3 is cleanup also a little optimization for the chain()/chain_prune interfaces, which made the stable_tree_search()/stable_tree_insert() over complex. I have done simple testing using "hackbench -g 1 -l 300000" (maybe I need to use a better workload) on my machine, have seen a little CPU usage decrease of ksmd and some improvements of cmp_and_merge_page() latency: We can see the latency of cmp_and_merge_page() when handling non-ksm anon pages has been improved. Thanks for review and comments! Before: - ksm page [128, 256) 21 | | [256, 512) 12509 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [512, 1K) 769 |@@@ | [1K, 2K) 99 | | [2K, 4K) 4 | | [4K, 8K) 2 | | [8K, 16K) 8 | | - anon page [512, 1K) 19 | | [1K, 2K) 7160 |@@@@@@@@@@@ | [2K, 4K) 33516 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 33172 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8K, 16K) 11305 |@@@@@@@@@@@@@@@@@ | [16K, 32K) 1303 |@@ | [32K, 64K) 16 | | [64K, 128K) 6 | | [128K, 256K) 6 | | [256K, 512K) 9 | | [512K, 1M) 3 | | [1M, 2M) 2 | | [2M, 4M) 1 | | After: - ksm page [128, 256) 9 | | [256, 512) 915 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [512, 1K) 41 |@@ | [1K, 2K) 1 | | [2K, 4K) 1 | | - anon page [512, 1K) 374 | | [1K, 2K) 5367 |@@@@ | [2K, 4K) 64362 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 27721 |@@@@@@@@@@@@@@@@@@@@@@ | [8K, 16K) 1047 | | [16K, 32K) 63 | | [32K, 64K) 7 | | [64K, 128K) 6 | | [128K, 256K) 5 | | [256K, 512K) 3 | | [512K, 1M) 1 | | Signed-off-by: Chengming Zhou --- Chengming Zhou (3): mm/ksm: refactor out try_to_merge_with_zero_page() mm/ksm: don't waste time searching stable tree for fast changing page mm/ksm: optimize the chain()/chain_prune() interfaces mm/ksm.c | 250 +++++++++++++++++++++------------------------------------------ 1 file changed, 82 insertions(+), 168 deletions(-) --- base-commit: 6ba59ff4227927d3a8530fc2973b80e94b54d58f change-id: 20240621-b4-ksm-scan-optimize-e614a3a52217 Best regards,