From patchwork Fri Mar 21 17:37:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sourav Panda X-Patchwork-Id: 14025834 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71384C36000 for ; Fri, 21 Mar 2025 17:37:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5798A280003; Fri, 21 Mar 2025 13:37:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4DA54280001; Fri, 21 Mar 2025 13:37:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37ABA280003; Fri, 21 Mar 2025 13:37:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 13C83280001 for ; Fri, 21 Mar 2025 13:37:34 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 509B7B78EE for ; Fri, 21 Mar 2025 17:37:35 +0000 (UTC) X-FDA: 83246265270.16.B400348 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf10.hostedemail.com (Postfix) with ESMTP id 7F851C001A for ; Fri, 21 Mar 2025 17:37:33 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=dmJcmAi+; spf=pass (imf10.hostedemail.com: domain of 33KPdZwsKCN8TPVSBWQBOEBHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=33KPdZwsKCN8TPVSBWQBOEBHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--souravpanda.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742578653; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zIY5k7jRtzkaMHM23LwZQbgKSFYkNE5bEn5Jv/DROwA=; b=nWksePMgstovv4PK/Qin3e7CjE3+qGFcp3iVGKLrhZLn2gR6P0LBi/lMtaVgPWz1ihvsSA d1pZWCASlyGqBxVAiC/tUVTrCstVzxFdeIiIHwA6BAEHXoqd23GJc7VP2E6CL78mGcRm+d E6bXu6/Bh9Gl/GoAXNtRTRvAANdWBoM= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=dmJcmAi+; spf=pass (imf10.hostedemail.com: domain of 33KPdZwsKCN8TPVSBWQBOEBHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=33KPdZwsKCN8TPVSBWQBOEBHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--souravpanda.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742578653; a=rsa-sha256; cv=none; b=4MDoNoPa0X2GH/O3GcsidnKDzrlDLDWI7UAxflVWExQhouQVeX2ELzegipD/I36RZAo3sY j85nsOwbIOCigN5YYTONMcbTQTXldEWt2Qu5LRORMLc10MNbqQC44iok14TTEQK0DLiEx/ dnWVLovCC9ZDNUqhHN4wKbg/qUrJNis= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ff799be8f5so3989339a91.1 for ; Fri, 21 Mar 2025 10:37:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742578652; x=1743183452; darn=kvack.org; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=zIY5k7jRtzkaMHM23LwZQbgKSFYkNE5bEn5Jv/DROwA=; b=dmJcmAi+eAmxlU6XS2DGP1JbWOSMw9sA8G9v454fXgpycIxC45MkhlW+hUv0qSB9WZ 4qFavi0avU4g6zNzOSnytmXebdIllSPvGYmjuF9YZoN+R3sJb6kGdSRZ2m4fLBQ7Y9z2 J37AZ8y3vn1PTZAdYPt4g7soIX7e2Mdx0yLYn/Kl9KJwX4Xcnw14Do2RZtb1qLCsHJr/ k7QpX6Dar4kNaFRzDdAEt3B/q95Cz7lqcrR54YBSY1zuJ4+4jjjvMNeDYcbs0IeeKvrt lcolY8hhOOymCIN7LnIkCh+3Nxld9hNe6q1SHz5i5d67k7JO96JBayfdOVdDgP3gMuXC xxDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742578652; x=1743183452; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zIY5k7jRtzkaMHM23LwZQbgKSFYkNE5bEn5Jv/DROwA=; b=Ka2dboLi87GkrPJw3hPq03bJ7ulTpN0VlygpmcnOfRM7wcMdj3Sip0j2AFigbKRl68 WJP6DtWznth5yr7G/FB0TgBC4SMqn2xdcHEebDar/bHMHUkUptQwMQmAMcgckX/Xh8O2 7okeiqUR1hLmDIyvD4bKbLDKFKJFMOm8n40kY741+b7RFwANAtox6X4pAOVXR2tLTmSZ lbMfH2HwgBXXbeHp/zjJmZSzxyu3Go8Z2bkDyMihE7N/XTUZdEKu2IosUzpq1FdJKe9t DB2H0bnShYhJndrsH8CeJjyU89wExupJt081vXvscVtEFF58JuvBW48cfLUuDcHBJmSU koJA== X-Forwarded-Encrypted: i=1; AJvYcCXkXdX6NKnC51bjWBkVrO7WZGhx8qkklxOzKCJ0LMlW41kxBKQHb86q0WSNcpJumyfgWqRNBtdiYA==@kvack.org X-Gm-Message-State: AOJu0YwCeIRoHAxLYBheegH54YL7WkElKG2E62TVJPPJ1ulHGoZmlf8K Kp7uhAJOZpJuxiyU/pFXvNjX3FZT3DzhxvlQhizjxFqNTbSRpyM353ZRwUwTNPMU1L88mBP+BFP 3akjJsMaHyxhCPpNGLno1YA== X-Google-Smtp-Source: AGHT+IE7r52beD3Ab1/77IHPkao7u16u+m0qxZHCccy5P9AkxUY4reihApSQ91Wq7XViRQXL1VyGvF83vZYuIuNbfw== X-Received: from pjj14.prod.google.com ([2002:a17:90b:554e:b0:2f5:63a:4513]) (user=souravpanda job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2707:b0:2ff:6488:e01c with SMTP id 98e67ed59e1d1-3030fefe3e4mr6880939a91.29.1742578652393; Fri, 21 Mar 2025 10:37:32 -0700 (PDT) Date: Fri, 21 Mar 2025 17:37:24 +0000 In-Reply-To: <20250321173729.3175898-1-souravpanda@google.com> Mime-Version: 1.0 References: <20250321173729.3175898-1-souravpanda@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321173729.3175898-2-souravpanda@google.com> Subject: [RFC PATCH 1/6] mm: introduce SELECTIVE_KSM KConfig From: Sourav Panda To: mathieu.desnoyers@efficios.com, willy@infradead.org, david@redhat.com, pasha.tatashin@soleen.com, rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, weixugc@google.com, gthelen@google.com, souravpanda@google.com, surenb@google.com X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 7F851C001A X-Stat-Signature: iax7tfd8pzg1oztunn6y8hdj3juhrxd1 X-HE-Tag: 1742578653-491184 X-HE-Meta: U2FsdGVkX19+cFx3v9zeYt9F35yA4J5/MxFcO6n/2KSSlzksEFtQ0XDPPR5S8HAMh5yd33pTauFYOoguOGGJkgx2Q0XdjQ8rBTJFGrNjNbyMHR7+g2Q7jSDaA18/sUwu9ZbyNLhzCjeActNqBxEnnmjCM/en3Wy1DVmQvxUxcCEd+A4w+byR/WKkekhS8HJ1jRftxlw1gi2c8/ij+E5vEyqqnZQ8Q0y/ZBBnQUneEK35iMy3CKHP6nX2gsNSsjZE07osm/gSEJsZwCQ+zi1kBYImCTd7WZ/RiOc9MsqllFRdLmAOtcxWKWFQqv2KGt8JxBmhRHHUd/kNdA0v0cGZ44AzhWajpoGvI3VEEiyI5Q9aOopJGjYFZBeFsMlQJjrWYzvJCRGWLVazjI/smswnCdjPa6jn6iLYrQo2f1b3ofJykIr88LAuIUb8CUvW3jbmSueuTth9ng7RVcCSb4TGtKAZ3xxUnG6pIE/oGlKmY4YcxL0lfBjoqNMOWfmiKebtsIVR6L0Jo7QsZBMRhyT6bx7b7odBtSJENaWhkiyLs2w447JLgMhThDk1sRMO9/m/7pl05pZvK2EelLK7+8f8mo9XQiIxqW1oHgz45nQNFZMPpbX9iJ1KaVX/O4+dqNkYxDDhC3vC0GeLRJa+VQ1kAE0sqPeIUVnXD//Z6elNrMogbe/EKASVRHhV90YPGkRFdSr/bJhrugcoWTjn/yfy5Yh9A516yzMM0Eq+Byo9DrSwGs6al8tEfakY5KGPntWCybjhdjRn/XWRXp1EWdKG43ov3zJBYf93rqgP8XgXEqffHWzRTs4LNWnLjwGyide5Fhlg4CB8gK0e0zlX7NfP6ghRQYvOsWLvThfFNjPrBA1C4Reg6wLuOaqpW4FXdV9enpOMKA3dzYod1WBSMoyq5bf1lY5HdZpxUdwVYwMmBdydDBwbMMvpFQo+IbTVyGfxEozHfi9lS7St8RPvwcz 7sIVEMv3 CQl+SzI8hRkkQHhHd0KvWlwSMFfPRcJTahCIfOV+12cyH8mqg349W+aqATj7KOVQSBwC8kdW99DpjUqQg+PRb0DzgGtpbrlI88VnEeok5Z+JfTXNV3ITHkev/r58Th4kDO6Tw98Fn6FCde/bmASEdUJKI4maQ1968uXvmxaHpjq/3KaSyyyofzXorb9GTKxClpZ+lBsyL3DD6sXGtTTZ20Hhe4fEEe12mxjtujKwAx9stJ0oPBVfGtGdYmZtIX9vT/rtcSdfXj18CX47/pGfohXCNc1dMJeWYGVlk7pJOGQbXpsvNz/ma6REd2Nhrros6IxioUL2oo1cn9uE0nfurtkh7bKcZQqGvcrmpoNcZw5Aius+XJ/JktmaniSHhVrbJXCK7tFgo5JbHw/cC3rIjYPOFABH+uzQz/x0Xheab39P024t9c3PeIe3MXNOKctTXj78y1jS1+xbIragmVjsoNekGhpzSPWbWA0ib+B/oGnlxtQqG1uQZs+NpGAV8bZUydawwQAMhLLXyFsPKyUjrBQSjR+yRKEth5PtJ+qN33wDFUssEe7zRx96oSjTZpDV2D44Qolhm7Ui3E68kklAg8bvuh8QueW89nCaJRFibzwMJONgKpc8++hKdhd5oDAyc3ZBe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Gate the partitioned and synchronous features of SELECTIVE_KSM behind a KConfig. This shall prevent vanilla KSM's background thread from stepping over SELECTIVE_KSM. Signed-off-by: Sourav Panda --- mm/Kconfig | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 1b501db06417..f9873002414c 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -783,6 +783,17 @@ config KSM until a program has madvised that an area is MADV_MERGEABLE, and root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set). +config SELECTIVE_KSM + bool "Enable Selective KSM for page merging" + depends on KSM + help + Enable Synchronous and Partitioned KSM for page merging. There is + no background scanning. Instead, userspace specifies the pid + and address range to have merged. The partitioning aspect divides + the merge space into security domains. Merging of pages only takes + place within a partition, improving security. Furthermore, trees + in each partitioning becomes smaller, improving CPU efficiency. + config DEFAULT_MMAP_MIN_ADDR int "Low address space to protect from user allocation" depends on MMU From patchwork Fri Mar 21 17:37:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sourav Panda X-Patchwork-Id: 14025835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11D7DC36005 for ; Fri, 21 Mar 2025 17:37:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19ED6280004; Fri, 21 Mar 2025 13:37:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 123F4280001; Fri, 21 Mar 2025 13:37:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9285280004; Fri, 21 Mar 2025 13:37:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C599D280001 for ; Fri, 21 Mar 2025 13:37:35 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 100095390B for ; Fri, 21 Mar 2025 17:37:37 +0000 (UTC) X-FDA: 83246265354.04.193C893 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf24.hostedemail.com (Postfix) with ESMTP id 23D86180011 for ; Fri, 21 Mar 2025 17:37:34 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2fdVLCFL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 33aPdZwsKCOAUQWTCXRCPFCIQQING.EQONKPWZ-OOMXCEM.QTI@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=33aPdZwsKCOAUQWTCXRCPFCIQQING.EQONKPWZ-OOMXCEM.QTI@flex--souravpanda.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742578655; a=rsa-sha256; cv=none; b=sbofJUbYqWvqUrMERCQmPpkZayRL37zjAKctE1DglUtbroMOtcXB3ev4JMUnDl62nXxoxs NttMvCG5e8xgX0/ODhUZukeFkxPus+bePSENVoxjR56uOu1bb7UYH9B0U7wlSy0LJhJAgJ ettKhdA5cLGpzGIH9vK01WvF32RhV9s= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2fdVLCFL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 33aPdZwsKCOAUQWTCXRCPFCIQQING.EQONKPWZ-OOMXCEM.QTI@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=33aPdZwsKCOAUQWTCXRCPFCIQQING.EQONKPWZ-OOMXCEM.QTI@flex--souravpanda.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742578655; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KD+/AzOE/8DE108iYCdGt2BwCB9Yw4lZbr5znCcmPD8=; b=3v/tDvVqOmy7mBrFnl9HGZO0LKTuXrUWyyBFIUdQjhOQhvfdR0bxQcw0/ZINDAl0cov6cW FKXZW+QEHmCEI0YIGJmpwqD/bYbuWnqW25qFlFiut4GUsAMgWY5/MRkCFV88AVFaxoQ4uc 1G+x9soS0o0hJyKLfpBmqlJYmpLGMNA= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ff798e8c90so3535041a91.1 for ; Fri, 21 Mar 2025 10:37:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742578654; x=1743183454; darn=kvack.org; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=KD+/AzOE/8DE108iYCdGt2BwCB9Yw4lZbr5znCcmPD8=; b=2fdVLCFLUEHE6dHYXgpc1wP+RrA5OLOjLfl+ljOPyuULywTL4xmYuZ2wnIPL4s4D1v 0r9hvKqwCX8pmY2KcDCr4cqT53Z/bd71rrdAWGIUvq+8CmtQqKQG8U+ZumGcT+5bZqud BOSK2X2gm60LAyxKJXiatGhtm+A9kpG5PiblrHCEbOGwjiLEWv6oNk6UTRRR1Jmu0i31 0V/r3aa1+T0oOw7CHeVsQU6xhnM3EqkKzg+9MIJQl5df8iTwwzAKQyJORzzNektWInTD 0xYvyPK4W4noV/otR1nv0TuylSMcLpQQGb8kCXYoMzoxrH+LaxWC2vUB2pFWl6dpU8lN SSRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742578654; x=1743183454; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KD+/AzOE/8DE108iYCdGt2BwCB9Yw4lZbr5znCcmPD8=; b=eUkbLf42uAk7V/2LtxNLbH21ov2FNaT3d9um6C/APfIv2SFwgIisdWLZ07WoAkbJSq NIc6HRFyDBYD/r82l7AC9Au4nJFNdfUNa+ijz92KyDfMl6hhWoxWNp0/3N2FF/i2GxIR lWOf9YgDHAp3MjY5BBF9+ylh1dKvJlL89RT4WDFIc2t65thQZdPSt59+2tMM+GkIB8SY UAsBfcqU5kNa/Jg6jLH8e5hX35eArd4FAcbDYpTeKvUgXACB70jjCgjLrEIeNoGC0A6j XaDVbnVG3Ppk4SemD5tH/IOg3gJaJOgFcUD13d14R7RSn0IO/jXutaHnW+MYhBx9h/cu 1E+w== X-Forwarded-Encrypted: i=1; AJvYcCXAowoGfqDy2iXi4s4l+3ZlD/NZUaLRFHFCi4DKXSqQ8QCe1xKVAEJUbx4j5SIOAwxbqnyL/Ms10A==@kvack.org X-Gm-Message-State: AOJu0Ywh6E7evXNWcS16oPZMU76jtrwitAZWQBEya/s5ovW4281ogb7K LcxWtZ5Zh7U14ZR393SJmrkXG2iDLNh1sfE6Fjc+ZuJPhfKoX0m7IPNRmNaIxJwUkrNjuwkj+Zi Kr+dU33mVwe9R/GDwSpTzug== X-Google-Smtp-Source: AGHT+IE3zfkzVNg9SOyEjNSGlTIDNM2s5rLvniscEkew+nzR6ifKeKx8uNXSoO7cH2YP2/p20l9xr1Homzd8hizqww== X-Received: from pjbee11.prod.google.com ([2002:a17:90a:fc4b:b0:2ea:46ed:5d3b]) (user=souravpanda job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4b10:b0:2ff:796b:4d05 with SMTP id 98e67ed59e1d1-3030fea7630mr6796934a91.11.1742578653992; Fri, 21 Mar 2025 10:37:33 -0700 (PDT) Date: Fri, 21 Mar 2025 17:37:25 +0000 In-Reply-To: <20250321173729.3175898-1-souravpanda@google.com> Mime-Version: 1.0 References: <20250321173729.3175898-1-souravpanda@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321173729.3175898-3-souravpanda@google.com> Subject: [RFC PATCH 2/6] mm: make Selective KSM synchronous From: Sourav Panda To: mathieu.desnoyers@efficios.com, willy@infradead.org, david@redhat.com, pasha.tatashin@soleen.com, rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, weixugc@google.com, gthelen@google.com, souravpanda@google.com, surenb@google.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 23D86180011 X-Stat-Signature: tumg9t3hrmup3g7cij9hiaankifzpopt X-Rspam-User: X-HE-Tag: 1742578654-791062 X-HE-Meta: U2FsdGVkX18qibtQV+pnmww4x4ip0QqB9+CdumKTiZpE7tLsdydBHoiaPVPtXviELrDMyx2k8Q3jDdVyOPCypnjGZnW9yNdSxIpI0XRjdUZgknyxKsjg4sJqe4UGO99x755l2nRDCMjEJQ41BMvRFTnwmso5+MIuN4tZPSrejITVnohNcGGnc/Ajh1gjn9hU6Ewkre+LquO13EciJVg06jXh7qXj3EwKMaQpn6E3ABi7sH/qPFpuk4Cc3NDm1u4BanZdh4JsIfjGlg25EbHeI9ARACSKynL64YJ8q/hZLC0uF8mkflqUZ2LrOnk/aLqEdp5qiFomtuTyfpXhZ7iR4PsMChMGkfvPTlFp+scfK76gKAquFyyrjksDn9mL5V8/eT47E2dLzy1Ndg7yOpEtymO6Lh7P/3yJuQc7hs+gTrBddSgNQriuCHg+Y7Xjm3Hg1nZrGyug/lSlcHO69cZOvo8KhVZrCRa/QeHoHRkakihN8nR2OJ5XhTVWQXx7tjQDuRFauEcIUMsvYg5zHS71jI2FMfY4xMTxAS4r/et1MdSHqFy9ghqpzfMAhjKPHGqZGu0UYZMqZnIZE/pdLBsRuUqXQeNLmxsmT7iFWtOeGtE/OGsg0h0YhvMiQ2OfQXwNKHK9/eHvfkflFVA13tbz7gMpgJIgkgsMcYUodGo8WkwQDXpZ7K6bRzHOr/Q/6gJvV8xCBWJ83jI0Y1xIpW1y/Nkw++nD3Qvh1/3BQYFoI2U3XM2ARSRI9hMHaJdI01XtyXLcpLEzoJsBVyVizu73CrfN9LCUbXJZEQ78fQY1ancdYHARVJuEObty2wHVWtlJOf94QduCPVIFl3mI6hR3/34O3cIDt6smfkS+65/Iq8rHgkyGGE6ZdTib3sXtuTbyh95u0WBs8OYl3OvE7a/t5g/+QcePqUMtpcTNCPUI6X7f1hKwHg1x1Hy1mkECjw7zSSrf8gm9aj9bpwLH8q9 F65MQC8C Pt1PKWXDkbVPGneWGgS/V4ltPhdSph5vSNhjWsUlPI6QjCulv8sh1FCcfcURwNWi8KQ06lfa0FjY7qnxDSY/kXZFAaBSItopCYITYEeEIEBUTZio61d8gJvkWng82pEk19VkuIoZmcQWWWYUUi/VI9NbYfX68+vAbYYTTx2FWRiIHR4ZxNfNr5sjsvA6riLhuBdGkIgopj/WCmp1TbDte5qd5ZhBrkBQtXohB1TiPXXUOd3RUKGuIYF4nhgoDX8y1Fyfz6JOJhJfVepFfoxf7fxbSiPWE39djf3oRAQe95mlGaCKvc33Cnn8DIaiMLRwjHld0889Dgea+rdpJ5Z88l3A1C2vpMk0HVmJJX5LpL7OZBu8K/LaHfgRc0uGasWBdYqhhOXUdeM9VO6zIEkTdp9iBTwBksQkqRPE3xWNVl5eY4/lr7k7tgcUGR3Jmq3l82Mue4thaNJTzxJOoYcqk4pcxjpNKTwyc0XVD6XEyD9aCVeEdgkmcbBcQv5X29FLdJpqHb5qwOXw/3iTRwK1Rr6ZByx6d9kabzPMHg47Rvw/wiyPzDshGCV3jr6OSaf72gHJgYJ8QHw+KwUo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make KSM synchronous by introducing the following sysfs file, which shall carryout merging on the specified memory region synchronously and eliminates the need of ksmd running in the background. echo "pid start_addr end_addr" > /sys/kernel/mm/ksm/trigger_merge Signed-off-by: Sourav Panda --- mm/ksm.c | 317 +++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 271 insertions(+), 46 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 8be2b144fefd..b2f184557ed9 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -290,16 +290,18 @@ static unsigned int zero_checksum __read_mostly; /* Whether to merge empty (zeroed) pages with actual zero pages */ static bool ksm_use_zero_pages __read_mostly; -/* Skip pages that couldn't be de-duplicated previously */ -/* Default to true at least temporarily, for testing */ -static bool ksm_smart_scan = true; - /* The number of zero pages which is placed by KSM */ atomic_long_t ksm_zero_pages = ATOMIC_LONG_INIT(0); /* The number of pages that have been skipped due to "smart scanning" */ static unsigned long ksm_pages_skipped; +#ifndef CONFIG_SELECTIVE_KSM /* advisor immaterial if there is no scanning */ + +/* Skip pages that couldn't be de-duplicated previously */ +/* Default to true at least temporarily, for testing */ +static bool ksm_smart_scan = true; + /* Don't scan more than max pages per batch. */ static unsigned long ksm_advisor_max_pages_to_scan = 30000; @@ -465,6 +467,7 @@ static void advisor_stop_scan(void) if (ksm_advisor == KSM_ADVISOR_SCAN_TIME) scan_time_advisor(); } +#endif /* CONFIG_SELECTIVE_KSM */ #ifdef CONFIG_NUMA /* Zeroed when merging across nodes is not allowed */ @@ -957,6 +960,25 @@ static struct folio *ksm_get_folio(struct ksm_stable_node *stable_node, return NULL; } +static unsigned char get_rmap_item_age(struct ksm_rmap_item *rmap_item) +{ +#ifdef CONFIG_SELECTIVE_KSM /* age is immaterial in selective ksm */ + return 0; +#else + unsigned char age; + /* + * Usually ksmd can and must skip the rb_erase, because + * root_unstable_tree was already reset to RB_ROOT. + * But be careful when an mm is exiting: do the rb_erase + * if this rmap_item was inserted by this scan, rather + * than left over from before. + */ + age = (unsigned char)(ksm_scan.seqnr - rmap_item->address); + WARN_ON_ONCE(age > 1); + return age; +#endif /* CONFIG_SELECTIVE_KSM */ +} + /* * Removing rmap_item from stable or unstable tree. * This function will clean the information from the stable/unstable tree. @@ -991,16 +1013,7 @@ static void remove_rmap_item_from_tree(struct ksm_rmap_item *rmap_item) rmap_item->address &= PAGE_MASK; } else if (rmap_item->address & UNSTABLE_FLAG) { - unsigned char age; - /* - * Usually ksmd can and must skip the rb_erase, because - * root_unstable_tree was already reset to RB_ROOT. - * But be careful when an mm is exiting: do the rb_erase - * if this rmap_item was inserted by this scan, rather - * than left over from before. - */ - age = (unsigned char)(ksm_scan.seqnr - rmap_item->address); - BUG_ON(age > 1); + unsigned char age = get_rmap_item_age(rmap_item); if (!age) rb_erase(&rmap_item->node, root_unstable_tree + NUMA(rmap_item->nid)); @@ -2203,6 +2216,37 @@ static void stable_tree_append(struct ksm_rmap_item *rmap_item, rmap_item->mm->ksm_merging_pages++; } +#ifdef CONFIG_SELECTIVE_KSM +static int update_checksum(struct page *page, struct ksm_rmap_item *rmap_item) +{ + /* + * Typically KSM would wait for a second round to even consider + * the page for unstable tree insertion to ascertain its stability. + * Avoid this when using selective ksm. + */ + rmap_item->oldchecksum = calc_checksum(page); + return 0; +} +#else +static int update_checksum(struct page *page, struct ksm_rmap_item *rmap_item) +{ + remove_rmap_item_from_tree(rmap_item); + + /* + * If the hash value of the page has changed from the last time + * we calculated it, this page is changing frequently: therefore we + * don't want to insert it in the unstable tree, and we don't want + * to waste our time searching for something identical to it there. + */ + checksum = calc_checksum(page); + if (rmap_item->oldchecksum != checksum) { + rmap_item->oldchecksum = checksum; + return -EINVAL; + } + return 0; +} +#endif + /* * cmp_and_merge_page - first see if page can be merged into the stable tree; * if not, compare checksum to previous and if it's the same, see if page can @@ -2218,7 +2262,6 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite struct page *tree_page = NULL; struct ksm_stable_node *stable_node; struct folio *kfolio; - unsigned int checksum; int err; bool max_page_sharing_bypass = false; @@ -2241,20 +2284,8 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite if (!is_page_sharing_candidate(stable_node)) max_page_sharing_bypass = true; } else { - remove_rmap_item_from_tree(rmap_item); - - /* - * If the hash value of the page has changed from the last time - * we calculated it, this page is changing frequently: therefore we - * don't want to insert it in the unstable tree, and we don't want - * to waste our time searching for something identical to it there. - */ - checksum = calc_checksum(page); - if (rmap_item->oldchecksum != checksum) { - rmap_item->oldchecksum = checksum; + if (update_checksum(page, rmap_item)) return; - } - if (!try_to_merge_with_zero_page(rmap_item, page)) return; } @@ -2379,6 +2410,111 @@ static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot, return rmap_item; } +#ifdef CONFIG_SELECTIVE_KSM +static struct ksm_rmap_item *retrieve_rmap_item(struct page **page, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + struct ksm_mm_slot *mm_slot; + struct mm_slot *slot; + struct vm_area_struct *vma; + struct ksm_rmap_item *rmap_item; + struct vma_iterator vmi; + + lru_add_drain_all(); + + if (!ksm_merge_across_nodes) { + struct ksm_stable_node *stable_node, *next; + struct folio *folio; + + list_for_each_entry_safe(stable_node, next, + &migrate_nodes, list) { + folio = ksm_get_folio(stable_node, KSM_GET_FOLIO_NOLOCK); + if (folio) + folio_put(folio); + } + } + + spin_lock(&ksm_mmlist_lock); + slot = mm_slot_lookup(mm_slots_hash, mm); + spin_unlock(&ksm_mmlist_lock); + + if (!slot) + return NULL; + mm_slot = mm_slot_entry(slot, struct ksm_mm_slot, slot); + + ksm_scan.address = 0; + ksm_scan.mm_slot = mm_slot; + ksm_scan.rmap_list = &mm_slot->rmap_list; + + vma_iter_init(&vmi, mm, ksm_scan.address); + + mmap_read_lock(mm); + for_each_vma(vmi, vma) { + if (!(vma->vm_flags & VM_MERGEABLE)) + continue; + if (ksm_scan.address < vma->vm_start) + ksm_scan.address = vma->vm_start; + if (!vma->anon_vma) + ksm_scan.address = vma->vm_end; + + while (ksm_scan.address < vma->vm_end) { + struct page *tmp_page = NULL; + struct folio_walk fw; + struct folio *folio; + + if (ksm_scan.address < start || ksm_scan.address > end) + break; + + folio = folio_walk_start(&fw, vma, ksm_scan.address, 0); + if (folio) { + if (!folio_is_zone_device(folio) && + folio_test_anon(folio)) { + folio_get(folio); + tmp_page = fw.page; + } + folio_walk_end(&fw, vma); + } + + if (tmp_page) { + flush_anon_page(vma, tmp_page, ksm_scan.address); + flush_dcache_page(tmp_page); + rmap_item = get_next_rmap_item(mm_slot, + ksm_scan.rmap_list, + ksm_scan.address); + if (rmap_item) { + ksm_scan.rmap_list = + &rmap_item->rmap_list; + ksm_scan.address += PAGE_SIZE; + *page = tmp_page; + } else { + folio_put(folio); + } + mmap_read_unlock(mm); + return rmap_item; + } + ksm_scan.address += PAGE_SIZE; + } + } + mmap_read_unlock(mm); + return NULL; +} + +static void ksm_sync_merge(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct ksm_rmap_item *rmap_item; + struct page *page; + + rmap_item = retrieve_rmap_item(&page, mm, start, end); + if (!rmap_item) + return; + cmp_and_merge_page(page, rmap_item); + put_page(page); +} + +#else /* CONFIG_SELECTIVE_KSM */ /* * Calculate skip age for the ksm page age. The age determines how often * de-duplicating has already been tried unsuccessfully. If the age is @@ -2688,6 +2824,7 @@ static int ksm_scan_thread(void *nothing) } return 0; } +#endif /* CONFIG_SELECTIVE_KSM */ static void __ksm_add_vma(struct vm_area_struct *vma) { @@ -3335,9 +3472,10 @@ static ssize_t pages_to_scan_store(struct kobject *kobj, unsigned int nr_pages; int err; +#ifndef CONFIG_SELECTIVE_KSM if (ksm_advisor != KSM_ADVISOR_NONE) return -EINVAL; - +#endif err = kstrtouint(buf, 10, &nr_pages); if (err) return -EINVAL; @@ -3396,6 +3534,65 @@ static ssize_t run_store(struct kobject *kobj, struct kobj_attribute *attr, } KSM_ATTR(run); +static ssize_t trigger_merge_show(struct kobject *kobj, + struct kobj_attribute *attr, + char *buf) +{ + return -EINVAL; /* Not yet implemented */ +} + +static ssize_t trigger_merge_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + unsigned long start, end; + pid_t pid; + char *input, *ptr; + int ret; + struct task_struct *task; + struct mm_struct *mm; + + input = kstrdup(buf, GFP_KERNEL); + if (!input) + return -ENOMEM; + + ptr = strim(input); + ret = sscanf(ptr, "%d %lx %lx", &pid, &start, &end); + kfree(input); + + if (ret != 3) + return -EINVAL; + + if (start >= end) + return -EINVAL; + + /* Find the mm_struct */ + rcu_read_lock(); + task = find_task_by_vpid(pid); + if (!task) { + rcu_read_unlock(); + return -ESRCH; + } + + get_task_struct(task); + + rcu_read_unlock(); + mm = get_task_mm(task); + put_task_struct(task); + + if (!mm) + return -EINVAL; + + mutex_lock(&ksm_thread_mutex); + wait_while_offlining(); + ksm_sync_merge(mm, start, end); + mutex_unlock(&ksm_thread_mutex); + + mmput(mm); + return count; +} +KSM_ATTR(trigger_merge); + #ifdef CONFIG_NUMA static ssize_t merge_across_nodes_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) @@ -3635,6 +3832,7 @@ static ssize_t full_scans_show(struct kobject *kobj, } KSM_ATTR_RO(full_scans); +#ifndef CONFIG_SELECTIVE_KSM static ssize_t smart_scan_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -3780,11 +3978,13 @@ static ssize_t advisor_target_scan_time_store(struct kobject *kobj, return count; } KSM_ATTR(advisor_target_scan_time); +#endif /* CONFIG_SELECTIVE_KSM */ static struct attribute *ksm_attrs[] = { &sleep_millisecs_attr.attr, &pages_to_scan_attr.attr, &run_attr.attr, + &trigger_merge_attr.attr, &pages_scanned_attr.attr, &pages_shared_attr.attr, &pages_sharing_attr.attr, @@ -3802,12 +4002,14 @@ static struct attribute *ksm_attrs[] = { &stable_node_chains_prune_millisecs_attr.attr, &use_zero_pages_attr.attr, &general_profit_attr.attr, +#ifndef CONFIG_SELECTIVE_KSM &smart_scan_attr.attr, &advisor_mode_attr.attr, &advisor_max_cpu_attr.attr, &advisor_min_pages_to_scan_attr.attr, &advisor_max_pages_to_scan_attr.attr, &advisor_target_scan_time_attr.attr, +#endif NULL, }; @@ -3815,40 +4017,63 @@ static const struct attribute_group ksm_attr_group = { .attrs = ksm_attrs, .name = "ksm", }; + +static int __init ksm_sysfs_init(void) +{ + return sysfs_create_group(mm_kobj, &ksm_attr_group); +} +#else /* CONFIG_SYSFS */ +static int __init ksm_sysfs_init(void) +{ + ksm_run = KSM_RUN_MERGE; /* no way for user to start it */ + return 0; +} #endif /* CONFIG_SYSFS */ -static int __init ksm_init(void) +#ifdef CONFIG_SELECTIVE_KSM +static int __init ksm_thread_sysfs_init(void) +{ + return ksm_sysfs_init(); +} +#else /* CONFIG_SELECTIVE_KSM */ +static int __init ksm_thread_sysfs_init(void) { struct task_struct *ksm_thread; int err; - /* The correct value depends on page size and endianness */ - zero_checksum = calc_checksum(ZERO_PAGE(0)); - /* Default to false for backwards compatibility */ - ksm_use_zero_pages = false; - - err = ksm_slab_init(); - if (err) - goto out; - ksm_thread = kthread_run(ksm_scan_thread, NULL, "ksmd"); if (IS_ERR(ksm_thread)) { pr_err("ksm: creating kthread failed\n"); err = PTR_ERR(ksm_thread); - goto out_free; + return err; } -#ifdef CONFIG_SYSFS - err = sysfs_create_group(mm_kobj, &ksm_attr_group); + err = ksm_sysfs_init(); if (err) { pr_err("ksm: register sysfs failed\n"); kthread_stop(ksm_thread); - goto out_free; } -#else - ksm_run = KSM_RUN_MERGE; /* no way for user to start it */ -#endif /* CONFIG_SYSFS */ + return err; +} +#endif /* CONFIG_SELECTIVE_KSM */ + +static int __init ksm_init(void) +{ + int err; + + /* The correct value depends on page size and endianness */ + zero_checksum = calc_checksum(ZERO_PAGE(0)); + /* Default to false for backwards compatibility */ + ksm_use_zero_pages = false; + + err = ksm_slab_init(); + if (err) + goto out; + + err = ksm_thread_sysfs_init(); + if (err) + goto out_free; #ifdef CONFIG_MEMORY_HOTREMOVE /* There is no significance to this priority 100 */ From patchwork Fri Mar 21 17:37:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sourav Panda X-Patchwork-Id: 14025836 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2003CC36002 for ; Fri, 21 Mar 2025 17:37:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 08F1F280005; Fri, 21 Mar 2025 13:37:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 01A91280001; Fri, 21 Mar 2025 13:37:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D887C280005; Fri, 21 Mar 2025 13:37:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AFE4D280001 for ; Fri, 21 Mar 2025 13:37:40 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id ACD05AA33C for ; Fri, 21 Mar 2025 17:37:41 +0000 (UTC) X-FDA: 83246265522.04.F1980CD Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf27.hostedemail.com (Postfix) with ESMTP id E928640010 for ; Fri, 21 Mar 2025 17:37:39 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PbuF+9qh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 34qPdZwsKCOUZVbYHcWHUKHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--souravpanda.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=34qPdZwsKCOUZVbYHcWHUKHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--souravpanda.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742578660; a=rsa-sha256; cv=none; b=nv4D1NjypRzYnBJ7S+ss94gGwtBABZI4VhMkE3py0zP5tFXh72rPz1BoIxPVRdnu1f6Zo8 bebz6Rdcx9+3ZHJNrjj2Hwsx653oYYyWrIRhBH24EBc6zwG/HFoKPiTT8m/Evn1sSR1Qxs nxtNekRsDXZvVYvczxHGJawsgHsJub4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PbuF+9qh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 34qPdZwsKCOUZVbYHcWHUKHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--souravpanda.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=34qPdZwsKCOUZVbYHcWHUKHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--souravpanda.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742578660; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JhyKuUbIG2qUFhxjEu6Kg0kSJWlrJ10xlnpfsY0CXs8=; b=KR/oUYOQQSeEOCMmHAb+PiwL6A0NlJN3LzUGvONB94BrT+ZI+Et9zyc8OfVrXg+ixqO+PF AstXXSXi88byMIyNUiwTlHOMWY08VN7+oIvgp3fQszn0TCUanN/u8GJ4kQYiX27iZg+ZsV sF6qSLaJkq3ih4ooUZop5BoT4qnLrsU= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3011c150130so3492874a91.2 for ; Fri, 21 Mar 2025 10:37:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742578659; x=1743183459; darn=kvack.org; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=JhyKuUbIG2qUFhxjEu6Kg0kSJWlrJ10xlnpfsY0CXs8=; b=PbuF+9qh4LDeeoj4z7GdXLT/eEDHunRkuk2gOGONO08sw6EAu43l3xPdNdtKuS41LS ziPI0Cdv9MxCikcPSSa7X1FPn8673aUz56eJfpZmLJmbLDZrNnNjb/FF+mus4M5KFvjW YDmQswg1LC9AgtK4Yrw9Tp5GrkgYiSQaJan7yVRlYIrZ9PYYCplVGvNb6A39odak15+V dIa0W0+GPgQ/9VOphxR32vdECqtpEelbQ3oJ8nZ//20c0TWJray0CWVaQgxxdKFjqVJt S5gs5fmUh+K9OMdm6Zge1xGTYaq7CGTVJCZpE3vThT/buu+evhfI/RX9mMh05MQccMjV SpDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742578659; x=1743183459; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JhyKuUbIG2qUFhxjEu6Kg0kSJWlrJ10xlnpfsY0CXs8=; b=SrUP6rYSDaUS9X6lOVECDFcvo1KUXHVUuJR6F320WXx3oTHH7ZghKCHauHanWKCrIU pfn/to6fWyCrVFCUMuMqIKV7dcXi5QF00g+/7XL7LV/Zvw6SdohUPkQwEdiLaMfr5BHY INQU9X26yAygyNge6HQHcTOOVbJkU7cJKIfy3nuSP/fJVA7mhe5XdmRxTVFBN8XkUFfR BJp1kRJsrxCZlYw7zFje+6hzgy/E3dhAuAl/L0groYDAO7UYjZPE4mB9oNGp+9I/hm0W pKTioWRCOrA8luZXsE97RgbJVBVFdj2PDrWvpDgt65ggdGg1uGk95EfvLZSANXqK3nLS hHaQ== X-Forwarded-Encrypted: i=1; AJvYcCXjWF03B18a2URBW0oQOdZz1SQDPxAexVO0sJCEIIBtf/BWcFnkB+TwevkbgKAS8kUy5WfmIpr/Dw==@kvack.org X-Gm-Message-State: AOJu0YxLThagVcFcwulsx9YE/zVVvaKe5rzyi7oaZ8fc4rFYAW5xOKWw MhTUTjK7ZvrZbVUBqzorkyOjShxKrATmKOqyc9bqA65sMejq5p2Un/TlB4pTRYJtqiWXF3uZXP2 x6NnwctONg+F0eO3ZEF+mYA== X-Google-Smtp-Source: AGHT+IErW6hIRZXrutBcTb+9kRL8XmrfeSil0yg5cL4+Y+ZPuvGKGUahbLis4vVKFNz4GDCbx8eM7VDHYQJtpswc8A== X-Received: from pgar25.prod.google.com ([2002:a05:6a02:2e99:b0:ad7:adb7:8c14]) (user=souravpanda job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:1582:b0:1f3:2e85:c052 with SMTP id adf61e73a8af0-1fe4330249cmr6556407637.35.1742578658785; Fri, 21 Mar 2025 10:37:38 -0700 (PDT) Date: Fri, 21 Mar 2025 17:37:26 +0000 In-Reply-To: <20250321173729.3175898-1-souravpanda@google.com> Mime-Version: 1.0 References: <20250321173729.3175898-1-souravpanda@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321173729.3175898-4-souravpanda@google.com> Subject: [RFC PATCH 3/6] mm: make Selective KSM partitioned From: Sourav Panda To: mathieu.desnoyers@efficios.com, willy@infradead.org, david@redhat.com, pasha.tatashin@soleen.com, rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, weixugc@google.com, gthelen@google.com, souravpanda@google.com, surenb@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E928640010 X-Stat-Signature: ixhscjy14ep63k8iz5yr8xumy76qtcyk X-HE-Tag: 1742578659-118839 X-HE-Meta: U2FsdGVkX19XHDJciqCb+JtjybuxrPxus+v+vA4VmjMQAq1Cs3yEzKXXjtLgUaEqJOnTNWh4GiUregzPpaj9NGnOnQQyzV7mVkSzKpncxPVMM6XO+SVoSGXIGV8xqsrsZbMe9VYo6CpECaopFFKxNBUmfzY6K9PG3S5t2VdIL/l+Ds47bSD8mhSfx/xBWSddGMqKurNLfRx2b/B1hUFBPkQe5vClv74ICr6HJ46XdcN/bV0Kg0pbZqeeFPXRTwXfxby3kj6uV1b2Fq9wAveJJVxZq9h+xFJyZA+NiiwSMeooWUtN02lUTp4B/+9Fcogg70pbj1NBUO4zqJR5Wga0974J+wITkgfc8g+rzFlegSer09wRY5N9GRplw3eGK2W5hXChDtDOWp6Plb8FFciwDndvJBWhsAnunsPmlCVeaj8Rkq8rZWK6/1Tre85ZVTMGseVhmKrMm4mBiFw+bA+b9N84bugXDaxUTDOAINlwZJj22wtXpdaX6va4S0mj+UAuTx+ZXKhAM75RJ7iBBRlRR2f5salKxTdU7GQlQtqELZe9S0jXoZkyPZqRJGQyWmznX54A9ixY/spBSBWpp+7oHa2LtftdAoepTofsAnZgbCuSiQZaXP0fFVxm24ImC7m0jkb4bZ6cwg6cPO25cyvFs+mUcvaaAD+xNVLh8WvlynTO+i+oy2iQ5ADbbWA6vELZx1ZyATKwYE90Algswyl3k5IPbiLUAylrV4Bl2hpsaDhOBfS7XL0J0af+PoikYlxkK2sLzqUoAswMyb4iZzplcm45QCtwSWRHHSYU0wuDxQMrseB9BSqfEGOnos9hGOKUWMv7Xb3rjIBQziHmZaAc9CKEAU0otq/JxGVNQqt6nnJT/6xrr8PCrSXvFo6WU/V8wyFJBaQnKz1Jmnl0pb896tg0BRVdJIlnKweBJMdm73FPEn5heIlOf5rV7oTaF1w72tJEoSCbkpm2TZTShJA xTF/Hj4v 7KWDy3rPv32je9/VOLb8Sn0896Z69JMDPd95adkXni/TNLNyzGNpGoyWS/8mG0z7UqwJ3cSWBICfSsLiZ2eHZtZTVGupCnKOxsrw3K7pYsNiOy+GcFBC1Yf5k4aqAQwk76ItJsi+YGGp2RiZnGjEy+tmSMU8+hmI2L71QkRFQ0TTU0TZw7fFgKLqCc4l03FF1dfWdjxiOBGrIuTvKk9grZoXjaSKWObxDQTG6Zxqr8wV23kiOFs9c0uMcb94YWdHhh/MrCHQNwJZHGccDi0ynQEbbQXCPF/9XgqJyeTPOc6FqkcFGw5Ua2yasNnz5E5fF+xBOCtYTx5nK9jlrUwCDBrpSC0qcc9hfe1SpF9flnhL/3Sc/0LLh/XE4i/F7FWjX7eQJPuR2CGQnUOoGlvvu5kp1Zpaj7HRIZ0m96IoCLz9NvQrBVmRnL2bqyp1D+I8LGh4Nmy4E2Uf3Re/raHERNjQjhQYiPaKuP/j6j5s6Jc+IJ8gV3zR3NVysnILzpC7uP5OWXBohwAmgixPoSs+PXC0LTBPt+/X7pXAZXPYcjbBRWO0Es52ncFj/Kx9LodNUXH4nFy3rMSWcJvA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a sysfs interface to partition the KSM merge space. We add a new sysfs file, namely add_partition. Which is used to specify the name of the new partition. Once a partition is created, we would get the traditional files typcally available in KSM under each partition. This sysfs interface changes are in preparation of the following patch that shall actually partition the merge space (e.g., prevent page-comparison and merging across partitions). KSM_SYSFS=/sys/kernel/mm/ksm echo "part_1" > ${KSM_SYSFS}/ksm/control/add_partition ls ${KSM_SYSFS}/part_1/ pages_scanned pages_to_scan sleep_millisecs ... echo "pid start_addr end_addr" > ${KSM_SYSFS}/part_1/trigger_merge Signed-off-by: Sourav Panda --- mm/ksm.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 95 insertions(+), 6 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index b2f184557ed9..927e257c48b5 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -3832,7 +3832,17 @@ static ssize_t full_scans_show(struct kobject *kobj, } KSM_ATTR_RO(full_scans); -#ifndef CONFIG_SELECTIVE_KSM +#ifdef CONFIG_SELECTIVE_KSM +static struct kobject *ksm_base_kobj; + +struct partition_kobj { + struct kobject *kobj; + struct list_head list; +}; + +static LIST_HEAD(partition_list); + +#else /* CONFIG_SELECTIVE_KSM */ static ssize_t smart_scan_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -4015,15 +4025,22 @@ static struct attribute *ksm_attrs[] = { static const struct attribute_group ksm_attr_group = { .attrs = ksm_attrs, +#ifndef CONFIG_SELECTIVE_KSM .name = "ksm", +#endif }; -static int __init ksm_sysfs_init(void) +static int __init ksm_sysfs_init(struct kobject *kobj, + const struct attribute_group *grp) { - return sysfs_create_group(mm_kobj, &ksm_attr_group); + int err; + + err = sysfs_create_group(kobj, grp); + return err; } #else /* CONFIG_SYSFS */ -static int __init ksm_sysfs_init(void) +static int __init ksm_sysfs_init(struct kobject *kobj, + const struct attribute_group *grp) { ksm_run = KSM_RUN_MERGE; /* no way for user to start it */ return 0; @@ -4031,9 +4048,81 @@ static int __init ksm_sysfs_init(void) #endif /* CONFIG_SYSFS */ #ifdef CONFIG_SELECTIVE_KSM +static ssize_t add_partition_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + struct partition_kobj *new_partition_kobj; + char partition_name[50]; + int err; + + mutex_lock(&ksm_thread_mutex); + + if (count >= sizeof(partition_name)) { + err = -EINVAL; /* Prevent buffer overflow */ + goto unlock; + } + + snprintf(partition_name, sizeof(partition_name), + "%.*s", (int)(count - 1), buf); /* Remove newline */ + + /* Allocate memory for new dynamic kobject entry */ + new_partition_kobj = kmalloc(sizeof(*new_partition_kobj), GFP_KERNEL); + if (!new_partition_kobj) { + err = -ENOMEM; + goto unlock; + } + + new_partition_kobj->kobj = kobject_create_and_add(partition_name, + ksm_base_kobj); + if (!new_partition_kobj) { + kfree(new_partition_kobj); + err = -ENOMEM; + goto unlock; + } + + err = sysfs_create_group(new_partition_kobj->kobj, &ksm_attr_group); + if (err) { + pr_err("ksm: register sysfs failed\n"); + kfree(new_partition_kobj); + err = -ENOMEM; + goto unlock; + } + + list_add(&new_partition_kobj->list, &partition_list); + +unlock: + mutex_unlock(&ksm_thread_mutex); + return err ? err : count; +} + +static struct kobj_attribute add_kobj_attr = __ATTR(add_partition, 0220, NULL, + add_partition_store); + +/* Array of attributes for base kobject */ +static struct attribute *ksm_base_attrs[] = { + &add_kobj_attr.attr, + NULL, /* NULL-terminated */ +}; + +/* Attribute group for base kobject */ +static struct attribute_group ksm_base_attr_group = { + .name = "control", + .attrs = ksm_base_attrs, +}; + static int __init ksm_thread_sysfs_init(void) { - return ksm_sysfs_init(); + int err; + + ksm_base_kobj = kobject_create_and_add("ksm", mm_kobj); + if (!ksm_base_kobj) { + err = -ENOMEM; + return err; + } + + err = ksm_sysfs_init(ksm_base_kobj, &ksm_base_attr_group); + return err; } #else /* CONFIG_SELECTIVE_KSM */ static int __init ksm_thread_sysfs_init(void) @@ -4048,7 +4137,7 @@ static int __init ksm_thread_sysfs_init(void) return err; } - err = ksm_sysfs_init(); + err = ksm_sysfs_init(mm_kobj, &ksm_attr_group); if (err) { pr_err("ksm: register sysfs failed\n"); kthread_stop(ksm_thread); From patchwork Fri Mar 21 17:37:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sourav Panda X-Patchwork-Id: 14025837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B93BC36000 for ; Fri, 21 Mar 2025 17:37:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93956280006; Fri, 21 Mar 2025 13:37:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BE9B280001; Fri, 21 Mar 2025 13:37:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67943280006; Fri, 21 Mar 2025 13:37:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 38F8D280001 for ; Fri, 21 Mar 2025 13:37:42 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 667ADB7917 for ; Fri, 21 Mar 2025 17:37:43 +0000 (UTC) X-FDA: 83246265606.01.987D2B9 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf19.hostedemail.com (Postfix) with ESMTP id 853421A000F for ; Fri, 21 Mar 2025 17:37:41 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YbB2Ilud; spf=pass (imf19.hostedemail.com: domain of 35KPdZwsKCOcbXdaJeYJWMJPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35KPdZwsKCOcbXdaJeYJWMJPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--souravpanda.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742578661; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w+86G2SUYfeiHLweFc9OnEaVmlu/kSLmPHY1T1dM0HE=; b=T66OfsK8HwsbBMlA7aHPezaVNDeTrjj8sG3YqPu5S6Pi72wsNZuDZVwjkEfG4OXpJnH7ky gnHJRAxrDFBljDw4KoaqqBCEPO6QxzgZesjVljRUuLwqeEDZnMZdr+83GHqbAt6VC4mMlN juCJM8DcIFmLwKDc829t6JBHaBbcVsY= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YbB2Ilud; spf=pass (imf19.hostedemail.com: domain of 35KPdZwsKCOcbXdaJeYJWMJPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35KPdZwsKCOcbXdaJeYJWMJPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--souravpanda.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742578661; a=rsa-sha256; cv=none; b=FqX8ZrF9sdXMnAQgP1fhlU9ezCUO0zMsRGfeFU0OPGIIjLX6XnfhTjJUd8LwYSxZa6pPZm VXsF7KYnm9dBxYtvzCcfuHSc5THL0R7Gv0WA7E2tD8wQ4KA8QdghJQLaHlzzrFSdbsB6LO 3lmPFLA6+bcRpRxjj8PlzcnOlavoUgM= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3011bee1751so3467473a91.1 for ; Fri, 21 Mar 2025 10:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742578660; x=1743183460; darn=kvack.org; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=w+86G2SUYfeiHLweFc9OnEaVmlu/kSLmPHY1T1dM0HE=; b=YbB2IludbCEl7WBx4lRs7/BRDVNCHBQZCMuw83y63LElBQDhmNmZf2R9tDfAt4eyNP xGX+4F6iSVszRlBRmUSMDIRTRnYLCQjK40vwZ+3oZWeqiuMK5e1H2Kf+xfkmUgEVgaeI 4tJVJ/NDtrBChMgRQQ9bYUFZntLiAPrmKKKcUKYVypFjlQWKmnGs14UNwJ2Evb7258BL zCP2+hMmH7lo/kg77QJ5W1NKR0Ar00xktWsm+20e0Ox3dJOBmvQq0zhgbiv+Bza7W42V yjEw36/A8cKqLfj96qcUsaLUzruL0pNKD1KxNQHuUN9QYejf/oFC1dsR6MD1qywGE4Nk hHNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742578660; x=1743183460; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=w+86G2SUYfeiHLweFc9OnEaVmlu/kSLmPHY1T1dM0HE=; b=M/Cy6TxraGC0YI5mTs6WN2ytX/+je6JRbA6ff5wYgfcuADvQehgFirH/AAS07cRr51 5yUeKSOu8zltOEMPNdYzsvz1JrDl7Ycvrm2xclMzpmxhfHRBwBH6iXwjlyfAMrvLl0+1 Umf5TxpJy5Yevj6xpUizgp7MXegZfdt6/HJGyzTETd17S6ZGS0/sIxh4QR5N6N5+He8h v2/0uzrRf72DD5L6XJi3HQpl+ucYGGmVYdZOW7WZVFjx/UtuKQQNrh9SjRQSZLCybYsy yeLd+kn1CUVDvMwzO38sflmZICIylH6EmPkipCtDb64ZTt3W2ypBnW6sLoapXwmbWbXT qGjw== X-Forwarded-Encrypted: i=1; AJvYcCVTwnafgrRvWMWeyBAY7uEkNveVPw3BhuKGjuVy9nt/3e3Hi/61b/xwNNcwj6ovzNGrwCS8POX7ZA==@kvack.org X-Gm-Message-State: AOJu0YxaYFQDKuEc2fnHiw8lrAOgRF4/A5y9P+WUezXjab6pxhfTLR8r uSY/Vd73AxAcqyQoSDXLlDqVw+otgQj1kl8fDN9C62ay+FUGJ2/efmWJKZIIsdzYqj+e66clxp4 8CaP/qCmIdU2fyIwepHOrFw== X-Google-Smtp-Source: AGHT+IGm4f7njleqOWwtfaudXh+F/G/hZodt/w5YdBSQxu8dXfprXgYhidXkVGY03bHplMdrG0x4phFmYcftkI/tzg== X-Received: from pjtd12.prod.google.com ([2002:a17:90b:4c:b0:2fa:27e2:a64d]) (user=souravpanda job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5450:b0:2ff:5714:6a with SMTP id 98e67ed59e1d1-3030fe98134mr5881396a91.19.1742578660362; Fri, 21 Mar 2025 10:37:40 -0700 (PDT) Date: Fri, 21 Mar 2025 17:37:27 +0000 In-Reply-To: <20250321173729.3175898-1-souravpanda@google.com> Mime-Version: 1.0 References: <20250321173729.3175898-1-souravpanda@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321173729.3175898-5-souravpanda@google.com> Subject: [RFC PATCH 4/6] mm: create dedicated trees for SELECTIVE KSM partitions From: Sourav Panda To: mathieu.desnoyers@efficios.com, willy@infradead.org, david@redhat.com, pasha.tatashin@soleen.com, rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, weixugc@google.com, gthelen@google.com, souravpanda@google.com, surenb@google.com X-Rspamd-Queue-Id: 853421A000F X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: 7sbn4bp84o8rqhey1uaffw9fy8dqc1aj X-HE-Tag: 1742578661-174765 X-HE-Meta: U2FsdGVkX1/pPn3pltlAtvyFXXDk3Y6TGAty3Z765LhnUuUfS6J60S6CRj2v7jwqrPHWUarree/9LsrWjO4TmjK27AqAMOZ6ZWsYrC7e0w0u9GiTSRvaVOEBbLPNDiNjyf3Gc/OMFcf379US+t3pYQwQlvmXszQMprmkiyjpnErndboXuwm2lex7HNnex0ZKm/+WTqRvCK0tjEBVHu3RNNBffSk/pY8gs9cD/kZtAxD8C0tXsf1G2ooVNh5kVZy20PTvyscVHpSMDcaTCBhxMoRICBgRlprG0YgMzxGpsTA6lEukd34keoHC6OItgYATLDvlDv1oDSiG6ovkO4MOiVdqAsATJ78fsTh7hybcV61jIhm7Ui4ddGvOOXCCBT9f1VFQLF78JGB1elzOc9UShZpGMoP3pkkXQRm1ZUA1dOLeCvFmzrqoJIuqXSHuws0aqTnqO1jJ1uBBNFxWPIazSheXecrjvXvWX3MqSIujjzoH6sVY4ISA5K+saAnr2w/EO5nqN09k3mT+2QUYg/jN0vt8qorD85t9ZR8jb1mw5TnBWW3buhvSAaK/ZMvtaCoXCsm8HLU/bWaEuJZT7oZmjC1CC0/UOy5y/BTsnebgsbjyf+PrALEfG0keoFsLCy5sI66YM8VfdAZptNmVZdC8xI1Z6xBEEx9GE+nE5fms3Vnj53XLTnEySvNICnK0yJrwCapI1i3Q+FrkXJzzl82SIV05/1q57Y18iYDz36dT5vrL2Q+PTNYypt/zUinIWz8HdYJDBD+xNSsa2vhzqDGq2yJewtAa2j5mu3jo1naFV1ruzkzWbQOTvMO/PYHmtBdV6f3/4Q9ch//Hd4dieZn/BL7SLZYuDjLjPUgCzs+Jbudtp9EVjiz1w2lqahpusfb5Q+oALWdnnEczG0CVLwViLDqq67PunH8h+6uB7PbEibGXCFMd12anpVb6P06slb5NJJVYZFZXd+pnHeJa3ux xUxNB5zr DaFEoHgnrmubj+ZSwHD0cL22TrNPGCpos2TELjqN3trbEX/UdUoaw6rMc/t4WMpbenXcveJTxavATBYI8bwmMQdoQN1aEo+Qz/15FO9PhPDIFWYFfcZn1qj8Ae5H/FRRqRwLOTLEzsh4MK2NTg+VE/owjAL82ITkSbXgPAy8W9OveHz6wV09eM7X0VP9qQLxPNV/Y5PEWu2Qm2Wnj1xJhFnBbsJ9g93ajo7DLV83MhOXc0A5P5O/3e/phNVDyrPq7bFocK9xEEhZ7DAu9LThw16YL2EVpS+/s2zjKqOGRnv/rFBl9ChfefQQfe1AQrsT76jO595zazUr48g0JlJEFEsnZUAkAHLk5SIscoIGLE6G1fD3qkajE2EoBWk6FAR0Hstox1Ale7uK7sOiZBaUyNMGX8NQaCpohlfHzNv+0MDYUwmUXZAtIR4yVlqCTSbihtS1CHx3FwHlV8Z3qHkYMpqK2R7LpPo9pVRs+H11OzrVqkMUOPFsZf62GTLG8YDLKwgYBuZ6qB5rlhpbztsnX4trg03cdTpe4ITPf0PinxKmw7cMml9PJLWAACVcT6Pj+ggcfXOjh+5K5YgA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Extend ksm to create dedicated unstable and stable trees for each partition. Signed-off-by: Sourav Panda --- mm/ksm.c | 165 +++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 111 insertions(+), 54 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 927e257c48b5..b575250aaf45 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -144,6 +144,28 @@ struct ksm_scan { unsigned long seqnr; }; +static struct kobject *ksm_base_kobj; + +struct partition_kobj { + struct kobject *kobj; + struct list_head list; + struct rb_root *root_stable_tree; + struct rb_root *root_unstable_tree; +}; + +static LIST_HEAD(partition_list); + +static struct partition_kobj *find_partition_by_kobj(struct kobject *kobj) +{ + struct partition_kobj *partition; + + list_for_each_entry(partition, &partition_list, list) { + if (partition->kobj == kobj) + return partition; + } + return NULL; +} + /** * struct ksm_stable_node - node of the stable rbtree * @node: rb node of this ksm page in the stable tree @@ -182,6 +204,7 @@ struct ksm_stable_node { #ifdef CONFIG_NUMA int nid; #endif + struct partition_kobj *partition; }; /** @@ -218,6 +241,7 @@ struct ksm_rmap_item { struct hlist_node hlist; }; }; + struct partition_kobj *partition; }; #define SEQNR_MASK 0x0ff /* low bits of unstable tree seqnr */ @@ -227,8 +251,6 @@ struct ksm_rmap_item { /* The stable and unstable tree heads */ static struct rb_root one_stable_tree[1] = { RB_ROOT }; static struct rb_root one_unstable_tree[1] = { RB_ROOT }; -static struct rb_root *root_stable_tree = one_stable_tree; -static struct rb_root *root_unstable_tree = one_unstable_tree; /* Recently migrated nodes of stable tree, pending proper placement */ static LIST_HEAD(migrate_nodes); @@ -555,7 +577,7 @@ static inline void stable_node_dup_del(struct ksm_stable_node *dup) if (is_stable_node_dup(dup)) __stable_node_dup_del(dup); else - rb_erase(&dup->node, root_stable_tree + NUMA(dup->nid)); + rb_erase(&dup->node, dup->partition->root_stable_tree + NUMA(dup->nid)); #ifdef CONFIG_DEBUG_VM dup->head = NULL; #endif @@ -580,14 +602,20 @@ static inline void free_rmap_item(struct ksm_rmap_item *rmap_item) kmem_cache_free(rmap_item_cache, rmap_item); } -static inline struct ksm_stable_node *alloc_stable_node(void) +static inline struct ksm_stable_node *alloc_stable_node(struct partition_kobj *partition) { /* * The allocation can take too long with GFP_KERNEL when memory is under * pressure, which may lead to hung task warnings. Adding __GFP_HIGH * grants access to memory reserves, helping to avoid this problem. */ - return kmem_cache_alloc(stable_node_cache, GFP_KERNEL | __GFP_HIGH); + struct ksm_stable_node *node = kmem_cache_alloc(stable_node_cache, + GFP_KERNEL | __GFP_HIGH); + + if (node) + node->partition = partition; + + return node; } static inline void free_stable_node(struct ksm_stable_node *stable_node) @@ -777,9 +805,10 @@ static inline int get_kpfn_nid(unsigned long kpfn) } static struct ksm_stable_node *alloc_stable_node_chain(struct ksm_stable_node *dup, - struct rb_root *root) + struct rb_root *root, + struct partition_kobj *partition) { - struct ksm_stable_node *chain = alloc_stable_node(); + struct ksm_stable_node *chain = alloc_stable_node(partition); VM_BUG_ON(is_stable_node_chain(dup)); if (likely(chain)) { INIT_HLIST_HEAD(&chain->hlist); @@ -1016,7 +1045,8 @@ static void remove_rmap_item_from_tree(struct ksm_rmap_item *rmap_item) unsigned char age = get_rmap_item_age(rmap_item); if (!age) rb_erase(&rmap_item->node, - root_unstable_tree + NUMA(rmap_item->nid)); + rmap_item->partition->root_unstable_tree + + NUMA(rmap_item->nid)); ksm_pages_unshared--; rmap_item->address &= PAGE_MASK; } @@ -1154,17 +1184,23 @@ static int remove_all_stable_nodes(void) struct ksm_stable_node *stable_node, *next; int nid; int err = 0; - - for (nid = 0; nid < ksm_nr_node_ids; nid++) { - while (root_stable_tree[nid].rb_node) { - stable_node = rb_entry(root_stable_tree[nid].rb_node, - struct ksm_stable_node, node); - if (remove_stable_node_chain(stable_node, - root_stable_tree + nid)) { - err = -EBUSY; - break; /* proceed to next nid */ + struct partition_kobj *partition; + struct rb_root *root_stable_tree; + + list_for_each_entry(partition, &partition_list, list) { + root_stable_tree = partition->root_stable_tree; + + for (nid = 0; nid < ksm_nr_node_ids; nid++) { + while (root_stable_tree[nid].rb_node) { + stable_node = rb_entry(root_stable_tree[nid].rb_node, + struct ksm_stable_node, node); + if (remove_stable_node_chain(stable_node, + root_stable_tree + nid)) { + err = -EBUSY; + break; /* proceed to next nid */ + } + cond_resched(); } - cond_resched(); } } list_for_each_entry_safe(stable_node, next, &migrate_nodes, list) { @@ -1802,7 +1838,8 @@ static __always_inline struct folio *chain(struct ksm_stable_node **s_n_d, * This function returns the stable tree node of identical content if found, * -EBUSY if the stable node's page is being migrated, NULL otherwise. */ -static struct folio *stable_tree_search(struct page *page) +static struct folio *stable_tree_search(struct page *page, + struct partition_kobj *partition) { int nid; struct rb_root *root; @@ -1821,7 +1858,7 @@ static struct folio *stable_tree_search(struct page *page) } nid = get_kpfn_nid(folio_pfn(folio)); - root = root_stable_tree + nid; + root = partition->root_stable_tree + nid; again: new = &root->rb_node; parent = NULL; @@ -1991,7 +2028,7 @@ static struct folio *stable_tree_search(struct page *page) VM_BUG_ON(is_stable_node_dup(stable_node_dup)); /* chain is missing so create it */ stable_node = alloc_stable_node_chain(stable_node_dup, - root); + root, partition); if (!stable_node) return NULL; } @@ -2016,7 +2053,8 @@ static struct folio *stable_tree_search(struct page *page) * This function returns the stable tree node just allocated on success, * NULL otherwise. */ -static struct ksm_stable_node *stable_tree_insert(struct folio *kfolio) +static struct ksm_stable_node *stable_tree_insert(struct folio *kfolio, + struct partition_kobj *partition) { int nid; unsigned long kpfn; @@ -2028,7 +2066,7 @@ static struct ksm_stable_node *stable_tree_insert(struct folio *kfolio) kpfn = folio_pfn(kfolio); nid = get_kpfn_nid(kpfn); - root = root_stable_tree + nid; + root = partition->root_stable_tree + nid; again: parent = NULL; new = &root->rb_node; @@ -2067,7 +2105,7 @@ static struct ksm_stable_node *stable_tree_insert(struct folio *kfolio) } } - stable_node_dup = alloc_stable_node(); + stable_node_dup = alloc_stable_node(partition); if (!stable_node_dup) return NULL; @@ -2082,7 +2120,8 @@ static struct ksm_stable_node *stable_tree_insert(struct folio *kfolio) if (!is_stable_node_chain(stable_node)) { struct ksm_stable_node *orig = stable_node; /* chain is missing so create it */ - stable_node = alloc_stable_node_chain(orig, root); + stable_node = alloc_stable_node_chain(orig, root, + partition); if (!stable_node) { free_stable_node(stable_node_dup); return NULL; @@ -2121,7 +2160,7 @@ struct ksm_rmap_item *unstable_tree_search_insert(struct ksm_rmap_item *rmap_ite int nid; nid = get_kpfn_nid(page_to_pfn(page)); - root = root_unstable_tree + nid; + root = rmap_item->partition->root_unstable_tree + nid; new = &root->rb_node; while (*new) { @@ -2291,7 +2330,7 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite } /* Start by searching for the folio in the stable tree */ - kfolio = stable_tree_search(page); + kfolio = stable_tree_search(page, rmap_item->partition); if (&kfolio->page == page && rmap_item->head == stable_node) { folio_put(kfolio); return; @@ -2344,7 +2383,8 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite * node in the stable tree and add both rmap_items. */ folio_lock(kfolio); - stable_node = stable_tree_insert(kfolio); + stable_node = stable_tree_insert(kfolio, + rmap_item->partition); if (stable_node) { stable_tree_append(tree_rmap_item, stable_node, false); @@ -2502,7 +2542,8 @@ static struct ksm_rmap_item *retrieve_rmap_item(struct page **page, } static void ksm_sync_merge(struct mm_struct *mm, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, + struct partition_kobj *partition) { struct ksm_rmap_item *rmap_item; struct page *page; @@ -2510,6 +2551,7 @@ static void ksm_sync_merge(struct mm_struct *mm, rmap_item = retrieve_rmap_item(&page, mm, start, end); if (!rmap_item) return; + rmap_item->partition = partition; cmp_and_merge_page(page, rmap_item); put_page(page); } @@ -3328,19 +3370,23 @@ static void ksm_check_stable_tree(unsigned long start_pfn, struct ksm_stable_node *stable_node, *next; struct rb_node *node; int nid; - - for (nid = 0; nid < ksm_nr_node_ids; nid++) { - node = rb_first(root_stable_tree + nid); - while (node) { - stable_node = rb_entry(node, struct ksm_stable_node, node); - if (stable_node_chain_remove_range(stable_node, - start_pfn, end_pfn, - root_stable_tree + - nid)) - node = rb_first(root_stable_tree + nid); - else - node = rb_next(node); - cond_resched(); + struct rb_root *root_stable_tree + + list_for_each_entry(partition, &partition_list, list) { + root_stable_tree = partition->root_stable_tree; + + for (nid = 0; nid < ksm_nr_node_ids; nid++) { + node = rb_first(root_stable_tree + nid); + while (node) { + stable_node = rb_entry(node, struct ksm_stable_node, node); + if (stable_node_chain_remove_range(stable_node, + start_pfn, end_pfn, + root_stable_tree + nid)) + node = rb_first(root_stable_tree + nid); + else + node = rb_next(node); + cond_resched(); + } } } list_for_each_entry_safe(stable_node, next, &migrate_nodes, list) { @@ -3551,6 +3597,7 @@ static ssize_t trigger_merge_store(struct kobject *kobj, int ret; struct task_struct *task; struct mm_struct *mm; + struct partition_kobj *partition; input = kstrdup(buf, GFP_KERNEL); if (!input) @@ -3583,9 +3630,13 @@ static ssize_t trigger_merge_store(struct kobject *kobj, if (!mm) return -EINVAL; + partition = find_partition_by_kobj(kobj); + if (!partition) + return -EINVAL; + mutex_lock(&ksm_thread_mutex); wait_while_offlining(); - ksm_sync_merge(mm, start, end); + ksm_sync_merge(mm, start, end, partition); mutex_unlock(&ksm_thread_mutex); mmput(mm); @@ -3606,6 +3657,8 @@ static ssize_t merge_across_nodes_store(struct kobject *kobj, { int err; unsigned long knob; + struct rb_root *root_stable_tree; + struct partition_kobj *partition; err = kstrtoul(buf, 10, &knob); if (err) @@ -3615,6 +3668,10 @@ static ssize_t merge_across_nodes_store(struct kobject *kobj, mutex_lock(&ksm_thread_mutex); wait_while_offlining(); + + partition = find_partition_by_kobj(kobj); + root_stable_tree = partition->root_stable_tree; + if (ksm_merge_across_nodes != knob) { if (ksm_pages_shared || remove_all_stable_nodes()) err = -EBUSY; @@ -3633,10 +3690,10 @@ static ssize_t merge_across_nodes_store(struct kobject *kobj, if (!buf) err = -ENOMEM; else { - root_stable_tree = buf; - root_unstable_tree = buf + nr_node_ids; + partition->root_stable_tree = buf; + partition->root_unstable_tree = buf + nr_node_ids; /* Stable tree is empty but not the unstable */ - root_unstable_tree[0] = one_unstable_tree[0]; + partition->root_unstable_tree[0] = one_unstable_tree[0]; } } if (!err) { @@ -3834,14 +3891,6 @@ KSM_ATTR_RO(full_scans); #ifdef CONFIG_SELECTIVE_KSM static struct kobject *ksm_base_kobj; - -struct partition_kobj { - struct kobject *kobj; - struct list_head list; -}; - -static LIST_HEAD(partition_list); - #else /* CONFIG_SELECTIVE_KSM */ static ssize_t smart_scan_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) @@ -4055,6 +4104,7 @@ static ssize_t add_partition_store(struct kobject *kobj, struct partition_kobj *new_partition_kobj; char partition_name[50]; int err; + struct rb_root *tree_root; mutex_lock(&ksm_thread_mutex); @@ -4081,6 +4131,13 @@ static ssize_t add_partition_store(struct kobject *kobj, goto unlock; } + tree_root = kcalloc(nr_node_ids + nr_node_ids, sizeof(*tree_root), GFP_KERNEL); + if (!tree_root) { + err = -ENOMEM; + goto unlock; + } + new_partition_kobj->root_stable_tree = tree_root; + new_partition_kobj->root_unstable_tree = tree_root + nr_node_ids; err = sysfs_create_group(new_partition_kobj->kobj, &ksm_attr_group); if (err) { pr_err("ksm: register sysfs failed\n"); From patchwork Fri Mar 21 17:37:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sourav Panda X-Patchwork-Id: 14025838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70882C36005 for ; Fri, 21 Mar 2025 17:37:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49E1E280007; Fri, 21 Mar 2025 13:37:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4269D280001; Fri, 21 Mar 2025 13:37:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22E65280007; Fri, 21 Mar 2025 13:37:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EBE9B280001 for ; Fri, 21 Mar 2025 13:37:42 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 45E54B78B7 for ; Fri, 21 Mar 2025 17:37:44 +0000 (UTC) X-FDA: 83246265648.16.2F6D7A6 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf15.hostedemail.com (Postfix) with ESMTP id 9DE95A001C for ; Fri, 21 Mar 2025 17:37:42 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OCixDwp+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 35aPdZwsKCOgcYebKfZKXNKQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--souravpanda.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=35aPdZwsKCOgcYebKfZKXNKQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--souravpanda.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742578662; a=rsa-sha256; cv=none; b=FGPGjcmVKA1Ch/m/Dqu3tAWYNOo9bPldIgQkTGtd7si7yo8Jo2sUwq6hiYKa7BlPSBl+UT AOlYI8N6wtOCc9v7FUhK5KVzP3DH0rRRkHpGruzoj1gvTfJ/kL5YEa/no70+lkXNWWwXml zYeZa2oxsY7A//WmUQEKRj+w/JAjq9A= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OCixDwp+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 35aPdZwsKCOgcYebKfZKXNKQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--souravpanda.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=35aPdZwsKCOgcYebKfZKXNKQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--souravpanda.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742578662; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oVY3EzVTKOjztz/iA3gGAorv/v23BkdoPGQ1O9sMpRg=; b=G4Wlqb0mI7UZasxBUOA1KK9DEEVwXXi0vQ5/H9rZU6tig9T3+Obp2NCTWb8EhpbAtotE3m s23hzzCakOYSAr2vue1lypAvjYd+NPLmwF+v597A7xXiva47Je5mD7Hp9eRv4MUyfuVNPC UblijESyITNiC2qb8PxM5e1WyOZF0XQ= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2242f3fd213so32636325ad.1 for ; Fri, 21 Mar 2025 10:37:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742578661; x=1743183461; darn=kvack.org; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=oVY3EzVTKOjztz/iA3gGAorv/v23BkdoPGQ1O9sMpRg=; b=OCixDwp+NNI5VH21VHiTPky/Ona3oRtENWWTNUdIQjMtTQeT5uwjRk2CHK/x7Mqiux q5G2aYFTC4g+Tojgee35dCLKJVSMCplAz8QYaozl5HModd6wSG6ah43KIGP52XG1GXfi jFPuKYsfoT3V/CaYwOWYg1T9+uASytMlMobSo62/QXgJMbHSxg66lZ8BtA2gZOq1SNDz bsPUuFGX1jNpaefLKhGZ8fewiryYzXhPZwIvaIHBrd58n47czJJ+tvVfpnNqHzh1Wi9y 6alssybYjRRI4jnjVJYN2iGODPXslIG6nIpNr8dGKnFgyJx8MbHp6HpBUHs/oMOG8Dv7 sqpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742578661; x=1743183461; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oVY3EzVTKOjztz/iA3gGAorv/v23BkdoPGQ1O9sMpRg=; b=ktjUigW6BLGhIUuz8icDWAPVAusG/UcusYH8a686oabiZrH3kG1GNy5pEVDLRJzij9 9aBQc8AXOR+0/XivHxedUsqzDDKVi01k1yMiNxhLCGhreBsfMGuFLwTrwiBaYln3kUp4 k630PJtQ7LQqUCdwAvdmKToAvFFD3A5mEs5scfMMM7esLTqDDU8pdXfTJJX3xWKXCPJM llADEavRLtpZmxvHZerbvm/IV925tARUAhTZsAYsH9LFe4xAAIbX6k//lyNvy0wyiUUA 2A14CKPPWHYPbEobzD8iWYgHxgyZwTmvwPkCrIDgTu0JIo10lxqJRT2jt58qrG0gUvSi FNBg== X-Forwarded-Encrypted: i=1; AJvYcCVQ9MwHvEWLHhulEd3sEoj7qthS5KqYwn96p245VorlwMnYQnH+lFqrw8K2rSLVqob//Mpiu1UsOQ==@kvack.org X-Gm-Message-State: AOJu0Yy3I/0//a4fSB7faK24GEeizoxMEAghuVAa6sXOGRHVgThm/onj WjCo8VR8FTGbM2ZD86CWitsACtt5Z3nJKnRepe9wpE8J+l57pqX7NYsm5lGLXMxw9VmEFZhEzyx 8blWUMI5vmTcjapGg2QNYRg== X-Google-Smtp-Source: AGHT+IF+cvqsQFV2a6Y9/WZ27WhrC22Tdz6Ik3urAFxxcSKORKZ17tAg0m4s+cXw6DmZenA0G9mdHnaddcz52VU3Tw== X-Received: from pfbdo13.prod.google.com ([2002:a05:6a00:4a0d:b0:736:a055:1ce3]) (user=souravpanda job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:218b:b0:736:7a00:e522 with SMTP id d2e1a72fcca58-7390593b7f2mr6721955b3a.2.1742578661516; Fri, 21 Mar 2025 10:37:41 -0700 (PDT) Date: Fri, 21 Mar 2025 17:37:28 +0000 In-Reply-To: <20250321173729.3175898-1-souravpanda@google.com> Mime-Version: 1.0 References: <20250321173729.3175898-1-souravpanda@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321173729.3175898-6-souravpanda@google.com> Subject: [RFC PATCH 5/6] mm: trigger unmerge and remove SELECTIVE KSM partition From: Sourav Panda To: mathieu.desnoyers@efficios.com, willy@infradead.org, david@redhat.com, pasha.tatashin@soleen.com, rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, weixugc@google.com, gthelen@google.com, souravpanda@google.com, surenb@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9DE95A001C X-Stat-Signature: 3nhn1fwn1y7uxspis93x3onixrd5rapx X-HE-Tag: 1742578662-709879 X-HE-Meta: U2FsdGVkX1/xLUtveuQi339cLvzvfCw5f2jFXmUhC5Ypf40RjGOZhMaZtAQvvrGSLVwIGZdTKAVpj2n/pdZl4GS56iOmVv/0Qpx6Sm1ewz+PLszPwvzAbBA37XIWcBCDDLoIkV5N+s4jCLMm/mDpqdtcyK57uxPGwXLMG8hwQ7HfPfWfEBs7Bhpm3TUnK/9g0erTGW4NtoP9hoU0GWZLIEuwS/jrhgLUU8peThX9ZQ6QVBBeMsAjSiJ/QOD3XfQc+/OPZx6gk3EBECW4+VYLIlMT9gP76C5lafxDy93y9x9rnJo9g1fP40Bq5jguHexGoG3LiAuvkbKC9HsrwiOv/xfOeruSV21CbpVGOnIQnBuJyhkXKXd5E6eK1tzyQZpfG4tnry2SgNcnnqsh3SFKFgpHlbqSkCep0de1ivkXPTaCJ7ygIS/iX2nmlOpJp94GWD8fDbClkj9H2TEBbPvAX9kuFgMY/qgRE7QN5PptI8/2t96XfEgs/7Z03FmDbUU/OSyWHNIHW3rLS3bz7IgoL+RcHzvqt1Zh3PqWdhCunMsAnC5RE3hNIu7xkAV3UVxVQjXJrBl1EIUaobWua7uWueyoDdF43TBeXKKygbRve0hzRsot3kqPl8HZHTlAA9r3oVoSxyItN/QYUCFg1wg4HGavaGLEMZZltiPS9B6qiyi2CPPKMehLhFTJsHkqdGWzULcf3jwH4MP1w7uTki3aozcFFJ6KP8l4iNqSQNVrX/REZD8Qh6AJMbfo8zLKc9YBBzyvdq1WpSOUNe5cDnziu5W+foT7FXVbNpODq2Kvk8YAaCH2mvPNPWtkyT/0wgwb9xUbUBTR3gmoXdsPZikDVvLOOZa4pfp2VtHPCN6g7iwhhpGrwj1av1vbbBQ29lQkJ4VHl/hrbK59h+E2Fs3CBKkitFTyHydyD7HQyLN9k6rkAMIgbvJze4dfwCSAhGnRrn7gvmgKUf4+6dzNEAR cS2Ixmj/ jCdnm6Bn8m6DavJU2rn3MTJNk2SvCPY/0Hp0qZZnOJ9bWb8Hz9y4Un9+L3IuxllhZFZL2nOIIEDYZtLLIiuA2GddHFmrEBR5h2Ho0iawFVNMCJ/mx459ddrQiTjGcJkDSiMsVM/DpnVXPUgoq/FsaK2JrOjo6M8b0Ddynu7SdV+vxLpS3iIDklX9jK2okoGPudz5XM8rHue/hZajETlE8Ms/y+IVSB6zu8KD0r86s8W/9LI4hhJRxWVYWtK5Ro2mv99R6ebcAPlSjQGQYkfhbxfZgLfMoi9kj4GDNhrGw/L10Ztk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Trigger unmerge or remove a partition using the following sysfs interface: Triggering an unmerge for a specific partition: echo "pid" > /sys/kernel/mm/ksm/partition_name/trigger_unmerge Removing a partition: echo "partition_to_remove" > /sys/kernel/mm/ksm/control/remove_partition Limitation of current implementation: On carrying out trigger_unmerge, we unmerge all rmap items which is wrong. We should only unmerge the rmap items that belong to the partition where we called unmerge. Another limitation is that we do not specify the address range when echoing into trigger unmerge. Intentionally left out till until we determine the implementation feasibility. Signed-off-by: Sourav Panda --- mm/ksm.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 120 insertions(+) diff --git a/mm/ksm.c b/mm/ksm.c index b575250aaf45..fd7626d5d8c9 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2556,6 +2556,31 @@ static void ksm_sync_merge(struct mm_struct *mm, put_page(page); } +static void ksm_sync_unmerge(struct mm_struct *mm) +{ + struct mm_slot *slot; + struct ksm_mm_slot *mm_slot; + + struct vm_area_struct *vma; + struct vma_iterator vmi; + + slot = mm_slot_lookup(mm_slots_hash, mm); + mm_slot = container_of(slot, struct ksm_mm_slot, slot); + + ksm_scan.address = 0; + vma_iter_init(&vmi, mm, ksm_scan.address); + + mmap_read_lock(mm); + for_each_vma(vmi, vma) { + if (!(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma) + continue; + unmerge_ksm_pages(vma, vma->vm_start, vma->vm_end, false); + } + remove_trailing_rmap_items(&mm_slot->rmap_list); + + mmap_read_unlock(mm); +} + #else /* CONFIG_SELECTIVE_KSM */ /* * Calculate skip age for the ksm page age. The age determines how often @@ -3644,6 +3669,58 @@ static ssize_t trigger_merge_store(struct kobject *kobj, } KSM_ATTR(trigger_merge); +static ssize_t trigger_unmerge_show(struct kobject *kobj, + struct kobj_attribute *attr, + char *buf) +{ + return -EINVAL; /* Not yet implemented */ +} + +static ssize_t trigger_unmerge_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + pid_t pid; + char *input, *ptr; + int ret; + struct task_struct *task; + struct mm_struct *mm; + + input = kstrdup(buf, GFP_KERNEL); + if (!input) + return -ENOMEM; + + ptr = strim(input); + ret = kstrtoint(ptr, 10, &pid); + kfree(input); + + /* Find the mm_struct */ + rcu_read_lock(); + task = find_task_by_vpid(pid); + if (!task) { + rcu_read_unlock(); + return -ESRCH; + } + + get_task_struct(task); + + rcu_read_unlock(); + mm = get_task_mm(task); + put_task_struct(task); + + if (!mm) + return -EINVAL; + + mutex_lock(&ksm_thread_mutex); + wait_while_offlining(); + ksm_sync_unmerge(mm); + mutex_unlock(&ksm_thread_mutex); + + mmput(mm); + return count; +} +KSM_ATTR(trigger_unmerge); + #ifdef CONFIG_NUMA static ssize_t merge_across_nodes_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) @@ -4044,6 +4121,7 @@ static struct attribute *ksm_attrs[] = { &pages_to_scan_attr.attr, &run_attr.attr, &trigger_merge_attr.attr, + &trigger_unmerge_attr.attr, &pages_scanned_attr.attr, &pages_shared_attr.attr, &pages_sharing_attr.attr, @@ -4156,9 +4234,51 @@ static ssize_t add_partition_store(struct kobject *kobj, static struct kobj_attribute add_kobj_attr = __ATTR(add_partition, 0220, NULL, add_partition_store); +static ssize_t remove_partition_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + struct partition_kobj *partition; + struct partition_kobj *partition_found = NULL; + char partition_name[50]; + int err = 0; + + if (sscanf(buf, "%31s", partition_name) != 1) + return -EINVAL; + + mutex_lock(&ksm_thread_mutex); + + list_for_each_entry(partition, &partition_list, list) { + if (strcmp(kobject_name(partition->kobj), partition_name) == 0) { + partition_found = partition; + break; + } + } + + if (!partition_found) { + err = -ENOENT; + goto unlock; + } + + unmerge_and_remove_all_rmap_items(); + + kobject_put(partition_found->kobj); + list_del(&partition_found->list); + kfree(partition_found->root_stable_tree); + kfree(partition_found); + +unlock: + mutex_unlock(&ksm_thread_mutex); + return err ? err : count; +} + +static struct kobj_attribute rm_kobj_attr = __ATTR(remove_partition, 0220, NULL, + remove_partition_store); + /* Array of attributes for base kobject */ static struct attribute *ksm_base_attrs[] = { &add_kobj_attr.attr, + &rm_kobj_attr.attr, NULL, /* NULL-terminated */ }; From patchwork Fri Mar 21 17:37:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sourav Panda X-Patchwork-Id: 14025839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EA8CC36000 for ; Fri, 21 Mar 2025 17:37:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAC9D280008; Fri, 21 Mar 2025 13:37:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C3589280001; Fri, 21 Mar 2025 13:37:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A136E280008; Fri, 21 Mar 2025 13:37:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 77147280001 for ; Fri, 21 Mar 2025 13:37:44 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BC4E31C8699 for ; Fri, 21 Mar 2025 17:37:45 +0000 (UTC) X-FDA: 83246265690.25.B4679EA Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf25.hostedemail.com (Postfix) with ESMTP id ED50BA001A for ; Fri, 21 Mar 2025 17:37:43 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nm3PxDsf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 35qPdZwsKCOkdZfcLgaLYOLRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35qPdZwsKCOkdZfcLgaLYOLRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--souravpanda.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742578664; a=rsa-sha256; cv=none; b=xjZwTkVZDvQNPylmiZ6AxiNS7NP5NCvmp1gmpEbfd9/xh0x4RrKheylwaBMZIZe3yWrl75 olqUN/wFRSYyK3V9RCNf9qUU8RxNM2Od8oWceun73SpA5MQ3lMR72lN5M7SgzaAGVYpDSs qcNJD1EkqRDJmtAoyANUF4tndimd6z4= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nm3PxDsf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 35qPdZwsKCOkdZfcLgaLYOLRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--souravpanda.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35qPdZwsKCOkdZfcLgaLYOLRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--souravpanda.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742578664; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vGIlFeRwp00cBGERgxxLQ4RbBbIFX2vI8pKUuAePCaE=; b=O/o/c7HNtElYxwsAr8IX3hE8zMG2KCoenLdL4n5wMiMJoDkz7Zto95o7LsWkk1k90XFAWM bcYcdHazXt8B4yex5MdwrC5VR69gtuQjoTZCw61EcKVpbfiTmU3iHrtohVu/1Q5QOUokg1 vhVFPjslWCfAp7TIBfybWPJ2SCdMkUk= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ff7f9a0b9bso3991088a91.0 for ; Fri, 21 Mar 2025 10:37:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742578663; x=1743183463; darn=kvack.org; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :from:to:cc:subject:date:message-id:reply-to; bh=vGIlFeRwp00cBGERgxxLQ4RbBbIFX2vI8pKUuAePCaE=; b=nm3PxDsfCGZGyktHlPLPUZ1V2nVpCHXk/3Kb7GsELxNp3zRGi595FbkW7J5+9MW/7F WSh0f5TRvXIriqpIgdfBz1FykccU50PG5GBO00I6yf75bTAKy30uk27uOGC4YDvTT8e/ q/zmQvPl75L2uz8HA7ODmbyHcCIXMa42F/a1sT9cvXoaYztS8APuFJdiQEWucKvCv51V Cr02CBrUTgxVIQqVg7vMAz9zu6u/EJGlW+SKtCRExK9B+8fDpGYUv4R+68fnieZJhwdr kiynWAbTJAIUSlZPaDc2AO+YOne6krotejOX0m0jLYI6SrRMcic6XI0Fu6l9B+8+7zeA +llQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742578663; x=1743183463; h=to:from:subject:message-id:references:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vGIlFeRwp00cBGERgxxLQ4RbBbIFX2vI8pKUuAePCaE=; b=J5ZQCoICaqaeIpdArVAxos+9NYm9RNXOx6bn4N+sFKWaBe9JdvIYHmqin54Cdjc4on Wl9MEKNbg1Kl8FodfYCw3hVqhcs536mIc81u2RGQhLtAy517fgXhHj0eHBQR1ZMgV2NS qSYlSm4KF6cE+Bp9OgQv5isFmqEERGn3EvSDZBSjmy6GbLVZWMfVB0VVweEay2bM1Qn/ brYq5h3vzUt4tjBtNJLrbSZYqVnUeUcXZIH27wmbzzLX7Jcb9KxEQ+z+YhKTGIutaZ3c 2JBRxaB2TU9kYI9SWiBix43dofUuBB+qVYyAIckQYQFPVvpRhZc2wYk3wr+EurOB1Y/r 9ofg== X-Forwarded-Encrypted: i=1; AJvYcCXC0UiNu5z8eTM4VjFQ9I3O2wUsNOxOB9PldqLDAglhwA905nN8DaOUYYjL4wz67yk8MWwhcuwmgg==@kvack.org X-Gm-Message-State: AOJu0YxQuRQYKbTVz7Y+o/Ij5tMydhK+wJR/7U/ahvTsNEhVx7s0IBVk 8owKXGiOi20bVy0ybKSDqVNQ3zxIhVnbAV0FfkHEfS7/um1sd4lKbZF2TnylOGcXGaBX2Idui2+ W9EM0b9SYtRwHYPZRLV72vw== X-Google-Smtp-Source: AGHT+IEi+XabJeuTZlJOMXJIuxzHgrE4b595XPIHNxfB9i6zloVLdGL4AH5NwI1Pnt5e4UJSRIwRL82NwESkLDRViQ== X-Received: from pjg4.prod.google.com ([2002:a17:90b:3f44:b0:301:2679:9aa]) (user=souravpanda job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:280b:b0:2ee:d824:b559 with SMTP id 98e67ed59e1d1-3030fef09b9mr6204491a91.28.1742578662931; Fri, 21 Mar 2025 10:37:42 -0700 (PDT) Date: Fri, 21 Mar 2025 17:37:29 +0000 In-Reply-To: <20250321173729.3175898-1-souravpanda@google.com> Mime-Version: 1.0 References: <20250321173729.3175898-1-souravpanda@google.com> X-Mailer: git-send-email 2.49.0.395.g12beb8f557-goog Message-ID: <20250321173729.3175898-7-souravpanda@google.com> Subject: [RFC PATCH 6/6] mm: syscall alternative for SELECTIVE_KSM From: Sourav Panda To: mathieu.desnoyers@efficios.com, willy@infradead.org, david@redhat.com, pasha.tatashin@soleen.com, rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, weixugc@google.com, gthelen@google.com, souravpanda@google.com, surenb@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: ED50BA001A X-Stat-Signature: m4wuiz6xs81a7ds34c19n3mbot7j3k1j X-HE-Tag: 1742578663-372624 X-HE-Meta: U2FsdGVkX1+aUrg197GWcMwbQGq7oSmZ5B+Eclo7C5vPGjx9nYDdDU+A4hpvvSI0S/GJJoHKWVIdWfh8oIdXFpHIONM1BRvYiXGdCbFdS2ZLKNTvlK8c2GTEEt34MVjQq6kfa8tTnlt1IMjpDfJmsS6B4Un96qZ02H40ABDWuv4TNMEOcGHWDRRKnZ6uLQolGGsNzDAAC/D9foyiSLpAFpC7HBtTwUe9bkF1u7vDoCY85a4nwgTIXL71wuKoMCTD7PHupuuWt57qzgnjiyExxw4wA5/S9iigA9Nh6UI2M+838hrny1LUOCUF54xZxr6VvACUJiuVXeYRRrId74MBP7J4U1bDeCiX8gWdTWkqaYvpaGFLefa659dF1lBUC4MScCQeAZt3JOxO0Q7KZizEyJd+lLyoY3z352myaPCUqGrqexBAaC4Bg8TVWnzmklkkSc2TBR91RDJNL0/4wqJSCoIiSDd5pSqlksXAR33hp6Cw4vVN3huCgyGZg9T+6Dgcwf+62lwBMrJ718ykHaVg+9YbVgzzPIkVKFAJkgQgF/xmjoC1oOfkedh/hRV9ivKrm5jpVyQO55DM9royl61nBAsuWVLTAd/wnXxljL4UcV1AGxTDBczR2E25oTNBQnGLjXk8Svgv61Lq6a1NGUG2Nh7PZjETgNevfTCrrb0XnuadqOvdCXF7w9cdWpXniCQl5+qlOGj60APt5JjCJ9TZyPPkIqNkrk6yms4TSNZJKnAz9BN0gTYSMMpYbNHIf7sKyoXtSBKy/EAnZPbMFWesOHTzge2t3zAw3Mxs0JewjRn3Jc7uaul2Z74FU8YvaR4QGrTqNWoqFuPvgxzTeaHgbz4TUK8qDb44a97mHUqC15823AXbHSFxLd1de9m++Bie4TVviQvCqMIvHOHgvKzhd9K15lP1Twz/Ip6hSyh0h5eGm8gCD0P35jD5Crc4r6feOgnLcoCcVjsdJZhw695 05Mq6pGv hWNl+6xdYqfrb0xXnXLRTs9BngLQpCwO+gVNXGIKuDplaXJBopzIvgWuPWJOxMa8qUXIM9xh9vPDO/Mf6OD7WQkghxlSTNFpq0Y1mZHOHvWynFvkInbXnbuhNhkcHw6l0bWQjDwlVGt/tyMQZitq7pQuOlGQIGxA+iwQyleUOd1/BVicddj3Xw/23SbN4y5veyTyCbhUCRjRpQjqitQGPSX9M3dT+mOf6n6zGlAUWzqJRDxDs/FNYuzlD1+k0oF2ecaLZ7C76bo1+L/3l+bGIvNoxgqwBdCd5NgJpiEhJ6gCfok8KblLCkSzmGKP2JhNjD9BU6kEpHmCpXWyxD3FWaqE6hTUY5ccYheGweNgq6W13EaoNnw5FfCF9LOGJtALwujN3zHXICKqpD3V6HzATRn+5piyN2ICIEf2MG7UdiazUkaUWdZ2SPXBylNOgoyYNtSoYRA+BAYjELPpxznk4GYrKCp5IdHVFrUD5Uz/fv6r2gEk0raXAy65lKhpQM0ewI/hvqdrvDsoQp0/mdYlt9epUiAXnvmMBJZMvLoNCsl9CObOLsddLQDgNxbZ/rOuWo7wf0FT8OshT/kJz8Teq/yB/TAd6yyfQIxdr/KwscWkIbUBjXx5O9V1c0Afcvx4F90Gy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Partition can be created or opened using: int ksm_fd = ksm_open(ksm_name, flag); name specifies the ksm partition to be created or opened. flags: O_CREAT Create the ksm partition object if it does not exist. O_EXCL If O_CREAT was also specified, and a ksm partition object with the given name already exists, return an error. Trigger the merge using: ksm_merge(ksm_fd, pid, start_addr, size); Limitation: Only supporting x86 syscall_64. Signed-off-by: Sourav Panda --- arch/x86/entry/syscalls/syscall_64.tbl | 3 +- include/linux/ksm.h | 4 + mm/ksm.c | 156 ++++++++++++++++++++++++- 3 files changed, 161 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 5eb708bff1c7..352d747dbe33 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -390,7 +390,8 @@ 464 common getxattrat sys_getxattrat 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat - +467 common ksm_open sys_ksm_open +468 common ksm_merge sys_ksm_merge # # Due to a historical design error, certain syscalls are numbered differently # in x32 as compared to native x86_64. These syscalls have numbers 512-547. diff --git a/include/linux/ksm.h b/include/linux/ksm.h index d73095b5cd96..a94c89403c29 100644 --- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -14,6 +14,10 @@ #include #include +#include +#include +#define MAX_KSM_NAME_LEN 128 + #ifdef CONFIG_KSM int ksm_madvise(struct vm_area_struct *vma, unsigned long start, unsigned long end, int advice, unsigned long *vm_flags); diff --git a/mm/ksm.c b/mm/ksm.c index fd7626d5d8c9..71558120b034 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -147,7 +147,8 @@ struct ksm_scan { static struct kobject *ksm_base_kobj; struct partition_kobj { - struct kobject *kobj; + struct kobject *kobj; /* Not required for the syscall interface */ + char name[MAX_KSM_NAME_LEN]; struct list_head list; struct rb_root *root_stable_tree; struct rb_root *root_unstable_tree; @@ -166,6 +167,106 @@ static struct partition_kobj *find_partition_by_kobj(struct kobject *kobj) return NULL; } +static struct partition_kobj *find_ksm_partition(char *partition_name) +{ + struct partition_kobj *partition; + + list_for_each_entry(partition, &partition_list, list) { + if (strcmp(partition->name, partition_name) == 0) + return partition; + } + return NULL; +} + +static DEFINE_MUTEX(ksm_partition_lock); + +static int ksm_release(struct inode *inode, struct file *file) +{ + struct partition_kobj *ksm = file->private_data; + + mutex_lock(&ksm_partition_lock); + list_del(&ksm->list); + mutex_unlock(&ksm_partition_lock); + + kfree(ksm); + return 0; +} + +static const struct file_operations ksm_fops = { + .release = ksm_release, +}; + +static struct partition_kobj *ksm_create_partition(char *ksm_name) +{ + struct partition_kobj *partition; + struct rb_root *tree_root; + + partition = kzalloc(sizeof(*partition), GFP_KERNEL); + if (!partition) + return NULL; + + tree_root = kcalloc(nr_node_ids + nr_node_ids, sizeof(*tree_root), + GFP_KERNEL); + if (!tree_root) + return NULL; + + partition->root_stable_tree = tree_root; + partition->root_unstable_tree = tree_root + nr_node_ids; + strncpy(partition->name, ksm_name, sizeof(partition->name)); + + list_add(&partition->list, &partition_list); + + return partition; +} + +static int ksm_partition_fd(struct partition_kobj *partition) +{ + int fd; + struct file *file; + int ret; + + file = anon_inode_getfile("ksm_partition", &ksm_fops, partition, O_RDWR); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + return ret; + } + + fd = get_unused_fd_flags(O_RDWR); + if (fd < 0) { + fput(file); + return fd; + } + fd_install(fd, file); + return fd; +} + +SYSCALL_DEFINE2(ksm_open, const char __user *, ksm_name, int, flags) { + char name[MAX_KSM_NAME_LEN]; + struct partition_kobj *partition; + int ret; + + ret = strncpy_from_user(name, ksm_name, sizeof(name)); + if (ret < 0) + return -EFAULT; + + partition = find_ksm_partition(name); + + if (flags & O_EXCL && partition) /* Partition already exists, return error */ + return -EEXIST; + + if (flags & O_CREAT && !partition) { + /* Partition does not exist, but we are allowed to create one */ + mutex_lock(&ksm_partition_lock); + partition = ksm_create_partition(name); + mutex_unlock(&ksm_partition_lock); + } + + if (!partition) + return flags & O_CREAT ? -ENOMEM : -ENOENT; + + return ksm_partition_fd(partition); +} + /** * struct ksm_stable_node - node of the stable rbtree * @node: rb node of this ksm page in the stable tree @@ -4324,6 +4425,59 @@ static int __init ksm_thread_sysfs_init(void) } #endif /* CONFIG_SELECTIVE_KSM */ +SYSCALL_DEFINE4(ksm_merge, int, ksm_fd, pid_t, pid, unsigned long, start, size_t, size) { + unsigned long end = start + size; + struct task_struct *task; + struct mm_struct *mm; + struct partition_kobj *partition; + struct file *file; + + file = fget(ksm_fd); + if (!file) + return -EBADF; + + partition = file->private_data; + if (!partition) { + fput(file); + return -EINVAL; + } + + if (start >= end) { + fput(file); + return -EINVAL; + } + + /* Find the mm_struct */ + rcu_read_lock(); + task = find_task_by_vpid(pid); + if (!task) { + fput(file); + rcu_read_unlock(); + return -ESRCH; + } + + get_task_struct(task); + + rcu_read_unlock(); + mm = get_task_mm(task); + put_task_struct(task); + + if (!mm) { + fput(file); + return -EINVAL; + } + + mutex_lock(&ksm_thread_mutex); + wait_while_offlining(); + ksm_sync_merge(mm, start, end, partition); + mutex_unlock(&ksm_thread_mutex); + + mmput(mm); + + fput(file); + return 0; +} + static int __init ksm_init(void) { int err;