From patchwork Fri Aug 30 10:03:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BC63CA0EC5 for ; Fri, 30 Aug 2024 10:04:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B7466B00F3; Fri, 30 Aug 2024 06:04:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 268BD6B00F5; Fri, 30 Aug 2024 06:04:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BA3F6B00F6; Fri, 30 Aug 2024 06:04:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E04276B00F3 for ; Fri, 30 Aug 2024 06:04:45 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9083E121852 for ; Fri, 30 Aug 2024 10:04:45 +0000 (UTC) X-FDA: 82508477730.06.5B86F2C Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by imf23.hostedemail.com (Postfix) with ESMTP id A7B9E140010 for ; Fri, 30 Aug 2024 10:04:43 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eWxYxrd5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012262; a=rsa-sha256; cv=none; b=45nC5s4VPoSKpmp2g5sJWvIW+he5kvGXCZrCOo4wFGexylup5hosTkdvNwU80SLrS868j0 AR5ckL6ymznkdONGsubcRNMUGgiCqiq0TEZBgI51RUmzxi4Ubcy8o0cPUHDR5W+6trCj6A ++E1CbmxSYeq1lzGFLFy/Ty31aTSBV0= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eWxYxrd5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf23.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r5a45zgnKr1XZRxdreeDmqy+oLRtttPm8KTNMecFwc8=; b=vbGB5H+FX1sUrPbU8oz0Por1DzTe4DEPo3MmX3yCxQmbiv1YfiN2gI0B25qspqAQGEgXHY yf56/pSeXK/o3c8+yIV0IOT1sWcJcQEs5tcILgcbtiCHRwDAlGIhk9hCEnAcqxNy7HCmcj /w7mnMsEJfJWr/HdHkAtHBHX74mYYns= Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6bf99fee82aso8752176d6.2 for ; Fri, 30 Aug 2024 03:04:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012283; x=1725617083; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=r5a45zgnKr1XZRxdreeDmqy+oLRtttPm8KTNMecFwc8=; b=eWxYxrd5ak+12Gtj7BKiL6kRCcNFwCV/II6Hp0lI6DHyWmAosBDR+GIal/nX/hT0t/ X6+PXTcT4yHEemQAipTwi9NHPp95mU8wvKkzvlxTbd3mmlq5Mzb3lgM/kTPeb9VMr/Tm ysXPuzmk4zqqHHkudMVdLDvzZTl/zAXFTqAxzcgHEgrxchyxH+n2SHI0FzOu1jlzKqrj JRicOtMC1HBkumKUanr4c3UFI+QAae4czkiaaGo/dJz7dOsxQHQ12egKtGPiadLhHvkV KW+5kl3LgTvtX+9R3LJZG5WKLpdIp5TXkA0Z7RPA+nioxLuysv4C0hZgmX7wDqKUdOVn chww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012283; x=1725617083; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r5a45zgnKr1XZRxdreeDmqy+oLRtttPm8KTNMecFwc8=; b=WxaH4ujeCB8L8uKNwK0aANiDDspjZtg0GXOgk+Dj0QvbunKxthHYiqDo2Sj56080ro hVjy94yx0OPfCAlPjUMO/riGdWD7DFZC3NCs/XkZKYvwHmFnYNrj6mK4+3btkp6eBPeD BXiPfn1l5LEoagXfbGKbeqzy5Sid+byWKDUypr9cruDlGVwLJWTE30O07HsjSVGmwaTs BMuLQalf9eNxXShZGu7n1vL9cp+/Wd9rpCMQjAMuBtLrbHCeRD/q/61tF0b4mrJi8Bha Ok0p8JoNLDea50x0eEF/Uf+4VUcT7uHFsu2qmck/Jv2DssPyCt2FjNOZK65ye/VFa9Zp sv0Q== X-Forwarded-Encrypted: i=1; AJvYcCVU4AkKpKCNC05NjhiR/Nuc+SqbpckcBWGyBMWaM/DHdwxzLl2NhrzjVzf/ORf9zu7y7FxnJmbO8w==@kvack.org X-Gm-Message-State: AOJu0YwxXDz5ZfJQgwRFzhXE2mk+9qVTVBS/m8LXpxfVE8ODEUlfy+F+ imIcBDRIX9M5w4Z2aHL3T3DfuEUrT/VfsbbkLF/GHeodG60nnq6N X-Google-Smtp-Source: AGHT+IGgUTAqqh5BMFm/0McbIC4rNQXxf1Fjc0Krv4v/rvVrVCmsGivh9at8oTGIG8VkATlTayTR6g== X-Received: by 2002:a05:6214:5410:b0:6bf:6b15:a6d9 with SMTP id 6a1803df08f44-6c33e69f6ecmr68793496d6.51.1725012282619; Fri, 30 Aug 2024 03:04:42 -0700 (PDT) Received: from localhost (fwdproxy-ash-014.fbsv.net. [2a03:2880:20ff:e::face:b00c]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6c340c435dbsm13151326d6.71.2024.08.30.03.04.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:41 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH v5 1/6] mm: free zapped tail pages when splitting isolated thp Date: Fri, 30 Aug 2024 11:03:35 +0100 Message-ID: <20240830100438.3623486-2-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: A7B9E140010 X-Rspamd-Server: rspam01 X-Stat-Signature: gk9gru41pr1u5jgnwc7dbzc8gxk6np38 X-HE-Tag: 1725012283-702886 X-HE-Meta: U2FsdGVkX18chWJKdfeo6yFsKRUD6Zu3qjgUb736oY3xUgM3HwveOWbPMyEHhqGXkqS0DEtMWkBG9a8aOdxwcI+N/yESUyFNDAZiG2xEGb0t/fm6IvGblC0giYxZj6gI3ikKteSNmQjf+X267pYB1mIg0haPTSujjTG68EOeF2AaFSA0vwFpSC/pw1yGghSlhgohkVu/zZKmT8eR35LJdLypvwmdpJ63ueiXklUVy6CgbKAvKgspel9unok+yYnzwDZn0cHUPujEhonoyKJJ8iXF6DIjXimUnWPeTuQ2zdUXWKMg2onzg+ttPaCINfCT0qYzRYpR/KqzOcNPnxv69+tfx4tAN3oy1MMC85c79gIz2vP1I21IS3ikpBkneGt0UJZe1C+RWzm9dodI6A7yOrCUMZYswNItwvGFDJU9QGECEXXpyeiMUHKQoexm/n0JQvYd+NImZsG5IG9s7PH2SNs/mSRHHQ4bYb/TRDNaGrAaKEA1TrdhpFf4uXxIlhyRc0zB4ax+KWHCy4n6GZnrM613XRXMAgKx0fBNE//ZLGn1xfYlooT3ai8kqegfZBg1QyVJJvX/v6gji7++Wafvxqg8VJz8duUGfifxnoPO0carn5/VCYZyIA6O9FMIlhFoAl/GmV7sxtHyY9ataDo1s19rNjgr5QANdFZL1LwOdstSjsYfat51K1uL+s++ucXnQo1LWKa/czZ5plt/jbOPfWO/rmDKNpAwcBy9f0BViTaN/zic6s2/0Y/HfYw0lu5sSUWmunBTMkl61ezglkSPOiVS5APjezFrCLMtd39Vx93pJRcS2tFwKY3d5jFobOBSQyNz43ybLJKEcAu+avHVmOGLJfkB0UMsKuzX11f+nu5oKI4spIhhruNWBi8NYLEwIdtBgDvLMRzQiRP1BNv8ZEnovJxKaIYrJJnWVaKSArxS6tApKgQcMT6iWQlU0NP0MblzISnrVIdSYLcq6W0 suhNVnrd HjQYfrhezdkUAac8NJYJIl/MgLdz0U/UxlnFo4DUBRYVKDdRMR4Ssjs8nXRdwELuJoBH2AHNcJ77y8GW+2UN47VLCa1iT9MEZDhx0XOhp+5f+gk6qzb+2T3TznkWB+KMPpM7+c8nnBpWc5tYhZT3u3FvodPqbWDTHRm6wngz8TpjOeefvOPAbk9DcJmEB+KtFA4flcKIauo3QJLy5EaMI4eFvAsYVDHdrLu4pMNDfWNyx1zMHA4uDL2+fiUL2wFOp/6VQUNlQzMUDyqa4bpyTnzOGnB2hIuHUra29eyr0RNhGjmQU4ZzKyxYhwL5oHAOKfhSPpPcPasRsGa4moIbXSruwIezkL909E7OBqm21T4E4h6ZqSbMT9Jd4vi5nZ2p1zp9ASDDBD0JoejgnUgCqpBqGqmxqSYDOSI0qBT1W45bC9F1UruKCTOBIJ/05sJrj84lhkIZV7vdMfZMiHtRrRrlb3P3Oggh8EKpLYwybMilneqvPDR+AYXXCiM6SpBfTYOIOy/t1cN7rYjLvUfBa7YUTQlK5Ab4HK+Lg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao If a tail page has only two references left, one inherited from the isolation of its head and the other from lru_add_page_tail() which we are about to drop, it means this tail page was concurrently zapped. Then we can safely free it and save page reclaim or migration the trouble of trying it. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Acked-by: Johannes Weiner Signed-off-by: Usama Arif --- mm/huge_memory.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 15418ffdd377..0c48806ccb9a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3170,7 +3170,9 @@ static void __split_huge_page(struct page *page, struct list_head *list, unsigned int new_nr = 1 << new_order; int order = folio_order(folio); unsigned int nr = 1 << order; + struct folio_batch free_folios; + folio_batch_init(&free_folios); /* complete memcg works before add pages to LRU */ split_page_memcg(head, order, new_order); @@ -3254,6 +3256,27 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (subpage == page) continue; folio_unlock(new_folio); + /* + * If a folio has only two references left, one inherited + * from the isolation of its head and the other from + * lru_add_page_tail() which we are about to drop, it means this + * folio was concurrently zapped. Then we can safely free it + * and save page reclaim or migration the trouble of trying it. + */ + if (list && folio_ref_freeze(new_folio, 2)) { + VM_WARN_ON_ONCE_FOLIO(folio_test_lru(new_folio), new_folio); + VM_WARN_ON_ONCE_FOLIO(folio_test_large(new_folio), new_folio); + VM_WARN_ON_ONCE_FOLIO(folio_mapped(new_folio), new_folio); + + folio_clear_active(new_folio); + folio_clear_unevictable(new_folio); + list_del(&new_folio->lru); + if (!folio_batch_add(&free_folios, new_folio)) { + mem_cgroup_uncharge_folios(&free_folios); + free_unref_folios(&free_folios); + } + continue; + } /* * Subpages may be freed if there wasn't any mapping @@ -3264,6 +3287,11 @@ static void __split_huge_page(struct page *page, struct list_head *list, */ free_page_and_swap_cache(subpage); } + + if (free_folios.nr) { + mem_cgroup_uncharge_folios(&free_folios); + free_unref_folios(&free_folios); + } } /* Racy check whether the huge page can be split */ From patchwork Fri Aug 30 10:03:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BC4FCA0EFB for ; Fri, 30 Aug 2024 10:04:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DF596B00F5; Fri, 30 Aug 2024 06:04:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F3E36B00F6; Fri, 30 Aug 2024 06:04:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 045696B00F7; Fri, 30 Aug 2024 06:04:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D505B6B00F5 for ; Fri, 30 Aug 2024 06:04:47 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9EC7EAB079 for ; Fri, 30 Aug 2024 10:04:47 +0000 (UTC) X-FDA: 82508477814.01.84685C0 Received: from mail-oo1-f48.google.com (mail-oo1-f48.google.com [209.85.161.48]) by imf21.hostedemail.com (Postfix) with ESMTP id C16D61C0020 for ; Fri, 30 Aug 2024 10:04:45 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Y00m0Yug; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.161.48 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012212; a=rsa-sha256; cv=none; b=69aUkUz7FSFKsepl1vHPGrgALFl+vXj8PId5y/i1lcEHqtO7Qk/tADONmvwCEU2lisd7uz iP9rLjNwVRMafM8UwAggo22GxnvyRHlD2q/1ko4L1+p1TmORojGE/rTFxwo54oa04wrrQJ fddM7CbnC2YDEvP4ikkALdCRv7W+JrQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Y00m0Yug; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.161.48 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012212; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lsaItdHLrkOs94vXPLpVeca+5XkpGp6YNABO6738q6c=; b=ltSDZUZpwojUFcKDXAayiSlVkdAor1XLwizlK4G5phVGtRNu8Io3D8otgxRMmW6Js1r73n gxOyJtbIGjoiIk4ssu0K8rkIhVI4HZiXBdGQtJSTMhNrp8AAyIydydKvlDsuOQJs7yOHuV VhUC/MhFHXpMH0anl0pYiWu2tsMkn/U= Received: by mail-oo1-f48.google.com with SMTP id 006d021491bc7-5dccc6cdc96so995937eaf.0 for ; Fri, 30 Aug 2024 03:04:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012285; x=1725617085; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lsaItdHLrkOs94vXPLpVeca+5XkpGp6YNABO6738q6c=; b=Y00m0YugGJ7cjEXl2v+SFaOhiSb6ggSUrPQvmDqo3Spf5iJJSrFflDNN3ZpMp4Rdec tN7g65kA0F0zyr/36AKc5lMiw+9ACwwwkuKNm7hstflPVoyR9RlQOHkbHaJAiqCJ8eP6 hhJE7P+rwGj6Zgc+IhdwScq7thDFCadO/nJvUJRIzO+sQEqcWe7PqTQ3F2f91Mor1njQ 7K6rom8pLh8t72Yhb8l+gshy9P7QnCI6NKkLbJrrxaud/S2Vc/oiwPDLyHsHcSbOkIam M3uG4ETRRsC327QgT6iB3ikEiOJNCffk77xJO8wxUrtqS/kdTAeX4RlF6u+9cLr8dV0i 9QAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012285; x=1725617085; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lsaItdHLrkOs94vXPLpVeca+5XkpGp6YNABO6738q6c=; b=lmYWDKWXSW49r2uHEUw0gcXy6UW9xODTE/IFzI0lFdjyittr9cZP/Rt2+BQ3sP5OZY mZefLg9XMROcP2x5PMD9rmLuvvlDupH5lUwtAfH6Ylf+kaq17YQpvfzrnI2fblqeX+Rj yDzcoavceX4MqKS61zvuWgqhgG6BA8H8xXmamPIil2MnV0TJUjp9C8AdLDr88Iz/jjn2 XOiCDnVI2rDSHfI3Hjnsvef3NHDMwFVbJh+AXtykAPvgSshaoiR6vuMwL/SvvvnTdV+y QsRJDoln/Awea85NAGy4N6wm40CNLFQ2DK9wBUCjaZXfskytYYxCheLxhB7RclG7YABN 9s/Q== X-Forwarded-Encrypted: i=1; AJvYcCUcLI8xSx9D0w0nmkVlOw+mx5v9+9nAT5ivwCNsygVfqsFI/uWXxnVQ0LsRcAu66FQ6ANoM61guSA==@kvack.org X-Gm-Message-State: AOJu0YwdqPj4GCBUdGyFWbOCuZDWDyE9S3mLYm+Ap8GAxU38VNqLCbAc AO6oqaKSejzPphpqZyOVl1YqP/cxo/d+3ecm3PIVpkAKysfNa+95 X-Google-Smtp-Source: AGHT+IHVXyDvTPcFg0d6jozxHg1gXGpjAfSA3ZiESWLvV7o1PUmHfmhl+s0Gs537cY51gwYtkOJCDA== X-Received: by 2002:a05:6820:1c9a:b0:5d5:bce7:677 with SMTP id 006d021491bc7-5dfacf2ff24mr1741737eaf.7.1725012284525; Fri, 30 Aug 2024 03:04:44 -0700 (PDT) Received: from localhost (fwdproxy-ash-113.fbsv.net. [2a03:2880:20ff:71::face:b00c]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6c340c96856sm13189406d6.89.2024.08.30.03.04.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:43 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH v5 2/6] mm: remap unused subpages to shared zeropage when splitting isolated thp Date: Fri, 30 Aug 2024 11:03:36 +0100 Message-ID: <20240830100438.3623486-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C16D61C0020 X-Stat-Signature: ir6zrmrmfctdwc4pu61x9mtnhjjxmu5s X-Rspam-User: X-HE-Tag: 1725012285-354852 X-HE-Meta: U2FsdGVkX18TK2OFYlwyeAFCyISkSXPyRkm303stmhXWOR5e3AhYAVksM6d/BklvjuMFCCNw7s52xcwxmqqfISt7K6coQHmd6NP+ZFopQGdMN3F9HVYMAaUii8fdmMiOCumYwkw3yg4IinDH+sb3/9lbptXeFQ6FmIyoj3faC9qkLXOeN2Yjc8o+XnkV+7+5sSOsZ5ucFOcpzsTfm7isaPPf9mtj0AE8NtixnzUD0g08uKJG+Nnv+EZvxpr4PnWIzoHzTJ4EUJH113NM58geN2CyJUEE2vq7N5Po5BiG1c/apfNgZ++u7h7V2mRYWasayAV56lQcPs6aTYEPJVy4c4hOw+O7FFIKS9l791DWdfz8T0Kqbk6i/rEcx0aKTABGNbmJ33FDGjh40kcmbMY1PZ0Ys1oIbxRCLzzIh5glnhNFiUI6WYm5/qEcHk1GrSKwGRWKXW4PzQ7TYcv7PQhzeGyAgcOm+wsCnqCuNTFH2ZnnMmaMHkn7N1E9E5yGzv+YD3wzzrQ3NDAVk0/Ocwi2RdjvFBKhn7fyPGX0sDht84qDC4N/fL42uJObGajAan1wx+ffpO4uxZwEYNh+BJ+73r4e41BrAzW82xsyLjg4UwYRAv4uHtlY0A8oDOnUvBzCLFW1v9ZT+QcN25+IynaoL9lr+cR1lL97UB92CQPYcnO0k7HDHVb612cSKR9FaV74lCG65obvqe0MMfWi7aBYHFiwhvAQhCoSKabk6bvScqEmEjqLWK8e5NZ6nrWDmezZu0UQgkVUBpF7VpdQ/HwKJhLeqEFVj9h0ftjh6MHIzZhxvQWNetEfQ8HsXh0fvYo0WvP2U2flGsXihJz9xzcmlRcxnCDZT/ATJNER/gyk7O4+TUz7JFDsbKY98Ww9iojMy4pnUN3JB1fLe3gXLhcBssSpOFWnkpIYDpW8MlZYCQaSafnIBdKvizyiTyhhXXRbYbkjbDFn43PGEgfQPHH wJVh8gLQ EMCd/c26mgzbA5fXLoh634fWeCVe44jFFrj1TzumZvowHKuuIFX1WdVvxdRqva6n05g2QzcbB4mrxAhp58SneQ0vZ3wyiw93jhy93IOj7qNXEYxOrPaCv36F52tdHiJ52a5IOeLgqbmTQgCj1i2SxGpMFqXzdlKvLU1nq4ZcoAQOtkUixzroMLMPoeJBWD1wllgsQRK8w1lW/CSAMGKOuy2IlqNv+luYXYmvHeIF+czUPskZjKG6z4Ch2a5NYUCMgyT97iWneluaQX9Mg0hefMlz04G7g8gGB9UDtn/eoJ5jQdFVCJ6SgKgKz7ugqB3YPuGyj2Yt1CB94HYHiKLetS+b3iJFEkXlBzsBPVG5ePRhlSBc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao Here being unused means containing only zeros and inaccessible to userspace. When splitting an isolated thp under reclaim or migration, the unused subpages can be mapped to the shared zeropage, hence saving memory. This is particularly helpful when the internal fragmentation of a thp is high, i.e. it has many untouched subpages. This is also a prerequisite for THP low utilization shrinker which will be introduced in later patches, where underutilized THPs are split, and the zero-filled pages are freed saving memory. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif --- include/linux/rmap.h | 7 ++++- mm/huge_memory.c | 8 ++--- mm/migrate.c | 72 ++++++++++++++++++++++++++++++++++++++------ mm/migrate_device.c | 4 +-- 4 files changed, 75 insertions(+), 16 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 91b5935e8485..d5e93e44322e 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -745,7 +745,12 @@ int folio_mkclean(struct folio *); int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, struct vm_area_struct *vma); -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); +enum rmp_flags { + RMP_LOCKED = 1 << 0, + RMP_USE_SHARED_ZEROPAGE = 1 << 1, +}; + +void remove_migration_ptes(struct folio *src, struct folio *dst, int flags); /* * rmap_walk_control: To control rmap traversing for specific needs diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0c48806ccb9a..af60684e7c70 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3020,7 +3020,7 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, return false; } -static void remap_page(struct folio *folio, unsigned long nr) +static void remap_page(struct folio *folio, unsigned long nr, int flags) { int i = 0; @@ -3028,7 +3028,7 @@ static void remap_page(struct folio *folio, unsigned long nr) if (!folio_test_anon(folio)) return; for (;;) { - remove_migration_ptes(folio, folio, true); + remove_migration_ptes(folio, folio, RMP_LOCKED | flags); i += folio_nr_pages(folio); if (i >= nr) break; @@ -3240,7 +3240,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (nr_dropped) shmem_uncharge(folio->mapping->host, nr_dropped); - remap_page(folio, nr); + remap_page(folio, nr, PageAnon(head) ? RMP_USE_SHARED_ZEROPAGE : 0); /* * set page to its compound_head when split to non order-0 pages, so @@ -3542,7 +3542,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (mapping) xas_unlock(&xas); local_irq_enable(); - remap_page(folio, folio_nr_pages(folio)); + remap_page(folio, folio_nr_pages(folio), 0); ret = -EAGAIN; } diff --git a/mm/migrate.c b/mm/migrate.c index 6f9c62c746be..d039863e014b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -204,13 +204,57 @@ bool isolate_folio_to_list(struct folio *folio, struct list_head *list) return true; } +static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw, + struct folio *folio, + unsigned long idx) +{ + struct page *page = folio_page(folio, idx); + bool contains_data; + pte_t newpte; + void *addr; + + VM_BUG_ON_PAGE(PageCompound(page), page); + VM_BUG_ON_PAGE(!PageAnon(page), page); + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page); + + if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) || + mm_forbids_zeropage(pvmw->vma->vm_mm)) + return false; + + /* + * The pmd entry mapping the old thp was flushed and the pte mapping + * this subpage has been non present. If the subpage is only zero-filled + * then map it to the shared zeropage. + */ + addr = kmap_local_page(page); + contains_data = memchr_inv(addr, 0, PAGE_SIZE); + kunmap_local(addr); + + if (contains_data) + return false; + + newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address), + pvmw->vma->vm_page_prot)); + set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte); + + dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio)); + return true; +} + +struct rmap_walk_arg { + struct folio *folio; + bool map_unused_to_zeropage; +}; + /* * Restore a potential migration pte to a working pte entry */ static bool remove_migration_pte(struct folio *folio, - struct vm_area_struct *vma, unsigned long addr, void *old) + struct vm_area_struct *vma, unsigned long addr, void *arg) { - DEFINE_FOLIO_VMA_WALK(pvmw, old, vma, addr, PVMW_SYNC | PVMW_MIGRATION); + struct rmap_walk_arg *rmap_walk_arg = arg; + DEFINE_FOLIO_VMA_WALK(pvmw, rmap_walk_arg->folio, vma, addr, PVMW_SYNC | PVMW_MIGRATION); while (page_vma_mapped_walk(&pvmw)) { rmap_t rmap_flags = RMAP_NONE; @@ -234,6 +278,9 @@ static bool remove_migration_pte(struct folio *folio, continue; } #endif + if (rmap_walk_arg->map_unused_to_zeropage && + try_to_map_unused_to_zeropage(&pvmw, folio, idx)) + continue; folio_get(folio); pte = mk_pte(new, READ_ONCE(vma->vm_page_prot)); @@ -312,14 +359,21 @@ static bool remove_migration_pte(struct folio *folio, * Get rid of all migration entries and replace them by * references to the indicated page. */ -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked) +void remove_migration_ptes(struct folio *src, struct folio *dst, int flags) { + struct rmap_walk_arg rmap_walk_arg = { + .folio = src, + .map_unused_to_zeropage = flags & RMP_USE_SHARED_ZEROPAGE, + }; + struct rmap_walk_control rwc = { .rmap_one = remove_migration_pte, - .arg = src, + .arg = &rmap_walk_arg, }; - if (locked) + VM_BUG_ON_FOLIO((flags & RMP_USE_SHARED_ZEROPAGE) && (src != dst), src); + + if (flags & RMP_LOCKED) rmap_walk_locked(dst, &rwc); else rmap_walk(dst, &rwc); @@ -934,7 +988,7 @@ static int writeout(struct address_space *mapping, struct folio *folio) * At this point we know that the migration attempt cannot * be successful. */ - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, 0); rc = mapping->a_ops->writepage(&folio->page, &wbc); @@ -1098,7 +1152,7 @@ static void migrate_folio_undo_src(struct folio *src, struct list_head *ret) { if (page_was_mapped) - remove_migration_ptes(src, src, false); + remove_migration_ptes(src, src, 0); /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); @@ -1336,7 +1390,7 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, lru_add_drain(); if (old_page_state & PAGE_WAS_MAPPED) - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, 0); out_unlock_both: folio_unlock(dst); @@ -1474,7 +1528,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, if (page_was_mapped) remove_migration_ptes(src, - rc == MIGRATEPAGE_SUCCESS ? dst : src, false); + rc == MIGRATEPAGE_SUCCESS ? dst : src, 0); unlock_put_anon: folio_unlock(dst); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 8d687de88a03..9cf26592ac93 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -424,7 +424,7 @@ static unsigned long migrate_device_unmap(unsigned long *src_pfns, continue; folio = page_folio(page); - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, 0); src_pfns[i] = 0; folio_unlock(folio); @@ -840,7 +840,7 @@ void migrate_device_finalize(unsigned long *src_pfns, dst = src; } - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, 0); folio_unlock(src); if (folio_is_zone_device(src)) From patchwork Fri Aug 30 10:03:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E288CA0EFC for ; Fri, 30 Aug 2024 10:04:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1AA6D6B00F7; Fri, 30 Aug 2024 06:04:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 164DC6B00F8; Fri, 30 Aug 2024 06:04:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EEA826B00F9; Fri, 30 Aug 2024 06:04:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CC0936B00F7 for ; Fri, 30 Aug 2024 06:04:49 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 853591416C4 for ; Fri, 30 Aug 2024 10:04:49 +0000 (UTC) X-FDA: 82508477898.14.452FDD5 Received: from mail-oa1-f50.google.com (mail-oa1-f50.google.com [209.85.160.50]) by imf29.hostedemail.com (Postfix) with ESMTP id A37CA120009 for ; Fri, 30 Aug 2024 10:04:47 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=KJ9y3ZHx; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.50 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012168; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ozbFo2ch+MkbZ+krZSCzDhOFvoowl/UCaYy4gtivTp4=; b=tTCOYSlv207DB1FiR6YqL2RYD6W30ufuGLl4oivuVlRtMBP0AiIyruDYhv4gLRvaQ137ek 4kuoCaMrjCMJHLsLi29N2suOpfT/ltQhb/SIS3E16w4EWlcvnwBKpqPdOftevqkZpIxXTk EUA0+IvrEK0P4ag6Sea68QlHKo9niyU= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=KJ9y3ZHx; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.50 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012168; a=rsa-sha256; cv=none; b=s42l1zfoPZ+AuRbhvc478IZhuGXbDJYQ7DxKiJfXvLFEkn53IRjlSE3QmLluHCVNiwV9q3 okt1DLcG98z8WI2v2z0cPccA0hhoVtkOUwAiUcZN/IK3kUAB5Y05HBmogGo8KJoA3HlOMo ccO71BRX1bmqzG8Xv1V85UwadGbRHEE= Received: by mail-oa1-f50.google.com with SMTP id 586e51a60fabf-27045e54272so798660fac.0 for ; Fri, 30 Aug 2024 03:04:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012286; x=1725617086; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ozbFo2ch+MkbZ+krZSCzDhOFvoowl/UCaYy4gtivTp4=; b=KJ9y3ZHxqwqXWYfZOg2r5ZrCyvQnrK92QySYInBWlfAjfJqvvNKPOJDmcmNl9wZhvQ Ikql+8P8aw/Bp+33ScQBssnAc74961Te0fPt4oKEKTe6OngMgLTS67gpoA2f/zn5gGqj BptkkmS2KFz+oE9h7e9w4IzQF4P86gpoaEp/wjoV08KEWZnA59eYm9gP8aiJiuXyjz7R GHwxVmIuG0oFKKRFc7k87BiszxPp8b09ykdAXuLlGSID2S9HV1Wn06JVSFGajprbtUY2 ZOMbfr7n5gPwPifyUoalnwJyEVygPOKWmlaJ7NsVGdiQd0NYtf++IVj7yQNea61rwYIT iFIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012286; x=1725617086; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ozbFo2ch+MkbZ+krZSCzDhOFvoowl/UCaYy4gtivTp4=; b=v1ffM62kkxzvenL83UjtaR0GHUMcn3z5TpTY/IXuPlN+G/Ec7/78EBq0/bO/6WJOjj rsoUoCQaRwE8r1PuG2Cod/1TnbOLc/IXy/0640nzhfeo51N9d9xvn0ahjPVw9dx0ICUE sZj6L2f0YKBr/4cLhb3U6z518q38Njm9QDBtpCEbep5pDTVEQcpaM3oles+Lxvtal8SC 645zzBSCNL2KHB2HOqiyAU7PBlz7VPqHbAHIsz1tmx1CE5hxOYRq0J6WR6ojA8N2a45Z j1zZ1qs+s8UdepTKJXsAYct7PkJF2EWiOpQ3Cs+lLXYyIEZZb7GoTvPaXlq+koMn1DfM YBDg== X-Forwarded-Encrypted: i=1; AJvYcCUC5p2xMJw4BuR5oSfIVI4DcQnGjRJDZQDzsBl5pfCQpvwkgqanMJJAX3R8b24HrUMAtonROSkUCQ==@kvack.org X-Gm-Message-State: AOJu0YwxLltwTwxbwsA/NUgZfslDq/55pWejIxWXGMzOzRYCIUPw4xwF vVOxdYw37DPHFahfq86wTs4WHedQ0NSDiicOOgGwVgkyGnzYxr6s X-Google-Smtp-Source: AGHT+IFoXn0hlKtDOWPBmBf5kINkjOtVZEqP5VHvyK8WGYNnu5Y+WFxUha8zAkckLlImFFhEi3Na6w== X-Received: by 2002:a05:6870:2046:b0:270:1dab:64a9 with SMTP id 586e51a60fabf-277900c10d2mr6589971fac.14.1725012286247; Fri, 30 Aug 2024 03:04:46 -0700 (PDT) Received: from localhost (fwdproxy-ash-003.fbsv.net. [2a03:2880:20ff:3::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a806c4a6e6sm129878685a.64.2024.08.30.03.04.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:45 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Alexander Zhu , Usama Arif Subject: [PATCH v5 3/6] mm: selftest to verify zero-filled pages are mapped to zeropage Date: Fri, 30 Aug 2024 11:03:37 +0100 Message-ID: <20240830100438.3623486-4-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A37CA120009 X-Stat-Signature: 4adwo1x9nbiqjgr8c41qm3toamjaqexb X-Rspam-User: X-HE-Tag: 1725012287-95709 X-HE-Meta: U2FsdGVkX1+FKBS8lh7ZzbKOf96lRjcACOgcoY0BELDTX8FF5UJqChOhrC5HyXrrIkrM4T+0ihjffWYu/R0iHf8xwJC4Kp0OIT2313UIcAlpCIuK7m8sRv6XdtUTQpFw8uUZWhXWl1kH3tYCMsPhf9SNrPXGRuQgCidsp05wQWJBgWbY1wWS8HKrxhp2TfwvkB53FJ1Siuphvgf2v0A8Rhcxi1V15bW1BEb+biH4yysKpHacYIrOscLL1Zmb+cDH2fX4sRo/Z2Eohe2h5NwgdkLVc1u5LaE3fFD2yw8iXRKa2QZaNiK1Ww7hiedFH3WtqZlSyc3oWv1xrk38IE2P3wifdYh0QAOvAiZeP7aAL4dBuMT/ec6RnYDrzCi3idRL1JAIGA3RB/OF1J1LR4mewJeELzyX3snlXAAnTHKfyNVhIarhf1HrzAz29Jmya+HPLJ/IjNxFCWf7P75HYo1YeZRg1KdsNR7AokbahPHEMxNUyKb/2ktqUfq9h4kKQx2ZIagYKZptIvVQvHTHZu3wDo78EXhKx/6B9CvyihWh5StgJbcZfE8Mxivv8Uk9lyg9zES+1w6APH6+dmpRj2aJHpKEyLfXBKzgTltzqBkFJnhx+DJM3cOc9MWj7FrspCB9JpVRJBmcAuRn0iTVGmSoe25eC9HFpB9gez5Z40RfyEK0D1HUOBWM2r7O+3FwDInKSvG5Z5WO45pdh+WznlRKTj5QcXvipFO/GWI0tgLCWIhlYaHfewbTrssVVYB5jA1C2BFRLA6q0FJnOmv+zTDJocQpqvXk3PsCEI1SqdThNH3xB2gEOa6uBq0ipi4v3w8cTcWCxHTsXlAwj4c0bH0pzfMOkQGGXOY2hoBp1k96zUNW4mc0uTQmYs89ucuV/BUU537teBjpTOz7jbmz31jtvcDYNpuxFR/h3opBwd3lC+jSR+CEG9lq91/fjxw6VP0T8QYl14AHP75rL8qCW9Q Lj4YDpIW 0uvy2JSnmJRTvpC49aiTRx+w7jLagCCH7RAqsq6sUoLwdR0ovKTSeaTqVl7aClXv5eEN+rFuKPumucy1xHGsx8elWoObAiUDgmE9e2p4/T5kWhK+uspt2vj05gkb8RJLgn0d1iSmOrMgUeViX3D0E/d02V+VrGNLnxKnN0UKGWcvl3xSG/RTew2GNQQVbI8D78x4V+y+r7Qv0yD7PV53zeeBsQxJ9p4Ewj62lFh5MvH7BYRDhGeCPdo48O7cIrGWluN+H98fG2tX2SUJW2OYn6ktkTILKk4xjfpZpfYRS5x3td7SZySO5dp72DfDl39W8MtgYPnJ/tLszzdoVQh/lXD97A2DZLtQ6ECGRwPAx7WmjLAQc0Tv0rZi5RBCiW48q1xif5ld9JFMsQXc05nNhxXcgNEKUTGJuBeSNFiPBYcMXfW5a1rXpCPS2EiBArS1gEB4V1pv10ttwq1W4ZGfU3bx0LA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexander Zhu When a THP is split, any subpage that is zero-filled will be mapped to the shared zeropage, hence saving memory. Add selftest to verify this by allocating zero-filled THP and comparing RssAnon before and after split. Signed-off-by: Alexander Zhu Acked-by: Rik van Riel Signed-off-by: Usama Arif --- .../selftests/mm/split_huge_page_test.c | 71 +++++++++++++++++++ tools/testing/selftests/mm/vm_util.c | 22 ++++++ tools/testing/selftests/mm/vm_util.h | 1 + 3 files changed, 94 insertions(+) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index e5e8dafc9d94..eb6d1b9fc362 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -84,6 +84,76 @@ static void write_debugfs(const char *fmt, ...) write_file(SPLIT_DEBUGFS, input, ret + 1); } +static char *allocate_zero_filled_hugepage(size_t len) +{ + char *result; + size_t i; + + result = memalign(pmd_pagesize, len); + if (!result) { + printf("Fail to allocate memory\n"); + exit(EXIT_FAILURE); + } + + madvise(result, len, MADV_HUGEPAGE); + + for (i = 0; i < len; i++) + result[i] = (char)0; + + return result; +} + +static void verify_rss_anon_split_huge_page_all_zeroes(char *one_page, int nr_hpages, size_t len) +{ + unsigned long rss_anon_before, rss_anon_after; + size_t i; + + if (!check_huge_anon(one_page, 4, pmd_pagesize)) { + printf("No THP is allocated\n"); + exit(EXIT_FAILURE); + } + + rss_anon_before = rss_anon(); + if (!rss_anon_before) { + printf("No RssAnon is allocated before split\n"); + exit(EXIT_FAILURE); + } + + /* split all THPs */ + write_debugfs(PID_FMT, getpid(), (uint64_t)one_page, + (uint64_t)one_page + len, 0); + + for (i = 0; i < len; i++) + if (one_page[i] != (char)0) { + printf("%ld byte corrupted\n", i); + exit(EXIT_FAILURE); + } + + if (!check_huge_anon(one_page, 0, pmd_pagesize)) { + printf("Still AnonHugePages not split\n"); + exit(EXIT_FAILURE); + } + + rss_anon_after = rss_anon(); + if (rss_anon_after >= rss_anon_before) { + printf("Incorrect RssAnon value. Before: %ld After: %ld\n", + rss_anon_before, rss_anon_after); + exit(EXIT_FAILURE); + } +} + +void split_pmd_zero_pages(void) +{ + char *one_page; + int nr_hpages = 4; + size_t len = nr_hpages * pmd_pagesize; + + one_page = allocate_zero_filled_hugepage(len); + verify_rss_anon_split_huge_page_all_zeroes(one_page, nr_hpages, len); + printf("Split zero filled huge pages successful\n"); + free(one_page); +} + void split_pmd_thp(void) { char *one_page; @@ -431,6 +501,7 @@ int main(int argc, char **argv) fd_size = 2 * pmd_pagesize; + split_pmd_zero_pages(); split_pmd_thp(); split_pte_mapped_thp(); split_file_backed_thp(); diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index 5a62530da3b5..d8d0cf04bb57 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -12,6 +12,7 @@ #define PMD_SIZE_FILE_PATH "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" #define SMAP_FILE_PATH "/proc/self/smaps" +#define STATUS_FILE_PATH "/proc/self/status" #define MAX_LINE_LENGTH 500 unsigned int __page_size; @@ -171,6 +172,27 @@ uint64_t read_pmd_pagesize(void) return strtoul(buf, NULL, 10); } +unsigned long rss_anon(void) +{ + unsigned long rss_anon = 0; + FILE *fp; + char buffer[MAX_LINE_LENGTH]; + + fp = fopen(STATUS_FILE_PATH, "r"); + if (!fp) + ksft_exit_fail_msg("%s: Failed to open file %s\n", __func__, STATUS_FILE_PATH); + + if (!check_for_pattern(fp, "RssAnon:", buffer, sizeof(buffer))) + goto err_out; + + if (sscanf(buffer, "RssAnon:%10lu kB", &rss_anon) != 1) + ksft_exit_fail_msg("Reading status error\n"); + +err_out: + fclose(fp); + return rss_anon; +} + bool __check_huge(void *addr, char *pattern, int nr_hpages, uint64_t hpage_size) { diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index 9007c420d52c..2eaed8209925 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -39,6 +39,7 @@ unsigned long pagemap_get_pfn(int fd, char *start); void clear_softdirty(void); bool check_for_pattern(FILE *fp, const char *pattern, char *buf, size_t len); uint64_t read_pmd_pagesize(void); +unsigned long rss_anon(void); bool check_huge_anon(void *addr, int nr_hpages, uint64_t hpage_size); bool check_huge_file(void *addr, int nr_hpages, uint64_t hpage_size); bool check_huge_shmem(void *addr, int nr_hpages, uint64_t hpage_size); From patchwork Fri Aug 30 10:03:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B8E4CA0EFB for ; Fri, 30 Aug 2024 10:04:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 727C86B00F8; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D4456B00F9; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 550816B00FA; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 361556B00F8 for ; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E5EAB40ECE for ; Fri, 30 Aug 2024 10:04:50 +0000 (UTC) X-FDA: 82508477940.10.85F339C Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) by imf29.hostedemail.com (Postfix) with ESMTP id 1F71212000B for ; Fri, 30 Aug 2024 10:04:48 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GP9+XGoa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.42 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t+vBNVWjO6mtV26zUs32wjVtSHlbqsf20K2xRQFmV6I=; b=TRrD1Bj0jF7TQcmHsj9IyhNe1TfTisj8550XNOrFPuAkxn1LZp+zr9c/h+DUTJApyFbT11 D9REBFUk9IAVrJL5erY2DhR2tCKcirBjbprDt0jRFdEHG/vDNgBrafTjRjxRf23mRp4D8G 78fNbzMWZI7my1MVMxNEPbonyo2Yekk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012188; a=rsa-sha256; cv=none; b=u5hYLygxg5MpItZkHRQ40rctEkmMoobBMUiCKEiY73oXwc5tXlqjuGd1h908Hs+c55GoYR pr+UdWSDLTXjdeHILFGKJ21myaz0IEArbZebR8jkXdhF0vFZN055PtuWLYxjS1a3+jnPkg GApKAuiMESLO/Y63ccQmtoqUgm7BkQI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GP9+XGoa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.42 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-6c352bcb569so357706d6.0 for ; Fri, 30 Aug 2024 03:04:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012288; x=1725617088; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=t+vBNVWjO6mtV26zUs32wjVtSHlbqsf20K2xRQFmV6I=; b=GP9+XGoaEgOaf73q0Hu8cbNHZZAZoE6YixGB3lPL/IzaJMwa+d+QRfjRbnsn2yrsz7 jSm9WHRVmR4sZVVEiYx/4MDWrwj8bD5w5yQBBX3oQSxbDmfi+FvRkfzlnhZDs/GCuw6Y Wei0PUU7HqH5xA1XpJ/KJAv/iKrXGCOVwvp8b3s4jBU79EPy1MYwbp04/lD8v2J8B7iu 4SS3OW1jFBhoblWdu33mckzEetXAXchOVZRrbPV0/DMvHL3dh2GK/rm/kQgt44a7KBx/ s3Owe/z37Yp86si0ntRzT68G5iexNBIXP4J7jJxUSFtqw26W9MD3gk6CAUIjZ/L/S6B9 rmXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012288; x=1725617088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t+vBNVWjO6mtV26zUs32wjVtSHlbqsf20K2xRQFmV6I=; b=pYHY3FCWUQ8NqHQ6RlYDQnEMQESjum/2mgU1X8OPievpBBlIScfST8WScjSRJcAqZ5 yagRkPRV0Y6fLYpYE+8dVQytV1VFun9OEi0Az1MGN9DLFKsD5FBeelpsuEyMsGZLwXfd 8vXvl7yURcjjxiI5B56c1o9Uq5vx2wU0oCwUwM9sjUXVXGDasb9P3A35pwjJTPsXKhjJ 6jZkddljyhmzYSW7pL9ce7aTA4oIm9z84fYsLQyAd2k6uBxmQhu2okHnMp8LchL4RkKj yKOTtPUG96oigIIttUjhZvnHjh+mVrKCtyqFSbS/EkHhrpIT8tko3PM2DQnylj5FmVUO oynw== X-Forwarded-Encrypted: i=1; AJvYcCUEmttsOlxWr+rOknjuAYAS4g2et12A3KgKTZlL5kReNJJwxOgG0a9TGN07weYRgm001iWd4lPehA==@kvack.org X-Gm-Message-State: AOJu0YyLNXmAX2GVnRR5OhbL8zXmqLfUbHC1gei1lJTnzevwu0jEqzoX /+UR+6kBeE99GnpLAy4/vf6/uqC4E5Z5GgpnUSspaGR/kHmnd9nq X-Google-Smtp-Source: AGHT+IH5AyMQZDNKx+vMxR5PHzOCBM3mGoXbZzUBy0tcyWMwXEM0DbfBsCwTeIM/EkOxrAzZrQbkmg== X-Received: by 2002:a0c:fa03:0:b0:6bf:6b8a:40a1 with SMTP id 6a1803df08f44-6c34ae4a03amr11113366d6.29.1725012288019; Fri, 30 Aug 2024 03:04:48 -0700 (PDT) Received: from localhost (fwdproxy-ash-001.fbsv.net. [2a03:2880:20ff:1::face:b00c]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6c340bfa666sm13303256d6.2.2024.08.30.03.04.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:47 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v5 4/6] mm: Introduce a pageflag for partially mapped folios Date: Fri, 30 Aug 2024 11:03:38 +0100 Message-ID: <20240830100438.3623486-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1F71212000B X-Stat-Signature: 61juia4oposwye4n9kwat174em9objua X-Rspam-User: X-HE-Tag: 1725012288-683317 X-HE-Meta: U2FsdGVkX19lfZxxWIOT/LQfmI9TSahzFa8WBdq1JxC+WKDXEmGiWAFurM2Ra+jL1L4xfbF8isBN5Bok3mQPRyjIRMbwamhZ1dGIqDdjSzSspk0lG99iD40P1WnHczP570UUG5wedDuPaBo4s9U7N7IKYwsXtzRendbKfXl0thVbUlVrTBUAv4jDDs1dN8NY1vMtfhRiWOWX/sgp9aVsVvHEDdvWFEQE3jxAwWqcciksbL/HnmVLV3/pJy7ZFThUntSvw9LC1/3UlZMnd0cxJC6CAOLzXMpSuML04no3qCBEwy1xeROAAQ6yfbXmvENXQCicR1zUTI4KuDlPlk36RwJaUs6I8mpnZxhigi6YE4/DEvEeKIDgpaD4Bi0H6F2MIVl7XZF8hdDPs3i2ZYix1N2wUYboLMXoE4G+QQcNj3zcd+9IBhWZvV17gxtRmHlnmoWmYfdLgLFbC6tTrkeS3jyv4DdQ19QSTnHQnPo5tdDSX7fcI0bOiGE6drotNyUYusiVZQRltr3PYbBboLeuslfqRd8unopUrEvIcWxTu4d2T8POugkwxkGDMhD6IohhWPxYwP5zzzF/YFXeqPBoHBwZHkVUsrvzxQ2h+Elrg6oJUe5ToitXTuM9j3HddcLlOFcyY9VZP1mlwyaeoMb2d9S1v6O5eJN5tWRCsfJ/CN8hod8kniLUz/9ryGHlroO03LSZnZphwqfMcCJ1ZjLY5UbNppCIW0NngpKoBBKJtMzhlW467RliY7XL7oyzUGMTBeizVYrcJ72xbethOSKHOe6c+q20rMoTR554BHjzYMxAsikJn65X3EgfxY0Xtvmx1FOcxhvAM/NqJLofxuXwZtLsBzv/eWcmhuLrsGARfmJNU7tfl14Su+jIybGk7dYJGtyT0aS1MVyiYIvMb3i9hV+8Zim6SgwbCHrwsmrEHLduXUfRwR+E4EKVqaEJe8AzgOoC+AF6VeGiozcJ8Zx ovJYCy2b YFBh8HddOfLhVFTr8b3pZeSuEJg8JIl+MUxKlGb+EP0MjFvB0IU3s0pI6zUtSN3wrNaDY5voYodSJ0nn7B/dg9Pmhx+35ZwXRsPWtP3hEVOtWeiZQD6Q4y4sNS6sMQrY4XxSADdny0sSi4JWwgYuznCyTDbqwE4xFFsHYa+vbWVppNNXoA+XyWnmHU5mwfIlTBJVUUVsEu8H84qUEUMsvshddi7Porr1X1Yj7A4JHA7sp0QCPilWnolXexD6AuGqKyE0ksK+EQjuv8ICL9IU9CC+NV8as7cScvBrwKagMtpbxB7lf0mkmRAMdEhE7tuw0a46R5SLkzVOpF/A+ZuHD3UF2DT0VqxC/kQzJ+l7uqzaBJIwTokGn9WrpfH9HUErxlU0P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently folio->_deferred_list is used to keep track of partially_mapped folios that are going to be split under memory pressure. In the next patch, all THPs that are faulted in and collapsed by khugepaged are also going to be tracked using _deferred_list. This patch introduces a pageflag to be able to distinguish between partially mapped folios and others in the deferred_list at split time in deferred_split_scan. Its needed as __folio_remove_rmap decrements _mapcount, _large_mapcount and _entire_mapcount, hence it won't be possible to distinguish between partially mapped folios and others in deferred_split_scan. Eventhough it introduces an extra flag to track if the folio is partially mapped, there is no functional change intended with this patch and the flag is not useful in this patch itself, it will become useful in the next patch when _deferred_list has non partially mapped folios. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 4 ++-- include/linux/page-flags.h | 13 +++++++++++- mm/huge_memory.c | 41 ++++++++++++++++++++++++++++---------- mm/memcontrol.c | 3 ++- mm/migrate.c | 3 ++- mm/page_alloc.c | 5 +++-- mm/rmap.c | 5 +++-- mm/vmscan.c | 3 ++- 8 files changed, 56 insertions(+), 21 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4da102b74a8c..0b0539f4ee1a 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -333,7 +333,7 @@ static inline int split_huge_page(struct page *page) { return split_huge_page_to_list_to_order(page, NULL, 0); } -void deferred_split_folio(struct folio *folio); +void deferred_split_folio(struct folio *folio, bool partially_mapped); void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio); @@ -502,7 +502,7 @@ static inline int split_huge_page(struct page *page) { return 0; } -static inline void deferred_split_folio(struct folio *folio) {} +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 2175ebceb41c..1b3a76710487 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -186,6 +186,7 @@ enum pageflags { /* At least one page in this folio has the hwpoison flag set */ PG_has_hwpoisoned = PG_active, PG_large_rmappable = PG_workingset, /* anon or file-backed */ + PG_partially_mapped = PG_reclaim, /* was identified to be partially mapped */ }; #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) @@ -859,8 +860,18 @@ static inline void ClearPageCompound(struct page *page) ClearPageHead(page); } FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) +FOLIO_TEST_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +/* + * PG_partially_mapped is protected by deferred_split split_queue_lock, + * so its safe to use non-atomic set/clear. + */ +__FOLIO_SET_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +__FOLIO_CLEAR_FLAG(partially_mapped, FOLIO_SECOND_PAGE) #else FOLIO_FLAG_FALSE(large_rmappable) +FOLIO_TEST_FLAG_FALSE(partially_mapped) +__FOLIO_SET_FLAG_NOOP(partially_mapped) +__FOLIO_CLEAR_FLAG_NOOP(partially_mapped) #endif #define PG_head_mask ((1UL << PG_head)) @@ -1171,7 +1182,7 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page) */ #define PAGE_FLAGS_SECOND \ (0xffUL /* order */ | 1UL << PG_has_hwpoisoned | \ - 1UL << PG_large_rmappable) + 1UL << PG_large_rmappable | 1UL << PG_partially_mapped) #define PAGE_FLAGS_PRIVATE \ (1UL << PG_private | 1UL << PG_private_2) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index af60684e7c70..166f8810f3c6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3503,7 +3503,11 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (folio_order(folio) > 1 && !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + if (folio_test_partially_mapped(folio)) { + __folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } /* * Reinitialize page_deferred_list after removing the * page from the split_queue, otherwise a subsequent @@ -3570,13 +3574,18 @@ void __folio_undo_large_rmappable(struct folio *folio) spin_lock_irqsave(&ds_queue->split_queue_lock, flags); if (!list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + if (folio_test_partially_mapped(folio)) { + __folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } list_del_init(&folio->_deferred_list); } spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } -void deferred_split_folio(struct folio *folio) +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ +void deferred_split_folio(struct folio *folio, bool partially_mapped) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); #ifdef CONFIG_MEMCG @@ -3604,15 +3613,21 @@ void deferred_split_folio(struct folio *folio) if (folio_test_swapcache(folio)) return; - if (!list_empty(&folio->_deferred_list)) - return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + if (partially_mapped) { + if (!folio_test_partially_mapped(folio)) { + __folio_set_partially_mapped(folio); + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_DEFERRED_SPLIT_PAGE); + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); + + } + } else { + /* partially mapped folios cannot become non-partially mapped */ + VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); + } if (list_empty(&folio->_deferred_list)) { - if (folio_test_pmd_mappable(folio)) - count_vm_event(THP_DEFERRED_SPLIT_PAGE); - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); ds_queue->split_queue_len++; #ifdef CONFIG_MEMCG @@ -3660,7 +3675,11 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, list_move(&folio->_deferred_list, &list); } else { /* We lost race with folio_put() */ - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + if (folio_test_partially_mapped(folio)) { + __folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } list_del_init(&folio->_deferred_list); ds_queue->split_queue_len--; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 087a8cb1a6d8..e66da58a365d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4629,7 +4629,8 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); VM_BUG_ON_FOLIO(folio_order(folio) > 1 && !folio_test_hugetlb(folio) && - !list_empty(&folio->_deferred_list), folio); + !list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio), folio); /* * Nobody should be changing or seriously looking at diff --git a/mm/migrate.c b/mm/migrate.c index d039863e014b..35cc9d35064b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1766,7 +1766,8 @@ static int migrate_pages_batch(struct list_head *from, * use _deferred_list. */ if (nr_pages > 2 && - !list_empty(&folio->_deferred_list)) { + !list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio)) { if (!try_split_folio(folio, split_folios, mode)) { nr_failed++; stats->nr_thp_failed += is_thp; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c2ffccf9d213..a82c221b7c2e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -962,8 +962,9 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) break; case 2: /* the second tail page: deferred_list overlaps ->mapping */ - if (unlikely(!list_empty(&folio->_deferred_list))) { - bad_page(page, "on deferred list"); + if (unlikely(!list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio))) { + bad_page(page, "partially mapped folio on deferred list"); goto out; } break; diff --git a/mm/rmap.c b/mm/rmap.c index 78529cf0fd66..a8797d1b3d49 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1579,8 +1579,9 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, * Check partially_mapped first to ensure it is a large folio. */ if (partially_mapped && folio_test_anon(folio) && - list_empty(&folio->_deferred_list)) - deferred_split_folio(folio); + !folio_test_partially_mapped(folio)) + deferred_split_folio(folio, true); + __folio_mod_stat(folio, -nr, -nr_pmdmapped); /* diff --git a/mm/vmscan.c b/mm/vmscan.c index f27792e77a0f..4ca612f7e473 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1238,7 +1238,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * Split partially mapped folios right away. * We can free the unmapped pages without IO. */ - if (data_race(!list_empty(&folio->_deferred_list)) && + if (data_race(!list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio)) && split_folio_to_list(folio, folio_list)) goto activate_locked; } From patchwork Fri Aug 30 10:03:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AFC7CA0EFC for ; Fri, 30 Aug 2024 10:04:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BE20F6B00FA; Fri, 30 Aug 2024 06:04:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1DFE6B00FC; Fri, 30 Aug 2024 06:04:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F78B6B00FB; Fri, 30 Aug 2024 06:04:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 72C846B009B for ; Fri, 30 Aug 2024 06:04:52 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3263CC03DF for ; Fri, 30 Aug 2024 10:04:52 +0000 (UTC) X-FDA: 82508478024.08.6804FCA Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) by imf21.hostedemail.com (Postfix) with ESMTP id 5B71F1C000C for ; Fri, 30 Aug 2024 10:04:50 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bI2uJscy; spf=pass (imf21.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.175 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012199; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TVfBrp+oNzgiShEzXHur6LLPz7aGRa6cdgG73sD1ODc=; b=c00PdFHqBdju+s5qZRV8HhQhnGasDghWgzxdczKKxgQPPlnN8pRLqLL+SBf7jsO1qjgXOR OOUpJKOUA8Y1Sa7sYdA4cdOoys4vn9037Z23wTB68wA7CkwXCjYbkBF7mxoySzaXFOsqVY n28BnZYdV3yGURQ+ow6WGFuEuW9VcmQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012200; a=rsa-sha256; cv=none; b=auqGQqVxpZm8wJAr/eAO1arvRVnkIKEJOnQEkHmP08tnI+xRwvGA9eu02iU2OzVx4ru5Mx mqpK1vtbq43Os70C15QXuetwXT5Dq44fHx4gLn4ReTDdv+tWxP3sLlRqIIvUafqsMJ64HZ ZLUmH44XW1pIMtqU8Aw7qE18haD77t4= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bI2uJscy; spf=pass (imf21.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.175 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-7a8134aefe8so26184585a.2 for ; Fri, 30 Aug 2024 03:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012289; x=1725617089; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TVfBrp+oNzgiShEzXHur6LLPz7aGRa6cdgG73sD1ODc=; b=bI2uJscybctnrLXVLn5yst21Wf+HsroyaBkmkIGRJdOGUja9FuQnlLs7WlMfzo2CEb kj7gwLBEsS7pGUSTBJZ5ae76gWVZr7svuZ+vLbxjPFsTJ46upCR4qBdpPXKyuecrgzk/ BUkQx8cBRC44ub4N5JuQ3oPlGecmYBegAItMk17YGY35YoNDvMGsrwvSESD89VFYHRNO DnI0zWkptj/LpU4t9utW9rYWg+z3I3NfBpvORUGKe7DKXRbsf7SwNaIq0+Q4oZT7llvW X9QMGlVBT+TdiETN2FGy9d9eTxytpenw/Ou4bQdLkGPCmpH/qyvv11Ior1+LKm2RwwIt L3og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012289; x=1725617089; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TVfBrp+oNzgiShEzXHur6LLPz7aGRa6cdgG73sD1ODc=; b=atRaj9XZcCacOxjPl+v8TIPoCZytW9Sztmg5aJ/EdMgfLltT6KNoa6nYrKboeKK7QI e3oaF2fQ3I/EprW68gMhWYy4x1G0XJT71zPHfXpcMIU9ybu2Jha5/592CuKY2FFxcIxi zqQ18rpjOwMYCHV390Vxe3YLLkxDuXUgJ6mQSrBRgMLs4GrYrPuZ2VN1pIdnwW8XYbkE YyG4/PELVC09hWwxhlB04+Tt4PeEGoVhbOQuelKwwAhqcQspWUVOglMgupE9KrCe67Yx fjoY8GqlHu/G9Y/NSUpLHCS+hWqGIX/6xc4WOZIuSwT4e2IbyLGOliakHh+4/HEeFfYD ehWw== X-Forwarded-Encrypted: i=1; AJvYcCUjoE8bGwPWoHdescx1ZD9trlGNiTaHBUB7B7D9s5ZwVK6oZUYWDLuAQh/DD5b0GYZA6DMw65sLqQ==@kvack.org X-Gm-Message-State: AOJu0YxmUdxcaK1K0B8vEcc7UcAZEr3GERnYU2FXOKr2TNjA/6EwMc39 ReX0PGL+0Oj0bQoWVOzQKiccz64zCRxE1xsR2VDIarhy4xjXr/N5QQdb8IS5 X-Google-Smtp-Source: AGHT+IGdkW+88y1JPZ8HJseiLWBGoeU0Lq8Ru1U+26rkyCLiExAn2mKtW62AzHffSX6x/E3+lb5IPw== X-Received: by 2002:a05:620a:319d:b0:79f:249:9f9f with SMTP id af79cd13be357-7a80427d35amr433621985a.62.1725012289265; Fri, 30 Aug 2024 03:04:49 -0700 (PDT) Received: from localhost (fwdproxy-ash-113.fbsv.net. [2a03:2880:20ff:71::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a806c4cd7csm130315085a.69.2024.08.30.03.04.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:48 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v5 5/6] mm: split underused THPs Date: Fri, 30 Aug 2024 11:03:39 +0100 Message-ID: <20240830100438.3623486-6-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Stat-Signature: wah7mniocn3oyg6j3syxnaki1zetpefj X-Rspamd-Queue-Id: 5B71F1C000C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725012290-597803 X-HE-Meta: U2FsdGVkX193ix63FdZaRKwbu46EXAF58UUkz/8FHFwevHuY171POkUiSyxPhGdP9pIzHZC7PQMfKH76aXAtSFw0acYW8OA/6iD/Su7+oOVVu10YCq545tCNxEMRRgRoOWgkCUgjHLrpVghF5ksNiI0IxBsrZXvyQ8uVK+5eNLrOQN8Nw56S0hlBmZzuyuZgkD19UsIyITdwnBzCH5YKvBCQbQZSjbEVnmzAw1E+PwIKnZYmuLxtp/kOhCiSm92HDru1HVFy+6S7avOrcrtl6Pl2fEavgL6ieCE8aB3NqGt3HobmwGkLmne5F48z0b8GdNYTeMBHo4Ig6P7MOUiiUkizH3+cRs1evdfAzutU13COs4y5P5W1WK+eirTPNR3nfRZ2gHHmiYumYtopOqm+tw2q9SEkYy0IY8l6bdsi4DAUnHu+UiFZpG2o6yzggQj5JKrfo96NF3asF94KmvDrOeQGu2OZG/lOdTNrqc0ZammoE9A/HUzz01HOEzODPCcY3TdAOtW55QWIHqzusOAYu6VRwWclxEHsLyqcV7jbYwJKEuU39jIWY8TRmex41ucy1fB/Vq+hKn/8hMYOOnyPOu1idJrcQiMOHNpj/nDUOooD4gVsLFeot5YkQ9O4ZBBZ81eu8GhWrgy2uRSspKtpfI9YO5gTKAw2Wjiwk4OdcuyZRRdPZyaclsO4pBg8bYCCkNitZBaK7RZxQlmqXpcAicxNqRxUJ4gM3+59rP9/QQUyRIlsI3tU0l1cSxtVCk8EIKlU+L/hk0FctREhM3do7/RoLZxNmfq9yLRvBlZAab6qi2NvIYvGkgxo3wc9KW7Ww/c641IJysHy1mCzBODfjjmFuAx273XbNgrGsglVQbeQuiagNdUQg1SBp//UL2jCO2MEvnTFt5lbWqEiekNi8q85EKrAHL4fYfwgCQPdxrWk5beOeS82iCZlZr9INiM/3+FGtI9AvNV3fEss7PA 4gBO+iLd cAAkNheElN2DXTc21Bj1P8kK3yMfi7i3CJWutLcnRuCD19oz3wo4jBaNiI9zRXmNZ1wa9E4YQiTnRUb/WC7I4phRgTTYReRLj3+gwOv2f+/XHsw5z9WVlIyySme3A0AqOU/OjHCnyNIRWGJmVNkpISBSDrPHZtl02RRO27HsMn9PMylM81hIcZDr5Gzh+hqXzSmTmhd4aAMIuHnfb07vgnzyC95GQj9XiBl1A2GxQNZ2kgdTLWL98hJX1GWBLrpW6J0I4hcLA9WcJA2cg554k82pLd+g6OI8ARfjrXdcOpVgJtq8ksGb/us7tMvH6Hat3IBh9HPVbd4Cmn0mtlc2CpVTaC7Lp7mTAV9yZmrxG8UHmpN8mEOLZpetw7U8tFSGu9tmt6ZV36vySTv0eEGK7IXCZP977xs6e+J+0n045VBuSBPHNO3bc4G1wCRcStgcttvM31d6AcLl30oo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is an attempt to mitigate the issue of running out of memory when THP is always enabled. During runtime whenever a THP is being faulted in (__do_huge_pmd_anonymous_page) or collapsed by khugepaged (collapse_huge_page), the THP is added to _deferred_list. Whenever memory reclaim happens in linux, the kernel runs the deferred_split shrinker which goes through the _deferred_list. If the folio was partially mapped, the shrinker attempts to split it. If the folio is not partially mapped, the shrinker checks if the THP was underused, i.e. how many of the base 4K pages of the entire THP were zero-filled. If this number goes above a certain threshold (decided by /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none), the shrinker will attempt to split that THP. Then at remap time, the pages that were zero-filled are mapped to the shared zeropage, hence saving memory. Suggested-by: Rik van Riel Co-authored-by: Johannes Weiner Signed-off-by: Usama Arif --- Documentation/admin-guide/mm/transhuge.rst | 6 +++ include/linux/khugepaged.h | 1 + include/linux/vm_event_item.h | 1 + mm/huge_memory.c | 60 +++++++++++++++++++++- mm/khugepaged.c | 3 +- mm/vmstat.c | 1 + 6 files changed, 69 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 56a086900651..aca0cff852b8 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -471,6 +471,12 @@ thp_deferred_split_page splitting it would free up some memory. Pages on split queue are going to be split under memory pressure. +thp_underused_split_page + is incremented when a huge page on the split queue was split + because it was underused. A THP is underused if the number of + zero pages in the THP is above a certain threshold + (/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none). + thp_split_pmd is incremented every time a PMD split into table of PTEs. This can happen, for instance, when application calls mprotect() or diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index f68865e19b0b..30baae91b225 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -4,6 +4,7 @@ #include /* MMF_VM_HUGEPAGE */ +extern unsigned int khugepaged_max_ptes_none __read_mostly; #ifdef CONFIG_TRANSPARENT_HUGEPAGE extern struct attribute_group khugepaged_attr_group; diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index aae5c7c5cfb4..aed952d04132 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -105,6 +105,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_SPLIT_PAGE, THP_SPLIT_PAGE_FAILED, THP_DEFERRED_SPLIT_PAGE, + THP_UNDERUSED_SPLIT_PAGE, THP_SPLIT_PMD, THP_SCAN_EXCEED_NONE_PTE, THP_SCAN_EXCEED_SWAP_PTE, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 166f8810f3c6..a97aeffc55d6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1187,6 +1187,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(vma->vm_mm); + deferred_split_folio(folio, false); spin_unlock(vmf->ptl); count_vm_event(THP_FAULT_ALLOC); count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); @@ -3652,6 +3653,39 @@ static unsigned long deferred_split_count(struct shrinker *shrink, return READ_ONCE(ds_queue->split_queue_len); } +static bool thp_underused(struct folio *folio) +{ + int num_zero_pages = 0, num_filled_pages = 0; + void *kaddr; + int i; + + if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1) + return false; + + for (i = 0; i < folio_nr_pages(folio); i++) { + kaddr = kmap_local_folio(folio, i * PAGE_SIZE); + if (!memchr_inv(kaddr, 0, PAGE_SIZE)) { + num_zero_pages++; + if (num_zero_pages > khugepaged_max_ptes_none) { + kunmap_local(kaddr); + return true; + } + } else { + /* + * Another path for early exit once the number + * of non-zero filled pages exceeds threshold. + */ + num_filled_pages++; + if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) { + kunmap_local(kaddr); + return false; + } + } + kunmap_local(kaddr); + } + return false; +} + static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc) { @@ -3689,13 +3723,35 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); list_for_each_entry_safe(folio, next, &list, _deferred_list) { + bool did_split = false; + bool underused = false; + + if (!folio_test_partially_mapped(folio)) { + underused = thp_underused(folio); + if (!underused) + goto next; + } if (!folio_trylock(folio)) goto next; - /* split_huge_page() removes page from list on success */ - if (!split_folio(folio)) + if (!split_folio(folio)) { + did_split = true; + if (underused) + count_vm_event(THP_UNDERUSED_SPLIT_PAGE); split++; + } folio_unlock(folio); next: + /* + * split_folio() removes folio from list on success. + * Only add back to the queue if folio is partially mapped. + * If thp_underused returns false, or if split_folio fails + * in the case it was underused, then consider it used and + * don't add it back to split_queue. + */ + if (!did_split && !folio_test_partially_mapped(folio)) { + list_del_init(&folio->_deferred_list); + ds_queue->split_queue_len--; + } folio_put(folio); } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5bfb5594c604..bf1734e8e665 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -85,7 +85,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); * * Note that these are only respected if collapse was initiated by khugepaged. */ -static unsigned int khugepaged_max_ptes_none __read_mostly; +unsigned int khugepaged_max_ptes_none __read_mostly; static unsigned int khugepaged_max_ptes_swap __read_mostly; static unsigned int khugepaged_max_ptes_shared __read_mostly; @@ -1237,6 +1237,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); + deferred_split_folio(folio, false); spin_unlock(pmd_ptl); folio = NULL; diff --git a/mm/vmstat.c b/mm/vmstat.c index f41984dc856f..bb081ae4d0ae 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1385,6 +1385,7 @@ const char * const vmstat_text[] = { "thp_split_page", "thp_split_page_failed", "thp_deferred_split_page", + "thp_underused_split_page", "thp_split_pmd", "thp_scan_exceed_none_pte", "thp_scan_exceed_swap_pte", From patchwork Fri Aug 30 10:03:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784872 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12C21CA0EFB for ; Fri, 30 Aug 2024 10:04:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BE45F6B00FC; Fri, 30 Aug 2024 06:04:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B6A306B00FE; Fri, 30 Aug 2024 06:04:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91FFF6B00FD; Fri, 30 Aug 2024 06:04:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 71C4F6B00FB for ; Fri, 30 Aug 2024 06:04:54 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E9EB81A1751 for ; Fri, 30 Aug 2024 10:04:53 +0000 (UTC) X-FDA: 82508478066.12.39776D7 Received: from mail-ot1-f45.google.com (mail-ot1-f45.google.com [209.85.210.45]) by imf16.hostedemail.com (Postfix) with ESMTP id 10F34180009 for ; Fri, 30 Aug 2024 10:04:51 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=F0AzkFmf; spf=pass (imf16.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.210.45 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012220; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y9HM33kRGSHCEaJuDM+XKG3QBl4KNlRa8+XSdjB42Js=; b=zRf7T8Q9TjTP20wC5XwhbxzJkNYgDNG0rW30ebkd80fA6mluhusic/CLpnPgG4CZ1TjPcd u6mbgAFspxwYBZXmGV/x7cxuE4ffxzKQ2ROLSZOJi83tG9+ckfAeuV7llCCcmta+gLXBbB DH1lDwq/qTWsUnoUmA8IRJclY6txd8Q= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=F0AzkFmf; spf=pass (imf16.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.210.45 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012220; a=rsa-sha256; cv=none; b=C/QBoZ4gHbChxVW1m9l4XmRzWp/UdwfUYYjldBHc3mp21fsoXWB1yTxjFqw256f/lFzdYf 9wcyqasFtR2b/0jcLuhLYCJDYsUn8rflKfjWFCD1e8gxsFHpyHOCxY1YEMWon6ZGIp7cxs HeQS/ysX8CtfNQwE9GqWkHGLRYsta2c= Received: by mail-ot1-f45.google.com with SMTP id 46e09a7af769-70f5a9bf18bso774422a34.2 for ; Fri, 30 Aug 2024 03:04:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012291; x=1725617091; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y9HM33kRGSHCEaJuDM+XKG3QBl4KNlRa8+XSdjB42Js=; b=F0AzkFmfkWA0LPW7ySley6EgXXeLIjI83IVljU0Yfw2AQtdAd5klM3dYkBKKufwW36 ysDONEO7yeQLi3hLbdMK7IaZVtSUlM7BIFWfB/Gp6+sKxiDQzsAP+nPiOifrObTvsTxx mF2U5Nt3SCB1FiOxE/rlfha6law8b2y3/ODgaP64xWBQIBS3YHIGaVHo+fq+06bYqb8t aIdpsnGCLgy6nEuR6Ojp0cTRWQ+HAJaQfrDTJpbHKrQ4FvXZbSVPbujhea8JHduzIsxM iVrWM+tI6XjUV5eLaPeMCxXjen11iyRVLMptslI7+1mDKDb7ibKPShqxGjvk76j4dcVI Vv+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012291; x=1725617091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y9HM33kRGSHCEaJuDM+XKG3QBl4KNlRa8+XSdjB42Js=; b=oAio2GpbsOM/e+irppULl6UNfLfzh2OgG2ChqrqxFR1fl5RtrNyhehQVYBNUoP58n6 XaJSVXKJf3n55Z8P1mcQ/YC56m9pFc0zOG3iTN06zyb/E/NuOvquf8afogiRvkGmXw4V 4yU4/qw0xJYoStk3mTNfM3xm+QR7sx9vfwvfLH5e1DwGFV63/CP/X4vlTWqc+M2CaZ6v N96/ah1FAGw0eA0hHkvYDxKSCrLb2DMN+ETmMsY5pyw7UfS/S9MCd+j+G7mr9+0d54ET Y9H0xeJZpXR8hIkY1Do51QwBsTNma26RktQtR/eJI9lczdvERQz12U6XXqAwUDHT/D// X/9Q== X-Forwarded-Encrypted: i=1; AJvYcCVOWaZIdeoa+re0++SSCBzmlrtKEKmw1mWJ19GTB7tlRZMNedZg8rBjZf4MWFicX1IBT/FsinjVLQ==@kvack.org X-Gm-Message-State: AOJu0YzAKMUeflDyraWav4ifjYVuUhsXYCeAtHGk9yL+wH6bK1ksecP7 LHUmHvtBSHTEhWc9OKoEpJuRBhpIllus0BtUQJNxvoxl95A46Eod X-Google-Smtp-Source: AGHT+IFnFxgDYehxBOjERbSY3E6IQRb/Eo5/olia1KlgsAQYx6fH7Gfn02t5h6FRR0dUftMkedAUyg== X-Received: by 2002:a05:6830:2682:b0:704:45b7:8ffc with SMTP id 46e09a7af769-70f5c49e963mr5658027a34.32.1725012290746; Fri, 30 Aug 2024 03:04:50 -0700 (PDT) Received: from localhost (fwdproxy-ash-013.fbsv.net. [2a03:2880:20ff:d::face:b00c]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6c340c0020esm13219306d6.50.2024.08.30.03.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:49 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v5 6/6] mm: add sysfs entry to disable splitting underused THPs Date: Fri, 30 Aug 2024 11:03:40 +0100 Message-ID: <20240830100438.3623486-7-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Stat-Signature: z4ywe6bw7hxfweiqkrea6uyntzttio7r X-Rspam-User: X-Rspamd-Queue-Id: 10F34180009 X-Rspamd-Server: rspam02 X-HE-Tag: 1725012291-834385 X-HE-Meta: U2FsdGVkX1+owRZAoQWEWF7L+RIz6/xpC9ERpCU/uMmbfG02U0uOX6HryY4APALV9xMnLO/3FBDYOq3UEPf4QpZNU4kZ7p6+qrf9N16N/VBPy+OyDqpXYice9Iq70Y0XRGOgtH50tJVoR0hIXdbmkJp5/1RETSMobQlsF0R32/SHUnEZbW2yW23uWTK+2dJ1wWB95utl2GVqQ1AKI8l8CmC0vMEHRdkTgcBmJ1WOxabmr8sIBcXnvgIEoJlAjK6tHmUQa+dbFUHJb0qm1pXN+G0Lz2WuOxRAe4mNs+2MzJ4cn0z5Y/5qQhvl9xu4yfEAXl79RKZOwPWlc0TJlBoGNqyIcn3/cqJZTzVDjipxpQsVxN6Zx9Q6ku1iJE1WY4AJQWjCA+28LCqOoNKR3uVoxGX9R6cwHa+zuWG0WomaO6jjcyM0kgXHBcWepEMlL8nqP3FruyYmj9cNLSVW9b62lX1TE8KBesilrjl7+lg6C4vv3kmfu5EaCO/DboYbMsGb7dMChK9YylptsgR853tXyMwD2QNxtsf9dPtFtgGXApIRmP1xKFFy4Pm6JSseWOzUWtkG+nNJSWwrIdByiI0iGNvCFQGdv4miC9G4X277Igi/e5k2xy5KBZdFF3pb1FKrg2vc0VBPp7fw/I9odNaNff6sj5/R4pTEyjOvK2E7Heh2eNI3oiOiWA903U5rieHatX5IG2COHp7uTKu7Sz2Ysht5Enu6ugyj9CtXVC0pJro/mO9CGAqmiq02N6JXO+rG9VY67QIJJox+hCSc/+Hlq2TtQHbMRcD6NI2WEhbbWx/frwAl9fjSLKR/1lnvOot1c3auffSLJ2grlp172r/gfuSWHYIBmONe09EOpscFOebr5cpiT/LxAh4n+4vVqLkFA/r46Gs1eOm3Mr5g7aXsl1o0+OXxybPrAYiU/RJJ0eI/r2UcZ7aL2Lcggr/NPFEKmMrYMIht8kVcTwyQT85 bj4ljwHH vtM4Od/dwpYyUKUng9ApbdGoEl7tJv3L2dEPTYCChGufKX0VeO7jV2kVLcvzQsl1AZpVYd2/gJ++VUu6AR30FFaku7mJMor0GADLqmmsOze1v6YWWw2dVIeqCSuRxdxgoLQrJyM9mI75quvZ6bj5JM1qWZdEgZ1YsNYizP9qE643WhRWJ49rveQLj+t0JpEBejcihrx7vhLzHktAbbKnk5OSgJmtvfbIMtSmGP28UN3fuhH/hJ9c6klMro5nlxtA7lT5/UdzBT158PK4YxNd/N7mQwHul6uvAyhUZSEeQkGLyoWB13c0ApQLtsJvZjZB7X8FY+qo2GcAzjHgGRumqFiBkbM4CfRQb8NWWzf0acH66FqkBrqgeA45uk4lUPYI3DZf8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If disabled, THPs faulted in or collapsed will not be added to _deferred_list, and therefore won't be considered for splitting under memory pressure if underused. Signed-off-by: Usama Arif --- Documentation/admin-guide/mm/transhuge.rst | 10 +++++++++ mm/huge_memory.c | 26 ++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index aca0cff852b8..cfdd16a52e39 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -202,6 +202,16 @@ PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size +All THPs at fault and collapse time will be added to _deferred_list, +and will therefore be split under memory presure if they are considered +"underused". A THP is underused if the number of zero-filled pages in +the THP is above max_ptes_none (see below). It is possible to disable +this behaviour by writing 0 to shrink_underused, and enable it by writing +1 to it:: + + echo 0 > /sys/kernel/mm/transparent_hugepage/shrink_underused + echo 1 > /sys/kernel/mm/transparent_hugepage/shrink_underused + khugepaged will be automatically started when PMD-sized THP is enabled (either of the per-size anon control or the top-level control are set to "always" or "madvise"), and it'll be automatically shutdown when diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a97aeffc55d6..0993dfe9ae94 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -74,6 +74,7 @@ static unsigned long deferred_split_count(struct shrinker *shrink, struct shrink_control *sc); static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc); +static bool split_underused_thp = true; static atomic_t huge_zero_refcount; struct folio *huge_zero_folio __read_mostly; @@ -440,6 +441,27 @@ static ssize_t hpage_pmd_size_show(struct kobject *kobj, static struct kobj_attribute hpage_pmd_size_attr = __ATTR_RO(hpage_pmd_size); +static ssize_t split_underused_thp_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", split_underused_thp); +} + +static ssize_t split_underused_thp_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err = kstrtobool(buf, &split_underused_thp); + + if (err < 0) + return err; + + return count; +} + +static struct kobj_attribute split_underused_thp_attr = __ATTR( + shrink_underused, 0644, split_underused_thp_show, split_underused_thp_store); + static struct attribute *hugepage_attr[] = { &enabled_attr.attr, &defrag_attr.attr, @@ -448,6 +470,7 @@ static struct attribute *hugepage_attr[] = { #ifdef CONFIG_SHMEM &shmem_enabled_attr.attr, #endif + &split_underused_thp_attr.attr, NULL, }; @@ -3601,6 +3624,9 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) if (folio_order(folio) <= 1) return; + if (!partially_mapped && !split_underused_thp) + return; + /* * The try_to_unmap() in page reclaim path might reach here too, * this may cause a race condition to corrupt deferred split queue.