From patchwork Tue Jan 11 13:16:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12709867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9FF1BC433EF for ; Tue, 11 Jan 2022 13:18:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=oJuK4lSO+bVLVSpB+JnAej4OYcEW5FNhI9z8oNuuPjI=; b=ZqRdbxdRJY3VV0 VwHJFJFW8hbrnmNMx8Qp1+9kJrloV6Z6kSSC5O1ai/XPsKSSSWAWrYqgLwqV4kb59fTw9/Pu7m41u 8CYj8dwGARLFZORrpkkWKdlP8GEXGwYQPp97Fqdra65lhllUE0ehTWeVPE2uH+O6dKi+6YquH0rYp RlWXTkwGYoIfvuJ70w7k5G2QOWWulEvrxpPKKM/Vv9BZXSeU3zTqHoffq45+AKFWObXQA85mmcTQG rWHk2XqKreT8QFojNSgaM9/iiywyANQWzVPz8x/OIAQ84h0WRV0ke6BN1U10Lkqq/R+oGp/Fal2go /Nedrn/uBK23BgW2pNmg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1n7H1e-00GNiA-Ex; Tue, 11 Jan 2022 13:17:22 +0000 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1n7H1a-00GNgQ-9h for linux-arm-kernel@lists.infradead.org; Tue, 11 Jan 2022 13:17:20 +0000 Received: by mail-pl1-x636.google.com with SMTP id x15so17082577plg.1 for ; Tue, 11 Jan 2022 05:17:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=51hfnZMrjVMwncBAfuPjR9jE1rmCVx4hXfZuwfJArk8=; b=LljzaIex9oaWpQpDIkneXLdfCcqNXilOgXqkk4D/kR7cYhdCa23RDZGJuoq6DBFGtO TTWUAFmaCLHz05QRmZdtZvkYMzdEtuxqpD4CDixUwHnGyp4T73EkQKomQVCKhnxLJTJ2 bbtj+7nyl6t95Uc231qFZLxXJfavpeElcBVgpxNE3+0jW78xxdxhGy8doKfjcdXqAw5T HvyP/EQyittIW0StHo18qvuDv2PFZMijQyc78uvJxxzfkvHq6wB/wg85RRM7Lnvo2Pdo u29iAq85glUaEykP2fObR7yddVtFz399OM4p3NsF3ZSC1rzPatOTKrqG8Nul2Xmr6e61 RUWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=51hfnZMrjVMwncBAfuPjR9jE1rmCVx4hXfZuwfJArk8=; b=uoROf5IQ2JINPqyhIgmEmPO6hvKQg8318m5TMhWVvd/If6/pXJxq3jS/a9LtoEuyMw yh/u42NhPCeLK1XDdm573nBhzc1s4zYmPd7EHWawC4BPJefve1ap1K20BAA27j+YVIZV ht2zg3nFS9JLtjmmB5hpOJXVrTLqA0Gurf2Ibt/Vz8F+BTkFMAZmUSshE6yPmPK9tHQY pdnSj4e/kYGZsten4QaSkX3deYf1yjKxDNCthmfm2pc4y4RFfogEVUfWM5247JNIxsin MExeSkckUrJR8V1N2Lld9d2YKbNyWOwSTlqlZrmz3OFIrpSFlt1CdB6NhMkcfpxUbXOv PSNA== X-Gm-Message-State: AOAM533x2Lf/PikUosmQfC+ZFJB2I21M9weG7urpxWY7eiopxJrmG0gE YdtdKNXthc+fwMivsEORsZN1qA== X-Google-Smtp-Source: ABdhPJzTrKFps+GeCoX3uZCxSVOWXOuaPD/0YuD2oXFzH+R10vAlnkfSlcZEwlQMzychksQJaXWtZw== X-Received: by 2002:a63:485a:: with SMTP id x26mr4019728pgk.580.1641907036736; Tue, 11 Jan 2022 05:17:16 -0800 (PST) Received: from C02FF4E5ML7H.tiktokd.org ([139.177.225.248]) by smtp.gmail.com with ESMTPSA id c124sm10080450pfb.139.2022.01.11.05.17.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jan 2022 05:17:16 -0800 (PST) From: Muchun Song To: will@kernel.org, akpm@linux-foundation.org, david@redhat.com, bodeddub@amazon.com, osalvador@suse.de, mike.kravetz@oracle.com, rientjes@google.com, mark.rutland@arm.com, catalin.marinas@arm.com, james.morse@arm.com Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, fam.zheng@bytedance.com, Muchun Song Subject: [PATCH] arm64: mm: hugetlb: add support for free vmemmap pages of HugeTLB Date: Tue, 11 Jan 2022 21:16:52 +0800 Message-Id: <20220111131652.61947-1-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220111_051718_372096_26CDE40F X-CRM114-Status: GOOD ( 19.95 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The preparation of supporting freeing vmemmap associated with each HugeTLB page is ready, so we can support this feature for arm64. Signed-off-by: Muchun Song --- There is already some discussions about this in [1], but there was no conclusion in the end. I copied the concern proposed by Anshuman to here. 1st concern: " But what happens when a hot remove section's vmemmap area (which is being teared down) is nearby another vmemmap area which is either created or being destroyed for HugeTLB alloc/free purpose. As you mentioned HugeTLB pages inside the hot remove section might be safe. But what about other HugeTLB areas whose vmemmap area shares page table entries with vmemmap entries for a section being hot removed ? Massive HugeTLB alloc/use/free test cycle using memory just adjacent to a memory hotplug area, which is always added and removed periodically, should be able to expose this problem. " My Answer: As you already know HugeTLB pages inside the hot remove section is safe. Let's talk your question "what about other HugeTLB areas whose vmemmap area shares page table entries with vmemmap entries for a section being hot removed ?", the question is not established. Why? The minimal granularity size of hotplug memory 128MB (on arm64, 4k base page), so any HugeTLB smaller than 128MB is within a section, then, there is no share (PTE) page tables between HugeTLB in this section and ones in other sections and a HugeTLB could not cross two sections. Any HugeTLB bigger than 128MB (e.g. 1GB) whose size is an integer multible of a section and vmemmap area is also an integer multiple of 2MB. At the time memory is removed, all huge pages either have been migrated away or dissolved. The vmemmap is stable. So there is no problem in this case as well. 2nd concern: " differently, not sure if ptdump would require any synchronization. Dumping an wrong value is probably okay but crashing because a page table entry is being freed after ptdump acquired the pointer is bad. On arm64, ptdump() is protected against hotremove via [get|put]_online_mems(). " My Answer: The ptdump should be fine since vmemmap_remap_free() only exchanges PTEs or split the PMD entry (which means allocating a PTE page table). Both operations do not free any page tables, so ptdump cannot run into a UAF on any page tables. The wrost case is just dumping an wrong value. [1] https://lore.kernel.org/linux-mm/b8cdc9c8-853c-8392-a2fa-4f1a8f02057a@arm.com/T/ fs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index 7a2b11c0b803..04cfd5bf5ec9 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -247,7 +247,7 @@ config HUGETLB_PAGE config HUGETLB_PAGE_FREE_VMEMMAP def_bool HUGETLB_PAGE - depends on X86_64 + depends on X86_64 || ARM64 depends on SPARSEMEM_VMEMMAP config HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON