From patchwork Mon Oct 21 04:22:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 927BAD3C933 for ; Mon, 21 Oct 2024 04:22:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E32B86B007B; Mon, 21 Oct 2024 00:22:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE2BE6B0082; Mon, 21 Oct 2024 00:22:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA9C36B0083; Mon, 21 Oct 2024 00:22:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id AD90B6B007B for ; Mon, 21 Oct 2024 00:22:25 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3C92CC15B8 for ; Mon, 21 Oct 2024 04:22:09 +0000 (UTC) X-FDA: 82696312188.23.2E0A5D5 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf04.hostedemail.com (Postfix) with ESMTP id 2A82340003 for ; Mon, 21 Oct 2024 04:22:05 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=llUj6bmr; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3_tYVZwYKCEM3z4mftlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3_tYVZwYKCEM3z4mftlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729484433; a=rsa-sha256; cv=none; b=DAArLvj2ZfzJWLL2F2kPjj3lBHD+TX/JHiSqNhW2aYT7bVGUsuf4EgjuJV2BperqtDsfk6 z9coFBGWIl0YGcbdnPoIOfppqKvDLnUMCM3h1fWgIgOZc4CTFtHYRoMnTW4a5E5qmy9l6s 1BY7dN869rhDkHMEbvvYhmk056ZBz7g= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=llUj6bmr; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3_tYVZwYKCEM3z4mftlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuzhao.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3_tYVZwYKCEM3z4mftlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729484433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=djmWeapaSQGC8nHmrg76kCEnRDM6khIeNx/vxZy7BEg=; b=wCD0bpuPSYmDy5Ab9pWKCH9vqnTHb/gERDVFklMexE/Mj+veJgDCKfqcHZLObF1I00PWae mz0IIO4WApEFg+Qwgj5Ow05yGPydK741AUp/MboYXB9+XZC9AeWRNmqP15gI0shx8H+mfF 4PR6yV7sLWnpFHeLSUAwvcg7cgu9bew= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e2974759f5fso4502213276.0 for ; Sun, 20 Oct 2024 21:22:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484542; x=1730089342; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=djmWeapaSQGC8nHmrg76kCEnRDM6khIeNx/vxZy7BEg=; b=llUj6bmruMIz8z8B+KYy2dneyUnTsGepUdWQqguQ+Nw/DX0tpmsoBOVuE+6794U+Gz MEXJWLhmY/1DltMEX5Cz9cSYL4JhMZK/63Rii7ctLMjDg0ClZnVw9mJOkowqw0rlibkZ 6q12BppDHUZs1aUErX65ZfrDx5SzXaR63GHdceRvuOoZimDqYqmae0Yi8oitbdIjOCJW dECtSd997/ivim41zM4rOL69+up3eWKiQQOAQDmmNtSr4IQO7oH2Obrtqr7pHAuctAYi VffbeeAYlZr0dASHskhL513dhB79KfzVBMtsrE8QSsZ1sgKFqN0Ow+KI+L45iloRKqPW 5+yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484542; x=1730089342; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=djmWeapaSQGC8nHmrg76kCEnRDM6khIeNx/vxZy7BEg=; b=HRgQGll45jyPI6Wh6kPNSCQ7TSOc4o758A2EGmJ18h8DIDwcPD7qJEqZB8YaxvQeci 3IHTQhUeRLy9cwsKtsOb3/JF4XzWQZu/AIKVQMdXsBQL8y7g9aNHsOfPBX7VdiUj5I6R 74mT8aRgiedHE34ynkAsVQNabTY/4BYU5P7+cBWQ/l+l6ZSeXqec/He5NeHPcYaJP7L3 ocl6O4XTRoITSOdvygRBROx2hlsk1vqTjOtgzXsjEgrS5l5OjOFO4uPTPsFdMrkZeeVz 2Ldlgadfjm4QRyZQenOh/TNq5at5qJis3wDYLOc2dIukhCEBbPIpLTr/3I/Tb3xoAs7W 6wPg== X-Forwarded-Encrypted: i=1; AJvYcCUWtLPuftpUM5gGl8TYndZ2CVFXeqZs58SpfRHM+IL1KGxOzrlWOzWJsdPoZw7n+Gg3isVWiSQUWQ==@kvack.org X-Gm-Message-State: AOJu0YzblKaHkieyzq0x3/mNykhITyPLy/H4Wa4Y0ngTPeVtH4vuPPvK xgT1qkE+WM+LfyT88WUqKRDezxfXMTQ+Bh62luKXzeLdKGFgeTJfzaM6Bs5alDgXP4qCNMVrk9K 8fA== X-Google-Smtp-Source: AGHT+IGG1IXCwOdLAyki3IzhZW4dyFxYFnYAhLqXXbmqpD2Dv/xXTcGUjx179gaMsBuxYaYLTRxOt7OLXTk= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:72c3:0:b0:e29:ad0:a326 with SMTP id 3f1490d57ef6-e2b9ccc8449mr48383276.0.1729484542310; Sun, 20 Oct 2024 21:22:22 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:12 -0600 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-1-yuzhao@google.com> Subject: [PATCH v1 0/6] mm/arm64: re-enable HVO From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Rspamd-Queue-Id: 2A82340003 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: c1myb5dxct9ie71dc56whmekqypyrxs5 X-HE-Tag: 1729484525-62599 X-HE-Meta: U2FsdGVkX1+LiYgIgE3rRz6SpW681XfUjXeAMzxQQsguRTMej4xHGiEJ21q+Y+6tcvw/TpUZqYnY1+n2K/iqEU9/MP4SYNOg2Jq99G6jKzvxYmxNhDoZn3ZUn5LpSjFT4j3HEBVwNrj3mxsRBv7eVzWDh4OXWUZrm7O3Iix6bba2vPL4ewwskl2d2DyfWaKxuLCFsXfwtkNAro+SVsUdyZrbx95Qc1UIkAvb1Qp0mtMd7hCKM1R19/XAPN6VKM6yrV4p30nzknUjTR0LinqCQCX9Zv0VGKH6rqyE8830anb/wtbZxS8M5d50rIEPa+YyTc62xJ1MceuyXRzmcWG1PAHsHfqMKokzgTj53V5l6BijVm5e+/4mlvsjW5JZJGuQvB7wfwVicTtghbtUzwE1NFoI+N4g9ixLbJCAs/TNQ1kEkF6bPjCP3tLIi6BV0yKKGhyUTycvdwhe3Ld2ET/afCh14qTYAzr2StTo9QO6xZ1wY213ry2TWOLhsYvA+lbbnHeUCpDctBMkuf0LwE4sYMypTpsZIkoncG328cq34F+eX6a3tmyw4WWOeHS113+lOyYX58z/GQNdxR993h0b7NXa8WHfITBhdLfZj4munsn3VUnplNsA2S/JZqOUOJDuLaOX3SJfNuzXUdNITFTWXJ2WSHMa2M8I7GSwS37dI9B775ASVgFeOqwP25V4S2YUYAv4hr/eTF+id7x9bgB0aKrx3w2uR9+POTRwWpJbFab4eD+1CPa/JBYoBHC6btCh3Jrk725TtfxHfi1U8b0dZzUNgS5yHnsu3pEqjtAUBRavm/UIs8DuT9m1dO3RlpXxFZGEzJXu/OF5xQwe9f/ixjeOUPeFHz8s6/pY6bdYhRBL/REEsIMzYI5bLsBokqqAFfVIq50kY6X9PA2n66JY/4ErrwQ2E+sHHYv4UIu6PC4sLqHdaWORCvjldvezXiXzDkSKFuDyJ+kyVmjUwtF IzaH4DmJ wQbw0mAz3kkDd8Xf+/KAYJTCyNQOQX04N8SUQaq+4qqQbEXiSn7HMYvhX3lx7SJ9HvU6JqO9IBMgrWX1kuw2j2fVggCKxwd5OGOy0BcrivrtZPS1dns/5si1wz0jfrQ9zWWy0ZnZkkLu5jZpKFY+Mz/u1joFbaT9eQcJozTWNp13flR4pMTuYOavFOZr2wdkf6sOHTFBqEoDOlNrYvHjPWIS0TgM7TySQRKaXLOZQVG1DhwMhcO6cguIo74Tx+BtFF7vTfv57USHFxdF/DTciGP+ycho/EyPpUcfeDEqdALBeVJa5ogNAbTMOeIo+YPdWuAh6c93BgPDmTBQtIwTXR90Va4hzHNVkbmiaPmQ3L7Zt1oKhlxDV9JKgg0zqfQTYEJ4JlOiqBHLsYe9boPkRXMJZBd4MvzFFIpVbZWH+fDQ8eE9lWFou1leJZJQZvlpFXP4kpMb2vLCDqkyAMdhgf5N3Wm4DT/zUhNIdFwIUoucabLHOScdn4gFrl4VzovvV4MmVtyZQ0/Yk7JCTgqCnVvfuoUHVBHhlscpRI3E7vCCEj9IB6qQb6uJZtIWuF5gq5g7VYfc0X3DMsg4jLisWvlyYJVsQmxLVkFTSxBXlWNW0u3P/XOeGValE0zfguWa31VW18NAb7d2HQ/msyQuE0CXzMuO3ZhPdTvEq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This series presents one of the previously discussed approaches to re-enable HugeTLB Vmemmap Optimization (HVO) on arm64. HVO was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the following reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Other approaches that have been discussed include: A. Handle kernel PF while doing BBM [1], B. Use stop_machine() while doing BBM [2], and, C. Enable FEAT_BBM level 2 and keep the memory contents at the old and new output addresses unchanged to avoid BBM (D8.16.1-2) [3]. A quick comparison between this approach (D) and the above approaches: --+------------------------------+-----------------------------+ | Pro | Con | --+------------------------------+-----------------------------+ A | Low latency, h/w independent | Predictability concerns [4] | B | Predictable, h/w independent | High latency | C | Predictable, low latency | H/w dependent, complex | D | Predictable, h/w independent | Medium latency | --+------------------------------+-----------------------------+ This approach is being tested for Google's production systems, which generally find the "con" above acceptable, making it the preferred tradeoff for our use cases: +------------------------------+------------+----------+--------+ | HugeTLB operations | Before [0] + After | Change | +------------------------------+------------+----------+--------+ | Alloc 600 1GB | 0m3.526s | 0m3.779s | +7% | | Free 600 1GB | 0m0.880s | 0m0.940s | +7% | | Demote 600 1GB to 307200 2MB | 0m1.575s | 0m5.132s | +326% | | Free 307200 2MB | 0m0.946s | 0m4.456s | +471% | +------------------------------+------------+----------+--------+ [0] For comparison purposes, this only includes the last patch in the series, i.e., CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y. [1] https://lore.kernel.org/20240113094436.2506396-1-sunnanyong@huawei.com/ [2] https://lore.kernel.org/ZbKjHHeEdFYY1xR5@arm.com/ [3] https://lore.kernel.org/Zo68DP6siXfb6ZBR@arm.com/ [4] https://lore.kernel.org/20240326125409.GA9552@willie-the-truck/ Yu Zhao (6): mm/hugetlb_vmemmap: batch update PTEs mm/hugetlb_vmemmap: add arch-independent helpers irqchip/gic-v3: support SGI broadcast arm64: broadcast IPIs to pause remote CPUs arm64: pause remote CPUs to update vmemmap arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgalloc.h | 69 ++++++++ arch/arm64/include/asm/smp.h | 3 + arch/arm64/kernel/smp.c | 92 ++++++++++- drivers/irqchip/irq-gic-v3.c | 20 ++- include/linux/mm_types.h | 7 + mm/hugetlb_vmemmap.c | 262 +++++++++++++++++++++---------- 7 files changed, 360 insertions(+), 94 deletions(-) base-commit: 42f7652d3eb527d03665b09edac47f85fb600924