From patchwork Wed Aug 14 03:54:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13762816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 788B5C3DA4A for ; Wed, 14 Aug 2024 03:54:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB6816B0082; Tue, 13 Aug 2024 23:54:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E40A56B0083; Tue, 13 Aug 2024 23:54:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDFF76B0085; Tue, 13 Aug 2024 23:54:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id ADF6B6B0082 for ; Tue, 13 Aug 2024 23:54:58 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 32F22160BF7 for ; Wed, 14 Aug 2024 03:54:58 +0000 (UTC) X-FDA: 82449485076.08.04E2CBD Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf29.hostedemail.com (Postfix) with ESMTP id 7D3C3120010 for ; Wed, 14 Aug 2024 03:54:56 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=yuqe8PbS; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3jyq8ZgYKCBIGCHzs6y66y3w.u64305CF-442Dsu2.69y@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3jyq8ZgYKCBIGCHzs6y66y3w.u64305CF-442Dsu2.69y@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723607684; a=rsa-sha256; cv=none; b=u5X90UmzSVaraNYEwiqDESdndkY3THPNToJMauGojiwvdGMvNv42dsB6C/9u/I4kTl2wPA kWFWszFv09YVl1RLOU7a3MELczRrX3sE6NLVjbiN7CtvKhYqCILywBTgFd8C2SG7hohMmU /AW6fDgR2Bgi4+0EH61SQmHwU3qlnFc= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=yuqe8PbS; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3jyq8ZgYKCBIGCHzs6y66y3w.u64305CF-442Dsu2.69y@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3jyq8ZgYKCBIGCHzs6y66y3w.u64305CF-442Dsu2.69y@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723607684; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=dojELHFyDppIdE3uONGGsnU1QQoUNwuzSISVMqCfDh4=; b=QBCNIyM1qgmHYXoRctIoZG1Vx0KXZ5ctkNF7dhN3s9+gncerGZKuICbTG7BkYonbd/OEPt XDxloOVqRB6o4RMdjsw0gRxcjR7R6n1DdIdpwA2njgPAOqJ8yhJBrMJTVWNoJeTDZKZmXg nseHkaVYwbZJJtQNT8HWvEBwafuh/78= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-672bea19dd3so136821337b3.1 for ; Tue, 13 Aug 2024 20:54:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723607695; x=1724212495; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=dojELHFyDppIdE3uONGGsnU1QQoUNwuzSISVMqCfDh4=; b=yuqe8PbStoUvxjbf7NAyoyOFrnQtBcts5VaD2/EoT3/jkfhbDkiczMdf2y/ZFXvDYt jAcakroOzZ5hoNHsrT4uGAshbx8yzNz2viCBE9OjzRenPLZxvXGbWiw+SBC1fHTkN9Sm Wy3ld4biSDfrHQsYvuLX9uLeQOigM97JbiFULBXmWGMgq+RC+AYSLqe37AWf1gYfTlyq /5FUx/kVGPhLtijT4DIE4ELnjYSF6KSK0oWztwaWdh+atetEXlhX2JBYvrASI5MzSU2l Ca7jzfffzr8RnJ78kX4sT02ZcNuncrrgNgWQOWuUTsdMAL6NO34bcKemAqOi6hEzxJRO sthg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723607695; x=1724212495; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=dojELHFyDppIdE3uONGGsnU1QQoUNwuzSISVMqCfDh4=; b=n9Fz3IPLcCR/BrX3F7JgiNdevgFMPJwO3gl5N95tz5RbwIbJh+TzUt9cXXNfMfPTLy QQMhbU/O4drZ02H0/J/o3rbdM+Qi4yFUkzrlhkNewuFc0LdcwM3PHuaiRrb2oV5YRGDo YqEv9i5LrwewHV2j4hfWCzVloVkxZ8ugrmGAuTf/SiekquJoNzpRTF2WTgHr+VujnqXE MzKMbJu570EwCSOgIuWAk9VBpGjWihEC9nqMb6IJ8ns9+GNZk4FZcOmjTpG5FwRU7KCs 5TlNMPqRZfhJxDsACvZI9uSGhP8WcZzQ29F2wMAsBEuppjJH/ffsAKfJJCro/zybpR+M Z4Xg== X-Forwarded-Encrypted: i=1; AJvYcCU4/vYXS9UdxsRLxMFszUCcdL2ZB39c5jzrhOjQJWjzSN64AcnNrP0m+k6aaUAATopO2iVV/QQnlldbaejvwDBOWwg= X-Gm-Message-State: AOJu0YwB1wDJJHcI1BWMLisMGvuF+OOk6ds5hsNpWvDXTP4tJf/VUVuY M9lI51AKL+Ab4QotQwjRx/j+QkUki3Cau7KUD6DuZAuEtT1/uqnh9+bRKNEf6l09w/ph5fXRPWR vJA== X-Google-Smtp-Source: AGHT+IE0SHTqsS6rDRUrhReJmeN4jaED2Ug+f6SvEIuYHmKPDmD+bYFFa/luE+VatDOpvCNmvgdK+TkgKII= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:c8ad:fcc7:f7cf:f2fd]) (user=yuzhao job=sendgmr) by 2002:a81:b507:0:b0:691:55ea:8572 with SMTP id 00721157ae682-6ac9a75c2e4mr210237b3.8.1723607695580; Tue, 13 Aug 2024 20:54:55 -0700 (PDT) Date: Tue, 13 Aug 2024 21:54:48 -0600 Mime-Version: 1.0 X-Mailer: git-send-email 2.46.0.76.ge559c4bf1a-goog Message-ID: <20240814035451.773331-1-yuzhao@google.com> Subject: [PATCH mm-unstable v2 0/3] mm/hugetlb: alloc/free gigantic folios From: Yu Zhao To: Andrew Morton , Muchun Song Cc: "Matthew Wilcox (Oracle)" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao X-Rspam-User: X-Rspamd-Queue-Id: 7D3C3120010 X-Rspamd-Server: rspam01 X-Stat-Signature: ctzhmm1xn1jdwfkdeffknc8qjtg6jjfm X-HE-Tag: 1723607696-383915 X-HE-Meta: U2FsdGVkX1+xoAF6GoujzLHMmLZA3t1TFChSkCwf8RsESAOeFifrOES7xv6KGz5RzjF7AN7tdbKPZClX2YOFhhEH7wGuzIpmOy0+H/iHrL5mDxNGkl6ZjdkwnNyFng7geArrkxz5YXT9S+q4bjvJCeYcwe3kOI1woxeq933IOQqtdqkw9N6GXcKqaG9EkYG7ni5Na4HesDcAU3utPJ2R0xLTaNLnFrOtp+GXwDm696WARy6BBo6FTL7d7/V4qejDpgM3jWb/1UukySpoi/j9Hf9DEkkkbhbr4ofaTxuBlszF2C4O3Y5H6uhPrpJ+BEX8YTHvP52JRlKxK8V0fg0YhBbueNowXQEUnTrL9U3hjimcM4zXrO/EJQQj2i7wGc9gF+Afa2FFbHTH4xdfEL6QOCRAbZxjmjkLbOTckOHLJSnU//dQy11VlLDhAlgkrusLj/FUT/oFmNATMnycIoRiDdcpB+uihHYB/ThZ+IAt+MK9KUxOezjL5dnVQKgyLae0rgv8nX8MKG5C5Yfw+T/Yw1OOc7pNDWI+2bnTHp2Fj9Im0cJFvMbsd1qlAH8FUp4j6FCxkd0qiYj+UavcRuABpGdZDxZvpHZ+Saeku6rTb+PlaQMnNZ18JKlkjr11OFS4AUalYES5jBY6chPV9H1hLyZA6Ini2dHDcZ9meNAjR1x9RcBIKlHWctjkkHih/+TQnybFvjQU6Y+bRUM5A43FQIOccxj0Vr+5P7EN7xn+eeKd+K01wJPYSaoKE/xFGFm+TelbREP6DzBEGrQx6DUm2J1O7aAX15InGyhwqk8rqPe3Ln51KrGJrWiJ8U82FkS6mGSz7UGSXbUfN2LdslkUt1vj4sVPS6tRyrVV7vmIarbzVlCWN3b/B/UAIlw897meMzu0IMZeV8qk6QEEoUd/kHk1s/hku0YpS2OIWLwd2nOx7IIagrhyOlYaczGJsCgjrR3iyR9JYVqTxCbtoM4 1eUUAtvV ruiXdg/A4cH7lH+2npzHHnbKBtINxkK+wJmIQthCyyFMlUlA27oqX5uUpVpaWhCE4bRg2Ph7tos5WX0pUISUN5lxPjgmJ6lym7PrkylPvXp7twHq7+y23GmymSXSzlv/Gr/qQ6U/FryE6DhSIjhZlORwcTcsXgNoE+8PEctEfRbY+7gWP8TA86Kh0Ptv0kKzocdo7oiv85SEMzKxrYTwlNdaAqoyPWneJFhHhlKdH6WwmpkKh4Iw4iHGHrEcr6m5dVNyc74X6jPFqUeElPeiLEVd7o1jDGYFUUtrje2iRF1ZcTVgs37qMAeNqUpLQJtMbhhIIwKJj20Y4WzFC+n5fIRRVQ+QhUdWc0oUxSQNASK62UQYIcSvqhrUqVWBiis08ehy9mz2CH9ml+KxolvblpUmZaKZuObv6y5JLpPO2B41nuQr/wtqTpOJMLA/DqZHIOer8nziz+JLn9YwuDk30qWOnRBRC+EpcUhDEf7Q1LllYanAMgjXktmLSt8wiKrV5G7702hoQ7keujgRLx/SRNIdJkA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use __GFP_COMP for gigantic folios can greatly reduce not only the amount of code but also the allocation and free time. Approximate LOC to mm/hugetlb.c: +60, -240 Allocate and free 500 1GB hugeTLB memory without HVO by: time echo 500 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages Before After Alloc ~13s ~10s Free ~15s <1s The above magnitude generally holds for multiple x86 and arm64 CPU models. Perf profile before: Alloc - 99.99% alloc_pool_huge_folio - __alloc_fresh_hugetlb_folio - 83.23% alloc_contig_pages_noprof - 47.46% alloc_contig_range_noprof - 20.96% isolate_freepages_range 16.10% split_page - 14.10% start_isolate_page_range - 12.02% undo_isolate_page_range Free - update_and_free_pages_bulk - 87.71% free_contig_range - 76.02% free_unref_page - 41.30% free_unref_page_commit - 32.58% free_pcppages_bulk - 24.75% __free_one_page 13.96% _raw_spin_trylock 12.27% __update_and_free_hugetlb_folio Perf profile after: Alloc - 99.99% alloc_pool_huge_folio alloc_gigantic_folio - alloc_contig_pages_noprof - 59.15% alloc_contig_range_noprof - 20.72% start_isolate_page_range 20.64% prep_new_page - 17.13% undo_isolate_page_range Free - update_and_free_pages_bulk - __folio_put - __free_pages_ok 7.46% free_tail_page_prepare - 1.97% free_one_page 1.86% __free_one_page Yu Zhao (3): mm/contig_alloc: support __GFP_COMP mm/cma: add cma_{alloc,free}_folio() mm/hugetlb: use __GFP_COMP for gigantic folios include/linux/cma.h | 16 +++ include/linux/gfp.h | 23 ++++ include/linux/hugetlb.h | 9 +- mm/cma.c | 55 ++++++-- mm/compaction.c | 41 +----- mm/hugetlb.c | 293 ++++++++-------------------------------- mm/page_alloc.c | 111 ++++++++++----- 7 files changed, 226 insertions(+), 322 deletions(-)