From patchwork Tue Nov 1 15:03:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 13027070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD400C433FE for ; Tue, 1 Nov 2022 15:03:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 146966B0072; Tue, 1 Nov 2022 11:03:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F8126B0073; Tue, 1 Nov 2022 11:03:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDA2B6B0074; Tue, 1 Nov 2022 11:03:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DE5C06B0072 for ; Tue, 1 Nov 2022 11:03:32 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B0B2980ED2 for ; Tue, 1 Nov 2022 15:03:32 +0000 (UTC) X-FDA: 80085192264.20.E5F7C01 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf16.hostedemail.com (Postfix) with ESMTP id 8DC68180080 for ; Tue, 1 Nov 2022 15:03:28 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-352e29ff8c2so132030607b3.21 for ; Tue, 01 Nov 2022 08:03:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=yNCkI2CEwMLowhj/c/NxG+QvNgqXDPtspVNRD3dm/Oo=; b=QjFv9GjD5VrKHQd2n7YJsVOrSJouLGKW3eSgTv9TIU1DTLuVTIgOIbub7nj5QyKdiC /4yYScHiQZ3I3aALwnomkPsuVtYTMaplMJoSjVRGGeup1D4LkMWNxXJL/uRnr4wnHU0K /bgzkzmCyxL6mFC3hcAMVTk3UVD7O2r/CEuIylay4lCzhVFlwdr8slGpDmZNONuNBE59 dOFrkJMsBals53JdLcNFsF92S94hmL854BKeP0dyBG6zmt4FkJrvilEbK+fe9MtgkXPS mzs2QE7B5GMHIjfmvBNOwjcmZKdZvLBdckECH3Somenvb0Zaqk9YbBDcOotdgmaNlFj1 jt8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yNCkI2CEwMLowhj/c/NxG+QvNgqXDPtspVNRD3dm/Oo=; b=oAY1p0sz/KBv+m6cq1zEE4dpZmk3+cs+PiRMaMDtTGKINJaO8WYVz5q6UrYMHnYRx4 Gby7xFXDczvuhtfO5VXS0ntjH2Is15/qaboufC/yGExaIlYHX9gzR2wYOt21kia20xuV 8oqJqIbuzIZDUnmPA5SNPHTlDWWUMLUXih88wyQ+CC/GGbs0yUVtZeeDETVmVJBVDyX3 dBIhvZSspB5usuDlLbE/Tt9aFEB9C9Yk/X1B/XO1ahAdGZcNZHT0CQete8YFurtLqFBl avYgOSM0QNPkaJBAR2Sx11K9G5j73ElvpCTirKb7Gjw/6yqz8Rmq+y1nsmpruBmPOtPt cW7A== X-Gm-Message-State: ACrzQf1AvEbzvWD+z0dwpAjedyi6DTOJpDqTp4C3NwEUtjZNFiANwn3k k+kRa+t/USM0cNdL5uGppmHN77VtuR+b X-Google-Smtp-Source: AMsMyM78LbZ1qYIyqvSWE+NnY33HVP77BP9X/J12hyw5uXTIYKbp/sPIK7j9cbeulsjJgDq7x8JkUghixvxF X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6902:521:b0:6c1:46d2:c7bb with SMTP id y1-20020a056902052100b006c146d2c7bbmr132734ybs.373.1667315007742; Tue, 01 Nov 2022 08:03:27 -0700 (PDT) Date: Tue, 1 Nov 2022 08:03:23 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.38.1.273.g43a17bfeac-goog Message-ID: <20221101150323.89743-1-zokeefe@google.com> Subject: [PATCH man-pages v5] madvise.2: add documentation for MADV_COLLAPSE From: Zach OKeefe To: Alejandro Colomar , Michael Kerrisk Cc: Yang Shi , linux-mm@kvack.org, linux-man@vger.kernel.org, "Zach O'Keefe" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667315008; a=rsa-sha256; cv=none; b=5ftncYUWC7S4AYu507nIZqN1/Sx6u1T7pHw7aDQyDa0MNoxhF8pZ3LWXLNlJv7RtdsCt2S MkQ9j+Fg6u9wyGtRnVzucC2jtdAQ42NPSRw3eXaK5SWuRE4ZoHLUdReV9jqoO3wnKEPGnP nRRDbLZBLwPX5YR69bbDq7EpsfmyKKU= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QjFv9GjD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3PzVhYwcKCFAH62wwxwy66y3w.u64305CF-442Dsu2.69y@flex--zokeefe.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3PzVhYwcKCFAH62wwxwy66y3w.u64305CF-442Dsu2.69y@flex--zokeefe.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667315008; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=yNCkI2CEwMLowhj/c/NxG+QvNgqXDPtspVNRD3dm/Oo=; b=d4jmP898JaiNFnDpMP7b9fpeosPskhrEvaZRxY9pp9HN9GZo0vZe+NbduogJk225TtqAbk /DF848kZzZb53jVstWbdNoBdZcPkI2cRDuVMnEmggB8hhl4ZfIHQRk9pEyLQRQqnQITBHP rGkTtKxjN6FqwiiUFPKsGaTFXeHD6PI= X-Stat-Signature: cytq1g5otobn6eqj6m4fyh5qj4gmca9a X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8DC68180080 X-Rspam-User: Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QjFv9GjD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3PzVhYwcKCFAH62wwxwy66y3w.u64305CF-442Dsu2.69y@flex--zokeefe.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3PzVhYwcKCFAH62wwxwy66y3w.u64305CF-442Dsu2.69y@flex--zokeefe.bounces.google.com X-HE-Tag: 1667315008-829603 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zach O'Keefe Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to MADV_COLLAPSE"). Update the man-pages for madvise(2) and process_madvise(2). Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@google.com/ Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/ Signed-off-by: Zach O'Keefe --- v4[1] -> v5 - Rebased to latest master - (Alejandro Colomar) Applied diff to remove spurious file and fix semantic newlines. - (Alejandro Colomar) Reworded documentation describing behavior of setting errno when multiple hugepage-aligned/sized regions fail to collapse. v3[2] -> v4 - Rebased to latest master - (Alejandro Colomar) Fixed weird, non-ascii chars: e2 80 99 -> "'" - (Alejandro Colomar) Replaced .BR with .B directive when the entire line was bold (no non-bold part) [1] https://lore.kernel.org/linux-man/20221031225500.3994542-1-zokeefe@google.com/ [2] https://lore.kernel.org/linux-man/bb3b5c3c-3966-ea1a-6d84-4f7f3afa37ca@gmail.com/T/#u man2/madvise.2 | 91 +++++++++++++++++++++++++++++++++++++++++- man2/process_madvise.2 | 10 +++++ 2 files changed, 99 insertions(+), 2 deletions(-) diff --git a/man2/madvise.2 b/man2/madvise.2 index edf805740..038e6023d 100644 --- a/man2/madvise.2 +++ b/man2/madvise.2 @@ -386,9 +386,10 @@ set (see .BR prctl (2)). .IP The -.B MADV_HUGEPAGE +.BR MADV_HUGEPAGE , +.BR MADV_NOHUGEPAGE , and -.B MADV_NOHUGEPAGE +.B MADV_COLLAPSE operations are available only if the kernel was configured with .B CONFIG_TRANSPARENT_HUGEPAGE and file/shmem memory is only supported if the kernel was configured with @@ -401,6 +402,82 @@ and .I length will not be backed by transparent hugepages. .TP +.BR MADV_COLLAPSE " (since Linux 6.1)" +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77 +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 +Perform a best-effort synchronous collapse of +the native pages mapped by the memory range +into Transparent Huge Pages (THPs). +.B MADV_COLLAPSE +operates on the current state of memory of the calling process and +makes no persistent changes or guarantees on how pages will be mapped, +constructed, +or faulted in the future. +.IP +.B MADV_COLLAPSE +supports private anonymous pages (see +.BR mmap (2)), +shmem pages, +and file-backed pages. +See +.B MADV_HUGEPAGE +for general information on memory requirements for THP. +If the range provided spans multiple VMAs, +the semantics of the collapse over each VMA is independent from the others. +If collapse of a given huge page-aligned/sized region fails, +the operation may continue to attempt collapsing +the remainder of the specified memory. +.B MADV_COLLAPSE +will automatically clamp the provided range to be hugepage-aligned. +.IP +All non-resident pages covered by the range +will first be swapped/faulted-in, +before being copied onto a freshly allocated hugepage. +If the native pages compose the same PTE-mapped hugepage, +and are suitably aligned, +allocation of a new hugepage may be elided and +collapse may happen in-place. +Unmapped pages will have their data directly initialized to 0 +in the new hugepage. +However, +for every eligible hugepage-aligned/sized region to be collapsed, +at least one page must currently be backed by physical memory. +.IP +.B MADV_COLLAPSE +is independent of any sysfs +(see +.BR sysfs (5)) +setting under +.IR /sys/kernel/mm/transparent_hugepage , +both in terms of determining THP eligibility, +and allocation semantics. +See Linux kernel source file +.I Documentation/admin\-guide/mm/transhuge.rst +for more information. +.B MADV_COLLAPSE +also ignores +.B huge= +tmpfs mount when operating on tmpfs files. +Allocation for the new hugepage may enter direct reclaim and/or compaction, +regardless of VMA flags +(though +.B VM_NOHUGEPAGE +is still respected). +.IP +When the system has multiple NUMA nodes, +the hugepage will be allocated from +the node providing the most native pages. +.IP +If all hugepage-sized/aligned regions covered by the provided range were +either successfully collapsed, +or were already PMD-mapped THPs, +this operation will be deemed successful. +Note that this doesn't guarantee anything about +other possible mappings of the memory. +In the event multiple hugepage-aligned/sized areas fail to collapse, +only the most recently-failed code will be set in +.IR errno . +.TP .BR MADV_DONTDUMP " (since Linux 3.4)" .\" commit 909af768e88867016f427264ae39d27a57b6a8ed .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519 @@ -620,6 +697,11 @@ A kernel resource was temporarily unavailable. .B EBADF The map exists, but the area maps something that isn't a file. .TP +.B EBUSY +(for +.BR MADV_COLLAPSE ) +Could not charge hugepage to cgroup: cgroup limit exceeded. +.TP .B EFAULT .I advice is @@ -717,6 +799,11 @@ maximum resident set size. Not enough memory: paging in failed. .TP .B ENOMEM +(for +.BR MADV_COLLAPSE ) +Not enough memory: could not allocate hugepage. +.TP +.B ENOMEM Addresses in the specified range are not currently mapped, or are outside the address space of the process. .TP diff --git a/man2/process_madvise.2 b/man2/process_madvise.2 index ac98850a9..92878286b 100644 --- a/man2/process_madvise.2 +++ b/man2/process_madvise.2 @@ -73,6 +73,10 @@ argument is one of the following values: See .BR madvise (2). .TP +.B MADV_COLLAPSE +See +.BR madvise (2). +.TP .B MADV_PAGEOUT See .BR madvise (2). @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process .TP .B ESRCH The target process does not exist (i.e., it has terminated and been waited on). +.PP +See +.BR madvise (2) +for +.IR advice -specific +errors. .SH VERSIONS This system call first appeared in Linux 5.10. .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc