From patchwork Sat Jul 17 02:59:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Collingbourne X-Patchwork-Id: 12383161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-21.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F06BC636CA for ; Sat, 17 Jul 2021 03:00:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0FDCE61009 for ; Sat, 17 Jul 2021 03:00:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0FDCE61009 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A43D8D00F5; Fri, 16 Jul 2021 23:00:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 77AE78D00EC; Fri, 16 Jul 2021 23:00:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61B3A8D00F5; Fri, 16 Jul 2021 23:00:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0085.hostedemail.com [216.40.44.85]) by kanga.kvack.org (Postfix) with ESMTP id 399728D00EC for ; Fri, 16 Jul 2021 23:00:00 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 041C2824556B for ; Sat, 17 Jul 2021 02:59:59 +0000 (UTC) X-FDA: 78370575318.04.F7D5A2A Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf06.hostedemail.com (Postfix) with ESMTP id A8E7F801F241 for ; Sat, 17 Jul 2021 02:59:58 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id x5-20020a0569021025b029055b9b68cd3eso15336427ybt.8 for ; Fri, 16 Jul 2021 19:59:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=OejThO9U7IkzrDDyCk4DT+oqrxwXtnahBtcnsZfFAwQ=; b=CPgeoEnhtdRlN+LrymMN8D/U57gZ4d4VKDzl3BBFvJ/t4UkOCMOpMAPlIhGVYZkhwQ adQHSP7U5ftflJkAQKk9UNQywQNxc3a2+awXCjnVGqdlOo6Q7rElF8Fdb5z8lO41tqKW pMN+ssnpuQqNYsGOE+7tQBBBy4KLLJWQyxQMdWJQQ8naLJeffrXoPkpZvIHhxGP1qzNS FwgGHPIDIVmOLRF7No+dWglwwbP/VJUrF1rwDKk85vmiMFK09rfSE2fdaVggWxbxCBJH KNwXQ5o2/eJ5aeeHE5g/dwVnvLuBmwGIJLmoOE1CMJnbX/iPxjFvKytx7KWkIwBJ2y+O nwkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=OejThO9U7IkzrDDyCk4DT+oqrxwXtnahBtcnsZfFAwQ=; b=QHs85WHoGxlJdzEeGgulTGdhbmV59k3FxvdtYGdKeWGPW7WpaSY0CnokB7iR4EceKX W82z0oOvJh+41n+HhmWu/PFkPEqIup5rok+Dnzc778x5siLkXywJNteLDuMHYy9U554t aW1rrCnBCid/wN3ENcvanqrB24ZfiJxk3iStQqoUEgVYzT15gHtdic/qBFBqWe1Vj8lp /TEvTq6c5XlZJ+0lJVF40tNSItL5tFZd1JCBGEmDuLolMRa7H9mt79z61rKzWz6GA1rl kY4kMPDhuuknKa/7iS9oNdFy3IMx7y9dBAy+0gBxlpmW7M2WPnsO82nYeVYERmWWnLqO 8NHQ== X-Gm-Message-State: AOAM531UCZjBZLKUYitB2j8J0aOqR+dTj0GNWYeH6aJkAqompmn265q8 XKOcwQ3OXkZHjUQ7qWsbowruhkU= X-Google-Smtp-Source: ABdhPJw0Hll9OtU9YXnGycNCR3KDGCfnmUJm0p/DDzcJNWNnHpIWwpU/tCIze7NanEGfSWjI3RO3VgY= X-Received: from pcc-desktop.svl.corp.google.com ([2620:15c:2ce:200:fc3c:d89a:88e2:5cfc]) (user=pcc job=sendgmr) by 2002:a25:7316:: with SMTP id o22mr17005025ybc.349.1626490797837; Fri, 16 Jul 2021 19:59:57 -0700 (PDT) Date: Fri, 16 Jul 2021 19:59:51 -0700 Message-Id: <20210717025951.3946505-1-pcc@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.32.0.402.g57bb445576-goog Subject: [PATCH] refpage_create.2: Document refpage_create(2) From: Peter Collingbourne To: John Hubbard , Matthew Wilcox , "Kirill A . Shutemov" , Andrew Morton , Catalin Marinas , Evgenii Stepanov , Michael Kerrisk , Alejandro Colomar Cc: Peter Collingbourne , Jann Horn , Linux ARM , linux-mm@kvack.org, kernel test robot , Linux API , linux-doc@vger.kernel.org, linux-man@vger.kernel.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A8E7F801F241 X-Stat-Signature: 8o1w7kyxtsway8rdjnzj64mk8a7nojpb Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=CPgeoEnh; spf=pass (imf06.hostedemail.com: domain of 3rUfyYAMKCBQ9ww08805y.w86527EH-664Fuw4.8B0@flex--pcc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3rUfyYAMKCBQ9ww08805y.w86527EH-664Fuw4.8B0@flex--pcc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1626490798-6759 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --- The syscall has not landed in the kernel yet. Therefore, as usual, the patch should not be taken yet and I've used 5.x as the introducing kernel version for now. man2/refpage_create.2 | 167 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 167 insertions(+) create mode 100644 man2/refpage_create.2 diff --git a/man2/refpage_create.2 b/man2/refpage_create.2 new file mode 100644 index 000000000..c0b928b92 --- /dev/null +++ b/man2/refpage_create.2 @@ -0,0 +1,167 @@ +.\" Copyright (C) 2021 Google LLC +.\" Author: Peter Collingbourne +.\" +.\" %%%LICENSE_START(VERBATIM) +.\" Permission is granted to make and distribute verbatim copies of this +.\" manual provided the copyright notice and this permission notice are +.\" preserved on all copies. +.\" +.\" Permission is granted to copy and distribute modified versions of this +.\" manual under the conditions for verbatim copying, provided that the +.\" entire resulting derived work is distributed under the terms of a +.\" permission notice identical to this one. +.\" +.\" Since the Linux kernel and libraries are constantly changing, this +.\" manual page may be incorrect or out-of-date. The author(s) assume no +.\" responsibility for errors or omissions, or for damages resulting from +.\" the use of the information contained herein. The author(s) may not +.\" have taken the same level of care in the production of this manual, +.\" which is licensed free of charge, as they might when working +.\" professionally. +.\" +.\" Formatted or processed versions of this manual, if unaccompanied by +.\" the source, must acknowledge the copyright and authors of this work. +.\" %%%LICENSE_END +.\" +.TH REFPAGE_CREATE 2 2021-07-16 "Linux" "Linux Programmer's Manual" +.SH NAME +refpage_create \- create a reference page file descriptor +.SH SYNOPSIS +.nf +.BR "#include " +.PP +.BI "int syscall(SYS_refpage_create, void *" content ", unsigned int " size , +.BI " unsigned long " flags ");" +.fi +.PP +.IR Note : +glibc provides no wrapper for +.BR refpage_create (), +necessitating the use of +.BR syscall (2). +.SH DESCRIPTION +The +.BR refpage_create () +system call is used to create a file descriptor +that conceptually refers to a read-only file +whose contents are an infinite repetition of +.I size +bytes of data read from the +.I content +argument to the system call, +and which may be mapped into memory with +.BR mmap (2). +The file descriptor is created as if by passing +.BR O_RDONLY | O_CLOEXEC +to +.BR open (2). +.PP +In reality, any read-only pages in the mapping are backed +by a so-called reference page, +whose contents are specified using the arguments to +.BR refpage_create (). +.PP +The reference page will consist of repetitions of +.I size +bytes read +from +.IR content , +as many as are required to fill the page. The +.I size +argument must be a power of two less than or equal to the page size, and the +.I content +argument must have at least +.I size +alignment. The behavior is as if a copy of this data +is made while servicing the system call; +any updates to the data after the system call has returned +will not be reflected in the reference page. +.PP +If the architecture specifies that metadata may be associated +with memory addresses, that metadata if present is copied +into the reference page along with the data itself, +but only if the size argument is at least as large +as the granularity of the metadata. +For example, with the ARMv8.5 Memory Tagging Extension, +the memory tags are copied, but only if the size is greater than +or equal to the architecturally specified tag granule size of 16 bytes. +.PP +Writable private mappings trigger specific copy-on-write behavior +when a page in the mapping is written to. +The behavior is as if the reference page is copied, +but the kernel may use a more efficient technique such as +.BR memset (3) +to produce the copy if the +.I size +argument originally used to create the reference page file descriptor +is sufficiently small. +For this reason it is recommended to specify as small of a +.I size +argument as possible +in order to activate any such optimizations implemented in the kernel. +.PP +The advantage of using this system call +over creating normal anonymous mappings +and manually initializing the pages from userspace +is that it is more efficient. +If it is not known that all of the pages in the mapping +will be faulted (for example, if the system call is used +by a general purpose memory allocator +where the behavior of the client program is unknown), +letting the pages be prepared on fault only if needed +is more efficient from both a performance +and memory consumption perspective. +Even if all of the pages would end up being faulted, +it would still be more efficient +to have the kernel initialize the pages with the required contents once +than to have the kernel zero initialize them on fault +and then have userspace initialize them again with different contents. +.SH EXAMPLES +The following program creates a 128KB memory mapping +preinitialized with the pattern byte 0xAA +and verifies that the contents of the mapping are correct. +.PP +.EX +#include +#include +#include +#include + +int main() { + unsigned char pattern = 0xaa; + unsigned long mmap_size = 131072; + + int fd = syscall(SYS_refpage_create, &pattern, 1, 0); + if (fd < 0) { + perror("refpage_create"); + return 1; + } + unsigned char *p = mmap(0, mmap_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE, fd, 0); + if (p == MAP_FAILED) { + perror("mmap"); + return 1; + } + for (unsigned i = 0; i != mmap_size; ++i) { + if (p[i] != pattern) { + fprintf(stderr, "refpage failed contents check @ %u: " + "0x%x != 0x%x\n", + i, p[i], pattern); + return 1; + } + } +} +.EE +.SH NOTE +Reading from a reference page file descriptor, e.g. with +.BR read (2), +is not supported, nor would this be particularly useful. +.SH VERSIONS +This system call first appeared in Linux 5.x. +.SH CONFORMING TO +The +.BR refpage_create () +system call is Linux-specific. +.SH SEE ALSO +.BR mmap (2), +.BR open (2).