From patchwork Mon May 14 08:13:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 10397473 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5F3DD600D0 for ; Mon, 14 May 2018 08:13:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F38A290C1 for ; Mon, 14 May 2018 08:13:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43FD5290C6; Mon, 14 May 2018 08:13:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A393B290C1 for ; Mon, 14 May 2018 08:13:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 036006B000E; Mon, 14 May 2018 04:13:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E4EAE6B000D; Mon, 14 May 2018 04:13:56 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C788D6B0010; Mon, 14 May 2018 04:13:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt0-f199.google.com (mail-qt0-f199.google.com [209.85.216.199]) by kanga.kvack.org (Postfix) with ESMTP id 9C4246B000C for ; Mon, 14 May 2018 04:13:56 -0400 (EDT) Received: by mail-qt0-f199.google.com with SMTP id a21-v6so12453791qtp.19 for ; Mon, 14 May 2018 01:13:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:in-reply-to:references:message-id; bh=SznpAbRwnJSJI/F7uNZtjmDSpyqVN/dDbAi1c3VBxxw=; b=OfxvndEXzS2w9kt5gg4h3VvoUIuPB7FFrTtPIatB+J8RzosvcQW581TEkZCqr+9NXv VuGbaFzoypzCITGrZYnI8L5FEB+dqUw0SeVaqqusjHIO45z/kln8FwwY9FIKvfjrgyKE aGklEy4itvrbpmYMUaNHVzI1PwDDoMDoi6bCgLG8nq7OlT8cFhkqN3c5mw9ngZzfXU/N Qm1BW4+UuF0sKiU+8Fd4s72FhTCuh+FHzGZYuGIo2r0Y0/kjVoEYMi4Txfl0T87TiGQc 1Zsf/4IgbU+ow9ZtuLoeSCWryWC6nGHJQLIH4fXtJK3OK8fTJB4foNO19P7YK7SM0t43 lifw== X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) smtp.mailfrom=rppt@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com X-Gm-Message-State: ALKqPwc/1wbRzSYQQhRz31vTwjsvb+i6YdBhgiIlzxeonNz+sTZ8FugT T3EHKz7ksQ8iGzpKQf4B7YQO00u7rvBzBg47lqg7C6TIHJ2aHRID5qeWrEmqImzoVeuMQBhvcCr nQvNvHXFupDyCk6XY0btUkRWAe4BBBW14rOpejY8zui5maDGUvdwaWbAb1+XlbvY= X-Received: by 2002:a37:1b28:: with SMTP id b40-v6mr7365006qkb.196.1526285636406; Mon, 14 May 2018 01:13:56 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpgp7QXWKJRlHFAp8VKn4Y8IQ1f32fissbbqwbuVTDs1Yg6V5KwAIAGZYUmrupZ4dPftAkN X-Received: by 2002:a37:1b28:: with SMTP id b40-v6mr7364980qkb.196.1526285635694; Mon, 14 May 2018 01:13:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526285635; cv=none; d=google.com; s=arc-20160816; b=K2ZBX1NgBGlGu/LFnEYJ1Rm/YbV4zIdxYJAeIacadqNbUIjO2zDPg/1Y3eKvJiFqd6 TbpsAG3Ko0GWPWJU1LrpxkEE5TX1FOfbaOVeWRZ+VGn/f6eDyV4cpUzZ0LMwhvGCxoYs 3X3rhOb8uKiOh/BBarcj84dQThzOAWn4Sa28LP1fot2r60+ayvbPXWIHKMVORpRUq8Jy 7wsdGWImcBVEbFjPGTF10PXlxAnoSvhtCeH7Bhonu6HSGIgz3CA620gtHy2ugYRJYHUk D6dmkdnq+fzLSOeWOhYh7Uubx7jhlYBIwlqbLF7aHB2L8IwEtUt6raY09MqScWube7L9 89Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:references:in-reply-to:date:subject:cc:to:from :arc-authentication-results; bh=SznpAbRwnJSJI/F7uNZtjmDSpyqVN/dDbAi1c3VBxxw=; b=N4tNhVKs+4RJ/df7lC0zGP8+DRZSby0xxOyyreiRF/SkIOO7LlRP0MZr/HBT9zrs3+ 1qGLmMk4Gwf6mJyXN2ibdSEs9VzhNTy6rdbLcUkHpGWSJLC1tpRLXlRvQKHqtqBvTB4L oOnBEJxw6SirQNFTAOWkSYcAFQjQ2YI6i5yVVp6E8tzlu4/kmPCz++NoETvgtJZW8Lwy dXUeWbTGn6fzyd6mPXxVCJYnc7DVXMaWsnq0q1M6zjbVQzI8nt920Q1iwBeFOZwOd2Zw Pdmc6O+qkBWhPWRBt7nXF0GR7n/SR652YiX1wOd9n364zeXliTHF2uDbBAIO9l4cRiRH VU0A== ARC-Authentication-Results: i=1; mx.google.com; spf=neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) smtp.mailfrom=rppt@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com. [148.163.156.1]) by mx.google.com with ESMTPS id j2-v6si8461296qtn.252.2018.05.14.01.13.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 14 May 2018 01:13:55 -0700 (PDT) Received-SPF: neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) client-ip=148.163.156.1; Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.156.1 is neither permitted nor denied by best guess record for domain of rppt@linux.vnet.ibm.com) smtp.mailfrom=rppt@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4E84IPZ124864 for ; Mon, 14 May 2018 04:13:54 -0400 Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 2hy5m6u15k-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 14 May 2018 04:13:54 -0400 Received: from localhost by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 14 May 2018 09:13:52 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp12.uk.ibm.com (192.168.101.142) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 14 May 2018 09:13:50 +0100 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w4E8DoLk7668010; Mon, 14 May 2018 08:13:50 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4538F42041; Mon, 14 May 2018 09:04:45 +0100 (BST) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DA8D942042; Mon, 14 May 2018 09:04:43 +0100 (BST) Received: from rapoport-lnx (unknown [9.148.8.81]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 14 May 2018 09:04:43 +0100 (BST) Received: by rapoport-lnx (sSMTP sendmail emulation); Mon, 14 May 2018 11:13:48 +0300 From: Mike Rapoport To: Jonathan Corbet Cc: linux-doc , linux-mm , lkml , Mike Rapoport Subject: [PATCH 2/3] docs/vm: transhuge: minor updates Date: Mon, 14 May 2018 11:13:39 +0300 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1526285620-453-1-git-send-email-rppt@linux.vnet.ibm.com> References: <1526285620-453-1-git-send-email-rppt@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18051408-0008-0000-0000-000004F639B9 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18051408-0009-0000-0000-00001E8A96F8 Message-Id: <1526285620-453-3-git-send-email-rppt@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-05-14_02:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1805140085 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Some formatting changes and addition of a sentence introducing khugepaged Signed-off-by: Mike Rapoport --- Documentation/vm/transhuge.rst | 47 ++++++++++++++++++++++++++++++++---------- 1 file changed, 36 insertions(+), 11 deletions(-) diff --git a/Documentation/vm/transhuge.rst b/Documentation/vm/transhuge.rst index 56d04cbb..47c7e47 100644 --- a/Documentation/vm/transhuge.rst +++ b/Documentation/vm/transhuge.rst @@ -9,14 +9,19 @@ Objective Performance critical computing applications dealing with large memory working sets are already running on top of libhugetlbfs and in turn -hugetlbfs. Transparent Hugepage Support is an alternative means of +hugetlbfs. Transparent HugePage Support (THP) is an alternative mean of using huge pages for the backing of virtual memory with huge pages that supports the automatic promotion and demotion of page sizes and without the shortcomings of hugetlbfs. -Currently it only works for anonymous memory mappings and tmpfs/shmem. +Currently THP only works for anonymous memory mappings and tmpfs/shmem. But in the future it can expand to other filesystems. +.. note:: + in the examples below we presume that the basic page size is 4K and + the huge page size is 2M, although the actual numbers may vary + depending on the CPU architecture. + The reason applications are running faster is because of two factors. The first factor is almost completely irrelevant and it's not of significant interest because it'll also have the downside of @@ -28,15 +33,27 @@ only matters the first time the memory is accessed for the lifetime of a memory mapping. The second long lasting and much more important factor will affect all subsequent accesses to the memory for the whole runtime of the application. The second factor consist of two -components: 1) the TLB miss will run faster (especially with -virtualization using nested pagetables but almost always also on bare -metal without virtualization) and 2) a single TLB entry will be -mapping a much larger amount of virtual memory in turn reducing the -number of TLB misses. With virtualization and nested pagetables the -TLB can be mapped of larger size only if both KVM and the Linux guest -are using hugepages but a significant speedup already happens if only -one of the two is using hugepages just because of the fact the TLB -miss is going to run faster. +components: + +1) the TLB miss will run faster (especially with virtualization using + nested pagetables but almost always also on bare metal without + virtualization) + +2) a single TLB entry will be mapping a much larger amount of virtual + memory in turn reducing the number of TLB misses. With + virtualization and nested pagetables the TLB can be mapped of + larger size only if both KVM and the Linux guest are using + hugepages but a significant speedup already happens if only one of + the two is using hugepages just because of the fact the TLB miss is + going to run faster. + +THP can be enabled system wide or restricted to certain tasks or even +memory ranges inside task's address space. Unless THP is completely +disabled, there is ``khugepaged`` daemon that scans memory and +collapses sequences of basic pages into huge pages. + +The THP behaviour is controlled via :ref:`sysfs ` +interface and using madivse(2) and prctl(2) system calls. Transparent Hugepage Support maximizes the usefulness of free memory if compared to the reservation approach of hugetlbfs by allowing all @@ -69,9 +86,14 @@ Applications that gets a lot of benefit from hugepages and that don't risk to lose memory by using hugepages, should use madvise(MADV_HUGEPAGE) on their critical mmapped regions. +.. _thp_sysfs: + sysfs ===== +Global THP controls +------------------- + Transparent Hugepage Support for anonymous memory can be entirely disabled (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE regions (to avoid the risk of consuming more memory resources) or enabled @@ -142,6 +164,9 @@ khugepaged will be automatically started when transparent_hugepage/enabled is set to "always" or "madvise, and it'll be automatically shutdown if it's set to "never". +Khugepaged controls +------------------- + khugepaged runs usually at low frequency so while one may not want to invoke defrag algorithms synchronously during the page faults, it should be worth invoking defrag at least in khugepaged. However it's