From patchwork Mon Jun 19 23:10:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13285007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C939EB64D9 for ; Mon, 19 Jun 2023 23:10:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 380538D0002; Mon, 19 Jun 2023 19:10:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 330148D0001; Mon, 19 Jun 2023 19:10:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F8088D0002; Mon, 19 Jun 2023 19:10:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0FE068D0001 for ; Mon, 19 Jun 2023 19:10:52 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C7F49120453 for ; Mon, 19 Jun 2023 23:10:51 +0000 (UTC) X-FDA: 80921044302.15.F1C169E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 9F74D100003 for ; Mon, 19 Jun 2023 23:10:49 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SPRS4fzA; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687216249; a=rsa-sha256; cv=none; b=C6wpqKOreyRQPTqiF7uLAhJCPyTOExc4GZ10ykAr1Seumn3Jx8/K0u/QCqUAdjDRZ+GQ72 yIH0J0aoTGgQx+woY8XhzUUmDgxEbYA17z6LeZso1OEoczGOg2wTIvZ2tRQHGqaUYuIsZM vX2RCfeYXi4YWN9YdGIi1Gjg0gRhYKs= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SPRS4fzA; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687216249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=AUaeznawE2H/34P9AfcQNnXrIV7Bu4arSjeauz3YDCg=; b=1WFYBLt1dv78UnDbQ/Cj0npj+14kRmTwn//sAQXPxDGqHF2NUVIjW6N1GxML7CiRhlSjk1 CKQecYqunqMdk+BV90MLGVsvvCD7sB0BDKMY7ZBErGBzQ+MEA+m/2oZwQkQUh40DL4FA5f 00qIDZZZM7J4SNFO829HW9Qzz5z7xFs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687216249; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=AUaeznawE2H/34P9AfcQNnXrIV7Bu4arSjeauz3YDCg=; b=SPRS4fzAMj4r7h0U7dpGP2C6Angj47sf7/x5t5p0QcuVRX1WnhOcfULt7C/Uab5ur6ynPI 4iYhLVpAcWYr/I6gHrvHA9lxKXYRiK/s04MqGmrOeXgO+tTolcZVA+AADUlOsRWHGAWz4u ZPTR6YqAGgk18y8bZRvpQxzLB8HnAng= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-573-crtGpHAEPWeYrOIUuGT49Q-1; Mon, 19 Jun 2023 19:10:47 -0400 X-MC-Unique: crtGpHAEPWeYrOIUuGT49Q-1 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-76248f3057bso37225285a.1 for ; Mon, 19 Jun 2023 16:10:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687216247; x=1689808247; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AUaeznawE2H/34P9AfcQNnXrIV7Bu4arSjeauz3YDCg=; b=Vcohiqub/V9201qL7z2EZtuJSDYOQFWeOsi4i3h7A1p+ZuISsHyl6l1exeJMIL90wS 86N2ZOfNtc+hAd9p8vwVHFUKASXU+x0aslataIOt5/bSz754i/+InuscoAXWZwAQModc oMT7LKFvVlS6Dv1u2jT0vxGW2cbNGAsj/XB80BijcmBlSanRyvyior8tCbNd7wQse3pl 7yTt9NOXnvu7pLxvBrN6Rxf5/aN2AEIsqu+5OXarCYfWpKiiE5VNpd8RD6IknJxNt0wV oPrWSdY001iyLqGpt+E/PqVa/wEu7U8SSp13ocvUE4DLG+p++GbGBKiU+8Z8QRumdJRm N7GA== X-Gm-Message-State: AC+VfDwRsK9IMUasE2u33OSpNu5MHsx5CQDwUwq+TB88g/W/4oHbsyPa pkPh94ovZTZW9wZvtfGNXInLKulMSefljMkxIJS+IlNmU86Kb3hQiETkvoalFtjw4wtDEfDI5fi gDwi57ZOWxmXyWMZp1I97BM/AcTPdquGAKsA6VdzQVgrU7XmuodNLUnE8iXkuzAqW3UlA X-Received: by 2002:a05:620a:319f:b0:760:3db8:fd60 with SMTP id bi31-20020a05620a319f00b007603db8fd60mr13281507qkb.2.1687216246832; Mon, 19 Jun 2023 16:10:46 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5yLXEGioUkQ6kxFUHescmZ4/AFJ0/pLhn+Fkj0MhekOj9iGEpHY/TNfhwUEe6ENzFyXD4h2Q== X-Received: by 2002:a05:620a:319f:b0:760:3db8:fd60 with SMTP id bi31-20020a05620a319f00b007603db8fd60mr13281472qkb.2.1687216246410; Mon, 19 Jun 2023 16:10:46 -0700 (PDT) Received: from x1n.. (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id t15-20020a05620a034f00b007592f2016f4sm405864qkm.110.2023.06.19.16.10.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Jun 2023 16:10:46 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrea Arcangeli , Mike Rapoport , David Hildenbrand , Matthew Wilcox , Vlastimil Babka , John Hubbard , "Kirill A . Shutemov" , James Houghton , Andrew Morton , Lorenzo Stoakes , Hugh Dickins , Mike Kravetz , peterx@redhat.com, Jason Gunthorpe Subject: [PATCH v2 0/8] mm/gup: Unify hugetlb, speed up thp Date: Mon, 19 Jun 2023 19:10:36 -0400 Message-Id: <20230619231044.112894-1-peterx@redhat.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9F74D100003 X-Stat-Signature: t6ei7khh8od9obfgz8cbjafzcewyktd6 X-Rspam-User: X-HE-Tag: 1687216249-112690 X-HE-Meta: U2FsdGVkX18eg9Hu4+NklvX99VcFdsybHc+5+GRuoAGljREJqic6GpP8THR1flN6WCmmt1gT7J72LWL3TAWLU/lCyTDh6oolWci3tIIogsBdVvtuwecvDK/JIBKW7py0fqXtU1Pgmy+op8vm50shRQEJ3Q5Q3Z7LThXtjYywfOCBUvMUG2iYl/yH0vPJquBaoHe/G4xEtETqnHOC9EUaU6m7w3ke3hzinjI8u1sb12b4gWIlGcph7uYnVpBDoeCNTR5yJWcPFtfFzrK4GkJT7R0tmSG3sLTpr3YBGc/kPc+ZTMMu6itTvUPFyE9T8OPYAU40W8M6BG2Ov8hwNbBpQzzbMPryqWIcod3mGHMApiRR40zk3RMUnusBT5p/A+EY9FbkqaeEPUDIS7UcgOm/og351k107gNZOETUCGmpVMOOuFOZ66SR3hJ5qQ7DnkX83CxkRi9zVjBQzO8ctQaCEkCjrrusmgT7DbxhwkedFFcUOHnVVIwMRps8DG0PrmcQwQGO3Or+myI8ETu7tjH6nwBi4TWBohjeKUDI1xzjJDUIX2wWMaarP0FN06pc2PVCsulztKbqVpwXmnecIaW3wCwR6/I4T7lY0SrlTlY3rLvnxnTJ9d57HYtS4PMoN35PsVMelIDbn8+ECFfpNJwy9ry/IAHYeyrDO1UQOVfTScjgSJl0RBoMkAoUMLdlCmy+FZwUdBHz7hVvDQtBLaDSqshQPcCZ5/KQzqEK2mMNrA9uayVNn5kGkoeP3jwTAmIiU3FCPDL5GHpY3DmmBbQEIC9t5fhTiAe53AUhoz0nbnxEXm+aC2QibtKQO9gROnsMsI/k3LIjpgNMva/urNDJnIF4Hyjh3czDKVmu0+r7Qm6ve55LfYY5ZcUPC9/K+Ca4Y0tr1AEscj9/Jr1q2zVqXeVw/BT/frGYtHvA6FAzM99cc055Vbs/OCUMJdXKE/n9zckQ/nk2Xspu4vzto5B m6sp//XP juaeuXF+3nQSkuWI8Up+BN9anspUDF/lbMzj1AbmE28PEE12W0jCM9AB0bDLHrxcx92ojzmGuDvqod0mMWFTEe1oX2TR93I8Ix7ErPpU7Tw5XqwFZjl4P3TNqpRG8jw3k73oPqpG0D4J4d/mc8d0KAxHSuDrJkrcvnK5lXxSXogRA+p7Q5NmTEQwB85oO4hMY5W/L0QjEPm0PPm5gqMByOPaaJPjyHIkcdP6puuh70W60soQiOk0ygWffWdv8OAM3AM6p2Jcn3SV1bLBc76i2MWkkPmj+ywfd8vo6DsOUspNUIZ3vehDn2eiz3/V9GJBBg+/4dWVp6MIVi1JBHEAvw97NJQaQ4SAMXnqrJrN7qOZajzdE6xBGdU7DBQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: v2: - Added R-bs - Merged patch 2 & 4 into a single patch [David, Mike] - Remove redundant compound_head() [Matthew] - Still handle failure cases of try_get_folio/page() [Lorenzo] - Modified more comments in the last removal patch [Lorenzo] - Fixed hugetlb npages>1 longterm pin issue - Added two testcase patches at last Hugetlb has a special path for slow gup that follow_page_mask() is actually skipped completely along with faultin_page(). It's not only confusing, but also duplicating a lot of logics that generic gup already has, making hugetlb slightly special. This patchset tries to dedup the logic, by first touching up the slow gup code to be able to handle hugetlb pages correctly with the current follow page and faultin routines (where we're mostly there.. due to 10 years ago we did try to optimize thp, but half way done; more below), then at the last patch drop the special path, then the hugetlb gup will always go the generic routine too via faultin_page(). Note that hugetlb is still special for gup, mostly due to the pgtable walking (hugetlb_walk()) that we rely on which is currently per-arch. But this is still one small step forward, and the diffstat might be a proof too that this might be worthwhile. Then for the "speed up thp" side: as a side effect, when I'm looking at the chunk of code, I found that thp support is actually partially done. It doesn't mean that thp won't work for gup, but as long as **pages pointer passed over, the optimization will be skipped too. Patch 6 should address that, so for thp we now get full speed gup. For a quick number, "chrt -f 1 ./gup_test -m 512 -t -L -n 1024 -r 10" gives me 13992.50us -> 378.50us. Gup_test is an extreme case, but just to show how it affects thp gups. Patch 1-5: prepares for the switch Patch 6: switchover to the new code and remove the old Patch 7-8: added some gup test matrix into run_vmtests.sh Please review, thanks. Peter Xu (8): mm/hugetlb: Handle FOLL_DUMP well in follow_page_mask() mm/hugetlb: Prepare hugetlb_follow_page_mask() for FOLL_PIN mm/hugetlb: Add page_mask for hugetlb_follow_page_mask() mm/gup: Cleanup next_page handling mm/gup: Accelerate thp gup even for "pages != NULL" mm/gup: Retire follow_hugetlb_page() selftests/mm: Add -a to run_vmtests.sh selftests/mm: Add gup test matrix in run_vmtests.sh fs/userfaultfd.c | 2 +- include/linux/hugetlb.h | 20 +- mm/gup.c | 83 ++++--- mm/hugetlb.c | 256 +++------------------- tools/testing/selftests/mm/run_vmtests.sh | 48 +++- 5 files changed, 119 insertions(+), 290 deletions(-)