From patchwork Wed Jun 28 21:53:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13296388 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99355EB64DA for ; Wed, 28 Jun 2023 21:53:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA4708D0002; Wed, 28 Jun 2023 17:53:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B54108D0001; Wed, 28 Jun 2023 17:53:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1BE88D0002; Wed, 28 Jun 2023 17:53:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 92B248D0001 for ; Wed, 28 Jun 2023 17:53:18 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5E5661602C3 for ; Wed, 28 Jun 2023 21:53:18 +0000 (UTC) X-FDA: 80953508076.15.DA0AE67 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id DBDD540016 for ; Wed, 28 Jun 2023 21:53:15 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DfplLwi1; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687989196; a=rsa-sha256; cv=none; b=GRhv6E3XLINb2EiqW4xe2v/cFpcxn9/ickRl9sRMfdJiPWr4Mkmeou2Zc0KH1g5w3He5g9 QYwhfFlWxo8MIqUhrsZKm2vH06OwW6ARsDBhxWhgfyNoGJehHP2CRbqpByWGfrmaj/xAan YemcwQAEQDhNDy0SdViAAXlEVd4Glbg= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DfplLwi1; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687989196; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=h1nRhPjVUvT83glYEJmWAi7ToC2P8NKyYybxP7YvEbw=; b=d79UZwOiVheh2fXjkRLrSmxQBVRTOUGNbGpzFeEVkfN7oS2BdA0QMdONjvx9Cpj7FwMpGG M0WRdT3hQYF2pjwtrk9j37HKW3ShWDTZwtgejDRsLHZSebbhLNuO8qRpGndoOc1YApQ4KS 0gAY5gm3E54ZHOT2hnDpE6WFdYBVLsg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687989195; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=h1nRhPjVUvT83glYEJmWAi7ToC2P8NKyYybxP7YvEbw=; b=DfplLwi1hmLFdk/uS8C0bMSZuyJ/iNI2sg2TdN4piZb5pjhp+dbWFHjjg2Z4lw7ZYaKHhY ahYSzj1rgfky6PwAPGSqShoyHEseGq9tgER7MoLoq13bzxGIkBmZVH0Q9RvWQScqLQP3H3 exVYJ2OmGrYM1KxztNjM60rttc6t0W0= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-553-iDSZpws6Ov6IZ4zf1AZH9w-1; Wed, 28 Jun 2023 17:53:13 -0400 X-MC-Unique: iDSZpws6Ov6IZ4zf1AZH9w-1 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6340023ffbbso16516d6.1 for ; Wed, 28 Jun 2023 14:53:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687989193; x=1690581193; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=h1nRhPjVUvT83glYEJmWAi7ToC2P8NKyYybxP7YvEbw=; b=k3RQfzssBSCoxebyPe/WarHaDqjtF6EDqfD537Zj/vHOoGvTSjA+uvlCCXeZwRh9a3 VJ0gVS6XL6c8Siz3FbhtNnZyqsrOGk0jxukUEfM66OUdBBtkUBn/HZ+kwBz2ZfgOvA6V FiPiooEt0V5OT25s2K5HAReaT5O30yDZcBQqQVQ4XpHoLOvjHjktSuqlYBUDT7Ji8GoO 4kBzmM1uQFRG0aRiIxq5SjDN18Al5GJIN0M1CZAXEURoRRkt4BEpEHNI+wHCu3fYF7xL 0Kv7v8JrDiIMIVMItFnqPnFN3CbkMl95K3wHFR0H40hlMvI9H/fr5b85EuheqFNbLHBO Dgtg== X-Gm-Message-State: ABy/qLZfDX02xaWpaE2eGd7AhHZs5OkZ4ptHvlKev05nPkTt7BLx2UoQ cNXh9PCHvdxJvKgK0TaqGrWy+GV1AjejhoeThgBejPA7PVAu7pxakFklrsVw+X25yJs+tg63F1n xuHKb3JG8/+mcL5BPbKe13Kd0kPCtUc25G2aFvaPwJUeBpW0W7iEujyNzslb+NLC45s5S X-Received: by 2002:a05:6214:5010:b0:635:e528:521a with SMTP id jo16-20020a056214501000b00635e528521amr946128qvb.5.1687989193171; Wed, 28 Jun 2023 14:53:13 -0700 (PDT) X-Google-Smtp-Source: APBJJlHoNAJWjLY8+y5RRUWEW1UMYBA8wgDHtFPDoOHjnAx+ck1QWHJujGtsSDS/6EviOoS7uEg5dA== X-Received: by 2002:a05:6214:5010:b0:635:e528:521a with SMTP id jo16-20020a056214501000b00635e528521amr946098qvb.5.1687989192755; Wed, 28 Jun 2023 14:53:12 -0700 (PDT) Received: from x1n.. (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id p3-20020a0cfac3000000b00631fea4d5bcsm6277797qvo.95.2023.06.28.14.53.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Jun 2023 14:53:12 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , "Kirill A . Shutemov" , Andrew Morton , Andrea Arcangeli , Mike Rapoport , John Hubbard , Matthew Wilcox , Mike Kravetz , Vlastimil Babka , Yang Shi , James Houghton , Jason Gunthorpe , Lorenzo Stoakes , Hugh Dickins , peterx@redhat.com Subject: [PATCH v4 0/8] mm/gup: Unify hugetlb, speed up thp Date: Wed, 28 Jun 2023 17:53:02 -0400 Message-ID: <20230628215310.73782-1-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: DBDD540016 X-Stat-Signature: xztn11dyqrhm4g9miqmhmhq33mfbt76w X-Rspam-User: X-HE-Tag: 1687989195-176001 X-HE-Meta: U2FsdGVkX1+vzXX/w9njGeCDX+ax2vCHqAWHc002KQCFpClVUJbwBtFdxafHYOWHxhLvgrslN61O9iQHspvPv0CZSTa/JyhupqHMlDChAR4VX5ssMinfxWbF/BzsDVdumL3kF9VRycbAO+cepDF6gieX0FJagbB+tfz0UMfiKkheYvmZRVJUcJUeGc+rxXmTRAOClKG5TDPMx2+g6/v1JwPcMl18DeIGS6WDycQqVZ8FNq5Lu9QQZaeKeQLshPDOsbkKH7BVH5RgjWbJ2cG/FuTaQeUMXQCagIYCfZQCGnSSSuGCQ5sY8lio17Zmi4Q4sOOkDYrX4mnFcqhh3686utyWuPAAeVq9fdEj/hMIJ8oJku+hyFHQC++ij/56h2c7Ngvn9aexpohYCUi7zq0+iPHDBe233ZRgpojawV0EGCLaL+DUeGz6rfDJY22JzMQbVi0RwjAxC0acBWxto/YqlLq7ti9w5QlHmDcM61gXaGVG7R2tePpzxTWzATHLUurEKdHijsePmqnZ1ZgCh/qhHFLuRooHbRyxY2Se0/WuBTjOq1y9ZzxM/t08LIR7wCh0r3PF9SkZp2PR8lzk9jGo6KiO7S3ByVP2KXVpPNLJwzt+pf3FFNVM7vD2CChEy46dF4k/s5QvymVFwBCtPCyZnzQtUDpoeF6YlXGXuO5/ibMdxf7fQyFuxZnIO5bJqUXbMdhVd3hQ9W93WQNr8zbB/dlEr9E9njF6KVDOR7YZFeBEeZeOs8jAimAtJbFp32A14gIuQ4fywIVEdU0nND8w2oLXJhg70lRX18fIZOBZDocRkI/Q6n9HpuDCpx1/bMCUYEgZuWQ13M0tZG+DtjkEQOJizFhH89qMjYfuuewjJHzpcdbnCcN1khTP+oHzkJG4FtP4XQwnMnNP9YDIJvl0GaBk0jIOLRWAeyuAxH5rZCIVT117k7TX5FRXDN7UWHWsuaGUjgTwnljJE3Nrzi/ ankZzcMZ YCKVLygL5ftJvqpQ4WfJ7hohNQCYmKRDard84p18BBztYZuud4tgaxKy7pJ2W8DMkOCTwseG/34ifXgos9ZScQbA1p6ANdUap2M9wisz45eHSaeMyKPrQX7W+0ScaGxxSusOU19fBlgIvI4ri5EZ8jFftzMnJFL4wt/ft2F5ZcqVbQTofJQi0fT6mdVDrrRdVAwI8e5CNqfQ/bp8ws3s6u3WuaVgjPoBh4UmNZrak8p3HOH0GIkrP2XvCS5YNYcu15miY6kfPXmyR2QPqpph9SFSdxHH5XNi+D/5Ynwe0WzMwvvjGB67yFDTqi5vQykmQg4lLq22HMnVTd2LLYSxSopQ3evOacJAgbsDQeSaYGSoBEqc3SZk2e16HROBPsbRJdp0/wNKQjf4S+hgxHxOLs3tt84Z4z6m/1h8KGumZotejCwgzIVtqt9NMat/jt5nzlbx0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: v1: https://lore.kernel.org/r/20230613215346.1022773-1-peterx@redhat.com v2: https://lore.kernel.org/r/20230619231044.112894-1-peterx@redhat.com v3: https://lore.kernel.org/r/20230623142936.268456-1-peterx@redhat.com v4: - Patch 2: check pte write for unsharing [David] - Added more tags, rebased to akpm/mm-unstable Hugetlb has a special path for slow gup that follow_page_mask() is actually skipped completely along with faultin_page(). It's not only confusing, but also duplicating a lot of logics that generic gup already has, making hugetlb slightly special. This patchset tries to dedup the logic, by first touching up the slow gup code to be able to handle hugetlb pages correctly with the current follow page and faultin routines (where we're mostly there.. due to 10 years ago we did try to optimize thp, but half way done; more below), then at the last patch drop the special path, then the hugetlb gup will always go the generic routine too via faultin_page(). Note that hugetlb is still special for gup, mostly due to the pgtable walking (hugetlb_walk()) that we rely on which is currently per-arch. But this is still one small step forward, and the diffstat might be a proof too that this might be worthwhile. Then for the "speed up thp" side: as a side effect, when I'm looking at the chunk of code, I found that thp support is actually partially done. It doesn't mean that thp won't work for gup, but as long as **pages pointer passed over, the optimization will be skipped too. Patch 6 should address that, so for thp we now get full speed gup. For a quick number, "chrt -f 1 ./gup_test -m 512 -t -L -n 1024 -r 10" gives me 13992.50us -> 378.50us. Gup_test is an extreme case, but just to show how it affects thp gups. Patch 1-5: prepares for the switch Patch 6: switchover to the new code and remove the old Patch 7-8: added some gup test matrix into run_vmtests.sh Please review, thanks. Peter Xu (8): mm/hugetlb: Handle FOLL_DUMP well in follow_page_mask() mm/hugetlb: Prepare hugetlb_follow_page_mask() for FOLL_PIN mm/hugetlb: Add page_mask for hugetlb_follow_page_mask() mm/gup: Cleanup next_page handling mm/gup: Accelerate thp gup even for "pages != NULL" mm/gup: Retire follow_hugetlb_page() selftests/mm: Add -a to run_vmtests.sh selftests/mm: Add gup test matrix in run_vmtests.sh fs/userfaultfd.c | 2 +- include/linux/hugetlb.h | 20 +- mm/gup.c | 83 ++++--- mm/hugetlb.c | 265 +++------------------- tools/testing/selftests/mm/run_vmtests.sh | 48 +++- 5 files changed, 126 insertions(+), 292 deletions(-)