From patchwork Fri Jun 21 14:24:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13707652 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AABC7C2BA1A for ; Fri, 21 Jun 2024 14:25:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E79756B040A; Fri, 21 Jun 2024 10:25:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E286C6B040D; Fri, 21 Jun 2024 10:25:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2DEF8D0164; Fri, 21 Jun 2024 10:25:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9D9B86B03FF for ; Fri, 21 Jun 2024 10:25:13 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E8A63A0C84 for ; Fri, 21 Jun 2024 14:25:12 +0000 (UTC) X-FDA: 82255118064.29.48262D6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id B835E10000C for ; Fri, 21 Jun 2024 14:25:10 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=c5ll0BPx; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718979906; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=rJ2F2lReou/79KtGbEHPf7ReN+ePAiVmvXImVODPK+8=; b=r7BDeHPn1833VB4JeNVa7ljLlLVNGPxYYde9eZrGG8zPCKO5pPCtu3h3sHW62fhCTu3wG2 uRbdEdxZRlaqt9EQk4PC+JBnP8oo1R8E7JCVmxQlu6Yq2cpYHbhvbnQd7bysLdfciYuaAJ iL34YkEOmQxGgnSZInkXeC7gf/lqcj4= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=c5ll0BPx; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718979906; a=rsa-sha256; cv=none; b=XONSr53YwHZftAJMySWK38Y+rtzJpdDfkzW6eL0UTOTSaSqa9p9M/1Cucs9klXDc2rMrbp TY6oqoZmV98WcGDO7PppGKg4GrwXRhibbpqYNk2VmpX7Gu5rV4JO9iUljyiJCRqpURDKqh WZMY5dIiOHVzFHXwoExpPLlNP0qbdGc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718979910; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=rJ2F2lReou/79KtGbEHPf7ReN+ePAiVmvXImVODPK+8=; b=c5ll0BPxgVLjR9GHH6sZ9+n0ao89KjVvVLDDX5RfQKk0AmuqfM0cIjCzQJrKBcGivqsXWv Lj/qWLsMKRa7mcus/0L6Yp2WF2gdAyXTn1F1fyJFO89Gy7NVrBnrpULA8359xscFZpLZDD GJiY0OlDWedLossLfy2OFuXSvecKb7U= Received: from mail-oi1-f198.google.com (mail-oi1-f198.google.com [209.85.167.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-626-OR1QEyjrM5uyAR6LOPd-kA-1; Fri, 21 Jun 2024 10:25:08 -0400 X-MC-Unique: OR1QEyjrM5uyAR6LOPd-kA-1 Received: by mail-oi1-f198.google.com with SMTP id 5614622812f47-3d1fd81dab0so190874b6e.1 for ; Fri, 21 Jun 2024 07:25:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718979908; x=1719584708; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rJ2F2lReou/79KtGbEHPf7ReN+ePAiVmvXImVODPK+8=; b=Cf0PzgB2whz0uD8RU84uWESOoNqORDvUXb+oWhNPBR2HG77xCkk5ohS2Zcwf3LR2BW kxXH+yvfmvvnYhEtMw2DnyUfJtAwzwmt3E7aWfaj0EFOnDi4iRyZt9E8wZFc1N1ZMH1W 5kgAF9j+5du0Y0qxR4FlcsZEoaQo/SUOY7HrsCAC+KkMuiuzlWM+oYdsq3sjzWhvLcuI BfXPT+rz0KVg9r2UN8Sf9wXl2frFyocsPO8ORblPpFNkq6COajXHsVhNniwAlGTl9rtw rBcBNVcjDMIOfNbJQeD20uUN/03mBAscQhER5XELeJ/tVv70TO4plXQrPoI8FT1ronzK Mo1A== X-Forwarded-Encrypted: i=1; AJvYcCVXKBHxdGAwv6De4IkDBgJYbzG2WeIh28biAy2eIRLLnzKU40Z1cIRaFLM2RpiJ+Pu2c1JiAo3HT9Jb/4mTYsUTm0g= X-Gm-Message-State: AOJu0Yylw6cAaW4UgKJr9c5YB+85ppgKN7NIi8gd+kqUCvXNgUXmsAK8 SNVPK2v/3b/mtapkZ5LEQD3/xQnKtMeH+GrbJ3sfAw46oV+qh75GQgp137vo+lB3x+1PZBpOGRD Mcy+0FgEUbjC2v4tZVC+pcYviS/PIjpLqT2iDwKI8K+YpuZ1N X-Received: by 2002:a05:6359:4c82:b0:1a1:cc04:1dfe with SMTP id e5c5f4694b2df-1a1fd54fea5mr982810755d.2.1718979907428; Fri, 21 Jun 2024 07:25:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF1Xxdp9urwT8bix6NuH4K7DRA81f/XgNHW39wGHKyZSk8ch5WqsO3pnf6YuhyEDxai+96HrA== X-Received: by 2002:a05:6359:4c82:b0:1a1:cc04:1dfe with SMTP id e5c5f4694b2df-1a1fd54fea5mr982805755d.2.1718979906763; Fri, 21 Jun 2024 07:25:06 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79bce944cb2sm90564785a.125.2024.06.21.07.25.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jun 2024 07:25:06 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: x86@kernel.org, Borislav Petkov , Dave Jiang , "Kirill A . Shutemov" , Ingo Molnar , Oscar Salvador , peterx@redhat.com, Matthew Wilcox , Vlastimil Babka , Dan Williams , Andrew Morton , Hugh Dickins , Michael Ellerman , Dave Hansen , Thomas Gleixner , linuxppc-dev@lists.ozlabs.org, Christophe Leroy , Rik van Riel , Mel Gorman , "Aneesh Kumar K . V" , Nicholas Piggin , Huang Ying Subject: [PATCH 0/7] mm/mprotect: Fix dax puds Date: Fri, 21 Jun 2024 10:24:57 -0400 Message-ID: <20240621142504.1940209-1-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B835E10000C X-Stat-Signature: g57xx1h9op11xy1bybyxshdgzdggwpna X-HE-Tag: 1718979910-682080 X-HE-Meta: U2FsdGVkX1+j/00j+JcE4zaTf4DE2zK01oAyG9VGOh/zbRIiV6WbNHV6w4+AmJchdWvqMYPmxG9f1zfPOzcMQR9Rj1U8iuWmTRSL8v1RnNb39l0BYx69D2jtZk6jKrusMhSBPN2B+B/Uat9DqOTJW+ozy1STHB7KgI7bKa0fAyNhDdxvYgWAfC4Ssv8gEzfZa3Ooc1vk5j7O/Eu48a/hQiKIT/xE3BIxM+nTka2I56qYnzrgMHVbqJNsk5so/FUjpZ7CJ9+40bheFeg735TvAsdOJRMCF6/OWhrhgGanhB/jw1SzHr55g7Vy/k4N1Qhc5TZzl/s2Y0dNc8hLaWkVdKTGeK0xIPc4aMc6dDVNTF5jmTfeujVCaRrCjxJMoiOK1arxGBqCeF+YUp1G3EldxWSCEM3kNJSwNzAqEWQtG72QlxAlwS05ovTunFTGcXS2dGHyuEUUaorY0rsvhgq8DUuxGbqk7s422y0SU/mJH+BakrB7Z8BcNYMAezwnf70Q/tNe9jRXuDrdpC5g05xfSDU23S6W0HNC2LVjKpoGpi2jb5lls1WlrbKTR9AurZO8ObrVXiOJbJcLLS8jdwy2j9Mhg5ssrazj6pdya13NJh5v3eH+r+bk6Pfl0MLJy1nB7JkTDzns+wk2sqaVZQaQadPIWbW3kHaTf6y8IhOv1tdyO/+UCutT2X7zY7CQkoJA+33FR7vQorXDdjZRQkwqAa2hdR+/XLdb+J7W2ll5AEPEHN1RDvUsIFdO7gBD4Hk8DbrCw3kWOijZDUgEdugQJjnzv1jG+9XE+jr21Ct4bCUSTUPI/DYhgA9BbzwU6gG0A8DgepqTPqh3T+n8H99MlLX1QTM3Q7i22VLztQc5wt6HKz8PRXwNW59Szcug3TMOeQIbw9X+dDR3isMVgUoP8emFDcABejkRJsKKyzR1LF35vsmdrhOKCX7gdhP6tYIuKpPC9HleWVZCpKaiRI1 nEaxuM9G NGVoUXWoJmsOpmWZAj0g0Ryvu89mg0P/KlV2Y+7plxK3sWBy48Awu9BbDckuFiQqXM3EDokQbIVepBCAEFBD0udu9ruxjYolewvnKEKLYGfqbXCguies8ZOL5H8Ly8jn9DLGVRnHHg5OZ/ZAa4LtkSG+J6cRIErpA06SNk1aCEVERV65vFNQieH7/uZOQpMcVRyfM2DK44JyBoesXZarGwJTxb/zO/QfftrzRfMAAOKNtZMPgZXMpXRymIlYgiiBhmjnpMeKpT+EUp6xcLUCSZ1nwuL+Np3RMdhAgmAh/eKmse5fq8CjS2fgMgECYKfqDHZ5+0SWg/N/pJHwd6rhjL3hj+Dnd9p//PgHWq+X+9rhxNE+3N1QuUT9bJ0H0CQhFc248o0QUXenLwSGP1emlS5dWXdqpEfspxNiufPiw7aCavDLn38cYu90YEd4FLjWRpwHW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [Based on mm-unstable, commit a53138cdbe3e] Dax supports pud pages for a while, but mprotect on puds was missing since the start. This series tries to fix that by providing pud handling in mprotect(), while my real goal is adding more types of pud mappings like hugetlb or pfnmaps, it's just that we probably want pud to work already and build the rest on top. Considering nobody reported this until when I looked at those other types of pud mappings, I am thinking maybe it doesn't need to be a fix for stable and this may not need to be backported. I would guess whoever cares about mprotect() won't care 1G dax puds yet, vice versa. I hope fixing that in new kernels would be fine, but I'm open to suggestions. There're quite a few small things changed here and there to teach mprotect work on PUDs. E.g. it will start with dropping NUMA_HUGE_PTE_UPDATES which may stop making much sense when there can be more than one type of huge pte (meanwhile it doesn't sound right at all to account non-numa operations too.. more in the commit message of the relevant patch). OTOH, we'll also need to push the mmu notifiers from pmd to pud layers, which might need some attention but so far I think it's safe. For these small details, please refer to each patch's commit message. The mprotect() pud process is hopefully straightforward enough, as I kept it as simple as possible. There's no NUMA handled as dax simply doesn't support that. There's also no userfault involvements as file memory (even if work with userfault-wp async mode) will need to split a pud, so pud entry doesn't need to yet know userfault's existance (but hugetlb entries will; that's also for later). Tests ===== What I did test: - cross-build tests that I normally cover [1], except an known issue elsewhere on hugetlb [2] - smoke tested on x86_64 the simplest program [3] on dev_dax 1G PUD mprotect() using QEMU's nvdimm emulations [4] and ndctl to create namespaces with proper alignments, which used to throw "bad pud" but now it'll run through all fine. Also I checked sigbus happens if with illegal access on protected puds. What I didn't test: - fsdax: I wanted to also give it a shot, but only until then I noticed it doesn't seem to be supported (according to dax_iomap_fault(), which will always fallback on PUD_ORDER). I did remember it was supported before, I could miss something important there.. please shoot if so. - userfault wp-async: I also wanted to test userfault-wp async be able to split huge puds (here it's simply a clear_pud.. though), but it won't work for devdax anyway due to not allowed to do smaller than 1G faults in this case. So skip too. - Power, as no hardware on hand. Thanks, [1] https://gitlab.com/peterx/lkb-harness/-/blob/main/config.json [2] https://lore.kernel.org/all/202406190956.9j1UCIe5-lkp@intel.com [2] https://github.com/xzpeter/clibs/blob/master/misc/dax.c [3] https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt Peter Xu (7): mm/dax: Dump start address in fault handler mm/mprotect: Remove NUMA_HUGE_PTE_UPDATES mm/mprotect: Push mmu notifier to PUDs mm/powerpc: Add missing pud helpers mm/x86: Make pud_leaf() only cares about PSE bit mm/x86: Add missing pud helpers mm/mprotect: fix dax pud handlings arch/powerpc/include/asm/book3s/64/pgtable.h | 3 + arch/powerpc/mm/book3s64/pgtable.c | 20 ++++++ arch/x86/include/asm/pgtable.h | 39 ++++++++++- arch/x86/mm/pgtable.c | 11 +++ drivers/dax/device.c | 6 +- include/linux/huge_mm.h | 24 +++++++ include/linux/vm_event_item.h | 1 - mm/huge_memory.c | 52 ++++++++++++++ mm/mprotect.c | 74 ++++++++++++-------- mm/vmstat.c | 1 - 10 files changed, 195 insertions(+), 36 deletions(-)