From patchwork Mon Aug 12 18:12:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13760907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49840C3DA7F for ; Mon, 12 Aug 2024 18:12:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E6096B008C; Mon, 12 Aug 2024 14:12:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 295486B0095; Mon, 12 Aug 2024 14:12:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15C956B0098; Mon, 12 Aug 2024 14:12:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EFCF06B008C for ; Mon, 12 Aug 2024 14:12:36 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 650891202F8 for ; Mon, 12 Aug 2024 18:12:36 +0000 (UTC) X-FDA: 82444388712.04.5EC0919 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 4865D1A0014 for ; Mon, 12 Aug 2024 18:12:34 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=K+QiUS2s; spf=pass (imf19.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723486284; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=eIfqRhqTBdQt7ItDuA3SYVNl6iQYa6JLleOS9G2I7/o=; b=OHDuVvUjMrK1AXBc84+R2v8yGYgnqvG3s/ICQFr7hmAWXjt6Jykand9pqJCmwCUlagXAla 89V7BFc4ZJ1MzOb4fyZjUxT8Y5zuNsBcPUuM3EOx6rxtk+jjaGklka7MAFcUwvfx6H7m3Q 1W6wrcnwvBWKbiakrDH/UCOtAZu15Ko= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723486284; a=rsa-sha256; cv=none; b=ReDaThAN4sgggwYBdy6MS3vNA15PiSu+aXFQvPI06awngi1NZ+AXYChSV9+rS4lTuiT/sV xkpuIBuvwZZopSTl3gvp24AYok29+Ujrgq3ZOQHAMX6Dy1X0QpTn9GcPzXUE8lXzIXAN8n Ae5aSW5QWUXK12i2rjQj/jm2aHxn9es= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=K+QiUS2s; spf=pass (imf19.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723486353; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=eIfqRhqTBdQt7ItDuA3SYVNl6iQYa6JLleOS9G2I7/o=; b=K+QiUS2s6cHQxKUwGbqyAGSlC3iHlY1hhwoOKWH2cj8KjxRpQGutDspnO7N5dHt5g88ZyT SGpNAztnrCGtp9QdH1YwQeEffVJ3/O607hFm0ci92gvHPtwmteBXgPLq29AbCfMuPUUUaL DZIblmqa/XJv2A/V+X14+FhINTn99L0= Received: from mail-oa1-f72.google.com (mail-oa1-f72.google.com [209.85.160.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-547--KW_VGr1OPSHRxepHnNq6g-1; Mon, 12 Aug 2024 14:12:29 -0400 X-MC-Unique: -KW_VGr1OPSHRxepHnNq6g-1 Received: by mail-oa1-f72.google.com with SMTP id 586e51a60fabf-26440d4a6f9so822129fac.2 for ; Mon, 12 Aug 2024 11:12:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723486349; x=1724091149; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eIfqRhqTBdQt7ItDuA3SYVNl6iQYa6JLleOS9G2I7/o=; b=ECTwsNW3U1y4YyGY/4iW+h3pes6rcwqQR03CGjzDoXq1q5lLy8hfbRd4x4A8Jl8rL5 Ca1Ra8GCA7z0tsrGGAjK7VpZidV8r0T1dmHEE4Y7so+qZwSWPxsHS1FNox6wNGBpBO32 A9awv8fW+WV9jcgDGn07WHfuls/PC96s7axjOmfTlZQEdINT1uZpGmxf9Fis2Si6N4aj jE325TA112eO+fmTAPpfhXPC8miEuiidb9zD8UZJ2Hopr1mEp0DnGQyPu7dQGWrqfamY 0XwigVjlUWZSXV40cr1CzK+jYZ3bmF6aBkdDPAo/7zufroKu3lXxOLJmpuySrft/e1VP m1pQ== X-Forwarded-Encrypted: i=1; AJvYcCWIzNOkUcdYIHZcP2hCGPCUXf0hVuJdSOpQsesPYhHTsbDEA1jnsJL+tHFtu9YaOHRPub8r7p+ngQ==@kvack.org X-Gm-Message-State: AOJu0YwA1swGan6GULestwpet5CnpXcWlFYT6yVM4IoDxXEZUvLrxWDH z/6Qv41KJOC4VJQQer7D0/PTIfDQSlJuDA+T7aZDSvbosHF3YP7Er7CsIVzddc3qhK+Nvu/TgEY uvX79M/chAtPX+Oo50FCGhJFDpbscimbzaXbXbfeWWkU/owhH X-Received: by 2002:a05:6870:ac0c:b0:261:b94:b0b with SMTP id 586e51a60fabf-26fcb61d7fcmr692518fac.1.1723486348980; Mon, 12 Aug 2024 11:12:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH6jLTxsSCZJ07oO3D4SmefhCdcQBPUO4jFHyKiLUZDd1WYyR5M288o5ohbA4O9kA1TSpyMOQ== X-Received: by 2002:a05:6870:ac0c:b0:261:b94:b0b with SMTP id 586e51a60fabf-26fcb61d7fcmr692493fac.1.1723486348633; Mon, 12 Aug 2024 11:12:28 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a4c7dee013sm268663985a.84.2024.08.12.11.12.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Aug 2024 11:12:28 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Nicholas Piggin , David Hildenbrand , Matthew Wilcox , Andrew Morton , James Houghton , Huang Ying , "Aneesh Kumar K . V" , peterx@redhat.com, Vlastimil Babka , Rick P Edgecombe , Hugh Dickins , Borislav Petkov , Christophe Leroy , Michael Ellerman , Rik van Riel , Dan Williams , Mel Gorman , x86@kernel.org, Ingo Molnar , linuxppc-dev@lists.ozlabs.org, Dave Hansen , Dave Jiang , Oscar Salvador , Thomas Gleixner Subject: [PATCH v5 0/7] mm/mprotect: Fix dax puds Date: Mon, 12 Aug 2024 14:12:18 -0400 Message-ID: <20240812181225.1360970-1-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: b6h6akw6tifa67irwz8fh6zx1yrkyt9r X-Rspamd-Queue-Id: 4865D1A0014 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1723486354-131841 X-HE-Meta: U2FsdGVkX19FY9JJQW7HSXhEoaqEfxVieibF6ugBs7Xq3QSE+xQtWURnz+U7X3WryTr+AWZ0FgghFICObFeWtaQz3Yswc6CqvH6clqIsxBur/iqtru4BZGNm9LfZPXBskD5YdNo7IouX+GnBOXhi7sAC1GodfWnGhrboS8VBG86airSthT4ZAxbIqt8EkFjblnxv6NiHxftY6ExGOKGz6l4+8cHqt8QgNkGSQJG5AEqcmjyO46UEC9nvBTxh0N4iyFZbjeIZ7kwuQ8tFeWpwxckjgoJKksoXVhlfoyo0FckTazlgvDUvnO0KlwJrRMd0nJXG0XVkLMyPFiqMmYJUwsicoSyqE1pLLyvwcErDE89YLDZWIr6ZdBVESHt6LMFcBOWPWI1QuwVIAmUeGXwieFDd9Bw1LjNsl8MUq3yjd61QSHRKqpF+5gZUqrC9m8Qsmu+Qc6w4OnU8jydriaHJ7nEL8eykqQ1WjPxnOuI+dwNoKtWsp6DpS5gbvt3drVjRqT2Kh93R2XL/cTqaks3KzpywA+GDiPtshR3muT9w12C6p6e9RGrADcH7dyAmoars3gCcjqOVvXRz4o8psUvKzZNe3MCi1Z62hx1dgTmiyVd3PLMMj+iHt95A/fLqK+5sG05JERbXSNrmmTzdHYCD/Vc+Te3a7jx0WF+CN70HeCQ3ExB9fRvp3dRfOK9qAI4fx83xjGhMbDsrLOK84rYmBv/bZOSjLzArtZEnNm6MNSOxiT1dJOV4kv4qD+sAVJc5LdLsBrkR1jCPG0kb16Ri3LDYTxCgA169dePBvP/cigCF9tZedLrL2/wUvWwxZZUbVwFBqOvzqJmxP3BzvudEiA8qwVZQHKlEuaO7t1ttMTj6M0fFVS3fCRerd+c1NXow7VvqKF2Xp5zlAldlk9J0gleRCT6+xchV0TUawwk20E/MNDyZn24Coe+IVgFmgKsh9mqeapgZ5yl1pzRUVt4 ulSriido u/MJN1rVEmhiP+dv4zX5e8aSR5VvpN299ZBxF0qnDwOeWsBFBig5AMWgKl4ZF97fGtmCmWDK3c7Wg/c6Qj8+dp3gbvzfkLYr6WKAbnccHwHp485OoTowV+RdfwRv2BBANgwbIRaXwJNr7kSJYKuBw9CvBvAMph40LqiI10RjqhVvAcu4SMltcv5MJGXgIcAQ3AXZgXoxrF87mSAyoTK/ZHBtTEEHItralXHbXxoKon+EAZN6NVpCZmgBFabjmxHVnFHLF8Z9GqlMkSwGx/y1eGTQcTmhoYGWkGbrxRfRgA2jX1zYqS2uKDoB2BhJfOdhC8nfAtvH8zdTVT1AEAJ6998o24Iopd8j932XQabEAk4sI2ovjxTC49S7T79QxMgxTiU34zr7ydVo3/umbPRNKKp3lYXa0hxP9mJKDVY1FMTxemmLbFdKDvpdacxASY0x9J3LaqCFdsHZJiXNsemCK1VQBS5KQpN3Ki8yjzJiWnUXLevJ5EJcx/0o5Xrz9G4fZ6EVl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [Based on mm-unstable, commit 98808d08fc0f, Aug 7th. NOTE: it is intentional to not have rebased to latest mm-unstable, as this is to replace the queued v4] v5 Changelog: - Rename patch subject "mm/x86: arch_check_zapped_pud()", add "Implement" [tglx] - Mostly rewrote commit messages for the x86 patches, follow -tip rules [tglx] - Line wrap fixes (to mostly avoid newlines when unnecessary) [tglx] - English fixes [tglx] - Fix a build issue only happens with i386 pae + clang https://lore.kernel.org/r/202408111850.Y7rbVXOo-lkp@intel.com v1: https://lore.kernel.org/r/20240621142504.1940209-1-peterx@redhat.com v2: https://lore.kernel.org/r/20240703212918.2417843-1-peterx@redhat.com v3: https://lore.kernel.org/r/20240715192142.3241557-1-peterx@redhat.com v4: https://lore.kernel.org/r/20240807194812.819412-1-peterx@redhat.com Dax supports pud pages for a while, but mprotect on puds was missing since the start. This series tries to fix that by providing pud handling in mprotect(). The goal is to add more types of pud mappings like hugetlb or pfnmaps. This series paves way for it by fixing known pud entries. Considering nobody reported this until when I looked at those other types of pud mappings, I am thinking maybe it doesn't need to be a fix for stable and this may not need to be backported. I would guess whoever cares about mprotect() won't care 1G dax puds yet, vice versa. I hope fixing that in new kernels would be fine, but I'm open to suggestions. There're a few small things changed to teach mprotect work on PUDs. E.g. it will need to start with dropping NUMA_HUGE_PTE_UPDATES which may stop making sense when there can be more than one type of huge pte. OTOH, we'll also need to push the mmu notifiers from pmd to pud layers, which might need some attention but so far I think it's safe. For such details, please refer to each patch's commit message. The mprotect() pud process should be straightforward, as I kept it as simple as possible. There's no NUMA handled as dax simply doesn't support that. There's also no userfault involvements as file memory (even if work with userfault-wp async mode) will need to split a pud, so pud entry doesn't need to yet know userfault's existance (but hugetlb entries will; that's also for later). Tests ===== What I did test: - cross-build tests that I normally cover [1] - smoke tested on x86_64 the simplest program [2] on dev_dax 1G PUD mprotect() using QEMU's nvdimm emulations [3] and ndctl to create namespaces with proper alignments, which used to throw "bad pud" but now it'll run through all fine. I checked sigbus happens if with illegal access on protected puds. - vmtests. What I didn't test: - fsdax: I wanted to also give it a shot, but only until then I noticed it doesn't seem to be supported (according to dax_iomap_fault(), which will always fallback on PUD_ORDER). I did remember it was supported before, I could miss something important there.. please shoot if so. - userfault wp-async: I also wanted to test userfault-wp async be able to split huge puds (here it's simply a clear_pud.. though), but it won't work for devdax anyway due to not allowed to do smaller than 1G faults in this case. So skip too. - Power, as no hardware on hand. Thanks, [1] https://gitlab.com/peterx/lkb-harness/-/blob/main/config.json [2] https://github.com/xzpeter/clibs/blob/master/misc/dax.c [3] https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt Peter Xu (7): mm/dax: Dump start address in fault handler mm/mprotect: Push mmu notifier to PUDs mm/powerpc: Add missing pud helpers mm/x86: Make pud_leaf() only care about PSE bit mm/x86: Implement arch_check_zapped_pud() mm/x86: Add missing pud helpers mm/mprotect: fix dax pud handlings arch/powerpc/include/asm/book3s/64/pgtable.h | 3 + arch/powerpc/mm/book3s64/pgtable.c | 20 ++++++ arch/x86/include/asm/pgtable.h | 70 ++++++++++++++++--- arch/x86/mm/pgtable.c | 18 +++++ drivers/dax/device.c | 6 +- include/linux/huge_mm.h | 24 +++++++ include/linux/pgtable.h | 6 ++ mm/huge_memory.c | 56 ++++++++++++++- mm/mprotect.c | 71 +++++++++++++------- 9 files changed, 236 insertions(+), 38 deletions(-)