From patchwork Mon Jul 15 19:21:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13733832 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4507C3DA5E for ; Mon, 15 Jul 2024 19:21:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF5316B0092; Mon, 15 Jul 2024 15:21:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA4AC6B0095; Mon, 15 Jul 2024 15:21:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6C2F6B0096; Mon, 15 Jul 2024 15:21:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B3D526B0092 for ; Mon, 15 Jul 2024 15:21:52 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 312F4A01B8 for ; Mon, 15 Jul 2024 19:21:52 +0000 (UTC) X-FDA: 82342956864.15.84F2FEE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 1309180022 for ; Mon, 15 Jul 2024 19:21:49 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="YPKNGBK/"; spf=pass (imf02.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721071291; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=fcwbFvUlydJ89LCIJbZYnbeNwq+Aa7yYvNbAx9RPsSI=; b=cQRL3DU9aJifIeT1hbcQXESm9ZLaPIguVc414avVP74qKy3JVIorzicaJCswqVHL2b+qdX O953W0onAXepofFAovlnkwjBZVQi8lzTgJIm7lmCouCwZ9c8YUL/pBFPgWY8g8zz4BOnW3 gqFXlGJaXRJas4qcltw0rAAM0CxK1xg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="YPKNGBK/"; spf=pass (imf02.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721071291; a=rsa-sha256; cv=none; b=QIDqTcb9bY9u2liwrl7lWplOFvnj0BLvzxpaRHHbZaMz9+3M3No+2F8N7ZGS82pSeCcC/q G9+78Vm2nbve36n3wfhjBZRb7f/bRqPbIv3ovI+7LEeNYyJDm+2eiQYGtAXGqV4dJ0unr1 wNgP0KXro8ogw2k8nBX9k6MUKXwAgV8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721071309; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=fcwbFvUlydJ89LCIJbZYnbeNwq+Aa7yYvNbAx9RPsSI=; b=YPKNGBK/r2lUQD2Dm4bYOahcgNehBdbNr8G/IyWi9FEWdJhRrE4PoKawXKWDYiy2LK080Z My808CXEfhGrqrodB1SlOvRFIhm7U92fOcaWb4HdLrFXuqDJQkEjIDU+2Xp1wGUOJA5beh JQ0ftWTLxEcfzoNm98tpO0z1eESExp8= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-253-49Vqo6XyOtuxzZpWeCvRdg-1; Mon, 15 Jul 2024 15:21:48 -0400 X-MC-Unique: 49Vqo6XyOtuxzZpWeCvRdg-1 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6b7735566bfso2304326d6.3 for ; Mon, 15 Jul 2024 12:21:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721071306; x=1721676106; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fcwbFvUlydJ89LCIJbZYnbeNwq+Aa7yYvNbAx9RPsSI=; b=l630Lgz2AUUH/NyAVOqeEdppwNfgQzD703OPeJpm0xjq1UZqpUS45zHtl93BR3P+LV 9kz9i4MhDXAwSfsdNmV9ZLCPVXeorfhV+MgBjKhQMlF6fa0vMaYCwOvr9CWYCfhfwiln YS9dKAPXPqCtdo6PDrdBimgcC+HOBchUAlfo/1s7TCvuYX1y4sPw3Pbjube6qJMXDTE5 qm7d0dZxalvcAHv/iXGUBusl2Rv23TF9TCAPtsFoky1QwQF/ODA9VkQom8MZQEQiqZwe ApPQrUYNczyqZPcRd3vfqQ3aSDgf9fhVorp8hjvUu30zaOOXhOapRQnrr/a8xzIX91B0 kerA== X-Gm-Message-State: AOJu0YxC12MOMWnOzhwRUKmS7QzQgvSFIkYzj9K5RyqCPec5nuLALT4X VnwWDvDsJNMsp2pvfpSbX4/SO0rN48O81WUbLVhFl2haRSUMgVULK0aHBPUXo4HThj15CUBJF0i Sw1NyYSUWUT/FT0SyiKY2z3qVk0XRH58SzcNkSCq7lSHuCEQp5Ob3SG5feT2oPyMANAsSSamekf QT59bpoFmC8nh2P+EeuqIJd+GypvUFKw== X-Received: by 2002:a05:6214:3d87:b0:6b2:b5b5:124e with SMTP id 6a1803df08f44-6b77dd634e7mr5106976d6.0.1721071306524; Mon, 15 Jul 2024 12:21:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFu7hXxUrNpLJfANABTNicknn5i9R99x8eQnIUTwDJYLFg/NIIa0kT9/i+hFib8ttrRFUJmhA== X-Received: by 2002:a05:6214:3d87:b0:6b2:b5b5:124e with SMTP id 6a1803df08f44-6b77dd634e7mr5106546d6.0.1721071305812; Mon, 15 Jul 2024 12:21:45 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b761978d30sm24039356d6.31.2024.07.15.12.21.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jul 2024 12:21:45 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Dave Jiang , Rik van Riel , Dave Hansen , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, Matthew Wilcox , Rick P Edgecombe , peterx@redhat.com, Oscar Salvador , Mel Gorman , Andrew Morton , Borislav Petkov , Christophe Leroy , Huang Ying , "Kirill A . Shutemov" , "Aneesh Kumar K . V" , Dan Williams , Thomas Gleixner , Hugh Dickins , x86@kernel.org, Nicholas Piggin , Vlastimil Babka , Ingo Molnar Subject: [PATCH v3 0/8] mm/mprotect: Fix dax puds Date: Mon, 15 Jul 2024 15:21:34 -0400 Message-ID: <20240715192142.3241557-1-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 1309180022 X-Stat-Signature: hbbx3x3n9aa4ewxopetnapbsq44wtzhs X-Rspam-User: X-HE-Tag: 1721071309-772535 X-HE-Meta: U2FsdGVkX181jEUZ0rjAJQD1JJgZLDS26SA3ddaLkDHDTvHMFwv3pjX5sr0MP/ZO8w/r2zhsWWuiFv8Bzm6qkvaG9STFJiUNY1RYbQuUqVxm3YCrDcgrJsGneDXAfOteeBmVhb59Ok+ntQlz2XtkqU85oSLh4tejzaHwwMhw61rYGGCHsdwbOcfFoPTE6codONA/0iWb2P1BK79h8SS0WQDPAtxxI2JoKmrQR3BrlSl9ab7OWsPauYjiyc4pZBZVCSMqmZ95QciKBMlwDNOXz+zCDH5k9TcH6dXYQOY+Njfeziem5KDFddue66HQWoNthRTHWSzi/TTsPLEGpxaA3wC7KS5ktyu6DU7fmSWkxPEFZtyLP34Kf3nfiIgrpw8jOoX5QLERRTN5QekIM0tS1VGd+JDxSz4tARYzjaEutS9k2hvbQCB+ao5Nb/6uAWtVoL0IpdHDbFIT0dJcBzyWQ16y8PWCJivPQo6Fz6kU8NrziQtMtO//ylXH0HPLpkfQowxUsN97GRxKtRtCSCnalRLDDHpMsMG7X21fdlZnkChF1ztPneJkG4CrGDbDtdFSM3BhVQUJBgEeJSMykQy02Eu459l5NU8RRzTUri3Qt/zGP66Ih6QDMuLyixdZnGReJJsHJHb+huBSOPx/J3ujjxRSskfoK/rhkC7nk1ERZaKOXCyFSRo6kYgiyJteoXEzc5iBBFr0mO5vYhHvR1hS4dHmpVCOawTvfqCu20DqlOgbxTreE2RqsTQ+q+qXC296csXA7dDhSvy1hFshj0yMARlPNfjsvvo9CjTh6ohtcCYd/zO9jY5cXurh/AfDUIbLdO7cV4cVxY9Eq5bdz98FuW6Lm4HVfNRrUR22YLSJtRRfQ02sh9w8QXQpLGSgGzpKFkfsV/6z4rBnfYEUt5GJB9CRko3MRjjG1HB+x9G9qSgJdhiOPzz9cuPhqBAOUwMdpOkpqZHtrF1+KS05j/f KrPuGCuV i5DVjaPL2K6Je9UJxx7ZSqFb3vQd14s/PzFwevw4QBQUb1BlJJEYt1l82rjYwl7sl/7OoaahPnjTvaHOnAkplDpPdBODbC9NA7jbvr7xH+PA+5qnLQkpdQIlh9aW5jsXCtaMe7XMkNbC56yVbT4NvwEipCiIv21IdJNlgqapksO3s6CuiFaD+yZaY3vR4fjGjbSLTSklVWORy1TI+RNGpby3h22DJ4b0rL0ocxqUR6eTSHX9h0ArTaSlCko0fEiCtjou7G7kPA3LnuDP6fhtzxprllYVwgI/nlgsy+M693R45xEZj0G/7en4a913nfh9IniR/KEKPaecgjoo+PJa2u3bqGimrBw03zo5n0oEW90CvNWR20mH6PqISEg2KBF7Nvx1b4MBsXl85MYrdFxNUPDiJCzTvZ5OOkoYfjup34Rvt3z2CrBjPbPwNb4vIS/qs0sCV1afFJXE1HLpqTRtRbwpAw2X2cUIbJTE41YRokNT8KQMDCeK+zJPDLXkzTRYbmy05FGkf6d9LU1XuH7RKL/fZ3uOm3OGmRHfB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [Based on mm-unstable, commit 31334cf98dbd, July 2nd] v3: - Fix a build issue on i386 PAE config - Moved one line from patch 8 to patch 3 v1: https://lore.kernel.org/r/20240621142504.1940209-1-peterx@redhat.com v2: https://lore.kernel.org/r/20240703212918.2417843-1-peterx@redhat.com Dax supports pud pages for a while, but mprotect on puds was missing since the start. This series tries to fix that by providing pud handling in mprotect(). The goal is to add more types of pud mappings like hugetlb or pfnmaps. This series paves way for it by fixing known pud entries. Considering nobody reported this until when I looked at those other types of pud mappings, I am thinking maybe it doesn't need to be a fix for stable and this may not need to be backported. I would guess whoever cares about mprotect() won't care 1G dax puds yet, vice versa. I hope fixing that in new kernels would be fine, but I'm open to suggestions. There're a few small things changed to teach mprotect work on PUDs. E.g. it will need to start with dropping NUMA_HUGE_PTE_UPDATES which may stop making sense when there can be more than one type of huge pte. OTOH, we'll also need to push the mmu notifiers from pmd to pud layers, which might need some attention but so far I think it's safe. For such details, please refer to each patch's commit message. The mprotect() pud process should be straightforward, as I kept it as simple as possible. There's no NUMA handled as dax simply doesn't support that. There's also no userfault involvements as file memory (even if work with userfault-wp async mode) will need to split a pud, so pud entry doesn't need to yet know userfault's existance (but hugetlb entries will; that's also for later). Tests ===== What I did test: - cross-build tests that I normally cover [1] - smoke tested on x86_64 the simplest program [2] on dev_dax 1G PUD mprotect() using QEMU's nvdimm emulations [3] and ndctl to create namespaces with proper alignments, which used to throw "bad pud" but now it'll run through all fine. I checked sigbus happens if with illegal access on protected puds. - vmtests. What I didn't test: - fsdax: I wanted to also give it a shot, but only until then I noticed it doesn't seem to be supported (according to dax_iomap_fault(), which will always fallback on PUD_ORDER). I did remember it was supported before, I could miss something important there.. please shoot if so. - userfault wp-async: I also wanted to test userfault-wp async be able to split huge puds (here it's simply a clear_pud.. though), but it won't work for devdax anyway due to not allowed to do smaller than 1G faults in this case. So skip too. - Power, as no hardware on hand. Thanks, [1] https://gitlab.com/peterx/lkb-harness/-/blob/main/config.json [2] https://github.com/xzpeter/clibs/blob/master/misc/dax.c [3] https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt Peter Xu (8): mm/dax: Dump start address in fault handler mm/mprotect: Remove NUMA_HUGE_PTE_UPDATES mm/mprotect: Push mmu notifier to PUDs mm/powerpc: Add missing pud helpers mm/x86: Make pud_leaf() only cares about PSE bit mm/x86: arch_check_zapped_pud() mm/x86: Add missing pud helpers mm/mprotect: fix dax pud handlings arch/powerpc/include/asm/book3s/64/pgtable.h | 3 + arch/powerpc/mm/book3s64/pgtable.c | 20 ++++++ arch/x86/include/asm/pgtable.h | 68 +++++++++++++++--- arch/x86/mm/pgtable.c | 19 +++++ drivers/dax/device.c | 6 +- include/linux/huge_mm.h | 24 +++++++ include/linux/pgtable.h | 7 ++ include/linux/vm_event_item.h | 1 - mm/huge_memory.c | 56 ++++++++++++++- mm/mprotect.c | 74 ++++++++++++-------- mm/vmstat.c | 1 - 11 files changed, 234 insertions(+), 45 deletions(-)