From patchwork Wed Jan 29 11:54:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13953676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31C17C0218D for ; Wed, 29 Jan 2025 11:54:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B66B6280012; Wed, 29 Jan 2025 06:54:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B1691280011; Wed, 29 Jan 2025 06:54:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B717280012; Wed, 29 Jan 2025 06:54:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7E165280011 for ; Wed, 29 Jan 2025 06:54:42 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 33E39160C37 for ; Wed, 29 Jan 2025 11:54:42 +0000 (UTC) X-FDA: 83060332404.04.1ADCA6E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf03.hostedemail.com (Postfix) with ESMTP id D96602000A for ; Wed, 29 Jan 2025 11:54:39 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AN7tnBJH; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738151680; a=rsa-sha256; cv=none; b=xPzbjeuTKcYVoxURVp4H4vw0h1p9/rbwSNLQSyFf7YfqpMMsVTCQNIWLEYlPjqUK4mDc39 W2bHufh/8NTOhsyTU4qTQHcF1Y1gUXbTpYXeWziiBUj0gAJK9NUN3cAWkZRzUahGu7TeTn OlEnNhY/c7yPGZqjKzCvkUgvtjRWM0I= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AN7tnBJH; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf03.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738151679; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PlRguNVTZsTlNEYBD/J+r00pj+4jYSoE9sS4zp2ee/k=; b=kjVF6zUXn58OrfwvBnE1cEmr8XQbxk6yoicsIYKpdIUZ1txgtlvjB8S7zUeAlLLxSW7xjR +605DjsgMZ5xCZKJk8ZxL5ugovLxtymMvgiPjL1jO3BdkEgucHrhFasy+xT32Niesmn+7/ tapzj4HQAmlE+0pGYNbrabChgfdSGIM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1738151679; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PlRguNVTZsTlNEYBD/J+r00pj+4jYSoE9sS4zp2ee/k=; b=AN7tnBJHlzIYtwcIwmI99ZTK8RhuzM0inlF6SliAPadv8NH+T3pAfKWc5l5sA+2xK2hZ4n /oX+MYsSt4SJYBXS2IpDhVLX2MyjAHqCavSjU5Ec2Dd21prSgFc2lUwqBXn6S8Ow75575I kDnJIIdSTP7GtKnROhauZUPpv6kMgjk= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-JdKW3BTuMwWnU1Wk2ZAzdA-1; Wed, 29 Jan 2025 06:54:38 -0500 X-MC-Unique: JdKW3BTuMwWnU1Wk2ZAzdA-1 X-Mimecast-MFC-AGG-ID: JdKW3BTuMwWnU1Wk2ZAzdA Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-38a684a096eso3012842f8f.2 for ; Wed, 29 Jan 2025 03:54:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738151677; x=1738756477; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PlRguNVTZsTlNEYBD/J+r00pj+4jYSoE9sS4zp2ee/k=; b=oSpeUnAgqJp+EQJWLArGSo4unRgA/vXD7lkQZhKx0vWFevAFoKgqYwJI6mbsQp0Bay +MV8XFHfvx3Yo2whXJjx7Vi1MxaY02gZAZzrmv62YLHa2zcjyIp449EZrsUH7n5Xsi6H HJ8NsfyOJt2c0DizuaFdVxQzJ3iyGweqVCIC45XEyv8HsZgvbBLHAoCmVswlf6poVlMd P6/M+BVZ0ZNd8yj6Y3wYKI1EZg8BLkqnH2IjEeAVRoCpIiqONqoGDWaSsJG3+KvxvHta R0Qej1JTO+Ocw/ESaAIx3yR5+92nRtklqDclzdYkioNjiImKgIpTmKe9KwzZxvern+zy w8Aw== X-Forwarded-Encrypted: i=1; AJvYcCUnqAf0TQTp/hA5lMRDOeLKMOBjMHyICwhbWp20ML6ORqJYJmDSmu8WKwi6s+cBLu6FukV2E6XEJw==@kvack.org X-Gm-Message-State: AOJu0Yw64fpaqpEu1V8t67M23qRyZrkrTDsIjAwfa0S0p8B8rioNNLTG m+T0wCA1GhFu7nBgszKPmb7fO/OskxRd0CfjgWqYPpg1NMZXD4+ZOsvnGDhvqnwqM4btfYfIFAE O9KDy4ORzfTJn4FfMexX3QikF7AXeixPij6ZglRLOydiBKx0fK5q8Zn4q2IM= X-Gm-Gg: ASbGncursdN1E2rfKT9tFGodyNiUBukg0w8cZD8rRB/c4+y4EnD4jOprZPpP66oDeZp qCZ1VJApHSxWhMYl0WSfIAnTFLZi4PhXlkVABn3VYHe/Qtyz/fHLQR2ZlqnHnqpkxCk7guVUT8Y GcYuCGxIXDpnipxamN6oWQWD2k+3nJ9PX09ekrW/f2QFV+MIxfvPUC0sESuRhP9TxV5MtKTRIkx VYhbi2VPQXRlRRLt8KjZ2hWUqAZH7OMW+G3DNVwNBIoD8TCD5WAd1kdom2lPtjcDpRRa65JIxYE jbKNbRglTJlEFQveu9dnc01yCANnLBS6KltWfxNdvfmew8Sv1lBTupEVF7k5BYj+UA== X-Received: by 2002:a5d:50c2:0:b0:38b:d7d2:12f2 with SMTP id ffacd0b85a97d-38c520bf925mr1637948f8f.54.1738151676876; Wed, 29 Jan 2025 03:54:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IE238aJbXQykMLxOmkA/GlxflTBLrEd6FkSw7fKwOzieBzddAgHKUTAMn1ykR+OZNV3l083wA== X-Received: by 2002:a5d:50c2:0:b0:38b:d7d2:12f2 with SMTP id ffacd0b85a97d-38c520bf925mr1637927f8f.54.1738151676497; Wed, 29 Jan 2025 03:54:36 -0800 (PST) Received: from localhost (p200300cbc7053b0064b867195794bf13.dip0.t-ipconnect.de. [2003:cb:c705:3b00:64b8:6719:5794:bf13]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-38c2a17d7a7sm17234978f8f.32.2025.01.29.03.54.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 29 Jan 2025 03:54:36 -0800 (PST) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-doc@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org, nouveau@lists.freedesktop.org, David Hildenbrand , Andrew Morton , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Jonathan Corbet , Alex Shi , Yanteng Si , Karol Herbst , Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , "Liam R. Howlett" , Lorenzo Stoakes , Vlastimil Babka , Jann Horn , Pasha Tatashin , Peter Xu , Alistair Popple , Jason Gunthorpe Subject: [PATCH v1 08/12] mm/rmap: handle device-exclusive entries correctly in try_to_unmap_one() Date: Wed, 29 Jan 2025 12:54:06 +0100 Message-ID: <20250129115411.2077152-9-david@redhat.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250129115411.2077152-1-david@redhat.com> References: <20250129115411.2077152-1-david@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: IG3ghe6dvRyrX8Uj1Y3J1yJH_Vw3-S8TkHAqtd62lTo_1738151677 X-Mimecast-Originator: redhat.com content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspam-User: X-Rspamd-Queue-Id: D96602000A X-Rspamd-Server: rspam10 X-Stat-Signature: icnjeppwi54txi1ca6ogjjt38mbbkoye X-HE-Tag: 1738151679-238868 X-HE-Meta: U2FsdGVkX187EWXyvApljRuxGlDSgxji0rgqt+gHhiHo4ByBsFrhHgfRcH+7qUWJUQ0UMrUtC5WYns+tsyKRK2/V/7OZr0r2jsukh/gRHNLh+BljhSttdFGQOcbrszYJm/Uohl1aiDbCrAQKqOUVtbc+6sqjWiWP5836NVoYhGw1IxChtpmzRULaBAbPt2WnUZG0nzPqTZcezB4ONuVOlVwTh6SnZfw7Yy8wyg951h+ChsZGH8olA9YwYSb7L7OeQr7sKlKa96PzYsl3DisCiQ5eS+iUmw8IQSbl+ZlaNyBFlHsqNQcs30J8hVWeCe9r2I7l8fl3mWU3kF7I7TbCVuxzBCvoXJQALBkDqC+O81v43zPKWsZKz6vmIQE2YHDdWK6d9BmZN55Orb5opJ6LxLj05KW96HCmbk9b6UvF4YKRlY5atp1dTcoXu6isF+qLqFc8LLKgKuj7Wnu2GhzwUldPPKjx9HBoDlC5VsypRh4GYmClkR/ZSqHFULMI23j7DTGp2XDcn8A5pT+FwQHHOEBy2Dqsnq6qxspAgHPX82zIy9plY2GutA1BpmsHbxDqdpU8cL7EmFcYyNqjDloGFA6c2g8fhFytZbpwoPKjUhA63jeSWZMqCQVPQiJlu0mqD3DRpGJ9Vo2J6IT4sbSD68b8flzJZeyjljn+DShN3M4sj/H6KkCH5mkaD5g82hiSDR9aY1RcRLc6k1b1pQAyr0ApRpY5h41vDeYVDb9Khm+Z14J802u6YimWZfzUWY3AMtctIHH5zCW2lRIPkbTBOZjbGV4h0+PaFuUs/vZCj7A6DhwfLd9/zCt5NVMeJnva0ZZAtr4TW1zkWrn4gBNxm/fWGaYJ6F+m2VQK/l/wz87w49A1B0mtxJmG9kpS7hpr2I/+tk9s09bcqQq1PxuVXXnz3bTS20hDxbmee7iXYxmPHUY5gIEScPO8C/0GBSaHMOET0HGO1aMyMfcNJH6 DB5QCgwW agV6AHdRdnKEyWqa/p4GqJ43/SlKd2Z8+TANAXZOVUF/A/2bQN05aDtESWt86OSGH1//hv9EwgCZja35Iqw6VieQbg/Ui79c8xmgP16YhA1qx2x30BK15YQVyI2xf3BIjK1zxYE0lUiswh2iE8iy4Ox5UuwpIxmEOjol7PjlTOiNEAEPo07reBVJTiRYyJ6Dd76e6/VVilQcG3zgGSTKxzBX+mtvBaaz08kNL12u4RmziwqLSX91SV/7fv9KuHcwRKFUsR0fGXET1jltpf+pYjDnIdqvwpWzSdnGWwZOl+YjEfqeOihOGbHv6LBlRHyUaj8hPiiMH2sb36ofufqW/tN+lIO167D6F439xu1qNe8zFYAOGMXNxb96IhxO4mLUk9r4gBmDAIzFRdO7yRImBeoyXyZ4BzjDex72DCRZ+pVnJPHD5lCUCbhOjgaOeLdX9nurVeZTxRvPsELFZJLDXroQUqX4oi6Rzul9pJ9imfy8EWO9YVD8spDs9/STQ9kKfypvtb6nT4WaSpcD7yaXkbGdN8w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Ever since commit b756a3b5e7ea ("mm: device exclusive memory access") we can return with a device-exclusive entry from page_vma_mapped_walk(). try_to_unmap_one() is not prepared for that, so teach it about these non-present nonswap PTEs. Before that, could we also have triggered this case with device-private entries? Unlikely. Note that we could currently only run into this case with device-exclusive entries on THPs. For order-0 folios, we still adjust the mapcount on conversion to device-exclusive, making the rmap walk abort early (folio_mapcount() == 0 and breaking swapout). We'll fix that next, now that try_to_unmap_one() can handle it. Further note that try_to_unmap() calls MMU notifiers and holds the folio lock, so any device-exclusive users should be properly prepared for this device-exclusive PTE to "vanish". Fixes: b756a3b5e7ea ("mm: device exclusive memory access") Signed-off-by: David Hildenbrand --- mm/rmap.c | 53 ++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 40 insertions(+), 13 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 65d9bbea16d0..12900f367a2a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1648,9 +1648,9 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, { struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); + bool anon_exclusive, ret = true; pte_t pteval; struct page *subpage; - bool anon_exclusive, ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; unsigned long pfn; @@ -1722,7 +1722,19 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, /* Unexpected PMD-mapped THP? */ VM_BUG_ON_FOLIO(!pvmw.pte, folio); - pfn = pte_pfn(ptep_get(pvmw.pte)); + /* + * We can end up here with selected non-swap entries that + * actually map pages similar to PROT_NONE; see + * page_vma_mapped_walk()->check_pte(). + */ + pteval = ptep_get(pvmw.pte); + if (likely(pte_present(pteval))) { + pfn = pte_pfn(pteval); + } else { + pfn = swp_offset_pfn(pte_to_swp_entry(pteval)); + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); + } + subpage = folio_page(folio, pfn - folio_pfn(folio)); address = pvmw.address; anon_exclusive = folio_test_anon(folio) && @@ -1778,7 +1790,9 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, hugetlb_vma_unlock_write(vma); } pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); - } else { + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + } else if (likely(pte_present(pteval))) { flush_cache_page(vma, address, pfn); /* Nuke the page table entry. */ if (should_defer_flush(mm, flags)) { @@ -1796,6 +1810,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + } else { + pte_clear(mm, address, pvmw.pte); } /* @@ -1805,10 +1823,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); - /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) - folio_mark_dirty(folio); - /* Update high watermark before we lower rss */ update_hiwater_rss(mm); @@ -1822,8 +1836,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, dec_mm_counter(mm, mm_counter(folio)); set_pte_at(mm, address, pvmw.pte, pteval); } - - } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { + } else if (likely(pte_present(pteval)) && pte_unused(pteval) && + !userfaultfd_armed(vma)) { /* * The guest indicated that the page content is of no * interest anymore. Simply discard the pte, vmscan @@ -1902,6 +1916,12 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, set_pte_at(mm, address, pvmw.pte, pteval); goto walk_abort; } + + /* + * arch_unmap_one() is expected to be a NOP on + * architectures where we could have non-swp entries + * here, so we'll not check/care. + */ if (arch_unmap_one(mm, vma, address, pteval) < 0) { swap_free(entry); set_pte_at(mm, address, pvmw.pte, pteval); @@ -1926,10 +1946,17 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (anon_exclusive) swp_pte = pte_swp_mkexclusive(swp_pte); - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); + if (likely(pte_present(pteval))) { + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + } else { + if (pte_swp_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_swp_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + } set_pte_at(mm, address, pvmw.pte, swp_pte); } else { /*