From patchwork Mon Nov 14 00:04:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13041782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FBBCC4332F for ; Mon, 14 Nov 2022 00:04:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 617396B0073; Sun, 13 Nov 2022 19:04:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A0096B0074; Sun, 13 Nov 2022 19:04:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F30C8E0001; Sun, 13 Nov 2022 19:04:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2D5A26B0073 for ; Sun, 13 Nov 2022 19:04:55 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DB7F316041B for ; Mon, 14 Nov 2022 00:04:54 +0000 (UTC) X-FDA: 80130102108.06.F0E5780 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 76564160008 for ; Mon, 14 Nov 2022 00:04:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668384292; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E/uqXcWNCa1p19zapYQ5e7dpPK5NM6V2qbPw5ZdKILA=; b=L9K4Vv2aP3Z1AC7W02k9GWIFEPGfOe/8gTTVMQNRpMkcSe2x3q5sbS+hCpaQOgT7DV6o/p 7isCpgEmxcZ1amPX0nr+v8nV/3UJy87EdfUp6qhKUztnDvxKUKuX196mefICuVoQ9KFgE6 TRK2W9zzSFAg3yT4/w3vlrbknTPlRoA= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-398-bpSH6CuGOD29ztj4xOwEBA-1; Sun, 13 Nov 2022 19:04:51 -0500 X-MC-Unique: bpSH6CuGOD29ztj4xOwEBA-1 Received: by mail-qk1-f197.google.com with SMTP id bl21-20020a05620a1a9500b006fa35db066aso9830195qkb.19 for ; Sun, 13 Nov 2022 16:04:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E/uqXcWNCa1p19zapYQ5e7dpPK5NM6V2qbPw5ZdKILA=; b=xvL0bp4it8CLuD8nGmdas0kkXy3purKtzQRuE7nIrU2oCXA5NZGnrU7KtOiSFcLE0z FyxZlfWJS0a/6+a1xL6pGczSBS3LHNBZXHa33M/4px0/J+4v2HATOOwgxBg2MWj0oOj9 BwJ1YJBx7LZgQsLMQrndc3s1ECv1QASRw1ylBiTHQg6Zv1k/VMfzAZzTkKhnqF2W+e45 wWO7L7tto1cooOLWPGK1es8dqXSdsWYp3ds5J0YJkunYODZPw8eUxiKyEtTBF1HDYm7p qxH4ZII0I938SKnNEGj2L+PEBMLzd7dsrN56nKjgFPpCHhgc9jIN4Fy78l1Jv/GGDiTD umTg== X-Gm-Message-State: ANoB5pmJFo+lH1YLkEE2TBheb4bsmTVftLqkS/9SJG5rh+BNggoT9uux T8UdQ+ohHhYAUQUGuSdFPtrZZ0clMSY2bX2WXQEqhvlc9ucB4RNxxDV6aNP8kVAcZBkytVlPW5G a2Xqge/iv2NM= X-Received: by 2002:a05:622a:1c0f:b0:3a5:47c8:3889 with SMTP id bq15-20020a05622a1c0f00b003a547c83889mr10331812qtb.66.1668384291032; Sun, 13 Nov 2022 16:04:51 -0800 (PST) X-Google-Smtp-Source: AA0mqf5ZtobJluPZe3x25WvxetiEo/+Q4VROUb9doo/lcY8V74Jq3aR59GmxR6sHKMymg4q34UwP4A== X-Received: by 2002:a05:622a:1c0f:b0:3a5:47c8:3889 with SMTP id bq15-20020a05622a1c0f00b003a547c83889mr10331790qtb.66.1668384290766; Sun, 13 Nov 2022 16:04:50 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id cb5-20020a05622a1f8500b0039cc0fbdb61sm4870380qtb.53.2022.11.13.16.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 16:04:50 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Nadav Amit , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , Ives van Hoorne , Axel Rasmussen , Alistair Popple , stable@vger.kernel.org Subject: [PATCH v3 1/2] mm/migrate: Fix read-only page got writable when recover pte Date: Sun, 13 Nov 2022 19:04:46 -0500 Message-Id: <20221114000447.1681003-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221114000447.1681003-1-peterx@redhat.com> References: <20221114000447.1681003-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668384293; a=rsa-sha256; cv=none; b=e6xv2t9Z6vJgS6C4huQ70vdBJe2QZsgVG+YlY95XZj0IyymIuw5YtDXRHEWmTyQfWw74/3 1kZWPjukOTcHEmG7RozWQZzXrfjfPECnWkVQBzNO3meg7gIoffeEWoLNQPOaUob4i5Rz5r 6TaQCfGVambqpBmtpqtpN/RsxSUfp90= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L9K4Vv2a; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668384293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E/uqXcWNCa1p19zapYQ5e7dpPK5NM6V2qbPw5ZdKILA=; b=ty3+yX6x4V7gyV+u8Td8FTg8cg+YKBaCklLE2drbnWrYUwh2nM/Q9ZcSZZL0euICNC+WUC 3OBtymzYZjRtzKy8COEayGEbHAGBWGF6lDuuJP+IwNSnMCy2L0rlvRlKQfEgq0KHQdCrz4 K+JMhmzY0k91W/qJhsw4uJEDAJt/gv8= X-Rspam-User: X-Stat-Signature: sqe7xrw7yiiwm4uog513tphe9by5pr3n X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 76564160008 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L9K4Vv2a; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1668384293-45915 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Ives van Hoorne from codesandbox.io reported an issue regarding possible data loss of uffd-wp when applied to memfds on heavily loaded systems. The symptom is some read page got data mismatch from the snapshot child VMs. Here I can also reproduce with a Rust reproducer that was provided by Ives that keeps taking snapshot of a 256MB VM, on a 32G system when I initiate 80 instances I can trigger the issues in ten minutes. It turns out that we got some pages write-through even if uffd-wp is applied to the pte. The problem is, when removing migration entries, we didn't really worry about write bit as long as we know it's not a write migration entry. That may not be true, for some memory types (e.g. writable shmem) mk_pte can return a pte with write bit set, then to recover the migration entry to its original state we need to explicit wr-protect the pte or it'll has the write bit set if it's a read migration entry. For uffd it can cause write-through. The relevant code on uffd was introduced in the anon support, which is commit f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration", 2020-04-07). However anon shouldn't suffer from this problem because anon should already have the write bit cleared always, so that may not be a proper Fixes target, while I'm adding the Fixes to be uffd shmem support. Cc: Andrea Arcangeli Cc: stable@vger.kernel.org Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs") Reported-by: Ives van Hoorne Reviewed-by: Alistair Popple Tested-by: Ives van Hoorne Signed-off-by: Peter Xu --- mm/migrate.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index dff333593a8a..8b6351c08c78 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -213,8 +213,14 @@ static bool remove_migration_pte(struct folio *folio, pte = pte_mkdirty(pte); if (is_writable_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); - else if (pte_swp_uffd_wp(*pvmw.pte)) + else + /* NOTE: mk_pte can have write bit set */ + pte = pte_wrprotect(pte); + + if (pte_swp_uffd_wp(*pvmw.pte)) { + WARN_ON_ONCE(pte_write(pte)); pte = pte_mkuffd_wp(pte); + } if (folio_test_anon(folio) && !is_readable_migration_entry(entry)) rmap_flags |= RMAP_EXCLUSIVE;