From patchwork Mon Nov 14 00:04:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13041782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FBBCC4332F for ; Mon, 14 Nov 2022 00:04:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 617396B0073; Sun, 13 Nov 2022 19:04:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A0096B0074; Sun, 13 Nov 2022 19:04:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F30C8E0001; Sun, 13 Nov 2022 19:04:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2D5A26B0073 for ; Sun, 13 Nov 2022 19:04:55 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DB7F316041B for ; Mon, 14 Nov 2022 00:04:54 +0000 (UTC) X-FDA: 80130102108.06.F0E5780 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 76564160008 for ; Mon, 14 Nov 2022 00:04:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668384292; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E/uqXcWNCa1p19zapYQ5e7dpPK5NM6V2qbPw5ZdKILA=; b=L9K4Vv2aP3Z1AC7W02k9GWIFEPGfOe/8gTTVMQNRpMkcSe2x3q5sbS+hCpaQOgT7DV6o/p 7isCpgEmxcZ1amPX0nr+v8nV/3UJy87EdfUp6qhKUztnDvxKUKuX196mefICuVoQ9KFgE6 TRK2W9zzSFAg3yT4/w3vlrbknTPlRoA= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-398-bpSH6CuGOD29ztj4xOwEBA-1; Sun, 13 Nov 2022 19:04:51 -0500 X-MC-Unique: bpSH6CuGOD29ztj4xOwEBA-1 Received: by mail-qk1-f197.google.com with SMTP id bl21-20020a05620a1a9500b006fa35db066aso9830195qkb.19 for ; Sun, 13 Nov 2022 16:04:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E/uqXcWNCa1p19zapYQ5e7dpPK5NM6V2qbPw5ZdKILA=; b=xvL0bp4it8CLuD8nGmdas0kkXy3purKtzQRuE7nIrU2oCXA5NZGnrU7KtOiSFcLE0z FyxZlfWJS0a/6+a1xL6pGczSBS3LHNBZXHa33M/4px0/J+4v2HATOOwgxBg2MWj0oOj9 BwJ1YJBx7LZgQsLMQrndc3s1ECv1QASRw1ylBiTHQg6Zv1k/VMfzAZzTkKhnqF2W+e45 wWO7L7tto1cooOLWPGK1es8dqXSdsWYp3ds5J0YJkunYODZPw8eUxiKyEtTBF1HDYm7p qxH4ZII0I938SKnNEGj2L+PEBMLzd7dsrN56nKjgFPpCHhgc9jIN4Fy78l1Jv/GGDiTD umTg== X-Gm-Message-State: ANoB5pmJFo+lH1YLkEE2TBheb4bsmTVftLqkS/9SJG5rh+BNggoT9uux T8UdQ+ohHhYAUQUGuSdFPtrZZ0clMSY2bX2WXQEqhvlc9ucB4RNxxDV6aNP8kVAcZBkytVlPW5G a2Xqge/iv2NM= X-Received: by 2002:a05:622a:1c0f:b0:3a5:47c8:3889 with SMTP id bq15-20020a05622a1c0f00b003a547c83889mr10331812qtb.66.1668384291032; Sun, 13 Nov 2022 16:04:51 -0800 (PST) X-Google-Smtp-Source: AA0mqf5ZtobJluPZe3x25WvxetiEo/+Q4VROUb9doo/lcY8V74Jq3aR59GmxR6sHKMymg4q34UwP4A== X-Received: by 2002:a05:622a:1c0f:b0:3a5:47c8:3889 with SMTP id bq15-20020a05622a1c0f00b003a547c83889mr10331790qtb.66.1668384290766; Sun, 13 Nov 2022 16:04:50 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id cb5-20020a05622a1f8500b0039cc0fbdb61sm4870380qtb.53.2022.11.13.16.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 16:04:50 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Nadav Amit , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , Ives van Hoorne , Axel Rasmussen , Alistair Popple , stable@vger.kernel.org Subject: [PATCH v3 1/2] mm/migrate: Fix read-only page got writable when recover pte Date: Sun, 13 Nov 2022 19:04:46 -0500 Message-Id: <20221114000447.1681003-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221114000447.1681003-1-peterx@redhat.com> References: <20221114000447.1681003-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668384293; a=rsa-sha256; cv=none; b=e6xv2t9Z6vJgS6C4huQ70vdBJe2QZsgVG+YlY95XZj0IyymIuw5YtDXRHEWmTyQfWw74/3 1kZWPjukOTcHEmG7RozWQZzXrfjfPECnWkVQBzNO3meg7gIoffeEWoLNQPOaUob4i5Rz5r 6TaQCfGVambqpBmtpqtpN/RsxSUfp90= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L9K4Vv2a; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668384293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E/uqXcWNCa1p19zapYQ5e7dpPK5NM6V2qbPw5ZdKILA=; b=ty3+yX6x4V7gyV+u8Td8FTg8cg+YKBaCklLE2drbnWrYUwh2nM/Q9ZcSZZL0euICNC+WUC 3OBtymzYZjRtzKy8COEayGEbHAGBWGF6lDuuJP+IwNSnMCy2L0rlvRlKQfEgq0KHQdCrz4 K+JMhmzY0k91W/qJhsw4uJEDAJt/gv8= X-Rspam-User: X-Stat-Signature: sqe7xrw7yiiwm4uog513tphe9by5pr3n X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 76564160008 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=L9K4Vv2a; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1668384293-45915 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Ives van Hoorne from codesandbox.io reported an issue regarding possible data loss of uffd-wp when applied to memfds on heavily loaded systems. The symptom is some read page got data mismatch from the snapshot child VMs. Here I can also reproduce with a Rust reproducer that was provided by Ives that keeps taking snapshot of a 256MB VM, on a 32G system when I initiate 80 instances I can trigger the issues in ten minutes. It turns out that we got some pages write-through even if uffd-wp is applied to the pte. The problem is, when removing migration entries, we didn't really worry about write bit as long as we know it's not a write migration entry. That may not be true, for some memory types (e.g. writable shmem) mk_pte can return a pte with write bit set, then to recover the migration entry to its original state we need to explicit wr-protect the pte or it'll has the write bit set if it's a read migration entry. For uffd it can cause write-through. The relevant code on uffd was introduced in the anon support, which is commit f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration", 2020-04-07). However anon shouldn't suffer from this problem because anon should already have the write bit cleared always, so that may not be a proper Fixes target, while I'm adding the Fixes to be uffd shmem support. Cc: Andrea Arcangeli Cc: stable@vger.kernel.org Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs") Reported-by: Ives van Hoorne Reviewed-by: Alistair Popple Tested-by: Ives van Hoorne Signed-off-by: Peter Xu --- mm/migrate.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index dff333593a8a..8b6351c08c78 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -213,8 +213,14 @@ static bool remove_migration_pte(struct folio *folio, pte = pte_mkdirty(pte); if (is_writable_migration_entry(entry)) pte = maybe_mkwrite(pte, vma); - else if (pte_swp_uffd_wp(*pvmw.pte)) + else + /* NOTE: mk_pte can have write bit set */ + pte = pte_wrprotect(pte); + + if (pte_swp_uffd_wp(*pvmw.pte)) { + WARN_ON_ONCE(pte_write(pte)); pte = pte_mkuffd_wp(pte); + } if (folio_test_anon(folio) && !is_readable_migration_entry(entry)) rmap_flags |= RMAP_EXCLUSIVE; From patchwork Mon Nov 14 00:04:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13041783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4FFEC43217 for ; Mon, 14 Nov 2022 00:04:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5070E6B0074; Sun, 13 Nov 2022 19:04:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4905A6B0075; Sun, 13 Nov 2022 19:04:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E5106B0078; Sun, 13 Nov 2022 19:04:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 034686B0074 for ; Sun, 13 Nov 2022 19:04:56 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CD83D1A0171 for ; Mon, 14 Nov 2022 00:04:55 +0000 (UTC) X-FDA: 80130102150.01.6ECAA48 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 782C7180002 for ; Mon, 14 Nov 2022 00:04:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668384294; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vuiiwLAN7I1THL57m13Ion7pO7eyRCignS0WdOlLecg=; b=W2VUOmIq5mhs6g8pIWAGfph47NR4cEgxBk/GSP0mLIhSja/Ea8d105fsm1Zw53V5uPYKGv AJjD2zFJU9b/7sFbhQ9JeQlkj4Y5c88+cT3OM0qC5kC7RtcUBNpjCjpO7hMqUGOVAmPxSR Z+s3nYZenfgyiI31fHJOvME5VvDMgvg= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-178-PQM4lUSTOga_a88VwScKmw-1; Sun, 13 Nov 2022 19:04:53 -0500 X-MC-Unique: PQM4lUSTOga_a88VwScKmw-1 Received: by mail-qv1-f69.google.com with SMTP id w12-20020a056214012c00b004c6257ca968so4151063qvs.16 for ; Sun, 13 Nov 2022 16:04:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vuiiwLAN7I1THL57m13Ion7pO7eyRCignS0WdOlLecg=; b=kkP3vUJdpD3SJMGZ+D1AagF8TwlRCPehTZ2sToE5bi1FQbzT4xOisLThAlnVQYle00 ubK113f1FroOmAVZGfL94xIdyddtgF8afypkk+Ftog018WhLk1rWD7QyCiiTi5WaBmKS UO7h4jIxBeeJiDEY537U/6fN6H3QzXclb2CYGKcPPdHBSSBoy0TT5lOsRiAP2+DhkFd0 TmxEqLGr//ND2RrcomhSt2L57J1BHZnWdF1ujdFdnhPeASF5xunqC4ReSIYcLCRWzeG7 iXRAWJnVlKCRVvv1TneNT852Kset8mfNYiPrmGi7hE03B3rfs8Lkceug5nCrKCYIuxUy 6UrA== X-Gm-Message-State: ANoB5plSAR8U3yG7U9ATFE6STAcZrWPU59LArVKjRaz83JJOtQ/6yBzJ oJhnA6wiAYHslswjLeTLR8XjlalCM8isl9iu/PoGhOPpW6xGpIj78f4tTraVp3+FBusfgTzYKfq rSDlZ4WdNmhY= X-Received: by 2002:ae9:e302:0:b0:6fa:6cb1:8965 with SMTP id v2-20020ae9e302000000b006fa6cb18965mr9424158qkf.541.1668384292882; Sun, 13 Nov 2022 16:04:52 -0800 (PST) X-Google-Smtp-Source: AA0mqf5V1BslIfV280V95tlOvqubbpsxPQtv1YGg6N+bmjUfNbr1J4Ku+Hat5/cU0tUkSD60e2vKIw== X-Received: by 2002:ae9:e302:0:b0:6fa:6cb1:8965 with SMTP id v2-20020ae9e302000000b006fa6cb18965mr9424139qkf.541.1668384292663; Sun, 13 Nov 2022 16:04:52 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id cb5-20020a05622a1f8500b0039cc0fbdb61sm4870380qtb.53.2022.11.13.16.04.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 16:04:52 -0800 (PST) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Rapoport , Nadav Amit , Andrew Morton , peterx@redhat.com, Andrea Arcangeli , Ives van Hoorne , Axel Rasmussen , Alistair Popple Subject: [PATCH v3 2/2] mm/uffd: Sanity check write bit for uffd-wp protected ptes Date: Sun, 13 Nov 2022 19:04:47 -0500 Message-Id: <20221114000447.1681003-3-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221114000447.1681003-1-peterx@redhat.com> References: <20221114000447.1681003-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668384295; a=rsa-sha256; cv=none; b=UraCqhpMUT3uPXgjjpafsCZBam7RGAY2zKYUtPSJ2JYeMRRMMaD9uV8qBdNI9NsU5cf503 aa/c8o8eLcARILz3qYbxfmw45c72UoqZLBQGdw9QTcYq/63MLCg46vbggKzJfgxt04LVi3 XODEi8p2qn9XTBc0VGgVAQ6tb0/NPco= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=W2VUOmIq; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668384295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vuiiwLAN7I1THL57m13Ion7pO7eyRCignS0WdOlLecg=; b=WKLdZS4hl+eE8aPuEFTX8Iw086h+mFohnvLwLK+mIV4BWq6dtLCkMr5o1aTAvyCiY2hgQC 2IjmMr1WhYkUrKuDlMtH/bKidSYoLOJ7k8B2vJSkTcxXIcZyCD6MA1LTdMqHDaQNdkSDNY 2LEFJbP6RI6cmnCDumb2OS7cWZfrQOg= X-Rspamd-Queue-Id: 782C7180002 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=W2VUOmIq; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 93j6s1wtp6o9k9b1gyj5caxdgp13ix8x X-HE-Tag: 1668384295-777313 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Let's add one sanity check for CONFIG_DEBUG_VM on the write bit in whatever chance we have when walking through the pgtables. It can bring the error earlier even before the app notices the data was corrupted on the snapshot. Also it helps us to identify this is a wrong pgtable setup, so hopefully a great information to have for debugging too. Cc: Andrea Arcangeli Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 5059799bebe3..63bdbb0f989e 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -291,7 +291,23 @@ static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP static inline int pte_uffd_wp(pte_t pte) { - return pte_flags(pte) & _PAGE_UFFD_WP; + bool wp = pte_flags(pte) & _PAGE_UFFD_WP; + +#ifdef CONFIG_DEBUG_VM + /* + * Having write bit for wr-protect-marked present ptes is fatal, + * because it means the uffd-wp bit will be ignored and write will + * just go through. + * + * Use any chance of pgtable walking to verify this (e.g., when + * page swapped out or being migrated for all purposes). It means + * something is already wrong. Tell the admin even before the + * process crashes. We also nail it with wrong pgtable setup. + */ + WARN_ON_ONCE(wp && pte_write(pte)); +#endif + + return wp; } static inline pte_t pte_mkuffd_wp(pte_t pte)