From patchwork Tue Feb 18 15:41:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A . Shutemov" X-Patchwork-Id: 11388827 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 431E2138D for ; Tue, 18 Feb 2020 15:41:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 02E6922527 for ; Tue, 18 Feb 2020 15:41:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=shutemov-name.20150623.gappssmtp.com header.i=@shutemov-name.20150623.gappssmtp.com header.b="QinxkUM/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 02E6922527 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0C4A06B0003; Tue, 18 Feb 2020 10:41:32 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 074986B0006; Tue, 18 Feb 2020 10:41:32 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA61B6B0007; Tue, 18 Feb 2020 10:41:31 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id CFDD56B0003 for ; Tue, 18 Feb 2020 10:41:31 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7A40E181AC9C6 for ; Tue, 18 Feb 2020 15:41:31 +0000 (UTC) X-FDA: 76503662382.24.touch83_476dd2d12d04a X-Spam-Summary: 2,0,0,c7cc0fff3dd9d202,d41d8cd98f00b204,kirill@shutemov.name,:akpm@linux-foundation.org:dan.j.williams@intel.com:justin.he@arm.com::linux-kernel@vger.kernel.org:kirill.shutemov@linux.intel.com:jmoyer@redhat.com,RULES_HIT:41:355:379:541:800:960:973:988:989:1042:1260:1311:1314:1345:1437:1515:1535:1543:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2918:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3874:4117:4250:4470:4605:5007:6119:6238:6261:6653:10004:11026:11232:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12683:12895:13161:13221:13229:13870:13894:14096:14181:14394:14721:21080:21444:21451:21627:21772:30003:30051:30054:30070:30079,0,RBL:209.85.208.194:@shutemov.name:.lbl8.mailshell.net-62.8.0.100 66.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:30,LUA_SUMMARY:none X-HE-Tag: touch83_476dd2d12d04a X-Filterd-Recvd-Size: 6714 Received: from mail-lj1-f194.google.com (mail-lj1-f194.google.com [209.85.208.194]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Feb 2020 15:41:30 +0000 (UTC) Received: by mail-lj1-f194.google.com with SMTP id w1so23489860ljh.5 for ; Tue, 18 Feb 2020 07:41:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=1prNYVi8te7tpWxKFgmK7lDkMWxtco2vgGigTJkyqhk=; b=QinxkUM/O6Civ5jQpx/cuVNRO8J5+mvs4duVtsr3xz9OrwvnLc9JOcvO6b31sBwh5m HZdxicIZPBgdplsEFqpPqBDAx7PMuK8KJe84FhsQ/nH1OxAC0bzLqimtyli5Gmy9wcGA 47/hc/xCOk8pTbcfL8fVKVf/M3a4YLzIyhX8Vq+nyG/1Ux0XmTrXwbZkxxiTmZ/1qyUZ N7jZqgWGM/FxrBvEEIsy/c9VIELiYV6+KzSNNgMudKPN0zT7rCvqbdk75w77kXu1mG8A BN9eGYn7q15twXvI7aLSaKQoZEhJdNplhvwvqpeUudiziyCVyzNvmTHsbenQr4yhWu4g X1sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=1prNYVi8te7tpWxKFgmK7lDkMWxtco2vgGigTJkyqhk=; b=ghWEAXjGJDxS34fwLYKwZzFtsdMlYqmKb4dYn1PqZEf1nLc20+1QpVW5opyAr4e4pG lLSW32f4p7OsX1Z9dzKRhTgsfPNBPssB1bH0x3ZmvmwYXvK2lwQgEwjYE8cexms/Yb3c l9NUTylI1gB35YkG8Xmb3irRRYaCjym8537x83EhCZv7+ncD1ZY4c6XZobpTiRZ2lyJZ Y2ueS4Mfkbfjeo1l0gVETSY7GT7e2agftjRTPl/poKF9y1wwb/HxHtBQEnV4hkZVmCQt BxnwoEMcx6puD0jf/vid9AUhLN2L4LETnHZsZGqFidTsiGeLUE4o3SgDXCKA4G+ZjRkj Px5Q== X-Gm-Message-State: APjAAAX9CpiRGrUVfAHXirS1eKGdkZDTluO5h/AGulUIDHv0jojcx0rp Z8USMbdgfUYiVt0AZ1YbXmX91Q== X-Google-Smtp-Source: APXvYqy02bAtARqs+xW5We+9JpED8EM7JDMzMhWAW6eWMmMGlvg2dNW69eQ4eWi532XnQLXF9CeDTA== X-Received: by 2002:a2e:80cc:: with SMTP id r12mr12507646ljg.154.1582040489372; Tue, 18 Feb 2020 07:41:29 -0800 (PST) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id b1sm2932370ljp.72.2020.02.18.07.41.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Feb 2020 07:41:28 -0800 (PST) From: "Kirill A. Shutemov" X-Google-Original-From: "Kirill A. Shutemov" Received: by box.localdomain (Postfix, from userid 1000) id 5F99E100FA3; Tue, 18 Feb 2020 18:41:56 +0300 (+03) To: Andrew Morton Cc: Dan Williams , Justin He , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Jeff Moyer Subject: [PATCH] mm: Avoid data corruption on CoW fault into PFN-mapped VMA Date: Tue, 18 Feb 2020 18:41:51 +0300 Message-Id: <20200218154151.13349-1-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Jeff Moyer has reported that one of xfstests triggers a warning when run on DAX-enabled filesystem: WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50 ... wp_page_copy+0x98c/0xd50 (unreliable) do_wp_page+0xd8/0xad0 __handle_mm_fault+0x748/0x1b90 handle_mm_fault+0x120/0x1f0 __do_page_fault+0x240/0xd70 do_page_fault+0x38/0xd0 handle_page_fault+0x10/0x30 The warning happens on failed __copy_from_user_inatomic() which tries to copy data into a CoW page. This happens because of race between MADV_DONTNEED and CoW page fault: CPU0 CPU1 handle_mm_fault() do_wp_page() wp_page_copy() do_wp_page() madvise(MADV_DONTNEED) zap_page_range() zap_pte_range() ptep_get_and_clear_full() __copy_from_user_inatomic() sees empty PTE and fails WARN_ON_ONCE(1) clear_page() The solution is to re-try __copy_from_user_inatomic() under PTL after checking that PTE is matches the orig_pte. The second copy attempt can still fail, like due to non-readable PTE, but there's nothing reasonable we can do about, except clearing the CoW page. Signed-off-by: Kirill A. Shutemov Reported-and-tested-by: Jeff Moyer --- mm/memory.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 0bccc622e482..e8bfdf0d9d1d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2257,7 +2257,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src, bool ret; void *kaddr; void __user *uaddr; - bool force_mkyoung; + bool locked = false; struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; unsigned long addr = vmf->address; @@ -2282,11 +2282,11 @@ static inline bool cow_user_page(struct page *dst, struct page *src, * On architectures with software "accessed" bits, we would * take a double page fault, so mark it accessed here. */ - force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte); - if (force_mkyoung) { + if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) { pte_t entry; vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + locked = true; if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { /* * Other thread has already handled the fault @@ -2310,18 +2310,37 @@ static inline bool cow_user_page(struct page *dst, struct page *src, * zeroes. */ if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { + if (locked) + goto warn; + + /* Re-validate under PTL if the page is still mapped */ + vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + locked = true; + if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { + /* The PTE changed under us. Retry page fault. */ + ret = false; + goto pte_unlock; + } + /* - * Give a warn in case there can be some obscure - * use-case + * The same page can be mapped back since last copy attampt. + * Try to copy again under PTL. */ - WARN_ON_ONCE(1); - clear_page(kaddr); + if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { + /* + * Give a warn in case there can be some obscure + * use-case + */ +warn: + WARN_ON_ONCE(1); + clear_page(kaddr); + } } ret = true; pte_unlock: - if (force_mkyoung) + if (locked) pte_unmap_unlock(vmf->pte, vmf->ptl); kunmap_atomic(kaddr); flush_dcache_page(dst);