From patchwork Fri Apr 10 07:32:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hillf Danton X-Patchwork-Id: 11482819 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5E9481 for ; Fri, 10 Apr 2020 07:32:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8FBED20757 for ; Fri, 10 Apr 2020 07:32:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8FBED20757 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BB5058E0032; Fri, 10 Apr 2020 03:32:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B64B08E0003; Fri, 10 Apr 2020 03:32:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A54788E0032; Fri, 10 Apr 2020 03:32:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0126.hostedemail.com [216.40.44.126]) by kanga.kvack.org (Postfix) with ESMTP id 8E6698E0003 for ; Fri, 10 Apr 2020 03:32:27 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4854A98A9 for ; Fri, 10 Apr 2020 07:32:27 +0000 (UTC) X-FDA: 76691127534.01.bath68_899dfe0a5cd32 X-Spam-Summary: 2,0,0,175802431e488ba4,d41d8cd98f00b204,hdanton@sina.com,,RULES_HIT:2:41:355:379:800:960:967:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1605:1606:1730:1747:1777:1792:2194:2198:2199:2200:2393:2525:2553:2559:2563:2682:2685:2859:2897:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3873:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4250:4321:5007:6261:7464:7903:8660:8957:9025:9163:9391:10004:10226:10967:11026:11232:11334:11473:11537:11658:11914:12043:12050:12295:12296:12297:12438:12555:12740:12895:12986:13148:13230:13255:13870:13894:14096:21080:21365:21451:21627:30029:30045:30054:30056:30064:30069:30090:30091,0,RBL:202.108.3.22:@sina.com:.lbl8.mailshell.net-64.100.201.100 62.18.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: bath68_899dfe0a5cd32 X-Filterd-Recvd-Size: 6076 Received: from r3-22.sinamail.sina.com.cn (r3-22.sinamail.sina.com.cn [202.108.3.22]) by imf15.hostedemail.com (Postfix) with SMTP for ; Fri, 10 Apr 2020 07:32:25 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.246.227.120]) by sina.com with ESMTP id 5E90210300011CE9; Fri, 10 Apr 2020 15:32:21 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 405225629197 From: Hillf Danton To: kernel test robot Cc: Peter Xu , Andrew Morton , Linux Memory Management List , linux-kernel@vger.kernel.org Subject: f45ec5ff16 ("userfaultfd: wp: support swap and page migration"): [ 140.777858] BUG: Bad rss-counter state mm:b278fc66 type:MM_ANONPAGES val:1 Date: Fri, 10 Apr 2020 15:32:09 +0800 Message-Id: <20200410073209.11164-1-hdanton@sina.com> In-Reply-To: <20200410002518.GG8179@shao2-debian> References: MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 10 Apr 2020 08:25:18 +0800 > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > commit f45ec5ff16a75f96dac8c89862d75f1d8739efd4 > Author: Peter Xu > AuthorDate: Mon Apr 6 20:06:01 2020 -0700 > Commit: Linus Torvalds > CommitDate: Tue Apr 7 10:43:39 2020 -0700 > > userfaultfd: wp: support swap and page migration > > For either swap and page migration, we all use the bit 2 of the entry to > identify whether this entry is uffd write-protected. It plays a similar > role as the existing soft dirty bit in swap entries but only for keeping > the uffd-wp tracking for a specific PTE/PMD. > > Something special here is that when we want to recover the uffd-wp bit > from a swap/migration entry to the PTE bit we'll also need to take care of > the _PAGE_RW bit and make sure it's cleared, otherwise even with the > _PAGE_UFFD_WP bit we can't trap it at all. > > In change_pte_range() we do nothing for uffd if the PTE is a swap entry. > That can lead to data mismatch if the page that we are going to write > protect is swapped out when sending the UFFDIO_WRITEPROTECT. This patch > also applies/removes the uffd-wp bit even for the swap entries. > Have trouble understanding the last sentence in the paragraph above and particularly linking it to the first one. > If you fix the issue, kindly add following tag > Reported-by: kernel test robot > > [child3:925] eventfd (323) returned ENOSYS, marking as inactive. > [ 132.014801] can: request_module (can-proto-2) failed. > [ 132.063717] can: request_module (can-proto-2) failed. > [ 137.186037] trinity-c2 (943) used greatest stack depth: 5804 bytes left > [ 140.771486] MCE: Killing trinity-c2:956 due to hardware memory corruption fault at 8bd2060 > [ 140.777858] BUG: Bad rss-counter state mm:b278fc66 type:MM_ANONPAGES val:1 > [ 140.778736] BUG: Bad rss-counter state mm:b278fc66 type:MM_SHMEMPAGES val:2 > [ 141.589424] MCE: Killing trinity-c3:940 due to hardware memory corruption fault at 8a8c860 > [ 141.590730] swap_info_get: Bad swap file entry 700b8216 > [ 141.591400] BUG: Bad page map in process trinity-c3 pte:17042c3c pmd:b1809067 > [ 141.592304] addr:08bcf000 vm_flags:00100073 anon_vma:f1f29528 mapping:00000000 index:8bcf > [ 141.593399] file:(null) fault:0x0 mmap:0x0 readpage:0x0 > [ 141.594065] CPU: 0 PID: 940 Comm: trinity-c3 Not tainted 5.6.0-11490-gf45ec5ff16a75 #1 > [ 141.595055] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 > [ 141.596093] Call Trace: > [ 141.596443] dump_stack+0x16/0x18 > [ 141.596868] print_bad_pte+0x13f/0x159 > [ 141.597367] unmap_page_range+0x2a7/0x3e7 > [ 141.597893] unmap_single_vma+0x53/0x5d > [ 141.598383] unmap_vmas+0x2c/0x3b > [ 141.598811] exit_mmap+0x81/0xfc > [ 141.599238] __mmput+0x25/0x8d > [ 141.599633] mmput+0x28/0x2b > [ 141.600007] do_exit+0x2f0/0x84a > [ 141.600449] ? ___might_sleep+0x3f/0x11f > [ 141.600949] do_group_exit+0x86/0x86 > [ 141.601421] __ia32_sys_exit_group+0x15/0x15 > [ 141.601965] do_fast_syscall_32+0x86/0xbf > [ 141.602481] entry_SYSENTER_32+0xaf/0x101 Because is_swap_pte(oldpte) != IS_ENABLED(CONFIG_MIGRATION)), restore the old behavior by modifying uffd_wp only for pte that is __not__ swap entry, as the commit log says. --- b/mm/mprotect.c +++ c/mm/mprotect.c @@ -139,7 +139,7 @@ static unsigned long change_pte_range(st } ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); pages++; - } else if (is_swap_pte(oldpte)) { + } else if (IS_ENABLED(CONFIG_MIGRATION)) { swp_entry_t entry = pte_to_swp_entry(oldpte); pte_t newpte; @@ -154,7 +154,9 @@ static unsigned long change_pte_range(st newpte = pte_swp_mksoft_dirty(newpte); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); - } else if (is_write_device_private_entry(entry)) { + } + + if (is_write_device_private_entry(entry)) { /* * We do not preserve soft-dirtiness. See * copy_one_pte() for explanation. @@ -163,11 +165,18 @@ static unsigned long change_pte_range(st newpte = swp_entry_to_pte(entry); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); - } else { - newpte = oldpte; } - if (uffd_wp) + /* + * do nothing for changing uffd_wp if oldpte is a + * swap entry. + * That can lead to data mismatch if the page we + * are going to write protect is swapped out when + * sending the UFFDIO_WRITEPROTECT. + */ + if (is_swap_pte(oldpte)) + newpte = oldpte; + else if (uffd_wp) newpte = pte_swp_mkuffd_wp(newpte); else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte);