From patchwork Tue Sep 24 23:24:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 11159783 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D1AF2912 for ; Tue, 24 Sep 2019 23:25:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8928121655 for ; Tue, 24 Sep 2019 23:25:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GcB3+mac" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8928121655 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A20436B0008; Tue, 24 Sep 2019 19:25:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9D0586B000C; Tue, 24 Sep 2019 19:25:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BEE76B000D; Tue, 24 Sep 2019 19:25:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 64A2F6B0008 for ; Tue, 24 Sep 2019 19:25:19 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id E1DB82DF0 for ; Tue, 24 Sep 2019 23:25:18 +0000 (UTC) X-FDA: 75971397516.10.fear44_4b705fa34054c X-Spam-Summary: 2,0,0,eabda8e5b535c91b,d41d8cd98f00b204,33awkxqykcey627piwowwotm.kwutqv25-uus3iks.wzo@flex--yuzhao.bounces.google.com,:akpm@linux-foundation.org:mhocko@suse.com:kirill.shutemov@linux.intel.com:peterz@infradead.org:mingo@redhat.com:acme@kernel.org:alexander.shishkin@linux.intel.com:jolsa@redhat.com:namhyung@kernel.org:vbabka@suse.cz:hughd@google.com:jglisse@redhat.com:aarcange@redhat.com:aneesh.kumar@linux.ibm.com:rientjes@google.com:willy@infradead.org:ldr709@gmail.com:rcampbell@nvidia.com:jgg@ziepe.ca:airlied@redhat.com:thellstrom@vmware.com:jrdr.linux@gmail.com:mgorman@suse.de:jack@suse.cz:mike.kravetz@oracle.com:ying.huang@intel.com:ziqian.lzq@antfin.com:osandov@fb.com:tglx@linutronix.de:vpillai@digitalocean.com:daniel.m.jordan@oracle.com:rppt@linux.ibm.com:joel@joelfernandes.org:mark.rutland@arm.com:alexander.h.duyck@linux.intel.com:pavel.tatashin@microsoft.com:david@redhat.com:jgross@suse.com:anthony.yznaga@oracle.com:hannes@cmpxchg.org:darrick.wong@oracl e.com:li X-HE-Tag: fear44_4b705fa34054c X-Filterd-Recvd-Size: 5143 Received: from mail-qt1-f201.google.com (mail-qt1-f201.google.com [209.85.160.201]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Sep 2019 23:25:18 +0000 (UTC) Received: by mail-qt1-f201.google.com with SMTP id r15so3923140qtn.12 for ; Tue, 24 Sep 2019 16:25:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=8TGliIBWPQiMc/dcK6U25tRYwGliMmayQocgS2bc3e4=; b=GcB3+maccc3yyJ68p6K6orlxKqatE32oUgshjiHKVZsaWKaPYUjICFK3v/zSYFr6jj 35J0LEAaUV1gahRSD3uui0BgHElrdBfxTtpBQxXFu7nO5RAzCMsztrcDrBogams0wBLN /JSK0q+AGwo2zFh+eZemw25WwBDqVQ2E0U9BgNQgbKSe/VJvUp3AFssYHLlXr2IWcU3P w7UxVFKW43AAZIWrhAKtL/HJDQaYza9VlYGq+Fy8TOP/oU75vBbs4BpHhai4QYbNXl9E 7E/F6h9+Y+3Tnt2l44lXRpFgq0POnIviT0Fy411wk//ApNDq4085CX6An4BQz+ubebij j1fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=8TGliIBWPQiMc/dcK6U25tRYwGliMmayQocgS2bc3e4=; b=cOToXzLuPHC/vWulGL0/VR84vEHtQPhKYy1CZhdrc8+39Nz1Q7nWxmRMF3Iv6gPk2R QI4/m7iqjZePdyr3bXBdpG7SWGYQSB+CQQizbiiaxlAcOr+b48AebfvAoghBQNFlXc+t dHR6yOv+OKT9vHH76cUgvJ0ROGQhSCuwMM4m8svRYbmlbecjQxl+Q1ebRSCbMDQ236Vw ksBjDvC51hvMfd5X2xRANGZdXvn930KmeQJEdvUa+Ru2xU2rN4kfdi8/4MEJ2kfD/8dO qwvje1Z5NG4czGmxQYvK+UFVbezSQvsolK70N8EeRRl1Z3/EfcreqAoiPBJxEtI5HfWL 9tiw== X-Gm-Message-State: APjAAAUvusBFv4VnVDVPiPAbOgwgybxubfxzFtOB/i3IeVLTGHMIStVK 22AMjVgo7TqYf+7gO6y8lkB23RjEQec= X-Google-Smtp-Source: APXvYqwqBI8fxarOPb9Kg9kSXeOK0+ZnCpMrgtm6xbNSOhZGZ6kttvBsIjy9TurHbnzhnJqkh7NowzV1lsk= X-Received: by 2002:a0c:c251:: with SMTP id w17mr4701768qvh.226.1569367517179; Tue, 24 Sep 2019 16:25:17 -0700 (PDT) Date: Tue, 24 Sep 2019 17:24:56 -0600 In-Reply-To: <20190914070518.112954-1-yuzhao@google.com> Message-Id: <20190924232459.214097-1-yuzhao@google.com> Mime-Version: 1.0 References: <20190914070518.112954-1-yuzhao@google.com> X-Mailer: git-send-email 2.23.0.351.gc4317032e6-goog Subject: [PATCH v3 1/4] mm: remove unnecessary smp_wmb() in collapse_huge_page() From: Yu Zhao To: Andrew Morton , Michal Hocko , "Kirill A . Shutemov" Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Vlastimil Babka , Hugh Dickins , " =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= " , Andrea Arcangeli , "Aneesh Kumar K . V" , David Rientjes , Matthew Wilcox , Lance Roy , Ralph Campbell , Jason Gunthorpe , Dave Airlie , Thomas Hellstrom , Souptick Joarder , Mel Gorman , Jan Kara , Mike Kravetz , Huang Ying , Aaron Lu , Omar Sandoval , Thomas Gleixner , Vineeth Remanan Pillai , Daniel Jordan , Mike Rapoport , Joel Fernandes , Mark Rutland , Alexander Duyck , Pavel Tatashin , David Hildenbrand , Juergen Gross , Anthony Yznaga , Johannes Weiner , "Darrick J . Wong" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: __SetPageUptodate() always has a built-in smp_wmb() to make sure user data copied to a new page appears before set_pmd_at(). Signed-off-by: Yu Zhao --- mm/khugepaged.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index ccede2425c3f..70ff98e1414d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1067,13 +1067,6 @@ static void collapse_huge_page(struct mm_struct *mm, _pmd = mk_huge_pmd(new_page, vma->vm_page_prot); _pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma); - /* - * spin_lock() below is not the equivalent of smp_wmb(), so - * this is needed to avoid the copy_huge_page writes to become - * visible after the set_pmd_at() write. - */ - smp_wmb(); - spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); page_add_new_anon_rmap(new_page, vma, address, true); From patchwork Tue Sep 24 23:24:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 11159785 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79F0D912 for ; Tue, 24 Sep 2019 23:25:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 38305217D7 for ; Tue, 24 Sep 2019 23:25:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GmtwfGFW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 38305217D7 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 231D96B000C; Tue, 24 Sep 2019 19:25:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 209C26B000D; Tue, 24 Sep 2019 19:25:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D53D6B000E; Tue, 24 Sep 2019 19:25:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id D800F6B000C for ; Tue, 24 Sep 2019 19:25:21 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 65AF7181AC9B4 for ; Tue, 24 Sep 2019 23:25:21 +0000 (UTC) X-FDA: 75971397642.09.join85_4bd2417a8951d X-Spam-Summary: 2,0,0,c28121aed68b003e,d41d8cd98f00b204,336wkxqykceg849rkyqyyqvo.mywvsx47-wwu5kmu.y1q@flex--yuzhao.bounces.google.com,:akpm@linux-foundation.org:mhocko@suse.com:kirill.shutemov@linux.intel.com:peterz@infradead.org:mingo@redhat.com:acme@kernel.org:alexander.shishkin@linux.intel.com:jolsa@redhat.com:namhyung@kernel.org:vbabka@suse.cz:hughd@google.com:jglisse@redhat.com:aarcange@redhat.com:aneesh.kumar@linux.ibm.com:rientjes@google.com:willy@infradead.org:ldr709@gmail.com:rcampbell@nvidia.com:jgg@ziepe.ca:airlied@redhat.com:thellstrom@vmware.com:jrdr.linux@gmail.com:mgorman@suse.de:jack@suse.cz:mike.kravetz@oracle.com:ying.huang@intel.com:ziqian.lzq@antfin.com:osandov@fb.com:tglx@linutronix.de:vpillai@digitalocean.com:daniel.m.jordan@oracle.com:rppt@linux.ibm.com:joel@joelfernandes.org:mark.rutland@arm.com:alexander.h.duyck@linux.intel.com:pavel.tatashin@microsoft.com:david@redhat.com:jgross@suse.com:anthony.yznaga@oracle.com:hannes@cmpxchg.org:darrick.wong@oracl e.com:li X-HE-Tag: join85_4bd2417a8951d X-Filterd-Recvd-Size: 6330 Received: from mail-ua1-f74.google.com (mail-ua1-f74.google.com [209.85.222.74]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Sep 2019 23:25:20 +0000 (UTC) Received: by mail-ua1-f74.google.com with SMTP id r39so790302uad.3 for ; Tue, 24 Sep 2019 16:25:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=W6ANTbXC/5tTOAaJaqjhpWRd4RlgZ8urEFF8tzHfF70=; b=GmtwfGFWEmM5rBrJZ3zL/rMurfIJHKW3DPe1mpCFLhK2FzPz/sdblXdmkhCzzDEr1p c1CRp4xcJFw1WGCy4uVkAynF2k0tQwMR21q8qe53S5JjRk0+bNxUy6BDvxJ2ED6mNRbC gF5SJ8uZ1yGe3Axb+WL3xWOM6fqQpzVFaHdkrIB/dkLyBO4RlaB+DJTctzG9y55IFbV1 AQGwyJ0PFRLXPKejau/Zp/XV1pF1eDXX5oIG0w+mPl8fjnJGR2sTiSwKsEAWfCaZUrzs EShJxvaMFLPOV0NoZ+Ut9VsP7qr8bxn3HanFveOEApilFTtmrqhoEZ+MyTs2T/leHskp T64g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=W6ANTbXC/5tTOAaJaqjhpWRd4RlgZ8urEFF8tzHfF70=; b=hU5ETB0kG7I3tbBE1a3RZYD/TDxhA1F5AZAq4N/f02/60Ozs4u23V7NIeQ7cS+tTS6 2Lv1R2aByujScQjPYqSjgcXLMPo/0iF7CLOlWBBPKFoZbcKKTnZkrbGxQYvZSBLRQGGm NisbVgNTyXZXZQPFLzVaQVT0yTE2DHzqjxz8nrmZRpT9UuLGJ2v6z2/lfXEmx0qPPGLQ xt58Exf7vMyczYKxuRdlarlRaDzGH9ozAuC+Z+NT3Fn2zRPt712xdzGJj9YGV24Bl1Ya 4M/eqY3hqS9i34fCP0m5uLN9wrGbmlAHJUkvI2CiKvOjb5L6DQsoVg1qdQcIboZi94br eebQ== X-Gm-Message-State: APjAAAVP96QD2n7BB2Dc+W7XG4JGC1O+JpiBglKf+XTIuorkIvhtEae4 YLUBr6GFp8P5iAb/Yz4ggg7YimnV4j8= X-Google-Smtp-Source: APXvYqyjV5ELtfxTEDN6T4GvPwrc1VD3sg7ZOW3l6KFpc18OtkYspLxzhnkSZhs+S8ZRWaXFVNPkcGEjrTA= X-Received: by 2002:a67:dc95:: with SMTP id g21mr3113756vsk.164.1569367519578; Tue, 24 Sep 2019 16:25:19 -0700 (PDT) Date: Tue, 24 Sep 2019 17:24:57 -0600 In-Reply-To: <20190924232459.214097-1-yuzhao@google.com> Message-Id: <20190924232459.214097-2-yuzhao@google.com> Mime-Version: 1.0 References: <20190914070518.112954-1-yuzhao@google.com> <20190924232459.214097-1-yuzhao@google.com> X-Mailer: git-send-email 2.23.0.351.gc4317032e6-goog Subject: [PATCH v3 2/4] mm: don't expose hugetlb page to fast gup prematurely From: Yu Zhao To: Andrew Morton , Michal Hocko , "Kirill A . Shutemov" Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Vlastimil Babka , Hugh Dickins , " =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= " , Andrea Arcangeli , "Aneesh Kumar K . V" , David Rientjes , Matthew Wilcox , Lance Roy , Ralph Campbell , Jason Gunthorpe , Dave Airlie , Thomas Hellstrom , Souptick Joarder , Mel Gorman , Jan Kara , Mike Kravetz , Huang Ying , Aaron Lu , Omar Sandoval , Thomas Gleixner , Vineeth Remanan Pillai , Daniel Jordan , Mike Rapoport , Joel Fernandes , Mark Rutland , Alexander Duyck , Pavel Tatashin , David Hildenbrand , Juergen Gross , Anthony Yznaga , Johannes Weiner , "Darrick J . Wong" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't want to expose a hugetlb page to the fast gup running on a remote CPU before the local non-atomic op __SetPageUptodate() is visible first. For a hugetlb page, there is no memory barrier between the non-atomic op and set_huge_pte_at(). Therefore, the page can appear to the fast gup before the flag does. There is no evidence this would cause any problem, but there is no point risking the race either. This patch simply replace 3 uses of the non-atomic op with its atomic version though out mm/hugetlb.c. The only one left in hugetlbfs_fallocate() is safe because huge_add_to_page_cache() serves as a valid write barrier. Signed-off-by: Yu Zhao --- mm/hugetlb.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6d7296dd11b8..0be5b7937085 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3693,7 +3693,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, copy_user_huge_page(new_page, old_page, address, vma, pages_per_huge_page(h)); - __SetPageUptodate(new_page); + SetPageUptodate(new_page); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, haddr, haddr + huge_page_size(h)); @@ -3879,7 +3879,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, goto out; } clear_huge_page(page, address, pages_per_huge_page(h)); - __SetPageUptodate(page); + SetPageUptodate(page); new_page = true; if (vma->vm_flags & VM_MAYSHARE) { @@ -4180,11 +4180,11 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, } /* - * The memory barrier inside __SetPageUptodate makes sure that + * The memory barrier inside SetPageUptodate makes sure that * preceding stores to the page contents become visible before * the set_pte_at() write. */ - __SetPageUptodate(page); + SetPageUptodate(page); mapping = dst_vma->vm_file->f_mapping; idx = vma_hugecache_offset(h, dst_vma, dst_addr); From patchwork Tue Sep 24 23:24:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 11159787 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C068B912 for ; Tue, 24 Sep 2019 23:25:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 714B9217F4 for ; Tue, 24 Sep 2019 23:25:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ux0Cr0Z0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 714B9217F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EDDBF6B000D; Tue, 24 Sep 2019 19:25:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E8D586B000E; Tue, 24 Sep 2019 19:25:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7C576B0010; Tue, 24 Sep 2019 19:25:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id AEA806B000D for ; Tue, 24 Sep 2019 19:25:23 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 46E158243771 for ; Tue, 24 Sep 2019 23:25:23 +0000 (UTC) X-FDA: 75971397726.07.boy28_4c148d1332023 X-Spam-Summary: 2,0,0,ff2dd12cc4e5ab90,d41d8cd98f00b204,34awkxqykceoa6btm0s00sxq.o0yxuz69-yyw7mow.03s@flex--yuzhao.bounces.google.com,:akpm@linux-foundation.org:mhocko@suse.com:kirill.shutemov@linux.intel.com:peterz@infradead.org:mingo@redhat.com:acme@kernel.org:alexander.shishkin@linux.intel.com:jolsa@redhat.com:namhyung@kernel.org:vbabka@suse.cz:hughd@google.com:jglisse@redhat.com:aarcange@redhat.com:aneesh.kumar@linux.ibm.com:rientjes@google.com:willy@infradead.org:ldr709@gmail.com:rcampbell@nvidia.com:jgg@ziepe.ca:airlied@redhat.com:thellstrom@vmware.com:jrdr.linux@gmail.com:mgorman@suse.de:jack@suse.cz:mike.kravetz@oracle.com:ying.huang@intel.com:ziqian.lzq@antfin.com:osandov@fb.com:tglx@linutronix.de:vpillai@digitalocean.com:daniel.m.jordan@oracle.com:rppt@linux.ibm.com:joel@joelfernandes.org:mark.rutland@arm.com:alexander.h.duyck@linux.intel.com:pavel.tatashin@microsoft.com:david@redhat.com:jgross@suse.com:anthony.yznaga@oracle.com:hannes@cmpxchg.org:darrick.wong@oracl e.com:li X-HE-Tag: boy28_4c148d1332023 X-Filterd-Recvd-Size: 12687 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Sep 2019 23:25:22 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id a18so810798ybe.13 for ; Tue, 24 Sep 2019 16:25:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=I0+ed5qgMVEqWKpob9KndzoL/arxaP80PcQihG9xudI=; b=ux0Cr0Z0Ycwm20bArU01Oc/qPO5LrAil4z5pwKynMCCcTG/tx0uY4uZo5oaTCxiecw 200tkVynhY85wKdjkXBSmddMCY2tQ4GXIfwH49XBE7AXO7rdbDIwdaK1Ims0dyK7kFym JZBJOfvztxPkKmU/vQqcpVZ0bq3c7rYtiikO/MyLgaQeBFSBSMt7Ebmvomd29It5dJs0 iShM48t4BhG9e86fzt2aF3Qh+QnEVBne1Sv8Q0EcnSwlqXj5363sjO/xAq5rwFGbT/WF IRlJO8pE9e0p7PjoTRITLeOR+q0k3I2joD6RdO3inVT9QtrCLVzPnD27oJJMKbn7R/no SUKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=I0+ed5qgMVEqWKpob9KndzoL/arxaP80PcQihG9xudI=; b=bxo94NH5xsBTp+iwjKEztRjfosV5baMgU4MRuDxme6CEFI+PP1Gx9BHxmmKB08zHWd yNcV7C0B9pBEoaIZSRsK9KOcIawUQy7FFB9+IviXmu8AhKDbJDdgjVDigoH0FEVfQYcz d+c4eD+Jdhdy/g2dSUnwpimzcnSag6DC5Nx4FmtpzIPfiGdgRddg+b4lGqrXrtRqxzsy j54EqMjdOmNwEdK/IB3nWLzuhMX5axu3bbY5odkfgFM62wzWQRimIK8m8dNWaUfMoiZH b7bEo2n1OVykob9FwdF7m9KgsZ1KcrdluFf7rTSaXcqpD99coQGyUA8SYQUH10xarhAP N5ZQ== X-Gm-Message-State: APjAAAXfnHjqVzKVwivklagSj5TIVssVRPfM9U0On3dlKk2EOeb67Q8q joZtrpPiYuFc0P2HLhfqhnbRSu94ORE= X-Google-Smtp-Source: APXvYqw9Fp9haV7xmpe3jZK31pIRzqTQV0Tr+Q5B6z/3oAW/KhxbGxWBFISCXV11wNDaezN3R13Xd5qcMEM= X-Received: by 2002:a25:9c01:: with SMTP id c1mr1016307ybo.492.1569367521792; Tue, 24 Sep 2019 16:25:21 -0700 (PDT) Date: Tue, 24 Sep 2019 17:24:58 -0600 In-Reply-To: <20190924232459.214097-1-yuzhao@google.com> Message-Id: <20190924232459.214097-3-yuzhao@google.com> Mime-Version: 1.0 References: <20190914070518.112954-1-yuzhao@google.com> <20190924232459.214097-1-yuzhao@google.com> X-Mailer: git-send-email 2.23.0.351.gc4317032e6-goog Subject: [PATCH v3 3/4] mm: don't expose non-hugetlb page to fast gup prematurely From: Yu Zhao To: Andrew Morton , Michal Hocko , "Kirill A . Shutemov" Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Vlastimil Babka , Hugh Dickins , " =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= " , Andrea Arcangeli , "Aneesh Kumar K . V" , David Rientjes , Matthew Wilcox , Lance Roy , Ralph Campbell , Jason Gunthorpe , Dave Airlie , Thomas Hellstrom , Souptick Joarder , Mel Gorman , Jan Kara , Mike Kravetz , Huang Ying , Aaron Lu , Omar Sandoval , Thomas Gleixner , Vineeth Remanan Pillai , Daniel Jordan , Mike Rapoport , Joel Fernandes , Mark Rutland , Alexander Duyck , Pavel Tatashin , David Hildenbrand , Juergen Gross , Anthony Yznaga , Johannes Weiner , "Darrick J . Wong" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't want to expose a non-hugetlb page to the fast gup running on a remote CPU before all local non-atomic ops on the page flags are visible first. For an anon page that isn't in swap cache, we need to make sure all prior non-atomic ops, especially __SetPageSwapBacked() in page_add_new_anon_rmap(), are ordered before set_pte_at() to prevent the following race: CPU 1 CPU1 set_pte_at() get_user_pages_fast() page_add_new_anon_rmap() gup_pte_range() __SetPageSwapBacked() SetPageReferenced() This demonstrates a non-fatal scenario. Though haven't been directly observed, the fatal ones can exist, e.g., PG_lock set by fast gup caller and then overwritten by __SetPageSwapBacked(). For an anon page that is already in swap cache or a file page, we don't need smp_wmb() before set_pte_at() because adding to swap or file cach serves as a valid write barrier. Using non-atomic ops thereafter is a bug, obviously. smp_wmb() is added following 11 of total 12 page_add_new_anon_rmap() call sites, with the only exception being do_huge_pmd_wp_page_fallback() because of an existing smp_wmb(). Signed-off-by: Yu Zhao --- kernel/events/uprobes.c | 2 ++ mm/huge_memory.c | 6 ++++++ mm/khugepaged.c | 2 ++ mm/memory.c | 10 +++++++++- mm/migrate.c | 2 ++ mm/swapfile.c | 6 ++++-- mm/userfaultfd.c | 2 ++ 7 files changed, 27 insertions(+), 3 deletions(-) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 84fa00497c49..7069785e2e52 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -194,6 +194,8 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, flush_cache_page(vma, addr, pte_pfn(*pvmw.pte)); ptep_clear_flush_notify(vma, addr, pvmw.pte); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); set_pte_at_notify(mm, addr, pvmw.pte, mk_pte(new_page, vma->vm_page_prot)); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index de1f15969e27..21d271a29d96 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -616,6 +616,8 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, mem_cgroup_commit_charge(page, memcg, false, true); lru_cache_add_active_or_unevictable(page, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(vma->vm_mm); @@ -1276,7 +1278,9 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, } kfree(pages); + /* commit non-atomic ops before exposing to fast gup */ smp_wmb(); /* make pte visible before pmd */ + pmd_populate(vma->vm_mm, vmf->pmd, pgtable); page_remove_rmap(page, true); spin_unlock(vmf->ptl); @@ -1423,6 +1427,8 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) page_add_new_anon_rmap(new_page, vma, haddr, true); mem_cgroup_commit_charge(new_page, memcg, false, true); lru_cache_add_active_or_unevictable(new_page, vma); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); if (!page) { diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 70ff98e1414d..f2901edce6de 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1074,6 +1074,8 @@ static void collapse_huge_page(struct mm_struct *mm, count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1); lru_cache_add_active_or_unevictable(new_page, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); spin_unlock(pmd_ptl); diff --git a/mm/memory.c b/mm/memory.c index aa86852d9ec2..6dabbc3cd3b7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2367,6 +2367,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * mmu page tables (such as kvm shadow page tables), we want the * new page to be mapped directly into the secondary page table. */ + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); set_pte_at_notify(mm, vmf->address, vmf->pte, entry); update_mmu_cache(vma, vmf->address, vmf->pte); if (old_page) { @@ -2877,7 +2879,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) flush_icache_page(vma, page); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte = pte; @@ -2886,12 +2887,15 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); } else { do_page_add_anon_rmap(page, vma, vmf->address, exclusive); mem_cgroup_commit_charge(page, memcg, true, false); activate_page(page); } + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); swap_free(entry); if (mem_cgroup_swap_full(page) || (vma->vm_flags & VM_LOCKED) || PageMlocked(page)) @@ -3034,6 +3038,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); setpte: set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); @@ -3297,6 +3303,8 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); } else { inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); page_add_file_rmap(page, false); diff --git a/mm/migrate.c b/mm/migrate.c index 9f4ed4e985c1..943d147ecc3e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2783,6 +2783,8 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, lru_cache_add_active_or_unevictable(page, vma); get_page(page); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); if (flush) { flush_cache_page(vma, addr, pte_pfn(*ptep)); ptep_clear_flush_notify(vma, addr, ptep); diff --git a/mm/swapfile.c b/mm/swapfile.c index dab43523afdd..5c5547053ee0 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1880,8 +1880,6 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); - set_pte_at(vma->vm_mm, addr, pte, - pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { page_add_anon_rmap(page, vma, addr, false); mem_cgroup_commit_charge(page, memcg, true, false); @@ -1889,7 +1887,11 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, page_add_new_anon_rmap(page, vma, addr, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); } + set_pte_at(vma->vm_mm, addr, pte, + pte_mkold(mk_pte(page, vma->vm_page_prot))); swap_free(entry); /* * Move the page to the active list so it is not diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index c7ae74ce5ff3..4f92913242a1 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -92,6 +92,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, dst_vma); + /* commit non-atomic ops before exposing to fast gup */ + smp_wmb(); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); /* No need to invalidate - it was non-present before */ From patchwork Tue Sep 24 23:24:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 11159789 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 47A5E13B1 for ; Tue, 24 Sep 2019 23:25:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ED9D12146E for ; Tue, 24 Sep 2019 23:25:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="R1cAwf0w" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ED9D12146E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2F88C6B000E; Tue, 24 Sep 2019 19:25:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 20A426B0010; Tue, 24 Sep 2019 19:25:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F9CC6B0266; Tue, 24 Sep 2019 19:25:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0249.hostedemail.com [216.40.44.249]) by kanga.kvack.org (Postfix) with ESMTP id D5BC86B000E for ; Tue, 24 Sep 2019 19:25:25 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 6D326180AD802 for ; Tue, 24 Sep 2019 23:25:25 +0000 (UTC) X-FDA: 75971397810.06.room51_4c63f4dd17b54 X-Spam-Summary: 2,0,0,d9dd2fe06cc4eb2e,d41d8cd98f00b204,346wkxqykcewc8dvo2u22uzs.q20zw18b-00y9oqy.25u@flex--yuzhao.bounces.google.com,:akpm@linux-foundation.org:mhocko@suse.com:kirill.shutemov@linux.intel.com:peterz@infradead.org:mingo@redhat.com:acme@kernel.org:alexander.shishkin@linux.intel.com:jolsa@redhat.com:namhyung@kernel.org:vbabka@suse.cz:hughd@google.com:jglisse@redhat.com:aarcange@redhat.com:aneesh.kumar@linux.ibm.com:rientjes@google.com:willy@infradead.org:ldr709@gmail.com:rcampbell@nvidia.com:jgg@ziepe.ca:airlied@redhat.com:thellstrom@vmware.com:jrdr.linux@gmail.com:mgorman@suse.de:jack@suse.cz:mike.kravetz@oracle.com:ying.huang@intel.com:ziqian.lzq@antfin.com:osandov@fb.com:tglx@linutronix.de:vpillai@digitalocean.com:daniel.m.jordan@oracle.com:rppt@linux.ibm.com:joel@joelfernandes.org:mark.rutland@arm.com:alexander.h.duyck@linux.intel.com:pavel.tatashin@microsoft.com:david@redhat.com:jgross@suse.com:anthony.yznaga@oracle.com:hannes@cmpxchg.org:darrick.wong@oracl e.com:li X-HE-Tag: room51_4c63f4dd17b54 X-Filterd-Recvd-Size: 13314 Received: from mail-qk1-f202.google.com (mail-qk1-f202.google.com [209.85.222.202]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Tue, 24 Sep 2019 23:25:24 +0000 (UTC) Received: by mail-qk1-f202.google.com with SMTP id w7so4044567qkf.10 for ; Tue, 24 Sep 2019 16:25:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=jF3LwHMy1Im/uN1Rxj0wLkf4Ra+UlXW81PKhiasIOf0=; b=R1cAwf0wmLc4JnxR+VS23xXSu3nPjvOeg7wranBMJiJc0XqUFVkNV/oehzgidbrevA MpduyYw37pqaEm8VweexNg876O72FXuUD+wV6nLgnoph0qgpXVrF+m7K6aZmD16COFGi bTG0dV4rg7IfGTIwZwKKgMjhQJ3o7i3HG0GNaRb6BYFUpS/HLuPpHc/HX07gbS3Z0I8J uNl/6SWHlcng5cWoiv9Gn2EQjg4O73ykGDFwk4Rp4Kp0e/EcQRYY+K7PU9TSN1uMWT4F dPcS+9uvmiwf1IAN/fwZHGSUCt5odnp5OyjHvVPPASJ1FnHcxVch0peeNhv6xXzO2RfI bK5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=jF3LwHMy1Im/uN1Rxj0wLkf4Ra+UlXW81PKhiasIOf0=; b=kVTjxXh47y0MROcQ5M9o1q4rrCFKbo67/q8OsQ2uPIfXuOtVVrsvUl1CSmTfGVjnd0 iPghtkv5dBI1fHbSIWF7L8qkbhUMlDQdRBf8PdlYOf4Zi5xNY08Jyee7Q+WdhVbiNVUF MRLFsFH7pw8DBbSyWmmlxL0ANfv7RJznYpMhaJ2E1E4VXEDKs/4eFisjAuVocBNr/Efg zH3w9K1UkUK1QL5l4AlaJvP7aYnXnboKcCRVr4PAM+rH6ohnYQIwZ6BMiaJVhakIZyMA teaDCejFo/2BVBg4884NqbaDOp70qQMN5KE6IGai4XZ6wooIIpRgw2mrwW4G3mvZSde0 5eaA== X-Gm-Message-State: APjAAAWl3UX0PDCdhFnraEbxHFC+nHn6lAzt2oEyxyIzNKAopbzOUdB7 wLNTjwn5d5uinEFr/RrGKtXQGmrh6p0= X-Google-Smtp-Source: APXvYqzRxA7gbx3nXbEjQFUlhSXxq0dRZYKtZSkeGzz3NExksdaEKh57l926xkGw5CQFU2cJ3KBgwWaa120= X-Received: by 2002:a0c:9952:: with SMTP id i18mr4976580qvd.202.1569367523787; Tue, 24 Sep 2019 16:25:23 -0700 (PDT) Date: Tue, 24 Sep 2019 17:24:59 -0600 In-Reply-To: <20190924232459.214097-1-yuzhao@google.com> Message-Id: <20190924232459.214097-4-yuzhao@google.com> Mime-Version: 1.0 References: <20190914070518.112954-1-yuzhao@google.com> <20190924232459.214097-1-yuzhao@google.com> X-Mailer: git-send-email 2.23.0.351.gc4317032e6-goog Subject: [PATCH v3 4/4] mm: remove unnecessary smp_wmb() in __SetPageUptodate() From: Yu Zhao To: Andrew Morton , Michal Hocko , "Kirill A . Shutemov" Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Vlastimil Babka , Hugh Dickins , " =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= " , Andrea Arcangeli , "Aneesh Kumar K . V" , David Rientjes , Matthew Wilcox , Lance Roy , Ralph Campbell , Jason Gunthorpe , Dave Airlie , Thomas Hellstrom , Souptick Joarder , Mel Gorman , Jan Kara , Mike Kravetz , Huang Ying , Aaron Lu , Omar Sandoval , Thomas Gleixner , Vineeth Remanan Pillai , Daniel Jordan , Mike Rapoport , Joel Fernandes , Mark Rutland , Alexander Duyck , Pavel Tatashin , David Hildenbrand , Juergen Gross , Anthony Yznaga , Johannes Weiner , "Darrick J . Wong" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: smp_wmb()s added in the previous patch guarantee that the user data appears before a page is exposed by set_pte_at(). So there is no need for __SetPageUptodate() to have a built-in one. There are total 13 __SetPageUptodate() for the non-hugetlb case. 12 of them reuse smp_wmb()s added in the previous patch. The one in shmem_mfill_atomic_pte() doesn't need a explicit write barrier because of the following shmem_add_to_page_cache(). Signed-off-by: Yu Zhao --- include/linux/page-flags.h | 6 +++++- kernel/events/uprobes.c | 2 +- mm/huge_memory.c | 11 +++-------- mm/khugepaged.c | 2 +- mm/memory.c | 13 ++++--------- mm/migrate.c | 7 +------ mm/swapfile.c | 2 +- mm/userfaultfd.c | 7 +------ 8 files changed, 17 insertions(+), 33 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index f91cb8898ff0..2481f9ad5f5b 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -508,10 +508,14 @@ static inline int PageUptodate(struct page *page) return ret; } +/* + * Only use this function when there is a following write barrier, e.g., + * an explicit smp_wmb() and/or the page will be added to page or swap + * cache locked. + */ static __always_inline void __SetPageUptodate(struct page *page) { VM_BUG_ON_PAGE(PageTail(page), page); - smp_wmb(); __set_bit(PG_uptodate, &page->flags); } diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 7069785e2e52..6ceae92afcc0 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -194,7 +194,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, flush_cache_page(vma, addr, pte_pfn(*pvmw.pte)); ptep_clear_flush_notify(vma, addr, pvmw.pte); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); set_pte_at_notify(mm, addr, pvmw.pte, mk_pte(new_page, vma->vm_page_prot)); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 21d271a29d96..101e7bd61e8f 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -580,11 +580,6 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, } clear_huge_page(page, vmf->address, HPAGE_PMD_NR); - /* - * The memory barrier inside __SetPageUptodate makes sure that - * clear_huge_page writes become visible before the set_pmd_at() - * write. - */ __SetPageUptodate(page); vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); @@ -616,7 +611,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, mem_cgroup_commit_charge(page, memcg, false, true); lru_cache_add_active_or_unevictable(page, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); @@ -1278,7 +1273,7 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, } kfree(pages); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); /* make pte visible before pmd */ pmd_populate(vma->vm_mm, vmf->pmd, pgtable); @@ -1427,7 +1422,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) page_add_new_anon_rmap(new_page, vma, haddr, true); mem_cgroup_commit_charge(new_page, memcg, false, true); lru_cache_add_active_or_unevictable(new_page, vma); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index f2901edce6de..668918842712 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1074,7 +1074,7 @@ static void collapse_huge_page(struct mm_struct *mm, count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1); lru_cache_add_active_or_unevictable(new_page, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); diff --git a/mm/memory.c b/mm/memory.c index 6dabbc3cd3b7..db001d919e60 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2367,7 +2367,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * mmu page tables (such as kvm shadow page tables), we want the * new page to be mapped directly into the secondary page table. */ - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); set_pte_at_notify(mm, vmf->address, vmf->pte, entry); update_mmu_cache(vma, vmf->address, vmf->pte); @@ -2887,7 +2887,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); } else { do_page_add_anon_rmap(page, vma, vmf->address, exclusive); @@ -3006,11 +3006,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) false)) goto oom_free_page; - /* - * The memory barrier inside __SetPageUptodate makes sure that - * preceeding stores to the page contents become visible before - * the set_pte_at() write. - */ __SetPageUptodate(page); entry = mk_pte(page, vma->vm_page_prot); @@ -3038,7 +3033,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); setpte: set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); @@ -3303,7 +3298,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, page_add_new_anon_rmap(page, vma, vmf->address, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); } else { inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); diff --git a/mm/migrate.c b/mm/migrate.c index 943d147ecc3e..dc0ab9fbe36e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2729,11 +2729,6 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, if (mem_cgroup_try_charge(page, vma->vm_mm, GFP_KERNEL, &memcg, false)) goto abort; - /* - * The memory barrier inside __SetPageUptodate makes sure that - * preceding stores to the page contents become visible before - * the set_pte_at() write. - */ __SetPageUptodate(page); if (is_zone_device_page(page)) { @@ -2783,7 +2778,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, lru_cache_add_active_or_unevictable(page, vma); get_page(page); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); if (flush) { flush_cache_page(vma, addr, pte_pfn(*ptep)); diff --git a/mm/swapfile.c b/mm/swapfile.c index 5c5547053ee0..dc9f1b1ba1a6 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1887,7 +1887,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, page_add_new_anon_rmap(page, vma, addr, false); mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, vma); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); } set_pte_at(vma->vm_mm, addr, pte, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 4f92913242a1..34083680869e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -58,11 +58,6 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, *pagep = NULL; } - /* - * The memory barrier inside __SetPageUptodate makes sure that - * preceeding stores to the page contents become visible before - * the set_pte_at() write. - */ __SetPageUptodate(page); ret = -ENOMEM; @@ -92,7 +87,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, mem_cgroup_commit_charge(page, memcg, false, false); lru_cache_add_active_or_unevictable(page, dst_vma); - /* commit non-atomic ops before exposing to fast gup */ + /* commit non-atomic ops and user data */ smp_wmb(); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);