From patchwork Wed Jun 7 20:48:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ross Zwisler X-Patchwork-Id: 9772809 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A4B4E60350 for ; Wed, 7 Jun 2017 20:50:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A35A25EF7 for ; Wed, 7 Jun 2017 20:50:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7ECF328544; Wed, 7 Jun 2017 20:50:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D68682853C for ; Wed, 7 Jun 2017 20:50:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751936AbdFGUtu (ORCPT ); Wed, 7 Jun 2017 16:49:50 -0400 Received: from mga14.intel.com ([192.55.52.115]:7651 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751566AbdFGUts (ORCPT ); Wed, 7 Jun 2017 16:49:48 -0400 Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2017 13:49:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,311,1493708400"; d="scan'208";a="1139021024" Received: from theros.lm.intel.com ([10.232.112.77]) by orsmga001.jf.intel.com with ESMTP; 07 Jun 2017 13:49:26 -0700 From: Ross Zwisler To: Andrew Morton , linux-kernel@vger.kernel.org Cc: Ross Zwisler , "Darrick J. Wong" , "Theodore Ts'o" , Alexander Viro , Andreas Dilger , Christoph Hellwig , Dan Williams , Dave Hansen , Ingo Molnar , Jan Kara , Jonathan Corbet , Matthew Wilcox , Steven Rostedt , linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-xfs@vger.kernel.org Subject: [PATCH 1/3] mm: add vm_insert_mixed_mkwrite() Date: Wed, 7 Jun 2017 14:48:57 -0600 Message-Id: <20170607204859.13104-1-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.9.4 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP To be able to use the common 4k zero page in DAX we need to have our PTE fault path look more like our PMD fault path where a PTE entry can be marked as dirty and writeable as it is first inserted, rather than waiting for a follow-up dax_pfn_mkwrite() => finish_mkwrite_fault() call. Right now we can rely on having a dax_pfn_mkwrite() call because we can distinguish between these two cases in do_wp_page(): case 1: 4k zero page => writable DAX storage case 2: read-only DAX storage => writeable DAX storage This distinction is made by via vm_normal_page(). vm_normal_page() returns false for the common 4k zero page, though, just as it does for DAX ptes. Instead of special casing the DAX + 4k zero page case, we will simplify our DAX PTE page fault sequence so that it matches our DAX PMD sequence, and get rid of dax_pfn_mkwrite() completely. This means that insert_pfn() needs to follow the lead of insert_pfn_pmd() and allow us to pass in a 'mkwrite' flag. If 'mkwrite' is set insert_pfn() will do the work that was previously done by wp_page_reuse() as part of the dax_pfn_mkwrite() call path. Signed-off-by: Ross Zwisler --- include/linux/mm.h | 9 +++++++-- mm/memory.c | 21 ++++++++++++++------- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b892e95..11e323a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2294,10 +2294,15 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot); -int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn); +int vm_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr, + pfn_t pfn, bool mkwrite); int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len); +static inline int vm_insert_mixed(struct vm_area_struct *vma, + unsigned long addr, pfn_t pfn) +{ + return vm_insert_mixed_mkwrite(vma, addr, pfn, false); +} struct page *follow_page_mask(struct vm_area_struct *vma, unsigned long address, unsigned int foll_flags, diff --git a/mm/memory.c b/mm/memory.c index 2e65df1..8d0eef6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1646,7 +1646,7 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, EXPORT_SYMBOL(vm_insert_page); static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn, pgprot_t prot) + pfn_t pfn, pgprot_t prot, bool mkwrite) { struct mm_struct *mm = vma->vm_mm; int retval; @@ -1658,7 +1658,7 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, if (!pte) goto out; retval = -EBUSY; - if (!pte_none(*pte)) + if (!pte_none(*pte) && !mkwrite) goto out_unlock; /* Ok, finally just insert the thing.. */ @@ -1666,6 +1666,12 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, entry = pte_mkdevmap(pfn_t_pte(pfn, prot)); else entry = pte_mkspecial(pfn_t_pte(pfn, prot)); + + if (mkwrite) { + entry = pte_mkyoung(entry); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); + } + set_pte_at(mm, addr, pte, entry); update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */ @@ -1736,14 +1742,15 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)); - ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot); + ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot, + false); return ret; } EXPORT_SYMBOL(vm_insert_pfn_prot); -int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn) +int vm_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr, + pfn_t pfn, bool mkwrite) { pgprot_t pgprot = vma->vm_page_prot; @@ -1772,9 +1779,9 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, page = pfn_to_page(pfn_t_to_pfn(pfn)); return insert_page(vma, addr, page, pgprot); } - return insert_pfn(vma, addr, pfn, pgprot); + return insert_pfn(vma, addr, pfn, pgprot, mkwrite); } -EXPORT_SYMBOL(vm_insert_mixed); +EXPORT_SYMBOL(vm_insert_mixed_mkwrite); /* * maps a range of physical memory into the requested pages. the old