From patchwork Fri Jun 8 17:00:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Jiang X-Patchwork-Id: 10454769 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8AD3B6037F for ; Fri, 8 Jun 2018 17:00:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FDB82949F for ; Fri, 8 Jun 2018 17:00:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7238F294AB; Fri, 8 Jun 2018 17:00:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD39C2948B for ; Fri, 8 Jun 2018 17:00:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 406E46B0003; Fri, 8 Jun 2018 13:00:07 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3B5EB6B0005; Fri, 8 Jun 2018 13:00:07 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CB906B0006; Fri, 8 Jun 2018 13:00:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg0-f72.google.com (mail-pg0-f72.google.com [74.125.83.72]) by kanga.kvack.org (Postfix) with ESMTP id E32056B0003 for ; Fri, 8 Jun 2018 13:00:06 -0400 (EDT) Received: by mail-pg0-f72.google.com with SMTP id d10-v6so5012637pgv.8 for ; Fri, 08 Jun 2018 10:00:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:user-agent:mime-version :content-transfer-encoding; bh=8hUtmTZHh7IOzvWcvY/CghQb5cvXuJrSa2U5iRCPzM0=; b=GUft8Nk6az7iA9raaqLPcPUJlY1LyEiEHe36+nmsLZ+vOr0dR+vGlfovHnrhM2wlpL 9RXWm52+u36RoJq2ssSgs7CeJ926rnFwyAqxkutBGJ1b8hPBd04e4bppQE144BsdGvcu gU8CttGYdPDjqvXYiyqkw+vbi/rvWrkoxl9I6Efwa7+ov8KsFQeT5gUcv0JUmlM60PPV MEYENqyjEtS096S7IBvj2JfjwhrYHLM86NyQXtRAFmQ50uT4sK8IOK6BCDeW+6kkFcH9 m0BlJU1DRAjSGu45zAEKgPRcvVYB2yMw9smAgdMVAkSCrhFlBwHFviR8OzUBLb9h7yFD v0uw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dave.jiang@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dave.jiang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APt69E1ISPHWykTGtnsQw8Nmiw2ktjolWWNBXxbLIj7RG2S0zm+X7sis smTFFGqlZjCpw+aDZ0ZvoKXMOArqg2tUZ3KpbwFGgp515dljLCSILhwCaWUqAffhW0/pDXHpT1/ 35ZQIVADqUD+YOTZFKUlWGEC+nka1jC5OFXfVMEd7AMkRJQuF1GGLB/A75i8pClrn+Q== X-Received: by 2002:a62:e117:: with SMTP id q23-v6mr6885132pfh.75.1528477206399; Fri, 08 Jun 2018 10:00:06 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJgMqjK76O2ZkI/GUty0PQUcMqJ7spRSblBStBDklD3FhpeytsnGrRFWJXHV4BaXTuerUZb X-Received: by 2002:a62:e117:: with SMTP id q23-v6mr6885049pfh.75.1528477205018; Fri, 08 Jun 2018 10:00:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528477204; cv=none; d=google.com; s=arc-20160816; b=kP3/QSSzcH5sXd6tVXaemf8cNBITl6dr9KFBly7KkfTSdJeIJe8lzuP6zUCSP9IarH bsjASwVdeUAIy6st3SEVWRIxXemLvJFIQD9sP3NXH6eRFHpF1kJDcgYSx9V8UJvW4BoZ 5E8rQUdOnaVPzOW58+hgEj/xoy2eowUrRnhg64gQuEoZtEe+p+gU+CGoElsmehAg8z99 ZfhTkUl3K4q+mzvhid+nNRSe2lw2JyajUjO7GFVTiMpyu2ABfLpIps2RyDDWJMTe/mfu UbpN2zByZtEsLIPvU/Y9e92JCPS/I410wmNKa3r6ko5QMHtp0IgmIyqRXE6j2PNau7Zm AQtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:message-id:date :cc:to:from:subject:arc-authentication-results; bh=8hUtmTZHh7IOzvWcvY/CghQb5cvXuJrSa2U5iRCPzM0=; b=RohuUts4XVVpGhNPbSrWHbbutjTsMxXxRIYaphWmT3ByOmaow9/DAax0n7/Ao3A8pL 1Nttt4ni2wJUqA8HqKaJiFu7Idas2UzcrreKlmWJOrG2aS2TPjvg940cDkirW1R1Umlk wne+BzTpikBXzAKiZJgXAcEPIdwiDaGpmaJpb0mz0pVKf9mhL29Dl31WcbuT/TbfAzFz 5jDPHwj0NiGcSn751PB892fIA3aHxA/2RsP8Ar+0uGJzpJSJpnqziOIqn9flZjLg+UuF g5r+GnY9sTvhMocyvreFF9SRw1VIOyTBfIMWXpsCp/rRjw6A+n3UZc2OmJ5NTLRhh6Qo ncYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dave.jiang@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dave.jiang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga07.intel.com (mga07.intel.com. [134.134.136.100]) by mx.google.com with ESMTPS id y19-v6si45155210pgv.66.2018.06.08.10.00.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Jun 2018 10:00:04 -0700 (PDT) Received-SPF: pass (google.com: domain of dave.jiang@intel.com designates 134.134.136.100 as permitted sender) client-ip=134.134.136.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dave.jiang@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dave.jiang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Jun 2018 10:00:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,490,1520924400"; d="scan'208";a="47517007" Received: from djiang5-desk3.ch.intel.com ([143.182.136.93]) by orsmga007.jf.intel.com with ESMTP; 08 Jun 2018 10:00:03 -0700 Subject: [PATCH] dax: remove VM_MIXEDMAP for fsdax and device dax From: Dave Jiang To: akpm@linux-foundation.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dan.j.williams@intel.com, jack@suse.cz, linux-nvdimm@lists.01.org Date: Fri, 08 Jun 2018 10:00:03 -0700 Message-ID: <152847720311.55924.16999195879201817653.stgit@djiang5-desk3.ch.intel.com> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch is reworked from an earlier patch that Dan has posted: https://patchwork.kernel.org/patch/10131727/ VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that the memory page it is dealing with is not typical memory from the linear map. The get_user_pages_fast() path, since it does not resolve the vma, is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we use that as a VM_MIXEDMAP replacement in some locations. In the cases where there is no pte to consult we fallback to using vma_is_dax() to detect the VM_MIXEDMAP special case. Now that we have explicit driver pfn_t-flag opt-in/opt-out for get_user_pages() support for DAX we can stop setting VM_MIXEDMAP. This also means we no longer need to worry about safely manipulating vm_flags in a future where we support dynamically changing the dax mode of a file. DAX should also now be supported with madvise_behavior(), vma_merge(), and copy_page_range(). This patch has been tested against ndctl unit test. It has also been tested against xfstests commit: 625515d using fake pmem created by memmap and no additional issues have been observed. Signed-off-by: Dave Jiang Acked-by: Dan Williams --- drivers/dax/device.c | 2 +- fs/ext2/file.c | 1 - fs/ext4/file.c | 2 +- fs/xfs/xfs_file.c | 2 +- mm/hmm.c | 6 ++++-- mm/huge_memory.c | 4 ++-- mm/ksm.c | 3 +++ mm/memory.c | 6 ++++++ mm/migrate.c | 3 ++- mm/mlock.c | 3 ++- mm/mmap.c | 9 +++++---- 11 files changed, 27 insertions(+), 14 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index b33e45ee4f70..a9486f1374e4 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -487,7 +487,7 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma) return rc; vma->vm_ops = &dax_vm_ops; - vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE; + vma->vm_flags |= VM_HUGEPAGE; return 0; } diff --git a/fs/ext2/file.c b/fs/ext2/file.c index 047c327a6b23..28b2609f25c1 100644 --- a/fs/ext2/file.c +++ b/fs/ext2/file.c @@ -126,7 +126,6 @@ static int ext2_file_mmap(struct file *file, struct vm_area_struct *vma) file_accessed(file); vma->vm_ops = &ext2_dax_vm_ops; - vma->vm_flags |= VM_MIXEDMAP; return 0; } #else diff --git a/fs/ext4/file.c b/fs/ext4/file.c index fb6f023622fe..61001b8e25ec 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -373,7 +373,7 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma) file_accessed(file); if (IS_DAX(file_inode(file))) { vma->vm_ops = &ext4_dax_vm_ops; - vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE; + vma->vm_flags |= VM_HUGEPAGE; } else { vma->vm_ops = &ext4_file_vm_ops; } diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 19b0c3e0e232..021056ad6de0 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1170,7 +1170,7 @@ xfs_file_mmap( file_accessed(filp); vma->vm_ops = &xfs_file_vm_ops; if (IS_DAX(file_inode(filp))) - vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE; + vma->vm_flags |= VM_HUGEPAGE; return 0; } diff --git a/mm/hmm.c b/mm/hmm.c index de7b6bf77201..f40e8add84b5 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -676,7 +676,8 @@ int hmm_vma_get_pfns(struct hmm_range *range) return -EINVAL; /* FIXME support hugetlb fs */ - if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL)) { + if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || + vma_is_dax(vma)) { hmm_pfns_special(range); return -EINVAL; } @@ -849,7 +850,8 @@ int hmm_vma_fault(struct hmm_range *range, bool block) return -EINVAL; /* FIXME support hugetlb fs */ - if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL)) { + if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || + vma_is_dax(vma)) { hmm_pfns_special(range); return -EINVAL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6af976472a5d..d89ba3564562 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -765,11 +765,11 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && + !pfn_t_devmap(pfn)); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); - BUG_ON(!pfn_t_devmap(pfn)); if (addr < vma->vm_start || addr >= vma->vm_end) return VM_FAULT_SIGBUS; diff --git a/mm/ksm.c b/mm/ksm.c index e3cbf9a92f3c..d30393e486d4 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2400,6 +2400,9 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start, VM_HUGETLB | VM_MIXEDMAP)) return 0; /* just ignore the advice */ + if (vma_is_dax(vma)) + return 0; + #ifdef VM_SAO if (*vm_flags & VM_SAO) return 0; diff --git a/mm/memory.c b/mm/memory.c index 01f5464e0fd2..2b364a5ed4d5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -858,6 +858,10 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr, return NULL; } } + + if (pte_devmap(pte)) + return NULL; + print_bad_pte(vma, addr, pte, NULL); return NULL; } @@ -921,6 +925,8 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, } } + if (pmd_devmap(pmd)) + return NULL; if (is_zero_pfn(pfn)) return NULL; if (unlikely(pfn > highest_memmap_pfn)) diff --git a/mm/migrate.c b/mm/migrate.c index 8c0af0f7cab1..4a83268e23c2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2951,7 +2951,8 @@ int migrate_vma(const struct migrate_vma_ops *ops, /* Sanity check the arguments */ start &= PAGE_MASK; end &= PAGE_MASK; - if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL)) + if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || + vma_is_dax(vma)) return -EINVAL; if (start < vma->vm_start || start >= vma->vm_end) return -EINVAL; diff --git a/mm/mlock.c b/mm/mlock.c index 74e5a6547c3d..41cc47e28ad6 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -527,7 +527,8 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev, vm_flags_t old_flags = vma->vm_flags; if (newflags == vma->vm_flags || (vma->vm_flags & VM_SPECIAL) || - is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm)) + is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) || + vma_is_dax(vma)) /* don't set VM_LOCKED or VM_LOCKONFAULT and don't count */ goto out; diff --git a/mm/mmap.c b/mm/mmap.c index 78e14facdb6e..5db93f58fdb1 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1796,11 +1796,12 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vm_stat_account(mm, vm_flags, len >> PAGE_SHIFT); if (vm_flags & VM_LOCKED) { - if (!((vm_flags & VM_SPECIAL) || is_vm_hugetlb_page(vma) || - vma == get_gate_vma(current->mm))) - mm->locked_vm += (len >> PAGE_SHIFT); - else + if ((vm_flags & VM_SPECIAL) || vma_is_dax(vma) || + is_vm_hugetlb_page(vma) || + vma == get_gate_vma(current->mm)) vma->vm_flags &= VM_LOCKED_CLEAR_MASK; + else + mm->locked_vm += (len >> PAGE_SHIFT); } if (file)