Message ID | 147318058165.30325.16762406881120129093.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 06 Sep 2016 09:49:41 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > track_pfn_insert() is marking dax mappings as uncacheable. > > It is used to keep mappings attributes consistent across a remapped range. > However, since dax regions are never registered via track_pfn_remap(), the > caching mode lookup for dax pfns always returns _PAGE_CACHE_MODE_UC. We do not > use track_pfn_insert() in the dax-pte path, and we always want to use the > pgprot of the vma itself, so drop this call. > > Cc: Ross Zwisler <ross.zwisler@linux.intel.com> > Cc: Matthew Wilcox <mawilcox@microsoft.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> > Reported-by: Kai Zhang <kai.ka.zhang@oracle.com> > Reported-by: Toshi Kani <toshi.kani@hpe.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Changelog fails to explain the user-visible effects of the patch. The stable maintainer(s) will look at this and wonder "ytf was I sent this". After fixing that, Acked-by: Andrew Morton <akpm@linux-foundation.org>
On Tue, Sep 6, 2016 at 1:17 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Tue, 06 Sep 2016 09:49:41 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > >> track_pfn_insert() is marking dax mappings as uncacheable. >> >> It is used to keep mappings attributes consistent across a remapped range. >> However, since dax regions are never registered via track_pfn_remap(), the >> caching mode lookup for dax pfns always returns _PAGE_CACHE_MODE_UC. We do not >> use track_pfn_insert() in the dax-pte path, and we always want to use the >> pgprot of the vma itself, so drop this call. >> >> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> >> Cc: Matthew Wilcox <mawilcox@microsoft.com> >> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> >> Reported-by: Kai Zhang <kai.ka.zhang@oracle.com> >> Reported-by: Toshi Kani <toshi.kani@hpe.com> >> Cc: <stable@vger.kernel.org> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > Changelog fails to explain the user-visible effects of the patch. The > stable maintainer(s) will look at this and wonder "ytf was I sent > this". True, I'll change it to this: track_pfn_insert() is marking dax mappings as uncacheable rendering them impractical for application usage. DAX-pte mappings are cached and the goal of establishing DAX-pmd mappings is to attain more performance, not dramatically less (3 orders of magnitude). Deleting the call to track_pfn_insert() in vmf_insert_pfn_pmd() lets the default pgprot (write-back cache enabled) from the vma be used for the mapping which yields the expected performance improvement over DAX-pte mappings. track_pfn_insert() is meant to keep the cache mode for a given range synchronized across different users of remap_pfn_range() and vm_insert_pfn_prot(). DAX uses neither of those mapping methods, and the pmem driver is already marking its memory ranges as write-back cache enabled. So, removing the call to track_pfn_insert() leaves the kernel no worse off than the current situation where a user could map the range via /dev/mem with an incompatible cache mode compared to the driver. > After fixing that, > > Acked-by: Andrew Morton <akpm@linux-foundation.org> Thanks Andrew!
On Tue, 2016-09-06 at 14:52 -0700, Dan Williams wrote: > On Tue, Sep 6, 2016 at 1:17 PM, Andrew Morton <akpm@linux-foundation. > org> wrote: > > > > On Tue, 06 Sep 2016 09:49:41 -0700 Dan Williams <dan.j.williams@int > > el.com> wrote: > > > > > > > > track_pfn_insert() is marking dax mappings as uncacheable. > > > > > > It is used to keep mappings attributes consistent across a > > > remapped range. However, since dax regions are never registered > > > via track_pfn_remap(), the caching mode lookup for dax pfns > > > always returns _PAGE_CACHE_MODE_UC. We do not use > > > track_pfn_insert() in the dax-pte path, and we always want to use > > > the pgprot of the vma itself, so drop this call. > > > > > > Cc: Ross Zwisler <ross.zwisler@linux.intel.com> > > > Cc: Matthew Wilcox <mawilcox@microsoft.com> > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> > > > Reported-by: Kai Zhang <kai.ka.zhang@oracle.com> > > > Reported-by: Toshi Kani <toshi.kani@hpe.com> > > > Cc: <stable@vger.kernel.org> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > Changelog fails to explain the user-visible effects of the > > patch. The stable maintainer(s) will look at this and wonder "ytf > > was I sent this". > > True, I'll change it to this: > > track_pfn_insert() is marking dax mappings as uncacheable rendering > them impractical for application usage. DAX-pte mappings are cached > and the goal of establishing DAX-pmd mappings is to attain more > performance, not dramatically less (3 orders of magnitude). > > Deleting the call to track_pfn_insert() in vmf_insert_pfn_pmd() lets > the default pgprot (write-back cache enabled) from the vma be used > for the mapping which yields the expected performance improvement > over DAX-pte mappings. > > track_pfn_insert() is meant to keep the cache mode for a given range > synchronized across different users of remap_pfn_range() and > vm_insert_pfn_prot(). DAX uses neither of those mapping methods, and > the pmem driver is already marking its memory ranges as write-back > cache enabled. So, removing the call to track_pfn_insert() leaves > the kernel no worse off than the current situation where a user could > map the range via /dev/mem with an incompatible cache mode compared > to the driver. I think devm_memremap_pages() should call reserve_memtype() on x86 to keep it consistent with devm_memremap() on this regard. We may need an arch stub for reserve_memtype(), though. Then, track_pfn_insert() should have no issue in this case. Thanks, -Toshi
On Wed, Sep 7, 2016 at 12:39 PM, Kani, Toshimitsu <toshi.kani@hpe.com> wrote: > On Tue, 2016-09-06 at 14:52 -0700, Dan Williams wrote: >> On Tue, Sep 6, 2016 at 1:17 PM, Andrew Morton <akpm@linux-foundation. >> org> wrote: >> > >> > On Tue, 06 Sep 2016 09:49:41 -0700 Dan Williams <dan.j.williams@int >> > el.com> wrote: >> > >> > > >> > > track_pfn_insert() is marking dax mappings as uncacheable. >> > > >> > > It is used to keep mappings attributes consistent across a >> > > remapped range. However, since dax regions are never registered >> > > via track_pfn_remap(), the caching mode lookup for dax pfns >> > > always returns _PAGE_CACHE_MODE_UC. We do not use >> > > track_pfn_insert() in the dax-pte path, and we always want to use >> > > the pgprot of the vma itself, so drop this call. >> > > >> > > Cc: Ross Zwisler <ross.zwisler@linux.intel.com> >> > > Cc: Matthew Wilcox <mawilcox@microsoft.com> >> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >> > > Cc: Andrew Morton <akpm@linux-foundation.org> >> > > Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> >> > > Reported-by: Kai Zhang <kai.ka.zhang@oracle.com> >> > > Reported-by: Toshi Kani <toshi.kani@hpe.com> >> > > Cc: <stable@vger.kernel.org> >> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> >> > >> > Changelog fails to explain the user-visible effects of the >> > patch. The stable maintainer(s) will look at this and wonder "ytf >> > was I sent this". >> >> True, I'll change it to this: >> >> track_pfn_insert() is marking dax mappings as uncacheable rendering >> them impractical for application usage. DAX-pte mappings are cached >> and the goal of establishing DAX-pmd mappings is to attain more >> performance, not dramatically less (3 orders of magnitude). >> >> Deleting the call to track_pfn_insert() in vmf_insert_pfn_pmd() lets >> the default pgprot (write-back cache enabled) from the vma be used >> for the mapping which yields the expected performance improvement >> over DAX-pte mappings. >> >> track_pfn_insert() is meant to keep the cache mode for a given range >> synchronized across different users of remap_pfn_range() and >> vm_insert_pfn_prot(). DAX uses neither of those mapping methods, and >> the pmem driver is already marking its memory ranges as write-back >> cache enabled. So, removing the call to track_pfn_insert() leaves >> the kernel no worse off than the current situation where a user could >> map the range via /dev/mem with an incompatible cache mode compared >> to the driver. > > I think devm_memremap_pages() should call reserve_memtype() on x86 to > keep it consistent with devm_memremap() on this regard. We may need an > arch stub for reserve_memtype(), though. Then, track_pfn_insert() > should have no issue in this case. Yes, indeed! In fact I already have that re-write getting 0day coverage before posting. It occurred to me while re-writing the changelog per Andrew's prompting.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a6abd76baa72..338eff05c77a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -676,8 +676,6 @@ int vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, if (addr < vma->vm_start || addr >= vma->vm_end) return VM_FAULT_SIGBUS; - if (track_pfn_insert(vma, &pgprot, pfn)) - return VM_FAULT_SIGBUS; insert_pfn_pmd(vma, addr, pmd, pfn, pgprot, write); return VM_FAULT_NOPAGE; }
track_pfn_insert() is marking dax mappings as uncacheable. It is used to keep mappings attributes consistent across a remapped range. However, since dax regions are never registered via track_pfn_remap(), the caching mode lookup for dax pfns always returns _PAGE_CACHE_MODE_UC. We do not use track_pfn_insert() in the dax-pte path, and we always want to use the pgprot of the vma itself, so drop this call. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> Reported-by: Kai Zhang <kai.ka.zhang@oracle.com> Reported-by: Toshi Kani <toshi.kani@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- mm/huge_memory.c | 2 -- 1 file changed, 2 deletions(-)