Message ID | 20220908204809.2012451-1-saproj@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: bring back update_mmu_cache() to finish_fault() | expand |
On Thu, Sep 08, 2022 at 11:48:09PM +0300, Sergei Antonov wrote: > Running this test program on ARMv4 a few times (sometimes just once) > reproduces the bug. > > int main() > { > unsigned i; > char paragon[SIZE]; > void* ptr; > > memset(paragon, 0xAA, SIZE); > ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, > MAP_ANON | MAP_SHARED, -1, 0); > if (ptr == MAP_FAILED) return 1; > printf("ptr = %p\n", ptr); > for (i=0;i<10000;i++){ > memset(ptr, 0xAA, SIZE); > if (memcmp(ptr, paragon, SIZE)) { > printf("Unexpected bytes on iteration %u!!!\n", i); > break; > } > } > munmap(ptr, SIZE); > } > > In the "ptr" buffer there appear runs of zero bytes which are aligned > by 16 and their lengths are multiple of 16. > > Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: > f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > > Before the commit update_mmu_cache() was called during a call to > filemap_map_pages() as well as finish_fault(). After the commit > finish_fault() lacks it. > > Bring back update_mmu_cache() to finish_fault() to fix the bug. > Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more > closely reproduce the code of alloc_set_pte() function that existed before > the commit. > > On many platforms update_mmu_cache() is nop: > x86, see arch/x86/include/asm/pgtable > ARMv6+, see arch/arm/include/asm/tlbflush.h > So, it seems, few users ran into this bug. > > Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > Signed-off-by: Sergei Antonov <saproj@gmail.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> +Will. Seems I confused update_mmu_tlb() with update_mmu_cache() :/ Looks good to me: Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/memory.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 4ba73f5aa8bb..a78814413ac0 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4386,14 +4386,20 @@ vm_fault_t finish_fault(struct vm_fault *vmf) > > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, > vmf->address, &vmf->ptl); > - ret = 0; > + > /* Re-check under ptl */ > - if (likely(!vmf_pte_changed(vmf))) > + if (likely(!vmf_pte_changed(vmf))) { > do_set_pte(vmf, page, vmf->address); > - else > + > + /* no need to invalidate: a not-present page won't be cached */ > + update_mmu_cache(vma, vmf->address, vmf->pte); > + > + ret = 0; > + } else { > + update_mmu_tlb(vma, vmf->address, vmf->pte); > ret = VM_FAULT_NOPAGE; > + } > > - update_mmu_tlb(vma, vmf->address, vmf->pte); > pte_unmap_unlock(vmf->pte, vmf->ptl); > return ret; > } > -- > 2.34.1 >
On Thu, Sep 08, 2022 at 11:48:09PM +0300, Sergei Antonov wrote: > Running this test program on ARMv4 a few times (sometimes just once) > reproduces the bug. > > int main() > { > unsigned i; > char paragon[SIZE]; > void* ptr; > > memset(paragon, 0xAA, SIZE); > ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, > MAP_ANON | MAP_SHARED, -1, 0); > if (ptr == MAP_FAILED) return 1; > printf("ptr = %p\n", ptr); > for (i=0;i<10000;i++){ > memset(ptr, 0xAA, SIZE); > if (memcmp(ptr, paragon, SIZE)) { > printf("Unexpected bytes on iteration %u!!!\n", i); > break; > } > } > munmap(ptr, SIZE); > } > > In the "ptr" buffer there appear runs of zero bytes which are aligned > by 16 and their lengths are multiple of 16. > > Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: > f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > > Before the commit update_mmu_cache() was called during a call to > filemap_map_pages() as well as finish_fault(). After the commit > finish_fault() lacks it. > > Bring back update_mmu_cache() to finish_fault() to fix the bug. > Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more > closely reproduce the code of alloc_set_pte() function that existed before > the commit. > > On many platforms update_mmu_cache() is nop: > x86, see arch/x86/include/asm/pgtable > ARMv6+, see arch/arm/include/asm/tlbflush.h > So, it seems, few users ran into this bug. > > Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > Signed-off-by: Sergei Antonov <saproj@gmail.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/memory.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>
On Fri, Sep 09, 2022 at 01:24:10AM +0300, Kirill A. Shutemov wrote: > On Thu, Sep 08, 2022 at 11:48:09PM +0300, Sergei Antonov wrote: > > Running this test program on ARMv4 a few times (sometimes just once) > > reproduces the bug. > > > > int main() > > { > > unsigned i; > > char paragon[SIZE]; > > void* ptr; > > > > memset(paragon, 0xAA, SIZE); > > ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, > > MAP_ANON | MAP_SHARED, -1, 0); > > if (ptr == MAP_FAILED) return 1; > > printf("ptr = %p\n", ptr); > > for (i=0;i<10000;i++){ > > memset(ptr, 0xAA, SIZE); > > if (memcmp(ptr, paragon, SIZE)) { > > printf("Unexpected bytes on iteration %u!!!\n", i); > > break; > > } > > } > > munmap(ptr, SIZE); > > } > > > > In the "ptr" buffer there appear runs of zero bytes which are aligned > > by 16 and their lengths are multiple of 16. > > > > Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: > > f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > > > > Before the commit update_mmu_cache() was called during a call to > > filemap_map_pages() as well as finish_fault(). After the commit > > finish_fault() lacks it. > > > > Bring back update_mmu_cache() to finish_fault() to fix the bug. > > Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more > > closely reproduce the code of alloc_set_pte() function that existed before > > the commit. > > > > On many platforms update_mmu_cache() is nop: > > x86, see arch/x86/include/asm/pgtable > > ARMv6+, see arch/arm/include/asm/tlbflush.h > > So, it seems, few users ran into this bug. > > > > Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") > > Signed-off-by: Sergei Antonov <saproj@gmail.com> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > +Will. > > Seems I confused update_mmu_tlb() with update_mmu_cache() :/ Urgh, that thing is pretty horrible! But anyway, I agree that this change looks correct based on the other callers in the file. > Looks good to me: > > Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> I'm assuming Andrew will pick this up. Otherwise, please let me know if I should route it via the arm64 tree. Will
diff --git a/mm/memory.c b/mm/memory.c index 4ba73f5aa8bb..a78814413ac0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4386,14 +4386,20 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); - ret = 0; + /* Re-check under ptl */ - if (likely(!vmf_pte_changed(vmf))) + if (likely(!vmf_pte_changed(vmf))) { do_set_pte(vmf, page, vmf->address); - else + + /* no need to invalidate: a not-present page won't be cached */ + update_mmu_cache(vma, vmf->address, vmf->pte); + + ret = 0; + } else { + update_mmu_tlb(vma, vmf->address, vmf->pte); ret = VM_FAULT_NOPAGE; + } - update_mmu_tlb(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; }
Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug. int main() { unsigned i; char paragon[SIZE]; void* ptr; memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE); } In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16. Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it. Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit. On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug. Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov <saproj@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- mm/memory.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)