Message ID | 20200807061706.unk5_0KtC%akpm@linux-foundation.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [001/163] mm/memory.c: avoid access flag update TLB flush for retried page fault | expand |
On Thu, Aug 6, 2020 at 11:17 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > From: Yang Shi <yang.shi@linux.alibaba.com> > Subject: mm/memory.c: avoid access flag update TLB flush for retried page fault This is not the safe version that just avoids the extra TLB flush. This is - once again - the thing that skips the whole mkdirty and page table update too. I'm not taking it this time _either_. Andrew, please flush this garbage from your system. Linus
On Fri, Aug 7, 2020 at 11:17 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Aug 6, 2020 at 11:17 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > > > From: Yang Shi <yang.shi@linux.alibaba.com> > > Subject: mm/memory.c: avoid access flag update TLB flush for retried page fault > > This is not the safe version that just avoids the extra TLB flush. > > This is - once again - the thing that skips the whole mkdirty and page > table update too. > > I'm not taking it this time _either_. I'm supposed Catalin would submit his proposal (flush local TLB for spurious TLB fault on ARM) for this specific regression per the discussion, right? And, the more general spurious TLB fault problem sounds not that urgent since it should be very rare. > > Andrew, please flush this garbage from your system. > > Linus >
On Fri, Aug 7, 2020 at 1:53 PM Yang Shi <shy828301@gmail.com> wrote: > > I'm supposed Catalin would submit his proposal (flush local TLB for > spurious TLB fault on ARM) for this specific regression per the > discussion, right? I think arm64 should do that regardless, yes. But I would also be ok with a version that does the FAULT_FLAG_TRIED testing, but does it only for that spurious TLB flushing. This "let's not update the page tables at all" is wrong, when the only problem was the TLB flushing. So changing the current (but quesitonable) if (vmf->flags & FAULT_FLAG_WRITE) flush_tlb_fix_spurious_fault(vmf->vma, vmf->address); to be if (vmf->flags & (FAULT_FLAG_WRITE | FAULT_FLAG_TRIED)) flush_tlb_fix_spurious_fault(vmf->vma, vmf->address); would be fine. But this patch that changes any semantics outside just the flushin gis a complete no-no. Linus
On Fri, Aug 7, 2020 at 9:34 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Fri, Aug 7, 2020 at 1:53 PM Yang Shi <shy828301@gmail.com> wrote: > > > > I'm supposed Catalin would submit his proposal (flush local TLB for > > spurious TLB fault on ARM) for this specific regression per the > > discussion, right? > > I think arm64 should do that regardless, yes. > > But I would also be ok with a version that does the FAULT_FLAG_TRIED > testing, but does it only for that spurious TLB flushing. > > This "let's not update the page tables at all" is wrong, when the only > problem was the TLB flushing. > > So changing the current (but quesitonable) > > if (vmf->flags & FAULT_FLAG_WRITE) > flush_tlb_fix_spurious_fault(vmf->vma, vmf->address); > > to be > > if (vmf->flags & (FAULT_FLAG_WRITE | FAULT_FLAG_TRIED)) > flush_tlb_fix_spurious_fault(vmf->vma, vmf->address); It looks the retried fault still flush TLB with this change. Shouldn't we do something like this to skip spurious TLB flush: @@ -4251,6 +4251,9 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) vmf->flags & FAULT_FLAG_WRITE)) { update_mmu_cache(vmf->vma, vmf->address, vmf->pte); } else { + if (vmf->flags & FAULT_FLAG_TRIED) + goto unlock; + /* * This is needed only for protection faults but the arch code * is not yet telling us if this is a protection fault or not. > > would be fine. > > But this patch that changes any semantics outside just the flushin gis > a complete no-no. > > Linus
On Mon, Aug 10, 2020 at 10:48 AM Yang Shi <shy828301@gmail.com> wrote: > > It looks the retried fault still flush TLB with this change. > > Shouldn't we do something like this to skip spurious TLB flush: I have no idea what code-base you're basing your patches against, and what you're comparing my patch. Your patch does *exactly* the same thing mine did. Except it does a "goto unlock" to jump over the flush_tlb_fix_spurious_fault(), while my pseudo-patch just changed the if (vmf->flags & FAULT_FLAG_WRITE) to be a if (vmf->flags & (FAULT_FLAG_WRITE | FAULT_FLAG_TRIED)) but it has the same effect: it skips the flush_tlb_fix_spurious_fault(). So if you think your patch does something else, then your source code doesn't match mine. The *only* thing you jumped over was that same thing that I disabled. Somebody is confused. Linus
--- a/mm/memory.c~mm-avoid-access-flag-update-tlb-flush-for-retried-page-fault +++ a/mm/memory.c @@ -4241,8 +4241,14 @@ static vm_fault_t handle_pte_fault(struc if (vmf->flags & FAULT_FLAG_WRITE) { if (!pte_write(entry)) return do_wp_page(vmf); - entry = pte_mkdirty(entry); } + + if (vmf->flags & FAULT_FLAG_TRIED) + goto unlock; + + if (vmf->flags & FAULT_FLAG_WRITE) + entry = pte_mkdirty(entry); + entry = pte_mkyoung(entry); if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry, vmf->flags & FAULT_FLAG_WRITE)) {