Message ID | 1535023848-5554-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, oom: Always call tlb_finish_mmu(). | expand |
On 2018/08/23 20:59, Michal Hocko wrote: > On Thu 23-08-18 20:30:48, Tetsuo Handa wrote: >> Commit 93065ac753e44438 ("mm, oom: distinguish blockable mode for mmu >> notifiers") added "continue;" without calling tlb_finish_mmu(). I don't >> know whether tlb_flush_pending imbalance causes problems other than >> extra cost, but at least it looks strange. > > tlb_flush_pending has mm scope and it would confuse > mm_tlb_flush_pending. At least ptep_clear_flush could get confused and > flush unnecessarily for prot_none entries AFAICS. Other paths shouldn't > trigger for oom victims. Even ptep_clear_flush is unlikely to happen. > So nothing really earth shattering but I do agree that it looks weird > and should be fixed. OK. But what is the reason we call tlb_gather_mmu() before mmu_notifier_invalidate_range_start_nonblock() ? I want that the fix explains why we can't do - tlb_gather_mmu(&tlb, mm, start, end); if (mmu_notifier_invalidate_range_start_nonblock(mm, start, end)) { ret = false; continue; } + tlb_gather_mmu(&tlb, mm, start, end); instead.
On Thu 23-08-18 22:48:22, Tetsuo Handa wrote: > On 2018/08/23 20:59, Michal Hocko wrote: > > On Thu 23-08-18 20:30:48, Tetsuo Handa wrote: > >> Commit 93065ac753e44438 ("mm, oom: distinguish blockable mode for mmu > >> notifiers") added "continue;" without calling tlb_finish_mmu(). I don't > >> know whether tlb_flush_pending imbalance causes problems other than > >> extra cost, but at least it looks strange. > > > > tlb_flush_pending has mm scope and it would confuse > > mm_tlb_flush_pending. At least ptep_clear_flush could get confused and > > flush unnecessarily for prot_none entries AFAICS. Other paths shouldn't > > trigger for oom victims. Even ptep_clear_flush is unlikely to happen. > > So nothing really earth shattering but I do agree that it looks weird > > and should be fixed. > > OK. But what is the reason we call tlb_gather_mmu() before > mmu_notifier_invalidate_range_start_nonblock() ? > I want that the fix explains why we can't do > > - tlb_gather_mmu(&tlb, mm, start, end); > if (mmu_notifier_invalidate_range_start_nonblock(mm, start, end)) { > ret = false; > continue; > } > + tlb_gather_mmu(&tlb, mm, start, end); This should be indeed doable because mmu notifiers have no way to know about tlb_gather. I have no idea why we used to have tlb_gather_mmu like that before. Most probably a C&P from munmap path where it didn't make any difference either. A quick check shows that tlb_flush_pending is the only mm scope thing and none of the notifiers really depend on it. I would be calmer if both paths were in sync in that regards. So I think it would be better to go with your previous version first. Maybe it makes sense to switch the order but I do not really see a huge win for doing so.
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index b5b25e4..4f431c1 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -522,6 +522,7 @@ bool __oom_reap_task_mm(struct mm_struct *mm) tlb_gather_mmu(&tlb, mm, start, end); if (mmu_notifier_invalidate_range_start_nonblock(mm, start, end)) { + tlb_finish_mmu(&tlb, start, end); ret = false; continue; }
Commit 93065ac753e44438 ("mm, oom: distinguish blockable mode for mmu notifiers") added "continue;" without calling tlb_finish_mmu(). I don't know whether tlb_flush_pending imbalance causes problems other than extra cost, but at least it looks strange. More worrisome part in that patch is that I don't know whether using trylock if blockable == false at entry is really sufficient. For example, since __gnttab_unmap_refs_async() from gnttab_unmap_refs_async() from gnttab_unmap_refs_sync() from __unmap_grant_pages() from unmap_grant_pages() from unmap_if_in_range() from mn_invl_range_start() involves schedule_delayed_work() which could be blocked on memory allocation under OOM situation, wait_for_completion() from gnttab_unmap_refs_sync() might deadlock? I don't know... Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- mm/oom_kill.c | 1 + 1 file changed, 1 insertion(+)