Message ID | 20231119194740.94101-17-ryncsn@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Swapin path refactor for optimization and bugfix | expand |
On Mon, Nov 20, 2023 at 03:47:32AM +0800, Kairui Song wrote: > page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, > vmf, &cache_result); > - if (page) { > + if (PTR_ERR(page) == -EBUSY) { if (page == ERR_PTR(-EBUSY)) {
Matthew Wilcox <willy@infradead.org> 于2023年11月20日周一 05:12写道: > > On Mon, Nov 20, 2023 at 03:47:32AM +0800, Kairui Song wrote: > > page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, > > vmf, &cache_result); > > - if (page) { > > + if (PTR_ERR(page) == -EBUSY) { > > if (page == ERR_PTR(-EBUSY)) { > Thanks, I'll fix this.
On Sun, Nov 19, 2023 at 11:48 AM Kairui Song <ryncsn@gmail.com> wrote: > > From: Kairui Song <kasong@tencent.com> > > Move get_swap_device into swapin_readahead, simplify the code > and prepare for follow up commits. > > For the later part in do_swap_page, using swp_swap_info directly is fine > since in that context, the swap device is pinned by swapcache reference. > > Signed-off-by: Kairui Song <kasong@tencent.com> > --- > mm/memory.c | 16 ++++------------ > mm/swap_state.c | 8 ++++++-- > mm/swapfile.c | 4 +++- > 3 files changed, 13 insertions(+), 15 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 22af9f3e8c75..e399b37ef395 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3789,7 +3789,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > struct folio *swapcache = NULL, *folio = NULL; > enum swap_cache_result cache_result; > struct page *page; > - struct swap_info_struct *si = NULL; > rmap_t rmap_flags = RMAP_NONE; > bool exclusive = false; > swp_entry_t entry; > @@ -3845,14 +3844,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > goto out; > } > > - /* Prevent swapoff from happening to us. */ > - si = get_swap_device(entry); > - if (unlikely(!si)) > - goto out; > - > page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, > vmf, &cache_result); > - if (page) { > + if (PTR_ERR(page) == -EBUSY) { > + goto out; > + } else if (page) { As Matthew suggested, using ERR_PTR(-EBUSY). you also don't need the else here. The goto already terminates the flow. if (page == ERR_PTR(-EBUSY)) goto out; if (page) { Chris
Kairui Song <ryncsn@gmail.com> writes: > From: Kairui Song <kasong@tencent.com> > > Move get_swap_device into swapin_readahead, simplify the code > and prepare for follow up commits. No. Please don't do this. Please check the get/put_swap_device() usage rule in the comments of get_swap_device(). " * When we get a swap entry, if there aren't some other ways to * prevent swapoff, such as the folio in swap cache is locked, page * table lock is held, etc., the swap entry may become invalid because * of swapoff. Then, we need to enclose all swap related functions * with get_swap_device() and put_swap_device(), unless the swap * functions call get/put_swap_device() by themselves. " This is to simplify the reasoning about swapoff and swap entry. Why does it bother you? > For the later part in do_swap_page, using swp_swap_info directly is fine > since in that context, the swap device is pinned by swapcache reference. [snip] -- Best Regards, Huang, Ying
Huang, Ying <ying.huang@intel.com> 于2023年11月22日周三 08:38写道: > > Kairui Song <ryncsn@gmail.com> writes: > > > From: Kairui Song <kasong@tencent.com> > > > > Move get_swap_device into swapin_readahead, simplify the code > > and prepare for follow up commits. > > No. Please don't do this. Please check the get/put_swap_device() usage > rule in the comments of get_swap_device(). > > " > * When we get a swap entry, if there aren't some other ways to > * prevent swapoff, such as the folio in swap cache is locked, page > * table lock is held, etc., the swap entry may become invalid because > * of swapoff. Then, we need to enclose all swap related functions > * with get_swap_device() and put_swap_device(), unless the swap > * functions call get/put_swap_device() by themselves. > " > > This is to simplify the reasoning about swapoff and swap entry. > > Why does it bother you? Hi Ying, This is trying to reduce LOC, avoid a trivial si read, and make error checking logic easier to refactor in later commits. And besides there is one trivial change I forgot to include in this commit, get_swap_device can be put after swap_cache_get_folio in swapin_readahead, since swap_cache_get_folio doesn't need to hold the swap device, so in cache hit case this get/put_swap_device() can be saved. The comment also mentioned: "Then, we need to enclose all swap related functions with get_swap_device() and put_swap_device(), unless the swap functions call get/put_swap_device() by themselves" So I think it should be OK to do this.
Kairui Song <ryncsn@gmail.com> writes: > Huang, Ying <ying.huang@intel.com> 于2023年11月22日周三 08:38写道: >> >> Kairui Song <ryncsn@gmail.com> writes: >> >> > From: Kairui Song <kasong@tencent.com> >> > >> > Move get_swap_device into swapin_readahead, simplify the code >> > and prepare for follow up commits. >> >> No. Please don't do this. Please check the get/put_swap_device() usage >> rule in the comments of get_swap_device(). >> >> " >> * When we get a swap entry, if there aren't some other ways to >> * prevent swapoff, such as the folio in swap cache is locked, page >> * table lock is held, etc., the swap entry may become invalid because >> * of swapoff. Then, we need to enclose all swap related functions >> * with get_swap_device() and put_swap_device(), unless the swap >> * functions call get/put_swap_device() by themselves. >> " >> >> This is to simplify the reasoning about swapoff and swap entry. >> >> Why does it bother you? > > Hi Ying, > > This is trying to reduce LOC, avoid a trivial si read, and make error > checking logic easier to refactor in later commits. The race with swapoff isn't considered by many developers usually. So, we should use a simple rule as much as possible. For example, if you get a swap entry, use get/put_swap_device() to enclose all code that operate on the swap entry. This makes code easier to be maintained in the long run. Yes. Sometimes we break the rule a little, but only if we have enough benefit, such as improving performance in some practical use cases. > And besides there is one trivial change I forgot to include in this > commit, get_swap_device can be put after swap_cache_get_folio in > swapin_readahead, since swap_cache_get_folio doesn't need to hold the > swap device, so in cache hit case this get/put_swap_device() can be > saved. swapoff is rare operation, it's OK to delay it a little to make the code easier to be understood. > The comment also mentioned: > > "Then, we need to enclose all swap related functions with > get_swap_device() and put_swap_device(), unless the swap functions > call get/put_swap_device() by themselves" > > So I think it should be OK to do this. No. You should change the code with a good enough reason. Compared with complexity it introduced, the benefit isn't enough for me so far. -- Best Regards, Huang, Ying
diff --git a/mm/memory.c b/mm/memory.c index 22af9f3e8c75..e399b37ef395 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3789,7 +3789,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) struct folio *swapcache = NULL, *folio = NULL; enum swap_cache_result cache_result; struct page *page; - struct swap_info_struct *si = NULL; rmap_t rmap_flags = RMAP_NONE; bool exclusive = false; swp_entry_t entry; @@ -3845,14 +3844,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) goto out; } - /* Prevent swapoff from happening to us. */ - si = get_swap_device(entry); - if (unlikely(!si)) - goto out; - page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, vmf, &cache_result); - if (page) { + if (PTR_ERR(page) == -EBUSY) { + goto out; + } else if (page) { folio = page_folio(page); if (cache_result != SWAP_CACHE_HIT) { /* Had to read the page from swap area: Major fault */ @@ -3964,7 +3960,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ exclusive = true; } else if (exclusive && folio_test_writeback(folio) && - data_race(si->flags & SWP_STABLE_WRITES)) { + (swp_swap_info(entry)->flags & SWP_STABLE_WRITES)) { /* * This is tricky: not all swap backends support * concurrent page modifications while under writeback. @@ -4068,8 +4064,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); out: - if (si) - put_swap_device(si); return ret; out_nomap: if (vmf->pte) @@ -4082,8 +4076,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_unlock(swapcache); folio_put(swapcache); } - if (si) - put_swap_device(si); return ret; } diff --git a/mm/swap_state.c b/mm/swap_state.c index 51de2a0412df..ff8a166603d0 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -922,6 +922,11 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, struct page *page; pgoff_t ilx; + /* Prevent swapoff from happening to us */ + si = get_swap_device(entry); + if (unlikely(!si)) + return ERR_PTR(-EBUSY); + folio = swap_cache_get_folio(entry, vmf, &shadow); if (folio) { page = folio_file_page(folio, swp_offset(entry)); @@ -929,7 +934,6 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, goto done; } - si = swp_swap_info(entry); mpol = get_vma_policy(vmf->vma, vmf->address, 0, &ilx); if (swap_use_no_readahead(si, swp_offset(entry))) { page = swapin_no_readahead(entry, gfp_mask, mpol, ilx, vmf->vma->vm_mm); @@ -944,8 +948,8 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, cache_result = SWAP_CACHE_MISS; } mpol_cond_put(mpol); - done: + put_swap_device(si); if (result) *result = cache_result; diff --git a/mm/swapfile.c b/mm/swapfile.c index b6d57fff5e21..925ad92486a4 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1857,7 +1857,9 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, pte = NULL; page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, &vmf, NULL); - if (page) + if (IS_ERR(page)) + return PTR_ERR(page); + else if (page) folio = page_folio(page); if (!folio) { /*