Message ID | 20220409093500.10329-3-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | A few cleanup and fixup patches for vmscan | expand |
On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > is checked before. It will avoid the long term pinned pages to be swapped > out. And we can live with short term pinned pages. Without can_split_folio > checking we can simplify the code. Also activate_locked can be changed to > keep_locked as it's just short term pinning. > > Suggested-by: Huang, Ying <ying.huang@intel.com> > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/vmscan.c | 22 ++++++++-------------- > 1 file changed, 8 insertions(+), 14 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4a76be47bed1..01f5db75a507 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list, > goto keep_locked; > if (folio_maybe_dma_pinned(folio)) > goto keep_locked; > - if (PageTransHuge(page)) { > - /* cannot split THP, skip it */ > - if (!can_split_folio(folio, NULL)) > - goto activate_locked; > - /* > - * Split pages without a PMD map right > - * away. Chances are some or all of the > - * tail pages can be freed without IO. > - */ > - if (!folio_entire_mapcount(folio) && > - split_folio_to_list(folio, > - page_list)) > - goto activate_locked; > - } > + /* > + * Split pages without a PMD map right > + * away. Chances are some or all of the > + * tail pages can be freed without IO. > + */ This could use more of the line length and be more readable: /* * Split pages without a PMD map right away. * Chances are some or all of the tail pages * can be freed without IO. */ > + if (PageTransHuge(page) && !folio_entire_mapcount(folio) && Please put the folio_entire_mapcoun ont a separate line to make this a bit more redable.
On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote: > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > is checked before. It will avoid the long term pinned pages to be swapped > out. And we can live with short term pinned pages. Without can_split_folio > checking we can simplify the code. Also activate_locked can be changed to > keep_locked as it's just short term pinning. > > Suggested-by: Huang, Ying <ying.huang@intel.com> > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Look good to me. Thanks! Reviewed-by: Huang, Ying <ying.huang@intel.com> Best Regards, Huang, Ying > --- > mm/vmscan.c | 22 ++++++++-------------- > 1 file changed, 8 insertions(+), 14 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4a76be47bed1..01f5db75a507 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list, > goto keep_locked; > if (folio_maybe_dma_pinned(folio)) > goto keep_locked; > - if (PageTransHuge(page)) { > - /* cannot split THP, skip it */ > - if (!can_split_folio(folio, NULL)) > - goto activate_locked; > - /* > - * Split pages without a PMD map right > - * away. Chances are some or all of the > - * tail pages can be freed without IO. > - */ > - if (!folio_entire_mapcount(folio) && > - split_folio_to_list(folio, > - page_list)) > - goto activate_locked; > - } > + /* > + * Split pages without a PMD map right > + * away. Chances are some or all of the > + * tail pages can be freed without IO. > + */ > + if (PageTransHuge(page) && !folio_entire_mapcount(folio) && > + split_folio_to_list(folio, page_list)) > + goto keep_locked; > if (!add_to_swap(page)) { > if (!PageTransHuge(page)) > goto activate_locked_split;
On 2022/4/11 22:18, Christoph Hellwig wrote: > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: >> We don't need to check can_split_folio() because folio_maybe_dma_pinned() >> is checked before. It will avoid the long term pinned pages to be swapped >> out. And we can live with short term pinned pages. Without can_split_folio >> checking we can simplify the code. Also activate_locked can be changed to >> keep_locked as it's just short term pinning. >> >> Suggested-by: Huang, Ying <ying.huang@intel.com> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/vmscan.c | 22 ++++++++-------------- >> 1 file changed, 8 insertions(+), 14 deletions(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 4a76be47bed1..01f5db75a507 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list, >> goto keep_locked; >> if (folio_maybe_dma_pinned(folio)) >> goto keep_locked; >> - if (PageTransHuge(page)) { >> - /* cannot split THP, skip it */ >> - if (!can_split_folio(folio, NULL)) >> - goto activate_locked; >> - /* >> - * Split pages without a PMD map right >> - * away. Chances are some or all of the >> - * tail pages can be freed without IO. >> - */ >> - if (!folio_entire_mapcount(folio) && >> - split_folio_to_list(folio, >> - page_list)) >> - goto activate_locked; >> - } >> + /* >> + * Split pages without a PMD map right >> + * away. Chances are some or all of the >> + * tail pages can be freed without IO. >> + */ > > This could use more of the line length and be more readable: > > /* > * Split pages without a PMD map right away. > * Chances are some or all of the tail pages > * can be freed without IO. > */ > Looks better. >> + if (PageTransHuge(page) && !folio_entire_mapcount(folio) && > > Please put the folio_entire_mapcoun ont a separate line to make this a > bit more redable. Will do in next version. Many thanks for your comment! > . >
On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > is checked before. It will avoid the long term pinned pages to be swapped > out. And we can live with short term pinned pages. Without can_split_folio > checking we can simplify the code. Also activate_locked can be changed to > keep_locked as it's just short term pinning. What do you mean by "we can live with short term pinned pages"? Does it mean that it was not pinned when we check folio_maybe_dma_pinned() but now it is? To me it looks like the pinning is fluctuating and we rely on split_folio_to_list() to see whether we succeed or not, and if not we give it another spin in the next round? > Suggested-by: Huang, Ying <ying.huang@intel.com> > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/vmscan.c | 22 ++++++++-------------- > 1 file changed, 8 insertions(+), 14 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 4a76be47bed1..01f5db75a507 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list, > goto keep_locked; > if (folio_maybe_dma_pinned(folio)) > goto keep_locked; > - if (PageTransHuge(page)) { > - /* cannot split THP, skip it */ > - if (!can_split_folio(folio, NULL)) > - goto activate_locked; > - /* > - * Split pages without a PMD map right > - * away. Chances are some or all of the > - * tail pages can be freed without IO. > - */ > - if (!folio_entire_mapcount(folio) && > - split_folio_to_list(folio, > - page_list)) > - goto activate_locked; > - } > + /* > + * Split pages without a PMD map right > + * away. Chances are some or all of the > + * tail pages can be freed without IO. > + */ > + if (PageTransHuge(page) && !folio_entire_mapcount(folio) && > + split_folio_to_list(folio, page_list)) > + goto keep_locked; > if (!add_to_swap(page)) { > if (!PageTransHuge(page)) > goto activate_locked_split; > -- > 2.23.0 > >
On 2022/4/12 16:59, Oscar Salvador wrote: > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: >> We don't need to check can_split_folio() because folio_maybe_dma_pinned() >> is checked before. It will avoid the long term pinned pages to be swapped >> out. And we can live with short term pinned pages. Without can_split_folio >> checking we can simplify the code. Also activate_locked can be changed to >> keep_locked as it's just short term pinning. > > What do you mean by "we can live with short term pinned pages"? > Does it mean that it was not pinned when we check > folio_maybe_dma_pinned() but now it is? > > To me it looks like the pinning is fluctuating and we rely on > split_folio_to_list() to see whether we succeed or not, and if not > we give it another spin in the next round? Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be pinned for a noticeable time. So it's expected to split the folio successfully in the next round as the pinning is really fluctuating. Or am I miss something? Many thanks for your comment and reply! > >> Suggested-by: Huang, Ying <ying.huang@intel.com> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/vmscan.c | 22 ++++++++-------------- >> 1 file changed, 8 insertions(+), 14 deletions(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 4a76be47bed1..01f5db75a507 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list, >> goto keep_locked; >> if (folio_maybe_dma_pinned(folio)) >> goto keep_locked; >> - if (PageTransHuge(page)) { >> - /* cannot split THP, skip it */ >> - if (!can_split_folio(folio, NULL)) >> - goto activate_locked; >> - /* >> - * Split pages without a PMD map right >> - * away. Chances are some or all of the >> - * tail pages can be freed without IO. >> - */ >> - if (!folio_entire_mapcount(folio) && >> - split_folio_to_list(folio, >> - page_list)) >> - goto activate_locked; >> - } >> + /* >> + * Split pages without a PMD map right >> + * away. Chances are some or all of the >> + * tail pages can be freed without IO. >> + */ >> + if (PageTransHuge(page) && !folio_entire_mapcount(folio) && >> + split_folio_to_list(folio, page_list)) >> + goto keep_locked; >> if (!add_to_swap(page)) { >> if (!PageTransHuge(page)) >> goto activate_locked_split; >> -- >> 2.23.0 >> >> >
On 12.04.22 15:42, Miaohe Lin wrote: > On 2022/4/12 16:59, Oscar Salvador wrote: >> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: >>> We don't need to check can_split_folio() because folio_maybe_dma_pinned() >>> is checked before. It will avoid the long term pinned pages to be swapped >>> out. And we can live with short term pinned pages. Without can_split_folio >>> checking we can simplify the code. Also activate_locked can be changed to >>> keep_locked as it's just short term pinning. >> >> What do you mean by "we can live with short term pinned pages"? >> Does it mean that it was not pinned when we check >> folio_maybe_dma_pinned() but now it is? >> >> To me it looks like the pinning is fluctuating and we rely on >> split_folio_to_list() to see whether we succeed or not, and if not >> we give it another spin in the next round? > > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be > pinned for a noticeable time. So it's expected to split the folio successfully in the next > round as the pinning is really fluctuating. Or am I miss something? > Just so we're on the same page. folio_maybe_dma_pinned() only capture FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now via vmsplice(). can_split_folio() is more precise then folio_maybe_dma_pinned(), but both are racy as long as the page is still mapped.
On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote: > On 12.04.22 15:42, Miaohe Lin wrote: > > On 2022/4/12 16:59, Oscar Salvador wrote: > > > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: > > > > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > > > > is checked before. It will avoid the long term pinned pages to be swapped > > > > out. And we can live with short term pinned pages. Without can_split_folio > > > > checking we can simplify the code. Also activate_locked can be changed to > > > > keep_locked as it's just short term pinning. > > > > > > What do you mean by "we can live with short term pinned pages"? > > > Does it mean that it was not pinned when we check > > > folio_maybe_dma_pinned() but now it is? > > > > > > To me it looks like the pinning is fluctuating and we rely on > > > split_folio_to_list() to see whether we succeed or not, and if not > > > we give it another spin in the next round? > > > > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be > > pinned for a noticeable time. So it's expected to split the folio successfully in the next > > round as the pinning is really fluctuating. Or am I miss something? > > > > Just so we're on the same page. folio_maybe_dma_pinned() only capture > FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now > via vmsplice(). Per my original understanding, folio_maybe_dma_pinned() can be used to detect long-term pinned pages. And it seems reasonable to skip the long-term pinned pages and try short-term pinned pages during page reclaiming. But as you pointed out, vmsplice() doesn't use FOLL_PIN. So if vmsplice() is expected to pin pages for long time, and we have no way to detect it, then we should keep can_split_folio() in the original code. Copying more people who have worked on long-term pinning for comments. Best Regards, Huang, Ying > can_split_folio() is more precise then folio_maybe_dma_pinned(), but > both are racy as long as the page is still mapped. > >
On 2022/4/13 9:26, ying.huang@intel.com wrote: > On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote: >> On 12.04.22 15:42, Miaohe Lin wrote: >>> On 2022/4/12 16:59, Oscar Salvador wrote: >>>> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: >>>>> We don't need to check can_split_folio() because folio_maybe_dma_pinned() >>>>> is checked before. It will avoid the long term pinned pages to be swapped >>>>> out. And we can live with short term pinned pages. Without can_split_folio >>>>> checking we can simplify the code. Also activate_locked can be changed to >>>>> keep_locked as it's just short term pinning. >>>> >>>> What do you mean by "we can live with short term pinned pages"? >>>> Does it mean that it was not pinned when we check >>>> folio_maybe_dma_pinned() but now it is? >>>> >>>> To me it looks like the pinning is fluctuating and we rely on >>>> split_folio_to_list() to see whether we succeed or not, and if not >>>> we give it another spin in the next round? >>> >>> Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be >>> pinned for a noticeable time. So it's expected to split the folio successfully in the next >>> round as the pinning is really fluctuating. Or am I miss something? >>> >> >> Just so we're on the same page. folio_maybe_dma_pinned() only capture >> FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now >> via vmsplice(). > > Per my original understanding, folio_maybe_dma_pinned() can be used to > detect long-term pinned pages. And it seems reasonable to skip the > long-term pinned pages and try short-term pinned pages during page > reclaiming. But as you pointed out, vmsplice() doesn't use FOLL_PIN. > So if vmsplice() is expected to pin pages for long time, and we have no > way to detect it, then we should keep can_split_folio() in the original > code. IIUC, even if we have no way to detect it, can_split_folio should be removed due to: """ can_split_huge_page is introduced via commit b8f593cd0896 ("mm, THP, swap: check whether THP can be split firstly") to avoid deleting the THP from the swap cache and freeing the swap cluster when the THP cannot be split. But since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped out"), splitting THP is delayed until THP is swapped out. There's no need to delete the THP from the swap cache and free the swap cluster anymore. Thus we can remove this unneeded can_split_huge_page check now to simplify the code logic. """ THP might not need to be splitted and could be freed directly after swapout. So can_split_huge_page check here is unneeded. Or am I miss something? Thanks! > > Copying more people who have worked on long-term pinning for comments. > > Best Regards, > Huang, Ying > >> can_split_folio() is more precise then folio_maybe_dma_pinned(), but >> both are racy as long as the page is still mapped. >> >> > > > . >
On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote: > On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote: > > On 12.04.22 15:42, Miaohe Lin wrote: > > > On 2022/4/12 16:59, Oscar Salvador wrote: > > > > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: > > > > > We don't need to check can_split_folio() because folio_maybe_dma_pinned() > > > > > is checked before. It will avoid the long term pinned pages to be swapped > > > > > out. And we can live with short term pinned pages. Without can_split_folio > > > > > checking we can simplify the code. Also activate_locked can be changed to > > > > > keep_locked as it's just short term pinning. > > > > > > > > What do you mean by "we can live with short term pinned pages"? > > > > Does it mean that it was not pinned when we check > > > > folio_maybe_dma_pinned() but now it is? > > > > > > > > To me it looks like the pinning is fluctuating and we rely on > > > > split_folio_to_list() to see whether we succeed or not, and if not > > > > we give it another spin in the next round? > > > > > > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be > > > pinned for a noticeable time. So it's expected to split the folio successfully in the next > > > round as the pinning is really fluctuating. Or am I miss something? > > > > > > > Just so we're on the same page. folio_maybe_dma_pinned() only capture > > FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now > > via vmsplice(). > > Per my original understanding, folio_maybe_dma_pinned() can be used to > detect long-term pinned pages. And it seems reasonable to skip the > long-term pinned pages and try short-term pinned pages during page > reclaiming. But as you pointed out, vmsplice() doesn't use FOLL_PIN. > So if vmsplice() is expected to pin pages for long time, and we have no > way to detect it, then we should keep can_split_folio() in the original > code. > > Copying more people who have worked on long-term pinning for comments. Checked the discussion in the following thread, https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/ It seems that from the practical point of view, folio_maybe_dma_pinned() can identify most long-term pinned pages that may block memory hot- remove or CMA allocation. Although as David pointed out, some pages may still be GUPed for long time (e.g. via vmsplice) even if !folio_maybe_dma_pinned(). But from another point of view, can_split_huge_page() is cheap and THP swapout is expensive (swap space, disk IO, and hard to be recovered), so it may be better to keep can_split_huge_page() in shink_page_list(). Best Regards, Huang, Ying > > > can_split_folio() is more precise then folio_maybe_dma_pinned(), but > > both are racy as long as the page is still mapped. > > > > >
On 2022/4/15 11:07, ying.huang@intel.com wrote: > On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote: >> On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote: >>> On 12.04.22 15:42, Miaohe Lin wrote: >>>> On 2022/4/12 16:59, Oscar Salvador wrote: >>>>> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote: >>>>>> We don't need to check can_split_folio() because folio_maybe_dma_pinned() >>>>>> is checked before. It will avoid the long term pinned pages to be swapped >>>>>> out. And we can live with short term pinned pages. Without can_split_folio >>>>>> checking we can simplify the code. Also activate_locked can be changed to >>>>>> keep_locked as it's just short term pinning. >>>>> >>>>> What do you mean by "we can live with short term pinned pages"? >>>>> Does it mean that it was not pinned when we check >>>>> folio_maybe_dma_pinned() but now it is? >>>>> >>>>> To me it looks like the pinning is fluctuating and we rely on >>>>> split_folio_to_list() to see whether we succeed or not, and if not >>>>> we give it another spin in the next round? >>>> >>>> Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be >>>> pinned for a noticeable time. So it's expected to split the folio successfully in the next >>>> round as the pinning is really fluctuating. Or am I miss something? >>>> >>> >>> Just so we're on the same page. folio_maybe_dma_pinned() only capture >>> FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now >>> via vmsplice(). >> >> Per my original understanding, folio_maybe_dma_pinned() can be used to >> detect long-term pinned pages. And it seems reasonable to skip the >> long-term pinned pages and try short-term pinned pages during page >> reclaiming. But as you pointed out, vmsplice() doesn't use FOLL_PIN. >> So if vmsplice() is expected to pin pages for long time, and we have no >> way to detect it, then we should keep can_split_folio() in the original >> code. >> >> Copying more people who have worked on long-term pinning for comments. > > Checked the discussion in the following thread, > > https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/ > > It seems that from the practical point of view, folio_maybe_dma_pinned() > can identify most long-term pinned pages that may block memory hot- > remove or CMA allocation. Although as David pointed out, some pages may > still be GUPed for long time (e.g. via vmsplice) even if > !folio_maybe_dma_pinned(). > > But from another point of view, can_split_huge_page() is cheap and THP > swapout is expensive (swap space, disk IO, and hard to be recovered), so > it may be better to keep can_split_huge_page() in shink_page_list(). Many thanks for your explanation. Looks convincing for me. Is it worth a comment about the above stuff? Anyway, will drop this patch. Thanks! > > Best Regards, > Huang, Ying > >> >>> can_split_folio() is more precise then folio_maybe_dma_pinned(), but >>> both are racy as long as the page is still mapped. >>> >>> >> > > > . >
diff --git a/mm/vmscan.c b/mm/vmscan.c index 4a76be47bed1..01f5db75a507 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list, goto keep_locked; if (folio_maybe_dma_pinned(folio)) goto keep_locked; - if (PageTransHuge(page)) { - /* cannot split THP, skip it */ - if (!can_split_folio(folio, NULL)) - goto activate_locked; - /* - * Split pages without a PMD map right - * away. Chances are some or all of the - * tail pages can be freed without IO. - */ - if (!folio_entire_mapcount(folio) && - split_folio_to_list(folio, - page_list)) - goto activate_locked; - } + /* + * Split pages without a PMD map right + * away. Chances are some or all of the + * tail pages can be freed without IO. + */ + if (PageTransHuge(page) && !folio_entire_mapcount(folio) && + split_folio_to_list(folio, page_list)) + goto keep_locked; if (!add_to_swap(page)) { if (!PageTransHuge(page)) goto activate_locked_split;
We don't need to check can_split_folio() because folio_maybe_dma_pinned() is checked before. It will avoid the long term pinned pages to be swapped out. And we can live with short term pinned pages. Without can_split_folio checking we can simplify the code. Also activate_locked can be changed to keep_locked as it's just short term pinning. Suggested-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/vmscan.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-)