Message ID | 20231019131446.317-1-justinjiang@vivo.com (mailing list archive) |
---|---|
Headers | show |
Series | mm: the dirty folio unmap redundantly | expand |
On 19.10.23 15:14, Zhiguo Jiang wrote: > In the shrink_folio_list() the sources of the file dirty folio include > two ways below: > 1. The dirty folio is from the incoming parameter folio_list, > which is the inactive file lru. > 2. The dirty folio is from the PTE dirty bit transferred by > the try_to_unmap(). > > For the first source of the dirty folio, if the dirty folio does not > support pageout, the dirty folio can skip unmap in advance to reduce > recyling time. > > Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > --- > > Changelog: > v1->v2: > 1. Keep the original judgment flow. > 2. Add the interface of folio_check_pageout(). > 3. The dirty folio which does not support pageout in inactive file lru > skip unmap in advance. > > mm/vmscan.c | 103 +++++++++++++++++++++++++++++++++------------------- > 1 file changed, 66 insertions(+), 37 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index a68d01fcc307..e067269275a5 100755 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -925,6 +925,44 @@ static void folio_check_dirty_writeback(struct folio *folio, > mapping->a_ops->is_dirty_writeback(folio, dirty, writeback); > } > > +/* Check if a dirty folio can support pageout in the recyling process*/ > +static bool folio_check_pageout(struct folio *folio, > + struct pglist_data *pgdat) > +{ > + int ret = true; > + > + /* > + * Anonymous folios are not handled by flushers and must be written > + * from reclaim context. Do not stall reclaim based on them. > + * MADV_FREE anonymous folios are put into inactive file list too. > + * They could be mistakenly treated as file lru. So further anon > + * test is needed. > + */ > + if (!folio_is_file_lru(folio) || > + (folio_test_anon(folio) && !folio_test_swapbacked(folio))) > + goto out; > + > + if (folio_test_dirty(folio) && > + (!current_is_kswapd() || > + !folio_test_reclaim(folio) || > + !test_bit(PGDAT_DIRTY, &pgdat->flags))) { > + /* > + * Immediately reclaim when written back. > + * Similar in principle to folio_deactivate() > + * except we already have the folio isolated > + * and know it's dirty > + */ > + node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE, > + folio_nr_pages(folio)); > + folio_set_reclaim(folio); > + > + ret = false; > + } > + > +out: > + return ret; > +} > + > static struct folio *alloc_demote_folio(struct folio *src, > unsigned long private) > { > @@ -1078,6 +1116,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > if (dirty && !writeback) > stat->nr_unqueued_dirty += nr_pages; > > + /* If the dirty folio dose not support pageout, > + * the dirty folio can skip this recycling. > + */ > + if (!folio_check_pageout(folio, pgdat)) > + goto activate_locked; > + > /* > * Treat this folio as congested if folios are cycling > * through the LRU so quickly that the folios marked > @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > enum ttu_flags flags = TTU_BATCH_FLUSH; > bool was_swapbacked = folio_test_swapbacked(folio); > > - if (folio_test_dirty(folio)) { > - /* > - * Only kswapd can writeback filesystem folios > - * to avoid risk of stack overflow. But avoid > - * injecting inefficient single-folio I/O into > - * flusher writeback as much as possible: only > - * write folios when we've encountered many > - * dirty folios, and when we've already scanned > - * the rest of the LRU for clean folios and see > - * the same dirty folios again (with the reclaim > - * flag set). > - */ > - if (folio_is_file_lru(folio) && > - (!current_is_kswapd() || > - !folio_test_reclaim(folio) || > - !test_bit(PGDAT_DIRTY, &pgdat->flags))) { > - /* > - * Immediately reclaim when written back. > - * Similar in principle to folio_deactivate() > - * except we already have the folio isolated > - * and know it's dirty > - */ > - node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE, > - nr_pages); > - folio_set_reclaim(folio); > - > - goto activate_locked; > - } > - > - if (references == FOLIOREF_RECLAIM_CLEAN) > - goto keep_locked; > - if (!may_enter_fs(folio, sc->gfp_mask)) > - goto keep_locked; > - if (!sc->may_writepage) > - goto keep_locked; > - } > - > if (folio_test_pmd_mappable(folio)) > flags |= TTU_SPLIT_HUGE_PMD; > > @@ -1323,6 +1330,28 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > > mapping = folio_mapping(folio); > if (folio_test_dirty(folio)) { > + /* > + * Only kswapd can writeback filesystem folios > + * to avoid risk of stack overflow. But avoid > + * injecting inefficient single-folio I/O into > + * flusher writeback as much as possible: only > + * write folios when we've encountered many > + * dirty folios, and when we've already scanned > + * the rest of the LRU for clean folios and see > + * the same dirty folios again (with the reclaim > + * flag set). > + */ > + if (folio_is_file_lru(folio) && > + !folio_check_pageout(folio, pgdat)) > + goto activate_locked; > + > + if (references == FOLIOREF_RECLAIM_CLEAN) > + goto keep_locked; > + if (!may_enter_fs(folio, sc->gfp_mask)) > + goto keep_locked; > + if (!sc->may_writepage) > + goto keep_locked; > + > /* > * Folio is dirty. Flush the TLB if a writable entry > * potentially exists to avoid CPU writes after I/O I'm confused. Did you apply this on top of v1 by accident?
在 2023/10/19 22:15, David Hildenbrand 写道: > [你通常不会收到来自 david@redhat.com 的电子邮件。请访问 > https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要] > > On 19.10.23 15:14, Zhiguo Jiang wrote: >> In the shrink_folio_list() the sources of the file dirty folio include >> two ways below: >> 1. The dirty folio is from the incoming parameter folio_list, >> which is the inactive file lru. >> 2. The dirty folio is from the PTE dirty bit transferred by >> the try_to_unmap(). >> >> For the first source of the dirty folio, if the dirty folio does not >> support pageout, the dirty folio can skip unmap in advance to reduce >> recyling time. >> >> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> >> --- >> >> Changelog: >> v1->v2: >> 1. Keep the original judgment flow. >> 2. Add the interface of folio_check_pageout(). >> 3. The dirty folio which does not support pageout in inactive file lru >> skip unmap in advance. >> >> mm/vmscan.c | 103 +++++++++++++++++++++++++++++++++------------------- >> 1 file changed, 66 insertions(+), 37 deletions(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index a68d01fcc307..e067269275a5 100755 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -925,6 +925,44 @@ static void folio_check_dirty_writeback(struct >> folio *folio, >> mapping->a_ops->is_dirty_writeback(folio, dirty, >> writeback); >> } >> >> +/* Check if a dirty folio can support pageout in the recyling process*/ >> +static bool folio_check_pageout(struct folio *folio, >> + struct pglist_data *pgdat) >> +{ >> + int ret = true; >> + >> + /* >> + * Anonymous folios are not handled by flushers and must be >> written >> + * from reclaim context. Do not stall reclaim based on them. >> + * MADV_FREE anonymous folios are put into inactive file list too. >> + * They could be mistakenly treated as file lru. So further anon >> + * test is needed. >> + */ >> + if (!folio_is_file_lru(folio) || >> + (folio_test_anon(folio) && !folio_test_swapbacked(folio))) >> + goto out; >> + >> + if (folio_test_dirty(folio) && >> + (!current_is_kswapd() || >> + !folio_test_reclaim(folio) || >> + !test_bit(PGDAT_DIRTY, &pgdat->flags))) { >> + /* >> + * Immediately reclaim when written back. >> + * Similar in principle to folio_deactivate() >> + * except we already have the folio isolated >> + * and know it's dirty >> + */ >> + node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE, >> + folio_nr_pages(folio)); >> + folio_set_reclaim(folio); >> + >> + ret = false; >> + } >> + >> +out: >> + return ret; >> +} >> + >> static struct folio *alloc_demote_folio(struct folio *src, >> unsigned long private) >> { >> @@ -1078,6 +1116,12 @@ static unsigned int shrink_folio_list(struct >> list_head *folio_list, >> if (dirty && !writeback) >> stat->nr_unqueued_dirty += nr_pages; >> >> + /* If the dirty folio dose not support pageout, >> + * the dirty folio can skip this recycling. >> + */ >> + if (!folio_check_pageout(folio, pgdat)) >> + goto activate_locked; >> + >> /* >> * Treat this folio as congested if folios are cycling >> * through the LRU so quickly that the folios marked >> @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct >> list_head *folio_list, >> enum ttu_flags flags = TTU_BATCH_FLUSH; >> bool was_swapbacked = >> folio_test_swapbacked(folio); >> >> - if (folio_test_dirty(folio)) { >> - /* >> - * Only kswapd can writeback filesystem >> folios >> - * to avoid risk of stack overflow. But >> avoid >> - * injecting inefficient single-folio >> I/O into >> - * flusher writeback as much as >> possible: only >> - * write folios when we've encountered >> many >> - * dirty folios, and when we've already >> scanned >> - * the rest of the LRU for clean folios >> and see >> - * the same dirty folios again (with >> the reclaim >> - * flag set). >> - */ >> - if (folio_is_file_lru(folio) && >> - (!current_is_kswapd() || >> - !folio_test_reclaim(folio) || >> - !test_bit(PGDAT_DIRTY, >> &pgdat->flags))) { >> - /* >> - * Immediately reclaim when >> written back. >> - * Similar in principle to >> folio_deactivate() >> - * except we already have the >> folio isolated >> - * and know it's dirty >> - */ >> - node_stat_mod_folio(folio, >> NR_VMSCAN_IMMEDIATE, >> - nr_pages); >> - folio_set_reclaim(folio); >> - >> - goto activate_locked; >> - } >> - >> - if (references == FOLIOREF_RECLAIM_CLEAN) >> - goto keep_locked; >> - if (!may_enter_fs(folio, sc->gfp_mask)) >> - goto keep_locked; >> - if (!sc->may_writepage) >> - goto keep_locked; >> - } >> - >> if (folio_test_pmd_mappable(folio)) >> flags |= TTU_SPLIT_HUGE_PMD; >> >> @@ -1323,6 +1330,28 @@ static unsigned int shrink_folio_list(struct >> list_head *folio_list, >> >> mapping = folio_mapping(folio); >> if (folio_test_dirty(folio)) { >> + /* >> + * Only kswapd can writeback filesystem folios >> + * to avoid risk of stack overflow. But avoid >> + * injecting inefficient single-folio I/O into >> + * flusher writeback as much as possible: only >> + * write folios when we've encountered many >> + * dirty folios, and when we've already scanned >> + * the rest of the LRU for clean folios and see >> + * the same dirty folios again (with the reclaim >> + * flag set). >> + */ >> + if (folio_is_file_lru(folio) && >> + !folio_check_pageout(folio, pgdat)) >> + goto activate_locked; >> + >> + if (references == FOLIOREF_RECLAIM_CLEAN) >> + goto keep_locked; >> + if (!may_enter_fs(folio, sc->gfp_mask)) >> + goto keep_locked; >> + if (!sc->may_writepage) >> + goto keep_locked; >> + >> /* >> * Folio is dirty. Flush the TLB if a writable >> entry >> * potentially exists to avoid CPU writes after >> I/O > > I'm confused. Did you apply this on top of v1 by accident? Hi, According to my modified mm_vmscan_lru_shrink_inactive test tracelog, in the 32 scanned inactive file pages, 20 were dirty, and the 20 dirty pages were not reclamed, but they took 20us to perform try_to_unmap. I think unreclaimed dirty folio in inactive file lru can skip to perform try_to_unmap. Please help to continue review. Thanks. kswapd0-99 ( 99) [005] ..... 687.793724: mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 reclamed=12 nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 nr_ref_keep=0 nr_unmap_fail=0 priority=2 file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 exe=0 reference_cost=5 reference_exe=0 unmap_cost=21 unmap_exe=0 dirty_unmap_cost=20 dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0 > > -- > Cheers, > > David / dhildenb >
在 2023/10/20 11:59, zhiguojiang 写道: > > > 在 2023/10/19 22:15, David Hildenbrand 写道: >> [你通常不会收到来自 david@redhat.com 的电子邮件。请访问 >> https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要] >> >> On 19.10.23 15:14, Zhiguo Jiang wrote: >>> In the shrink_folio_list() the sources of the file dirty folio include >>> two ways below: >>> 1. The dirty folio is from the incoming parameter folio_list, >>> which is the inactive file lru. >>> 2. The dirty folio is from the PTE dirty bit transferred by >>> the try_to_unmap(). >>> >>> For the first source of the dirty folio, if the dirty folio does not >>> support pageout, the dirty folio can skip unmap in advance to reduce >>> recyling time. >>> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> >>> --- >>> >>> Changelog: >>> v1->v2: >>> 1. Keep the original judgment flow. >>> 2. Add the interface of folio_check_pageout(). >>> 3. The dirty folio which does not support pageout in inactive file lru >>> skip unmap in advance. >>> >>> mm/vmscan.c | 103 >>> +++++++++++++++++++++++++++++++++------------------- >>> 1 file changed, 66 insertions(+), 37 deletions(-) >>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> index a68d01fcc307..e067269275a5 100755 >>> --- a/mm/vmscan.c >>> +++ b/mm/vmscan.c >>> @@ -925,6 +925,44 @@ static void folio_check_dirty_writeback(struct >>> folio *folio, >>> mapping->a_ops->is_dirty_writeback(folio, dirty, >>> writeback); >>> } >>> >>> +/* Check if a dirty folio can support pageout in the recyling >>> process*/ >>> +static bool folio_check_pageout(struct folio *folio, >>> + struct pglist_data >>> *pgdat) >>> +{ >>> + int ret = true; >>> + >>> + /* >>> + * Anonymous folios are not handled by flushers and must be >>> written >>> + * from reclaim context. Do not stall reclaim based on them. >>> + * MADV_FREE anonymous folios are put into inactive file list >>> too. >>> + * They could be mistakenly treated as file lru. So further anon >>> + * test is needed. >>> + */ >>> + if (!folio_is_file_lru(folio) || >>> + (folio_test_anon(folio) && >>> !folio_test_swapbacked(folio))) >>> + goto out; >>> + >>> + if (folio_test_dirty(folio) && >>> + (!current_is_kswapd() || >>> + !folio_test_reclaim(folio) || >>> + !test_bit(PGDAT_DIRTY, &pgdat->flags))) { >>> + /* >>> + * Immediately reclaim when written back. >>> + * Similar in principle to folio_deactivate() >>> + * except we already have the folio isolated >>> + * and know it's dirty >>> + */ >>> + node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE, >>> + folio_nr_pages(folio)); >>> + folio_set_reclaim(folio); >>> + >>> + ret = false; >>> + } >>> + >>> +out: >>> + return ret; >>> +} >>> + >>> static struct folio *alloc_demote_folio(struct folio *src, >>> unsigned long private) >>> { >>> @@ -1078,6 +1116,12 @@ static unsigned int shrink_folio_list(struct >>> list_head *folio_list, >>> if (dirty && !writeback) >>> stat->nr_unqueued_dirty += nr_pages; >>> >>> + /* If the dirty folio dose not support pageout, >>> + * the dirty folio can skip this recycling. >>> + */ >>> + if (!folio_check_pageout(folio, pgdat)) >>> + goto activate_locked; >>> + >>> /* >>> * Treat this folio as congested if folios are cycling >>> * through the LRU so quickly that the folios marked >>> @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct >>> list_head *folio_list, >>> enum ttu_flags flags = TTU_BATCH_FLUSH; >>> bool was_swapbacked = >>> folio_test_swapbacked(folio); >>> >>> - if (folio_test_dirty(folio)) { >>> - /* >>> - * Only kswapd can writeback >>> filesystem folios >>> - * to avoid risk of stack overflow. >>> But avoid >>> - * injecting inefficient single-folio >>> I/O into >>> - * flusher writeback as much as >>> possible: only >>> - * write folios when we've encountered >>> many >>> - * dirty folios, and when we've >>> already scanned >>> - * the rest of the LRU for clean >>> folios and see >>> - * the same dirty folios again (with >>> the reclaim >>> - * flag set). >>> - */ >>> - if (folio_is_file_lru(folio) && >>> - (!current_is_kswapd() || >>> - !folio_test_reclaim(folio) || >>> - !test_bit(PGDAT_DIRTY, >>> &pgdat->flags))) { >>> - /* >>> - * Immediately reclaim when >>> written back. >>> - * Similar in principle to >>> folio_deactivate() >>> - * except we already have the >>> folio isolated >>> - * and know it's dirty >>> - */ >>> - node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE, >>> - nr_pages); >>> - folio_set_reclaim(folio); >>> - >>> - goto activate_locked; >>> - } >>> - >>> - if (references == FOLIOREF_RECLAIM_CLEAN) >>> - goto keep_locked; >>> - if (!may_enter_fs(folio, sc->gfp_mask)) >>> - goto keep_locked; >>> - if (!sc->may_writepage) >>> - goto keep_locked; >>> - } >>> - >>> if (folio_test_pmd_mappable(folio)) >>> flags |= TTU_SPLIT_HUGE_PMD; >>> >>> @@ -1323,6 +1330,28 @@ static unsigned int shrink_folio_list(struct >>> list_head *folio_list, >>> >>> mapping = folio_mapping(folio); >>> if (folio_test_dirty(folio)) { >>> + /* >>> + * Only kswapd can writeback filesystem folios >>> + * to avoid risk of stack overflow. But avoid >>> + * injecting inefficient single-folio I/O into >>> + * flusher writeback as much as possible: only >>> + * write folios when we've encountered many >>> + * dirty folios, and when we've already scanned >>> + * the rest of the LRU for clean folios and see >>> + * the same dirty folios again (with the reclaim >>> + * flag set). >>> + */ >>> + if (folio_is_file_lru(folio) && >>> + !folio_check_pageout(folio, pgdat)) >>> + goto activate_locked; >>> + >>> + if (references == FOLIOREF_RECLAIM_CLEAN) >>> + goto keep_locked; >>> + if (!may_enter_fs(folio, sc->gfp_mask)) >>> + goto keep_locked; >>> + if (!sc->may_writepage) >>> + goto keep_locked; >>> + >>> /* >>> * Folio is dirty. Flush the TLB if a writable >>> entry >>> * potentially exists to avoid CPU writes >>> after I/O >> >> I'm confused. Did you apply this on top of v1 by accident? > Hi, > According to my modified mm_vmscan_lru_shrink_inactive test tracelog, > in the 32 scanned inactive file pages, 20 were dirty, and the 20 dirty > pages were not reclamed, but they took 20us to perform try_to_unmap. > > I think unreclaimed dirty folio in inactive file lru can skip to > perform try_to_unmap. Please help to continue review. Thanks. > > kswapd0-99 ( 99) [005] ..... 687.793724: > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 > reclamed=12 nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 > nr_congested=0 nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 > nr_ref_keep=0 nr_unmap_fail=0 priority=2 > file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 exe=0 reference_cost=5 > reference_exe=0 unmap_cost=21 unmap_exe=0 dirty_unmap_cost=20 > dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0 > To supplement, I think the unreclaimed dirty folio of the inactive file lru in shrink_folio_list() can exit the recyling flow in advance and avoid to execute some time-consuming interfaces, such as folio_check_references() and try_to_unmap(). >> -- >> Cheers, >> >> David / dhildenb >> >
On Fri, Oct 20, 2023 at 11:59:33AM +0800, zhiguojiang wrote: > > > @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct > > > list_head *folio_list, > > > enum ttu_flags flags = TTU_BATCH_FLUSH; > > > bool was_swapbacked = > > > folio_test_swapbacked(folio); > > > > > > - if (folio_test_dirty(folio)) { > > > - /* > > > - * Only kswapd can writeback > > > filesystem folios > > > - * to avoid risk of stack overflow. > > > But avoid > > > - * injecting inefficient single-folio > > > I/O into > > > - * flusher writeback as much as > > > possible: only > > > - * write folios when we've encountered > > > many > > > - * dirty folios, and when we've > > > already scanned > > > - * the rest of the LRU for clean > > > folios and see > > > - * the same dirty folios again (with > > > the reclaim > > > - * flag set). > > > - */ > > > - if (folio_is_file_lru(folio) && > > > - (!current_is_kswapd() || > > > - !folio_test_reclaim(folio) || > > > - !test_bit(PGDAT_DIRTY, > > > &pgdat->flags))) { > > > - /* > > > - * Immediately reclaim when > > > written back. > > > - * Similar in principle to > > > folio_deactivate() > > > - * except we already have the > > > folio isolated > > > - * and know it's dirty > > > - */ > > > - node_stat_mod_folio(folio, > > > NR_VMSCAN_IMMEDIATE, > > > - nr_pages); > > > - folio_set_reclaim(folio); > > > - > > > - goto activate_locked; > > > - } > > > - > > > - if (references == FOLIOREF_RECLAIM_CLEAN) > > > - goto keep_locked; > > > - if (!may_enter_fs(folio, sc->gfp_mask)) > > > - goto keep_locked; > > > - if (!sc->may_writepage) > > > - goto keep_locked; > > > - } > > > - > > > if (folio_test_pmd_mappable(folio)) > > > flags |= TTU_SPLIT_HUGE_PMD; > > > > > > > I'm confused. Did you apply this on top of v1 by accident? > Hi, > According to my modified mm_vmscan_lru_shrink_inactive test tracelog, in the You're missing David's point. You've generated this patch against ... something ... that isn't upstream. Probably against v1 of your patch. Please check your git tree. > 32 scanned inactive file pages, 20 were dirty, and the 20 dirty pages were > not reclamed, but they took 20us to perform try_to_unmap. > > I think unreclaimed dirty folio in inactive file lru can skip to perform > try_to_unmap. Please help to continue review. Thanks. > > kswapd0-99 ( 99) [005] ..... 687.793724: > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 reclamed=12 > nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 nr_congested=0 > nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 nr_ref_keep=0 > nr_unmap_fail=0 priority=2 file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 > exe=0 reference_cost=5 reference_exe=0 unmap_cost=21 unmap_exe=0 > dirty_unmap_cost=20 dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0 Are you seeing measurable changes for any workloads? It certainly seems like you should, but it would help if you chose a test from mmtests and showed how performance changed on your system.
在 2023/10/20 12:15, Matthew Wilcox 写道: > On Fri, Oct 20, 2023 at 11:59:33AM +0800, zhiguojiang wrote: >>>> @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct >>>> list_head *folio_list, >>>> enum ttu_flags flags = TTU_BATCH_FLUSH; >>>> bool was_swapbacked = >>>> folio_test_swapbacked(folio); >>>> >>>> - if (folio_test_dirty(folio)) { >>>> - /* >>>> - * Only kswapd can writeback >>>> filesystem folios >>>> - * to avoid risk of stack overflow. >>>> But avoid >>>> - * injecting inefficient single-folio >>>> I/O into >>>> - * flusher writeback as much as >>>> possible: only >>>> - * write folios when we've encountered >>>> many >>>> - * dirty folios, and when we've >>>> already scanned >>>> - * the rest of the LRU for clean >>>> folios and see >>>> - * the same dirty folios again (with >>>> the reclaim >>>> - * flag set). >>>> - */ >>>> - if (folio_is_file_lru(folio) && >>>> - (!current_is_kswapd() || >>>> - !folio_test_reclaim(folio) || >>>> - !test_bit(PGDAT_DIRTY, >>>> &pgdat->flags))) { >>>> - /* >>>> - * Immediately reclaim when >>>> written back. >>>> - * Similar in principle to >>>> folio_deactivate() >>>> - * except we already have the >>>> folio isolated >>>> - * and know it's dirty >>>> - */ >>>> - node_stat_mod_folio(folio, >>>> NR_VMSCAN_IMMEDIATE, >>>> - nr_pages); >>>> - folio_set_reclaim(folio); >>>> - >>>> - goto activate_locked; >>>> - } >>>> - >>>> - if (references == FOLIOREF_RECLAIM_CLEAN) >>>> - goto keep_locked; >>>> - if (!may_enter_fs(folio, sc->gfp_mask)) >>>> - goto keep_locked; >>>> - if (!sc->may_writepage) >>>> - goto keep_locked; >>>> - } >>>> - >>>> if (folio_test_pmd_mappable(folio)) >>>> flags |= TTU_SPLIT_HUGE_PMD; >>>> >>> I'm confused. Did you apply this on top of v1 by accident? >> Hi, >> According to my modified mm_vmscan_lru_shrink_inactive test tracelog, in the > You're missing David's point. You've generated this patch against ... > something ... that isn't upstream. Probably against v1 of your > patch. Please check your git tree. Yes, [PATCH v2 1/2] is against my patch v1 index 2cc0cb41fb32..cf555cdfcefc. > >> 32 scanned inactive file pages, 20 were dirty, and the 20 dirty pages were >> not reclamed, but they took 20us to perform try_to_unmap. >> >> I think unreclaimed dirty folio in inactive file lru can skip to perform >> try_to_unmap. Please help to continue review. Thanks. >> >> kswapd0-99 ( 99) [005] ..... 687.793724: >> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 reclamed=12 >> nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 nr_congested=0 >> nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 nr_ref_keep=0 >> nr_unmap_fail=0 priority=2 file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 >> exe=0 reference_cost=5 reference_exe=0 unmap_cost=21 unmap_exe=0 >> dirty_unmap_cost=20 dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0 > Are you seeing measurable changes for any workloads? It certainly seems > like you should, but it would help if you chose a test from mmtests and > showed how performance changed on your system.
在 2023/10/20 12:15, Matthew Wilcox 写道: > On Fri, Oct 20, 2023 at 11:59:33AM +0800, zhiguojiang wrote: >>>> @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct >>>> list_head *folio_list, >>>> enum ttu_flags flags = TTU_BATCH_FLUSH; >>>> bool was_swapbacked = >>>> folio_test_swapbacked(folio); >>>> >>>> - if (folio_test_dirty(folio)) { >>>> - /* >>>> - * Only kswapd can writeback >>>> filesystem folios >>>> - * to avoid risk of stack overflow. >>>> But avoid >>>> - * injecting inefficient single-folio >>>> I/O into >>>> - * flusher writeback as much as >>>> possible: only >>>> - * write folios when we've encountered >>>> many >>>> - * dirty folios, and when we've >>>> already scanned >>>> - * the rest of the LRU for clean >>>> folios and see >>>> - * the same dirty folios again (with >>>> the reclaim >>>> - * flag set). >>>> - */ >>>> - if (folio_is_file_lru(folio) && >>>> - (!current_is_kswapd() || >>>> - !folio_test_reclaim(folio) || >>>> - !test_bit(PGDAT_DIRTY, >>>> &pgdat->flags))) { >>>> - /* >>>> - * Immediately reclaim when >>>> written back. >>>> - * Similar in principle to >>>> folio_deactivate() >>>> - * except we already have the >>>> folio isolated >>>> - * and know it's dirty >>>> - */ >>>> - node_stat_mod_folio(folio, >>>> NR_VMSCAN_IMMEDIATE, >>>> - nr_pages); >>>> - folio_set_reclaim(folio); >>>> - >>>> - goto activate_locked; >>>> - } >>>> - >>>> - if (references == FOLIOREF_RECLAIM_CLEAN) >>>> - goto keep_locked; >>>> - if (!may_enter_fs(folio, sc->gfp_mask)) >>>> - goto keep_locked; >>>> - if (!sc->may_writepage) >>>> - goto keep_locked; >>>> - } >>>> - >>>> if (folio_test_pmd_mappable(folio)) >>>> flags |= TTU_SPLIT_HUGE_PMD; >>>> >>> I'm confused. Did you apply this on top of v1 by accident? >> Hi, >> According to my modified mm_vmscan_lru_shrink_inactive test tracelog, in the > You're missing David's point. You've generated this patch against ... > something ... that isn't upstream. Probably against v1 of your > patch. Please check your git tree. > >> 32 scanned inactive file pages, 20 were dirty, and the 20 dirty pages were >> not reclamed, but they took 20us to perform try_to_unmap. >> >> I think unreclaimed dirty folio in inactive file lru can skip to perform >> try_to_unmap. Please help to continue review. Thanks. >> >> kswapd0-99 ( 99) [005] ..... 687.793724: >> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 reclamed=12 >> nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 nr_congested=0 >> nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 nr_ref_keep=0 >> nr_unmap_fail=0 priority=2 file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 >> exe=0 reference_cost=5 reference_exe=0 unmap_cost=21 unmap_exe=0 >> dirty_unmap_cost=20 dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0 > Are you seeing measurable changes for any workloads? It certainly seems > like you should, but it would help if you chose a test from mmtests and > showed how performance changed on your system. In one mmtest, the max times for a invalid recyling of a folio_list dirty folio that does not support pageout and has been activated in shrink_folio_list() are: cost=51us, exe=2365us. Calculate according to this formula: dirty_cost / total_cost * 100%, the recyling efficiency of dirty folios can be improved 53.13%、82.95%. So this patch can optimize shrink efficiency and reduce the workload of kswapd to a certain extent. kswapd0-96 ( 96) [005] ..... 387.218548: mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 kswapd0-96 ( 96) [006] ..... 412.822532: mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total_cost 88 total_exe 605 dirty_cost 73 total_exe 605
On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: > > Are you seeing measurable changes for any workloads? It certainly seems > > like you should, but it would help if you chose a test from mmtests and > > showed how performance changed on your system. > In one mmtest, the max times for a invalid recyling of a folio_list dirty > folio that does not support pageout and has been activated in > shrink_folio_list() are: cost=51us, exe=2365us. > > Calculate according to this formula: dirty_cost / total_cost * 100%, the > recyling efficiency of dirty folios can be improved 53.13%、82.95%. > > So this patch can optimize shrink efficiency and reduce the workload of > kswapd to a certain extent. > > kswapd0-96 ( 96) [005] ..... 387.218548: > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 > nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 > nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC > total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 > > kswapd0-96 ( 96) [006] ..... 412.822532: > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 > nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 > nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC > total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 I appreciate that you can put probes in and determine the cost, but do you see improvements for a real workload? Like doing a kernel compile -- does it speed up at all?
在 2023/10/23 20:21, Matthew Wilcox 写道: > On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: >>> Are you seeing measurable changes for any workloads? It certainly seems >>> like you should, but it would help if you chose a test from mmtests and >>> showed how performance changed on your system. >> In one mmtest, the max times for a invalid recyling of a folio_list dirty >> folio that does not support pageout and has been activated in >> shrink_folio_list() are: cost=51us, exe=2365us. >> >> Calculate according to this formula: dirty_cost / total_cost * 100%, the >> recyling efficiency of dirty folios can be improved 53.13%、82.95%. >> >> So this patch can optimize shrink efficiency and reduce the workload of >> kswapd to a certain extent. >> >> kswapd0-96 ( 96) [005] ..... 387.218548: >> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >> nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 >> nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >> total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 >> >> kswapd0-96 ( 96) [006] ..... 412.822532: >> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >> nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 >> nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >> total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 > I appreciate that you can put probes in and determine the cost, but do > you see improvements for a real workload? Like doing a kernel compile > -- does it speed up at all? Can you help share a method for testing thread workload, like kswapd? Thanks.
On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote: > 在 2023/10/23 20:21, Matthew Wilcox 写道: > > On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: > > > > Are you seeing measurable changes for any workloads? It certainly seems > > > > like you should, but it would help if you chose a test from mmtests and > > > > showed how performance changed on your system. > > > In one mmtest, the max times for a invalid recyling of a folio_list dirty > > > folio that does not support pageout and has been activated in > > > shrink_folio_list() are: cost=51us, exe=2365us. > > > > > > Calculate according to this formula: dirty_cost / total_cost * 100%, the > > > recyling efficiency of dirty folios can be improved 53.13%、82.95%. > > > > > > So this patch can optimize shrink efficiency and reduce the workload of > > > kswapd to a certain extent. > > > > > > kswapd0-96 ( 96) [005] ..... 387.218548: > > > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 > > > nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 > > > nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC > > > total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 > > > > > > kswapd0-96 ( 96) [006] ..... 412.822532: > > > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 > > > nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 > > > nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC > > > total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 > > I appreciate that you can put probes in and determine the cost, but do > > you see improvements for a real workload? Like doing a kernel compile > > -- does it speed up at all? > Can you help share a method for testing thread workload, like kswapd? Something dirt simple like 'time make -j8'.
在 2023/10/23 21:01, Matthew Wilcox 写道: > On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote: >> 在 2023/10/23 20:21, Matthew Wilcox 写道: >>> On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: >>>>> Are you seeing measurable changes for any workloads? It certainly seems >>>>> like you should, but it would help if you chose a test from mmtests and >>>>> showed how performance changed on your system. >>>> In one mmtest, the max times for a invalid recyling of a folio_list dirty >>>> folio that does not support pageout and has been activated in >>>> shrink_folio_list() are: cost=51us, exe=2365us. >>>> >>>> Calculate according to this formula: dirty_cost / total_cost * 100%, the >>>> recyling efficiency of dirty folios can be improved 53.13%、82.95%. >>>> >>>> So this patch can optimize shrink efficiency and reduce the workload of >>>> kswapd to a certain extent. >>>> >>>> kswapd0-96 ( 96) [005] ..... 387.218548: >>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >>>> nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 >>>> nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>> total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 >>>> >>>> kswapd0-96 ( 96) [006] ..... 412.822532: >>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >>>> nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 >>>> nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>> total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 >>> I appreciate that you can put probes in and determine the cost, but do >>> you see improvements for a real workload? Like doing a kernel compile >>> -- does it speed up at all? >> Can you help share a method for testing thread workload, like kswapd? > Something dirt simple like 'time make -j8'. Two compilations were conducted separately, and compared to the unmodified compilation, the compilation time for adding modified patches had a certain reduction, as follows: Compilation command: make distclean -j8 make ARCH=x86_64 x86_64_defconfig time make -j8 1.Unmodified Compilation time: real 2m40.276s user 16m2.956s sys 2m14.738s real 2m40.136s user 16m2.617s sys 2m14.722s 2.[Patch v2 1/2] Modified Compilation time: real 2m40.067s user 16m3.164s sys 2m14.211s real 2m40.123s user 16m2.439s sys 2m14.508s 3 [Patch v2 1/2] + [Patch v2 2/2] Modified Compilation time: real 2m40.367s user 16m3.738s sys 2m13.662s real 2m40.014s user 16m3.108s sys 2m14.096s
在 2023/10/23 21:01, Matthew Wilcox 写道: > On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote: >> 在 2023/10/23 20:21, Matthew Wilcox 写道: >>> On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: >>>>> Are you seeing measurable changes for any workloads? It certainly seems >>>>> like you should, but it would help if you chose a test from mmtests and >>>>> showed how performance changed on your system. >>>> In one mmtest, the max times for a invalid recyling of a folio_list dirty >>>> folio that does not support pageout and has been activated in >>>> shrink_folio_list() are: cost=51us, exe=2365us. >>>> >>>> Calculate according to this formula: dirty_cost / total_cost * 100%, the >>>> recyling efficiency of dirty folios can be improved 53.13%、82.95%. >>>> >>>> So this patch can optimize shrink efficiency and reduce the workload of >>>> kswapd to a certain extent. >>>> >>>> kswapd0-96 ( 96) [005] ..... 387.218548: >>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >>>> nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 >>>> nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>> total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 >>>> >>>> kswapd0-96 ( 96) [006] ..... 412.822532: >>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >>>> nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 >>>> nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>> total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 >>> I appreciate that you can put probes in and determine the cost, but do >>> you see improvements for a real workload? Like doing a kernel compile >>> -- does it speed up at all? >> Can you help share a method for testing thread workload, like kswapd? > Something dirt simple like 'time make -j8'. Two compilations were conducted separately, and compared to the unmodified compilation, the compilation time for adding modified patches had a certain reduction, as follows: Compilation command: make distclean -j8 make ARCH=x86_64 x86_64_defconfig time make -j8 1.Unmodified Compilation time: real 2m40.276s user 16m2.956s sys 2m14.738s real 2m40.136s user 16m2.617s sys 2m14.722s 2.[Patch v2 1/2] Modified Compilation time: real 2m40.067s user 16m3.164s sys 2m14.211s real 2m40.123s user 16m2.439s sys 2m14.508s 3.[Patch v2 1/2] + [Patch v2 2/2] Modified Compilation time: real 2m40.367s user 16m3.738s sys 2m13.662s real 2m40.014s user 16m3.108s sys 2m14.096s Thanks
On 24.10.23 04:04, zhiguojiang wrote: > > > 在 2023/10/23 21:01, Matthew Wilcox 写道: >> On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote: >>> 在 2023/10/23 20:21, Matthew Wilcox 写道: >>>> On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: >>>>>> Are you seeing measurable changes for any workloads? It certainly seems >>>>>> like you should, but it would help if you chose a test from mmtests and >>>>>> showed how performance changed on your system. >>>>> In one mmtest, the max times for a invalid recyling of a folio_list dirty >>>>> folio that does not support pageout and has been activated in >>>>> shrink_folio_list() are: cost=51us, exe=2365us. >>>>> >>>>> Calculate according to this formula: dirty_cost / total_cost * 100%, the >>>>> recyling efficiency of dirty folios can be improved 53.13%、82.95%. >>>>> >>>>> So this patch can optimize shrink efficiency and reduce the workload of >>>>> kswapd to a certain extent. >>>>> >>>>> kswapd0-96 ( 96) [005] ..... 387.218548: >>>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >>>>> nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 >>>>> nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>>> total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 >>>>> >>>>> kswapd0-96 ( 96) [006] ..... 412.822532: >>>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 >>>>> nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 >>>>> nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>>> total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 >>>> I appreciate that you can put probes in and determine the cost, but do >>>> you see improvements for a real workload? Like doing a kernel compile >>>> -- does it speed up at all? >>> Can you help share a method for testing thread workload, like kswapd? >> Something dirt simple like 'time make -j8'. > Two compilations were conducted separately, and compared to the > unmodified compilation, > the compilation time for adding modified patches had a certain > reduction, as follows: > > Compilation command: > make distclean -j8 > make ARCH=x86_64 x86_64_defconfig > time make -j8 > > 1.Unmodified Compilation time: > real 2m40.276s > user 16m2.956s > sys 2m14.738s > > real 2m40.136s > user 16m2.617s > sys 2m14.722s > > 2.[Patch v2 1/2] Modified Compilation time: > real 2m40.067s > user 16m3.164s > sys 2m14.211s > > real 2m40.123s > user 16m2.439s > sys 2m14.508s > > 3 [Patch v2 1/2] + [Patch v2 2/2] Modified Compilation time: > real 2m40.367s > user 16m3.738s > sys 2m13.662s > > real 2m40.014s > user 16m3.108s > sys 2m14.096s > To get expressive numbers two iterations are usually not sufficient. How much memory does you system have? Does vmscan even ever get active?
在 2023/10/24 15:07, David Hildenbrand 写道: > On 24.10.23 04:04, zhiguojiang wrote: >> >> >> 在 2023/10/23 21:01, Matthew Wilcox 写道: >>> On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote: >>>> 在 2023/10/23 20:21, Matthew Wilcox 写道: >>>>> On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: >>>>>>> Are you seeing measurable changes for any workloads? It >>>>>>> certainly seems >>>>>>> like you should, but it would help if you chose a test from >>>>>>> mmtests and >>>>>>> showed how performance changed on your system. >>>>>> In one mmtest, the max times for a invalid recyling of a >>>>>> folio_list dirty >>>>>> folio that does not support pageout and has been activated in >>>>>> shrink_folio_list() are: cost=51us, exe=2365us. >>>>>> >>>>>> Calculate according to this formula: dirty_cost / total_cost * >>>>>> 100%, the >>>>>> recyling efficiency of dirty folios can be improved 53.13%、82.95%. >>>>>> >>>>>> So this patch can optimize shrink efficiency and reduce the >>>>>> workload of >>>>>> kswapd to a certain extent. >>>>>> >>>>>> kswapd0-96 ( 96) [005] ..... 387.218548: >>>>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 >>>>>> nr_taken 32 >>>>>> nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 >>>>>> nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>>>> total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 >>>>>> >>>>>> kswapd0-96 ( 96) [006] ..... 412.822532: >>>>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 >>>>>> nr_taken 32 >>>>>> nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 >>>>>> nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>>>> total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 >>>>> I appreciate that you can put probes in and determine the cost, >>>>> but do >>>>> you see improvements for a real workload? Like doing a kernel >>>>> compile >>>>> -- does it speed up at all? >>>> Can you help share a method for testing thread workload, like kswapd? >>> Something dirt simple like 'time make -j8'. >> Two compilations were conducted separately, and compared to the >> unmodified compilation, >> the compilation time for adding modified patches had a certain >> reduction, as follows: >> >> Compilation command: >> make distclean -j8 >> make ARCH=x86_64 x86_64_defconfig >> time make -j8 >> >> 1.Unmodified Compilation time: >> real 2m40.276s >> user 16m2.956s >> sys 2m14.738s >> >> real 2m40.136s >> user 16m2.617s >> sys 2m14.722s >> >> 2.[Patch v2 1/2] Modified Compilation time: >> real 2m40.067s >> user 16m3.164s >> sys 2m14.211s >> >> real 2m40.123s >> user 16m2.439s >> sys 2m14.508s >> >> 3 [Patch v2 1/2] + [Patch v2 2/2] Modified Compilation time: >> real 2m40.367s >> user 16m3.738s >> sys 2m13.662s >> >> real 2m40.014s >> user 16m3.108s >> sys 2m14.096s >> > > To get expressive numbers two iterations are usually not sufficient. > How much memory does you system have? Does vmscan even ever get active? Test system memory: MemTotal: 8161608 kB. When multiple Apps were opened, vmscan can get active. I can capture a lot of tracelog data through testing, I only posted two sets of tracelog data.
在 2023/10/24 15:21, zhiguojiang 写道: > > > 在 2023/10/24 15:07, David Hildenbrand 写道: >> On 24.10.23 04:04, zhiguojiang wrote: >>> >>> >>> 在 2023/10/23 21:01, Matthew Wilcox 写道: >>>> On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote: >>>>> 在 2023/10/23 20:21, Matthew Wilcox 写道: >>>>>> On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote: >>>>>>>> Are you seeing measurable changes for any workloads? It >>>>>>>> certainly seems >>>>>>>> like you should, but it would help if you chose a test from >>>>>>>> mmtests and >>>>>>>> showed how performance changed on your system. >>>>>>> In one mmtest, the max times for a invalid recyling of a >>>>>>> folio_list dirty >>>>>>> folio that does not support pageout and has been activated in >>>>>>> shrink_folio_list() are: cost=51us, exe=2365us. >>>>>>> >>>>>>> Calculate according to this formula: dirty_cost / total_cost * >>>>>>> 100%, the >>>>>>> recyling efficiency of dirty folios can be improved 53.13%、82.95%. >>>>>>> >>>>>>> So this patch can optimize shrink efficiency and reduce the >>>>>>> workload of >>>>>>> kswapd to a certain extent. >>>>>>> >>>>>>> kswapd0-96 ( 96) [005] ..... 387.218548: >>>>>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 >>>>>>> nr_taken 32 >>>>>>> nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0 >>>>>>> nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>>>>> total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365 >>>>>>> >>>>>>> kswapd0-96 ( 96) [006] ..... 412.822532: >>>>>>> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 >>>>>>> nr_taken 32 >>>>>>> nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 >>>>>>> nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC >>>>>>> total_cost 88 total_exe 605 dirty_cost 73 total_exe 605 >>>>>> I appreciate that you can put probes in and determine the cost, >>>>>> but do >>>>>> you see improvements for a real workload? Like doing a kernel >>>>>> compile >>>>>> -- does it speed up at all? >>>>> Can you help share a method for testing thread workload, like kswapd? >>>> Something dirt simple like 'time make -j8'. >>> Two compilations were conducted separately, and compared to the >>> unmodified compilation, >>> the compilation time for adding modified patches had a certain >>> reduction, as follows: >>> >>> Compilation command: >>> make distclean -j8 >>> make ARCH=x86_64 x86_64_defconfig >>> time make -j8 >>> >>> 1.Unmodified Compilation time: >>> real 2m40.276s >>> user 16m2.956s >>> sys 2m14.738s >>> >>> real 2m40.136s >>> user 16m2.617s >>> sys 2m14.722s >>> >>> 2.[Patch v2 1/2] Modified Compilation time: >>> real 2m40.067s >>> user 16m3.164s >>> sys 2m14.211s >>> >>> real 2m40.123s >>> user 16m2.439s >>> sys 2m14.508s >>> >>> 3 [Patch v2 1/2] + [Patch v2 2/2] Modified Compilation time: >>> real 2m40.367s >>> user 16m3.738s >>> sys 2m13.662s >>> >>> real 2m40.014s >>> user 16m3.108s >>> sys 2m14.096s >>> >> >> To get expressive numbers two iterations are usually not sufficient. >> How much memory does you system have? Does vmscan even ever get active? > Test system memory: MemTotal: 8161608 kB. When multiple Apps were > opened, vmscan can get active. I can capture a lot of tracelog data > through testing, I only posted two sets of tracelog data. Hi, please help to continue reviewing this path and draw a conclusion on whether it can be merged. Thanks. > >